Sepsis presents with diverse clinical trajectories, complicating timely stratification and outcome prediction. Conventional approaches relying on single data modalities often fail to capture this heterogeneity. A multimodal embedding model integrates structured clinical variables with unstructured text to generate unified patient representations across 19,526 cases. The approach supports phenotyping and prognosis without task-specific tuning, enabling identification of clinically distinct subgroups and robust prediction of 28-day outcomes across retrospective, prospective and external validation settings.
Multimodal Representation and Model Design
The model combines tabular and textual inputs to construct a unified embedding for each patient. Tabular variables include demographic data, Sequential Organ Failure Assessment scores, comorbidities and laboratory indicators, while textual inputs include microbiological findings and CT reports. This integration allows the model to capture both structured physiological signals and unstructured clinical narratives. The architecture employs a multilayer perceptron for tabular data and a transformer-based encoder for text, with contrastive learning used to align representations across modalities.
Must Read: Fairness-Aware Model Detects Undiagnosed Alzheimer’s
Training relies on unlabelled data, enabling efficient use of large datasets without requiring extensive outcome annotation. This design allows the model to generalise across multiple tasks, including phenotyping and prognosis prediction. Performance evaluation across different datasets shows strong predictive capability, with AUC values reaching 0.92 and 0.94 in internal retrospective and prospective cohorts, and 0.78 in external validation. The model maintains robust discrimination and balanced performance metrics even under domain shift, indicating resilience to differences in clinical practice, documentation style and patient populations. Compared with unimodal and conventional multimodal baselines, the integrated approach consistently delivers improved predictive accuracy and stability.
Phenotype Identification and Clinical Characterisation
Clustering of learned representations identifies four clinically distinct phenotypes: high inflammation, low inflammation, intermediate and multiple organ failure. These phenotypes demonstrate clear separation in both tabular and textual analyses, with strong statistical support for differences in laboratory values, comorbidities and outcomes. The high inflammation phenotype shows elevated inflammatory markers and moderate mortality, while the low inflammation phenotype presents minimal organ dysfunction and the lowest mortality rates. The intermediate phenotype exhibits values between these extremes, reflecting mixed clinical characteristics.
The multiple organ failure phenotype displays the most severe clinical profile, with high levels of inflammatory markers, liver and renal dysfunction indicators and coagulation abnormalities. This group also carries the highest burden of comorbidities and the highest in-hospital mortality. Textual data analysis reinforces these distinctions, with specific clinical terms associated with each phenotype. For example, multi-organ involvement and severe complications appear more frequently in the most severe group, while less complex patterns dominate the low inflammation phenotype. The alignment between structured and unstructured data confirms the robustness of the phenotypic classification and supports its clinical relevance.
Treatment Response and Predictive Performance
Analysis of treatment patterns reveals variation in therapeutic response across phenotypes. Use of Xuebijing injection shows differential associations with outcomes, with a significant reduction in in-hospital mortality observed in the high inflammation phenotype. No significant differences are observed in other phenotypes or in the overall population. This finding highlights the importance of patient stratification in evaluating treatment effects and suggests that targeted therapies may yield benefits in specific subgroups.
The model also demonstrates strong performance in few-shot classification settings, requiring only limited labelled data to achieve high predictive accuracy. Across multiple datasets, it outperforms baseline models in both discrimination and clinically relevant metrics such as sensitivity, specificity and predictive values. In prospective evaluation, the model achieves higher sensitivity and negative predictive value than human experts, indicating improved ability to identify patients at risk of adverse outcomes. Performance remains consistent across different evaluation metrics, supporting its reliability in clinical decision contexts. Comparison with human predictions shows reduced variability and greater consistency, reflecting the model’s capacity to integrate complex data patterns that may not be readily apparent through experience-based judgement.
The integration of structured and unstructured clinical data enables more comprehensive representation of sepsis patients, supporting both phenotypic stratification and outcome prediction. The identification of four distinct phenotypes, combined with robust predictive performance across multiple datasets, demonstrates the value of multimodal embedding approaches in addressing sepsis heterogeneity. Differential treatment associations further underline the importance of targeted strategies based on patient subgroup characteristics. Consistent performance under varying conditions and improved accuracy compared with human assessment highlight the potential of such models as clinical decision-support tools.
Source: npj Digital Medicine
Image Credit: iStock
References:
Liu T, Li Y, Chen H et al. (2026) A multimodal embedding model for sepsis data representation. npj Digit Med: In Press.