Temporal robustness of biomarker-based classification algorithms for sepsis

Emma Rademaker; Rombout B E van Amstel; Said el Bouhaddani; Marc J M Bonten; Lennie P G Derde; Lonneke van Vught; Tom van der Poll; Lieuwe D J Bos; Harm-Jan de Grooth; Olaf L Cremer

doi:10.1007/s00134-025-08218-z

. 2025 Dec 1;52(1):22–30. doi: 10.1007/s00134-025-08218-z

Temporal robustness of biomarker-based classification algorithms for sepsis

Emma Rademaker ^1,^2,^✉, Rombout B E van Amstel ³, Said el Bouhaddani ^1,², Marc J M Bonten ^1,⁴, Lennie P G Derde ^1,², Lonneke van Vught ^1,², Tom van der Poll ⁵, Lieuwe D J Bos ³, Harm-Jan de Grooth ², Olaf L Cremer ²

PMCID: PMC12852201 PMID: 41324692

Abstract

Purpose

Heterogeneity of the host response in sepsis hampers development of effective treatments. Several immunobiologically distinct subphenotypes (or endotypes) have been identified using data-driven analyses of single-timepoint biomarker data, but their temporal stability remains uncertain due to dynamic biology and statistical limitations.

Methods

We analyzed data from 345 sepsis patients across two ICU cohorts. 30 immune biomarkers were measured every 8 h for up to 7 days. Latent profile analysis was used to identify classes upon admission and re-classify patients at later timepoints. Temporal robustness was assessed by (1) inter-class transition rates, and (2) intra-class cohesion (regardless of label) using the Rand Index (RI).

Results

At ICU admission, three immune profiles were identified: profile A (149 patients, 43%) reflected adaptive immune activation (elevated IL-4, IL-5, RANTES, and GM-CSF); profile B (60 patients, 17%) a hyperinflammatory state (high IL-6, IL-8, IL-1Ra, and low protein C); and profile C (136 patients, 39%) broadly attenuated inflammation. By 48 h, the prevalences of A and B declined to 31% and 13%, while C increased to 56%. Inter-class transitions occurred most in patients assigned to A (41% of all 8-hourly transitions), compared to 39% and 22% for B and C. Intra-class cohesion across intervals was poor (median RI 65%, IQR 62–64%), indicating that patients classified together at admission did not remain consistently together.

Conclusion

Sepsis patients were frequently reclassified across immune profiles over short intervals, with approximately one-third of subgroup peers changing at each timepoint. This instability challenges the clinical utility of biomarker-derived endotypes.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00134-025-08218-z.

Keywords: Sepsis, Immune profile, Biomarker, Subphenotypes, Temporal stability

Take-home message

Immunobiological subgroups of sepsis are not temporally stable: about one-third of subgroup peers are reclassified every 8 h. While in part reflecting true biological change, most transitions were observed in patients with biomarker patterns near subgroup boundaries where classification is less certain. This instability may limit the applicability of subgrouping approaches and highlights the importance of considering temporal dynamics and certainty in therapeutic decision-making.

Open in a new tab

Introduction

The immunobiology of sepsis is highly complex and dynamic, marked by substantial heterogeneity and temporal fluctuations, with hyperinflammation and immunosuppression coexisting to varying degrees within individual patients [1, 2]. This biological variability is widely considered a key factor in the failure of immunomodulatory therapies in unselected sepsis populations [3] and has spurred efforts to identify treatment-responsive traits.

Over the past decade, the identification of clinically homogeneous subgroups (subphenotypes) and biologically distinct disease subtypes (endotypes) has become a central focus of intensive care unit (ICU) research [4]. Subphenotypes are typically derived using unsupervised learning techniques, such as latent class analysis (LCA) or clustering algorithms, which group patients based on shared statistical patterns in observed data. Endotypes, in contrast, are inferred through biological interpretation of these groupings and imply a shared underlying disease mechanism. To date, numerous subgroups have been described in patients with sepsis and acute respiratory distress syndrome (ARDS), with some linked to differential responses to treatment [4–8]. Although recent studies suggest that endotypes discovered in independent cohorts reflect similar underlying biology [7, 8], individual patients may nonetheless be assigned to different subgroups when multiple classification systems are applied in parallel [9].

These inconsistencies, in part, may reflect a lack of understanding of how disease patterns evolve as illness progresses—a concern recently emphasized in an ICU subphenotyping consensus statement [4]. Subphenotypes are typically identified at the time of ICU admission, based on cross-sectional clinical and biological data. However, sepsis immunobiology is inherently dynamic [10], and patients may rapidly transition between subphenotypes as their condition progresses. Moreover, new subphenotypes may emerge while others cease to exist. In contrast to true biological evolution, apparent subphenotype transitions may arise from technical factors. Immunobiological data are subject to measurement error, which can affect the subphenotype. Additionally, due to the highly data-driven nature of clustering algorithms, subphenotypes are sensitive to clinically irrelevant fluctuations in the data and arbitrary decisions made during the statistical modeling process. This can shift the dominant patterns these algorithms identify and lead to inconsistent classifications. To inform the timing and feasibility of targeted immunomodulatory strategies, it is thus essential to assess the robustness and temporal stability of these classifications. However, only a few studies have conducted longitudinal analysis [11–13], primarily due to the scarcity of high-quality cohorts with serial immunobiological sampling and a lack of standardized statistical methods for evaluating subphenotype persistence over time.

This study investigated the temporal stability of immunobiology-driven subgroups, derived from a broad panel of biomarkers measured during the first 24 h of ICU admission in critically ill patients with sepsis. By examining both inter-class transitions and intra-class cohesion across consecutive 8-h intervals, we aimed to differentiate true biological evolution from statistical classification error, thereby assessing the methodological robustness of data-driven classification algorithms for biomarker data.

Methods

Study design and patients

This retrospective study analyzed data from two Dutch tertiary ICU cohorts: BASIC (Amsterdam University Medical Center) and BioSep (University Medical Center Utrecht). Ethical approval was granted by the respective institutional review boards (protocol numbers 34294 and 11-205), and informed consent was obtained from all participants. Patients with sepsis were enrolled between April 2011 and December 2013, using the Sepsis-2 definition that was in effect at the time [14]. Patients were excluded if they had received > 48 h of intravenous antibiotics before screening or had been in the ICU > 24 h before enrollment.

Sample preparation and protein assays

Blood was sampled every 8 h from the time of inclusion (within 24 h of ICU admission for all study participants) until day seven or ICU discharge. Luminex multiplex assays (ProcartaPlex, eBioscience, San Diego, USA) and enzyme-linked immunosorbent assays (ELISA) (Diagnostica Stago SAS, Asnières-sur-Seine, France; Siemens Healthineers, Erlangen, Germany) were then used to measure 43 biomarkers of inflammation, endothelial activation, and coagulation (Table S2). Across 26 assay batches on two analytical platforms used in 2013 and 2015, respectively, 13 biomarkers were excluded because of irreconcilable measurement range discrepancies. For the remaining 30 markers, batch effects were statistically harmonized (Supplement 1.5). Subsequently, biomarker concentrations were log₁₀-transformed and standardized using z scores.

Statistical analysis

We applied four approaches to unsupervised subgroup identification. In the primary analysis, patients were classified from their first available sample (t0) using latent profile analysis (LPA), and the same model (LPA_t0) was reapplied every 8 h to assign labels over time. LPA, a variant of latent class analysis for continuous variables, assumes that populations are mixtures of distributions representing distinct classes [15]. For alternative approaches, evaluated only within the first 48 h after inclusion, we: (1) repeated LPA independently at each 8-h interval (LPA_ti) and compared results with the t0 solution in terms of number, composition, and prognostic relevance; (2) applied k-means clustering, assigning patients at later times by Euclidean distance to t0 centroids (k-means_t0); and (3) performed de novo k-means at each time point (k-means_ti). These algorithms included only biomarker data, deliberately excluding clinical variables (e.g., age, vital signs). This ensured that identified classes reflected immunobiological patterns rather than demographics, while also capturing noise and spurious associations inherent to biomarker data. Biomarkers with strong pairwise correlations (Pearson’s r > 0.5) were excluded as recommended [15], retaining the biomarker contributing most to variance by principal component analysis. Ten biomarkers were excluded, leaving 20 inputs (Fig. S3). For LPA, models with one to five classes were fitted under conditional independence, with the optimal solution chosen by Bayesian Information Criterion (BIC) and entropy [15]. Patients were assigned to a class if posterior probability exceeded 0.5. For k-means, the number of clusters was determined by majority vote across NbClust indices [16]. When class or cluster counts varied across time points (t0–t48), the most frequent solution was selected.

Assessment of temporal stability

Temporal stability was evaluated using two metrics: (1) the inter-class transition rate, the proportion of patients reassigned to a different class at each time point; and (2) intra-class cohesion, measured by the Rand Index (RI). The RI reflects the proportion of patient pairs consistently classified over time, either together in the same class, or apart across different classes, i.e., the RI quantifies if patients remain grouped together with their original subgroup peers, irrespective of label.

Exploratory analysis

To assess the robustness of our findings, we evaluated the temporal stability of alternative 2- and 4-class LPA solutions at t0, repeated the four main analyses using all 30 biomarkers, and re-ran LPA relaxing the assumption of conditional independence. To explore drivers of instability, we compared biomarker levels between patients with stable versus unstable assignments, defining stability as remaining in the same class for > 80% of time points during the first 48 h. We also tested whether posterior probabilities from LPA_t0 predicted stability: patients above a 95% threshold were considered definite, others indeterminate, and their subsequent stability was compared. Alternative thresholds (> 0.75, > 0.80, > 0.85, > 0.90) were also evaluated. All analyses used RStudio v4.4.3.

Role of the funding source

The funder of the study and Immunetrics Inc. (responsible for coordinating the laboratory analysis) had no involvement in study design, data collection, data analysis, data interpretation, or writing of the report.

Results

Primary class assignments at t0

A total of 345 patients with sepsis were included (Fig. S1), with 3852 plasma samples analyzed. At baseline (t0), the optimal LPA solution was a three-class model. Although a four-class model showed marginal statistical gain, this did not outweigh the added complexity and risk of overfitting (Table S4). The resulting three classes reflected distinct and interpretable immune response patterns: profile A (149 patients, 43%) corresponded to an active adaptive immune response, with elevated interleukin (IL)-2, IL-4, IL-5, IL-7, IL-12p40, C–C motif chemokine ligand 5 (RANTES), and granulocyte–macrophage colony-stimulating factor (GM-CSF); profile B (60 patients, 17%) reflected a hyperinflammatory state, characterized by high IL-6, IL-8, IL-1Ra, and low Protein-C; and profile C (136 patients, 39%) exhibited broadly attenuated inflammation, marked by overall lower biomarker concentrations (Fig. 1).

Fig. 1 — Immune response profiles. Expression values are the measurements at t0, and are Z score-normalized, with higher expression shown in red and lower levels in blue. Class assignments are color-coded in the annotation bar on the left. Biomarkers that were included in the LPA_t0 model are labeled in bold. Profile A: adaptive; profile B: hyperinflammatory; profile C: attenuated

Patients in profile B exhibited the highest disease severity upon admission and the highest ICU mortality (22 deaths, 37%), followed by profiles A and C, with 28 deaths (19%) and 16 deaths (12%), respectively; p < 0.001 (Table 1).

Table 1.

Patient characteristics

	Profile A N = 149	Profile B N = 60	Profile C N = 136	p value
Demographics
Age	64 [55–73]	64 [53–70]	64 [56–72]	0.15
Male	90 (60)	36 (60)	73 (54)	0.48
Comorbidities
Diabetes mellitus	26 (17)	9 (15)	33 (24)	0.21
Chronic renal insufficiency	24 (16)	6 (10)	17 (13)	0.45
Chronic obstructive pulmonary disease	22 (15)	4 (7)	23 (17)	0.16
Immune deficiency	26 (17)	11 (18)	15 (11)	0.24
Malignancy	27 (18)	14 (23)	21 (15)	0.41
Site of infection
Lower respiratory tract	70 (47)	21 (35)	81 (60)	0.01
Intra-abdominal	34 (23)	14 (23)	15 (11)
Urinary tract	10 (7)	6 (10)	11 (8)
Central nervous system	7 (5)	6 (10)	10 (7)
Skin or soft tissue	11 (7)	4 (7)	4 (3)
Other	17 (11)	9 (15)	15 (11)
Disease severity at ICU admission^a
APACHE IV acute physiology score	70 [54–86]	80 [55–109]	59 [44–74]	< 0.001
Sequential organ failure
Assessment score^b	8 [5–9]	10 [8–13]	6 [3–8]	< 0.001
Septic shock	51 (34)	29 (48)	23 (17)	< 0.001
White blood cell count (10⁹/L)	15.1 [10.5–23.1]	11.5 [6.5–18.0]	13.1 [10.4–18.9]	0.01
C-reactive protein (mg/L)	167 [76–288]	163 [68–226]	102 [46–185]	0.004
Creatinine (µmol/L)	133 [88–227]	182 [96–256]	88 [64–130]	< 0.001
Bilirubin (µmol/L)	15 [8–23]	20 [9–46]	9 [7–17]	< 0.001
Platelet count (10⁹/L)	211 [147–288]	103 [50–162]	181 [138–264]	< 0.001
Lactate (mmol/L)^b	2.6 [1.6–4.6]	4.6 [2.5–7.9]	2.0 [1.3–3.3]	< 0.001
Outcomes
ICU length of stay^c	6 [2–17] (n = 121)	4 [2–10] (n = 38)	4 [2–8] (n = 120)	0.01
ICU mortality	28 (19)	22 (37)	16 (12)	< 0.001

Open in a new tab

Data are presented as medians (interquartile range) or absolute numbers (%)

^aThese values represent the worst observed values of the first 48 h of ICU admission

^bSequential Organ Failure Score (SOFA) is defined as the total SOFA without the value for central nervous system

^cLength of stay and length of mechanical ventilation is reported for ICU survivors; ICU: Intensive Care Unit; Profile A: adaptive; profile B: hyperinflammatory; profile C: attenuated

Inter-class transitions

Over the first 48 h in ICU, the prevalence of profiles A and B declined from 43 and 17% at t0 to 31% and 13% at t48, respectively, while profile C increased from 39 to 56% (p < 0.001). Thereafter, these proportions remained largely stable among patients still admitted to the ICU beyond day two. However, at the individual patient level, inter-class transition rates remained high throughout the study period. Within the first 48 h, at least one reassignment occurred in 131 (88%) patients initially classified in profile A, 36 (60%) patients in profile B, and 63 (46%) patients in profile C (Fig. 2). Across all 8-h intervals, the overall transition rate was 31%, based on 1,110 reassignments and 2,506 consistent assignments. Average transition rates over time were higher for profiles A and B (41% and 39%, respectively) than for profile C (22%) (Fig. 3A). Patients with profile B had similar transition rates into either profile A (17%) or profile C (22%). In contrast, the transition rates from profiles A or C into profile B were lower (7% and 5%, respectively) (Fig. 3). The inter-class transition rate was stable over time for all classes (Fig. 3).

Fig. 2 — Individual transitions. Alluvial plots depicting patient class assignments at each time point based on LPA_t0. Lines represent individual patient trajectories, with line colors indicating the class assigned at t0. Stacked bars reflect the number of patients assigned to each class at the respective time points. Profile A: adaptive; profile B: hyperinflammatory; profile C: attenuated

Fig. 3 — Inter-class transitions. The left panel displays agreement of patient classifications at consecutive time points i and i + 1. The right panel shows the probability of a class reassignment occurring between ti and ti + 1 over time, stratified by the immune profile assigned at ti. Lines depict locally estimated scatterplot smoothing (LOESS) curves, with shaded bands indicating the standard error of the estimate. Transition probabilities were averaged across three adjacent time intervals before plotting. Profile A: adaptive; profile B: hyperinflammatory; profile C: attenuated

Intra-class cohesion

Intra-class cohesion—representing the extent to which patients remained grouped with their original class peers—was limited. Across all 8-h intervals, the median RI from ti to ti + 1 was 65% [IQR: 62–64], indicating that approximately one-third of class peers were replaced at each time point. Notably, although overall class prevalence stabilized after 48 h, intra-class cohesion did not improve over time, suggesting persisting instability in individual class assignments (Fig. 4). When using t0 as a reference, cohesion was even lower (median RI 55% [IQR: 54–57] across all time points), and gradually declined over time.

Fig. 4 — Intra-class cohesion. The Rand Index (RI) quantifies pairwise consistency in class assignments. Dark blue indicates agreement between consecutive time points (i and i + 1); light blue shows agreement between each time point and baseline (t0). Higher values reflect greater stability

Alternative statistical approaches

To further assess temporal stability, three alternative unsupervised learning approaches were applied: LPA_ti, K-means_t0, and K-means_ti. All produced results that were broadly consistent with the primary LPA_t0 approach in terms of the optimal number of classes/clusters (Table S4) and their prognostic relevance (Table S5). Across all methods and time points, classes/clusters resembled profiles A, B, and C although the absolute differences in concentrations of profile-defining biomarkers between profiles diminished over time (Figs. S5, S6). Inter-class transition rates varied by method and profile but remained high overall. For LPA_ti and K-means_ti, transition rates were the highest for profile C (51% and 44%, respectively) and the lowest in profile B (43% and 28%) (Fig. S7A). In contrast, K-means_t0 showed lower overall transition rates, with profile A being the most stable (21%), compared to 31% and 33% for profiles B and C, respectively (Figure S7A). Intra-class cohesion from ti to ti + 1 ranged between 60 and 70% across the three alternative methods (Fig. S7B). Additional analyses based on alternative 2- and 4-class solutions to LPA_t0, based on all 30 biomarkers, or analyses relaxing the assumption of conditional independence did not alter the main conclusions and are presented in the supplementary appendix (S2, Fig. S8).

Exploratory analysis

To assess potential sources of the observed temporal instability, we compared patients with stable versus unstable LPA_t0 class assignments. A total of 160 (46%) patients were identified as unstable, based on the fact that they had been reassigned on > 20% of time points during the first 48 h. Compared to patients with more stable classifications, these individuals exhibited less extreme concentrations of profile-defining biomarkers at t0 (Fig. S9). Correspondingly, they showed a trend toward lower posterior probabilities of membership in their assigned class (p = 0.074). These findings suggest that temporal instability may be partially attributable to patients with intermediate immune profiles, resulting in borderline or uncertain class assignments.

Aiming to explore a solution to the observed temporal instability, we subsequently examined the impact of applying stricter posterior probability thresholds for class membership at t0. Incrementally increasing the cutoff in 5% steps from 75% onward resulted in only minimal reduction in inter-class transition rates, and only for profile B (data not shown). At a > 95% probability threshold, 99 (29%) patients remained unclassifiable, of whom 57 (58%) would otherwise have been assigned to profile A, 14 (14%) to profile B, and 28 (28%) to profile C. After exclusion of these indeterminate cases, transition rates only decreased modestly for profile B to 33% (from 39% under the original cutoff assignment) (Fig. S10A). However, intra-class cohesion from ti to ti + 1 after exclusion of patients with indeterminate class membership remained poor (RI 66% [IQR: 64–69]), which was similar to non-stratified LPA_t0 results (Fig. S10B).

Discussion

In this study, we assessed the temporal stability of biomarker-derived subgroups (i.e., subphenotypes or endotypes) in critically ill patients with sepsis. We found that patients were frequently reclassified into different immune profiles over short intervals, with roughly one-third of their original subgroup peers changing at each time point. Subgroup stability remained poor across multiple statistical approaches and was not substantially improved by varying analytical cutoffs.

The classes identified by LPA at t0 (and replicated across alternative statistical approaches) reflected biologically plausible host response states. These included adaptive immune activation (profile A), innate hyperinflammation (profile B), and a broadly attenuated immune state (profile C), each showing coherent associations with clinical outcomes: profile B was linked to death, whereas profile C was associated with discharge. Furthermore, the biological characteristics of profile B closely resembled the well-established hyperinflammatory subphenotype, originally described in ARDS [6, 17] and subsequently also observed in sepsis [18]. Similarly, several transcriptomic studies have identified immunobiological endotypes characterized by dominant innate or adaptive responses [19–21], and two large integrative analyses have subsequently consolidated these findings [7, 8].

A recent study reported that approximately 50% of ARDS patients initially classified as hyperinflammatory were reassigned to the hypoinflammatory subphenotype by day 2 [11]. Likewise, a recent transcriptomic study in a West African cohort of ICU survivors showed a decline in expression of SRS-1 and MARS-2, -3, and -4 endotype-defining genes after 24 h, suggesting a shift toward a more balanced immune state [13]. In another study, Balch and colleagues reported a 57.5% inter-class transition rate between inflammopathic, adaptive, and coagulopathic endotypes within two days of ICU admission [12]. Finally, although longitudinal sampling was limited to 35 patients, Scicluna et al. reported comparable transitions by day 3 between the consensus transcriptomic subtypes [8]. These findings are consistent with our data, in which 53% of patients assigned to profile B at ICU admission were reclassified to less inflammatory profiles by day 2. Prior work has suggested that early resolution of the hyperinflammatory subphenotype may indicate a treatment-responsive trajectory toward recovery [11], and a similar mechanism may underlie our data. However, extending previous studies, our findings suggest that transitions occur at far shorter intervals and often appear stochastic. This indicates that shifts partly reflect statistical classification error in addition to true biological change.

In keeping with these observations, a key finding of our study is that beyond the high rate of inter-class transitions, intra-class cohesion was poor, even at closely spaced time points. At each 8-h interval, approximately a third of class peers were reassigned to a different immune profile, regardless of the statistical approach used. Exploratory analysis suggests that this instability is at least partially attributable to patients whose biomarker expression patterns fall near the decision boundaries of the defined classes. Put more intuitively: while some patients clearly match one of the three immune profiles, others exhibit intermediate or overlapping biological features, resulting in seemingly random class assignments over time. To test whether higher classification certainty at baseline would mitigate this instability, we applied more stringent posterior probability thresholds at t0. While this improved temporal stability modestly for profile B, no meaningful improvement was observed for the other profiles. Additional factors that may have contributed to classification instability include measurement error in biomarker assays and the inherent sensitivity of unsupervised clustering methods to sample-specific noise and outliers, both of which are difficult to mitigate. Notably, small differences in classification stability were observed between samples processed at different facilities (data not shown), possibly suggesting that some laboratories are able to control measurement variation more effectively than others.

Our findings highlight potential obstacles in targeting subphenotypes or endotypes for therapeutic intervention, even though we did not study heterogeneity of treatment effect directly. First, effective stratification relies on treatments precisely matched to distinct immune response profiles, with minimal overlap or uncertainty at classification boundaries. Second, effective implementation requires strong intra-class cohesion to prevent undertreatment of patients not yet assigned to a treatment group but likely to enter it, and overtreatment of those nearing transition out. In other words, we identified a genuine risk of renewed heterogeneity—not in sepsis itself, but in the process of class allocation. The acceptable rate of misclassification will depend on the trade-off between anticipated therapeutic benefit and potential harm.

A notable strength of the study is the use of high-resolution, granular biological data from a sizable and well-curated cohort. However, our study also has several limitations. First, patients were enrolled over a decade ago using the then-standard Sepsis-2 definition. However, only 12 patients in the cohort would not meet current Sepsis-3 criteria [22], suggesting that the findings remain largely applicable. Second, although all measurements were performed by accredited laboratories using standardized reagents, substantial batch effects persisted. Addressing these required extensive harmonization procedures and led to the exclusion of several biomarkers. While this may have influenced our findings to some extent, measurements for nearly all patients were conducted within a single batch, minimizing the potential impact on the temporal stability analysis. Third, although immunoparalysis is a key area of interest in sepsis research, it could not be reliably assessed using the plasma biomarkers available in our dataset. As such, profile C reflects a generally attenuated immune response rather than definite evidence of immunoparalysis. Fourth, we did not evaluate the external validity of the derived immune profiles prior to assessing their temporal stability. However, the aim of this study was not to propose novel endotypes, but rather to evaluate the methodological robustness and temporal stability of unsupervised classification methods. Lastly, the impact of immunomodulatory treatments on transition rates was not assessed; however, the use of such therapies was limited under the protocols in effect at the time.

In conclusion, the frequent reclassification of sepsis patients within the first week of ICU admission, with approximately one-third of class membership changing at each 8-h interval, highlights an inherent volatility of discrete immune profiling approaches. While this statistical instability appears most prominent in patients with intermediate immune profiles, our findings raise concerns about the reliability and clinical applicability of biomarker-based endotyping strategies.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 3037 kb)^{(3MB, docx)}

Acknowledgements

This study was conducted within the framework of the MARS project, supported by the Center for Translational Molecular Medicine (project grant 04I-201). We thank Immunetrics Inc. for their in-kind support in conducting the biomarker analysis, and acknowledge all members of the former MARS consortium for their contributions to the BASIC and BioSep studies.

Author contributions

This study was conceived by ER and OLC. MJMB, LvV, TvdP and OLC were responsible for data collection. ER and RvA accessed and verified the underlying data. ER, SeB, and JD performed the statistical analysis. RvA, SeB, MJMB, LPGD, TvdP, LDJB, and HJdG contributed to the study design, statistical analysis, and interpretation of the data. ER drafted the manuscript, to which all authors contributed critical revisions. All authors had full access to all data in the study and accept responsibility to submit for publication.

Data availability

Deidentified study data may be made available upon reasonable request to the principal investigator (o.l.cremer@umcutrecht.nl). Data will be shared for non-commercial academic purposes, subject to a data use agreement and in compliance with applicable privacy regulations.

Declarations

Conflicts of interest

All authors declare that they have no conflicts of interest.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Cajander S et al (2024) Profiling the dysregulated immune response in sepsis: overcoming challenges to achieve the goal of precision medicine. Lancet Respir Med 12(4):305–322 [DOI] [PubMed] [Google Scholar]
2.Shankar-Hari M et al (2024) Reframing sepsis immunobiology for translation: towards informative subtyping and targeted immunomodulatory therapies. Lancet Respir Med 12(4):323–336 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Marshall JC (2014) Why have clinical trials in sepsis failed? Trends Mol Med 20(4):195–203 [DOI] [PubMed] [Google Scholar]
4.Gordon AC et al (2024) From ICU syndromes to ICU subphenotypes: consensus report and recommendations for developing precision medicine in the ICU. Am J Respir Crit Care Med 210(2):155–166 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Antcliffe DB et al (2019) Transcriptomic signatures in sepsis and a differential response to steroids. From the VANISH randomized trial. Am J Respir Crit Care Med 199(8):980–986 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Calfee CS et al (2018) Acute respiratory distress syndrome subphenotypes and differential response to simvastatin: secondary analysis of a randomised controlled trial. Lancet Respir Med 6(9):691–698 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Moore AR et al (2025) A consensus immune dysregulation framework for sepsis and critical illnesses. Nat Med. 10.1038/s41591-025-03956-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Scicluna BP et al (2025) A consensus blood transcriptomic framework for sepsis. Nat Med. 10.1038/s41591-025-03964-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.van Amstel RBE et al (2023) Uncovering heterogeneity in sepsis: a comparative analysis of subphenotypes. Intensive Care Med 49(11):1360–1369 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Meyer NJ, Prescott HC (2024) Sepsis and septic shock. N Engl J Med 391(22):2133–2146 [DOI] [PubMed] [Google Scholar]
11.van Amstel RBE et al (2025) Temporal transitions of the hyperinflammatory and hypoinflammatory phenotypes in critical illness. Am J Respir Crit Care Med 211(3):347–356 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Balch JA et al (2023) Defining critical illness using immunological endotypes in patients with and without sepsis: a cohort study. Crit Care 27(1):292 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Chenoweth JG et al (2024) Gene expression signatures in blood from a West African sepsis cohort define host response phenotypes. Nat Commun 15(1):4606 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Levy MM et al (2003) 2001 SCCM/ESICM/ACCP/ATS/SIS international sepsis definitions conference. Crit Care Med 31(4):1250–1256 [DOI] [PubMed] [Google Scholar]
15.Sinha P, Calfee CS, Delucchi KL (2021) Practitioner’s guide to latent class analysis: methodological considerations and common pitfalls. Crit Care Med 49(1):e63–e79 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Charrad M et al (2014) NbClust: an R package for determining the relevant number of clusters in a data set. J Stat Softw 61(6):1–36 [Google Scholar]
17.Calfee CS et al (2014) Subphenotypes in acute respiratory distress syndrome: latent class analysis of data from two randomised controlled trials. Lancet Respir Med 2(8):611–620 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Sinha P et al (2023) Identifying molecular phenotypes in sepsis: an analysis of two prospective observational cohorts and secondary analysis of two randomised controlled trials. Lancet Respir Med 11(11):965–974 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Davenport EE et al (2016) Genomic landscape of the individual host response and outcomes in sepsis: a prospective cohort study. Lancet Respir Med 4(4):259–271 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Scicluna BP et al (2017) Classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. Lancet Respir Med 5(10):816–826 [DOI] [PubMed] [Google Scholar]
21.Sweeney TE et al (2018) Unsupervised analysis of transcriptomics in bacterial sepsis across multiple datasets reveals three robust clusters. Crit Care Med 46(6):915–925 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Singer M et al (2016) The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315(8):801–810 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary file1 (DOCX 3037 kb)^{(3MB, docx)}

Data Availability Statement

[CR1] 1.Cajander S et al (2024) Profiling the dysregulated immune response in sepsis: overcoming challenges to achieve the goal of precision medicine. Lancet Respir Med 12(4):305–322 [DOI] [PubMed] [Google Scholar]

[CR2] 2.Shankar-Hari M et al (2024) Reframing sepsis immunobiology for translation: towards informative subtyping and targeted immunomodulatory therapies. Lancet Respir Med 12(4):323–336 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Marshall JC (2014) Why have clinical trials in sepsis failed? Trends Mol Med 20(4):195–203 [DOI] [PubMed] [Google Scholar]

[CR4] 4.Gordon AC et al (2024) From ICU syndromes to ICU subphenotypes: consensus report and recommendations for developing precision medicine in the ICU. Am J Respir Crit Care Med 210(2):155–166 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Antcliffe DB et al (2019) Transcriptomic signatures in sepsis and a differential response to steroids. From the VANISH randomized trial. Am J Respir Crit Care Med 199(8):980–986 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Calfee CS et al (2018) Acute respiratory distress syndrome subphenotypes and differential response to simvastatin: secondary analysis of a randomised controlled trial. Lancet Respir Med 6(9):691–698 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Moore AR et al (2025) A consensus immune dysregulation framework for sepsis and critical illnesses. Nat Med. 10.1038/s41591-025-03956-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Scicluna BP et al (2025) A consensus blood transcriptomic framework for sepsis. Nat Med. 10.1038/s41591-025-03964-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.van Amstel RBE et al (2023) Uncovering heterogeneity in sepsis: a comparative analysis of subphenotypes. Intensive Care Med 49(11):1360–1369 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Meyer NJ, Prescott HC (2024) Sepsis and septic shock. N Engl J Med 391(22):2133–2146 [DOI] [PubMed] [Google Scholar]

[CR11] 11.van Amstel RBE et al (2025) Temporal transitions of the hyperinflammatory and hypoinflammatory phenotypes in critical illness. Am J Respir Crit Care Med 211(3):347–356 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Balch JA et al (2023) Defining critical illness using immunological endotypes in patients with and without sepsis: a cohort study. Crit Care 27(1):292 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Chenoweth JG et al (2024) Gene expression signatures in blood from a West African sepsis cohort define host response phenotypes. Nat Commun 15(1):4606 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Levy MM et al (2003) 2001 SCCM/ESICM/ACCP/ATS/SIS international sepsis definitions conference. Crit Care Med 31(4):1250–1256 [DOI] [PubMed] [Google Scholar]

[CR15] 15.Sinha P, Calfee CS, Delucchi KL (2021) Practitioner’s guide to latent class analysis: methodological considerations and common pitfalls. Crit Care Med 49(1):e63–e79 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Charrad M et al (2014) NbClust: an R package for determining the relevant number of clusters in a data set. J Stat Softw 61(6):1–36 [Google Scholar]

[CR17] 17.Calfee CS et al (2014) Subphenotypes in acute respiratory distress syndrome: latent class analysis of data from two randomised controlled trials. Lancet Respir Med 2(8):611–620 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Sinha P et al (2023) Identifying molecular phenotypes in sepsis: an analysis of two prospective observational cohorts and secondary analysis of two randomised controlled trials. Lancet Respir Med 11(11):965–974 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Davenport EE et al (2016) Genomic landscape of the individual host response and outcomes in sepsis: a prospective cohort study. Lancet Respir Med 4(4):259–271 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Scicluna BP et al (2017) Classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. Lancet Respir Med 5(10):816–826 [DOI] [PubMed] [Google Scholar]

[CR21] 21.Sweeney TE et al (2018) Unsupervised analysis of transcriptomics in bacterial sepsis across multiple datasets reveals three robust clusters. Crit Care Med 46(6):915–925 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Singer M et al (2016) The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315(8):801–810 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Temporal robustness of biomarker-based classification algorithms for sepsis

Emma Rademaker

Rombout B E van Amstel

Said el Bouhaddani

Marc J M Bonten

Lennie P G Derde

Lonneke van Vught

Tom van der Poll

Lieuwe D J Bos

Harm-Jan de Grooth

Olaf L Cremer

Abstract

Purpose

Methods

Results

Conclusion

Supplementary Information

Take-home message

Introduction

Methods

Study design and patients

Sample preparation and protein assays

Statistical analysis

Assessment of temporal stability

Exploratory analysis

Role of the funding source

Results

Primary class assignments at t0

Fig. 1.

Table 1.

Inter-class transitions

Fig. 2.

Fig. 3.

Intra-class cohesion

Fig. 4.

Alternative statistical approaches

Exploratory analysis

Discussion

Supplementary Information

Acknowledgements

Author contributions

Data availability

Declarations

Conflicts of interest

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases