Skip to main content
JAMA Network logoLink to JAMA Network
. 2023 Apr 3;183(6):611–612. doi: 10.1001/jamainternmed.2022.7182

Factors Associated With Variability in the Performance of a Proprietary Sepsis Prediction Model Across 9 Networked Hospitals in the US

Patrick G Lyons 1,2,, Mackenzie R Hofford 3, Sean C Yu 3, Andrew P Michelson 1, Philip R O Payne 3, Catherine L Hough 4, Karandeep Singh 5
PMCID: PMC10071393  PMID: 37010858

Abstract

This cohort study uses data from electronic health records to assess variability in a sepsis prediction model across 9 hospitals.


Use of sepsis prediction models may be associated with reduced patient mortality.1 There is concern, however, about the external validity of widely implemented models. For example, although the Epic Sepsis Model (ESM) performed poorly in an academic health system,2 another study at a community hospital found ESM improved the timeliness of antibiotic administration.3 The reasons for these disparate findings are unclear.4 Because increased ESM alerting at larger vs smaller hospitals suggests that hospital-related factors are associated with sepsis model performance variation,4,5 we evaluated the ESM performance across 9 hospitals.

Methods

This cohort study enrolled consecutive adult patients presenting to 1 of 9 hospitals in a large network (BJC HealthCare) between January 1, 2020, and June 30, 2022 (eMethods in Supplement 1). The Washington University Institutional Review Board approved this study with a waiver of consent because it involved no more than minimal risk to participants and it could not be practically conducted without a waiver. The ESM was calculated silently (ie, results not clinically available) every 20 minutes from the patient’s arrival in the emergency department. The primary outcome was sepsis, defined as meeting Sepsis-3 criteria using data extracted from the electronic health record; sepsis onset was based on clinical recognition (ie, either cultures or antimicrobials ordered). We evaluated discrimination by calculating the C-statistic (area under the receiver operating characteristic curve) at the hospitalization level, using the highest occurring score prior to clinical recognition of sepsis. A Pearson product-moment correlation was used to detect associations between C-statistics and hospital factors (time to clinical recognition of sepsis, sepsis incidence, Van Walraven index [a modification of Elixhauser comorbidity measures], and cancer prevalence). We used R version 4.0 (R Project for Statistical Computing). We set statistical significance at P = .05 and used 2-sided tests.

Results

We evaluated 806 696 encounters across 9 hospitals, including 233 875 from the largest hospital in the network and 572 821 from the remaining hospitals. Sepsis incidence ranged from 4.8% to 12.0% (Table).

Table. Characteristics and Outcomes of Sepsis and Other Encounters Across 9 US Hospitals.

Variable No. (%)a
Hospital A Hospital B Hospital C Hospital D Hospital E Hospital F Hospital G Hospital H Hospital I
Encounters, No. 233 875 97 996 172 929 64 154 74 963 54 206 40 571 33 958 34 065
Staffed beds, No. 1273 449 233 158 107 102 76 68 35
Characteristics of patient encounters
Age, mean (SD), y 49 (19) 57 (20) 44 (19) 51 (21) 50 (20) 55 (20) 50 (20) 53 (19) 49 (20)
Race
Black 101 231 (43) 18 166 (19) 127 782 (74) 10 713 (17) 1354 (1.8) 5151 (9.6) 2900 (7.1) 421 (1.2) 10 585 (31)
White 100 050 (43) 67 598 (69) 23 884 (14) 48 946 (76) 52 718 (70) 43 468 (80) 33 732 (83) 18 854 (56) 30 916 (91)
Otherb 32 594 (14) 12 232 (12) 21 263 (12) 4495 (7.0) 20 891 (28) 5587 (10) 3939 (9.7) 2728 (8.0) 4519 (13)
Sex
Male 126 908 (54) 42 208 (43) 81 507 (47) 29 138 (45) 43 354 (58) 26 881 (50) 17 864 (44) 15 936 (47) 16 731 (49)
Female 106 967 (46) 55 788 (57) 91 422 (53) 35 016 (55) 31 609 (42) 27 325 (50) 22 707 (56) 18 022 (53) 17 334 (51)
Hypertension 23 383 (10.0) 10 902 (11) 16 151 (9.3) 5167 (8.1) 5244 (7.0) 2960 (5.5) 1278 (3.2) 2469 (7.3) 1479 (4.3)
T2D 14 482 (6.2) 5732 (5.8) 7006 (4.1) 3110 (4.8) 3368 (4.5) 1905 (3.5) 828 (2.0) 1034 (3.0) 1050 (3.1)
CHF 12 166 (5.2) 5345 (5.5) 5860 (3.4) 2536 (4.0) 2344 (3.1) 1515 (2.8) 657 (1.6) 558 (1.6) 991 (2.9)
COPD 11 107 (4.7) 4312 (4.4) 8407 (4.9) 3378 (5.3) 4002 (5.3) 2250 (4.2) 961 (2.4) 967 (2.8) 1668 (4.9)
CKD 10 986 (4.7) 4474 (4.6) 4950 (2.9) 2069 (3.2) 1922 (2.6) 1082 (2.0) 281 (0.7) 622 (1.8) 486 (1.4)
Cancer 19 497 (8.3) 3609 (3.7) 1785 (1.0) 853 (1.3) 810 (1.1) 1287 (2.4) 215 (0.5) 1621 (4.8) 284 (0.8)
Van Walraven Elixhauser Comorbidity Score, mean (SD) 9 (13) 10 (13) 6 (11) 8 (13) 7 (12) 8 (12) 6 (10) 9 (13) 6 (11)
Positive for SARS-CoV-2 6672 (2.9) 2477 (2.5) 7196 (4.2) 1583 (2.5) 2572 (3.4) 1556 (2.9) 1108 (2.7) 708 (2.1) 1164 (3.4)
Outcomes of patient encounters
Met Sepsis-3 criteria 28 148 (12) 11 900 (12) 7024 (4.1) 4949 (7.7) 6120 (8.2) 3923 (7.2) 2127 (5.2) 2213 (6.5) 1648 (4.8)
Hours between presentation and sepsis clinical recognition, mean (SD) 8.3 (59) 4.5 (38) 6.2 (32) 2.6 (22) 0.41 (11) 3.9 (21) 2.9 (16) 4.2 (29) 1.6 (11)
Died 4762 (2.0) 1603 (1.6) 1481 (0.9) 689 (1.1) 351 (0.5) 572 (1.1) 230 (0.6) 29 (<0.1) 224 (0.7)

Abbreviations: CHF, congestive heart failure; CKD, chronic kidney disease; COPD, chronic obstructive pulmonary disease; T2D, type 2 diabetes; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.

a

Percentage values are rounded to 2 digits (≥10 to whole numbers and <10 to a single decimal place).

b

Other includes Asian, American Indian or Alaska Native, Native Hawaiian or Other Pacific Islander, and individuals with no race or ethnicity provided. We did not break these categories down further given their size at certain hospitals, which could risk identifiability.

Discrimination varied substantially across hospitals (Figure). The C-statistic at the largest hospital (0.55) was similar to that reported in another study at the University of Michigan (0.63).2 The highest C-statistic in the present study (0.73) was close to the range (0.76-0.83) reported by the model developer.

Figure. Association Between Hospital-Level Sepsis Incidence and Epic Sepsis Model C-Statistic Across 9 US Hospitals in a Network.

Figure.

Each hospital is represented by a blue point (A through I), with 95% CIs represented by vertical bars. The diagonal line represents the line of best fit among hospitals A through I.

Across the 9 hospitals, C-statistics were negatively correlated with sepsis incidence (r −0.80; P = .009), comorbidity burden (r −0.78; P = .013), and cancer prevalence (r −0.86; P = .003). The correlation was not statistically significant with time to clinical recognition of sepsis (r −0.59; P = .09).

Discussion

This study found that the performance of a sepsis model was negatively correlated with the incidence of sepsis, the presence of comorbidities, and cancer prevalence. We found no evidence that encounters with COVID-19 were associated with ESM discrimination, suggesting that COVID-19-related alerting increases may be due to model miscalibration.4,5 Our study was limited in that between-hospital differences in recognizing sepsis might affect the calculation of its incidence.1

Possible explanations for our findings include that sepsis may present more heterogeneously at hospitals with a higher sepsis incidence or be more difficult to detect in populations with a higher comorbidity burden, in which other clinical disorders could mimic sepsis. Beyond differential patient characteristics, the low C-statistic at the large teaching hospital in our study may be related to unique patterns in care, documentation, or billing compared with other hospitals in the network. Although academic medical centers often use sepsis prediction models, our findings suggest they may be most useful at lower-acuity hospitals, where sepsis rates may be lower. A forthcoming ESM update,6 which includes a recommendation for training on a hospital’s own data before clinical use and a change in the definition of sepsis onset, may mitigate the performance variation that we found and improve the clinical value of the model.

Supplement 1.

eMethods. Supplementary Methods

eReferences

Supplement 2.

Data Sharing Statement

References

  • 1.Adams R, Henry KE, Sridharan A, et al. Prospective, multi-site study of patient outcomes after implementation of the TREWS machine learning-based early warning system for sepsis. Nat Med. 2022;28(7):1455-1460. [DOI] [PubMed] [Google Scholar]
  • 2.Wong A, Otles E, Donnelly JP, et al. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern Med. 2021;181(8):1065-1070. doi: 10.1001/jamainternmed.2021.2626 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tarabichi Y, Cheng A, Bar-Shain D, et al. Improving timeliness of antibiotic administration using a provider and pharmacist facing sepsis early warning system in the emergency department setting: a randomized controlled quality improvement initiative. Crit Care Med. 2022;50(3):418-427. [DOI] [PubMed] [Google Scholar]
  • 4.Finlayson SG, Subbaswamy A, Singh K, et al. The clinician and dataset shift in artificial intelligence. N Engl J Med. 2021;385(3):283-286. doi: 10.1056/NEJMc2104626 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wong A, Cao J, Lyons PG, et al. Quantification of sepsis model alerts in 24 US hospitals before and during the COVID-19 Pandemic. JAMA Netw Open. 2021;4(11):e2135286. doi: 10.1001/jamanetworkopen.2021.35286 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ross C. Epic overhauls popular sepsis algorithm criticized for faulty alarms. Stat News. Published October 3, 2022. Accessed November 7, 2022. https://www.statnews.com/2022/10/03/epic-sepsis-algorithm-revamp-training/

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1.

eMethods. Supplementary Methods

eReferences

Supplement 2.

Data Sharing Statement


Articles from JAMA Internal Medicine are provided here courtesy of American Medical Association

RESOURCES