Skip to main content
American Journal of Respiratory and Critical Care Medicine logoLink to American Journal of Respiratory and Critical Care Medicine
. 2023 Feb 1;207(7):853–854. doi: 10.1164/rccm.202212-2284VP

Artificial Intelligence for Early Sepsis Detection: A Word of Caution

Michiel Schinkel 1,2,, Tom van der Poll 1,3, W Joost Wiersinga 1,3
PMCID: PMC10111986  PMID: 36724366

Sepsis is a life-threatening syndrome with an estimated 49 million cases and 11 million related deaths globally each year (1). Early detection of potential sepsis, leading to a timely and appropriate work-up and treatment, seems essential to improve outcomes (2). However, sepsis’ highly heterogeneous and variable presentation makes it hard to establish a prompt and accurate diagnosis. Consequently, the delivery of appropriate care can be delayed. The need for early sepsis detection has incentivized researchers and companies worldwide to leverage advanced analytical tools, including artificial intelligence (AI), to develop automated systems that provide timely alerts and make physicians aware of imminent sepsis. Among the vast array of AI applications developed to support the management of sepsis, early detection tools are the first to have acquired a place in clinical practice (3). For better or for worse, they are already impacting patient care. In the past few years, we have seen the first real-world evaluations of these tools and their potential to help improve patient outcomes. Here, we argue that their results warrant caution (highlighted in Text Box 1).

Text Box 1. Artificial Intelligence for Early Sepsis Detection: A Word of Caution

  • Artificial intelligence (AI) tools already impact sepsis care.

  • Up to now, no large-scale, randomized controlled trial–level evidence demonstrates the clinical benefits of AI-based alerts for patients with sepsis.

  • Sepsis alerts may trigger the unnecessary use of antibiotics.

  • The proprietary nature of many AI tools can make independent validation challenging.

  • Caution should be exercised when using early sepsis detection tools.

In 2021, Wong and colleagues validated the Epic Sepsis Model (ESM), an algorithm for detecting sepsis implemented and used by hundreds of hospitals (4). During the validation, the ESM had poor discrimination (area under the curve [AUC] of 0.63) and calibration, far worse than the initially reported AUC of 0.76 to 0.83. Physicians needed to evaluate 109 patients to detect 1 case of sepsis earlier than without the ESM. More recently, a prospective study evaluated the impact of using the Targeted Real-time Early Warning System (TREWS) to improve patient outcomes (5). Researchers have been developing and fine-tuning the TREWS for years, reaching an AUC of 0.97 for detecting sepsis and flagging 82% of cases. This algorithm is undoubtedly one of the better sepsis detection tools, showing robust results in large and diverse cohorts. Starting in 2018, TREWS was deployed in five hospitals as part of a multisite study of patient outcomes after using the alert. Impressively, the TREWS retained adoption rates of 89% throughout this study (5). In a retrospective analysis of 6,877 actionable sepsis cases with alerts, the 4,220 cases in which the alert was confirmed within 3 hours had a significantly reduced mortality rate (adjusted relative reduction of 18.7%) compared with the 2,657 controls in which the alert was not confirmed within 3 hours (5). Although these results indeed are promising, they seem preliminary. A significant concern lies within the control group, which is highly heterogeneous and may have included many patients without sepsis. Despite the sophisticated adjustments for potential confounding, the study’s observational nature and electronic health record–based sepsis identification make it hard to investigate the actual benefits of TREWS in sepsis outcomes. Furthermore, the investigation did not consider the patients with TREWS alerts who did not have sepsis (over 30,000). An alert in this group may have had harmful effects, such as the overuse of diagnostics and antimicrobial therapy. A key concern is that the authors acknowledge the need for a randomized controlled trial (RCT) to investigate the potential benefits of using TREWS (5). However, they found it difficult to operationalize, which ultimately caused them not to conduct such a large-scale RCT.

The above-mentioned evaluations of implemented AI tools for early sepsis detection showcase serious challenges. Although a notable value proposition of AI-based algorithms is that they can continuously learn, the current regulatory frameworks are not designed for learning systems (6). The Food and Drug Administration only approves “locked” algorithms, which cannot be adjusted in new settings. Using such locked and static tools in ever-changing environments inevitably leads to performance drifts. The additional problem is that the ESM and the TREWS, like many other AI-based sepsis tools, are proprietary algorithms (4, 5). Inherently, AI systems can already feel like a black box because of their complexity. Adding yet another layer of concealment by protecting proprietary intellectual property makes it hard for physicians to obtain meaningful insights into how the algorithms work and whether they fit their patient population. It also makes independent validations nearly impossible and requires major efforts (4).

Yet, the most critical concern about early sepsis detection tools is that the evidence for their clinical benefits is still preliminary and circumstantial. High-quality trials to investigate the impact of alerts on care processes and patient outcomes are lacking. The evaluation of the TREWS score has been the most extensive to date, but the study design limits the conclusions that can be drawn (5). Implementing and adopting these tools in practice should not be taken lightly, because unjust use can cause harm. Sepsis alerts triggering one-size-fits-all protocols may lead to substantial overuse of antibiotics, with profound implications, and may cause a significant burden when physicians must evaluate many patients to detect one sepsis patient early (4, 7). We need RCT-level evidence to unequivocally show that using these alerts will have a positive net benefit for the patient. Setting up such studies will present significant challenges, including blinding, level of randomization, regulatory barriers, and cohort selection. Shimabukuro and colleagues’ small, randomized study can be seen as a pilot of such a trial (8). Their RCT suggests that a sepsis alert helps improve survival, although their population (N = 142) was largely diluted with patients without sepsis and included only 25 actual cases. As a next step, we need large-scale, high-quality, system-wide RCTs to show whether AI-based sepsis alerts benefit the patients and the healthcare system. These RCTs should adhere to accepted reporting standards, such as CONSORT-AI (Consolidated Standards of Reporting Trials with an AI component), to attain the highest quality evidence (9). Cost-effectiveness evaluations should also be included to prevent overpricing of these tools.

The potential of high-precision alerts for sepsis detection is promising, but their impact on patient outcomes is still unknown. Further research and development are needed to address the challenges surrounding regulation, proprietary intellectual property, and generation of RCT-level evidence before these tools can be safely used in practice. Until then, caution should be exercised when considering the use of early sepsis detection tools. To showcase both the potential and the concerns about AI, we here admit that not the authors but an AI algorithm wrote this concluding paragraph (10).

Footnotes

Originally Published in Press as DOI: 10.1164/rccm.202212-2284VP on February 1, 2023

Author disclosures are available with the text of this article at www.atsjournals.org.

References

  • 1. Rudd KE, Johnson SC, Agesa KM, Shackelford KA, Tsoi D, Kievlan DR, et al. Global, regional, and national sepsis incidence and mortality, 1990–2017: analysis for the Global Burden of Disease Study. Lancet . 2020;395:200–211. doi: 10.1016/S0140-6736(19)32989-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Evans L, Rhodes A, Alhazzani W, Antonelli M, Coopersmith CM, French C, et al. Surviving Sepsis Campaign: international guidelines for management of sepsis and septic shock 2021. Intensive Care Med . 2021;47:1181–1247. doi: 10.1007/s00134-021-06506-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Schinkel M, Paranjape K, Nannan Panday RS, Skyttberg N, Nanayakkara PWB. Clinical applications of artificial intelligence in sepsis: a narrative review. Comput Biol Med . 2019;115:103488. doi: 10.1016/j.compbiomed.2019.103488. [DOI] [PubMed] [Google Scholar]
  • 4. Wong A, Otles E, Donnelly JP, Krumm A, McCullough J, DeTroyer-Cooley O, et al. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern Med . 2021;181:1065–1070. doi: 10.1001/jamainternmed.2021.2626. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Adams R, Henry KE, Sridharan A, Soleimani H, Zhan A, Rawat N, et al. Prospective, multi-site study of patient outcomes after implementation of the TREWS machine learning-based early warning system for sepsis. Nat Med . 2022;28:1455–1460. doi: 10.1038/s41591-022-01894-0. [DOI] [PubMed] [Google Scholar]
  • 6. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med . 2020;3:118. doi: 10.1038/s41746-020-00324-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Klompas M, Calandra T, Singer M. Antibiotics for sepsis-finding the equilibrium. JAMA . 2018;320:1433–1434. doi: 10.1001/jama.2018.12179. [DOI] [PubMed] [Google Scholar]
  • 8. Shimabukuro DW, Barton CW, Feldman MD, Mataraso SJ, Das R. Effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial. BMJ Open Respir Res . 2017;4:e000234. doi: 10.1136/bmjresp-2017-000234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK, SPIRIT-AI and CONSORT-AI Working Group Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med . 2020;26:1364–1374. doi: 10.1038/s41591-020-1034-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Open AI. https://openai.com/

Articles from American Journal of Respiratory and Critical Care Medicine are provided here courtesy of American Thoracic Society

RESOURCES