Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

medRxiv logoLink to medRxiv
[Preprint]. 2026 Jan 15:2026.01.13.26344046. [Version 1] doi: 10.64898/2026.01.13.26344046

Achieving Expert-Level Clinical Infection Detection with LLMs from Clinical Documents: Validation in Complex Patient Cases with Cirrhosis

Yufei Yu, James Ford, Eileen Kim, Avi Patel, Gabriel Wardi, Rohit Loomba, Atul Malhotra, Shamim Nemati, Joseph C Ahn
PMCID: PMC12870549  PMID: 41646769

ABSTRACT

Background

Systemic infections are a leading cause of hospitalization and death among patients with cirrhosis. Timely and accurate infection identification is essential for both clinical care and the development of predictive models. However, existing methods such as ICD-10 coding are unreliable, and manual chart review is resource-intensive and difficult to scale. This study aimed to develop and validate an automated large language model (LLM)-based approach for infection classification and subtyping in patients with cirrhosis presenting to the emergency department (ED).

Method

We developed INFEHR (INfection identification and subtyping using Free-text EHR analysis), an LLM-powered pipeline utilizing Claude 3.5 Sonnet to analyze clinical notes from the first 72 hours of admission. Model outputs were compared against a physician-adjudicated gold standard in a cohort of 1,000 encounters from patients with cirrhosis who presented to the ED. Performance was benchmarked against ICD-10 code–based labeling and CDC Adult Sepsis Event criteria.

Results

INFEHR achieved 94.7% overall accuracy, with 99.5% sensitivity and 92.8% positive predictive value for identifying infection presence, outperforming ICD-10–based classification across all metrics ( p < 0.0001). The model also demonstrated strong performance in classifying pathogen type and infection site. This pipeline processed notes within seconds, offering improvements in efficiency and scalability over manual review.

Conclusion

INFEHR offers a scalable, reproducible, and accurate method for infection phenotyping in cirrhosis. By overcoming limitations of traditional coding and manual review, it supports high-throughput infection surveillance, improves cohort construction for clinical research, and enables future integration into real-time decision-support tools in hepatology.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from medRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES