Skip to main content
Oxford University Press - PMC COVID-19 Collection logoLink to Oxford University Press - PMC COVID-19 Collection
. 2020 Aug 21:hvaa200. doi: 10.1093/clinchem/hvaa200

Routine laboratory blood tests predict SARS-CoV-2 infection using machine learning

He S Yang h1,h2,✉,#, Yu Hou h3,#, Ljiljana V Vasovic h1,h4, Peter Steel h2,h5, Amy Chadburn h1,h2, Sabrina E Racine-Brzostek h1,h2, Priya Velu h1,h2, Melissa M Cushing h1,h2, Massimo Loda h1,h2, Rainu Kaushal h2,h3, Zhen Zhao h1,h2,, Fei Wang h3,
PMCID: PMC7499540  PMID: 32821907

Abstract

Background

Accurate diagnostic strategies to rapidly identify SARS-CoV-2 positive individuals for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours.

Method

We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory test results obtained within two days before the release of SARS-CoV-2-RT-PCR result were used to train a gradient boosted decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital.

Results

The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within two days.

Conclusion

This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-COV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints.

Keywords: SARS-CoV-2, COVID-19, machine learning, gradient boosted decision tree, routine laboratory tests


Articles from Clinical Chemistry are provided here courtesy of Oxford University Press

RESOURCES