Skip to main content
. 2023 Jul 12;15:49. doi: 10.1186/s13073-023-01202-6

Fig. 3.

Fig. 3

Stool-based classifier for COVID-19 disease severity. a Box and scatter plots of the top 50 microbial features and their differential abundance by COVID-19 severity with barplots indicating univariate/nominal p-value, fold change by study group, prevalence, and taxa-level contribution to area-under-the curve for a random forest-based machine learner. b Receiver operator characteristic (ROC) and precision-recall curves demonstrating excellent performance in classifying stool samples by COVID-19 severity. The removal of stool SARS-CoV-2 viral load and clinical metadata resulted in only modestly decreased task performance, as did limiting our input to only the top 20 differentially abundant microbes by disease class. A sensitivity analysis using only the first provided stool from each participant, which should minimize the possibility of overfitting data due to repeated measures and longitudinal sampling, still performed well. c External validation of the taxa-only random forest model on an independent dataset of 24 patients with mild/moderate COVID-19 and 14 with severe/critical COVID-19 (Xu et al. 2022)