Table 2.
Summary of models trained
Models were optimized and evaluated via a nested cross-validation protocol. The prefix and suffix of each model name corresponds to the dataset and contamination reduction technique applied, respectively. Neat, SD and CR refer to the feature spaces with no decontamination, Simple Decontamination, and SHAP Decontamination applied, respectively (see Methods). Karius-Without corresponds to the SHAP-decontaminated feature space after claimed ‘culture-confirmed’ pathogens are excluded. Karius-Only refers to the feature space containing only genera with ‘culture-confirmed’ pathogens as features.
No. of features |
Feature space |
Model performance |
||
---|---|---|---|---|
Precision |
Recall |
AUROC |
||
1564 |
Karius-Neat |
0.976 |
0.983 |
0.995 |
1564 |
Karius-normalised |
0.956 |
0.932 |
0.943 |
111 |
Karius-SD |
0.896 |
0.787 |
0.942 |
25 |
Karius-CR |
0.883 |
0.810 |
0.942 |
22 |
Karius-Without |
0.803 |
0.727 |
0.915 |
22 |
Karius-Only |
0.929 |
0.862 |
0.950 |
685 |
Pooled-Neat |
0.950 |
0.939 |
0.982 |
21 |
Pooled-CR |
0.870 |
0.796 |
0.904 |