Table 2.
Estimated performance metrics
| Algorithm | Missing values handling | AUROC | Accuracya, % | Sensitivitya, % | Specificitya, % | PPVa, % | NPVa, % |
|---|---|---|---|---|---|---|---|
| A) All studies (group 1; N = 8404) b | |||||||
| Logistic regression | Only complete observations | 0.705 | 82.5 | 37.4 | 85.5 | 14.7 | 95.3 |
| SVM with linear kernel | Only complete observations | 0.686–0.691 | 75.1–75.7 | 51.0–52.9 | 76.6–77.2 | 12.9–13.3 | 95.9–96.1 |
| Random forest | Only complete observations | 0.682–0.733 | 93.0–93.7 | 0.0–6.2 | 98.8–100.0 | 0.0–30.9 | 93.7–94.0 |
| Extreme gradient boosting treesc | Whole population (no missing value imputation) | 0.656–0.739 | 83.7–93.6 | 3.8–27.1 | 87.2–98.9 | 9.9–20.0 | 94.5–95.5 |
| Boosted treesc | MIA | 0.703–0.726 | 89.6–91.5 | 11.3–18.4 | 93.9–96.3 | 14.6–17.0 | 94.8–95.1 |
| Logistic regressionc | ML single imputation | 0.693 | 80.1 | 40.9 | 82.5 | 12.2 | 95.9 |
| Logistic regressionc | ML multiple imputation | 0.694–0.697 | 79.8–80.2 | 40.0–41.5 | 82.1–82.5 | 11.9–12.4 | 95.8–95.9 |
| B) Phase 3 and 3b/4 studies (group 2; N = 7565) b | |||||||
| Logistic regression | Only complete observations | 0.696 | 81.9 | 36.3 | 85.0 | 14.3 | 95.1 |
| SVM with linear kernel | Only complete observations | 0.680–0.686 | 74.8–75.5 | 48.9–51.3 | 76.6–77.2 | 12.6–13.4 | 95.6–95.8 |
| Random forest | Only complete observations | 0.673–0.723 | 92.5–93.5 | 0.0–5.1 | 98.6–100.0 | 0.0–41.7 | 93.5–93.8 |
| Extreme gradient boosting treesc | Whole population (no missing value imputation) | 0.599–0.730 | 87.9–92.9 | 4.6–22.6 | 92.2–98.6 | 11.8–19.9 | 94.1–94.9 |
| Boosted treesc | MIA | 0.702–0.720 | 88.8–90.9 | 13.1–18.8 | 93.4–96.0 | 14.9–17.9 | 94.4–94.7 |
| Logistic regressionc | ML single imputation | 0.702 | 82.4 | 35.7 | 85.4 | 13.8 | 95.3 |
| Logistic regressionc | ML multiple imputation | 0.701–0.704 | 82.4–82.6 | 36.4–37.6 | 85.4–85.6 | 14.1–14.5 | 95.4–95.5 |
| C) ORAL Surveillance only (group 3; N = 2911) b | |||||||
| Logistic regression | Only complete observations | 0.611 | 75.3 | 32.5 | 80.9 | 18.3 | 90.1 |
| SVM with linear kernel | Only complete observations | 0.607–0.610 | 73.1–73.7 | 34.7–36.3 | 78.0–78.8 | 17.3–17.9 | 90.1–90.3 |
| Random forest | Only complete observations | 0.589–0.635 | 87.7–88.4 | 0.0–3.4 | 98.9–100.0 | 0.0–63.9 | 88.3–88.6 |
| Extreme gradient boosting treesc | Whole population (no missing value imputation) | 0.563–0.643 | 74.0–87.4 | 3.9–24.1 | 80.5–98.3 | 14.1–27.6 | 88.6–89.3 |
| Boosted treesc | MIA | 0.603–0.630 | 86.3–87.5 | 3.3–8.0 | 96.6–98.6 | 20.1–26.6 | 88.5–88.8 |
| Logistic regressionc | ML single imputation | 0.624 | 76.1 | 35.3 | 81.5 | 20.1 | 90.5 |
| Logistic regressionc | ML multiple imputation | 0.621–0.629 | 75.9–76.4 | 34.8–36.3 | 81.3–81.8 | 19.8–20.7 | 90.5–90.7 |
The AUROC considers the estimated probabilities provided by the models, regardless of any cut-off value, while all other performance measures (i.e., accuracy, sensitivity, specificity, PPV, and NPV) are obtained by applying a cut-off value of 0.5 to the predicted probability obtained (i.e., a patient is classified as having serious infections if their predicted probability is ≥ 0.5)
AUROC area under receiver operating characteristic, MIA missing incorporated in attribute, ML maximum likelihood, N total number of patients included in each group, NPV negative predictive value, PPV positive predictive value, SVM support vector machines
a Cut-off = 0.5
b The total number of patients assessed in each model differed according to how missing values were handled by the model
c Complete patient set. No patients excluded based on missing variables