Skip to main content
. 2024 Mar 27;3(3):e0000478. doi: 10.1371/journal.pdig.0000478

Table 2. Model performance using the eICU-CRD dataset.

Comparison of selected metrics for XGBoost and LR (baseline) on the defined experiments, see Fig 3. The best-performing metric values are represented in bold font.

Experiment AUROC (95% CI) AUPRC (95% CI) Recall
(95% CI)
F1-score
(95% CI)
PPV
(95% CI)
NPV
(95% CI)
(A1) 80% training, 20% test, randomly selected patients
XGBoost 0.82 (0.81–0.83) 0.72 (0.71–0.74) 0.83 (0.82–0.84) 0.81 (0.80–0.83) 0.80 (0.78–0.81) 0.60 (0.68–0.71)
Logistic Regression 0.79 (0.78–0.80) 0.66 (0.64–0.67) 0.85 (0.83–0.86) 0.80 (0.79–0.82) 0.77 (0.75–0.78)
0.69 (0.67–0.70)
(A1’) 80% training, 20% test, randomly selected patients, with RFE
XGBoost 0.81 (0.80–0.82) 0.70 (0.69–0.72) 0.83 (0.82–0.84) 0.81 (0.80–0.82) 0.79 (0.77–0.80) 0.69 (0.67–0.70)
Logistic Regression 0.77 (0.76–0.78) 0.62 (0.61–0.64) 0.87 (0.86–0.88) 0.80 (0.79–0.81) 0.74 (0.72–0.75) 0.69 (0.67–0.70)
(A2) 80% training, 20% test, randomly selected hospitals
XGBoost 0.84 (0.83–0.85) 0.69 (0.67–0.70) 0.85 (0.84–0.86) 0.86 (0.85–0.87) 0.87 (0.86–0.88) 0.64 (0.62–0.65)
Logistic Regression 0.81 (0.80–0.82) 0.62 (0.61–0.63) 0.89 (0.88–0.90) 0.86 (0.85–0.87) 0.83 (0.82–0.84) 0.67 (0.66–0.69)