Skip to main content
[Preprint]. 2024 Sep 18:2024.09.17.24313649. [Version 1] doi: 10.1101/2024.09.17.24313649

Table 2.

Performance of the generalizable model at the development and external validation sites.

Development Site Validation Site
Dataset Development Validation Test External Validation
Encounters
(Cases/Controls), N
10,457
(1,095/9,362)
3,486
(365/3,121)
4,152
(398/3,754)
6,825
(387/6,438)
Model XGB LR XGB LR XGB LR XGB LR
Feature Selection IG IG IG IG IG IG IG IG
AUROC 1.00 0.90 0.81 0.81 0.87 0.86 0.81 0.82
AUPRC 0.99 0.71 0.52 0.54 0.62 0.61 0.51 0.48
PPV 1.00 0.84 0.67 0.69 0.82 0.80 0.26 0.18
NPV 0.99 0.94 0.93 0.93 0.94 0.94 0.97 0.98
Sensitivity 0.91 0.47 0.35 0.41 0.35 0.38 0.61 0.70
Specificity 1.00 0.99 0.98 0.98 0.99 0.99 0.90 0.81
F1 score 0.95 0.61 0.46 0.51 0.49 0.52 0.37 0.29
F2 score 0.92 0.52 0.39 0.45 0.39 0.43 0.48 0.45
F3 score 0.92 0.50 0.37 0.43 0.37 0.41 0.54 0.55
F0.5 score 0.98 0.73 0.57 0.60 0.65 0.66 0.30 0.22

Abbreviations: AUROC, area under the receiver operating characteristics curve; AUPRC, area under the precision recall curve; CFS, correlation-based feature selection; IG, information gain; LR, logistic regression; NB, naïve Bayes; NPV, negative predictive value; PPV, positive predictive value; XGB, XGBoost (extreme gradient boosting)