[Preprint]. 2024 Sep 18:2024.09.17.24313649. [Version 1] doi: 10.1101/2024.09.17.24313649

Table 2.

Performance of the generalizable model at the development and external validation sites.

	Development Site						Validation Site
Dataset	Development		Validation		Test		External Validation
Encounters (Cases/Controls), N	10,457 (1,095/9,362)		3,486 (365/3,121)		4,152 (398/3,754)		6,825 (387/6,438)
Model	XGB	LR	XGB	LR	XGB	LR	XGB	LR
Feature Selection	IG	IG	IG	IG	IG	IG	IG	IG
AUROC	1.00	0.90	0.81	0.81	0.87	0.86	0.81	0.82
AUPRC	0.99	0.71	0.52	0.54	0.62	0.61	0.51	0.48
PPV	1.00	0.84	0.67	0.69	0.82	0.80	0.26	0.18
NPV	0.99	0.94	0.93	0.93	0.94	0.94	0.97	0.98
Sensitivity	0.91	0.47	0.35	0.41	0.35	0.38	0.61	0.70
Specificity	1.00	0.99	0.98	0.98	0.99	0.99	0.90	0.81
F1 score	0.95	0.61	0.46	0.51	0.49	0.52	0.37	0.29
F2 score	0.92	0.52	0.39	0.45	0.39	0.43	0.48	0.45
F3 score	0.92	0.50	0.37	0.43	0.37	0.41	0.54	0.55
F0.5 score	0.98	0.73	0.57	0.60	0.65	0.66	0.30	0.22

Abbreviations: AUROC, area under the receiver operating characteristics curve; AUPRC, area under the precision recall curve; CFS, correlation-based feature selection; IG, information gain; LR, logistic regression; NB, naïve Bayes; NPV, negative predictive value; PPV, positive predictive value; XGB, XGBoost (extreme gradient boosting)