. 2023 Nov 6;2:e45257. doi: 10.2196/45257

Table 4.

Model performance based on a temporal data split for deterioration within 24 hours. We performed a temporal data split for the training and test sets. We fixed the test set to patient encounters recorded during 2020, whereas we expanded the training set gradually to eventually include encounters recorded between 2016 and 2019. We report area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) values with 95% CIs.

Training set	Deterioration within 24 h, n (%)	XGBoost^a			Logistic regression			Neural network
		AUROC (95% CI)	AUPRC (95% CI)	AUROC (95% CI)		AUPRC (95% CI)	AUROC (95% CI)		AUPRC (95% CI)
2016: n=68,499	2788 (4.1)	0.740 (0.724-0.754)	0.207 (0.187-0.229)	0.679 (0.660-0.696)		0.204 (0.182-0.228)	0.706 (0.688-0.722)		0.211 (0.188-0.233)
2016-2017: n=159,888	7062 (4.4)	0.763 (0.747-0.778)	0.232 (0.211-0.257)	0.687 (0.669-0.704)		0.213 (0.191-0.237)	0.744 (0.727-0.759)		0.241 (0.216-0.266)
2016-2018: n=285,733	12,352 (4.3)	0.758 (0.742-0.773)	0.242 (0.218-0.267)	0.69 (0.671-0.706)		0.216 (0.194-0.240)	0.745 (0.728-0.760)		0.237 (0.213-0.262)
2016-2019: n=431,503	19,261 (4.5)	0.778 (0.763-0.792)	0.25 (0.226-0.276)	0.688 (0.670-0.705)		0.215 (0.192-0.239)	0.754 (0.739-0.769)		0.233 (0.211-0.259)

^aXGBoost: extreme gradient boosting.