Table 4.
Model performance based on a temporal data split for deterioration within 24 hours. We performed a temporal data split for the training and test sets. We fixed the test set to patient encounters recorded during 2020, whereas we expanded the training set gradually to eventually include encounters recorded between 2016 and 2019. We report area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) values with 95% CIs.
| Training set | Deterioration within 24 h, n (%) | XGBoosta | Logistic regression | Neural network | |||||
|
|
|
AUROC (95% CI) | AUPRC (95% CI) | AUROC (95% CI) | AUPRC (95% CI) | AUROC (95% CI) | AUPRC (95% CI) | ||
| 2016: n=68,499 | 2788 (4.1) | 0.740 (0.724-0.754) | 0.207 (0.187-0.229) | 0.679 (0.660-0.696) | 0.204 (0.182-0.228) | 0.706 (0.688-0.722) | 0.211 (0.188-0.233) | ||
| 2016-2017: n=159,888 | 7062 (4.4) | 0.763 (0.747-0.778) | 0.232 (0.211-0.257) | 0.687 (0.669-0.704) | 0.213 (0.191-0.237) | 0.744 (0.727-0.759) | 0.241 (0.216-0.266) | ||
| 2016-2018: n=285,733 | 12,352 (4.3) | 0.758 (0.742-0.773) | 0.242 (0.218-0.267) | 0.69 (0.671-0.706) | 0.216 (0.194-0.240) | 0.745 (0.728-0.760) | 0.237 (0.213-0.262) | ||
| 2016-2019: n=431,503 | 19,261 (4.5) | 0.778 (0.763-0.792) | 0.25 (0.226-0.276) | 0.688 (0.670-0.705) | 0.215 (0.192-0.239) | 0.754 (0.739-0.769) | 0.233 (0.211-0.259) | ||
aXGBoost: extreme gradient boosting.