Table 3.
Model performance across different subgroups. We report performances in terms of area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) values for deterioration within 24 hours in the test set. We also provide 95% CIs computed using bootstrapping.
| Subgroup | XGBoosta | Logistic regression | Neural network | NEWSb | |||||||||||||
|
|
AUROC (95% CI) | AUPRC (95% CI) | AUROC (95% CI) | AUPRC (95% CI) | AUROC (95% CI) | AUPRC (95% CI) | AUROC (95% CI) | AUPRC (95% CI) | |||||||||
| All patients | 0.778c (0.770-0.785) | 0.244 (0.231-0.258) | 0.654 (0.644-0.663) | 0.169 (0.158-0.181) | 0.756 (0.749-0.764) | 0.222 (0.211-0.235) | 0.682 (0.673-0.690) | 0.151 (0.142-0.161) | |||||||||
| Male patients | 0.775 (0.764-0.784) | 0.274 (0.258-0.291) | 0.651 (0.638-0.663) | 0.208 (0.193-0.223) | 0.752 (0.742-0.762) | 0.253 (0.237-0.269) | 0.675 (0.665-0.685) | 0.176 (0.163-0.190) | |||||||||
| Female patients | 0.785 (0.772-0.797) | 0.214 (0.194-0.236) | 0.676 (0.662-0.692) | 0.137 (0.123-0.155) | 0.766 (0.754-0.779) | 0.194 (0.176-0.216) | 0.704 (0.689-0.718) | 0.129 (0.116-0.145) | |||||||||
| Age group (years) | |||||||||||||||||
|
|
<40 | 0.818 (0.797-0.837) | 0.222 (0.193-0.256) | 0.739 (0.718-0.761) | 0.153 (0.130-0.179) | 0.804 (0.784-0.824) | 0.213 (0.184-0.249) | 0.738 (0.717-0.758) | 0.120 (0.104-0.138) | ||||||||
|
|
40-59 | 0.758 (0.744-0.772) | 0.251 (0.230-0.275) | 0.609 (0.592-0.626) | 0.159 (0.143-0.176) | 0.734 (0.721-0.749) | 0.226 (0.208-0.248) | 0.640 (0.626-0.655) | 0.149 (0.134-0.165) | ||||||||
|
|
≥60 | 0.779 (0.768-0.790) | 0.258 (0.240-0.278) | 0.663 (0.649-0.676) | 0.196 (0.181-0.214) | 0.757 (0.745-0.768) | 0.235 (0.218-0.254) | 0.700 (0.689-0.712) | 0.177 (0.162-0.192) | ||||||||
aXGBoost: extreme gradient boosting.
bNEWS: National Early Warning Score.
cThe best results in each subgroup are italicized.