Table 2.
Performance metrics of recurrent neural network (RNN) and physicians on a balanced test set.
Threshold-independent metrics, (95% CI) | Metrics based on a threshold of 0.5 for positive/negative classification, (95% CI) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
AUC | PR_AUC | Brier | Acc | Sens | Spec | F1 | FPR | NPV | PPV | |
RNN | 0.901 (0.870–0.932) | 0.907 (0.877–0.937) | 0.122 (0.088–0.156) | 0.846 (0.808–0.884) | 0.851 (0.798–0.904) | 0.840 (0.787–0.894) | 0.847 (0.797–0.897) | 0.160 (0.106–0.214) | 0.850 (0.797–0.903) | 0.842 (0.788–0.896) |
Physicians | 0.745 (0.699–0.791) | 0.747 (0.701–0.793) | 0.217 (0.174–0.260) | 0.711 (0.664–0.759) | 0.594 (0.521–0.667) | 0.829 (0.773–0.884) | 0.673 (0.609–0.738) | 0.171 (0.116–0.227) | 0.671 (0.601–0.741) | 0.776 (0.715–0.838) |
n = 350 admissions/patients.
AUC area under curve, PR_AUC precision-recall AUC, Brier Brier score, Acc accuracy, Sens sensitivity, Spec specificity, F1 F1-score, FPR false-positive rate, NPV negative predictive value, PPV positive predictive value, CI confidence interval.