Table 2.
Model | AUROC | AUPRC | Accuracy | F-1 | Precision | Recall |
---|---|---|---|---|---|---|
In-ICU mortality | ||||||
LR | 85.1 ± 3.2 | 39.5 ± 7.2 | 93.4 ± 0.6 | 30.1 ± 7.6 | 55.0 ± 11.6 | 20.7 ± 6.1 |
RF | 89.1 ± 2.2 | 45.9 ± 7.3 | 93.5 ± 0.3 | 14.2 ± 6.5 | 81.8 ± 19.2 | 7.8 ± 3.9 |
GRU-D | 89.4 ± 2.3 | 50.8 ± 6.8 | 94.0 ± 0.6 | 38.9 ± 8.1 | 66.2 ± 10.3 | 27.6 ± 6.5 |
TCN | 89.2 ± 2.5 | 50.8 ± 7.0 | 94.3 ± 0.6 | 46.6 ± 7.3 | 64.5 ± 8.7 | 36.5 ± 7.1 |
In-hospital mortality | ||||||
LR | 83.6 ± 2.6 | 44.7 ± 5.7 | 91.0 ± 0.7 | 35.7 ± 6.0 | 61.4 ± 9.3 | 25.2 ± 5.3 |
RF | 86.4 ± 2.3 | 49.3 ± 5.9 | 90.7 ± 0.4 | 14.5 ± 5.8 | 85.1 ± 14.0 | 7.9 ± 3.4 |
GRU-D | 87.3 ± 2.3 | 52.1 ± 5.6 | 91.6 ± 0.8 | 44.2 ± 6.0 | 65.4 ± 7.5 | 33.4 ± 5.8 |
TCN | 87.7 ± 2.1 | 53.0 ± 6.0 | 91.2 ± 0.9 | 47.2 ± 6.0 | 58.7 ± 6.7 | 39.5 ± 6.2 |
Length of stay (LOS > 3) | ||||||
LR | 69.0 ± 2.1 | 61.7 ± 2.8 | 65.5 ± 1.8 | 53.5 ± 2.7 | 63.6 ± 2.8 | 46.2 ± 2.9 |
RF | 71.4 ± 2.0 | 65.5 ± 2.8 | 67.3 ± 1.7 | 55.3 ± 2.7 | 67.1 ± 2.8 | 47.0 ± 3.0 |
GRU-D | 72.2 ± 2.0 | 65.7 ± 2.7 | 68.1 ± 1.7 | 59.4 ± 2.5 | 65.6 ± 2.6 | 54.2 ± 3.0 |
TCN | 71.6 ± 2.2 | 65.0 ± 2.7 | 67.0 ± 1.7 | 55.6 ± 2.7 | 66.0 ± 2.8 | 48.0 ± 2.9 |
Length of stay (LOS > 7) | ||||||
LR | 66.8 ± 4.2 | 15.9 ± 3.3 | 91.7 ± 0.3 | 2.3 ± 2.8 | 15.2 ± 17.7 | 1.3 ± 1.6 |
RF | 75.3 ± 3.5 | 22.0 ± 4.5 | 92.1 ± 0.0 | 0.0 ± 0.0 | 0.0 ± 0.0 | 0.0 ± 0.0 |
GRU-D | 74.4 ± 3.8 | 22.4 ± 4.5 | 92.0 ± 0.4 | 9.8 ± 5.3 | 44.9 ± 20.4 | 5.5 ± 3.2 |
TCN | 73.5 ± 3.6 | 18.8 ± 3.5 | 91.8 ± 0.3 | 3.7 ± 3.5 | 25.0 ± 21.9 | 2.0 ± 1.9 |
All values shown in %. Primary evaluation metrics: AUROC, AUPRC, Accuracy, F-1. Secondary evaluation metrics: precision, recall. TCN, temporal convolution network; GRU-D, gated recurrent unit with delay; RF, random forest; LR, logistic regression; AUROC, area under receiver operating curve; AUPRC, area under precision recall curve.
Best-in-task values for primary evaluation metrics are in bold.