Table 3.
Best threshold chosen by highest F1 score.
AKIa | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Score | Threshold | F1 score (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | Precision (95% CI) | TN | FP | FN | TP | Accuracy (%) |
ASA | 3 | 0.412 (0.393–0.43) | 0.914 (0.896–0.93) | 0.27 (0.255–0.284) | 0.266 (0.251–0.281) | 901 | 2439 | 83 | 884 | 41.4 |
LR OFS | 0.273071 | 0.538 (0.512–0.563) | 0.631 (0.597–0.661) | 0.793 (0.78–0.807) | 0.469 (0.442–0.497) | 2650 | 690 | 357 | 610 | 75.7 |
LR OFS + MAP features | 0.27574 | 0.537 (0.512–0.563) | 0.624 (0.59–0.654) | 0.798 (0.785–0.812) | 0.472 (0.444–0.5) | 2666 | 674 | 364 | 603 | 75.9 |
LR RFS | 0.287606 | 0.537 (0.51–0.563) | 0.607 (0.575–0.637) | 0.811 (0.798–0.823) | 0.482 (0.454–0.511) | 2708 | 632 | 380 | 587 | 76.5 |
DNN individual OFS | 0.408436 | 0.545 (0.52–0.569) | 0.654 (0.622–0.682) | 0.784 (0.77–0.798) | 0.467 (0.441–0.493) | 2618 | 722 | 335 | 632 | 75.5 |
DNN individual OFS + MAP features | 0.481765 | 0.559 (0.533–0.587) | 0.548 (0.515–0.579) | 0.881 (0.87–0.892) | 0.571 (0.542–0.603) | 2942 | 398 | 437 | 530 | 80.6 |
DNN individual RFS | 0.406397 | 0.542 (0.516–0.568) | 0.618 (0.586–0.648) | 0.808 (0.794–0.821) | 0.483 (0.455–0.51) | 2699 | 641 | 369 | 598 | 76.5 |
DNN combined OFS | 0.906036 | 0.548 (0.521–0.575) | 0.568 (0.536–0.598) | 0.854 (0.843–0.865) | 0.53 (0.501–0.559) | 2853 | 487 | 418 | 549 | 79.0 |
DNN combined OFS + MAP features | 0.901522 | 0.549 (0.524–0.575) | 0.58 (0.55–0.61) | 0.846 (0.833–0.857) | 0.521 (0.493–0.552) | 2825 | 515 | 406 | 561 | 78.6 |
DNN combined RFS | 0.869984 | 0.557 (0.53–0.583) | 0.575 (0.543–0.606) | 0.858 (0.846–0.87) | 0.539 (0.51–0.569) | 2865 | 475 | 411 | 556 | 79.4 |
Reintubation | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Score | Threshold | F1 score (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | Precision (95% CI) | TN | FP | FN | TP | Accuracy (%) |
ASA | 4 | 0.152 (0.121–0.182) | 0.44 (0.361–0.517) | 0.941 (0.937–0.945) | 0.092 (0.072–0.112) | 11,142 | 695 | 89 | 70 | 93.5 |
LR OFS | 0.08 | 0.21 (0.157–0.261) | 0.296 (0.223–0.366) | 0.98 (0.977–0.982) | 0.163 (0.121–0.207) | 11,595 | 242 | 112 | 47 | 97.0 |
LR OFS + MAP features | 0.081 | 0.223 (0.168–0.276) | 0.314 (0.24–0.389) | 0.98 (0.977–0.982) | 0.172 (0.129–0.22) | 11,597 | 240 | 109 | 50 | 97.1 |
LR RFS | 0.079193 | 0.211 (0.161–0.262) | 0.302 (0.231–0.375) | 0.979 (0.977–0.982) | 0.163 (0.121–0.207) | 11,590 | 247 | 111 | 48 | 97.0 |
DNN individual OFS | 0.715748 | 0.21 (0.16–0.257) | 0.333 (0.257–0.406) | 0.975 (0.972–0.978) | 0.153 (0.115–0.192) | 11,544 | 293 | 106 | 53 | 96.7 |
DNN individual OFS + MAP features | 0.734977 | 0.197 (0.149–0.243) | 0.321 (0.247–0.397) | 0.974 (0.971–0.977) | 0.142 (0.104–0.179) | 11,530 | 307 | 108 | 51 | 96.5 |
DNN individual RFS | 0.687943 | 0.22 (0.17–0.269) | 0.371 (0.297–0.445) | 0.973 (0.97–0.976) | 0.156 (0.117–0.196) | 11,518 | 319 | 100 | 59 | 96.5 |
DNN combined OFS | 0.769994 | 0.206 (0.164–0.252) | 0.352 (0.284–0.428) | 0.972 (0.969–0.975) | 0.145 (0.113–0.181) | 11,508 | 329 | 103 | 56 | 96.4 |
DNN combined OFS + MAP features | 0.784518 | 0.228 (0.179–0.278) | 0.34 (0.271–0.414) | 0.978 (0.975–0.981) | 0.171 (0.131–0.215) | 11,576 | 261 | 105 | 54 | 96.9 |
DNN combined RFS | 0.746933 | 0.213 (0.166–0.263) | 0.289 (0.221–0.36) | 0.981 (0.978–0.983) | 0.168 (0.128–0.214) | 11,610 | 227 | 113 | 46 | 97.2 |
Mortality | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Score | Threshold | F1 score (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | Precision (95% CI) | TN | FP | FN | TP | Accuracy (%) |
ASA | 5 | 0.239 (0.138–0.356) | 0.161 (0.088–0.253) | 0.999 (0.998–0.999) | 0.467 (0.3–0.667) | 11,893 | 16 | 73 | 14 | 99.3 |
LR OFS | 0.194 | 0.306 (0.208–0.402) | 0.253 (0.167–0.346) | 0.997 (0.996–0.998) | 0.386 (0.265–0.516) | 11,874 | 35 | 65 | 22 | 99.2 |
LR OFS + MAP features | 0.203 | 0.306 (0.212–0.4) | 0.253 (0.17–0.345) | 0.997 (0.996–0.998) | 0.386 (0.267–0.519) | 11,874 | 35 | 65 | 22 | 99.2 |
LR RFS | 0.135 | 0.287 (0.196–0.375) | 0.299 (0.202–0.404) | 0.994 (0.993–0.996) | 0.277 (0.187–0.372) | 11,841 | 68 | 61 | 26 | 98.9 |
DNN individual OFS | 0.59 | 0.294 (0.202–0.389) | 0.276 (0.188–0.383) | 0.996 (0.994–0.997) | 0.316 (0.215–0.429) | 11,857 | 52 | 63 | 24 | 99.0 |
DNN individual OFS + MAP features | 0.587 | 0.268 (0.181–0.36) | 0.253 (0.167–0.356) | 0.995 (0.994–0.996) | 0.286 (0.192–0.391) | 11,854 | 55 | 65 | 22 | 99.0 |
DNN individual RFS | 0.55 | 0.278 (0.204–0.357) | 0.368 (0.276–0.474) | 0.991 (0.989–0.992) | 0.224 (0.16–0.291) | 11,798 | 111 | 55 | 32 | 98.6 |
DNN combined OFS | 0.950117 | 0.271 (0.175–0.367) | 0.218 (0.136–0.312) | 0.997 (0.996–0.998) | 0.358 (0.231–0.482) | 11,875 | 34 | 68 | 19 | 99.1 |
DNN combined OFS + MAP features | 0.975254 | 0.239 (0.138–0.344) | 0.161 (0.089–0.244) | 0.999 (0.998–0.999) | 0.467 (0.294–0.64) | 11,893 | 16 | 73 | 14 | 99.3 |
DNN combined RFS | 0.868749 | 0.267 (0.183–0.346) | 0.299 (0.205–0.393) | 0.993 (0.992–0.995) | 0.241 (0.164–0.325) | 11,827 | 82 | 61 | 26 | 98.8 |
Any outcome | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Score | Threshold | F1 score (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | Precision (95% CI) | TN | FP | FN | TP | Accuracy (%) |
ASA | 4 | 0.36 (0.335–0.387) | 0.309 (0.283–0.337) | 0.96 (0.957–0.964) | 0.431 (0.399–0.468) | 10,494 | 435 | 737 | 330 | 90.2 |
LR OFS | 0.122592 | 0.504 (0.48–0.529) | 0.542 (0.513–0.572) | 0.941 (0.936–0.945) | 0.471 (0.445–0.498) | 10,280 | 649 | 489 | 578 | 90.5 |
LR OFS + MAP features | 0.12059 | 0.503 (0.48–0.53) | 0.549 (0.521–0.58) | 0.938 (0.934–0.943) | 0.465 (0.439–0.492) | 10,254 | 675 | 481 | 586 | 90.4 |
LR RFS | 0.124499 | 0.503 (0.479–0.529) | 0.532 (0.505–0.563) | 0.943 (0.939–0.947) | 0.477 (0.449–0.504) | 10,305 | 624 | 499 | 568 | 90.6 |
DNN individual OFS | 0.411454 | 0.479 (0.455–0.504) | 0.515 (0.487–0.545) | 0.938 (0.934–0.942) | 0.448 (0.422–0.475) | 10,252 | 677 | 518 | 549 | 90.0 |
DNN individual OFS + MAP features | 0.395795 | 0.482 (0.46–0.506) | 0.584 (0.555–0.616) | 0.918 (0.913–0.923) | 0.41 (0.386–0.434) | 10,033 | 896 | 444 | 623 | 88.8 |
DNN individual RFS | 0.402621 | 0.473 (0.449–0.498) | 0.535 (0.508–0.567) | 0.929 (0.924–0.934) | 0.424 (0.399–0.452) | 10,153 | 776 | 496 | 571 | 89.4 |
DNN combined OFS | 0.710049 | 0.47 (0.445–0.496) | 0.503 (0.475–0.534) | 0.938 (0.934–0.942) | 0.441 (0.412–0.47) | 10,249 | 680 | 530 | 537 | 89.9 |
DNN combined OFS + MAP features | 0.678431 | 0.475 (0.452–0.5) | 0.587 (0.558–0.616) | 0.914 (0.909–0.919) | 0.399 (0.376–0.424) | 9988 | 941 | 441 | 626 | 88.5 |
DNN combined RFS | 0.632316 | 0.446 (0.423–0.469) | 0.565 (0.535–0.595) | 0.905 (0.9–0.911) | 0.368 (0.345–0.39) | 9894 | 1035 | 464 | 603 | 87.5 |
Comparison of F1 score, sensitivity, and specificity with best thresholds for acute kidney injury (AKI), reintubation, mortality, and any outcome with 95% CIs for the test set (N = 11,996) for the ASA score, logistic regression (LR) models, deep neural networks predicting individual outcomes (DNN individual), and deep neural networks predicting all three outcomes (DNN combined). Each model was also evaluated for each feature set combination of original feature set (OFS), OFS + the minimum MAP features (OFS + MAP), and reduced feature set (RFS). Note that for the LR and individual models, there is one model per outcome and the predicted outcome probabilities from each model is stacked to predict any outcome. For the combined models, there is one model for all three outcomes and those probabilities are stacked to predict any outcome.
aIt should be noted that AKI labels were only available for 4307 of the test patients, and so all results for AKI are from those patients with AKI labels. Bolded are the best F1 scores for logistic regression and DNN models.