Skip to main content
. 2020 Apr 20;3:58. doi: 10.1038/s41746-020-0248-0

Table 3.

Best threshold chosen by highest F1 score.

AKIa
Score Threshold F1 score (95% CI) Sensitivity (95% CI) Specificity (95% CI) Precision (95% CI) TN FP FN TP Accuracy (%)
ASA 3 0.412 (0.393–0.43) 0.914 (0.896–0.93) 0.27 (0.255–0.284) 0.266 (0.251–0.281) 901 2439 83 884 41.4
LR OFS 0.273071 0.538 (0.5120.563) 0.631 (0.597–0.661) 0.793 (0.78–0.807) 0.469 (0.442–0.497) 2650 690 357 610 75.7
LR OFS + MAP features 0.27574 0.537 (0.512–0.563) 0.624 (0.59–0.654) 0.798 (0.785–0.812) 0.472 (0.444–0.5) 2666 674 364 603 75.9
LR RFS 0.287606 0.537 (0.51–0.563) 0.607 (0.575–0.637) 0.811 (0.798–0.823) 0.482 (0.454–0.511) 2708 632 380 587 76.5
DNN individual OFS 0.408436 0.545 (0.52–0.569) 0.654 (0.622–0.682) 0.784 (0.77–0.798) 0.467 (0.441–0.493) 2618 722 335 632 75.5
DNN individual OFS+MAP features 0.481765 0.559 (0.5330.587) 0.548 (0.515–0.579) 0.881 (0.87–0.892) 0.571 (0.542–0.603) 2942 398 437 530 80.6
DNN individual RFS 0.406397 0.542 (0.516–0.568) 0.618 (0.586–0.648) 0.808 (0.794–0.821) 0.483 (0.455–0.51) 2699 641 369 598 76.5
DNN combined OFS 0.906036 0.548 (0.521–0.575) 0.568 (0.536–0.598) 0.854 (0.843–0.865) 0.53 (0.501–0.559) 2853 487 418 549 79.0
DNN combined OFS + MAP features 0.901522 0.549 (0.524–0.575) 0.58 (0.55–0.61) 0.846 (0.833–0.857) 0.521 (0.493–0.552) 2825 515 406 561 78.6
DNN combined RFS 0.869984 0.557 (0.53–0.583) 0.575 (0.543–0.606) 0.858 (0.846–0.87) 0.539 (0.51–0.569) 2865 475 411 556 79.4
Reintubation
Score Threshold F1 score (95% CI) Sensitivity (95% CI) Specificity (95% CI) Precision (95% CI) TN FP FN TP Accuracy (%)
ASA 4 0.152 (0.121–0.182) 0.44 (0.361–0.517) 0.941 (0.937–0.945) 0.092 (0.072–0.112) 11,142 695 89 70 93.5
LR OFS 0.08 0.21 (0.157–0.261) 0.296 (0.223–0.366) 0.98 (0.977–0.982) 0.163 (0.121–0.207) 11,595 242 112 47 97.0
LR OFS+MAP features 0.081 0.223 (0.1680.276) 0.314 (0.24–0.389) 0.98 (0.977–0.982) 0.172 (0.129–0.22) 11,597 240 109 50 97.1
LR RFS 0.079193 0.211 (0.161–0.262) 0.302 (0.231–0.375) 0.979 (0.977–0.982) 0.163 (0.121–0.207) 11,590 247 111 48 97.0
DNN individual OFS 0.715748 0.21 (0.16–0.257) 0.333 (0.257–0.406) 0.975 (0.972–0.978) 0.153 (0.115–0.192) 11,544 293 106 53 96.7
DNN individual OFS + MAP features 0.734977 0.197 (0.149–0.243) 0.321 (0.247–0.397) 0.974 (0.971–0.977) 0.142 (0.104–0.179) 11,530 307 108 51 96.5
DNN individual RFS 0.687943 0.22 (0.17–0.269) 0.371 (0.297–0.445) 0.973 (0.97–0.976) 0.156 (0.117–0.196) 11,518 319 100 59 96.5
DNN combined OFS 0.769994 0.206 (0.164–0.252) 0.352 (0.284–0.428) 0.972 (0.969–0.975) 0.145 (0.113–0.181) 11,508 329 103 56 96.4
DNN combined OFS+MAP features 0.784518 0.228 (0.1790.278) 0.34 (0.271–0.414) 0.978 (0.975–0.981) 0.171 (0.131–0.215) 11,576 261 105 54 96.9
DNN combined RFS 0.746933 0.213 (0.166–0.263) 0.289 (0.221–0.36) 0.981 (0.978–0.983) 0.168 (0.128–0.214) 11,610 227 113 46 97.2
Mortality
Score Threshold F1 score (95% CI) Sensitivity (95% CI) Specificity (95% CI) Precision (95% CI) TN FP FN TP Accuracy (%)
ASA 5 0.239 (0.138–0.356) 0.161 (0.088–0.253) 0.999 (0.998–0.999) 0.467 (0.3–0.667) 11,893 16 73 14 99.3
LR OFS 0.194 0.306 (0.208–0.402) 0.253 (0.167–0.346) 0.997 (0.996–0.998) 0.386 (0.265–0.516) 11,874 35 65 22 99.2
LR OFS+ MAP features 0.203 0.306 (0.2120.4) 0.253 (0.17–0.345) 0.997 (0.996–0.998) 0.386 (0.267–0.519) 11,874 35 65 22 99.2
LR RFS 0.135 0.287 (0.196–0.375) 0.299 (0.202–0.404) 0.994 (0.993–0.996) 0.277 (0.187–0.372) 11,841 68 61 26 98.9
DNN individual OFS 0.59 0.294 (0.2020.389) 0.276 (0.188–0.383) 0.996 (0.994–0.997) 0.316 (0.215–0.429) 11,857 52 63 24 99.0
DNN individual OFS + MAP features 0.587 0.268 (0.181–0.36) 0.253 (0.167–0.356) 0.995 (0.994–0.996) 0.286 (0.192–0.391) 11,854 55 65 22 99.0
DNN individual RFS 0.55 0.278 (0.204–0.357) 0.368 (0.276–0.474) 0.991 (0.989–0.992) 0.224 (0.16–0.291) 11,798 111 55 32 98.6
DNN combined OFS 0.950117 0.271 (0.175–0.367) 0.218 (0.136–0.312) 0.997 (0.996–0.998) 0.358 (0.231–0.482) 11,875 34 68 19 99.1
DNN combined OFS + MAP features 0.975254 0.239 (0.138–0.344) 0.161 (0.089–0.244) 0.999 (0.998–0.999) 0.467 (0.294–0.64) 11,893 16 73 14 99.3
DNN combined RFS 0.868749 0.267 (0.183–0.346) 0.299 (0.205–0.393) 0.993 (0.992–0.995) 0.241 (0.164–0.325) 11,827 82 61 26 98.8
Any outcome
Score Threshold F1 score (95% CI) Sensitivity (95% CI) Specificity (95% CI) Precision (95% CI) TN FP FN TP Accuracy (%)
ASA 4 0.36 (0.335–0.387) 0.309 (0.283–0.337) 0.96 (0.957–0.964) 0.431 (0.399–0.468) 10,494 435 737 330 90.2
LR OFS 0.122592 0.504 (0.480.529) 0.542 (0.513–0.572) 0.941 (0.936–0.945) 0.471 (0.445–0.498) 10,280 649 489 578 90.5
LR OFS + MAP features 0.12059 0.503 (0.48–0.53) 0.549 (0.521–0.58) 0.938 (0.934–0.943) 0.465 (0.439–0.492) 10,254 675 481 586 90.4
LR RFS 0.124499 0.503 (0.479–0.529) 0.532 (0.505–0.563) 0.943 (0.939–0.947) 0.477 (0.449–0.504) 10,305 624 499 568 90.6
DNN individual OFS 0.411454 0.479 (0.455–0.504) 0.515 (0.487–0.545) 0.938 (0.934–0.942) 0.448 (0.422–0.475) 10,252 677 518 549 90.0
DNN individual OFS+MAP features 0.395795 0.482 (0.460.506) 0.584 (0.555–0.616) 0.918 (0.913–0.923) 0.41 (0.386–0.434) 10,033 896 444 623 88.8
DNN individual RFS 0.402621 0.473 (0.449–0.498) 0.535 (0.508–0.567) 0.929 (0.924–0.934) 0.424 (0.399–0.452) 10,153 776 496 571 89.4
DNN combined OFS 0.710049 0.47 (0.445–0.496) 0.503 (0.475–0.534) 0.938 (0.934–0.942) 0.441 (0.412–0.47) 10,249 680 530 537 89.9
DNN combined OFS + MAP features 0.678431 0.475 (0.452–0.5) 0.587 (0.558–0.616) 0.914 (0.909–0.919) 0.399 (0.376–0.424) 9988 941 441 626 88.5
DNN combined RFS 0.632316 0.446 (0.423–0.469) 0.565 (0.535–0.595) 0.905 (0.9–0.911) 0.368 (0.345–0.39) 9894 1035 464 603 87.5

Comparison of F1 score, sensitivity, and specificity with best thresholds for acute kidney injury (AKI), reintubation, mortality, and any outcome with 95% CIs for the test set (N = 11,996) for the ASA score, logistic regression (LR) models, deep neural networks predicting individual outcomes (DNN individual), and deep neural networks predicting all three outcomes (DNN combined). Each model was also evaluated for each feature set combination of original feature set (OFS), OFS + the minimum MAP features (OFS + MAP), and reduced feature set (RFS). Note that for the LR and individual models, there is one model per outcome and the predicted outcome probabilities from each model is stacked to predict any outcome. For the combined models, there is one model for all three outcomes and those probabilities are stacked to predict any outcome.

aIt should be noted that AKI labels were only available for 4307 of the test patients, and so all results for AKI are from those patients with AKI labels. Bolded are the best F1 scores for logistic regression and DNN models.