Skip to main content
. 2020 Oct 27;20:276. doi: 10.1186/s12911-020-01284-x

Table 2.

Comparison table of performance metrics for MLA to standard scoring systems, at time of severe sepsis onset

MLA ≥ 0.029 DAD training MLA ≥ 0.030 DAD testing MLA ≥ 0.017 CHH external validation MEWS ≥ 2 DAD testing SOFA ≥ 2 DAD testing SIRS ≥ 1 DAD testing
AUROC (SD) 0.931 (0.01) 0.930 (0.01) 0.948 (0.01) 0.725 0.716 0.655
P value (MLA vs comparator) P < 0.001 P < 0.001 P < 0.001
Sensitivity 0.800 0.800 0.800 0.845 0.750 0.868
Specificity 0.926 0.933 0.921 0.444 0.554 0.334
Accuracy 0.923 0.929 0.920 0.608 0.645 0.646
DOR 53.105 56.508 47.532 4.358 3.720 3.290
LR+ 11.411 12.110 10.306 1.521 1.680 1.303
LR− 0.216 0.215 0.217 0.349 0.452 0.396

Detailed performance metrics for the Machine Learning Algorithm (MLA) and rules-based systems taken at the time of severe sepsis onset, using the Dascena Analysis Dataset for training and testing and the Cabell Huntington Hospital dataset for external validation. The score threshold reported for the MLA is the average over rounds of ten-fold cross-validation. AUROC for MLA versus comparators was performed using two-sample t-tests at 95% confidence. AUROC area under the receiver operating characteristic, MEWS Modified Early Warning Score, SOFA Sequential Organ Failure Assessment, SIRS Systemic Inflammatory Response Syndrome, DOR diagnostic odds ratio, LR likelihood ratio