Table 2.
Validation Studies Model Performance Characteristics
Characteristic | UMass Cohort 1 | UMass Cohort 2 | WakeMed Cohort | COVID-19 Cohort | Telemedicine System Alerts |
|
---|---|---|---|---|---|---|
UMass Cohort 1 | UMass Cohort 2 | |||||
CLEW hemodynamic instability model | ||||||
Patientsa | 6,098 | 6,116 | 3,191 | 513 | 6,098 | 6,616 |
F scoreb | 0.20 | 0.21 | 0.33 | 0.36 | 0.032 | 0.032 |
TPR (sensitivity) | 0.72 | 0.72 | 0.61 | 0.79 | 0.67 | 0.69 |
TNR (specificity) | 0.94 | 0.94 | 0.94 | 0.88 | 0.53 | 0.53 |
PPV (precision) | 0.12 | 0.12 | 0.22 | 0.23 | 0.016 | 0.016 |
NPV | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |
Accuracy | 0.94 | 0.94 | 0.92 | 0.87 | 0.53 | 0.53 |
Misclassification rate | 0.065 | 0.062 | 0.068 | 0.13 | 0.47 | 0.47 |
TP | 478 | 462 | 492 | 426 | 443 | 441 |
FP | 3,538 | 3,337 | 1,714 | 1,421 | 26,770 | 26,776 |
TN | 53,206 | 53,082 | 27,545 | 10,005 | 29,974 | 29,643 |
FN | 186 | 180 | 320 | 111 | 221 | 201 |
No. of events | 664 | 642 | 812 | 537 | 664 | 642 |
No. of alert windows | 4,016 | 3,799 | 2,206 | 1,847 | 27,213 | 27,212 |
Mean total alerts per patient per dayc | 0.21 | 0.20 | 0.22 | 0.46 | 12 | 11 |
Median lead time (IQR), min | 240 (170-342) | 220 (160-340) | 180 (72-290) | 180 (72-290) | 200 (78-340) | 180 (72-330) |
CLEW respiratory failure model | ||||||
Patientsa | 6,098 | 6,116 | 3,191 | 513 | 6,098 | 6,116 |
F scoreb | 0.143 | 0.132 | 0.111 | 0.213 | 0.020 | 0.019 |
TPR (sensitivity) | 0.75 | 0.70 | 0.58 | 0.59 | 0.49 | 0.47 |
TNR (specificity) | 0.94 | 0.94 | 0.91 | 0.90 | 0.70 | 0.69 |
PPV (precision) | 0.08 | 0.07 | 0.06 | 0.13 | 0.01 | 0.01 |
NPV | 0.99 | 0.99 | 0.92 | 0.81 | 0.99 | 0.99 |
Accuracyd | 0.94 | 0.94 | 0.91 | 0.89 | 0.70 | 0.69 |
Misclassification rate | 0.057 | 0.061 | 0.088 | 0.11 | 0.30 | 0.31 |
TP | 199 | 200 | 124 | 83 | 130 | 132 |
FP | 2,325 | 2,535 | 1,900 | 558 | 12,401 | 13,132 |
TN | 39,172 | 40,011 | 20,485 | 4,786 | 29,069 | 29,414 |
FN | 65 | 84 | 90 | 57 | 134 | 152 |
No. of events | 264 | 284 | 214 | 140 | 264 | 284 |
No. of alert windows | 2,524 | 2,735 | 2,024 | 641 | 12,531 | 13,264 |
Mean total alerts per patient per dayc | 0.18 | 0.19 | 0.27 | 0.35 | 11 | 11 |
Median lead time (IQR), h | 240 (160-350) | 210 (150-320) | 260 (180-370) | 250 (170-360) | 170 (54-340) | 190 (78-350) |
FN = false negative; FP = false positive; IQR = interquartile range; NPV = negative predictive value; PPV = positive predictive value; TN = true negative; TNR = true negative rate; TP = true positive; TPR = true positive rate.
Number of ICU visits that contributed data to one or both models.
Harmonic mean of precision and recall with a range of 0 to 1. TPR can also be termed model recall, and PPV (true positives/[true positives + false negatives]) can also be termed model precision.
Each 8-h alert window has one or more alerts.
Accuracy is defined as (true positives + true negatives)/(all positives + all negatives).