Table 2.
diagnostic performance indices of the ML model classifier in validation and testing datasets compared with two radiologists performance.
| AUC | Threshold | Sensitivity% | Specificity% | LR+ | LR- | ||
|---|---|---|---|---|---|---|---|
| ML Model | Validation Dataset | 0.986 (0.978−0.992) | Rule-out (>0.0006) | 99.3 | 75.8 | 4.1 | 0.009 |
| Rule-in (>0.4) | 92.2 | 96.3 | 25.0 | 0.081 | |||
| Test Dataset | 0.956 (0.890−0.988) | Rule-out (>0.0006) | 100 | 60 | 2.5 | <0.01 | |
| Rule-in (>0.4) | 84.4 | 93.3 | 12.7 | 0.17 | |||
| Radiologists | Radiologist 1 (Test Dataset) | 0.867 (0.779−0.929) | n.a. | 82.2 | 91.1 | 9.25 | 0.2 |
| Radiologist 2 (Test Dataset) | 0.889 (0.805−0.945) | n.a. | 80 | 97.8 | 36.0 | 0.2 |