Table 2.
Predictive performance of DLM in the testing and external validation cohorts.
Results | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score (%) |
---|---|---|---|---|
Testing cohort | ||||
Low-malignant | 86 (89/104) [80, 92] | 93 (57/61) [87, 99] | 74 (32/43) [61, 88] | 88 |
Intermediate-malignant | 87 (90/104) [81, 92] | 50 (9/18) [27, 74] | 94 (81/86) [88, 100] | 56 |
High-malignant | 91 (95/104) [86, 97] | 76 (19/25) [58, 94] | 96 (76/79) [92, 100] | 81 |
Overall result | 82 | 73 | 88 | 75 |
External validation cohort | ||||
Low-malignant | 81 (315/388) [77, 85] | 72 (98/137) [64, 79] | 86 (217/251) [83, 90] | 73 |
Intermediate-malignant | 75 (292/388) [71, 79] | 24 (16/67) [14, 34] | 86 (276/321) [82, 90] | 25 |
High-malignant | 77 (299/388) [73, 81] | 79 (145/184) [73, 85] | 75 (154/204) [70, 81] | 77 |
Overall result | 67 | 58 | 83 | 58 |
Unless otherwise specified, data are percentages, with numbers of images in parentheses and 95% confidence intervals in brackets.
DLM, deep learning model.