. 2021 Sep 17;11:750875. doi: 10.3389/fonc.2021.750875

Table 2.

Predictive performance of DLM in the testing and external validation cohorts.

Results	Accuracy (%)	Sensitivity (%)	Specificity (%)	F1 score (%)
Testing cohort
Low-malignant	86 (89/104) [80, 92]	93 (57/61) [87, 99]	74 (32/43) [61, 88]	88
Intermediate-malignant	87 (90/104) [81, 92]	50 (9/18) [27, 74]	94 (81/86) [88, 100]	56
High-malignant	91 (95/104) [86, 97]	76 (19/25) [58, 94]	96 (76/79) [92, 100]	81
Overall result	82	73	88	75
External validation cohort
Low-malignant	81 (315/388) [77, 85]	72 (98/137) [64, 79]	86 (217/251) [83, 90]	73
Intermediate-malignant	75 (292/388) [71, 79]	24 (16/67) [14, 34]	86 (276/321) [82, 90]	25
High-malignant	77 (299/388) [73, 81]	79 (145/184) [73, 85]	75 (154/204) [70, 81]	77
Overall result	67	58	83	58

Unless otherwise specified, data are percentages, with numbers of images in parentheses and 95% confidence intervals in brackets.

DLM, deep learning model.