Skip to main content
. 2023 Sep 26;96(1151):20220835. doi: 10.1259/bjr.20220835

Table 2.

Summary of the evaluation metrics of best-performed models on the internal test set with 95% confidence intervals

Model Accuracy Precision Sensitivity Specificity F1 Score AUROC
a) Stage 1 Model performance on the internal test set containing only axillary patches
Stage 1
Resnet18 0.94 (0.91, 0.97) 0.94 (0.88, 0.99) 0.93 (0.89, 0.98) 0.96 (0.92, 0.99) 0.93 (0.90, 0.97) 0.98 (0.97–1.00)
VGG16 0.93 (0.89, 0.96) 0.90 (0.84, 0.95) 0.92 (0.87, 0.97) 0.93 (0.89, 0.97) 0.91 (0.87, 0.95) 0.98 (0.97–0.99)
Densenet121 0.92 (0.88, 0.95) 0.89 (0.84, 0.95) 0.91 (0.85, 0.96) 0.92 (0.88, 0.96) 0.90 (0.86, 0.94) 0.97 (0.94–0.99)
Stage 2
Resnet18 0.97 (0.94, 0.99) 0.95 (0.90, 0.98) 0.97 (0.93, 1.00) 0.96 (0.93, 0.99) 0.96 (0.93, 0.98) 1.00 (0.99–1.00)
b) Model performance on the internal test set containing both axillary and non-axillary patches
Stage 1
Resnet18 0.85 (0.81–0.89) 0.45 (0.33–0.57) 0.95 (0.86–1.00) 0.84 (0.79–0.89) 0.61 (0.49–0.72) 0.97 (0.91–1.00)
Stage 2
Resnet18 0.99 (0.98–1.00) 0.95 (0.86–1.00) 0.97 (0.90–1.00) 0.99 (0.98–1.00) 0.96 (0.92–1.00) 1.00 (1.00–1.00)

AUROC, area under the receiver operating characteristic.

(a) Stage 1 Model performance on the internal test set containing only axillary patches. (b) Model performance on the internal test set containing both axillary and non-axillary patches.