Skip to main content
. 2023 Jan 23;12:1044496. doi: 10.3389/fonc.2022.1044496

Table 3.

Performance values and statistical significance test results of the best data augmentation strategies for the models trained with OPTIMAM Hologic database.

Only synthetic BI-RADS D in training Synthetic and real BI-RADS D in training
FROC AUC Gain p-value FROC AUC Gain p-value
OPTIMAM Hologic BI-RADS D Test Set Baseline 79.71%
(78.44, 80.98)
Ref Ref 80.60%
(79.20, 82.00)
Ref Ref
BC-Aug 79.62%
(77.83, 81.41)
-0.09 0.0064 81.10%
(80.40, 81.80)
+0.50 0.2277 0.2277
OP-Aug 79.86%
(78.30, 81.42)
+0.15 0.8269 80.75%
(78.77, 82.73)
+0.15 0.5599 0.5599
OP-CS-BC-Aug 80.95%
(79.63, 82.27)
+1.24 0.0696 80.76%
(79.92, 81.60)
+0.16 0.7921 0.7921
INbreast Dataset (external validation) Baseline 81.51%
(78.93, 84.09)
Ref Ref 84.71%
(83.39, 86.03)
Ref Ref
BC-Aug 85.66%
(81.91, 89.41)
+4.15 0.0002 84.88%
(82.86, 86.90)
+0.17 0.1666 0.1666
OP-Aug 83.45%
(80.03, 86.87)
+1.94 6.08e-05 86.16%
(83.37, 88.95)
+1.45 0.0041 0.0041
OP-CS-BC-Aug 84.47%
(82.32, 86.62)
+2.95 0.0008 84.29%
(82.22, 86.36)
-0.42 0.0162 0.0162

The columns on the left correspond to the models trained without real BI-RADS D mammograms. The baseline models were trained without synthetic images. The 95% Confidence Intervals of the FROC AUC are in parenthesis. The p-value was computed using the DeLong method with a maximum of 10 FPPI. Bold values correspond to the best performing strategy. Ref corresponds to the reference method.