Table 2.
Models | DSC | P value (vs. non-DA model) | MSD (mm) | P value (vs. non-DA model) |
---|---|---|---|---|
Non-DA model | 0.955 ± 0.012 (0.947, 0.953, 0.964) |
– | 1.055 ± 0.953 (0.632, 0.775, 0.916) |
– |
Conventional-DA model | 0.961 ± 0.011 (0.955, 0.963, 0.968) |
0.03 | 0.594 ± 0.137 (0.536, 0.601, 0.680) |
0.03 |
GAN-DA model | 0.959 ± 0.011 (0.949, 0.959, 0.967) |
0.13 | 0.928 ± 0.690 (0.550, 0.726, 0.883) |
0.31 |
Combined-DA model | 0.965 ± 0.010* (0.955, 0.967, 0.971) |
0.0007 | 0.657 ± 0.254* (0.477, 0.630, 0.715) |
0.005 |
Note. —Data are presented as mean ± standard deviation (25th percentile, median, 75th percentile). DA = data augmentation, GAN = generative adversarial network, DSC = Dice Similarity Coefficient, MSD = mean surface distance. Wilcoxon signed-rank tests with the Bonferroni-Holm correction were used to account for multiple comparisons. Adjusted P-values are reported.
Combined-DA model has statistically significantly higher DSC (P = 0.0007) and lower MSD (P = 0.01) compared with GAN-DA model.
DSCs are not statistically significantly different between conventional-DA and GAN-DA, or between conventional-DA and combined-DA.
MSDs are not statistically significantly different between conventional-DA and GAN-DA, or between conventional-DA and combined-DA.