Skip to main content
. Author manuscript; available in PMC: 2023 Sep 27.
Published in final edited form as: Eur Radiol. 2023 Apr 12;33(9):6582–6591. doi: 10.1007/s00330-023-09583-3

Table 2.

Pair-wise comparison of segmentation model DSCs using bootstrap resampling

Models compared [DSC (95% CI)] p value
Full image segmentation model comparisons
 U-Net baseline [0.768 (0.753–0.781)] U-Net after self-refinement [0.798 (0.784–0.810)] < 0.001
 Mask R-CNN baseline [0.831 (0.816–0.846)] Mask R-CNN after self-refinement [0.871 (0.854–0.886)] < 0.001
 HRNet baseline [0.838 (0.823–0.854)] HRNet after self-refinement [0.873 (0.858–0.889)] < 0.001
Full image vs. Hybrid segmentation model comparison
 HRNet after self-refinement [0.873 (0.858–0.889)] Mask R-CNN hybrid [0.884 (0.868–0.899)] < 0.001

Pair-wise comparison of segmentation model DSCs using bootstrap resampling technique. All baseline models were trained using the TrainOtsu dataset and are compared with the peak model of the same architecture obtained from self-refinement

The best-performing full image segmentation model (HRNet after self-refinement) is compared with the best-performing hybrid method (Mask R-CNN hybrid). The image patch segmentation model used in the hybrid Mask R-CNN method was trained with the TrainFinal-patch dataset derived from TrainFinal

For each comparison, the DSC of the better performing model is in bold text. Bolded p values indicate p < 0.05

CI, confidence interval