Table 2.
Pair-wise comparison of segmentation model DSCs using bootstrap resampling
| Models compared [DSC (95% CI)] | p value | |
|---|---|---|
| Full image segmentation model comparisons | ||
| U-Net baseline [0.768 (0.753–0.781)] | U-Net after self-refinement [0.798 (0.784–0.810)] | < 0.001 |
| Mask R-CNN baseline [0.831 (0.816–0.846)] | Mask R-CNN after self-refinement [0.871 (0.854–0.886)] | < 0.001 |
| HRNet baseline [0.838 (0.823–0.854)] | HRNet after self-refinement [0.873 (0.858–0.889)] | < 0.001 |
| Full image vs. Hybrid segmentation model comparison | ||
| HRNet after self-refinement [0.873 (0.858–0.889)] | Mask R-CNN hybrid [0.884 (0.868–0.899)] | < 0.001 |
Pair-wise comparison of segmentation model DSCs using bootstrap resampling technique. All baseline models were trained using the TrainOtsu dataset and are compared with the peak model of the same architecture obtained from self-refinement
The best-performing full image segmentation model (HRNet after self-refinement) is compared with the best-performing hybrid method (Mask R-CNN hybrid). The image patch segmentation model used in the hybrid Mask R-CNN method was trained with the TrainFinal-patch dataset derived from TrainFinal
For each comparison, the DSC of the better performing model is in bold text. Bolded p values indicate p < 0.05
CI, confidence interval