Table 1. Segmentation accuracy metrics presented separately for younger and older fetuses. The metrics were computed separately for each label; this table presents mean ± standard deviation over all labels. Best results for each metric are in bold. We used paired t-tests to compare our proposed method with every other method. Asterisks in this table denote significantly better results for the proposed method than all other methods.
Dataset | Method | DSC | HD95 (mm) | ASSD (mm) |
---|---|---|---|---|
Younger fetuses | nnU-Net | 0.872 ± 0.063 | 0.99 ± 0.11 | 0.26 ± 0.14 |
Generalized Dice | 0.845 ± 0.087 | 1.09 ± 0.12 | 0.32 ± 0.26 | |
Focal loss | 0.839 ± 0.080 | 1.15 ± 1.20 | 0.30 ± 0.19 | |
iMAE | 0.865 ± 0.075 | 1.06 ± 0.15 | 0.26 ± 0.17 | |
Training on clean labels (without T.L.) | 0.863 ± 0.068 | 1.09 ± 0.14 | 0.28 ± 0.16 | |
Training on clean labels (with T.L.) | 0.866 ± 0.062 | 1.03 ± 0.13 | 0.27 ± 0.17 | |
Standard label smoothing | 0.833 ± 0.084 | 1.08 ± 0.17 | 0.34 ± 0.21 | |
SVLS | 0.843 ± 0.074 | 1.07 ± 0.17 | 0.30 ± 0.18 | |
DeepLab | 0.851 ± 0.072 | 1.11 ± 0.15 | 0.30 ± 0.15 | |
UNet++ | 0.866 ± 0.060 | 1.02 ± 0.14 | 0.27 ± 0.15 | |
Proposed method | 0.893 ± 0.066∗ | 0.94 ± 0.13∗ | 0.23 ± 0.13∗ | |
| ||||
Older fetuses | nnU-Net | 0.896 ± 0.066 | 0.98 ± 0.11 | 0.36 ± 0.12 |
Generalized Dice | 0.866 ± 0.070 | 1.16 ± 0.11 | 0.46 ± 0.15 | |
Focal loss | 0.861 ± 0.068 | 1.16 ± 0.16 | 0.42 ± 0.16 | |
iMAE | 0.880 ± 0.064 | 1.09 ± 0.17 | 0.41 ± 0.20 | |
Training on clean labels (without T.L.) | 0.877 ± 0.073 | 1.12 ± 0.14 | 0.40 ± 0.18 | |
Training on clean labels (with T.L.) | 0.880 ± 0.070 | 1.04 ± 0.15 | 0.40 ± 0.20 | |
Standard label smoothing | 0.853 ± 0.071 | 1.16 ± 0.12 | 0.39 ± 0.23 | |
SVLS | 0.856 ± 0.077 | 1.10 ± 0.13 | 0.37 ± 0.27 | |
DeepLab | 0.865 ± 0.074 | 1.21 ± 0.19 | 0.43 ± 0.26 | |
UNet++ | 0.885 ± 0.070 | 1.08 ± 0.16 | 0.38 ± 0.23 | |
Proposed method | 0.916 ± 0.059∗ | 0.94 ± 0.13∗ | 0.25 ± 0.09∗ |