Table 11.
Model | Dice | 5mm SD | Precision | Recall | HD (mm) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BS | OP | p | BS | OP | p | BS | OP | p | BS | OP | p | BS | OP | p | |
nnFormer | 68.5 | 69.7 | _ | 67.6 | 67.7 | _ | 78.9 | 82.9 | _ | 65.7 | 64.4 | _ | 66.7 | 72.3 | _ |
nnUNet | 68.1 | 68.1 | _ | 67.5 | 67.8 | _ | 78.8 | 84.2 | _ | 67.9 | 63.9 | _ | 60.2 | 60.1 | _ |
SegmentationNet | 55.8 | 61.7 | _ | 52.5 | 58.3 | _ | 61.6 | 74.2 | _ | 55.1 | 59.2 | _ | 91.5 | 84.8 | _ |
Swin-UNETR | 47.7 | 55.1 | _ | 42.0 | 50.9 | *** | 48.7 | 78.2 | * | 57.2 | 48.0 | *** | 162.9 | 58.5 | ** |
TransBTS | 51.4 | 62.1 | ** | 46.2 | 59.2 | _ | 58.4 | 73.1 | *** | 53.6 | 59.9 | * | 176.7 | 73.7 | *** |
UNETR | 33.1 | 41.5 | *** | 30.0 | 35.3 | * | 40.9 | 50.7 | _ | 31.9 | 39.2 | ** | 121.7 | 138.1 | * |
VT-UNet | 54.3 | 56.6 | _ | 49.6 | 53.9 | _ | 61.1 | 71.6 | _ | 55.8 | 56.0 | ** | 98.0 | 72.7 | _ |
Italic indicates the best inter-model value for each metric. Stars indicate the level of significance of differences between baseline and optimised results based on paired t-test and Wilcoxon signed-rank test depending on the distribution of the results on the test set according to the Shapiro-Wilk test (no star means not significant, * means p < 0.05, ** means p < 0.01, and *** means p < 0.001).