Table 2.
Comparison of segmentation performance for the five DL training methods for all scans in the testing set.
Experimental DL methods | Evaluation metrics: median (range) | |||
---|---|---|---|---|
DSC | Avg HD (mm) | HD95 (mm) | XOR | |
Train on 3He | 0.961 (0.765, 0.981) | 2.335 (35.91, 0.644) | 10.00 (140.9, 1.934) | 0.079 (0.613, 0.037) |
Train on 129Xe | 0.964 (0.886, 0.983) | 1.341 (3.911, 0.675) | 4.809 (15.90, 1.875) | 0.072 (0.253, 0.035) |
Train on 3He, fine-tuned on 129Xe | 0.963 (0.892, 0.983) | 1.384 (4.628, 0.636) | 4.971 (29.80, 1.934) | 0.075 (0.238, 0.034) |
Train on 129Xe, fine-tuned on 3He | 0.968 (0.842, 0.983) | 1.483 (10.84, 0.596) | 4.935 (67.85, 1.563) | 0.066 (0.372, 0.034 |
Combined 3He and 129Xe training | 0.971 (0.886, 0.983) | 1.234 (5.630, 0.594) | 4.193 (52.70, 1.875) | 0.059 (0.255, 0.035) |
Medians (ranges) are given; the best result for each metric is in bold.