TABLE II.
The Effect of BigAug and Various Augmentation Methods on Unseen Domain Generalization (Measured with Dice Scores). Source Columns Indicate The Dataset Used for Training, and its Dice Scores are Validation Dice Scores (Using a Split) for Comparisons. Unseen Columns List Dice Results When Applied to Unseen Datasets (of The Model Trained on the Source). Here Baseline Refers to a Random Crop With no Further Augmentation. Top4 Stands for the Combination of Four Best Performing Augmentations (Sharpening, Brightness, Contrast, Scaling). Supervised Indicates the State-of-the-Art Literature Results, When a Model is Trained and Tested on the Same Dataset.
Task 1. MRI - whole prostate | Task 2. MRI- left atrial | Task 3. Ultrasound - left ventricle | All Tasks | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Source | Unseen | Source | Unseen | Source | Unseen | Source | Unseen | |||||
MSD-P | PROMISE | NCI-ISBI | Prostate X | MSD-H | ASC | MM-WHS | CETUS-A | CETUS-B | CETUS-C | Average | Average | |
Baseline | 89.6 | 60.4 | 58.0 | 76.8 | 91.9 | 4.4 | 72.9 | 85.8 | 51.7 | 39.2 | 89.1 | 49.8 |
Shaipening | 90.6 | 65.5 | 82.8 | 84.0 | 91.5 | 5.7 | 78.9 | 83.7 | 59.5 | 78.5 | 88.6 | 62.9 |
Blurring | 86.1 | 63.9 | 67.0 | 79.9 | 90.9 | 3.3 | 76.9 | 90.5 | 73.4 | 72.4 | 89.2 | 61.1 |
Noise | 91.1 | 59.3 | 67.4 | 81.4 | 91.4 | 8.3 | 78.0 | 87.3 | 66.8 | 62.2 | 90.0 | 59.0 |
Brightness | 89.7 | 63.3 | 66.9 | 83.0 | 91.3 | 12.2 | 80.2 | 85.5 | 63.6 | 83.1 | 88.8 | 63.6 |
Contrast | 91.1 | 72.7 | 60.7 | 86.1 | 91.3 | 12.7 | 78.6 | 88.4 | 58.4 | 85.5 | 90.3 | 63.6 |
Perturb | 90.1 | 63.4 | 69.5 | 81.5 | 91.7 | 6.6 | 77.3 | 88.5 | 63.6 | 83.1 | 90.1 | 55.7 |
Rotation | 87.4 | 59.0 | 57.9 | 75.1 | 91.2 | 5.2 | 72.1 | 78.0 | 60.4 | 62.6 | 85.5 | 54.7 |
Scaling | 90.8 | 59.3 | 60.8 | 78.8 | 91.3 | 7.4 | 75.3 | 91.0 | 84.1 | 68.2 | 91.0 | 61.3 |
Deform | 89.7 | 61.4 | 61.5 | 81.2 | 91.6 | 7.8 | 69.2 | 86.3 | 62.4 | 31.4 | 89.2 | 51.1 |
Top4 | 91.0 | 73.5 | 83.0 | 86.5 | 91.6 | 45.4 | 79.4 | 90.9 | 81.9 | 80.5 | 91.2 | 74.9 |
CycleGAN | - | 74.7 | 76.4 | 81.2 | - | 18.0 | 76.2 | - | 65.3 | 66.6 | - | 63.5 |
BigAug (ours) | 91.3 | 80.2 | 85.4 | 86.5 | 91.4 | 65.5 | 80.0 | 92.1 | 84.9 | 81.3 | 91.6 | 80.0 |
Supervised | - | 91.4 [39] | 89.3 [40] | 91.9* | - | 94.2 [41] | 88.6# | - | 92.5* | 92.5* | - | 91.5 |
Indicates Inter-Observer Variability.