Table 3.
Results on the TUPAC dataset.
CNN model combinations | Baseline DCNN | ||||||
---|---|---|---|---|---|---|---|
Color augmentation | ✓ | ✓ | ✓ | ✓ | |||
Staining normalization | ✓ | ✓ | ✓ | ||||
Domain adversarial | ✓ | ✓ | ✓ | ||||
Internal test set (F1-score) | 0.8088 (±0.02) | 0.8117 (±0.001) | 0.7630 (±0.04) | 0.6950 (±0.379) | 0.7787 (±0.03) | 0.6985 (±0.01) | 0.6945 (±0.02) |
External test set (F1-score) | 0.71173 (±0.02) | 0.7306 (±0.07) | 0.5424 (±0.01) | 0.8236 (±0.071) | 0.5963 (±0.1) | 0.6740 (±0.01) | 0.5742 (±0.009) |
Internal test set (AUC) | 0.9596 (±0.006) | 0.9631 (±0.005) | 0.9351 (±0.001) | 0.8972 (±0.011) | 0.9503 (±0.01) | 0.9030 (±0.002) | 0.8871 (±0.02) |
External test set (AUC) | 0.8014 (±0.01) | 0.8270 (±0.06) | 0.848 (±0.075) | 0.9146 (±0.003) | 0.7925 (±0.06) | 0.8446 (±0.004) | 0.8255 (±0.06) |
Performance measures for the possible combinations of color augmentation, staining normalization, and DANN. The first column corresponds to the baseline DCNN without any staining normalization nor color augmentation. Numbers in bold indicate the best result for that row (performance measure).