Table 2.
Model Name | LUAD | BRCA | SARC | OV | Other* | 13 cancer types** | All |
---|---|---|---|---|---|---|---|
Baseline | 0.78 | 0.77 | – | – | – | 0.85 | – |
VGG-16 | 0.85 | 0.88 | 0.92 | 0.84 | 0.85 | 0.86 | 0.86 |
ResNet-34 | 0.87 | 0.87 | 0.88 | 0.82 | 0.86 | 0.86 | 0.86 |
Incep-V4 | 0.89 | 0.89 | 0.96 | 0.93 | 0.87 | 0.88 | 0.89 |
Compare result for each of LUAD, BRCA, SARC, OV, *Other: patches from other cancer types in the set of 23 types used in training, **13 cancer types: subset of test patches belonging to the 13 cancer types the baseline model with human in the loop (Baseline) (33) was trained on, All: all test patches from all the 23 cancer types. Best F-score in each dataset is indicated in bold.