Table 1.
Author | Year | Sample | Number of classes | Diagnosisa | Dataseta | Training setb | Test set | External test set | Pre-processinga | Model (Patch level)a | Model (Slide level)a | Transfer learning | Training approach | Resultsa | Results of the external test seta |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Lucas et al.33 | 2019 | Prostate | 4 | Cancer | Private | 268 000 patches | 89 000 patches | – | Data Augmentation | InceptionV3 + SVM | Percentages of GPs used for final Gleason grade | No | Supervised | Kappa: 0.70 | - |
Pantanowitz et al.40 | 2020 | Prostate | 18 | Cancer | Private | 549 WSIs | 2501 WSIs | 1627 WSIs | Tissue segmentation and data augmentation | InceptionV1, InceptionV3 and ResNet101 | Maximum score | Yes | Supervised | AUC: 0.997g | AUC: 0.991, 0.941, 0.971, and 0.957h |
Ström et al.49 | 2020 | Prostate | 2 and 4a | Cancer | Private | 1069 WSIs | 246 WSIs | 73 WSIs | Tissue segmentation and data augmentation | 30 InceptionV3 models | Boosted tree | Yes | Supervised | Kappa: 0.83 | Kappa: 0.70 |
BenTaieb et al.6 | 2017 | Ovary | 5 | Cancer | Public | 68 WSIs | 65 WSIs | – | – | K-means | LSVM | Yes | Weakly Supervised | Kappa: 0.89 | – |
Barker et al.4 | 2016 | Central nervous system | 2 | Cancer | Public | 302 WSIs | 45 WSIs | 302 WSIs | Tissue segmentation, color deconvolution and nuclei segmentation | – | Feature Extraction + Elastic Net (Regression) | No | Weakly supervised | Accuracy: 1.0 | Accuracy: 0.93 |
Xu et al.60 | 2017 | Central nervous system | 2 | Cancer | Public | 55 WSIs | 40 WSIs | - | Tissue segmentation, resize and data augmentation | Customized AlexNet | Feature Pooling + SVM | Yes | Supervised | Accuracy: 0.975 | – |
Bulten et al.7 | 2020 | Prostate | 7 | Cancer | Private | 933 WSIS | 210 WSIs | – | Tissue segmentation and data augmentation | Own CNN to detect tumor and U-Net to final label | Normalized percentage of the volume of each class | No | Supervised (with a semi-automatic annotation) | Kappa: 0.819 on Gleason score | – |
Gecer et al.17 | 2018 | Breast | 5 | Cancer | Private | 180 WSIs | 60 WSIs | – | Color Normalization | RoI detector and an own proposed CNN | Majority voting | No | Weakly supervised | Accuracy: 0.55 | – |
Silva-Rodríguez et al.46 | 2020 | Prostate | 4 and 1a | Cancer | Public | 155 WSIs | 2122 patches | - | Tissue segmentation and data augmentation | Own CNN | MLP | No and yesa | Supervised | Kappa: 0.732 | – |
Tokunaga et al.53 | 2019 | Gastric | 4 | Cancer | – | 29 WSIs | – | – | Data augmentation | AWMF-CNN | Aggregating CNN | No | Supervised | IoU (Mean): 0.536 | – |
Sali et al.43 | 2019 | Small intestine | 4 | Celiac disease | Private | 336 WSIs | 120 WSIs | - | Tissue segmentation, color normalization, resize and data augmentation | Customized Resnet50 | Sum of all labels and majority | No | Weakly Supervised | Accuracy: 1.0 | - |
Xu et al.61 | 2020 | Prostate | 3 | Cancer | Public | 312 WSIs | 49,883 patches | – | Grayscale and tissue segmentation | Feature extractor | PCA and SVM | No | Weakly Supervised | Accuracy: 0.771 | – |
Mercan et al.34 | 2018 | Breast | 14 | Cancer | Private | 240 WSIs | 60 WSIs | – | – | Feature extractor + Linear classifier | PCA and SVM | No | Weakly supervised | Average precision: 0.737 | – |
Adnan et al.1 | 2020 | Lung | 2 | Cancer | Public | 1026 WSIs | – | – | RoI selection | Feature extractor | GCN | No and yesa | Weakly supervised | 0.89 AUCf | – |
van Zon et al.56 | 2020 | Skin | 3 | Cancer | Private | 232 WSIs | 331 WSIs | - | Tissue segmentation and data augmentation | U-Net | Own CNN | No | Supervised | 0.954 Accuracyd | – |
Wang et al.57 | 2019 | Lung | 4 | Cancer | Private | 754 WSIs | 185 WSIs | - | Tissue segmentation, resize and data augmentation | ScanNet | Aggregation of patch preditcions values + Random foresta | No | Weakly supervised | Accuracy: 0.973 | – |
Syrykh et al.51 | 2020 | Lymph node | 2 | Cancer | Private | 75% of 378 WSIs | 25% of 378 WSIs | 48 Cases | Tissue segmentation | CNNa | Average of patch inferences | – | Weakly supervised | AUC: 0.99 | AUC: 0.69 |
Wei et al.58 | 2019 | Small intestine | 3 | Celiac disease | Private | 1,018 WSIs | 212 WSIs | - | Data augmentation and color normalization | ResNet50 | Threshold to discard low confidence + Most frequent predicted class | Yes | – | Average F1 score: 0.872 | – |
Korbar et al.28 | 2017 | Small intestine | 6 | Colorectal polyps | Private | 458 WSIs | 239 WSIs | - | Data augmentation, color normalization and resize | ResNet-D | At least 5 positive class patches with 70% of confidence | No | Supervised | Overall F1 score: 0.888 | – |
Nagpal et al.37 | 2019 | Prostate | 4 | Cancer | Public and private | 1,226 WSIs | 331 WSIs | - | Data augmentation | Customized inception V3 | K-nearest neighbor model from patch prediction | No | Supervised | Gleason Score Accuracy: 0.70 | – |
Olsen et al.39 | 2018 | Skin | 3 models with 2 classes | Cancer | Private | Study 1: 300 WSIs | Study 1: 126 WSIs | – | Tissue segmentation | Derivative VGG + Rule-based discriminator | Classification model trained with the segmented areasa | No | Supervised | Study 1 Accuracy: 0.9945 | – |
Study 2: 225 WSIs | Study 2: 114 WSIs | Study 2 Accuracy: 0.994 | |||||||||||||
Study 3: 225 WSIs | Study 3: 123 WSIs | Study 3 Accuracy: 1.0 | |||||||||||||
Wei et al.59 | 2019 | Lung | 6 | Cancer | Private | RoIs from 279 WSIs | 143 WSIs | – | Tissue segmentation, data augmentation and color normalization | ResNet18 | Threshold to discard low confidence + Most frequent predicted class | Yes | Supervised | Kappa Score: 0.525 | - |
Ianni et al.20 | 2020 | Skin | 4 | Cancer | Private | 85% of 5070 WSIs | 15% of 5,070 WSIs | 13,537 WSIs | – | Own Enconder-Decoder CNN + U-Net | Own CNN | No | Supervised (Patch) and Weakly Supervised (Slide) | – | Accuracy: 0.98 |
Iizuka et al.21 | 2020 | Stomach & Small intestine | 2 models with 3 classes | Cancer | Private | Stomach: 3,628 WSIs | Stomach & Colon: 500 WSIs | Stomach & Colon: 500 WSIs | Tissue segmentation and data augmentation | Customized Inception V3 | RNN using the last but one layer from the previous model as input | No | Supervised | AUC e: | AUC e: |
Stomach: 0.97 and 0.99 | Stomach: 0.98 and 0.93 | ||||||||||||||
Colon: 3,536 WSIs | Colon: 0.96 and 0.99 | Colon: 0.97 and 0.96 | |||||||||||||
Campanella et al.8 | 2019 | Skin | 2 | Cancer | Private | 8387 WSIsc | 1575 WSIsc | – | – | ResNet34 | RNN using the last but one layer from the previous model as input | No | Weakly supervised | AUC: 0.994 | – |
Chuang et al.9 | 2020 | Larynx, lip and oral cavity, esophagus, pharynx | 3 | Cancer | Private | 626 Cases | 100 Cases | – | – | ResNetXt | ResNet using the probability map as input | Yes | Supervised | AUC: 0.985 | – |
Captions – Not mentioned or not performed
Details can be found in the Supplementary Table
Training and validation set used during training was considered as training set in this column
Not clearly specified, only the test set size and the whole dataset size, this number was estimated with these 2 information
No metrics were performed by the authors in terms of final diagnosis, we calculated this metric using the table of misclassifcation comparison
AUC of adenocarcinoma and adenoma compared to benign, respectively
This study used the same model in 2 different tasks of lung carcinoma, one in a private set with 4 classes, and another in the TCGA differentiating 2 classes. We considered the most complex task.
Authors performed only the Benign vs. Cancer AUC in the internal test set.
Metrics representing: Benign vs Cancer, Gleason score 6 or ASAP vs Gleason score 7–10, ASAP or Gleason pattern 3 or 4 vs Gleason pattern 5, Cancer without vs with perineural invasion, respectively