. 2023 Dec 22;34(2):96–104. doi: 10.1097/CMR.0000000000000951

Table 2.

Key studies comparing outcomes between pathologists and WSI assessment using CNN

Reference	Segmentation classifier protocol	Dataset	Proposed segmentation OR training method to improve segmentation	Summary
De Logu et al., 2020 [41]	ResNet V2	University of Florence Department of Pathology. H&E of primary invasive cutaneous melanoma, n = 100, Breslow >2 mm)	ROI patches extracted from WSI. ROI were defined and labelled by two dermatopathologists, then trained and tested with CNN to assess performance	CNN had potential to give more detailed information on pathological cases, defining heat maps that distinguish healthy and pathological areas. High concordance between pathologist and CNN. Misclassification seen in patients with dermal solar elastosis and epidermal atrophy (chronic UVR exposed sites).
Wu et al., 2021 [40]	Scale-Aware Transformer Network	240 H&E	WSI comparison between pathologist, CNN and ground truth	While accuracy between pathologist and CCN was comparable, the subset of dysplastic naevus seemed inferior with CNN.
Xie et al., 2021 [44]	Grad-CAM	841 H&E of melanoma and naevus. Central South University Xiangya Hospital	WSI comparison between proposed CNN model and 20 pathologists for specificity, sensitivity and accuracy.	CNN superior when compared to pathologist. Model identified salient features through heat maps. Additional clinical data was helpful for pathologist and may also aide CNN
Kim et al., 2022 [45]	Inception	256 H&E New York University	Predicting BRAF mutation through WSI analysis (Pathonomics)	When compared to BRAF-wild type, BRAF mutated nuclei were shown to be statistically larger (in radii), and rounder (in form factor, solidity, extent, and eccentricity)
Klein et al., 2021 [46]	U-Net	H&E from 90 patients with metastatic melanoma. University Hospital Cologne	Used WSI to determine association with TILs and CPI treatment response in metastatic melanoma	TIL clusters reveal a predictive response/resistance to CPI. Elevated TIL clusters showed higher response to CPI in BRAF-positive tumours. High TIL counts were associated with increased survival.
Hohn et al., 2021 [47]	ResNeXt50	431 images (430 patients)	Used WSI to examine CNN accuracy when patient data was incorporated (age, sex, location).	Patient data did not improve CNN accuracy unless the confidence level without patient data was low
Li et al., 2021 [48]	ResNet50	701 images (583 patients). Multi-centre database. Chinese University Hospitals	Assessing AUROC including both melanoma and naevus (intradermal, compound, junctional)	Very high AUROC 0.971, showing promising results for full automation ability for CNN and WSI
Schmitt et al., 2021 [43]	ResNet50, DenseNet21, VGG16	427 H&E Slides from 5 different institutions	Batch effects that are learned by CNN can cause significant misclassification and accuracy issues. Batch effect variables that were studied included patient age, slide preparation date, slide origin, scanner type	Hidden variables can cause significant accuracy variability. Preparation date of the slide and patient’s age were the biggest factor that caused significant accuracy variability
Zormpass-Petridis et al., 2020 [42]	SuperHistopath/Xception	127 melanoma H&E	Introduction of a novel SuperHistopath framework and modified Xception CNN to analyse 5X magnified WSI and segment ROI breast cancer, melanoma and neuroblastoma	Accurate segmentation and ability to determine prognostic histological features (some are seen to have high intra and inter-observer variability within pathologists)

AUC ROC, area under the curve of the receiver operating characteristic; CNN, convolutional neural network; CPI, check point inhibitor; H&E, haematoxylin and eosin staining; ROI, region of interest; TILs, tumour-infiltrating lymphocytes; UVR, ultraviolet radiation; WSI, whole-slide imaging.