Skip to main content
. Author manuscript; available in PMC: 2023 Dec 1.
Published in final edited form as: J Am Acad Dermatol. 2020 May 16;87(6):1352–1360. doi: 10.1016/j.jaad.2020.05.053

Table 1:

Recent Studies in Dermatology utilizing Artificial Intelligence

Study Objectives and key findings Algorithm Sample Size
Skin Cancer Classifiction
Esteva et al.14 (2017) Classifying images of 1.) keratinocyte carcinomas vs. benign seborrheic keratosis and 2.) malignant melanomas vs. benign nevi. ROC AUC of 0.96 for carcinomas and 0.96 for melanomas. GoogLeNet Inception v3 model 129,450 images
Han et al.20 (2018) Classifying 12 different skin lesions based on clinical images. ROC AUCs for BCC, SCC, melanoma were 0.96, 0.83, 0.96, respectively ResNet 19,398 images
Haenssle et al.15 (2018) Classifying dermatoscopic images of melanoma vs benign nevi. Compared performance to 58 dermatologists. Dermatologists had improved sensitivity and specificity with additional clinical info but still did not perform as well as CNN. GoogLeNet Inception v3 model not provided
Marchetti et al.17 (2018) Reported the results of 25 deep learning algorithms trained to classify melanomas vs. benign nevi. Top 5 algorithms were combined in a fusion algorithm which achieved ROC AUC of 0.86 Variable 900 images
Brinker el al.18 (2019) Trained a melanoma classification model on dermatoscopic images and then evaluated it on clinical images. For equal levels of sensitivity (89.4%), the model achieved a higher specificity (68.2%) than the mean of 157 dermatologists (64.4%) ResNet 20,735 images
Yap et al.19 (2018) Combined dermatoscopic images with macroscopic images and patient metadata to train a CNN for skin lesion classification. They report an ROC AUC of 0.866 for melanoma detection. ResNet 2,917 cases
Aggarwal et al.36 (2019) Deep learning algorithms were evaluated before and after data augmentation. For the five classes of lesions tested, data augmentation increased AUC by 0.132 on average TensorFlow Inception version-3 938 images
Tschandl et al.37 (2019) Classification of nonpigmented skin cancer, compared performance to 95 human raters. ROC AUC of 0.742. CNN had higher percentage of correct diagnosis (37.6%) compared to average human rater (33.5%) but not compared to experts (37.3%). TensorFlow Inception version-3 7895 dermoscopic images, 5829 close up images
Han et al.38 (2019) Using unprocessed photos, localize and diagnose keratinocytic skin cancers without manual preselection of suspicious lesions by dermatologists. ROC AUC of 0.910. F1 score of 0.831. Youden index of 0.675 Region-based Convolutional Neural Network 182,348 images
Dermatopathology
Hekler et al.39 (2019) A CNN was trained with images of melanomas and nevi. The CNN had a 19% discordance with the histopathologist. This is comparable to the discordance between human pathologists (25–26%) ResNet 595 whole slide images
Olsen et al.22 (2018) CNN was trained on whole slide images to identify basal cell carcinomas, dermal nevi, and seborrheic keratoses. The ROC AUC were 0.99, 0.97, 0.99 respectively VGG 450 whole slide images
Hart et al.23 (2019) Developed a CNN trained on whole slide images to distinguish between spitz and conventional melanocytic lesions. They report an accuracy of 92% for a single call from a whole slide image. TensorFlow Inception version-3 100 whole slide images
Jiang et al.40 (2019) Used smartphone cameras to photograph microscope ocular images instead of using whole slide images. Developed CNN to recognize BCC. ROC AUC of 0.95 GoogLeNet Inception v3 model 8,046 microscopic ocular images
Hekler et al.24 (2019) Developed a CNN to classify histopathologic images as melanoma or nevi. CNN achieved a mean sensitivity/specificity/accuracy of 76%/60%/68%. The 11 pathologists achieved 52%/67%/59% ResNet 695 whole slide images
Predicting Novel Risk Factors/Epidemiology
Roffman et al.8 (2018) Trained their model with a data set of personal health information. The study identified 13 parameters that were predictive of NMSC. The model achieved a sensitivity of 86.2% and a specificity of 62.7% Did not specify further than neural network 2,056 NMSC cases and 460,574 control cases
Lott et al.41 (2018) Used natural language processing to analyze EMR pathology reports of patients who underwent skin biopsy to calculate population based frequencies of histologically confirmed melanocytic lesions. PPV 82.4%, sensitivity 81.7%, F1 measure 0.82 NegEx 80,368 skin biopsies
Identifying Onychomycosis
Han et al.7 (2018) Classified nail images as onychomycosis or not. ROC AUC of 0.98 ensemble model combining ResNet-152 and VGG-19 models 49,567 images
Quantifying Alopecia Areata
Bernardis et al.42 (2018) Trained a texture classification model to calculate SALT score. Their model was able to calculate SALT scores in seconds, whereas the average time reported by clinicians is 7 minutes. Texture classification model 250 images
Automating Reflectance Confocal Microscopy
Bozkurt et al.43 (2017) Developed a segmentation algorithm to distinguish the stratum corneum layer and accurately calculate its thickness Complex Wavelet Transform 15 RCM stacks w/ 30 images in each stack
Mitosis Detection
Andres et al.21 (2017) Identified mitotic cells and detect relevant regions in whole slide images. The study reports an accuracy of 83% and a dice coefficient of 0.72. Random Forest Classifier 59 whole slide images