Table 4.
Reference | AI performance | Expert performance | Non-expert performance |
---|---|---|---|
Piccolo et al. (2002) [23] | Sensitivity: 92% Specificity: 74% |
Sensitivity: 92% Specificity: 99% |
Sensitivity: 69% Specificity: 94% |
Chang et al. (2013) [25] | Accuracy Melanoma: 91% Non-melanoma: 83% Sensitivity: 86% Specificity: 88% |
Accuracy: 81% Sensitivity: 83% Specificity: 86% |
|
Yu et al. (2018) [29] | Accuracy: 82% Sensitivity: 93% Specificity: 72% AUC: 0.80 |
Accuracy: 81% Sensitivity: 97% Specificity: 67% AUC: 0.80 |
Accuracy: 65% Sensitivity: 45% Specificity: 84% AUC: 0.65 |
Huang et al. (2020) [37] | Sensitivity: 90% AUC 0.94 |
Sensitivity: 85% Specificity: 90% |
Sensitivity: 66% Specificity: 72% |
Han et al. (2020) [35] | Sensitivity: 89% Specificity: 78% AUC: 0.92 |
Sensitivity: 95% Specificity: 72% ROC: 0.91 |
Accuracy Dermatology resident: 94% Non-dermatology clinician: 77% Sensitivity Dermatology resident: 69% Non-dermatology clinician: 65% AUC Dermatology resident: 0.88 Non-dermatology clinician: 0.73 |
Fujisawa et al. (2019) [31] | Accuracy Binary: 92% Multiclass: 75% |
Accuracy Binary: 85% Multiclass: 60% |
Accuracy Binary: 74% Multiclass: 42% |
Jinnai et al., (2019) [38] | Accuracy: 92% Sensitivity: 83% Specificity: 95% |
Accuracy: 87% Sensitivity: 86% Specificity: 87% |
Accuracy: 85% Sensitivity: 84% Specificity: 86% |
Zhao et al. (2019) [32] | Sensitivity Benign: 90% Low risk: 90% High risk: 75% |
Sensitivity Benign: 61% Low risk: 50% High risk: 64% |
|
Cho et al. (2020) [33] | Sensitivity Dataset 1: 76% Dataset 2: 70% Specificity Dataset 1: 80% Dataset 2: 76% AUC Dataset 1: 0.83 Dataset 2: 0.77 |
Sensitivity -Without algorithm: 90% -With algorithm: 90% Specificity -Without algorithm: 58% -With algorithm: 61% |
Sensitivity Dermatology resident -Without algorithm: 80% -With algorithm: 85% Non-dermatology clinician -Without algorithm: 65% -With algorithm: 74% Specificity Dermatology resident -Without algorithm: 53% -With algorithm: 71% Non-dermatology clinician -Without algorithm: 46% -With algorithm: 49% AUC Dermatology resident -Without algorithm: 0.33 -With algorithm: 0.42 Non-dermatology clinician -Without algorithm: 0.11 -With algorithm: 0.23 |
Han et al. (2020) [36] | Multiclass model Accuracy Top 1: 45% Top 3: 69% Top 5: 78% |
Multiclass model Accuracy (without algorithm) Top 1: 50% Top 3: 67% (with algorithm) Top 1: 53% Top 3: 74% Binary model Accuracy -Without algorithm: 77% -With algorithm: 85% |
|
Han et al. (2020) [34] | Binary model Sensitivity: 67% Specificity: 87% Multiclass accuracy Top 1: 50% Top 3: 70% |
Binary model Sensitivity: 66% Specificity: 67% Multiclass accuracy Top 1: 38% Top 3: 53% |
|
Li et al. (2020) [44] | Accuracy Binary: 73% Multiclass: 86% |
Accuracy Binary: 83% Multiclass: 74% |
|
Liu et al. (2020) [39] | Accuracy Top 1: 66% Top 3: 90% |
Accuracy Top 1: 63% Top 3: 75% |
Accuracy Primary care physician Top 1: 44% Top 3: 60% Nurse practitioner Top 1: 40% Top 3: 55% |
Minagawa et al. (2021) [42] | Accuracy: 71% | Accuracy: 90% |