Table 2.
Comprehensive summary of findings with GRADE certainty: AI in Dermatological Diagnosis
Outcome | No. of studies | Total participants/Images | Effect estimate/Key results | Comments |
---|---|---|---|---|
Early detection of malignant skin lesions | 5 studies [23, 27, 28, 35, 37] | ~ 60,000 + images |
Accuracy: 85–97% Sensitivity: 82–95% |
Strong performance in model testing; external validation is still limited |
Detection of monkeypox from skin images | 5 studies [21, 24, 29, 32, 36] | ~ 30,000 + images | Accuracy: 88–98%. Used SVMs, transformers, and few-shot learning | Models are promising but mostly based on small or synthetic datasets |
Diagnostic accuracy of AI models for general dermatology | 17 studies [21–29, 31–38] | ~ 150,000 + images |
Accuracy: 80–98% Sensitivity: 76–96% Specificity: 78–97% |
High variability in algorithms and lesion types; many use public data sets |
AI vs Dermatologist diagnostic performance | 1 study [30] | ~ 1,000 image assessments | Comparable grading accuracy for facial signs | Only one validation study; limited scope |
Generalizability across populations and skin tones | 3 studies [30, 31, 34] | Multinational(South Africa, China, Uganda) | Performance gaps noted in underrepresented groups | Highlights a lack of training data diversity |
Explainability and transparency of AI predictions | 5 studies [23, 27, 28, 31, 35] | Not quantifiable | Used tools like SHAP, Grad-CAM | Explainability present but limited validation and integration |
Real-time or low-resource deployment feasibility | 2 studies [26, 31] | Not specified | Use of lightweight/efficient models for mobile platforms | Mostly feasibility-focused; real-world performance unknown |