Skip to main content
. 2024 Jul 12;35(1):506–516. doi: 10.1007/s00330-024-10902-5

Table 3.

ChatGPT’s diagnostic accuracy in nontumor etiologies

Correct answer (accuracy rate [%])
GPT-4-based ChatGPT GPT-4V-based ChatGPT
Final diagnosis Differential diagnosis Final diagnosis Differential diagnosis
Muscle/soft tissue/nerve disorder (n = 12) 7/12 (58%) 11/12 (92%) 2/12 (17%) 3/12 (25%)
Arthritis/arthropathy (n = 10) 4/10 (40%) 4/10 (40%) 1/10 (10%) 1/10 (10%)
Infection (n = 8) 3/8 (38%) 5/8 (63%) 0/8 (0%) 1/8 (13%)
Congenital/developmental abnormality and dysplasia (n = 6) 4/6 (67%) 4/6 (67%) 0/6 (0%) 0/6 (0%)
Trauma (n = 6) 5/6 (83%) 5/6 (83%) 1/6 (17%) 2/6 (33%)
Metabolic disease (n = 5) 2/5 (40%) 3/5 (60%) 0/5 (0%) 0/5 (0%)
Anatomical variant (n = 4) 3/4 (75%) 3/4 (75%) 0/4 (0%) 0/4 (0%)
Others (n = 10) 4/10 (40%) 5/10 (50%) 1/10 (10%) 3/10 (30%)

ChatGPT Chat Generative Pre-trained Transformer, GPT-4 Generative Pre-trained Transformer-4, GPT-4V Generative Pre-trained Transformer-4 with vision