Skip to main content
. 2024 Jul 12;35(1):506–516. doi: 10.1007/s00330-024-10902-5

Table 2.

ChatGPT’s diagnostic accuracy categorized by tumor and nontumor groups

Correct answer (accuracy rate [%])
GPT-4-based ChatGPT GPT-4V-based ChatGPT
Final diagnosis Differential diagnosis Final diagnosis Differential diagnosis
Total (n = 106) 46/106 (43%) 62/106 (58%) 9/106 (8%) 15/106 (14%)
 Tumor group (n = 45) 14/45 (31%) 22/45 (49%) 4/45 (9%) 5/45 (11%)
 Nontumor group (n = 61) 32/61 (52%) 40/61 (66%) 5/61 (8%) 10/61 (16%)
Tumor group (n = 45)a 14/45 (31%) 22/45 (49%) 4/45 (9%) 5/45 (11%)
 Bone tumor (n = 24) 8/24 (33%) 14/24 (58%) 2/24 (8%) 3/24 (13%)
 Soft tissue tumor (n = 22) 6/22 (27%) 9/22 (41%) 2/22 (9%) 2/22 (10%)

ChatGPT Chat Generative Pre-trained Transformer, GPT-4 Generative Pre-trained Transformer-4, GPT-4V Generative Pre-trained Transformer-4 with vision

aOne case presents both a bone tumor and a soft tissue tumor