. 2024 Jul 12;35(1):506–516. doi: 10.1007/s00330-024-10902-5

Table 2.

ChatGPT’s diagnostic accuracy categorized by tumor and nontumor groups

	Correct answer (accuracy rate [%])
	GPT-4-based ChatGPT		GPT-4V-based ChatGPT
	Final diagnosis	Differential diagnosis	Final diagnosis	Differential diagnosis
Total (n = 106)	46/106 (43%)	62/106 (58%)	9/106 (8%)	15/106 (14%)
Tumor group (n = 45)	14/45 (31%)	22/45 (49%)	4/45 (9%)	5/45 (11%)
Nontumor group (n = 61)	32/61 (52%)	40/61 (66%)	5/61 (8%)	10/61 (16%)
Tumor group (n = 45)^a	14/45 (31%)	22/45 (49%)	4/45 (9%)	5/45 (11%)
Bone tumor (n = 24)	8/24 (33%)	14/24 (58%)	2/24 (8%)	3/24 (13%)
Soft tissue tumor (n = 22)	6/22 (27%)	9/22 (41%)	2/22 (9%)	2/22 (10%)

ChatGPT Chat Generative Pre-trained Transformer, GPT-4 Generative Pre-trained Transformer-4, GPT-4V Generative Pre-trained Transformer-4 with vision

^aOne case presents both a bone tumor and a soft tissue tumor