. 2023 Jun 29;9:e48002. doi: 10.2196/48002

Table 1.

Comparison of GPT-3.5 (Generative Pre-trained Transformer) and GPT-4 for essential knowledge questions and other questions in the Japanese Medical Licensing Examination (JMLE).

Question category	Question (n=254), n (%)	Examinee correct response rate^a (%)	GPT-3.5 correct response rate (%; 95% CI)	GPT-4 correct response rate (%; 95% CI)	P value
All questions	254 (100)	84.9	50.8 (44.6-57.0)	79.9 (75.0-84.9)	<.001
Essential knowledge	78 (30.7)	89.2	55.1 (43.8-66.4)	87.2 (79.6-94.8)	<.001
General clinical	105 (41.3)	83.1	43.8 (34.2-53.5)	73.3 (64.7-81.9)	<.001
Specific disease	71 (28)	83	56.3 (44.5-68.2)	81.7 (72.5-90.9)	<.001

^aThe correct response rates of examinees were obtained from the 117th JMLE, as announced by the Ministry of Health, Labour and Welfare [15].