. 2023 Jun 29;9:e48002. doi: 10.2196/48002

Table 2.

Comparison of GPT-3.5 (Generative Pre-trained Transformer) and GPT-4 by question type in the Japanese Medical Licensing Examination (JMLE).

Question type	Question (n=254), n (%)	Examinee correct response rate^a (%)	GPT-3.5 correct response rate (%; 95% CI)	GPT-4 correct response rate (%; 95% CI)	P value
General	134 (52.7)	84	51.5 (42.9-60.0)	79.1 (72.1-86.1)	<.001
Clinical	98 (38.6)	85.3	50 (39.9-60.1)	79.6 (71.5-87.7)	<.001
Clinical sentence	22 (8.7)	88.8	50 (27.3-72.7)	86.3 (70.8-102)	.005

^aThe correct response rates of examinees were obtained from the 117th JMLE, as announced by the Ministry of Health, Labour and Welfare [15].