Skip to main content
. 2023 Jun 29;9:e48002. doi: 10.2196/48002

Table 3.

Comparison of GPT-3.5 (Generative Pre-trained Transformer) and GPT-4 in the Japanese Medical Licensing Examination (JMLE) by difficulty levela.

Difficulty level Question (n=254), n (%) Examinee correct response rateb (%) GPT-3.5 correct response rate (%; 95% CI) GPT-4 correct response rate (%; 95% CI) P value
Easy 82 (32.3) 98.7 69.5 (59.3-79.7) 87.8 (80.6-95.0) .001
Normal 112 (44.1) 90.2 46.2 (37.0-55.8) 77.7 (69.8-85.5) <.001
Hard 60 (23.6) 56.3 33.3% (21.1-45.6) 73.3 (61.8-84.8) <.001

aDifficulty level was classified by the percentage of correct responses provided by medu4 [16], Japan’s leading preparatory school for the JMLE: easy, >97%; normal, 80% to 96.9%; and hard, <79.9%.

bThe correct response rates of examinees were obtained from the 117th JMLE, as announced by the Ministry of Health, Labour and Welfare [15].