Table 1. Performance of optimal GPT-4 model with optimized prompt for the 117th NMLE in Japan.
Essential | Basics and Clinical | |||||||
---|---|---|---|---|---|---|---|---|
Basics of medicine (essential) | Clinical medicine (essential) | Comprehension (essential) | Basics of medicine (general) | Basics of medicine (specifics) | Clinical medicine (general) | Clinical medicine (specifics) | Comprehension | |
No. of questions | 45 | 22 | 15 | 61 | 27 | 36 | 46 | 10 |
No. of correct answers | 36 | 19 | 12 | 47 | 25 | 22 | 37 | 8 |
No. of output errors | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
No. of incorrect answers | 8 | 3 | 3 | 14 | 1 | 14 | 9 | 2 |
Correct answer rate | 80.0% | 86.4% | 80.0% | 77.0% | 92.6% | 61.1% | 80.4% | 80.0% |
Output error rate | 2.2% | 0.0% | 0.0% | 0.0% | 3.7% | 0.0% | 0.0% | 0.0% |
Score weight | x1 | x3 | x1 | |||||
Total score (scoring rate) | 129/156 (82.7%) | 139/180 (77.2%) | ||||||
Minimum passing scoring rate | 80.0% | 74.6% |