Table 1.
Performance of ChatGPT on medical examination
| Model | Country | Examination | Score | Pass mark | Result |
| Galactica | USA | USMLE (MedQA)23 | 52.9% | 60% | Fail |
| Flan-PaLM | USA | USMLE (MedQA)9 | 67.6% | 60% | Pass |
| ChatGPT 3.5 | USA | USMLE5 | 44%–64.4% | 60% | Mixed |
| ChatGPT 3.5 | USA | USMLE6 | 42.1%–65.2% | 60% | Mixed |
| ChatGPT 3.5 | USA | American Heart Association life support exams24 | 68%–76.3% | 84% | Fail |
| ChatGPT 3.5 & 4 | USA | Plastic Surgery In-Service Training Exam25 | ChatGPT 3.5: 3rd (2021) and 8th (2022) decile | – | – |
| ChatGPT 4: 99th (2021) and 88th (2022) decile | – | – | |||
| ChatGPT 4 | USA | USMLE12 | 85% | 60% | Pass |
| ChatGPT 3.5 | UK | General Practitioner (GP) AKT26 | 60.17% | 70.42% | Fail |
| ChatGPT 3.5 & 4 | USA | Ophthalmology Board Exam27 | ChatGPT 3.5%–63.1% | 65% | Fail |
| ChatGPT 4%–76.9% | Pass | ||||
| Med-Palm 2 | USA | USMLE (MedQA)17 | 86.5% | 60% | Pass |
| ChatGPT 3.5 and 4 | USA | Neurosurgical Board Exam28 | ChatGPT 3.5 62.4% | – | – |
| ChatGPT 4 85.2% | |||||
| ChatGPT 3.5 | UK | FRCA Primary29 | 69.7% | 71.3% | Fail |
| ChatGPT 3.5 and 4 | UK | Dermatology SCE30 | ChatGPT 3.5%–63.1% | 70%–72% | Fail |
| ChatGPT4 90.5% | Pass | ||||
| ChatGPT 3.5 and 4 | UK | Neurology SCE31 | ChatGPT 3.5%–57% | 58% | Fail |
| ChatGPT 4%–64% | Pass | ||||
| ChatGPT 3.5 | USA | Neonatal Board Exams32 | 45.3% | – | – |
AKT, Applied Knowledge Test; FRCA, Fellow of the Royal College of Anaesthetists; SCE, Specialty Certificate Examination; USMLE, United States Membership Licensing Examination.