Skip to main content
. 2023 Jun 22;15(6):e40822. doi: 10.7759/cureus.40822

Table 1. Percentage of questions answered correctly by GPT-3.5 vs. GPT-4 vs. humans by ophthalmology sub-category.

Bolding indicates statistical significance

Ophthalmology Subcategory GPT-3.5 Questions Answered Correctly (%) GPT-4 Questions Answered Correctly (%) Human Questions Answered Correctly (%) GPT-3.5 vs GPT-4 P-Value GPT-3.5 vs Human P-Value GPT-4 vs Human P-Value
Lens & Cataract (n = 42) 45 52 57 0.518 0.163 0.569
External Disease & Cornea (n = 43) 58 70 56 0.267 0.833 0.085
Glaucoma (n = 43) 65 84 59 0.048 0.614 0.003
Neuro (n = 42) 69 79 59 0.327 0.212 0.007
Optics (n = 42) 38 69 48 0.004 0.284 0.017
Pathology & Tumors (n = 44) 45 70 58 0.017 0.124 0.128
Pediatrics (n = 43) 63 79 63 0.099 0.951 0.023
Oculoplastics (n = 42) 57 83 59 0.008 0.851 < 0.001
Refractive Surgery (n = 42) 48 69 58 0.002 0.397 0.003
Retina & Vitreous (n = 42) 67 74 63 0.480 0.652 0.157
Intraocular Inflammation & Uveitis (n = 42) 55 76 61 0.039 0.496 0.054
Total (n = 467) 55 73 58 < 0.001 0.231 <0.001