Table 3.
Author | Medically Irrelevant Questions |
Invalid for Medical Exam |
Inaccurate/Wrong Question |
Inaccurate/Wrong Answer or Alternative answers |
Low Difficulty Level |
---|---|---|---|---|---|
Sevgi et al. | N/A | N/A | N/A | 1 (33.3%) | N/A |
Biswas | N/A | N/A | N/A | N/A | N/A |
Agarwal et al. | N/A | Highly valid | N/A | V/A | Somewhat difficult |
Ayub et al. | 9 (23%) | 24 (60%) | 5 (13%) | 5 (13%) | 10 (25%) |
Cheung et al. | 32 (64%) | 28 (56%) | 32 (64%) | 29 (58%) | N/A |
Totlis et al. | N/A | 8 (44.4%) | N/A | N/A | 8 (44.4%) |
Han et al. | N/A | N/A | N/A | N/A | 3 (100%) |
Klang et al. | 2 (0.95%) | 1 (0.5%) | 12 (5.7%) | 14 (6.6%) | 2 (0.95%) |
Summary of faulty questions generated by the AI, November 2023