Table 1.
ChatGPT-3.5 (n = 199) |
Bing Chat (n = 158) |
Bard (n = 112) |
p-Value | |
---|---|---|---|---|
Accurate | 76 (38.2%) * | 47 (29.8%) ** | 3 (2.7%) *,** | <0.001 |
Inaccurate | 82 (41.2%) * | 77 (48.7%) ** | 26 (23.2%) *,** | <0.001 |
Fabricated | 32 (16.1%) * | 21 (13.3%) ** | 71 (63.4%) *,** | <0.001 |
Incomplete | 9 (4.5%) | 13 (8.2%) | 12 (10.7%) | 0.11 |
* Significant difference between ChatGPT-3.5 and Bard p < 0.05. ** Significant difference between Bing Chat and Bard p < 0.05.