Table 3.
Weighted percentage agreement (WPA) point estimates.
|
|
ChatGPT, WPA (95% CI) | Claude 2, WPA (95% CI) | Bard, WPA (95% CI) |
| Accuracy | 80.26 (67.61-92.92) | 86.84 (76.09-97.59) | 71.05 (56.63-85.47) |
| Relevance | 76.32 (62.8-89.83) | 97.37 (92.28-102.46) | 71.05 (56.63-85.47) |
| Clarity | 72.37 (58.15-86.59) | 94.74 (87.64-101.84) | 60.53 (44.98-76.07) |
| Emotional | 68.42 (53.64-83.2) | 77.63 (64.38-90.88) | 67.11 (52.17-82.04) |