Skip to main content
. 2023 Jul 12;620(7972):172–180. doi: 10.1038/s41586-023-06291-2

Extended Data Table 2.

Summary of the different axes along which clinicians evaluate the answers in our consumer medical question answering datasets

graphic file with name 41586_2023_6291_Tab2_ESM.jpg

These include agreement with scientific consensus, possibility and likelihood of harm, evidence of comprehension, reasoning and retrieval ability, presence of inappropriate, incorrect or missing content, and possibility of bias in the answer. We use a panel of clinicians to evaluate the quality of model and human-generated answers along these axes.