Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

medRxiv logoLink to medRxiv
[Preprint]. 2023 Jan 28:2023.01.27.23285115. [Version 1] doi: 10.1101/2023.01.27.23285115

Analysis of large-language model versus human performance for genetics questions

Dat Duong, Benjamin D Solomon
PMCID: PMC9928145  PMID: 36789422

Abstract

Large-language models like ChatGPT have recently received a great deal of attention. To assess ChatGPT in the field of genetics, we compared its performance to human respondents in answering genetics questions (involving 13,636 responses) that had been posted on social media platforms starting in 2021. Overall, ChatGPT did not perform significantly differently than human respondents, but did significantly better on memorization-type questions versus critical thinking questions, frequently provided different answers when asked questions multiple times, and provided plausible explanations for both correct and incorrect answers.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from medRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES