Skip to main content
JAMA Network logoLink to JAMA Network
. 2023 Apr 27;149(6):556–558. doi: 10.1001/jamaoto.2023.0704

Comparison Between ChatGPT and Google Search as Sources of Postoperative Patient Instructions

Noel F Ayoub 1,, Yu-Jin Lee 1, David Grimm 1, Karthik Balakrishnan 1
PMCID: PMC10141286  PMID: 37103921

Abstract

This qualitative study rates the level of understandability, actionability, and procedure-specific content in postoperative instructions generated from ChatGPT, Google Search, and Stanford University.


ChatGPT (generative pretrained transformer), an artificial intelligence–powered language model chatbot, has been described as an innovative resource for many industries, including health care.1 Lower health literacy and limited understanding of postoperative instructions have been associated with worse outcomes.2,3 While currently ChatGPT cannot supplant a human clinician, it can serve as a medical knowledge source. This qualitative study assessed the value of ChatGPT in augmenting patient knowledge and generating postoperative instructions for use in populations with low educational or health literacy levels.

Methods

We analyzed postoperative patient instructions for 8 common pediatric otolaryngologic procedures: tympanostomy tube placement, tonsillectomy and adenoidectomy, inferior turbinate reduction, tympanoplasty, cochlear implant, neck mass resection, microdirect laryngoscopy and bronchoscopy, and tongue-tie release. Stanford University Institutional Review Board deemed this study exempt from review and waived the informed consent requirement given the study design. We followed the SRQR reporting guideline.

Postoperative instructions were obtained from ChatGPT, Google Search, and Stanford University (hereafter, institution). This phrase was entered into ChatGPT: Please provide postoperative instructions for the family of a child who just underwent a [procedure]. Provide them at a 5th grade reading level. Similarly, this phrase was entered into Google Search: My child just underwent [procedure]. What do I need to know and watch out for? The first nonsponsored Google Search results were used for analysis. Results were extracted and blinded. To enable adequate blinding, we standardized all fonts and removed audiovisuals (eg, pictures). Two of us (N.F.A., Y.-J.L.) scored the instructions.

The primary outcome was the Patient Education Materials Assessment Tool–printable (PEMAT-P)4 score, which assessed the understandability and actionability of instructions for patients of different backgrounds and health literacy levels. As a secondary outcome, instructions were scored on whether they addressed procedure-specific items. We a priori generated a list of 4 items specific to each procedure that were deemed important for each instruction to mention; see the Table 1 footnote for these items.

Table 1. Understandability, Actionability, and Procedure-Specific Scores for Each Procedure.

Procedure and Instructions source PEMAT-P understandability score, % PEMAT-P actionability score, % Procedure-specific items score, %a
Tympanostomy tube placement
Institutionb 91 80 100
Google Search 82 100 75
ChatGPT 82 80 100
Tonsillectomy and adenoidectomy
Institutionb 91 80 100
Google Search 82 100 100
ChatGPT 82 80 100
Inferior turbinate reduction
Institutionb 91 100 100
Google Search 82 80 75
ChatGPT 73 80 100
Tympanoplasty
Institutionb 91 100 100
Google Search 82 100 100
ChatGPT 82 80 100
Cochlear implant
Institutionb 91 100 100
Google Search 82 40 0
ChatGPT 82 20 100
Neck mass resection
Institutionb 91 80 75
Google Search 82 80 100
ChatGPT 82 80 75
Microdirect laryngoscopy and bronchoscopy
Institutionb 91 80 100
Google Search 82 80 75
ChatGPT 82 80 100
Tongue-tie release
Institutionb 91 100 100
Google Search 73 80 50
ChatGPT 82 80 100

Abbreviation: PEMAT-P, Patient Education Materials Assessment Tool–printable.

a

Reviewers analyzed whether each instruction discussed 4 items specific to each procedure. Tympanostomy tube placement: (1) follow-up/tube check, (2) otorrhea, (3) ear drops, (4) when to call a clinician. Tonsillectomy and adenoidectomy: (1) pain management, (2) what to do if there is bleeding, (3) oral hydration, (4) when to call a clinician. Inferior turbinate reduction: (1) what to do if there is bleeding, (2) nasal sprays, (3) pain management, (4) when to call a clinician. Microdirect laryngoscopy and bronchoscopy: (1) pain management, (2) difficulty breathing, (3) difficulty swallowing, (4) when to call a clinician. Neck mass resection: (1) pain management, (2) difficulty breathing/swallowing, (3) swelling, (4) when to call a clinician. Tympanoplasty: (1) dry ear precautions, (2) ear drops, (3) pain management, (4) when to call a clinician. Cochlear implant: (1) what to do if there is fever, (2) what to do if there is swelling, (3) pain management, (4) wound care. Tongue-tie release: (1) pain, (2) difficulty eating, (3) bleeding, (4) when to call a clinician.

b

Standardized postoperative instructions from Stanford University School of Medicine.

Scores were compared using 1-way analysis of variance and Kruskal-Wallis tests with η2 (90% CI) as the appropriate effect size.5 Analysis was performed February 6, 2023 using R, version 4 (R Core Team).

Results

Overall, understandability scores ranged from 73% to 91%; actionability scores, 20% to 100%; and procedure-specific items, 0% to 100% (Table 1). ChatGPT-generated instructions were scored from 73% to 82% for understandability, 20% to 80% for actionability, and 75% to 100% for procedure-specific items.

Institution-generated instructions consistently had the highest scores (Table 2). Understandability scores were highest for institution (91%) vs ChatGPT (81%) and Google Search (81%) instructions (η2, 0.86; 90% CI, 0.67-1.00). Actionability scores were lowest for ChatGPT (73%), intermediate for Google Search (83%), and highest for institution (92%) instructions (η2, 0.22; 90% CI, 0.04-0.55). For procedure-specific items, ChatGPT (97%) and institution (97%) instructions had the highest scores and Google Search had the lowest (72%) (η2, 0.23; 90% CI, 0-0.64).

Table 2. Comparison of ChatGPT, Google Search, and Institution Instructions.

Scores, % η2 (90% CI)
ChatGPT Google search Institutiona
PEMAT-P total 78 81 91 0.52 (0.16-0.68)
PEMAT-P understandability 81 81 91 0.86 (0.67-1.00)
PEMAT-P actionability 73 83 92 0.22 (0.04-0.55)
Procedure-specific items 97 72 97 0.23 (0-0.64)

Abbreviation: PEMAT-P, Patient Education Materials Assessment Tool–printable.

a

Standardized postoperative instructions from Stanford University School of Medicine.

Discussion

Findings suggest that ChatGPT provides instructions that are helpful for patients with a fifth-grade reading level or different health literacy levels. However, ChatGPT-generated instructions scored lower in understandability, actionability, and procedure-specific content than Google Search– and institution-specific instructions. Despite these findings, ChatGPT may be beneficial for patients and clinicians, especially when alternative resources are limited.

Online search engines are common sources of medical information for the public: 7% of Google searches are health-related.6 However, ChatGPT has advantages over search engines: it is free, can be customized to different literacy levels, and provides succinct information. ChatGPT provides direct answers that are often well-written, detailed, and in if-then format, which give patients access to immediate information while waiting to reach a clinician.

Study limitations were that only a few procedures and resources were analyzed, and the analysis was performed only in English. ChatGPT limitations included lack of citations; inability of users to confirm the accuracy of the information or explore topics further; and a knowledge base with a 2021 end point, excluding the latest data, events, or practice.

Supplement.

Data Sharing Statement

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement.

Data Sharing Statement


Articles from JAMA Otolaryngology-- Head & Neck Surgery are provided here courtesy of American Medical Association

RESOURCES