Skip to main content
Annals of Family Medicine logoLink to Annals of Family Medicine
. 2023 Nov;21(Suppl 3):5153. doi: 10.1370/afm.22.s1.5153

The bot will answer you now: Using AI to assist patient-physician communication and implications for physician inbox workload

Amanda Walker, Sally Baxter, Ming Tai-Seale, Amy Sitapati, Christopher Longhurst
PMCID: PMC10983471

Abstract

Context

This study explores the potential application of artificial intelligence (AI) in facilitating communication in electronic health record (EHR) systems to reduce the burden and risk of clinician burnout. We leveraged previously extracted real EHR patient messages from a study of physician burnout, generated responses using ChatGPT, and then qualitatively compared them to actual physician responses.

Objectives

Assess the potential use of AI in reducing clinician burnout caused by electronic messaging by generating responses to patient messages using ChatGPT. The study also evaluates the AI-generated responses based on their relational connection, informational content, recommendations for next steps, and the extent of editing required before they can be used.

Study Design and Analysis

Qualitative analysis of AI-generated responses to patient messages compared to actual physician responses.

Dataset

EHR messages

Population studied

EHR patient messages

Intervention

Previously extracted real EHR patient messages were used as prompts to generate responses using ChatGPT. Qualitative comparisons were made between the generated responses and actual physician responses for different categories of patient messages, evaluating their relational connection, informational content, follow up recommendations, and the amount of editing needed.

Outcome Measures

Outcome measures include the qualitative assessments of ChatGPT-generated responses to patient messages compared to actual physician responses.

Results

The study found that AI-generated responses lacked relational connection, appearing mechanical and impersonal, while physicians’ responses varied widely, ranging from personal and empathic to instrumental and prescriptive. The informational content of AI-generated responses was also general, compared to physicians’ responses, which were more specific. Additionally, AI-generated responses were on average three times longer than physicians’ responses and required substantial editing. Recommendations from AI were generally generic, while physicians provided tailored recommendations based on the patient’s specific needs.

Conclusions

While some users have started using generative AI language models in healthcare communication, this study demonstrates significant challenges to making them useful to clinicians, and more efforts are needed to harness the potential of AI to support human critical thinking, judgment, and creativity in healthcare.


Articles from Annals of Family Medicine are provided here courtesy of Annals of Family Medicine, Inc.

RESOURCES