Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Oct 3.
Published in final edited form as: JCO Clin Cancer Inform. 2025 Oct 1;9:e2500102. doi: 10.1200/CCI-25-00102

Unsupervised Large Language Models to Identify Topics in Cancer Center Patient Portal Messages

Ji Hyun Chang 1,2, Amir Ashraf-Ganjouei 1, Isabel Friesner 1, Ryzen Benson 1, Travis Zack 1, Sumi Sinha 1,3, Jason Chan 1,3, Steve Braunstein 1,3, Amy Lin 1, Lisa Singer 1,3, Julian C Hong 1,3,4
PMCID: PMC12490804  NIHMSID: NIHMS2107050  PMID: 41032743

Abstract

Background:

The increasing use of patient portal messages has enhanced patient-provider communication. However, the high volume of these messages has also contributed to physician burnout.

Methods:

Patient-generated portal messages sent to a single cancer center from 2011 to 2023 were extracted. BERTopic, a natural language processing topic modeling technique based on large language model, was optimized. For further categorization, the topic words were labeled using GPT-4, followed by review by two oncologists. Uniform Manifold Approximation and Projection was employed for dimensionality reduction and visualizing topics. Message volume changes over time were assessed using a Student’s t-test.

Results:

A total of 2,280,851 messages were analyzed. The monthly average number of messages increased from 2,071 in 2012 to 43,430 in 2022 (p<0.001). There was a significant rise in message volume following the COVID-19 pandemic, with a posterior probability of a causal effect of 96.4% (p=0.04). Scheduling-related messages were the most frequent across departments, while symptoms and health concerns were second or third most common topics. In medical oncology and surgical oncology, topics on prescriptions and medications were more common compared to radiation oncology and gynecological oncology. Despite concurrent institutional changes in self-scheduling systems, scheduling-related messages did not decrease over time.

Conclusions:

The substantial increase in patient portal messages, particularly scheduling-related inquiries, underscores the need for streamlined communication to reduce the burden on healthcare providers. These findings highlight the need for strategies to manage message volume and mitigate physician burnout, laying groundwork for artificial intelligence-driven future triage systems to improve message management and patient care.

Keywords: Patient portal message, Physician burnout, Natural language processing, Large language model

Introduction

The use of patient portal messages has been rapidly increasing.1 Patient portal messages enhance patient-provider communication2, relationships with patients3 and enable patients to remotely report significant symptom changes or adverse events so that medical providers can take measures promptly. The impact is most significant for patients with limited physical access to healthcare facilities due to their condition or distance from the hospital.4 It can also help facilitate a multidisciplinary approach and prevent treatment discontinuation by allowing various specialists to access patient-reported information, enabling timely responses to patient needs and reducing the risk of interrupted treatment.5 The COVID-19 pandemic and the accompanying increase in remote clinical care further accentuated its importance, significantly accelerating growth.6

However, perceived patient expectations to respond to the patient portal message have created new stressors3, and the high volume of patient messages are associated with clinician burnout.7 This not only adversely affects the well-being of physicians themselves but also has negative consequences for patients. Physician burnout is well-documented to be associated with medical errors810 and lower patient satisfaction.11

To achieve a balance and optimize the use of patient portal message, it is imperative to identify message subject matter and devise a strategy for the most efficient management. Topic modeling is a powerful natural language processing (NLP) method for text mining, clustering, and classification. It may uncover topics by automating analysis in the tremendous expanse of patient portal messages and reveal hidden themes by identifying patterns in word choice and linking documents with similar patterns.12 In particular, large language models (LLMs) such as Bidirectional Encoder Representations from Transformers (BERT) and its derivatives, like sentence transformer, can be applied to enhance these analyses.13

In the current study, we aimed to identify the most frequent topics from patient portal messages to cancer center medical providers with NLP-based topic modeling, with the goal of providing foundational data to inform the development of automated message triage systems.

Methods

Data collection

This retrospective study adhered to the principles of the Declaration of Helsinki and was approved by the institutional review board (IRB) of the University of California San Francisco (UCSF) (IRB number: 19–29686). The need for informed consent was waived due to the retrospective nature of the study, as approved by the ethics board. Patient portal messages generated from October 2011 to April 2023 in UCSF Cancer Center were retrieved. Basic information such as created date, time, recipient identifier, and department were collected. This study adhered to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines to ensure comprehensive and transparent reporting of the research methodology and findings.

Topic modeling

Messages were first pre-processed, including removing message headers, quoted bounce-back messages from delivery failure, and stopwords. Stopwords such as “thank you”, “thanks”, “thank”, “thankyou”, “perfect”, “great”, “dr”, “hi”, “ok”, “sounds good”, “okay”, “yes”, “thx” and frequently mentioned names of medical providers were removed in addition to the frozenset “English” in Countvectorizer. To create embeddings, sentence transformer all-MiniLM-L6-v2 was used.14

For department-wise analysis, a total of 40 departments in the cancer center were condensed into 6 categories: medical oncology, surgical oncology, supportive care, gynecology oncology, radiation oncology and others. Detailed classification information can be found in Supplemental Table 1. Topic modeling was performed in the medical oncology, surgical oncology, gynecology oncology, and radiation oncology categories because ‘supportive’ or ‘others’ could include a vast variety of services (e.g., psycho-oncology, nutrition, exercise) and may not directly pertain to a specific oncologic treatment, warranting a separate, dedicated analysis. Considering the characteristics and diversity of messages from each department, topic modeling was conducted on a department-by-department basis.

BERTopic was used for topic modeling. BERTopic is a variation of BERT, which is a language representation model designed to pre-train for fine-tuning.15 BERTopic leverages transformers and c-TF-IDF for clustering and interpretable topic modeling.16

Topic number

The determination of the final number of topics was a multi-step process combining quantitative metrics with qualitative assessment. We first evaluated models with various topic counts (e.g., 50, 100, 150, 200) using topic coherence (NPMI) and diversity scores (OCTIS v1.13.1).17 As these metrics often present a trade-off, we then performed a qualitative review of the resulting topics at each threshold. This involved examining representative documents and topic words to identify the point at which increasing the topic count began to produce overly granular or redundant themes (e.g., splitting a single “appointment request” topic into multiple, nearly identical sub-topics). The determined topics were distilled to their four most representative topic words and 30 representative documents. To label the categories, the topic words were anonymized and initially labeled using GPT-4. The grouping process was further reviewed by two board-certified oncologists.

Message volume analysis

The causal impact of COVID-19 on the increase in message volume was analyzed with a Bayesian structural model (tfcausalimpact v0.0.13).18 Changes in message volume over time were evaluated using a Student’s t-test to determine statistical significance.

Results

From October 2011 to April 2023, 23,977,703 message lines across 2,280,851 messages were received at our cancer center. The monthly average number of messages in 2012 was 2,071, surging to 43,430 messages per month in 2022 (p<0.001) (Supplemental Figure 1-A). An analysis by department revealed significant variation in message volume, as visualized in Supplemental Figure 1-B. Medical oncology and surgical oncology consistently accounted for the vast majority of messages. While all departments experienced substantial growth over the study period, these two departments remained the dominant recipients. Each physician received a median of 7.93 messages per month (interquantile range, 2.39–29.17). In the causal impact analysis, a notable increase in the rate of growth was observed following the issuance of a stay-at-home order by California after the COVID-19 outbreak on March 19, 2020. The analysis revealed a posterior probability of a causal effect of 96.4% (p=0.04).

After conducting a comparative analysis of NPMI, topic diversity scores, and multiple trial runs, the number of topics was determined as follows (Supplemental Data 1), factoring in the relative volume and diversity of messages in each specialty: 100 for medical oncology, 100 for surgical oncology, 50 for radiation oncology, and 50 for gynecology oncology. A bar chart representing topics and topic words in order of top frequency in the medical oncology category is depicted in Figure 1.

Figure 1.

Figure 1.

Frequency Distribution of the Top 16 Topics and Topic Words in the Medical Oncology Category

These topics were further categorized after anonymizing the topic words using GPT-4 and reviewed by two oncologists (JHC and LS): 34 for medical oncology, 26 for surgical oncology, 19 for radiation oncology and 21 for gynecology oncology. The agreement between the GPT-4 categorization and the evaluators was as follows: 0.48 for medical oncology, 0.55 for surgical oncology, 0.7 for radiation oncology and 0.82 for gynecology oncology. In the messages sent to all department categories, scheduling-related topics were found to be the most frequent. The categorization of topics for the medical oncology and surgical oncology departments is presented in Table 1 and Table 2, while the categorization for the radiation oncology and gynecologic oncology departments is detailed in Supplemental Table 2. In medical oncology, communications primarily focus on appointments and scheduling (38.0%), prescriptions and medications (14.4%), and symptoms and health concerns (12.0%). Similarly, surgical oncology discussions revolve around appointments and scheduling (40.2%), symptoms and health concerns (12.9%), and prescriptions and medications (10.4%). In radiation oncology, the main topics include scheduling and test results (48.9%), gratitude and well wishes (11.6%), and symptoms and health concerns (11.5%). Lastly, gynecology oncology communications center on appointments and scheduling (29.7%), symptoms and health concerns (15.5%), and test orders and results (14.3%).

Table 1.

The topic lists of medical oncology department following topic modeling and categorization

Medical Oncology Number of Messages Percentage (%)
 Appointments and Scheduling 468663 38.0
 Prescriptions and Medications 177364 14.4
 Symptoms and Health Concerns 147759 12.0
 Gratitude and Well Wishes 130843 10.6
 Forms and Documents 75550 6.1
 Simple Communication 66446 5.4
 File Attachment 26218 2.1
 Contact information 20687 1.7
 Location 17945 1.5
 Test Orders and Results 17313 1.4
 Medical Records and Results 18972 1.5
 Contacting on behalf of someone 8653 0.7
 Asking for clarification 4508 0.4
 Outliers 4506 0.4
 Communications about “numbers” 3874 0.3
 Equipment and Supplies 3760 0.3
 Primary care 3711 0.3
 Supportive Care for Symptoms 3542 0.3
 Appointment delays or Transportation 3499 0.3
 COVID/Vaccine 3390 0.3
 Test Preparation 3336 0.3
 Weight 2824 0.2
 Carcinogen and Prognosis 2605 0.2
 Insurance and Payment 2569 0.2
 Health and Lifestyle Management 2559 0.2
 Log in issue 2494 0.2
 System messages 1870 0.2
 Drain 1806 0.1
 Non-English 1578 0.1
 Misclassified due to initial letter grouping 1490 0.1
 Procedure 799 0.1
 Dental 790 0.1
 Video meeting 697 0.1
 Cord Blood 550 0.0
Total 1233170 100.0

Table 2.

The topic lists of surgical oncology department following topic modeling and categorization

Surgical Oncology Number of Messages Percentage (%)
 Appointments and Scheduling 291541 40.2
 Symptoms and Health Concerns 93298 12.9
 Prescriptions and Medications 75285 10.4
 Gratitude and Well Wishes 68121 9.4
 Forms and Documents 36892 5.1
 Simple Communication 49431 6.8
 Asking for clarification 18054 2.5
 Location 16369 2.3
 File Attachment 15108 2.1
 Equipment and Supplies 9710 1.3
 Contacting on behalf of someone 8529 1.2
 Test Preparation 7157 1.0
 Test Orders and Results 6926 1.0
 Health and Lifestyle Management 6911 1.0
 Nutrition 4371 0.6
 Questions about contrasts 3501 0.5
 Procedures 2906 0.4
 Outlier 2257 0.3
 Personal Information 1588 0.2
 Non-English 1269 0.2
 Insurance and Payment 1207 0.2
 Medical Records and Results 1102 0.2
 System messages 1081 0.1
 Interpreter Request 850 0.1
 Patient education 616 0.1
 Log in Issue 427 0.1
Total 724507 100

Discussion

Effective communication is crucial for all medical conditions, but it holds particular significance for cancer patients as they undergo intense longitudinal treatments that require frequent communication with multiple departments regarding treatments, symptoms, appointments, and diagnostic results. Therefore, the advantages of patient portal messages may be further emphasized in cancer patients. However, at the same time, oncologists are at a higher risk of experiencing burnout due to their care of seriously ill patients, demanding workloads, and the growing reliance on electronic medical record system, resulting in a higher prevalence.19, 20 Cancer care involves not only decision-making about treatment but also the need for regular visits and monitoring of test results over an extended period of time. Additionally, this complex process includes the provision of emotional support. To alleviate the burden on healthcare providers and focus their efforts, it is necessary to examine how resources are being allocated. Therefore, in the current study, we conducted a topic analysis of patient portal messages with a specific focus on cancer care. The results revealed a significant number of messages regarding scheduling, highlighting a persistent inefficiency in communication that lays the groundwork for developing future automatic triage systems.

To contextualize these findings, our institution uses a hybrid messaging system where messages are triaged by care teams or sent directly to physicians. This dual system creates a significant burden from both complex escalated cases and high-volume unfiltered inquiries. The prevalence of administrative topics we found suggests an automated routing system could manage many of these messages, thereby alleviating the workload for the entire care team.

We utilized the widely employed topic modeling tool, BERTopic, which is currently prevalent in medical research as well. BERTopic stands as one of the outstanding topic modeling models alongside others such as Top2Vec, latent Dirichlet allocation (LDA), and non-negative matrix factorization (NMF). BERT, an LLM, serves as the basis for BERTopic’s embedding, enabling it to capture the contextual meanings of words more effectively, which is a key differentiator from other models. Additionally, the application of c-TF-IDF allows for precise topic clustering, further enhancing its strength and uniqueness compared to other models. This allows BERTopic to handle dynamic updates better and provide consistent topic coherence, making it particularly suitable for large-scale text datasets.16, 21 Many medical-related studies utilizing topic modeling, including BERTopic, have predominantly focused on data from social media platforms such as Twitter2224 and Reddit25, 26, providing insights into what concerns and experiences patients actually share within the community. In contrast, this study presented a different perspective by exploring the topics and characteristics of patients’ inquiries directed towards medical providers using BERTopic.

Furthermore, our research findings demonstrated a significant increase in the number of messages, particularly following the outbreak of COVID-19. While messages offer convenience, the widespread accessibility of patient portal messages can lead to significant work burden and burnout among healthcare providers.27 As one solution, some hospitals have implemented billing systems to reduce the absolute volume of messages. However, even after implementation, the billing rate remains low, with only 1.4% of all messages being billed at UCSF since November 2020.6 Peppercorn argued for the minimal financial impact on care sustainability and highlighted potential barriers to care if patients avoid messaging due to cost concerns.28 They proposed better solutions, such as automating workflows to route specific inquiries to billed phone calls or virtual visits, utilizing templates for common questions, and specifying acceptable message topics. Our study investigates the proportion of message topics suitable for templates or automation within the same context.

Recent studies have been conducted on using LLMs to generate response drafts and Epic has already implemented the use of GPT-4 to draft communications with patients in the In Basket.29 In one study, it was announced that this approach led to a reduction in task load and emotional exhaustion.30 However, automation bias remains a significant concern. Chen et al. pointed out that while there were improvements in efficiency and interphysician agreement, there were also risks associated with LLM-generated responses, with some potentially leading to severe harm or even death, highlighting the need for continuous monitoring by physicians.31

Various attempts have been made to classify patient portal messages; however, the majority of them have employed manual analysis or supervised learning methods.3235 While establishing predefined categories and classifying messages can be an effective approach, considering the substantial message volume and the diverse mix of senders and providers, an unsupervised approach appears to be a more pragmatic choice from a research perspective, as opposed to a supervised method. The results demonstrated that across departments, scheduling messages were the most prevalent. The next most frequent topics were “Symptoms and Health Concerns” and “Prescriptions/Medications”. While the ranking differed slightly across departments, in medical oncology, surgical oncology, and radiation oncology, these three topics accounted for over 60% of all messages. While symptom-related inquiries align with the original purpose of patient portal messages – facilitating patient communication – scheduling and medication-related messages present opportunities for system optimization. By streamlining these processes, a significant portion of these inquiries could be resolved without human intervention, alleviating the increasing burden on healthcare providers. This optimization would not only reduce the sheer volume of messages but also streamline the delivery process, minimizing unnecessary steps and ultimately decreasing stress for medical professionals. UCSF’s scheduling practices have evolved asynchronously, prompting an investigation into whether this influences the temporal changes in topic frequency. Upon examining the topics over time, it was found that there was no difference in schedule-related topic frequency over time. This suggests there are still factors hindering easy access for patients, highlighting the need for further simplification.

Additionally, all departments received approximately 10% of messages expressing gratitude and well wishes, indicating a positive impact on building rapport between patients and medical providers. Positive feedback has been reported to reduce physician burnout.36 Therefore, although patient portal messages may increase workload, they may also offer the advantage of providing positive experiences.37

This study possesses several inherent limitations. Firstly, as a single-center study within an academic cancer center, the findings may not be generalizable to other healthcare settings with differing patient populations, workflows, and technology adoption, such as community hospitals or non-oncology specialties.38 Secondly, despite utilizing BERTopic and quantitative metrics, the process of determining the number of topics and interpretation inherently involves subjectivity, potentially introducing bias. Finally, given that the analysis considered entire patient messages, instances of multiple topics within a single message were observed. Consequently, our annotation process likely prioritized the most salient concept, which can oversimplify multifaceted messages. This approach, however, was consistent with our primary objective of first identifying the most frequent, dominant topics to build a foundation for an effective routing system.

While acknowledging its limitations, the current study comprehensively analyzed over two million patient messages accumulated over a decade, offering a rich and unprecedented dataset of patient-provider communication within a large academic cancer center using NLP. It is also important to acknowledge that this is the first study to investigate the topics present in patient messages sent to a cancer center. The study focuses on identifying broad topical categories, a crucial step towards effective message routing and triage, supports the level of granularity achieved in outlining these categories as adequately informative and appropriate for this stage of investigation. Future work will focus on assessing the accuracy of topic labels for automatic triage to ensure that messages are routed efficiently and correctly, minimizing the workload on healthcare providers while maintaining high-quality patient care, as similar machine learning approaches have already been shown to reduce rates of acute care visits.39

In conclusion, this study analyzed over two million patient portal messages from the UCSF Cancer Center, revealing significant insights into communication patterns between patients and healthcare providers. The findings highlighted a substantial increase in message volume, particularly during the COVID-19 pandemic, with scheduling-related inquiries being the most prevalent across departments. Despite advancements in self-scheduling systems, these messages persist, suggesting ongoing barriers to patient access. These insights underscore the need for continued efforts to streamline communication processes in order to reduce the burden on healthcare providers and improve patient access to care.

Supplementary Material

PV Data Supplement

Context summary:

Key objective:

How can unsupervised topic modeling using large language model identify common themes in patient portal messages to guide strategies for improving communication efficiency and reducing clinician workload?

Knowledge generated:

The monthly message volume experienced a more than 20-fold increase between 2012 and 2022. Across all oncology specialties, communications regarding appointments and scheduling consistently represented the largest topic (over 30% of messages), followed by inquiries about symptoms and medications.

Relevance (written by Dr. Umit Topaloglu):

The study highlights that the massive increase in patient portal messages, especially for scheduling, is a major contributor to physician burnout. Therefore, developing efficient strategies, potentially including AI-driven triage systems, is clinically essential to manage this communication volume and improve provider well-being and patient care.

Acknowledgements:

This study was supported by the National Cancer Institute of the National Institutes of Health under Award Number R01CA277782, Radiation Oncology Institute, ASCO Conquer Cancer Foundation, and UCSF Computational Cancer Award (Grant number not available). JCH is also supported by the ASTRO-PCF Career Development Award to End Prostate Cancer. Funders had no role in design and conduct of the study, nor the decision to prepare and submit the abstract.

Footnotes

This study was presented in part at the 65th American Society for Radiation Oncology (ASTRO) meeting in San Diego in 2023, and at the American Medical Informatics Association (AMIA) 2024 Informatics Summit in Boston.

Conflicts of interest

The authors declare that they have no competing interests.

Data Availability Statement

The data supporting the findings of this study contain sensitive personal information and, as such, cannot be shared publicly or made available upon request. Due to the confidentiality and privacy concerns associated with the data, we are unable to provide access to the dataset used for this research. We have adhered to all relevant ethical guidelines and institutional policies to ensure the protection of participants’ privacy.

References

  • 1.Cronin RM, Davis SE, Shenson JA, et al. : Growth of Secure Messaging Through a Patient Portal as a Form of Outpatient Interaction across Clinical Specialties. Appl Clin Inform 6:288–304, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Dendere R, Slade C, Burton-Jones A, et al. : Patient Portals Facilitating Engagement With Inpatient Electronic Medical Records: A Systematic Review. J Med Internet Res 21:e12779, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lieu TA, Altschuler A, Weiner JZ, et al. : Primary Care Physicians’ Experiences With and Strategies for Managing Electronic Messages. JAMA Netw Open 2:e1918287, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Lent AB, Derksen D, Jacobs ET, et al. : Policy Recommendations for Improving Rural Cancer Services in the United States. JCO Oncol Pract 19:288–294, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Yin Z, Harrell M, Warner JL, et al. : The therapy is making me sick: how online portal communications between breast cancer patients and physicians indicate medication discontinuation. J Am Med Inform Assoc 25:1444–1451, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Holmgren AJ, Byron ME, Grouse CK, et al. : Association Between Billing Patient Portal Messages as e-Visits and Patient Messaging Volume [Internet]. JAMA, 2023[cited 2023 Jan 10] Available from: https://jamanetwork.com/journals/jama/fullarticle/2800370 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hilliard RW, Haskell J, Gardner RL: Are specific elements of electronic health record use associated with clinician burnout more than others? J Am Med Inform Assoc 27:1401–1410, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Al-Ghunaim TA, Johnson J, Biyani CS, et al. : Surgeon burnout, impact on patient safety and professionalism: A systematic review and meta-analysis. Am J Surg 224:228–238, 2022 [DOI] [PubMed] [Google Scholar]
  • 9.Tawfik DS, Profit J, Morgenthaler TI, et al. : Physician Burnout, Well-being, and Work Unit Safety Grades in Relationship to Reported Medical Errors. Mayo Clin Proc 93:1571–1580, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Menon NK, Shanafelt TD, Sinsky CA, et al. : Association of Physician Burnout With Suicidal Ideation and Medical Errors. JAMA Netw Open 3:e2028780, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Halbesleben JRB, Rathert C: Linking physician burnout and patient outcomes: Exploring the dyadic relationship between physicians and patients. Health Care Manage Rev 33:29–39, 2008 [DOI] [PubMed] [Google Scholar]
  • 12.Barde BV, Bainwad AM: An overview of topic modeling methods and tools [Internet], in 2017 International Conference on Intelligent Computing and Control Systems (ICICCS). Madurai, IEEE, 2017, pp 745–750[cited 2023 Nov 7] Available from: http://ieeexplore.ieee.org/document/8250563/ [Google Scholar]
  • 13.Benson R, Elia M, Hyams B, et al. : A Narrative Review on the Application of Large Language Models to Support Cancer Care and Research. Yearb Med Inform 33:090–098, 2024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.all-MiniLM-L6-v2 [Internet] Available from: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
  • 15.Devlin J, Chang M-W, Lee K, et al. : BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [Internet], 2018[cited 2023 Jan 3] Available from: https://arxiv.org/abs/1810.04805
  • 16.Grootendorst M: BERTopic: Neural topic modeling with a class-based TF-IDF procedure [Internet], 2022[cited 2023 Jan 3] Available from: https://arxiv.org/abs/2203.05794
  • 17.Terragni S, Fersini E, Galuzzi BG, et al. : OCTIS: Comparing and Optimizing Topic models is Simple! [Internet], in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. Online, Association for Computational Linguistics, 2021, pp 263–270[cited 2023 Oct 26] Available from: https://aclanthology.org/2021.eacl-demos.31 [Google Scholar]
  • 18.tfcausalimpact [Internet] Available from: https://github.com/WillianFuks/tfcausalimpact
  • 19.Medisauskaite A, Kamau C: Prevalence of oncologists in distress: Systematic review and meta-analysis. Psychooncology 26:1732–1740, 2017 [DOI] [PubMed] [Google Scholar]
  • 20.Hlubocky FJ, Back AL, Shanafelt TD: Addressing Burnout in Oncology: Why Cancer Care Clinicians Are At Risk, What Individuals Can Do, and How Organizations Can Respond. Am Soc Clin Oncol Educ Book Am Soc Clin Oncol Annu Meet 35:271–279, 2016 [DOI] [PubMed] [Google Scholar]
  • 21.Egger R, Yu J: A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts. Front Sociol 7:886498, 2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ng QX, Lee DYX, Yau CE, et al. : Public perception on “healthy ageing” in the past decade: An unsupervised machine learning of 63,809 Twitter posts. Heliyon 9:e13118, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gabarron E, Dorronzoro E, Reichenpfader D, et al. : What Do Autistic People Discuss on Twitter? An Approach Using BERTopic Modelling. Stud Health Technol Inform 302:403–407, 2023 [DOI] [PubMed] [Google Scholar]
  • 24.Lindelöf G, Aledavood T, Keller B: Dynamics of the Negative Discourse Toward COVID-19 Vaccines: Topic Modeling Study and an Annotated Data Set of Twitter Posts. J Med Internet Res 25:e41319, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Williams CYK, Li RX, Luo MY, et al. : Exploring patient experiences and concerns in the online Cochlear implant community: A cross-sectional study and validation of automated topic modelling. Clin Otolaryngol Off J ENT-UK Off J Neth Soc Oto-Rhino-Laryngol Cervico-Facial Surg 48:442–450, 2023 [DOI] [PubMed] [Google Scholar]
  • 26.Yao LF, Ferawati K, Liew K, et al. : Disruptions in the Cystic Fibrosis Community’s Experiences and Concerns During the COVID-19 Pandemic: Topic Modeling and Time Series Analysis of Reddit Comments. J Med Internet Res 25:e45249, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Adler-Milstein J, Zhao W, Willard-Grace R, et al. : Electronic health records and burnout: Time spent on the electronic health record after hours and message volume associated with exhaustion but not with cynicism among primary care clinicians. J Am Med Inform Assoc JAMIA 27:531–538, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Peppercorn J: Now That We Don’t Talk: Should Cancer Centers Bill for Patient Portal Messages in Oncology? JCO Oncol Pract OP.24.00176, 2024. [DOI] [PubMed] [Google Scholar]
  • 29.Liu S, McCoy AB, Wright AP, et al. : Leveraging large language models for generating responses to patient messages—a subjective analysis. J Am Med Inform Assoc 31:1367–1379, 2024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Garcia P, Ma SP, Shah S, et al. : Artificial Intelligence–Generated Draft Replies to Patient Inbox Messages. JAMA Netw Open 7:e243201, 2024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Chen S, Guevara M, Moningi S, et al. : The effect of using a large language model to respond to patient messages. Lancet Digit Health 6:e379–e381, 2024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Benda NC, Rogers C, Sharma M, et al. : Identifying Nonpatient Authors of Patient Portal Secure Messages in Oncology: A Proof-of-Concept Demonstration of Natural Language Processing Methods. JCO Clin Cancer Inform 6:e2200071, 2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Cronin RM, Fabbri D, Denny JC, et al. : A comparison of rule-based and machine learning approaches for classifying patient portal messages. Int J Med Inf 105:110–120, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Sieck CJ, Walker DM, Hefner JL, et al. : Understanding Secure Messaging in the Inpatient Environment: A New Avenue for Communication and Patient Engagement. Appl Clin Inform 9:860–868, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Sulieman L, Gilmore D, French C, et al. : Classifying patient portal messages using Convolutional Neural Networks. J Biomed Inform 74:59–70, 2017 [DOI] [PubMed] [Google Scholar]
  • 36.Solms L, van Vianen AEM, Koen J, et al. : Physician exhaustion and work engagement during the COVID-19 pandemic: A longitudinal survey into the role of resources and support interventions. PloS One 18:e0277489, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Janssen A, Keep M, Selvadurai H, et al. : Health professionals’ experiences with a patient portal pre and post launch: A qualitative study. Health Policy Technol 12:100761, 2023 [Google Scholar]
  • 38.Sinha S, Garriga M, Naik N, et al. : Disparities in Electronic Health Record Patient Portal Enrollment Among Oncology Patients. JAMA Oncol 7:935, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Hong JC, Eclov NCW, Dalal NH, et al. : System for High-Intensity Evaluation During Radiation Therapy (SHIELD-RT): A Prospective Randomized Study of Machine Learning–Directed Clinical Evaluations During Radiation and Chemoradiation. J Clin Oncol 38:3652–3661, 2020 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

PV Data Supplement

Data Availability Statement

The data supporting the findings of this study contain sensitive personal information and, as such, cannot be shared publicly or made available upon request. Due to the confidentiality and privacy concerns associated with the data, we are unable to provide access to the dataset used for this research. We have adhered to all relevant ethical guidelines and institutional policies to ensure the protection of participants’ privacy.

RESOURCES