Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 10.
Published in final edited form as: J Clin Oncol. 2024 Oct 2;42(35):4119–4122. doi: 10.1200/JCO-24-01570

A Role for Artificial Intelligence in the Detection of Immune-Related Adverse Events

Mohamed I Elsaid 1,2, Alexa Simon Meara 2,3, Dwight H Owen 2,4
PMCID: PMC12172021  NIHMSID: NIHMS2019021  PMID: 39356977

Summary:

In the article that accompanies this editorial, Sun and colleagues utilized large language models (LLMs) to detect immune-related adverse events (irAEs) from electronic health records and demonstrated that LLMs had higher sensitivity than ICD codes alone (94.7% vs 68.7% respectively) and similar specificity, requiring a fraction of the time compared to manual adjudication. This research represents a significant step in enhancing the efficiency of our efforts to better understand irAEs by leveraging data in the electronic medical records from patients treated with immune checkpoint inhibitors over the last 15 years to better predict these toxicities, with the goal of minimizing or mitigating them entirely.


Despite recent advancements and uptake in the use of immune checkpoint inhibitors (ICIs) for patients with multiple cancer types and stages, ICIs can trigger immune syndromes called immune-related adverse events (irAEs). While these immune responses can impact any organ system, common irAEs include gastrointestinal, rheumatic, endocrine, and dermatologic.13 In addition to adversely impacting patients’ quality of life, irAEs may result in treatment interruptions or necessitate prolonged courses of high doses of systemic corticosteroids or other immune-suppression therapy, which may have long-lasting effects. Some irAEs can cause chronic effects, such as pulmonary toxicities, with patients requiring life-long oxygen, and some irAEs can be fatal.4,5 Research on potential risk factors for irAEs and strategies to mitigate or minimize their risk have been hampered by a lack of valid methods to accurately detect and characterize irAEs from medical records in real-world settings despite the significant risk they pose to patients. In addition, this detection gap prevents researchers from leveraging large data sets, which have been instrumental in leading to breakthroughs in other research fields, such as genomics.6

The complex nature of irAEs underscores the urgent need for a detailed understanding of their frequency and severity. This is especially crucial given the current limitations in detection and reporting tools. IrAEs can range from mild to moderate, but in severe cases, they may cause permanent organ dysfunction and exhibit acute, life-threatening manifestations.7 In a meta-analysis involving 45,855 patients, the risk of all-grade irAEs was 39.8%, with 14.9% of patients experiencing grade ≥3 irAEs.8 The rapidly expanding use of ICIs in combination with other cancer therapies – including other immune therapies, radiation, chemotherapy, targeted therapies, cellular therapies, and antibody-drug conjugates - introduces additional layers of complexity in reporting and managing irAEs. Due to potential drug-drug interactions, patients receiving combination therapies may experience overlapping toxicities from different treatment modalities, and parsing out attributions is even more challenging. This underscores the need for detection and classification tools to identify and distinguish between reversible and irreversible toxicities and those caused by overlapping toxicities. Integrating large language models (LLMs) offers a promising solution to these challenges by providing advanced tools for accurately and efficiently identifying and characterizing irAEs in electronic medical records (EHR) data.

In this JCO issue, Sun et al. conducted a comprehensive retrospective study to investigate the precision of LLMs with retrieval augmented generation (RAG) in identifying and characterizing the most prevalent irAEs among hospitalized patients undergoing ICIs therapy. As such, clinical records were processed through an LLMs pipeline augmented with RAG. The use of RAG enables the LLMs to search a user-provided database of unstructured text, such as clinical notes, for relevant information. This approach enhances the LLM’s ability to generate accurate responses by incorporating additional context from the retrieved data. This study was motivated by the time and resource-intensive requirements of manual adjudication and the limitations of existing methodologies, such as International Classification of Disease (ICD) codes in detecting irAEs in large EHR data. The authors conducted a comprehensive manual review and adjudication of hospital admissions for 3,542 patients receiving ICIs treatment between 2011 and 2023. By leveraging both ICD codes and LLMs the study team compared the performance of these methods against adjudication (i.e., by a dedicated team of physicians and medical students) as the gold standard.

Results from Sun et al. are important as they show that the LLMs had a significantly higher sensitivity in identifying irAEs compared to ICD codes, with an overall sensitivity of 94.7% versus 68.7% for ICD codes. In irAEs-specific analyses, LLMs showed statistically significant improvements in sensitivity estimates for ICI-hepatitis (p<0.001), myocarditis (p<0.001), and pneumonitis (p=0.003), compared with ICD codes. Notably, the significantly higher sensitivity results of the LLMs relative to ICD codes did not come at the cost of specificity rates, which were comparable between the two methods. External validation results using a second institution’s dataset further supported the study’s main findings, with the LLMs outperforming the ICD codes in sensitivity (98.1% vs. 78.6%) and specificity (95.7% vs. 92.9%). The authors also observed an additional advantage of the LLMs approach ─ efficiency. In this study, the LLMs method required an average of 9.53 seconds per chart review, which was significantly less than the estimated average of 15 minutes needed for manual adjudication. The observed efficiency combined with high sensitivity and specificity underscores the potential of LLMs as a valuable tool in detecting irAEs.

This work is highly impactful because a reliable collection of irAEs data by manual review is onerous, time-consuming, and subjective even when reviewed by trained specialists. We and others have shown that relying on ICD codes and billing documentation is fraught with error914; therefore, having access to extensive data with a systemic reliable method for irAEs identification would be of great utility to advance research. While the study by Sun et al. provides promising findings, some limitations warrant mentioning. The study employed a retrospective design and was conducted at institutions that both participated extensively in early clinical trials in immunotherapy and, therefore, may have more familiarity with irAEs – including a dedicated immune toxicity consultation service - which might affect the generalizability of the results. In addition, the validation analysis was performed using data from an external institution that is part of the same health system where the primary analysis was conducted. The LLMs demonstrated low positive predictive values for irAEs, which indicates high false positive rates. High false-positive rates will limit the reliability of using LLMs as results must be confirmed by supplementary review. The LLMs were only limited to detecting the presence of irAEs and lacked any assessment of severity, which limits the model’s ability to provide a comprehensive toxicity assessment. Moreover, the data utilized were only from inpatient records. Ideally, we would be able to identify irAE across the care continuum, especially in the outpatient setting, where early identification and treatment might prevent hospitalization entirely. Finally, the study focused on four irAE (colitis, hepatitis, pneumonitis, myocarditis), so it is not clear how the LLMs would perform in detecting other irAEs, such as neurotoxicities, which are rare but often life-threatening, or rheumatic irAEs, which are common but fraught with misdiagnosis.15

irAEs are often inadequately and inconsistently reported10,11 due to complex data structures, the need for manual adjudication, and the use of ICD codes. In addition, the complexity and variability of irAEs and the lack of standardized irAEs-specific codes ICD contribute to their identification and reporting challenges.7,9 Currently, ICD codes for autoimmune disease and drug-induced toxicities are often used to report irAEs in EHR data.16 The problem is further compounded by the general underrepresentation of true irAEs and overrepresentation of false irAEs when relying solely on ICD codes, as we have previously shown.17 As a result, current irAEs reporting practices may impede precise attribution and identification of irAEs and can lead to gaps in both patient care and future oncology research efforts. Notably, due to important advocacy work by the irAEs Consortium, ASPIRE, new ICD-10 modifier codes are expected to be released in October 2024, which is a significant and critical step forward (Table 1).16 However, concerns remain about the future uptake and utilization of the new codes. Additionally, these new codes will not help identify irAEs during the first 15 years of routine ICIs use, so tools such as the one developed by Sun et al. will remain important for learning from patients treated during this period. Additional systemic infrastructure building and investment are certainly warranted to ensure the ongoing safety of our patients being treated with ICIs, especially as these therapies enter the neoadjuvant and adjuvant settings where patients have undergone curative therapy.

Table 1.

New ICD-10 Codes Immune-Related Adverse Events

ICD-10 Description
Z92.26 Personal history of immune checkpoint inhibitor therapy
T45.AX5 Adverse effect of immune checkpoint inhibitors and immunostimulant drugs
T45.AX6A Adverse effect of immune checkpoint inhibitors and immunostimulant drugs, initial encounter
T45.AX7D Adverse effect of immune checkpoint inhibitors and immunostimulant drugs, subsequent encounter
T45.AX8S Adverse effect of immune checkpoint inhibitors and immunostimulant drugs, sequela
*

To become Effective October 1st, 2024

ICD-10 = International Classification of Disease 10th Edition

Integrating large LLMs-based advanced detection tools into clinical practice presents a transformative potential for both irAEs management and oncology research. In clinical trials, LLMs-based tools can help precisely identify individuals with previous irAEs and ensure appropriate participant selection. LLMs can also be employed for real-time continuous surveillance and early detection of irAEs during follow-up. This proactive approach could facilitate intervention and tailored treatment adjustments based on patient responses, reducing the burden on patients by minimizing the need to collect patient-reported outcomes and for researchers and investigators in terms of data capture and documentation. Additionally, this approach can improve data quality and consistency through standardized assessments and reporting. Automating data review processes has the potential to reduce expenses, accelerate trial timelines, enhance efficiency, and boost cost-effectiveness.

The creation of infrastructure to implement LLMs identification and classification tools in outpatient clinics is crucial. Continuous monitoring in outpatient settings can aid in the early detection of irAEs and timely intervention. To ensure the practical applications of these models outside academic centers, addressing challenges such as infrastructure, safeguarding data privacy, integrating with existing systems, and gaining support from healthcare providers will be crucial.

In conclusion, the study by Sun et al. highlights the transformative potential of LLMs in detecting irAEs among hospitalized patients. Their findings demonstrate that LLMs significantly outperform traditional ICD codes’ sensitivity and efficiency, offering a promising solution for accurate irAEs identification. Despite limitations such as high false positive rates and a lack of severity assessment, the integration of LLMs into clinical practice can enhance patient selection, real-time monitoring, and data standardization in clinical trials. Addressing current detection and reporting challenges, LLMs provide a robust framework for improving patient care and advancing oncology research in managing irAEs.

Research Support:

DHO receives research support from the NIH (R01CA273924) and the LUNGevity Foundation. DHO reports research funding (to institution) from Merck, BMS, Palobiofarma, Genentech, Pfizer, Abbvie, Nuvalent, Onc.AI. MIE receives research support from the NIH (P30 CA016058-47S1), Genentech and AstraZeneca, all unrelated to the current work.

References

  • 1.Wongvibulsin S, Pahalyants V, Kalinich M, et al. Epidemiology and risk factors for the development of cutaneous toxicities in patients treated with immune-checkpoint inhibitors: A United States population-level analysis. J Am Acad Dermatol. Mar 2022;86(3):563–572. doi: 10.1016/j.jaad.2021.03.094 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Yin Q, Wu L, Han L, et al. Immune-related adverse events of immune checkpoint inhibitors: a review. Front Immunol. 2023;14:1167975. doi: 10.3389/fimmu.2023.1167975 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Zhou N, Velez MA, Owen D, Lisberg AE. Immune-Related Adverse Events (irAEs): Implications for Immune Checkpoint Inhibitor Therapy. J Natl Compr Canc Netw. Sep 2020;18(9):1287–1290. doi: 10.6004/jnccn.2020.7640 [DOI] [PubMed] [Google Scholar]
  • 4.Brahmer JR, Lacchetti C, Schneider BJ, et al. Management of Immune-Related Adverse Events in Patients Treated With Immune Checkpoint Inhibitor Therapy: American Society of Clinical Oncology Clinical Practice Guideline. J Clin Oncol. Jun 10 2018;36(17):1714–1768. doi: 10.1200/jco.2017.77.6385 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Jamal S, Hudson M, Fifi-Mah A, Ye C. Immune-related Adverse Events Associated with Cancer Immunotherapy: A Review for the Practicing Rheumatologist. J Rheumatol. Feb 2020;47(2):166–175. doi: 10.3899/jrheum.190084 [DOI] [PubMed] [Google Scholar]
  • 6.Reynolds KL, Arora S, Elayavilli RK, et al. Immune-related adverse events associated with immune checkpoint inhibitors: a call to action for collecting and sharing clinical trial and real-world data. J Immunother Cancer. Jul 2021;9(7)doi: 10.1136/jitc-2021-002896 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ramos-Casals M, Sisó-Almirall A. Immune-Related Adverse Events of Immune Checkpoint Inhibitors. Ann Intern Med. Feb 2024;177(2):Itc17-itc32. doi: 10.7326/aitc202402200 [DOI] [PubMed] [Google Scholar]
  • 8.Shen X, Yang J, Qian G, et al. Treatment-related adverse events of immune checkpoint inhibitors in clinical trials: a systematic review and meta-analysis. Front Oncol. 2024;14:1391724. doi: 10.3389/fonc.2024.1391724 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Cheung YM, Hamnvik OR, Shariff A, Gallagher EJ. Dearth of ICD Codes for Complications of Immune Checkpoint Inhibitors Impedes Clinical Care and Research. J Endocr Soc. Feb 9 2023;7(4):bvad019. doi: 10.1210/jendso/bvad019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wang Y, Chen C, Du W, et al. Adverse Event Reporting Quality in Cancer Clinical Trials Evaluating Immune Checkpoint Inhibitor Therapy: A Systematic Review. Front Immunol. 2022;13:874829. doi: 10.3389/fimmu.2022.874829 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Xie T, Zhang Z, Qi C, et al. The Inconsistent and Inadequate Reporting Of Immune-Related Adverse Events in PD-1/PD-L1 Inhibitors: A Systematic Review of Randomized Controlled Clinical Trials. Oncologist. Dec 2021;26(12):e2239–e2246. doi: 10.1002/onco.13940 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Nashed A, Zhang S, Chiang CW, et al. Comparative assessment of manual chart review and ICD claims data in evaluating immunotherapy-related adverse events. Cancer Immunol Immunother. Oct 2021;70(10):2761–2769. doi: 10.1007/s00262-021-02880-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zitu MM, Li L, Elsaid MI, Gatti-Mays ME, Manne A, Shendre A. Comparative assessment of manual chart review and patient-level adverse drug event identification using artificial intelligence in evaluating immunotherapy-related adverse events (irAEs). American Society of Clinical Oncology; 2023. [Google Scholar]
  • 14.Zitu MM, Zhang S, Owen DH, Chiang C, Li L. Generalizability of machine learning methods in detecting adverse drug events from clinical narratives in electronic medical records. Frontiers in Pharmacology. 2023;14:1218679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Liew DFL, Meara AS. Rheumatic Immune-Related Adverse Events: Current Clinical Imperatives Underpin Future Novel Insights. Rheum Dis Clin North Am. May 2024;50(2):xvii–xix. doi: 10.1016/j.rdc.2024.02.008 [DOI] [PubMed] [Google Scholar]
  • 16.Mittal K, Reynolds K, Funchain P. Introducing ASPIRE and STORIES: A New International Initiative for Faculty Collaboration and Patient Advocacy in Immune-Related Adverse Events. Accessed 07/10/2024, 2024. https://ascopost.com/issues/june-10-2024/a-new-international-initiative-for-faculty-collaboration-and-patient-advocacy-in-immune-related-adverse-events/#:~:text=In%20its%20first%20landmark%20advocacy,data%20analysis%20of%20these%20entities [Google Scholar]
  • 17.Nashed A, Zhang S, Chiang CW, et al. Comparative assessment of manual chart review and ICD claims data in evaluating immunotherapy-related adverse events. Cancer Immunol Immunother. Oct 2021;70(10):2761–2769. doi: 10.1007/s00262-021-02880-0 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES