Abstract
PURPOSE
Precision oncology clinical trials often struggle to accrue, partly because it is difficult to find potentially eligible patients at moments when they need new treatment. We piloted deployment of artificial intelligence tools to identify such patients at a large academic cancer center.
PATIENTS AND METHODS
Neural networks that process radiology reports to identify patients likely to start new systemic therapy were applied prospectively for patients with solid tumors that had undergone next-generation sequencing at our center. Model output was linked to the MatchMiner tool, which matches patients to trials using tumor genomics. Reports listing genomically matched patients, sorted by probability of treatment change, were provided weekly to an oncology nurse navigator (ONN) coordinating recruitment to nine early-phase trials. The ONN contacted treating oncologists when patients likely to change treatment appeared potentially trial-eligible.
RESULTS
Within weekly reports to the ONN, 60,199 patient-trial matches were generated for 2,150 patients on the basis of genomics alone. Of these, 3,168 patient-trial matches (5%) corresponding to 525 patients were flagged for ONN review by our model, representing a 95% reduction in review compared with manual review of all patient-trial matches weekly. After ONN review for potential eligibility, treating oncologists for 74 patients were contacted. Common reasons for not contacting treating oncologists included cases where patients had already decided to continue current treatment (21%); the trial had no slots (14%); or the patient was ineligible on ONN review (12%). Of 74 patients whose oncologists were contacted, 10 (14%) had a consult regarding a trial and five (7%) enrolled.
CONCLUSION
This approach facilitated identification of potential patients for clinical trials in real time, but further work to improve accrual must address the many other barriers to trial enrollment in precision oncology research.
AI on EHR data can enable delivery of clinical trial information for patients with cancer who need new treatment.
INTRODUCTION
Precision oncology clinical trials often struggle to accrue,1,2 in part because of the difficulty of identifying patients whose tumors meet complex tumor genomic eligibility criteria at the key moments when they need new treatment. Even when computational software tools3 can identify potential matches on the basis of tumor molecular profiles and trial genomic eligibility criteria, finding patients who are ready to start a new treatment remains challenging.4 To address this gap, we conducted a pilot of deploying artificial intelligence (AI) models to find such patients and facilitate clinical trial matching workflows at a large academic cancer center.
CONTEXT
Key Objective
Eligibility for many clinical trials of novel cancer therapeutics is based on genomic biomarker criteria. Biomarker-selected clinical trials often struggle to accrue, in part because of the difficulty of identifying potentially eligible patients when they need new treatments. We piloted an artificial intelligence (AI)–assisted process for matching patients to genomically selected trials in real time at our institution.
Knowledge Generated
We generated weekly lists of patients with tumor genomic matches to nine early-phase clinical trials using the MatchMiner engine. The lists were then sorted on the basis of the output of previously trained neural network models that use imaging reports to predict new treatment and ascertain disease progression. The sorted matches were sent to an oncology nurse navigator in our early-phase clinical trial group, who manually reviewed records for patients with elevated probabilities of treatment change, reaching out to treating oncologists with trial information if the patient appeared potentially eligible.
Relevance
Deploying AI to identify patients potentially eligible for clinical trials at moments when they need new treatment is feasible. Still, this process addresses just one barrier of many to clinical trial accrual.
METHODS
A neural network model was trained to predict initiation of new systemic therapy (chemotherapy, immunotherapy, or targeted therapy) for advanced cancer within 30 days of imaging with computed tomography (CT), magnetic resonance imaging (MRI), bone scan, or positron emission tomography (PET)-CT to evaluate cancer status.5 Briefly, for each patient, the text of each imaging report and previous imaging reports were used to predict the probability of starting new treatment at any given time. To encourage derivation of prognostically relevant learned features, a secondary task was prediction of mortality risk, but only the prediction of new systemic therapy was incorporated into the workflow for the current pilot. Model architecture, training details, and performance were previously published,5 and the training code is available at GitHub.6 For the study described in this report, the model was retrained using additional electronic health records data generated since publication, and the task of predicting mortality risk was defined as prediction of 6-month mortality.
A second set of neural network models were trained using manually labeled data for patients with six types of solid tumors to ascertain descriptions of progressive disease, and to identify sites of metastatic disease, using individual imaging reports from CTs, MRIs, bone scans, or PET-CTs. The details of these models were also previously published,7 and the training code is available at GitHub.8
From April 2022 to January 2023, these models were applied prospectively each time patients with solid tumors that had undergone next-generation genomic sequencing underwent an imaging study. Model output was linked to the MatchMiner tool, which matches patients to clinical trials using their genomic profiles.3 The output consisted of lists of patients with tumor genomic matches to specific trials (derived from MatchMiner), sorted in order of the likelihood of changing treatment within 30 days (derived from our AI model). Patients with a probability of treatment change exceeding the best F1 threshold for that outcome in the model validation set were deemed likely to change treatment. Beginning in May 2022, to improve model specificity for patients most likely to be eligible for a trial, model-derived annotations of the last cancer progression date, the last brain metastasis date, and the last progressive brain metastasis date (if any) were incorporated into this output. In December 2022, the definition of likely to change treatment was refined to additionally require AI-derived ascertainment of cancer progression in the last month defined using the best F1 threshold in the validation set for that outcome.7
These reports were provided weekly to an oncology nurse navigator (ONN) charged with coordinating recruitment to early-phase targeted and immunotherapy trials. For nine genomically selected early-phase trials identified as high priority by the Center for Cancer Therapeutic Innovation at our institution, the ONN reviewed records for each patient flagged as likely to change treatment, filtered out patients who were very likely ineligible to participate in the clinical trial on the basis of trial-specific eligibility criteria, and contacted treating oncologists when patients appeared potentially eligible. The proportions of records leading to oncologist contact and reasons for oncologist noncontact were analyzed.
This study was approved by the Dana-Farber Harvard Cancer Center Institutional Review Board; a waiver of written informed consent was granted, given the minimal risk to patients.
RESULTS
There were 2,150 patients (65% female; median age, 63 years) who had tumors with genomic matches to the nine trials of interest during the pilot. Since information about trial matches was provided to the ONN weekly, and patients could have multiple radiology scans on different dates, patients could be included in these reports repeatedly. The reports therefore provided information on patient-trial matches a total of 60,199 times. Of these, 20,035 (33%) corresponded to moments when patients had imaging studies in the last 4 weeks and 3,168 (5%) indicated that the patient was likely to change treatment at the time of the report (Fig 1). Using the AI models to filter for patients likely to change treatment on the basis of imaging reports therefore corresponded to a 95% reduction in the number of patients who needed to be reviewed weekly by the ONN compared with reviewing all patients with genomic matches each week, and to an 84% reduction in manual review compared with weekly review of all patients with genomic matches and recent imaging.
FIG 1.

Amount of manual review required for examining all patients with a genomic match to a trial; patients with imaging in the last 4 weeks; or using our AI-assisted process: (A) patients ever requiring manual CCTI review and (B) patient-trial matches requiring manual CCTI review at a given time. AI, artificial intelligence; CCTI, Center for Cancer Therapeutic Innovation.
A total of 525 patients (24%) were deemed likely to change treatment by our AI model at least once during the pilot (Fig 1). The ONN contacted the treating oncologist for 74 patients (3% of the total; 14% of the 525 likely patients). Common reasons that the ONN did not contact the treating oncologists included cases where they had already decided to continue current treatment (22%) or had already started new treatment (20%); the trial had no available slots at the time (15%); the patient was ineligible on the basis of nurse navigator review (14%); had already been evaluated for the trial (7%); or enrolled in hospice (5%; Figure 2). Of the 74 patients whose oncologists were contacted, 10 (14%) had a consult regarding early phase trials; six (8%) consented to participate; and five (7%) received protocol treatment (Fig 3).
FIG 2.

Reasons listed by the oncology nurse navigator for noncontact of treating oncologists when patients with tumor genomic matches had an elevated probability of changing treatment. ONN, oncology nurse navigator.
FIG 3.
Overall pilot outcomes.
The sensitivity of our process for identifying patients who ultimately consented to one of the trials during our pilot was also evaluated. There were 18 total patients who consented to the trials at our institution. Of those, six patients had no imaging reports in our inference data set before consent and therefore could not have been captured by our algorithm; this was generally due to patients who signed consent for a trial during a first consult visit. Of the remaining 12 patients who consented, 11 were predicted to be likely to change treatment on the basis of imaging performed before or on their consent dates. The patient who was not predicted by our algorithm as likely to change treatment before consent had slow progression in a lung nodule and consented to one of our pilot's clinical trials after screening for a different trial and being found ineligible. Of the 11 patients predicted to be likely to change treatment, five were already being followed by the early-phase trials group, so these patients were not included in the ONN contact workflow. This left six patients whose treating oncologists were contacted by our ONN and later consented; as above, five of these six patients were eligible for the trial after consent and started trial treatment.
We also evaluated the performance of our AI-assisted process compared with a hypothetical baseline consisting of weekly ONN review of a random sample of patients who had genomic matches to trials and imaging within the last 4 weeks. As noted in Figure 1, there were 20,035 such patient-trial matches that could have been candidates for review during our pilot, of which 3,168 (15%) were determined by our model as corresponding to a patient likely to change therapy at that time and therefore reviewed by our ONN. Of those, 307 patient-trial matches (10%) led to contact with the patient's treating oncologist. Our AI model exhibited a 92% sensitivity at the patient level for identifying patients who had imaging on record and then consented to one of these trials. Therefore, approximately 333 patient-trial matches would have resulted in oncologist contact if the nurse navigator had reviewed all 20,035 matches. If the ONN had instead been able only to review a random 15% of the 20,035 matches, she would have been expected to detect only 0.15 × 333 = 50 of the matches that could have resulted in oncologist contact. Therefore, our AI-assisted method triggered approximately six times as many contacts to treating oncologists as would have occurred by simply taking a weekly random sample of patients with recent imaging for review.
DISCUSSION
In summary, this pilot demonstrated that using AI to narrow a list of potential precision clinical trial candidates by identifying patients likely to change treatment at any given time is feasible. However, many other factors also affect potential trial enrollment, ranging from eligibility criteria to patient and oncologist preferences for other treatments. Strengths of our study include its piloting of direct integration of AI assistance into clinical research workflows. Limitations include the single-institution nature of the study. However, our primary model was trained to predict near-term treatment change using imaging reports as inputs and retrospective structured treatment data as labels. The training code is published,5 and this approach could be replicated at other institutions using their own electronic health records (EHR) data without requiring manual annotation, avoiding the bureaucratic and patient privacy challenges9 of attempting to export models trained on EHR data from one center to another.
Rapid delivery of trial information before final treatment decisions are made is a critical component of this type of AI-assisted workflow. Since the most common reason for not contacting a treating oncologist after a patient was deemed likely to change was that a treatment decision had already been made, the time required to extract imaging reports from the EHR for processing, run AI model inference, and return the results must be minimized. At our site, we have access to imaging reports from the Dana-Farber Cancer Institute Enterprise Data Warehouse, which pulls from the Mass General Brigham Enterprise Data Warehouse, which in turn pulls from the Epic electronic health record. It takes approximately 48 hours for signed imaging reports to become available to us for processing through this pipeline. Additionally, although once weekly generation of patient lists for ONN review was as frequent as was feasible, given the availability of one nurse to review the lists, patients who had imaging right after a list was generated would have had more time to start a new treatment before generation of the next list.
Most patients (86%) who were predicted to be likely to change treatment on the basis of our models did not trigger contact to the treating oncologist, because on manual ONN review, the patients were not eligible for the genomically matched study. The specific reasons for ineligibility were not tracked. However, they likely ranged from exclusions consistent with the research question asked by the trial, such as lack of required previous therapy, to well-documented, arguably overly stringent, standard eligibility criteria.10 Further research is required to improve AI-based matching for criteria in the former category, but work to liberalize eligibility criteria where appropriate is needed for criteria in the latter category.
A prospective study is underway at our center to evaluate the impact of sharing clinical trial information directly with treating oncologists when our model predicts a high probability of readiness for treatment change. Given persistently low rates of accrual to clinical trials and the workload required to identify patients for appropriate clinical trials at appropriate moments in time, AI assistance could provide a strategy to assist clinicians, researchers, and patients in identifying investigational treatment options.
PRIOR PRESENTATION
Presented at the ASCO Annual Meeting, Chicago, IL, June 5, 2023.
SUPPORT
Supported by the National Institutes of Health/National Cancer Institute (R00CA245899, K.L.K.); the Doris Duke Charitable Foundation (2020080, K.L.K.); and the Lurie Family Endowment for the Knowledge Systems Group (T.M., P.T., J.L., M.R.G., and E.C.).
AUTHOR CONTRIBUTIONS
Conception and design: Kenneth L. Kehl, Tali Mazor, Pavel Trukhanov, Deborah Schrag, Michael J. Hassett, Ethan Cerami
Financial support: Ethan Cerami
Administrative support: Deborah Schrag, Ethan Cerami
Provision of study materials or patients: Antonio Giordano, Leena Gandhi, Deborah Schrag
Collection and assembly of data: Kenneth L. Kehl, Tali Mazor, Pavel Trukhanov, James Lindsay, Matthew R. Galvin, Karim S. Farhat, Emily McClure, Antonio Giordano, Leena Gandhi, Deborah Schrag, Ethan Cerami
Data analysis and interpretation: Kenneth L. Kehl, Tali Mazor, Pavel Trukhanov, Matthew R. Galvin, Antonio Giordano, Leena Gandhi, Deborah Schrag, Michael J. Hassett, Ethan Cerami
Manuscript writing: All authors
Final approval of manuscript: All authors
Accountable for all aspects of the work: All authors
AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST
The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/po/author-center.
Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).
Kenneth L. Kehl
Employment: Change Healthcare, OptumRx
Pavel Trukhanov
Leadership: NoRD Bio
Stock and Other Ownership Interests: NoRD Bio
Matthew R. Galvin
Employment: Invitae
Stock and Other Ownership Interests: Invitae
Antonio Giordano
Consulting or Advisory Role: Pfizer
Leena Gandhi
Employment: NextPoint Therapeutics
Leadership: Bright Peak Therapeutics, Neximmune
Stock and Other Ownership Interests: Lilly
Uncompensated Relationships: Accurius
Deborah Schrag
Stock and Other Ownership Interests: Merck
Consulting or Advisory Role: JAMA-Journal of the American Medical Association, Swiss Re Management Ltd
Research Funding: AACR (Inst), GRAIL (Inst)
Patents, Royalties, Other Intellectual Property: PRISSMM model is trademarked and curation tools are available to academic medical centers and government under creative commons license
No other potential conflicts of interest were reported.
REFERENCES
- 1.Nass SJ, Moses HL, Mendelsohn J; Committee on Cancer Clinical Trials and the NCI Cooperative Group Program Board on Health Care Services ; A National Cancer Clinical Trials System for the 21st Century: Reinvigorating the NCI Cooperative Group Program, Vol 102. Washington, DC, Institute of Medicine, National Academies Press, 2010, pp 1371. [PubMed] [Google Scholar]
- 2.O’Dwyer PJ, Gray RJ, Flaherty KT, et al. : The NCI-MATCH trial: Lessons for precision oncology. Nat Med 29:1349-1357, 2023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Klein H, Mazor T, Siegel E, et al. : MatchMiner: An open-source platform for cancer precision medicine. NPJ Precis Oncol 6:69, 2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hamel LM, Dougherty DW, Albrecht TL, et al. : Unpacking trial offers and low accrual rates: A qualitative analysis of clinic visits with physicians and patients potentially eligible for a prostate cancer clinical trial. JCO Oncol Pract 16:e124-e131, 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kehl KL, Groha S, Lepisto EM, et al. : Clinical inflection point detection on the basis of EHR data to identify clinical trial-ready patients with cancer. JCO Clin Cancer Inform 10.1200/CCI.20.00184 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.PRISSMM imaging inflection points. https://github.com/prissmmnlp/imaging_inflection_points
- 7.Kehl KL, Xu W, Gusev A, et al. : Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset. Nat Commun 12:7304, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.PRISSMM pan-cancer outcomes. https://github.com/prissmmnlp/pan_cancer_outcomes
- 9.Lehman E, Jain S, Pichotta K, et al. : Does BERT pretrained on clinical notes reveal sensitive data? arXiv. 2021. http://arxiv.org/abs/2104.07762 [Google Scholar]
- 10.Jin S, Pazdur R, Sridhara R: Re-evaluating eligibility criteria for oncology clinical trials: Analysis of investigational new drug applications in 2015. J Clin Oncol 35:3745-3752, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]

