Abstract
The role of real‐world evidence (RWE) in regulatory, drug development, and healthcare decision‐making is rapidly expanding. Recent advances have increased the complexity of cancer care and widened the gap between randomized clinical trial (RCT) results and the evidence needed for real‐world clinical decisions.1 Instead of remaining invisible, data from the >95% of cancer patients treated outside of clinical trials can help fill this void.
DEFINING RWE
RWE is derived from the data of patients treated in real‐world settings. The surge of electronic health records (EHRs), as well as other technologies, enables researchers to better understand the real‐world patient experience. EHR‐derived data can be combined with other data sources such as administrative claims, genomic information, and mortality datasets to create a more complete description of a patient's cancer journey. It is crucial to develop rigorous guidance for translating real‐world data (RWD) into actionable and meaningful RWE.
Standardized criteria facilitate the evaluation of RWD and data analysis in order to establish the confidence level in an RWE result. This article furthers the ongoing discussion about RWE in healthcare by defining RWE, exploring potential use cases in cancer, and proposing a regulatory‐grade RWE checklist.
RWE is generated from RWD that are documented during the course of routine clinical care. With appropriate privacy and ethical processes in place, RWD can be gathered retrospectively, as commonly used for health outcomes research, or prospectively, as may be used for safety monitoring or a pragmatic trial.
While all RWD sources have their limitations, the richest in terms of depth and breadth is the EHR. Structured data, such as cancer diagnosis codes, can be easily extracted from the EHR with appropriate technological and software solutions. Abstraction of unstructured data, such as tumor histology from pathology reports, can supplement the core structured data elements.2
RWE QUALITY
Credible RWE is generated from high‐quality data that are 1) obtained from relevant RWD sources, 2) cleaned, harmonized, and linked to fill in gaps, and 3) include endpoints. Quality criteria need to encompass the entire process to generate RWE, from data sources and processing to defining appropriate use cases (Figure 1).
The optimal RWD source depends on the RWE hypothesis and purpose.3 As the EHR is a contemporaneous (prospective or retrospective) account of the clinical narrative, it provides contextual details and longitudinal follow‐up for outcomes. The completeness of EHR data depends on clinician work‐flow, care location, and patient factors. Missing data may need to be filled by using alternative data sources; for example, claims data may provide evidence of emergency department visits but not documented in clinic notes.
Each type of RWD source has well‐documented limitations: the absence of clinical results and endpoints in claims data; limited follow‐up and reliability in patient registries; and selection bias and security concerns for smartphone/wearables data. These limitations should not preclude the use of RWD; rather, data characteristics need to be consistently documented in order to understand their potential implication for analysis and interpretation. A comprehensive RWE checklist covers multiple data quality dimensions. Compiled together as metadata, this information must accompany each dataset.
In order to create a comprehensive dataset that truly represents the clinical journey, data must be aggregated to combine disparate pieces of information. Cleaning and harmonization of data is necessary to integrate heterogeneous sources. For example, a lumpectomy may be documented in structured (procedure code) and/or unstructured (clinician note) data—capturing both is imperative. Heterogeneous units (e.g., serum calcium reported as mg/dL or mEq/L) also need to be standardized. Deidentified links to external sources such as genomic databases can further enhance dataset completeness. The efficiency of curating these complex, relational datasets can be increased through technology‐enabled approaches (Figure 1). Overall, these steps ensure consistent data values and facilitate a parsimonious set of variables that can accommodate a variety of analytic use cases.
For many oncology use cases, relevant, high‐quality outcomes variables must be included in the study dataset. These measures include binary endpoints such as vital status, as well as complex tumor (e.g., response), clinical (e.g., patient‐reported outcomes), and healthcare system (e.g., resource utilization) endpoints. Challenges for accurately capturing real‐world endpoints include variable documentation of key events (e.g., change in tumor measurement) and subjectivity (e.g., adverse events descriptions). Given the importance of endpoints, multiple stakeholders (e.g., the US Food and Drug Administration, National Cancer Institute, biopharmaceuticals, academics, etc.) are actively engaged in establishing endpoint frameworks and metrics.
POTENTIAL RWE USE CASES
The overarching objective of cancer research is to improve patient outcomes and/or quality of life. Retrospective and prospective RWE may play a complementary role to existing data or may stand alone. Here we explore potential RWE use cases.
REGULATORY DECISION‐MAKING
Multiple scientific, drug development, patient access, and legislative forces have coincided to focus efforts on developing rigorous RWE for regulatory purposes.4 For new oncology therapeutics, conventional RCTs remain the gold standard. However, RCTs produce efficacy and safety results for narrow patient populations, circumscribed clinical settings, and limited drug combinations. By expanding data sources, regulatory‐grade RWE can provide critical information needed by clinicians, patients, and regulatory bodies to make informed decisions.5
Postmarketing requirements and commitments
Traditional phase IV and other postmarketing studies can be cumbersome and face a myriad of patient enrollment barriers such as changing practice patterns. Rigorous RWE studies may generate unique hypotheses for future basic science, drug development, health outcomes, and clinical research.
Pharmacovigilance
Identification of rare side effects may be facilitated by longitudinal RWE for broad populations. Instead of relying on voluntary reporting, carefully conducted RWE studies may uncover, in real time, adverse event trends.
Label expansion
Registrational trials often exclude specific populations (e.g., patients with HIV). Focused, in‐depth RWE studies may be able to turn anecdotes about these patients into cohorts large enough and robust enough for regulatory consideration. By thoroughly evaluating structured and unstructured RWD for individual patients, RWE may rigorously document safety and effectiveness with the level of quality and detail needed to support label expansion.
DRUG DEVELOPMENT
During its development life‐cycle, a new oncology therapeutic faces multiple go/no‐go decision points. Scientific and safety standards always have primacy. However, limited resources mean some good drugs are never fully explored. By clarifying real‐world unmet needs, RWE may help optimize decisions during predevelopment and guide clinical development strategies (Figure 2).
During clinical development, RWE may also inform clinical trial design and conduct. RWE about specific populations (e.g., renal cell carcinoma patients with asymptomatic brain metastases) may help avoid unnecessarily restrictive exclusion criteria. Understanding prevalence patterns for potential trial candidates (e.g., rare cancers progressing on chemotherapy) may facilitate patient enrollment. Synthetic control arms based on RWE are also being explored, particularly for cancers with a well‐established standard of care, poor prognosis, and low incidence (e.g., small cell lung cancer). In contrast to historical controls, synthetic controls may have greater recency that can help control for changes in supportive care over time.
Clinical decision‐making
Treatment decisions often depend on the risk/benefit ratio for an individual patient. Although clinical uncertainty cannot be eliminated, RWE may help fine‐tune this appraisal and promote personalized medicine tailored to both the patient and the tumor.
CHECKLIST FOR REGULATORY‐GRADE RWE
As with all scientific evidence, RWE, both retrospective and prospective, must be fit for purpose. We propose a checklist to ensure regulatory‐grade data quality. In all cases, policies and procedures must be well documented, each dataset tested, and results reported when appropriate.
1) High quality
The provenance of each datapoint must be clear, traceable, and auditable. Data quality must be systematically measured with predetermined frameworks (e.g., interrater reliability) and against benchmarks (e.g., stage distribution in Surveillance, Epidemiology and End Results (SEER)).
2) Complete
Completeness requires predefined rules for abstraction of structured and unstructured data, data harmonization, and quality monitoring. Completeness needs to be benchmarked to appropriate gold standards (e.g., National Death Index for date of death).
3) Transparent
Transparent study designs and analysis plans are critical for robust RWE. In particular, the specific aims and cohort selection criteria need to be precisely defined. Study design considerations include retrospective vs. prospective data collection, the need for matching or propensity scores to facilitate comparisons, and endpoint validation.
4) Generalizable
RWE is often based on a broad range of patients, which can translate into better generalizability. Potential biases (e.g., geographic representation) must be identified and reported to allow for appropriate statistical adjustments and clinical interpretations.
5) Timely
RWE reflects daily clinical decisions. Thus, reliable RWE needs to be recent and timely. Details about the timepoint that the data analysis represents must be reported (e.g., time period, last update, number of potential candidates, etc.).
6) Scalable
Data challenges become exponentially more complicated as the number of patients and variables increase. Therefore, scaling requires 1) a balance between high touch and automation; 2) a modular data model that can be used in multiple contexts and facilitates model evolution (e.g., frequency of intravenous regimens); and 3) unambiguous variable definitions, particularly for endpoints.
CONCLUSION
In conclusion, the significance and scope of potential RWE use cases requires rigorous quality assessment, especially when used for regulatory decision‐making. Therefore, we propose a checklist for robust, regulatory‐grade RWE: 1) High quality, 2) Complete, 3) Transparent, 4) Generalizable, 5) Timely, and 6) Scalable. As suggested by potential use cases in cancer, the up‐front investment in establishing regulatory‐grade standards will pay dividends as RWE gains importance in medicine.
CONFLICT OF INTEREST/DISCLOSURE
Dr. Miksad and Dr. Abernethy are employed at Flatiron Health, a for‐profit company.
References
- 1. Sherman, R.E. , Davies, K.M. , Robb, M.A. , Hunter, N.L. & Califf, R.M. Accelerating development of scientific evidence for medical products within the existing US regulatory framework. Nat. Rev. Drug Discov. 16, 297–298 (2017). [DOI] [PubMed] [Google Scholar]
- 2. Berger, M.L. , Curtis, M.D. , Smith, G. , Harnett, J. & Abernethy, A.P. Opportunities and challenges in leveraging electronic health record data in oncology. Future Oncol. 12, 1261–1274 (2016). [DOI] [PubMed] [Google Scholar]
- 3. FDA guidance: Use of real‐world evidence to support regulatory decision‐making for medical devices. Silver Spring, MD: US Food and Drug Administration. [Google Scholar]
- 4. Khozin, S. , Blumenthal, G.M. & Pazdur, R. Real‐world data for clinical evidence generation in oncology. J. Natl. Cancer Inst. 109 (2017). [DOI] [PubMed] [Google Scholar]
- 5. Khozin, S. , Kim, G. & Pazdur, R. Regulatory watch: from big data to smart data: FDA's INFORMED initiative. Nat. Rev. Drug Discov. 16, 306 (2017). [DOI] [PubMed] [Google Scholar]