As the use of real-world data (RWD) to generate real-world evidence (RWE) in oncology evolves, it has the potential to improve care and outcomes for patients. Notably, there has been rapid growth in the availability of RWD, accompanied by a convergence of efforts across the government,1-3 academic, and nonprofit organizations, data holders, and the pharmaceutical industry to advance appropriate uses while elucidating strengths and limitations. RWD and RWE have many potential applications in oncology from researching cancer etiology to implementing improvements in clinical care as well as providing evidence to support regulatory decision making. In their article, Castellanos et al4 present a perspective on processes and scientific methods to address data quality for electronic health record–derived data, and in this editorial we add a regulatory perspective to the discussion.
In the regulatory setting, RWD may be used according to relevant regulatory standards in evaluating safety signals that cannot easily be assessed in clinical trials. In addition, RWD could be used to describe the natural history of cancers with poorly understood or inadequately characterized etiology or to serve as an external control or historical benchmark to provide context for findings from single-arm trials where a randomized study may be unfeasible or unethical. Additionally, RWD can be used prospectively in a randomized pragmatic clinical trial. In all cases, the quality of RWD must be a paramount consideration given its potential to affect the interpretation of study findings.
Castellanos et al4 created a definition of RWD quality by synthesizing definitions across published guidelines and guidances including the European Medicines Agency, National Institute for Health and Care Excellence, the US Food and Drug Administration (FDA), the Duke-Margolis Health Policy Center, and the Patient Centered Outcomes Research Institute and used this new definition to characterize quality attributes of a single RWD source.4 This effort highlighted apparent similarities in how various regulatory agencies and health policy experts sought to define RWD quality, as well as challenges in how to directly evaluate data quality in the absence of specific benchmarks or thresholds (ie, how good is good enough?). The authors noted additional challenges facing the field broadly in data quality assessment including a need for multidisciplinary expertise, lack of consistent terminology, and considerations for new quality dimensions as RWD sources evolve.
The FDA does not endorse any specific data source as appropriate for regulatory decision making.5 Although some have referred to regulatory grade when describing RWD sources, such a term has no regulatory meaning. Whether a particular data source is fit for use must be determined in the context of the specific intended use. Since 2021, the agency has published five draft or final guidances related to the use of RWD for regulatory purposes.6-10 The FDA RWE guidances focus on outlining how a data source may be fit for use, meeting both principles of relevance and reliability to answer a specific scientific question. Castellanos et al4 consider both relevance and reliability as dimensions of data quality while noting considerable conceptual overlap specifically between data quality and reliability. Reliable data are complete (necessary data to address study question, design, analysis) and accurate (appropriate collection, transmission, data processing) with known provenance (audit trail accounting for data origin, how it got to present place) and traceability (permits understanding of relationship between original source data, tabulation data, analytic datasets, and study results). The concept of relevance captures availability of key data elements (exposures, outcomes, covariates) and a sufficient number of representative patients, which is contingent on the research question.8,9
Data that are not reliable and relevant cannot be deemed fit for use and, therefore, cannot be used to support regulatory decision making in oncology because they may lead to biased or uninterpretable results and erroneous conclusions. For example, tumor response as reported by the treating physician may be available in certain RWD sources; however, imaging data are often inaccessible to confirm tumor response using criteria similar to those in a clinical trial (eg, RECIST version 1.1).11 This may lead to the potential for misclassification of the outcome and inability to properly estimate the effect size, as physician-reported response rate may overestimate response rate according to RECIST criteria in the real-world setting.12,13 An externally controlled trial may underestimate the effect of the medical product if physician-reported response is used in the RWD-derived external control arm. Likewise, a postmarketing study may be interested in characterizing medical product effectiveness in patients who were underrepresented in the pivotal clinical trial, such as older adults. In this instance, physician-reported response rate may overestimate medical product effectiveness in the real-world setting compared with the response rate observed among patients in clinical trials.
As a fit-for-use assessment is inherently specific to the regulatory research question, transparency regarding the study design and careful evaluation and reporting of the selected data source are important to allow for assessment of potential sources of bias. Transparency helps to promote both relevance and reliability of a RWD source to address a scientific question and is particularly important for studies intended to support regulatory decision making. A comprehensive prespecified protocol that transparently describes study objectives and design elements can facilitate assessment of data source relevance14; specifically, outlining how the minimum data elements needed to determine selection criteria, exposure assessment, outcome, and relevant covariates can inform RWD source selection and study feasibility.15 Transparency in the protocol and supporting analytic documentation regarding methods for data abstraction, curation, aggregation, and transformation (ie, data provenance), including any deduplication procedures or linkages, can help to assess reliability. Audit trails starting at data extraction and extending through maintenance and retention that include access to source records or certified copies can help promote transparency and reproducibility. The analysis by Castellanos et al4 highlights the importance of transparency as a means to characterize fitness-for-use of a singular specific RWD source. However, this high-level assessment of fitness-for-use is limited absent a specific research question.
Given the rapid increase in the availability of RWD for research and its potential to provide valuable evidence in regulatory settings, innovation to improve data reliability is a key focus among epidemiologists, biostatisticians, and clinicians evaluating RWD across academia, industry, and regulatory science. Prospective studies incorporating RWD such as noninterventional studies using registry data and pragmatic clinical trials using electronic health records present a particularly exciting opportunity. Registries may overcome many of the challenges associated with secondary use of RWD not collected for research purposes, such as missingness, duplication, completeness of capture, auditing issues, provenance, and transparency. Technological advances that allow for decentralized patient enrollment, consent to access electronic health records, and linkages to external data to minimize loss to follow-up may help overcome certain challenges that historically limited the utility of registries for generation of oncology RWE. In addition, pragmatic trials may incorporate RWD to minimize clinician and patient burden and streamline design while maintaining randomization, with the potential to generate evidence for new indications and evaluate safety and effectiveness in the postmarketing setting.16 Finally, as alluded to by Castellanos et al,4 a vast amount of unstructured RWD for oncology are becoming available including radiology reports, laboratory reports, physician notes, and genomic sequencing data. Whereas manual abstraction of such data is labor intensive, the application of artificial intelligence (AI), including techniques such as natural language processing, large language models, and machine learning, holds promise as a means to facilitate data abstraction. However, AI methods need to be verified and validated for regulatory purposes. Given the potential for bias, they may not be appropriate for regulatory decision making at this current time. Within the scientific community, efforts to optimize such AI methods aim to ensure they are appropriate for their intended use.
While some may view the use of RWD as a resource-saving alternative to an interventional study, carefully designing and conducting a noninterventional study using RWD may be resource intensive given the level of expertise necessary to ensure the relevance and reliability of RWD sources. Assessing data relevance and reliability is essential, and failure to do so can lead to bias, confounding, and erroneous results. Accordingly, study designs incorporating RWD intended to provide evidence to support regulatory decision making can be discussed early and often with regulatory agencies.7 Commitment to continuous data improvement is an important step in maximizing the utility of RWD to generate RWE for patient-centered regulatory decision making, furthering the development of oncology medical products, and protecting and advancing public health.
Footnotes
See accompanying Article, 10.1200/CCI.23.00046
DISCLAIMER
Views expressed are those of the authors and not the US Food and Drug Administration. This is a US Government work. There are no restrictions on its use.
AUTHOR CONTRIBUTIONS
Conception and design: All authors
Manuscript writing: All authors
Final approval of manuscript: All authors
Accountable for all aspects of the work: All authors
AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST
The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/cci/author-center.
Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).
No potential conflicts of interest were reported.
REFERENCES
- 1.PDUFA Reauthorization Performance Goals and Procedures Fiscal Years 2023 through 2027. Silver Spring, MD: US Food and Drug Administration; 2022. [Google Scholar]
- 2.Office of the Federal Register, National Archives and Records Administration . Public Law 114-255, 21st Century Cures Act. Washington, DC, US Government Publishing Office; 2016. [Google Scholar]
- 3.Cancer Moonshot. Washington, DC: The White House; https://www.whitehouse.gov/cancermoonshot/ [Google Scholar]
- 4. Castellanos EH, Wittmershaus BK, Chandwani S. Raising the bar for real-world data in oncology: Approaches to quality across multiple dimensions. JCO Clin Cancer Inform. doi: 10.1200/CCI.23.00046. 10.1200/CCI.23.00046 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Framework for the FDA's Real-World Evidence Program. Silver Spring, MD: US Food and Drug Administration; 2018. [Google Scholar]
- 6.Data Standards for Drug and Biological Product Submissions Containing Real-World Data, Draft Guidance for Industry. Silver Spring, MD: US Food and Drug Administration; 2021. [Google Scholar]
- 7.Considerations for the Use of Real-World Data and Real-World Evidence to Support Regulatory Decision-Making for Drug and Biological Products, Guidance for Industry. Silver Spring, MD: US Food and Drug Administration; 2023. [Google Scholar]
- 8.Real-World Data . Assessing Electronic Health Records and Medical Claims Data to Support Regulatory Decision-Making for Drug and Biological Products, Draft Guidance for Industry. Silver Spring, MD: US Food and Drug Administration; 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Real-World Data . Assessing Registries to Support Regulatory Decision-Making for Drug and Biological Products, Draft Guidance for Industry. Silver Spring, MD: US Food and Drug Administration; 2021. [Google Scholar]
- 10.Considerations for the Design and Conduct of Externally Controlled Trials for Drug and Biological Products, Draft Guidance for Industry. Silver Spring, MD: US Food and Drug Administration; 2023. [Google Scholar]
- 11. Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1) Eur J Cancer. 2009;45:228–247. doi: 10.1016/j.ejca.2008.10.026. [DOI] [PubMed] [Google Scholar]
- 12. Ma X, Bellomo L, Hooley I, et al. Concordance of clinician-documented and imaging response in patients with stage IV non-small cell lung cancer treated with first-line therapy. JAMA Netw Open. 2022;5:e229655. doi: 10.1001/jamanetworkopen.2022.9655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Feinberg BA, Bharmal M, Klink AJ, et al. Using response evaluation criteria in solid tumors in real-world evidence cancer research. Future Oncol. 2018;14:2841–2848. doi: 10.2217/fon-2018-0317. [DOI] [PubMed] [Google Scholar]
- 14. Public Policy Committee, International Society of Pharmacoepidemiology Guidelines for good pharmacoepidemiology practice (GPP) Pharmacoepidemiol Drug Saf. 2016;25:2–10. doi: 10.1002/pds.3891. [DOI] [PubMed] [Google Scholar]
- 15.Oncology Quality . Characterization and Assessment of Real-World Data (QCARD) Initiative. US Food and Drug Administration. Oncology Center of Excellence; 2023. https://www.fda.gov/about-fda/oncology-center-excellence/oncology-quality-characterization-and-assessment-real-world-data-qcard-initiative [Google Scholar]
- 16. Califf RM, Sugarman J. Exploring the ethical and regulatory issues in pragmatic clinical trials. Clin Trials. 2015;12:436–441. doi: 10.1177/1740774515598334. [DOI] [PMC free article] [PubMed] [Google Scholar]
