Abstract
The use of real-world data (RWD) for healthcare decision-making is complicated by concerns regarding whether RWD is fit-for-purpose or is of sufficient validity to support the creation of credible RWE. An efficient mechanism for screening the quality of RWD is needed as regulatory agencies begin to use real-world evidence (RWE) to inform decisions about treatment effectiveness and safety. First, we provide an overview of RWD and RWE. Data quality frameworks (DQFs) in the US and EU were examined, including their dimensions and subdimensions. There is some convergence of the conceptual DQFs on specific assessment criteria. Second, we describe a list of screening criteria for assessing the quality of RWD sources. The curation and analysis of RWD will continue to evolve in light of developments in digital health and artificial intelligence (AI). In conclusion, this paper provides a perspective on the utilization of RWD and RWE in healthcare decision-making. It covers the types and uses of RWD, data quality frameworks (DQFs), regulatory landscapes, and the potential impact of RWE, as well as the challenges and opportunities for the greater leveraging of RWD to create credible RWE.
Keywords: real-world data, real-world evidence, data types, data discoverability, data quality, data privacy, interoperability
1. Introduction
The healthcare landscape is rapidly shifting toward the utilization of real-world data (RWD) and real-world evidence (RWE) to inform decision-making processes. Unlike traditional clinical trials that operate within controlled environments, RWD and RWE offer insights derived from real-life patient experiences in diverse settings. However, the journey toward leveraging the full potential of RWD and RWE can only be maximized by overcoming several challenges ranging from data discoverability, transparency in data curation and data quality assurance, the linkage of data across various platforms, and the protection of sensitive data [1,2,3,4,5,6,7,8,9,10].
“Real-world data [FDA] are data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources. Examples of RWD include data derived from electronic health records, medical claims data, data from product or disease registries, and data gathered from other sources (such as digital health technologies) that can provide information regarding patient health status.”
United States (US) Food and Drug Administration (FDA) [5]
Consequently, “Real-world evidence [RWE] is the clinical evidence about the usage and potential benefits or risks of a medical product derived from analysis of RWD”. Together, they offer a glimpse into how treatments and interventions perform in diverse real-world settings beyond traditional clinical trials (CTs).
One of the primary advantages of RWD and RWE lies in their ability to capture the complexities of patient care as it happens in the real world. Other than pragmatic clinical trials (PCTs), most regulatory-purposed CTs involve strict inclusion criteria and controlled environments that do not reflect the diversity of patient populations, treatment patterns, and healthcare delivery systems.
The promise of RWD and RWE lies in their potential to enhance the development of new therapeutic strategies and support the creation of a learning healthcare system that results in continuous quality improvement from the merging of clinical research and healthcare delivery. Observational data have long been utilized by regulators to identify significant adverse events associated with treatments; however, both in the US and Europe, regulators are exploring how to appropriately use RWE derived from observational studies to inform their decisions about treatment effectiveness. In addition, good-quality observational RWD enables the spectrum of healthcare stakeholders to obtain a more comprehensive view of patient health and treatment outcomes, facilitating the identification of trends, patterns, and insights that may have otherwise gone unnoticed.
This article provides a perspective on the utilization of real-world data (RWD) and real-world evidence (RWE) in healthcare decision-making. It covers the importance of RWD/RWE, data quality frameworks (DQFs), regulatory landscapes, and the potential impact of RWD, as well as the challenges and opportunities for appropriately expanding the use of RWD to create credible RWE.
2. RWD Types and Use
The analysis of RWD for most of the last four decades has focused on structured data that has been coded in a standardized fashion. This includes diagnosis codes, procedure codes, prescriptions written, dates of eligibility, dates of service, and many others. Most of this information was obtained from administrative claims and pharmacy records. The strength of this data is that it is very complete. The limitations of these data are that coding is not always accurate and that exposure and outcomes of treatments must be inferred. Recent initiatives have focused on transparency in the curation procedures utilized to acquire and transform “raw” coded RWD into “analyzable” RWD to determine the quality of the RWD and whether it is fit-for-purpose to answer specific scientific questions. These issues will be explored later in this paper. Interventional studies involving animals or humans and other studies that require ethical approval must list the authority that provided approval and the corresponding ethical approval code.
Since the 21st Century Cures Act, when RWE was defined by the U.S. Congress [7], many new sources of RWD have become available, including electronic health records (EHRs), ‘omics data, specimens, voice recordings, images, texts, and sensor data from wearable devices and health apps. While a minority of data in EHRs are recorded in structured fields, most of the data are not, which may only be accessible as free-text or reports in portable document formats (PDFs).
The breadth and depth of this data are tremendous; however, the curation of this data into analyzable data sets requires significant effort. Free-text doctor notes can be converted into structured, coded data manually or through natural language processing and machine learning; however, while expert manual coding of data is well accepted, these other methods have not yet been validated. Hence, their acceptance by regulatory authorities has been limited. For other sources of data, such as wearable sensors, there has not been sufficient transparency in the disclosure of how data has been processed or demonstration of the clinical significance of their findings. The sources of RWE continue to expand, particularly with emerging digital sources (Figure 1). The greater variety and availability of RWD are revolutionizing research and development, patient care, and the work of regulatory and health technology assessment (HTA) agencies.
Figure 1.
Types of RWD (EHR = Electronic Health Record; HRA = Health Risk Assessment; HSA = Health Status Assessment; PDF = Portable Document Format; PRO = Patient-Reported Outcome; RWD = Real-World Data).
The findability, accessibility, interoperability, and reusability (FAIR) principles outline the dimensions of RWD that are fundamental considerations in assessing their usefulness [8]. Such data must also be of sufficient quality and fit-for-purpose. Research purpose-driven prospective collection of RWD has advantages over RWD collected for non-research purposes in that it ensures the collection of critical data elements and often includes ongoing data quality checks. These are common characteristics of PCTs, prospective observational studies, and registries. The disadvantage of these approaches is that they may take a long time to complete and require a much greater assignment of effort and funding to complete in comparison with the secondary use of RWD collected during healthcare delivery.
Secondary use of routinely collected RWD generally requires less effort but faces its own challenges with respect to transparency of how the research is conducted and, therefore, its credibility. It has been recommended that such research be conducted transparently, including public posting of the protocol and hypotheses tested prior to analyzing the RWD. Furthermore, RWD scientists describe what efforts were employed in selecting an RWD source to use (such as numbers of patients and events) and reporting any changes to the protocol and their rationale that occur while the RWD is analyzed. In the absence of randomization, the imputation of causality is more difficult due to issues of bias and confounding. These will be discussed in more detail later in this paper. For example, the criteria for designing RWD studies are summarized in Table 1.
Data discoverability and linkage remain significant hurdles, with RWD often scattered across disparate systems and sources. Addressing these challenges requires concerted efforts from stakeholders across the healthcare ecosystem. Initiatives aimed at improving data standardization, interoperability, and privacy protection are essential. Collaboration between healthcare providers, technology companies, regulators, and patients themselves is key to building a robust infrastructure for RWD and RWE utilization.
One useful tool for data linkage is tokenization [9,10]. This process converts patient identifiers into codes or tokens that allow researchers to connect patient data from various sources without the need to know the identity of the patients. Implementing tokenization early in a study may enable researchers to use existing data sources while reducing the need for new data collection. Making tokens involves straightforward steps, but it is crucial to consider factors like defining research questions, obtaining consent, and training staff on using tokens. Tokenization is typically performed using cloud-based data systems.
3. Acceptability of RWD for Regulatory Decision-Making
In the US, RWD/RWE initiatives are driven by various regulatory frameworks, such as the FDA’s RWD and RWE guidance documents [5,11,12,13,14,15,16,17,18,19]. The FDA’s “Advancing Real-World Evidence Program” [19] follows various regulatory frameworks. The EU has embarked on similar initiatives and supported the development of systems to facilitate data sharing while ensuring compliance with privacy regulations. The European Health Data Space (EHDS) aims to foster cross-border data exchanges and transfers to support healthcare delivery, research, and innovation. The EHDS seeks to strike a balance between enabling data sharing for legitimate purposes and safeguarding individual privacy rights [20]. Recently, the EMA has written a reflection paper [21]. Ultimately, regulators want to be convinced that the RWD used to generate RWE is reliable, relevant, and fit-for-purpose (Table 2). The definitions of these terms, although they overlap, are not identical across jurisdictions; this represents an opportunity for harmonization efforts in the future. Regulators also want to be assured that non-interventional RWD studies are designed and conducted rigorously. With respect to the latter issue, a consensus on good practice recommendations has yet to emerge. By implementing robust data governance frameworks and fostering collaboration between stakeholders, healthcare organizations can harness the power of real-world data while safeguarding patient privacy and complying with regulatory requirements.
At its base, the creation of RWE requires access to RWD sources that are reliable, relevant, and fit-for-purpose. Deciding whether an RWD source is fit-for-purpose depends on the assessment of reliability and relevance in the context of the regulatory issue being considered. Reliability and relevance are the overarching framework for the assessment of data quality. The data quality framework in the EU is more detailed and encompasses various dimensions as compared to the US. [22,23] (Table 2). For the EMA, the major considerations are transparency, reliability, extensiveness, coherence, and timeliness. Twelve DQ dimensions were identified in a systematic literature review, and it was concluded that while there was much overlap across the proposed DQF’s, the definitions of “dimensions” were quite variable [24].
To provide assistance to researchers and sponsors facing this current complexity, we have recently developed a research-intuitive set of screening criteria to assess whether potential RWD sources for planned research studies are fit-for-purpose and of sufficient quality to support the creation of trustworthy RWD (Table 1) [25]. We have given this tool the acronym ATRAcTR (meaning: Authentic Transparent Relevant Accurate Track-Record).
Table 1.
ATRAcTR data quality criteria.
| Dimensions | Concepts |
|---|---|
| Authenticity |
|
| Transparency |
|
| Relevancy |
|
| Accuracy |
|
| Track Record |
|
These criteria are consistent with the frameworks promoted by the FDA and the EMA within a simplified framework and employ terminology directly relatable to the work performed by clinical researchers (Table 2).
Table 2.
A comparison of multiple DQFs.
| FDA | EMA | ATRAcTR | ||
|---|---|---|---|---|
| Data Reliability |
Accuracy Completeness Provenance Traceability |
Data Reliability | Precision Accuracy Plausibility |
Data Authenticity Data Transparency Data Accuracy |
| Data Extensiveness | Completeness Coverage |
|||
| Data Coherence | Format Structural Semantic Uniqueness Conformance Validity |
|||
| Data Timeliness | ||||
| Data Relevance |
Exposure Outcomes Adequate Sample size |
Data Relevance | Data Relevance | |
| Study Design |
Employ Causal Inference Framework |
|||
| Data Track Record | ||||
Beyond data quality, it has become apparent that study design is equally, if not more, important to the creation of credible RWE. Much of this work has demonstrated that discrepancies between RCTs and RWD studies that attempted to emulate their findings were due to two factors: not embedding the study design within a causal incidence framework or not being able to closely emulate the RCT [26,27,28]. The use of causal inference frameworks in RWD studies has not been routinely incorporated into RWD protocols, but it will improve the robustness of conclusions from observational studies that produce RWE. The credibility of RWD for causal inference from observational studies has been a matter of debate; these studies must be carefully designed, and statistical methods must be applied to address the potential for bias and confounding [29]. The use of RWD for causal inference has been advanced by the work of Hernan and colleagues, who advocate for a target trial simulation approach to study design, which also has implications for how we think about data quality. This approach enables a qualitative assessment of how closely RWD studies can emulate the theoretical RCT that one would perform, if feasible.
Recent advances in statistical methods for causal inference in epidemiology (e.g., doubly robust methods and G estimation) are all based on estimating treatment effects using the mean difference in predicted outcomes for the intervention and comparison groups rather than estimating a parameter in an equation [30,31]. These methods may enable a quantitative assessment of residual bias and confounding. Because the treatment effects are based on predictions, machine learning methods are particularly useful for implementing these new statistical estimates of treatment effects. RWD needs to be adequate for estimating good predictions, but it is not clear that we need equally high levels of data quality for all the variables in predictive models.
4. System Interoperability and Data Privacy
It has become clear that the value of RWD increases as various data sources are linked together. Accurate linkage of different data sources requires overcoming the challenges of interoperability. Interoperability, particularly in the US, where data are often siloed. In most instances, common data models (CDMs) are employed, which organize data into a standard structure and apply standard data definitions [32,33]. These typically vary across networks. Standardizing data curation across these disparate data types and sources presents formidable challenges, requiring consensus among stakeholders and alignment with industry best practices.
Technical challenges include different systems and linkage issues such as non-unique patient identification numbers, semantics, and various CDMs used. This has frequently been addressed using probabilistic linkage techniques. As discussed earlier, recent efforts to address this involve data tokenization; for example, the standards are established under Health Level Seven International (HL7) and Integrating the Healthcare Enterprise (IHE) [34,35].
Initiatives such as EHR adoption, interoperability standards, and data exchange platforms aim to streamline data sharing and integration, enabling stakeholders to access and analyze data more efficiently. Despite these advancements, siloed data remains a significant barrier to realizing the full potential of RWD in the US.
In contrast, the EU landscape for RWD is characterized by initiatives aimed at facilitating data discovery and exchange. The EHDS seeks to foster cross-border data exchanges to support healthcare delivery, research, and innovation. EHDS aims to overcome barriers to data sharing while ensuring compliance with privacy regulations, thereby enabling stakeholders to access a wealth of RWD from diverse sources across member states.
Additionally, metadata catalogs play a crucial role in facilitating data discovery within the EU. These catalogs provide comprehensive information about the available datasets, including data types, sources, and access requirements, enabling researchers to identify relevant datasets for their specific research questions. However, challenges persist due to varying levels of digitalization across member states, as well as differences in data governance frameworks and privacy regulations.
Data privacy considerations loom large when leveraging RWD, particularly given the sensitive nature of health data. Privacy concerns are addressed through regulatory frameworks such as the Health Insurance Portability and Accountability Act (HIPAA) and General Data Protection Regulation (GDPR) in the US and EU, respectively [36,37]. Challenges remain to ensure compliance and safeguard patient data. Safeguarding patient privacy and ensuring compliance with regulatory requirements are critical considerations in the collection, use, and sharing of RWD. Stringent privacy protections, such as de-identification techniques, encryption protocols, and access controls, are essential for mitigating the risks of unauthorized access, data breaches, and privacy violations. By prioritizing patient privacy and data security, stakeholders can foster trust and confidence in RWD/RWE initiatives, enabling data-driven decision-making while respecting individual privacy rights.
The foremost among these was HIPAA, enacted in 1996, which establishes national standards for the protection of sensitive patient health information. HIPAA mandates stringent safeguards for the handling and transmission of health data, imposing penalties for non-compliance.
In addition to HIPAA, recent legislative developments in the US include the California Consumer Privacy Act (CCPA) [38]. The CCPA enhances consumer privacy rights and imposes obligations on businesses regarding the collection, use, and sale of personal information. Although not specifically tailored to healthcare data, the CCPA has implications for RWD initiatives, particularly in the context of patient privacy and data sharing. The California Privacy Rights Act (CPRA) was passed, and the CPRA amended and extended the CCPA [39].
Both HIPAA and GDPR play a crucial role in safeguarding patient privacy and establishing standards for data protection in the US and EU, respectively. However, challenges remain in ensuring compliance and safeguarding patient data in the face of evolving threats and technological advancements. Healthcare organizations must remain vigilant and proactive in implementing robust security measures to protect patient data and uphold high standards of privacy and confidentiality.
5. Discussion
The importance of RWD and RWE is ever-increasing and is foundational to the creation of a learning healthcare system. From optimizing treatment pathways and identifying real-world safety concerns to informing regulatory decisions and shaping healthcare policy, applications using RWD for RWE can be vast and far-reaching. Through data discoverability, ensuring quality, protecting privacy, and promoting interoperability, RWD and RWE initiatives enable stakeholders to harness the power of big data and evidence to drive positive change and innovation in healthcare.
In a learning healthcare system, new insights are not only learned from discrete experiments, such as standard CTs, but also from a range of investigations that utilize RWD. The range includes randomized interventional studies (e.g., PCTs) to non-randomized interventional studies (e.g., single-arm studies utilizing an external control group) to non-randomized non-interventional studies or observational studies (e.g., prospective observational studies and registries) [40].
As healthcare continues to evolve over time, the richness of RWD will continue to expand and improve in quality. By appropriately harnessing the diverse forms of RWD as they become available, stakeholders can unlock new opportunities for improving patient care, advancing research, and informing policies and decision-making in healthcare. This will be augmented by additional advances in the study designs and statistical analyses. Advances in artificial intelligence (AI) may be a core part of this era of big data [41,42,43]. However, validation of AI on a use-case basis is required to ensure that their findings are applicable and beneficial to larger, diverse patient populations. This is especially true for regulatory decision-making.
6. Conclusions and Future Direction
RWD and RWE applications will increase in the future, given the abundance of data from various sources. Standard RCTs alone will not adequately address the complex intersection of many diseases and comorbid conditions, which are patient-centric and require us to find alternate ways of getting evidence to support such gaps. Big data, RWD, digital, AI, and robotics have the potential to support patients for generalizability across the spectrum of various characteristics and comorbid conditions while considering the tradeoff between potential benefits and risks, as well as the data privacy rules.
Concerns about equity and diversity will still loom large. Biases in data collection, algorithmic bias, and disparities in access to healthcare resources can exacerbate inequities and perpetuate systemic biases in healthcare delivery and outcomes. Addressing these concerns will require a concerted effort to ensure inclusivity, fairness, and transparency in development.
The deployment of AI-driven healthcare solutions must be executed in a deliberate fashion, accompanied by transparency, explicability, and use validation. By prioritizing equity and diversity, stakeholders will need to mitigate biases, promote health equity, and ensure that data and technologies can benefit all patient groups. It is anticipated that RWD and RWE will play increasing roles in the biopharma industry as issues of ethics, transparency, and trustworthiness are addressed.
Actions by regulatory agencies and HTA authorities will have a significant impact on the speed and application of RWE in population-based decision-making. As noted earlier, there is ample room for the harmonization of data quality frameworks. In addition, how the specific consideration of causal inference is incorporated into the study design and analysis is in its early stages. The adoption of international standards for “good study practices” also needs to be addressed for non-interventional studies. The latter has been less of an issue for descriptive RWE studies (e.g., treatment patterns, disease progression) and for safety studies; however, they will be critical for studies of treatment effectiveness. See, for example, best practices [44]. Achieving consensus on data quality frameworks, study design, and analysis standards will likely have a greater short-term impact on the adoption of RWE than the creation of new RWD sources.
Acknowledgments
The authors would like to acknowledge William H. Crown for valuable input regarding the content of this article. The authors acknowledge Arghya Bhattacharya and Shanthakumar V for medical writing support.
Author Contributions
Conceptualization, K.H.Z. and M.L.B.; methodology, K.H.Z. and M.L.B.; software, not applicable; validation, not applicable; formal analysis, not applicable; investigation, K.H.Z. and M.L.B.; resources, K.H.Z.; data curation, K.H.Z. and M.L.B.; writing—original draft preparation, K.H.Z.; writing—review and editing, K.H.Z. and M.L.B.; visualization, K.H.Z. and M.L.B.; supervision, K.H.Z.; project administration, K.H.Z. and M.L.B.; funding acquisition, K.H.Z. All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
K.H.Z. is an employee and shareholder at Viatris Inc. K.H.Z. is a co-founder of AI4Purpose, Inc. M.L.B. was an independent consultant for Viatris Inc. Viatris Inc. sponsored the design of the study and was in the decision to publish the results.
Funding Statement
This research was funded by Mylan Inc.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
- 1.Zou K.H., Li J.Z., Imperato J., Potkar C.N., Sethi J., Edward J., Ray A. Harnessing real-world data for regulatory use and Applying Innovative Applications. J. Multidiscip. Healthc. 2020;13:671–679. doi: 10.2147/JMDH.S262776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zou K.H., Li J.Z., Salem L.A., Imperato J., Edwards J., Ray A., editors. Harnessing real-world evidence to reduce the burden of noncommunicable disease: Health information technology and innovation to generate insights. Health Serv. Outcomes Res. Methodol. 2021;21:8–20. doi: 10.1007/s10742-020-00223-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Zou K.H., Salem L.A., Ray A., editors. Real World Evidence in a Patient Centric Digital Era. Taylor & Francis; Boca Raton, FL, USA: 2021. [Google Scholar]
- 4.Berger M.L., Ganz P.A., Zou K.H., Greenfield S. When Will Real-World Data Fulfill Its Promise to Provide Timely Insights in Oncology? JCO Clin. Cancer Inform. 2024;8:e2400039. doi: 10.1200/CCI.24.00039. [DOI] [PubMed] [Google Scholar]
- 5.U.S. Food & Drug Administration Real-World Evidence. [(accessed on 10 July 2024)];2023 Available online: https://www.fda.gov/science-research/science-and-research-special-topics/real-world-evidence.
- 6.Concato J., Corrigan-Curay J. Real-World Evidence—Where Are We Now? N. Engl. J. Med. 2022;386:1680–1682. doi: 10.1056/NEJMp2200089. [DOI] [PubMed] [Google Scholar]
- 7.U.S. Congress H.R.34—21st Century Cures Act. [(accessed on 10 July 2024)];2016 Available online: https://www.congress.gov/bill/114th-congress/house-bill/34.
- 8.FAIR FAIR Principles. 2024. [(accessed on 10 July 2024)]. Available online: https://www.go-fair.org/fair-principles.
- 9.Zou K.H., Vigna C., Talwai A., Jain R., Galaznik A., Berger M.L., Li J.Z. The Next Horizon of Drug Development: External Control Arms and Innovative Tools to Enrich Clinical Trial Data. Ther. Innov. Regul. Sci. 2024;58:443–455. doi: 10.1007/s43441-024-00627-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Shah K., Patt D., Mullangi S. Use of Tokens to Unlock Greater Data Sharing in Health Care. JAMA. 2023;330:2333–2334. doi: 10.1001/jama.2023.23720. [DOI] [PubMed] [Google Scholar]
- 11.U.S. Food & Drug Administration Use of Real-World Evidence to Support Regulatory Decision-Making for Medical Devices. [(accessed on 10 July 2024)];2017 Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/use-real-world-evidence-support-regulatory-decision-making-medical-devices.
- 12.U.S. Food & Drug Administration Use of Electronic Health Record Data in Clinical Investigations Guidance for Industry. [(accessed on 10 July 2024)];2018 Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/use-electronic-health-record-data-clinical-investigations-guidance-industry.
- 13.U.S. Food & Drug Administration Framework for FDA’s Real-World Evidence Program. [(accessed on 10 July 2024)];2018 Available online: https://www.fda.gov/media/120060/download?attachment.
- 14.U.S. Food & Drug Administration CVM GFI #266 Use of Real-World Data and Real-World Evidence to Support Effectiveness of New Animal Drugs. [(accessed on 10 July 2024)];2021 Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/cvm-gfi-266-use-real-world-data-and-real-world-evidence-support-effectiveness-new-animal-drugs.
- 15.U.S. Food & Drug Administration Real-World Data: Assessing Electronic Health Records and Medical Claims Data to Support Regulatory Decision-Making for Drug and Biological Products. [(accessed on 10 July 2024)];2021 doi: 10.1002/pds.5444. Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/real-world-data-assessing-electronic-health-records-and-medical-claims-data-support-regulatory. [DOI] [PMC free article] [PubMed]
- 16.U.S. Food & Drug Administration Considerations for the Design and Conduct of Externally Controlled Trials for Drug and Biological Products. [(accessed on 10 July 2024)];2023 Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-design-and-conduct-externally-controlled-trials-drug-and-biological-products.
- 17.U.S. Food & Drug Administration Real-World Data: Assessing Registries to Support Regulatory Decision-Making for Drug and Biological Products. [(accessed on 10 July 2024)];2023 Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/real-world-data-assessing-registries-support-regulatory-decision-making-drug-and-biological-products.
- 18.U.S. Food & Drug Administration Data Standards for Drug and Biological Product Submissions Containing Real-World Data. [(accessed on 10 July 2024)];2023 Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/data-standards-drug-and-biological-product-submissions-containing-real-world-data.
- 19.U.S. Food & Drug Administration Advancing Real-World Evidence Program. [(accessed on 10 July 2024)];2024 Available online: https://www.fda.gov/drugs/development-resources/advancing-real-world-evidence-program.
- 20.European Commission European Health Data Space. 2024. [(accessed on 10 July 2024)]. Available online: https://health.ec.europa.eu/ehealth-digital-health-and-care/european-health-data-space_en.
- 21.European Medicines Agency Reflection Paper on Use of Real-World Data in Non-Interventional Studies to Generate Real-World Evidence—Scientific Guideline. 2024. [(accessed on 10 July 2024)]. Available online: https://www.ema.europa.eu/en/reflection-paper-use-real-world-data-non-interventional-studies-generate-real-world-evidence-scientific-guideline.
- 22.Kahn M.G., Callahan T.J., Barnard J., Bauck A.E., Brown J., Davidson B.N., Estiri H., Goerg C., Holve E., Johnson S.G., et al. A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data. EGEMS. 2016;4:1244. doi: 10.13063/2327-9214.1244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.European Medicines Agency and Heads of Medicines Agencies Data Quality Framework for EU Medicines Regulation. 2023. [(accessed on 10 July 2024)]. Available online: https://www.ema.europa.eu/en/documents/regulatory-procedural-guideline/data-quality-framework-eu-medicines-regulation_en.pdf.
- 24.Mahendraratnam N., Silcox C., Mercon K., Krostsch A., Romine M., Aten A., Sherman R., Daniel G., McClellan M., Determining Real-World Data’s Fitness for Use and the Role of Reliability Duke Margolis Center for Health Policy. 2019. [(accessed on 10 July 2024)]. Available online: https://healthpolicy.duke.edu/sites/default/files/2019-11/rwd_reliability.pdf.
- 25.Berger M.L., Crown W.H., Li J.Z., Zou K.H. ATRAcTR (Authentic Transparent Relevant Accurate Track-Record): A screening tool to assess the potential for real-world data sources to support creation of credible real-world evidence for regulatory decision-making. Health Serv. Outcomes Res. Methodol. 2023 doi: 10.1007/s10742-023-00319-w. [DOI] [Google Scholar]
- 26.Wang S.V., Schneeweiss S., Franklin J.M., Desai R.J., Feldman W., Garry E.M., Glynn R.J., Lin K.J., Paik J., Patorno E., et al. Emulation of randomized clinical trials with nonrandomized database analyses: Results of 32 clinical trials. JAMA. 2023;329:1376–1385. doi: 10.1001/jama.2023.4221. Erratum in JAMA 2024, 331, 1236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wang S.V., Schneeweiss S. Understanding the facets of emulating randomized clinical trials-reply. JAMA. 2023;330:770–771. doi: 10.1001/jama.2023.11535. [DOI] [PubMed] [Google Scholar]
- 28.Schneeweiss S., Wang S.V. Hypothetical assessments of trial emulations. JAMA Intern. Med. 2024;184:446. doi: 10.1001/jamainternmed.2023.7945. [DOI] [PubMed] [Google Scholar]
- 29.Desai R.J., Wang S.V., Sreedhara S.K., Zabotka L., Khosrow-Khavar F., Nelson J.C., Shi X., Toh S., Wyss R., Patorno E., et al. Process guide for inferential studies using healthcare data from routine clinical practice to evaluate causal effects of drugs (PRINCIPLED): Considerations from the FDA Sentinel Innovation Center. BMJ. 2024;384:e076460. doi: 10.1136/bmj-2023-076460. [DOI] [PubMed] [Google Scholar]
- 30.Park J.E., Campbell H., Towle K., Yuan Y., Jansen J.P., Phillippo D., Cope S. Unanchored population-adjusted indirect comparison methods for time-to-event outcomes using inverse odds weighting, regression adjustment, and doubly robust methods with either individual patient or aggregate data. Value Health. 2024;27:278–286. doi: 10.1016/j.jval.2023.11.011. [DOI] [PubMed] [Google Scholar]
- 31.Loh W.W. Estimating Curvilinear Time-Varying Treatment Effects: Combining g-Estimation of Structural Nested Mean Models with Time-Varying Effect Models for Longitudinal Causal Inference. [(accessed on 10 July 2024)];Psychol. Methods. 2024 doi: 10.1037/met0000637. advance online publication . Available online: https://psycnet.apa.org/record/2024-54079-001?doi=1. [DOI] [PubMed] [Google Scholar]
- 32.HealthIT.gov Building Data Infrastructure to Support Patient Centered Outcomes Research (PCOR): Common Data Model Harmonization. [(accessed on 10 July 2024)];2024 Available online: https://www.healthit.gov/topic/scientific-initiatives/pcor/common-data-model-harmonization-cdm.
- 33.Observational Health Data Science and Informatics (OHDSI) Standardized Data: The OMOP Common Data Model. 2024. [(accessed on 10 July 2024)]. Available online: https://www.ohdsi.org/data-standardization.
- 34.HL7 FHIR HL7 FHIR Foundation Enabling Health Interoperability through FHIR. 2024. [(accessed on 10 July 2024)]. Available online: https://fhir.org.
- 35.Integrating the Healthcare Enterprise (IHE) International Making Healthcare Interoperable. 2024. [(accessed on 10 July 2024)]. Available online: https://www.ihe.net.
- 36.U.S. Department of Health and Human Services Health Information Privacy. [(accessed on 10 July 2024)];2024 Available online: https://www.hhs.gov/hipaa/index.html.
- 37.GDPR.EU What Is GDPR, the EU’s New Data Protection Law? 2024. [(accessed on 10 July 2024)]. Available online: https://gdpr.eu/what-is-gdpr.
- 38.Rob Bonta, Attorney General California Consumer Privacy Act (CCPA) [(accessed on 10 July 2024)];2024 Available online: https://oag.ca.gov/privacy/ccpa.
- 39.California Privacy Protection Agency The California Consumer Privacy Act. [(accessed on 10 July 2024)];2024 Available online: https://cppa.ca.gov/regulations.
- 40.Casey J.D., Courtright K.R., Rice T.W., Semler M.W. What can a learning healthcare system teach us about improving outcomes? Curr. Opin. Crit. Care. 2021;27:527–536. doi: 10.1097/MCC.0000000000000857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Rajpurkar P., Chen E., Banerjee O., Topol E.J. AI in health and medicine. Nat. Med. 2022;28:31–38. doi: 10.1038/s41591-021-01614-0. [DOI] [PubMed] [Google Scholar]
- 42.How to support the transition to AI-powered healthcare. Nat. Med. 2024;30:609–610. doi: 10.1038/s41591-024-02897-9. [DOI] [PubMed] [Google Scholar]
- 43.Silcox C., Zimlichmann E., Huber K., Neil Rowen N., Robert Saunders R., McClellan M., Kahn C.N., Salzberg C.A., 3rd, Bates D.W. The potential for artificial intelligence to transform healthcare: Perspectives from international health leaders. NPJ Digit. Med. 2024;7:88. doi: 10.1038/s41746-024-01097-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Berger M.L., Sox H., Willke R.J., Brixner D.L., Eichler H.G., Goettsch W., Madigan D., Makady A., Schneeweiss S., Tarricone R., et al. Good Practices for Real-World Data Studies of Treatment and/or Comparative Effectiveness: Recommendations from the Joint ISPOR-ISPE Special Task Force on Real-World Evidence in Health Care Decision Making. Value Health. 2017;20:1003–1008. doi: 10.1016/j.jval.2017.08.3019. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Not applicable.

