Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2025 Mar 14;34(3):e70127. doi: 10.1002/pds.70127

A Brief Report on Proposed Areas of International Harmonization of Real‐World Evidence Relevance, Reliability and Quality Standards Among Medical Product Regulators

Maryam Nafie 1, Valerie J Parker 1, Mark McClellan 1, Rachele M Hendricks‐Sturrup 1,
PMCID: PMC11907323  PMID: 40084391

ABSTRACT

Background

International harmonization of real‐world data and evidence (RWD/E) standards is a goal among real‐world data/real‐world evidence (RWD/E) policy stakeholders. The Duke‐Robert J. Margolis Institute for Health Policy developed an online ‘International Harmonization of RWE Standards Dashboard’ to provide timely updates around these goals.

Methods

Guidance for industry (draft and final) and related literature available online by medical product regulators globally was sought and, where needed, translated into English language using a certified translator. Consultations were then held with practicing experts to identify, collate, and interpret documents. An online Tableau tool was assembled to collate guidance documents and regulatory definitions of the following key terms used among the community to describe fit‐for‐use RWE in regulatory submissions: relevance, reliability, and quality.

Results

As of February 2025, the United States Food and Drug Administration (FDA) has released the most RWE guidance documents to date (n = 13; 4 draft, 9 final). Four (4) regulators globally (US FDA, EMA, Taiwan FDA, Brazil ANVISA) have directly defined at least two (2) out of the three key terms (reliability, relevance, quality), indicating alignment around the importance of these terms used in the context of RWD/E. Across these terms, we propose areas of definitional alignment: data representativeness and research and regulatory concern (relevance), accuracy in data interpretation and quality and integrity during data accrual (reliability), and data quality assurance across sites and time (quality). We propose areas of definitional misalignment regarding clinical context, data availability and representativeness, and ensuring study sample sizes and/or datasets are adequate to address a given study question.

Conclusions

Our assessment of definitions provided among these four regulators lends us to propose distinct areas for harmonization based on our assessment of where regulators appear to align and highlight opportunities to address misalignment.

Keywords: international harmonization, real‐world data, real‐world evidence, regulatory science


Summary.

  • As international harmonization of real‐world evidence (RWE) standards is a goal among medical product regulators globally, the Duke‐Margolis International Harmonization of RWE Standards Dashboard was built to serve this goal.

  • The United States Food and Drug Administration has published the most guidance (n = 13), although only four (4) regulators (US FDA, EMA, Taiwan FDA, Brazil ANVISA) have directly defined and aligned around the use of at least two (2) of the following key terms used to describe fit‐for‐use RWE in regulatory submissions: reliability, relevance, and quality.

  • Our assessment of findings to date in the Dashboard supports areas for harmonization as well as opportunities to address specific areas of misalignment.

1. Purpose

Real‐world data (RWD) and real‐world evidence (RWE) are increasingly used to demonstrate both the efficacy and safety of medical products investigated globally across a broad range of therapeutic and disease areas [1]. RWE approaches have involved but have not been limited to the use of external RWD controls (direct matching, benchmark/natural history), external controls (literature/prior trials), and RWD generated via an expanded access program [1, 2]. For these reasons, international harmonization of RWD and RWE standards has become a stated goal among regulators, medical product developers, and their collaborators globally.

In 2022, members of the International Coalition of Medicines Regulatory Authorities (ICMRA; European Medicines Agency [EMA]; the United States Food and Drug Administration [US FDA]; and Health Canada) published a statement noting their interest and intent to converge on terminology for RWD and RWE. More recently, members of the M14 working group within the International Conference on Harmonization (ICH) have published draft guidelines on M14 General Principles on Plan, Design, and Analysis of Pharmacoepidemiological Studies That Utilize Real‐World Data for Safety Assessment of Medicines [3].

To augment transparency and support this effort's partnership with communities of practice, the Duke‐Robert J. Margolis Institute for Health Policy's RWE Collaborative, we developed an online and publicly accessible International Harmonization of RWE Standards Dashboard (“Dashboard”) [4]. This brief report, builds on current and relevant work, including that of the International Society for Pharmacoepidemiology (ISPE), conducted to generate recommendations for improvements to RWD identification and feasibility assessments [5]. Building on our recent engagement and presentation at the 2024 ISPE Annual Meeting, entitled “Catalyzing International Harmonization of Real World Evidence Standards,” we describe our general process taken to build the Dashboard and summarize quantitative and qualitative findings. We highlight and discuss signs of definitional alignment around key terms, as well as highlight areas of definitional misalignment, to offer practical recommendations.

2. Methods

2.1. Identification and Selection of Guidance Documents and Frameworks for Industry

A broad internet search, alongside consultations with practicing experts across government, industry, and academia, was conducted with the intent to locate, source, and collate both draft and final regulatory guidance documents and frameworks on RWD/RWE. Guidance documents and frameworks found were evaluated as published in the English language, or as certified translated into English, among three analysts (M.N., V.J.P., R.M.H‐S.). We extracted and captured, using Microsoft Excel, both formal and informal definitions for the following key terms used to describe fit‐for‐use RWE in regulatory settings: reliability, relevance, and quality.

2.2. Quantitative and Qualitative Assessment

We quantified the total number of guidance documents and frameworks published to date across all regulators globally, as well as the total number of guidance documents and frameworks published by each regulator to date. We also quantified the total number of regulators who have defined the following terms: reliability, relevance, and quality. We evaluated definitions provided among a subset of regulators that have formally defined at least two (2) of those terms to determine early signs of definitional alignment/misalignment, propose areas for harmonization based on our assessment of where regulators appear to align, and highlight opportunities to address misalignment.

2.3. Statement of Work in Progress

Information captured in Microsoft Excel was imported into Tableau Desktop to display summarized findings online via Tableau Public on the Duke‐Margolis International Harmonization of RWE Standards Dashboard (https://healthpolicy.duke.edu/projects/international‐harmonization‐real‐world‐evidence‐standards‐dashboard). Information within the Dashboard continues to be updated regularly. The Dashboard in its current form is meant to serve as a foundational work‐in‐progress, particularly as new guidance documents and/or frameworks published by government agencies globally are released and/or finalized and thus captured within the Dashboard.

3. Results

3.1. Quantitative Findings

As of February 2025, we observed 58 guidance documents that have been published across 14 regulators. The number of publications published, however, varies by regulator. The United States Food and Drug Administration (US FDA) has released the greatest proportion of guidance documents (n = 13; 4 draft, 9 final), followed by the European Medicines Agency (EMA) and the National Medical Products Administration/China Center for Drug Evaluation (NMPA/CDE‐China) and Pharmaceuticals and Medical Devices Agency (PMDA—Japan), which have each released seven (n = 7) guidance documents. Four (4) of the 14 regulators (US FDA, EMA, Taiwan FDA, Brazil ANVISA) have directly defined and aligned around the use of at least two (2) of the three stated key terms (reliability, relevance, and quality).

3.2. Qualitative Findings: Definitional Alignment

We observed signs of definitional alignment among the subset of regulators that have formally defined at least two of the three following terms: relevance, reliability, and/or quality (see Appendix A and Table 1). The US FDA and ANVISA have defined data “relevance” as inclusive of “sufficient numbers of representative patients for the study” and “data is robust and representative,” thus indicating potential definitional alignment.

TABLE 1.

Proposed areas for harmonization and opportunities to address signs of potential misalignment among regulators globally concerning the relevance, reliability, and quality of real‐world data and evidence.

Term Proposed areas for harmonization and opportunities to address potential misalignment
Relevance Proposed areas for alignment
Data representativeness:
  • Sufficient numbers of representative patients.

  • Data is robust and representative, thus being able to address the regulatory problem in the clinical context of interest.

Research and regulatory concern:
  • Dataset presents the data elements useful to answer a given research question.

  • Generate valid evidence informing a specific research question based on the study design.

  • Relevance of the data to a specific regulatory purpose.

Opportunity to address signs of potential misalignment
  • Identify and discuss scenarios where a problem driven by clinical context is observed and where data availability and representativeness are necessary to ensure data is relevant to address the observed problem.

Reliability Proposed areas for alignment
Accuracy in Data Interpretation:
  • Degree to which data are accurate or correctly representing an observed reality.

  • How correct and true the data are.

Quality and Integrity During Data Accrual:
  • Data accuracy, completeness, provenance, and traceability.

  • Accrual, quality, and integrity.

  • Affected by the source and quality of the data, and the assessment aspects include data accrual.

  • Adequate data quality control or data assurance.

  • Description of how data is collected.

  • During data accrual and analysis, operations and processes to ensure that errors are minimized.

  • Quality and completeness of the data collected are adequate.

Opportunity to address signs of potential misalignment
  • Clarify if data representation is form of actively ensuring either or both data reliability and relevance to address a problem driven by clinical context, particularly given observed emerging alignment around data representativeness being a form of actively ensuring data relevance.

Quality Proposed areas for alignment
Data Quality Assurance Across Sites and Time:
  • Assessment of completeness, accuracy, and consistency across sites and over time.

  • Must consider completeness, accuracy, consistency, and transparency of the data.

  • High quality data is data that presents, in every aspect of its origin, clarity and traceability.

Opportunity to address signs of potential misalignment
  • Determine whether the “extent to which a dataset presents the data elements useful to answer a given research question” and where “data quality is relative to the research question and does not address the question on what level is the quality measured,” serve as functions to ensuring ‘relevance’ versus ‘quality.’

  • Determine if the act of assessing whether a study sample size and/or dataset, as well as the level to which the quality of data are measured, are adequate to address a given study question is operationalizable to support data reliability, relevance, and/or quality.

Regarding “reliability,” the EMA includes “accuracy” and “representation” in its definition of reliability, as does the US FDA in one of its definitions (“data accuracy, completeness, provenance, and traceability”). “Quality” is also a consideration in the EMA, Taiwan FDA, and one of the US FDA's definitions of reliability, and the US FDA and Taiwan FDA consider quality within the scope of data accrual.

For the term “quality” as a standalone definition, the Taiwan FDA advises sponsors to “take into account completeness, accuracy, consistency, and transparency of the data.” The US FDA notes this in its definition of “quality,” such that quality would involve an “assessment of completeness, accuracy, and consistency across sites and over time.” Likewise, ANVISA notes regarding quality, “data that presents, in every aspect of its origin, clarity and traceability, and it is also auditable.”

3.3. Qualitative Findings: Definitional Misalignment

Despite signs of alignment, we observed areas of possible misalignment for these terms among the subset of regulators (see Appendix A). We observed that the Taiwan FDA and ANVISA have defined data “relevance” as specific to a regulatory purpose or problem driven by clinical context, while the FDA and EMA define “relevance” as specific to a research question and involving aspects of data availability and representativeness, thus indicating potential definitional misalignment.

The Taiwan FDA considers quality control and quality/data assurance in its definition of reliability. The EMA includes “accuracy” and “representation” in its definition of reliability, though, as mentioned above, both the US FDA and ANVISA include “representative” within their definitions of “relevance.” ANVISA has yet to formally or clearly define “reliability” as a key term.

Regarding the term “quality,” the EMA includes “extent to which a dataset presents the data elements useful to answer a given research question” and “data quality is relative to the research question and does not address the question on what level is the quality measured,” in its definition of “relevance” versus “quality.” Yet, the US FDA includes “study sample size should be adequate to address the study question” in its definition of “quality,” thus indicating not only differences in these definitions themselves, but also differences in quality measurement considerations (i.e., sample size).

In Table 1, we propose areas for harmonization based on our assessment of where regulators seem to align most and highlight opportunities to address misalignment.

4. Conclusion

Overall, our findings indicate opportunities for regulators to consider our proposed areas of alignment and discuss opportunities to address signs of potential misalignment. We believe this work is critical to continue to help shape national and international RWE regulatory policy conversations; guide operational goals and standards for RWE consideration broadly among regulators, health technology assessment bodies, and payers; and, importantly, support researchers and other regulatory stakeholders seeking to identify, in real‐time, global regulatory policy developments focused on RWE [6].

4.1. Plain Language Summary

Harmonization around the consideration and use of real‐world evidence (RWE) is a stated goal among regulators globally. As a helpful step towards achieving this goal, we collated, quantified, and evaluated regulatory guidance documents and definitions of key terms used among the community to describe fit‐for‐use RWE (relevance, reliability, quality) provided in regulatory submissions. To date, the United States Food and Drug Administration has released the most guidance documents to date (n = 13; 4 draft, 9 final). Only four (4) of these regulators (US FDA, EMA, Taiwan FDA, Brazil ANVISA), however, have directly defined and aligned around the use of at least two (2) of the aforementioned key terms. Our assessment of definitions provided among these four regulators leads us to propose distinct areas for harmonization based on our assessment of where regulators appear to align and highlight opportunities to address misalignment.

Ethics Statement

The authors have nothing to report.

Conflicts of Interest

Dr. McClellan is an independent director on the boards of Johnson & Johnson, Cigna, Alignment Healthcare, and PrognomIQ; co‐chairs the Guiding Committee for the Health Care Payment Learning and Action Network; and receives fees for serving as an advisor for Arsenal Capital Partners, Blackstone Life Sciences, and MITRE.

Acknowledgments

We formally acknowledge and thank Dr. Eric Monson within the Duke University Libraries' Center for Data and Visualization Sciences for providing close data visualization guidance and assistance towards the development of the Dashboard. We also formally acknowledge and thank Adam Aten, former Duke‐Margolis Institute for Health Policy staff, for his early thought and contributions towards the development of this work and Dr. Trevan Locke and Matt D'Ambrosio within the Duke‐Margolis Institute for Health Policy for their ongoing support towards engagement with our Duke‐Margolis RWE Collaborative members to support this work. We also acknowledge and thank Rachel Neha Shaw for her efforts in supporting this work during its initial phases. Last but not least, we formally acknowledge and thank our Duke‐Margolis RWE Collaborative members who have contributed to the successful development and maintenance of the Duke‐Margolis International Harmonization of RWE Standards Dashboard.

Appendix A.

See Table A1.

TABLE A1.

Overview of definitions of relevance, reliability, and quality among a subset of medical product regulators globally with definitions for at least two key fit‐for‐purpose terms (n = 4).

Regulatory agency Definition source Definition of relevance, reliability, and quality as fit‐for‐purpose terms a
United States (US) Food and Drug Administration (FDA)

Real‐World Data: Assessing Electronic Health Records and Medical Claims Data To Support Regulatory Decision Making for Drug and Biological Products (“EHR and claims final guidance”)

Use of Real‐World Evidence to Support Regulatory Decision‐Making for Medical Devices (“device draft guidance”)

Relevance: “… the availability of key data elements (exposure, outcomes, covariates) and sufficient numbers of representative patients for the study” (page 3; in the EHR and claims final guidance)

Reliability: “… data accuracy, completeness, provenance, and traceability” (page 3; EHR and claims final guidance) and “… consideration of accrual, quality, and integrity of RWD” (lines 462–464; device draft guidance)

Quality: “Quality control processes; Assessment of completeness, accuracy, and consistency across sites and over time; Study sample size should be adequate to address the study question. Establishment and adherence to data collection, recording, and source verification procedures; Adequate patient protections (e.g., methods to protect the privacy of individuals' health data and adherence to applicable privacy and ethics standards) established in advance of executing the study protocol; Prior demonstration of RWE generation from the data source.” (pages 16–18; device draft guidance)

European (EU) Medicines Agency

Data Quality Framework for EU medicines regulation

Relevance: “… the extent to which a dataset presents the data elements useful to answer a given research question. This definition is narrower and more data‐focused than the more commonly understood meaning of ‘relevance’ (i.e., relevance of a data source to generate valid evidence informing a specific research question based on the study design)” (pages 25–26)

Reliability: “The reliability dimension answers the question: to what degree are data accurate or correctly representing an observed reality? When considering the “fit‐for‐purpose” definition of quality, reliability covers how correct and true the data are.” Subdimensions of reliability include: precision, accuracy and plausibility. (page 15)

Quality: “… fitness for purpose for users' needs in relation to health research, policymaking, and regulation and that the data reflect the reality, which they aim to represent. Data quality is relative to the research question and does not address the question on what level is the quality measured e.g., variable, data source or institutional level. These aspects are addressed in the data quality determinants and dimensions of data quality” (page 39)

Taiwan FDA Real‐World Data—Relevance and Reliability Assessment Considerations (official English language translation)

Relevance: “… the degree of fit between the collected data and the study objectives, i.e., the relevance of the data to a specific regulatory purpose” (page 12)

Reliability: “… affected by the source and quality of the data, and the assessment aspects include data accrual and data quality control or data assurance. The assessment considerations for reliability includes as follows: how is the data collected? During data accrual and analysis, do operators and processes ensure that errors are minimized? and is the quality and completeness of the data collected adequate? In other words, it involves determining if there is adequate data quality control or data assurance” (page 16)

Quality: “… must take into account completeness, accuracy, consistency, and transparency of the data” (page 36)

Brazilian Health Regulatory Agency (ANVISA)

Best practices guide for real‐world data studies (official English language translation)

Relevance: “… if the data is robust and representative, thus being able to address the regulatory problem in the clinical context of interest” (lines 304–306)

Reliability: Definition undetermined.

Quality: Definition: “High quality data is data that presents, in every aspect of its origin, clarity and traceability, and it is also auditable. Quality assessment must be carried out systematically, with predefined and determined structures, such as, for example, reliability between two or more evaluators. Furthermore, it must be comparable in relation to references” (page 17)

a

Areas in bold within each definition indicate opportunities for definitional alignment, respectively.

Funding: The authors received no specific funding for this work.

References


Articles from Pharmacoepidemiology and Drug Safety are provided here courtesy of Wiley

RESOURCES