Skip to main content
Future Oncology logoLink to Future Oncology
. 2025 Sep 1;21(23):3075–3089. doi: 10.1080/14796694.2025.2552098

Finding the right tool for the specific task: navigating RWE tools and checklists

Richard Willke a, Paul Cottu b,c, Andrew Briggs d, Uwe Siebert e,f,g, Connie Chen h,, Beata Korytowsky h, Julien Heidt i, Meghan Renfrow i, Kate Lovett i, Adam Brufsky j
PMCID: PMC12490408  PMID: 40888174

ABSTRACT

Real-world evidence (RWE) is increasingly used to support product approvals and label expansions, as well as clinical and payer decision-making. Various tools (e.g. frameworks, checklists) have been developed to help inform and assess the robustness and quality of real-world study design and reporting. This targeted review provides a practical guide for leveraging these tools to increase awareness and utility for decision-makers. A pre-defined search strategy was applied to identify articles from PubMed. Articles published from 1 January 2020, through 4 October 2024 were included and reviewed to identify relevant tools aimed at assessing RWE study planning, reporting, or quality assessment. Key information regarding each was extracted and summarized including strengths, limitations, and included domains. 119 articles were initially identified, of which 15 were included after screening, referencing a total of 17 tools. These 17 tools varied in format and structure, ranging from detailed guidelines and templates to checklists and questionnaires. Utility and application of the tools identified in this targeted review vary across the evaluation of study planning, reporting, and quality. Selection of the appropriate tool depends on several factors including intended purpose of the tool, intended real-world study design, and the availability of study documentation.

KEYWORDS: Real-world evidence, real-world data, payer decision-making, health technology assessment, frameworks, checklists, decision-making tools, reporting standards

1. Background

Real-world data (RWD) refers to data on patient health status and healthcare delivery that is routinely collected from various sources, such as electronic health records (EHR), insurance claims, and patient registries. Real-world evidence (RWE) is derived from the analysis of RWD and provides insights into the effectiveness and safety of medical products in real-world settings [1]. The role of observational real-world evidence (RWE) in healthcare decision-making, as a complement to randomized controlled trials (RCTs), is continuously evolving [1–3]. Traditional RCTs may limit the population under investigation with narrow eligibility criteria, which can reduce the generalizability of findings for certain patient subgroups, such as those with extensive comorbidities [4]. However, key global regulators, payers, and health technology assessment (HTA) bodies have provided guidance on best practices for utilizing RWE [5,6]. Such guidance highlights the potential for RWE to add value across the product lifecycle, including supporting new product approvals or label expansions, and increasing likelihood of reimbursement. This is particularly critical in complex therapeutic areas like oncology, where RWE can facilitate the inclusion of broader, more heterogenous patient populations affected by diverse cancer types [4]. As the use of RWE to inform decision-making expands across therapeutic areas, a wider range of stakeholders are being called upon to engage with, assess, and apply RWE findings in decision-making processes. However, inconsistencies in RWE quality, study design and reporting, as well as unmeasured confounding, can undermine trust and hinder utility, especially in critical areas like patient safety and treatment efficacy [3].

Recent policy developments have underscored the growing importance of RWE in healthcare decision-making. Notably, the 21st Century Cures Act, enacted in 2016, aimed to accelerate medical product development and bring innovations to patients more efficiently [7]. In alignment with this goal, the US FDA introduced its RWE framework in 2018 to guide the use of RWE in regulatory decisions, particularly for approving new indications for already approved drugs and fulfilling post-approval study requirements. This framework emphasized the importance of fit-for-purpose RWD and accelerated the need for tools that operationalize relevant guidance to support RWE generation [8]. Beyond the US, global health authorities and reimbursement agencies including the European Medicines Agency (EMA), and HTA bodies like Canada’s Drug Agency (CDA-AMC, formerly CADTH), the National Institute for Health and Care Excellence (NICE), and the Institute for Clinical and Economic Review (ICER), have also provided guidance on ensuring and assessing the relevance and reliability of RWD for regulatory submissions and reimbursement decisions [9–13].

The importance of RWE in oncology is particularly notable due to its growing role in regulatory and payer assessment for oncology therapies. The FDA’s Oncology Center of Excellence has advanced efforts to incorporate RWE into oncology regulatory decisions. This includes using RWE to complement traditional clinical trial data, thereby accelerating the approval process for new oncology drugs and expanding indications for existing therapies [14]. Assessing practical application of real-world endpoints is a key part of this effort. Organizations such as Friends of Cancer Research have played a vital role improving the quality of oncology RWE through developing methodological recommendations and frameworks for assessing real-world endpoints. Their efforts have provided valuable insights and strategies to encourage the utilization of RWE in oncology research [15]. By leveraging RWE, stakeholders in oncology can make more informed decisions that reflect the complexities of real-world patient populations and treatment settings. This not only enhances the regulatory and HTA review processes but also ensures that patients receive timely access to safe and effective cancer treatments.

The growing demand for RWE in oncology and the development of various assessment or study design guidance tools mark an exciting shift in the field. However, the wide range of tools with different purposes, quality, and expertise requirements highlights the need for standardization and guidance on practical implementation. Although multiple checklists, frameworks, and tools exist to help assess and interpret RWE, there may be low awareness of their availability and a lack of direction on selecting the most appropriate tool, leading to their underutilization. For example, multiple tools are available to guide design and reporting of real-world oncology studies, one of which is the European Society for Medical Oncology’s Guidance for Reporting Oncology Real-World Evidence (ESMO-GROW) [3,16]. It can be challenging to understand when, or if, one tool may be more appropriate than another. Furthermore, in a recent survey of health plan pharmacy and medical directors, approximately half noted a lack of experience interpreting RWE studies [17,18]. As such, there is a need for guidance to support health care providers (HCPs), payers, and researchers in better navigating the evolving landscape of tools for evaluation RWE. This targeted review aimed to identify and summarize these tools, including those applicable to oncology, to aid decision-makers (e.g., HCPs, clinical guideline developers, payers), across the spectrum of experience with RWE, in their real-world application.

2. Methods

2.1. Study design

A targeted review of the peer-reviewed literature (1 January 2020 – 4 October 2024) was conducted in PubMed to identify relevant RWE tools. PubMed was chosen as the sole data source for this review as it is a comprehensive and widely recognized database that indexes a vast array of high-impact, peer-reviewed journals across multiple disciplines. We did not anticipate a much larger body of literature by expanding beyond PubMed. The review comprised the following steps to identify the tools for inclusion: 1) search strategy development, 2) article selection and tool identification, and 3) article review.

2.2. Search strategy

A list of search terms was developed to identify relevant published literature, including “checklist,” “tool,” “real-world,” “observational,” “rwe” (for full search string see supplemental material). The search criteria excluded certain study types (i.e., case reports, clinical trials, complementary therapies, protocols) and specific topics not related to oncology (i.e., air pollutants, education, sports nutrition, COVID-19, drug-related side effects, adverse event reporting systems, pharmacokinetics, and safety) from the general pool of results. The aim of this search strategy was to include both tools specific to RWE in oncology and broader therapy areas, while excluding those with a specific focus in non-oncology therapy areas. If newer tools were based on tools developed prior to the article selection window, the original tools were identified and included (information on date of original publication was captured).

2.2.1. Article selection and tool identification

A single independent reviewer (with no adjudication) operationalized the following inclusion criteria:

  • Articles that described RWE tools, or were cited within relevant review articles (ancestral search) AND

  • Articles that detailed tools used in the design, conduct, and/or reporting of RWE studies

Articles were excluded based on the following exclusion criteria:

  • Articles that are review papers and do not describe an RWE tool OR

  • Articles not available in English OR

  • Articles that are disease-specific and not related to oncology (e.g., The OHStat Guidelines for Reporting Observational Studies and Clinical Trials in Oral Health Research: Manuscript Checklist) OR

Articles detailing tools that originate from HTA or regulatory guidance and are country-specific, unless the tool(s) has/have been used and recognized on a global level (e.g., (i.e., ISPOR – The Professional Society for Health Economics and Outcomes Research). Next, tools mentioned within the included articles were assessed for eligibility based on the following inclusion criteria:

  • Tools applicable to the design, reporting, and/or quality assessment of RWE studies

Tools were excluded based on the following criteria:

  • Tools not applicable to the design, reporting, and/or quality assessment of RWE studies OR

  • Tools that are disease-specific and not related to oncology OR

  • Tools that originate from HTA or regulatory guidance and are country-specific, unless they have been used and recognized on a global level (i.e., ISPOR) OR

Tools that were applicable to the design of only meta-analyses or systematic reviews. Given that most RWE tools are designed to be broadly applicable across various audiences, articles and tools were neither included nor excluded based on the intended user. Similarly, the format of the tools was not a criterion for inclusion or exclusion but was instead evaluated during the article review process.

2.2.2. Article review

Once articles were identified and selected, a single independent reviewer assessed each manuscript and extracted key information based upon a priori specified assessment criteria using a standardized extraction form. Then, the data extraction outputs were reviewed and quality checked by at least one additional reviewer independent of the original extractor to confirm accuracy, ensure consistency, and reduce subjective influence. This process helped mitigate biases such as confirmation and reviewer bias by involving two independent reviewers. In order to better understand the potential use cases and intended audience (s), basic information was captured (e.g., date of publication, overall structure [number of items, format]) as well as the categories and topics covered by each tool. Based on the tool’s domains and stated purpose (if applicable), they were categorized by intended use case throughout the RWE study lifecycle: 1) protocol development, 2) study reporting, 3) quality assessment. Initial use case categorization was performed by the primary reviewer and finalized through uniform consensus among the expert review team. Areas of disagreement were generally limited to tools with overlapping domains or ambiguous stated purposes and were resolved through discussion and consensus. Likewise, when strengths and limitations were not explicitly stated in the source literature, comparative strengths and weaknesses were inferred by the primary reviewer and adjudicated by additional reviewers using a pre-defined categorization. This categorization included scoring methodology (e.g., binary vs. scaled scoring), length (i.e., number of items included within the tool), whether or not the tool was formally validated, notable domain coverage (i.e., categories/topics addressed), as well as degree of tool complexity relative to one another. Assessment of tool complexity was based on criteria such as expertise required for implementation (e.g., whether specialized training in epidemiology or biostatistics was needed) and estimated time for completion based on number of items, complexity of instructions, etc. These criteria were applied uniformly across all tools, regardless of whether explicit strengths or limitations were reported in the source literature. However, this assessment process was not systematic or validated, and reflects a targeted approach informed by expert judgment. All categorization decisions were tracked in a centralized Excel spreadsheet, which was accessible to the full review team to ensure transparency and facilitate iterative updates. Tables 1–3 describe and summarize these characteristics.

Table 1.

Tools to support protocol development.

  Characteristic
Most* useful when …
 
Tool Format Length (# of items) Year Published (Year Last Revised) Formally validation/tested? Strength(s) Weakness(es) Notable domains as described within the tool (non-exhaustive)* … structured protocol template is needed … seeking guidance on data feasibility requirements … high complexity of study design and/or data sources Ref
European Network of Center for Pharmacoepidemiology and Pharmacovigilance (ENCePP) Checklist for Study Protocols Checklist (Yes/No) 68 2011 (2018) No • Compatible with multiple RWE study designs (e.g., exploratory, descriptive, prediction) and data sources • May require referring to supplemental resource (ENCePP Guide on Methodological Standards in Pharmacoepidemiology) • Study design, source, & populations
• Exposure/outcome definitions & measurement
• Bias & effect measure modification
• Data sources
• Analysis plan
• Data management & quality control
• Communication of study results
    x [21]
European Network of Center for Pharmacoepidemiology and Pharmacovigilance (ENCePP) Guide on Methodological Standards in Pharmacoepidemiology Guideline recommendations (200 pages) N/A 2011 (2023) No • Periodically updated (11 times to date) to reflect latest methodological advancements
• Addresses novel study designs (e.g., target trial emulation)
• High implementation effort due to length of document • Study protocol & design
• Definition & validation of drug exposure, outcomes & covariates
• Methods to address bias & confounding
• Effect modification & interaction
• Research networks for multi-database studies
• Systematic reviews & meta-analysis
• Signal detection methodology & application
• Statistical analysis
• Quality management
• Communication of study results
  x x [22]
HARmonized Protocol Template to Enhance Reproducibility of Hypothesis Evaluating Real-World Evidence (HARPER) Template 24 2022 No • Combines structured table format with free-text sections to enhance flexibility • Does not cover every aspect of transparency over lifecycle of research study (e.g., sharing protocol, code data, etc.) • Timeline
• Rationale & background
• Research questions & objectives
• Research methods
x   x [19,20]
International Society for Good Pharmacoepidemiology Practice Guide for Good Pharmacoepidemiology Practice (ISPE GPP) Guideline recommendations N/A 1996 (2015) No • Does not require extensive expertise • Broad guidelines that are not study design-specific• May not reflect latest methodological advancements • Protocol development
• Responsibilities, personnel, facilities, resource commitment, & contractors
• Study conduct
• Communication
• Adverse event reporting
  x   [3,19,23–25]
Structured Template and Reporting Tool for Real-World Evidence (STaRT-RWE) Framework 16 2021 No • Compatible with multiple RWE study designs (e.g., exploratory, descriptive, predictive) and data sources • Focused primarily on study design decisions
• Limited utility for assessing whether data are fit-for-purpose
• Design diagram
• Summary of analytical study population
• Analysis specification
• Sensitivity analyses
• Attrition table
• Power & sample size calculation
x   x [20]

Abbreviation: RWE: Real-world evidence.

Notable domains are the key areas a tool addresses, providing insight into its scope and applicability. These domains are non-exhaustive, meaning the tool may cover additional areas beyond the main ones listed. The lack of a checkmark beneath the “Most useful when…” columns does not imply that a tool cannot be used for the stated purpose, only that it may not necessarily be the most useful in that particular instance.

Table 2.

Tools to support study reporting.

  Characteristic
Most* useful when …
 
Tool Format Length (# of items) Year Published (Year Last Revised) Formally validation/tested? Strength(s) Weakness(es) Notable domains as described within the tool (non-exhaustive)* … structured protocol template is needed … seeking guidance on data feasibility requirements … high complexity of study design and/or data sources Ref
Assessment of Real-World Observational Studies (ArRoWS) Questionnaire 9 2019 Yes • Low implementation effort (shorter in length compared to other tools) • Potential for ambiguity
• Focused primarily on quality assessment
Core items:
• Target population
• Sample size & power calculation
• Exposure/outcome measures
• Statistical analyses, confounding factors, bias, uncertainty
Study design items:
• Study design, data sources
• Exposure/outcome assessment
• Follow-up period
• Handling of missing data, sensitivity analyses
x   x [26]
European Society for Medical Oncology Guidance for Reporting Oncology Real-World Evidence (ESMO-GROW) Checklist 35 2023 No • Clear, categorical criteria for scoring
Addresses novel study designs or methods (e.g., use of AI, machine learning, etc.)
Tailored to oncology studies
• Requires specific expertise (e.g., oncology knowledge) • Introduction
• Methods
• Results
• Discussion and conclusions
• Final considerations
x   x [3,16,27]
HARmonized Protocol Template to Enhance Reproducibility of Hypothesis Evaluating Real-World Evidence Studies on Treatment Effects (HARPER) Template 24 2022 No • Combines structured table format with free-text sections to enhance flexibility • Does not cover every aspect of transparency over lifecycle of research study (e.g., sharing protocol, code data, etc.) • Timeline
• Rationale & background
• Research questions & objectives
• Research methods
x [19,20]
International Society for Good Pharmacoepidemiology Practice Guide for Good Pharmacoepidemiology Practice (ISPE GPP) Guideline recommendations N/A 1996 (2015) No • Does not require extensive expertise • Broad guidelines that are not method-specific• Periodically updated and may not reflect latest methodological advancements • Protocol development
• Responsibilities, personnel, facilities, resource commitment, & contractors
• Study conduct
• Communication
• Adverse event reporting
x x   [3,19,23–25]
Structured Template and Reporting Tool for Real-World Evidence (STaRT-RWE) Framework 16 2021 No • Compatible with multiple RWE study designs (e.g., exploratory, descriptive, predictive) and data sources • Focused primarily on study implementation decisions
• May not capture all information needed to assess whether data are fit-for-purpose
• Design diagram
• Summary of analytical study population
• Analysis specification
• Sensitivity analyses
• Attrition table
• Power & sample size calculation
    x [20]
STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) Checklist 22 2007 (2021) No • Widely endorsed
• Standardizes/enhances clarity of observational study reporting
• Limited to three study design types • Introduction
• Methods
• Results
• Discussion
• Other information
x x   [24,28–31]
AI: Artificial intelligence; RWE: Real-world evidence            

*Notable domains are the key areas a tool addresses, providing insight into its scope and applicability. These domains are non-exhaustive, meaning the tool may cover additional areas beyond the main ones listed. The lack of a checkmark beneath the “Most useful when…” columns does not imply that a tool cannot be used for the stated purpose, only that it may not necessarily be the most useful in that particular instance.

Table 3.

Tools to support quality assessment.

  Characteristic
Most* useful when …
 
Tool Format Length (# of items) Year Published (Year Last Revised) Formally validation/tested? Strength(s) Weakness(es) Notable domains as described within the tool (non-exhaustive)* … structured protocol template is needed … seeking guidance on data feasibility requirements … high complexity of study design and/or data sources Ref
Assessment of Real-World Observational Studies (ArRoWS) Questionnaire 9 2019 Yes • Low implementation effort (shorter in length compared to other tools) • Potential for ambiguity due to subjective nature Core items:
• Target population
• Sample size & power calculation
• Exposure/outcome measures
• Statistical analyses, confounding factors, bias, uncertainty
Study design items:
• Study design, data sources
• Exposure/outcome assessment
• Follow-up period
• Handling of missing data, sensitivity analyses
x x   [26]
Data Governance Checklist Checklist 38 2023 –proposed No • Covers multiple components of data governance (e.g., data privacy, security, management, and access) • Requires expert interpretation (e.g., understanding of data privacy, management, etc.) • Data privacy & security
• Data management & linkage
• Data access management
• Generation & use of RWE
x   x [35]
European Network of Center for Pharmacoepidemiology and Pharmacovigilance (ENCePP) Guide on Methodological Standards in Pharmacoepidemiology Guideline recommendations (200 pages) N/A 2011 (2023) No • Frequently updated to incorporate the latest methodological advancements
• Addresses novel study designs (e.g., target trial emulation)
• High implementation effort
• Requires specific expertise (e.g., understanding of pharmacoepidemiology)
• Study design, source, & populations
• Exposure/outcome definitions & measurement
• Bias & effect measure modification
• Data sources
• Analysis plan
• Data management & quality control
• Communication of study results
x x   [22]
European Society for Medical Oncology Guidance for Reporting Oncology Real-World Evidence (ESMO-GROW) Checklist 35 2023 No • Clear, categorical criteria for scoring
• Addresses novel study designs or methods (e.g., use of AI, machine learning, etc.)
• Tailored to oncology studies
• Requires specific expertise (e.g., oncology knowledge) • Introduction
• Methods
• Results
• Discussion and conclusions
• Final considerations
x x x [3,16,27]
Good Research for Comparative Effectiveness (GRACE) Checklist Checklist 11 2014 (2016) Yes • Widely endorsed (e.g., NPC, ISPE) • Potential for ambiguity due to subjective nature • Data
• Methods
x     [35]
Grading of Recommendations Assessment, Development, and Evaluation (GRADE) Approach Framework 8 2000 (2013) Yes • Widely endorsed (e.g., WHO, FDA, EMA)
• Criteria definitions provided
• Potential for ambiguity due to subjective nature • Risk of bias
• Strength of recommendation
x   x [34]
International Society for Good Pharmacoepidemiology Practice Guide for Good (ISPE GPP) Guideline recommendations N/A 1996 (2015) No • Comprehensive methodological resource
• Does not require extensive expertise
• Broad guidelines that are not method-specific• Periodically updated and may not reflect latest advancements in pharmacoepidemiology • Protocol development
• Responsibilities, personnel, facilities, resource commitment, & contractors
• Study conduct
• Communication
• Adverse event reporting
x     [3,19,23–25]
International Society for Pharmacoeconomics and Outcomes Research – Academy of Managed Care Pharmacy – National Pharmaceutical Council (ISPOR-AMCP-NPC Questionnaire) Questionnaire (Yes/No/NA) 33 2012 (2017) No • Low implementation effort • Potential for ambiguity due to subjective nature • Relevance
• Reliability & validity
• Data linkages
• Eligibility determination
• Research design
• Treatment effects
• Sample selection
• Censoring
• Variable definition
• Resource validation
• Statistical analysis
• Generalizability
• Data interpretation
x x x [38]
International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Checklist on Retrospective Database Studies Checklist 27 2003 (2013) No • Tailored to retrospective database studies
• Ensures data relevance for broad types of retrospective health-related data sources• Addresses data linkage
• Primarily designed for medical claims or encounter-based databases, limiting applicability to other designs Data delineation:
• Characteristics• Provenance• Governance• Data fitness-for-purpose• Reliability – completeness, accuracy• Relevance – Data content, care setting & time period, sample size & follow-up period
x x   [37]
International Society for Pharmacoeconomics and Outcomes Research (ISPOR) SUITABILITY Checklist Checklist 24 2024 No • Focused on EHR data for HTAs
• Includes forward-looking recommendations for improving use of EHR data over time
• Limited applicability to RWD types beyond EHR data
• Does not address national or local idiosyncrasies of EHR data systems, processes, and governance
• Does not address data linkage
• Relevance
• Credibility
x x   [36]
National Evaluation System for health Technology Coordinating Center (NESTcc) Data Quality Framework Framework 7 I 2020 No • Focused on EHR data
• Compatible with multiple RWE study designs (e.g., exploratory, descriptive, predictive) and data sources
• Requires expert interpretation (e.g., data management, EHR data) Data:
• Governance
• Characteristics• Capture• Transformation• Curation
x x   [19]
Risk of Bias in Non-randomized Studies – of Interventions (ROBINS-I) Questionnaire 34 2016 No • Specifically designed to assess non-randomized studies • Potential for ambiguity due to subjective nature Bias:
• Due to confounding
• In selection of participants into the study
• In classification of interventions
• Due to deviations from intended interventions
• Due to missing data
• In measurement outcomes
• In selection of reported result
x   x [3,31]
Structured Template and Reporting Tool for Real-World Evidence (STaRT-RWE) Framework 16 2021 No • Compatible with multiple RWE study designs (e.g., exploratory, descriptive, predictive) and data sources • Focused primarily on study implementation decisions
• May not capture all information needed to assess whether data are fit-for-purpose
• Design diagram
• Summary of analytical study population
• Analysis specification
• Sensitivity analyses
• Attrition table
• Power & sample size calculation
x     [20]
Use-case specific Relevance and Quality Assessment (UReQA) Framework Framework 5 2020 (2023) No • Compatible with multiple RWE study designs (e.g., exploratory, descriptive, predictive) and data sources • Potential for ambiguity due to subjective nature Preassessment
Data element standardization
Cohort definition
Verification and validation
Benchmarking
x x   [24,32]
AI: Artificial intelligence; EHR: Electronic helath records; FDA: Food and Drug Administration; RWD: Real-world data; RWE: Real-world evidence; WHO: World Health Organization

*Notable domains are the key areas a tool addresses, providing insight into its scope and applicability. These domains are non-exhaustive, meaning the tool may cover additional areas beyond the main ones listed. The lack of a checkmark beneath the “Most useful when…” columns does not imply that a tool cannot be used for the stated purpose, only that it may not necessarily be the most useful in that particular instance.

Abbreviations: AI: Artificial intelligence; EHR: Electronic helath records; FDA: Food and Drug Administration; RWD: Real-world data; RWE: Real-world evidence; WHO: World Health Organization.

3. Results

The application of the aforementioned search strategy yielded 119 articles, 35 of which were included for subsequent article assessment. Of those, 15 articles were ultimately included in the review. Within the 15 selected articles, there were a total of 17 tools. These tools varied in terms of format, including checklists, guideline recommendations, templates, frameworks, and questionnaires. Three of the 17 included tools have undergone some form of formal validation, meaning they have been assessed through methods such as inter-rater reliability, content and/or construct validation, sensitivity and specificity testing, and/or practical utility testing. In terms of intended use case, of the 17 included tools, five support protocol development, six support study reporting, and 14 support quality assessment (not mutually exclusive) (Figure 1).

Figure 1.

Figure 1.

PRISMA flow diagram.

Abbreviations: RWE: Real-world evidence; HTA: Health technology assessment; ISPOR: The Professional Society for Health Economics and Outcomes Research.

3.1. Tools supporting protocol development

The 5 tools identified for supporting protocol development included: 1) HARmonized Protocol Template to Enhance Reproducibility of Hypothesis Evaluating Real-World Evidence Studies on Treatment Effects (HARPER) [19,20] (2022), 2) European Network of Center for Pharmacoepidemiology and Pharmacovigilance (ENCePP) Checklist for Study Protocols [21] (2018), 3) Structured Template and Reporting Tool for Real-World Evidence (STaRT-RWE) [20] (2021), 4) EnCePP Guide on Methodological Standard in Pharmacoepidemiology [22] (2023), and 5) International Society for Pharmacoepidemiology (ISPE) Guidelines for Good Pharmaceutical Practice (GPP) [3,19,23–25] and Structured template (2015) (Table 1). All of the aforementioned tools included domains related to study design (e.g., research question, data sources, study population) and analytic methods. In particular, the ENCePP Checklist for Study Protocols, a checklist intended to promote transparency, methodological rigor, and regulatory alignment in non-interventional studies, and the ENCePP Guide on Methodological Standards in Pharmacoepidemiology, a comprehensive resource offering methodological guidance for conducting high-quality pharmacoepidemiological studies, include sections focused on data management and quality control, while the ISPE GPP and Structured template, which are guideline recommendations aimed to ensure ethical and transparent conduct of pharmacoepidemiological research across all phases of study set-up, includes a section on adverse event reporting. The most recently published tool was HARPER, a protocol template designed to improve clarity and reproducibility of RWE study protocols developed in 2022 based on a review of prior tools, including StaRT-RWE and ISPE GPP. HARPER, a combination of free text along with structured tables, has been pilot tested across a range of study designs and data source types (e.g., active comparator new user design effectiveness study, nested case control, oncology drug vs standard of care, pregnancy safety study, and a self-controlled case series). The recency of HARPER, along with the rigorous review methodology through which it was developed, represents strengths compared to older tools such as STaRT-RWE and ISPE GPP. Additionally, HARPER has been endorsed by ISPE, ISPOR, and the Centers for Medicare & Medicaid Services (CMS) (HARPER+); further adding to its strengths. A limitation in use of HARPER is that it can be perceived as geared toward studies in the United States (US) or other contexts where specific international frameworks, such as ENCePP, are not mandated. Nonetheless, HARPER has recently been acknowledged in the EMA draft International Council for Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) M14 guidance on real-world evidence (issued 30 May 2025). The guidance notes that in the absence of specific regulatory formatting or content requirements, sponsors may use scientifically developed frameworks, such as HARPER, for developing study documents [10]. This suggests that HARPER is considered an acceptable option by ICH, unless a country’s specific regulatory guidance dictates otherwise. Ultimately, if a study is intended for European stakeholders, the ENCePP checklist and associated guidance may be more appropriate due to its alignment with European Union (EU) expectations and practices. Although both HARPER and ENCePP could be utilized for a variety of study types, ENCePP has historically been the preferred checklist for pharmacovigilance and post-authorization safety study (PASS) protocols. When selecting a tool, it is important to consider both the intended purpose and geography of the RWE study, as well as any organizational requirements around protocol templates that may exist.

3.2. Tools supporting study reporting

The six tools identified for supporting study reporting included: 1) Assessment of Real-World Observational Studies (ArRoWS) [26] (2019), 2) ESMO-GROW [3,16,27] (2023), 3) HARPER [19,20] (2022), 4) ISPE GPP [3,19,23–25] (2015), 5) STaRT-RWE [20] (2021), and 6) STrengthening the Reporting of Observational studies in Epidemiology (STROBE) [24,28–31] (2021). While there is strong consensus around the key elements to include as part of study reporting (including details regarding study design and methods/conduct), there remains a general lack of specificity and clarity in the guidelines on how to report these elements, leading to potential misinterpretation [19]. Because results reporting is intrinsically tied to study design and overall quality assessment, there are very few existing tools solely for the purpose of standardizing study reporting. For example, the STROBE [24,28–31] checklist, a checklist aimed to improve the reporting quality of observational studies, primarily cohort, case-control, and cross-sectional studies, is widely used and endorsed for study reporting but is limited in that it focuses on just those three common study design types (Table 2). STROBE may be less applicable to more innovative study designs (e.g., external comparator arm, hybrid prospective and retrospective data collection), which are becoming increasingly more common in oncology and rare disease where there is high unmet need and small sample sizes. While the majority of tools for study reporting can be applied to many study design types, the ESMO-GROW [3,16,27] checklist, a comprehensive checklist for reporting of observational and non-interventional oncology RWE studies, is tailored for the complexities of oncology, such as rapidly evolving treatments and heterogenous patient populations. It also accommodates oncology-specific data sources such as tumor registries and biobanks. This oncology-specific focus may limit the applicability of ESMO-GROW to other therapeutic areas and require a certain level of familiarity with oncology-specific concepts for successful implementation across all 35 checklist items. In contrast, the ArRoWS questionnaire, designed to ensure quality and transparency of RWE studies, is useful to support study reporting across a broad range of indications but is relatively short (nine questions). While the ArRoWS questionnaire is comprehensive in its coverage and focuses on informing decision making, the limited number of questions opens the door for potential ambiguity in interpretation that could make the operationalization of this tool less practical than others. This demonstrates the important trade-off between length, breadth, and interpretability. In contrast, both HARPER and STaRT-RWE offer frameworks that complement existing checklists by using tabular and visual formats to minimize ambiguous prose and potential for misinterpretation [19]. HARPER may be preferred due to both its format flexibility and recency. Selecting the appropriate tool will depend on the indication of interest, the complexity and innovativeness of the study design, the type of tool (checklist vs questionnaire), and the decision maker’s technical expertise with RWE (Table 2).

3.3. Tools supporting quality assessment

There were 14 tools identified for quality assessment (Table 3). In general, the tools aim to assess the risk of bias resulting from the data source and/or the study methodology. Among the tools, there was heterogeneity with regards to format, recency, and length. Despite the heterogeneity, the most common domains were those focused on study design, bias and confounding, and data management and quality. In order to assess the quality of RWE, transparency and access to information is required. For example, certain tools, such as the ISPE GPP [3,19,23–25] and the ENCePP Guide on Methodological Standards in Pharmacoepidemiology, may require more extensive documentation to complete than others; this is an important consideration when selecting a tool. Some tools focus on the underlying data quality/management (data governance, transformation, curation, standardization), including the Use-case specific Relevance and Quality Assessment (UReQA) [24,32], a structured framework for evaluating data relevance and quality in RWE studies; the National Evaluation System for health Technology Coordinating Center (NESTcc) Data Quality Framework [19], a systematic framework for assessing RWD for decision-making; and the Data Governance (DG) Checklist [33], a checklist supporting decision-making on data privacy, security, management, and access. Other tools focus on the study design and analytic methods (bias due to confounding, missing data, misclassification, outcome measurement), such as Risk of Bias in Non-randomized Studies – of Interventions (ROBINS-I) [3,31], a structured questionnaire for assessing bias in non-randomized interventional studies (Table 3). Given these tools aim to evaluate quality from different perspectives, it may be beneficial to consider use of more than one tool to assess different domains if time and resources permit.

There are multiple reasons why a decision maker may select one tool for quality assessment versus another. Although the UReQA framework (2023) appears to have a limited number of “items,” it actually includes five interlinked, iterative steps that are relatively labor intensive (preassessment, data element standardization, cohort definition, verification and validation, and benchmarking). This framework has a strong emphasis on data relevance and incorporates multiple stakeholder perspectives to accommodate various types of RWE studies and data types; however, the framework is rather general and lacks customization for specific therapeutic areas (unlike ESMO-GROW or ArRoWS, for example). UReQA also lacks explicit guidance for evaluating studies using certain analytic methods (such as machine learning or predictive analytics). In contrast, the ROBINS-I questionnaire is structured and systematic, facilitating the consistent and transparent evaluation of bias risk. The language in the questionnaire is aligned with the terminology of causal inference (e.g., target trial). While useful for studies intended to assess causal effects, substantial expertise in epidemiology is required to understand potential biases that may be present in an RWE study. Additionally, the strong focus on internal validity means this questionnaire may need to be supplemented with tools like UReQA or ArRoWS that assess the relevance and applicability of the evidence.

While less recent, Grading of Recommendations Assessment, Development, and Evaluation (GRADE) (2013), a systematic method for assessing certainty of evidence and strength of recommendations in healthcare, is widely used and endorsed for quality assessment [34]. A notable feature of GRADE is its explicitly defined criteria for upgrading or downgrading evidence, which lends to its utility for decision making; however, the criteria for upgrading evidence from RWE (e.g., large effect sizes) may be challenging to consistently apply in observational studies as GRADE was originally designed for RCTs. The Good Research for Comparative Effectiveness (GRACE) checklist (2016), an 11-item checklist for evaluating the quality and reliability of observational studies in comparative effectiveness research, does not include a mechanism for upgrading or downgrading evidence, but it is more tailored to observational research as it is primarily designed for comparative effectiveness studies and places more emphasis on external validity and generalizability [35]. This reinforces that internal vs external validity is a key differentiator between the purpose and application of these different tools.

The DG checklist (2023) aims to promote compliance of RWE studies through addressing key aspects such as data privacy, security, management, and access. This checklist was developed based on a literature review and a three-round Delphi panel including multi-stakeholder perspectives. The checklist items cover topics such as data privacy and security, data management and linkage, data access, and RWE generation. Each checklist item can receive a “yes” or a “no,” which then translates to a quantitative score corresponding to “excellent,” “acceptable,” or “low” quality [33]. This binary scoring mechanism and quantitative scoring are simultaneously a strength and limitation, as it is a very structured and straightforward mechanism to support decision making. The DG checklist is a good tool for users seeking fast insights into whether minimum-quality governance has been achieved in the respective dataset or RWE study; however, it assumes that certain documentation on the study conduct and data source is available (e.g., data access agreements, relevant standard operating procedures (SOPs), etc.) which may not always be the case.

The NESTcc data quality framework (2020) is designed to ensure quality of RWD for decision making, and while comprehensive in scope and applicable to diverse stakeholders, it primarily addresses EHR data [19]. This is also true of the ISPOR SUITability Checklist (2024), a list of recommendations created for use in HTAs to evaluate the relevance, completeness, accuracy, timeliness, and generalizability of EHR data. For a study solely based upon an EHR data source, either of these tools would be useful in assessing relevance and reliability despite differing in recency and format [36].

In addition to the ISPOR SUITability checklist, our targeted review identified two other ISPOR sponsored checklists for RWE quality assessment. The ISPOR Checklist on Retrospective Database Studies (2013) is less recent and, although applicable to multiple types of retrospective data, was developed primarily for medical claims or encounter-based databases in order to support quality assessment [37]. The checklist was written in the form of 27 questions to guide decision makers as they consider the database, study methodology, and study conclusions. A unique and helpful feature of this checklist is that for many of the questions, key references are provided. This helps users obtain more context on the specific items and enhances understanding. The ISPOR Academy of Managed Care Pharmacy National Pharmaceutical Council (ISPOR-AMCP-NPC) Questionnaire (2017) is similar in format but is framed as a questionnaire rather than a checklist and aims to assess relevance and credibility of observational studies in healthcare decision-making by providing a structured approach to evaluate study design, data integrity, and result interpretation. The rationale for this is that checklists might mislead users if a study satisfies nearly all of the elements of a checklist and yet still harbors “fatal flaws” (defined as design, execution, or analysis elements of the study that by themselves may significantly undermine the validity of the results) [38]. The ISPOR-AMCP-NPC questionnaire was developed based on a review of items in previous questionnaires and guidance recommendations, previous ISPOR Task Force recommendations as well as methods and reporting guidances (including GRADE, STROBE, and ENCePP). The questionnaire is divided into two domains: 1) relevance and 2) credibility. Although “relevance” is a term that is regularly used as a key dimension of data quality, the use of “credibility” is less standardized and accepted, demonstrating how some aspects of this questionnaire may be out of date and misaligned with present terminology.

There are various elements to consider when selecting a tool, including its recency, the need to assess internal vs external validity, the type of data (EHR, claims, other), and the desired format of the output (quantitative score vs qualitative insights).

4. Discussion

This targeted literature review identified 17 tools that could be useful to HCPs and payers in the evaluation of RWE study planning, reporting, and quality. Although the tools identified in this review vary with regards to domains covered, degree of complexity (Table 4), and their timing of use during the lifecycle of a study, they are all intended to result in more consistency and improve the overall quality of RWE. While many tools cover similar or overlapping domains, each introduces subtle but important differences that reflect specific priorities or perspectives of their creators. These nuances help explain why new tools continue to emerge, despite the frequent perception of “framework fatigue.” Rather than signaling redundancy, this evolution reflects stakeholders’ efforts to address gaps they observe in existing tools and to tailor approaches to better suit their contexts and objectives. As such, it is not necessarily standardization of these tools and their associated domains that is needed, but rather more clarity around how to select the most appropriate tool for a given task or stakeholder need.

Table 4.

Tools by intended use case and degree of complexity.

  Intended Use Case
 
Degree of Complexity* Protocol Development Study Reporting Quality Assessment Ref
Lower • ENCePP Checklist for Study Protocols • ArRoWS
• ISPOR Suitability Checklist
• STROBE
• ArRoWS
• ISPOR-AMCP-NPC Questionnaire
• ISPOR Suitability Checklist
[21,24,26,28–31,36,38]
Higher • ENCePP Guide on Methodological Standards in Pharmacoepidemiology
• HARPER
• STaRT-RWE
• ESMO-GROW
• HARPER
• ISPE Guidelines for Good Pharmacoepidemiology Practice
• ISPOR Checklist on Retrospective Database Studies
• STaRT-RWE
• DG Checklist
• ENCePP Guide on Methodological Standards in Pharmacoepidemiology
• ESMO-GROW
• GRACE Checklist
• GRADE Approach
• ISPE Guidelines for Good Pharmacoepidemiology Practice
• ISPOR Checklist on Retrospective Database Studies
• NESTcc Data Quality Framework
• ROBINS-I
• STaRT-RWE
• UReQA Framework
[3,16,19,20,22,23,25,27]

Abbreviations: ENCePP: European Network of Center for Pharmacoepidemiology and Pharmacovigilance; ArRoWS: Assessment of Real-World Observational Studies; ISPOR: International Society for Pharmacoeconomics and Outcomes Research; STROBE: STrengthening the Reporting of OBservational studies in Epidemiology; AMCP-NPC: Academy of Managed Care Pharmacy – National Pharmaceutical Council; HARPER: HARmonized Protocol Template to Enhance Reproducibility of Hypothesis Evaluating Real-World Evidence Studies on Treatment Effects; STaRT-RWE: Structured Template and Reporting Tool for Real-World Evidence; ESMO-GROW: European Society for Medical Oncology Guidance for Reporting Oncology Real-World Evidence; ISPE: International Society for Good Pharmacoepidemiology Practice; DG: Data Governance; GRACE: Good Research for Comparative Effectiveness; GRADE: Grading of Recommendations Assessment, Development, and Evaluation; NESTcc: National Evaluation System for health Technology Coordinating Center; ROBINS-I: Risk of Bias in Non-randomized Studies – of Interventions; UReQA: Use-case specific Relevance and Quality Assessment.

*Tools listed in this table were rated based on their complexity relative to each other, with high complexity tools covering a wide range of topics, providing detailed instructions, and requiring more advanced technical knowledge (e.g., biostatistics, pharmacoepidemiology, data privacy and management) and increased implementation effort. Lower complexity checklists are more focused, concise, and easier to apply with basic understanding and fewer resources.

Though knowledge of the tools, frameworks, and checklists identified in this review exists in the epidemiology community, awareness is lower beyond that stakeholder group. Given the number of existing tools and lack of harmonized expectations on RWE study design, quality elements, and reporting [3], there is an unmet need around assisting different stakeholders to select and utilize these tools in decision-making. While tools like ROBINS-I and STROBE are widely endorsed, frequently cited in literature, and have been incorporated into some payer reviews, few tools have shown empirical improvements in study rigor or uptake by HTA bodies or regulators. The GRACE Checklist is one of the few RWE tools that has undergone extensive validation efforts against external measures of quality [35]. This lack of validation does not diminish the value of other tools but highlights the importance of using each judiciously and in combination rather than relying on a single tool.

Adopting a template, checklist, or framework to support protocol development helps to ensure that all of the necessary elements are included. Utilizing such templates to develop study protocols can ensure the appropriate elements are identified a priori and included by following explicit guidance when communicating key study parameters. This facilitates reproducibility, replication in independent data sources, and the assessment and mitigation of potential sources of bias [19]. The developers of HARPER, one of the available tools for protocol development, have been piloting the template with key international stakeholders, and uptake is increasing as is evident by endorsement in ICH) M14 draft guidance [10]. Such standardization will likely become increasingly common as such templates become codified in guidance, with the subsequent utility of these tools improving quality of study design and transparency while also reducing investigator time and burden.

Utilizing tools for results reporting further increases traceability and supports data trustworthiness. If a tool is used to inform study reporting, it should be cited. The study’s protocol (or a redacted version) should be made publicly available (perhaps the data dictionary as well), per EU PASS register and www.clinicaltrial.gov requirements [39]. This fosters transparency, as a decision maker can only make an assessment based on the information they are provided, and the quality of evidence generated is only as good as the quality of the underlying data. When assessing the quality of evidence, it is useful to have an understanding of the data source(s) used (and how they were selected), the detailed study methodology, and other aspects of the analysis that may impact relevance (including availability, sufficiency, and representativeness) and reliability (including accuracy, completeness, provenance, and timeliness) [40]. In some situations this information may be required or requested by payers and regulators to support transparency in decision-making [38].

Many tools overlap in their focus on ensuring the inclusion of necessary elements in study protocols and enhancing transparency and reproducibility. For example, HARPER, ENCePP Checklist for Study Protocols, and STaRT-RWE all provide comprehensive guidance for protocol development, covering domains such as study design, data sources, and analytic methods. However, this redundancy can lead to confusion among users regarding which tool to select, highlighting the need for clearer guidance around selecting the right tool for a given task and, as applicable, formal evaluation of existing tools for different tasks.

Certain domains, such as external validity and data provenance, are often under-addressed. External validity, which pertains to the generalizability of study findings to broader patient populations, is crucial for the applicability of RWE but is not consistently emphasized across all tools. Similarly, data provenance, which involves documenting the lineage and transformations of the data used, is essential for assessing data reliability but is not uniformly covered. Tools like the DG Checklist and NESTcc Data Quality Framework address data governance and quality but may not provide sufficient guidance on external validity.

The targeted focus of some tools can be highly beneficial for certain use cases, providing specialized guidance and insights; however, this can also be limiting for broader applications. ESMO-GROW, for example, provides valuable, oncology-specific guidance for RWE study reporting and quality assessment; however, its focus on oncology may limit its applicability for broader stakeholder needs, such as regulators or HTA bodies, who may require additional economic and methodological considerations when evaluating RWE. In such contexts, more general tools like the GetReal Trial Tool may be better aligned with these broader evaluative frameworks [41]. Importantly, the oncology-specific nature of a tool does not inherently make it fit for the task; rather, it highlights its tailored utility within a specific domain. A comprehensive and effective approach to RWE generation, reporting, and assessment may therefore require the complementary use of multiple tools, each addressing distinct but critical dimensions of evidence quality.

The tools captured in this review do not fully address the gap of assessing underlying data quality. There are existing frameworks for conducting fit-for-purpose feasibility assessment (e.g., Structured Process to Identify Fit-for-purpose data [42]) that inform data source selection based on data relevance to the research question and study objectives. Such frameworks assume assessment of the underlying reliability of a data source as a necessary first step. Other publications have attempted to operationalize existing data quality frameworks put forth by regulators by mapping data quality checks in data sources to the dimensions of these frameworks [40]. In oncology, there are specific data quality considerations such as accurate and complete capture of tumor histology and staging, consistency in treatment sequencing and biomarker status, and ascertainment of complex endpoints such as progression-free and overall survival across data sources. For example, Friends of Cancer Research emphasizes the importance of aligning analytic definitions of oncology endpoints and developing common methodological frameworks to support high-quality RWE generation [15]. This underscores the need for rigorous methods to ensure the reliability and applicability of RWE in oncology, as RWD quality is the foundation for reliable evidence.

The existence of multiple tools at different stages of a study’s life cycle presents a challenge for decision makers, underscoring a need for structured and iterative assessment of data sources, study design, and the resulting RWE. For “end users” of that RWE (e.g., payers, HCPs), having visibility into the assessments, tools, and templates that have been utilized by researchers throughout the study may support transparency, consistency, and enhanced understanding. In order to increase tools’ awareness and uptake, socialization and training on these tools should be incorporated into educational forums and physician/continuing medical education training. Academic journals requiring or recommending the use of RWE tools may also help place emphasis on the importance of study quality and thus increase adoption.

4.1. Limitations

Our targeted review has several limitations. Publications with tools were selected using predefined search terms developed based on subject matter expertise and were adjudicated by multiple authors with expertise in the development, reporting, and evaluation of RWE; however, these terms were not validated, meaning the sensitivity and specificity is not known. The search was also limited to a single source (PubMed), which introduces the potential for database bias. It is possible other tools are available, as this was a targeted rather than systematic review. Data extraction was conducted manually, and assessments of tool characteristics, including strengths and weaknesses, were either drawn directly from the source publications or inferred based on the expertise of the reviewers in the absence of explicit evaluations. These assessments were first conducted by the primary reviewer, who holds a clinical doctoral degree and a master’s in business administration with an analytics focus. They were then adjudicated by additional expert reviewers, including multiple with advanced degrees in epidemiology and decades of combined expertise in pharmacoepidemiology, real-world evidence, clinical oncology, and outcomes research; however, they were not based on a standardized or validated assessment framework. Ultimately, what is considered a strength or weakness of a tool may vary depending on the intended use, user needs, and research context. The perceived value of specific tool features may also evolve over time, particularly as tools are externally validated or gain recognition and adoption by various stakeholders.

We did not include tools designed for specific therapeutic areas or interventions; however, prior studies have found that appraisal tools designed for specific interventions have the potential to be applied to general interventions [25]. Moreover, our focus on published literature means that we did not include relevant guidance documents available on websites or within gray literature. This area is rapidly evolving, with new updates frequently released, which may not be fully reflected in our review.

The categorization of tools by use case was based on the available literature and the content of the tool itself. For certain tools, it could be argued that they are applicable for other use cases as compared to how they have been classified here. This review offers considerations for implementing tools in various use cases but does not provide a formal evaluation of their utility. Future studies could formally evaluate the comparative value of different checklists for different use cases.

5. Conclusions

The tools identified in this targeted review can be utilized by various stakeholders for multiple use cases, including guiding study planning and reporting, as well as supporting quality assessment. The successful implementation of these tools creates more transparent, rigorous RWE generation and interpretation of study findings. When deciding which of these tools to consider for a specific use case, there is no one-size-fits-all solution. While some tools are shorter in length, they may be more high-level, which can create ambiguity in their operationalization. Selection of the appropriate tool will depend on multiple factors including intended purpose of the tool, intended real-world study design, and the availability of study documentation. Given the sheer number of available tools and their variability in terms of domains, structure, and terminology, a coordinated effort among methodologists, regulators, payers, and other stakeholders involved in generating, evaluating, and reporting RWE could help develop further guidance around selecting the appropriate tool for a given task. As existing tools are harmonized, the barrier to adoption may be lowered, particularly for nonspecialist users.

6. Future perspective

Looking ahead, the next 5–10 years are likely to bring significant evolution in how RWE quality is assessed, particularly in oncology. Advances in artificial intelligence (AI) will increasingly augment key components of RWE generation, such as cohort identification, unstructured data abstraction, and bias detection, necessitating the development of tools and checklists that incorporate AI-specific quality domains (e.g., model transparency, calibration, drift monitoring). Traditional static checklists may give way to more dynamic, continuous quality assurance frameworks that reflect the iterative and evolving nature of RWD generation and use. As regulatory bodies across jurisdictions move toward greater harmonization, metadata standards, and quality evaluation frameworks are expected to become more interoperable and consistent, with oncology-specific adaptations (e.g., tumor-specific endpoints, genomic data integration) becoming more formalized. The growing use of novel and multi-modal data sources, including PROs, digital biomarkers, and genomic sequencing, will, particularly in the context of precision oncology, demand more sophisticated tools that evaluate data provenance, linkage quality, and contextual relevance. Moreover, as privacy-preserving analytic approaches and federated data environments become more common, RWE frameworks will need to account for distributed data architectures where direct data access is limited. Finally, we anticipate the emergence of curated repositories or registries of RWE datasets that are supported by standardized data quality metrics and potential certification mechanism. Together, these trends underscore the need for flexible tools that can adapt to the rapidly changing landscape of oncology research and RWE generation.

Supplementary Material

Supplemental Material
Supplemental Material

Funding Statement

This study was funded by Pfizer, Inc.

Article highlights

  • Real-world evidence (RWE) is increasingly used to support regulatory approvals, label expansions, and payer decision-making in oncology.

  • A targeted literature review identified 17 tools designed to support RWE study planning, reporting, and quality assessment, drawn from 15 articles published between 2020 and 2024.

  • Tools varied in format (e.g., checklists, templates, frameworks, questionnaires), and were categorized by use case: protocol development (5 tools), study reporting (6 tools), and quality assessment (14 tools).

  • Selecting the appropriate tool depends on factors such as study design, intended use, geographic context, and availability of study documentation.

  • Many tools overlap in purpose but differ in complexity, format, and applicability, highlighting the need for harmonization and user guidance.

  • Awareness and adoption of these tools remain limited outside the epidemiology community, underscoring the need for broader education and standardization.

  • Future RWE tools will need to evolve to address AI integration, novel data types, and federated data environments, especially in oncology research.

Author contributions

All authors meet the authorship criteria as defined by Future Oncology. Each author made a significant contribution to the work reported, participation in drafting or critically revising the article, agreed on the journal submission, reviewed and approved all versions of the manuscript, and accepts responsibility for the integrity of the work.

Connie Chen and Beata Korytowsky provided overall strategic oversight, including development of concept, study design, and critical review of the manuscript.

Richard Willke, Paul Cottu, Andrew Briggs, Uwe Siebert, and Adam Brufsky provided subject matter expertise, reviewed the manuscript, and gave critical feedback from health care provider and payer perspective.

Julien Heidt, Meghan Renfrow, and Kate Lovett participated in study design and implementation, including data collection, data analysis and interpretation, and drafting of the manuscript.

Disclosure statement

C.C. and B.K. are employees of and hold stock in Pfizer, Inc; J.H., M.R., and K.L. are/were employees of IQVIA, which received funding from Pfizer in connection with the development of the manuscript and to complete this study. A.Brufsky has received grants from Agendia and AstraZeneca; consulting fees or honoraria from AstraZeneca, Pfizer, Novartis, Lilly, Genentech/Roche, Seagen, Daiichi-Sankyo, Merck, Agendia, Sanofi, and Puma. P.C. has received honoraria for consulting or advisory roles from Novartis and Pfizer; research funding from AstraZeneca, Genentech/Roche, Novartis, Pierre Fabre, and Pfizer; reimbursement for travel, accommodations, or expenses from AstraZeneca, NanoString Technologies, Novartis, Pfizer, and Roche. R.W. has received honoraria for consulting or advisory roles from Pfizer, Viatris, Sarepta, Bayer, SKL, and consulting for work with PhRMA. A. Briggs reports consultancy payments from Pfizer in relation to the current manuscript and consultancy payments from the following companies, not related to the current manuscript: Astellas, AstraZeneca, BioCryst, Boeringher Ingelheim, Chiesi, Daiichi Sankyo, Gilead, Galderma, GSK, Idorsia, Novartis, Pharmacosmos, Rythmn, Sanofi, Sobi, Takeda, Teofarma. U.S. has received honoraria for consulting or advisory roles from Pfizer and also provides teaching and consulting in causal inference and health decision science methods for academic institutions, scientific societies, HTA agencies, and industry. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

Reviewer disclosure

Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

Supplemental data

Supplemental data for this article can be accessed online at https://doi.org/10.1080/14796694.2025.2552098

References

Papers of special note have been highlighted as either of interest (•) or of considerable interest (••) to readers.

  • 1.Zisis K, Pavi E, Geitona M, et al. Real-world data: a comprehensive literature review on the barriers, challenges, and opportunities associated with their inclusion in the health technology assessment process. J Pharm Pharm Sci. 2024;27:12302. doi: 10.3389/jpps.2024.12302 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hogervorst MA, Soman KV, Gardarsdottir H, et al. Analytical methods for comparing uncontrolled trials with external controls from real-world data: a systematic literature review and comparison with European regulatory and health technology assessment practice. Value Health. 2024 2025. Sep 4;28(1):161–174. doi: 10.1016/j.jval.2024.08.002 [DOI] [PubMed] [Google Scholar]
  • 3.Sarri G, Hernandez LG.. The maze of real-world evidence frameworks: from a desert to a jungle! An environmental scan and comparison across regulatory and health technology assessment agencies. J Comp Eff Res. 2024. Sep;13(9):e240061. doi: 10.57264/cer-2024-0061 [DOI] [PMC free article] [PubMed] [Google Scholar]; •• This review by Sarri and Hernandez focuses on regulatory and health technology assessment (HTA) frameworks for real-world evidence (RWE). Our work is important because it fills a gap in this review by specifically looking at the published literature on RWE.
  • 4.Batra A, Cheung WY. Role of real-world evidence in informing cancer care: lessons from colorectal cancer. Curr Oncol. 2019. Nov;26(Suppl 11):S53–S56. doi: 10.3747/co.26.5625 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Jansen MS, Dekkers OM, le Cessie S, et al. Real-world evidence to inform regulatory decision making: a scoping review. Clin Pharmacol Ther. 2024;115(6):1269–1276. doi: 10.1002/cpt.3218 [DOI] [PubMed] [Google Scholar]
  • 6.Innovative Health Initiative . D6.2 report on global regulatory best practices analysis: a scoping review of HTA and regulatory RWD/RWE policy documents. 2024. Available from: https://www.iderha.org/sites/iderha/files/2024-05/D6.2%20Report%20on%20Global%20Regulatory%20Best%20Practices%20Analysis_v2.0.pdf
  • 7.U.S. Department of Health and Human Services . Food and Drug Administration, Center for Drug Evaluation and Research, Center for Biologics Evaluation and Research, Oncology Center of Excellence. Real-world data: assessing electronic health records and medical claims data to support regulatory decision-making for drug and biological products. Fda guidance. 2024. Jul.
  • 8.U.S. Department of Health and Human Services . Food and Drug Administration. Framework for FDA’s real-world evidence program. Fda guidance. 2018. Dec.
  • 9.Cantoni C, Pearl M. Data quality framework for EU medicines regulation: application to real-world data. EMA Guid. 2023. Dec. [Google Scholar]
  • 10.European Medicines Agency. ICH M14 Guideline on General Principles on Plan . Design, and analysis of pharmacoepidemiological studies that utilize real-world data for safety assessment of medicines. [cited 2025 Jul 15]. Available from: https://www.ema.europa.eu/en/documents/scientific-guideline/ich-m14-guideline-general-principles-plan-design-analysis-pharmacoepidemiological-studies-utilize-real-world-data-safety-assessment-medicines-step-2b_en.pdf
  • 11.CADTH . Guidance for reporting real-world evidence. Cadth guidance. 2023. May.
  • 12.National Institute for Health and Care Excellence . Nice real-world evidence framework. NICE Guid. 2022. Jun. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Institute for Clinical and Economic Review . A guide to ICER’s methods for health technology assessment. Institute for Clinical and Economic Review. A guide to ICER’s methods for health technology assessment. ICER guidance. 2020. Oct.
  • 14.Sola-Morales O, Curtis LH, Heidt J, et al. Effectively leveraging RWD for external controls: a systematic literature review of regulatory and HTA decisions. Clin Pharmacol Ther. 2023;114(2):325–355. doi: 10.1002/cpt.2914 [DOI] [PubMed] [Google Scholar]
  • 15.Rivera DR, Henk HJ, Garrett-Mayer E, et al. The friends of cancer research real-world data collaboration pilot 2.0: methodological recommendations from oncology case studies. Clin Pharmacol Ther. 2022;111(1):283–292. doi: 10.1002/cpt.2453 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Castelo-Branco L, Pellat A, Martins-Branco D, et al. Esmo guidance for reporting oncology real-world evidence (GROW). Ann Oncol. 2023. Dec;34(12):1097–1112. doi: 10.1016/j.annonc.2023.10.001 [DOI] [PubMed] [Google Scholar]; • This guidance provides reporting guidelines for oncology real-world evidence. Our work is significant because it addresses the broader published literature on RWE, beyond specific guidelines.
  • 17.National Pharmaceutical Council . What’s in your RWE evaluation toolbox? [cited 2024 Nov 7]. Available from: https://www.npcnow.org/resources/whats-your-rwe-evaluation-toolbox
  • 18.Chen S, Graff J, Yun S, et al. Online tools to synthesize real-world evidence of comparative effectiveness research to enhance formulary decision making. J Manag Care Spec Pharm. 2021. Jan;27(1):95–104. doi: 10.18553/jmcp.2021.27.1.095 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wang SV, Pinheiro S, Hua W, et al. Start-RWE: structured template for planning and reporting on the implementation of real world evidence studies. BMJ. 2021. Jan 12;372:m4856. doi: 10.1136/bmj.m4856 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wang SV, Pottegard A, Crown W, et al. Harmonized protocol template to enhance reproducibility of hypothesis evaluating real-world evidence studies on treatment effects: a good practices report of a joint ISPE/ISPOR task force. Value Health. 2022. Oct;25(10):1663–1672. doi: 10.1016/j.jval.2022.09.001 [DOI] [PubMed] [Google Scholar]; • This report presents a harmonized protocol template to enhance the reproducibility of hypothesis-evaluating RWE studies on treatment effects. Our work is important as it builds on such standardized tools by examining the published literature.
  • 21.European Network of Centres for Pharmacoepidemiology and Pharmacovigilance . ENCePP checklist for study protocols. [cited 2024 Nov 15]. Available from: https://encepp.europa.eu/encepp-toolkit/encepp-checklist-study-protocols_en
  • 22.European Network of Centres for Pharmacoepidemiology and Pharmacovigilance . Encepp guide on methodological standards in pharmacoepidemiology. [cited Nov 15 2024]. Available form: https://encepp.europa.eu/encepp-toolkit/methodological-guide_en
  • 23.International Society for Pharmacoepidemiology . Guidelines for good pharmacoepidemiology practices (GPP). [cited 2024 Nov 14]. Available from: https://www.pharmacoepi.org/resources/policies/guidelines-08027/
  • 24.Capkun G, Corry S, Dowling O, et al. Can we use existing guidance to support the development of robust real-world evidence for health technology assessment/payer decision-making? Int J Technol Assess Health Care. 2022. Nov 2;38(1):e79. doi: 10.1017/S0266462322000605 [DOI] [PubMed] [Google Scholar]
  • 25.Jiu L, Hartog M, Wang J, et al. Tools for assessing quality of studies investigating health interventions using real-world data: a literature review and content analysis. BMJ Open. 2024. Feb 13;14(2):e075173. doi: 10.1136/bmjopen-2023-075173 [DOI] [PMC free article] [PubMed] [Google Scholar]; •• This literature review and content analysis by Jiu et al. focuses on assessing the quality of studies investigating health interventions using real-world data. Our work complements this by addressing the published studies on RWE.
  • 26.Khambholja K, Gehani M. Use of structured template and reporting tool for real-world evidence for critical appraisal of the quality of reporting of real-world evidence studies: a systematic review. Value Health. 2023. Mar;26(3):427–434. doi: 10.1016/j.jval.2022.09.003 [DOI] [PubMed] [Google Scholar]
  • 27.European Society For Medical Oncology . ESMO-GROW checklist for authors and reviewers. [cited Nov 15 2024]. Available form: https://www.esmo.org/content/download/775073/18293338/1/ESMO-GROW-Checklist.pdf
  • 28.Bruggesser S, Stockli S, Seehra J, et al. The reporting adherence of observational studies published in orthodontic journals in relation to STROBE guidelines: a meta-epidemiological assessment. Eur J Orthod. 2023. Feb 10;45(1):39–44. doi: 10.1093/ejo/cjac045 [DOI] [PubMed] [Google Scholar]
  • 29.Ghaferi AA, Schwartz TA, Pawlik TM. Strobe reporting guidelines for observational studies. JAMA Surg. 2021. Jun 1;156(6):577–578. doi: 10.1001/jamasurg.2021.0528 [DOI] [PubMed] [Google Scholar]
  • 30.Vandenbroucke JP, von Elm E, Altman DG, et al. Strengthening the reporting of observational studies in epidemiology (STROBE): explanation and elaboration. Ann Intern Med. 2007. Oct 16;147(8):W163–94. doi: 10.7326/0003-4819-147-8-200710160-00010-w1 [DOI] [PubMed] [Google Scholar]
  • 31.White R. Building trust in real world evidence (RWE): moving transparency in RWE towards the randomized controlled trial standard. Curr Med Res Opin. 2023. Dec;39(12):1737–1741. doi: 10.1080/03007995.2023.2263353 [DOI] [PubMed] [Google Scholar]
  • 32.Desai Sc K, Ru M, Reynolds B, et al. An oncology real-world data assessment framework for outcomes research. Value In Health. 2021;24(1):S25. doi: 10.1016/j.jval.2021.04.129 [DOI] [Google Scholar]
  • 33.Sola-Morales O, Sigurethardottir K, Akehurst R, et al. Data governance for real-world data management: a proposal for a checklist to support decision making. Value Health. 2023. Apr;26(4):32–42. doi: 10.1016/j.jval.2023.02.012 [DOI] [PubMed] [Google Scholar]; • This study proposes a checklist for data governance in real-world data management. It highlights the need for standardized approaches, which our work aims to address by focusing on the published literature.
  • 34.American Academy of Pediatric Dentistry . Grade framework in systematic reviews. Available from: https://www.aapd.org/globalassets/aapd-grade
  • 35.Dreyer NA, Schneeweiss S, McNeil BJ, et al. The GRACE principles: recognizing high-quality observational studies of comparative effectiveness. Am J Manag Care. 2010;16(6):467–471. Available from: https://www.ajmc.com/view/ajmc_10jundreyer_467to4711 [PubMed] [Google Scholar]
  • 36.Fleurence RL, Kent S, Adamson B, et al. Assessing real-world data from electronic health records for health technology assessment: the SUITABILITY checklist: a good practices report of an ISPOR task Force. Value Health. 2025;27(6):692–701. Available from: https://www.ispor.org/publications/journals/value-in-health/abstract/Volume-27–Issue-6/Assessing-Real-World-Data-From-Electronic-Health-Records-for-Health-Technology-Assessment–The-SUITABILITY-Checklist–A-Good-Practices-Report-of-an-ISPOR-Task-Force [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Motheral B, Brooks J, Clark MA, et al. A checklist for retroactive database studies – report of the ISPOR task Force on retrospective databases. Value Health. 2003;6(2):90–97. doi: 10.1046/j.1524-4733.2003.00242.x [DOI] [PubMed] [Google Scholar]
  • 38.Berger ML, Martin BC, Husereau D, et al. A questionnaire to assess the relevance and credibility of observational studies to inform health care decision making: an ISPOR-AMCP-NPC Good Practice Task Force report. Value Health. 2014. Mar;17(2):143–156. doi: 10.1016/j.jval.2013.12.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Real World Transparency Initiative . Real world evidence registry. Available from: https://osf.io/registries/rwe/discover
  • 40.Castellanos EH, Wittmershaus BK, Chandwani S. Raising the bar for real-world data in oncology: approaches to quality across multiple dimensions. JCO Clin Cancer Inf. 2024. Jan 19;8(8):e2300046. doi: 10.1200/CCI.23.00046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Boateng D, Kumke T, Vernooij R, et al. GetReal Initiative. Validation of the GetReal trial tool – facilitating discussion and understanding more pragmatic design choices and their implications. Contemp Clin Trials. 2023;125:107054. doi: 10.1016/j.cct.2022.107054 [DOI] [PubMed] [Google Scholar]
  • 42.Gatto NM, Campbell UB, Rubinstein E, et al. The structured process to identify fit-for-purpose data: a data feasibility assessment framework. Clin Pharmacol Ther. 2022. Jan;111(1):122–134. doi: 10.1002/cpt.2466 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material
Supplemental Material

Articles from Future Oncology are provided here courtesy of Taylor & Francis

RESOURCES