Skip to main content
HHS Author Manuscripts logoLink to HHS Author Manuscripts
. Author manuscript; available in PMC: 2021 Dec 23.
Published in final edited form as: Clin Trials. 2020 Feb 17;17(4):377–382. doi: 10.1177/1740774520905576

Regulatory-Grade Clinical Trial Design Using Real-World Data

Mark S Levenson 1
PMCID: PMC8697198  NIHMSID: NIHMS1551475  PMID: 32063037

Abstract

Real-world data and evidence provide the potential to address the effectiveness and safety of drugs. The US Food and Drug Administration has initiated a program to evaluate the potential use of real-world evidence for regulatory uses. Whether a study is designed for regulatory purposes or for other purposes, existing regulation and guidance provide a reference for high-quality studies. Clarifying the study objectives and the role of real-world data in the study are important considerations. Robustness and transparency of the analysis allow for greater understanding and acceptance of the study results.

Keywords: Evidence, bias, heterogeneity, real-world

Background

Real-World Data (RWD) has been routinely used for certain regulatory purposes for some time. In particular, RWD has been used to evaluate suspected signals of adverse events in drugs.1 The Sentinel System2, 3 is an example of the extensive use of RWD for this purpose. Responding to the Food and Drug Amendments Act of 2007, the United States Food and Drug Administration (FDA) initiated the development of the Sentinel System. The Sentinel System is based on a large distributed network of RWD curated using a common data model and associated tools, permitting efficient evaluation of drug safety issues using observational study methods. Based on the Sentinel infrastructure, the FDA-Catalyst Program4 allows interaction with patients and providers. The study IMPACT-AFib provides an example of the FDA-Catalyst Program for a randomized, educational interventional study. This system may be capable of addressing safety and efficacy problems not possible without additional data collection.

Recognizing the potential for additional use of RWD and more generally Real-World Evidence (RWE) for regulatory purposes, the United States Congress included in the 21st Century Cures legislation (Cures) (Public Law 114–225) the direction for FDA to develop a program to evaluate the potential use of RWE to support the approval of a new indication of an approved drug or to support or satisfy postapproval study requirements. Cures explicitly states that there are no changes in evidentiary standards for drug approval. Cures uses the term RWE to mean data regarding the usage, or the potential benefits or risks, of a drug derived from sources other than traditional clinical trials. In December 2018, FDA released a framework for this program5 (Framework). The Framework calls for demonstration projects, stakeholder engagement, new internal processes, and guidance development.

The Framework provides definitions of Real-World Data (RWD) and Real-World Evidence (RWE).

Real-World Data (RWD) are data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources.

Real-World Evidence (RWE) is the clinical evidence about the usage and potential benefits or risks of a medical product derived from analysis of RWD.

Examples of RWD include electronic health records, medical claims and billing data, patient-generated data, and emerging sources such as mobile devices. Environmental and social-economic data may provide RWD in that the information may have bearing on patient health.

It is FDA’s position that studies used to demonstrate the safety and effectiveness of a drug should reflect the diversity of patients that the drug is intended for.6 However, these studies are often conducted in special centers outside routine health-care delivery and often entail inclusion criteria to ensure the safety of participants and to maximize the sensitivity to drug effects. In fact, FDA has provided guidance on enriching studies for this purpose.7 Such enrichments may be used to select patients with greater adherence, tolerability, likelihood of benefit, or likelihood of events. The guidance does caution that “Some strategies to decrease variability can result in studies that provide too little information about the full range of patients who will receive a drug in clinical practice.” There is recognition in the drug development community that narrow inclusion criteria hinder trial enrollment and the generalizability of the study and that justification for excluding patients should be addressed.8

RWD has a range of potential uses in studies intended for regulatory uses. On one end of the spectrum, RWD may be used in the study planning and design stage, for example, to understand the intended population of a drug, evaluate design proposals and feasibility, and identify clinical sites and patients for a study. During study conduct, RWD may be used as a source of study data, such as medical history, safety-related information, and endpoints. Greater potential use of RWD may include the use of RWD as an external control for a study. On the far end of the spectrum, observational studies may be based solely on RWD sources, such as in Sentinel. Hybrid designs making use of internal and external controls are possible.9

RWD is not tied to specific study objectives. For example, RWD may play a role in a study designed to maximize the sensitivity to a drug effect under optimal conditions. In this case, RWD may leverage existing infrastructure and data to promote efficiency in study conduct. Alternatively, RWD provides an opportunity to enroll more diverse patient populations from a range of treatment settings.

The remainder of this paper expands on some RWD and RWE regulatory considerations from a statistical perspective for the design of interventional studies. The paper focuses on statistical design issues and does not address other important regulatory issues such as data quality, human subject protection, and study integrity. The paper first discusses the regulation and some guidance for establishing the effectiveness of a drug and how the regulation and guidance may relate to the use of RWD and RWE. The paper then discusses the need to clarify the objectives of an RWE study and potential tradeoffs among objectives, making use of the causal frameworks from a draft FDA guidance. Finally, the paper provides some concluding remarks. Whether a study is designed for regulatory purposes or for other purposes, existing regulation and guidance provide a reference for high-quality studies.

Regulatory standards

For a drug to be marketed in the United States, the applicant must among other requirements establish the safety and effectiveness of the drug (Federal Food, Drug, and Cosmetic Act, 21 USC 355). The establishment of effectiveness requires substantial evidence, which consists of adequate and well-controlled investigations. An adequate and well-controlled investigation is defined in regulation (21CFR314.126) to have a number of characteristics. Note that the safety of the drug must be established but the demonstration of safety is not based on the substantial evidence criteria. As noted in background section, these standards of evidence currently apply to RWE when used to establish effectiveness. These characteristics represent good scientific practice and are relevant outside of the regulatory setting. This section will state characteristics of an adequate and well-controlled investigation and discuss issues relevant to RWD and RWE. The material in this section is not meant as a statement or interpretation of the regulation. For a complete statement of the characteristics, refer to the regulation.

Clear objectives and summary of methods and results

The protocol specifies the objectives and methods of the study. Although not stated in the regulation, the protocol should be written prior to the conduct of the study.10 In the domain of RWE, special attention is needed here, because the study may make use of data that exist at the time of the study design and planning. To ensure the integrity of the study, assurances are needed that any review of the existing data would not affect the integrity of the study findings. For example, the choice of an external control from existing data sources should not be based on the rate of an outcome and should be blinded to the study outcome to the extent feasible during the design of the study and the specification of the analysis plan.

Design permits a valid comparison with a control

This characteristic is fundamental to a controlled study because it facilitates isolation of the effect of the drug from other factors that can influence the measured outcome, such as the underlying disease. Several types of controls are described including placebo concurrent control, dose-comparison concurrent control, no treatment concurrent control, active treatment concurrent control, and historical control. FDA Guidance E10 Choice of Control Group and Related Issues in Clinical Trials (E10)11 provides further FDA guidance in this area. The use of an historical control, and more generally an external control, implies a non-randomized study and in some situations provides an avenue for non-randomized studies to demonstrate effectiveness. Single-arm trials, in which results from the single arm are compared to accepted medical knowledge fall into this category. Also, in this category are single-arm trials in which results are compared to a specific group of patients derived from external sources. However, both the regulation and E10 provide caveats and caution. E10 states, “The inability to control bias restricts use of the external control design to situations in which the effect of treatment is dramatic and the usual course of the disease highly predictable.” Because the effect of the drug is not known until the completion of the study, this statement might be interpreted as large effects are necessary for substantial evidence. Without randomization, it is important to evaluate the comparability of factors between the treated and control groups that might influence the endpoint, and therefore, these factors should be understood and measured.

Adequate selection of patients

The study patients should have the condition being studied. The implications of heterogeneity of patients and other study characteristics are further discussed later in the paper.

Assigning patients to treatment and control groups minimizes bias

The regulation states that for concurrent control groups, randomization is ordinarily used to assign patients to either the treatment or control groups. Randomization ensures that both measured and unmeasured characteristics are probabilistically balanced between the treated and control groups. It is hard to ensure or establish this balance without randomization. However, the regulation does not state that randomization is required.

Adequate measures to minimize biases on subjects, observers, and analysts

Typically, blinding is also used in studies to minimize bias. When possible, patients and investigators are blinded to treatment allocation. Without blinding, bias may affect protocol adherence and ascertainment of endpoints possibly resulting in misleading findings. It may be difficult to blind in some real-world studies. Further research is likely needed to explore the impact of lack of blinding on adherence and endpoint ascertainment and in what situations the impact of blinding is not significant.

Well-defined and reliable assessment of subjects’ responses

Responses under this heading refer to patient outcomes or endpoints. Valid, reliable, and accurate endpoints are central to the strength of the study finding. Blinding is one means to promote the reliability of the endpoints. However, as mentioned, this may not always be possible with the use of electronic health data. The use of hard endpoints that are clear and unambiguous may reduce the consequences of lack of blinding. Some research has shown that the use of RWD for certain endpoints results in similar results as the use of traditional study endpoint ascertainment.12, 13 The use of these data may afford the ability for long-term follow-up.14 With the use of external control, it is important to consider comparability of the ascertainment of endpoint between the internal interventional arm and the external control. For this situation, in addition to the comparability of the endpoint ascertainment, consideration of confounding is relevant. Some endpoints may be associated with less confounding than others. For example, survival may be the most objective endpoint, but many factors such as standard of care may affect survival. In some cases, multiple RWD sources may be combined to derive endpoints.15

Adequate analysis to assess drug results

Good design and conduct promote straightforward analysis. However, non-randomized studies, complex adherence patterns, and the absence of information (e.g., missing data) may necessitate analysis that is more complex than has been typically used in traditional randomized trials. Analysis methods should have well-understood properties and assumptions. Likewise, they should be robust to assumption and include sensitivity analyses and diagnostics to evaluate the assumptions and the impact of deviations from the assumptions.

Because study results are reviewed by decision makers with a range of backgrounds, it is advantageous for methods to be based on accessible and understandable concepts. Relatedly, specification of analysis methods in a regulatory setting calls for some consensus among parties with different backgrounds and focus. Thus, the use of an analysis that requires extensive specifications, such as the form of a model, may be challenging in a regulatory context. However, for advancement of the use RWD in trials, it may be necessary to explore the potential, appropriate use, and robustness of novel methods.

The quantification of potential bias may assist in the evaluation of the review of the strength of findings. Bias may result from population selection, confounding, endpoint ascertainment, and measurement error.1618 For example, for a binary endpoint, assumptions on the misclassification of the endpoint (sensitivity and specificity) can be used to derive the bias in the effect measurement. Finally, the analysis plan should be finalized and documented before the review and analysis of the study data.10

Precise research question

The first element of the regulation for adequate and well-controlled investigations relates to clear objectives and summaries. Intent-to-treat and Per Protocol analyses are two commonly discussed ideals in the design and analysis of trials. However, even in traditional trials, with the presence of missing data and other post-baseline events, there may be ambiguity in the precise objective of the trial. For studies using RWD, this may be even a larger issue, because of the greater potential for missing data and complex exposure patterns. It is helpful to consider the framework of the draft guidance E9(R1) Statistical Principles for Clinical Trials: Addendum: Estimands and Sensitivity Analysis in Clinical Trials (E9R1),19 which makes use of the concept of estimand to help align the study design and analysis with the study objectives. The estimand is the unknown target quantity (true drug effect) that addresses the study objectives. The estimand is specified by the study population, endpoint, summary measure, and the handling of post-randomization events (intercurrent events).

Studies intended to reflect clinical practice are likely to have more complex exposure and adherence patterns than highly-controlled studies; equivalently, there may be more intercurrent events. For the demonstration of effectiveness, FDA has generally advocated for the use of the Intention-To-Treat principle10 or in the language of E9R1, Treatment Policy. Information from the time after patients stop treatment is included in the analysis, so that the overall effect includes information from patients who stopped treatment because of lack of efficacy, adverse events, or other reasons. In practice the information after the patient stops therapy may not be available, and the missing data problem arises. Stopping treatment or lack of follow-up may be common in studies reflecting clinical practice, because there is more flexibility in the intervention delivery and the follow-up. In addition, even when follow-up is good, if adherence is short relative to the study period, any effect of the drug might be diluted by long periods without exposure to the drug. Because the estimated drug effect would be smaller, larger sample sizes would be needed to achieve the same statistical power than in a study with more adherence. There is also the issue whether such an estimand is appropriate if there is significant study time for patients who have stopped therapy or if there are other intercurrent events.

Estimands other than Treatment Policy offer promises and challenges. Hernan and Robins20 argued that for studies design to reflect clinical practice, Intention-To-Treat may not be appropriate and argue for what they refer to as Per Protocol Analysis. They present an example for a contraceptive, in which one might be interested in the effectiveness when used as indicated and not the effectiveness that includes significant non-adherence. One can argue that both measures of effectiveness are of interest. However, the latter may be more dependent on how the specific study design encourages adherence. Both measures are dependent on the population in the study. Hernan and Robins are clear that naïve estimates do not necessarily produce Per Protocol estimates. Adherence may be non-random and depend on baseline factors, post-baseline factors, and unmeasured factors. Increasingly more complex models are needed to account for these factors. In practice, as discussed, in a regulatory setting complex models may be challenging. These models entail more assumptions, and the sensitivity analyses and diagnostics become more important. Reaching consensus on the form of the model may be challenging. Furthermore, the factors that predict adherence need to be understood and measured. With the use of RWD, for example use existing information in electronic medical records, there may be limited information on the predictors. Whichever estimand is used, the proper interpretation should be conveyed in professional labeling for the drug.

Hernan and Scharfstein21 propose a focus on studies that compare treatment strategies that include protocol-defined use of second-line treatment, as opposed to comparing individual, initial treatments. This approach reduces issues related to adherence, because adherence is more broadly and clinically defined. This approach clearly has a role in developing clinical evidence. However, to effectively search for such strategies, questions like “does a drug work in ideal conditions?” might need to be addressed first.

As discussed in the background section, RWD may be used to promote efficiency by leveraging existing infrastructure and data or it may provide an opportunity to enroll more diverse patient populations from a range of treatment settings. These two goals can be in opposition. Design elements that are associated with studies designed to reflect clinical practice,22 such as broad eligibility, flexible delivery and adherence, and even patient-centered outcome may be associated with greater heterogeneity. If the heterogeneity can be accounted for, for example by including predictive covariates in the analysis, then loss of efficiency may not be a problem. However, such studies may require larger sample sizes to achieve the same statistical power than a highly-controlled study. Thus, there may be a trade-off between providing evidence reflecting clinical practice and the efficiency of leveraging existing infrastructure and data. It is important to have clear study objectives and a clear role for the use of RWD in the study.

Conclusions

The use of RWD and RWE offers great promise to leverage large amounts of healthcare data and address questions relevant to patients and public health. In addition to specifying legal requirements, existing FDA regulation offer a framework to consider in the design and analysis of RWE studies to achieve high-quality evidence. It is likely that randomization will still play the primary role for studies to establish the effectiveness of drugs. However, in certain settings, non-randomized studies may play a role, although challenges exist for these studies to provide well-accepted evidence. The characteristics of an adequate and well-controlled investigation can still provide a roadmap for non-randomized studies. Experience with comparing observational studies addressing the same question as randomized clinical studies as in the FDA Duplicate project23 will provide some insight into the reliability of observational studies. The current regulation and guidance reserve external controls for special cases. Understanding the elements of observational studies that are associated with reliable results perhaps may in the future permit the use of these studies in a wider range of settings. In parallel, the advancement and understanding of the quantification of bias will aid in evaluating study findings.

It is important to separate the clinical question from the use of RWD, since RWD may play a role in a range of clinical questions. First, the clinical questions should be clear, whether the intent is to isolate the effect of the drug or to provide evidence on its use and effectiveness in clinical practice. RWD may be used to provide information on effects in clinical practice, and it can be used to enhance efficiency in the study by leveraging existing infrastructure and data sources. However, at times these two purposes may be at odds, since heterogeneity associated with clinical practice may result in larger sample sizes.

One of the strengths of the traditional randomize clinical trial is that with good design and conduct, the analysis is relatively straightforward, transparent, and widely comprehensible. In the planning, parties with different views and focus can generally reach consensus on the analysis plan. With more complex study designs and exposure and adherence patterns, studies may require more complex analytic approaches. It is important that analyses maintain the desirable features of traditional analyses such as robustness and transparency. It is also important that the necessary data to conduct the appropriate analysis are available in the study. Ultimately, designs and methods should bring the best outcomes to patients, and not be limited by the current state of designs, methods, and data.

The regulatory use of RWD and RWE will likely grow as the understanding of the strengths and limitations of these studies grow and with technological advances, such as new analysis methods and data sources. What we know about traditional clinical trials will provide a foundation of the growth.

Acknowledgements

I wish to acknowledge the helpful comments from two anonymous reviewers.

Funding

No external funding sources.

Footnotes

Declaration of conflicting interests

There are no conflicts of interest to declare.

Disclaimer

This article reflects the views of the author and should not be construed to represent FDA’s views or policies.

References

  • 1.United States Food and Drug Administration. Guidance for Industry and FDA Staff: Best practices for conducting and reporting pharmacoepidemiologic safety studies using electronic healthcare data sets, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/best-practices-conducting-and-reporting-pharmacoepidemiologic-safety-studies-using-electronic (2013, accessed 13 September 2019).
  • 2.Platt R, Wilson M, Chan KA, et al. The new Sentinel Network-improving the evidence of medical-product safety. N Engl J Med 2009; 361: 645–647. [DOI] [PubMed] [Google Scholar]
  • 3.Platt R, Brown JS, Robb M, et al. The FDA Sentinel Initiative - An evolving national resource. N Engl J Med 2018; 379: 2091–2093. [DOI] [PubMed] [Google Scholar]
  • 4.Cocoros NM, Pokorney SD, Haynes K, et al. FDA-Catalyst-Using FDA’s Sentinel Initiative for large-scale pragmatic randomized trials: approach and lessons learned during the planning phase of the first trial. Clin Trials 2019; 16: 90–97. [DOI] [PubMed] [Google Scholar]
  • 5.United States Food and Drug Administration. Framework for FDA’s real-world evidence program, https://www.fda.gov/media/120060/download (2018, accessed 13 September 2019).
  • 6.United States Food and Drug Administration. Guidance for Industry and FDA Staff: Enhancing the diversity of clinical trial populations— Eligibility criteria, enrollment practices, and trial designs, https://www.fda.gov/media/127712/download (2019, accessed 13 September 2019).
  • 7.United States Food and Drug Administration. Guidance for Industry and FDA Staff: Enrichment strategies for clinical trials to support determination of effectiveness of human drugs and biological products, https://www.fda.gov/media/121320/download (2019, accessed 13 September 2019). [Google Scholar]
  • 8.Kim ES, Bruinooge SS, Roberts S, et al. Broadening eligibility criteria to make clinical trials more representative: American Society of Clinical Oncology and Friends of Cancer Research Joint Research Statement. J Clin Oncol 2017; 35: 3737–3744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lin J, Gamalo-Siebers M and Tiwari R. Propensity-score-based priors for Bayesian augmented control design. Pharm Stat 2019; 18: 223–238. [DOI] [PubMed] [Google Scholar]
  • 10.United States Food and Drug Administration. Guidance for Industry: E9 statistical principles for clinical trials, https://www.fda.gov/media/71336/download (1998, accessed 13 September 2019).
  • 11.United States Food and Drug Administration. Guidance for Industry: E10 choice of control group and related issues in clinical trials, https://www.fda.gov/media/71349/download (2001, accessed 13 September 2019).
  • 12.Hlatky MA, Ray RM, Burwen DR, et al. Use of Medicare data to identify coronary heart disease outcomes in the Women’s Health Initiative. Circ Cardiovasc Qual Outcomes 2014; 7: 157–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Barry SJ, Dinnett E, Kean S, et al. Are routinely collected NHS administrative records suitable for endpoint identification in clinical trials? Evidence from the West of Scotland Coronary Prevention Study. PLoS One 2013; 8: e75379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ford I, Murray H, McCowan C, et al. Long-term safety and efficacy of lowering low-density lipoprotein cholesterol with statin therapy: 20-year follow-up of West of Scotland Coronary Prevention Study. Circulation 2016; 133: 1073–1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Curtis MD, Griffith SD, Tucker M, et al. Development and validation of a high-quality composite real-world mortality endpoint. Health Serv Res 2018; 53: 4460–4476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Schneeweiss S Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics. Pharmacoepidemiol Drug Saf 2006; 15: 291–303. [DOI] [PubMed] [Google Scholar]
  • 17.Lash TL, Fox MP, MacLehose RF, et al. Good practices for quantitative bias analysis. Int J Epidemiol 2014; 43: 1969–1985. [DOI] [PubMed] [Google Scholar]
  • 18.Lash TL, Fox MP, Cooney D, et al. Quantitative bias analysis in regulatory settings. Am J Public Health 2016; 106: 1227–1230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). Draft Guidance: E9(R1) statistical principles for clinical trials: addendum: Estimands and sensitivity analysis in clinical trials, https://www.fda.gov/media/108698/download (2017, accessed 13 September 2019).
  • 20.Hernan MA and Robins JM. Per-protocol analyses of pragmatic trials. N Engl J Med 2017; 377: 1391–1398. [DOI] [PubMed] [Google Scholar]
  • 21.Hernan MA and Scharfstein D. Cautions as regulators move to end exclusive reliance on intention to treat. Ann Intern Med 2018; 168: 515–516. [DOI] [PubMed] [Google Scholar]
  • 22.Loudon K, Treweek S, Sullivan F, et al. The PRECIS-2 tool: designing trials that are fit for purpose. BMJ 2015; 350: h2147. [DOI] [PubMed] [Google Scholar]
  • 23.Franklin JM, Pawar A, Martin D, et al. Nonrandomized real-world evidence to support regulatory decision-making: process for a randomized trial replication project. Clin Pharmacol Ther. Epub ahead of print 21 September 2019. DOI: 10.1002/cpt.1633. [DOI] [PubMed] [Google Scholar]

RESOURCES