Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Jun 1.
Published in final edited form as: Stroke. 2013 Jun;44(6 0 1):S116–S118. doi: 10.1161/STROKEAHA.111.000031

Novel Methodologic Approaches to Phase I, II, and III Trials

Sharon D Yeatts 1,2
PMCID: PMC3684044  NIHMSID: NIHMS465120  PMID: 23709704

Among the future directions of stroke research identified by the Stroke PRG was a call for improved trial design, conduct and outcome assessment. This demand is underscored by recent trial findings. Of the more than 100 clinical trials in ischemic stroke published during the first decade of the 21st century, only five of 31 phase III trials demonstrated efficacy on the primary outcome1. As the number of experimental therapies undergoing clinical investigation in stroke increases, there is a rising need for more efficient statistical designs to 1) determine the optimal dose, 2) identify interventions with therapeutic potential, and 3) evaluate clinically relevant treatment effects.

The correct selection of the optimal dose, via a thorough understanding of the dose-toxicity and dose-efficacy relationships, is critical for establishing the efficacy of a potential therapeutic agent2. Traditional rule-based dose-escalation designs3, such as the “3+3” design and its variations, have suboptimal statistical operating characteristics. The inefficiency in the “3+3” design, which makes the decision to escalate or de-escalate based solely on event data from the current dose, without considering event information available from neighboring doses. An alternative design called the Continual Reassessment Method4 (CRM) is an adaptive dose-finding algorithm, wherein escalation or de-escalation through the dose region is determined by continuous re-estimation of the dose-toxicity curve, and each cohort of subjects is treated at the dose currently believed to be the MTD. Although computationally more intensive than the “3+3”, the CRM and its variations use all available toxicity data in the estimation of the MTD, and thus are statistically more efficient. Ethical considerations also favor the CRM, since the CRM typically treats fewer patients at subtherapeutic doses.

Once the appropriate dose has been determined through a Phase I trial, the next step is to evaluate efficacy potential in a Phase II trial. The traditional concurrently-controlled Phase II design, intended to simultaneously estimate treatment effect and assess variability, is often criticized as an underpowered Phase III comparative clinical trial. Proposed alternative designs, such as the selection design and the futility design, are instead intended to weed out ineffective or mediocre therapies.

The objective of the futility design, which has been successfully implemented in stroke5,6, is to discard treatments that do not show promise. Statistical hypotheses are stated such that the goal is to demonstrate that the intervention is futile. Failure to conclude futility would be considered evidence in favor of a need for a definitive phase III clinical trial. In the single arm futility design7, the experimental treatment arm is compared to a target response rate, π1, defined as the minimum proportion of successes in the treated group which would warrant further study. If the true success proportion π is less than π1, the intervention is declared futile. Comparison of the experimental arm with a target response rate, which is fixed and has no variability, results in a smaller sample size than would be required for direct comparison with a concurrent control arm. The target response rate can be determined based on a clinically relevant treatment effect and historical control data.

Concerns over the use of historical control data include temporal changes in outcome associated with improvements in patient management, as well as variations in eligibility criteria, protocol adherence, and primary outcome measures across clinical trials. If the historical control data is outdated and therefore no longer relevant, trial results may not reflect an accurate assessment of the futility of the experimental treatment. Inclusion of a small cohort of control subjects for calibration of the threshold value has been suggested8. If this cohort is too small, its usefulness in terms of calibration is quite limited; if this cohort is too large, the trial begins to resemble an underpowered Phase III.

The inclusion of a concurrent control arm in the futility design avoids the drawbacks of historical control data and allows for a direct comparison of treatment arms. The futility hypothesis is based on a direct comparison of the randomized treatment arms, such that futility would be declared if the absolute treatment effect is less than δ, a pre-specified clinically meaningful futility margin, in favor of the experimental treatment. The sample size is increased over the single-arm design, but the concurrently controlled futility design is not an alternative to phase III efficacy testing. The objective remains to establish futility, rather than to demonstrate efficacy, of the active treatment. The NINDS-funded phase II trial of deferoxamine mesylate in intracerebral hemorrhage (Hi-Def in ICH, clinicaltrials.gov NCT01662895) employs this design.

Selection designs9 can be used to prioritize candidate interventions, such that resources are allocated to the most promising of candidates. In a selection design, the objective is to select the “best” among K interventions (or K interventions and a control) for further testing. In the selection design, subjects would be randomized to one of the K interventions. The “best” intervention is defined as the intervention with the numerically, rather than statistically, highest response rate. The sample size is determined to ensure that, if the best treatment is superior by at least some margin D, then the best treatment will be selected with high probability.

The selection design can be combined with a futility or superiority test in a sequential two stage design, as described in the trial of Co-Q10 in ALS10. At the conclusion of Stage 1, a treatment would be selected, and the statistical hypothesis tested at the end of Stage 2. Inclusion of Stage 1 subjects in the statistical hypothesis test has the advantage of using all available outcome data but introduces bias which must be accounted for in the test statistic. If the Stage 1 subjects are excluded from the Stage 2 hypothesis test, the parameter estimate is unbiased, but the overall sample size is increased.

Adaptive designs promise increased flexibility of the trial to respond to accumulating information, a promise which has generated both enthusiasm and confusion. According to the FDA draft guidance11, an adaptive design “includes… a prospectively planned opportunity for modification of one or more specified aspects of the study design and hypotheses based on analysis of data (usually interim data) from subjects in the study.” It is important to emphasize that the potential issues, and in what manner the trial will adapt to each, must be prespecified in the design stage in order to maintain trial validity.

While certainly not novel, group sequential methods are adaptive according to this definition, in that they allow the trial to be stopped early, based on accumulating data, in the face of overwhelming efficacy or futility. Other valid mechanisms for adaptation in Phase III trials include blinded sample size reestimation12 and covariate adaptive randomization13. Adaptations based on an unblinded assessment of interim data, including sample size reestimation and response adaptive randomization, may be more enticing but may also be more controversial in the confirmatory setting. Early phase designs allow for more flexibility with regard to adaptation, and adaptive designs have gained greater acceptance in this exploratory setting. Whether exploratory or confirmatory, Bayesian or Frequentist, each trial must demonstrate that the statistical operating characteristics remain sound in the face of the chosen adaptation(s).

Even a well-designed trial of an effective therapy can fail based on an inappropriate primary outcome or primary analysis. The selection of valid, reliable, and efficient outcome measures and analytic strategies for stroke clinical trials is receiving much attention in the stroke literature14. The modified Rankin Scale (mRS) is an ordinal disability measure which is commonly utilized as the primary efficacy outcome measure in stroke clinical trials. Traditional analysis focuses on dichotomization of the ordinal scale into a binary response, where success or failure is determined by comparing each subject’s result to a fixed threshold. Responder analysis (also referred to as the sliding dichotomy or the stratified dichotomy) tailors the threshold for success based on each subject’s baseline prognosis15. In the responder analysis setting, a mild stroke would have a more stringent definition of success than a severe stroke. This approach is thought to reflect more accurately the clinical perspective of outcome and is currently being implemented in the SHINE trial (clinicaltrials.gov NCT01369069). While these dichotomization approaches have the advantage of a relatively straight-forward clinical interpretation, reduction of an ordinal measure to a binary outcome results in a loss of information.

Analytic approaches that maintain the ordinal nature of the scale, sometimes referred to as shift analysis16, are statistically more powerful than dichotomized analyses; however, the clinical interpretation is also less intuitive. In addition, some of these ordinal approaches require distributional assumptions which may not be supported by the trial data. Ordinal regression, for example, requires the assumption of proportional odds, which means that the estimated odds ratio is constant across all possible cutpoints of the ordinal scale. Alternative methods must be carefully considered in the design stage; a clinical trial powered for ordinal regression may be underpowered for the logistic regression required if the proportional odds assumption is violated.

The mRS reflects a global assessment of function, but a patient may consider many other life aspects in describing outcome following a stroke. Patient-reported outcomes, including the NeuroQOL17 and PROMIS18 tools, may provide important information regarding quality of life, cognition and social functioning. Consideration of these varying aspects of global outcome may provide a more complete understanding of the evolution of stroke and allow a more sensitive estimate of treatment effect.

These are just a sampling of the novel approaches being considered, and in some cases implemented, in current stroke trials. Continued development of innovative trial designs, outcome assessments, and analytic approaches is an essential component of stroke research.

Footnotes

Disclosures: Dr. Yeatts is the SDMC PI for the phase II trial of deferoxamine in ICH (Hi-Def; U01 NS074425). She is the unblinded statistician for the IMS 3 (U01 NS077304) and ProTECT (U01 NS062778; Bio-ProTECT R01 NS071867) phase III clinical trials.

References

  • 1.Hong KS, Lee SJ, Hao Q, Liebeskind DS, Saver JL. Acute stroke trials in the 1st decade of the 21st century. Stroke. 2011;42:e314. [Google Scholar]
  • 2.Fisher M for the Stroke Therapy Academic Industry Roundtable IV. Enhancing the development and approval of acute stroke therapies. Stroke. 2005;36:1808–1813. doi: 10.1161/01.STR.0000173403.60553.27. [DOI] [PubMed] [Google Scholar]
  • 3.Storer BE. Design and analysis of phase I clinical trials. Biometrics. 1989;45:925–937. [PubMed] [Google Scholar]
  • 4.O’Quigley J, Pepe M, Fisher L. Continual reassessment method: a practical design for phase I clinical trials in cancer. Biometrics. 1990;46:33–48. [PubMed] [Google Scholar]
  • 5.The IMS Trial Investigators. Combined intravenous and intra-arterial recanalization for acute ischemic stroke: the Interventional Management of Stroke study. Stroke. 2004;35:904–912. doi: 10.1161/01.STR.0000121641.77121.98. [DOI] [PubMed] [Google Scholar]
  • 6.The IMS II Trial Investigators. The Interventional Management of Stroke (IMS) II Study. Stroke. 2007;38:2127–2135. doi: 10.1161/STROKEAHA.107.483131. [DOI] [PubMed] [Google Scholar]
  • 7.Palesch YY, Tilley BC, Sackett DL, Johnston KC, Woolson R. Applying a phase II futility study design to therapeutic stroke trials. Stroke. 2005;36:2410–2414. doi: 10.1161/01.STR.0000185718.26377.07. [DOI] [PubMed] [Google Scholar]
  • 8.Herson J, Carter SK. Calibrated phase II clinical trials in oncology. Statistics in Medicine. 1986;5:441–447. doi: 10.1002/sim.4780050508. [DOI] [PubMed] [Google Scholar]
  • 9.Simon R, Thall PF, Ellenberg SS. New designs for the selection of treatments to be tested in randomized clinical trials. Statistics in Medicine. 1994;13:417–429. doi: 10.1002/sim.4780130506. [DOI] [PubMed] [Google Scholar]
  • 10.Levy G, Kauffman P, Buchsbaum R, Montes J, Barsdorf A, Arbing A, et al. A two-stage design for a phase II clinical trial of coenzyme Q10 in ALS. Neurology. 2006;66:660–663. doi: 10.1212/01.wnl.0000201182.60750.66. [DOI] [PubMed] [Google Scholar]
  • 11.US Food and Drug Administration. [Accessed October 31, 2012];Draft Guidance for Industry: Adaptive Design Clinical Trials for Drugs and Biologics. 2010 http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM201790.pdf.
  • 12.Chow SC, Chang M. Adaptive Sample Size Adjustment. In: Chow SC, editor. Adaptive design methods in clinical trials. Boca Raton, CL: Chapman & Hall; 2007. pp. 137–159. [Google Scholar]
  • 13.Chow SC, Chang M. Adaptive Randomization. In: Chow SC, editor. Adaptive design methods in clinical trials. Boca Raton, CL: Chapman & Hall; 2007. pp. 47–73. [Google Scholar]
  • 14.Saver JL. Optimal end points for acute stroke therapy trials: best ways to measure treatment effects of drugs and devices. Stroke. 2011;42:2356–2362. doi: 10.1161/STROKEAHA.111.619122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Adams HP, Jr, Leclerc JR, Bluhmki E, Clarke W, Hansen MD, Hacke W. Measuring outcomes as a function of baseline severity of ischemic stroke. Cerebrovascular Diseases. 2004;18:124–129. doi: 10.1159/000079260. [DOI] [PubMed] [Google Scholar]
  • 16.Saver JL. Novel end point analytic techniques and interpreting shifts across the entire range of outcome scales in acute stroke trials. Stroke. 2007;38:3055–3062. doi: 10.1161/STROKEAHA.107.488536. [DOI] [PubMed] [Google Scholar]
  • 17.Cella D, Nowinski C, Peterman A, Victorson D, Miller D, Lai J-S, et al. The Neurology Quality-of-Life Measurement Initiative. Archives of Physical Medicine and Rehabilitation. 2011;92:S28–S36. doi: 10.1016/j.apmr.2011.01.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Fries JF, Bruce B, Cella D. The promise of PROMIS: using item response theory to improve assessment of patient-reported outcomes. Clinical and Experimental Rheumatology. 2005;23:S53–S57. [PubMed] [Google Scholar]

RESOURCES