In the accompanying commentary, Lauer supports the role of large-scale randomized clinical trials (RCTs) in comparative effectiveness research (CER) and argues that the current debate on CER will reinvigorate the clinical trial enterprise. [1] Although I agree with Lauer that randomization will continue to play a big role in comparative effectiveness research going forward, I believe that the current clinical trial enterprise needs re-invention rather than reinvigoration as it falls short of the data production infrastructure required to achieve the goals set out by the recent legislation on CER. [2] I discuss some of the issues surrounding the concerns about current RCT infrastructure and how we might think of rebuilding it to meet the needs of health care.
Comparative effectiveness research (CER) is meant “to assist consumers, clinicians, purchasers and policymakers to make informed decisions”. [3] However, the informational requirements for each of these decision makers are starkly different. Moreover, decision making at all levels are strongly interrelated. Individual patients and their physicians usually require nuanced information to make the right treatment decision for them. Manufacturers and purchasers require information on the potential uptake of treatments in the population, and potential value generated among the patients taking a treatment to make correct pricing and quantity decisions. Policy makers in charge of insurance coverage decisions consider all of this information plus the budget impact of their decisions. Tunis et al [4] recognize this complexity in decision making and recommend practical clinical trials for which the hypothesis and study design are developed specifically to answer the questions faced by decision makers. However, they also recognize that we do not have the luxury of time and budget to conduct a CER study separately for each level of decision within a clinical context. Therefore, how results emanating from a single or a few CER study(ies) can inform all levels of decision making remain to be the biggest challenge in the designs of CER studies going forward. RCTs are often touted on the powers of randomization as random allocations of treatment would equate the distribution of all possible factors that affect outcomes among the treatment groups. Therefore, any difference in average outcomes between the treatment groups can be attributed to differences in treatment allocation. While this simple and powerful idea help establish a causal effect of treatment allocation among a group of patients, it is far from clear how such an effect should inform decision making for various stakeholders, even though, historically, such results have influenced decision making at all levels.
First, it is often a stretch to interpolate the average casual effect from a study to an effect that accrues to an individual within that study. For example, a physician trying to decide the best smoking cessation intervention for a motivated and committed individual would find it hard to inform her decision based on the casual effect established in an RCT using intention-to-treat principles where a large fraction of the enrollees were not as motivated or committed. Second, extrapolation of the RCT results to those individuals who would typically not enroll in such studies remains contentious. Finally, CER on alternative access to treatments does not often equate to CER on alternative treatment assignments. Yet, in the absence of any other information, it is quite plausible that individuals and policy makers are drawn to behave (either directly or through nudges by a third party, e.g. marketing) as if the RCT results apply to them. How else one could explain that when Nexium (esomeprazole) was found to heal slightly more patients with GERD compared to Prilosec (omeprazole) (93.7% vs. 84.2%, p-value < 0.05), [5] almost the entire patient population switched to using the branded Nexium within a year and shun the generic Prilosec which was proven to work for at least 80% of the patients. [6] Similarly, after CATIE results on comparative effectiveness of antipsychotics came out, [7] many State Medicaid agencies went on to institute prior authorization requirements on some of the second generation antipsychotic drugs, [8] only to realize increased costs and decreased health. [9]
The influence of such average results on demand for a treatment has tremendous consequence for the entire health care economy. [10] This is illustrated in a recent example of Medicare coverage. The Center for Medicare and Medicaid Services (CMS) recently conducted a year-long review of a new drug sipuleucel-T (Provenge) that was found to extend median survival by 4.1 months in patients with castration-resistant prostate cancer when compared to placebo. [11] CMS decided to continue pay for this drug at a price tag of $93,000 per patient. Over the next five years, about 120,000 patients will use Provenge [12] and are expected to gain a total of about 40,000 life years at price tag of $11.1B to Medicare, which will be paid by US tax payers. In countries like the UK, where policy makers are allowed to consider costs in their decisions to cover new medical breakthroughs, drugs like Provenge would most likely not make the cut. But, in the US, Medicare has resisted the urge to institute a formal rationing mechanism and have upheld patient or physician autonomy with regard to treatment choices. [13] Consequently, in the US setting it becomes even more important that resources are devoted to identify the portion of patient population who would all benefit from treatment and not just establish that the treatment is beneficial on average. [14] If such targeting be accomplished, Medicare would end up paying for Provenge for only a fraction of the patients, each of whom would benefit substantially from taking it. In fact, Medicare may be open to paying double or triple the current price for Provenge if it is targeted optimally, thereby still preserving the manufacturer incentives for innovation. Patient health will improve inevitably. Future incentives for research would line up with the needs of the patients who do not benefit from current technology. However, none of these welfare enhancing behaviors can be realized if we continue to rely on RCTs, practical or not, that gives us average results, unless those average results are established for nuanced subpopulations. Instead, population level average results would continue to promote winner-takes-all type of environments that will increase risk faced by innovators and inevitably propel demand that fuels the unsustainable health care costs growth without generating commensurate value. [10,14,15]
Therefore, arguments that RCTs give an unbiased estimate of an effect seem to be of less importance than realizing that the target parameters for the RCTs are often misplaced. The recent shift towards the term patient-centered outcomes research (PCOR) carries important implications for the future of CER. The 2010 PPACA legislation specifically mentions “Research shall be designed, as appropriate, to take into account the potential for differences in the effectiveness of healthcare treatments, services and items as used with various subpopulations, such as racial and ethnic minorities, women, age and groups of individuals with different comorbidities, genetic and molecular subtypes, or quality of life preferences and include members of such subpopulations as subject in the research as feasible and appropriate.” RCTs are not geared to answer such questions about effect heterogeneity that is demanded by the legislation mainly because they rely on hypothesis testing approaches and cannot possible accommodate multitude of hypotheses across subgroups with limits to time, money and recruitment.
Effect heterogeneity, however, is the fundamental building block for any level of decision making. [16] One approach to solving such a problem is to be able to develop prediction algorithms for individual-level treatment effect heterogeneity. Any attempt to individualize care based on prediction algorithms must begin with a hypothesis generation exercise and therefore these results can provide valuable resources to clinicians and policymakers, who in their absence must rely on traditional comparative effectiveness results on averages. Such algorithms can be constructed not only by identifying low-dimensional individualized characteristics such as genomic information, but also by collapsing multi (high)-dimensional outcomes and behavior into individual-level characteristics, which can be used to establish individualized treatment effects. [17] The necessity of an algorithmic approach also lies in the feasibility of translating enormous amounts of information to the bedside, without overwhelming physicians. Observational data coupled with appropriate methodologies are far well suited for these purposes than explanatory trials when compared across dimensions of costs, time and population coverage. The role of confirmatory RCTs then becomes even more relevant as they would be designed to randomly assign algorithms to allocate treatment rather than assign treatment themselves. Efforts in this line are already underway in diabetes, [17] mental health, [18] cardiovascular health. [19] However, more investment is methods are required to develop prediction algorithms for effects rather than risks.
Once information is generated to promote welfare enhancing decisions at the individual patient level, economic evaluations, budget-impact analysis and other forms of evaluations directed to decision making at the population level will naturally become much more informative. To what extent recent investments in comparative effectiveness research and the newly created Patient-Centered Outcomes Research Institute will help transform our current clinical research system towards achieving these goals remains to be seen.
Acknowledgment
The author acknowledges financial support from the National Institutes of Health grants R01MH083706, RC4CA155809 and R01CA155329. The commentary reflects views of the author and not necessarily those of the University of Washington or the National Bureau of Economic Research.
References
- 1.Lauer MS. How the debate about comparative effectiveness research should impact the future of clinical trials. Stat Med. 2012;XX:XXX–XXX. doi: 10.1002/sim.5400. [DOI] [PubMed] [Google Scholar]
- 2.Patient Protection and Affordable Care Act of 2009, H.R. 3590, 111th Congress § 6301. 2010 [Google Scholar]
- 3.IOM. Initial national priorities for comparative effectiveness research. Washington, DC: 2009. [Google Scholar]
- 4.Tunis SR, Stryer DB, Clancy CM. Practical clinical trials: Increasing the value of clinical research for decision making in clinical and health policy. JAMA. 2003;290:1624–1632. doi: 10.1001/jama.290.12.1624. [DOI] [PubMed] [Google Scholar]
- 5.Richter JE, Kahrilas PJ, Johanson J, Maton P, Breiter JR, Hwang C, Marino V, Hamelin B, Levine JG Esomeprazole Study Investigators. Efficacy and safety of esomeprazole compared with omeprazole in GERD patients with erosive esophagitis: a randomized controlled trial. Am J Gastroenterol. 2001;96(3):656–665. doi: 10.1111/j.1572-0241.2001.3600_b.x. [DOI] [PubMed] [Google Scholar]
- 6.Harris Gardiner. Prilosec’s Maker Switches Users To Nexium, Thwarting Generics. Wall Street Journal. 2002 Jun 6; [Google Scholar]
- 7.Lieberman JA, Stroup TS, McEvoy JP, Swartz MS, Rosenheck RA, Perkins DO, Keefe RS, Davis SM, Davis CE, Lebowitz BD, Severe J, Hsiao JK. Effectiveness of antipsychotic drugs in patients with chronic schizophrenia. New England Journal of Medicine. 2005;353:1209–1223. doi: 10.1056/NEJMoa051688. [DOI] [PubMed] [Google Scholar]
- 8.Polinski JM, Wang PS, Fischer MA. Medicaid’s prior authorization program and access to atypical antipsychotic medications. Health Affairs. 2007;26:750–760. doi: 10.1377/hlthaff.26.3.750. [DOI] [PubMed] [Google Scholar]
- 9.Soumerai SB, Zhang F, Ross-Degnan D, Ball DE, LeCates RF, Law MR, Hughes TE, Chapman D, Adams AS. Use of atypical antipsychotic drugs for schizophrenia in Maine Medicaid following a policy change. Health Affairs. 2008;27:w185–w195. doi: 10.1377/hlthaff.27.3.w185. [DOI] [PubMed] [Google Scholar]
- 10.Basu A, Jena A, Philipson T. Impact of comparative effectiveness research on health and healthcare spending.2010 NBER Working Paper No w15633. J Health Econ. 2011;30(4):695–706. doi: 10.1016/j.jhealeco.2011.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kantoff PW, Higano CS, Shore ND, Berger ER, Small EJ, Penson DF, Redfern CH, Ferrari AC, Dreicer R, Sims RB, Xu Y, Frohlich MW, Schellhammer PF IMPACT Study Investigators. Sipuleucel-T Immunotherapy for Castration-Resistant Prostate Cancer. N Engl J Med. 2010;363(5):411–422. doi: 10.1056/NEJMoa1001294. [DOI] [PubMed] [Google Scholar]
- 12.Cetin K, Beebe-Dimmer JL, Fryzek JP, Markus R, Carducci MA. Recent time trends in the epidemiology of stage IV prostate cancer in the United States: analysis of data from the Surveillance, Epidemiology, and End Results Program. Urology. 2010;75(6):1396–1404. doi: 10.1016/j.urology.2009.07.1360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Rosenbaum S, Frankford DM. Who Should Determine When Health Care Is Medically Necessary? N Engl J Med. 1999;340(3):229–232. doi: 10.1056/NEJM199901213400312. [DOI] [PubMed] [Google Scholar]
- 14.Basu A. Economics of individualization in comparative effectiveness research and a basis for a patient-centered healthcare.2011 NBER Working Paper No w16900. J Health Econ. 2011;30(3):549–559. doi: 10.1016/j.jhealeco.2011.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Meltzer D, Basu A, Conti R. The Economics of comparative effectiveness studies: Societal and private perspectives and their implications for prioritizing public investments in comparative effectiveness research. PharmacoEconomics. 2010;28(10):843–853. doi: 10.2165/11539400-000000000-00000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Basu A. Individualization at the heart of comparative effectiveness research: The time for i-CER has come. Med Dec Making. 2009;29(6):N9–N11. doi: 10.1177/0272989X09351586. [DOI] [PubMed] [Google Scholar]
- 17.Kaplan SH, Billimek J, Sorkin DH, Ngo-Metzger Q, Greenfield S. Who can respond to treatment? Identifying patient characteristics related to heterogeneity of treatment effects. Med Care. 2010;48(6 Suppl):S9–S16. doi: 10.1097/MLR.0b013e3181d99161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Guo Y, Bowman FB, Clinton K. Predicting the brain response to treatment using a Bayesian hierarchical model with application to a study of schizophrenia. Hum Brain Mapp. 2008;29(9):1092–1109. doi: 10.1002/hbm.20450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Goldberger JJ, Buxton AE, Cain M, Costantini O, Exner DV, Knight BP, Llyod-Jones D, Kadish AH, Lee B, Moss A, Myerburg R, Olgin J, Passman R, Rosenbaum D, Stevenson W, Zareba W, Zipes DP. Risk stratification for arrhythmic cardiac death: Identifying the roadblocks. Circulation. 2011;123:2423–2430. doi: 10.1161/CIRCULATIONAHA.110.959734. [DOI] [PubMed] [Google Scholar]