Aphrase that one hears increasingly these days in biomedical circles is “comparative effectiveness research” (CER). No doubt, part of the reason is the fact that the current administration in the United States has promised a large wave of funding for CER. But before rushing off to write grant proposals, it might be a good idea for us to have a clear notion of what CER is, or should be. There does not appear to be a universal definition, but the proposed definitions embody a simple idea: The medicine that should be practiced on people should be the medicine that has been shown to work, in some scientific sense. It is perhaps a mark of where we are in terms of progress in biomedical science that anyone can see this as a new goal.
Evidently, the previous version of this idea was “evidence-based medicine” (EBM). The roots of the EBM movement were in studies showing wide geographical medical-practice variations that did not seem rationally explainable, together with the generally low quality of medical reviews written by nominal opinion leaders. The core EBM activity is the evidence review, which is a systematic collection and assessment of articles pulled from the medical literature (broadly defined) in order to provide a summary judgment about a specific therapy for use in a specific population to treat a specific condition. A key tenet of the EBM approach is that high-quality randomized clinical trials (RCTs) form the best and primary evidence base, and, while other kinds of evidence can be used, they are deprecated. With these methodological strengths, why do we need to revisit the issue of the best scientific approach for discovering the best medical practices?
For some time now, complementary and alternative medicine (CAM) researchers have complained that RCTs are not universally appropriate in their work, and so, to the extent that EBM is tightly bound with this one approach, it may ask the wrong questions or give the wrong answers, at least regarding CAM. Part of the reason for this kind of criticism is that CAM research is generally on the biomedical frontier, where the wisdom that works well in conventional biomedicine may work rather poorly. The RCT grew up in a symbiotic fashion with the increasing production, marketing, and use of drugs targeted for well-defined conditions. It is no surprise, therefore, that RCTs work best for investigating drugs, or interventions that can be packaged like drugs, and which have druglike effects. In addition it is also no surprise that CAM researchers would have been in the forefront of RCT critics for this very reason.
However, dissatisfaction with EBM has now spread beyond the CAM community. Family-medicine doctors are increasingly expressing the opinion that RCTs, as they are currently run, fail to provide answers to most of these doctors' pressing clinical questions. It is not clear, for example, that the conclusions drawn from RCTs, which would exclude most of the patients that a family-medicine doctor would see, can be safely extrapolated in any clinically useful way. Moreover, there has been consistent pressure to increase the sample sizes of RCTs, which may simply mean that ever-smaller and less–clinically relevant differences will be found “significant.” Thus, at least part of the reason why EBM is seen as being in need of an overhaul is that it does not always do a good job of serving the needs of primary care, nor of CAM.
It would be a dreadful mistake if CER were to be conceptualized entirely as a protest against RCTs and conventional EBM philosophy. Saying what something is not, is not a good way of defining it. Again, however, it seems as though we could draw on the challenges of CAM, both research and practice, to make contributions to a positive definition of CER. The first characteristic I would choose is patient-centeredness. The notion is that clinical research ought to consider the individual patient as the primary unit of analysis and interpretation. The aim of the research should be to improve the care of the individual patient. The failing of RCTs in this regard is that they focus exclusively on the treatment of groups of patients, and usually discourage the extension of findings to subgroups, and never recommend extending the findings to individuals. The irony here is that, while nonintervention (“observational”) studies are frequently criticized on the grounds that they cannot be generalized, only the most vocal critics of the RCT point out that, using the same logic, the results of an RCT cannot be specialized to individual patients.
My second choice would be to focus on patient dynamics instead of statics. Frequent measurement of important outcomes, markers of outcomes, and behaviors, gives one the opportunity of saying something meaningful about the trajectories of individual patients. Cumulative knowledge then consists of generalizing the fundamental principles learned by the study of such patient trajectories, rather than by achieving statistical significance for an outcome that summarizes superficial measurements of groups of patients. Measurements made in the context of a trial, once at the start and once at the end, cannot shed any light on how the condition of each patient evolved (or failed to evolve) under actual therapeutic conditions.
A third choice would be that clinical research should be carried out on the kinds of patients who actually go to doctors and have the condition under study. In one of their more remarkable methodological lapses, the inventors of RCTs set out principles, such as excluding from participation people with comorbidities or who were unlikely to follow directions. This violates perhaps the most important principle of prospective studies (that a factor influencing the relationship between treatment and outcome must not be related to whether one is in the study).
A fourth option is that we need methods that are more eclectic with regard to outcomes. Reimbursement schemes are sensitive to utilization of medical services, particularly if they are chronic. Physicians and practitioners want to see their patients improve across a range of areas, including daily activities and life satisfaction, in addition to proper laboratory values or results of standardized tests. Different patients may value different outcomes differently, and, if we are going to be patient-centered, then we should take such differences seriously. Therefore, it seems important to consider multiple outcomes across several levels, which is forbidden by RCT dogma.
These four areas (patient-centeredness, patient dynamics, patient heterogeneity, and patient values) all lead to the same conclusion, in terms of where CER should place its heaviest focus. Most CER should be carried out using the health records of real patients.
We know that this is possible—at least in principle. There are health care delivery systems with good-to-excellent electronic medical records (EMRs) or electronic health records (EHRs), and some of these systems regularly carry out funded research based on those records. We also know that there are severe methodological challenges in drawing valid conclusions from studies of naturalistic data (meaning that there is no researcher-level intervention). Ironically, most of the statistical work that has been done to address these challenges comes out of the behavioral sciences, where interventions are rare, and not from the world of bench science, which inspired the form of the RCT. Nevertheless, although forging good CER methodology presents problems, there is a large and growing literature pointing toward solutions. One approach involves taking advantage of the sheer size of clinic populations, in order to form groups of highly similarly situated patients, then following their courses through different therapeutic pathways. If adequate “similarity” can be achieved, then the risk of confounding can be much diminished.
A focus on clinic-based CER would have one immediate beneficial side-effect. The only therapies that would be compared with each other would be those that would be used in actual practice. Thus, the inclusion of a placebo or sham group in a CER study would only make sense if clinicians would actually use it if it outperformed a putatively active therapy. Perhaps a large amount of money could be saved if we stopped testing therapeutic approaches that noone intends to use.
We also know, however, that EHR-based research presents practical problems. In the United States the estimates of the percentage of physicians who have any kind of EMR for their patients is on the order of 10%. In addition, even this discouraging figure needs to be tempered by the fact that most of those EMRs will have been designed for purposes other than research, and so they may be of limited usefulness. European countries, including the United Kingdom, are in much better positions, although limited research usefulness will remain a problem. However, in a certain sense, the problems of EHR-based research are much smaller than the problems of RCT-based research. We will never have enough money to carry out all of the RCTs that would need to be done for all of the outstanding clinically important questions, but it is conceivable that we could invest in EHRs that would be research-ready. In addition, once that investment was made, the marginal cost of each EHR study would be a tiny fraction of its RCT counterpart.
Coming back to CAM, there is an additional aspect of the economic argument. We are all tired of reading CAM evidence reviews that conclude that there are not enough studies (or at least not enough high-quality studies) to justify any conclusions. This pattern has misled some scientists to take the position that there is no evidence that any CAM modality actually works. However, a moment's thought shows that the areas in which there will be enough studies are precisely those that have attracted the most funding, and CAM is not one of them. Indeed, in the United States, it is said that the pharmaceutical industry outspends the National Institutes of Health (NIH) on biomedical research, which would certainly explain why evidence reviews of drugs have an easier time coming to conclusions.
If we follow the path I suggest—that CER should be primarily EHR-based—then it becomes evident that, in many countries (especially the United States), CAM practices will be at a disadvantage. This is because CAM is usually practiced by solo practitioners or in very small clinics, which cannot afford the luxury of having sophisticated EMR systems. Of course, one can point to the lack of EHRs in health care generally, but many conventional clinics could conceivably add an EHR in their budgets in the near future. As a health care delivery system, CAM is much less organized. However, even this can be seen as an opportunity. If we are to take a patient-centered approach, then all of the health care received by each individual patient is important, regardless of whether it took place in an “official” or “casual” context. In the United States (lacking a health care system of even minimal rationality) especially, but also in other more medically advanced countries, capturing the nonclinical aspects of medical care will be a problem. In addition, trying to integrate information across clinics and practices will only make things harder. We should, therefore, be concentrating on the concept that the EHR belongs to the patient, should all be in one place (electronically), should be highly portable, and be flexible in the types of information it contains. If we see that such a system is almost dictated by the requirements of CER, then we shall have opportunities to improve both patient care and patient-centered research at the same time. Of course, we would then also have the chance of putting CAM and conventional medicine research on the same footing, which would be a enormous step forward.
Acknowledgments
This work was supported by a grant (RC1AT005715) from the National Institutes of Health. The opinions expressed are solely those of the author.