My charge is to debate against evidence‐based medicine; yet, I am not opposed to evidence. Clearly, the more evidence we accumulate the closer we get to understanding an issue. Researchers and clinicians alike must therefore forge forward in the rigorous pursuit of knowledge. However, we must use the restraint, precision, and caution of a microsurgeon when we discuss evidence, truth, and knowledge.
We must dissect these words, separating them into their rightful places, lest we adulterate their meaning and reduce them to twisted unrecognizable constructs. The reason is that these words are not equivalent. Evidence is information that is used to approach truth, whereas truth is infallible, unequivocal, immutable fact. The definition of knowledge has been debated through the ages, but the term is typically used as a representation of a person's comprehension of a particular subject. We see, therefore, that although truth is the object of our desire, being absolute, it is likely unattainable. Evidence does imbue us with knowledge, but in no way does it affirm truth. Thus, evidence‐based medicine cannot be our conduit to truth. It is, however, a system that can, when used correctly, enhance our ability to care for patients. Strictly speaking, evidence‐based medicine seeks to use the best signals garnered from the best science to make clinical decisions. A hierarchy of medical evidence has been constructed with double‐blind, placebo‐controlled, randomized clinical trials (RCTs) on top, and experience‐based clinical acumen on the bottom. The justification of this hierarchy has been called into question. Although its roots dive to the depths of statistics, even the brightest of the statisticians have raised eyebrows about our current construct. In addition to statistics, the issue spans science, philosophy, and mathematics. To elucidate the shortfall and even danger of today's love affair with evidence‐based medicine, I examine our methods for acquiring knowledge and demonstrate not only why our current system fails to meet its goals, but also how its failure spawns often unrecognized yet decidedly dangerous consequences.
Deductive and inductive reasoning represent the 2 distinct but often interconnected approaches we use to acquire knowledge in the pursuit of truth. Deductive reasoning moves from the general to the specific, a classic example being, “All men are mortal. Socrates is a man. Therefore Socrates is mortal.” Inductive reasoning moves in the opposite direction, taking a specific observation and using it to draw a general inference. An example of inductive reasoning can be seen in our RCTs: in a small group of prespecified study patients, drug x decreases cholesterol and consequently decreases heart attacks by 30%. Giving drug x to all patients with characteristics similar to those in the study group will therefore reduce their risk of heart attacks by 30%. Although deductive reasoning does not depend on our observations and perceptions, inductive reasoning does. Plato—and many other philosophers and scientists—maintained that the human flaw inherent in inductive reasoning makes it an inferior means to approach the truth. Plato's cave allegory depicts this nicely. In the cave, man observes shadows and based on his particular perspective infers their meaning. The shadows are, of course, reflections of something, not the thing itself. Our current gold standard for science, the RCT, uses inductive reasoning to draw conclusions. Making matters worse, RCTs' conclusions are by definition simply expressions of probability; they are decidedly not the truth. The evidence accumulated through these trials, although purely proxies for truth, have been crowned as truth. By doing this we have allowed and even fostered the illusion that the best of our clinical trials answer questions of truth and falsehood. They do not. In fact, they were not originally intended to illuminate truth; they were only meant to bring us closer to it. By distorting the factual significance of the outcomes of these trials, we put our faith in the shadows of Plato's cave, bowing to them as though they were divine. Allowing ourselves (and the public) to be misled into believing we know what we in fact do not, we not only draw erroneous scientific conclusions, but also carelessly discard sound, benchtop science in addition to well‐earned clinical acumen. Instead, we adopt phantoms of truth. We pray to idols. We have become prisoners of our own design, handcuffed by our self‐imposed methodologies.
So then how accurate are our top‐notch RCTs? In their 2011 article, “The Frequency of Medical Reversal,” Dr. Prasad et al. succinctly demonstrated the alarmingly high frequency of medical reversals.1 To localize the rate of reversal, they searched 1 year in the New England Journal of Medicine for original papers. They found that of the original papers making claims about medical practice, 13% represented true reversals. They defined reversal as, “a new trial—superior to predecessors because of better design, increased power or more appropriate controls—contradicting current clinical practice.” Thus, reversals are not only commonplace, but they have far‐reaching implications, often unwinding and dismantling clinical practices that frequently include surgery, medications, or a combination of both. These practices typically have had great impact on the lives of many, meaning that reversals are often accompanied by patient and physician angst, anger, or in its most extreme form, outrage. Prasad et al.'s findings shed light on the ignored yet consequential fallout of our misrepresentation of evidence. It is not just physicians who are affected by these trials; our patients and the media are impacted as well. A direct consequence of misrepresenting evidence as truth is the fact that when the truth is found to be unequivocally false—13% of the time in the Prasad et al. paper—the public looks at us as either foolish or falsifiers. We lose our stature and credibility as teachers and truth seekers, becoming viewed as narrow‐minded, arrogant, and misguided soldiers of an impure science. If instead we were frank with our patients and ourselves, acknowledging that science is a process yearning for the truth but never quite reaching it, we would not have to contend with the misunderstandings and oftentimes ill will that accompany these all‐too‐frequent reversals.
The very underpinning of our scientific method is another factor that warrants scrutiny. Does the P value represent what we think it does, and is it the proper touchstone for us to use when we perform and interpret clinical trials? For decades researchers have disputed the merits of the P value, Goodman being one of the most eloquent and vocal.2., 3., 4. The P value was introduced in the 1920s by R. A. Fisher as a means of measuring the strength of evidence. It is defined as “the probability of the observed result, plus more extreme results, if the null hypothesis were true.” The null hypothesis, also developed by Fisher, and the cornerstone of the modern scientific method, assumes there is no significant difference between the phenomena under study. It is imperative for us to recognize that the null hypothesis can never actually be proved. Study data can merely reject or fail to reject it. Their rejection or failure of rejection is determined by the P value. Unfortunately, there are many problems inherent in the P value. First and foremost, its very foundation is fragile. After all, its creator never intended for it to become the touchstone for truth; Fisher developed it as a tolerably flawed but useful tool to “guide the strength of evidence against the null hypothesis.” It was meant to tell researchers that enough meaning was elicited from a study to warrant its being repeated. Fisher, its creator, never infused it with the power to define or identify truth. We, his descendants, have done so in error. Goodman argues effectively that the P value fails on many grounds to provide us with a solid measure of evidence. In fact, his “A Dirty Dozen” paper describes 12 common P value misconceptions. Goodman asserts that “the most serious consequence of this array of P value misconceptions is the false belief that the probability of a conclusion being in error can be calculated from the data in a single experiment without reference to external evidence or the plausibility of the underlying mechanism.” In other words, the P value must be viewed in the context of human physiology and prior art; it is not a stand‐alone number that possesses enough information to direct the medical management of our patients. Another example of the P value's frailty is that it tends to overstate the case against the null hypothesis, resulting in an inappropriately high frequency of positive trials. The consequence is that we often prescribe medications and perform surgeries under the perforated umbrella of clinical evidence. Lest the reader feel that Goodman is alone in this battle, you should know that he is not. In fact, in 1998, the editor of the prestigious journal Epidemiology assiduously but unsuccessfully tried to ban the use of the P value!
In sum, through the medical system's endowment of the P value and the RCT with boundless unfounded power, the lay public and physicians alike have become confused. Conflicting publications are released on nearly a weekly basis, each of them being treated as gospel with its message being shouted from the rooftops by the media as well as camera‐adoring members of our profession. The fact that science is a process is ignored. Undecipherable statistical jargon cloaks the fact that medical evidence emanates not from truth but instead the falsifiable proof (the rejection of the null hypothesis). Our facts are not facts; they are probabilities. Although we are enamored of the notion that we live in the world of truth in evidence‐based medicine, the real truth is that we know very little. Instead, we understand a great many things with a great deal of probability. For us to practice better medicine and perform superior research, we must all accept this fact. We need to evolve as a group beyond the omnipresent bias witnessed from the days of Galileo through the present time. There is no end to science; it is an evolutionary discipline. The best we can do at any moment is be familiar with the totality of the literature and practice medicine based upon its trends as well as our knowledge of human physiology and hard‐won clinical acumen. Only when enough compelling evidence tells us to do so should we modify our clinical practice. In this process, let us be forever vigilant not to slay the intuitive physician.
References
- 1. Prasad V, Gall V, Cifu A. The frequency of medical reversal. Arch Intern Med. 2011;171:1675–1676. [DOI] [PubMed] [Google Scholar]
- 2. Goodman S. A dirty dozen: twelve p‐value misconceptions. Semin Hematol. 2008;45:135–140. [DOI] [PubMed] [Google Scholar]
- 3. Goodman S. Toward evidence‐based medical statistics. 1: The P value fallacy. Ann Intern Med. 1999;130:995–1004. [DOI] [PubMed] [Google Scholar]
- 4. Goodman S. Toward evidence‐based medical statistics. 2: The Bayes factor. Ann Intern Med. 1999;130:1005–1013. [DOI] [PubMed] [Google Scholar]
