Skip to main content
Annals of Surgery logoLink to Annals of Surgery
. 2006 Nov;244(5):668–676. doi: 10.1097/01.sla.0000225356.04304.bc

Comparison of Effects in Randomized Controlled Trials With Observational Studies in Digestive Surgery

Satoru Shikata *†, Takeo Nakayama , Yoshinori Noguchi §, Yoshinori Taji , Hisakazu Yamagishi *
PMCID: PMC1856609  PMID: 17060757

Abstract

Objectives:

To compare the results of randomized controlled trials versus observational studies in meta-analyses of digestive surgical topics.

Summary Background Data:

While randomized controlled trials have been recognized as providing the highest standard of evidence, claims have been made that observational studies may overestimate treatment benefits. This debate has recently been renewed, particularly with regard to pharmacotherapies.

Methods:

The PubMed (1966 to April 2004), EMBASE (1986 to April 2004) and Cochrane databases (Issue 2, 2004) were searched to identify meta-analyses of randomized controlled trials in digestive surgery. Fifty-two outcomes of 18 topics were identified from 276 original articles (96 randomized trials, 180 observational studies) and included in meta-analyses. All available binary data and study characteristics were extracted and combined separately for randomized and observational studies. In each selected digestive surgical topic, summary odds ratios or relative risks from randomized controlled trials were compared with observational studies using an equivalent calculation method.

Results:

Significant between-study heterogeneity was seen more often among observational studies (5 of 12 topics) than among randomized trials (1 of 9 topics). In 4 of the 16 primary outcomes compared (10 of 52 total outcomes), summary estimates of treatment effects showed significant discrepancies between the two designs.

Conclusions:

One fourth of observational studies gave different results than randomized trials, and between-study heterogeneity was more common in observational studies in the field of digestive surgery.


Comparison of the results of randomized controlled trials versus observational studies in meta-analyses of digestive surgical topics was performed. One fourth of observational studies gave different results than randomized trials, and between-study heterogeneity was more common in observational studies in the field of digestive surgery.

The first randomized controlled trial in medicine was an investigation of streptomycin in 1948.1 Since then, randomized controlled trials have been widely recognized as offering the gold standard for evaluating treatment efficacy and effectiveness and are classified as providing the highest grade of evidence in the hierarchy of research designs.2

Evaluations in the 1970s and 1980s suggested that observational studies may spuriously overestimate treatment benefits, yielding misleading conclusions.3–6 In recent years, this debate has resurfaced. Some reports have suggested that for selected medical topics, both randomized and observational studies, may yield very similar results.7,8 Conversely, opposing results have been reported from a large number of diverse medical topics.9 Although these previous studies have contained some surgical topics, most have assessed topics involving pharmacotherapies. However, pharmacologic and surgical therapies differ in clinical nature, and results for pharmacologic investigations may therefore not apply to surgical fields.

This issue warrants investigation with a focus on the surgical area, and no previous studies appear to have undertaken an exhaustive assessment of a single clinical field. The present study investigated digestive surgery, allowing a systematic search and evaluation.

This systematic and exhaustive search of a large number of diverse articles on digestive surgery seeks to answer the following question: Do observational studies in digestive surgery tend to produce the same results as randomized controlled trials?

METHODS

Search for Meta-Analyses of Randomized Controlled Trials and Selection of Topics

Meta-analyses of randomized controlled trials in digestive surgery that had been published up to April 2004 were selected as topics in this study. Retrieved articles were judged suitable for use as topics only if all the following criteria were met: 1) meta-analysis of randomized controlled trials; 2) investigating digestive surgery; 3) assessing the treatment effects of at least one operative intervention versus any other intervention (operative or nonoperative); and 4) subjects were human. Searches were not limited to English language articles (any language). Studies were excluded if the main purpose was not evaluation of treatment effect, such as diagnosis. A literature search was performed using the PubMed (1966 to April 2004), EMBASE (1986 to April 2004) and Cochrane Library (Issue 2, 2004) databases. A computer-assisted search was conducted using the following combination of Medical Subject Heading Terms and text words: “surgical procedures, operative,” “digestive system surgical procedures,” “randomized,” “random,” “meta-analysis,” and “review.” A manual search was also performed using references from the retrieved review articles.

Search for Observational Studies for Meta-Analysis

If meta-analyses of both randomized and observational studies had been performed on the same topic in each selected review article, the results could be used for comparison. However, if meta-analysis of observational studies had not been performed, we attempted to perform that by ourselves. Thus, when a meta-analysis of observational studies could not be identified in the selected review article, we needed to search for such meta-analyses while gathering observational studies under the following process.

For meta-analyses of observational studies, we first searched observational studies for all selected topics. In each topic, the same inclusion criteria used for meta-analysis of randomized controlled trials were used, with the exception of study design. Observational study designs were used if they could be categorized as prospective nonrandomized studies, retrospective cohort studies, case-control studies, case series with control groups, or other unspecified designs (provided a control group was used). A literature search was performed using the PubMed (1966 to April 2004), EMBASE (1986 to April 2004) and Cochrane Library (Issue 2, 2004) databases. PubMed contains no search term for observational studies, so a text-word strategy was used to search for “observational,” “nonrandomized,” “case series,” “case control study,” “cohort,” “retrospective,” and “prospective.” In addition, a manual search was performed using references from the retrieved review articles. We also attempted to contact as many experts from the review articles as possible.

Data Extraction and Selection of Outcomes

All available binary data were extracted from the outcomes of the gathered observational studies. Data extraction was performed after translation of the article into English if the article had not been written in English or Japanese. Up to this point, 2 authors (S.S., T.N.) undertook the literature searches and data extraction independently, and disagreements were resolved by consensus.

For final inclusion of a topic in the present evaluation, binary data for the same outcome had to be available from at least one randomized trial and at least one observational study. When primary outcomes had been defined in the review article, these were used for the main comparison. Whenever the primary outcome was unclear, the outcome that was considered a priori as the most clinically important was selected, using consensus among the data extractors. In digestive surgery, mortality was generally given priority in clinical importance over other outcomes.

Statistical Analysis

For all selected topics, data from observational studies were combined. Generally, the fixed-effects model weighted by Peto’s odds ratio method or the Mantel-Haenszel method was used for data pooling, followed by a test of heterogeneity.10,11 Heterogeneity between studies was assessed using Q statistics.12 Given the low power of this test, a significance level of 0.10 was used, rather than 0.05.13 If the hypothesis of heterogeneity was accepted, the random-effects model using the DerSimonian-Laird method was used.14 However, this study sought to compare summary estimates of randomized controlled trials with observational studies under equivalent conditions to the maximum extent possible. Thus, when performing meta-analysis of observational studies, we used the same method that had been used in the meta-analysis of randomized controlled trials. In this study, the quantity I2 was used for assessing heterogeneity between trials in meta-analyses, calculated as: I2 = [(Q − df)/Q] × 100, where Q is the χ2 statistic and df is the degrees of freedom. A value greater than 50% may be considered indicative of substantial heterogeneity.15

Although pooled odds ratio or pooled relative risk could be used as the indicator of summary estimates of outcomes, the present study used the same indicator that had been used in the meta-analysis of randomized controlled trials. In this context, odds ratios and relative risks will inevitably be similar in magnitude, as the rates of outcome events are low. Relative risks were therefore considered as odds ratios in comparisons of summary estimates. Confidence intervals were always calculated at 95%. When one arm of an outcome contained no events, this was considered a “zero cell” in the 2 × 2 table. Zero cells create problems in computing ratio measures of treatment effect. This problem was dealt with using a common method of adding 0.5 to each cell of the 2 × 2 table for the trial.16

To evaluate concordance between the results of randomized and observational studies, the following analyses were performed: 1) assessment of the number of cases in which the summary estimates of the observational studies suggested an effect at least double that of the randomized trials; and 2) evaluation of whether differences in the summary odds estimates of randomized controlled trials and observational studies for the same topic were larger than what would be expected by chance alone. To accomplish this, Z scores were calculated as follows:

graphic file with name 9MMU1.jpg

where ln(ORRCT) is the natural logarithm of the odds ratio or relative risk of randomized controlled trials, ln(OROBS) is the natural logarithm of the odds ratio or relative risk of observational studies, and var is variance. A Z score above 1.96 or less than −1.96 suggests a nonrandom difference between randomized controlled trials and observational studies (0.05 level of statistical significance).17

All statistical analyses were performed using STATA statistical software version 8.1 (STATA Corporation, College Station, TX).

RESULTS

Characteristics of Topics, Observational Studies

A literature search was first performed to select meta-analyses of randomized controlled trials for the topics, identifying 1184 potentially relevant articles. The process finally identified and selected 15 meta-analyses of randomized controlled trials for digestive surgical topics in this research (Fig. 1).7,18–31 Three of the 15 reviews contained two topics.21,30,31 Thus, 18 topics were identified for comparison of summary estimates between randomized controlled trials and observational studies (Table 1).

graphic file with name 9FF1.jpg

FIGURE 1. Summary profile of search for meta-analyses of randomized controlled trials.

TABLE 1. Topics of Meta-Analyses Considering Both Randomized Controlled Trials and Observational Studies

graphic file with name 9TT1.jpg

Meta-analyses of observational studies could not be identified for 10 of the 18 topics (topics 2, 3, 8–13, 17, and 18), so additional meta-analyses were required. Meta-analyses of observational studies had been identified for the remaining 8 topics (topics 1, 4–7, and 14–16), and the results were used for comparisons in this study.

For meta-analyses of observational studies for the 10 topics without existing meta-analyses, a literature search was performed and 111 observational studies were selected from 10,960 articles using the process outlined in Figure 2. Of the 111 selected articles, 17 had not been written in English or Japanese, instead appearing in 7 different languages, and the 2 trial assessors therefore abstracted data from the articles after translation into English by independent translators. A total of 52 common outcomes for both randomized controlled and observational studies were available for comparison in this study.

graphic file with name 9FF2.jpg

FIGURE 2. Summary profile of search for observational studies.

Using the described processes, 52 outcomes of 18 topics were investigated in 276 original articles (96 randomized trials, 180 observational studies) with a total of 101,170 study patients (Table 1). The 180 observational studies comprised 36 prospective and 144 retrospective studies. Randomized and observational studies on the same topic generally administered treatment in the same way and outcome measures were similarly defined.

Between-Study Heterogeneity

Data on between-study heterogeneity using the I2 statistic were available for all 10 meta-analyses of observational studies that we performed specifically for the present study (topics 2, 3, 8–13, 17, and 18). Conversely, data had not been described in 8 of the remaining meta-analyses that had been reported (topics 1, 4–7, and 14–16). In primary outcomes of 16 topics, significant heterogeneity was noted between randomized controlled trials in 1 of 9 topics (11.1%). Significant between-study heterogeneity was identified between observational studies in 5 of 12 topics (41.7%). There was no significant difference between the rates of heterogeneity (P = 0.18 by Fisher exact test).

Comparison of Primary Outcomes

In almost all topics, the primary outcome defined in the review or decided by author consensus was mortality. However, in topics dealing with safety of procedures, such as appendectomy and operation for fissure-in-ano, one of the complications, such as risk of wound infection or persistence of fissure, was considered as a more appropriate primary outcome.

In 16 of 18 topics, primary outcomes could be compared between observational studies and randomized controlled trials. These summary estimates and associated 95% confidence intervals are shown in Figure 3. One of 16 primary outcomes displayed a magnitude of effect in the combined observational studies that was outside the 95% confidence interval for the combined randomized controlled trials (topic 14). In 4 of 16 primary outcomes, summary estimates from observational studies were at least double those from randomized controlled trials (topics 7, 8, 15, and 17). The converse occurred in 3 topics (topics 11, 14, and 16) (exact P = 0.45 by Wilcoxon test). Evaluation by Z score revealed significant discrepancies between randomized trials and observational studies for 4 of 16 primary outcomes (topic 7, Z = −4.28; topic 8, Z = −2.36; topic 11, Z = 2.19; topic 14, Z = 4.34).

graphic file with name 9FF3.jpg

FIGURE 3. Comparison of primary outcomes between observational studies and randomized controlled trials. This figure is based on data from 13 review articles7,18–24,26,27,29–31 and 10 meta-analyses of observational studies by the authors. OR, odds ratio; RR, relative risk; CI, confidence interval. *Outcome reporting relative risk rather than odds ratio.

Comparison of All Outcomes

All summary estimates for 52 outcomes of 18 topics are shown in Table 2. Three types of calculation model were used: random effects calculation using the DerSimonian-Laird method; and fixed effects calculation using Peto’s odds ratio method or the Mantel-Haenszel method. In 21 of 52 outcomes, relative risk was evaluated rather than odds ratio in meta-analyses of observational studies, as the original meta-analyses of randomized controlled trials had used relative risks for evaluations.

TABLE 2. Summary Estimates for All Outcomes

graphic file with name 9TT2A.jpg

TABLE 2. (Continued)

graphic file with name 9TT2B.jpg

In 9 of 52 outcomes, summary estimates from observational studies were at least double those from randomized controlled trials. The converse occurred in 10 outcomes (exact P = 0.943 by Wilcoxon test). Evaluation by Z score revealed significant discrepancies between randomized trials and observational studies in 10 of 52 outcomes.

Overall, these data suggest that about one fourth of observational studies gave different results than randomized trials.

DISCUSSION

Using data from 276 articles in 18 topics, summary estimates were compared between randomized controlled trials and observational studies in digestive surgery. Significant between-study heterogeneity occurred more often between observational studies than between randomized controlled trials. One fourth of the summary estimates of treatment effects in randomized controlled trials and observational studies differed significantly from each other. From this study, observational studies in digestive surgery tend to have similar results to those by randomized controlled trials. At least, they do not tend to overestimate or underestimate more than randomized controlled trials.

Our findings support the conclusions of earlier evaluations in the 1970s and 1980s.3–6 In 2001, Ioannidis et al investigated 45 diverse pharmacologic and surgical topics in 408 articles and concluded that observational studies tend to indicate larger treatment effects (28 of 45 topics vs. 11 of 45 topics) and between-study heterogeneity is more frequent among observational studies than among randomized controlled trials (41% vs. 23%).9 On the other hand, previous studies by Benson and Hartz7 and Concato et al8 reached the opposite conclusion. Benson and Hartz7 investigated 19 diverse pharmacologic and surgical treatments in 136 articles and found little evidence of larger or differing estimates of treatment effects in observational studies compared with randomized controlled trials. Concato et al8 evaluated 5 clinical topics and 99 articles, concluding that well-designed observational studies do not systematically overestimate the magnitude of treatment effects when compared with randomized controlled trials on the same topic.

All these previous studies have made substantial contributions toward identifying the problems caused by differing study designs. However, conclusions have inevitably been in the form of general statements, as the studies addressed diverse topics in various clinical fields. The present study was limited to a single clinical field, digestive surgery and thus offers two advantages over previous studies: a more exhaustive search is possible in studies of diverse clinical fields; and higher applicability to clinical practice than a general statement.

In 25% of digestive surgical topics, summary estimates of treatment effects in observational studies yielded different results than randomized trials, but both designs reached similar results in the remaining topics. This may be attributable to various factors. First, quality of surgical randomized controlled trials is low according to some review articles and may be so low that the essential contents of randomized trials do not differ from those of observational studies.32,33 Second, for most topics, sample sizes may be too small to detect clinically important differences between the results of two types of study. Actually, 12 of 18 topics used fewer than 500 randomized patients. Combined with the use of a rare endpoint, mortality, we could expect to see very large confidence intervals in the randomized evidence. The wide confidence intervals mean that demonstrating any significant discrepancy between the two designs will be very difficult.

This study examined not only primary outcomes, butalso the secondary outcomes. Generally, results about concordance of different studies may vary depending on whether primary or secondary outcomes are examined. Discrepancies may be less apparent for secondary outcomes than for primary outcomes because secondary events are likely to be too uncommon to show any significant difference between arms except in extremely large trials (mega-trials).17

One possible explanation for the greater frequency of between-study heterogeneity in observational studies than in randomized trials is that each observational study usually includes a wide spectrum of subjects from the population at risk. In contrast, randomized trials use specific inclusion criteria and may not be representative of populations seen in clinical practice.

All topics examined in this study were comparisons in the form of A versus B. Generally, A represented a new procedure while B represented an accepted method, but deciding which was newer was difficult in some topics. Most trials in medicine estimate the benefits of pharmacologic effects, whereas 50 of 52 outcomes in this study estimate risks of operations, such as mortality and morbidity. Discrepancies in summary estimates were estimated accordingly between randomized trials and observational studies. For example, the greatest statistical discrepancy between the two types of study design was topic 14 (mortality), comparing extended and limited lymph node dissections for adenocarcinoma of the stomach (Z = 4.34). In this topic, although the summary estimate from observational studies was one fourth that from randomized trials (0.63 vs. 2.39), this represented an underestimation of risks, not of benefits.

The authors revealed that one fourth of observational studies gave different results to randomized trials and between-study heterogeneity was more common in observational studies in the field of digestive surgery. Furthermore, even if clinical applicability is improved by combining a large number of observational studies, estimations of treatment effect sometimes differ from those obtained from randomized controlled trials. The present study confirmed such tendencies in the well-defined area of digestive surgery. However, observational studies offer several advantages over randomized controlled trials, including lower cost, greater timeliness, and a broader range of patients.34 These benefits remain worthy of attention in real clinical settings, particularly where random allocation is not easily accepted by either clinicians or patients. In the field of digestive surgery, large observational studies may actually be more reliable than small underpowered randomized controlled trials. To clarify how to interpret the findings of observational studies and randomized controlled trials, further analyses in other fields are eagerly awaited.

Footnotes

Supported by a Health and Labour Sciences Research Grant (Health Technology Assessment) from the Ministry of Health, Labour and Welfare, Japan.

Reprints: Takeo Nakayama, MD, PhD, Department of Health Informatics, Kyoto University School of Public Health, Konoe-cho, Yoshida, Sakyo-ku, Kyoto 606-8501, Japan. E-mail: nakayama@pbh.med.kyoto-u.ac.jp.

REFERENCES

  • 1.Streptomycin treatment of pulmonary tuberculosis: a Medical Research Council investigation. BMJ. 1948;2:769–782. [PMC free article] [PubMed] [Google Scholar]
  • 2.Preventive Services Task Force. Guide to Clinical Preventive Services: Report of the U.S. Preventive Services Task Force, 2nd ed. Baltimore: Williams & Wilkins, 1996. [Google Scholar]
  • 3.Chalmers TC, Matta RJ, Smith H Jr, et al. Evidence favoring the use of anticoagulants in the hospital phase of acute myocardial infarction. N Engl J Med. 1977;297:1091–1096. [DOI] [PubMed] [Google Scholar]
  • 4.Sacks HS, Chalmers TC, Smith H Jr. Randomized versus historical controls for clinical trials. Am J Med. 1982;72:233–240. [DOI] [PubMed] [Google Scholar]
  • 5.Colditz GA, Miller JN, Mosteller F. How study design affects outcomes in comparisons of therapy. I. Med Stat Med. 1989;8:441–454. [DOI] [PubMed] [Google Scholar]
  • 6.Miller JN, Colditz GA, Mosteller F. How study design affects outcomes in comparisons of therapy. II. Surgical Stat Med. 1989;8:455–466. [DOI] [PubMed] [Google Scholar]
  • 7.Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med. 2000;342:1878–1886. [DOI] [PubMed] [Google Scholar]
  • 8.Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med. 2000;342:1887–1892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ioannidis JP, Haidich AB, Pappa M, et al. Comparison of evidence of treatment effects in randomized and nonrandomized studies. JAMA. 2001;286:821–830. [DOI] [PubMed] [Google Scholar]
  • 10.Yusuf S, Peto R, Lewis J, et al. Beta blockade during and after myocardial infarction: an overview of the randomized trials. Prog Cardiovasc Dis. 1985;27:335–371. [DOI] [PubMed] [Google Scholar]
  • 11.Mantel N, Haenszel WH. Statistical aspects of the analysis of data from retrospective studies of diseases. J Natl Cancer Inst. 1959;22:719–748. [PubMed] [Google Scholar]
  • 12.Fleiss JL. Statistical Methods for Rates and Proportions. New York: Wiley, 1981. [Google Scholar]
  • 13.Fleiss JL. Analysis of data from multiclinic trials. Control Clin Trials. 1986;7:267–275. [DOI] [PubMed] [Google Scholar]
  • 14.DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986;7:177–188. [DOI] [PubMed] [Google Scholar]
  • 15.Julian PTH, Simon GT, Jonathan JD, et al. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Matthias E, George DS, Douglas GA. Systematic Reviews in Health Care: Meta-Analysis in Context, 2nd ed. London: BMJ Books, 2001. [Google Scholar]
  • 17.Ioannidis JP, Cappelleri JC, Lau J. Issues in comparisons of meta-analyses and large trials. JAMA. 1998;279:1089–1093. [DOI] [PubMed] [Google Scholar]
  • 18.Leiboff AR, Soroff HS. The treatment of generalized peritonitis by closed postoperative peritoneal lavage: a critical review of the literature. Arch Surg. 1987;122:1005–1010. [DOI] [PubMed] [Google Scholar]
  • 19.Spina GP, Henderson JM, Rikkers LF, et al. Distal spleno-renal shunt versus endoscopic sclerotherapy in the prevention of variceal rebleeding: a meta-analysis of 4 randomized clinical trials. J Hepatol. 1992;16:338–345. [DOI] [PubMed] [Google Scholar]
  • 20.Urbach DR, Kennedy ED, Cohen MM. Colon and rectal anastomoses do not require routine drainage: a systematic review and meta-analysis. Ann Surg. 1999;229:174–180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Nelson RL. Meta-analysis of operative techniques for fissure-in-ano. Dis Colon Rectum. 1999;42:1424–1428. [DOI] [PubMed] [Google Scholar]
  • 22.Hulscher JB, Tijssen JG, Obertop H, et al. Transthoracic versus transhiatal resection for carcinoma of the esophagus: a meta-analysis. Ann Thorac Surg. 2001;72:306–313. [DOI] [PubMed] [Google Scholar]
  • 23.Urschel JD, Blewett CJ, Bennett WF, et al. Handsewn or stapled esophagogastric anastomoses after esophagectomy for cancer: meta-analysis of randomized controlled trials. Dis Esophagus. 2001;14:212–217. [DOI] [PubMed] [Google Scholar]
  • 24.Urschel JD, Urschel DM, Miller JD, et al. A meta-analysis of randomized controlled trials of route of reconstruction after esophagectomy for cancer. Am J Surg. 2001;182:470–475. [DOI] [PubMed] [Google Scholar]
  • 25.Urschel JD, Blewett CJ, Young JE, et al. Pyloric drainage (pyloroplasty) or no drainage in gastric reconstruction after esophagectomy: a meta-analysis of randomized controlled trials. Dig Surg. 2002;19:160–164. [DOI] [PubMed] [Google Scholar]
  • 26.Singer MA, Nelson RL. Primary repair of penetrating colon injuries: a systematic review. Dis Colon Rectum. 2002;45:1579–1587. [DOI] [PubMed] [Google Scholar]
  • 27.Lustosa SA, Matos D, Atallah AN, et al. Stapled versus handsewn methods for colorectal anastomosis surgery: a systematic review of randomized controlled trials. Sao Paulo Med J. 2002;120:132–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Sutherland LM, Burchard AK, Matsuda K, et al. A systematic review of stapled hemorrhoidectomy. Arch Surg. 2002;137:1395–1406. [DOI] [PubMed] [Google Scholar]
  • 29.McCulloch P, Nita ME, Kazi H, et al. Extended versus limited lymph nodes dissection technique for adenocarcinoma of the stomach. Cochrane Database Syst Rev. 2003;(4):CD001964. [DOI] [PubMed] [Google Scholar]
  • 30.Merlin TL, Hiller JE, Maddern GJ, et al. Systematic review of the safety and effectiveness of methods used to establish pneumoperitoneum in laparoscopic surgery. Br J Surg. 2003;90:668–679. [DOI] [PubMed] [Google Scholar]
  • 31.Papi C, Catarci M, D’Ambrosio L, et al. Timing of cholecystectomy for acute calculous cholecystitis: a meta-analysis. Am J Gastroenterol. 2004;99:147–155. [DOI] [PubMed] [Google Scholar]
  • 32.Solomon MJ, McLeod RS. Surgery and the randomized controlled trial: past, present and future. Med J Aust. 1998;169:380–383. [DOI] [PubMed] [Google Scholar]
  • 33.McCulloch P, Taylor I, Sasako M, et al. Randomized trials in surgery: problems and possible solutions. BMJ. 2002;324:1448–1451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Feinstein AR. Epidemiologic analyses of causation: the unlearned scientific lessons of randomized trials. J Clin Epidemiol. 1989;42:481–489. [DOI] [PubMed] [Google Scholar]

Articles from Annals of Surgery are provided here courtesy of Lippincott, Williams, and Wilkins

RESOURCES