Skip to main content
Journal of Epidemiology and Community Health logoLink to Journal of Epidemiology and Community Health
. 2006 May;60(5):382–388. doi: 10.1136/jech.2004.031880

Randomised by (your) god: robust inference from an observational study design

George Davey Smith
PMCID: PMC2563965  PMID: 16614326

In 1954 Archie Cochrane, then aged 45, was diagnosed as having an acute attack of polio. He was sceptical of the diagnosis at the time, but evidently thought little more about it.1 Eleven years later he was informed that his sister, Helen (fig 1), was acutely ill in a psychiatric hospital in Glasgow, said to be suffering from senile psychosis. Unhappy with this diagnosis, and hearing that she was potentially terminally ill, he enlisted the help of a physician friend to visit her in hospital. There it was determined that she was suffering from an acute attack of porphyria, precipitated by the use of barbiturates. Archie remembered back to his supposed polio attack and recalled that he had been prescribed a sleeping table before it. It looked as though he, too, had suffered an attack of porphyria.

graphic file with name ch31880.f1.jpg

Figure 1 Archie Cochrane and his sister Helen (source: Bosch FX. Archie Cochrane: Back to the Front, Barcelona, 2003).

As variegate porphyria is an autosomal dominant condition Archie reasoned that informing family members of their possible propensity could avoid future episodes, through avoidance of triggering agents, as well as mitigate against potentially serious misdiagnoses. He therefore contacted family members and obtained urine or faecal samples, to trace transmission through the family. As a man convinced of the necessity of high response rates he was justifiably proud of the fact that samples were obtained from 152 of 153 living descendents of his maternal great grandfather. Indeed the relevant family tree, framed and hanging on the wall, was among the first items discussed when I visited him in Rhoose Farm House in 1985. In his report of the investigation of familial distribution of variegate porphyria tendency Archie calculated mortality rates among the maternal side of his family and, although very imprecisely estimated, a hint of excess total mortality was seen for women aged 45–64.2 Although the genetic variants underlying variegate porphyria are now well characterised3 these have many non‐specific effects. Thus the level of understanding of the function of these variants at present offers little to furthering the understanding of disease aetiology and modifiable environmental risk factors more generally. In the case of genetic variants with more circumscribed functional consequences this is not the case, however. Indeed, examining the association of such functional genetic variants and health outcomes offers one potential way of obtaining robust inferences regarding modifiable environmental causes of disease. To illustrate the potentials of this approach—sometimes referred to as Mendelian randomisation—I will briefly outline why there is a problem in conventional observation epidemiology, introduce Mendelian randomisation as one potential solution, and discuss the limitations of this approach.

Limits of observational epidemiology

To someone interested in the health consequences of a modifiable environmental exposure—say, a particular aspect of diet—the obvious approach would be to directly study dietary intake and how this relates to the risk of disease. Why, then, should an alternative approach be advanced? The impetus for thinking of new approaches is that conventional observational study designs have yielded findings that have failed to be confirmed by randomised controlled trials.4 Observational studies showed that β carotene intake was associated with a lower risk of lung cancer and cardiovascular disease mortality, and some authorities were impressed enough with this evidence that large scale randomised controlled trials of β carotene supplementation were launched. Large numbers of people took β carotene, justification for which would come from reports such as the 1990 review of this issue that concluded “Available data thus strongly support the hypothesis that dietary carotenoids reduce the risk of lung cancer”.5 However, when large scale randomised controlled trials reported their findings β carotene supplementation produced no reduction in risk of lung cancer or cardiovascular disease. Similarly, observational studies reported that people taking vitamin E supplements had lower risk of coronary heart disease, and again based on such studies randomised controlled trials were launched. The results were similarly disappointing—there was no evidence of benefit from supplements, despite the fact that the trials directly mimicked the observational studies that had studied the apparent consequences of supplemental vitamin E intake.

In 2001 the Lancet published an observational study showing an inverse association between circulating vitamin C levels and incident coronary heart disease.6 The left hand side of figure 2 summarises these data, presenting the relative risk for 15.7 μmol/l higher plasma vitamin C level, assuming a log‐linear association. As can be seen, adjustment for confounders had little impact on this association. However a large scale randomised controlled trial, the heart protection study, examined the effect of a supplement that increased average plasma vitamin C levels by 15.7 μmol/l. In this study randomisation to the supplement was associated with no decrement in coronary heart disease risk.7

graphic file with name ch31880.f2.jpg

Figure 2 Estimates of the effects of an increase of 15.7 μmol/l plasma vitamin C on CHD five year mortality estimated from the observational epidemiological EPIC study and the randomised controlled heart protection study. EPIC m, men, age adjusted; EPIC m*, men, adjusted for age, systolic blood pressure, cholesterol, BMI, smoking, diabetes, and vitamin supplement use; EPIC f , women, age adjusted; EPIC f*, women, adjusted for age, systolic blood pressure, cholesterol, BMI, smoking, diabetes, and vitamin supplement

It is possible to advance case by case explanations for the discrepancies between observational studies and randomised controlled trials of particular exposures. However, it seems probable that a general explanation also applies. Thus there is considerable confounding between vitamin C levels and other exposures that could increase the risk of coronary heart disease. In the British women's heart and health study (BWHHS), for example, women with higher plasma vitamin C levels were less likely to be in a manual social class, have no car access, be a smoker or be obese and more likely to exercise, be on a low fat diet, have a daily alcoholic drink, and be tall.8 Furthermore, for these women in their 60s and 70s those with higher plasma vitamin C levels were less likely to have come from a home 50 years or more previously in which their father was in a manual job, or had no bathroom or hot water, or within which they had to share a bedroom. They were also less likely to have limited educational attainment. In short, a substantial amount of confounding by factors from across the life course that indicate increased risk of coronary heart disease was seen. Table 1 illustrates how four simple dichotomous variables from across the life course can generate large differences in cardiovascular disease mortality (table 1).9

Table 1 Cardiovascular mortality according to cumulative risk indicator (father's social class, screening social class, smoking, alcohol use)9.

Number CVD deaths Relative risk
4 favourable (0 unfavourable) 517 47 1
3 favourable (1 unfavourable) 1299 227 1.99 (1.45 to 2.73)
2 favourable (2 unfavourable) 1606 354 2.60 (1.92 to 3.52)
1 favourable (3 unfavourable) 1448 339 2.98 (2.20 to 4.05)
0 favourable (4 unfavourable) 758 220 4.55 (3.32 to 6.24)

In the BWHHS 15.7 μmol/l higher plasma vitamin C level was associated with a relative risk of incident coronary heart disease of 0.88 (95% CI 0.80 to 0.97).10 This is close to the estimates seen in the observational study summarised in figure 2. When we adjusted for the same confounders as were adjusted for in that study the estimate changed very little—to 0.90 (95% CI 0.82 to 0.99). Only when we additionally adjusted for confounders acting across the life course was considerable attenuation seen, with a residual relative risk of 0.95 (95% CI 0.85 to 1.05). It is obvious that given inevitable amounts of measurement imprecision in the confounders, or a limited number of missing confounders, the residual association is essentially null and close to the finding of the randomised controlled trial. Most studies have more limited information on potential confounders, and in other fields we may be even more ignorant of the confounding factors we should measure. In these cases inferences drawn from observational epidemiological studies may be seriously misleading. It is for these reasons that alternative approaches—including Mendelian randomisation—need to be applied.

What is Mendelian randomisation?

Mendelian randomisation is an instrumental variables approach,11 in which genetic variants are the instruments. In instrumental variable approaches a measure is required that is related to the exposure of interest, but is not related through any other pathway to the outcome.12 Thus the instruments will serve as proxies for the exposure of interest, but the instruments will not have a confounded or biased association with the disease outcome. Therefore, the association of the instrument with the outcome can provide robust evidence regarding the potentially causal association of exposure and disease.

Genetic variants can serve as useful instruments within this framework.13,14,15,16,17,18,19 Within populations of homogenous origin the vast majority of genetic variants will be uncorrelated with other variants, the exception being those which are physically close together on a chromosome and thus remain associated despite repeated meioses—an association referred to in the genetic literature as “linkage disequilibrium”. Furthermore, genetic variants tend to be unrelated to the behavioural and socioeconomic factors that underlie so much confounding in conventional observational epidemiology. Thus if a genetic variant can be taken to proxy for a modifiable environmental risk factor, or for a potentially modifiable physiological measure, the variant should only be related to the disease outcome to the extent to which it serves as a proxy for these factors. As well as allowing an unconfounded estimate of exposure and disease associations, the genetic variant will not be influenced by reverse causation. While the onset of disease may change people's behaviour, or alter physiological parameters such as circulating cholesterol levels or blood pressure, such diseases will not change germ line genetic variants.20,21

The utility of this approach may be best appreciated through examples, and those below have been selected to illustrate the range of inferences that can be drawn.

Alcohol and health: reporting bias, reverse causation, confounding, or cause?

Studying the association of alcohol and health outcomes is problematic, as the reporting of alcohol intake may be seriously biased in a way that could be differential with respect to health outcomes. Furthermore, health problems might lead people to reduce or stop drinking, generating an apparently protective effect of alcohol consumption, and confounding will occur—heavy drinkers are likely to smoke more and display other unfavourable behavioural and socioeconomic characteristics, putting them at high risk of disease. The predicted influence of bias, reverse causation, and confounding on the alcohol‐disease association could go in either direction, inflating or attenuating real effects.

Mendelian randomisation can help here because a genetic variant exists that is strongly associated with alcohol consumption. Alcohol is oxidised to acetaldehyde, which in turn is oxidised by aldehyde dehydrogenases (ALDHs) to acetate. Half of Japanese people are heterozygotes or homozygotes for a null variant of ALDH2 and peak blood acetaldehyde concentrations after alcohol challenge are 18 times and five times higher among homozygous null variant and heterozygous individuals respectively compared with homozygous wild type individuals.22 This renders the consumption of alcohol unpleasant through inducing facial flushing, palpitations, drowsiness, and other symptoms. As table 2 shows, there are considerable differences in alcohol consumption according to genotype.23 The principles of Mendelian randomisation are seen to apply—two factors that would be expected to be associated with alcohol consumption—age, and cigarette smoking—which would confound conventional observational associations between alcohol and disease, are not related to genotype, despite the strong association of genotype with alcohol consumption.

Table 2 Relation between characteristics and ALDH2 genotype: the 2*2/z*2 and 2*2/2*1 genotypes are associated with avoidance of alcohol consumption23.

2*2/2*2 2*2/2*1 1*1/1*1 p Value
Age (years) 61.3 (0.8) 61.5 (0.4) 60.6 (0.4) NS
BMI (kg/m2) 23.1 (0.2) 23.0 (0.1) 23.3 (0.1) NS
Alcohol 0.21 (0.06) 0.6 (0.03) 1.16 (0.03) 0.0001
% smoker 48.5 47.9 47.7 NS
% hypertension 40.6 37.7 46.9 0.0002
Cholesterol (mg/dl) 203 (2.3) 203 (1.1) 203 (1.0) NS
Triglyceride (mg/dl) 134 (7.4) 137 (3.5) 150 (3.3) 0.012
HDL cholesterol (mg/dl) 48 (1.0) 52 (0.5) 54 (0.5) 0.0001

Values are expressed as the mean and standard error. Alcohol consumption in cups/day (one cup of Japanese alcohol corresponds to 25.2 ml ethanol).

It would be expected that ALDH2 genotype influences diseases known to be related to alcohol consumption, and as proof of principle it has been shown that ALDH2 null variant homozygosity—associated with low alcohol consumption—is indeed related to a lower risk of liver cirrhosis.24 Considerable evidence, including data from randomised controlled trials, suggests that alcohol increases HDL cholesterol levels25,26 (which should protect against CHD) and blood pressure (which should mitigate or reverse the protective effect of alcohol).27,28 In line with this, ALDH2 genotype is strongly associated with HDL cholesterol and hypertension in the expected direction (table 2). Given the apparent protective effect of alcohol against CHD risk seen in observational studies possession of the ALDH2 allele—associated with lower alcohol consumption—should be associated with a greater risk of myocardial infarction, and this is what was seen in a case‐control study.23 Men either homozygous or heterozygous for null variant ALDH2 were at twice the risk of myocardial infarction. Supporting reasoning that the HDL cholesterol elevating effects of alcohol render it protective against coronary heart disease, statistical adjustment for HDL cholesterol greatly attenuated the association between ALDH2 genotype and CHD.

The implications of this example are that examination of ALDH2 null variant and disease associations can provide evidence regarding the causal influence of alcohol, rather than that people should be screened for ALDH2 variants as a way of determining their disease risk. This basic point—that genetic variant‐disease associations provide evidence about environmentally modifiable risk of disease in populations, rather than point to detection of genetic susceptibility in individuals as a way of targeting interventions—is central to the Mendelian randomisation enterprise.

Cholesterol and coronary heart disease

Randomised controlled trials of lowering cholesterol with statins have shown that circulating cholesterol has a causal effect on coronary heart disease risk to most (but not all29) people's satisfaction. Before this there was some debate regarding causality, and some commentators still consider that confounding or reverse causation could generate the association (fig 3). A variant in the gene coding for apolipoprotein B (Apo B) is associated with a 2.6 mmol/l higher circulating cholesterol level—a condition known as familial defective Apo B.30,31 The principle of Mendelian randomisation applies—although related to a substantial difference in cholesterol level the variant is not related to triglyceride, fibrinogen, glucose, body mass index, and waist‐hip ratio. The variant is, however, association with an odds ratio for coronary heart disease of 7 (95% CI 2.2 to 22).32 This is greater than would be predicted by randomised controlled trial evidence on lowering total cholesterol and reducing CHD mortality, but the genetic variant is associated with lifelong differences in cholesterol level, whereas trials only lower cholesterol for a few years. The greater increase in risk is therefore expected, and is likely to reflect the effect of increased cholesterol acting over many decades. This illustrates one particular value of Mendelian randomisation approaches, which is that differences in an exposure proxied by a genetic variant will reflect long term differences, not subject to the short term fluctuations that apply to single measures of, say, cholesterol level in middle age. The implication of the finding that familial defective Apo B is related to higher coronary heart disease is that cholesterol reduction will reduce CHD risk in the whole population, not that screening for this genetic variant to detect people at high risk would be valuable. The population attributable risk of familial defective Apo B for coronary heart disease is trivial, and even if the risk in all people carrying this variant was reduced to the background population risk by treatment, the effect on overall population CHD occurrence would be small.

graphic file with name ch31880.f3.jpg

Figure 3 Non‐randomised comparisons.

Statin therapy, HMG‐CoA reductase genotype, and cholesterol reduction

Genetic variants in the HMG‐CoA reductase gene are associated with a differential cholesterol response to statin treatment, for example a 42 compared with 33 mg/dl fall in total cholesterol according to genotype was reported by one study.33 This difference is comparatively small, although is highly statistically robust. It is unlikely to be useful in clinical practice—the persons with slightly less response could simply receive a greater dose of statin based on cholesterol response, not genotype.

Their findings are, however, potentially exciting for another reason. The mode of action of statins in preventing CHD risk is debated, and it has been suggested that mechanisms other than lowering circulating LDL‐cholesterol are involved. For example, an influence on inflammatory causes of CHD34 may underlie their influence on CHD risk. Indeed opponents of the diet‐heart theory have cited the additional actions of statins as evidence against the argument that lipid lowering by statins provides strong evidence of a causal relation between cholesterol and coronary heart disease.29 Variants in the HMG‐CoA reductase genotype combined with the reasoning offered by the “Mendelian randomisation” paradigm offer a way of resolving this issue, however. If cholesterol lowering is the mechanism through which statins act, then in fixed dose randomised trials of statin treatment, in the group randomised to statin treatment, those with the HMG‐CoA reductase variant related to a lesser reduction in cholesterol level should have a higher risk of CHD than those with variants related to greater statin induced cholesterol reduction. Among the control group CHD risk should not be related to HMG‐CoA reductase gene variant, as genotype is not related to baseline blood cholesterol levels. If these effects were seen—and if a lack of influence of these variants on other potential mechanisms were confirmed—then the attribution of the CHD risk reduction seen with statins to their cholesterol reducing effects could be reliably inferred. These data would confirm that despite some attempts to argue the opposite29 circulating cholesterol levels are causally related to CHD and thus that non‐pharmacological ways of reducing cholesterol levels will lead to reductions in CHD occurrence. Such increased knowledge of disease mechanism and mode of drug action may offer more to public health than the supposed benefits of “personalised medicine”.35

Folate, MTHFR, and neural tube defects

Examining the effects of genotype of mothers on the health outcomes of their children can be termed “intergenerational Mendelian randomisation”.18 In these circumstances, exposures of interest relate to aspects of the intrauterine environment that are difficult to measure but are modified by maternal genotype. For example, folate deficiency in pregnancy is now known to be a cause of neural tube defects (NTDs), an effect confirmed by the beneficial effects of periconceptual folate supplementation.36,37 The MTHFR 677C→T polymorphism is associated with slower enzymatic processing of folate and in a meta‐analysis of case‐control studies of NTDs, mothers homozygous for the null variant (TT) had a twofold increased risk of having an infant with a NTD than the homozygous CC mothers.38 The relative risk of an NTD associated with the TT genotype in the infant was less than that seen with respect to maternal genotype, and there was no effect of paternal genotype on offspring NTD risk. This suggests that it is the intrauterine environment—influenced by maternal TT genotype operating as a proxy for lower maternal folate levels—rather than the genotype of offspring that increases the risk of NTD (see fig 4). The association between maternal MTHFR genotype and offspring NTD risk provides evidence that maternal folate intake is a key aetiological (and potential preventive) factor, as has been confirmed by the randomised trials of folate supplementation. The population attributable risk of maternal MTHFR for NTDs would be low, however, and this finding does not suggest that screening women for this genetic variant will be a particularly useful strategy. Instead it provides evidence that maternal folate is causally related to offspring NTD risk.

graphic file with name ch31880.f4.jpg

Figure 4 Parental MTHFR and offspring neural tube defect risk.38

Box 1 Limitations of Mendelian randomisation

  • Failure to establish reliable genotype‐intermediate phenotype or genotype‐disease associations

  • The confounding of genotype‐intermediate phenotype disease associations by linkage disequilibrium between the genetic variant of interest and another genetic variant with an influence on the outcome under investigation.

  • The confounding of genotype‐intermediate phenotype disease associations by pleiotropic effects of genetic variants.

  • Canalisation and developmental stability.

  • Lack of suitable polymorphisms for studying modifiable exposures of interest.

  • Inadequate biological understanding of the function of genetic variants.

Fibrinogen and coronary heart disease: proving a negative?

In the case of cholesterol and CHD we saw that a genetic variant related to higher cholesterol level was associated with a higher risk of CHD, as would be expected if cholesterol had a causal effect on CHD. This approach offers a method for investigating the causal effect of many other intermediate phenotypes. It has, for example, been applied to the potential causal influence of circulating fibrinogen levels on CHD risk. Many observational epidemiological studies show that higher levels of circulating fibrinogen are related to increased risk of CHD.39 Fibrinogen levels are, however, higher in a wide variety of population subgroups known to have increased CHD risk—for example, cigarette smokers, people from less favourable socioeconomic backgrounds, non‐drinkers, and people who engage in less leisure time activity.40 Thus confounding could generate a positive fibrinogen‐CHD association, and furthermore atherosclerosis itself may increase fibrinogen levels, meaning that fibrinogen could be a marker of disease state rather than a causal factor in its own right.

Box 2 Misunderstandings of Mendelian randomisation

One common misunderstanding regarding Mendelian randomisation relates to the fact that the environmentally modifiable exposures that are proxied for by the genetic variants used in studies within the Mendelian randomisation paradigm are influenced by many other factors than these genetic variants. For example, when discussing the case of fibrinogen and coronary heart disease—which is discussed in this paper—Jousilahti and Salomaa49 consider that this approach does not take into account the complex genetic background of multifactorial disease, and fails to recognise that other genetic factors and environmental factors than the polymorphism under study influence fibrinogen levels. The suggestion here is that because other factors are involved, the association between a genetic polymorphism related to fibrinogen level and disease outcomes cannot be taken to provide information about the association between fibrinogen and coronary heart disease. This reflects a basic misunderstanding of epidemiology, as well as of Mendelian randomisation. Of course many genetic and environmental factors influence fibrinogen levels—indeed with respect to environmental factors this is why the association between fibrinogen and coronary heart disease is strongly confounded, and thus cannot be reliably estimated from observational epidemiology. However, groups defined by the genetic variant under study consistently differ with respect to mean fibrinogen level. As a thought experiment, consider another example. Within a population the use of antihypertensive medicines (even if these are widely and appropriately prescribed) will only make a small contribution to the variance in blood pressure. However, this obviously does not mean that antihypertensive drugs will not have an influence on the sequelae of raised blood pressure. The fact that there are many other environmental and genetic factors contributing to blood pressure other than antihypertensive drugs is irrelevant, as is the wide range of factors contributing to variance in fibrinogen levels. Groups that differ with respect to use of antihypertensive drugs will differ with respect to blood pressure, and this will be shown in their clinical event rate. Similarly groups that differ with respect to a genetic variant related to fibrinogen level will differ with respect to fibrinogen level, and if fibrinogen level were a cause of coronary heart disease (in the way that blood pressure is indeed a cause of coronary heart disease) these groups would have different rates of disease.

In a large case‐control study Youngman and colleagues41 showed that fibrinogen was associated with heart disease in the usual way; a 0.12 g/l increase conferring a relative risk of CHD of 1.20 (1.13 to 1.26). However, fibrinogen is also influenced by a polymorphism in the β fibrinogen gene, presence of the T allele being associated with a 0.12 g/l increase in serum levels in the population. Presence or absence of the T allele should not be associated with any of the behavioural or environmental correlates of fibrinogen that may confound associations with heart disease. Therefore, estimates of effects of this allele on CHD risks are in effect, unconfounded, “intention to treat” estimates of the effect of the higher fibrinogen levels associated with presence of the allele. In their study, Youngman and colleagues found that the relative risk of CHD associated with presence of the T allele (that is, the unconfounded effect of a 0.12 g/l increase in fibrinogen) was 1.03 (0.96 to 1.10). This provides evidence against a strong causal association of fibrinogen with CHD, although it does not, of course, exclude any association. Indeed it can be shown that 30 000 cases and 30 000 controls are required to exclude a relative risk of 1.5 for CHD between the top and bottom tertile of fibrinogen at 80% power.14

Limitations of Mendelian randomisation

Mendelian randomisation is an attractive strategy for improving strength of causal inference that can be drawn from observational epidemiology. However, there are several limitations that need to be considered (box 1). Firstly, and not specific to Mendelian randomisation, establishing robust evidence on genetic variant‐disease associations has proved problematic, probably largely because effect sizes will be small and publication bias will ensure that many of the putative associations that reach public attention are actually chance findings.42 Strategies for improving this situation have been discussed,42 and the Mendelian randomisation approach would benefit greatly from the robust establishment of associations between functional genetic variants and disease outcomes.

Given a true genotype–phenotype association, when are Mendelian randomisation interpretations misleading? By “true” in this statement is meant an association that is not attributable to population stratification (in which the coexistence of different disease rates and allele frequencies within population sub‐sections lead to an association between the two at the whole population level). Confounding could occur through a second genetic variant that influences disease risk being in linkage disequilibrium with the variant under study. A second form of confounding will arise if the genetic variant has more than one functional effect (known as pleiotropy). A possible example of this relates to the association of APOE genotype and coronary heart disease.43 The APOE genotype is associated with differences in cholesterol level, but the genotype is not associated with CHD risk in the way predicted by this. This could be because the APOE genotype associated with lower cholesterol levels is also associated with less efficient transfer of very low density lipoproteins and chylomicrons.44,45

The possible role of confounding through linkage disequilibrium or pleiotropy could be investigated through tabulating other disease risk factors by genotype. This, of course, relies on knowledge of other causal factors for the disease in question and mitigates against some of the advantages of Mendelian randomisation. Empirical evidence to date suggests that, when this has been studied, confounders are only rarely related to genotype, but more systematic evidence on this is required.

A further potential limitation relates to the process of canalisation or developmental adaptation that may occur in response to the effect of the genotype under investigation. Compensation may occur to a perturbation introduced by genotypic effects expressed during fetal development. Thus if a genotype changed, say, cholesterol levels during fetal development, the developmental programme could be changed to compensate for this, such that tissues were less susceptible to the action of cholesterol, for example. Findings from knock‐out animal model preparations—when a gene is essentially rendered non‐expressive—often show less severe phenotypic effects than anticipated from knowledge of the function of the knocked‐out gene.46 A full discussion of this issue is available elsewhere.13 In terms of human studies using the Mendelian randomisation approach it is unclear how important this theoretical problem is. However, it should be born in mind that when analogies are drawn between Mendelian randomisation and randomised controlled trials, randomised controlled trial interventions tend to occur in adulthood (for example, drugs lowering cholesterol are given then) whereas in Mendelian randomisation the “randomisation” occurs during gamete formation and conception, and thus prior to fetal development.

Genetic epidemiology and population health: is Mendelian randomisation the link?

Genetic epidemiology has been viewed as almost the antithesis of behavioural, environmental, or social epidemiology. This line of reasoning sees genetic epidemiology as primarily investigating non‐modifiable biological factors, which must lead to a purely biological notion of disease causation, prevention, and treatment.47 This critique can be expanded to cover two features of findings from genetic association studies: that the population attributable risk of the genetic variants is low and that in any case the influence of genetic factors is not reversible. Illustrating both of these criticisms, Terwilliger and Weiss48 suggest, as reasons for considering that many of the current claims regarding genetic epidemiology are hype, firstly, that alleles identified as increasing the risk of common diseases “tend to be involved in only a small subset of all cases of such diseases” and, secondly, that in any case “while the concept of attributable risk is an important one for evaluating the impact of removable environmental factors, for non‐removable genetic risk factors, it is a moot point”.

Mendelian randomisation, however, puts environmental factors centre stage by explicitly using studies in which genetic variant‐disease outcome associations can provide robust evidence regarding the causal nature of environmental factors in influencing health. This approach intends to inform behavioural, environmental, or social approaches to disease control by helping to establish those causal factors for which intervention will influence disease rates. The approach does not inform strategies for genetic screening for disease risk or targeting of therapy. In this light the criticisms of Terwilliger and Weiss regarding the small subset of diseases that can be said to be “caused” by the genetic variants, and the low population attributable risk for the genetic variants, do not apply.

Conclusions

Despite some prevalent misunderstandings (box 2), Mendelian randomisation allows for many forms of inference regarding the causal effects of modifiable exposures on disease risk. Empirical evidence regarding the utility of the approach remains limited, but the rapid expansion of knowledge in functional genomics promises many exciting possibilities to test its application in the next few years.

Acknowledgements

Thanks to Shah Ebrahim for (nearly) endless discussion of Mendelian randomisation.

Addendum

Since this paper was accepted there have been several developments in this field. See references 50–55 for examples.

References

  • 1.Cochrane A.One man's medicine. London: BMJ Books, 1989
  • 2.Cochrane A L, Goldberg A. A study of faecal porphyrin levels in a large family. J Hum Genet 196832195–208. [DOI] [PubMed] [Google Scholar]
  • 3.Morgan R R, Da Silva V, Puy H.et al Functional studies of mutations in the human protoporphyringen oxidase gene in variegate porphyria. Cell Mol Biol 20024879–82. [PubMed] [Google Scholar]
  • 4.Davey Smith G, Ebrahim S. Data dredging, bias, or confounding. BMJ 20023251437–1438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Willett W C. Vitamin A and lung cancer. Nutrition Reviews 199048201–211. [DOI] [PubMed] [Google Scholar]
  • 6.Khaw K ‐ T, Bingham S, Welch A.et al Relation between plasma ascorbic acid and mortality in men and women in EPIC‐Norfolk prospective study: a prospective population study. Lancet 2001357657–663. [DOI] [PubMed] [Google Scholar]
  • 7.Heart Protection Study Collaborative Group MRC/BHF heart protection study of antioxidant vitamin supplementation in 20536 high‐risk individuals: a randomised placebo‐controlled trial. Lancet 200236023–33.12114037 [Google Scholar]
  • 8.Lawlor D A, Davey Smith G, Bruckdorfer K R.et al Those confounded vitamins: what can we learn from the differences between observational versus randomised trial evidence? Lancet 20043631724–1727. [DOI] [PubMed] [Google Scholar]
  • 9.Davey Smith G, Hart C. Lifecourse socioeconomic and behavioural influences on cardiovascular disease mortality: the Collaborative study. Am J Public Health 2002921295–1298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Lawlor D A, Kundu D, Bruckdorfer R.et al Vitamin C is not associated with coronary heart disease risk once life course socioeconomic position is taken into account: prospective findings from the British women's heart and health study. Heart 2005911086–1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Thomas D C, Conti D V. Commentary: the concept of ‘Mendelian randomization'. Int J Epidemiol 20043321–25. [DOI] [PubMed] [Google Scholar]
  • 12.Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol 2000291102. [DOI] [PubMed] [Google Scholar]
  • 13.Davey Smith G, Ebrahim S. ‘Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 2003321–22. [DOI] [PubMed] [Google Scholar]
  • 14.Davey Smith G, Harbord R, Ebrahim S. Fibrinogen, C‐reactive protein and coronary heart disease: does Mendelian randomization suggest the associations are non‐causal? QJM 200497163–166. [DOI] [PubMed] [Google Scholar]
  • 15.Brennan P. Mendelian randomization and gene‐environment interaction. Int J Epidemiol 20043317–21. [DOI] [PubMed] [Google Scholar]
  • 16.Tobin M D, Minelli C, Burton P R. Development of Mendelian randomisation: from hypothesis test to ‘Mendelian deconfounding'. Int J Epidemiol 20043321–25. [DOI] [PubMed] [Google Scholar]
  • 17.Keavney B. Katan's remarkable foresight: genes and causality 18 years on. Int J Epidemiol 20043311–14. [DOI] [PubMed] [Google Scholar]
  • 18.Davey Smith G, Ebrahim S. Mendelian randomization: prospects, potentials, and limitations. Int J Epidemiol 20043330–42. [DOI] [PubMed] [Google Scholar]
  • 19.Khoury M J, Millikan R, Little J.et al The emergence of epidemiology in the genomics age. Int J Epidemiol 200433936–944. [DOI] [PubMed] [Google Scholar]
  • 20.Katan M B. Apolipoprotein E isoforms, serum cholesterol, and cancer. (Reprinted Int J Epidemiol 2004;33: 9), Lancet 1986i507–508. [DOI] [PubMed] [Google Scholar]
  • 21.Katan M B. Mendelian randomization, 18 years on. Int J Epidemiol 20043310–11. [DOI] [PubMed] [Google Scholar]
  • 22.Enomoto N, Takase S, Yasuhara M. Acetaldehyde metabolism in different aldehyde dehydrogenase‐2 genotypes. Alcohol Clin Exp Res 199115141–144. [DOI] [PubMed] [Google Scholar]
  • 23.Takagi S, Iwai N, Yamauchi R.et al Aldehyde dehydrogenase 2 gene is a risk factor for myocardial infarction in Japanese men. Hypertens Res 200225677–681. [DOI] [PubMed] [Google Scholar]
  • 24.Chao Y ‐ C, Liou S ‐ R, Chung Y ‐ Y.et al Polymorphism of alcohol and aldehyde dehydrogenase genes and alcoholic cirrhosis in Chinese patients. Hepatology 199419360–366. [PubMed] [Google Scholar]
  • 25.Haskell W L, Camargo C, Williams P T.et al The effect of cessation and resumption of moderate alcohol intake on serum high‐density‐lipoprotein subfractions. N Engl J Med 1984310805–810. [DOI] [PubMed] [Google Scholar]
  • 26.Burr M L, Fehily A M, Butland B K.et al Alcohol and high‐density‐lipoprotein cholesterol: a randomised controlled trial. Br J Nutr 19865681–86. [DOI] [PubMed] [Google Scholar]
  • 27.Dyer A R, Stamler J, Paulet al Alcohol, cardiovascular risk factors and mortality: the Chicago experience. Circulation 198164(suppl III)III20–III27. [PubMed] [Google Scholar]
  • 28.Wallace R B, Lynch C F, Pomrehn P R.et al Alcohol and hypertension: epidemiologic and experimental considerations. Circulation 198164(suppl III)III41–III47. [PubMed] [Google Scholar]
  • 29.Ravnskov U. A hypothesis out‐of‐date. the diet‐heart idea. J Clin Epidemiol 2002551057–1063. [DOI] [PubMed] [Google Scholar]
  • 30.Soria L F, Ludwig E H, Clarke H R G.et al Association between a specific apolipoprotein B mutation and familial defective apolipoprotein B‐100. Proc Natl Acad Sci U S A 198986587–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Tybaerg‐Hansen A, Humphries S E. Familial defective apolipoprotein B‐100: a single mutation that causes hypercholesterolemia and premature coronary artery disease. Atherosclerosis 19929691–107. [DOI] [PubMed] [Google Scholar]
  • 32.Tybaerg‐Hansen A, Steffenson R, Meinertz H.et al Association of mutations in the apolipoprotein B gene with hypercholesterolemia and the risk of ischemic heart disease. N Engl J Med 19983381577–1584. [DOI] [PubMed] [Google Scholar]
  • 33.Chasman D I, Posada D, Subrahmanyan L.et al Pharmacogenetic study of statin therapy and cholesterol reduction. JAMA 20042912821–2827. [DOI] [PubMed] [Google Scholar]
  • 34.Balk E M, Lau J, Goudas L C.et al Effects of statins on nonlipid serum markers associated with cardiovascular disease: a systematic review. Ann Intern Med 2003139670–682. [DOI] [PubMed] [Google Scholar]
  • 35.Haga S B, Burke W. Using pharmacogenetics to improve drug safety and efficacy. JAMA 20042912869–2871. [DOI] [PubMed] [Google Scholar]
  • 36.MRC Vitamin Study Research Group Prevention of neural tube defects: results of the Medical Research Council vitamin study. Lancet 1991338131–137. [PubMed] [Google Scholar]
  • 37.Czeizel A E, Dudás I. Prevention of the first occurrence of neural‐tube defects by periconceptional vitamin supplementation. N Engl J Med 19923271832–1835. [DOI] [PubMed] [Google Scholar]
  • 38.Botto L D, Yang Q. 5, 10‐Methylenetetrahydrofolate reductase gene variants and congenital anomalies: a HuGE review. Am J Epidemiol 2000151862–877. [DOI] [PubMed] [Google Scholar]
  • 39.Danesh J, Collins R, Appleby Peto R. Association of fibrinogen, C‐reactive protein, albumin or leucocyte count with coronary heart disease. Meta‐analysis of prospective studies. JAMA 19982791477–1482. [DOI] [PubMed] [Google Scholar]
  • 40.Brunner E, Davey Smith G, Marmot M.et al Childhood social circumstances and psychosocial and behavioural factors as determinants of plasma fibrinogen. Lancet 19963471008–1013. [DOI] [PubMed] [Google Scholar]
  • 41.Youngman L D, Keavney B D, Palmer A.et al Plasma fibrinogen and fibrinogen genotypes in 4685 cases of myocardial infarction and in 6002 controls: test of causality by “Mendelian randomisation”. Circulation 2000102(suppl II)31–32. [Google Scholar]
  • 42.Colhoun H M, McKeigue P M, Davey Smith G. Problems of reporting genetic associations with complex outcomes. Lancet 2003361865–872. [DOI] [PubMed] [Google Scholar]
  • 43.Keavney B, Palmer A, Parish S.et al Lipid‐related genes and myocardial infarction in 4685 cases and 3460 controls: discrepancies between genotype, blood lipid concentrations, and coronary disease risk. Int J Epidemiol 2004331002–1013. [DOI] [PubMed] [Google Scholar]
  • 44.Smith J. Apolipoproteins and aging: emerging mechanisms. Ageing Research Reviews 20021345–365. [DOI] [PubMed] [Google Scholar]
  • 45.Eichner J E, Dunn S T, Perveen G.et al Apolipoprotein E polymorphism and cardiovascular disease: a HuGE review. Am J Epidemiol 2002155487–495. [DOI] [PubMed] [Google Scholar]
  • 46.Williams R S, Wagner P D. Transgenic animals in integrative biology: approaches and interpretations of outcome. J Appl Physiol 2000881119–1126. [DOI] [PubMed] [Google Scholar]
  • 47.Baird P A. Genetic technologies and achieving health for populations. Int J Health Serv 200030407–424. [DOI] [PubMed] [Google Scholar]
  • 48.Terwilliger J D, Weiss W M. Confounding, ascertainment bias, and the blind quest for a genetic ‘fountain of youth'. Ann Med 200335532–544. [DOI] [PubMed] [Google Scholar]
  • 49.Jousilahti P, Salomaa V. Fibrinogen, social position, and Mendelian randomisation. J Epidemiol Community Health 200458883. [PMC free article] [PubMed] [Google Scholar]
  • 50.Davey Smith G, Ebrahim S. What can mendelian randomisation tell us about modifiable behavioural and environmental exposures. BMJ 20053301076–1079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Lewis S, Davey Smith G. Alcohol, ALDH2 and oesophageal cancer. A meta‐analysis which illustrates the potentials and limitations of a Mendelian randomization approach. Cancer Epidemiol Biomarkers Prev 2005141967–1971. [DOI] [PubMed] [Google Scholar]
  • 52.Davey Smith G, Lawlor D, Harbord R.et al Association of C‐reactive protein with blood pressure and hypertension: lifecourse confounding and Mendelian randomisation tests of causality. Arterioscler Thromb Vasc Biol 200525e137–e138. [DOI] [PubMed] [Google Scholar]
  • 53.Davey Smith G, Harbord R, Milton J.et al Does elevated plasma fibrinogen increase the risk of coronary heart disease? evidence from a meta‐analysis of genetic association studies. Arterioscler Thromb Vasc Biol 2005252228–2233. [DOI] [PubMed] [Google Scholar]
  • 54.Timpson N J, Lawlor D A, Harbord R M.et al C‐reactive protein and its role in metabolic syndrome: mendelian randomisation study. Lancet 20053661954–1959. [DOI] [PubMed] [Google Scholar]
  • 55.Lewis S J, Lawlor D A, Davey Smith G.et al The thermolabile variant of MTHFR is associated with depression in the British women's heart and health study and a meta‐analysis. Mol Psychiatry. 2006 online [DOI] [PubMed]

Articles from Journal of Epidemiology and Community Health are provided here courtesy of BMJ Publishing Group

RESOURCES