Illustrative example of instrumental variable analyses in RCTs and Mendelian randomization studies to answer aetiological questions of the effect of a risk factor (LDLc) on an outcome (CHD). The figure shows directed acyclic graphs (DAGs) of instrumental variable (IV) analyses to test the causal effect of low-density lipoprotein cholesterol (LDLc) on coronary heart disease (CHD). In a and b, the IV is randomization to receiving a statin or not (i.e. this is an example of IV analyses to test an intermediate in an RCT); statins are 3-hydroxy-3-methylglutaryl-coenzyme (HMG-CoA) reductase inhibitors. In (c) and (d), the IV is genetic variants in the HMGCR gene (i.e. this is an MR study); these variants mimic HMG-CoA reductase inhibition. In (e) and (f) the IV is genetic variants (MR) that are independent of those in the HMGCR genes. The three key assumptions of IV analyses are illustrated in (a), (c) and (e), that the: (i) IV ‘Z’ (randomization to statins in a and genetic variants related to LDLc in (c) and (e) is (or is plausibly) robustly related to the risk factor (LDLc in all figures); (ii) IV is not related to confounders (shown by letter C in all figures) for the risk factor-outcome association (shown by the lack of an arrow from C to Z in all figures); (iii) IV only affects the outcome ‘Y’ (CHD) through its effect on the risk factor ‘X’ (LDLc). This last assumption is known as the exclusion restriction criteria. In the RCT of statins example, we know that assumption (i) is true, and if the RCT is well conducted then assumption (ii) will be true. If, however, statins are directly (independently of LDLc) related to other factors which then affect CHD, assumption (iii) will be violated and the estimated causal effect a biased estimate of the true effect of LDLc. There is some evidence that statins do relate to a wide range of other lipids and fatty acids in addition to LDLc,46 though whether these are caused by the statins independent of LDLc and affect CHD is currently unknown. If they do (as shown as an illustrative example in (b) then the estimate of the LDLc effect on CHD is likely to be biased (what is assumed to be the effect of LDLc on CHD will be the combined effect of LDLc and other lipids/fatty acids on CHD). In the MR example of variants in the HMGCR gene, we know that assumption (i) is correct and there is evidence that assumption (ii) is also this is likely to be true.44 As with the RCT example, in MR we are often most worried about violation of assumption (iii), due to genuine (horizontal) pleiotropy in MR35–38–i.e. that variants in HMGCR influence other factors independently of LDLc which in turn (independently of LDLc) affect CHD (d). As these variants are mimicking the action of statins, then any pleiotropy is likely to be similar to that seen for statins46 (d). By contrast, (e) and (f) show the use of genetic variants that are unrelated to HMGCR. Although there may still be violation of the exclusion restriction criteria (due to genuine pleiotropy) with these variants, it is unlikely to be related to violation of the exclusion restriction criteria in an RCT of statins because the variants have been selected on the basis that their actions are on a different path from those of statins.