Skip to main content
Wellcome Open Research logoLink to Wellcome Open Research
. 2023 Aug 4;4:186. Originally published 2019 Nov 26. [Version 3] doi: 10.12688/wellcomeopenres.15555.3

Guidelines for performing Mendelian randomization investigations: update for summer 2023

Stephen Burgess 1,2,a, George Davey Smith 3,4, Neil M Davies 3,5,6,7, Frank Dudbridge 8, Dipender Gill 9, M Maria Glymour 10, Fernando P Hartwig 3,11, Zoltán Kutalik 12,13,14, Michael V Holmes 15,16, Cosetta Minelli 17, Jean V Morrison 18, Wei Pan 19, Caroline L Relton 3,4,20, Evropi Theodoratou 21,22
PMCID: PMC7384151  PMID: 32760811

Version Changes

Updated. Changes from Version 2

This is an update to the guidelines for performing Mendelian randomization investigations to reflect updates in the literature over the past three years – both advances in technologies and datasets providing more opportunities for advanced analyses, and methodological innovations enabling broader and more reliable analyses. As stated in the original publication, we will continue to revise these guidelines periodically. Notable changes in this update are: 1) addition of three new co-authors (Zoltán Kutalik, Jean Morrison, Wei Pan) to better reflect the diversity and geographical spread of thought leaders in MR, 2) new paragraphs on within-family analyses, 3) updated advice on drug-target MR analyses, including on variant choice and use of colocalization, 4) revised discussion of methods including recently published robust methods for MR, 5) substantial revision to the section on other sensitivity analyses and a renewed emphasis on triangulation of evidence, 6) a new section “Extensions and additional analyses”, and 7) a general edit of the text to improve accuracy and clarity. We will continue to revisit these guidelines as applied practice shifts and the methodological literature develops.

Abstract

This paper provides guidelines for performing Mendelian randomization investigations. It is aimed at practitioners seeking to undertake analyses and write up their findings, and at journal editors and reviewers seeking to assess Mendelian randomization manuscripts. The guidelines are divided into ten sections: motivation and scope, data sources, choice of genetic variants, variant harmonization, primary analysis, supplementary and sensitivity analyses (one section on robust statistical methods and one on other approaches), extensions and additional analyses, data presentation, and interpretation. These guidelines will be updated based on feedback from the community and advances in the field. Updates will be made periodically as needed, and at least every 24 months.

Keywords: Mendelian randomization, guidelines, genetic epidemiology, causal inference


The aim of this paper is to provide guidelines for performing Mendelian randomization investigations. It is written both for practitioners seeking to undertake analyses and write up their findings, and for journal editors and reviewers seeking to assess Mendelian randomization manuscripts. These guidelines are deliberately written as suggestions and recommendations rather than as prescriptive rules, as we believe that there is no recipe or single “right way” to perform a Mendelian randomization investigation. Best practice will depend on the aim of the investigation and the specific exposure and outcome variables. However, we believe these guidelines will help investigators to consider the key issues in designing, undertaking and presenting Mendelian randomization analyses. These guidelines will be updated based on feedback from the community and advances in the field. Updates will be made periodically as needed, and at least every 24 months.

These guidelines are complementary to the STROBE-MR recommendations on reporting Mendelian randomization investigations 1, 2 . Here, we provide advice on which analyses to perform in a Mendelian randomization investigation, whereas the STROBE-MR guidelines focus on reporting the analyses chosen by the investigators. We assume a familiarity with the basic concepts of Mendelian randomization and genetic epidemiology, such as pleiotropy and linkage disequilibrium 36 . We use the term “exposure” to refer to the proposed causal factor, and “outcome” to refer to the trait or disease that the exposure is hypothesized to influence.

Flowcharts highlighting some of the key analytic steps and choices for investigators are provided as Figure 1 and Figure 2, and a one-page checklist summarizing these guidelines written for reviewers of Mendelian randomization analyses is provided as Figure 3. The guidelines are divided into ten sections: motivation and scope, data sources, choice of genetic variants, variant harmonization, primary analysis, supplementary and sensitivity analyses (one section on robust statistical methods and one on other approaches), extensions and additional analyses, data presentation, and interpretation. Software to implement the statistical methods is referenced in Table 1.

Figure 1. Flowchart highlighting some of the key analytic choices in performing a Mendelian randomization (MR) analysis.

Figure 1.

Figure 2. Generic analytic pipeline for Mendelian randomization (MR).

Figure 2.

Figure 3. Checklist of questions to consider when reviewing a Mendelian randomization investigation.

Figure 3.

Table 1. Summary of some methods proposed for Mendelian randomization: inverse-variance weighted method and robust methods.

Method Consistency
assumption
Strengths and weaknesses Reference Software
Inverse-variance
weighted
All variants valid or
balanced pleiotropy
Most efficient (greatest statistical power), biased if average
pleiotropic effect differs from zero
16 *
MR-Egger InSIDE Sensitive to outliers, sensitive to violations of InSIDE
assumption, InSIDE assumption often not plausible, often less
efficient
17 *
MR-RAPS InSIDE
(except outliers)
Downweights outliers, sensitive to violations of balanced
pleiotropy assumption
18
Weighted median Majority valid Robust to outliers, sensitive to addition/removal of genetic
variants
19 *
Mode-based
estimation
Plurality valid Robust to outliers, sensitive to bandwidth parameter and
addition/removal of genetic variants, generally conservative
20 *
MR-PRESSO Outlier-robust Removes outliers, efficient with valid IVs, very high false positive
rate with several invalid IVs
21
MR-Robust Outlier-robust Downweights outliers, efficient with valid IVs, high false positive
rate with several invalid IVs
22 *
MR-Lasso Outlier-robust Removes outliers, efficient with valid IVs, high false positive rate
with several invalid IVs
22
Contamination
Mixture
Plurality valid Robust to outliers, sensitive to variance parameter and addition/
removal of genetic variants
23 *
MR-Mix Plurality valid Robust to outliers, requires large numbers of genetic variants,
very high false positive rate in several scenarios
24
MR-cML Plurality valid Likelihood-based and robust to the violation of all three IV
assumptions
25 *

Each of the methods in the table can be implemented using summarized data. False positive rates refer to the simulation study by Slob and Burgess 26 . InSIDE is the Instrument Strength Independent of Direct Effect assumption. IV = instrumental variable.

* Implemented in MendelianRandomization package for R ( https://cran.r-project.org/web/packages/MendelianRandomization/index.html)

† Implemented in mrrobust package for Stata ( https://github.com/remlapmot/mrrobust)

‡ Implemented for R in its own software package:

- MR-PRESSO in mrpresso package ( https://github.com/rondolab/MR-PRESSO),

- MR-RAPS in mr.raps package ( https://github.com/qingyuanzhao/mr.raps),

- MR-Mix in MRMix package ( https://github.com/gqi/MRMix),

- MR-cML in MRcML package ( https://github.com/xue-hr/MRcML).

1. Motivation and scope

Mendelian randomization uses genetic variants to assess causal relationships using observational data. A genetic variant can be considered as an instrumental variable for a given exposure if it satisfies the instrumental variable assumptions: 1) it is associated with the exposure, 2) it is not associated with the outcome due to confounding pathways, and 3) it does not affect the outcome except potentially via the exposure 7, 8 .

Before embarking on a Mendelian randomization analysis, investigators should consider the aims of their investigation and the primary hypotheses of interest. There are many potential motivations for using Mendelian randomization, and the motivation should influence decisions on how to perform the analysis, and how to arrange and present its results. The objective of a Mendelian randomization analysis is a test of a causal hypothesis, and sometimes additionally an estimate of a causal effect 9 . The straightforward statement of the causal hypothesis is that interventions on the exposure variable will affect the outcome. If the genetic associations with the exposure vary with time, then there are some nuances in terms of what causal hypotheses can be tested 10 ; we discuss the impact of time-varying relationships between variables in Section 10.

If a Mendelian randomization investigation is performed primarily to assess whether an exposure has a causal effect on an outcome, then estimating the size of the causal effect of the exposure on the outcome is less important and may even be unnecessary 9, 11 . Priorities in such an analysis are to find genetic variants that satisfy the instrumental variable assumptions and to test their associations with the outcome in the largest available dataset that is relevant to the causal question of interest. Investigators may be able to find mediating traits downstream of the exposure that both help understand the mechanistic pathways from the exposure to the outcome, and provide modifiable targets for intervention in order to influence the outcome.

In contrast, if investigators seek to estimate the quantitative impact on the outcome of a proposed intervention in the exposure 12 , then further questions become more important, such as how well the genetic variant proxies the specific intervention, whether genetic associations with the exposure are estimated in a relevant population, and whether the relationships between variables are linear and homogeneous in the population 13 . However, as we discuss in Section 10, causal estimates from Mendelian randomization should always be interpreted with caution. Alternatively, if investigators simply want to assess whether traits share common genetic predictors (potentially implying shared aetiological mechanisms), then an analytic approach that assesses shared heritability (such as LD-score regression 14 or bivariate genome-based restricted maximum likelihood [GREML] 15 ) may be preferable to conducting a Mendelian randomization investigation.

Investigators should also give thought to the scope of their analysis. If the aim of the investigation is to understand disease aetiology, then consideration of a limited set of exposures/outcomes as main analyses may be justified. Whereas if the question relates to public health, then consideration of a wider range of outcomes influenced by an exposure may be worthwhile, as public health recommendations should assess the broad consequences of intervention on an exposure, which may involve weighing risks and benefits for different outcomes. At the extreme end of the spectrum is a phenome-wide Mendelian randomization investigation, in which very large numbers of exposure/outcome pairs are considered 2729 . Such analyses are generally regarded as exploratory or “hypothesis-generating”, and results are typically treated as provisional until replicated in an independent dataset.

Specifying the primary analyses in a Mendelian randomization investigation is important to address problems of multiple testing, particularly given the large number of analyses that could be performed using available genetic data 30 . Additional analyses, including subgroup analyses and analyses on related outcomes may be presented as supplementary, exploratory, or sensitivity analyses. An overly conservative approach to multiple testing is often excessive, given the typically low power of Mendelian randomization studies and the fact that Mendelian randomization often investigates exposure/outcome relationships with prior epidemiological or biological support. As with all epidemiological analyses, selective reporting of “significant” results (leading to reporting bias) should be avoided and all analyses performed should be described transparently.

2. Data sources

The next fundamental question is which data sources will be used: how many datasets are included in the analysis and whether the analysis is performed using individual-level data or summarized data.

Mendelian randomization investigations can be performed using data from a single sample (known as one-sample Mendelian randomization), in which genetic variants, exposure, and outcome are measured in the same individuals, or from two samples (known as two-sample Mendelian randomization), in which variant—exposure associations are estimated in one dataset, and variant—outcome associations are estimated in a second dataset 31 . Two-sample investigations often occur when genetic associations with the exposure are estimated in a cross-sectional sample of healthy individuals, to reflect genetic associations with usual levels of the exposure in the population, and genetic associations with a binary disease outcome are estimated in a case-control study.

There are benefits and limitations of both one- and two-sample settings. A one-sample setting allows the investigation to be conducted in a single population sample, meaning that Mendelian randomization and conventional epidemiological findings (for example from multivariable-adjusted regression) can be compared in the same individuals. In a two-sample setting, the populations from which the two samples were extracted may differ. This is problematic if associations of the genetic variant with the exposure or with variables on pleiotropic pathways differ between the two samples, as this could affect the validity of the instrumental variable assumptions. A particular concern arises if the two samples represent different ethnic groups, as patterns of linkage disequilibrium can differ between population ancestry groups, meaning that a genetic variant may not be as strongly (or even not at all) associated with the exposure in the outcome dataset. Alternatively, the two samples could differ substantially according to population characteristics such as age, sex, socio-economic background, and so on 32 . Such differences can affect not only the interpretation of causal estimates, but also the validity of causal inferences 33 . For example, genetic variants associated with smoking intensity may be strongly associated with disease outcomes in populations where smoking is common, but not in populations where smoking is rare. One-sample analyses do not suffer from these concerns, nor do they require harmonization of the genetic variants across the datasets (see Section 4).

Another related issue is whether the analysis is performed using individual-level data or summarized data. Summarized data are genetic association estimates from regression of the exposure or outcome on a genetic variant 16, 34 . Several large consortia have made such estimates publicly available for millions of variants 30, 35, 36 . Although the use of summarized data is often synonymous with the two-sample setting, the benefits and limitations for the analysis of the two choices (i.e. one- versus two-sample and individual-level versus summarized data) are distinct. Moreover, two-sample approaches can be used with individual-level data (such as the use of externally-derived weights), and summarized data approaches can be used with one-sample data (if necessary, by creating the summarized data from the individual-level data 37 ).

Summarized data are often available for larger sample sizes, meaning that power to detect a causal effect is increased. However, access to only summarized data limits the range of analyses that can be performed. Individual-level data are required to conduct analyses in specific subgroups or strata of the population, or to choose which variables to adjust for when generating the summarized data. If published summarized association estimates have already been adjusted for a variable causally downstream of the exposure or outcome, collider bias (see Section 7) can occur 38 . Individual data in a one-sample setting are required to investigate non-linear effects 39 . A specific advantage of publicly available summarized data is transparency, as the analysis can be reproduced by a third party with access to the same data.

One- and two-sample investigations also differ in terms of bias with weak instruments 40 . In a one-sample setting, if the genetic variant–exposure associations are weak, then chance variation means that genetic associations with the exposure and outcome are correlated in the direction of the confounded association between the two. This results in instrumental variable estimates that are biased in the direction of the confounded association, and inflated false positive (type 1 error) rates, particularly when more than one variant is included in the analysis 41 . In a two-sample setting without sample overlap, bias due to weak instruments is in the direction of the null, and does not lead to false positive findings. However, as several large consortia have overlapping studies, participants may overlap between the datasets used to estimate the genetic associations with the exposure and outcome 42 . In this case, the direction and size of the bias varies linearly depending on the degree of overlap (formally, depending on the degree of correlation between the genetic association estimates). For the special case of a one-sample analysis with a binary disease outcome, if the genetic associations with the exposure are estimated in the controls only, then genetic associations with the exposure and outcome will not be correlated, and bias will follow the pattern of the two-sample setting 42 . Various statistical methods have been proposed to reduce weak instrument bias due to sample overlap and bias due to winner’s curse (see Section 3) 4345 .

The “randomization” in Mendelian randomization refers to the quasi-random allocation of genetic variants from parents to offspring that occurs at conception. This randomization only truly holds conditional on the parental genotype. The key consequence of this randomization is that genetic variants are independently distributed from traits that they do not affect, an implication of Mendel’s laws of segregation and independent assortment. There is some plausibility that this independence holds for many traits at a population level in large “well-mixed” populations. Empirical investigations in European populations have shown that associations between genetic variants and many traits are no stronger than would be expected due to chance alone 46, 47 .

Concerns about independence are greater for traits that are more socially-patterned, as this increases susceptibility to associations arising from population structure, assortative mating, and dynastic effects 48 . Population structure can give rise to genetic associations due to differences in the frequency of a variant and the distribution of a trait across the population (such as latitude in Europe, which correlates with allele frequencies for lactase persistence variants and milk consumption 49 ). Assortative mating occurs when individuals reproduce with people who are more similar to themselves than would be expected by chance. This can also lead to genetic associations that represent social differences rather than causal effects 50 . Finally, dynastic effects occur when a parent’s genotype affects their child’s outcome by a causal pathway not acting via the child’s phenotype (for example, due to an effect of the parent’s phenotype on the child’s outcome). This can induce associations between the offspring’s genotype and outcomes that do not reflect the effect of the exposure in the offspring.

These potential sources of bias have encouraged the development of statistical approaches and datasets to perform within-family Mendelian randomization analyses, exploiting the random allocation of variants between siblings 51 . Hence, if a relevant dataset is available and statistical power is reasonable, investigators could consider performing a within-family Mendelian randomization analysis, particularly if the exposure is socially-patterned or likely to be subject to population stratification 48, 52 . If this is not possible, then the validity of the investigation relies on independence of the genetic variants from potential confounders holding at a population level. However, within-family analyses typically have lower power than analyses in unrelated individuals, as families where all individuals have the same genotype (such as where both parents are major homozygotes) would not contribute any information to the analysis. Hence, imprecise null findings from family-based analyses should be interpreted with caution, particularly if a population-based analysis suggests a causal effect.

3. Selection of genetic variants

The most important decision to be made in designing a Mendelian randomization investigation is which genetic variants to include in the analysis 53 . First, it is necessary to decide whether the analysis is performed using variants from a single gene region, or using variants from multiple regions of the genome (a polygenic analysis). For example, a Mendelian randomization analysis for C-reactive protein (CRP) may be conducted using variants in the neighbourhood of the CRP gene region (which encodes C-reactive protein), or it may be conducted using all independent genome-wide significant predictors of CRP 54 . The former has advantages of specificity – if a gene region has a specific biological link with the exposure, then the Mendelian randomization investigation based on these variants (sometimes called a “ cis-Mendelian randomization analysis” 55 , as the variants are cis-variants for the gene product) is more plausible as an assessment of the causal role of that particular exposure compared with an analysis including all genome-wide significant predictors of CRP regardless of function. However, if only one gene region is included in the analysis, then several robust statistical analysis methods (see Section 6) are less reliable, as they assume independence in whether variants violate the instrumental variable assumptions. Variants in the same gene region are likely to either all be valid instruments or all invalid. When genetic variants are all valid instruments, power depends on the proportion of variance in the exposure explained by the variants 56 – hence a polygenic Mendelian randomization investigation will typically have greater power than one including variants only from a single gene region.

Mendelian randomization analyses for investigating drug targets often use variants in a single gene region, typically the gene that encodes the protein target under investigation 57, 58 . For example, investigations into the effects of glucagon-like peptide 1 receptor (GLP1R) agonists have considered variants in the GLP1R gene region 59 , and investigations into the effects of activated factor X inhibitors have considered variants in the F10 gene 60 . However, for complex multifactorial exposures such as body mass index or blood pressure, there is no single relevant gene, and so a more agnostic polygenic analysis may be necessary. In some cases, both approaches may be possible: for example, variants associated with low-density lipoprotein (LDL) cholesterol in the HMGCR gene region have particular relevance for understanding the impact of taking statin drugs, whereas a polygenic analysis including genetic predictors of LDL-cholesterol from multiple gene regions may be informative about the effect of LDL-cholesterol perturbation more generally. The latter approach allows investigators to test for consistency of the causal finding across multiple variants that influence the exposure via different biological pathways 61 .

When the analysis is based on a single gene region, it may be that a single variant is included in the analysis. However, if multiple variants explain independent variance in the exposure, then their inclusion will increase the power to detect a causal effect, even if the variants are partially correlated. With summarized data, appropriate methods should be used to account for correlated variants 32 . For drug targets, variants may be chosen based on associations with levels of a protein or similar biomarker that reflects pharmacological perturbation of the target, or expression of the targeted gene in a relevant tissue or cell type. If there are many correlated candidate variants in a gene region, then including all variants in a single analysis will typically result in numerical instability, as the analysis can be highly sensitive to small changes in the variant correlation matrix 62 . Variable selection and dimension reduction approaches have been proposed to maximize the proportion of variance in the exposure explained by the selected variants while avoiding instability due to multicollinearity 63 .

For a polygenic analysis, there are two main strategies for selecting variants: either a biologically driven approach or a statistically driven approach. The two approaches are not mutually exclusive, and the overall decision of which variants to include may comprise elements from both approaches.

A biological approach to selecting genetic variants would include variants from regions that have a biological link to the exposure of interest. For example, several Mendelian randomization investigations for vitamin D have used variants from four gene regions that are biologically implicated in the synthesis or metabolism of vitamin D 64 . However, caution is required as biological understanding is often imperfect. As an example, although genetic variants in the IL6R gene region are associated with increased circulating levels of interleukin-6, they in fact decrease interleukin-6 signalling, leading to opposite directions of association with disease outcomes to those expected based on serum interleukin-6 measurements 65 .

A common statistical approach when selecting genetic variants is to include all variants associated with the exposure of interest at a given level of statistical significance (typically, a genome-wide significance threshold, such as p < 5×10 -8). Selection may be based on the dataset in which genetic associations with the exposure are estimated. However, this can lead to “winner’s curse” –genetic associations tend to be overestimated in the dataset in which they were first discovered. If genetic variants are selected based on their associations with the exposure in the dataset under analysis, weak instrument bias is exacerbated (in the direction of the observational association in a one-sample setting, and in the direction of the null in a two-sample setting) 41 . This bias can be avoided by selecting genetic variants based on a different dataset entirely. This can lead to a “three-sample” analysis, in which variants are identified in one dataset, and the genetic associations with the exposure and outcome are estimated in separate datasets 66 . If associations with the exposure from separate large datasets are not available, investigators will have to choose between basing their variant choice on the dataset under analysis (and hence risking winner’s curse bias), or basing their variant choice on a smaller dataset (and hence risking uninformative findings due to low power) 67 . When genetic variants are chosen solely based on their association with the exposure without reference to the function of the variants, researchers should be especially careful about the possibility of variants being pleiotropic.

A more nuanced approach to variant selection would be to start off with a statistical rationale for choosing genetic variants, but then to exclude variants that are known to be pleiotropic or that are associated with variables that represent pleiotropic pathways to the outcome. However, a genetic association with a variable does not necessarily reflect that the instrumental variable assumptions are violated. Additionally, if variants are associated with a variable that has no influence on the outcome, bias will not be introduced.

We use the term “horizontal pleiotropy” (sometimes referred to as “direct pleiotropy” or simply “pleiotropy”) to refer to the scenario where a genetic variant is associated with variables on different causal pathways to the outcome, and “vertical pleiotropy” (sometimes referred to as “indirect pleiotropy” or “mediated pleiotropy”) to refer to the scenario where a genetic variant is associated with variables that are on the same causal pathway to the outcome 68 . Provided that the causal pathway from the genetic variant to the outcome is mediated entirely via the exposure (see Figure 4), a genetic variant is a valid instrument for assessing the causal role of the exposure (assuming the other instrumental variable assumptions are satisfied), even if it is associated with another variable 31 . In practice, distinguishing between horizontal pleiotropy and vertical pleiotropy requires knowledge of the relationships between the variables in the analysis. When there are multiple genetic variants, horizontal pleiotropy is more likely if a genetic association with a specific variable is only observed for a small number of variants. In contrast, vertical pleiotropy (in particular corresponding to the scenarios in Figure 4a–c) is likely to lead to genetic associations with that variable for all variants that associate with the exposure. While removing horizontally pleiotropic variants from a Mendelian randomization analysis should lead to more reliable results, care must be exercised, as removing vertically pleiotropic variants could lead to distorted causal estimates.

Figure 4. Directed acyclic graphs illustrating validity and invalidity of instrumental variable assumptions in different scenarios.

Figure 4.

a) Mediator is on causal pathway from exposure to outcome. b) Mediator is on causal pathway from genetic variants to exposure. c) Genetic variants influence the exposure, which has downstream effect on a related variable which does not affect the outcome. d) Genetic variants influence a related variable, and the related variable affects the outcome and exposure of interest. We note that the related variable may be known or unknown. e) Genetic variants influence the exposure and outcome via different causal pathways. f) Genetic variants influence the outcome primarily, and only influence the exposure via the outcome. In scenarios a, b, and c, as there is no alternative pathway from the genetic variants to the outcome, the instrumental variable assumptions are satisfied. In scenario d, the pathway from the genetic variants to the outcome does not pass via the exposure, and so the instrumental variable assumptions are not satisfied for the exposure (although they are satisfied for the related variable). Scenarios a, b, and c are examples of “vertical pleiotropy” (also called “indirect pleiotropy”) that do not invalidate the instrumental variable assumptions. Scenario d reflects a situation where the causal risk factor has been incorrectly identified – it is not the exposure, but the related variable. Scenario e reflects “horizontal pleiotropy” (also called “direct pleiotropy”) that violates the instrumental variable assumptions. Scenario f reflects a reverse causation situation where the genetic variant has been incorrectly identified as primarily affecting the exposure.

Another possible scenario that would lead to instrument invalidity is if genetic variants influence the outcome primarily rather than the exposure ( Figure 4f, see also discussion on reverse causation in Section 7). If there is a reverse causal effect of the outcome on the exposure, then genetic predictors of the outcome could be identified as hits in a genome-wide association study for the exposure. However, such variants would not be valid instrumental variables.

In conclusion, there is no one correct way to choose which genetic variants to include in an analysis. Causal conclusions will be more reliable when the instrumental variable assumptions are more plausible. In practice, a balance may need to be struck between including fewer variants (and potentially having insufficient power) and including more variants (and potentially including more pleiotropic variants). We note the possibility that a researcher could exploit this uncertainty and perform a data-driven investigation, choosing variants based on the results of Mendelian randomization analyses with different sets of variants. This underscores the importance of writing an analysis plan before looking at the data, and considering prospectively what criteria may be considered for including and excluding genetic variants from the analysis. This problem is not unique to Mendelian randomization, and pre-registration of analysis plans has been suggested as a potential way of ensuring that analyses are conducted transparently and without bias (whether intentional or unintentional) 69 .

A practical suggestion for performing a polygenic analysis is to consider both a liberal analysis, including more genetic variants, and a conservative analysis, including fewer variants 31 . While it is theoretically possible for pleiotropy to lead to a false negative finding, it is generally more likely that pleiotropy will bias estimates away from the null. Hence a null finding in a liberal analysis is more convincing evidence of a true null relationship – there is little evidence for a causal relationship even when potentially pleiotropic genetic variants are included in the analysis. Section 6 and Section 7 describe sensitivity analyses for assessing the instrumental variable assumptions and the robustness of non-null findings.

4. Variant harmonization

Genetic associations with exposures and outcomes are typically reported per additional copy of a particular allele. Hence, when combining summarized data on genetic associations, it is important to ensure that genetic associations are expressed per additional copy of the same allele 70 . This is particularly important as not all publicly-available data resources are consistent about reporting strand information correctly. For example, if a genetic variant is a biallelic single nucleotide polymorphism (SNP) with alleles A and G on the positive strand, then the corresponding base pairs on the negative strand will be T and C. In this case, one dataset may report the association per additional copy of the A allele, and another per additional copy of the T allele – but the same comparison is being made. Allele and strand information can be double-checked by comparing allele frequency information – if the allele frequencies are similar for the A and T alleles, then the researcher can be more confident that this is a strand mismatch. Additional care should be taken for palindromic variants – if the alleles were A and T (or C and G), then the same alleles would appear on both the positive and negative strands. In such a case, if the allele frequency is close to 50%, analysts may choose to drop the variant from the analysis if it is not possible to verify that the alleles have been correctly orientated. While this is a conservative policy, allele alignment problems have led to incorrect results in Mendelian randomization analyses, and retractions and corrections of manuscripts.

5. Primary analysis

Different statistical methods have been proposed for Mendelian randomization with individual-level data and with summarized data. In a one-sample setting with individual-level data, a causal effect estimate can be obtained using the two-stage least-squares (2SLS) method. In the first stage, the exposure is regressed on the genetic variants and any relevant covariates. In the second stage the outcome is then regressed on the predicted values of the exposure from the first regression and the same covariates 71 . In general, we recommend only including as covariates age, sex, genomic principal components of ancestry, and technical covariates (such as recruitment centre), as further adjustment may bias estimates either if adjustment is for a variable on the causal pathway from the genetic variants to the outcome (a mediator), or if adjustment induces collider bias 72 . Strictly speaking, the 2SLS method refers to a two-stage analysis using linear regression for continuous outcomes and exposures. Similar two-stage analyses can be performed with binary variables using logistic regression 73 , although in this case estimates are sensitive to correct specification of the first-stage regression model 74 and other approaches that make weaker distributional assumptions, such as structural mean models, may be preferred 75 .

The 2SLS method can be applied to the two-sample setting if individual-level data are available for both samples 76 . However, it is typical for two-sample investigations to use summarized data. With summarized data, if only one genetic variant is used as an instrument, the causal effect estimate is simply the ratio of the variant—outcome association divided by the variant—exposure association. With multiple variants as instruments, the most commonly used method is the inverse-variance weighted (IVW) method 16 . With uncorrelated variants, the IVW estimate can be obtained from an IVW meta-analysis of the ratio estimates for the individual variants 77 . The same estimate can equivalently be calculated as the ratio estimate using a weighted genetic risk score as a single instrument, with the weights equal to the associations of each variant with the exposure estimated in the first sample 32 . A modification of this method has been proposed to allow for correlation (linkage disequilibrium) between variants 32 . For continuous outcomes, the IVW estimate is asymptotically equivalent to the 2SLS estimate obtained from individual level data 16 . The 2SLS method (and thus also the IVW method) is the most efficient estimate of the causal effect when all genetic variants are valid instruments 32 .

If all genetic variants are valid instruments and the relationships between all variables (genetic variants, exposure and outcome) are linear and homogeneous for all individuals in the population, then we would expect the variant-specific estimates (that is, the ratio estimates based on each variant in turn) to all target the same causal parameter, and for there to be no more heterogeneity between the variant-specific estimates than would be expected by chance alone 13 . However, there are many reasons why excess heterogeneity may occur in practice. These include statistical reasons (such as departures from linearity and homogeneity across individuals) and biological reasons. For instance, variants associated with body mass index (BMI) influence BMI via different biological mechanisms 78 . Additionally, some variants are associated with BMI from early childhood and others from adolescence or later. Variants that influence BMI for longer may be expected to have stronger proportional associations with chronic disease outcomes for which BMI is a cause. Hence if there is a true causal effect of the exposure on the outcome, some heterogeneity may be expected in the variant-specific causal estimates. However, heterogeneity would also arise if some genetic variants are not valid instrumental variables (see Section 6) 79 .

The IVW method can be performed using a fixed-effects or a random-effects meta-analysis model. Unless there are very few variants (meaning that heterogeneity between the variant-specific estimates cannot be estimated reliably) or all variants are taken from the same gene region, we recommend using a multiplicative random-effects model as the default option for the IVW method. If there is no more heterogeneity between the ratio estimates for the individual variants than would be expected by chance alone, then the random-effect analysis is equivalent to the fixed-effect analysis, and there is no loss of precision in making the weaker random-effects assumption. However, if there is excess heterogeneity, then the fixed-effect analysis is inappropriate, as its confidence intervals are misleadingly narrow. A multiplicative random-effects model is preferred to the additive random-effects model that is more common in the meta-analysis literature as it does not change the relative weighting of the variant-specific estimates 34 . In contrast, an additive random-effects model upweights outlying estimates, which are more likely to represent pleiotropic variants. The multiplicative random-effects IVW method provides valid causal estimates under the assumption of balanced pleiotropy; that is, pleiotropic effects on the outcome are equally likely to be positive as negative 34 .

We recommend the IVW method with multiplicative random-effects as the primary analysis method for use with summarized data, because it is the most efficient analysis method with valid instrumental variables, and it accounts for heterogeneity in the variant-specific causal estimates. If a causal effect is detected using this method, then investigators should proceed to perform sensitivity analyses ( Section 6 and Section 7) to assess the robustness of their finding to the assumption of balanced pleiotropy.

A scenario that requires a different approach to the primary analysis occurs when there are several related exposures that have shared genetic predictors, meaning that it is difficult to find specific predictors of the individual exposures. In this case, a multivariable Mendelian randomization approach may be the primary analysis strategy 80 . Multivariable Mendelian randomization is an extension to standard (univariable) Mendelian randomization that allows genetic variants to be associated with more than one exposure, and estimates the direct causal effects of each exposure in a single analysis model. The instrumental variable assumptions in multivariable Mendelian randomization require each variant to be associated with at least one of the exposures, not associated with the outcome via confounding, and not to affect the outcome except potentially via its association with one or more of the exposures included in the analysis model. For identification, it is also required that there is no perfect collinearity between the genetic associations; that is, there are variants that explain independent variation in each exposure 81 . Examples of exposure sets where multivariable Mendelian randomization has been used include lipid fractions (such as high-density lipoprotein cholesterol, LDL-cholesterol, and triglycerides) 82 , and body composition measures (such as fat mass and fat-free mass) 83 . Provided that genetic variants act as instrumental variables for the set of exposures, the direct causal effects of the individual exposures on the outcome can be estimated 84 . Both the 2SLS and IVW methods can be adapted to the multivariable setting 81 . A multivariable analysis strategy may also be worthwhile if genetic variants are associated with measured exposures that represent potentially pleiotropic pathways from the genetic variants to the outcome, as the effects of these exposures on the outcome will be accounted for in the multivariable analysis model ( Section 7). Specific methods have been proposed based on a multivariable approach in the context of gene expression data, such as the transcriptome-wide summary statistics-based Mendelian randomization (TWMR) method 85 .

6. Robust methods for sensitivity analysis

A robust analysis method is defined here as a method that can provide valid causal inferences under weaker assumptions than the standard IVW method. Many robust analysis methods are available to detect and account for pleiotropy when using multiple genetic variants. Any polygenic Mendelian randomization investigation where variants are chosen based on their associations with the exposure that does not perform one or more robust methods may be viewed as somewhat incomplete 54, 86 . Investigators should consider using multiple methods that make different assumptions about the nature of the underlying pleiotropy 26 . Although robust methods typically use the term ‘pleiotropy’, any source of instrument invalidity can be expressed as algebraically equivalent to bias from pleiotropy 87 , and so these methods can help assess sensitivity of findings to instrument invalidity more generally, and not simply invalidity that arises from horizontal pleiotropy. However, the robust methods are more likely to be effective for addressing instrument invalidity that arises due to issues such as pleiotropy or linkage disequilibrium with a variant influencing a confounder, which affect specific variants in a sporadic way, and less effective for instrument invalidity that arises due to issues such as population stratification or dynastic effects, which affect all variants in a systematic way 48 . We here use the language of pleiotropy to make mathematically precise statements about the assumptions needed for methods to provide consistent estimates, but these statements cover instrument invalidity more generally.

While a full comparison of all the robust methods that have been proposed is beyond the scope of this paper, a summary of several methods is provided as Table 1. This table is based on a broader review and comparison of methods 26 . We proceed to provide a brief description of some commonly used methods.

The most commonly used robust methods are MR-Egger, median- and mode-based methods, and MR-PRESSO. We focus on these methods here as they can be implemented using summarized data alone, and they rely on different assumptions to provide consistent causal estimates. The MR-Egger method estimates the causal effect as the slope from the weighted regression of the variant—outcome associations on the variant—exposure associations, and the average pleiotropic effect as the intercept. The method allows all genetic variants to have pleiotropic effects; however, it requires that the pleiotropic effects are independent of the variant–exposure associations (referred to as the Instrument Strength Independent of Direct Effect (InSIDE) assumption) 17 . This assumption would be violated in the case of “correlated pleiotropy”, which occurs when genetic variants influence a confounder of the exposure and outcome (and hence there are correlated pleiotropic effects on the exposure and outcome) 88 . A multivariable version of the MR-Egger method is available 89 . Estimates from the MR-Egger method are particularly affected by outlying and influential datapoints 90 , and are prone to be imprecise, particularly when the variant—exposure associations are all similar in magnitude. This can lead to the method having low power to detect a causal effect. A heterogeneity measure has been proposed to quantify the similarity between variant—exposure associations and the potential impact on MR-Egger analyses 91 . Another method making the InSIDE assumption is the MR-RAPS (robust adjusted profile score) method, which first excludes strongly pleiotropic variants, and then assumes all remaining variants follow the InSIDE assumption 18 .

The median- and mode-based methods 19, 20, 92 rely on some genetic variants being valid instruments, but make weaker assumptions about the invalid instruments and are more robust to outliers. Specifically, the median-based method assumes that less than half of the variants are invalid instruments (majority valid assumption), and the mode-based method assumes more variants estimate the true causal effect than estimate any other quantity (plurality valid assumption). Intuitively speaking, both methods take the variant-specific causal estimates (i.e. the ratio estimates based on the individual variants), and calculate a measure of central tendency of these estimates. These methods have a natural robustness to variants with outlying ratio estimates, and so are not as affected by the presence of a small number of pleiotropic variants as the IVW and MR-Egger methods. The mode-based method has been shown to have low precision in some simulated and real datasets 26 . Other methods have been proposed that make the same plurality valid assumption as the mode-based method, including the contamination mixture method 23 and MR-Mix 24 .

The MR-PRESSO method is a variation on the IVW method that first sequentially removes genetic variants from the analysis whose variant-specific causal estimate differs substantially from those of other variants 21 . The IVW method is then performed for all variants that are not judged to be heterogeneous. A potential problem with this sequential (that is, one-by-one) removal strategy is that, when there are several variants with similar outlying estimates, no single variant may be judged to be an outlier on its own. Alternative methods have considered penalized regression for simultaneous parameter estimation and outlier detection, using a Lasso (also called L 1) penalty 22 or an L 0 penalty. The constrained maximum likelihood (MR-cML) 25 is such a method that performs selection of invalid instruments and estimation allowing any of the three instrument variable assumptions to be violated via either uncorrelated or correlated pleiotropy. In addition to asymptotically valid inference, it also offers a data perturbation/resampling scheme to account for uncertainty in model selection and so achieve better inferences in finite samples.

A further class of robust methods uses latent modelling to distinguish to what extent genetic associations with the outcome arise due to a causal effect of the exposure, as opposed to pleiotropic effects of particular variants either on the outcome directly or on a common cause of the exposure and outcome. A causal model is evidenced if the predominance of variants that associate with the exposure also associate with the outcome in a proportional way. If the genetic associations with the outcome do not follow this pattern, then a non-causal explanation would be preferred. Emerging methods that take this approach include the Causal Analyses Using Summary Effect Estimates (CAUSE) 88 and Latent Heritable Confounder Mendelian randomization (LHC-MR) 93 methods.

While it would be excessive to perform every robust method for Mendelian randomization that has been proposed, or even all the methods mentioned here, investigators should pick a sensible range of methods to assess the sensitivity of their findings. For example, one suggestion is to perform MR-RAPS, the weighted median-based method, and the MR-cML method, as these methods require different assumptions to be satisfied for asymptotically consistent estimates (respectively: InSIDE, majority valid, and plurality valid). If estimates from all methods are similar, then any causal claim is more credible. However, finding differences between estimates does not necessarily imply the absence of a causal effect. Different methods will perform better and worse in different scenarios, so critical thought and judgement is required. Two recent simulation studies that compared different methods recommended the contamination mixture method 26 and MR-Mix 94 as having the lowest mean squared error across a range of different methods – these methods both make the same assumption for consistent estimation as the mode-based method, and so either could be used in preference to it. Alternatively, the MR-CUE 95 (“correlated horizontal pleiotropy unraveling shared etiology and confounding”) method has been demonstrated to have good performance in an extensive comparison of methods.

New methods for Mendelian randomization analysis are appearing regularly, and (unsurprisingly) tend to report simulations and applications suggesting they have advantages over previous methods. It is unlikely that one approach will perform best in all settings, and so it is important to perform a range of analyses that depend on different sets of assumptions. Combining this with orthogonal validation – such as through the use of positive and negative controls ( Section 7) – is the most powerful way of addressing causal questions, within the triangulation of evidence framework 96, 97 .

We also recommend that a measure of the heterogeneity between variant-specific causal estimates, such as Cochran’s Q statistic or the I 2 statistic, is reported as a part of a polygenic Mendelian randomization investigation 79, 98, 99 . Conclusions are more reliable when multiple genetic variants provide concordant evidence for a causal effect, and particularly when there is no more heterogeneity between the variant-specific causal estimates than expected by chance. As discussed in Section 5, some heterogeneity may be expected even when all genetic variants are valid instruments. However, causal conclusions are less reliable when there is substantial heterogeneity, especially when there are distinct outliers (which may represent pleiotropic variants) or when evidence for a causal effect depends on one or a small number of variants.

Leave-one-out analyses (i.e. remove one variant from the analysis and re-estimate the causal effect) can be valuable in assessing the reliance of a Mendelian randomization analysis on a particular variant 100 . If there is one genetic variant that is particularly strongly associated with the exposure, then it may dominate the estimate of the causal effect. Investigators should assess the robustness of findings to the removal of such variants. If a causal effect is only evidenced by one variant, then the validity of the inference depends only on that variant. If there are many variants in an analysis, leaving one variant out at a time is unlikely to change the estimate substantially, and leaving out subsets of the variants (say, a randomly chosen 30% at a time 101 ) may be more appropriate. A further approach for identifying variants to remove from the analysis is Steiger filtering, which removes variants from the analysis if their association with the outcome is stronger than that with the exposure 102 . It is highly unlikely that variants could have a stronger association with the outcome than the exposure if the instrumental variable assumptions are satisfied and the genetic association with the outcome is entirely mediated via the exposure (unless there is substantial measurement error in the exposure).

While removing horizontally pleiotropic variants from a Mendelian randomization analysis will improve the validity of causal inferences, there is some danger in a post hoc or data-driven selection of genetic variants. This is particularly true if many genetic variants are judged to be heterogeneous: the removal of too many variants from the analysis could provide a false impression of agreement amongst the remaining variants, and over-precision in the causal estimate. Removing a variant from the analysis is more justified when a pleiotropic association of the variant has been identified 103 .

7. Other approaches for sensitivity analysis

Sensitivity analysis should not be limited to the application of different statistical methods. This is particularly important for investigations based on a single gene region, as several of the methods discussed above are not applicable in this case. Other approaches for assessing robustness include varying the dataset and choice of genetic variants in the analysis (including the suggestion of liberal and conservative variant sets in Section 3), the use of positive and negative control outcomes and/or samples, colocalization, subgroup analyses, and examining associations with potentially pleiotropic variables. We describe each of these in turn.

A positive control outcome is an outcome for which it is already established that the exposure is causal. For example, the outcome of gout may be used as a positive control in a Mendelian randomization investigation for serum uric acid as an exposure, as raised uric acid levels are known to increase risk of gout. Provided that there is sufficient statistical power, then if genetic variants that are associated with serum uric acid are not also associated with risk of gout, then we may question whether the genetic variants are truly able to assess the effects of varying serum uric acid 104 . Conversely, a negative control outcome is an outcome for which it is believed that the exposure cannot be causal 105 . For example, childhood levels of vitamin D have been used as a negative control outcome for the effect of adulthood BMI; childhood BMI was shown to affect childhood vitamin D levels in a Mendelian randomization investigation, but adulthood BMI did not 106 . If a Mendelian randomization investigation suggests that the negative control outcome is caused by the exposure, then violation of the instrumental variable assumptions (such as through pleiotropy or population stratification) may be suspected 107 .

Colocalization assesses whether the same genetic variant (or variants) influences two traits 108, 109 . If genetic variants in a given gene region are associated with both an exposure and an outcome, it may be that the same genetic variants causally influence both the exposure and outcome (implying the likely presence of a causal pathway including the exposure and outcome). However, it may instead be that the two associations are driven by different causal variants, and these variants are correlated due to linkage disequilibrium 110 . This would typically indicate a violation of the Mendelian randomization assumptions. An example of this is the APOE gene region, which contains genetic variants associated with LDL-cholesterol and Alzheimer’s disease. However, LDL-cholesterol does not appear to be a cause of Alzheimer’s disease in Mendelian randomization analyses using variants from other gene regions 111 . A colocalization analysis revealed distinct causal variants for LDL-cholesterol and Alzheimer’s disease at the APOE locus, indicating that variants in this gene region are not valid instruments assessing the effect of LDL-cholesterol on Alzheimer’s disease 112 .

Colocalization can be a useful sensitivity analysis when a Mendelian randomization analysis is based on a single gene region 113 . However, there are several limitations to such an analysis, including identifying the true causal exposure (this is particularly relevant when using gene expression as the exposure, as colocalization results often differ depending on the choice of tissue) and statistical power. Bayesian approaches for colocalization (such as the coloc family of methods) typically only conclude that there is colocalization if there are variants having strong associations (p < 10 -4 or 10 -5) with both the exposure and outcome. In the language of coloc, if genetic variants are strongly associated with the exposure but not the outcome, then the method may prioritize hypothesis H 1 (existence of a causal variant for trait 1 but not trait 2) rather than hypothesis H 3 (distinct causal variants for traits 1 and 2 – that is, the traits fail to colocalize) or hypothesis H 4 (shared causal variants for traits 1 and 2 – that is, the traits colocalize). Hence, while in some cases colocalization methods will provide helpful evidence supporting or questioning the Mendelian randomization assumptions, in other cases they may provide no strong evidence for or against colocalization 112 .

A subgroup analysis comparing Mendelian randomization results from subgroups of the population in which the genetic variants have different degrees of association with the exposure can serve as a sensitivity analysis to assess the instrumental variable assumptions. An example of such a subgroup analysis is the comparison of genetic associations with blood pressure in men and women in an East Asian population for variants implicated in the metabolism of alcohol 114, 115 . As women in East Asia tend not to drink alcohol, genetic associations with blood pressure are observed in men but not in women. Also, genetic associations are stronger in heavier drinkers 114 . This provides confidence that the genetic associations are driven by alcohol consumption and not by a pleiotropic mechanism. Such an analysis can be performed if there is a subgroup of the population that has reduced or increased levels of the exposure 116, 117 . However, if the subgroup is defined by a collider (see below), then stratification can introduce bias to the analysis 118 . As sex cannot be affected by autosomal genetic variants, sex cannot be a collider, and so stratification on sex will not induce collider bias 119 .

A further possible sensitivity analysis is to check the genetic associations with other variables associated with the outcome, and which are thought not to lie on the causal pathway through the exposure (i.e. are not mediators). Such variables may lie on alternative pleiotropic pathways to the outcome. If the genetic variants are not associated with such variables, then some reassurance can be drawn that the Mendelian randomization assumptions are satisfied. A further possibility in this case is to perform a multivariable Mendelian randomization, including the putative pleiotropic variables as additional exposures in the analysis model 120 . This analysis will estimate the direct effect of the exposure on the outcome keeping these variables constant.

There are several other potential sources of bias in a Mendelian randomization analysis other than invalid instruments. We consider here collider bias, selection bias, and reverse causation as three potential sources of bias, and direct readers to reviews that list further potential sources of bias 68, 121 .

A collider is a common effect of two variables – for example, the exposure is influenced by the genetic variants and the exposure—outcome confounders, and so is a collider. Any variable causally downstream of the exposure is also affected by the genetic variants and confounders, and so is also a collider. Even if the genetic variants and confounders are uncorrelated (they are marginally independent in the population), they will typically be associated when conditioning on the collider (they become conditionally dependent) 118 . Stratifying on or adjusting for a collider therefore leads to an association between variables that influence the collider. An association between the genetic variants and the exposure—outcome confounders would lead to biased causal estimates 122 . Collider bias is not unique to Mendelian randomization, but it is particularly relevant as some published genetic association estimates have been adjusted for potential colliders 123 . For example, genome wide association studies of many brain volume measures routinely adjust for measures of cranial size or total brain volume, but head size may itself be influenced by exposures of interest in subsequent Mendelian randomization studies 124 . Methods to account for collider bias have recently been proposed 125127 .

Selection bias is a specific example of collider bias which occurs when selection into a study sample depends on a collider. Most epidemiological studies do not recruit all individuals from the target population with equal probability, and so suffer from selection bias. Even if genetic variants behave as if randomly distributed in the population as a whole, they may not be randomly distributed in a selected subset of the population. A specific example of selection bias is index event bias, where entry into the study sample is dependent on having a particular index event 128 . For example, investigations into disease survival can only include individuals who have had an initial disease event 129 . Simulation studies have shown that selection bias can have a severe impact on Mendelian randomization estimates, but only when the selection effects are quite strong 72, 122 . Selection bias can potentially be addressed using inverse-probability weighting 130 , although this requires estimation of the probability of selection into the study sample for all participants. Specific weights to reduce selection bias have been proposed for the UK Biobank study 131 .

While the genetic code is fixed at conception and so cannot be influenced by reverse causation, if the outcome influences the risk factor, then variants that primarily affect the outcome would (in a large enough sample size) be associated with the exposure 132 . As discussed above and shown in Figure 4f, if genetic variants used as instrumental variables for the exposure in fact influence the outcome primarily, then genetic associations with the outcome could be present without the exposure influencing the outcome. The MR-Steiger method has been developed to detect such variants (for a continuous exposure and outcome) and remove them from the analysis 102 .

8. Extensions and additional analyses

We distinguish between sensitivity and supplementary analyses discussed in the previous two sections, which are conducted to improve reliability in testing the primary causal hypothesis, and extensions and additional analyses, which address related, but distinct causal hypotheses. We only provide brief comments and references on these extensions to Mendelian randomization, several of which are the subject of ongoing methodological investigations.

Non-linear Mendelian randomization aims to characterize the shape of the causal relationship between the exposure and outcome; that is, does the causal effect of the exposure on the outcome vary at different levels of the exposure 39 ? Several methods for non-linear instrumental variable analysis have been proposed. Two broad categories of non-linear methods operate by: i) estimation of a flexible model relating the exposure to the outcome 133, 134 , and ii) stratification of the population into strata with different average levels of the exposure, and estimation of stratum-specific causal effects 135 . Results from these approaches can be sensitive to the parametric assumptions made by the methods – for the first category, which models relating the exposure to the outcome are considered 136 ; and for the second category, whether the genetic effect on the exposure varies in the population 137 . Indeed, variability in the effect of genetic variants on the exposure is evident for several exposures; this variability can lead to highly misleading estimates 137, 138 . The doubly-ranked stratification method has been proposed that may be less sensitive to variability in these genetic effects 139 . Assessment of the reliability of current non-linear methods is a topic of current research.

Factorial Mendelian randomization takes genetic predictors of two exposures (or two interventions on the same exposure), and assesses whether there is statistical interaction between these in their association with the outcome. Under the assumption that the genetic predictors are instrumental variables for the exposures, the statistical interaction can be interpreted as an interaction between the causal effects of the exposures on the outcome on the same scale 140 . For example, a study investigated interactions between genetic variants in the HMGCR gene region and the PCSK9 gene region, which respectively can be regarded as proxies for statins and PCSK9 inhibitors 141 , in their associations with coronary artery disease risk. The investigation found no association of the outcome with the interaction term between these variants in logistic regression, indicating no evidence for deviation from additivity in the combined effects of statins and PCSK9 inhibitors on a logit scale. A weakness of these investigations is that statistical power to detect an interaction is often low, in which case a null finding should not be interpreted as strong evidence for lack of interaction.

Time-varying Mendelian randomization aims to assess the potentially varying effect of an exposure at different periods during the life course 142 . For example, an investigation considered genetic predictors of BMI measured during early-life and later-life, and used a multivariable Mendelian randomization framework to assess the independent effects of early-life and later-life BMI on coronary artery disease risk 143 . Little evidence was found for a direct effect of early-life BMI on coronary artery disease risk. In contrast, for breast cancer, the effect of early-life BMI appeared stronger than the effect of later-life BMI. Researchers should be cautious when performing such analyses to ensure that the values of the exposure at different time periods genuinely represent biologically distinct risk factors, and not simply measures taken at different times but capturing the same essential risk factor 144 .

Mediation can be assessed in a Mendelian randomization framework in two ways 145 . In two-step Mendelian randomization, analysts estimate the effect of the exposure on the outcome, and compare this to the product of the effect of the exposure on the proposed mediator multiplied by the effect of the mediator on the outcome 146 . Each of these steps can be performed using standard Mendelian randomization, although separate instrumental variables are required for the exposure and mediator. These approaches can also be used to explore evidence for molecular mediation: the involvement of gene expression, DNA methylation or metabolite involvement in a disease pathway 147 . Alternatively, analysts can compare the effect of the exposure on the outcome from standard (that is, univariable) Mendelian randomization to the effect from multivariable Mendelian randomization including the mediator as an additional exposure variable 84 . The latter estimate represents the direct (that is, unmediated) effect of the exposure on the outcome. For example, investigators considered the effect of time in education on coronary heart disease in a multivariable Mendelian randomization additionally accounting for BMI, systolic blood pressure, and smoking behaviour 148 . They showed that a substantial proportion of the effect of education on coronary heart disease risk was mediated via one or other of these traits.

In bidirectional Mendelian randomization, investigators perform separate Mendelian randomization analyses to assess the effect of the exposure on the outcome, and the effect of the outcome on the exposure 149 . These analyses require separate instrument variables for the exposure and for the outcome 150 . For example, investigators considered associations between genetic predictors of educational attainment and short-sightedness, and between genetic predictors of short-sightedness and educational attainment 151 . They found associations in the former case, but not in the latter. This suggests that time spent in education affects one’s eyesight, rather than poor eyesight affecting an individual’s propensity to spend more time in education. Resolving the direction of causation between these two factors using Mendelian randomization answered a question first posed over 400 years ago 152 . Although in this example, evidence for a causal effect was found in one direction but not the other, this is not always the case. For example, there is evidence from Mendelian randomization that higher BMI has a causal effect on increasing smoking prevalence 153 , and that cigarette smoking causally reduces BMI 154 . However, in other cases, bidirectional Mendelian randomization findings may reflect the presence of shared aetiological pathways, rather than true causal effects in both directions 155 .

9. Data presentation

An attractive feature of Mendelian randomization is that the analysis can be summarized graphically in a transparent way. For example, in a polygenic analysis, a scatter plot of the genetic associations with the outcome against the genetic associations with the exposure reveals much about the analysis – whether different genetic variants provide similar estimates of the causal effect or if there is considerable heterogeneity, and whether the analysis is dominated by a single genetic variant or not 31 . The scatter plot is appealing as it presents the data with no manipulation. Examples of scatter plots illustrating heterogeneity and no heterogeneity in the causal estimates from different variants are shown in Figure 5. Alternatives are forest plots, funnel plots, and radial plots – each of these assesses heterogeneity in the variant-specific causal estimates 156 . Plots allow the investigators and readers to assess the reliability of the analysis method and its underlying assumptions, and we strongly recommend their inclusion in a manuscript.

Figure 5. Scatter plot of genetic associations with the outcome (vertical axis) against genetic associations with the exposure (horizontal axis).

Figure 5.

Examples illustrated are: (left) no heterogeneity in the variant-specific causal estimates (effect of LDL-cholesterol on coronary heart disease risk using 8 variants associated with LDL-cholesterol); and (right) heterogeneity in the variant-specific causal estimates (effect of C-reactive protein on coronary heart disease risk using 17 genome-wide significant predictors of C-reactive protein). As indicated by differences in estimates, not all genetic variants are valid instrumental variables for C-reactive protein, and so a causal interpretation is not appropriate. Taken from Burgess et al., 2018 92 .

Other important information to report include the first-stage R 2 statistic (when the exposure is continuous), which is a measure of the variance in the exposure explained by the genetic variants, and (particularly in a one-sample setting) the related F statistic, which is a measure of instrument strength and can be used to judge the extent of weak instrument bias 157 . For multivariable Mendelian randomization, the conditional F statistic is a more relevant measure of instrument strength, and assesses the strength of association of the variants with each exposure in turn after accounting for the other exposures in the model 158 .

Investigators can also make some statement about the power of their proposed analyses. Power to detect a causal effect depends on the proportion of variance in the exposure explained by the genetic variants, proposed size of causal effect, sample size (for the genetic associations with the outcome), and (with a binary outcome) proportion of individuals with an outcome event. Power calculators can be found at http://cnsgenomics.com/shiny/mRnd/ and https://sb452.shinyapps.io/power/. Power calculations are often performed post hoc, as sample sizes are rarely determined based on a proposed Mendelian randomization analysis. Power calculations are more meaningful when performed prior to the analysis, and can guide investigators which exposure/outcome pairs to consider, and so focus on analyses that have a better chance of giving meaningful results.

10. Interpretation

Finally, we discuss the interpretation of findings from Mendelian randomization investigations. In the first instance, a Mendelian randomization investigation assesses the association of genetic predictors of an exposure with an outcome, or equivalently, the association of genetically-predicted levels of an exposure with an outcome. Making causal inferences from observational data always relies on untestable assumptions. In Mendelian randomization, a key assumption is that observed differences in the outcome associated with genetically-predicted levels of the exposure would also be seen if the exposure were intervened on 9, 71 . This version of the consistency assumption in causal inference 159 is referred to as gene—environment equivalence 160 .

In line with the STROBE-MR guidelines 1, 2 , we recommend that a cautious interpretation should be taken when describing the extent to which a causal effect has been demonstrated by a Mendelian randomization investigation. The appropriate degree of caution will depend on the plausibility of the instrumental variable assumptions, the concordance of estimates from different methods and different analytical approaches, the results from sensitivity and supplementary analyses, and so on. Even if a Mendelian randomization finding is replicated in a separate dataset, there is still intrinsic uncertainty in the instrumental variable assumptions, meaning that uncertainty in a causal conclusion remains. Another specific caution is that if multiple related traits are similarly associated with the same genetic variants (such as different measures of obesity or gene expression in different tissues), then Mendelian randomization approaches cannot identify the true causal risk factor without additional assumptions.

Mendelian randomization estimates relate specifically to changes in the exposure induced by the genetic variants used as instrumental variables. The genetic code is fixed at conception, and so Mendelian randomization investigations typically compare groups of the population having different trajectories in their distribution of the exposure over time 161 . Analyses therefore typically can be interpreted as assessing the impact of long-term elevated levels of an exposure 162 . For example, genetic variants in the CRP gene have been shown to be associated with CRP levels throughout the life course, with similar relative associations in childhood and in middle age 163 . However, in most cases, we have incomplete information about how the genetic variant changes the distribution of the exposure across the life course. If the genetic associations with the exposure vary over time, then Mendelian randomization estimates based on genetic associations with the exposure measured at a single timepoint can be unreliable 33 . Similar difficulties of interpretation arise if the impact on the outcome relates to levels of the exposure at a specific time period in life. A plausible example of this is the effect of vitamin D on multiple sclerosis; multiple sclerosis risk is hypothesized to be influenced by vitamin D levels during childhood, but not vitamin D levels in adulthood 164 .

That said, results from Mendelian randomization investigations have often been shown to qualitatively agree with the results from randomized trials, suggesting that a causal interpretation for Mendelian randomization findings is often reasonable 121 . Mendelian randomization investigations are worthwhile in providing an alternative line of aetiological evidence even though the instrumental variable assumptions can never be proved beyond all doubt 96, 97 . However, quantitative differences between estimates from Mendelian randomization and from trials are likely, particularly as there are differences between how genetic variants influence the exposure and how clinical and pharmaceutical interventions influence the exposure 165 As genetic variants typically affect usual levels of exposures on a long-term basis, Mendelian randomization estimates are often larger than those from conventional observational studies or randomized trials for the same magnitude of difference in the exposure 33 . Hence, the causal estimate from a Mendelian randomization investigation should not generally be interpreted directly as the expected impact of intervening on the exposure in applied practice 166 .

The estimate from a Mendelian randomization investigation is therefore better interpreted as a test statistic for a causal hypothesis and an indicator of the direction of the effect, rather than the estimated impact of a well-defined intervention at a specific point in time. But even when a Mendelian randomization investigation is performed primarily to assess the causal role of an exposure, causal estimates can still be useful, for example to assess heterogeneity in estimates from different variants as a test of instrument validity, or to compare results from different analysis methods as an assessment of robustness. A logical consequence of the 2SLS/IVW method providing the most efficient causal estimate when combining evidence across multiple valid instrumental variables is that, under the same assumptions, the method provides the most powerful test of the presence of a causal effect.

Summary

Overall, the key elements of a Mendelian randomization investigation to be reported in any manuscript are: i) motivation for why a Mendelian randomization analysis should be performed and for the scope of the analysis, ii) a clear description and justification of the choice of dataset(s) for the analysis, including why a one- or two-sample approach was chosen for the primary analysis, iii) a clear description and justification of the choice of genetic variants used in the analysis, iv) a discussion, whether statistically or biologically led, of whether the genetic variants are likely to satisfy the instrumental variable assumptions, v) a graphical presentation of the data, such as a scatter plot of the genetic associations, and vi) some attempt to test the robustness of the main findings, whether by use of robust methods (for a polygenic analysis) or another approach – whatever is most appropriate to the analysis under consideration. These elements are necessary for the reader to judge the reliability of a Mendelian randomization investigation.

Particularly with the advent of summarized data and the two-sample setting, performing a Mendelian randomization analysis has become more straightforward 30 . The difficulty is not in performing a Mendelian randomization analysis, but rather in performing a credible analysis 167 and providing a reasoned interpretation 164 . We hope that these guidelines, summarized in the accompanying flowcharts ( Figure 1 and Figure 2) and checklist ( Figure 3), will aid practitioners in performing reliable analyses, and editors and reviewers in judging the reliability of analyses, and that their use will help improve the overall quality of Mendelian randomization investigations.

Disclaimer

The views expressed in this article are those of the authors. Publication in Wellcome Open Research does not imply endorsement by Wellcome.

Funding Statement

Grant information is now as follows: Stephen Burgess is supported by the Wellcome Trust (225790/Z/22/Z) and the United Kingdom Research and Innovation Medical Research Council (MC_UU_00002/7). Dipender Gill is funded by the Wellcome Trust. Evropi Theodoratou is supported by a CRUK Career Development Fellowship (C31250/A22804). George Davey Smith, Neil M Davies, Fernando Hartwig, and Caroline Relton work within a Unit funded by the Medical Research Council (MC_UU_00011/1, MC_UU_00011/5). Neil Davies is supported by a Future Research Leaders grant funded by the Economics and Social Research Council (ES/N000757/1) and a Norwegian research Council grant number 295989. Michael Holmes works in a unit that receives funding from the UK Medical Research Council and is supported by a British Heart Foundation Intermediate Clinical Research Fellowship (FS/18/23/33512) and the National Institute for Health Research Oxford Biomedical Research Centre. Wei Pan is supported by NIH grants (R01 AG065636, RF1 AG067924, U01 AG073079). This research was supported by the National Institute for Health Research Cambridge Biomedical Research Centre (NIHR203312). The views expressed are those of the authors and not necessarily those of the National Institute for Health Research or the Department of Health and Social Care.

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

[version 3; peer review: 2 approved]

Data availability

No data are associated with this article.

References

  • 1. Skrivankova VW, Richmond RC, Woolf BAR, et al. : Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomization: The STROBE-MR Steering Group Statement. JAMA. 2021;326(16):1614–1621. 10.1001/jama.2021.18236 [DOI] [PubMed] [Google Scholar]
  • 2. Skrivankova VW, Richmond RC, Woolf BAR, et al. : Strengthening the reporting of observational studies in epidemiology using mendelian randomisation (STROBE-MR): explanation and elaboration. BMJ. 2021;375: n2233. 10.1136/bmj.n2233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Davey Smith G, Ebrahim S: 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003;32(1):1–22. 10.1093/ije/dyg070 [DOI] [PubMed] [Google Scholar]
  • 4. Burgess S, Thompson SG: Mendelian Randomization: Methods for causal inference using Genetic Variants.2nd ed: Chapman & Hall/CRC;2021. Reference Source [Google Scholar]
  • 5. Davies NM, Holmes MV, Davey Smith G: Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians. BMJ. 2018;362: k601. 10.1136/bmj.k601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Sanderson E, Glymour MM, Holmes MV, et al. : Mendelian randomization. Nat Rev Methods Primers. 2022;2: 6. 10.1038/s43586-021-00092-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Greenland S: An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29(4):722–729. 10.1093/ije/29.4.722 [DOI] [PubMed] [Google Scholar]
  • 8. Martens EP, Pestman WR, de Boer A, et al. : Instrumental variables: application and limitations. Epidemiology. 2006;17(3):260–267. 10.1097/01.ede.0000215160.88317.cb [DOI] [PubMed] [Google Scholar]
  • 9. Didelez V, Sheehan N: Mendelian randomization as an instrumental variable approach to causal inference. Stat Methods Med Res. 2007;16(4):309–330. 10.1177/0962280206077743 [DOI] [PubMed] [Google Scholar]
  • 10. Swanson SA, Labrecque J, Hernán MA: Causal null hypotheses of sustained treatment strategies: What can be tested with an instrumental variable? Eur J Epidemiol. 2018;33(8):723–728. 10.1007/s10654-018-0396-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. VanderWeele TJ, Tchetgen Tchetgen EJ, Cornelis M, et al. : Methodological challenges in Mendelian randomization. Epidemiology. 2014;25(3):427–435. 10.1097/EDE.0000000000000081 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Hernán MA, Taubman SL: Does obesity shorten life? The importance of well-defined interventions to answer causal questions. Int J Obes (Lond). 2008;32 Suppl 3:S8–S14. 10.1038/ijo.2008.82 [DOI] [PubMed] [Google Scholar]
  • 13. Hernán MA, Robins JM: Instruments for causal inference: an epidemiologist's dream? Epidemiology. 2006;17(4):360–372. 10.1097/01.ede.0000222409.00878.37 [DOI] [PubMed] [Google Scholar]
  • 14. Bulik-Sullivan BK, Loh PR, Finucane HK, et al. : LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47(3):291–5. 10.1038/ng.3211 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Visscher PM, Hemani G, Vinkhuyzen AA, et al. : Statistical power to detect genetic (co)variance of complex traits using SNP data in unrelated samples. PLoS Genet. 2014;10(4): e1004269. 10.1371/journal.pgen.1004269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Burgess S, Butterworth A, Thompson SG: Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37(7):658–665. 10.1002/gepi.21758 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Bowden J, Davey Smith G, Burgess S: Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–525. 10.1093/ije/dyv080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Zhao Q, Wang J, Hemani G, et al. : Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. Ann Stat. 2020;48(3):1742–1769. 10.1214/19-AOS1866 [DOI] [Google Scholar]
  • 19. Bowden J, Davey Smith G, Haycock PC, et al. : Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet Epidemiol. 2016;40(4):304–314. 10.1002/gepi.21965 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Hartwig FP, Davey Smith G, Bowden J: Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46(6):1985–1998. 10.1093/ije/dyx102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Verbanck M, Chen CY, Neale B, et al. : Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(5):693–698. 10.1038/s41588-018-0099-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Rees JMB, Wood AM, Dudbridge F, et al. : Robust methods in Mendelian randomization via penalization of heterogeneous causal estimates. PLoS One. 2019;14(9): e0222362. 10.1371/journal.pone.0222362 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Burgess S, Foley CN, Allara E, et al. : A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun. 2020;11(1): 376. 10.1038/s41467-019-14156-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Qi G, Chatterjee N: Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nat Commun. 2019;10(1): 1941. 10.1038/s41467-019-09432-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Xue H, Shen X, Pan W: Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects. Am J Hum Genet. 2021;108(7):1251–1269. 10.1016/j.ajhg.2021.05.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Slob E, Burgess S: A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol. 2020;44(4):313–329. 10.1002/gepi.22295 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Millard LA, Davies NM, Timpson NJ, et al. : MR-PheWAS: hypothesis prioritization among potential causal effects of body mass index on many outcomes, using Mendelian randomization. Sci Rep. 2015;5: 16645. 10.1038/srep16645 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Li X, Meng X, Spiliopoulou A, et al. : MR-PheWAS: exploring the causal effect of SUA level on multiple disease outcomes by using genetic instruments in UK Biobank. Ann Rheum Dis. 2018;77(7):1039–1047. 10.1136/annrheumdis-2017-212534 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Gill D, Benyamin B, Moore LSP, et al. : Associations of genetically determined iron status across the phenome: A mendelian randomization study. PLoS Med. 2019;16(6): e1002833. 10.1371/journal.pmed.1002833 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Hemani G, Zheng J, Elsworth B, et al. : The MR-Base platform supports systematic causal inference across the human phenome. eLife. 2018;7: e34408. 10.7554/eLife.34408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Burgess S, Scott RA, Timpson NJ, et al. : Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. Eur J Epidemiol. 2015;30(7):543–552. 10.1007/s10654-015-0011-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Burgess S, Dudbridge F, Thompson SG: Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat Med. 2016;35(11):1880–1906. 10.1002/sim.6835 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Labrecque JA, Swanson SA: Interpretation and Potential Biases of Mendelian Randomization Estimates With Time-Varying Exposures. Am J Epidemiol. 2019;188(1):231–238. 10.1093/aje/kwy204 [DOI] [PubMed] [Google Scholar]
  • 34. Bowden J, Del Greco MF, Minelli C, et al. : A framework for the investigation of pleiotropy in two-sample summary data Mendelian randomization. Stat Med. 2017;36(11):1783–1802. 10.1002/sim.7221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Staley JR, Blackshaw J, Kamat MA, et al. : PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics. 2016;32(20):3207–3209. 10.1093/bioinformatics/btw373 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Elsworth B, Lyon M, Alexander T, et al. : The MRC IEU OpenGWAS data infrastructure. bioRxiv. 2020; 2020.2008.2010.244293. 10.1101/2020.08.10.244293 [DOI] [Google Scholar]
  • 37. Minelli C, Del Greco MF, van der Plaat DA, et al. : The use of two-sample methods for Mendelian randomization analyses on single large datasets. Int J Epidemiol. 2021;50(5):1651–1659. 10.1093/ije/dyab084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Hartwig FP, Tilling K, Davey Smith G, et al. : Bias in two-sample Mendelian randomization when using heritable covariable-adjusted summary associations. Int J Epidemiol. 2021;50(5):1639–1650. 10.1093/ije/dyaa266 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Burgess S, Davies NM, Thompson SG, et al. : Instrumental variable analysis with a nonlinear exposure-outcome relationship. Epidemiology. 2014;25(6):877–885. 10.1097/EDE.0000000000000161 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Pierce BL, Burgess S: Efficient design for Mendelian randomization studies: subsample and 2-sample instrumental variable estimators. Am J Epidemiol. 2013;178(7):1177–1184. 10.1093/aje/kwt084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Burgess S, Thompson SG, , CRP CHD Genetics Collaboration : Avoiding bias from weak instruments in Mendelian randomization studies. Int J Epidemiol. 2011;40(3):755–764. 10.1093/ije/dyr036 [DOI] [PubMed] [Google Scholar]
  • 42. Burgess S, Davies NM, Thompson SG: Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol. 2016;40(7):597–608. 10.1002/gepi.21998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Mounier N, Kutalik Z: Bias correction for inverse variance weighting Mendelian randomization. Genet Epidemiol. 2023;47(4):314–331. 10.1002/gepi.22522 [DOI] [PubMed] [Google Scholar]
  • 44. Ting Y, Jun S, Hyunseung K: Debiased inverse-variance weighted estimator in two-sample summary-data Mendelian randomization. Ann Stat. 2021;49(4):2079–2100. 10.1214/20-AOS2027 [DOI] [Google Scholar]
  • 45. Xu S, Wang P, Fung WK, et al. : A novel penalized inverse-variance weighted estimator for Mendelian randomization with applications to COVID-19 outcomes. Biometrics. 2022. 10.1111/biom.13732 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Davey Smith G, Lawlor DA, Harbord R, et al. : Clustered Environments and Randomized Genes: A Fundamental Distinction between Conventional and Genetic Epidemiology. PLoS Med. 2007;4(12): e352. 10.1371/journal.pmed.0040352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Taylor M, Tansey KE, Lawlor DA, et al. : Testing the principles of Mendelian randomization: Opportunities and complications on a genomewide scale. bioRxiv. 2017; 124362. 10.1101/124362 [DOI] [Google Scholar]
  • 48. Howe LJ, Nivard MG, Morris TT, et al. : Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects. Nat Genet. 2022;54(5):581–592. 10.1038/s41588-022-01062-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Vissers LET, Sluijs I, van der Schouw YT, et al. : Dairy Product Intake and Risk of Type 2 Diabetes in EPIC-InterAct: A Mendelian Randomization Study. Diabetes Care. 2019;42(4):568–575. 10.2337/dc18-2034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Hartwig FP, Davies NM, Davey Smith G: Bias in Mendelian randomization due to assortative mating. Genet Epidemiol. 2018;42(7):608–620. 10.1002/gepi.22138 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Davies NM, Howe LJ, Brumpton B, et al. : Within family Mendelian randomization studies. Hum Mol Genet. 2019;28(R2):R170–R179. 10.1093/hmg/ddz204 [DOI] [PubMed] [Google Scholar]
  • 52. Brumpton B, Sanderson E, Heilbron K, et al. : Avoiding dynastic, assortative mating, and population stratification biases in Mendelian randomization through within-family analyses. Nat Commun. 2020;11(1): 3519. 10.1038/s41467-020-17117-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Swerdlow DI, Kuchenbaecker KB, Shah S, et al. : Selecting instruments for Mendelian randomization in the wake of genome-wide association studies. Int J Epidemiol. 2016;45(5):1600–1616. 10.1093/ije/dyw088 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Burgess S, Bowden J, Fall T, et al. : Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants. Epidemiology. 2017;28(1):30–42. 10.1097/EDE.0000000000000559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Schmidt AF, Finan C, Gordillo-Marañón M, et al. : Genetic drug target validation using Mendelian randomisation. Nat Commun. 2020;11(1): 3255. 10.1038/s41467-020-16969-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Brion MJ, Shakhbazov K, Visscher PM: Calculating statistical power in Mendelian randomization studies. Int J Epidemiol. 2013;42(5):1497–1501. 10.1093/ije/dyt179 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Holmes MV, Richardson TG, Ference BA, et al. : Integrating genomics with biomarkers and therapeutic targets to invigorate cardiovascular drug development. Nat Rev Cardiol. 2021;18(6):435–453. 10.1038/s41569-020-00493-1 [DOI] [PubMed] [Google Scholar]
  • 58. Burgess S, Mason A, Grant A, et al. : Using genetic association data to guide drug discovery and development: review of methods and applications. Am J Hum Genet. 2023;110(2):195–214. 10.1016/j.ajhg.2022.12.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Daghlas I, Karhunen V, Ray D, et al. : Genetic Evidence for Repurposing of GLP1R (Glucagon‐Like Peptide‐1 Receptor) Agonists to Prevent Heart Failure. J Am Heart Assoc. 2021;10(13): e020331. 10.1161/JAHA.120.020331 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Gill D, Burgess S: Use of a Genetic Variant Related to Circulating FXa (Activated Factor X) Levels to Proxy the Effect of FXa Inhibition on Cardiovascular Outcomes. Circ Genom Precis Med. 2020;13(5):551–553. 10.1161/CIRCGEN.120.003061 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Foley CN, Mason AM, Kirk PDW, et al. : MR-Clust: clustering of genetic variants in Mendelian randomization with similar causal estimates. Bioinformatics. 2021;37(4):531–541. 10.1093/bioinformatics/btaa778 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Burgess S, Zuber V, Valdes-Marquez E, et al. : Mendelian randomization with fine-mapped genetic data: Choosing from large numbers of correlated instrumental variables. Genet Epidemiol. 2017;41(8):714–725. 10.1002/gepi.22077 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Gkatzionis A, Burgess S, Newcombe PJ: Statistical methods for cis-Mendelian randomization with two-sample summary-level data. Genet Epidemiol. 2023:47(1):3–25. 10.1002/gepi.22506 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Mokry LE, Ross S, Ahmad OS, et al. : Vitamin D and Risk of Multiple Sclerosis: A Mendelian Randomization Study. PLoS Med. 2015;12(8): e1001866. 10.1371/journal.pmed.1001866 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Interleukin-6 Receptor Mendelian Randomisation Analysis (IL6R MR) Consortium, . Swerdlow DI, Holmes MV, et al. : The interleukin-6 receptor as a target for prevention of coronary heart disease: a mendelian randomisation analysis. Lancet. 2012;379(9822):1214–1224. 10.1016/S0140-6736(12)60110-X [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Zhao Q, Chen Y, Wang J, et al. : Powerful three-sample genome-wide design and robust statistical inference in summary-data Mendelian randomization. Int J Epidemiol. 2019;48(5):1478–1492. 10.1093/ije/dyz142 [DOI] [PubMed] [Google Scholar]
  • 67. Jiang T, Gill D, Butterworth AS, et al. : An empirical investigation into the impact of winner’s curse on estimates from Mendelian randomization. Int J Epidemiol. 2022; dyac233. 10.1093/ije/dyac233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Davey Smith G, Hemani G: Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23(R1):R89–R98. 10.1093/hmg/ddu328 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Baldwin JR, Pingault JB, Schoeler T, et al. : Protecting against researcher bias in secondary data analysis: challenges and potential solutions. Eur J Epidemiol. 2022;37(1):1–10. 10.1007/s10654-021-00839-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Hartwig FP, Davies NM, Hemani G, et al. : Two-sample Mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique. Int J Epidemiol. 2016;45(6):1717–1726. 10.1093/ije/dyx028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Lawlor DA, Harbord RM, Sterne JA, et al. : Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27(8):1133–1163. 10.1002/sim.3034 [DOI] [PubMed] [Google Scholar]
  • 72. Hughes RA, Davies NM, Davey Smith G, et al. : Selection Bias When Estimating Average Treatment Effects Using One-sample Instrumental Variable Analysis. Epidemiology. 2019;30(3):350–357. 10.1097/EDE.0000000000000972 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Didelez V, Meng S, Sheehan NA: Assumptions of IV Methods for Observational Epidemiology. Stat Sci. 2010;25(1):22–40. 10.1214/09-STS316 [DOI] [Google Scholar]
  • 74. Vansteelandt S, Bekaert M, Claeskens G: On model selection and model misspecification in causal inference. Stat Methods Med Res. 2012;21(1):7–30. 10.1177/0962280210387717 [DOI] [PubMed] [Google Scholar]
  • 75. Palmer TM, Sterne JA, Harbord RM, et al. : Instrumental variable estimation of causal risk ratios and causal odds ratios in Mendelian randomization analyses. Am J Epidemiol. 2011;173(12):1392–1403. 10.1093/aje/kwr026 [DOI] [PubMed] [Google Scholar]
  • 76. Inoue A, Solon G: Two-Sample Instrumental Variables Estimators. Rev Econ Stat. 2010;92(3):557–561. 10.1162/REST_a_00011 [DOI] [Google Scholar]
  • 77. Thompson JR, Minelli C, Del Greco MF: Mendelian Randomization using Public Data from Genetic Consortia. Int J Biostat. 2016;12(2): pii: /j/ijb.2016.12.issue-2/ijb-2015-0074/ijb-2015-0074.xml. 10.1515/ijb-2015-0074 [DOI] [PubMed] [Google Scholar]
  • 78. Walter S, Kubzansky LD, Koenen KC, et al. : Revisiting Mendelian randomization studies of the effect of body mass index on depression. Am J Med Genet B Neuropsychiatr Genet. 2015;168B(2):108–115. 10.1002/ajmg.b.32286 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Del Greco MF, Minelli C, Sheehan NA, et al. : Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med. 2015;34(21):2926–2940. 10.1002/sim.6522 [DOI] [PubMed] [Google Scholar]
  • 80. Burgess S, Thompson SG: Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol. 2015;181(4):251–260. 10.1093/aje/kwu283 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Sanderson E, Davey Smith G, Windmeijer F, et al. : An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol. 2019;48(3):713–727. 10.1093/ije/dyy262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Burgess S, Freitag DF, Khan H, et al. : Using multivariable Mendelian randomization to disentangle the causal effects of lipid fractions. PLoS One. 2014;9(10): e108891. 10.1371/journal.pone.0108891 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83. Larsson SC, Bäck M, Rees JMB, et al. : Body mass index and body composition in relation to 14 cardiovascular conditions in UK Biobank: a Mendelian randomization study. Euro Heart J. 2019; pii: ehz388. 10.1093/eurheartj/ehz388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Burgess S, Thompson DJ, Rees JMB, et al. : Dissecting Causal Pathways Using Mendelian Randomization with Summarized Genetic Data: Application to Age at Menarche and Risk of Breast Cancer. Genetics. 2017;207(2):481–487. 10.1534/genetics.117.300191 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85. Porcu E, Rüeger S, Lepik K, et al. : Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits. Nat Commun. 2019;10(1): 3300. 10.1038/s41467-019-10936-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86. Hemani G, Bowden J, Davey Smith G: Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum Mol Genet. 2018;27(R2):R195–R208. 10.1093/hmg/ddy163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87. Kang H, Zhang A, Cai TT, et al. : Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization. J Am Stat Assoc. 2016;111(513):132–144. 10.1080/01621459.2014.994705 [DOI] [Google Scholar]
  • 88. Morrison J, Knoblauch N, Marcus J, et al. : Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet. 2020;52(7):740–747. 10.1038/s41588-020-0631-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89. Rees JM, Wood AM, Burgess S: Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy. Stat Med. 2017;36(29):4705–4718. 10.1002/sim.7492 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90. Burgess S, Thompson SG: Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol. 2017;32(5):377–389. 10.1007/s10654-017-0255-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91. Bowden J, Del Greco MF, Minelli C, et al. : Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I 2 statistic. Int J Epidemiol. 2016;45(6):1961–1974. 10.1093/ije/dyw220 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92. Burgess S, Zuber V, Gkatzionis A, et al. : Modal-based estimation via heterogeneity-penalized weighting: model averaging for consistent and efficient estimation in Mendelian randomization when a plurality of candidate instruments are valid. Int J Epidemiol. 2018;47(4):1242–1254. 10.1093/ije/dyy080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93. Darrous L, Mounier N, Kutalik Z: Simultaneous estimation of bi-directional causal effects and heritable confounding from GWAS summary statistics. Nat Commun. 2021;12(1): 7274. 10.1038/s41467-021-26970-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94. Qi G, Chatterjee N: A Comprehensive Evaluation of Methods for Mendelian Randomization Using Realistic Simulations and an analysis of 38 biomarkers for risk of type 2 diabetes. Int J Epidemiol. 2021;50(4):1335–1349. 10.1093/ije/dyaa262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95. Cheng Q, Zhang X, Chen LS, et al. : Mendelian randomization accounting for complex correlated horizontal pleiotropy while elucidating shared genetic etiology. Nat Commun. 2022;13(1): 6490. 10.1038/s41467-022-34164-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96. Munafò MR, Davey Smith G: Robust research needs many lines of evidence. Nature. 2018;553(7689):399–401. 10.1038/d41586-018-01023-3 [DOI] [PubMed] [Google Scholar]
  • 97. Lawlor DA, Tilling K, Davey Smith G: Triangulation in aetiological epidemiology. Int J Epidemiol. 2016;45(6):1866–1886. 10.1093/ije/dyw314 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98. Bowden J, Del Greco MF, Minelli C, et al. : Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption. Int J Epidemiol. 2019;48(3):728–742. 10.1093/ije/dyy258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99. Bowden J, Hemani G, Davey Smith G: Invited Commentary: Detecting Individual and Global Horizontal Pleiotropy in Mendelian Randomization-A Job for the Humble Heterogeneity Statistic? Am J Epidemiol. 2018;187(12):2681–2685. 10.1093/aje/kwy185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100. Corbin LJ, Richmond RC, Wade KH, et al. : BMI as a Modifiable Risk Factor for Type 2 Diabetes: Refining and Understanding Causal Estimates Using Mendelian Randomization. Diabetes. 2016;65(10):3002–3007. 10.2337/db16-0418 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101. Smith JG, Luk K, Schulz CA, et al. : Association of low-density lipoprotein cholesterol-related genetic variants with aortic valve calcium and incident aortic stenosis. JAMA. 2014;312(17):1764–1771. 10.1001/jama.2014.13959 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102. Hemani G, Tilling K, Davey Smith G: Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 2017;13(11): e1007081. 10.1371/journal.pgen.1007081 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103. Cho Y, Haycock PC, Sanderson E, et al. : Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework. Nat Commun. 2020;11(1): 1010. 10.1038/s41467-020-14452-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104. Palmer TM, Nordestgaard BG, Benn M, et al. : Association of plasma uric acid with ischaemic heart disease and blood pressure: mendelian randomisation analysis of two large cohorts. BMJ. 2013;347: f4262. 10.1136/bmj.f4262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105. Lipsitch M, Tchetgen Tchetgen E, Cohen T: Negative Controls: A Tool for Detecting Confounding and Bias in Observational Studies. Epidemiology. 2010;21(3):383–388. 10.1097/EDE.0b013e3181d61eeb [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106. Richardson TG, Power GM, Davey Smith G: Adiposity may confound the association between vitamin D and disease risk – a lifecourse Mendelian randomization study. eLife. 2022;11: e79798. 10.7554/eLife.79798 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107. Sanderson E, Richardson TG, Hemani G, et al. : The use of negative control outcomes in Mendelian randomization to detect potential population stratification. Int J Epidemiol. 2021;50(4):1350–1361. 10.1093/ije/dyaa288 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108. Giambartolomei C, Vukcevic D, Schadt EE, et al. : Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10(5): e1004383. 10.1371/journal.pgen.1004383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Hormozdiari F, Kostem E, Kang EY, et al. : Identifying causal variants at loci with multiple signals of association. Genetics. 2014;198(2):497–508. 10.1534/genetics.114.167908 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110. Solovieff N, Cotsapas C, Lee PH, et al. : Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet. 2013;14(7):483–495. 10.1038/nrg3461 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111. Benn M, Nordestgaard BG, Frikke-Schmidt R, et al. : Low LDL cholesterol, PCSK9 and HMGCR genetic variation, and risk of Alzheimer's disease and Parkinson's disease: Mendelian randomisation study. BMJ. 2017;357: j1648. 10.1136/bmj.j1648 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. Zuber V, Grinberg NF, Gill D, et al. : Combining evidence from Mendelian randomization and colocalization: Review and comparison of approaches. Am J Hum Genet. 2022;109(5):767–782. 10.1016/j.ajhg.2022.04.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113. Zheng J, Haberland V, Baird D, et al. : Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat Genet. 2020;52(10):1122–1131. 10.1038/s41588-020-0682-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114. Chen L, Davey Smith G, Harbord RM, et al. : Alcohol intake and blood pressure: a systematic review implementing a Mendelian randomization approach. PLoS Med. 2008;5(3): e52. 10.1371/journal.pmed.0050052 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115. Millwood IY, Walters RG, Mei XW, et al. : Conventional and genetic evidence on alcohol and vascular disease aetiology: a prospective study of 500 000 men and women in China. Lancet. 2019;393(10183):1831–1842. 10.1016/S0140-6736(18)31772-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116. van Kippersluis H, Rietveld CA: Pleiotropy-robust Mendelian randomization. Int J Epidemiol. 2018;47(4):1279–1288. 10.1093/ije/dyx002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117. Spiller W, Slichter D, Bowden J, et al. : Detecting and correcting for bias in Mendelian randomization analyses using Gene-by-Environment interactions. Int J Epidemiol. 2019;48(3):702–712. 10.1093/ije/dyy204 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118. Munafò MR, Tilling K, Taylor AE, et al. : Collider scope: when selection bias can substantially influence observed associations. Int J Epidemiol. 2018;47(1):226–235. 10.1093/ije/dyx206 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119. Cho Y, Shin SY, Won S, et al. : Alcohol intake and cardiovascular risk factors: A Mendelian randomisation study. Sci Rep. 2015;5(1): 18422. 10.1038/srep18422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120. White J, Sofat R, Hemani G, et al. : Plasma urate concentration and risk of coronary heart disease: a Mendelian randomisation analysis. Lancet Diabetes Endocrinol. 2016;4(4):327–336. 10.1016/S2213-8587(15)00386-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121. Haycock PC, Burgess S, Wade KH, et al. : Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am J Clin Nutr. 2016;103(4):965–978. 10.3945/ajcn.115.118216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122. Gkatzionis A, Burgess S: Contextualizing selection bias in Mendelian randomization: how bad is it likely to be? Int J Epidemiol. 2019;48(3):691–701. 10.1093/ije/dyy202 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123. Gilbody J, Borges MC, Davey Smith G, et al. : Multivariable MR can mitigate bias in two-sample MR using covariable-adjusted summary associations. medRxiv. 2022; 2022.2007.2019.22277803. 10.1101/2022.07.19.22277803 [DOI] [Google Scholar]
  • 124. Zhao B, Luo T, Li T, et al. : Genome-wide association analysis of 19,629 individuals identifies variants influencing regional brain volumes and refines their genetic co-architecture with cognitive and mental health traits. Nat Genet. 2019;51(11):1637–1644. 10.1038/s41588-019-0516-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125. Dudbridge F, Allen RJ, Sheehan NA, et al. : Adjustment for index event bias in genome-wide association studies of subsequent events. Nat Commun. 2019;10(1): 1561. 10.1038/s41467-019-09381-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126. Mahmoud O, Dudbridge F, Davey Smith G, et al. : A robust method for collider bias correction in conditional genome-wide association studies. Nat Commun. 2022;13(1): 619. 10.1038/s41467-022-28119-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127. Mitchell RE, Hartley AE, Walker VM, et al. : Strategies to investigate and mitigate collider bias in genetic and Mendelian randomisation studies of disease progression. PLoS Genet. 2023;19(2): e1010596. 10.1371/journal.pgen.1010596 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128. Yaghootkar H, Bancks MP, Jones SE, et al. : Quantifying the extent to which index event biases influence large genetic association studies. Hum Mol Genet. 2017;26(5):1018–1030. 10.1093/hmg/ddw433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129. Paternoster L, Tilling K, Davey Smith G: Genetic epidemiology and Mendelian randomization for informing disease therapeutics: Conceptual and methodological challenges. PLoS Genet. 2017;13(10): e1006944. 10.1371/journal.pgen.1006944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130. Schoeler T, Speed D, Porcu E, et al. : Correction for participation bias in the UK Biobank reveals non-negligible impact on genetic associations and downstream analyses. bioRxiv. 2022; 2022.2009.2028.509845. 10.1101/2022.09.28.509845 [DOI] [Google Scholar]
  • 131. van Alten S, Domingue BW, Galama T, et al. : Reweighting the UK Biobank to reflect its underlying sampling population substantially reduces pervasive selection bias due to volunteering. medRxiv. 2022; 2022.2005.2016.22275048. 10.1101/2022.05.16.22275048 [DOI] [Google Scholar]
  • 132. Burgess S, Swanson SA, Labrecque JA: Are Mendelian randomization investigations immune from bias due to reverse causation? Eur J Epidemiol. 2021;36(3):253–257. 10.1007/s10654-021-00726-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133. Hall P, Horowitz JL: Nonparametric methods for inference in the presence of instrumental variables. Ann Statist. 2005;33(6):2904–2929. 10.1214/009053605000000714 [DOI] [Google Scholar]
  • 134. Sulc J, Sjaarda J, Kutalik Z: Polynomial Mendelian randomization reveals non-linear causal effects for obesity-related traits. HGG Adv. 2022;3(3): 100124. 10.1016/j.xhgg.2022.100124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135. Staley JR, Burgess S: Semiparametric methods for estimation of a nonlinear exposure-outcome relationship using instrumental variables with application to Mendelian randomization. Genet Epidemiol. 2017;41(4):341–352. 10.1002/gepi.22041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136. Horowitz JL: Applied Nonparametric Instrumental Variables Estimation. Econometrica. 2011;79(2):347–394. 10.3982/ECTA8662 [DOI] [Google Scholar]
  • 137. Burgess S: Violation of the constant genetic effect assumption can result in biased estimates for non-linear Mendelian randomization. Human Heredity. 2023; (in press). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138. Davey Smith G: Mendelian randomisation and vitamin D: the importance of model assumptions. Lancet Diabetes Endocrinol. 2023;11(1):14. 10.1016/S2213-8587(22)00345-X [DOI] [PubMed] [Google Scholar]
  • 139. Tian H, Mason AM, Liu C, et al. : Relaxing parametric assumptions for non-linear Mendelian randomization using a doubly-ranked stratification method. PLoS Genet. 2023;19(6): e1010823. 10.1371/journal.pgen.1010823 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140. Rees JMB, Foley CN, Burgess S: Factorial Mendelian randomization: using genetic variants to assess interactions. Int J Epidemiol. 2020;49(4):1147–1158. 10.1093/ije/dyz161 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141. Ference BA, Robinson JG, Brook RD, et al. : Variation in PCSK9 and HMGCR and Risk of Cardiovascular Disease and Diabetes. N Engl J Med. 2016;375(22):2144–2153. 10.1056/NEJMoa1604304 [DOI] [PubMed] [Google Scholar]
  • 142. Sanderson E, Richardson TG, Morris TT, et al. : Estimation of causal effects of a time-varying exposure at multiple time points through multivariable mendelian randomization. PLoS Genet. 2022;18(7): e1010290. 10.1371/journal.pgen.1010290 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143. Richardson TG, Sanderson E, Elsworth B, et al. : Use of genetic variation to separate the effects of early and later life adiposity on disease risk: mendelian randomisation study. BMJ. 2020;369: m1203. 10.1136/bmj.m1203 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144. Tian H, Burgess S: Estimation of time-varying causal effects with multivariable Mendelian randomization: some cautionary notes. Int J Epidemiol. 2023;52(3):846–857. 10.1093/ije/dyac240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145. Carter AR, Sanderson E, Hammerton G, et al. : Mendelian randomisation for mediation analysis: current methods and challenges for implementation. Eur J Epidemiol. 2021;36(5):465–478. 10.1007/s10654-021-00757-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146. Relton CL, Davey Smith G: Two-step epigenetic Mendelian randomization: a strategy for establishing the causal role of epigenetic processes in pathways to disease. Int J Epidemiol. 2012;41(1):161–176. 10.1093/ije/dyr233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147. Richmond RC, Hemani G, Tilling K, et al. : Challenges and novel approaches for investigating molecular mediation. Hum Mol Genet. 2016;25(R2):R149–R156. 10.1093/hmg/ddw197 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148. Carter AR, Gill D, Davies NM, et al. : Understanding the consequences of education inequality on cardiovascular disease: mendelian randomisation study. BMJ. 2019;365: l1855. 10.1136/bmj.l1855 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149. Burgess S, Daniel RM, Butterworth AS, et al. : Network Mendelian randomization: using genetic variants as instrumental variables to investigate mediation in causal pathways. Int J Epidemiol. 2015;44(2):484–495. 10.1093/ije/dyu176 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 150. Timpson NJ, Nordestgaard BG, Harbord RM, et al. : C-reactive protein levels and body mass index: elucidating direction of causation through reciprocal Mendelian randomization. Int J Obes (Lond). 2011;35(2):300–308. 10.1038/ijo.2010.137 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151. Mountjoy E, Davies NM, Plotnikov D, et al. : Education and myopia: assessing the direction of causality by mendelian randomisation. BMJ. 2018;361: k2022. 10.1136/bmj.k2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152. Morgan IG, French AN, Rose KA: Intense schooling linked to myopia. BMJ. 2018;361: k2248. 10.1136/bmj.k2248 [DOI] [PubMed] [Google Scholar]
  • 153. Carreras-Torres R, Johansson M, Haycock PC, et al. : Role of obesity in smoking behaviour: Mendelian randomisation study in UK Biobank. BMJ. 2018;361: k1767. 10.1136/bmj.k1767 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 154. Taylor AE, Richmond RC, Palviainen T, et al. : The effect of body mass index on smoking behaviour and nicotine metabolism: a Mendelian randomization study. Hum Mol Genet. 2019;28(8):1322–1330. 10.1093/hmg/ddy434 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155. Burgess S, Chirinos JA, Damrauer SM, et al. : Genetically Predicted Pulse Pressure and Risk of Abdominal Aortic Aneurysm: A Mendelian Randomization Analysis. Circ Genom Precis Med. 2022;15(3): e003575. 10.1161/CIRCGEN.121.003575 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156. Bowden J, Spiller W, Del Greco F, et al. : Improving the visualization, interpretation and analysis of two-sample summary data Mendelian randomization via the Radial plot and Radial regression. Int J Epidemiol. 2018;47(4):1264–1278. 10.1093/ije/dyy101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157. Burgess S, Thompson SG: Bias in causal estimates from Mendelian randomization studies with weak instruments. Stat Med. 2011;30(11):1312–1323. 10.1002/sim.4197 [DOI] [PubMed] [Google Scholar]
  • 158. Sanderson E, Spiller W, Bowden J: Testing and correcting for weak and pleiotropic instruments in two-sample multivariable Mendelian randomization. Stat Med. 2021;40(25):5434–5452. 10.1002/sim.9133 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159. Cole SR, Frangakis CE: Commentary: The Consistency Statement in Causal Inference: A Definition or an Assumption? Epidemiology. 2009;20(1):3–5. 10.1097/EDE.0b013e31818ef366 [DOI] [PubMed] [Google Scholar]
  • 160. Davey Smith G: Epigenesis for epidemiologists: does evo-devo have implications for population health research and practice? Int J Epidemiol. 2012;41(1):236–247. 10.1093/ije/dys016 [DOI] [PubMed] [Google Scholar]
  • 161. Swanson SA, Tiemeier H, Ikram MA, et al. : Nature as a Trialist?: Deconstructing the Analogy Between Mendelian Randomization and Randomized Trials. Epidemiology. 2017;28(5):653–659. 10.1097/EDE.0000000000000699 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162. Morris TT, Heron J, Sanderson ECM, et al. : Interpretation of Mendelian randomization using a single measure of an exposure that varies over time. Int J Epidemiol. 2022;51(6):1899–1909. 10.1093/ije/dyac136 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 163. Kivimäki M, Lawlor DA, Davey Smith G, et al. : Variants in the CRP Gene as a Measure of Lifelong Differences in Average C-Reactive Protein Levels: The Cardiovascular Risk in Young Finns Study, 1980– 2001. Am J Epidemiol. 2007;166(7):760–764. 10.1093/aje/kwm151 [DOI] [PubMed] [Google Scholar]
  • 164. Holmes MV, Ala-Korpela M, Davey Smith G: Mendelian randomization in cardiometabolic disease: challenges in evaluating causality. Nat Rev Cardiol. 2017;14(10):577–590. 10.1038/nrcardio.2017.78 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 165. Burgess S, Butterworth A, Malarstig A, et al. : Use of Mendelian randomisation to assess potential benefit of clinical intervention. BMJ. 2012;345: e7325. 10.1136/bmj.e7325 [DOI] [PubMed] [Google Scholar]
  • 166. Ference BA: How to use Mendelian randomization to anticipate the results of randomized trials. Eur Heart J. 2018;39(5):360–362. 10.1093/eurheartj/ehx462 [DOI] [PubMed] [Google Scholar]
  • 167. Burgess S, Davey Smith G: How humans can contribute to Mendelian randomization analyses. Int J Epidemiol. 2019;48(3):661–664. 10.1093/ije/dyz152 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wellcome Open Res. 2020 Jul 24. doi: 10.21956/wellcomeopenres.17423.r38569

Reviewer response for version 2

Sonja A Swanson 1

I appreciate the authors' willingness to engage with my suggestions in the prior review. I look forward to seeing this living document continue to improve as the methods, settings, and therefore guidelines relevant to Mendelian randomization investigators continue to evolve.

Is the rationale for developing the new method (or application) clearly explained?

Yes

Is the description of the method technically sound?

Partly

Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Partly

If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

No source data required

Are sufficient details provided to allow replication of the method development and its use by others?

Partly

Reviewer Expertise:

Epidemiology; causal inference.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Wellcome Open Res. 2020 Mar 17. doi: 10.21956/wellcomeopenres.17034.r37985

Reviewer response for version 1

Sonja A Swanson 1

These guidelines are a welcomed addition to the literature and reflect the experience and common practices performed by some thoughtful and influential researchers in the MR community. I am especially encouraged to see that this is a living document that the researchers expect to update periodically. Generally, the guidelines reflect consensus on analytic practices with some appropriate caveats provided for the targeted audience. In this review, I highlight areas that are not discussed in great detail that I think warrant attention in the development of further guidelines.

1.        Well-defined causal effect estimands and causal null hypotheses: Part 1 carefully separates the goal of testing for a non-null effect and estimating a causal effect. Its consideration of testing, however, does not make clear what types of null hypotheses can really be assessed (see Swanson et al. 2018) 1 , and the description on effect estimation only suggests qualitatively weighing whether the genetic variants act like the intervention of interest. More guidance could be provided on question framing, especially by giving more specifics on the types of causal questions MR is able to answer. See also Hernán & Robins (2006) 2 and Swanson et al. (2017) 3 .

2.        The importance of time: Related to the framing of the research question is also how time is a part of this framing. MR estimates are often described as “lifetime effects” but the standard IV methods used in MR are developed for time-fixed exposures. (Note that this type of concern also applies to causal null hypothesis testing, as described in Swanson et al. 2018) 1 . At very least, some qualitative guidance on how to interpret results in light of the time-varying nature of exposures and outcomes over the life-course is needed. See again Swanson et al. (2017) 3 and also Labrecque & Swanson (2019) 4 .

3.       Available robust methods, sensitivity analyses, and falsification strategies: These guidelines state that Sections 6 and 7 are not exhaustive considerations of these tools, but it would be helpful to the MR community if guidance on these points was expanded in future versions of these guidelines. It is not very clear why the authors chose to present the tools they did have room for, and not others. It also is not very clear why these sections focus so much on the risk of bias due to pleiotropy when other types of biases can also threaten MR estimates.

Is the rationale for developing the new method (or application) clearly explained?

Yes

Is the description of the method technically sound?

Partly

Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Partly

If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

No source data required

Are sufficient details provided to allow replication of the method development and its use by others?

Partly

Reviewer Expertise:

Epidemiology; causal inference.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

References

  • 1. : Causal null hypotheses of sustained treatment strategies: What can be tested with an instrumental variable?. Eur J Epidemiol .2018;33(8) : 10.1007/s10654-018-0396-6 723-728 10.1007/s10654-018-0396-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. : Instruments for causal inference: an epidemiologist's dream?. Epidemiology .2006;17(4) : 10.1097/01.ede.0000222409.00878.37 360-72 10.1097/01.ede.0000222409.00878.37 [DOI] [PubMed] [Google Scholar]
  • 3. : Nature as a Trialist?: Deconstructing the Analogy Between Mendelian Randomization and Randomized Trials. Epidemiology .28(5) : 10.1097/EDE.0000000000000699 653-659 10.1097/EDE.0000000000000699 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. : Interpretation and Potential Biases of Mendelian Randomization Estimates With Time-Varying Exposures. American Journal of Epidemiology .2019;188(1) : 10.1093/aje/kwy204 231-238 10.1093/aje/kwy204 [DOI] [PubMed] [Google Scholar]
Wellcome Open Res. 2020 Apr 20.
Stephen Burgess 1

> We thank the reviewers for their comments, which we have numbered for reference. We provide point-by-point responses to their suggestions below. <

Reviewer 2 (Sonja Swanson)

B0. These guidelines are a welcomed addition to the literature and reflect the experience and common practices performed by some thoughtful and influential researchers in the MR community. I am especially encouraged to see that this is a living document that the researchers expect to update periodically. Generally, the guidelines reflect consensus on analytic practices with some appropriate caveats provided for the targeted audience. In this review, I highlight areas that are not discussed in great detail that I think warrant attention in the development of further guidelines.

> We thank the reviewer for her comments and positive view of these guidelines. <

B1.       Well-defined causal effect estimands and causal null hypotheses: Part 1 carefully separates the goal of testing for a non-null effect and estimating a causal effect. Its consideration of testing, however, does not make clear what types of null hypotheses can really be assessed (see Swanson et al. 2018)1, and the description on effect estimation only suggests qualitatively weighing whether the genetic variants act like the intervention of interest. More guidance could be provided on question framing, especially by giving more specifics on the types of causal questions MR is able to answer. See also Hernán & Robins (2006)2 and Swanson et al. (2017)3.

> We appreciate this point, which we now introduce at the beginning of the manuscript, referencing the Swanson 2018 paper. We have added to the manuscript: “The straightforward statement of the causal hypothesis is that interventions in the exposure variable will affect the outcome. If the genetic associations with the exposure vary with time, then there are some nuances in terms of what causal hypotheses can be tested [Swanson et al, 2018]; we discuss the impact of time-varying relationships between variables in Section 9.” (Section 1). <

> We also refer to this debate in Section 9 (Interpretation) when we discuss numerical estimates from Mendelian randomization and how these relate to clinically-meaningful causal parameters. In particular, we have added to the manuscript: “The estimate from a Mendelian randomization investigation is therefore better interpreted as a test statistic for a causal hypothesis rather than the estimate of a well-defined intervention.” (Section 9).

B2.       The importance of time: Related to the framing of the research question is also how time is a part of this framing. MR estimates are often described as “lifetime effects” but the standard IV methods used in MR are developed for time-fixed exposures. (Note that this type of concern also applies to causal null hypothesis testing, as described in Swanson et al. 2018)1. At very least, some qualitative guidance on how to interpret results in light of the time-varying nature of exposures and outcomes over the life-course is needed. See again Swanson et al. (2017)3and also Labrecque & Swanson (2019)4.

> As per the response to point B1, we now reference the discussion about time-varying relationships between variables in Section 1 in relation to tests of the causal null hypothesis.

> We have edited the discussion in Section 9 to include issues relating to time-varying relationships between variables, referencing the papers cited above: “Mendelian randomization estimates relate specifically to changes in the exposure induced by the genetic variants used as instrumental variables. Genetic variants are present from before birth, and so Mendelian randomization investigations typically compare groups of the population having different trajectories in their distribution of the exposure over time. Analyses therefore typically can be interpreted as assessing the impact of long-term elevated levels of an exposure. However, in most cases, we have incomplete information about how the genetic variant changes the distribution of the exposure across the life course. If the genetic associations with the exposure vary over time, then Mendelian randomization estimates based on genetic associations with the exposure measured at a single timepoint can be unreliable. Similar difficulties of interpretation arise if the impact on the outcome relates to levels of the exposure at a specific time period in life. A plausible example of this is the effect of vitamin D on multiple sclerosis; multiple sclerosis risk is hypothesized to be influenced by vitamin D levels during early childhood, but not vitamin D levels in adulthood.” (Section 9). <

B3.      Available robust methods, sensitivity analyses, and falsification strategies: These guidelines state that Sections 6 and 7 are not exhaustive considerations of these tools, but it would be helpful to the MR community if guidance on these points was expanded in future versions of these guidelines. It is not very clear why the authors chose to present the tools they did have room for, and not others. It also is not very clear why these sections focus so much on the risk of bias due to pleiotropy when other types of biases can also threaten MR estimates.

> As we have stated, we will revise these guidelines over time. However, balance is needed in this manuscript between comprehensiveness and comprehensibility. It is perhaps better for a comprehensive discussion of methods to be provided separately. Also, while we aim to update these guidelines regularly, we are unable to update advice with the regularity needed to offer up-to-the-minute recommendations on methods as they are developed and updated. We therefore focus on methods whose performance is generally understood. We have now added reference to a more extensive review of methods: “This table is based on a broader review and comparison of methods [Slob, 2020].” (Section 6).

> We have now more clearly stated the motivation behind our choice of presentation: “We focus on these methods here as they can be implemented using summarized data alone, and they rely on different assumptions to provide consistent causal estimates.” (Section 6).

> In terms of focusing on pleiotropy, we were unclear in the initial submission. As described in Kang et al, JASA 2016 “Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization”, there is a statistical correspondence between pleiotropy and instrument validity, meaning that any instrument invalidity can be expressed algebraically in terms of pleiotropy. Hence, while we use the language of pleiotropy to provide mathematically precise statements of the assumptions needed for consistent estimates, this section does not only cover pleiotropy, but invalid instruments more generally.

> We have changed the section headings for Sections 6 and 7 to: “Robust methods for sensitivity analysis” and “Other approaches for sensitivity analysis” to make clear that these sections are not narrowly focused on pleiotropy, but cover other sources of instrument invalidity. We have also added the sentence: “Although robust methods typically use the term ‘pleiotropy’, there is a mathematical correspondence between instrument invalidity and pleiotropy [Kang et al, 2016], and so these methods can help assess sensitivity of findings to instrument invalidity more generally, and not simply invalidity that arises from horizontal pleiotropy. We here use the language of pleiotropy to make mathematically precise statements about the assumptions needed for methods to provide consistent estimates.” (Section 6).

> As per the response to point A1, we have added new paragraphs in Section 7 on alternative sources of bias. We hope this addresses the reviewer’s concerns. <

Wellcome Open Res. 2019 Dec 24. doi: 10.21956/wellcomeopenres.17034.r37168

Reviewer response for version 1

Jean Morrison 1

This article is a practical review of Mendelian randomization (MR) and set of guidelines intended for practitioners. The writing is clear and well organized and fairly thorough. I think this is an invaluable resource to investigators who wish to carry out an MR analysis and provides a good survey of relevant literature. My comments are mostly minor, I feel this article is much needed contribution to the field.

Major comments

  • Reverse direction effects: To me it seemed to one major omission was discussion of reverse direction effects. These can easily cause false positives for methods like IVW and are especially an issue if one is using the "agnostic" variable selection method. I think it could be good to add a graph or two graphs to Figure 4 displaying a reverse effect (of outcome on exposure) and possibly also a feedback loop. This should be accompanied by a discussion of when reverse effects are something the investigator should think about and when they aren't. The sensitivity testing section should include a discussion of testing in the opposite direction if agnostic variable selection is used and if the pair of traits warrants that consideration. It would also be good to include a discussion of how to interpret a positive result in both directions.

  • Collider bias when using summary statistics: When discussing individual level data approaches the authors give a set of recommended covariates and note that collider bias is a concern. A brief explanation of what a collider is and how it causes bias should be added. A parallel discussion should be added concerning summary statistic based analyses. In particular, it is important that investigators know which covariates were adjusted to compute the summary statistics and how to identify a potential collider. 

Minor comments:

  • Slightly more attention should be given to analyses described early as "exploratory" in which the investigator scans through many potential causal effects and how these should be treated differently. Relatedly, more attention could be given to analyses that use agnostic variable selection method. These issues are linked because a phenome wide MR analysis is likely to use the agnostic approach. In my view, for studies like these, a robust method (or multiple robust methods) should always be used, the investigator should assume that some variants are pleiotropic.

  • There is more danger in using a one sample approach with an agnostic variant set due to weak instrument bias, which should be mentioned. Methods exist that estimate and correct for this bias by using correlation among test statistics for variants that aren't associated with either trait. CAUSE (Morrison et al. bioRxiv 2019) 1 is one but there must be other approaches to this issue as well. 

  • Both Egger regression and the modal estimator have much lower power than other methods. This is worth mentioning when discussing interpreting results from sensitivity analyses.

  • It is worth mentioning in figure 4 that the "related variable" may not always be known. 

  • In the paragraph mentioning the "three sample" approach, it would be interesting to include any results about how much bias in the causal effect might be created by selection bias if one simply selects significant variants from the exposure GWAS.

Is the rationale for developing the new method (or application) clearly explained?

Yes

Is the description of the method technically sound?

Yes

Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

No source data required

Are sufficient details provided to allow replication of the method development and its use by others?

Yes

Reviewer Expertise:

statistical genetics, Mendelian randomization

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

References

  • 1. : Mendelian randomization accounting for horizontal and correlated pleiotropic effects using genome-wide summary statistics[preprint]. Biorxiv .2019; https://doi.org/10.1101/682237 10.1101/682237 Reference source [DOI]
Wellcome Open Res. 2019 Dec 24.
Stephen Burgess 1

Jean,

Thanks for the rapid and thoughtful response. We will wait for other reviewers to comment, and then update the article to address your points in due course.

Best wishes, Stephen

Wellcome Open Res. 2020 Apr 20.
Stephen Burgess 1

> We thank the reviewers for their comments, which we have numbered for reference. We provide point-by-point responses to their suggestions below. <

Reviewer 1 (Jean Morrison):

A0. This article is a practical review of Mendelian randomization (MR) and set of guidelines intended for practitioners. The writing is clear and well organized and fairly thorough. I think this is an invaluable resource to investigators who wish to carry out an MR analysis and provides a good survey of relevant literature. My comments are mostly minor, I feel this article is much needed contribution to the field.

> We thank the reviewer for her comments and positive view of these guidelines. <

Major comments

A1. Reverse direction effects: To me it seemed to one major omission was discussion of reverse direction effects. These can easily cause false positives for methods like IVW and are especially an issue if one is using the "agnostic" variable selection method. I think it could be good to add a graph or two graphs to Figure 4 displaying a reverse effect (of outcome on exposure) and possibly also a feedback loop. This should be accompanied by a discussion of when reverse effects are something the investigator should think about and when they aren't. The sensitivity testing section should include a discussion of testing in the opposite direction if agnostic variable selection is used and if the pair of traits warrants that consideration. It would also be good to include a discussion of how to interpret a positive result in both directions.

> We have now added new paragraphs to Section 7 covering biases in estimation arising due to issues other than invalid instruments. These covers reverse causation (point A1) and collider bias (point A2).

> Relating to reverse causation, we have added to the manuscript: “While the genetic code is fixed at conception and so cannot be influenced by reverse causation, if the outcome influences the risk factor, this can result in gene—outcome associations becoming distorted and lead to misleading inferences. As shown in Figure 4e, if genetic variants that are supposed to be instrumental variables for the exposure in fact influence the outcome primarily, then genetic associations with the outcome could be present without the exposure influencing the outcome. The MR-Steiger method has been developed to detect such variants and remove them from the analysis. Bidirectional Mendelian randomization analyses have been proposed that use separate sets of instrumental variables for the exposure and outcome to assess the direction of causal effect.” (Section 7).

> We have added a reverse causation scenario and a feedback loop scenario to Figure 4 as requested: “Another possible scenario that would lead to instrument invalidity is if genetic variants influence the outcome primarily rather than the exposure (Figure 4e, see also discussion on reverse causation in Section 7).” (Section 3). <

A2. Collider bias when using summary statistics: When discussing individual level data approaches the authors give a set of recommended covariates and note that collider bias is a concern. A brief explanation of what a collider is and how it causes bias should be added. A parallel discussion should be added concerning summary statistic based analyses. In particular, it is important that investigators know which covariates were adjusted to compute the summary statistics and how to identify a potential collider.

> As per the response to point A1, we have expanded the discussion about collider bias and selection bias in new paragraphs in Section 7 on bias due to issues other than invalid instruments.

> We have added to the manuscript: “A collider is a common effect of two variables – for example, any variable causally downstream of the exposure is influenced by the genetic variants and the exposure—outcome confounders, and so is a collider. Even if two variables are unrelated (they are marginally independent), they will typically be related when conditioning on the collider (conditionally dependent). Stratifying on or adjusting for a collider therefore leads to an association between variables that influence the collider. An association between the genetic variants and the exposure—outcome confounders would lead to biased causal estimates. Collider bias is not unique to Mendelian randomization, but it is particularly relevant as some published genetic association estimates have been adjusted for potential colliders. Methods to account for collider bias have recently been proposed.

“Selection bias is a specific example of collider bias which occurs when selection into a study sample depends on a collider. Simulation studies have shown that selection bias can have a severe impact on Mendelian randomization estimates, but only when the associations of variables with the collider are quite strong. Selection bias can potentially be addressed using inverse-probability weighting, although this requires estimation of the probability of selection into the study sample for all individuals.” (Section 7).

> We also discuss the problem raised by the reviewer about pre-computed summarized data: “If published summarized association estimates have already been adjusted for a variable causally downstream of the exposure, collider bias (see Section 7) may be unavoidable.” (Section 2). <

Minor comments:

A3. Slightly more attention should be given to analyses described early as "exploratory" in which the investigator scans through many potential causal effects and how these should be treated differently. Relatedly, more attention could be given to analyses that use agnostic variable selection method. These issues are linked because a phenome wide MR analysis is likely to use the agnostic approach. In my view, for studies like these, a robust method (or multiple robust methods) should always be used, the investigator should assume that some variants are pleiotropic.

> In reference to phenome scan analyses, we have added: “Such analyses are generally regarded as exploratory or “hypothesis-generating”, and results are typically treated as provisional until replicated in an independent dataset.” (Section 1). In relation to analyses performed using an agnostic set of genetic variants, we have added: “If genetic variants are chosen in a way that is completely agnostic to the function of the variants, then researchers should be especially careful about the possibility of variants being pleiotropic.” (Section 3). <

A4. There is more danger in using a one sample approach with an agnostic variant set due to weak instrument bias, which should be mentioned. Methods exist that estimate and correct for this bias by using correlation among test statistics for variants that aren't associated with either trait. CAUSE (Morrison et al. bioRxiv 2019)1 is one but there must be other approaches to this issue as well.

> We have added discussion on winner’s curse and weak instrument bias when genetic variants are selected in the dataset under analysis (see point A7).

> We have added reference to the CAUSE method and another similar paper: “A further class of robust methods uses latent modelling to distinguish to what extent genetic associations with the outcome arise due to a causal effect of the exposure, as opposed to via direct (pleiotropic) or confounder-driven effects of particular variants. A causal model is evidenced if the predominance of variants that associate with the exposure also associate with the outcome in a proportional way. If the genetic associations with the outcome do not follow this pattern, then a non-causal explanation would be preferred. Emerging methods that take this approach include the Causal Analyses Using Summary Effect Estimates (CAUSE) [Morrison et al] and Latent Heritable Confounder MR (LHC-MR) [Darrous et al] methods.” (Section 6).

> We have also added reference to recently developed methods that are based on multivariable Mendelian randomization: “Methods have been proposed based on a multivariable approach in the context of gene expression data, including the MR-link [van der Graaf et al] and transcriptome-wide summary statistics-based Mendelian Randomization (TWMR) [Porcu et al] methods.” (Section 7). <

A5. Both Egger regression and the modal estimator have much lower power than other methods. This is worth mentioning when discussing interpreting results from sensitivity analyses.

> We have now mentioned this with respect to the MR-Egger method: “Estimates from the MR-Egger method are particularly affected by outlying and influential datapoints, and are prone to be imprecise, particularly when the variant—exposure associations are all similar in magnitude. This can lead to the method having low power to detect a causal effect.” (Section 6), and also with respect to the mode-based method: “The mode-based method has been shown to have low precision in some simulated and real datasets.” (Section 6). <

A6. It is worth mentioning in figure 4 that the "related variable" may not always be known.

> We now mention this in the figure caption: “We note that the related variable may be known or unknown.”. <

A7. In the paragraph mentioning the "three sample" approach, it would be interesting to include any results about how much bias in the causal effect might be created by selection bias if one simply selects significant variants from the exposure GWAS.

> Bias due to selecting variants based on their statistical significance in the dataset under analysis comes under the broader category of winner’s curse. Selecting variants based on their association with the exposure in the dataset under analysis can lead to exacerbation of weak instrument bias. We have expanded the discussion on winner’s curse in Section 3: “Often, selection is based on the dataset in which genetic associations with the exposure are estimated. However, this leads to “winner’s curse” –genetic associations tend to be overestimated in the dataset in which they were first discovered. If genetic variants are selected based on their associations with the exposure in the dataset under analysis, weak instrument bias is exacerbated (in the direction of the observational association in a one-sample setting, and in the direction of the null in a two-sample setting).” (Section 3). <

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Availability Statement

    No data are associated with this article.


    Articles from Wellcome Open Research are provided here courtesy of The Wellcome Trust

    RESOURCES