Summary
Evidence on the validity of drug targets from randomized trials is reliable but typically expensive and slow to obtain. In contrast, evidence from conventional observational epidemiological studies is less reliable because of the potential for bias from confounding and reverse causation. Mendelian randomization is a quasi-experimental approach analogous to a randomized trial that exploits naturally occurring randomization in the transmission of genetic variants. In Mendelian randomization, genetic variants that can be regarded as proxies for an intervention on the proposed drug target are leveraged as instrumental variables to investigate potential effects on biomarkers and disease outcomes in large-scale observational datasets. This approach can be implemented rapidly for a range of drug targets to provide evidence on their effects and thus inform on their priority for further investigation. In this review, we present statistical methods and their applications to showcase the diverse opportunities for applying Mendelian randomization in guiding clinical development efforts, thus enabling interventions to target the right mechanism in the right population group at the right time. These methods can inform investigators on the mechanisms underlying drug effects, their related biomarkers, implications for the timing of interventions, and the population subgroups that stand to gain the most benefit. Most methods can be implemented with publicly available data on summarized genetic associations with traits and diseases, meaning that the only major limitations to their usage are the availability of appropriately powered studies for the exposure and outcome and the existence of a suitable genetic proxy for the proposed intervention.
Keywords: genetic epidemiology, Mendelian randomization, causal inference, instrumental variables, target validation
Mendelian randomization exploits naturally occurring randomization in the transmission of genetic variants to mimic randomized trials in observational data. We review statistical methods and applications to showcase opportunities for applying Mendelian randomization in guiding pharmaceutical development efforts to target the right mechanism in the right population at the right time.
Introduction
The availability of data on genetic associations from large-scale epidemiological studies offers the opportunity to improve the efficacy and efficiency of drug development. Genetic variants that affect the function of a gene may be used to provide insight into the efficacy, adverse effects, and repurposing potential of interventions that perturb the corresponding protein in specific population groups.1 Mendel’s second law (“the independent segregation of alleles at conception”) means that these genetic variants should not associate systematically with respect to confounding variables, creating a natural experiment analogous to a randomized trial2 (Figure 1). The use of genetic variants as instrumental variables (Figure 2) to assess causal relationships from observational data is known as Mendelian randomization.3,4
Figure 1.
Schematic diagram illustrating analogy between Mendelian randomization and randomized trial
In a randomized trial, the population is split into control and treatment groups at random by the investigator; in Mendelian randomization, the population is divided into groups based on a genetic variant (a “natural experiment”).
Figure 2.
Schematic diagram illustrating the instrumental variable assumptions
For a genetic variant to be a valid instrumental variable, it must (1) be associated with the exposure (that is, population subgroups defined by the genetic variant have different average levels of the exposure); (2) not be associated with the outcome by a confounding pathway (that is, population subgroups defined by the genetic variant have similar average levels of competing risk factors); and (3) not have a direct effect on the outcome (that is, any influence of the genetic variant on the outcome operates via the exposure). If the genetic variant satisfies these assumptions, then any association between the variant and the outcome must be due to a causal effect of the exposure.
The relevance of Mendelian randomization to drug discovery efforts has previously been discussed at length.5,6,7 The majority of drug targets are proteins, which are coded for by genes. Genetic variants can therefore be used as proxies for studying the effect of pharmacologically perturbing these protein drug targets. As a result of the random allocation of genetic variants at conception, variants are typically distributed independently of potential confounders conditional on the parental genotype. This independence has been shown empirically to hold marginally at a population level for many variants.8,9 Therefore, aside from associations arising because of population stratification or linkage disequilibrium, genetic variants should only be associated with traits that they affect and so should be independent of confounding factors. Furthermore, the fixed nature of the genotype means it cannot be influenced by environmental variables,10 thus reducing the possibility of spurious results due to confounding by such factors and ensuring genetic associations are protected from bias due to reverse causation.11 Therefore, compared with conventional observational epidemiological analyses, Mendelian randomization analyses have the potential to provide more reliable insights into causal relationships.12 The human relevance of such analyses can offer considerable advantages over animal models, from which findings may not be directly translatable.13
Systematic reviews indicate that drug targets with human genetic evidence are more than twice as likely to progress into clinical practice.14,15 Furthermore, the availability of large-scale genetic association data and genotyped biobanks offers the potential for time- and cost-efficient investigations.16 Indeed, two-thirds of drugs approved in the US in 2021 have evidential support from human genetics.17 These advantages have led to the widespread adoption of genetic data as a key evidence source in drug discovery, and generation and analyses of genetic data have become key strategic elements for many pharmaceutical companies.
In this review, we present ways in which methodological developments in Mendelian randomization and other related genetic analyses can be used not only to select between promising targets but also to inform all stages of drug development.18 The ultimate judge of therapeutic effectiveness is a clinical trial. However, while a necessary part of drug development, trials can be expensive and time consuming.19 Many Mendelian randomization analyses can be performed rapidly and efficiently with existing data resources. Genetic analyses that address questions of translational relevance can guide trial design by gathering focused evidence on what are the best targets for intervention, when is the optimal time to intervene, and who would most benefit from intervention. This enables trials that have the greatest chance of an informative outcome to be prioritized.
We focus on nine areas where genetic evidence can impact drug development: (1) target validation, (2) clarification of causal pathways, (3) identification of relevant outcomes, (4) elucidation of complex mechanisms, (5) uncovering of tissue specificity, (6) investigation of heterogeneity in response to treatment, (7) ancestry-specific considerations, (8) prediction of the magnitude of effect of an intervention, and (9) understanding the impact of timings of interventions. In each case, we consider methodological developments and provide practical examples, focusing on how these methods have been used in the existing literature. A summary of some relevant examples is given as Table 1.
Table 1.
Some published examples where human genetic data have been leveraged to facilitate and inform drug development in cardiometabolic disease
Target | Gene | Outcome | Conclusions from genetic evaluation | Clinical validation of genetic evidence |
---|---|---|---|---|
Coagulation factor X | F10 | cardiovascular and thrombotic disease | repurposing potential of factor Xa inhibitors in other cardiovascular disease subtypes21 | phase III clinical trial22 |
Coagulation factor XI | F11 | cardioembolic stroke | protective effects of factor XI inhibition in cardioembolic stroke36 | phase III trial evidence in venous thromboembolism163 |
Fibroblast growth factor 21 (FGF21) | FGF21 | non-alcoholic steatohepatitis | favourable effects of circulating FGF21 on cardiometabolic biomarkers164 | phase II clinical trial165 |
Glucagon-like peptide 1 (GLP1) | GLP1R | heart failure | protective effects of GLP1 agonism in heart failure166 | phase III trial underway (NCT01800968) |
Glucose-dependent insulinotropic polypeptide (GIP) | GIP | cardiometabolic disease | favourable effects of GIP agonism on bodyweight, lipid traits, coronary artery disease and inflammation167 | phase III clinical trials168,169 |
Interleukin 6 receptor (IL6R) | IL6R | cardiovascular disease, severe COVID-19, adverse effects, repurposing potential and biomarkers | beneficial effects of IL6R inhibition in severe COVID-19,72 increased risk of infectious, allergic and autoimmune disease,170 identify biomarkers to measure efficacy,170 repurposing potential in atherosclerotic disease171 | phase II and III clinical trial evidence74,172 |
Proprotein convertase subtilisin/kexin type 9 (PCSK9) | PCSK9 | cardiovascular disease | protective effects of PCSK9 inhibition173 | phase III clinical evidence174 |
Niemann-Pick C1-like 1 (NPC1L1) | NPC1L1 | coronary heart disease | protective effects of NPC1L1 inhibition175 | phase III clinical evidence176 |
Cholesteryl ester transfer protein (CETP) | CETP | cardiovascular disease | protective effects of CETP inhibition on cardiovascular disease177 | phase III clinical evidence154 |
Target validation
We initially discuss conventional Mendelian randomization analyses and proceed to consider related and complementary approaches: use of colocalization, genetic predictors of drug response, and rare variants.
Two categories of Mendelian randomization investigations are those using variants from a single genetic region and those using variants from multiple genetic regions (polygenic analyses).20 Mendelian randomization analyses for target selection and drug development typically use genetic variants from a single genetic region, the region around the gene that encodes the protein target for investigation. Such analyses have been called “cis-Mendelian randomization” analyses, as genetic variants in a relevant coding region are known as cis-variants.6 When investigating drug targets, the exposure is taken as a measure of pharmacological perturbation of the relevant drug target. Choice of genetic region is crucial to the validity of the investigation; if variants in the genetic region do not replicate the effect of the drug, then the investigation has limited value.5
As an example, a cis-Mendelian randomization investigation considered coagulation factor X as a risk factor for various cardiovascular diseases by using a genetic variant in the F10 region that had previously been shown to associate with plasma levels of activated factor X (FXa).21 Given the use of FXa inhibitors for preventing deep venous thrombosis and pulmonary embolism,22 genetic associations with these outcomes provided confirmatory evidence of the efficacy of interventions on this pathway. A non-significant association with coronary artery disease (despite greater number of cases) suggests that prevention of this outcome is lower priority for pursuit through FXa inhibition as a monotherapy.
Although several of the examples included in this review focus on cardiovascular diseases, Mendelian randomization investigations have been performed for many other outcomes. The bias toward cardiovascular disease in this review (which reflects a bias in the literature23) is primarily because (1) there are several modifiable risk factors for cardiovascular disease that are suitable for consideration as exposures in Mendelian randomization and (2) summarized data from large consortia were released earlier for cardiovascular diseases than for most other diseases.
If there are multiple candidate variants in a genetic region, investigators may consider including multiple variants in their analysis. Variants chosen for inclusion in a cis-Mendelian randomization analysis should be conditionally independent predictors of the putative causal trait, which we refer to as the exposure. That is, each variant included in the analysis should be associated with the exposure in a model adjusting for other included variants. If a Mendelian randomization analysis is performed with summarized data, any correlation between variants should be accounted for in the analysis.24 If two variants are in perfect linkage disequilibrium, estimates will be no more precise when both variants are used compared to when one variant is used. Variants in partial linkage disequilibrium can increase the power of an analysis if they explain independent variation in the exposure.25
Selecting too many correlated variants can lead to numerical instability in Mendelian randomization estimates, as multicollinearity between variants makes estimates highly sensitive to misspecification of the variant correlation matrix.26 One option to mitigate this issue is employing a variable selection method from the fine-mapping literature to select a small number of variants with independent signals, such as Bayesian stochastic search.4 An alternative approach is performing dimension reduction on the matrix of genetic variants and using a small number of principal components or factors from this matrix as instruments.26,27 However, unless additional variants explain a substantial fraction of additional variance in the exposure, gains in power over analyses based on the lead variant alone are likely to be small.28
A particular concern to the validity of Mendelian randomization analyses is pleiotropy (sometimes called “horizontal pleiotropy”), defined here as the association of a genetic variant with a trait that is not on the causal pathway from the variant to the outcome through the exposure.29 While there is a wide variety of Mendelian randomization methods that are robust to some degree of pleiotropy,30 these methods typically rely on the availability of variants in multiple regions, hence allowing the consistency of findings across these variants to be investigated for a polygenic Mendelian randomization analysis. As an example, several genes are implicated in the synthesis and metabolism of vitamin D. Variants in four distinct regions linked to vitamin D were shown to associate concordantly with the risk of multiple sclerosis, providing evidence of a potential protective effect of vitamin D on multiple sclerosis risk.31 However, these methods typically are of limited benefit when variants are all located in a single genetic region because the assumption of instrument validity is not independent for different variants in such a single region.
An alternative approach that can be used to assess the robustness of a cis-Mendelian randomization finding is colocalization. Colocalization is a method developed in the context of genome-wide association studies (GWASs) to distinguish between two scenarios at a genetic region containing variants associated with two traits: (1) the traits are affected by the same variant(s) or (2) the traits are affected by different variants.32 If colocalization is applied to the exposure and outcome from a Mendelian randomization analysis, the latter scenario is likely to represent violation of the Mendelian randomization assumptions (Figure 2), as a genetic predictor of the exposure could be associated with the outcome via linkage disequilibrium with a distinct variant that influences the outcome. For example, colocalization analyses suggest that the causal variants for low-density lipoprotein (LDL) cholesterol and Alzheimer disease at the APOE locus are distinct, indicating that different mechanisms underlie these associations.33 A related example is at the GLP1R locus, where the lead signal for bodyweight reduction is not the same as that for lowering elevated blood glucose,33 suggesting distinct signaling mechanisms underlie the effect of glucagon-like peptide 1 receptor (GLP1R) agonism on these two outcomes.34
A limitation of the use of colocalization as a sensitivity analysis for Mendelian randomization is that colocalization methods (in particular, the coloc method35 and its derivatives) require the presence of a genetic variant that is strongly associated with each trait in the given genetic region to conclude that there is colocalization. If a genetic variant is associated with the outcome at a relatively weak level of statistical significance (say p = 0.005), then the coloc method under its default prior settings will typically conclude that there is no causal variant for the outcome (in the language of coloc, this is the hypothesis H1) rather than evidence for the hypothesis of distinct causal variants (H3) or a shared causal variant (H4; this is colocalization).33 Such a finding does not provide strong evidence either in favor of or against the validity of the Mendelian randomization assumptions.
As an example, two genetic variants in the F11 region that are associated with circulating levels of coagulation factor XI have been shown to be associated with risk of ischemic stroke and in particular the cardioembolic subtype.36 In a separate investigation,37 variants in the F11 region were shown to colocalize for circulating levels of factor XI and venous thromboembolism via the coloc-SuSiE method,38 an extension of the standard coloc method that allows colocalization to be detected when there is more than one causal variant at a given locus. This supports the findings of phase II trials, which have shown evidence for the beneficial effect of factor XI inhibition on venous thromboembolism risk.39
As a complementary approach to strengthen target validation, investigators considered the impact of statins on cardiovascular diseases, not by using genetic variants in HMGCR that encodes the target of statin therapy but instead by using a score comprising genetic variants that predict statin efficacy (see “investigating heterogeneity in responses”).40 Amongst statin users, this score was inversely associated with risk of myocardial infarction and peripheral vascular disease and positively associated with intracerebral hemorrhage, mirroring results from statin trials41,42 and genetic analyses using variants in HMGCR.43 This is an example of triangulation: the use of distinct approaches to address the same question.44 Amongst statin non-users, the score was not associated with any of these diseases. Statin non-users represent a natural negative control group, as we would not expect genetic variants that influence the efficacy of statins to affect disease outcomes amongst individuals who do not use statins. A limitation of this analysis is that the restriction to statin users could induce selection bias. However, the impact of mild selection effects on Mendelian randomization estimates is typically not substantial.45
Mendelian randomization investigations can be performed with rare genetic variants, although specific methodological approaches are recommended when considering evidence from multiple rare variants. To assess associations with rare variants, it is common to consider collapsing analyses that combine information on several variants into a single test.46 If genetic associations are considered for individual variants in isolation, there may not be enough power to detect an association. By combining information across variants in a region, power can be increased. Several approaches have been considered, including burden tests, which combine multiple variants into a single burden variable and assess associations of the burden variable, and kernel-based test methods, which combine test statistics for each variant via a kernel matrix. The sequence kernel association optimal unified test (SKAT-O) method combines these two approaches in a data-driven way.47 Rare variant approaches have been used in a hypothesis-free way to search for associations across the genome48 and in a focused way, for example, to demonstrate associations of variants in CIDEB with liver disease.49 The latter investigation used the REGENIE method,50 which has been developed to perform such analyses in large datasets. A detailed review of rare variant methods is beyond the scope of this paper; a recent review can be found here.51
A critical question when performing a Mendelian randomization analysis is how to choose which genetic variants to include in the analysis. First, one must determine which regions to focus on, and second, one must decide which variants from these regions to include in the analysis. Mendelian randomization analyses are most reliable when variants are selected on the basis of their biological relevance to the exposure or, more specifically, their relevance to the proposed intervention on the exposure.20 For this reason, drug target Mendelian randomization analyses typically focus on a single region. Selection of variants from a region may be guided by functional insight or associations with levels of a circulating biomarker, a relevant protein, or gene expression in a relevant tissue or cell type. The optimal approach in any case will depend on the specific investigation and the available data. Discussion of these considerations in the context of relevant examples can be found in Gill et al.5 and Swerdlow et al.52
Clarifying causal pathways
Although Mendelian randomization provides evidence about the causal nature of an exposure for a disease outcome, more detailed analyses are required to understand the causal pathway by which the exposure may influence the outcome. A relevant approach here is multivariable Mendelian randomization, the use of genetic variants that are associated with multiple exposures to estimate the effect of each exposure on the outcome.53,54 While standard (that is, univariable) Mendelian randomization assesses whether genetically predicted levels of an exposure are associated with the outcome in a univariable model, multivariable Mendelian randomization assesses whether genetically predicted levels of multiple exposures are associated with the outcome in a multivariable model.55 Two scenarios in which this method can be used are (1) when there are several related traits with shared genetic predictors and (2) to assess mediation of the effect of a complex trait via one or more proposed mediators (Figure 3).56
Figure 3.
Schematic diagram illustrating two scenarios in which multivariable Mendelian randomization can be used:
(A and B) To disentangle the effects of related traits with shared genetic predictors (A) and to assess mediation in the effect of an exposure via a proposed mediator (B).
As an example of the first scenario (related traits), it is difficult to find genetic predictors of high-density lipoprotein (HDL) cholesterol concentrations that are not also associated with LDL cholesterol and/or triglyceride concentrations. However, while polygenic univariable Mendelian randomization analyses suggest that genetically predicted levels of HDL cholesterol are associated with coronary artery disease risk, this association attenuates substantially in multivariable Mendelian randomization analyses. The univariable analysis is subject to potential bias, as several of the genetic predictors of HDL cholesterol concentrations have pleiotropic effects on the outcome via LDL cholesterol and/or triglycerides. Conditional on genetically predicted LDL cholesterol and triglyceride concentrations, the association between genetically predicted HDL cholesterol concentrations and coronary artery disease risk is attenuated toward the null.57 This suggests that HDL cholesterol by itself is not a worthwhile target for pharmacological intervention in a general population.
An extension of this method is the Mendelian randomization Bayesian model averaging (MR-BMA) method, which extends multivariable Mendelian randomization to consider large numbers of traits, comparing the evidence for different sets of exposures as causal risk factors.58 An analysis considering genetic predictors of 30 lipidomic measurements concluded that the most plausible causal model for those data had apolipoprotein B as the sole risk factor affecting coronary artery disease. This suggests that differences in coronary artery disease risk are proportional to the change in the number of hepatically derived cholesterol carrying lipoprotein particles not the concentration of LDL cholesterol.59 A similar conclusion was reached investigating lipid predictors of cardiovascular risk reduction in response to pharmacological agents in clinical trials.60 Genetic analyses have yielded similar conclusions for peripheral artery disease.61 These findings have implications for the choice of targets for lipid-lowering therapies and the measurement of lipid traits in clinical trials for cardiovascular disease.
As an example of the second scenario (mediation), previous Mendelian randomization investigations have demonstrated evidence for body mass index (BMI) as a causal risk factor for coronary heart disease.62 However, the mechanism linking BMI to coronary heart disease risk is unclear. Multivariable Mendelian randomization analyses have suggested that the effect of BMI on coronary heart disease risk is partially mediated via systolic blood pressure and type 2 diabetes propensity.63 This implies that much of the cardiovascular benefit of BMI lowering could be achieved by reductions in blood pressure and risk of type 2 diabetes. Such insight also has direct relevance for informing of the population that stands to gain greatest cardiovascular risk benefit from bodyweight reduction.
Multivariable Mendelian randomization is difficult to implement for cis-Mendelian randomization analyses, as it is necessary to have at least as many independent genetic variants as there are traits in order to estimate the effect of each trait.64 Additionally, it is necessary to have variants that differ in their relative strength of association with the traits; if genetic associations with two traits are perfectly proportional, then it is not possible to distinguish between the traits in a multivariable model.55
Multivariable Mendelian randomization could have utility for cis-Mendelian randomization if multiple traits are associated with variants in a particular genetic region linked with a disease outcome. A potential scenario where this may occur is for protein traits at a “gene cluster,” a region of the chromosome containing several adjacent genes, that contains one or more GWAS hits, such as the interleukin-1 receptor cluster that contains variants associated with various autoimmune diseases.65,66 A recent methodological development extended dimension reduction methods for variant selection to the multivariable cis-Mendelian randomization setting.67 While in univariable cis-Mendelian randomization, including multiple variants may increase power slightly; in multivariable cis-Mendelian randomization, including multiple variants is necessary to disentangle the various putative causal traits. This method was used to show that, despite the presence of multiple protein associations at the locus, the most likely causal risk factor for cardioembolic stroke at the chemokine receptor gene cluster is monocyte chemoattractant protein-1.67
Identifying relevant outcomes
An efficient way of addressing unmet medical need is the repurposing of existing drugs for novel disease outcomes. If genetic variants can be found that proxy a particular pharmacological intervention and are valid instrumental variables, then genetic evidence for the effect of the intervention on a range of outcomes can be assessed in a phenome-wide association study.68,69 Outcomes can include continuous traits and diseases: intermediate biomarkers indicating efficacy of the treatment (which could therefore be used in early-phase trials as proxies for a disease outcome), positive and negative controls, putative additional indicated disease outcomes, and safety signals.
For example, genetic variants in the IL6R region that can be considered as proxies for intervention on interleukin-6 signaling are associated with levels of downstream inflammatory biomarkers (C-reactive protein, fibrinogen) similarly to tocilizumab, an interleukin-6 receptor inhibitor.70 Tocilizumab is used in the treatment of rheumatoid arthritis, which is therefore a positive control outcome. These IL6R variants are associated with rheumatoid arthritis and with coronary artery disease in the same direction, suggesting that interleukin-6 receptor inhibition may also reduce risk of cardiovascular disease.71 These same variants have also been shown to be associated with reduced risk of COVID-19 and COVID-19 hospitalization72,73; tocilizumab was subsequently shown to be an effective treatment for reducing COVID-19 severity in the RECOVERY trial.74
Wide-angled Mendelian randomization analyses (that is, analyses considering a broad set of outcomes) could help define the therapeutic effects of an intervention and hence the primary outcome of a clinical trial. Phase III trials are currently underway for antisense oligonucleotides that substantially reduce lipoprotein(a) concentrations.75 Genetic variants in the LPA region are strongly associated with coronary heart disease, displaying a dose-dependent relationship on the log-linear scale76 (see “predicting the magnitude of effect”). Variants in this region are also associated with risk of ischemic stroke, peripheral artery disease, abdominal aortic aneurysm, and aortic stenosis but not with hemorrhagic stroke,77 suggesting that hemorrhagic stroke should not be considered as part of the primary efficacy endpoint for lipoprotein(a) trials.
An example of a potential safety signal is the genetic association of variants in the CETP region with age-related macular degeneration (AMD) risk.78,79 Cholesteryl ester transfer protein (CETP) inhibitors are being developed for treatment of cardiovascular diseases,80 and vigilance for increased risk of AMD incidence or progression in treated patients may be appropriate.
Elucidating complex mechanisms
Genetic associations can be used to untangle complex biological mechanisms. As a simple example, genetic predictors of C-reactive protein concentrations in the IL6R region are associated with coronary artery disease risk,70 but those in the CRP region are not.81 The IL6R region encodes interleukin-6 receptor; interleukin-6 is upstream of C-reactive protein in the inflammatory cascade. The genetic associations suggest that targeting interleukin-6 pathways may reduce coronary artery disease risk but targeting C-reactive protein directly will not.
A typical feature of polygenic Mendelian randomization analyses is heterogeneity amongst genetic predictors of an exposure in their associations with the outcome. While this can be a sign of specific variants having pleiotropic associations,29 it could instead reflect the presence of multiple causal mechanisms by which the exposure influences the outcome, particularly if the Mendelian randomization estimates for different variants cluster around distinct values. Such clusters may reflect distinct components of the exposure that have different effects on the outcome, different mechanisms of intervention on the exposure, or distinct causal pathways passing via the exposure82 (Figure 4). Untangling these mechanisms could help pinpoint specific aspects of a complex exposure that could be intervened on to reduce disease risk.83
Figure 4.
Schematic diagram illustrating a potential explanation leading to clustering of genetic variants associated with an exposure
Two approaches have been proposed to cluster genetic variants in a Mendelian randomization analysis: (1) clustering based on variant-specific Mendelian randomization estimates and (2) clustering based on associations with related traits.
Two methods with relevance to Mendelian randomization have been proposed for the clustering of genetic variants associated with a particular exposure: one that clusters variants on the basis of their associations with the outcome and another that clusters variants on the basis of their associations with multiple traits but typically does not include the outcome among these traits.
The first method, MRClust, finds groups of variants with similar Mendelian randomization estimates for the effect of the exposure on the outcome but then requires the user to explore genetic associations with different traits to interpret the clusters.82 For example, although most genetic predictors of increased BMI were positively associated with type 2 diabetes risk, a minority cluster of genetic variants was associated with increased BMI but decreased type 2 diabetes risk.84 A shared feature of this cluster was associations with increased birthweight, suggesting a dichotomy in effects on type 2 diabetes risk between genetic variants that predispose an individual to excess adiposity in later life versus those that predispose an individual to large size from birth.85
The second method, NAvMix, finds groups of variants having similar proportional associations with a range of traits.86 If the outcome is not included in the clustering algorithm, then we can test whether the Mendelian randomization estimates for the outcome based on different clusters differ. For example, genetic predictors of BMI were divided into five clusters on the basis of their associations with nine cardiovascular traits. While three of these clusters had positive Mendelian randomization estimates for the outcome of coronary heart disease, the fourth cluster had a null estimate, and the fifth cluster had a negative estimate. The fifth cluster, which was also the smallest cluster, was associated with cardiovascular traits in a pattern that has previously been described as “metabolically favorable adiposity,” namely increased HDL cholesterol concentrations and decreased systolic blood pressure, triglyceride concentrations, waist-hip ratio, and type 2 diabetes propensity.87 This cluster also differed from others in its associations with levels of inflammatory biomarkers.86 This implies there may be targets that increase bodyweight but decrease other biomarkers of cardiometabolic disease and have favorable overall effects on cardiovascular disease risk.
An alternative approach to unravel complexity is to perform dimension reduction on a set of related traits. This can be used to summarize high-dimensional data on a complex phenotype into meaningful composite variables. For example, it is unlikely that a single component of body composition is causal for metabolic disease but rather the agglomerative effect of a mechanism such as adiposity, which influences many anthropometric traits. Principal-component analysis methods have been applied to reduce multiple body composition measures into four main components, representing body size, adiposity, predisposition to abdominal fat deposition, and lean mass, which were then investigated in Mendelian randomization analyses.88 Another context where this approach could be used is to partition imaging data into components via methods such as sparse principal-component analysis, which creates components that are sparse (that is, they are only calculated on the basis of a subset of all available variables) and hence are more interpretable.89
Another potential complexity in the effect of an intervention is if there is an interaction between two treatments. If separate genetic variants (or sets of variants) are available that proxy distinct interventions, then the statistical interaction between these variants in their association with the outcome can be explored in an approach known as factorial Mendelian randomization, named in analogy with a factorial trial90 (Figure 5). Initial implementations of factorial Mendelian randomization considered a 2-by-2 design in which the genetic instruments were dichotomized to divide participants into one of four groups, corresponding to no intervention, intervention A only, intervention B only, and interventions A and B.91 However, while this is an intuitive way of conceptualizing and communicating the analysis, power to detect an interaction is greater when considering the presence of a statistical interaction between the genetic instruments considered as continuous variables.90
Figure 5.
Schematic diagram illustrating factorial Mendelian randomization to assess interactions between exposures:
(A) Diagram indicating causal effects of exposures plus their interaction.
(B) Illustration of the approach as a 2-by-2 design with dichotomized genetic scores that act as proxies for interventions on exposures 1 and 2.
A recent analysis considered interactions between genetic variants in the IL6R region, representing proxies for intervention on interleukin-6 receptor signaling, and genetic variants in regions corresponding to lipid-lowering therapies (PCSK9, HMGCR, NPC1L1). No significant interactions were observed in associations with cardiovascular disease risk, suggesting that the effects of interleukin-6 receptor inhibition and lipid-lowering therapies on cardiovascular disease are independent with no detectable departure from additivity.92 Such analyses can help understanding of the therapeutic potential of novel targets to provide additional benefit in conjunction with existing medicines, although power to detect an interaction is typically low in practice.
Uncovering tissue specificity
The availability of genetic association data related to gene expression in particular tissues has created the opportunity for Mendelian randomization analyses to be used to reveal the particular body sites or cell types at which proteins may be exerting their relevant biological effects. For example, recent work partitioned BMI-associated genetic variants into two groups on the basis of whether their effects were likely being exerted in the brain or adipose tissue. Specifically, the approach taken was to perform pairwise colocalization analyses at each of the relevant regions, considering the traits of BMI and tissue-specific gene expression.93 Consequently, multivariable Mendelian randomization analyses considered the partitioned sets of variants for brain-related BMI variation and adipose-related BMI variation. These analyses provided evidence that effects of BMI reduction on cardiovascular outcomes were more likely to be a consequence of brain-related mechanisms than those related to effects in the adipose tissue. For drug development, this provides the important insight that perturbing mechanisms affecting appetite and behavior may be more effective for achieving weight loss than those affecting peripheral adipose tissue metabolism.
In another example, tissue-specific gene expression data have been used to support a role for brain ACE expression in the pathophysiology of Alzheimer disease.94 Particularly when pursuing drug development, insight into the relevant site of action is paramount for ensuring that any ensuing asset has appropriate pharmacokinetic properties, while concurrently minimizing risk of potential adverse effects in other tissues. However, researchers should be aware that GWAS signals often colocalize with gene expression across multiple tissues—it may be that expression data in sufficient sample sizes for the most relevant tissue or cell type are not available.
A potential limitation of employing variants related to gene expression is that such data are typically taken from donors (often either healthy or deceased) and may not reflect associations observed in disease states. Further, gene expression patterns can be sensitive to environmental factors (such as glycemic status or pH balance) and thus may not be reflective of the relevant pathophysiological processes.95 A further limitation is that measurements of gene expression in many current datasets are derived from bulk tissues, which are mixtures of different cell types, whereas the relevant effects on expression (and therapy) may be specific to a single or small number of cell types, although single-cell expression datasets are now becoming available.96
Investigating heterogeneity in responses
Mendelian randomization investigations typically provide an estimate that represents the population-averaged impact of a shift in the distribution of an exposure.97 However, it may be that the effect of the exposure on the outcome varies between subgroups of the population. By identifying such subgroups, treatments can be targeted toward those who will benefit most.
One scenario that would lead to such heterogeneity is if the effect of the exposure is non-linear. Non-linearity in causal relationships can be investigated in a Mendelian randomization framework but requires the availability of individual-level data on the genetic variants, exposure, and outcome in a single sample. A popular method for non-linear Mendelian randomization first stratifies the sample on the basis of levels of the exposure and then conducts separate Mendelian randomization analyses within each stratum.97,98 An important methodological point is that stratifying on the exposure directly would break randomization and lead to biased estimates in the strata.99 This is because the distribution of the genetic variants would no longer be the same within each stratum, as genetic variants predisposing individuals to higher levels of the exposure would be more common in strata with high levels of the exposure and less common in strata with low levels of the exposure. This is an example of collider bias.100 For this reason, the method first regresses the exposure on the genetic variants and then stratifies on residual values of the exposure from this regression, as this “residual exposure” is independent of the genetic variants and hence randomization still holds within strata of the residual exposure. An alternative method, known as PolyMR, also calculates this residual exposure but then performs a parametric instrumental variable analysis via a polynomial function of the residual exposure to estimate the shape of the non-linear relationship.101
The residual method was used to investigate the effect of alcohol on cardiovascular disease outcomes.102 A non-linear Mendelian randomization investigation found evidence that the effect of alcohol consumption on coronary artery disease is much stronger at high levels of alcohol consumption than at low levels of alcohol consumption, although a harmful effect of alcohol consumption was evident across the whole distribution. This suggests that reductions in alcohol consumption will have the strongest effect on coronary artery disease risk for those with high levels of consumption. In contrast, non-linear Mendelian randomization analyses for the effect of blood pressure on coronary heart disease risk103 and the effect of average blood glucose levels on coronary heart disease risk have indicated linear causal relationships,104 suggesting that appropriate interventions on these risk factors may be similarly beneficial for the whole population.
Another possibility is that the causal effect of an exposure differs within strata of a measured covariate. Again, stratifying directly on a covariate directly could induce collider bias.105 A similar approach to calculate and then stratify on a residual version of the covariate has been proposed to stratify on a covariate without inducing collider bias.106 This stratification approach has been used to show that the effect of smoking on risk of bladder cancer is greater for those with low bodyweight,106 and that the effect of interleukin-6 receptor inhibition on coronary heart disease risk is greater for those with high levels of C-reactive protein.107
An important assumption made by these methods is that the genetic effect on the exposure is linear and constant for all individuals in the population. In many cases, this assumption will be questionable. In other cases, it logically cannot hold; for example, if the exposure has a natural zero level (such as non-consumers for alcohol as an exposure) or is rounded to the nearest whole number. An alternative method has been proposed that stratifies the population without making such strict parametric assumptions.108 This method is implemented by ranking the population twice: first on the basis of the value of the instrument and second on the basis of the value of the exposure. As with the residual method, the strata formed are independent of the genetic variants.
Another possibility is that the effect of an exposure varies within genetic subgroups of the population, an area of research known as “pharmacogenetics.” GWASs of response to drug treatment, such as estimated from a randomized clinical trial, have revealed specific variants that associate with treatment response.109 These can also be combined to provide a polygenic treatment response score.110 This could be used to identify potential non-responders to treatment in whom alternative therapies may be preferred111 or alternatively those who would benefit most from early intervention. This would also be relevant when a treatment has non-negligible side effects to compare causal estimates in different subgroups of the population and hence judge whether benefits outweigh risks in each subgroup.
This approach can also be mirrored for cross-sectional observational data in a Mendelian randomization framework with genetic variants as proxies for drug treatments and investigating gene-gene interactions between the treatment-mimicking instrument and other variants that represent potential effect modifiers. Even if individual variants with strong gene-gene interactions cannot be detected, it is possible to combine signals across sub-GWAS-significant variants with a random forest approach to construct a polygenic response score.112 This idea is analogous to the creation of a polygenic risk score, which can provide superior performance for risk prediction compared with approaches that only include GWAS-significant variants.113
Ancestry-specific considerations
Ancestry has a special place in human genetic investigations because of the complex interplay of genetics with ethnic identity and cultural practice.114 Typically, Mendelian randomization analyses are conducted in populations from a single ancestry group, under the assumption that this population does not contain substructure (that is, it is a “well-mixed population”115). Population stratification can lead to associations between genetic variants and traits that reflect social relationships rather than biological mechanisms. For example, if a genetic variant is more prevalent in a subpopulation that has higher risk of a disease, then the variant will be associated with disease risk. For this reason, it is recommended to adjust for genomic principal components when estimating genetic associations.116 Empirical investigations have provided supportive evidence that population stratification in curated datasets does not lead to substantially more significant associations than would be expected due to chance alone.9 However, investigations in UK Biobank have demonstrated that the assumption of a well-mixed population is violated for this dataset.117 Investigators should be aware of the possibility that genetic associations could be biased by population stratification in general, even for datasets that are not large enough for this assumption to be assessed. A further possibility is to conduct Mendelian randomization analyses within families118; this is particularly important for exposures that are socially patterned compared with those that are biologically determined.119
Although most Mendelian randomization investigations have been performed in European-descent populations, important investigations have been performed in other population groups. For lipoprotein-associated phospholipase A2 (Lp-PLA2), the existence of a common inactivating variant in the PLA2G7 region in South Asians enabled investigations to consider a far greater magnitude of change in Lp-PLA2 concentrations than would have been possible based on analyses in European-descent populations.120 Similarly, the common “alcohol flushing response” association in the ALDH2 region enables powerful analyses to investigate the effect of alcohol consumption in East Asians.121 There are no genetic variants in Europeans that explain a similar proportion of variance in the distribution of alcohol consumption.
Effect heterogeneity when comparing estimates between population groups can provide mechanistic insights as well as providing information on the potential benefit of intervention on the exposure in each population. Mendelian randomization analyses of metabolic traits on stroke risk in African ancestry individuals mirrored results in European ancestry individuals.122 However, an analysis considering the effect of lipid traits revealed evidence for a harmful effect of LDL cholesterol on type 2 diabetes risk in African ancestry individuals, which is in contrast to the protective effect in European ancestry individuals.123 This may reflect a stronger representation of lipoprotein-lipase-related mechanisms rather than LDL-receptor-related mechanisms in African ancestry individuals.124 As a further example, Mendelian randomization estimates for the effect of lipoprotein(a) on coronary heart disease risk were stronger for European ancestry individuals than for African ancestry individuals.125
A limitation of such analyses is lack of suitable ancestry-specific data for non-European ancestry populations. Allele frequencies and patterns of linkage disequilibrium may differ between populations, which has implications for Mendelian randomization and colocalization methods. For example, a genetic variant may have a pleiotropic association in one ancestry group (because of linkage disequilibrium with another functional variant) but not in another group. Low sample size has a dual effect on power in Mendelian randomization investigations: first, genetic associations with the outcome are less precise, and second, identified genetic variants typically explain less variability in the exposure. Researchers often face a choice between choosing genetic variants based on larger European ancestry datasets or based on smaller ancestry-specific datasets. The former option may lead to a greater number of selected variants, but there is no guarantee that genetic predictors of the exposure selected in Europeans will be the optimal predictors of the exposure in another ancestry group.126 However, efforts are underway to gather such data as well as to develop suitable tools for the analysis of non-European data.127
Predicting the magnitude of effect
Mendelian randomization typically compares genetically defined subgroups of the population that have different trajectories in their average level of the exposure since childhood. Hence, estimates typically reflect the impact of life-long differences in an exposure. However, it may be that the impact of sustained differences in an exposure differs from the impact of short-term interventions achievable in a clinical trial.128,129
When designing a clinical trial for lowering lipoprotein(a) levels, an important question is the trial inclusion criterion relating to lipoprotein(a) levels.130 The distribution of lipoprotein(a) concentration in European ancestry individuals is highly skewed, with median levels at around 30 mg/dL but as much as 1,000-fold differences between individuals.131 Recruiting individuals at the median lipoprotein(a) value would limit the maximum possible benefit of lipoprotein(a)-lowering therapies, as the maximum absolute reduction in lipoprotein(a) levels would be 30 mg/dL. In contrast, the maximum absolute reduction in lipoprotein(a) levels would be over three times greater for an individual with lipoprotein(a) levels of 100 mg/dL. Predicting the magnitude of effect of lipoprotein(a) lowering was critically important to the design of the trial in order to focus recruitment on individuals having a potential detectable benefit of lipoprotein(a) lowering and hence maximize power, given that any trial is finite in sample size.
Lipoprotein(a) concentration is particularly amenable to Mendelian randomization investigations, as it is highly heritable.132,133 Genetic variants having a wide range of magnitudes of association with lipoprotein(a) levels were able to demonstrate a log-linear relationship between the genetic association with lipoprotein(a) levels and the genetic association with coronary heart disease risk, suggesting that the potential benefit of intervention was proportional to the absolute change in lipoprotein(a) levels.76
To estimate the potential benefit of lipoprotein(a) lowering in a short-term trial, investigators compared Mendelian randomization estimates for LDL-cholesterol lowering to trial estimates for LDL cholesterol lowering.76 The same ratio of life-long to short-term estimates observed for LDL cholesterol lowering was assumed to hold for lipoprotein(a) lowering. This may be reasonable given their similarities as circulating lipid traits that are believed to causally affect cardiovascular risk. This enabled investigators to estimate the potential benefit of lipoprotein(a) lowering in a short-term trial by multiplying the Mendelian randomization estimate for lipoprotein(a) by this ratio. Consistent with these calculations, the HORIZON trial used a cut-off of 70 mg/dL for participant inclusion; only individuals with lipoprotein(a) levels above this threshold were invited to participate in the trial.134
A limitation of this investigation is that it only considered lipoprotein(a) concentrations and not apolipoprotein(a) isoform size; a previous Mendelian randomization analysis indicated that apolipoprotein(a) isoform size is an independent causal risk factor for coronary artery disease.135 This is a potential reason why Mendelian randomization estimates for lipoprotein(a) concentration differ between European and African ancestry populations,125 as the distributions of apolipoprotein(a) isoform size and the number of kringle IV repeats differ between the ancestry groups.136
Understanding timings of interventions
Mendelian randomization analyses can investigate the impact of interventions on an exposure during critical time periods. A relatively under-studied population in the context of drug development efforts is women of childbearing age and pregnant women. Because of risks to the fetus, clinical study of drug effects in this population is more challenging, with the result that fewer effective therapies are available for pregnant women. To combat this, genetic data may be used to inform on the relative safety of interventions. For example, genetic evidence has suggested that blood pressure lowering may reduce risk of pre-eclampsia or eclampsia in pregnant women,137 a finding that was subsequently supported in evidence from clinical trials.138 More recently, genetic data have been used to investigate the comparative safety of beta-blocker and calcium-channel-blocker antihypertensive drugs in pregnancy, and the evidence suggested that beta-blocker effects may lower offspring birthweight.139
An extension of this notion is to consider how interventions in the exposure at different time points may affect the outcome. By understanding the relationship between time and causal effects, we can better consider how the timing of an intervention will impact clinical outcomes.
If genetic variants are available that are more strongly associated with the exposure at specific time points than others, then the values of the exposure at different time points can be used as separate risk factors in a multivariable Mendelian randomization analysis. For example, investigators have considered BMI measured during early life and during later life as separate risk factors and assessed whether genetically predicted values of early life and later life BMI were associated with coronary artery disease risk.140 A positive univariable association between genetically predicted early-life BMI and coronary artery disease risk was interpreted as evidence that early-life BMI is a causal risk factor for coronary artery disease. Lack of independent association between genetically predicted early-life BMI and coronary artery disease risk in a multivariable analysis that adjusted for genetically predicted later-life BMI provided evidence that early-life BMI does not have a direct effect on coronary artery disease risk. The effect of early-life BMI appears to be mediated via later-life BMI. This has translational relevance, as it implies that for a general population, bodyweight should be pharmacologically lowered in adult life for the purposes of reducing cardiovascular risk, and not in early life.
As a note of caution, such analyses are likely to suffer from model misspecification, as it is unlikely that disease risk is a discrete function of the exposure at the fixed time points considered in the multivariable model.141 We would therefore recommend that these analyses are only conducted when values of the exposure at different time points can reasonably be interpreted as distinct risk factors and not simply measurements of the same risk factor at different timepoints. Empirical investigations have shown that results in the latter case can be misleading, with no guarantee that estimates from multivariable Mendelian randomization correspond to the true pattern of the time-varying effect of the exposure.141
A further potential area of investigation is effects on disease progression or disease survival. While Mendelian randomization analyses have suggested that BMI is a protective risk factor for breast cancer,142,143 separate analyses have indicated that BMI is a harmful risk factor for breast cancer progression.144 Mendelian randomization analyses for disease progression are difficult to implement, as there is a natural selection event that may induce bias: disease progression can only be measured in individuals who have had an initial disease event.145,146 If the exposure influences the risk of the disease, then this can lead to collider bias.45 Additionally, disease progression cohorts are typically older and less healthy, which can lead to substantial survival bias. Few GWASs of disease progression are available.147 However, the increasing availability of large biobank cohorts of individuals who are (relatively) disease free at baseline and have extensive follow-up data make this a potential fertile ground for future investigations.148
Limitations and future directions
The limitations of Mendelian randomization have been widely discussed since the conception of the approach.149,150 We focus here on limitations of Mendelian randomization for testing a causal hypothesis and those particularly relevant to drug discovery and development; several other limitations have already been discussed in the relevant sections. As per “predicting the magnitude of effect”, there are many additional reasons why Mendelian randomization is not well suited for estimation of the magnitude of a causal effect (principally, it considers life-long variability in traits due to genetic variation, whereas trials assess the impact of short-term changes128,129).
Causality cannot be demonstrated directly from observational data. Any causal claim must rely on an untestable assumption. In the case of Mendelian randomization, we assume that genetic variants used in the analysis satisfy the assumptions of an instrumental variable: they are associated with the exposure, not associated with the outcome via a confounding pathway, and can only influence the outcome via their effect on the exposure. As an alternative phrasing of these assumptions, we assume the genetic variants are distributed “as if randomly” in the population (that is, independently of competing risk factors) and gene-environment equivalence (that is, the result of inheriting a genetic variant is qualitatively equivalent to the proposed intervention that we are assessing). If any of these assumptions are violated, Mendelian randomization investigations can be misleading.
For some exposures and corresponding interventions, we have plausible genetic instruments. For others, either we do not, or else there is uncertainty in the extent to which the genetic variants mimic the intervention. For example, glycated hemoglobin and urinary sodium excretion can be considered as sentinel traits (positive control effects) for sodium-glucose co-transporter-2 (SGLT2) inhibitors; that is, any variant that mimics SGLT2 inhibitors should be associated with both glycated hemoglobin and urinary sodium levels. A recent investigation of variants in the SLC5A2 region did not find any common variants that were associated with both glycated hemoglobin and urinary sodium excretion and so could be used as instruments in Mendelian randomization analyses.151 As a further example, the precise mechanism by which metformin exerts effects on glycemic control is unclear. While a recent investigation considered genetic variants relating to levels of postulated downstream protein targets of metformin,152 the direct relevance of this analysis to the impact of metformin in clinical practice is uncertain. In other cases, it may be unclear what aspect of intervention on the exposure is proxied by the genetic variants. For example, HDL particles are heterogeneous in their protein and enzymatic content. The extent to which this function is captured by genetic predictors of HDL cholesterol concentrations is unclear. Some drug targets, such as the target of calcium channel blockers, are made up of more than one protein. Again, the extent to which the totality of the drug effect is mimicked by genetic variants in regions relating to individual protein targets is questionable. Further still, many neuropsychiatric drugs have wide-ranging effects on a variety of targets.
In isolation, Mendelian randomization investigations into drug targets may be of limited value for unraveling the optimal mode of action for a pharmacological intervention. Specifically, if an instrument for drug target perturbation relates to risk of a particular disease, further functional insights will be necessary to uncover the mechanism by which this effect may be occurring. While additional Mendelian randomization analyses, including exploration of potential mediating pathways, may be informative in this respect, such genetic analyses must typically still be supplemented with experimental approaches that offer complementary insights.
We note that a positive finding from a well-conducted Mendelian randomization analysis is no guarantee that intervention on the corresponding pathway will yield a successful drug. For example, interventions on the pathway may have undesired adverse effects or may not result in sufficient clinical benefit for the target disease. A relevant example is CETP inhibition; although variants in the CETP region are associated with lower coronary heart disease risk153 and trials of anacetrapib demonstrated reduced incidence of major coronary events,154 these findings have so far been insufficient for the drug to be employed in routine clinical practice.
In conclusion, the availability of large-scale genetic data has created unprecedented opportunity to offer insight for drug target identification and consequent clinical development. In order to maximize the potential of these data, it is equally important that appropriate methods be used to capitalize on these learnings. There are many ways that genetic variants can be used to provide evidence not only on the causal nature of a target, but also focused evidence that addresses questions of translational relevance and guides the drug development process. With the advent of publicly available data from GWASs as well as large population-based cohort studies with concomitant phenotypic and genetic data (such as UK Biobank155), the limiting factor for such investigations is often not data availability but analyst skill and time. As the techniques discussed in this review become more familiar to investigators, we hope to see their application becoming routine in all stages of the drug development pipeline.
Acknowledgments
This research was funded by United Kingdom Research and Innovation Medical Research Council (MC_UU_00002/7 and MC_UU_00011/3) and was supported by the National Institute for Health Research Cambridge Biomedical Research Centre (BRC-1215-20014). S.B. is supported by the Wellcome Trust (225790/Z/22/Z). The views expressed are those of the authors and not necessarily those of the National Health Service, the National Institute for Health Research, or the Department of Health and Social Care. D.G. is supported by the British Heart Foundation Center of Research Excellence at Imperial College London (RE/18/4/34215).
Declaration of interests
W.G.H., G.K.H., L.B.K., and D.G. are employed by Novo Nordisk.
Data and code availability
Most methods listed in this review can be implemented with summarized genetic associations,156 representing the beta coefficients and standard errors from univariable regression on each genetic variant in turn, and which have been made publicly available by many large GWAS consortia157 and for large population-based biobanks.158,159,160 These methods can be implemented with the MendelianRandomization161 and TwoSampleMR16 packages for R. An exception is non-linear Mendelian randomization, which requires individual participant data.162 A full list of software packages and links is provided in Table 2.
Table 2.
List of software packages for Mendelian randomization (MR) and related methods discussed in this review (this is not a comprehensive list of all MR methods but covers the methods most commonly used in the literature)
Approach | Package name | Weblink |
---|---|---|
Multiple methods | MendelianRandomization | https://cran.r-project.org/web/packages/MendelianRandomization/ |
Multiple methods | TwoSampleMR | https://github.com/MRCIEU/TwoSampleMR |
Outlier-robust estimation | MR-PRESSO | https://github.com/rondolab/MR-PRESSO |
Colocalization | coloc | https://github.com/chr1swallace/coloc |
Rare variant burden testing | SKAT | https://cran.r-project.org/web/packages/SKAT/ |
Rare variant burden testing | regenie | https://rgcgithub.github.io/regenie/ |
Robust estimation for multivariable MR | Robust MVMR | https://github.com/aj-grant/robust-mvmr |
Variable selection | MR-BMA | https://github.com/verena-zuber/demo_AMD |
Factor-based cis-MR | con-cis-MR | https://github.com/ash-res/con-cis-MR |
Network cis-MR | TwoStepCisMR | https://github.com/bar-woolf/TwoStepCisMR/wiki |
Clustering variants based on outcome associations | MRClust | https://github.com/cnfoley/mrclust |
Clustering variants based on trait associations | NAvMix | https://github.com/aj-grant/navmix |
Non-linear residual method (individual-level data) | nlmr | https://github.com/jrs95/nlmr |
Non-linear residual and doubly ranked methods | SUMnlmr | https://github.com/amymariemason/SUMnlmr |
Non-linear polynomial method | PolyMR | https://github.com/JonSulc/PolyMR |
References
- 1.Plenge R.M., Scolnick E.M., Altshuler D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Discov. 2013;12:581–594. doi: 10.1038/nrd4051. [DOI] [PubMed] [Google Scholar]
- 2.Thanassoulis G., O'Donnell C.J. Mendelian randomization: nature's randomized trial in the post-genome era. JAMA. 2009;301:2386–2388. doi: 10.1001/jama.2009.812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Smith G.D., Ebrahim S. Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 2003;32:1–22. doi: 10.1093/ije/dyg070. [DOI] [PubMed] [Google Scholar]
- 4.Gkatzionis A., Burgess S., Conti D.V., Newcombe P.J. Bayesian variable selection with a pleiotropic loss function in Mendelian randomization. Stat. Med. 2021;40:5025–5045. doi: 10.1002/sim.9109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gill D., Georgakis M.K., Walker V.M., Schmidt A.F., Gkatzionis A., Freitag D.F., Finan C., Hingorani A.D., Howson J.M.M., Burgess S., et al. Mendelian randomization for studying the effects of perturbing drug targets. Wellcome Open Res. 2021;6:16. doi: 10.12688/wellcomeopenres.16544.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Schmidt A.F., Finan C., Gordillo-Marañón M., Asselbergs F.W., Freitag D.F., Patel R.S., Tyl B., Chopade S., Faraway R., Zwierzyna M., Hingorani A.D. Genetic drug target validation using Mendelian randomisation. Nat. Commun. 2020;11:3255. doi: 10.1038/s41467-020-16969-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Holmes M.V., Richardson T.G., Ference B.A., Davies N.M., Davey Smith G. Integrating genomics with biomarkers and therapeutic targets to invigorate cardiovascular drug development. Nat. Rev. Cardiol. 2021;18:435–453. doi: 10.1038/s41569-020-00493-1. [DOI] [PubMed] [Google Scholar]
- 8.Smith G.D., Lawlor D.A., Harbord R., Timpson N., Day I., Ebrahim S. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLoS Med. 2007;4:e352. doi: 10.1371/journal.pmed.0040352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Taylor M., Tansey K.E., Lawlor D.A., Bowden J., Evans D., Davey S.G., Timpson N. Testing the principles of Mendelian randomization: Opportunities and complications on a genomewide scale. bioRxiv. 2017 doi: 10.1101/124362. Preprint at. [DOI] [Google Scholar]
- 10.Davey Smith G., Ebrahim S. What can mendelian randomisation tell us about modifiable behavioural and environmental exposures? BMJ. 2005;330:1076–1079. doi: 10.1136/bmj.330.7499.1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Burgess S., Swanson S.A., Labrecque J.A. Are Mendelian randomization investigations immune from bias due to reverse causation? Eur. J. Epidemiol. 2021;36:253–257. doi: 10.1007/s10654-021-00726-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Burgess S., O’Donnell C.J., Gill D. Expressing results from a Mendelian randomization analysis: separating results from inferences. JAMA Cardiol. 2021;6:7–8. doi: 10.1001/jamacardio.2020.4317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hingorani A.D., Kuan V., Finan C., Kruger F.A., Gaulton A., Chopade S., Sofat R., MacAllister R.J., Overington J.P., Hemingway H., et al. Improving the odds of drug development success through human genomics: modelling study. Sci. Rep. 2019;9:18911. doi: 10.1038/s41598-019-54849-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Nelson M.R., Tipney H., Painter J.L., Shen J., Nicoletti P., Shen Y., Floratos A., Sham P.C., Li M.J., Wang J., et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 2015;47:856–860. doi: 10.1038/ng.3314. [DOI] [PubMed] [Google Scholar]
- 15.King E.A., Davis J.W., Degner J.F. Are drug targets with genetic support twice as likely to be approved? Revised estimates of the impact of genetic support for drug mechanisms on the probability of drug approval. PLoS Genet. 2019;15:e1008489. doi: 10.1371/journal.pgen.1008489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Hemani G., Zheng J., Elsworth B., Wade K.H., Haberland V., Baird D., Laurin C., Burgess S., Bowden J., Langdon R., et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7:e34408. doi: 10.7554/eLife.34408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ochoa D., Karim M., Ghoussaini M., Hulcoop D.G., McDonagh E.M., Dunham I. Human genetics evidence supports two-thirds of the 2021 FDA-approved drugs. Nat. Rev. Drug Discov. 2022;21:551. doi: 10.1038/d41573-022-00120-3. [DOI] [PubMed] [Google Scholar]
- 18.Mountjoy E., Schmidt E.M., Carmona M., Schwartzentruber J., Peat G., Miranda A., Fumis L., Hayhurst J., Buniello A., Karim M.A., et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 2021;53:1527–1533. doi: 10.1038/s41588-021-00945-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wouters O.J., McKee M., Luyten J. Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018. JAMA. 2020;323:844–853. doi: 10.1001/jama.2020.1166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Burgess S., Davey Smith G., Davies N.M., Dudbridge F., Gill D., Glymour M.M., Hartwig F.P., Holmes M.V., Minelli C., Relton C.L., Theodoratou E. Guidelines for performing Mendelian randomization investigations [version 2; peer review: 2 approved] Wellcome Open Res. 2019;4:186. doi: 10.12688/wellcomeopenres.15555.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gill D., Burgess S. Use of a Genetic Variant Related to Circulating FXa (Activated Factor X) Levels to Proxy the Effect of FXa Inhibition on Cardiovascular Outcomes. Circ. Genom. Precis. Med. 2020;13:551–553. doi: 10.1161/CIRCGEN.120.003061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bonaca M.P., Bauersachs R.M., Anand S.S., Debus E.S., Nehler M.R., Patel M.R., Fanelli F., Capell W.H., Diao L., Jaeger N., et al. Rivaroxaban in Peripheral Artery Disease after Revascularization. N. Engl. J. Med. 2020;382:1994–2004. doi: 10.1056/NEJMoa2000052. [DOI] [PubMed] [Google Scholar]
- 23.Boef A.G.C., Dekkers O.M., le Cessie S. Mendelian randomization studies: a review of the approaches used and the quality of reporting. Int. J. Epidemiol. 2015;44:496–511. doi: 10.1093/ije/dyv071. [DOI] [PubMed] [Google Scholar]
- 24.Burgess S., Dudbridge F., Thompson S.G. Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat. Med. 2016;35:1880–1906. doi: 10.1002/sim.6835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Brion M.J.A., Shakhbazov K., Visscher P.M. Calculating statistical power in Mendelian randomization studies. Int. J. Epidemiol. 2013;42:1497–1501. doi: 10.1093/ije/dyt179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Burgess S., Zuber V., Valdes-Marquez E., Sun B.B., Hopewell J.C. Mendelian randomization with fine-mapped genetic data: Choosing from large numbers of correlated instrumental variables. Genet. Epidemiol. 2017;41:714–725. doi: 10.1002/gepi.22077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Patel A., Gill D., Newcombe P.J., Burgess S. Conditional inference in cis-Mendelian randomization using weak genetic factors. arXiv. 2020 doi: 10.48550/ARXIV.2005.01765. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Dudbridge F., Newcombe P.J. Accuracy of Gene Scores when Pruning Markers by Linkage Disequilibrium. Hum. Hered. 2015;80:178–186. doi: 10.1159/000446581. [DOI] [PubMed] [Google Scholar]
- 29.Verbanck M., Chen C.-Y., Neale B., Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 2018;50:693–698. doi: 10.1038/s41588-018-0099-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Slob E.A.W., Burgess S. A comparison of robust Mendelian randomization methods using summary data. Genet. Epidemiol. 2020;44:313–329. doi: 10.1002/gepi.22295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Mokry L.E., Ross S., Ahmad O.S., Forgetta V., Smith G.D., Goltzman D., Leong A., Greenwood C.M.T., Thanassoulis G., Richards J.B. Vitamin D and risk of multiple sclerosis: a Mendelian randomization study. PLoS Med. 2015;12:e1001866. doi: 10.1371/journal.pmed.1001866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hormozdiari F., van de Bunt M., Segrè A.V., Li X., Joo J.W.J., Bilow M., Sul J.H., Sankararaman S., Pasaniuc B., Eskin E. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 2016;99:1245–1260. doi: 10.1016/j.ajhg.2016.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zuber V., Grinberg N.F., Gill D., Manipur I., Slob E.A.W., Patel A., Wallace C., Burgess S. Combining evidence from Mendelian randomization and colocalization: review and comparison of approaches. Am. J. Hum. Genet. 2022;109:767–782. doi: 10.1016/j.ajhg.2022.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wilding J.P.H., Batterham R.L., Calanna S., Davies M., Van Gaal L.F., Lingvay I., McGowan B.M., Rosenstock J., Tran M.T.D., Wadden T.A., et al. Once-weekly semaglutide in adults with overweight or obesity. N. Engl. J. Med. 2021;384:989–1002. doi: 10.1056/NEJMoa2032183. [DOI] [PubMed] [Google Scholar]
- 35.Giambartolomei C., Vukcevic D., Schadt E.E., Franke L., Hingorani A.D., Wallace C., Plagnol V. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10:e1004383. doi: 10.1371/journal.pgen.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Gill D., Georgakis M.K., Laffan M., Sabater-Lleal M., Malik R., Tzoulaki I., Veltkamp R., Dehghan A. Genetically determined FXI (factor XI) levels and risk of stroke. Stroke. 2018;49:2761–2763. doi: 10.1161/STROKEAHA.118.022792. [DOI] [PubMed] [Google Scholar]
- 37.Namba S., Konuma T., Wu K.-H., Zhou W., Global Biobank Meta-analysis Initiative. Okada Y. A practical guideline of genomics-driven drug discovery in the era of global biobank meta-analysis. medRxiv. 2021 doi: 10.1101/2021.12.03.21267280. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Wallace C. A more accurate method for colocalisation analysis allowing for multiple causal variants. PLoS Genet. 2021;17:e1009440. doi: 10.1371/journal.pgen.1009440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Büller H.R., Bethune C., Bhanot S., Gailani D., Monia B.P., Raskob G.E., Segers A., Verhamme P., Weitz J.I., FXI-ASO TKA Investigators Factor XI antisense oligonucleotide for prevention of venous thrombosis. N. Engl. J. Med. 2015;372:232–240. doi: 10.1056/NEJMoa1405760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Mayerhofer E., Malik R., Parodi L., Burgess S., Harloff A., Dichgans M., Rosand J., Anderson C.D., Georgakis M.K. Genetically predicted on-statin LDL response is associated with higher intracerebral haemorrhage risk. Brain. 2022;145:2677–2686. doi: 10.1093/brain/awac186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Baigent C., Keech A., Kearney P.M., Blackwell L., Buck G., Pollicino C., Kirby A., Sourjina T., Peto R., Collins R., et al. Efficacy and safety of cholesterol-lowering treatment: prospective meta-analysis of data from 90, 056 participants in 14 randomised trials of statins. Lancet. 2005;366:1267–1278. doi: 10.1016/s0140-6736(05)67394-1. [DOI] [PubMed] [Google Scholar]
- 42.Vergouwen M.D.I., De Haan R.J., Vermeulen M., Roos Y.B.W.E.M. Statin Treatment and the Occurrence of Hemorrhagic Stroke in Patients With a History of Cerebrovascular Disease. Stroke. 2008;39:497–502. doi: 10.1161/strokeaha.107.488791. [DOI] [PubMed] [Google Scholar]
- 43.Allara E., Morani G., Carter P., Gkatzionis A., Zuber V., Foley C.N., Rees J.M.B., Mason A.M., Bell S., Gill D., et al. Genetic Determinants of Lipids and Cardiovascular Disease Outcomes A Wide-Angled Mendelian Randomization Investigation. Circ. Genom. Precis. Med. 2019;12:e002711. doi: 10.1161/CIRCGEN.119.002711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Lawlor D.A., Tilling K., Davey Smith G. Triangulation in aetiological epidemiology. Int. J. Epidemiol. 2016;45:1866–1886. doi: 10.1093/ije/dyw314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Gkatzionis A., Burgess S. Contextualizing selection bias in Mendelian randomization: how bad is it likely to be? Int. J. Epidemiol. 2019;48:691–701. doi: 10.1093/ije/dyy202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Asimit J., Zeggini E. Rare Variant Association Analysis Methods for Complex Traits. Annu. Rev. Genet. 2010;44:293–308. doi: 10.1146/annurev-genet-102209-163421. [DOI] [PubMed] [Google Scholar]
- 47.Lee S., Emond M.J., Bamshad M.J., Barnes K.C., Rieder M.J., Nickerson D.A., NHLBI GO Exome Sequencing Project—ESP Lung Project Team. Christiani D.C., Wurfel M.M., Lin X. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 2012;91:224–237. doi: 10.1016/j.ajhg.2012.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wang Q., Dhindsa R.S., Carss K., Harper A.R., Nag A., Tachmazidou I., Vitsios D., Deevi S.V.V., Mackay A., Muthas D., et al. Rare variant contribution to human disease in 281, 104 UK Biobank exomes. Nature. 2021;597:527–532. doi: 10.1038/s41586-021-03855-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Verweij N., Haas M.E., Nielsen J.B., Sosina O.A., Kim M., Akbari P., De T., Hindy G., Bovijn J., Persaud T., et al. Germline Mutations in CIDEB and Protection against Liver Disease. N. Engl. J. Med. 2022;387:332–344. doi: 10.1056/NEJMoa2117872. [DOI] [PubMed] [Google Scholar]
- 50.Mbatchou J., Barnard L., Backman J., Marcketta A., Kosmicki J.A., Ziyatdinov A., Benner C., O'Dushlaine C., Barber M., Boutkov B., et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 2021;53:1097–1103. doi: 10.1038/s41588-021-00870-7. [DOI] [PubMed] [Google Scholar]
- 51.Povysil G., Petrovski S., Hostyk J., Aggarwal V., Allen A.S., Goldstein D.B. Rare-variant collapsing analyses for complex traits: guidelines and applications. Nat. Rev. Genet. 2019;20:747–759. doi: 10.1038/s41576-019-0177-4. [DOI] [PubMed] [Google Scholar]
- 52.Swerdlow D.I., Kuchenbaecker K.B., Shah S., Sofat R., Holmes M.V., White J., Mindell J.S., Kivimaki M., Brunner E.J., Whittaker J.C., et al. Selecting instruments for Mendelian randomization in the wake of genome-wide association studies. Int. J. Epidemiol. 2016;45:1600–1616. doi: 10.1093/ije/dyw088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Burgess S., Thompson S.G. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol. 2015;181:251–260. doi: 10.1093/aje/kwu283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Porcu E., Rüeger S., Lepik K., eQTLGen Consortium. BIOS Consortium. Santoni F.A., Reymond A., Kutalik Z. Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits. Nat. Commun. 2019;10:3300. doi: 10.1038/s41467-019-10936-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Sanderson E., Davey Smith G., Windmeijer F., Bowden J. An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int. J. Epidemiol. 2019;48:713–727. doi: 10.1093/ije/dyy262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Carter A.R., Sanderson E., Hammerton G., Richmond R.C., Davey Smith G., Heron J., Taylor A.E., Davies N.M., Howe L.D. Mendelian randomisation for mediation analysis: current methods and challenges for implementation. Eur. J. Epidemiol. 2021;36:465–478. doi: 10.1007/s10654-021-00757-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Burgess S., Freitag D.F., Khan H., Gorman D.N., Thompson S.G. Using multivariable Mendelian randomization to disentangle the causal effects of lipid fractions. PLoS One. 2014;9:e108891. doi: 10.1371/journal.pone.0108891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zuber V., Colijn J.M., Klaver C., Burgess S. Selecting likely causal risk factors from high-throughput experiments using multivariable Mendelian randomization. Nat. Commun. 2020;11:29. doi: 10.1038/s41467-019-13870-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zuber V., Gill D., Ala-Korpela M., Langenberg C., Butterworth A., Bottolo L., Burgess S. High-throughput multivariable Mendelian randomization analysis prioritizes apolipoprotein B as key lipid risk factor for coronary artery disease. Int. J. Epidemiol. 2021;50:893–901. doi: 10.1093/ije/dyaa216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Marston N.A., Giugliano R.P., Melloni G.E.M., Park J.G., Morrill V., Blazing M.A., Ference B., Stein E., Stroes E.S., Braunwald E., et al. Association of Apolipoprotein B–Containing Lipoproteins and Risk of Myocardial Infarction in Individuals With and Without Atherosclerosis: Distinguishing Between Particle Concentration, Type, and Content. JAMA Cardiol. 2022;7:250–256. doi: 10.1001/jamacardio.2021.5083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Levin M.G., Zuber V., Walker V.M., Klarin D., Lynch J., Malik R., Aday A.W., Bottolo L., Pradhan A.D., Dichgans M., et al. Prioritizing the Role of Major Lipoproteins and Subfractions as Risk Factors for Peripheral Artery Disease. Circulation. 2021;144:353–364. doi: 10.1161/circulationaha.121.053797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hägg S., Fall T., Ploner A., Mägi R., Fischer K., Draisma H.H.M., Kals M., de Vries P.S., Dehghan A., Willems S.M., et al. Adiposity as a cause of cardiovascular disease: a Mendelian randomization study. Int. J. Epidemiol. 2015;44:578–586. doi: 10.1093/ije/dyv094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Gill D., Zuber V., Dawson J., Pearson-Stuttard J., Carter A.R., Sanderson E., Karhunen V., Levin M.G., Wootton R.E., Klarin D., et al. Risk factors mediating the effect of body mass index and waist-to-hip ratio on cardiovascular outcomes: Mendelian randomization analysis. Int. J. Obes. 2021;45:1428–1438. doi: 10.1038/s41366-021-00807-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Burgess S., Thompson D.J., Rees J.M.B., Day F.R., Perry J.R., Ong K.K. Dissecting causal pathways using Mendelian randomization with summarized genetic data: application to age at menarche and risk of breast cancer. Genetics. 2017;207:481–487. doi: 10.1534/genetics.117.300191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Timms A.E., Crane A.M., Sims A.-M., Cordell H.J., Bradbury L.A., Abbott A., Coyne M.R.E., Beynon O., Herzberg I., Duff G.W., et al. The interleukin 1 gene cluster contains a major susceptibility locus for ankylosing spondylitis. Am. J. Hum. Genet. 2004;75:587–595. doi: 10.1086/424695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Zhu G., Whyte M.K.B., Vestbo J., Carlsen K., Carlsen K.H., Lenney W., Silverman M., Helms P., Pillai S.G. Interleukin 18 receptor 1 gene polymorphisms are associated with asthma. Eur. J. Hum. Genet. 2008;16:1083–1090. doi: 10.1038/ejhg.2008.67. [DOI] [PubMed] [Google Scholar]
- 67.Batool F., Patel A., Gill D., Burgess S. Disentangling the effects of traits with shared clustered genetic predictors using multivariable Mendelian randomization. Genet. Epidemiol. 2022;46:415–429. doi: 10.1002/gepi.22462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Zheng J., Haberland V., Baird D., Walker V., Haycock P.C., Hurle M.R., Gutteridge A., Erola P., Liu Y., Luo S., et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 2020;52:1122–1131. doi: 10.1038/s41588-020-0682-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Millard L.A.C., Davies N.M., Tilling K., Gaunt T.R., Davey Smith G. Searching for the causal effects of body mass index in over 300 000 participants in UK Biobank, using Mendelian randomization. PLoS Genet. 2019;15:e1007951. doi: 10.1371/journal.pgen.1007951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Interleukin-6 Receptor Mendelian Randomisation Analysis IL6R MR Consortium. Swerdlow D.I., Holmes M.V., Kuchenbaecker K.B., Engmann J.E.L., Shah T., Sofat R., Guo Y., Chung C., Peasey A., et al. The interleukin-6 receptor as a target for prevention of coronary heart disease: a mendelian randomisation analysis. Lancet. 2012;379:1214–1224. doi: 10.1016/s0140-6736(12)60110-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.IL6R Genetics Consortium Emerging Risk Factors Collaboration. Sarwar N., Butterworth A.S., Freitag D.F., Gregson J., Willeit P., Gorman D.N., Gao P., Saleheen D., Rendon A., et al. Interleukin-6 receptor pathways in coronary heart disease: a collaborative meta-analysis of 82 studies. Lancet. 2012;379:1205–1213. doi: 10.1016/s0140-6736(11)61931-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Larsson S.C., Burgess S., Gill D. Genetically proxied interleukin-6 receptor inhibition: opposing associations with COVID-19 and pneumonia. Eur. Respir. J. 2021;57:2003545. doi: 10.1183/13993003.03545-2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Bovijn J., Lindgren C.M., Holmes M.V. Genetic variants mimicking therapeutic inhibition of IL-6 receptor signaling and risk of COVID-19. Lancet. Rheumatol. 2020;2:e658–e659. doi: 10.1016/s2665-9913(20)30345-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Abani O., Abbas A., Abbas F., Abbas M., Abbasi S., Abbass H., Abbott A., Abdallah N., Abdelaziz A., Abdelfattah M., et al. Tocilizumab in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial. Lancet. 2021;397:1637–1645. doi: 10.1016/S0140-6736(21)00676-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Viney N.J., van Capelleveen J.C., Geary R.S., Xia S., Tami J.A., Yu R.Z., Marcovina S.M., Hughes S.G., Graham M.J., Crooke R.M., et al. Antisense oligonucleotides targeting apolipoprotein(a) in people with raised lipoprotein(a): two randomised, double-blind, placebo-controlled, dose-ranging trials. Lancet. 2016;388:2239–2253. doi: 10.1016/s0140-6736(16)31009-1. [DOI] [PubMed] [Google Scholar]
- 76.Burgess S., Ference B.A., Staley J.R., Freitag D.F., Mason A.M., Nielsen S.F., Willeit P., Young R., Surendran P., Karthikeyan S., et al. Association of LPA Variants With Risk of Coronary Disease and the Implications for Lipoprotein(a)-Lowering Therapies: A Mendelian Randomization Analysis. JAMA Cardiol. 2018;3:619–627. doi: 10.1001/jamacardio.2018.1470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Larsson S.C., Gill D., Mason A.M., Jiang T., Bäck M., Butterworth A.S., Burgess S. Lipoprotein(a) in Alzheimer, Atherosclerotic, Cerebrovascular, Thrombotic, and Valvular Disease. Circulation. 2020;141:1826–1828. doi: 10.1161/CIRCULATIONAHA.120.045826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Burgess S., Davey Smith G. Mendelian Randomization Implicates High-Density Lipoprotein Cholesterol-Associated Mechanisms in Etiology of Age-Related Macular Degeneration. Ophthalmology. 2017;124:1165–1174. doi: 10.1016/j.ophtha.2017.03.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Nordestgaard L.T., Christoffersen M., Lauridsen B.K., Afzal S., Nordestgaard B.G., Frikke-Schmidt R., Tybjærg-Hansen A. Long-term Benefits and Harms Associated With Genetic Cholesteryl Ester Transfer Protein Deficiency in the General Population. JAMA Cardiol. 2022;7:55–64. doi: 10.1001/jamacardio.2021.3728. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Nicholls S.J., Ditmarsch M., Kastelein J.J., Rigby S.P., Kling D., Curcio D.L., Alp N.J., Davidson M.H. Lipid lowering effects of the CETP inhibitor obicetrapib in combination with high-intensity statins: a randomized phase 2 trial. Nat. Med. 2022;28:1672–1678. doi: 10.1038/s41591-022-01936-7. [DOI] [PubMed] [Google Scholar]
- 81.C Reactive Protein Coronary Heart Disease Genetics Collaboration CCGC. Wensley F., Gao P., Burgess S., Kaptoge S., Di Angelantonio E., Shah T., Engert J.C., Clarke R., Davey-Smith G., et al. Association between C reactive protein and coronary heart disease: mendelian randomisation analysis based on individual participant data. Bmj. 2011;342:d548. doi: 10.1136/bmj.d548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Foley C.N., Mason A.M., Kirk P.D.W., Burgess S. Clustering of genetic variants in Mendelian randomization with similar causal estimates. Bioinformatics. 2021;37:531–541. doi: 10.1093/bioinformatics/btaa778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Udler M.S., Kim J., von Grotthuss M., Bonàs-Guarch S., Cole J.B., Chiou J., Christopher D. Anderson on behalf of METASTROKE and the ISGC. Boehnke M., Laakso M., Atzmon G., et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis. PLoS Med. 2018;15:e1002654. doi: 10.1371/journal.pmed.1002654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Burgess S., Foley C.N., Allara E., Staley J.R., Howson J.M.M. A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat. Commun. 2020;11:376. doi: 10.1038/s41467-019-14156-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Thompson W.D., Beaumont R.N., Kuang A., Warrington N.M., Ji Y., Tyrrell J., Wood A.R., Scholtens D.M., Knight B.A., Evans D.M., et al. Higher maternal adiposity reduces offspring birthweight if associated with a metabolically favourable profile. Diabetologia. 2021;64:2790–2802. doi: 10.1007/s00125-021-05570-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Grant A.J., Gill D., Kirk P.D.W., Burgess S. Noise-augmented directional clustering of genetic association data identifies distinct mechanisms underlying obesity. PLoS Genet. 2022;18:e1009975. doi: 10.1371/journal.pgen.1009975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Yaghootkar H., Lotta L.A., Tyrrell J., Smit R.A.J., Jones S.E., Donnelly L., Beaumont R., Campbell A., Tuke M.A., Hayward C., et al. Genetic evidence for a link between favorable adiposity and lower risk of type 2 diabetes, hypertension, and heart disease. Diabetes. 2016;65:2448–2460. doi: 10.2337/db15-1671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Sulc J., Sonrel A., Mounier N., Auwerx C., Marouli E., Darrous L., Draganski B., Kilpeläinen T.O., Joshi P., Loos R.J.F., Kutalik Z. Composite trait Mendelian randomization reveals distinct metabolic and lifestyle consequences of differences in body shape. Commun. Biol. 2021;4:1064. doi: 10.1038/s42003-021-02550-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Karageorgiou V., Gill D., Bowden J., Zuber V. Sparse Dimensionality Reduction Approaches in Mendelian Randomization with highly correlated exposures. medRxiv. 2022 doi: 10.1101/2022.06.15.22276455. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Rees J.M.B., Foley C.N., Burgess S. Factorial Mendelian randomization: using genetic variants to assess interactions. Int. J. Epidemiol. 2020;49:1147–1158. doi: 10.1093/ije/dyz161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Ference B.A., Majeed F., Penumetcha R., Flack J.M., Brook R.D. Effect of naturally random allocation to lower low-density lipoprotein cholesterol on the risk of coronary heart disease mediated by polymorphisms in NPC1L1, HMGCR, or both: a 2 × 2 factorial Mendelian randomization study. J. Am. Coll. Cardiol. 2015;65:1552–1561. doi: 10.1016/j.jacc.2015.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Georgakis M.K., Malik R., Burgess S., Dichgans M. Additive Effects of Genetic Interleukin-6 Signaling Downregulation and Low-Density Lipoprotein Cholesterol Lowering on Cardiovascular Disease: A 2×2 Factorial Mendelian Randomization Analysis. J. Am. Heart Assoc. 2022;11:e023277. doi: 10.1161/jaha.121.023277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Leyden G.M., Shapland C.Y., Davey Smith G., Sanderson E., Greenwood M.P., Murphy D., Richardson T.G. Harnessing tissue-specific genetic variation to dissect putative causal pathways between body mass index and cardiometabolic phenotypes. Am. J. Hum. Genet. 2022;109:240–252. doi: 10.1016/j.ajhg.2021.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Ryan D.K., Karhunen V., Su B., Traylor M., Richardson T.G., Burgess S., Tzoulaki I., Gill D. Genetic Evidence for Protective Effects of Angiotensin-Converting Enzyme Against Alzheimer Disease But Not Other Neurodegenerative Diseases in European Populations. Neurol. Genet. 2022;8:e200014. doi: 10.1212/NXG.0000000000200014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Murray J.I., Whitfield M.L., Trinklein N.D., Myers R.M., Brown P.O., Botstein D. Diverse and specific gene expression responses to stresses in cultured human cells. Mol. Biol. Cell. 2004;15:2361–2374. doi: 10.1091/mbc.E03-11-0799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Yazar S., Alquicira-Hernandez J., Wing K., Senabouth A., Gordon M.G., Andersen S., Lu Q., Rowson A., Taylor T.R.P., Clarke L., et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science. 2022;376:eabf3041. doi: 10.1126/science.abf3041. [DOI] [PubMed] [Google Scholar]
- 97.Burgess S., Davies N.M., Thompson S.G., EPIC-InterAct Consortium Instrumental variable analysis with a nonlinear exposure-outcome relationship. Epidemiology. 2014;25:877–885. doi: 10.1097/ede.0000000000000161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Staley J.R., Burgess S. Semiparametric methods for estimation of a nonlinear exposure-outcome relationship using instrumental variables with application to Mendelian randomization. Genet. Epidemiol. 2017;41:341–352. doi: 10.1002/gepi.22041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Cole S.R., Platt R.W., Schisterman E.F., Chu H., Westreich D., Richardson D., Poole C. Illustrating bias due to conditioning on a collider. Int. J. Epidemiol. 2010;39:417–420. doi: 10.1093/ije/dyp334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Hernán M.A., Hernández-Díaz S., Robins J.M. A Structural Approach to Selection Bias. Epidemiology. 2004;15:615–625. doi: 10.1097/01.ede.0000135174.63482.43. [DOI] [PubMed] [Google Scholar]
- 101.Sulc J., Sjaarda J., Kutalik Z. Polynomial Mendelian randomization reveals non-linear causal effects for obesity-related traits. HGG Adv. 2022;3:100124. doi: 10.1016/j.xhgg.2022.100124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Biddinger K.J., Emdin C.A., Haas M.E., Wang M., Hindy G., Ellinor P.T., Kathiresan S., Khera A.V., Aragam K.G. Association of Habitual Alcohol Intake With Risk of Cardiovascular Disease. JAMA Netw. Open. 2022;5:e223849. doi: 10.1001/jamanetworkopen.2022.3849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Malik R., Georgakis M.K., Vujkovic M., Damrauer S.M., Elliott P., Karhunen V., Giontella A., Fava C., Hellwege J.N., Shuey M.M., et al. Relationship Between Blood Pressure and Incident Cardiovascular Disease: Linear and Nonlinear Mendelian Randomization Analyses. Hypertension. 2021;77:2004–2013. doi: 10.1161/HYPERTENSIONAHA.120.16534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Burgess S., Malik R., Liu B., Mason A.M., Georgakis M.K., Dichgans M., Gill D. Dose–response relationship between genetically proxied average blood glucose levels and incident coronary heart disease in individuals without diabetes mellitus. Diabetologia. 2021;64:845–849. doi: 10.1007/s00125-020-05377-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Burgess S. “C-reactive protein levels and risk of dementia”: Subgroup analyses in Mendelian randomization are likely to be misleading. Alzheimer's Dementia. 2022 doi: 10.1002/alz.12743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Coscia C., Gill D., Benítez R., Pérez T., Malats N., Burgess S. Avoiding collider bias in Mendelian randomization when performing stratified analyses. Eur. J. Epidemiol. 2022;37:671–682. doi: 10.1007/s10654-022-00879-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Georgakis M.K., Malik R., Richardson T.G., Howson J.M.M., Anderson C.D., Burgess S., Hovingh G.K., Dichgans M., Gill D. Associations of genetically predicted IL-6 signaling with cardiovascular disease risk across population subgroups. BMC Med. 2022;20:245. doi: 10.1186/s12916-022-02446-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Tian H., Mason A.M., Liu C., Burgess S. Relaxing parametric assumptions for non-linear Mendelian randomization using a doubly-ranked stratification method. bioRxiv. 2022 doi: 10.1101/2022.06.28.497930. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Postmus I., Trompet S., Deshmukh H.A., Barnes M.R., Li X., Warren H.R., Chasman D.I., Zhou K., Arsenault B.J., Donnelly L.A., et al. Pharmacogenetic meta-analysis of genome-wide association studies of LDL cholesterol response to statins. Nat. Commun. 2014;5:5068. doi: 10.1038/ncomms6068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Lewis J.P., Backman J.D., Reny J.-L., Bergmeijer T.O., Mitchell B.D., Ritchie M.D., Déry J.P., Pakyz R.E., Gong L., Ryan K., et al. Pharmacogenomic polygenic response score predicts ischaemic events and cardiovascular mortality in clopidogrel-treated patients. Eur. Heart J. Cardiovasc. Pharmacother. 2020;6:203–210. doi: 10.1093/ehjcvp/pvz045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Pan Y., Chen W., Wang Y., Li H., Johnston S.C., Simon T., Zhao X., Liu L., Wang D., Meng X., et al. Association Between ABCB1 Polymorphisms and Outcomes of Clopidogrel Treatment in Patients With Minor Stroke or Transient Ischemic Attack: Secondary Analysis of a Randomized Clinical Trial. JAMA Neurol. 2019;76:552–560. doi: 10.1001/jamaneurol.2018.4775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Xu Z.M., Burgess S. Polygenic modelling of treatment effect heterogeneity. Genet. Epidemiol. 2020;44:868–879. doi: 10.1002/gepi.22347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013;9:e1003348. doi: 10.1371/journal.pgen.1003348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Birney E., Inouye M., Raff J., Rutherford A., Scally A. The language of race, ethnicity, and ancestry in human genetic research. arXiv. 2021 doi: 10.48550/ARXIV.2106.10041. Preprint at. [DOI] [Google Scholar]
- 115.Nowak M.A., Tarnita C.E., Antal T. Evolutionary dynamics in structured populations. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2010;365:19–30. doi: 10.1098/rstb.2009.0215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Price A.L., Patterson N.J., Plenge R.M., Weinblatt M.E., Shadick N.A., Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 2006;38:904–909. doi: 10.1038/ng1847. [DOI] [PubMed] [Google Scholar]
- 117.Haworth S., Mitchell R., Corbin L., Wade K.H., Dudding T., Budu-Aggrey A., Carslake D., Hemani G., Paternoster L., Smith G.D., et al. Apparent latent structure within the UK Biobank sample has implications for epidemiological analysis. Nat. Commun. 2019;10:333. doi: 10.1038/s41467-018-08219-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Brumpton B., Sanderson E., Heilbron K., Hartwig F.P., Harrison S., Vie G.Å., Cho Y., Howe L.D., Hughes A., Boomsma D.I., et al. Avoiding dynastic, assortative mating, and population stratification biases in Mendelian randomization through within-family analyses. Nat. Commun. 2020;11:3519. doi: 10.1038/s41467-020-17117-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Howe L.J., Nivard M.G., Morris T.T., Hansen A.F., Rasheed H., Cho Y., Chittoor G., Lind P.A., Palviainen T., van der Zee M.D., et al. Within-sibship GWAS improve estimates of direct genetic effects. bioRxiv. 2021 doi: 10.1101/2021.03.05.433935. Preprint at. [DOI] [Google Scholar]
- 120.Gregson J.M., Freitag D.F., Surendran P., Stitziel N.O., Chowdhury R., Burgess S., Kaptoge S., Gao P., Staley J.R., Willeit P., et al. Genetic invalidation of Lp-PLA2 as a therapeutic target: Large-scale study of five functional Lp-PLA2-lowering alleles. Eur. J. Prev. Cardiol. 2017;24:492–504. doi: 10.1177/2047487316682186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Millwood I.Y., Walters R.G., Mei X.W., Guo Y., Yang L., Bian Z., Bennett D.A., Chen Y., Dong C., Hu R., et al. Conventional and genetic evidence on alcohol and vascular disease aetiology: a prospective study of 500 000 men and women in China. Lancet. 2019;393:1831–1842. doi: 10.1016/S0140-6736(18)31772-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Fatumo S., Karhunen V., Chikowore T., Sounkou T., Udosen B., Ezenwa C., Nakabuye M., Soremekun O., Daghlas I., Ryan D.K., et al. Metabolic traits and stroke risk in individuals of African ancestry: Mendelian randomization analysis. Stroke. 2021;52:2680–2684. doi: 10.1161/STROKEAHA.121.034747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Soremekun O., Karhunen V., He Y., Rajasundaram S., Liu B., Gkatzionis A., Soremekun C., Udosen B., Musa H., Silva S., et al. Lipid traits and type 2 diabetes risk in African ancestry individuals: A Mendelian Randomization study. EBioMedicine. 2022;78:103953. doi: 10.1016/j.ebiom.2022.103953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Lotta L.A., Stewart I.D., Sharp S.J., Day F.R., Burgess S., Luan J., Bowker N., Cai L., Li C., Wittemans L.B.L., et al. Association of Genetically Enhanced Lipoprotein Lipase-Mediated Lipolysis and Low-Density Lipoprotein Cholesterol-Lowering Alleles With Risk of Coronary Disease and Type 2 Diabetes. JAMA Cardiol. 2018;3:957–966. doi: 10.1001/jamacardio.2018.2866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Satterfield B.A., Dikilitas O., Safarova M.S., Clarke S.L., Tcheandjieu C., Zhu X., Bastarache L., Larson E.B., Justice A.E., Shang N., et al. Associations of genetically predicted Lp(a)(lipoprotein[a]) levels with cardiovascular traits in individuals of European and African Ancestry. Circ. Genom. Precis. Med. 2021;14:e003354. doi: 10.1161/CIRCGEN.120.003354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Graham S.E., Clarke S.L., Wu K.-H.H., Kanoni S., Zajac G.J.M., Ramdas S., Surakka I., Ntalla I., Vedantam S., Winkler T.W., et al. The power of genetic diversity in genome-wide association studies of lipids. Nature. 2021;600:675–679. doi: 10.1038/s41586-021-04064-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.O’Connell J., Yun T., Moreno M., Li H., Litterman N., Kolesnikov A., Noblin E., Chang P.C., Shastri A., Dorfman E.H., et al. A population-specific reference panel for improved genotype imputation in African Americans. Commun. Biol. 2021;4:1–9. doi: 10.1038/s42003-021-02777-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Burgess S., Butterworth A., Malarstig A., Thompson S.G. Use of Mendelian randomisation to assess potential benefit of clinical intervention. Br. Med. J. 2012;345:e7325. doi: 10.1136/bmj.e7325. [DOI] [PubMed] [Google Scholar]
- 129.Ference B.A. How to use Mendelian randomization to anticipate the results of randomized trials. Eur. Heart J. 2018;39:360–362. doi: 10.1093/eurheartj/ehx462. [DOI] [PubMed] [Google Scholar]
- 130.Hardy J., Niman S., Goldfaden R.F., Ashchi M., Bisharat M., Huston J., Hartmann H., Choksi R. A Review of the Clinical Pharmacology of Pelacarsen: A Lipoprotein(a)-Lowering Agent. Am. J. Cardiovasc. Drugs. 2022;22:47–54. doi: 10.1007/s40256-021-00499-1. [DOI] [PubMed] [Google Scholar]
- 131.Kamstrup P.R., Benn M., Tybjaerg-Hansen A., Nordestgaard B.G. Extreme lipoprotein(a) levels and risk of myocardial infarction in the general population: the Copenhagen City Heart Study. Circulation. 2008;117:176–184. doi: 10.1161/circulationaha.107.715698. [DOI] [PubMed] [Google Scholar]
- 132.Clarke R., Peden J.F., Hopewell J.C., Kyriakou T., Goel A., Heath S.C., Parish S., Barlera S., Franzosi M.G., Rust S., et al. Genetic variants associated with Lp(a) lipoprotein level and coronary disease. N. Engl. J. Med. 2009;361:2518–2528. doi: 10.1056/NEJMoa0902604. [DOI] [PubMed] [Google Scholar]
- 133.Kamstrup P.R., Tybjaerg-Hansen A., Steffensen R., Nordestgaard B.G. Genetically elevated lipoprotein(a) and increased risk of myocardial infarction. JAMA. 2009;301:2331–2339. doi: 10.1001/jama.2009.801. [DOI] [PubMed] [Google Scholar]
- 134.Nicholls S.J., Bubb K.J. The Riskier Lipid: What Is on the HORIZON for Lipoprotein (a) and Should There Be Lp(a) Screening for All? Curr. Cardiol. Rep. 2021;23:97. doi: 10.1007/s11886-021-01528-w. [DOI] [PubMed] [Google Scholar]
- 135.Saleheen D., Haycock P.C., Zhao W., Rasheed A., Taleb A., Imran A., Abbas S., Majeed F., Akhtar S., Qamar N., et al. Apolipoprotein(a) isoform size, lipoprotein(a) concentration, and coronary artery disease: a mendelian randomisation analysis. Lancet Diabetes Endocrinol. 2017;5:524–533. doi: 10.1016/s2213-8587(17)30088-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Marcovina S.M., Albers J.J., Wijsman E., Zhang Z., Chapman N.H., Kennedy H. Differences in Lp[a] concentrations and apo[a] polymorphs between black and white Americans. J. Lipid Res. 1996;37:2569–2585. [PubMed] [Google Scholar]
- 137.Ardissino M., Slob E.A.W., Millar O., Reddy R.K., Lazzari L., Patel K.H.K., Ryan D., Johnson M.R., Gill D., Ng F.S. Maternal Hypertension Increases Risk of Preeclampsia and Low Fetal Birthweight: Genetic Evidence From a Mendelian Randomization Study. Hypertension. 2022;79:588–598. doi: 10.1161/HYPERTENSIONAHA.121.18617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Tita A.T., Szychowski J.M., Boggess K., Dugoff L., Sibai B., Lawrence K., Hughes B.L., Bell J., Aagaard K., Edwards R.K., et al. Treatment for Mild Chronic Hypertension during Pregnancy. N. Engl. J. Med. 2022;386:1781–1792. doi: 10.1056/NEJMoa2201295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Ardissino M., Slob E.A.W., Rajasundaram S., Reddy R.K., Woolf B., Girling J., Johnson M.R., Ng F.S., Gill D. Safety of beta-blocker and calcium channel blocker antihypertensive drugs in pregnancy: a Mendelian randomization study. BMC Med. 2022;20:288. doi: 10.1186/s12916-022-02483-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Richardson T.G., Sanderson E., Elsworth B., Tilling K., Davey Smith G. Use of genetic variation to separate the effects of early and later life adiposity on disease risk: mendelian randomisation study. Br. Med. J. 2020;369:m1203. doi: 10.1136/bmj.m1203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Tian H., Burgess S. Estimation of time-varying causal effects with multivariable Mendelian randomization: some cautionary notes. medRxiv. 2022 doi: 10.1101/2022.03.16.22272492. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Guo Y., Warren Andersen S., Shu X.-O., Michailidou K., Bolla M.K., Wang Q., Garcia-Closas M., Milne R.L., Schmidt M.K., Chang-Claude J., et al. Genetically predicted body mass index and breast cancer risk: Mendelian randomization analyses of data from 145, 000 women of European descent. PLoS Med. 2016;13:e1002105. doi: 10.1371/journal.pmed.1002105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Vithayathil M., Carter P., Kar S., Mason A.M., Burgess S., Larsson S.C. Body size and composition and risk of site-specific cancers in the UK Biobank and large international consortia: A mendelian randomisation study. PLoS Med. 2021;18:e1003706. doi: 10.1371/journal.pmed.1003706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Guo Q., Burgess S., Turman C., Bolla M.K., Wang Q., Lush M., Abraham J., Aittomäki K., Andrulis I.L., Apicella C., et al. Body mass index and breast cancer survival: a Mendelian randomization analysis. Int. J. Epidemiol. 2017;46:1814–1822. doi: 10.1093/ije/dyx131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Vansteelandt S., Dukes O., Martinussen T. Survivor bias in Mendelian randomization analysis. Biostatistics. 2018;19:426–443. doi: 10.1093/biostatistics/kxx050. [DOI] [PubMed] [Google Scholar]
- 146.Cho Y., Rau A., Reiner A., Auer P.L. Mendelian randomization analysis with survival outcomes. Genet. Epidemiol. 2021;45:16–23. doi: 10.1002/gepi.22354. [DOI] [PubMed] [Google Scholar]
- 147.Paternoster L., Tilling K., Davey Smith G. Genetic epidemiology and Mendelian randomization for informing disease therapeutics: Conceptual and methodological challenges. PLoS Genet. 2017;13:e1006944. doi: 10.1371/journal.pgen.1006944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Mitchell R.E., Hartley A., Walker V.M., Gkatzionis A., Yarmolinsky J., Bell J.A., Chong A.H., Paternoster L., Tilling K., Smith G.D. Strategies to investigate and mitigate collider bias in genetic and Mendelian randomization studies of disease progression. medRxiv. 2022 doi: 10.1371/journal.pgen.1010596. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Smith G.D., Ebrahim S. Mendelian randomization: prospects, potentials, and limitations. Int. J. Epidemiol. 2004;33:30–42. doi: 10.1093/ije/dyh132. [DOI] [PubMed] [Google Scholar]
- 150.VanderWeele T.J., Tchetgen Tchetgen E.J., Cornelis M., Kraft P. Methodological challenges in Mendelian randomization. Epidemiology. 2014;25:427–435. doi: 10.1097/EDE.0000000000000081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Wang S., Said M.A., Groot H.E., van der Most P.J., Thio C.H.L., van de Vegte Y.J., Verweij N., Snieder H., van der Harst P. Search for a Functional Genetic Variant Mimicking the Effect of SGLT2 Inhibitor Treatment. Genes. 2021;12:1174. doi: 10.3390/genes12081174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Zheng J., Xu M., Walker V., Yuan J., Korologou-Linden R., Robinson J., Huang P., Burgess S., Au Yeung S.L., Luo S., et al. Evaluating the efficacy and mechanism of metformin targets on reducing Alzheimer's disease risk in the general population: a Mendelian randomisation study. Diabetologia. 2022;65:1664–1675. doi: 10.1007/s00125-022-05743-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Schmidt A.F., Hunt N.B., Gordillo-Marañón M., Charoen P., Drenos F., Kivimaki M., Lawlor D.A., Giambartolomei C., Papacosta O., Chaturvedi N., et al. Cholesteryl ester transfer protein (CETP) as a drug target for cardiovascular disease. Nat. Commun. 2021;12:5640. doi: 10.1038/s41467-021-25703-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154.The HPS3/TIMI55–REVEAL Collaborative Group Effects of Anacetrapib in Patients with Atherosclerotic Vascular Disease. N. Engl. J. Med. 2017;377:1217–1227. doi: 10.1056/NEJMoa1706444. [DOI] [PubMed] [Google Scholar]
- 155.Sudlow C., Gallacher J., Allen N., Beral V., Burton P., Danesh J., Downey P., Elliott P., Green J., Landray M., et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. doi: 10.1371/journal.pmed.1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Burgess S., Scott R.A., Timpson N.J., Davey Smith G., Thompson S.G., EPIC- InterAct Consortium Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. Eur. J. Epidemiol. 2015;30:543–552. doi: 10.1007/s10654-015-0011-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Kamat M.A., Blackshaw J.A., Young R., Surendran P., Burgess S., Danesh J., Butterworth A.S., Staley J.R. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics. 2019;35:4851–4853. doi: 10.1093/bioinformatics/btz469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Abbott L., Bryant S., Churchhouse C., Ganna A., Howrigan D., Palmer D., Neale B., Walters R., Carey C., et al. The Hail team UK Biobank GWAS results Round 2. 2018. http://www.nealelab.is/uk-biobank Accessed 27th July 2022.
- 159.Kurki M.I., Karjalainen J., Palta P., Sipilä T.P., Kristiansson K., Donner K., Reeve M.P., Laivuori H., Aavikko M., Kaunisto M.A., et al. FinnGen: Unique genetic insights from combining isolated population and national health register data. medRxiv. 2022 doi: 10.1101/2022.03.03.22271360. Preprint at. [DOI] [Google Scholar]
- 160.Okada Y., Kanai M. BioBank Japan Pheweb GWAS results downloads. 2021. https://pheweb.jp/downloads Accessed 27th July 2022.
- 161.Broadbent J.R., Foley C.N., Grant A.J., Mason A.M., Staley J.R., Burgess S. MendelianRandomization v0.5.0: updates to an R package for performing Mendelian randomization analyses using summarized data. Wellcome Open Res. 2020;5:252. doi: 10.12688/wellcomeopenres.16374.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.Mason A.M., Burgess S. Software Application Profile: SUMnlmr, an R package that facilitates flexible and reproducible non-linear Mendelian randomisation analyses. Int. J. Epidemiol. 2021 doi: 10.1101/2021.12.10.21267623. [DOI] [Google Scholar]
- 163.Piccini J.P., Caso V., Connolly S.J., Fox K.A.A., Oldgren J., Jones W.S., Gorog D.A., Durdil V., Viethen T., Neumann C., et al. Safety of the oral factor XIa inhibitor asundexian compared with apixaban in patients with atrial fibrillation (PACIFIC-AF): a multicentre, randomised, double-blind, double-dummy, dose-finding phase 2 study. Lancet. 2022;399:1383–1390. doi: 10.1016/S0140-6736(22)00456-1. [DOI] [PubMed] [Google Scholar]
- 164.Larsson S.C., Gill D. Genetic Evidence Supporting Fibroblast Growth Factor 21 Signalling as a Pharmacological Target for Cardiometabolic Outcomes and Alzheimer's Disease. Nutrients. 2021;13:1504. doi: 10.3390/nu13051504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Harrison S.A., Ruane P.J., Freilich B.L., Neff G., Patil R., Behling C.A., Hu C., Fong E., de Temple B., Tillman E.J., et al. Efruxifermin in non-alcoholic steatohepatitis: a randomized, double-blind, placebo-controlled, phase 2a trial. Nat. Med. 2021;27:1262–1271. doi: 10.1038/s41591-021-01425-3. [DOI] [PubMed] [Google Scholar]
- 166.Daghlas I., Karhunen V., Ray D., Zuber V., Burgess S., Tsao P.S., Lynch J.A., Lee K.M., Voight B.F., Chang K.M., et al. Genetic Evidence for Repurposing of GLP1R (Glucagon-Like Peptide-1 Receptor) Agonists to Prevent Heart Failure. J. Am. Heart Assoc. 2021;10:e020331. doi: 10.1161/JAHA.120.020331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Karhunen V., Daghlas I., Zuber V., Vujkovic M., Olsen A.K., Knudsen L.B., Haynes W.G., Howson J.M.M., Gill D. Leveraging human genetic data to investigate the cardiometabolic effects of glucose-dependent insulinotropic polypeptide signalling. Diabetologia. 2021;64:2773–2778. doi: 10.1007/s00125-021-05564-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 168.Frías J.P., Davies M.J., Rosenstock J., Pérez Manghi F.C., Fernández Landó L., Bergman B.K., Liu B., Cui X., Brown K., SURPASS-2 Investigators Tirzepatide versus Semaglutide Once Weekly in Patients with Type 2 Diabetes. N. Engl. J. Med. 2021;385:503–515. doi: 10.1056/NEJMoa2107519. [DOI] [PubMed] [Google Scholar]
- 169.Rosenstock J., Wysham C., Frías J.P., Kaneko S., Lee C.J., Fernández Landó L., Mao H., Cui X., Karanikas C.A., Thieu V.T. Efficacy and safety of a novel dual GIP and GLP-1 receptor agonist tirzepatide in patients with type 2 diabetes (SURPASS-1): a double-blind, randomised, phase 3 trial. Lancet. 2021;398:143–155. doi: 10.1016/S0140-6736(21)01324-6. [DOI] [PubMed] [Google Scholar]
- 170.Georgakis M.K., Malik R., Li X., Gill D., Levin M.G., Vy H.M.T., Judy R., Ritchie M., Verma S.S., et al. Regeneron Genetics Center Genetically Downregulated Interleukin-6 Signaling Is Associated With a Favorable Cardiometabolic Profile: A Phenome-Wide Association Study. Circulation. 2021;143:1177–1180. doi: 10.1161/CIRCULATIONAHA.120.052604. [DOI] [PubMed] [Google Scholar]
- 171.Georgakis M.K., Malik R., Gill D., Franceschini N., Sudlow C.L.M., Dichgans M., INVENT Consortium, CHARGE Inflammation Working Group Interleukin-6 Signaling Effects on Ischemic Stroke and Other Cardiovascular Outcomes: A Mendelian Randomization Study. Circ. Genom. Precis. Med. 2020;13:e002872. doi: 10.1161/CIRCGEN.119.002872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 172.Ridker P.M., Devalaraja M., Baeres F.M.M., Engelmann M.D.M., Hovingh G.K., Ivkovic M., Lo L., Kling D., Pergola P., Raj D., et al. IL-6 inhibition with ziltivekimab in patients at high atherosclerotic risk (RESCUE): a double-blind, randomised, placebo-controlled, phase 2 trial. Lancet. 2021;397:2060–2069. doi: 10.1016/S0140-6736(21)00520-1. [DOI] [PubMed] [Google Scholar]
- 173.Cohen J.C., Boerwinkle E., Mosley T.H., Jr., Hobbs H.H. Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N. Engl. J. Med. 2006;354:1264–1272. doi: 10.1056/NEJMoa054013. [DOI] [PubMed] [Google Scholar]
- 174.Schwartz G.G., Steg P.G., Szarek M., Bhatt D.L., Bittner V.A., Diaz R., Edelberg J.M., Goodman S.G., Hanotin C., Harrington R.A., et al. Alirocumab and Cardiovascular Outcomes after Acute Coronary Syndrome. N. Engl. J. Med. 2018;379:2097–2107. doi: 10.1056/NEJMoa1801174. [DOI] [PubMed] [Google Scholar]
- 175.Myocardial Infarction Genetics Consortium Investigators. Stitziel N.O., Won H.H., Morrison A.C., Peloso G.M., Do R., Lange L.A., Fontanillas P., Gupta N., Duga S., et al. Inactivating mutations in NPC1L1 and protection from coronary heart disease. N. Engl. J. Med. 2014;371:2072–2082. doi: 10.1056/NEJMoa1405386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176.Cannon C.P., Blazing M.A., Giugliano R.P., McCagg A., White J.A., Theroux P., Darius H., Lewis B.S., Ophuis T.O., Jukema J.W., et al. Ezetimibe Added to Statin Therapy after Acute Coronary Syndromes. N. Engl. J. Med. 2015;372:2387–2397. doi: 10.1056/NEJMoa1410489. [DOI] [PubMed] [Google Scholar]
- 177.Johannsen T.H., Frikke-Schmidt R., Schou J., Nordestgaard B.G., Tybjærg-Hansen A. Genetic inhibition of CETP, ischemic vascular disease and mortality, and possible adverse effects. J. Am. Coll. Cardiol. 2012;60:2041–2048. doi: 10.1016/j.jacc.2012.07.045. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Most methods listed in this review can be implemented with summarized genetic associations,156 representing the beta coefficients and standard errors from univariable regression on each genetic variant in turn, and which have been made publicly available by many large GWAS consortia157 and for large population-based biobanks.158,159,160 These methods can be implemented with the MendelianRandomization161 and TwoSampleMR16 packages for R. An exception is non-linear Mendelian randomization, which requires individual participant data.162 A full list of software packages and links is provided in Table 2.
Table 2.
List of software packages for Mendelian randomization (MR) and related methods discussed in this review (this is not a comprehensive list of all MR methods but covers the methods most commonly used in the literature)
Approach | Package name | Weblink |
---|---|---|
Multiple methods | MendelianRandomization | https://cran.r-project.org/web/packages/MendelianRandomization/ |
Multiple methods | TwoSampleMR | https://github.com/MRCIEU/TwoSampleMR |
Outlier-robust estimation | MR-PRESSO | https://github.com/rondolab/MR-PRESSO |
Colocalization | coloc | https://github.com/chr1swallace/coloc |
Rare variant burden testing | SKAT | https://cran.r-project.org/web/packages/SKAT/ |
Rare variant burden testing | regenie | https://rgcgithub.github.io/regenie/ |
Robust estimation for multivariable MR | Robust MVMR | https://github.com/aj-grant/robust-mvmr |
Variable selection | MR-BMA | https://github.com/verena-zuber/demo_AMD |
Factor-based cis-MR | con-cis-MR | https://github.com/ash-res/con-cis-MR |
Network cis-MR | TwoStepCisMR | https://github.com/bar-woolf/TwoStepCisMR/wiki |
Clustering variants based on outcome associations | MRClust | https://github.com/cnfoley/mrclust |
Clustering variants based on trait associations | NAvMix | https://github.com/aj-grant/navmix |
Non-linear residual method (individual-level data) | nlmr | https://github.com/jrs95/nlmr |
Non-linear residual and doubly ranked methods | SUMnlmr | https://github.com/amymariemason/SUMnlmr |
Non-linear polynomial method | PolyMR | https://github.com/JonSulc/PolyMR |