Abstract
Mendelian Randomization (MR) represents a class of instrumental variable methods using genetic variants. It has become popular in epidemiological studies to account for the unmeasured confounders when estimating the effect of exposure on outcome. The success of Mendelian Randomization depends on three critical assumptions, which are difficult to verify. Therefore, sensitivity analysis methods are needed for evaluating results and making plausible conclusions. We propose a general and easy to apply approach to conduct sensitivity analysis for Mendelian Randomization studies. Bound et al. (1995) derived a formula for the asymptotic bias of the instrumental variable estimator. Based on their work, we derive a new sensitivity analysis formula. The parameters in the formula include sensitivity parameters such as the correlation between instruments and unmeasured confounder, the direct effect of instruments on outcome and the strength of instruments. In our simulation studies, we examined our approach in various scenarios using either individual SNPs or unweighted allele score as instruments. By using a previously published dataset from researchers involving a bone mineral density study, we demonstrate that our proposed method is a useful tool for MR studies, and that investigators can combine their domain knowledge with our method to obtain bias-corrected results and make informed conclusions on the scientific plausibility of their findings.
Keywords: instrumental variable, causal inference, sensitivity analysis, unmeasured confounding
1. Introduction
Determining the relationship between exposures and health related outcomes is one of the major goals of epidemiology. When researchers investigate the effects of exposures on outcomes, they often make some assumptions at different stages. Examples include the study design and statistical analysis stages. For example, epidemiological studies are usually nonrandomized due either to logistical or ethical reasons so that potential confounders are collected from subjects and later included in the analyses. At the analysis stage, epidemiologists and statisticians assume there are no unobserved confounders being left out of the statistical models such that the estimated effect of exposure is the true effect. However, such an assumption is usually hard to test in statistics and typically relies on subject-matter knowledge. With the increasing availability of large-scale genetic data and association through genome-wide association studies, George Davey Smith and colleagues (Davey Smith and Ebrahim 2005; Davey Smith and Ebrahim 2003; Davey Smith and Ebrahim 2004) have argued for the use of DNA variants as instrumental variables, which they termed Mendelian Randomization (MR). Mendelian randomization has circumvented the need for the assumption of no omitted confounders when inferring the true effect.
The use of Mendelian randomization methods was initially suggested by Dutch scientist Martjin Katan in 1986 (Katan 2004). In describing the relationship between serum cholesterol level and cancer, Katan stated that the different alleles of gene apolipoprotein E (apo E) were major determinants of plasma cholesterol levels in several populations. In addition, the alleles were not affected by confounders since they were inherited from parents and had not changed since birth. Katan argued that this meant that reverse causality could not occur. Therefore, these genes could potentially be used to investigate the relationship between low-serum cholesterol and cancer. Although Katan did not use the word “Mendelian randomization” and did not provide statistical methods for analysis, he made the important observation that genetic variants are usually not affected by confounders and outcomes, hence, they may be used as surrogates for associated exposures in statistical analyses. However, the relationship between the genetic variants and the outcomes is not exactly the same as that between the exposures and the outcomes, which is of main interest to investigators. More advanced analytic methods are needed to take advantage of the genetic variants’ unconfoundedness.
The method of instrumental variables (IV) was developed by Wright (Wright 1928) for simultaneous equation estimation. It has long been used in econometrics to estimate causal effects of treatment on outcomes while accommodating unmeasured confounding (Haavelmo 1944; Theil 1958). The IV methods depend on finding one or more variables to extract the unconfounded variation of treatment for studying the association between treatment and the outcome. In Mendelian randomization, genetic variants are the instrumental variables for studying effects of exposure on health outcomes. There are three critical assumptions that must hold for valid inferences using IV/MR methods:
The instrumental variables are strongly associated with the exposure.
The instrumental variables are independent of the unmeasured confounder.
The instrumental variables affect outcome only through exposure.
When these assumptions hold, the IV method will yield consistent estimators for the true effect. However, identifying valid instruments is very difficult. While assumption (a) can be assessed statistically, the other two require making statistically unverifiable assumptions. When assumption (b) is violated, the unmeasured confounder between exposure and outcome then becomes an unmeasured confounder between instruments and exposure, causing biased estimates of instrumental effect on exposure. This will result in incorrect estimates of exposure on outcome. Although genetic variants generally do not correlate with the common baseline confounders (Katan 2004), the correlation of genetic variants with known or unknown confounders is still of importance. First, population stratification may still exist even within data from the same continent despite the careful design (Bauchet et al. 2007; Novembre et al. 2008; Seldin and Price 2008). Secondly, Mendelian randomization has been applied to omics studies to facilitate estimating the causal effect of omics on complex disease phenotypes (Auerbach et al. 2018). However, batch effects are widespread in large high-throughput-omics data and can impede efforts to uncover disease mechanisms (Goh et al. 2017; Leek et al. 2010; Listgarten et al. 2010). The batch effects can impact the estimation of true effects of omics on traits by being correlated with both omics, such as DNA expression data, and genotype data (Listgarten et al. 2010; Michaelson et al. 2009). It is also common for genetic variants to exhibit pleiotropy and thereby function in multiple biological pathways. This renders alternative pathways by which IVs may affect outcome separately from the path through the exposure. In a recent paper, Sivakumaran et al (Sivakumaran et al. 2011) carefully estimated the frequency of pleiotropy. They only considered independent traits to avoid inflation in estimated pleiotropy frequency by highly correlated traits. They found 4.6–7.8% of relatively common SNPs to have pleiotropy. Pleiotropy can manifest in different ways. The same gene or regulatory region could give rise to two different traits or a causal locus could affect a trait through the mediation of another trait (Hackinger and Zeggini 2017). Violation of assumption (b) may cause the association between instrument and outcome mediated by unmeasured confounder, whereas violation of assumption (c) includes both cases. When there are multiple IVs, the overidentifying restrictions test (ORT) can be used to test the assumptions (b) and (c). However, ORTs may be inconsistent and have low power in certain situations (Small 2007).
Thus, it is important to have methods available to understand the robustness of findings when assumptions (a)-(c) are violated. Sensitivity analysis is a method for evaluating how much the uncertainty in assumptions (a)-(c) affects the conclusions and assessing how reliable the study results are. One well-known early example of sensitivity analysis in non-randomized studies was performed by Cornfield et al concerning the relationship between smoking and lung cancer (Cornfield et al. 1959). It was observed that the cigarette smokers had a nine-fold greater risk of developing lung cancer than non-smokers. Cornfield et al argued that attributing such a large increase in risk only due to confounders and not due to smoking itself is highly unlikely, thereby concluding that smoking itself has a casual effect on lung cancer. This example demonstrated the usefulness of sensitivity analysis. Although sensitivity analysis usually does not give an answer conclusively regarding the reliability or correctness of the scientific findings, it does provide information about unobserved factors that could be detrimental to the effect estimation and/or hypothesis testing.
Sensitivity analysis methods have been studied and proposed by many authors (Harding 2003; Gastwirth et al. 1998; Greenland 1996; Lin et al. 1998; Rosenbaum and Rubin 1983; Vanderweele and Arah 2011). The focus of the methods are either on hypothesis testing (Gastwirth et al. 1998; Rosenbaum 1987) or on estimating of true effect between the exposure and the outcome (Harding 2003; Lin et al. 1998; Vanderweele and Arah 2011). Both approaches employ sensitivity parameters that quantify bias in parameter estimates. By varying the values of those parameters, researchers can evaluate the sensitivity of their results. For example, in a time to event analysis, if our estimated hazard ratio is 1.5, how do we know how much of this value can be attributed to unknown confounders? In (Lin et al. 1998), the authors derived bias formulae for the Cox proportional hazards model. For a binary unmeasured confounder and a binary exposure, they showed the sensitivity parameters were prevalence of the unmeasured confounder in each exposure group and the log hazard of unmeasured confounder. They demonstrated how unadjusted unmeasured confounding could mask the true effect.
Different statistical methods have been proposed to study the sensitivity of inference to the violation of the three assumptions in IV/MR studies. Methods such as scatterplots, funnel plots and Egger regression can be used to identify unbalanced pleiotropy (Burgess et al. 2017). Those methods do not quantify potential bias in estimation of true exposure effect on outcome. Kolesàr et al proposed a bias corrected two-stage least squares (TSLS) estimator and suggested to compare the bias corrected TSLS to the usual limited-information-maximum-likelihood estimator while conducting the ORTs to gauge the effect of broken assumptions (b) and (c) (Kolesár et al. 2015). Conley et al used Bayesian approaches to incorporate the prior knowledge about the degree of violation of the assumptions (b) and (c) into the analysis (Conley et al. 2012). Small’s method relies on creating a sensitivity set first for the parameter, then estimates a sensitivity interval for the true effect using values in the sensitivity set and a proposed hypothesis testing procedure (Small 2007). Wang et al recently proposed a sensitivity analysis method based on the Anderson-Rubin (AR) test for IV analysis (Wang et al. 2018). Similar to other researchers who developed sensitivity analysis methods for various data analyses, we derive a simple sensitivity analysis formula and sensitivity parameters for MR from an IV bias equation. We demonstrate that our method can accurately measure the bias and lets investigators easily manipulate the sensitivity parameters to gauge the plausibility of their MR results. The rest of this manuscript is organized as follows. In Section 2, we define the data, sensitivity parameters and derive the bias formula. Section 3 show the results of simulation studies to evaluate the formula and parameters. A real data analysis is presented in Section 4. Section 5 concludes with a brief discussion of our methodology.
2. Sensitivity parameters
2.1. Notation, background and general approach
Let Y be a n × 1 vector of the single outcome for n subjects and X be a n × 1 vector for the single exposure. We only consider the common genetic variants as instruments in our methods. Let Z = [Z1, … , Zp] be a n × p matrix of p independent common genetic variants, whose minor allele frequency (MAF) is higher than 0.05, as the instrumental variables. Also U1 and U2 are two unmeasured confounders between Y and X where U1 is independent of the genetic instruments Z, but U2 is correlated with Z. In this manuscript we focus on single continuous outcome and single continuous exposure with the true relationship between Y and X, X and Z, and relationship between unmeasured confounder and Y, X or Z all being linear. To simplify presentation, we do not include any measured confounders in our models. A graphical depiction of the relationships is provided in Figure 1.
Fig. 1.
Relationship between outcome, exposure, instruments and unmeasured confounders
We use the two-stage least squares estimator (Basmann 1957; Theil 1953a; Theil 1953b; Theil 1958) to compute the exposure’s true effect on the outcome. The models we consider are the following:
| (1) |
where ey~N(0, In) and βz is the vector of direct effects of Z on outcome and βz = (βz1 … βzp). ey is independent of X and Z.
| (2) |
where ex~N(0, In), and αz is the vector of effects of Z on exposure, αz = (αz1, … , αzp).
| (3) |
where eu2~N(0, In). From equations 2 and 3, we can further represent the exposure X as:
When the confounder U2 is omitted from the model (2), the effect of Z on X is αZ + αu2γz, which is not the true association αZ. Then, the fitted X is
| (4) |
Bound et al (Bound et al. 1995) consider a model
| (5) |
where βx is the true causal effect of exposure X on the outcome Y. They showed that i.e. the IV estimator converges in probability to the true effect plus a bias term with a sufficiently large sample, where plim denotes the limiting quantity in probability, and is the covariance between fitted value (equation 4) and the error term in equation (5), which is ε = βu1U1 + βu2U2 + Zβz + ey from equation 1, and is the variance of fitted value (equation 4). We derived the sensitivity parameters from this asymptotic bias, . Appendix A derives for more than one IV, but for simplicity and intuition, the single IV case is presented here:
| (6) |
Equation (6) indicates that there are two parts contributing to the bias. The first one is the path from the IV (Z) to the outcome (Y) through the unmeasured confounder, and the second is the direct effect of the IV (Z) on the outcome (Y). Both are normalized by the effect of IV (Z) on the exposure (X). Therefore, the asymptotic bias of two stage least squares estimator is caused by any effects of IV on the outcome that is not mediated by the exposure and is reduced when there are stronger IV effects on the exposure. The IV ratio estimator is equivalent to the two-stage least squares estimator when there is only a single IV. From the ratio estimator’s point of view, we can also derive the same formula. The ratio estimator of the true effect βx is the effect of IV on the outcome divided by the effect of IV on the exposure i.e. , when the unmeasured confounders are not properly adjusted for.
2.2. Multiple instrument case
In many situations, it will likely be the case that multiple instrumental variables will be required to perform MR. Recently, Zhang and Ghosh explored the issue of using multiple variants in an instrumental variable framework (Zhang and Ghosh 2017). They studied the problem within the kernel machine framework, whose utility and impact in genetic studies has been extensive (e.g., Wu et al. 2011). Through simulation studies, Zhang and Ghosh (2017) made the following observations:
Using the full vector of SNPs as instruments led to suboptimal performance, especially when a fraction of the SNPs represents weak instruments;
In smaller sample sizes, there was substantial finite-sample bias when using the full vector of SNPs as instruments. This is due to the fact that chance imbalances are more likely to occur with more categories subclassified by SNP combination vectors.
In many instances, using a sum allelic score, which reduces the number of SNPs down to a one-dimensional score, was effective in both estimating exposure effects in MR analyses as well as maintaining type I error across a variety of settings.
For the purposes of sensitivity analyses, the sum allelic score reduces the vector Z of SNPs down to a one-dimensional random variable S. We can then substitute S in the role of Z in the previous section and derive the same bias formulae. Using the sum allelic score offers additional advantages. First, it effectively amounts to an averaging operation so that that instances of rare occurrences of certain levels of SNPs do not distort the IV analysis. Second, this circumvents the need for multiparameter sensitivity analyses, which would be difficult to implement in practice. Finally, as seen in the case of one genetic IV, we would have an equation similar to (6) and thus have simple interpretations of the asymptotic bias, now for the sum score S.
3. Simulation studies and their results
In order to investigate if our formula closely estimates the bias and how the parameters in the formula influence the bias, we conducted simulation studies to evaluate the bias formulae across a wide range of situations.
3.1. Single genetic IV
We first sought to evaluate if and how well equation (6) and parameters reflect the observed bias when the three IV assumptions do not hold. Note that in (6), there are three parameters that are related to the IV: βz, the direct effect of IV on the outcome, αz, the direct effect of IV on the exposure and γz, the direct effect of IV on the unmeasured confounder. We focused our simulation studies on these three parameters while fixing the association between the unmeasured confounder to the outcome (βu2) and between the unmeasured confounder to the exposure (αu2).
In all of our simulations, we simulated 1000 datasets with sample size 2500 for each simulation setting. We first simulated a common SNP Z with a minor allele frequency (MAF) of 0.1. For all simulations, Z was coded using an additive model. We simulated two unmeasured confounders, U1~N(0,1), U2~N(γ0 + Z γz, 1), which are associated with both the outcome Y and the exposure X. We fixed γ0 at 0.1. and obtained the exposure X using equation (2), where ex was drawn from N(0,1), α0 was fixed at 0.1, and both αu1 and αu2 were set to 0.5. Finally, we simulated the outcome Y using equation (1), where ey~N(0,1), β0 was fixed at 0.1, and both βu1 and βu2 were fixed at 0.5. The true effect of exposure on outcome βx was set to 0.5.
In our first study, we specified no association of Z with the unmeasured confounders(γz = 0). We varied the pleiotropy effect (βz) and the strength of the IV effect on the exposure (αz ). Figure 2a shows the observed bias influenced by those two parameters. As expected, the bias is at the highest level when the IV is very weak and the pleiotropy is very strong. At a fixed level of pleiotropy, increasing the strength of IV can reduce the bias significantly. Figure 2b shows the absolute value of difference between observed bias and bias estimated using equation (6). The equation (6) represents the observed bias very well except when the association between IV and exposure is almost zero. The observed bias includes both the asymptotic bias and the finite-sample bias of the TSLS estimator. The finite-sample bias is also magnified by the weak IV (Matthew et al. 2016). The large difference between bias estimated by our formula and the observed bias when a very weak IV was involved may be partly due to the larger finite-sample bias that is not included in our formula.
Fig. 2.
a pleiotropy and strength of IV
b pleiotropy and strength of IV
In our second set of simulation studies, we specified no direct association between Z with the outcome (βz = 0). We varied the strength of association between IV and unmeasured confounder U2 (γz) and the strength of the IV effect on the exposure (αz ). The correlation between the instrument and unmeasured confounder causes bias in the TSLS estimator, and stronger instruments can reduce the bias (Figure 3a). The observed bias is higher when the IV strength is very weak and the instrumental variable is strongly associated with the unmeasured confounder. However, the maximum of observed bias is much smaller than what was observed in Figure 2a, and the rate of bias increasing appears lower. The association between the instrumental variable and the exposure includes both the direct effect (αz ) and the path through the unmeasured confounder (αu2γz). Therefore, the increased γz also strengthens the IV and reduces the observed bias. Again, the equation (6) performs well (Figure 3b). However, unlike in Figure 2b, the largest differences between the observed and estimated bias occur when neither the strength of IV nor the association between IV and unmeasured confounder is large due to the diminished contribution of γz to the strength of the IV.
Fig. 3.
a direct effect of IV on outcome and strength of IV
b direct effect of IV on outcome and strength of IV
3.2. Unweighted allele score from multiple genetic IVs
Using multiple genetic variants as instruments in Mendelian Randomization analyses has been shown to increase power (Burgess and Thompson 2014). Several authors (Chao and Swanson 2007; Matthew et al. 2016) separately have shown that the exact finite sample bias of the TSLS estimator is a non-linear function of the number of IVs, and larger numbers of instrumental variables cause larger finite-sample bias when other parameters are the same. Intuitively, in MR studies, large numbers of genetic IVs cause extensive subdivision of subjects into multi-loci genotypic subgroups, which leads to higher chances of imbalance in the unmeasured confounder across those subgroups. Consequently, this will render the IVs to be invalid instruments. Therefore, combining multiple genetic variants into some type of allele score to decrease the finite-sample bias while keeping information from each variant has been studied and proposed (Burgess and Thompson 2014; Zhang and Ghosh 2017). We investigated how well our approach works with an unweighted allele score. Again, we simulated 1000 datasets with sample size 2500. We simulated total 6 independent common genetic variants, of which 3 variants were valid IVs i.e. satisfied the three IV assumptions, the other three had broken assumptions (b) and (c). Hence, both the IV Z and its effects (βz, αz, and γz ) were 6 × 1 vectors, and the genetic variable of each variant was coded using an additive model. The allele score was defined as the unweighted sum of all genetic variables in Z. We adopted the same simulation approach and parameter values as in Section 3.1 other than the values of MAF, βz, αz, and γz. For calculating the bias using equation (6), we plugged the means of βz, αz, and γz in place of βz, αz, and γz respectively. For the studies with multiple IVs, we used relatively weak IVs with mean αz being set at 0.1.
We first evaluated the case where all 6 SNPs have the same MAF but have unequal effects on exposure, outcome and unmeasured confounder. We first generated a set of 6 values for the effect on exposure; the six values are (0.1–0.05, 0.1–0.05, 0.1, 0.1, 0.1+0.05, 0.1+0.05). We then did simple random sample without replacement (SRSWOR) from this set to assign each SNP an effect on exposure such that the other parameters of the instrument and the IVs’ strength are independent. Similarly, for the IV direct effect on the outcome or on the unmeasured confounder, we selected an effect mean and an effect difference, then randomly assigned the three value (mean, mean−difference, mean+difference) to the three invalid IVs. The IV direct effect on the outcome and the effect on unmeasured confounder is zero for the 3 valid instruments. Table 1 shows the parameter settings for the simulation studies.
Table 1,
the setting of simulation parameters for Figures 2 – 6.
| Parameters | |||||
|---|---|---|---|---|---|
| Figure | αz | βz | γz | MAF | |
| Single Genetic IV | 2 | [0,2] | [0,2] | 0 | 0.1 |
| 3 | [0,2] | 0 | [0,2] | 0.1 | |
| Multiple Genetic IV | 4 | SRSWOR(0.05, 0.05, 0.1, 0.1, 0.15, 0.15) | (u-0.1, u, u+0.1, 0, 0, 0), u ∈ [0.1, 1.9] | (u-0.1, u, u+0.1, 0, 0, 0), u ∈ [0.1, 1.9] | (0.1, 0.1, 0.1, 0.1, 0.1) |
| 5 | SRSWOR(0.05, 0.05, 0.1, 0.1, 0.15, 0.15) | (u-0.1, u, u+0.1, 0, 0, 0), u ∈ [0.1, 1.9] | (u-0.1, u, u+0.1, 0, 0, 0), u ∈ [0.1, 1.9] | SRSWOR(0.1, 0.15, 0.2, 0.3, 0.35, 0.4) | |
| 6 | SRSWOR(0.05, 0.05, 0.1, 0.1, 0.15, 0.15) | (u-0.1, u, u+0.1, 0, 0, 0), u ∈ [0.1, 1.9] | (u-0.1, u, u+0.1, 0, 0, 0), u ∈ [0.1, 1.9] | SRSWOR(0.1,0.12, 0.15, 0.17, 0.19, 0.22) | |
Notes: SRSWOR: simple random sample without replacement.
It is unrealistic to have genetic variants having the same MAF. In Appendix A, our derivation shows that the variance of each IV affects the bias, and the variance terms cannot be canceled out when the allele frequencies are different. We investigated how variants having different MAFs would affect our methods. While using the procedure describe above to assign IV effects, we randomly assigned distinct MAF to the variants. We evaluated both a set of 6 MAFs having larger differences (MAF = (0.1, 0.15, 0.2, 0.3, 0.35, 0.4)) and a set having smaller differences (MAF = (0.1, 0.12, 0.15, 0.17, 0.19, 0.22)). The MAFs were assigned to SNPs independent of their validity as an instrument.
In the equal MAF (Figure 4a), unequal MAF (Figure 5a) and smaller MAF difference (Figure 6a) situations, the observed bias demonstrates the same trend. Namely, the bias is larger when pleiotropy is larger and association between IV and unmeasured confounder is not very large due to the reduced IV strength by smaller γz. When MAFs are equal (figure 4b), the difference between the observed and estimated bias from equation (6) was small; its first quartile was 0.0035 and its third quartile was 0.0073. When the IVs have different MAFs (Figure 5b), the first and third quartiles of the difference between observed and estimated bias was 0.033 and 0.108, respectively. But when the differences among MAFs are smaller (Figure 6b), the first quartile of the difference between observed bias and estimated bias by equation 6 was 0.019 and the third quartile was 0.071. The difference between observed bias and estimated bias by equation (6) increases in both Figures 5b and 6b when the pleiotropy was larger and the correlation between genetic variants and unmeasured confounder was smaller. Therefore, the difference between observed bias and the estimated bias using the equation (6) becomes larger when genetic variants have different MAFs. The larger differences in MAFs causes larger deviation of equation (6) from the observed bias. However, the simulations in figures 5 and 6 show that the equation (6) estimates the bias of the IV estimator relatively well in many realistic scenarios.
Fig. 4.
a unweighted allele score from SNPs with unequal strength, but same MAF
b unweighted allele score from SNPs with unequal strength, but same MAF
Fig. 5.
a unweighted allele score from SNPs with unequal strength and different MAFs
b unweighted allele score from SNPs with unequal strength and different MAFs
Fig. 6a.
a unweighted allele score from SNPs with unequal strength and smaller difference in MAFs
b unweighted allele score from SNPs with unequal strength and smaller difference in MAFs
4. Data analysis
We now apply the methodology to a case study that involves understanding the role of body mass composition on bone mineral density (BMD). Bones provide support to our body and protection to our vital organs. Proper bone health is critical for human to have a healthy and vibrant life even at older age. One indicator of bone health is BMD, which measures the amount of minerals (mostly calcium and phosphorous) contained in a certain volume of bone. Researchers and physicians have extensively studied the effect of body lean mass and body fat mass on BMD, but have largely come up with conflicting results reflecting the complex relationship between body lean mass, fat mass and the bone mass (Dimitri 2018). One possible explanation is the unmeasured confounders in the analyses.
The Avon Longitudinal Study of Parents and Children (ALSPAC) is a longitudinal, population-based birth cohort study in Avon, UK (Golding 1990). Timpson et al conducted a Mendelian randomization study to investigate the influence of body fat on bone mass in children, whose mean age was 9,8 years (Timpson et al. 2009). All subjects with non-white ethnic origin were excluded from analyses. They chose one SNP from each of the two independent loci (FTO/rs9939609 and MC4R/rs17782313) as the instruments for the exposure body fat. In order to identify the valid IVs, they screened IV candidates for their association with potential confounders and found weak association between FTO marker and covariates lean mass and sitting height in this study population. In their two-stage least squares analysis models, sex, sitting height and height were adjusted for. Timpson et al. (2009) identified a weak positive effect of fat mass on BMD (effect size =1.081, 95% CI = (1.050, 1.113)). Palmer et al reanalyzed the data from ALSPAC with different IVs and models from Timpson to study the causal effect of fat mass on bone mineral density (BMD) using Mendelian randomization (Palmer et al. 2012). These authors chose one adiposity-associated SNP from each of four different genes as the IVs in TSLS analyses while adjusted for sex, age, height and height squared. In their sample, the FTO marker did not show association with lean mass, however, TMEM18 marker showed association with lean mass and mom’s educational achievements. They treated those associations as chance association. When using the unweighted allele score from the four SNPs in TSLS, they estimated the causal effect of the allele score on BMD as 1.40, and the 95% confidence interval was (0.99, 1.98).
The results from Timpson and Palmer et al are consistent with conclusions from some but not all of the previous bone health studies. For example, Wosje et al studied children of age 7 years and found negative association between the baseline fat mass at age 3.5 year to bone mass (Wosje et al. 2009). Besides covariates such as sex, race and height, Wosje et al (2009) included potential confounders such as physical activity (number of hours of TV viewed per day) and calcium daily intake that were omitted in other prior studies. Timpson and Palmer et al. applied MR to counter the effect of missing confounders. However, there could be some violations of the IV/MR assumptions. These studies strongly exhibit the need for the analysis tools described in this paper.
We applied equation (6) to estimate how much bias the various level of bias parameters can create. We set αu2 and βu2 each at low (0.1), medium (0.3) or high (0.5) level to create 9 combinations of those two parameters. We then varied βz and γz from 0.1 to 0.5 in increments of 0.05. For the effect of IVs on exposure,αz, we adopted the average of the first-stage coefficients listed in Palmer et al’s article. When the true effect of fat mass on BMD is zero, the estimated coefficient for MR analysis completely arises from the bias caused by violation of assumptions. The 25% and 75% quantile of the bias are given in Table 2. In some instances, the bias is larger than 1.4. Therefore, if there is a true effect of fat mass in those situations, the effect will be negative in order to bring the estimated coefficient down to 1.4. For example, from first row of table 2, when the αu2 = 0.1 and the βu2 = 0.1, if the bias is 1.56, the bias corrected effect estimate is 1.40–1.56=−0.16, and the 95% confidence interval is (0.99–1.56, 1.98–1.56). Thus, investigators may first use our bias equation combining with their initial point estimates and confidence interval from MR analyses to calculate a bias corrected effect and its confidence interval for each unknown confounder parameter setting. They can then use their domain knowledge to choose plausible unmeasured confounder parameter ranges for evaluating their results. For example, we only used positive effect of unmeasured confounders, if investigators know that there might be confounders having negative relationship with outcome or exposure, they could choose negative αu2 or βu2 to obtain bias corrected estimates of the true effect between exposure and outcome and draw informed conclusions. The lower and upper bound of the bias-adjusted confident intervals are the initial estimates shifted by the value of bias without updating standard error since the variance of the estimate using observed data is the same as the variance of the bias corrected estimate when the bias is at a fixed value (Lin et al. 1998).
Table 2.
1st and 3rd quartile of bias calculated from our method.
| αu2 | βu2 | 25th percentile | 75th percentile |
|---|---|---|---|
| 0.1 | 0.1 | 1.56 | 3.02 |
| 0.1 | 0.3 | 1.92 | 3.37 |
| 0.1 | 0.5 | 2.37 | 3.83 |
| 0.3 | 0.1 | 1.09 | 2.11 |
| 0.3 | 0.3 | 1.38 | 2.43 |
| 0.3 | 0.5 | 1.71 | 2.72 |
| 0.5 | 0.1 | 0.83 | 1.67 |
| 0.5 | 0.3 | 1.07 | 1.88 |
| 0.5 | 0.5 | 1.28 | 2.09 |
In order to further investigate the performance of equation (6) for real-data analysis, we simulated data using the same scheme as in Section 3. We identified the MAFs for those four SNPs in 1000 Genomes project phase 3 CEU population and simulated the genetic data (Table 3). Based on the first-stage F statistic, the FTO/rs9939609 and the TMEM18/rs6548238 are the strongest and the weakest IV respectively. We assigned the FTO/rs9939609 and the TMEM18/rs6548238 to be invalid IVs in a hypothetical case. This could be due to associations with the unmeasured confounder U2 (non-zero γz) and/or have non-zero direct effect (βz) on the outcome BMD, we set αu2 and βu2 each at low (0.1), medium (0.3) or high (0.5) level to create 9 combinations of those two parameters. We then varied βz and γz from 0.1 to 0.5 by increments of 0.05. We used the first-stage coefficients listed in Palmer’s article to simulate the exposure (fat mass) (Table 3). When simulating the BMD data, we set the effect of fat mass on BMD to be zero. The observed bias from TSLS analysis on simulated data (Table 4) is very close to the estimate bias (Table 2) calculated directly from equation (6) in this real data setting.
Table 3.
Genetic variants from the original article and their information.
| Gene/SNPs | MAFa in CEU | First-stage regression coefficient | First-stage F-statistic |
|---|---|---|---|
| FTO/rs9939609 | 0.444 | 0.1285 | 39.66 |
| MC4R/rs17782313 | 0.258 | 0.0975 | 17.84 |
| TMEM18/rs6548238 | 0.167 | −0.0685 | 7.5 |
| GNPDA2/rs10938397 | 0.424 | 0.054 | 7.59 |
Table 4.
1st and 3rd quartile of observed bias simulated using MAFs and first-stage coefficients from the article by Palmer et al when using unweighted allele score in MR analysis.
| αu2 | βu2 | 25th percentile | 75th percentile |
|---|---|---|---|
| 0.1 | 0.1 | 1.67 | 3.13 |
| 0.1 | 0.3 | 2.10 | 3.49 |
| 0.1 | 0.5 | 2.51 | 3.97 |
| 0.3 | 0.1 | 1.15 | 2.17 |
| 0.3 | 0.3 | 1.44 | 2.48 |
| 0.3 | 0.5 | 1.77 | 2.79 |
| 0.5 | 0.1 | 0.87 | 1.70 |
| 0.5 | 0.3 | 1.11 | 1.90 |
| 0.5 | 0.5 | 1.34 | 2.12 |
5. Discussion
Sensitivity analysis is a key technique for nonrandomized studies. It provides qualitative and quantitative information regarding the influence of unmeasured confounders on the study. In this article we have proposed a sensitivity analysis approach for Mendelian randomization studies. Our method uses sensitivity parameters derived from asymptotic bias of IV estimator. We demonstrated the effectiveness of our method through simulation studies and a real data study. Unlike previous authors (Conley et al. 2012; Kolesár et al. 2015; Small 2007; Wang et al. 2018), we treat the IV direct effect on the outcome and the IV effect on the outcome through unmeasured confounder separately with different sensitivity parameters. We observed the differing behavior of these two parameters because the IV association with unmeasured confounders can affect the IV association with the exposure. The strength of the IV can exacerbate the bias caused by the aforementioned two parameters. Therefore, in a sense, it is the most important parameter in both Mendelian randomization and attendant sensitivity analysis. Weak IVs can also negatively affect the method of Small (2007).. Wang’s method avoids this problem, but it solely focuses on the hypothesis testing aspect (Wang et al. 2018). We put this parameter explicitly in our formula because of the uncertainty of the IV strength even with the first-stage F statistic. Our method allows investigators to apply their field expertise together with the first-stage F statistic to adjust this parameter in their analyses.
Our method is derived from an asymptotic bias formula. For IV/MR estimators, finite-sample bias is an important source of bias. It has been shown that there is significant finite-sample bias even with a large sample size when the IVs are weak (Bound et al. 1995). We observed the larger difference between bias calculated from our formula and the observed bias when the IVs are very weak. But investigators usually screen for stronger IVs before the MR analysis. Hence, we do not expect very weak IVs when using our method for sensitivity analysis. The proposed method does not directly handle the multiple IV scenario. However, in order to combat the finite-sample bias, an allele score combining all genetic variants has been suggested to replace multiple genetic variants in the first stage model (Burgess and Thompson 2015; Zhang and Ghosh 2017). We studied our method with allele score both in simulated data and in real data and demonstrated the effectiveness of the proposed new method. Intuitively, the allele score should behave like a single instrument in our method. Nevertheless, our simulations demonstrate that the MAFs of individual components affect the accuracy of our method for allele score approach. Recently, GWAS summary data based Mendelian randomization (SMR) has been proposed, and SMR has been effectively applied to exam the relationship between eQTL and phenotypes (Wu et al. 2018; Zhu et al. 2016). Although SMR reduces the burden of data requirements, the SNPs used in these studies are still required to meet the IV/MR assumptions. How to extend the procedures in the paper to the SMR case is a topic of future investigation.
Acknowledgements
This research was supported by the National Science Foundation under Grant No. NSF ABI 1457935 and the National Institutes of Health under Grant R01 GM117946.
Appendix A.
We assume independence of the p genetic variants. We first expand the , which is the covariance between the fitted exposure and the error term in equation 5.
Hence,
When there is a single SNPs (p=1), the bias simplifies to
When multiple SNPs are used in MR, it was suggested that a summary score should be used in place of the multiple SNPs to reduce the finite-sample bias. This simplified equation may be used in such situation by treating the summary score as the single instrument.
Footnotes
Conflict of Interest disclosure:
Weiming Zhang declares no conflict of interest.
Debashis Ghosh declares no conflict of interest.
Publisher's Disclaimer: This Author Accepted Manuscript is a PDF file of an unedited peer-reviewed manuscript that has been accepted for publication but has not been copyedited or corrected. The official version of record that is published in the journal is kept up to date and so may therefore differ from this version.
References
- Auerbach J et al. (2018) Causal modeling in a multi-omic setting: insights from GAW20. BMC Genet 19:74 doi: 10.1186/s12863-018-0645-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Basmann RL (1957) A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica: Journal of the Econometric Society:77–83 [Google Scholar]
- Bauchet M et al. (2007) Measuring European population stratification with microarray genotype data. Am J Hum Genet 80:948–956 doi: 10.1086/513477 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bound J, Jaeger D, Baker R (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association 90:443–450 doi: 10.2307/2291055 [DOI] [Google Scholar]
- Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG (2017) Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants Epidemiology 28:30–42 doi: 10.1097/EDE.0000000000000559 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burgess S, Thompson SG (2014) Mendelian randomization : methods for using genetic variants in causal estimation. Chapman & Hall/CRC interdisciplinary statistics series. Chapman & Hall/CRC, Boca Raton [Google Scholar]
- Burgess S, Thompson SG (2015) Mendelian randomization : methods for using genetic variants in causal estimation. Chapman & Hall/CRC interdisciplinary statistics series. CRC Press, Taylor & Francis Group, Boca Raton [Google Scholar]
- Chao J, Swanson NR (2007) Alternative approximations of the bias and MSE of the IV estimator under weak identification with an application to bias correction. Journal of Econometrics 137:515–555 doi: 10.1016/j.jeconom.2005.09.002 [DOI] [Google Scholar]
- Conley TG, Hansen CB, Rossi PE (2012) Plausibly Exogenous. The Review of Economics and Statistics 94:260–272 doi: 10.1162/REST_a_00139 [DOI] [Google Scholar]
- Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB, Wynder EL (1959) Smoking and lung cancer: recent evidence and a discussion of some questions J Natl Cancer Inst 22:173–203 [PubMed] [Google Scholar]
- Davey Smith G, Ebrahim S (2005) What can mendelian randomisation tell us about modifiable behavioural and environmental exposures? BMJ 330:1076–1079 doi: 10.1136/bmj.330.7499.1076 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harding DJ (2003) Counterfactual Models of Neighborhood Effects: The Effect of Neighborhood Poverty on Dropping Out and Teenage Pregnancy American Journal of Sociology 109:676–719 doi: 10.1086/379217 [DOI] [Google Scholar]
- Dimitri P (2018) Fat and bone in children - where are we now? Ann Pediatr Endocrinol Metab 23:62–69 doi: 10.6065/apem.2018.23.2.62 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gastwirth JL, Krieger AM, Rosenbaum PR (1998) Dual and simultaneous sensitivity analysis for matched pairs Biometrika 85:907–920 doi: 10.1093/biomet/85.4.907 [DOI] [Google Scholar]
- Goh WWB, Wang W, Wong L (2017) Why Batch Effects Matter in Omics Data, and How to Avoid Them Trends Biotechnol 35:498–507 doi: 10.1016/j.tibtech.2017.02.012 [DOI] [PubMed] [Google Scholar]
- Golding J (1990) Children of the nineties. A longitudinal study of pregnancy and childhood based on the population of Avon (ALSPAC) West Engl Med J 105:80–82 [PMC free article] [PubMed] [Google Scholar]
- Greenland S (1996) Basic methods for sensitivity analysis of biases. Int J Epidemiol 25:1107–1116 [PubMed] [Google Scholar]
- Haavelmo T (1944) The probability approach in econometrics Econometrica: Journal of the Econometric Society:iii–115 [Google Scholar]
- Hackinger S, Zeggini E (2017) Statistical methods to detect pleiotropy in human complex traits Open Biol 7 doi: 10.1098/rsob.170125 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katan MB (2004) Apolipoprotein E isoforms, serum cholesterol, and cancer. 1986 Int J Epidemiol 33:9 doi: 10.1093/ije/dyh312 [DOI] [PubMed] [Google Scholar]
- Kolesár M, Chetty R, Friedman J, Glaeser E, Imbens GW (2015) Identification and Inference With Many Invalid Instruments Journal of Business & Economic Statistics 33:474–484 doi: 10.1080/07350015.2014.978175 [DOI] [Google Scholar]
- Leek JT et al. (2010) Tackling the widespread and critical impact of batch effects in high-throughput data Nat Rev Genet 11:733–739 doi: 10.1038/nrg2825 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin DY, Psaty BM, Kronmal RA (1998) Assessing the sensitivity of regression results to unmeasured confounders in observational studies Biometrics 54:948–963 [PubMed] [Google Scholar]
- Listgarten J, Kadie C, Schadt EE, Heckerman D (2010) Correction for hidden confounders in the genetic analysis of gene expression Proc Natl Acad Sci U S A 107:16465–16470 doi: 10.1073/pnas.1002425107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matthew H, Jerry H, Christopher JP (2016) Finite Sample Bias Corrected IV Estimation for Weak and Many Instruments. In, vol 36. Advances in Econometrics. Emerald Publishing Ltd, pp 245–273. doi:DOI: [Google Scholar]
- Michaelson JJ, Loguercio S, Beyer A (2009) Detection and interpretation of expression quantitative trait loci (eQTL) Methods 48:265–276 doi: 10.1016/j.ymeth.2009.03.004 [DOI] [PubMed] [Google Scholar]
- Neuman JA, Isakov O, Shomron N (2013) Analysis of insertion-deletion from deep-sequencing data: software evaluation for optimal detection Briefings in bioinformatics 14:46–55 doi: 10.1093/bib/bbs013 [DOI] [PubMed] [Google Scholar]
- Novembre J et al. (2008) Genes mirror geography within Europe Nature 456:98–101 doi: 10.1038/nature07331 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palmer TM et al. (2012) Using multiple genetic variants as instrumental variables for modifiable risk factors Stat Methods Med Res 21:223–242 doi: 10.1177/0962280210394459 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosenbaum PR (1987) Sensitivity Analysis for Certain Permutation Inferences in Matched Observational Studies Biometrika 74:13–26 doi: 10.2307/2336017 [DOI] [Google Scholar]
- Rosenbaum PR, Rubin DB (1983) Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome Journal of the Royal Statistical Society Series B (Methodological) 45:212–218 [Google Scholar]
- Seldin MF, Price AL (2008) Application of ancestry informative markers to association studies in European Americans PLoS Genet 4:e5 doi: 10.1371/journal.pgen.0040005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sivakumaran S et al. (2011) Abundant pleiotropy in human complex diseases and traits Am J Hum Genet 89:607–618 doi: 10.1016/j.ajhg.2011.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Small DS (2007) Sensitivity Analysis for Instrumental Variables Regression With Overidentifying Restrictions Journal of the American Statistical Association 102:1049–1058 doi: 10.1198/016214507000000608 [DOI] [Google Scholar]
- Smith GD, Ebrahim S (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32:1–22 [DOI] [PubMed] [Google Scholar]
- Smith GD, Ebrahim S (2004) Mendelian randomization: prospects, potentials, and limitations Int J Epidemiol 33:30–42 doi: 10.1093/ije/dyh132 [DOI] [PubMed] [Google Scholar]
- Theil H (1953a) Estimation and simultaneous correlation in complete equation systems. Central Planning Bureau. Mimeo, The Hague. [Google Scholar]
- Theil H (1953b) Repeated least squares applied to complete equation systems. Central Planning Bureau. Mimeo, The Hague. [Google Scholar]
- Theil H (1958) Economic forecasts and policy. Central Planning Bureau. Mimeo, The Hague. [Google Scholar]
- Timpson NJ, Sayers A, Davey-Smith G, Tobias JH (2009) How does body fat influence bone mass in childhood? A Mendelian randomization approach J Bone Miner Res 24:522–533 doi: 10.1359/jbmr.081109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vanderweele TJ, Arah OA (2011) Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders Epidemiology 22:42–52 doi: 10.1097/EDE.0b013e3181f74493 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X, Jiang Y, Zhang NR, Small DS (2018) Sensitivity analysis and power for instrumental variable studies Biometrics doi: 10.1111/biom.12873 [DOI] [PubMed] [Google Scholar]
- Wosje KS, Khoury PR, Claytor RP, Copeland KA, Kalkwarf HJ, Daniels SR (2009) Adiposity and TV viewing are related to less bone accrual in young children J Pediatr 154:79–85.e72 doi: 10.1016/j.jpeds.2008.06.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wright PG (1928) The tariff on animal and vegetable oils. The Institute of Economics Investigations in international commercial policies, vol no 26. The Macmillan company, New York, [Google Scholar]
- Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare-variant association testing for sequencing data with the sequence kernel association test Am J Hum Genet 89:82–93 doi:S0002–9297(11)00222–9 [pii] 10.1016/j.ajhg.2011.05.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu Y et al. (2018) Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits Nat Commun 9:918 doi: 10.1038/s41467-018-03371-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang W, Ghosh D (2017) On the use of kernel machines for Mendelian randomization Quantitative Biology 5:368–379 doi: 10.1007/s40484-017-0124-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu Z et al. (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets Nat Genet 48:481–487 doi: 10.1038/ng.3538 [DOI] [PubMed] [Google Scholar]






