Summary:
Mendelian randomization (MR) is a type of instrumental variable (IV) analysis that uses genetic variants as IVs for a risk factor to study its causal effect on an outcome. Extensive investigations on the performance of IV analysis procedures, such as the one based on the two-stage least squares (2SLS) procedure, have been conducted under the one-sample scenario, where measures on IVs, the risk factor, and the outcome are assumed to be available for each study participant. Recent MR analysis usually is performed with data from two independent or partially overlapping genetic association studies (two-sample setting), with one providing information on the association between the IVs and the outcome, and the other on the association between the IVs and the risk factor. We investigate the performance of 2SLS in the two-sample based MR when the IVs are weakly associated with the risk factor. We derive closed form formulas for the bias and mean squared error (MSE) of the 2SLS estimate, and verify them with numeric simulations under realistic circumstances. Using these analytic formulas we can study the pros and cons of conducting MR analysis under one-sample and two-sample settings, and assess the impact of having overlapping samples. We also propose and validate a bias-corrected estimator for the causal effect.
Keywords: Bias, Instrumental variable, Mean squared error, Mendelian randomization, Two-stage least squares estimate
1. Introduction
Mendelian randomization (MR) analysis uses genetic variants as instrumental variables (IVs) to estimate the casual effect of a risk factor on an outcome based on observational studies (Davies, Holmes and Davey Smith, 2018; Lawlor et al., 2008; Smith and Ebrahim, 2003). Recent genome-wide association studies (GWAS) have generated many robust findings on the association between genetic variants and various phenotypes, which facilitate the choose of appropriate genetic variants as IVs for a wide range of MR analyses (Hemani et al., 2018; Staley et al., 2016; Zhu et al., 2016).
Although recently published MR studies usually rely on results from large GWAS, and use IVs that can jointly explain a substantial proportion of the total variation in the risk factor, some MR studies, especially those based on GWAS with limited sample sizes, are still conducted with weak IVs that only explain a small proportion of variation (Meng et al., 2018; Takahashi et al., 2019; Theodoratou et al., 2012; Treur et al., 2018). Challenges of using weak IVs, and ways to overcome them have been studied extensively in the econometrics literature, with a major focus on the one-sample setting, where measurements on IVs, the risk factor, and outcome are available for each study participant (Bound, Jaeger and Baker, 1995; Chao and Swanson, 2007; Staiger and Stock, 1997; Stock, Wright and Yogo, 2002). One common IV analysis approach is the two-stage least squares method (2SLS), which builds a regression model using IVs to predict the risk factor, called the first stage regression, and then regresses the outcome on predicted values of the risk factor, called the second stage regression. The 2SLS estimator can have a non-normal sampling distribution even with a large sample size when dealing with weak IVs (Stock et al., 2002). It can have a substantial bias in the same direction as the ordinary least squared (OLS) estimator, which is obtained by regressing the outcome on the risk factor directly, and thus erroneously assesses the causal relationship due to unadjusted confounding effects (Bound et al., 1995; Stock et al., 2002).
Since GWAS is typically carried out to study one trait at a time, an increasing number of MR studies are conducted under the two-sample setting by combining data from two separate GWAS, with one study providing measures on IVs and the risk factor, and the other on IVs and the outcome. The first two-sample IV analysis was conducted by Angrist and Krueger (1992). They developed a two-sample instrumental variable (TSIV) method to study the effect of age at school entry on education attainment by combining data from two separate censuses. Most subsequent IV analyses (including MR analyses) in the two-sample setting used the more convenient two-sample 2SLS (TS2SLS) method, which can be straightforwardly adopted from the one-sample based 2SLS estimator. The easy-to-use TS2SLS estimator was shown to be more asymptotically efficient than the original TSIV estimator (Inoue and Solon, 2010). In addition, since the TS2SLS estimator with weak IVs has an attenuation bias toward the null (Angrist and Krueger, 1995), it provides a more conservative estimate of the causal effect.
Due to the nature of large GWAS consortia, the two GWAS studies used in the MR analysis can have non- negligible overlaps, with a noticeable proportion of subjects participating in both studies (Burgess, Davies and Thompson, 2016a). Although the TS2SLS method can be applied to this overlapping two-sample setting directly, its performance, such as bias and mean squared error (MSE), are not well understood. Burgess et al. (2016a) conducted simulation studies to evaluate the weak instrument bias in this setting.
The purpose of this paper is to investigate the bias and MSE due to “weak instruments” in the general two-sample (with or without overlapping subjects) MR analysis. We focus on the TS2SLS estimator as it is the most commonly used approach. It also can be approximated by the inverse-variance weighted estimator (Burgess, Butterworth and Thompson, 2013; Burgess, Dudbridge and Thompson, 2016b). The inverse-variance weighted estimator is very flexible and widely used for MR studies with summarized data that are typically generated by GWAS meta-analyses. We derive theoretic formulas for the bias and MSE under the weak instrument asymptotics. Our results are extensions of those by Chao and Swanson (2007) obtained under the one-sample setting. We verify those formulas with numeric simulations under practical circumstances. We use those analytic results to evaluate the pros and cons of conducting MR analysis under one-sample and two-sample settings, as well as the impact of using overlapping samples. We also propose and validate a class of bias-corrected estimators for the causal effect.
2. Method
2.1. Setup and Assumptions
Let represent the continuous outcome, represent the exposure measure on the risk factor, and be the vector of random variables representing the set of IVs. Following the standard conditions for the IVs analysis, we assume they are connected with the following two regression models,
(1) |
(2) |
where is the coefficient for the causal effect of interest, is the vector of regression coefficients for the IVs, and and are two correlated error terms. We call (1) and (2) the first, and second stage regression model, respectively. To simplify the notation, we first assume no intercept term or adjusted covariates in models (1) and (2). We will expand results to more general model later.
We consider the general two-sample setting, where we have measures on IVs and the outcome from one study with subjects, and measures on IVs and the exposure from another study with subjects. The two studies have overlapping subjects. We call the data from the first study the outcome data, the one from the second study the risk exposure data. We break all data into three exclusive subsets to represent them. For the first subset, we have subjects on which we have measures on , and denote them as a vector , and a matrix . For the second subset, we have subjects on which we have measures on , and denote them as vectors and , and a matrix . For the third subset, we have subjects on which we have measures on , and denote them as a vector , and a matrix . Based on models (1) and (2), all data can be linked as:
where are corresponding error terms, and . The TS2SLS estimator can be expressed as,
Notice that when the two studies are completely overlapped (i.e.,), the TS2SLS estimator becomes the one sample 2SLS estimator.
We consider asymptotic bias and MSE of the TS2SLS estimator under the following assumptions.
Assumption 1: For we have and , , where .
Assumption 2:, where C is a fixed vector.
Assumption 3: For , (i), (ii), (iii).
Assumption 4: There exists a finite integer such that , for some , where denotes the TS2SLS estimator of for a sample size of , and is the true value of .
Assumption 1 is to ensure proportions of the three subsets converge as , and that sample sizes of the exposure data and outcome data are comparable. Assumptions 2 is essential for the study of weak instrument asymptotics (Chao and Swanson, 2007; Staiger and Stock, 1997). The main purpose is to evaluate the estimate when is weakly correlated with , in the sense that the first stage statistic testing is small or moderate even if is large. Notice that if is treated as constant, the first stage statistic would go to infinity as . But under Assumption 2, the statistic becomes , representing the targeted weak instrument scenario. In practice, Assumption 2 is reasonable if the statistic is smaller than 10. Assumption 3 gives some moment conditions on the error terms and IVs. Similar conditions are given by Staiger and Stock (1997). Those conditions hold under fairly weak assumptions. For example, it holds when the error terms are i.i.d. and independent of IVs. Therefore, it is always true for standard MR studies with unrelated subjects.
Assumption 4 is similar to the one given by Chao and Swanson (2007). This assumption is sufficient for the existence of asymptotic bias and MSE of the estimate . As pointed out by Chao and Swanson (2007), in special cases where the error terms in (1) and (2) among different subjects are i.i.d., and joint normal, Assumption 4 always holds for . In fact, the estimate would have an infinite large asymptotic variance, i.e., for (Chao and Swanson, 2007). As a result, higher order asymptotic properties of the estimate cannot be studied for . In standard GWAS analysis, a common practice for modeling a continuous outcome is to check its residual normality, and to apply the Box-Cox transformation to ensure the residual have a normal shape distribution. Thus, in real MR studies using data from existing GWAS, Assumption 4 is generally satisfied when the number of IVs is larger than three.
2.2. Asymptotic Bias and MSE formulas
The following theorem gives the asymptotic bias and MSE formulas of .
Theorem 1. Under Assumptions 1–4, as , we have for ,
(3) |
where with representing a random variable following a noncentral Chi-squared distribution with degrees of freedom, and noncentrality parameter . And
(4) |
where , and .
The proof of Theorem 1 is given in Web Appendix A. Here are some remarks on the results.
Remark 1: Theorem 1 indicates that the asymptotic bias is a linear function of the true value and , with the later one being the asymptotic bias of the OLS estimator. Under the one-sample setting (i.e., ,), the asymptotic bias does not depend on . It becomes the same as the one given by Chao and Swanson (2007) by noticing the following known equation when ,
where denotes the gamma function and denotes the confluent hypergeometric function. Similarly, the general asymptotic MSE formula reduces the one given by Chao and Swanson (2007) under the one-sample setting.
Remark 2: Under the independent two-sample setting with no overlapping samples (i.e.,), the bias formula becomes similar to the one given by Angrist and Krueger (1995). In this situation, the bias is in proportion to , but is not influenced by . Therefore, under the independent two-sample setting, the TS2SLS estimator is unbiased under the null (i.e.,), and is biased toward 0 when . This is one motivation for the development of split-sample IV estimator (Angrist and Krueger, 1992, 1995).
Remark 3: The bias formula (3) is not a symmetric function of . When and have the same sign, their impacts on the bias can offset each other.
Remark 4: With , and , becomes the concentration parameter under the one-sample setting (Stock et al., 2002). We still call it the concentration parameter under the general two-sample setting. As usual we adopt as a key measure for the strength of IVs on their ability to explain the exposure.
Remark 5: Although the asymptotic bias and MSE formulas are derived under the weak asymptotic assumptions, they are also applicable when the IVs are strong. Assuming is a fixed vector, by the similar argument as those used in the proof of Theorem 1 in Web Appendix A, we can show that those formulas are still good approximations of bias and MSE of the TS2SLS estimate with strong IVs, with errors in the order of .This can be confirmed with simulations (shown later).
When there are overlapping samples (i.e.,), we can conduct the MR analysis by restricting the overlapping samples to the first stage or the second stage regression. More generally, we can apply the TS2SLS estimator by using a proportion (denoted as ,) of those overlapping subjects in the second stage regression and the rest in the first stage regression. We call this class of split sample estimators . By using , we in fact keep all the overlapping samples in the second stage regression model. For , we only use the overlapping samples in the first stage regression model. According to Theorem 1, if , then for any given , is asymptotic unbiased as there is no overlapping sample between the two stage regressions. Furthermore, we have the following useful results regarding to its MSE property, with the proof given in Web Appendix B.
Corollary 1. Under the same conditions given in Theorem 1 with , if , then has the smallest asymptotic MSE among all , , and the original TS2SLS estimator .
Corollary 1 suggests that we can restrict all the overlapping samples to the second stage regression to achieve the most accurate estimate of the causal effect when . Given that all involved functions are differentiable, Corollary 1 remains to be true as long as is not too large. Although we cannot provide a simple analytic formula for this upper bound of , we show in the Web Appendix C that such a bound does exist. Therefore, it is beneficial to use all the overlapping samples in the second stage regression, but not in the first stage regression, when estimating relatively small causal effect .
2.3. Bias-corrected Estimator
Given the formula for the asymptotic bias, we can define a bias-corrected estimator with a reduced level of bias. We first consider the most general scenario where , for . Given the bias formula (3), we can define for the following bias-corrected estimator,
with
where is the 2SLS estimator using the overlapping samples that have complete measures on .
The consistency of is not guaranteed under the weak instrument assumptions considered in Theorem 1 with , while keeping k fixed. Instead, we can show that is consistent under the many weak instruments asymptotic of Stock and Yogo (2005) by letting and k go to infinity jointly, with the following condition on the strength of IVs.
Assumption 5: for constant values and as .
This assumption requires that the concentration parameter should not be too small, and it should be comparable to the number of IVs. Notice that Assumption 5 does not contradict to Assumption 2. In fact, by its definition, is expected to grow as more IVs are used, as long as the added IVs are all relevant. Assumption 5 is slightly weaker than the one required by Chao and Swanson (2007) in their study of many weak IVs.
We have the following result on the consistency of , with proof given in Web Appendix D.
Theorem 2. Suppose Assumptions 1 to 5, and some technical conditions listed in Web Appendix D hold, we have as and jointly.
Using Assumptions 2–5, Chao and Swanson (2007) proposed a biased-corrected estimator under the one-sample setting. By letting , we can see that is similar to the one given by Chao and Swanson (2007). The only difference is that their formula uses an approximation version of the function .
The consistency of the bias-corrected estimate is not guaranteed if k is fixed (i.e., small). This is a common theme when dealing with a small number of weak IVs. In fact, by using similar arguments as used in the proof of Theorem 1, we can show that converges weakly to a non-degenerate random variable with fixed k. A similar result was obtained by Staiger and Stock (1997) in the one sample setting. Therefore, even if all plug-in parameters are known or can be consistently estimated, still cannot converge to , since its variance does not converge to 0 as the sample size goes to infinite. On the other hand, by using similar arguments as in the proof of Theorem 2, we can show that the bias-corrected estimate under fixed k is consistent if is large enough. For example, with , we find that the bias-corrected estimate still has the desired property when (shown later).
2.4. Extension to Estimate with Summary Data and Models with Covariates
In previous sections we have provided results on TS2SLS estimate based on individual-level data, for a MR study with simplified models given by (1) and (2). In Web Appendix E we provide corresponding results on MR analyses with GWAS summary data under models (1) and (2). In Web Appendix F we further generalize those results to models with covariates.
3. Numerical Results
3.1. The accuracy of the bias and MSE formulas
First, we conducted simulations to investigate the accuracy of our asymptotic bias and MSE formulas. IVs were genotypes on a set of k () independent genetic markers, denoted as , with ,. Let the risk factor , the outcome , and follow models, , , where u and v were correlated error terms, with correlation coefficient of 0.5. We assumed u and v were both , or 2 degree-of-freedom Chi-squared distributed (re-centered and scaled with mean of 0, and variance of 2). We fixed sample sizes as , and changed the overlapping percentage defined as , by varying the overlapping sample size . We chose such as , or 7, representing the two ends of the weak IV spectrum. We let , 0.0, or 0.3 in the simulation.
For any given configuration, we simulated 10,000 datasets assuming error terms were normal or Chi-squared distributed. By summarizing results over simulated datasets, we obtained empirical estimates of bias and MSE for the TS2SLS estimator. In Table 1–2 we compared the empirical bias and MSE with the ones calculated by the corresponding asymptotic formulas for scenarios , with different distributions for the error terms. We also demonstrated the accuracy of the formulas for other configurations with normal distributed error terms (Web Appendix Tables S1–S2), and Chi-squared distributed error terms (Web Appendix Tables S3–S4). It is clear from these tables that the asymptotic bias and MSE formulas can accurately approximate their expected values under all considered situations.
Table 1:
Bias and MSE approximation accuracy with normal distributed error terms, independent IVs with k = 100, and . Overlapping percentage (ψ): proportion of subjects in the outcome data that overlap with the risk exposure data. β0: true causal effect; Empi.: empirical estimates of the bias and MSE; Asym.: asymptotic estimates of the bias and MSE given by Theorem 1.
β0 = 0.3 |
β0 = 0 |
β0 = −0.3 |
|||||
---|---|---|---|---|---|---|---|
ψ | Empi. | Asym. | Empi. | Asym. | Empi. | Asym. | |
Bias | 0% | −0.240 | −0.239 | 0.000 | 0.000 | 0.240 | 0.239 |
20% | −0.112 | −0.112 | 0.079 | 0.080 | 0.272 | 0.274 | |
40% | 0.015 | 0.016 | 0.160 | 0.160 | 0.304 | 0.304 | |
60% | 0.141 | 0.144 | 0.238 | 0.240 | 0.335 | 0.337 | |
80% | 0.271 | 0.272 | 0.319 | 0.320 | 0.368 | 0.368 | |
100% | 0.400 | 0.399 | 0.400 | 0.399 | 0.400 | 0.399 | |
MSE | 0% | 0.069 | 0.068 | 0.008 | 0.008 | 0.064 | 0.063 |
20% | 0.023 | 0.024 | 0.014 | 0.014 | 0.080 | 0.081 | |
40% | 0.010 | 0.011 | 0.033 | 0.033 | 0.099 | 0.099 | |
60% | 0.029 | 0.030 | 0.064 | 0.065 | 0.119 | 0.120 | |
80% | 0.081 | 0.082 | 0.109 | 0.109 | 0.142 | 0.142 | |
100% | 0.166 | 0.166 | 0.166 | 0.166 | 0.166 | 0.166 |
Table 2:
Bias and MSE approximation accuracy with Chi-squared distributed error terms, independent IVs with k = 100, and . Overlapping percentage (ψ): proportion of subjects in the outcome data that overlap with the risk exposure data. β0: true causal effect; Empi.: empirical estimates of the bias and MSE; Asym.: asymptotic estimates of the bias and MSE given by Theorem 1.
β0= 0.3 |
β0 = 0 |
β0 = −0.3 |
|||||
---|---|---|---|---|---|---|---|
ψ | Empi. | Asym. | Empi. | Asym. | Empi. | Asym. | |
Bias | 0% | −0.240 | −0.239 | 0.000 | 0.000 | 0.240 | 0.239 |
20% | −0.114 | −0.112 | 0.078 | 0.080 | 0.271 | 0.274 | |
40% | 0.014 | 0.016 | 0.158 | 0.160 | 0.303 | 0.304 | |
60% | 0.140 | 0.144 | 0.237 | 0.240 | 0.334 | 0.337 | |
80% | 0.271 | 0.272 | 0.319 | 0.320 | 0.368 | 0.368 | |
100% | 0.399 | 0.399 | 0.399 | 0.399 | 0.399 | 0.399 | |
MSE | 0% | 0.068 | 0.068 | 0.008 | 0.008 | 0.064 | 0.063 |
20% | 0.024 | 0.024 | 0.014 | 0.014 | 0.080 | 0.081 | |
40% | 0.011 | 0.011 | 0.033 | 0.033 | 0.098 | 0.099 | |
60% | 0.029 | 0.030 | 0.064 | 0.065 | 0.118 | 0.120 | |
80% | 0.081 | 0.082 | 0.109 | 0.109 | 0.141 | 0.142 | |
100% | 0.166 | 0.166 | 0.166 | 0.166 | 0.166 | 0.166 |
To further evaluate the bias and MSE formulas, in Web Appendix G we decribed additional simulation studies in which IVs were correlated, had different allele frequencies, varied levels of association strength, and were in Hardy-Weinberg dis-equilibrium. As predicted by the theoretic results, characteristics of IVs affect the bias and MSE through the noncentrality parameter . Results shown in Web Appendix Tables S5 and S6 indicate that theoretic formulas match well with empirical results.
We conducted further simulation studies in situations when Assumption 4 did not hold. We considered a MR study with just two weak IVs (i.e.,), with details on the simulation design given in Web Appendix G. It appears that the empirical distribution of the estimate has a long tail, with estimates based on some simulated datasets having very large values. This is expected as the second moment of the TS2SLS estimate under the weak IV asymptotic does not exist when there are less than 4 weak IVs, and error terms are normal distributed. Because of the violation of Assumption 4, it is not surprising to see that the asymptotic MSE formula performs poorly (Web Appendix Table S7).
Further simulation results confirmed that the theoretic bias and MSE formulas, which are derived under weak IV asymptotic, work well for strong IVs, as explained by Remark 5 in Section 2.2. Details on the simulation design are given in Web Appendix G, with results given in Web Appendix Tables S8 and S9).
3.2. The Impact of Using Overlapping Samples
Once we have established the validity of asymptotic bias and MSE formulas, we can use them instead of time-consuming simulations to investigate the performance of TS2SLS estimator in the general two-sample setting. As mentioned in the Introduction Section, some two-sample MR studies could have overlapping samples. We can use the bias and MSE formulas to evaluate the impact of having overlapping samples, and check whether there is any advantage by discarding those overlapping samples. Given a two-sample MR study with , and subjects in each of the three exclusive subsets, we can apply the TS2SLS estimators in different ways. We can use TS2SLS estimator on the original study with overlapping samples (denoted as ), alternatively, we can use the class of estimators . For illustration purpose, we consider , or 1.0.
We compared the asymptotic bias and MSE among the considered estimators under various scenarios. In Figure 1 we show the theoretic MSE curves over different values under the setting with , , and . As we have proved in Corollary 1, the estimate has the smallest MSE and is nearly unbiased when is relatively small. For with relatively large value in the same direction as ( in this case), the original estimator can have a lower MSE than that of . Similar patterns can be observed under other configurations (results not shown).
Figure 1.
Asymptotic MSE comparisons between split-sample estimator and TS2SLS estimator . Plots are asymptotic MSE of and over various true values of under the simulation setting with , and the overlapping percentage .
3.3. Evaluation of Bias-corrected Estimator
Using simulation studies we compared the bias-corrected estimator , with the original TS2SLS estimator . As expected, we find that does not work well for very weak IVs due to the violation of Assumption 5. For example, in simulations with 100 independent IVs, and , the estimated concentration parameter is negative for some simulated datasets. In Table 3 we show the simulation results with 100 independent IVs, and . It is clear that tends to have a smaller bias than . Since adjusting the bias can introduce extra variation in , the MSE of is not always smaller than that of . In situations when has a relatively large bias, the reduction of bias can lead to a lower level of MSE. Results based on simulations using 100 independent IVs with show the similar pattern (Web Appendix Tables S10). We also evaluated the performance of with relatively small k. In Web Appendix Table S11 we show simulation results for the study using 10 independent IVs with . It demonstrates that the bias-corrected estimate has the expected performance (i.e., bias reduction) for small k as long as IVs are not too weak.
Table 3:
Performance of bias-corrected estimator using indepedent IVs with k = 100, and . Overlapping percentage (ψ): proportion of subjects in the outcome data that overlap with the risk exposure data. β0: true causal effect;: the TS2SLS estimate;: the biased-corrected estimate.
β0 = 1 |
β0 = 0.3 |
β0 = 0 |
β0 = −0.3 |
β0 = −1 |
|||||||
---|---|---|---|---|---|---|---|---|---|---|---|
ψ | |||||||||||
Bias | 0% | −0.333 | 0.007 | −0.100 | 0.002 | −0.000 | −0.000 | 0.099 | −0.002 | 0.333 | −0.007 |
20% | −0.233 | 0.002 | −0.047 | −0.001 | 0.032 | −0.003 | 0.112 | −0.004 | 0.298 | −0.008 | |
40% | −0.135 | −0.004 | 0.005 | −0.004 | 0.065 | −0.005 | 0.125 | −0.005 | 0.265 | −0.006 | |
60% | −0.035 | −0.005 | 0.058 | −0.005 | 0.098 | −0.005 | 0.138 | −0.006 | 0.231 | −0.006 | |
80% | 0.065 | −0.004 | 0.112 | −0.005 | 0.132 | −0.005 | 0.152 | −0.005 | 0.199 | −0.005 | |
100% | 0.165 | −0.005 | 0.165 | −0.005 | 0.165 | −0.005 | 0.165 | −0.005 | 0.165 | −0.005 | |
MSE | 0% | 0.122 | 0.034 | 0.014 | 0.011 | 0.003 | 0.007 | 0.012 | 0.007 | 0.115 | 0.019 |
20% | 0.064 | 0.023 | 0.007 | 0.010 | 0.004 | 0.008 | 0.015 | 0.008 | 0.093 | 0.016 | |
40% | 0.027 | 0.017 | 0.004 | 0.009 | 0.007 | 0.008 | 0.018 | 0.008 | 0.074 | 0.014 | |
60% | 0.008 | 0.013 | 0.007 | 0.009 | 0.012 | 0.008 | 0.022 | 0.008 | 0.057 | 0.012 | |
80% | 0.009 | 0.010 | 0.016 | 0.008 | 0.020 | 0.008 | 0.026 | 0.008 | 0.043 | 0.009 | |
100% | 0.030 | 0.008 | 0.030 | 0.008 | 0.030 | 0.008 | 0.030 | 0.008 | 0.030 | 0.008 |
Finally, we evaluated in studies with correlated IVs, with details of the simulation design given in Web Appendix G. We can reach similar conclusions as those based on simulations with independent IVs (Web Appendix Table S12).
4. Real Applications
4.1. Effect of Maternal Obesity on Birth Size
Geng and Huang (2018) recently conducted a MR analysis to evaluate the causal effect of maternal central obesity on birth size and puberty height growth. They performed two-sample MR analyses using summary-level GWAS meta-analysis results generated from two large consortia. They considered three risk factors representing maternal central obesity, including waist-to-hip ratio, waist circumference, and hip circumference, all adjusted for BMI. Three outcomes were infant birth weight, birth length, and head circumference. For the purpose of illustration, we focused on the risk factor defined by mother’s BMI adjusted hip circumference (HIPadjBMI), and the birth length as the outcome. Geng and Huang (2018) chose 41 independent SNPs that passed the genome wide significance threshold () as IVs for the risk factor based on results from existing GWAS. Summary statistics on the association between each of those 41 IVs and the outcome (birth length) were obtained from another consortium.
We first revaluated the causal effect of HIPadjBMI on birth length using the formula given by Burgess et al. (2016b), which used the inverse-variance meta-analysis method to generate an estimate of the causal effect based on summary statistics from the two consortia (see Web Appendix E). We used European references from the 1000 Genomes Project to estimate the covariance matrix for the vector of 41 IVs. Since in this example the chosen SNPs were independent. the reference genomes were only used for the variance estimate for each IV. The estimated causal effect was . According to Web Appendix E, this estimate can be considered as an approximation for the TS2SLS estimate. Using summary statistics, we can also estimate the concentration parameter . In this case, since the IVs were relatively strong with , this estimate had negligible weak instrument bias as .
We noticed that the 41 IVs were chosen from the HIPadjBMI GWAS because of their smaller p-values (). Due to the selection bias caused by winner’s curse (Yu et al., 2007; Zhong and Prentice, 2008), the absolute value of the estimated coefficient , for the marginal association between the jth selected IV and the risk factor tended to be inflated. We can correct this selection bias by shrinking toward zero using the method of Zhong and Prentice (2008). We obtained this selection biased adjusted estimate (denoted as ) for each IV, then re-estimated the causal effect as , and the concentration parameter as . In this example, even after the adjustment of the selection bias, the IVs were still strong enough () with negligible weak instrument bias.
The chosen IVs in this example are strong mainly because of the extremely large sample size used in the first stage regression model (). But if we want to design a replication study with a smaller sample size to validate the observed causal effect, the IVs could become weak and induce non-negligible weak instrument bias. To illustrate this, let’s assume the replication study consists of 10,000 subjects with measures on IVs and the risk factor, and 10,000 subjects with measures on IVs and the outcome, with possible overlapping subjects between the two sets. Suppose the true causal effect , and the marginal association coefficient between the jth IV and the risk factor is given by , based on Web Appendix E we can estimate all parameters in models 1–2 given , the correlation coefficient between the two error terms u and v. We can also calculate the expected bias and MSE for this replication study with various levels of and overlapping percentage (Figure 2). For this particular example with relatively large causal effect , we can see that it is beneficial (in term of both reduced bias and MSE) to have some overlapping samples between the outcome data and the risk exposure data, although the ideal overlapping percentage depends on the correlation level between u and v.
Figure 2.
Relative asymptotic bias and MSE for the validation study. The validation study has 10,000 samples in the risk exposure data, and 10,000 samples in the outcome data, with varying overlapping percentage. Parameters in underlying two-stage regression models are given by estimates from the real study of causal effect of HIPadjBMI on birth length. Given correlation coefficient between two error terms u and v (r), and overlapping percentage, asymptotic bias and MSE are calculated according to Theorem 1. The left panel is for the absolute relative bias, defined as , with being the asymptotic bias of the TS2SLS estimator , and the true causal effect . The right panel is for the relative MSE, defined as , with being the asymptotic MSE of the TS2SLS estimator .
4.2. Effect of Caffeine Consumption on Sleep Duration
Treur et al. (2018) recently conducted MR analyses to investigate the effect of caffeine consumption on various sleep behaviors. They considered three risk factors representing caffeine consumption, including caffeine intake, plasma caffeine and caffeine metabolic rate. Three outcomes on sleep behaviors were sleep duration, chronotype and insomnia. For purpose of illustration, we focused on plasma caffeine as the risk factor, and used sleep duration as the outcome. Based on results from GWAS on plasma caffeine with a sample size 9876, Treur et al. (2018) used p-value threshold of to identify 11 independent SNPs as IVs for the risk factor plasma caffeine. Summary statistics on the association between each of these IVs and the outcome (sleep duration) were obtained from a separate GWAS with 128,266 participants.
We first revaluated the causal effect of plasma caffeine on sleep duration using the inverse-variance meta-analysis method given by Burgess et al. (2016b), which can be considered as an approximation for the TS2SLS estimate (see Web Appendix E). The estimated causal effect was , with the concentration parameter estimated as . In this case, the strength of IVs was modest with . Compared to our bias-corrected estimator , the original estimate has an absolute relative bias .
We noticed that the 11 IVs were chosen from the GWAS study with p-value less than . Due to the selection bias caused by winner’s curse, the absolute value of the estimated coefficient for the marginal association between each IV and the risk factor tends to be inflated. We can correct this selection bias as we did in the first example, by using the method of Zhong and Prentice (2008). After we obtained the winner’s curse adjusted estimate for each IV, we applied the inverse-variance meta-analysis method to re-estimate the causal effect as . With the adjustment of the selection bias, the strength of IVs was relatively weak with . The corresponding weak IV bias-correct estimate provided an estimate as , indicating the absolute relative bias of the original estimate was about 7.0%.
5. Discussion
The 2SLS estimator with weak IVs has been studied extensively in the literature under the one-sample setting, but its performance in the two-sample setting, under which the MR is most often conducted, has not been adequately investigated. In this paper, we derive analytic formulas for the expected bias and MSE of 2SLS estimate in the general two-sample setting. These formulas can be used for reducing the bias in the causal effect estimate. We show that the derived bias-corrected estimate can have much smaller bias than the original estimate. To estimate a relatively small causal effect, our theoretic result suggests that using all overlapping subjects only in the second stage regression can lead to a more accurate (less MSE) estimate. Those formulas can also provide insight into the design of future MR studies. As demonstrated in the real example application, we can use the bias and MSE formulas to study the impact of having overlapping subjects in the two-sample MR study, and to find the optimal way for using those overlapping samples.
The formulas of bias and MSE we obtained cannot be used for deriving confidence intervals, and hypothesis testing, since they do not fully characterize the asymptotic distribution of the 2SLS estimate. As we have shown in the proof of Theorem 1, the weak asymptotic distribution of 2SLS can be defined by a complicated function of several normal variables. It is possible to use a computationally efficient method to simulate that asymptotic distribution, and use it to derive the p-value, or confidence intervals. More investigations are needed to evaluate this Monte Carlo approach.
The bias-corrected estimate tends to less biased than the original 2SLS estimate, but due to the additional variation induced by adjusted terms, its MSE sometimes can be larger. Another limitation of the bias-corrected estimate is that its consistency is established under the many weak instruments asymptotic. It does not have the desired performance when the number of weak IVs is relatively small.
Most MR analyses estimate the causal effect of one exposure at a time. Given that multiple risk factors can jointly influence an outcome, a multivariable MR study to assess the causal effect of several exposures could be valuable (Burgess and Thompson, 2015; Sanderson et al., 2018). Results we obtained for single exposure MR analysis could be extended to multivariable MR studies.
We focus on the MR study of a continuous outcome. It would be interesting to study the property of 2SLS for the study of a binary outcome as many MR analyses are conducted for binary traits, such as disease status. If a standard logistic regression model is assumed for the relationship between the outcome and the risk factor, a two-stage regression (TSR) model, similar to that used for the study of a continuous outcome, can be used. But unlike the 2SLS estimate, the corresponding TSR estimate does not have a closed form solution. We are still investigating ways to obtain formulas for its bias and MSE.
Supplementary Material
Acknowledgements
The study utilized the computational resources of the NIH HPV Biowulf cluster (https://hpc.nih.gov/).
Footnotes
References
- Angrist J, and Krueger AB. (1992). The effect of age at school entry on educational attainment: An application of instrumental variables with moments from two samples. Journal of the American Statistical Association 87, 328–336. [Google Scholar]
- Angrist J, and Krueger AB. (1995). Split‐sample instrumental variables estimates of the return to schooling. Journal of Business and Economic Statistics 13, 225–235. [Google Scholar]
- Bound J, Jaeger DA, and Baker RM. (1995). Problems with instrumental variables estimation when the correlation between the instruments and the endogeneous explanatory variable is weak. Journal of the American Statistical Association 90, 443–450. [Google Scholar]
- Burgess S, Butterworth A, and Thompson SG. (2013). Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol 37, 658–665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burgess S, Davies NM, and Thompson SG. (2016a). Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol 40, 597–608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burgess S, Dudbridge F, and Thompson SG. (2016b). Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat Med 35, 1880–1906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burgess S, and Thompson SG. (2015). Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol 181, 251–260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chao J, and Swanson NR. (2007). Alternative approximations of the bias and MSE of the IV estimator under weak identification with an application to bias correction. Journal of Econometrics 137, 515–555. [Google Scholar]
- Davies NM, Holmes MV, and Davey Smith G. (2018). Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians. BMJ 362, k601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geng TT, and Huang T. (2018). Maternal central obesity and birth size: a Mendelian randomization analysis. Lipids Health Dis 17, 181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hemani G, Zheng J, Elsworth B, et al. (2018). The MR-Base platform supports systematic causal inference across the human phenome. Elife 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Inoue A, and Solon G. (2010). Two‐sample instrumental variables estimators. The Review of Economics and Statistics 92, 557–561. [Google Scholar]
- Lawlor DA, Harbord RM, Sterne JA, Timpson N, and Davey Smith G. (2008). Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med 27, 1133–1163. [DOI] [PubMed] [Google Scholar]
- Meng XH, Chen XD, Greenbaum J, et al. (2018). Integration of summary data from GWAS and eQTL studies identified novel causal BMD genes with functional predictions. Bone 113, 41–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanderson E, Davey Smith G, Windmeijer F, and Bowden J. (2018). An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol. [DOI] [PMC free article] [PubMed]
- Smith GD, and Ebrahim S. (2003). ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32, 1–22. [DOI] [PubMed] [Google Scholar]
- Staiger D, and Stock J. (1997). Instrumental variables regression with weak instruments. Econometrica 65, 557–586. [Google Scholar]
- Staley JR, Blackshaw J, Kamat MA, et al. (2016). PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 32, 3207–3209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stock J, Wright J, and Yogo M. (2002). A survey of weak instruments and weak identification in generalized method of moments. Journal of Business and Economic Statistics 20, 518–529. [Google Scholar]
- Stock J, and Yogo M. (2005). Asymptotic distributions of instrumental variables statistis with many weak instruments In Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg, Andrews D, and Stock J. (eds), 109–120. Cambridge: Cambridge University Press. [Google Scholar]
- Takahashi H, Cornish AJ, Sud A, et al. (2019). Mendelian randomization provides support for obesity as a risk factor for meningioma. Sci Rep 9, 309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Theodoratou E, Palmer T, Zgaga L, et al. (2012). Instrumental variable estimation of the causal effect of plasma 25-hydroxy-vitamin D on colorectal cancer risk: a mendelian randomization analysis. PLoS One 7, e37662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Treur JL, Gibson M, Taylor AE, Rogers PJ, and Munafo MR. (2018). Investigating genetic correlations and causal effects between caffeine consumption and sleep behaviours. J Sleep Res 27, e12695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu K, Chatterjee N, Wheeler W, et al. (2007). Flexible design for following up positive findings. Am J Hum Genet 81, 540–551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhong H, and Prentice RL. (2008). Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostatistics 9, 621–634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu Z, Zhang F, Hu H, et al. (2016). Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet 48, 481–487. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.