Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2021 Aug 9;17(8):e1009703. doi: 10.1371/journal.pgen.1009703

Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data

Ciarrah Barry 1,2, Junxi Liu 1,2, Rebecca Richmond 1,2, Martin K Rutter 3,4, Deborah A Lawlor 1,2, Frank Dudbridge 5, Jack Bowden 6,1,*
Editor: Heather J Cordell7
PMCID: PMC8376220  PMID: 34370750

Abstract

Over the last decade the availability of SNP-trait associations from genome-wide association studies has led to an array of methods for performing Mendelian randomization studies using only summary statistics. A common feature of these methods, besides their intuitive simplicity, is the ability to combine data from several sources, incorporate multiple variants and account for biases due to weak instruments and pleiotropy. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well developed summary data methods to individual level data, and to explore the use of more sophisticated causal methods allowing for non-linearity and effect modification.

In this paper we describe a general procedure for optimally applying any two sample summary data method using one sample data. Our procedure first performs a meta-analysis of summary data estimates that are intentionally contaminated by collider bias between the genetic instruments and unmeasured confounders, due to conditioning on the observed exposure. These estimates are then used to correct the standard observational association between an exposure and outcome. Simulations are conducted to demonstrate the method’s performance against naive applications of two sample summary data MR. We apply the approach to the UK Biobank cohort to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.

Our approach can be viewed as a generalization of Dudbridge et al. (Nat. Comm. 10: 1561), who developed a technique to adjust for index event bias when uncovering genetic predictors of disease progression based on case-only data. Our work serves to clarify that in any one sample MR analysis, it can be advantageous to estimate causal relationships by artificially inducing and then correcting for collider bias.

Author summary

Uncovering causal mechanisms between risk factors and disease is challenging with observational data because of unobserved confounding. Mendelian randomization offers a potential solution by replacing an individual’s observed risk factor data with an unconfounded genetic proxy measure. Over the last decade an array of methods for performing Mendelian randomization studies (MR) using publicly available summary statistics gleaned from two separate genome-wide association studies. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well-developed summary data methods to individual level data. In this paper we describe a general procedure for optimally applying any summary data MR method using individual level data from one cohort study. Our approach may at first seem nonsensical: we create summary statistics that are intentionally biased by confounding. This bias can, however, be very accurately estimated, and the estimate then used to correct the results of a standard observational analysis. We apply our new way of performing an MR analysis to data from UK Biobank to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.

Introduction

Mendelian randomisation (MR) is a technique used to test for, and quantify, the causal relationship between a modifiable exposure and health outcome with observational data, by using genetic variants as instrumental variables [1, 2]. MR circumvents the need to measure and adjust for all variables which confound the exposure-outcome association, and is therefore seen as an attractive additional analysis to perform alongside more traditional epidemiological methods [3]. The following Instrumental Variable assumptions are usually invoked in order justify testing for a causal effect of an exposure X on a health outcome Y using a set of genes, G:

  • IV1: G must be associated with X;

  • IV2: G must be independent of unmeasured confounding between X and Y;

  • IV3: G must be independent of Y conditional on X and all confounders of the X-Y relationship.

These assumptions are encoded in the causal diagram in Fig 1. Further linearity and homogeneity assumptions are needed in order to consistently estimate the magnitude of the causal effect. When performing an MR-analysis it is best practice to pre-select SNPs for use as instruments using external data, in order to avoid bias due to the winner’s curse [4]. Subsequently, if the genetic variants are not as strongly associated with the exposure as in the discovery GWAS, assumption IV1 will only be weakly satisfied, which leads to so-called weak instrument bias [5, 6]. This issue is mitigated as the sample size increases as long as the true association is non-zero. When a genetic variant is in fact associated with the outcome through pathways other than the exposure, a phenomenon known as horizontal pleiotropy [7], this is a violation of assumptions IV2 and/or IV3. Horizontal pleiotropy is not necessarily mitigated by an increasing sample size and is also harder to detect. Its presence can therefore render very precise MR estimates hopelessly biased. Pleiotropy-robust MR methods have been a major focus of research in recent years for this reason [811].

Fig 1. The IV assumptions for a genetic variant G are represented by solid lines in the directed acyclic graph (DAG).

Fig 1

Dotted lines represent violations of IV assumptions as described in IV2 and IV3. The causal effect of a unit increase of the exposure, X, on the outcome, Y, is denoted by β. U represents unobserved confounders of X and Y.

One-sample versus Two-sample MR: Pros and cons

Obtaining access to a single cohort with measured genotype, exposure and outcome data that is large enough to furnish an MR analysis has been difficult, historically. It has instead been far easier to obtain summary data estimates of gene-exposure and gene-outcome associations from two independent studies, and to perform an analysis within the ‘two-sample summary data MR’ framework (see Fig 2). [12, 13]. This has made it an attractive option for the large scale pursuit of MR, through software platforms such as MR-Base [14]. The relative simplicity of these methods (which resemble a standard meta-analysis of study results) and their ability to furnish graphical summaries for the detection and adjustment of pleiotropy [15] has also acted to increase their popularity. Indeed, the array of pleiotropy robust two sample summary data methods far outstrips those available for one sample individual level data MR analysis [16]. A further advantage of two-sample over one-sample MR is that weak instruments bias causal estimates towards the null (it is often referred to as a ‘dilution’ bias for this reason) which is conservative [17]. Dilution bias arises precisely because uncertainty in the SNP-exposure association estimates obtained from one cohort is independent of the uncertainty in SNP-outcome association estimates from a non-overlapping cohort (Fig 2). This makes the the SNP-exposure association uncertainty akin to ‘classical’ measurement error [18] and enables standard approaches such as Simulation Extrapolation [19, 20] or modified weighting [6, 21] to be used to adjust for its presence. In contrast, weak instruments bias MR estimates obtained from a one sample analysis towards the observational association because uncertainty in the SNP-exposure and SNP-outcome association estimates are correlated. This bias is harder to correct for and is potentially anti-conservative.

Fig 2. In two sample summary data MR, (GX) association estimates, β^XGj, from one cohort are combined with (GY) association estimates, β^YGj from a separate, non-overlapping cohort, to produce a set of SNP-specific causal estimates, β^j.

Fig 2

These are combined using inverse variance weighted meta-analysis (wj being the weight) to obtain an overall estimate β^IVW for the true causal effect β.

There are, however, many disadvantages of using two sample summary data compared to individual level data from a single sample MR, some examples of which are now given: The two-sample approach assumes the two cohorts are perfectly homogeneous [13]. If the distribution of confounders is different between the samples, this can result in severe bias [22]. Alternatively, it may be that the independence assumption is violated due to an unknown number of shared subjects across the two studies [23], which cannot be easily removed [24]. Even when the homogeneity assumption is satisfied, two sample methods can give misleading results if the two sets of associations are not properly harmonized [25]. Often, summary statistics from a GWAS have been adjusted for factors that might bias MR results, and the unadjusted data are not available [26]. It may not be possible to source summary data on the exact population needed for a particular analysis, for example on either men or women only when looking at sex-specific outcomes) [27]. Finally, a richer array of analyses are possible with individual level data. For example, the estimation of non-linear causal effects across the full range of the exposure and the exploration of effect modification via covariates.

It is of course possible to naively apply summary data MR methods to the one-sample context, estimating both the gene-exposure and gene-outcome associations in the same sample, an analysis made increasingly easy by the advent of large open-access cohort studies such as the UK Biobank (UKB) [28]. This avoids problems with synthesising and harmonizing data from separate cohorts, but can result potentially anti-conservative weak instrument bias due to correlated error. A preliminary investigation has found that this naive approach is particularly bad for pleiotropy robust approaches such as MR-Egger regression [29, 30]. So far, there is no consensus on how best to implement summary data approaches in the one sample setting.

In this paper we propose a general method which we term ‘Collider-Correction’ that can reliably apply two-sample summary data MR methods to one-sample data, whilst maintaining the simplicity and appeal of the two-sample approach. Our method builds on the work of Dudbridge et. al. [31], who proposed a method to correct for ‘index event’ (or collider) bias in genetic studies of disease progression, when all subjects included in the analysis have been diagnosed with the disease. In this setting, the analysis is open to contamination from collider bias. Our work serves to clarify that the procedure can be extended to any MR analysis where the aim is to estimate the causal effect, by artificially inducing collider bias in the observational association between X and Y and then correcting for it. This allows any two sample method to be used in a one sample design, thereby benefiting from the plethora of weak instrument and pleiotropy robust approaches available. We show that this approach is (a) statistically efficient compared to artificially splitting the data in two, and (b) will deliver consistent estimates of the causal effect whenever the assumptions of the underlying two-sample approach are satisfied.

Although our method builds on the work of Dudbridge et al, there are several major differences. Firstly, whilst Dudbridge et al focus on the unbiased estimation of the direct SNP-outcome associations, we treat these as nuisance parameters and focus instead on estimation of the causal effect. Secondly, whilst the underlying method we use is closely related to the approach of Dudbridge et al when the chosen method is MR-Egger regression, our paper shows that the underlying method can actually be applied to any MR method. Thirdly, whereas Dudbridge et al propose a solution to adjust for weak instrument bias within the specific context of an MR-Egger model which relies on the InSIDE assumption [9], we propose the use of a SIMEX procedure that can be applied to any regression model, including robust regression models that do not rely on InSIDE for identification. Of course, some recent two-sample approaches have weak-instrument robust weighting built into them, for example MR-RAPS [21, 32]. In this case, SIMEX adjustment is unnecessary.

A major reason for the emergence of weak instrument and pleiotropy robust two-sample MR methods [6, 21, 32] is the avoidance of winner’s curse [4], by using one discovery GWAS for instrument selection and two additional data sources for the two-sample MR analysis (i.e. a ‘three-sample’ design). Although this removes winner’s curse by design, it generally yields far weaker instruments. In practice, it may be hard to obtain data from three independent, homogenous cohorts to enact the three-sample approach, but a nice property of Collider-Correction is that it can be enacted with two-independent data sources rather than three. In Results, we apply Collider-Correction to 1 sample individual level UK Biobank data to investigate the causal role of sleep disturbance on HbA1c levels, using both overlapping and non-overlapping GWAS data for instrument selection. In the former case winner’s curse is seen to induce a dilution in the MR estimates that is not present in the latter case.

We see three scenarios where our Collider-Correction approach is applicable. Firstly, when interest lies in estimation of the causal effect of an exposure X on an outcome, Y and only summary data on ‘YadjX’ genetic associations are available (for example, waist/hip ratio adjusted for BMI from the GIANT consortium). The second is when researchers have direct access to individual level patient data. This is likely to become much more common over time as further international biobank studies follow the lead of UKB in opening up data access. Extracting the summary statistics for our approach then enables the efficient implementation of any two-sample method to the data. This is attractive because two-sample methods are currently more numerous than one sample methods, more familiar to researchers and more technically advanced (especially in their ability to adjust for weak instrument bias and pleiotropy). Furthermore, if one additional GWAS is available for instrument selection, Collider-Correction enables winner’s curse, weak instrument bias and pleiotropy to be accounted for using two independent data sets rather than three. The third is when data custodians prefer not to grant direct access to individual level data, but are willing to provide the requisite summary statistics for implementing the Collider-Correction approach, safe in the knowledge that the individual-level data analysis can be performed whilst maintaining data security. Allowing large scale, rapid access to confidential data has obvious benefits to the research community and wider society, as demonstrated through initiatives such as OpenSAFELY [33].

Methods

To motivate ideas, we assume the following individual level data model for the exposure X and continuous outcome Y for subject i:

Xi|Gi,Ui=j=1kβXGjGij+βUXUi+εXi (1)
Yi|Xi,Gi,Ui=βXi+j=1kαjGij+βUYUi+εYi (2)
=j=1k(αj+ββXGj)Gij+(ββUX+βUY)Ui+βεXi+εYi
=j=1kβYGjGij+εYi* (3)

Here, Gi = (Gi1, …, Gik)′ represents a set of k variants that predict Xi, β represents the target estimand, reflecting the causal effect of inducing a 1-unit change in the exposure on the outcome, and U represents unmeasured confounding predicting both X and Y. The variables εXi, εYi represent independent residual error terms. Since the unmeasured confounder U is common to both X and Y, the total residual errors around X|G, Y|X, G and Y|G in Eqs (1)(3) are correlated. This linear model tacitly assumes that the causal effect is the same for all individuals (that is, regardless of their observed exposure level). This is referred to as ‘Homogeneity’: it is an example of a fourth IV assumption that is needed to ‘point identify’ β (assumptions IV1-IV3 are sufficient to test for causality only). We re-write model (2) in ‘reduced form’ as model (3) to clarify that the underlying SNP-outcome association βYGj is equal to αj + ββXGj. When the exposure is binary, so that X = 0, and X = 1 refer to being unexposed and exposed respectively, we can again identify β by assuming Homogeneity. This would mean that the effect of intervening and changing X from 1 to 0 is equal and opposite to the effect of intervening and changing X from 0 to 1.

The standard approach to estimating β with individual level data is Two Stage Least Squares (TSLS). This assumes that all instruments are valid (not pleiotropic), so that αj = 0 for all j. TSLS firstly regresses the exposure on all k genotypes simultaneously to derive an estimate for subject i’s genetically predicted exposure: X^i=j=1kβ^XGjGij, where β^XGj is the estimated association between SNP j and X. The outcome Y is then regressed on X^i and its regression coefficient is taken as the causal estimate β^. As explained in Fig 2, when the set of k SNPs which predict X are mutually independent (i.e. not in linkage disequilibrium), the TSLS estimate is asymptotically equivalent to the IVW estimate [34] obtained by:

  • Calculating the causal estimate β^j by dividing the SNP-outcome association β^YGj (obtained from a regression of Y on Gj) by the SNP-exposure estimate β^XGj for each SNP and;

  • Performing an inverse variance weighted meta-analysis of the k individual causal estimates, β^1,,β^k.

The inverse variance weights traditionally used make the simplifying assumption that the SNP-exposure association β^XGj is sufficiently precise that its uncertainty can be ignored. This is referred to as the No Measurement Error (NOME) assumption [6]. This procedure is equivalent to fitting the following weighted regression model

β^YGj=ββ^XGj+ϵYGj (4)

where ϵYGj is the mean zero residual error with Var(ϵYGj)=σYGj2=Var(β^YGj) and the intercept is constrained to zero. We will refer to this as the ‘standard’ IVW approach. It is commonly used in two sample summary data MR out of necessity because only summary statistics are available, but not typically in the one sample setting [29].

Inducing collider bias into SNP-outcome associations

Consider a regression of the outcome Y on G and X together (but not U). Under our assumed data generating model:

E[Yi|Xi,Gi]=β*Xi+j=1kαj*Gij, (5)

yielding estimated coefficients β^* and α^1*,,α^k*. Since X is a function of both G and U, conditioning on X induces a correlation between them [35]. This is commonly referred to as ‘collider bias’ [36]. Its presence contaminates the Gj-Y association estimate with a contribution through U so that α^j* is not a consistent estimate for αj. For the same reason, β^* is not a consistent estimate for β. It instead reflects the causal effect, plus a contribution from X to Y via U. Such ‘collider biased’ analyses are usually avoided for this reason [36]. However, it is in a special sense advantageous to fit model (5) because under models (1) and (2), αj*, αj, β* and β are linked through the following linear relation:

α^j*=αj+(β-β*)βXGj+θj, (6)

where θj is mean zero residual error with Var(θj)=σαj*2=Var(α^j*) (see S1(A) Text for a detailed derivation). This suggests the following algorithm for estimating the causal effect:

  1. Regress Y on X and G to obtain the collider biased parameter estimates β^* and α^1*,,α^k*.

  2. Regress X on G to obtain estimates β^XG1,,β^XGk, where
    β^XGj=βXGj+δj, (7)
    for independent residual error term δj with mean zero and variance σXGj2=Var(β^XGj);
  3. Fit the linear model:
    E[α^j*|β^XGj]=α0+(β-β*)β^XGj (8)
    under a user-specified loss function and pleiotropy-identifying assumption in order to obtain an estimate for the Collider-Correction term (ββ*).
  4. Adjust the observational estimate to obtain an estimate for the causal effect β via:
    β^=β*^+β-β*^ (9)

The above procedure, which we call ‘Collider-Correction’ is a modification and generalisation of the Dudbridge approach [31]. In step 3 and 4 we instead focus on estimation of the Collider-Correction term and the causal parameter β rather than, as Dudbridge et al do, the pleiotropic effects. Crucially, we clarify that, as long as the model for Y given X and G in step 1 is correctly specified, the correlation between the residual error in model (6) and residual error in the first-stage model (7) will have a mean of zero. To illustrate this we simulated 500 independent sets of data from models (1)(2), each containing individual level data on 10,000 subjects. We fixed the number of SNPs to k = 50: each SNP was bi-allelic (taking the values 0,1 or 2), mutually uncorrelated with other SNPs, had a minor allele frequency of 0.3, and collectively explained 1.5% of the variance in the exposure. The correlation between the residual errors in model (1) and (2) was approximately 0.5 to reflect moderate confounding. SNP-exposure and SNP outcome association parameters βXGj and αj were generated from dependent distributions, so that their average correlation was approximately 0.45. This is a clear violation of the InSIDE assumption that the sample covariance Cov^(αj,βXG) is zero [6, 9, 13]. We then applied Step 1 and 2 of the Collider-Correction algorithm to estimate the α^j* and β^XGj terms. Fig 3A shows, for a single simulated data set, the extent of correlation between the 50 βXGj and αj. Fig 3B shows across all 500 independent data sets, the sample correlation between the first stage residual δj=β^XGj-βXGj and both:

Fig 3.

Fig 3

(A = Left): Scatter plot of βXGj and αj terms for a single simulated data set. (B = Right): Sample correlation between δ^j and θ^j (black) and δ^j and ϵ^YGj (red).

  • The Collider-Correction residual: θj=α^j*-αj-(β-β*)βXGj (shown in black);

  • The ‘standard’ SNP-outcome residual: ϵYGj=β^YGj-αj-ββXGj (shown in red).

We see that the mean correlation of the Collider-Correction residual with 1st stage residual is zero whereas the mean correlation of the standard SNP-outcome residual with the 1st stage residual is 0.5. This residual error independence property is advantageous because it means that step 3 of the Collider-Correction algorithm can be implemented using any pleiotropy robust two-sample summary data MR method, where the estimand of interest is ββ* rather than the causal effect β directly. Crucially, the residual error independence property means that weak instrument bias will induce a dilution in the slope estimate β-β*^ towards zero, because it can be viewed as a consequence of ‘classical’ measurement error. This makes it easy to quantify and correct for using standard methods, as we will subsequently discuss.

In the toy example above we purposefully generated the data so that the InSIDE assumption was violated across the entire set of SNPs to demonstrate that residual error independence does not rely on InSIDE. However, the success of any subsequently applied Collider-Corrected two sample approach in consistently estimating the causal effect β (i.e. so that it is asymptotically unbiased) will of course depend on the pleiotropy identifying assumption being met, just as if it were being applied in a standard two-sample setting. Although the Collider-Correction algorithm is generalisable in theory to any MR analysis method, we now describe several canonical implementations, which require that the InSIDE assumption is satisfied across either the entire set of SNPs or a subset of SNPs.

Implementing Collider-Correction

Collider-Corrected IVW implementation

To implement the Collider-Corrected IVW approach we set the parameter α0 to zero in Eq (8) and estimate the slope (ββ*) using weighted least squares via the model:

α^j*=(β-β*)β^XGj+θj* (10)

where θj* is mean zero residual error with an assumed variance σαj*2=Var(α^j*). Note that under data-generating model (6) θj* is actually equal to θj + αj. Under the assumption that the mean pleiotropic effect is zero and the InSIDE assumption is satisfied, the residual error independence property of Collider-Correction will mean that θj* is also independent of uncertainty in β^XGj so that (ββ*) can be consistently estimated. The IVW approach then quantifies additional uncertainty in the estimate for (ββ*) due to the presence of pleiotropy, by increasing its variance by a factor ϕ proportional to the variance of the estimated residual Var(θ^j) whenever this variance is greater than 1. This is equivalent to fitting a multiplicative random effects model [13].

The IVW estimate uses ‘1st order weights’ that ignore uncertainty in the SNP-exposure association estimate by assuming that its variance σXGj20. This is referred to as the NO Measurement Error (NOME) assumption [6]. When this is violated the estimate β-β*^ from model (10) will be diluted towards zero by a factor of (F¯-1)/F¯, where:

F¯=j=1kβ^XGj2σXGj2 (11)

See Section 3.2 in [21] for a more detailed explanation. Note that, whilst the Collider-Correction slope is diluted towards zero in the presence of weak instrument bias, the causal estimate itself is still biased toward the observational association estimate β^*, because the causal effect calculated in Step 4 of the Collider-Correction algorithm is the sum of β^* and β-β*^. A simple and general method for weak instrument bias adjustment that can be applied directly to the IVW estimate from model (10) is Simulation Extrapolation (SIMEX) [19]. Under SIMEX, a parametric bootstrap is used to generate ‘pseudo’ SNP-exposure associations, each one centred on the observed estimate, but with an increasing amount of uncertainty (i.e. with larger and larger values of σXGj2). This subsequently induces an increasing dilution in the IVW estimate for (ββ*). A global model is then fitted to the entire set of simulated data in order to extrapolate back to the estimate for (ββ*) that would have been obtained if there were no uncertainty in the SNP-exposure associations (i.e. σXGj2=0, NOME satisfied). SIMEX is attractive because it can be applied to any regression model (and hence many MR methods), and reliable software is available in standard packages, such as R and Stata.

Connecting IVW to LIML and MR-RAPS

An alternative to SIMEX in the special case of the IVW approach is to find the values of (ββ*) and σα*2=Var(αj*) that minimises the weighted sum of squared residuals in the extended model (12):

α^j*=(β-β*)β^XGj+(β-β*)σXGj2+σαj*+zσα*2ϵj (12)

When z = 0 in (12), the pleiotropy variance σα*2=Var(αj*) is fixed to zero and the above procedure is equivalent to performing Limited Information Maximum Likelihood (LIML) with summary data (see Section 3.1 in [21]). Furthermore, the weighted sum of squared residuals from (12) follows a χL-12 distribution when the assumption that Var(αj*)=0 is satisfied, thus providing a simple weak instrument bias robust test for the presence of pleiotropy. This is referred to as the ‘exact’ Q statistic [6] which is similar to the simulation-based MR-PRESSO test for ‘global’ pleiotropy [37, 38].

Unfortunately, when pleiotropy is present so that Var(αj*)0, then the LIML estimate will be biased [6]. In order to account for both weak instrument bias and non-zero pleiotropy, z can be set to 1 so that the squared residual minimisation is over both (ββ*) and σα*2. This is equivalent to applying ‘MR-RAPS’ [21] when applied to the Collider-Correction summary statistics. MR-RAPS actually uses an approximation to the least-squares method because the maximum likelihood estimates are inherently unstable, this entails the use of a score function to proxy for the likelihood and a penalization term to dampen the effect of large residuals.

Collider-Corrected MR-Egger implementation

In order to account for pleiotropy with a non-zero mean but under the InSIDE assumption, we could instead allow the intercept α0 and slope (ββ*) to be freely estimated via weighted least squares by fitting a Collider-Correction MR-Egger model [9]

α^j*=α0+(β-β*)β^XGj+θj*, (13)

where θj* is mean zero residual error with an assumed variance σαj*2=Var(α^j*). Note that under data-generating model (7) θj* is actually equal to θj + αjα0. Using the same argument as for the IVW model, when InSIDE is satisfied this will consistently estimate the Collider-Correction slope (adjusted for α0) and from there, the causal effect. Additional uncertainty due to pleiotropy can again be handled using a multiplicative random effects model [13]. To assess the vulnerability of the MR-Egger regression estimates to weak instrument bias due to violation of the NOME assumption, we use the IGX2 statistic [5]:

IGX2=QGX-(k-1)QGX,whereQGX=j=1k(β^XGj-β¯XGj)2σXGj2 (14)

The expected dilution in the Collider-Correction β-β*^ due to weak instruments is equal to (β-β*)IGX2. This can easily be adjusted for by applying SIMEX to model (13), just as for the IVW approach.

Collider-Corrected robust regression

IVW MR-Egger and MR-RAPS rely on the InSIDE assumption to consistently estimate the causal effect. This may be violated in practice, hence the rationale for the development of alternative, robust methods such as the Weighted Median [10]. In the two-sample summary data context it can consistently estimate the causal effect if the majority of the ‘weight’ in the MR analysis stems from genetic variants that are not pleiotropic. That is, the existence of a SNP subset S is assumed for which Cov^jS(αj,βXGj)=Cov^S(0,βXGj)=0, but InSIDE is allowed to be violated for SNPs not in subset S. The downside of weighted median approach is that it is not directly equivalent to a regression model, which in turn means that we can not benefit from a procedure like SIMEX to perform a weak instrument bias adjustment. However, there is a close connection between the median and minimisation using a Least Absolute Deviation (LAD), or L1-norm. We therefore propose the use of LAD regression [39] instead of least squares, at Step 3 of the Collider-Correction algorithm, with α0 set to zero. This is close in spirit to the Weighted Median, and is amenable to SIMEX-adjustment too. The exact ‘breakdown point’ of LAD regression (or the proportion of pleiotropic SNPs above which LAD regression will not deliver a consistent estimate) depends on the data generating model, but is bounded between 1/k (k being the number of SNPs) and 1/2.

Simulation studies

In order to confirm our theoretical results and assess the performance of the Collider-Correction algorithm, data sets of between 5000 and 50,000 individuals were generated under models (1) and (2) as described previously. Across all simulations:

  • The causal effect of inducing a one-unit change in the exposure on the outcome, β, was set to 0.5 for all individuals;

  • The correlation between the residual errors in model (1) and (2) was set to approximately 0.9 to reflect strong confounding;

  • The observational estimate for β^* and the true Collider-Correction term ββ* were approximately 1.12 and -0.62 respectively.

To showcase the ability of IVW-based approaches, MR-Egger regression and LAD regression, pleiotropy parameters and SNP-exposure associations were generated under three distinct models:

  • For IVW simulations, pleiotropic effect parameters α1, …, α50 were generated with a zero mean independently of the SNP-exposure associations βXG1, …, βXG50 (InSIDE satisfied);

  • For MR-Egger simulations, pleiotropic effect parameters α1, …, α50 were generated with a non-zero mean independently of the SNP-exposure associations βXG1, …, βXG50 (InSIDE satisfied);

  • For LAD regression simulations, pleiotropic effect parameters α1, …, α15 were generated with a non-zero mean dependent on the SNP-exposure associations βXG1, …, βXG15 (with an average correlation of 0.5) whilst α16, …, α50 were set to 0. InSIDE was therefore strongly violated across SNPs 1:15, satisfied across SNPs 16:50 and violated across all SNPs, respectively.

IVW simulation results

Fig 4 shows, for a range of sample sizes the average value across 1000 independent data sets of: (a) The standard IVW estimate (black line); (b) the SIMEX adjusted standard IVW estimate (blue line); (c) the Collider-Corrected IVW estimate (red line); (d) the Collider-Corrected IVW estimate with SIMEX correction (green line); (e) the TSLS estimate (orange line) and (f) the Collider-Corrected MR-RAPS estimate (implemented using the ‘Tukey’ penalization option). We see that methods (a), (c) and (e) give approximately the same answer, and are therefore hard to individually distinguish in the figure. The approximate equivalence of the TSLS and IVW approaches with uncorrelated SNPs is well known, but it is also reassuring that our two step approach is also equivalent. We also see that applying a direct SIMEX correction to method (a) (i.e. method (b)) dramatically increases the bias of the causal estimate beyond even that of the observational estimate for small sample sizes. This bias is slow to diminish as the sample size grows. This poor performance is because uncertainty in the SNP-exposure association estimates can not be viewed as classical measurement error within a standard IVW model. Conversely, we see that applying a SIMEX correction to the Collider-Corrected IVW estimate (c) (i.e method (d)) yields a steadily decreasing bias which is essentially zero when the mean F statistic across the instruments is larger than 5. The Collider-Corrected MR-RAPS estimate performs very well too, and is essentially unbiased for mean F statistics greater than 3.5.

Fig 4. Performance of IVW implementations (including the Collider-Correction algorithm) using one-sample data.

Fig 4

Fig 5 gives further intuition on why the correction process works. The black line shows the estimated Collider-Correction β-β*^ as a function of the given sample size. The blue line shows the true Collider-Correction multiplied by the expected dilution factor F¯-1F¯, which varies as a function of the sample size. The fact that the two lines are in good agreement indicates that the dilution in β-β*^ can be perfectly predicted by the F-statistic formula and underlines why SIMEX can be used to correct for it. Fig 6 shows the performance of the IVW estimate implemented using the (one sample) Collider-Correction algorithm, versus that obtained from artificially splitting the data in two, and applying the ‘standard’ IVW approach. That is, calculating SNP-exposure associations in one half, SNP-outcome associations in the other half and combining in the usual manner. This ensures that the residual error independence property is satisfied, as it is for the one sample Collider-Correction approach. Results for each method are shown with and without SIMEX correction. We see that the absolute bias of the Collider-Correction implementations is less than that of the two-sample implementation. However, the two estimation strategies differ more substantially in terms of precision, as shown in Fig 7. Collider-correction of one sample data is shown to be far more efficient than sampling splitting.

Fig 5. An illustration that the Collider-Correction slope’s dilution can be accurately predicted using the F-statistic.

Fig 5

Fig 6. A comparison of the one sample Collider-Correction versus two-sample IVW approaches in terms of bias.

Fig 6

Fig 7. A comparison of the one sample Collider-Correction versus two-sample IVW approaches in terms of efficiency.

Fig 7

Figs 8A and 9A show the corresponding standard deviation and mean-squared error (a measure of accuracy that equals an estimate’s variance plus the squared bias) for all IVW-based methods across the same set of simulations. They show that whilst the MR-RAPS estimate is less biased for small sample sizes than the Collider-Corrected IVW method with SIMEX adjustment, it is more variable and less accurate.

Fig 8. Monte-Carlo standard deviations for all IVW (A = top-left), MR-Egger (B = top-right) and LAD regression (C = bottom) estimators.

Fig 8

Fig 9. Mean Squared Error for all IVW (A = top-left), MR-Egger (B = top-right) and LAD regression (C = bottom) estimators.

Fig 9

MR-Egger simulation results

Fig 10 shows for a range of sample sizes the average value across 1000 independent data sets of: (a) The standard (one-sample) MR-Egger estimate (black line); (b) the SIMEX adjusted standard MR-Egger estimate (blue line); (c) the Collider-Corrected MR-Egger estimate (red line) and (d) the Collider-Corrected MR-Egger estimate with SIMEX correction (green line). As the sample size increases the IGX2 statistic increases from 0.1 to 0.5. This signals that the 50 SNPs get collectively stronger as a set of instruments within MR-Egger as the sample size increases, but even at the largest sample size we expect a dilution of 50% in the MR-Egger slope. Again, we see that standard and Collider-Corrected MR-Egger methods give the same results, but the two approaches differ greatly under SIMEX correction, with the SIMEX adjusted Collider-Corrected estimate being least biased. In Fig 11 we show how dilution in the Collider-Corrected slope estimate β-β*^ for MR-Egger can be accurately quantified using the IGX2 statistic, just as the F-statistic predicts the dilution for IVW. This explains why SIMEX adjustment works.

Fig 10. Performance of the MR-Egger implementation of the Collider-Correction algorithm under a directional pleiotropy scenario.

Fig 10

Fig 11. An illustration that the Collider-Correction slope’s dilution can be accurately predicted using the IGX2 statistic.

Fig 11

Figs 8B and 9B show the corresponding standard deviation and mean squared error for all MR-Egger-based methods across the same set of simulations. Standard and Collider-Corrected MR-Egger are seen to have the joint smallest variance, but Collider-Corrected MR-Egger with SIMEX adjustment has the smallest mean-squared error because it is far less biased.

LAD-regression simulation results

Fig 12 shows for a range of sample sizes the average value across 1000 independent data sets of: (a) The standard (one-sample) LAD-regression estimate (black line); (b) the SIMEX adjusted standard LAD regression estimate (blue line); (c) the Collider-Corrected LAD regression estimate (red line) and (d) the Collider-Corrected LAD regression estimate with SIMEX correction (green line). For comparison we also show (e) the standard IVW estimate: its bias does not approach zero as the sample size increases because of the presence of non-zero mean pleiotropy violating InSIDE, which is the very motivation for LAD regression. As in the previous simulations, standard and Collider-Corrected LAD regression give identical point estimates on average, but when SIMEX adjustment is applied the two estimates diverge substantially. Collider-Corrected LAD regression with SIMEX adjustment results in the least biased estimates of all.

Fig 12. Performance of the LAD-regression implementation algorithm under InSIDE violating pleiotropy.

Fig 12

Fig 13 plots the mean dilution in the Collider-Corrected LAD regression estimate, versus that predicted by the IVW dilution factor F¯-1F¯. The fact that the observed dilution is below the expected IVW dilution illustrates that LAD regression is more vulnerable to weak instrument bias, because it is a less efficient but more robust technique. This emphasises the importance of being able to address its weak instrument bias.

Fig 13. An illustration that the Collider-Correction slope’s dilution under a LAD-regression analysis can be approximately (but not exactly) predicted using the F-statistic.

Fig 13

Figs 8C and 9C show the corresponding standard deviation and mean-squared error for all LAD regression-based methods across the same set of simulations. The same pattern of higher variance but lower mean-squared error is seen for the SIMEX adjusted Collider-Corrected LAD regression approach as in the MR-Egger case.

The MR-RAPS approach can, in theory, consistently estimate the causal effect when a small proportion of SNPs are pleiotropic and violate the InSIDE assumption, as long as their contribution is strongly penalized by its robust loss function. In order to test this we also calculated the MR-RAPS estimate when applied to the simulated data for LAD regression. MR-RAPS was seen to work well for a proportion of simulated data sets, but its estimates were unstable: in many cases they were an order of magnitude larger than the true value of 0.5. To illustrate this, Fig 14 shows the distribution of its estimates at the largest sample size of 50,000 subjects, where it was most stable. Even in this case substantial instability is observed.

Fig 14. Distribution of MR-RAPS estimates at sample size = 50,000 when the data were generated under the LAD regression model.

Fig 14

(A = Left: all estimates, B = Right: estimates less than 1.

R code for reproducing the simulation study results is available in S1 Code.

Results: Assessing the causal role of insomnia on HbA1c

Observationally, sub-optimal sleep (i.e., low sleep quantity and quality) has been found to be associated with hyperglycaemia [4042] and increased diabetes risk [43]. Insomnia, defined as difficulty initiating or maintaining sleep, is one of the most important indices of sleep quality [44]. It has been associated with type 2 diabetes in observational studies [44] and in a previous Mendelian randomization study [45]. However, it is unclear whether associations with insomnia are mediated through HbA1c in the general population, whose glucose levels may not meet the threshold criteria for a formal diabetes diagnosis. As such, we focus on a potentially causal role of insomnia on HbA1c, a well-established clinical assessment of long-term glycaemic regulation that is central to the diagnosis of diabetes [46]. To address this question we use individual level data on approximately 320,000 individuals in UK Biobank to furnish a one sample Mendelian randomization study.

Two hundred and forty-eight independent genetic variants at 202 loci were associated with self-reported insomnia at or below the standard genome-wide significance threshold (p-value<5 × 10−8) in a recent GWAS of over 1.33 million UK Biobank and 23andMe individuals reported by Jansen [45] which collectively explained 2.6% of the total trait variance. SNP-exposure associations were measured on the log-odds scale using logistic regression. Among this set of variants, 240 SNPs were in principle available for use as instruments in UK Biobank. In this cohort, participants were asked: “Do you have trouble falling asleep at night or do you wake up in the middle of the night?” with responses “Never/rarely”, “Sometimes”, “Usually”, or “Prefer not to answer”. Those who responded “Prefer not to answer” were set to missing. To reflect the Jansen analysis, the remaining entries were treated as a binary variable for insomnia symptoms, with “Never/rarely”, “Sometimes”, and “Usually” coded as 0, 0, and 1, respectively and a logistic regression performed. HbA1c measurements were obtained from a panel of biomarkers assayed from blood samples collected at baseline from UK Biobank participants. HbA1c (mmol/mol) was measured in red blood cells by HPLC analysis using Bio-Rad VARIANT II Turbo and log-transformed.

Instrument selection and winner’s curse

The mean F statistic for the 240 genetic instruments in the original GWAS was 41, but in order to avoid winner’s curse we did not want to incorporate these estimates directly into our MR analysis. In UK Biobank the same SNPs had an F¯ of approximately 8.3 and an IGX2 statistic of approximately 40%, meaning that the MR analysis was susceptible to bias due to both weak instrument and pleiotropy. This motivates the use of our Collider-Correction method for causal estimation. However, the original Jansen GWAS combined data from the UK Biobank (n = 386,533) and 23andMe (n = 944,477) using METAL [47]. As such, there was an approximate 23% overlap between data used for SNP discovery and for estimation in our MR model [4]. To additionally assess the impact of winner’s curse for this reason we performed our subsequent analysis using (a) all 240 SNPs and (b) a subset of 112 SNPs that were only genome-wide significant using only the 23andMe portion of the Jansen data. Analysis (b) is completely protected from winner’s curse whereas (a) is not. The downside of analysis (b) is that, with an F¯ of 6.8, it is even more susceptible to weak-instrument bias.

Methods used

We applied the TSLS, IVW, MR-Egger, LAD regression and MR-RAPS approaches to the data. The IVW, MR-Egger and LAD regression approaches were implemented in three ways (1) The ‘Standard’ 1-sample approach (i.e. using all the data to estimate SNP-exposure and SNP-outcome associations); (2) the Collider-Correction algorithm and (3) Collider-Correction with SIMEX adjustment. Note that MR-RAPS incorporates an internal weak instrument bias adjustment and there is no need to additionally apply a SIMEX algorithm to it. Along with MR-RAPS, we refer to approach (3) as the ‘gold-standard’ methods.

Causal estimates

SNP exposure associations β^XGj were obtained from a logistic regression of insomnia on the set of SNPs as well age at recruitment, sex, assessment centre, 10 genetic principal components, and genotyping chip. Estimates for collider biased SNP outcome associations α^j* were obtained from a multivariable regression of HbA1c on observed insomnia severity, all genetic variants and the same additional covariates. This second regression additionally yielded an estimate for the collider biased observational association between insomnia severity and HbA1c of β^*=0.012 (S.E. = 0.00057).

Fig 15 plots the collider biased SNP-outcome associations versus the SNP-exposure associations for analysis (a). Overlaid on the plot are the weak-instrument and pleiotropy adjusted Collider-Correction slopes β-β*^ estimated by the four gold standard methods. The Q statistic is 809 (df = 239) providing overwhelming evidence of heterogeneity due to pleiotropy. The 13 SNPs circled in black contribute a component to this global statistic with a bonferroni corrected p-value below (5/240)% and could therefore be classed as outliers. Adjusted causal effect estimates can be found in Table 1. Across all methods, we see a consistent picture: a unit increase in the log-odds of insomnia leads to an increase of between 0.17 and 0.24 units of log mmol/mol HbA1c. All estimates are further from the null than the collider biased observational association, β^*. However the results highlight that, without weak-instrument adjustment, all summary data MR-methods are biased in the direction of β^*.

Fig 15. Collider biased SNP outcome associations, α^j*, versus SNP-exposure associations, β^XGj for 240 SNPs that were genome-wide significant using 23andMe and UKB data.

Fig 15

Table 1. Point estimates, standard errors and p-values for the: TSLS, IVW, MR-Egger, LAD-regression and MR-RAPS methods.

Estimates reflect the average causal effect of a unit increase in the log-odds of insomnia on HbA1c levels across the population. ‘Standard’ = standard 1-sample analysis. Top rows: Analysis (a)—All 240 SNPs from Jansen et al used. Bottom rows: Analysis (b)—only genome wide significant SNPs from 23andMe data (ignoring UK Biobank) used.

Method Estimate S.E p-value
Analysis (a): 23andMe + UK Biobank significant SNPs
# SNPs: 240, F¯=8.36, Q(p-value) = 809 (<2 × 10−16), IGX2=41.0%
β* 0.012 0.00057 < 2 × 10−16
TSLS 0.016 0.002 < 1 × 10−16
Standard IVW 0.013 0.008 5.04 × 10−7
Standard MR-Egger 0.007 0.005 1.3 × 10−1
Standard LAD 0.011 0.004 2.09 × 10−3
Collider-Corrected IVW 0.022 0.0028 1.1 × 10−15
Collider-Corrected IVW (SIMEX) 0.024 0.0031 3.1 × 10−14
Collider-Corrected MR-Egger 0.015 0.0060 1.3 × 10−2
Collider-Corrected MR-Egger (SIMEX) 0.017 0.0085 4.5 × 10−2
Collider-Corrected LAD 0.020 0.0036 2.0 × 10−8
Collider-Corrected LAD (SIMEX) 0.021 0.0024 < 2 × 10−16
Collider-Corrected MR-RAPs 0.020 0.0026 3.1 × 10−15
Analysis (b) 23andMe significant SNPs only
# SNPs: 112, F¯=6.88, Q(p-value) = 385 (<2 × 10−16), IGX2=52.1%
β* 0.012 0.00057 < 2 × 10−16
TSLS 0.017 0.003 2.39 × 10−10
Standard IVW 0.014 0.004 5.23 × 10−4
Standard MR-Egger 0.008 0.006 1.76 × 10−1
Standard LAD 0.012 0.006 3.30 × 10−2
Collider-Corrected IVW 0.024 0.0045 1.2 × 10−7
Collider-Corrected IVW (SIMEX) 0.026 0.0051 3.3 × 10−7
Collider-Corrected MR-Egger 0.020 0.0083 1.8 × 10−2
Collider-Corrected MR-Egger (SIMEX) 0.023 0.0110 4.5 × 10−2
Collider-Corrected LAD 0.021 0.0056 1.5 × 10−4
Collider-Corrected LAD (SIMEX) 0.024 0.0042 1.4 × 10−8
Collider-Corrected MR-RAPs 0.023 0.0043 3.6 × 10−8

Table 1 (rows 6:10) and Fig 16 show the MR results for analysis (b) using only the 112 SNPs identified in Jansen from 23andMe data, which are immune to the dilution bias caused by winner’s curse. These SNPs have a weaker mean F statistic of 6.88 but a higher IGX2 statistic of 52%. All causal estimates are seen to increase when compared to analysis (a). This is because the winner’s curse which is present in (a) leads to an over-estimation of the SNP-exposure association (which forms the denominator of the standard ratio estimate for β) and thus an underestimation of the causal effect. Again, across all methods, we see consistent evidence that the insomnia causally increases HbA1c.

Fig 16. Collider biased SNP outcome associations, α^j*, versus SNP-exposure associations, β^XGj for 112 SNPs that were genome-wide significant using 23andMe data only.

Fig 16

In total there were 14 outlier SNPs (13 SNPs in analysis (a) and (6) in analysis (b), respectively), which were investigated using the GWAS Catalog (https://www.ebi.ac.uk/gwas/), a full list of which can be found in S1(B) Text and S1 Data. Most of these SNPs are only associated with insomnia except rs10758593 (type 1 and type 2 diabetes), rs12917449 (type 2 diabetes), rs1861412 (BMI) and rs429358 (70+ traits). This provides some biological evidence for the existence of pleiotropy, which further underlines the utility of using robust methods that account for its presence.

Discussion

In this paper we clarify how the principle of Collider-Correction offers a vehicle for applying any two-sample summary data MR method to one sample data, making it easy to account for both pleiotropy and weak instrument bias. Our method is closely related to the approach of Dudbridge et al [31] for genetic studies of disease progression, and primarily serves to emphasise that this procedure is in fact applicable to any MR analysis. We used our new method to provide important insights into the role of insomnia on glycated haemoglobin and, by extension, on incident diabetes.

A nice feature of our approach is that the Collider-Correction term ββ* will be large (and therefore the Collider-Corrected estimate will be clearly distinct from the observational association) precisely when there is strong confounding. Conversely, when there is weak confounding, or the confounding has been sufficiently adjusted for, ββ* will be zero and Collider-Correction estimate will equal the observational association. In this case, the observational association then becomes a consistent and likely very efficient estimate of the true causal effect. Collider-Correction therefore naturally promotes the triangulation and synthesis of observational and MR estimates, which can estimate the true causal effect with distinct but complementary assumptions.

We showcased the Collider-Correction approach using four univariate MR approaches that estimate a single causal effect parameter. At the cutting-edge of MR methods research, new approaches are attempting to: estimate causal effects identified by different clusters of SNPs [32, 48, 49]; simultaneously estimate causal effects via multiple exposures [50, 51], or quantify non-linear effects of an exposure [52]. The Collider-Correction algorithm can in principle be adapted to fit all of these multi-parameter approaches and this is an important topic of future research.

The insomnia data was affected by a small amount of winner’s curse, which we removed by design in a sensitivity analysis by restricting our SNP set to those obtained from a purely independent data source. More sophisticated approaches to adjusting for winner’s curse are possible by incorporating the original Discovery data. For example, Bowden and Dudbridge [4] describe the most statistically efficient way to combine SNP discovery and validation data from two non-overlapping GWAS studies and remove winner’s curse. As further work, we plan to extend this approach and combine it with Collider-Correction.

Often in MR analyses the outcome of interest is binary and a logistic regression model is used in place of the linear model to estimate the causal effect on the odds ratio scale. In this case, the interpretation of causal estimates from a resulting Collider-Correction analysis will be more nuanced for the following reason. Even if we replaced the assumed linear outcome model in Eq (2) with a logistic model, so that β reflected the true causal log-odds ratio for a unit increase in the exposure experienced by each individual, the causal effect estimate (which is a population average) will be diluted by a factor that is proportional to the variance of the residual error in the model not explained by the genetically predicted exposure. This is due to the fact that the odds ratio is a non-collapsible measure [53]. Although this dilution is a very general phenomenon that affects all logistic regression based analysis, three obvious options exist to the applied researcher if implementing Collider-Correction in the binary outcome case. The first would be to simply accept the interpretation of the causal estimate as a population average effect. The second would be to attempt to better approximate the individual causal effect by additionally adjusting for the first stage residual, (that is the observed exposure minus its genetically predicted value) in the second stage logistic model. This is referred to as the Control Function or adjusted IV approach [54]. The third option would be to estimate the causal effect on a risk difference scale. Since the risk difference is a collapsible measure, individual and population average effects are the same. Risk difference estimates can be estimated either by fitting a linear probability model or by extracting the risk difference contrast from the logistic model. This latter approach can be implemented using the margins() package in R. A thorough investigation of the performance of Collider-Correction in the binary outcome setting is an interesting avenue for future research.

Supporting information

S1 Text

A: A formal proof of the Collider-Correction formulae. B: A list of outlying SNPs detected in analysis (a) and analysis (b) of the data example.

(PDF)

S1 Data. Additional functional information on the outlying SNPs detected in analysis (a) and analysis (b) of the data example.

(ZIP)

S1 Code. R scripts for re-creating the simulation study results in the paper.

(R)

Data Availability

This research has been conducted using the UK Biobank Resource under application 6818. The individual level data underlying the results presented in the study are available to qualified researchers from UK Biobank (https://www.ukbiobank.ac.uk/). Summary statistics gleaned from the individual level data and used by our Collider-Correction algorithm in applied analyses (a) and (b) are provided in S1 Data, as well as R code to aid their extraction.

Funding Statement

D.A.L., J.L. and R.R. all work in a Unit that receives support from the University of Bristol and UK Medical Research Council (MCUU00011/6). C.B. is supported by a Wellcome Trust studentship (218495/Z/19/Z). J.L. is funded by a Diabetes UK project grant (17/0005700). D.A.L. is a National Institute of Research Senior Investigator (NF-0616-10102). F.D. is supported by the MRC (MR/S037055/1). J.B. is funded by an Expanding Excellence in England (E3) research award. D.A.L. is funded by an award from the National Institute for Health Research (NF-0616-10102). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Davey Smith G, Ebrahim S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology 2003; 32:1–22. doi: 10.1093/ije/dyg070 [DOI] [PubMed] [Google Scholar]
  • 2.Sheehan N, Didelez V, Burton P, Tobin M. Mendelian Randomisation and Causal Inference in Observational Epidemiology PLOS Medicine 2008; 5:1–6. doi: 10.1371/journal.pmed.0050177 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Davey Smith G, Lawlor D, Harbord R, Timpson N, Day I, Ebrahim S. Clustered Environments and Randomized Genes: A Fundamental Distinction between Conventional and Genetic Epidemiology PLOS Medicine 2007; 4:1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bowden J, Dudbridge F. Unbiased estimation of odds ratios: combining genomewide association scans with replication studies Genetic Epidemiology 2009; 33:406–418 doi: 10.1002/gepi.20394 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bowden J, Del Greco F, Minelli C, Davey Smith G, Sheehan N, Thompson J Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic IJE 2016; 45:1961–1974 doi: 10.1093/ije/dyw220 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bowden J, Del Greco F, Minelli C, Zhao Q, Lawlor D, Sheehan N, Thompson J, Davey Smith G. Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption IJE 2018; 48:728–742 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hemani G, Bowden J, Davey Smith G. Evaluating the potential role of pleiotropy in Mendelian randomization studies HMG 2018; 27:R195–R208 doi: 10.1093/hmg/ddy163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kang H, Zhang A, Cai T, Small D. Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization JASA 2016; 111:132–144 doi: 10.1080/01621459.2014.994705 [DOI] [Google Scholar]
  • 9.Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression IJE 2015; 44:512–525 doi: 10.1093/ije/dyv080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Bowden J, Davey Smith G, Haycock P, Burgess S. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator Genetic Epidemiology 2016; 40:304–314 doi: 10.1002/gepi.21965 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bowden J, Hartwig F, Davey Smith G. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption IJE 2017; 46:1985–1998 doi: 10.1093/ije/dyx102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Burgess S, Butterworth A, Thompson S. Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data Genetic Epidemiology 2013; 37:685–665 doi: 10.1002/gepi.21758 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bowden J, Del Greco F, Minelli C, Davey Smith G, Sheehan N, Thompson J A framework for the investigation of pleiotropy in two-sample summary data Mendelian randomization. Statistics in Medicine 2017; 36:1783–1802. doi: 10.1002/sim.7221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hemani G, Zheng J, Elsworth B, Wade K, Haberland V, Baird D et al. The MR-Base platform supports systematic causal inference across the human phenome. e-Life 2018; 7:e34408.doi: 10.7554/eLife.34408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bowden J, Spiller W, Del Greco F, Sheehan N, Thompson J, Minelli C, Davey Smith G. Improving the visualization, interpretation and analysis of two-sample summary data Mendelian randomization via the Radial plot and Radial regression IJE 2018; 47:1264–1278 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lawlor D, Wade K, Borges M, Palmer T, Hartwig F, Hemani G. A Mendelian Randomization Dictionary Useful Definitions and Descriptions for Undertaking, Understanding and Interpreting Mendelian Randomization Studies OSF Preprints 2019 [Google Scholar]
  • 17.Inoue A, Solon G. Two-sample Instrumental Variable Estimators The Review of Economics and Statistics 2010; 92:557–561 doi: 10.1162/REST_a_00011 [DOI] [Google Scholar]
  • 18.Hyslop R, Imbens G. Bias from Classical and Other Forms of Measurement Error Journal of Business & Economic Statistics 2001; 19:475–481 doi: 10.1198/07350010152596727 [DOI] [Google Scholar]
  • 19.Cook J, Stefanski L Simulation-Extrapolation Estimation in Parametric Measurement Error Models JASA 1994; 89: 1314–1328 doi: 10.1080/01621459.1994.10476871 [DOI] [Google Scholar]
  • 20.Hardin J, Schmiediche H, Carroll R. The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error The Stata Journal 2003; 3:373–385 doi: 10.1177/1536867X0300300406 [DOI] [Google Scholar]
  • 21.Zhao Q, Wang J, Hemani G, Bowden J, Small D. Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score The Annals of Statistics 2020; 48: 1742–1769 doi: 10.1214/19-AOS1866 [DOI] [Google Scholar]
  • 22.Zhao Q, Wang J, Spiller W, Bowden J, Small D. Two-Sample Instrumental Variable Analyses Using Heterogeneous Samples Statistical Science 2019; 34: 317–333 doi: 10.1214/18-STS692 [DOI] [Google Scholar]
  • 23.The CARDIoGRAMplusC4D Consortium A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease Nature Genetics 2015; 47: 1121–1130 doi: 10.1038/ng.3396 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Leblanc M, Zuber V, Thompson W, Andreassen O, Frigessi A, Andreassen B. et al. A correction for sample overlap in genome-wide association studies in a polygenic pleiotropy-informed framework BMC Genomics 2018; 19: 1471–2164 doi: 10.1186/s12864-018-4859-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Hartwig F, Davies N, Hemani G, Davey Smith G. Two-sample Mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique IJE 2016; 45: 1717–1726 doi: 10.1093/ije/dyx028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hartwig F, Tilling K, Davey Smith G Lawlor D, Borges M Bias in two-sample Mendelian randomization by using covariable-adjusted summary associations IJE 2021; In Press [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lawlor D. Commentary: Two-sample Mendelian randomization: opportunities and challenges IJE 2016; 45:908–915 doi: 10.1093/ije/dyw127 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J et al. UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age PLOS Medicine 2015; 12: 1–10 doi: 10.1371/journal.pmed.1001779 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Minelli C, Del Greco F, van der Plaat D, Bowden J, Sheehan N, Thompson J. The use of two-sample methods for Mendelian randomization analyses on single large datasets IJE 2021; In Press [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hartwig F, Davies N. Why internal weights should be avoided (not only) in MR-Egger regression IJE 2016; 45: 1676–1678 [DOI] [PubMed] [Google Scholar]
  • 31.Dudbridge F, Allen R, Sheehan N, Schmidt F, Lee J, Jenkins R et al. Adjustment for index event bias in genome-wide association studies of subsequent events Nature Communications 2019; 101561.doi: 10.1038/s41467-019-09381-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Shapland C, Zhao Q, Bowden J. Profile-likelihood Bayesian model averaging for two-sample summary data Mendelian randomization in the presence of horizontal pleiotropy Biorxiv 2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Williamson E, Walker A, Bhaskaran K, Bacon S, Bates C, Morton C et al. Factors associated with COVID-19-related death using OpenSAFELY Nature 2020; 584430–436 doi: 10.1038/s41586-020-2521-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Burgess S and Bowden J Integrating summarized data from multiple genetic variants in Mendelian randomization: bias and coverage properties of inverse-variance weighted methods arXiv 2015; 1512.04486 [Google Scholar]
  • 35.Pearl J. Causal inference in statistics: An overview Statistics Surveys 2009; 3: 96–146 doi: 10.1214/09-SS057 [DOI] [Google Scholar]
  • 36.Munafo M, Tilling K, Taylor A, Evans D, Davey Smith G. Collider scope: when selection bias can substantially influence observed associations IJE 2017; 47: 226–235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Verbank M, Chen C, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases Nature Genetics 2018; 50: 693–698 doi: 10.1038/s41588-018-0099-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Bowden J, Hemani G, Davey Smith G, Invited Commentary: Detecting Individual and Global Horizontal Pleiotropy in Mendelian Randomization—A Job for the Humble Heterogeneity Statistic? American Journal of Epidemiology 2018; 187: 2681–2685 doi: 10.1093/aje/kwy185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Giloni A, Padberg M The Finite Sample Breakdown Point of L-1 Regression SIAM Journal on Optimization 2004; 14: 1028–1042 doi: 10.1137/S1052623403424156 [DOI] [Google Scholar]
  • 40.Spiegel K, Leproult R, Van Cauter E Impact of sleep debt on metabolic and endocrine function Lancet 1999; 354: 1435–1439 doi: 10.1016/S0140-6736(99)01376-8 [DOI] [PubMed] [Google Scholar]
  • 41.Stamatakis K, Punjabi N Effects of sleep fragmentation on glucose metabolism in normal subjects Chest 2010; 137: 95–101 doi: 10.1378/chest.09-0791 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Nedeltcheva A, Kessler L, Imperial J, Penev P. Exposure to Recurrent Sleep Restriction in the Setting of High Caloric Intake and Physical Inactivity Results in Increased Insulin Resistance and Reduced Glucose Tolerance The Journal of Clinical Endocrinology & Metabolism 2009; 94: 3242–3250 doi: 10.1210/jc.2009-0483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Shan Z, Ma H, Xie M, Yan P, Guo Y, Bao W et al. Sleep Duration and Risk of Type 2 Diabetes: A Meta-analysis of Prospective Studies Diabetes Care; 38: 529–537 doi: 10.2337/dc14-2073 [DOI] [PubMed] [Google Scholar]
  • 44.Green M, Espie C, Popham F, Robertson T, Benzeval M Insomnia symptoms as a cause of type 2 diabetes Incidence: a 20?year cohort study BMC Psychiatry 2017; 17: 94.doi: 10.1186/s12888-017-1268-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Jansen P, Watanabe K, Stringer S, Skene N, Bryois J, Hammerschlag A et al. Genome-wide analysis of insomnia in 1,331,010 individuals identifies new risk loci and functional pathways Nature Genetics 2019; 51: 394–403 doi: 10.1038/s41588-018-0333-3 [DOI] [PubMed] [Google Scholar]
  • 46.Guidance Diagnosis and Classification of Diabetes Mellitus Diabetes Care 2004; 27: s5–s10 doi: 10.2337/diacare.27.2007.S5 [DOI] [PubMed] [Google Scholar]
  • 47.Willer C, Li Y, Abecasis G. METAL: fast and efficient meta-analysis of genomewide association scans Bioinformatics 2010; 26: 2190–2191 doi: 10.1093/bioinformatics/btq340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Qi G, Chatterjee N Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects Nature Communications 2019; 10: 1941.doi: 10.1038/s41467-019-09432-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Burgess S, Foley C, Allara E, Staley J, Howson J A robust and efficient method for Mendelian randomization with hundreds of genetic variants Nature Communications 2020; 11: 376.doi: 10.1038/s41467-019-14156-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Sanderson E, Davey Smith G, Windmeijer F, Bowden J An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings IJE 2018; 48: 713–727 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Wang J, Zhao Q, Bowden J, Hemani G, Davey Smith G, Small D et al. Causal Inference for Heritable Phenotypic Risk Factors Using Heterogeneous Genetic Instruments PLOS Genetics 2021, In Press doi: 10.1371/journal.pgen.1009575 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Staley J, Burgess S. Semiparametric methods for estimation of a nonlinear exposure-outcome relationship using instrumental variables with application to Mendelian randomization Genetic Epidemiology (2017); 41: 341–352 doi: 10.1002/gepi.22041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Vansteelandt S, Bowden J, Babanezhad M, Goetghebeur E. On Instrumental Variables Estimation of Causal Odds Ratios Statistical Science 2011; 26: 403–422 doi: 10.1214/11-STS360 [DOI] [Google Scholar]
  • 54.Palmer T, Thompson J, Tobin M, Sheehan N, Burton P Adjusting for bias and unmeasured confounding in Mendelian randomization studies with binary responses IJE 2008; 37: 1161–1168 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

David Balding, Heather J Cordell

23 Mar 2021

Dear Dr Bowden,

Thank you very much for submitting your Methods entitled 'Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data' to PLOS Genetics.  We apologise for the slow response, largely due to waiting for a late review but we think it was helpful and worth waiting for.

The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Heather J Cordell

Associate Editor

PLOS Genetics

David Balding

Section Editor: Methods

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Reviewer #1: The manuscript presents an elegant use of the collider correction idea of Dudbridge et al., but instead of using it for SNP-effect correction, it is applied to remove bias in one sample causal effect estimation. While the MR field has focused tremendously on 2-sample MR methods due to data availability, one-sample MR is still only done by 2SLS. Since the bias correction is stated as a regression problem, analogously to most 2-sample MR methods, it can be combined with any such MR method – however estimating the causal effect, they now estimate the bias.

In all fairness, it has to be stated that the method still requires 2 samples, since one is needed to select the instruments and the method is applicable only when the instruments are known. The Winner’s curse bias of such instrument selection is not touched upon, which is acceptable, but needs to be stated upfront. The simulation results are convincing and the real data application shows a good example how such collider correction improves causal effect estimation. Below I list a few comments, some of which may improve the manuscript.

Major comments

How different is the method compared to using the Dudbridge et al [29] method to correct the G-Y summary statistics for collider bias and then use classical 2-sample MR methods for the corrected G-Y and G-X summary stats?

The applicability of the method is rather limited: it requires a GWAS to be performed on a YadjX trait, hence its applicability to summary statistics is very low. This needs to be admitted in the Discussion.

The variance-bias tradeoff when adding the SIMEX correction for weak instrument bias could be explored further. Bias corrections are much less interesting in practice that lead to increased RMSE. Could the authors specify in what kind of settings would the RMSE of the SIMEX corrected estimate still decrease? E.g. in Fig 3 (bottom), how would the bias^2+SD^2 plots would look like for these methods?

It would be interesting to assess how much this approach may still suffer from Winner’s curse bias, which has been ignored as the (50) SNPs have been pre-selected. This is particularly important when the authors correct for the regression dilution bias of Eq (6): The mean F-statistic is much more biased in real data applications, when the SNPs are selected in the same data set: thus at each locus the top SNP is chosen, but for this the F-statistic is overestimated and hence the regression-dilution bias is underestimated. It would be key that the authors in their simulations use rather 50 loci (use realistic LD patterns at each locus) and choose the top hit SNPs as instruments, as people would do for real data. I strongly suspect that the SIMEX approach (or any other method to mitigate this bias) would perform worse. Also, loci that do not reach genome-wide (GW) significance are not used, hence not always all the 50 SNPs should be used as instruments. For real data examples, there are many hundreds of loci reach GW significance and such bias in the regression dilution estimation is far stronger. I’d invite the authors to include more loci and decide which ones to use as instruments that survive some threshold to reflect more realistic settings. I do not feel that extensive analysis of this phenomenon is needed, only some effort to show how serious this bias could be.

In the real data application X is binary and logistic regression is used, while in the methods the models for X and Y assume linear models. How is this contradiction resolved?

It was not clear in the real data application which of the methods listed in Table 1 were applied to the artificially induced Y~X+G based beta_GY summary statistics and whether they have directly applied classical MR methods to simple Y~G vs X~G types of summary stats (which has sample overlap bias) or to TSLS, which would be the standard choice? This is also not very clear in the simulations: when they say “standard IVW” is it IVW of the Y~G/X~G or Y~X+G/X~G estimates? I guess/hope it is the former one.

Minor comments

1. Black curves are not visible in Figs 3A, 4A/C. I know that it is overlapping other curves, but the reader cannot know which ones (maybe use dashed lines).

2. In Eq (8), shouldn’t sigma^2 have a “hat” on it, since it is just an estimate for the variance of the estimator?

3. Why the standard error in Fig 3D collider corrected 1 sample IVW is increasing with the sample size? Would be informative to add the “collider uncorrected 2 sample” MR estimates to Fig 3B (bottom left), would it be the same as collider corrected 2 sample IVW?

4. “(a) The standard IVW estimate (black line); (b) the SIMEX adjusted standard IVW estimate (blue line); (c) the collider corrected IVW estimate (red line); (d) the collider Corrected IVW estimate with SIMEX correction (green line); and (e) the TSLS estimate (orange line). We see that methods (a), (c) and (e) give essentially the same answer and can therefore not be individually distinguished in the figure.” – I’m not sure I get it: collider correction does nothing to IVW? How is that possible?

Reviewer #2: With pleasure I read this manuscript about using a correction for collider bias to apply two-sample summary data Mendelian randomization (MR) methods to one-sample individual level data. These MR methods are gaining a lot of traction and I believe the authors propose an idea that is likely to gain more traction, as the number of large datasets with individual data are becoming more and more available (think of UK Biobank, Biobank Japan and the Million Veterans program).

However, I feel like in the current form the manuscript is somewhat puzzling. In a somewhat arbitrary order, I have listed my points below:

1. I feel like the method proposed by the authors is not compared to the right models. Currently, they only show how their method compares to a regular IVW (with/without SIMEX)/TSLS method without correcting for collider bias. However, I feel like I miss a lot of methods here that would be more interesting to compare the method to, such as Robinson’s 1988 partially linear model, limited information maximum likelihood, and semi parametric methods such as generalized methods of moments (GMM) and structural mean models (SMM).

2. I think the current reporting of only the estimates is somewhat misleading, given that the standard deviations of the proposed method are much larger (as shown in the bottom right panel of Figure 3). I can imagine that in the current form, due to the large variance, just by chance this method can have a worse estimator compared to just doing a regular IVW. I think a measure that takes into account both bias and variance of the method such as mean squared prediction error (MSPE) (or some other metric as mean average prediction error (MAPE)) is more insightful.

3. I feel like the current empirical example is worrisome. The authors results are very prone to weak instrument bias (illustrated by the low F-statistics of 8.36 and 6.88 as shown in Table 1) and should be interpreted with a lot more caution.

4. Also, I feel like the example given where there is overlap between discovery sample and estimation sample is a bad example of how an MR study should be done (due to winner’s curse) and hence it would be a better showcase to only report the example where there is no overlap.

5. The proposed method strongly hinges on the InSIDE assumption. I feel like a proper discussion of this assumption is missing.

6. A more thorough inspection of what SNPs are chosen as outliers (appendix C) would be interesting.

7. Elaborate more on the decision: `we propose to fit step 3 using Least-Absolute Deviation (LAD) regression instead of least squares.’ So that I understand why this decision is made.

Minor remarks:

8. There is inconsistency in the mathematic equations, they sometimes have a missing comma or a dot to end the sentence.

9. Figures do often not contain a 0 on the Y axis. This is misleading, especiall in Figure 4 right bottom panel, and Figure 3 top right panel.

Reviewer #3: see attached file

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Attachment

Submitted filename: review.docx

Decision Letter 1

David Balding, Heather J Cordell

20 Jun 2021

Dear Dr Bowden,

Thank you very much for submitting your Methods entitled 'Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the improvements made in your revised manuscript but identified some remaining concerns.

We therefore ask you to revise the manuscript in the light of the reviewer recommendations. You should address the specific points made by each reviewer, either in the manuscript or through an explanation in your covering letter.  The editors have some concern that in trying to respond to previous reviewer comments, the manuscript has lost some readability and so we encourage you to review the manuscript for opportunities to improve clarity.  We note again that it's not necessary to make a change suggested by a reviewer if you can give a good explanation why not.  The manuscript is also rather long and with many figures, please consider whether some material can be moved to supplementary information.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Heather J Cordell

Associate Editor

PLOS Genetics

David Balding

Section Editor: Methods

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Reviewer #1: I thank the authors for having addressed all my concerns. I have only a minor point left to be clarified: when the outcome (Y) is binary and a YadjX~G is done via logistic regression. I do not see how the derivation starting off from Eq (3) could be adapted, since there is some (non-linear) link function needs to be applied to the liability, Eq 5 would no longer hold in its form which assumes simple linear relationship between (X, G) and Y. I do not see how monotonicity can resolve this issue.

Reviewer #2: I want to congratulate the authors on improving the manuscript.

There are still some remarks that I would like the authors to clarify:

1. I want to stress that the authors need to be clear if they do or do not require the Inside assumption. Currently, they state in the response to reviewers they do not need this, but in the main manuscript on page 13 they still seem to use it ("under the assumption that the mean pleiotropic effect is zero and the InSIDE assumption is satisfed, the residual error independence property of Collider-Correction

will mean that ..."). I think this is a very important point to make, what assumptions does the method rely on.

2. I would like to know under what scenario's with pleiotropy the method will be worse compared to a (standard) IVW approach (please relate this to equation (10)).

3. How prone is the method to weak-instrument bias? There are some hints to this throughout the manuscript, but it is unclear to me if the method is more or less susceptible to this.

Minor remark: figure reference is missing on page 13: " Figure (top-left) shows, for a range of sample sizes the average value across 1000 independent data sets of ... "

Reviewer #3: I thank the authors for their extensive response to my questions. I do not have further comments.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Decision Letter 2

David Balding, Heather J Cordell

8 Jul 2021

Dear Dr Bowden,

We are pleased to inform you that your manuscript entitled "Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Heather J Cordell

Associate Editor

PLOS Genetics

David Balding

Section Editor: Methods

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

Reviewer's Responses to Questions

Comments to the Authors:

Reviewer #1: I'd like to thank the authors for addressing my remaining point.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-20-01817R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

David Balding, Heather J Cordell

4 Aug 2021

PGENETICS-D-20-01817R2

Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data

Dear Dr Bowden,

We are pleased to inform you that your manuscript entitled "Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Melanie Wincott

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text

    A: A formal proof of the Collider-Correction formulae. B: A list of outlying SNPs detected in analysis (a) and analysis (b) of the data example.

    (PDF)

    S1 Data. Additional functional information on the outlying SNPs detected in analysis (a) and analysis (b) of the data example.

    (ZIP)

    S1 Code. R scripts for re-creating the simulation study results in the paper.

    (R)

    Attachment

    Submitted filename: review.docx

    Attachment

    Submitted filename: ReviewerResponse_Final.docx

    Attachment

    Submitted filename: ReviewerResponse_Round2.docx

    Data Availability Statement

    This research has been conducted using the UK Biobank Resource under application 6818. The individual level data underlying the results presented in the study are available to qualified researchers from UK Biobank (https://www.ukbiobank.ac.uk/). Summary statistics gleaned from the individual level data and used by our Collider-Correction algorithm in applied analyses (a) and (b) are provided in S1 Data, as well as R code to aid their extraction.


    Articles from PLoS Genetics are provided here courtesy of PLOS

    RESOURCES