Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2025 Sep 11;21(9):e1011819. doi: 10.1371/journal.pgen.1011819

Integrative Mendelian randomization for detecting exposure-by-group interactions using group-specific and combined summary statistics

Ke Xu 1,2, Nathaniel Maydanchik 1, Bowei Kang 1, Jianhai Chen 1, Qixiang Chen 1, Gongyao Xu 1, Shinya Tasaki 3, David A Bennett 3, Lin S Chen 1,*
Editor: Xiaofeng Zhu4
PMCID: PMC12440225  PMID: 40934263

Abstract

Interactions between risk factors and covariate-defined groups are commonly observed in complex diseases. Existing methods for detecting interactions typically require individual-level data. The data availability and the measurements of risk exposures and covariates often limit the power and applicability in assessing interactions. To address these limitations, we propose int2MR, an integrative Mendelian randomization (MR) method that leverages GWAS summary statistics on exposure traits and group-separated and/or combined GWAS statistics on outcome traits. The int2MR can assess a broad range of risk exposure effects on diseases and traits, revealing interactions unattainable with incomplete or limited individual-level data. Simulation studies demonstrate that int2MR effectively controls type I error rates under various settings while achieving considerable power gains with the integration of additional group-combined GWAS data. We applied int2MR to two data analyses. First, we identified risk exposures with sex-interaction effects on ADHD, and our results suggested potentially elevated inflammation in males. Second, we detected age-group-specific risk factors for Alzheimer’s disease pathologies in the oldest-old (age 95+); many of these factors were related to immune and inflammatory processes. Our findings suggest that reduced chronic inflammation may underlie the distinct pathological mechanisms observed in this age group. The int2MR is a robust and flexible tool for assessing group-specific or interaction effects, providing insights into disease mechanisms.

Author summary

Complex diseases often arise from interactions between genetic, environmental, and biological factors, leading to differential effects across population subgroups. However, existing methods for detecting such interactions typically require access to individual-level data, which is often limited or incomplete. To address this challenge, we propose int2MR, an integrative Mendelian Randomization (MR) method that leverages GWAS summary statistics from both exposure traits and outcome traits, including group-separated and combined GWAS data. int2MR enables the detection of exposure-by-group interaction effects even when individual-level data are unavailable. Through simulation studies, we demonstrate that int2MR effectively controls type I error rates while achieving substantial power gains when integrating group-combined GWAS data. We apply int2MR in two real-world analyses: (1) identifying risk exposures with sex-differentiated effects on ADHD and (2) detecting age-differentiated risk factors for Alzheimer’s disease. Our findings highlight the utility of int2MR in uncovering previously undetectable interactions, improving statistical power, and enhancing the interpretability of genetic association studies for complex diseases.

Introduction

Complex diseases often arise from a combination of genetic, environmental, and biological factors, resulting in varied effects of risk factors across different subgroups [1,2]. These group-specific risk effects, also known as exposure-by-group interaction effects, play a critical role in disease mechanisms [3,4]. For example, genetic variations may influence susceptibility to risk factors such as diet, pollution, or lifestyle within specific populations [511]. Biological differences, such as age and sex, may lead to different disease pathways across groups [1221]. Additionally, social determinants of health, including access to healthcare and socioeconomic status, may shape exposure-disease relationships differently for various groups [2224]. Understanding these exposure-by-group interaction effects is crucial for advancing research on disease mechanisms and precision medicine. By considering group-specific effects, precision medicine can better address the unique risks of diverse populations and improve health outcomes, especially for vulnerable populations.

To assess exposure-by-group interaction effects, a common approach is interaction analysis, which estimates both the effects of risk exposures and their interactions with groups or covariates on disease outcomes [25]. These analyses, though, are often constrained by small sample sizes and the limited availability of risk exposure and covariate measurements within individual studies. Moreover, unmeasured confounding variables can bias the results, complicating causal interpretations. Two-stage least squares (2SLS) methods have been applied to detect interactions while allowing for unmeasured confounding under certain assumptions [26]. However, their applicability is also restricted to studies with individual-level data, and the power is constrained by the strength of the instruments in the data. Mendelian randomization (MR) is a powerful tool to evaluate the causal effects of risk exposures on complex disease outcomes, treating genetic variants associated with exposures of interest as instrumental variables (IVs) [2730]. Two-sample MR, which uses two sets of genome-wide association study (GWAS) summary statistics as input, has achieved many successes in assessing the causal effects of complex traits as exposures on various diseases as outcomes [3142]. Most existing MR methods are designed to assess total effects, with limited focus on detecting exposure-by-group interaction effects. While some recent MR methods were proposed to detect interaction effects [3,4], they require individual-level data on exposure, outcome, and the covariate group variables. The growing availability of GWAS summary statistics highlights the need for two-sample MR methods to detect interactions across covariate-defined groups using summary data.

Here we propose an integrative MR method for detecting interaction effects (int2MR) between exposures and covariate groups on complex diseases, using only summary statistics as input. The int2MR method integrates group-specific and group-combined GWAS summary statistics from multiple studies and consortia, enhancing the power to detect group-specific effects of risk exposures. Through extensive simulations and method comparisons, we demonstrated the advantages of the proposed int2MR method for detecting interactions and main effects. We applied the proposed int2MR method to two data analyses. In the first analysis, by integrating sex-stratified and sex-combined Attention Deficit Hyperactivity Disorder (ADHD) GWAS summary statistics from the Psychiatric Genomics Consortium [43,44] and other major consortia, we boosted the power for identifying exposures with sex-interaction effects on ADHD. In the second analysis, motivated by the observation that AD pathology peaks around age 95 and then declines, we sought to identify risk factors with age-group-specific effects on AD pathology in the oldest-old (defined as death-age 95+) compared to the rest of the population. Using int2MR, we integrated age-group-stratified GWAS summary statistics from ROSMAP (Religious Orders Study and the Rush Memory and Aging Project) [45] with publicly available GWAS summary statistics on a wide range of risk exposures from major consortia. We identified multiple immune and inflammation-related exposures with age group-differential effects on AD pathology in the oldest-old, suggesting reduced inflammation and potentially distinct neuroinflammatory mechanisms in the oldest-old compared with the rest of the population. These analyses demonstrate int2MR’s flexibility in assessing exposure-by-group interactions across diverse traits and groups (e.g., socioeconomic or environmental factors) using GWAS summary statistics. It enhances detection power for main and interaction effects by integrating group-stratified and group-combined GWAS data.

Description of the method

We propose the int2MR model to detect exposure-by-group interaction effects. We introduce two variations of the model, as illustrated in Fig 1. In this section, we will describe the input statistics and then present the model formulation and the estimation algorithms for both variations. Furthermore, the model is flexible and can be extended to integrate IV-to-outcome statistics from mixed groups with varying group compositions (see S1 Text for details).

Fig 1. Illustrations of the proposed int2MR framework for detecting interaction effects.

Fig 1

(a) The causal diagram of the int2MR model, using summary statistics from two group-specific IV-to-outcome GWASs as input. The model estimates the causal effect β for the reference group (Group 0) and β+βint for the comparison group (Group 1), capturing group-specific effects. (b) int2MR using two sets of group-specific IV-to-outcome GWAS statistics and one set of group-combined GWAS statistics for input. The parameter ρ represents the proportion of samples from the comparison group in the mixed-group study. The IV-to-exposure GWAS statistics in int2MR analyses are all from standard group-combined GWAS analyses.

int2MR: An integrative MR for detecting exposure-by-group interaction effects with group-separated GWAS summary statistics

As illustrated in Fig 1A, the first variation of int2MR takes the following summary statistics as input:

  • IV-to-exposure statistics: Standard GWAS summary statistics for the exposure trait.

  • Group-specific IV-to-outcome statistics: GWAS summary statistics for the outcome stratified by groups (i.e. reference group and comparison group).

For the j-th SNP (j=1,2,,p), let {γ^j,s^γ,j}j=1p denote the observed marginal IV-to-exposure effect and its standard error obtained from a GWAS study for the exposure trait. Similarly, let {Γ^0,j,s^0,j}j=1p and {Γ^1,j,s^1,j}j=1p denote the observed marginal IV-to-outcome effects and their standard errors for the reference group (denoted by subscript 0) and the comparison group (denoted by subscript 1), respectively. The IV-to-outcome statistics for the two groups are obtained from group-stratified GWAS studies on the outcome trait. Suppose that {γj}j=1p are the true IV-to-exposure effects, and {Γ0,j}j=1p and {Γ1,j}j=1p are the true IV-to-outcome effects for the reference and comparison groups, respectively. For the j-th SNP as an IV, we jointly model the IV-to-exposure effect and the IV-to-outcome effects in the reference and comparison groups as:

(γ^jΓ^0,jΓ^1,j)𝒩((γjΓ0,jΓ1,j),(s^γ,j2s^0,j2s^1,j2)). (1)

Our framework assumes that IVs must satisfy the standard core assumptions: (1) Relevance: the IVs are associated with the exposure (i.e., the IV-to-exposure effects, γj, are nonzero and assumed homogeneous across groups), (2) Independence: the IVs are independent of any unmeasured confounders, and (3) Exclusion Restriction: the IVs affect the outcome only through their effects on the exposure. We relax this last assumption by allowing for uncorrelated pleiotropy. We assume balanced pleiotropy; that is, any residual direct effects of the IV on the outcome are assumed to average out. In our model, a “valid” IV is one that meets these conditions.

The current model assumptions indeed imply that the group-specific genetic effect on the outcome is fully mediated by the effect of the exposure on the outcome. There is no additional direct group-specific genetic effect on the outcome. In other words, our current model does not allow any additional group-specific genetic effect on the outcome beyond the causal effect β, and the differential effect across groups is entirely captured through the interaction effect βint.

The true IV-to-outcome effects in the two groups are:

Reference group (Group 0) : Γj=β·γj+α0,j,Comparison group (Group 1) : Γj=(β+βint)·γj+α1,j.

In our model, we assume relevance and independence of the IVs, while relaxing the exclusion restriction by allowing for uncorrelated pleiotropy [32,37]. Here, α0,jN(0,σα,02) and α1,jN(0,σα,12) represent uncorrelated pleiotropic effects in the reference group and comparison group, respectively. The main effect β captures the causal effect in the reference group, while βint represents the differential effect between the comparison and reference groups, i.e., the interaction effects. This framework allows for estimating and testing any linear combination of β and βint. For example, we can separately test for non-zero group-specific causal effects, β and β+βint, for the reference and comparison groups, respectively. Additionally, we can test for non-zero interaction effects, βint. Note that the off-diagonal elements in the covariance matrices of Eq (1) are non-zero if there are sample overlaps among different GWAS studies.

Modeling interaction effects in the group-combined GWAS: from individual-level data to GWAS summary statistics

In the previous section, we focus on the scenario in which group-specific IV-to-outcome statistics are available. In practice, however, a GWAS cohort often includes individuals from both groups, yielding only combined summary statistics. This situation therefore requires us to model interaction effects directly in the group-combined GWAS.

Let X represent the risk factor of interest and Y denote the outcome. At the individual level, we have the following structural equations.

XG,S,U=j=1pγjGj+βUX·U+εX (2)
YX,S,G,U=βX+j=1pαjGj+βUYU+βSY·S+βint·XS+εY. (3)

In Eqs 2 and 3, the vector G=(G1,G2,,Gp) contains p genetic variants for an individual, and a=(α1,,αp) represents the uncorrelated pleiotropy effects. Here, S denotes a binary group label such as sex, taking values in {0,1}, and U is an unmeasured confounder. βUX and βUY represent the effects of the confounder U on X and Y, respectively, and βSY represents the direct effect of group S on the outcome (capturing baseline group differences). The parameters of interest, β and βint, capture the main causal effect of exposure X on outcome Y and the interaction effect between X and group label S, respectively.

When only summary statistics are available, individual group labels S are unobserved. Instead, let ρ denote the proportion of the comparison group in the GWAS sample. For instance, if we aim to estimate the female-specific effect β in a sex-combined GWAS, then ρ is the proportion of males. We model SBernoulli(ρ) independently of G and U, so that ρ=𝔼[S]. Here ρ serves as a summary statistic for the unobserved S and can be used to model interaction effects. Substituting Eq 2 into Eq 3 and marginalizing over S yields (see S1 Text for details):

YG,U=c0+j=1p[(β+βint·ρ)γj+αj]·Gj+c1·U+εY, (4)

where c0 is a constant depending on the group indicator S, c1 is a constant representing the strength of the confounding effect. Here, c0 and c1 represent the non-genetic contributions.

Here, we denote the true IV-to-outcome effect Γj for the j-th SNP, which is defined as the coefficient of Gj in the model Eq 4, i.e.,

Γj=(β+βint·ρ)γj+αj (5)

for any j=1,2,,p, where γj is the j-th IV-to-exposure effect. We denote the estimated IV-to-outcome and IV-to-exposure effects as Γ^j and γ^j, respectively.

Crucially, each SNP Gj has coefficient (β+βint·ρ)γj+αj. Thus, GWAS summary statistics encode the interaction effect βint only through the shift ρ·βint in the effective exposure coefficient. This observation motivates our flexible strategy to recover both β and βint by integrating information from combined or group-specific GWAS results.

Flexible int2MR allowing the integration of group-specific and group-combined GWAS statistics

The proposed int2MR model can integrate group-specific and group-combined GWAS statistics for the outcome disease/trait, enhancing the power and flexibility of the analysis. As illustrated in Fig 1B, the second variation of int2MR can utilize the following types of summary statistics as inputs:

  • IV-to-exposure statistics: Standard GWAS summary statistics for the exposure trait.

  • Group-specific IV-to-outcome statistics: GWAS summary statistics for the outcome stratified by group, which could be statistics specific to the reference group, the comparison group, or both.

  • Group-combined IV-to-outcome statistics: Standard GWAS summary statistics for the outcome trait derived from mixed-group samples, with a known proportion of group composition, ρ.

For the j-th SNP as an IV, we jointly model the IV-to-outcome effect for each input study and the IV-to-exposure effect as follows:

(γ^jΓ^0,jΓ^1,jΓ^2,j)𝒩((γjΓ0,jΓ1,jΓ2,j),(s^γj2s^0,j2s^1,j2s^2,j2)). (6)

Note that the covariance matrices above may have non-zero off-diagonal elements if the input GWAS datasets have overlapping samples. These details were omitted here for clarity in presenting the main model.

The true IV-to-outcome effect in each study with varying proportions of comparison group samples is specified as:

Γk,j=(β+βint·ρk)γj+αk,jfork{0,1,2},

where ρk represents the proportion of comparison group samples in the k-th study. For the study with only reference group samples (k = 0), ρ0=0; for the study with only comparison group samples (k = 1), ρ1=1; and for the study with group-combined samples (k = 2), ρ2 reflects the proportion of the comparison group samples. Here, αk,jN(0,σα,k2) is the uncorrelated pleiotropic effect for each study. The causal effect of the reference group is β and the causal effect of the comparison group is β+βint. The interaction effect βint captures the difference in causal effects comparing the comparison group versus the reference group.

To obtain the parameter estimates and inference of the above models, we build a Bayesian hierarchical model. We implement a No-U-Turn sampler (NUTS) [46,89], a variant of the Hamiltonian Monte Carlo method [47], to generate posterior samples for inference. By the Bernstein–von Mises theorem, the posterior distribution asymptotically approaches a normal distribution. We estimate the standard errors of β and βint by inverting the observed Fisher information matrix, denoted as 𝐈(Θ^), where Θ=(β,βint,γ,α0,α1,α2) represents the vector of all model parameters. The observed Fisher information matrix is computed from the Hessian of the negative posterior log-likelihood, yielding SE(β)=𝐈(Θ^)1,11 and SE(βint)=𝐈(Θ^)2,21. Further algorithmic details are provided in the S1 Text.

The model described by Eq (1) integrates group-separated GWAS statistics on outcome and the model described by Eq (6) integrates group-separated and combined GWAS statistics on outcome. Additionally, int2MR is generalized to integrate group-combined GWAS statistics on outcomes from two or more studies with varying group compositions. β and βint are identifiable as long as the ρ values differ among the IV-to-outcome GWAS samples (see S1 Text for more details). For instance, when analyzing causal effects of male and female groups, we can integrate GWAS statistics on outcome from the Million Veteran Program (MVP) [48,49], which consists of approximately 91.8% males, with another GWAS with an equal proportion of males and females. This flexibility allows int2MR to efficiently combine data from diverse sources, enhancing statistical power and enabling robust inference.

Compared to existing MR methods for detecting interaction effects [3,4], a key innovation and major advantage of our model is that it requires only summary statistics. This approach provides unparalleled flexibility in assessing group-specific effects of risk factors on complex diseases, even when individual-level data on risk exposures are incomplete or unavailable. For example, in our data analysis, we evaluated age-group-specific risk factors for Alzheimer’s disease (AD), with a focus on the oldest-old (death-age 95+). Using the ROSMAP study, we obtained age-group-separated GWAS statistics on AD for the oldest-old and those for all samples in ROSMAP as group-specific and combined IV-to-outcome statistics, respectively. Publicly available GWAS statistics for various complex traits, sourced from other GWAS consortia, served as IV-to-exposure statistics. Notably, many of these risk exposures are not directly measured in the ROSMAP data. Yet, our method can assess their interaction effects and age-group-specific contributions to AD and pathologies.

Allowing correlated SNPs as IVs by accounting for linkage disequilibrium (LD)

We extend the int2MR model to allow moderately correlated SNPs as IVs by modeling the LD structure:

γ^𝒩(S^γR^S^γ1γ,S^γR^γ),Γ^k𝒩(S^Γ0R^ΓkS^Γ01Γk,S^ΓkR^Γk),k=0,1,2.

where γ^=(γ^1,γ^2,,γ^p)  and γ=(γ1,γ2,,γp)  are the vectors of estimated and true marginal IV-to-exposure effects, respectively. Γ^k=(Γ^k,1,Γ^k,2,,Γ^k,p)  and Γk=(Γk,1,Γk,2,,Γk,p)  are the vectors of estimated and true marginal IV-to-outcome effects in the k-th GWAS sample (k{0,1,2}). S^γ,S^Γ0,S^Γ1 and S^Γ2 are the corresponding diagonal matrices of standard errors; and R^γ,R^Γ0,R^Γ1 and R^Γ2 are the corresponding estimated correlation matrices among all selected IVs, which could be estimated using an independent reference panel data. This framework allows for weak to moderate correlations among IVs by accounting for their correlation structures. However, the application of int2MR may not be suited when IVs are highly correlated.

Verification and comparison

We evaluated the performance of our proposed summary-statistics-based MR method, int2MR, by comparing it with existing methods that utilize either individual-level data or summary statistics. We simulated individual-level genotype data for GWASs of both the exposure and the outcome traits. For the outcome GWAS, we simulated GWAS datasets with group-specific effects, and generated both group-specific and group-combined GWAS summary statistics. In each simulation, we generated 300 SNPs as IVs. For the group-specific GWAS, we simulated 1,000 individuals for the comparison group and 2,000 for the reference group, and each group has group-specific effects. We calculated group-combined GWAS summary statistics by combining data from both the reference group and the comparison group. The total sample size for the mixed group was varied to investigate its effects on method performance, in particular power. In the group-combined GWAS data, individuals from the reference and comparison groups were represented in equal proportions.

From the simulated datasets, we obtained the IV-to-exposure GWAS summary statistics, as well as group-separated and group-combined GWAS summary statistics for the outcome. These summary statistics were used as inputs for our int2MR analyses. We selected the appropriate input data type, either individual-level data or summary statistics, for specific comparison method (see S1 Text for additional simulation details). We evaluated the type I error rates and power using a p-value threshold of 0.05.

Type I error

Table 1A compares the type I error rates of int2MR using group-separated GWASs with int2MR+20k integrating an additional group-combined GWAS consisting of 20,000 mixed-group samples. Additionally, we compared with the 2SLS method for interactions[26] and standard interaction tests based on ordinary least squares (OLS) regression. We simulated various settings with and without horizontal pleiotropy and with varying strengths of confounding effects. In all settings, the proposed int2MR approach, using both group-separated and group-combined GWAS statistics for outcomes, controlled type I error rates. In contrast, the 2SLS method for interactions failed to control type I error rates in the presence of pleiotropy, and the OLS-based interaction test failed to control type I error rates in the presence of confounding. Table 1B compares type I error rates for testing non-zero total effects for int2MR and other existing MR methods under group-specific pleiotropy and confounding. For int2MR and int2MR+20k, the null hypothesis H0:β=0 was tested. For competing MR methods, IVW [31], MR-Egger [32], MR-Median [50], MR-RAPS [33], and MR-cML [36], we tested for total effects since these methods do not assess interactions. The proposed int2MR methods consistently controlled type I error rates, whereas some competing methods showed slight inflation in type I error rates under group-specific pleiotropy or confounding.

Table 1. Simulation results comparing the type I error rates of different methods in different settings: with and without horizontal pleiotropy effects (α) and in the presence of confounders (U).

Testing H0:β=0 Testing H0:βint=0
Method α=0, U0 = 0, U1 = 0.2 α=0.02, U0 = 0, U1 = 0.2 α=0.02, U0 = 0.1, U1 = −0.2 α=0, U0 = 0, U1 = 0.2 α=0.02, U0 = 0, U1 = 0.2 α=0.02, U0 = 0.1, U1 = −0.2
int2MR+20k 0.048 0.060 0.066 0.048 0.062 0.066
int2MR 0.060 0.060 0.067 0.046 0.064 0.062
2SLS 0.054 0.078 0.100 0.030 0.082 0.078
OLS 0.072 0.064 0.082 0.132 0.148 0.044
Method α0=0, α1=0, U = 0.2 α0=0, α1=0.02, U = 0.2 α0=0.02, α1=0.02, U = 0.2 α0=0, α1=0, U = 0.2 α0=0, α1=0.02, U = 0.2 α0=0.02, α1=0.02, U = 0.2
int2MR+20k 0.042 0.040 0.068 0.034 0.066 0.070
int2MR 0.042 0.040 0.062 0.046 0.064 0.072
2SLS 0.072 0.066 0.082 0.042 0.080 0.088
OLS 0.260 0.258 0.210 0.098 0.104 0.130
(a) Type I error rate comparisons for testing interaction effects (βint), in the presence of (group-specific) confounding and pleiotropy.

For int2MR, we integrated group-separated GWASs for reference and comparison groups on outcome. In int2MR+N, the subscript “+N" represents the additional sample size from the group-combined GWAS. Among the competing methods, individual-level data-based approaches, such as 2SLS and OLS-based interaction tests, can test for interaction effects, while summary-statistics-based MR methods are limited to testing total effects.

Power comparison

Fig 2A compares the power of different methods for detecting interaction effects. While 2SLS and OLS require individual-level data, int2MR demonstrated comparable power to the OLS-based interaction test with 2,000 samples from the reference group and 1,000 from the comparison group. When additional group-combined GWAS statistics with larger sample sizes (10k, 20k, and 50k) were integrated, int2MR showed a substantial power improvement. Fig 2B compares the power for detecting both main and interaction effects. For 2SLS, OLS, and int2MR, we tested non-zero main and interaction effects, while for other summary-based MR methods, we tested total effects. int2MR improved power by jointly testing for main and interaction effects.

Fig 2. Simulation results comparing the power of int2MR with competing methods.

Fig 2

(a) Power comparison for detecting interaction effects. The method int2MR+N integrates group-separated GWAS data with a group-combined GWAS dataset, where the subscript N represents the sample size of the group-combined GWAS. Comparison methods are 2SLS and OLS-based interaction test, both requiring individual-level data. (b) Power comparison for detecting non-zero causal effects. We test for non-zero main or interaction effects for 2SLS, OLS-based interaction, and int2MR. For existing MR methods (colored green), we test for total effects.

In our int2MR method, integration of additional group-combined GWAS data substantially increases the sample size and thereby boosts the power, which is comparable or even higher than that of OLS. OLS relies solely on the available group-specific data (with a 2000-sample size of the reference group and a 1000-sample size of the comparison group). Without integrating the additional group-combined GWAS dataset, OLS has higher power than int2MR in detecting both the main effect and the shared effect. In contrast, int2MR leverages the extra information provided by the group-combined data, which is not utilized by other individual-level MR methods.

In summary, the simulations showed that int2MR effectively controls type I error rates in the presence of pleiotropy and confounding. In addition, int2MR demonstrated strong power in detecting interaction effects and group-specific causal effects. The integration of group-combined and group-separated GWASs on outcomes further enhanced its power.

Applications

Data analysis: Identifying risk factors with sex-biased effects on ADHD

Attention-deficit/hyperactivity disorder (ADHD) is a sex-biased condition, with males significantly more likely to be diagnosed than females [5153]. These observed sex differences suggest that certain risk factors may have varying causal effects depending on sex. Identifying and understanding these sex-specific effects and their mechanisms are crucial for improving diagnosis and developing targeted interventions for both sexes. In this analysis, we leverage existing GWAS summary statistics to systematically identify potential risk factors with sex-biased effects on ADHD.

We applied the proposed int2MR method to evaluate the interaction effects between risk exposures and sex groups on ADHD. The IV-to-exposure statistics were obtained from publicly available GWASs for 51 complex traits (see S1 Table) and diseases related to immunology, metabolism, gastrointestinal health, cardiovascular health, dermatological conditions, and brain function. The IV-to-outcome GWAS statistics on ADHD were obtained from the Psychiatric Genomics Consortium (PGC), including two sex-stratified ADHD GWAS datasets as well as a sex-combined dataset. The male-only GWAS included 32,102 individuals of European ancestry, while the female-only GWAS included 21,191 individuals of European ancestry [54]. Additionally, we used a larger sex-combined GWAS dataset consisting of 224,534 individuals, approximately 49.61% females [43]. We first applied int2MR using only sex-stratified ADHD GWAS statistics as the IV-to-outcome statistics. Additionally, we expanded the int2MR analysis by integrating sex-stratified GWAS statistics with a larger sex-combined GWAS to enhance the power for detecting interaction effects. For each exposure, SNPs significantly associated with the exposure (p-values 5×108) were selected as IVs, followed by LD clumping with an r2 threshold of 0.05. In this analysis, we apply LD clumping to select a set of relatively independent IVs without considering their LD structures. The analysis was restricted to 35 exposures with at least 20 IVs and at most 1000 IVs after IV selection.

At a false discovery rate (FDR) threshold of 0.1, we identified 15 traits with significant sex-biased effects on ADHD by integrating the large sex-combined GWAS. In comparison, int2MR using only sex-stratified GWASs identified 11 of these exposures. See Table 2 for the list of fifteen exposures with significant sex-interaction effects on ADHD. Overall, we observed more significant p-values for the exposures when integrating the sex-combined GWAS. The results demonstrate improved power in detecting interaction effects by integrating sex-combined GWAS summary statistics.

Table 2. Results on fifteen risk exposures with significant sex-interaction effects on ADHD (FDR 10%) identified using int2MR.

The left panel shows the results based on integrating sex-stratified with sex-combined GWAS statistics. The right panel shows the results based on int2MR using sex-stratified GWAS statistics only.

Exposure #IVs Sex-stratified GWAS only + Sex-combined GWAS
β^int p-value FDR adj. β^int p-value FDR adj.
High Light Scatter Reticulocyte Count 676 0.117 0.0021 0.0611 0.125 0.0013 0.0312
Reticulocyte Count 681 0.089 0.0147 0.0644 0.100 0.0036 0.0312
Sum of Neutrophil and Eosinophil Counts 412 0.113 0.0146 0.0644 0.124 0.0044 0.0312
White Blood Cell Count 503 0.101 0.0042 0.0611 0.110 0.0037 0.0312
Hayfever or Eczema 353 -0.371 0.0095 0.0611 -0.412 0.0045 0.0312
Self-Reported Hypertension 407 -0.412 0.0086 0.0611 -0.415 0.0065 0.0359
Self-Reported Psoriasis 190 0.210 0.0105 0.0611 0.221 0.0072 0.0359
Granulocyte Count 409 0.117 0.0100 0.0611 0.125 0.0110 0.0479
Myeloid White Cell Count 418 0.096 0.0257 0.0960 0.102 0.0143 0.0557
Neutrophil Count 418 0.104 0.0343 0.1000 0.117 0.0198 0.0629
Eosinophil Count 601 0.078 0.0296 0.0960 0.080 0.0198 0.0629
Sum of Basophil and Neutrophil Counts 415 0.108 0.0302 0.0960 0.112 0.0231 0.0674
Lymphocyte Count 556 0.0783 0.0374 0.1010 0.0817 0.0275 0.0742
Sum of Eosinophil and Basophil Counts 560 0.0754 0.0601 0.1403 -0.0800 0.0333 0.0787
Inflammatory Bowel Disease 155 -0.0272 0.1997 0.3495 -0.0389 0.0337 0.0787

Fig 3 highlights several immune-related traits with sex-biased effects on ADHD, including white blood cell count, reticulocyte count, and granulocyte count, all of which showed larger effects (in absolute value) in males compared to females. These findings are consistent with previous reports on immune-related risk factors for ADHD [5557], suggesting that immune system activity may play a more significant role in ADHD development among males. Prior studies have also shown that males are more prone to increased inflammatory responses, potentially driven by hormonal influences such as testosterone, which promotes a pro-inflammatory state [58]. Additionally, hypertension has been identified as a common comorbidity in adults with ADHD [59]. We observed consistent effect estimates across the two int2MR analyses, with or without the integration of sex-combined GWAS data. By leveraging existing GWAS summary statistics for risk exposure traits and outcome traits from multiple data sources and studies, int2MR enables an efficient evaluation of interaction effects across a wide range of exposures. This is particularly valuable for assessing interactions of risk exposures that are not measured in the studies of outcome traits, making int2MR a powerful and versatile tool for addressing research questions that would be infeasible to study using interaction analysis methods requiring individual-level data.

Fig 3. A forest plot of causal effects for risk factors with significant sex-biased effects, stratified by sex (female vs. male).

Fig 3

Many of these risk factors are immune-related traits. Blue points represent male-specific effects, while red points represent female-specific effects.

Data analysis: Identifying age-group-specific risk factors for Alzheimer’s disease in the oldest-old

AD and related dementias (ADRD) present a significant public health challenge, particularly as life expectancy continues to rise. Age is the primary risk factor for ADRD. A recent analysis of data from the Religious Orders Study and Rush Memory and Aging Project (ROSMAP) [45,60,61], involving 1,420 autopsied individuals, found that the probability of Alzheimer’s dementia and cognitive impairment increases with age. Interestingly, a nonlinear relationship was observed for AD pathology, which peaks around age 95 and then slightly declines [62]. This pattern was evident in several AD pathology measures, including global AD pathology burden (gpath), amyloid, and PHF tau tangle density (tangles). In contrast, non-AD pathologies, except for TDP-43, continue to increase beyond age 95 in severity. Survival bias is unlikely, as the nonlinear trajectory with age is observed only for AD pathologies, while non-AD pathologies increase linearly with age. These findings highlight the need to understand the unique biology and mechanisms of neuropathologies in the oldest-old (death-age 95+) to develop effective prevention and treatment strategies for this rapidly growing age group.

To identify risk factors with age-group-specific effects on AD and pathologies in the oldest-old compared to the rest, we applied the int2MR method. We examined the same 51 complex traits and diseases as exposures that were used in our previous analysis on ADHD. We obtained the GWAS summary statistics for clinically diagnosed AD, three AD pathologies (gpath, amyloid, tangles), and three non-AD pathologies including TDP-43 (tdp_st4_binary)[6367], hippocampal sclerosis (hspath_typ) [6769], and cortical Lewy body (dlbany) [7072], from the ROSMAP study. The ROSMAP GWAS data included 2,587 individuals, 408 of whom were 95 years or older at the time of death. We also obtained the GWAS summary statistics for only the oldest-old (N=408). For each exposure, SNPs with a p-value 5×108 were selected as IVs, followed by LD clumping with an r2 threshold of 0.05. The risk exposures and their age-group interaction effect estimates on three AD pathologies obtained in the int2MR analysis are presented in Table 3.

Table 3. Risk exposures and their age-group interaction effect estimates on three AD pathologies obtained in the int2MR analysis.

Exposure #IVs β^int p-value FDR adj.
Amyloid Pathology
Hair or Balding Pattern 4 710 -0.5482 6.23×105 0.00336
Inflammatory Bowel Disease 243 -0.2078 2.73×104 0.00736
Global AD Pathology Burden
White Blood Cell Count 833 -0.2355 2.35×107 0.00001
Systemic Lupus Erythematosus 251 0.0630 6.77×106 0.00018
Self-Reported Psoriasis 315 -0.6228 1.89×105 0.00034
Lymphocyte Count 915 -0.1795 0.00020 0.00177
Myeloid White Cell Count 686 -0.2135 0.00018 0.00177
Neutrophil Count 660 -0.2128 0.00014 0.00177
Sum of Neutrophil and Eosinophil Counts 656 -0.2056 0.00083 0.00637
Granulocyte Count 651 -0.2110 0.00182 0.01101
Fluid intelligence score 174 0.2522 0.00183 0.01101
Sum of Basophil and Neutrophil Counts 659 -0.2138 0.00283 0.01389
Sum of Eosinophil and Basophil Counts 934 -0.1353 0.00598 0.02690
Tangles Pathology
Chronotype 292 1.3546 0.00020 0.00528
Self-Reported Psoriasis 315 -1.2458 0.00018 0.00528
Crohn’s Disease 192 0.1629 0.00080 0.01432

The three AD pathologies analyzed are Amyloid Pathology (amyloid), Global AD Pathology Burden (gpath), and Tangles Pathology (tangles). Significant exposure-by-age-group interactions were identified at a 5% FDR threshold.

Fig 4 presents a heatmap showing the significance of age-group-interaction effects of selected exposures (FDR0.05) on AD pathologies (bottom panel) versus non-AD pathologies (top panel). The heatmap showed that several inflammation- and immune-related traits and diseases have significant age-group-interaction effects comparing the oldest-old to the others (95+ vs. 95-). Traits such as lymphocyte and eosinophil counts showed stronger associations with AD pathologies in younger individuals (below 95). However, these associations weaken or even reverse in the oldest-old, suggesting diminished immune responses in the oldest-old. In contrast, these risk factors do not have significant age-group interaction effects on non-AD pathologies. While immune-related exposures are strongly associated with non-AD pathologies, their associations are relatively consistent across age groups (95+ vs. 95-).

Fig 4. A heatmap showing age-group interaction effects on AD pathologies (bottom panel) and non-AD pathologies (top panel), comparing the oldest-old to younger age groups (95+ vs. 95-).

Fig 4

Significant interaction effects were observed for several immune-related traits on AD pathologies. In contrast, these risk exposures showed fewer differences in effects comparing the two age groups, i.e., weaker age-group interactions for non-AD pathologies.

These findings underscore distinct pathological mechanisms in the oldest-old compared to the rest. The role of neuroinflammation in AD progression has been well-documented, with inflammation and immune responses driving the accumulation of AD pathology [7375]. However, the reduced association between immune traits and AD in the oldest-old suggests a decline in immune response and neuroinflammatory activity, potentially explaining the plateau in AD pathology accumulation in this age group. This may be due to an age-related decline in the immune system’s ability to mount an inflammatory response, along with the brain’s compensatory mechanisms [7681]. In contrast, non-AD pathologies continue to increase with age [62,82], which may result from a continuous and escalating inflammatory response that persists or intensifies with aging [8388].

The underlying risk factors and molecular mechanisms driving these patterns in the oldest-old remain largely unexplored. Understanding the biology of AD and related pathologies is critical for developing targeted strategies for prevention, progression, and treatment. The proposed int2MR method demonstrates its advantage by leveraging age-group-specific GWAS statistics from ROSMAP, even with a limited sample size (N=408), and integrating them with publicly available GWAS data on diverse exposures. It enables the comprehensive evaluation of age-group interaction effects across a wide range of risk exposures, many of which are not directly measured in the ROSMAP dataset. Such evaluations would be infeasible with traditional interaction tests.

Discussion

In this study, we introduced int2MR, an integrative MR method for detecting exposure-by-group interaction effects by leveraging GWAS summary statistics. int2MR combines group-separated and group-combined GWAS statistics on outcomes with GWAS statistics on exposures. The ability to use only summary statistics provides unparalleled flexibility, enabling the evaluation of group-specific effects of risk factors on complex diseases when individual-level data are incomplete or unavailable. Our simulation studies demonstrated the high power and robustness of int2MR. The method consistently controlled type I error rates in various settings. In terms of power, int2MR showed reasonable performance when using group-separated GWAS summary statistics, comparable to analyses based on individual-level data. When integrating additional group-combined GWAS data on outcomes, int2MR had substantial power improvements.

We applied int2MR to identify exposure-by-sex interaction effects for ADHD, a sex-biased disorder. The analysis revealed multiple risk factors suggesting increased inflammatory responses in males. These findings are consistent with prior studies linking inflammation and immune dysregulation to ADHD. Importantly, the integration of sex-combined GWAS data improved the power to detect interaction effects, demonstrating the method’s ability to leverage additional data to improve power. We further applied int2MR to identify age-group-specific risk factors for AD pathologies, with a focus on the individuals aged 95 and older (the oldest-old). This analysis identified multiple inflammation and immune-related risk factors, suggesting that chronic inflammation is reduced in the oldest-old. This may reflect an age-related decline in the immune system’s ability to mount inflammatory responses, coupled with compensatory mechanisms in the brain. In both analyses, int2MR integrated group-separated GWAS statistics on outcomes with GWAS statistics on exposures from various sources. This capability is a major advantage over existing methods that require individual-level data, allowing for the broad assessment of interactions and group-specific effects of risk factors for complex diseases.

Despite its strengths, int2MR has several challenges that present opportunities for future research. First, the current MR model assumes that IVs are not associated with unmeasured confounders, an assumption relaxed by some recent MR methods for total effects. In particular, if the correlated pleiotropic effect differs among the GWAS samples, the resulting pattern may confound the interaction effect (βint) with the pleiotropic effect. In these cases, it is difficult to disentangle whether the observed deviations in the IV-to-outcome associations are due to a true interaction or to systematic pleiotropic biases. Future work could explore model extensions in this direction. Second, int2MR currently supports only binary group variables. Expanding the framework to interactions with continuous or multi-category variables would significantly broaden its utility. Our current work focuses on linear models. In many cases, especially when effect sizes are modest, the linear approximation provides a reasonable first-order approximation for binary outcomes. We plan to incorporate binary outcomes and nonlinear relationships in our future studies and refine the robustness and applicability of our methods under these conditions. Another opportunity is the application of int2MR in transcriptome-wide MR analyses, enabling the detection of genetically regulated genes and molecular risk factors with interaction effects across various cellular contexts. int2MR is a flexible, robust, and scalable tool for detecting exposure-by-group interaction effects. Its ability to integrate diverse GWAS summary statistics opens new avenues for understanding the complex interplay between risk factors and group-specific disease mechanisms.

Supporting information

S1 Text

This document presents a comprehensive description of our methodological framework, including an extended discussion of the Bayesian hierarchical model and rigorous justification of all hyperparameter choices. It further details the simulation design and the procedures employed for generating summary statistics. S1 Text reports additional simulation results and expands upon the data-analysis findings presented in the main text.

(PDF)

pgen.1011819.s001.pdf (2.4MB, pdf)
S1 Table

List of the 51 genome-wide association study (GWAS) traits included in our analysis. For each trait, we report the phenotype name, total sample size, number of cases (where applicable), the contributing consortium or study, the citation for the primary GWAS publication (via PubMed), and the web portal used to access the full summary-statistic dataset.

(XLSX)

pgen.1011819.s002.xlsx (17.9KB, xlsx)
S1 Checklist

The STROBE-MR checklist of recommended items to address in reports of Mendelian randomization is included. This checklist is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0; https://creativecommons.org/licenses/by/4.0/) and must be attributed to the STROBE Initiative. For more information, please see https://www.strobe-statement.org/.

(DOCX)

pgen.1011819.s003.docx (29.3KB, docx)

Acknowledgments

The authors would like to thank Dr. Guimin Gao for valuable discussions.

Data Availability

All the GWAS summary statistics of IV-to-exposure effects used in this paper are publicly available. Related links to the summary statistics can be found in S1 Table. The R implementation of our int2MR method is available at https://github.com/kxu-stat/int2MR. Our implementation of algorithms depends on rstan (available on https://CRAN.R-project.org/package=rstan). Additionally, the GWAS summary statistics for the IV-to-outcome effects derived from the Religious Orders Study and the Rush Memory and Aging Project (ROSMAP), along with all primary results underlying our analyses, have been deposited on Zenodo at https://doi.org/10.5281/zenodo.16341091.

Funding Statement

This work was supported by the National Institutes of Health (grant 1R01GM154421 to LSC, BK, JC, and QC; grant 1U01MH139345 to LSC, BK, JC, and QC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Ottman R. Gene-environment interaction: definitions and study designs. Nature Precedings. 2008; p. 1. [DOI] [PMC free article] [PubMed]
  • 2.Ottman R. An epidemiologic approach to gene-environment interaction. Genet Epidemiol. 1990;7(3):177–85. doi: 10.1002/gepi.1370070302 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Zhu X, Yang Y, Lorincz-Comi N, Li G, Bentley AR, de Vries PS, et al. An approach to identify gene-environment interactions and reveal new biological insight in complex traits. Nat Commun. 2024;15(1):3385. doi: 10.1038/s41467-024-47806-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Gorfine M, Qu C, Peters U, Hsu L. Unveiling challenges in Mendelian randomization for gene-environment interaction. Genet Epidemiol. 2024;48(4):164–89. doi: 10.1002/gepi.22552 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Olsson T, Barcellos LF, Alfredsson L. Interactions between genetic, lifestyle and environmental risk factors for multiple sclerosis. Nature Reviews Neurology. 2017;13(1):25–36. [DOI] [PubMed] [Google Scholar]
  • 6.Perera FP. Environment and cancer: who are susceptible?. Science. 1997;278(5340):1068–73. doi: 10.1126/science.278.5340.1068 [DOI] [PubMed] [Google Scholar]
  • 7.Abdul QA, Yu BP, Chung HY, Jung HA, Choi JS. Epigenetic modifications of gene expression by lifestyle and environment. Arch Pharm Res. 2017;40(11):1219–37. doi: 10.1007/s12272-017-0973-3 [DOI] [PubMed] [Google Scholar]
  • 8.Alegría-Torres JA, Baccarelli A, Bollati V. Epigenetics and lifestyle. Epigenomics. 2011;3(3):267–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Franks PW, Mesa JL, Harding AH, Wareham NJ. Gene–lifestyle interaction on risk of type 2 diabetes. Nutrition, Metabolism and Cardiovascular Diseases. 2007;17(2):104–24. [DOI] [PubMed] [Google Scholar]
  • 10.Dietrich S, Jacobs S, Zheng J-S, Meidtner K, Schwingshackl L, Schulze MB. Gene-lifestyle interaction on risk of type 2 diabetes: a systematic review. Obes Rev. 2019;20(11):1557–71. doi: 10.1111/obr.12921 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kim MS, Shim I, Fahed AC, Do R, Park W-Y, Natarajan P, et al. Association of genetic risk, lifestyle, and their interaction with obesity and obesity-related morbidities. Cell Metab. 2024;36(7):1494-1503.e3. doi: 10.1016/j.cmet.2024.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Mauvais-Jarvis F, Bairey Merz N, Barnes PJ, Brinton RD, Carrero J-J, DeMeo DL, et al. Sex and gender: modifiers of health, disease, and medicine. Lancet. 2020;396(10250):565–82. doi: 10.1016/S0140-6736(20)31561-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lau ES, Paniagua SM, Guseh JS, Bhambhani V, Zanni MV, Courchesne P, et al. Sex differences in circulating biomarkers of cardiovascular disease. J Am Coll Cardiol. 2019;74(12):1543–53. doi: 10.1016/j.jacc.2019.06.077 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Westergaard D, Moseley P, Sørup FKH, Baldi P, Brunak S. Population-wide analysis of differences in disease progression patterns in men and women. Nat Commun. 2019;10(1):666. doi: 10.1038/s41467-019-08475-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.van Dongen J, Nivard MG, Willemsen G, Hottenga J-J, Helmer Q, Dolan CV, et al. Genetic and environmental influences interact with age and sex in shaping the human methylome. Nat Commun. 2016;7:11115. doi: 10.1038/ncomms11115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Winkler TW, Justice AE, Graff M, Barata L, Feitosa MF, Chu S, et al. The influence of age and sex on genetic associations with adult body size and shape: a large-scale genome-wide interaction study. PLoS Genet. 2015;11(10):e1005378. doi: 10.1371/journal.pgen.1005378 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Liu C-T, Estrada K, Yerges-Armstrong LM, Amin N, Evangelou E, Li G, et al. Assessment of gene-by-sex interaction effect on bone mineral density. J Bone Miner Res. 2012;27(10):2051–64. doi: 10.1002/jbmr.1679 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Oliva M, Muñoz-Aguirre M, Kim-Hellmuth S, Wucher V, Gewirtz ADH, Cotter DJ, et al. The impact of sex on gene expression across human tissues. Science. 2020;369(6509):eaba3066. doi: 10.1126/science.aba3066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ober C, Loisel DA, Gilad Y. Sex-specific genetic architecture of human disease. Nat Rev Genet. 2008;9(12):911–22. doi: 10.1038/nrg2415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hughes T, Adler A, Merrill JT, Kelly JA, Kaufman KM, Williams A, et al. Analysis of autosomal genes reveals gene-sex interactions and higher total genetic risk in men with systemic lupus erythematosus. Ann Rheum Dis. 2012;71(5):694–9. doi: 10.1136/annrheumdis-2011-200385 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Leng R-X, Wang W, Cen H, Zhou M, Feng C-C, Zhu Y, et al. Gene-gene and gene-sex epistatic interactions of MiR146a, IRF5, IKZF1, ETS1 and IL21 in systemic lupus erythematosus. PLoS One. 2012;7(12):e51090. doi: 10.1371/journal.pone.0051090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Marioni RE, Ritchie SJ, Joshi PK, Hagenaars SP, Okbay A, Fischer K. Proceedings of the National Academy of Sciences. 2016;113(47):13366–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Fan Q, Verhoeven VJ, Wojciechowski R, Barathi VA, Hysi PG, Guggenheim JA. Meta-analysis of gene–environment-wide association scans accounting for education level identifies additional loci for refractive error. Nature Communications. 2016;7(1):11008. doi: 10.1038/ncomms11008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Conti G, Heckman JJ. Understanding the early origins of the education-health gradient: a framework that can also be applied to analyze gene-environment interactions. Perspect Psychol Sci. 2010;5(5):585–605. doi: 10.1177/1745691610383502 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Aiken LS. Multiple regression: Testing and interpreting interactions. Sage. 1991.
  • 26.North T-L, Davies NM, Harrison S, Carter AR, Hemani G, Sanderson E, et al. Using genetic instruments to estimate interactions in mendelian randomization studies. Epidemiology. 2019;30(6):e33–5. doi: 10.1097/EDE.0000000000001096 [DOI] [PubMed] [Google Scholar]
  • 27.Chen LS, Emmert-Streib F, Storey JD. Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biol. 2007;8(10):R219. doi: 10.1186/gb-2007-8-10-r219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27(8):1133–63. doi: 10.1002/sim.3034 [DOI] [PubMed] [Google Scholar]
  • 29.Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, Guhathakurta D, et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet. 2005;37(7):710–7. doi: 10.1038/ng1589 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Smith GD, Ebrahim S. “Mendelian randomization”: can genetic epidemiology contribute to understanding environmental determinants of disease?. Int J Epidemiol. 2003;32(1):1–22. doi: 10.1093/ije/dyg070 [DOI] [PubMed] [Google Scholar]
  • 31.Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37(7):658–65. doi: 10.1002/gepi.21758 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–25. doi: 10.1093/ije/dyv080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zhao Q, Wang J, Hemani G, Bowden J, Small DS. Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. Ann Statist. 2020;48(3). doi: 10.1214/19-aos1866 [DOI] [Google Scholar]
  • 34.Cheng Q, Zhang X, Chen LS, Liu J. Mendelian randomization accounting for complex correlated horizontal pleiotropy while elucidating shared genetic etiology. Nat Commun. 2022;13(1):6490. doi: 10.1038/s41467-022-34164-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Wang J, Zhao Q, Bowden J, Hemani G, Davey Smith G, Small DS, et al. Causal inference for heritable phenotypic risk factors using heterogeneous genetic instruments. PLoS Genet. 2021;17(6):e1009575. doi: 10.1371/journal.pgen.1009575 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Xue H, Shen X, Pan W. Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects. Am J Hum Genet. 2021;108(7):1251–69. doi: 10.1016/j.ajhg.2021.05.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Morrison J, Knoblauch N, Marcus JH, Stephens M, He X. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet. 2020;52(7):740–7. doi: 10.1038/s41588-020-0631-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Cheng Q, Qiu T, Chai X, Sun B, Xia Y, Shi X, et al. MR-Corr2: a two-sample Mendelian randomization method that accounts for correlated horizontal pleiotropy using correlated instrumental variants. Bioinformatics. 2022;38(2):303–10. doi: 10.1093/bioinformatics/btab646 [DOI] [PubMed] [Google Scholar]
  • 39.Bucur IG, Claassen T, Heskes T. Inferring the direction of a causal link and estimating its effect via a Bayesian Mendelian randomization approach. Stat Methods Med Res. 2020;29(4):1081–111. doi: 10.1177/0962280219851817 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Zhao J, Ming J, Hu X, Chen G, Liu J, Yang C. Bayesian weighted Mendelian randomization for causal inference based on summary statistics. Bioinformatics. 2020;36(5):1501–8. doi: 10.1093/bioinformatics/btz749 [DOI] [PubMed] [Google Scholar]
  • 41.Grant AJ, Burgess S. A Bayesian approach to Mendelian randomization using summary statistics in the univariable and multivariable settings with correlated pleiotropy. Am J Hum Genet. 2024;111(1):165–80. doi: 10.1016/j.ajhg.2023.12.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Berzuini C, Guo H, Burgess S, Bernardinelli L. A Bayesian approach to Mendelian randomization with multiple pleiotropic variants. Biostatistics. 2020;21(1):86–101. doi: 10.1093/biostatistics/kxy027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Demontis D, Walters GB, Athanasiadis G, Walters R, Therrien K, Nielsen TT, et al. Author Correction: genome-wide analyses of ADHD identify 27 risk loci, refine the genetic architecture and implicate several cognitive domains. Nat Genet. 2023;55(4):730. doi: 10.1038/s41588-023-01350-w [DOI] [PubMed] [Google Scholar]
  • 44.Demontis D, Walters RK, Martin J, Mattheisen M, Als TD, Agerbo E, et al. Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nat Genet. 2019;51(1):63–75. doi: 10.1038/s41588-018-0269-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Bennett DA, Buchman AS, Boyle PA, Barnes LL, Wilson RS, Schneider JA. Religious orders study and rush memory and aging project. J Alzheimers Dis. 2018;64(s1):S161–89. doi: 10.3233/JAD-179939 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Hoffman MD, Gelman A. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res. 2014;15(1):1593–623. [Google Scholar]
  • 47.Betancourt M. A conceptual introduction to Hamiltonian Monte Carlo. arXiv preprint arXiv:170102434. 2017.
  • 48.Gaziano JM, Concato J, Brophy M, Fiore L, Pyarajan S, Breeling J, et al. Million veteran program: a mega-biobank to study genetic influences on health and disease. J Clin Epidemiol. 2016;70:214–23. doi: 10.1016/j.jclinepi.2015.09.016 [DOI] [PubMed] [Google Scholar]
  • 49.Verma A, Huffman JE, Rodriguez A, Conery M, Liu M, Ho Y-L, et al. Diversity and scale: Genetic architecture of 2068 traits in the VA Million Veteran Program. Science. 2024;385(6706):eadj1182. doi: 10.1126/science.adj1182 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genetic epidemiology. 2016;40(4):304–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Skogli EW, Teicher MH, Andersen PN, Hovik KT, Øie M. ADHD in girls and boys–gender differences in co-existing symptoms and executive function measures. BMC Psychiatry. 2013;13:298. doi: 10.1186/1471-244X-13-298 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Polanczyk G, de Lima MS, Horta BL, Biederman J, Rohde LA. The worldwide prevalence of ADHD: a systematic review and metaregression analysis. Am J Psychiatry. 2007;164(6):942–8. doi: 10.1176/ajp.2007.164.6.942 [DOI] [PubMed] [Google Scholar]
  • 53.Lahey BB, Applegate B, McBurnett K, Biederman J, Greenhill L, Hynd GW, et al. DSM-IV field trials for attention deficit hyperactivity disorder in children and adolescents. Am J Psychiatry. 1994;151(11):1673–85. doi: 10.1176/ajp.151.11.1673 [DOI] [PubMed] [Google Scholar]
  • 54.Martin J, Walters RK, Demontis D, Mattheisen M, Lee SH, Robinson E, et al. A genetic investigation of sex bias in the prevalence of attention-deficit/hyperactivity disorder. Biol Psychiatry. 2018;83(12):1044–53. doi: 10.1016/j.biopsych.2017.11.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Wang L-J, Yu Y-H, Fu M-L, Yeh W-T, Hsu J-L, Yang Y-H, et al. Attention deficit-hyperactivity disorder is associated with allergic symptoms and low levels of hemoglobin and serotonin. Sci Rep. 2018;8(1):10229. doi: 10.1038/s41598-018-28702-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Bale TL, Epperson CN. Sex differences and stress across the lifespan. Nat Neurosci. 2015;18(10):1413–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Bilbo SD, Schwarz JM. The immune system and developmental programming of brain and behavior. Front Neuroendocrinol. 2012;33(3):267–86. doi: 10.1016/j.yfrne.2012.08.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Straub RH. The complex role of estrogens in inflammation. Endocr Rev. 2007;28(5):521–74. doi: 10.1210/er.2007-0001 [DOI] [PubMed] [Google Scholar]
  • 59.Chen Q, Hartman CA, Haavik J, Harro J, Klungsøyr K, Hegvik T-A, et al. Common psychiatric and metabolic comorbidity of adult attention-deficit/hyperactivity disorder: a population-based cross-sectional study. PLoS One. 2018;13(9):e0204516. doi: 10.1371/journal.pone.0204516 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Bennett DA, Schneider JA, Arvanitakis Z, Wilson RS. Overview and findings from the religious orders study. Curr Alzheimer Res. 2012;9(6):628–45. doi: 10.2174/156720512801322573 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Bennett DA, Schneider JA, Buchman AS, Barnes LL, Boyle PA, Wilson RS. Overview and findings from the rush Memory and Aging Project. Curr Alzheimer Res. 2012;9(6):646–63. doi: 10.2174/156720512801322663 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Farfel JM, Yu L, Boyle PA, Leurgans S, Shah RC, Schneider JA, et al. Alzheimer’s disease frequency peaks in the tenth decade and is lower afterwards. Acta Neuropathol Commun. 2019;7(1):104. doi: 10.1186/s40478-019-0752-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Meneses A, Koga S, O’Leary J, Dickson DW, Bu G, Zhao N. TDP-43 Pathology in Alzheimer’s Disease. Mol Neurodegener. 2021;16(1):84. doi: 10.1186/s13024-021-00503-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Josephs KA, Murray ME, Whitwell JL, Parisi JE, Petrucelli L, Jack CR, et al. Staging TDP-43 pathology in Alzheimer’s disease. Acta Neuropathol. 2014;127(3):441–50. doi: 10.1007/s00401-013-1211-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Josephs KA, Whitwell JL, Weigand SD, Murray ME, Tosakulwong N, Liesinger AM, et al. TDP-43 is a key player in the clinical features associated with Alzheimer’s disease. Acta Neuropathol. 2014;127(6):811–24. doi: 10.1007/s00401-014-1269-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.McAleese KE, Walker L, Erskine D, Thomas AJ, McKeith IG, Attems J. TDP-43 pathology in Alzheimer’s disease, dementia with Lewy bodies and ageing. Brain Pathol. 2017;27(4):472–9. doi: 10.1111/bpa.12424 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Nag S, Yu L, Capuano AW, Wilson RS, Leurgans SE, Bennett DA, et al. Hippocampal sclerosis and TDP-43 pathology in aging and Alzheimer disease. Ann Neurol. 2015;77(6):942–52. doi: 10.1002/ana.24388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Nelson PT, Schmitt FA, Lin Y, Abner EL, Jicha GA, Patel E, et al. Hippocampal sclerosis in advanced age: clinical and pathological features. Brain. 2011;134(Pt 5):1506–18. doi: 10.1093/brain/awr053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Ala TA, Beh GO, Frey WH 2nd. Pure hippocampal sclerosis: a rare cause of dementia mimicking Alzheimer’s disease. Neurology. 2000;54(4):843–8. doi: 10.1212/wnl.54.4.843 [DOI] [PubMed] [Google Scholar]
  • 70.Mattila PM, Röyttä M, Torikka H, Dickson DW, Rinne JO. Cortical Lewy bodies and Alzheimer-type changes in patients with Parkinson’s disease. Acta Neuropathol. 1998;95(6):576–82. doi: 10.1007/s004010050843 [DOI] [PubMed] [Google Scholar]
  • 71.Kotzbauer PT, Trojanowsk JQ, Lee VM. Lewy body pathology in Alzheimer’s disease. J Mol Neurosci. 2001;17(2):225–32. doi: 10.1385/jmn:17:2:225 [DOI] [PubMed] [Google Scholar]
  • 72.Kenny ER, Blamire AM, Firbank MJ, O’Brien JT. Functional connectivity in cortical regions in dementia with Lewy bodies and Alzheimer’s disease. Brain. 2012;135(Pt 2):569–81. doi: 10.1093/brain/awr327 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Heneka MT, Carson MJ, El Khoury J, Landreth GE, Brosseron F, Feinstein DL, et al. Neuroinflammation in Alzheimer’s disease. Lancet Neurol. 2015;14(4):388–405. doi: 10.1016/S1474-4422(15)70016-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Chen Y, Yu Y. Tau and neuroinflammation in Alzheimer’s disease: interplay mechanisms and clinical translation. J Neuroinflammation. 2023;20(1):165. doi: 10.1186/s12974-023-02853-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Andronie-Cioara FL, Ardelean AI, Nistor-Cseppento CD, Jurcau A, Jurcau MC, Pascalau N, et al. Molecular mechanisms of neuroinflammation in aging and Alzheimer’s Disease progression. Int J Mol Sci. 2023;24(3):1869. doi: 10.3390/ijms24031869 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Muller L, Di Benedetto S, Pawelec G. The immune system and its dysregulation with aging. Biochemistry and cell biology of ageing: Part II clinical science. 2019. p. 21–43. [DOI] [PubMed]
  • 77.Santoro A, Bientinesi E, Monti D. Immunosenescence and inflammaging in the aging process: age-related diseases or longevity?. Ageing Res Rev. 2021;71:101422. doi: 10.1016/j.arr.2021.101422 [DOI] [PubMed] [Google Scholar]
  • 78.Smith R, Strandberg O, Mattsson-Carlgren N, Leuzy A, Palmqvist S, Pontecorvo MJ, et al. The accumulation rate of tau aggregates is higher in females and younger amyloid-positive subjects. Brain. 2020;143(12):3805–15. doi: 10.1093/brain/awaa327 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Therneau TM, Knopman DS, Lowe VJ, Botha H, Graff-Radford J, Jones DT, et al. Relationships between β-amyloid and tau in an elderly population: An accelerated failure time model. Neuroimage. 2021;242:118440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Gao F, Shang S, Chen C, Dang L, Gao L, Wei S, et al. Non-linear relationship between plasma amyloid-β 40 level and cognitive decline in a cognitively normal population. Front Aging Neurosci. 2020;12:557005. doi: 10.3389/fnagi.2020.557005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Whitwell JL, Tosakulwong N, Weigand SD, Graff-Radford J, Ertekin-Taner N, Machulda MM, et al. Relationship of APOE, age at onset, amyloid and clinical phenotype in Alzheimer disease. Neurobiol Aging. 2021;108:90–8. doi: 10.1016/j.neurobiolaging.2021.08.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Wyss-Coray T. Ageing, neurodegeneration and brain rejuvenation. Nature. 2016;539(7628):180–6. doi: 10.1038/nature20411 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Bi J, Zhang C, Lu C, Mo C, Zeng J, Yao M. Age-related bone diseases: role of inflammaging. Journal of Autoimmunity. 2024;143:103169. [DOI] [PubMed] [Google Scholar]
  • 84.Li X, Li C, Zhang W, Wang Y, Qian P, Huang H. Inflammation and aging: signaling pathways and intervention therapies. Signal Transduction and Targeted Therapy. 2023;8(1):239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Amin J, Holmes C, Dorey RB, Tommasino E, Casal YR, Williams DM, et al. Neuroinflammation in dementia with Lewy bodies: a human post-mortem study. Transl Psychiatry. 2020;10(1):267. doi: 10.1038/s41398-020-00954-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Wetering J van, Geut H, Bol JJ, Galis Y, Timmermans E, Twisk JWR, et al. Neuroinflammation is associated with Alzheimer’s disease co-pathology in dementia with Lewy bodies. Acta Neuropathol Commun. 2024;12(1):73. doi: 10.1186/s40478-024-01786-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Iba M, Kim C, Sallin M, Kwon S, Verma A, Overk C, et al. Neuroinflammation is associated with infiltration of T cells in Lewy body disease and α-synuclein transgenic models. J Neuroinflammation. 2020;17(1):214. doi: 10.1186/s12974-020-01888-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Sordo L, Qian T, Bukhari SA, Nguyen KM, Woodworth DC, Head E, et al. Characterization of hippocampal sclerosis of aging and its association with other neuropathologic changes and cognitive deficits in the oldest-old. Acta Neuropathol. 2023;146(3):415–32. doi: 10.1007/s00401-023-02606-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, et al. Stan: a probabilistic programming language. J Stat Softw. 2017;76:1. doi: 10.18637/jss.v076.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Xiaofeng Zhu

18 Mar 2025

PGENETICS-D-25-00150

Integrative Mendelian randomization for detecting exposure-by-group interactions using group-specific and combined summary statistics

PLOS Genetics

Dear Dr. Chen,

Thank you for submitting your manuscript to PLOS Genetics. After careful consideration, we feel that it has merit but does not fully meet PLOS Genetics's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 60 days May 17 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosgenetics@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pgenetics/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to any formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Xiaofeng Zhu

Section Editor

PLOS Genetics

Xiaofeng Zhu

Section Editor

PLOS Genetics

Aimée Dudley

Editor-in-Chief

PLOS Genetics

Anne Goriely

Editor-in-Chief

PLOS Genetics

Journal Requirements:

1) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019.

2) Your manuscript's sections are not in the correct order.  Please amend to the following order: Abstract, Author Summary, Introduction, Description of the Method, Verification and Comparison, Applications, Discussion, Acknowledgements, References, and Supplementary Information

3) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: 

https://journals.plos.org/plosgenetics/s/figures

4) We have noticed that you have uploaded Supporting Information files, but you have not included a complete list of legends. Please add a full list of legends for your Supporting Information files after the references list.

5) Some material included in your submission may be copyrighted. According to PLOSu2019s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOSu2019s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form.

Please respond directly to this email and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. Once you have responded and addressed all other outstanding technical requirements, you may resubmit your manuscript within Editorial Manager. 

Potential Copyright Issues:

i) Figure 1. Please confirm whether you drew the images / clip-art within the figure panels by hand. If you did not draw the images, please provide (a) a link to the source of the images or icons and their license / terms of use; or (b) written permission from the copyright holder to publish the images or icons under our CC BY 4.0 license. Alternatively, you may replace the images with open source alternatives. See these open source resources you may use to replace images / clip-art:

- https://commons.wikimedia.org

- https://openclipart.org/.

6) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

1) State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

2) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.".

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Authors:

Please note that one of the reviews is uploaded as an attachment.

Reviewer #1: The authors propose a novel MR method to test for interaction effect of an exposure and a binary group on the outcome using group-stratified and group-combined GWAS summary statistic. The application on ADHD and AD revealed some interesting interaction effect with sex and age on the diseases. The idea is interesting and the paper is well structured and well written. I have several comments for improvement:

1. It might be better to move the model detail in the Supplementary S1.1 to the main text. This is insightful to motivate the main relationship of GWAS effects. There are several typos in the derivation, e.g. the equation after S2, $p$ should be ρ, and RHS is not conditioning on X.

2. Related to 1, could the authors provide more details on the assumptions and identifiability conditions of the model?

a) For example, what is the definition of valid/invalid IVs? What are the conditions such that β and βint are identifiable? The current model implicitly assume the group specific genetic effect on the outcome only due to the group specific effect of the exposure on the outcome (if no pleiotropy is present), is this correct? Group specific genetic effect on the exposure is not allowed?

b) How is the model distinguish $\Gamma_j = \beta \gamma_j + \alpha_j^*$, where $\alpha_j^* = \beta_{int}\gamma_j + \alpha_j$ (I assume this might also be related to the assumption of uncorrelated balanced pleiotropy)? An explicit and detailed description of model assumptions will be very helpful.

c) Based on the current model $\Gamma_j = (\beta + \beta_{int}\rho)\gamma_j) + \alpha_j$, is it possible to infer parameters of interest using only multiple combined outcome GWAS dataset with known ρ's?

3. How is the power of testing \beta and \beta_{int} related to the sample sizes of different GWAS datasets? If this cannot be derived analytically, a more extensive simulation studies will be helpful.

4. For the simulation results, besides Type-I error and power, could the authors also show the averaged estimates, MSE, bias etc. (at least in the supplementary)?

5. In simulation, could the authors also consider scenarios with directional pleiotropy and/or correlated pleiotropy? It will be interesting to see how robust the model is in these scenarios.

6. In real data application, what are the results for shared effects $\beta$? Could the authors also present some results only based on combined GWAS dataset using existing MR methods, and discuss the comparison?

7. Line 234 mentioned 35 exposures are examined, but line 242 mentioned "the analysis was restricted to 51 exposures". And Fig4 show neither 35 nor 51 exposures. Which one is correct?

8. What is the result of clinically diagnosed AD?

9. The current heatmap only shows the difference in effect, could the authors also present a heatmap comparing the total effects in the two groups? This may give more information than only presenting the difference.

Reviewer #2: The authors propose int2MR, an integrative Mendelian randomization (MR) method that utilizes GWAS summary statistics for exposure traits and group-separated or combined GWAS statistics for outcome traits, all without requiring individual-level GWAS data. Overall, the manuscript is well-written, but I have a few suggestions for improvement:

In models (1-3), it is unclear how the authors handle (\gamma_j). Is it treated as a random variable with a prior, or as a fixed but unknown parameter?

The authors state that "int2MR improved power by jointly testing for main and interaction effects." This implies that the main effect should be non-zero. Two clarifications are needed: What if the main effect is zero? Additionally, can the method focus solely on testing the interaction effect?

In the real data analysis, how did the authors specify (\rho_k)? Is it known or unknown in practice?

Regarding sensitivity analysis, the authors used thresholds of (5 \times 10^{-8}) (ADHD) or (10^{-8}) (Alzheimer’s Disease) to select instrumental variables (IVs) and then performed LD clumping with an (r^2) threshold of 0.05, assuming all IVs are valid. Please conduct sensitivity analyses to ensure the robustness of the results.

Reviewer #3: The review is uploaded as an attachment.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: None

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Can Yang

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: Review.docx

pgen.1011819.s004.docx (15.6KB, docx)

Decision Letter 1

Xiaofeng Zhu

13 Jun 2025

PGENETICS-D-25-00150R1

Integrative Mendelian randomization for detecting exposure-by-group interactions using group-specific and combined summary statistics

PLOS Genetics

Dear Dr. Chen,

Thank you for submitting your manuscript to PLOS Genetics. The revised manuscript has been seen by the reviewers. Reviewer 2 is satisfied with this revision. However, reviewer 1 and 3 still have some concerns. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 30 days Jul 13 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosgenetics@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pgenetics/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Xiaofeng Zhu

Section Editor

PLOS Genetics

Xiaofeng Zhu

Section Editor

PLOS Genetics

Aimée Dudley

Editor-in-Chief

PLOS Genetics

Anne Goriely

Editor-in-Chief

PLOS Genetics

Journal Requirements:

Please ensure that the funders and grant numbers match between the Financial Disclosure field and the Funding Information tab in your submission form. Note that the funders must be provided in the same order in both places as well.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have addressed most of my comments satisfactorily. I have two remaining comments:

1. Regarding Comment 1: The typo in the equation below line 71 appears to remain uncorrected and is the same as in the previous version.

2. Regarding Comment 6: I did not see the inclusion of results based on the combined GWAS dataset using existing MR methods. In addition, could the authors also perform traditional MR analyses using the sex-stratified GWAS data and compare those results with the findings presented in Figure S1?

Reviewer #2: Thank the authors' effort to address my concerns.

Reviewer #3: The authors have addressed some of my concerns, but several critical issues remain:

1. The response to comment is superficial. The data analysis used binary outcomes, and yet the proposed method is developed only for linear regression. This mismatch is unacceptable and must be rectified.

2. The rationale for omitting Zhu et al. (2024) for comparison is unpersuasive. By the same logic, OLS—also not designed for MR—should have been excluded.

3. The response states that “relatively independent” SNPs are required, yet the manuscript claims the method tolerates “moderate” correlation. Clarify what constitutes moderate correlation and soften the wording of generalization if necessary.

4. The explanation about power comparison in the response is strong, but the key points need to be integrated into the main text to prevent misinterpretation and ensure balanced representation.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: None

Reviewer #2: Yes

Reviewer #3: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Can Yang

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Decision Letter 2

Xiaofeng Zhu

25 Jul 2025

Dear Dr Chen,

We are pleased to inform you that your manuscript entitled "Integrative Mendelian randomization for detecting exposure-by-group interactions using group-specific and combined summary statistics" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Xiaofeng Zhu

Section Editor

PLOS Genetics

Xiaofeng Zhu

Section Editor

PLOS Genetics

Aimée Dudley

Editor-in-Chief

PLOS Genetics

Anne Goriely

Editor-in-Chief

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-25-00150R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Xiaofeng Zhu

PGENETICS-D-25-00150R2

Integrative Mendelian randomization for detecting exposure-by-group interactions using group-specific and combined summary statistics

Dear Dr Chen,

We are pleased to inform you that your manuscript entitled " 

Integrative Mendelian randomization for detecting exposure-by-group interactions using group-specific and combined summary statistics" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Benedek Toth

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text

    This document presents a comprehensive description of our methodological framework, including an extended discussion of the Bayesian hierarchical model and rigorous justification of all hyperparameter choices. It further details the simulation design and the procedures employed for generating summary statistics. S1 Text reports additional simulation results and expands upon the data-analysis findings presented in the main text.

    (PDF)

    pgen.1011819.s001.pdf (2.4MB, pdf)
    S1 Table

    List of the 51 genome-wide association study (GWAS) traits included in our analysis. For each trait, we report the phenotype name, total sample size, number of cases (where applicable), the contributing consortium or study, the citation for the primary GWAS publication (via PubMed), and the web portal used to access the full summary-statistic dataset.

    (XLSX)

    pgen.1011819.s002.xlsx (17.9KB, xlsx)
    S1 Checklist

    The STROBE-MR checklist of recommended items to address in reports of Mendelian randomization is included. This checklist is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0; https://creativecommons.org/licenses/by/4.0/) and must be attributed to the STROBE Initiative. For more information, please see https://www.strobe-statement.org/.

    (DOCX)

    pgen.1011819.s003.docx (29.3KB, docx)
    Attachment

    Submitted filename: Review.docx

    pgen.1011819.s004.docx (15.6KB, docx)
    Attachment

    Submitted filename: response_letter.pdf

    pgen.1011819.s005.pdf (264.3KB, pdf)
    Attachment

    Submitted filename: response_r2.pdf

    pgen.1011819.s006.pdf (1.8MB, pdf)

    Data Availability Statement

    All the GWAS summary statistics of IV-to-exposure effects used in this paper are publicly available. Related links to the summary statistics can be found in S1 Table. The R implementation of our int2MR method is available at https://github.com/kxu-stat/int2MR. Our implementation of algorithms depends on rstan (available on https://CRAN.R-project.org/package=rstan). Additionally, the GWAS summary statistics for the IV-to-outcome effects derived from the Religious Orders Study and the Rush Memory and Aging Project (ROSMAP), along with all primary results underlying our analyses, have been deposited on Zenodo at https://doi.org/10.5281/zenodo.16341091.


    Articles from PLOS Genetics are provided here courtesy of PLOS

    RESOURCES