Disentangling horizontal and vertical Pleiotropy in genetic correlation estimation: introducing the HVP model

Lamessa Dube Amente; Natalie T Mills; Thuc Duy Le; Elina Hyppönen; S Hong Lee

doi:10.1007/s00439-025-02762-w

. 2025 Sep 16;144(8):861–876. doi: 10.1007/s00439-025-02762-w

Disentangling horizontal and vertical Pleiotropy in genetic correlation estimation: introducing the HVP model

Lamessa Dube Amente ^1,^2,^3,^4,^✉, Natalie T Mills ⁵, Thuc Duy Le ⁶, Elina Hyppönen ^1,^3,⁷, S Hong Lee ^1,^2,^3,^✉

PMCID: PMC12449366 PMID: 40956352

Abstract

Genome-wide genetic correlation studies have demonstrated widespread shared genetic architecture between complex traits, yet the impact of vertical pleiotropy on these genetic correlation estimates remains unclear. To address this, we propose the Horizontal and Vertical Pleiotropy (HVP) model, designed to disentangle horizontal from vertical pleiotropy effects. This approach provides unbiased genetic correlation estimates specifically attributed to horizontal pleiotropy. Through simulations, we verify that the HVP model corrects biases introduced by vertical pleiotropy—particularly the causal influence of exposure on outcomes—across various scenarios, improving the accuracy of heritability and genetic correlation estimates. Vertical pleiotropy biases genetic variances and covariances, influencing essential estimates such as SNP-based heritability and genetic correlation in traditional methods. By addressing these biases, the HVP model enhances accuracy in parameter estimation. Real data analysis shows that horizontal pleiotropy significantly contributes to genetic correlations between metabolic syndrome (MetS) and traits such as type 2 diabetes, C-reactive protein (CRP), sleep apnoea, and cholelithiasis, whereas vertical pleiotropy is more relevant between body mass index (BMI) and MetS, and MetS and cardiovascular diseases. These findings suggest that action on modifiable factors like lowering BMI may effectively reduce MetS risk, while CRP—though not causative—serves as a useful marker in risk prediction through horizontal pleiotropic genes. These results confirm the HVP model’s relevance and utility in revealing the complex genetic architecture underlying traits such as metabolic syndrome, highlighting its potential to inform precision healthcare.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00439-025-02762-w.

Introduction

Genome-wide genetic correlation studies quantify the shared genetic architecture between complex traits, a concept known as horizontal pleiotropy (Ni et al. 2018; Rheenen et al. 2019; Zhang et al. 2021). These methods rely on individual-level data and genome-wide association study (GWAS) summary statistics to estimate genetic correlations and reveal significant relationships between traits (Ni et al. 2018; Bulik-Sullivan et al. 2015). Methods based on Restricted maximum likelihood (REML) (Lee et al. 2012a, b), including Genome-wide Complex Trait Analysis (GCTA) (Yang et al. 2011, 2013), MTG2 (Lee and van der Werf 2016), and BOLT-REML (Loh et al. 2015), are commonly employed for this purpose, especially in traits with polygenic architectures.

Previous studies have reported substantial genetic correlations across various diseases. For instance, studies published in Science (2018) and Nature Genetics (2013 & 2015) demonstrated substantial genetic correlations among psychiatric disorders, emphasizing the widespread impact of shared genetic effects (Bulik-Sullivan et al. 2015; Cross-Disorder Group of the Psychiatric Genomics et al. 2013; Brainstorm et al. 2018). Similarly, significant genetic correlations between metabolic syndrome (MetS) and multiple complex traits have been reported (van Walree et al. 2022). However, current approaches do not distinguish whether these correlations are driven by horizontal or vertical pleiotropy, limiting our understanding of the underlying mechanisms.

Pleiotropy generally manifests in two primary forms: vertical and horizontal (Ni et al. 2018; van Rheenen et al. 2019; Zhang et al. 2021; Hemani et al. 2018). Vertical pleiotropy occurs when a genetic variant influences one trait, which in turn causally affects another trait (van Rheenen et al. 2019; Lee et al. 2012a, b; Hackinger and Zeggini 2017; Sivakumaran et al. 2011). In contrast, horizontal pleiotropy occurs when genetic factors independently affect two traits, reflecting shared biological processes (Solovieff et al. 2013; Chesmore et al. 2018). Disentangling these two forms is essential for clarifying genotype-to-phenotype relationships and disease mechanisms.

This distinction has important practical implications. For example, vertical pleiotropic effects may not drive changes in the target trait through approaches like functional targeting, gene knockdown, or CRISPR-based gene editing unless the intermediary trait is also involved. In contrast, a modifiable exposure will not affect the outcome in the absence of vertical pleiotropy. Without recognizing this, practical applications such as targeted interventions may be hindered.

Current genetic correlation methods such as REML or Linkage Disequilibrium Score Regression (LDSC) (Bulik-Sullivan et al. 2015; Lee et al. 2012a, b; Yang et al. 2011; Cross-Disorder Group of the Psychiatric Genomics et al. 2013; Brainstorm et al. 2018; van Walree et al. 2022) implicitly assume that observed correlations reflect shared genetic effects alone. However, vertical pleiotropy can also induce genetic correlations that reflect phenotypic causality rather than shared genetic etiology. This conflation presents a critical limitation, as it obscures biological mechanisms and may undermine the predictive utility of polygenic models, especially when causal pathways or mediator traits differ across populations.

To address these issues, we propose the Horizontal and Vertical Pleiotropy (HVP) model, a novel approach designed to disentangle horizontal and vertical pleiotropy while accounting for causal relationships between traits. The HVP method aims to provide unbiased estimates of genetic correlation attributed to shared genetic effects, thereby enhancing both causal inference and the biological interpretability. We evaluate the performance of the HVP model through extensive simulations and apply it to estimate the genetic correlation between MetS and related traits using UK Biobank data. The analysis reveals that horizontal pleiotropy drives genetic correlations between MetS and traits such as type 2 diabetes, C-reactive protein (CRP), sleep apnea, and cholelithiasis, while vertical pleiotropy links body mass index (BMI) with MetS and MetS with cardiovascular diseases. These findings suggest that lowering BMI could effectively reduce MetS risk, while CRP serves as a useful risk biomarker. This highlights the HVP model’s utility in uncovering the complex genetic architecture of MetS and its potential implications for healthcare research.

Method

Ethical statement

This study utilized data from the UK Biobank (http://www.ukbiobank.ac.uk/), which follows a rigorously reviewed scientific protocol approved by the Northwest Multi-centre Research Ethics Committee, the National Information Governance Board for Health & Social Care, and the Community Health Index Advisory Group. Participants provided electronic consent, completed questionnaires on socio-demographic and lifestyle factors, and underwent physical measurements. Blood samples (approximately 45 ml) were collected and stored for various analyses, including genetic, proteomic, and metabolomic studies. Access to UK Biobank data was granted under project number 14575.

Statistical model: horizontal and vertical Pleiotropy

Simultaneous consideration of horizontal and vertical pleiotropy (HVP) is feasible within a bivariate linear mixed model (see Fig. 1). In this HVP model, the genetic effects influencing trait 1 are shared with those influencing trait 2, and, concurrently, the phenotypes of trait 1 are partially determined by the phenotypes of trait 2 (Eq. 1).

where y and c represent traits of interest (outcome vs. exposure), τ represents fixed causal effects of trait 2 (c) on trait 1 (y), and α and β denote random genetic effects associated with traits y and c, respectively. Additionally, e and ϵ encompass non-genetic residual effects pertaining to traits y and c, respectively. The genetic effects of both traits traverse two distinct paths, namely horizontal and vertical pleiotropy.

Assuming that c is standardised with a mean of zero and a variance of one, the estimated τ can be expressed as:

This equation highlights the potential bias in estimating τ due to genetic and residual covariances. Additionally, the variances of y and c can be written as:

It is important to note that a mis-specified model without accounting for vertical pleiotropy is often used in genetic analyses. This model can be expressed as

where Inline graphic and Subsequently, the following equations can be derived:

This clearly illustrates the biased estimates of genetic variances within a univariate context, comparing var(g) with var(α). The deviation is a consequence of both Inline graphic and cov(α,β). Importantly, this bias extends to genetic covariances, influencing key parameters, including heritability and genetic correlation (refer to Results section).

Leveraging established Mendelian randomization (MR) methods, we anticipate the accurate classification of genetic variants pertaining to horizontal and vertical pleiotropy (Hackinger and Zeggini 2017; Slob and Burgess 2020; Davey Smith and Hemani 2014). This classification enables an unbiased estimate of τ. Subsequently, the remaining two key parameters (cov(α,β), cov(e,ϵ)) can be accurately inferred by solving multiple equations that elucidate relationships between the two traits in terms of genetic and residual components. The procedural breakdown is as follows:

Estimation of causal effect

GREML approach

Starting with Eq. (1), genetic components are decomposed into two distinct subsets: one specific to horizontal pleiotropy (α₁, β₁) and the other to vertical pleiotropy (α₂, β₂) as illustrated in Fig. 1. Specifically, we define:

The genetic component for trait 1, under vertical pleiotropy, is expressed as:

Thus, the genetic covariance under vertical pleiotropy is:

Since α₂ and β₂ are independent by construction, cov(α₂, β₂) = 0, and therefore:

The values of cov(g₂, β₂), and var(β₂) can be obtained using the conventional GREML (Lee and Werf 2016) method (i.e. Equation (5) if it is based on the SNPs classified as vertical pleiotropy variants. As depicted in the subsequent equations, the Inline graphic value is used to differentiate variance components attributed to horizontal and vertical pleiotropy.

MR approach

In practice, robust MR methods can be used to estimate an unbiased value of Inline graphic (Slob and Burgess 2020, Zhu et al. 2018). To guide researchers in selecting appropriate methods, we recommend following established best practices in MR analysis ((Hemani et al. 2018; Burgess et al. 2019), with a particular focus on approaches that are robust to horizontal pleiotropy. Among these, MRLOVA (Amente et al. 2025) is designed to accommodate invalid instruments through an EM-based framework. We also applied methods from different model classes, including Contamination Mixture (Burgess et al. 2020) and MRMix (Qi and Chatterjee 2019) both of which use mixture modelling to handle heterogeneous instrument validity. These approaches offer complementary robustness features and are well suited to sensitivity analysis. We encourage the use of multiple robust MR methods in combination with sensitivity analyses to evaluate the stability of causal inference under different assumptions.

Correction of genetic covariance

From Eq. (7), the genetic covariance attributed horizontal pleiotropy is:

here, var(β) and cov(g,β) are estimated using the conventional bivariate GREML method (Eq. 5) based on genome-wide SNPs.

Estimation of standard error for the corrected genetic covariance

The standard error (se) of the corrected genetic covariance, Inline graphic , can be derived using the delta method. Based on Eq. (8), the corrected genetic covariance is a function of and , assuming that is estimated without error, i.e. f(cov(g,β), var(β))= . Then, the delta method approximates the standard error of the corrected genetic covariance as:

where Inline graphic , ) is the derivative of f with respect to the variance components and is the covariance matrix of the variance components. Each element of Ω can be obtained from the information matrix generated by the conventional bivariate GREML method (Lee and van der Werf 2016).

Correction of genetic variance

Similarly, the biased estimate of genetic variance in Eq. (6) can be corrected as:

Inline graphic which is equivalent to

Substituting from Eq. (8):

Estimation of standard error of the corrected genetic variance

Using Eq. (9), the se of var(α) can be derived with:

and

Again, each element of Inline graphic can be extracted from the information matrix of the GREML method.

Correction of heritability

From Eqs. 6 and 10, the corrected heritability can be derived as follows, assuming y is standardised:

With standardised y, i.e. var(y) = 1, the standard error of Inline graphic can be used for the corrected heritability.

Correction of genetic correlation

The corrected genetic correlation due to horizontal pleiotropy is given by:

Equation (11) is equivalent to:

Estimation of standard error of the corrected genetic correlation

Using Eq. (9), the se of Inline graphic can be derived with

Each element of Inline graphic can be extracted from the information matrix of the GREML method.

Binary variables

In the previous section, we derived the theoretical framework assuming that both c and y are quantitative variables. Extending this framework to binary variables, whether for c, y, or both simultaneously, requires a transformation. This transformation enables the conversion of observed-scale estimates for binary variables into estimates on the liability scale (Lee et al. 2011, 2012a, b). Although genetic correlation remains consistent between the liability and observed scales (Lee et al. 2012a, b), fixed causal effects (τ), phenotypic covariance (cov(y, c) in Eq. 2), heritability (h²), coefficient of determination (R²), and genetic covariance differ between these scales. Table 1 illustrates the transformation of key parameters in the HVP model between the observed and liability scale.

Table 1.

Transformation of key parameters between the observed and liability scale

Parameter	Transformation formula
Heritability
R²
Genetic covariance between c on y due to horizontal pleiotropy, i.e.
Genetic variance of y, i.e. var(g)
Causal effects of c on y

Open in a new tab

For models involving binary c and/or y, transforming parameters from the observed scale (denoted by subscript o) to the liability scale (denoted by l) allows more accurate interpretations. In the transformation, we assume that the population prevalence (k) matches the sample prevalence, suggesting an absence of ascertainment bias. The variable z represents the height of the normal density function at a specific threshold t on the normal distribution. This threshold corresponds to the proportion of disease prevalence k, calculated as t = − qnorm(k,0,1) and z = dnorm(t,0,1) in R. For cases involving the genetic covariance cov(α,β), z_y is for y and z_c is for c.

For summary-based data

The HVP model and parameter corrections are also applicable to summary-based data. In this approach, genetic covariance and correlation estimates obtained from LDSC can be corrected using Eqs. 8 and 11, respectively. The standard errors of these corrected estimates are derived using the block jackknife method available in LDSC ((Bulik-Sullivan et al. 2015).

Simulation setup

Simulation was conducted using simulated genotype data. For the simulated phenotypes, 1,000 SNPs were used to model horizontal pleiotropy (with effects α₁ and β₁). Additionally, another 1,000 SNPs were used: half for simulating the exposure (β₂) and the other half for simulating the outcome (α₂), based on a sample size of 10,000 individuals. Tau is estimated based on GREML applied on the second SNP set, where the genetic effect of exposure and outcome has zero genetic covariance.

Scenario 1: vertical Pleiotropy only

In this scenario, we simulated vertical pleiotropy only ( Inline graphic >0), excluding horizontal pleiotropy (=0) and residual covariance (=0). The exposure (c) and outcome (y) variables were generated based on a multivariate normal distribution with a predefined variance-covariance structure, using Eq. (1). The variance-covariance structures for the genetic (α and β) and residuals (e and ε) in Eq. (1) are as follows:

where Inline graphic is used to maintain the phenotypic variance of y equal to 1.

This setup results in a heritability of 0.5 for both c and y. In this scenario, we systematically varied the causal effect τ from 0 to 0.4 in increments of 0.1. For genetic data, we simulated genotypes for 1000 SNPs using a binomial distribution with a minor allele frequency (MAF) set at 0.2.

Scenario 2: both vertical and horizontal Pleiotropy

In the second scenario, we introduced a more intricate setting involving both vertical and horizontal pleiotropy ( Inline graphic >0 and cov(α,β) >0), while still excluding residual covariance (cov(e,ε)=0). To delineate between vertical and horizontal pleiotropy variants, two sets of SNPs were exclusively used, each comprising 1000 SNPs, contributing to distinct genetic effects for y and c. The genetic effects can be decomposed as follows: α = α₁ + α₂ and β = β₁ + β₂. The variance-covariance structures for the genetic effects (α₁ and β₁, and α₂ and β₂) are as follows:

and

The variance-covariance structure for the residual effects (e and ε) is as follows:

where Inline graphic is used to maintain the phenotypic variance of y equal to 1.

This setup results in a heritability of 0.5 for both c and y, and genetic correlation attributed to horizontal pleiotropy is 0.5. In this scenario, we systematically varied the causal effect τ from 0 to 0.4 in increments of 0.1.

Scenario 3: vertical and horizontal Pleiotropy plus residual covariance

In the third scenario, we introduced a more complex setting involving both vertical and horizontal pleiotropy ( Inline graphic >0 and cov(α,β) >0), along with the inclusion of residual covariance (cov(e,ε)>0). Similar to scenario 2, we used two distinct sets of SNPs, each consisting of 1000 SNPs, to distinguish between vertical and horizontal pleiotropy. The variance-covariance structures for the genetic effects (α₁ and β₁, and α₂ and β₂) are as follows:

and

The variance-covariance structure for the residual effects (e and ε) is as follows:

where Inline graphic is used to maintain the phenotypic variance of y equal to 1.

This setup results in a heritability of 0.5 for both c and y, and genetic correlation attributed to horizontal pleiotropy is 0.5, and fixed residual correlation at 0.1. In this scenario, we systematically varied the causal effect τ from 0 to 0.4 in increments of 0.1.

Scenario 4: complete mediation of genetic effect of trait 1(y)

In this simulation scenario, we aimed to explore the complete mediation of genetic effects on trait y (var(α) = 0) through another intermediate trait, c, which exhibits significant direct genetic effects (var(β) = 0.5). Consequently, we set the covariance between α and β to 0 (cov(α, β) = 0), ensuring that no direct genetic effects influence y without passing through c. Additionally, we exclude residual covariance (cov(e, ε) = 0) to focus exclusively on vertical pleiotropy (τ > 0). Our interest lay in understanding how genetic effects manifest via c, acting as the primary genetic influencers on y in the presence of vertical pleiotropy ( Inline graphic >0). We constructed the exposure (c) and outcome (y) variables using a multivariate normal distribution, defining their variance-covariance structure according to Eq. (1). The variance-covariance structures for the genetic (α and β) and residuals (e and ε) are represented as follows:

and

where Inline graphic is used to maintain the phenotypic variance of y equal to 1. In this setup, the heritability for c is 0.5, while it is 0 for y, with a genetic correlation attributed solely to vertical pleiotropy. The scenario systematically varied the causal effect τ from 0 to 0.4 in increments of 0.1, elucidating the extent of genetic mediation through trait c on the outcome y.

Scenario 5: vertical and horizontal Pleiotropy for the exposure, but only horizontal Pleiotropy for the outcome

The fifth scenario is a special form of scenario 2 where the causal SNPs for y include only horizontal pleiotropy, with no vertical pleiotropy (var(α₂) = 0). The variance-covariance structures for the genetic effects (α₁ and β₁, and α₂ and β₂) are as follows:

and

The variance-covariance structure for the residual effects (e and ε) is as follows:

where Inline graphic is used to maintain the phenotypic variance of y equal to 1.

We also performed simulations based on real genotype data from the UKB with MAF > 0.01. Details of the UKB genotype-based simulation are provided in the Supplementary Note.

Simulation for binary outcome and exposure

In real situations, both outcomes and exposures can often be binary. To simulate binary outcomes or exposures, we used the liability threshold model, which is based on predefined population prevalence rates (k). First, continuous quantitative phenotypes were simulated as described earlier. Then, using the specified prevalence rates for the outcome (y) and exposure (c), we applied standard normal distribution theory to determine the corresponding liability thresholds. Individuals were then categorized as either having or not having the outcome, or as exposed or unexposed, depending on whether their continuous phenotypes exceeded the respective threshold values.

Application of real data: MetS comorbidities and related complex traits

Phenotype data

We investigated MetS and 23 associated traits— 12 comorbidities and 11 quantitative traits— to explore their biological interrelationships. MetS was defined as a composite phenotype according to the International Diabetes Federation criteria (Alberti et al. 2009). Individuals were classified as having MetS if they met three or more of the following criteria: central obesity (waist circumference (WC) ≥ 88 cm for females and ≥ 102 cm for males), elevated fasting triglycerides (TG) (≥ 1.7 mmol/L or medication), reduced high-density lipoprotein cholesterol (HDL-C) (< 1.29 mmol/L for females and < 1.03 mmol/L for males or medication), elevated blood pressure (systolic/diastolic blood pressure [SBP/DBP] ≥ 130/85 mmHg), and elevated fasting glucose (GLU) (≥ 5.6 mmol/L or medication). The 12 comorbidities were identified using ICD-10 codes (Supplementary Table 1), with relevant conditions from chapters including cardiovascular, psychiatric, metabolic, respiratory, and renal disorders. Quantitative traits included body mass index (BMI), basal metabolic rate (BMR), lung function, insulin-like growth factor 1 (IGF-1), neuroticism, C-reactive protein (CRP), serum vitamin D, and liver function tests (Alkaline Phosphatase (ALP), Alanine Aminotransferase (ALT), Aspartate Aminotransferase (ASP), & Gamma-Glutamyl Transferase (GGT) (Supplementary Table 2).

Data were adjusted for confounders (age, sex, Townsend Deprivation Index (TDI), education level, assessment centre, batch effect, population structure using the first 10 principal components), with standardized residuals used in analyses.

Genotype data

Rigorous quality control procedures were applied using PLINK v1.9. Individuals were excluded if they did not self-identify as white British, had discordant sex information, or had a genotype missing rate > 0.05. At the variant level, exclusions were made for an INFO score < 0.6, minor allele frequency < 0.01, Hardy-Weinberg equilibrium p-value < 1E-7, or call rate < 0.95. A total of 288,792 individuals and 7,701,772 single nucleotide polymorphisms (SNPs) passed these quality controls. Given the computational demands when using individual-level data, a random sample of 82,955 participants and HapMap3 SNPs was selected to balance computational efficiency with sufficient statistical power. The remaining 205,828 samples were used for two sample MR analysis of exposure GWAS, as detailed below.

Genetic correlation

We estimated genetic correlations between MetS and its associated traits and comorbidities using bivariate GREML, a method that utilizes individual-level data to quantify the genetic overlap between traits. To account for multiple testing, we applied a Bonferroni correction to adjust significance thresholds, minimizing the risk of false-positive findings.

Subsequently, GWAS was conducted on the same dataset to generate summary statistics, followed by LDSC to estimate genetic correlations from these summary data. To further refine the genetic correlations, we applied HVP to disentangle horizontal and vertical pleiotropy, incorporating causal relationships between traits as estimated through MR. This process enabled a direct comparison of corrected genetic correlations derived from individual-level GREML estimates and summary-based LDSC estimates, offering an additional layer of validation and robustness to our findings.

To estimate causal effects, two-sample MR was conducted using the “TwoSampleMR” R package where a separate sample (n = 205,837) was used for exposure GWAS. Methods included MR Egger regression (Bowden et al. 2015), Inverse Variance Weighting (IVW), weighted median (Bowden et al. 2016), and weighted mode (Bowden et al. 2019). Additionally, GSMR in GCTA(v1.93) (Zhu et al. 2018), the contamination mixture method (Burgess et al. 2020), Iterative Mendelian Randomization and Pleiotropy (IMRP) (Zhu et al. 2021), Latent outcome variable approach (MRLOVA) (Amente et al. 2025) and The MR Pleiotropy RESidual Sum and Outlier (MR-PRESSO) (Verbanck et al. 2018) method were also employed to address significant heterogeneity and horizontal pleiotropy. Instrumental SNPs were selected based on genome-wide significance, LD clumped and harmonized between exposure and outcome datasets. Sensitivity tests evaluated directional pleiotropy, heterogeneity, and the strength of instrumental variables using F-statistics.

Results

Simulations

The HVP model integrates both horizontal and vertical pleiotropy. In this model, vertical pleiotropy mediates the genetic effects on the outcome (see Fig. 1). As demonstrated in Eqs. (5–7), vertical pleiotropy can introduce biases into estimates of genetic variances within a univariate context. These biases extend to genetic covariances, impacting key genetic parameters such as heritability and genetic correlation (refer to Figs. 2, Supplementary Fig. 1). Through extensive simulation analyses (scenarios 1–5), we have identified biases in estimating SNP-based heritability and genetic correlation using the existing method, bivariate GREML (as depicted in Eq. (5)).

Fig. 2 — Estimated Heritability (h²) and Genetic correlation (r_g). This figure presents results from a simulation examining the heritability of trait y and the genetic correlation of traits y and c under four scenarios. The red dashed lines represent the true values. A (scenario 1): Traits y and c are simulated under vertical pleiotropy only, with variance-covariance structures for genetic effects and residual effects maintain y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates. B (scenario 2): Traits y and c are simulated under both horizontal and vertical pleiotropy without residual covariance. The variance-covariance structures for genetic effects and . Residual effects (e and ε) are characterized by , maintaining y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates. C (scenario 3): Traits y and c are simulated under both horizontal and vertical pleiotropy, with residual covariance. The variance-covariance structures for genetic effects are and . Residual effects (e and ε) are characterized by , maintaining y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates. D (scenario 4): The heritability estimation for trait y and the genetic correlation between traits y and c, simulated under the assumption of no direct genetic effect on trait y except through trait c (complete mediation of genetic effects via the exposure). The variance-covariance structures for genetic effects and residual effects maintain y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates

Inline graphic — Estimated Heritability (h²) and Genetic correlation (r_g). This figure presents results from a simulation examining the heritability of trait y and the genetic correlation of traits y and c under four scenarios. The red dashed lines represent the true values. A (scenario 1): Traits y and c are simulated under vertical pleiotropy only, with variance-covariance structures for genetic effects and residual effects maintain y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates. B (scenario 2): Traits y and c are simulated under both horizontal and vertical pleiotropy without residual covariance. The variance-covariance structures for genetic effects and . Residual effects (e and ε) are characterized by , maintaining y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates. C (scenario 3): Traits y and c are simulated under both horizontal and vertical pleiotropy, with residual covariance. The variance-covariance structures for genetic effects are and . Residual effects (e and ε) are characterized by , maintaining y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates. D (scenario 4): The heritability estimation for trait y and the genetic correlation between traits y and c, simulated under the assumption of no direct genetic effect on trait y except through trait c (complete mediation of genetic effects via the exposure). The variance-covariance structures for genetic effects and residual effects maintain y’s phenotypic variance at 1. The left pane shows the biased estimates due to vertical pleiotropy. The right panel demonstrates that, after estimating τ, the HPV model successfully disentangle horizontal pleiotropy from the vertical pleiotropy, thereby correcting both heritability and genetic correlation estimates

For instance, in simulation scenario 1, where there is no shared genetic effect between two phenotypes (cov(α,β) = 0), varying the fixed causal effect τ from 0.0 to 0.4 in increments of 0.1 reveals a significant deviation in SNP-based heritability and genetic correlation. Notably, this deviation is proportional to the magnitude of Inline graphic (refer to Fig. 2A). When applying the proposed method (Eq. (6)– Eq. (11)), the biased estimates can be corrected, and unbiased estimates for both heritability and genetic correlations can be obtained (Fig. 2A).

A similar pattern was observed when using simulation scenarios 2–4, where genetic and residual correlations are presented (Figs. 2B-D), in that the existing method generates biased estimates that can be corrected by our proposed method. We also conducted simulations (scenario 4) where genetics has no direct effect on the outcome that is causally associated with another trait (complete mediation of genetic effect via the second trait). Interestingly, because of this causal relationship, the non-heritable trait (outcome) becomes a heritable trait (Fig. 2D (the left panel)). This bias can be also corrected by the proposed method (Fig. 2D (the right panel)). Also, similar pattern was observed under the fifth Scenario (Supplementary Fig. 2).

Our extensive simulations demonstrate that our approach effectively disentangles the two components of pleiotropy across scenarios 1 through 4 (Fig. 2). Simulations based on real UK Biobank (UKB) genotype data yielded similar results, underscoring the robustness of our method (see Supplementary Fig. 1). Additionally, we verified the applicability of the HVP model and parameter corrections for binary exposure and/or binary outcome using the transformation approach (see Supplementary Table 3).

The HVP model also corrects the underestimation of genetic correlation and outcome heritability in scenarios where the causal effect (τ) is negative (supplementary Tables 4–5).

Real data

MetS comorbidities and related complex traits

From a total of 82,955 white British included in this analysis, a total of 17,335 (20.90%) were identified as MetS cases at baseline (Supplementary Table 6). Heritability of complex traits evaluated in this study ranges from 29.18% for BMR to 0.77% for stroke (Supplementary Table 7). We examined the prevalence of ICD-10 codes associated with MetS in individuals with (n = 17,335) and without MetS (n = 65,620). All comorbidities were more frequent in the MetS group, with significant differences confirmed by chi-square tests after multiple test correction. After adjusting for covariates, the associations of stroke and atrial fibrillation with MetS were no longer significant, while others remained significant (Supplementary Table 8). Additionally, baseline comparisons of the quantitative traits showed significant mean differences between the groups, persisting after adjustment except for the FEV1/FVC Z-score and neuroticism (Supplementary Table 9).

Estimation of causal relationship

In the analysis of causal relationships, MetS was examined as both an exposure and an outcome for traits of interest associated with MetS (Figs. 3, Supplementary Figs. 3 and 5, Supplementary Table 10). Based on prior knowledge and current literature, MetS was primarily analyzed as an exposure (Lin et al. 2022; Mottillo et al. 2010; Zhu et al. 2023) for most comorbidities and complex traits, while it was considered an outcome in analyses involving BMI, BMR, CRP, and vitamin D (Xu et al. 2020; Ridker et al. 2008; Timpson et al. 2005). In all cases, reverse causation was systematically evaluated to ensure robustness of the findings (Supplementary Figs. 4 and 6). Instrumental variant details, including F-statistics, are in Supplementary Table 11.

Fig. 3 — Causal relationships between MetS and related complex traits. Causal effects estimates determined by MRLOVA, a robust method, are presented here for clarity and focus. Consistent findings across other methods (detailed in Supplementary Figs. 2 & 4) further validate and strengthen the robustness of these results. MetS metabolic syndrome, AHD atherosclerotic heart disease, IHD ischemic heart disease, MI myocardial infarction, AFib atrial fibrillation, ACD arrhythmia and conduction disorders, CKD chronic kidney disease, DM2 type 2 diabetes, SA sleep apnoea, BMI body mass index, BMR basal metabolic rate, IGF-1 insulin-like growth factor 1, CRP C-reactive protein, vitD vitamin D, ALP Alkaline Phosphatase, ALT Alanine Aminotransferase, ASP Aspartate Aminotransferase, & GGT Gamma-Glutamyl Transferase

When analyzing MetS as an outcome, genetically determined BMI and BMR were positively associated with MetS status. This association was consistent across all methods, except for the weighted mode, which reported a non-significant association for BMR (Fig. 3, Supplementary Fig. 3, Supplementary Table 10). No evidence of reverse causation was detected (Supplementary Fig. 4). No causal relationship was identified between genetically determined serum CRP levels and MetS (Fig. 3, Supplementary Fig. 3), nor between MetS and CRP levels, also confirming the absence of reverse causation (Supplementary Fig. 4). Genetically determined vitamin D levels also showed no significant effect on MetS. However, MetS exhibited a significant negative impact on vitamin D levels, with this reverse causal effect consistently supported by multiple methods, including GSMR, MR-PRESSO, ConMix, and MRLOVA (Fig. 3, Supplementary Figs. 3–4, Supplementary Table 10).

When analyzing MetS as an exposure, genetically determined MetS was significantly associated with the development of cardiovascular diseases, including myocardial infarction (MI), ischemic heart disease (IHD), and atherosclerotic heart disease (AHD) (Fig. 3, Supplementary Fig. 5, Supplementary Table 10). Estimates from all methods, except MR Egger, were consistent in direction and magnitude, with no evidence of reverse causation between cardiovascular diseases and MetS (Supplementary Fig. 6). Genetically determined MetS was also significantly associated with chronic kidney disease (CKD) and type 2 diabetes (DM2). While no reverse causation was detected for CKD, a consistent bidirectional relationship between MetS and DM2 was observed across all methods except MR Egger (Supplementary Figs. 5–6, Supplementary Table 10). No causal association was identified between MetS and atrial fibrillation (AFib) in either direction (Fig. 3, Supplementary Fig. 5–6, Supplementary Table 10).

Genetic correlation estimations

The genetic correlation between MetS and 23 complex traits (including 12 comorbid conditions based on ICD-10 and 11 quantitative traits) was estimated using the UKB data. All phenotypes were adjusted for confounders such as age, sex, and other relevant factors. We applied the conventional bivariate GREML method (Ni et al. 2018) for this analysis. The strongest genetic correlation was observed between MetS and type 2 diabetes (0.69, se = 0.0367), followed by BMI (0.65 se = 0.019), CKD (0.54, se = 0.0843), and BMR (0.50, se = 0.0196). Serum vitamin D levels exhibited a negative correlation with MetS (-0.12, se = 0.0358), as did IGF-1 (-0.06, se = 0.0261). Notably, stroke, neuroticism, chronic obstructive pulmonary disease (COPD), and anxiety showed no significant genetic correlations with MetS (see Fig. 4, Supplementary Table 10).

Fig. 4 — Genetic correlation between MetS and related traits estimated using conventional GREML and the proposed HVP methods. Bonferroni correction for multiple testing were applied a to adjust significance thresholds. Causal effect estimated by MRLOVA is used to correct the genetic correlation in this figure

Based on these estimations, we applied the proposed HVP model to correct the genetic correlations and disentangle the effects of horizontal pleiotropy. The estimated genetic correlations between MetS and cardiovascular diseases were primarily driven by vertical pleiotropy, as evidenced by the significant causal effects of MetS on cardiovascular diseases. After correcting using the HVP model, the genetic correlation attributed to horizontal pleiotropy became non-significant. Similarly, the genetic correlation between vitamin D and MetS became non-significant after correction, suggesting no significant horizontal pleiotropy effects (Fig. 4).

Interestingly, the strong genetic correlation between DM2 and MetS was influenced by both vertical and horizontal pleiotropy, with horizontal pleiotropy accounting for 77.67% of the correlation i.e., the proportion of the original genetic correlation that remains after adjusting for vertical pleiotropy. Likewise, the genetic correlation between BMR and MetS was also affected by both forms of pleiotropy. The corrected genetic correlation between MetS and CKD had a p-value of 0.0023 but became non-significant after adjusting for multiple testing. In contrast, the genetic correlations between MetS and CRP, MetS and sleep apnoea, and MetS and cholelithiasis were driven solely by horizontal pleiotropy (Fig. 4, Supplementary Table 10), indicating no causal relationship exists between these pairs.

HVP model based on GWAS summary statistics

The HVP model is compatible with GWAS summary statistics, enabling its application even when individual-level data are inaccessible. In scenarios where only summary data were available, LDSC was used to estimate genetic correlations and their associated standard errors. When applied under these conditions, the HVP model demonstrated strong concordance with results obtained from individual-level HVP analyses, showcasing its robustness and adaptability (Fig. 5). This consistency underscores the model’s utility in leveraging summary-level data for reliable genetic correlation estimation and pleiotropy dissection, broadening its applicability across diverse datasets. This functionality is also implemented in the accompanying R package, as detailed in the code availability section.

Fig. 5 — Comparison of corrected genetic correlation estimates based on LDSC and GREML. The bar plot illustrates the corrected genetic correlations between MetS and various traits, as initially estimated by LDSC (using summary statistics) and GREML (using individual level data). Estimates from the two methods are represented by distinct colours. Error bars indicate the 95% confidence intervals (CI) for the corrected genetic correlation estimates. Asterisks represent Bonferroni-adjusted significance levels (* = adjusted p < 0.05, ** = adjusted p < 0.01, *** = adjusted p < 0.001). Trait pairs included in the figure showed significant genetic correlations from both methods and a significant causal effect based on two-sample MR

Discussion

Genome-wide genetic correlation studies and MR analyses are essential tools in understanding the relationships between complex traits. Genetic correlation quantifies the shared genetic basis between traits, while MR leverages genetic data to estimate causal effects. Standard methods for estimating genetic correlation, such as LDSC (Bulik-Sullivan et al. 2015) and REML through approaches like GCTA (Yang et al. 2011), MTG2 (Lee and van der Werf 2016), and BOLT-REML (Loh et al. 2015), have been widely used, especially for polygenic traits. However, these methods do not always account for the distinct forms of pleiotropy—vertical and horizontal—which can obscure the true nature of genetic associations.

Vertical pleiotropy involves a causal chain where a genetic variant influences one trait, which then affects another (Jang et al. 2022). In contrast, horizontal pleiotropy occurs when a genetic variant independently affects two traits, reflecting shared biological pathways (Sivakumaran et al. 2011; Solovieff et al. 2013)). Distinguishing between these forms is critical for accurate interpretation of genetic correlations and causal relationships in disease biology.

In this study, we introduced the HVP model, an innovative approach designed to provide unbiased estimates of heritability and genetic correlation by disentangling horizontal and vertical pleiotropy. Unlike traditional methods such as GREML, the HVP model adjusts for vertical pleiotropy by incorporating causal effect estimates from MR. This allows for a clear distinction between shared genetic influences (horizontal pleiotropy) and those mediated through causal pathways (vertical pleiotropy). This distinction is critical because genetic effects arising from vertical pleiotropy may not directly impact the target trait of interest unless the intermediary trait is also addressed. Such insights have profound implications for functional studies, including gene knockdown analyses or CRISPR-based interventions, where understanding the precise causal mechanisms is essential for effective targeting and interpretation (Ford et al. 2019).

Applying the HVP model to data from the UKB, we found that horizontal pleiotropy significantly contributes to the genetic correlations between MetS and traits such as type 2 diabetes, C-reactive protein, sleep apnoea, and cholelithiasis. Previous studies reported significant genetic relationships between MetS and these traits (van Walree et al. 2022), which may primarily be driven by horizontal pleiotropy. In contrast, vertical pleiotropy was more prominent in the relationships between MetS and traits like cardiovascular diseases, serum vitamin D level, and chronic kidney disease, and that between BMI and MetS. While earlier studies reported significant genetic correlations between MetS and these traits (van Walree et al. 2022; Vattikuti et al. 2012; Chen et al. 2019), our analysis reveals that these correlations primarily arise from causal pathways linking MetS to these outcomes, rather than shared genetic effects directly influencing both MetS and these traits. These findings underscore the distinct mechanisms underlying the associations between MetS and related traits, with important implications for risk prediction and the development of targeted interventions, such as gene knockdown studies or CRISPR-based therapies.

The delta method assuming that tau is estimated without error may leads to an underestimation of the standard error of the corrected genetic correlation. However, we find that the theoretical standard error closely matches the empirical standard deviation, and importantly, the empirical coverage of the 95% confidence intervals is close to the nominal level. This suggests that the estimated standard errors are reasonably well calibrated in practice (see Supplementary Table 12). Moreover, sensitivity analyses are recommended in scenarios with small sample sizes or weak genetic associations to ensure the robustness and careful interpretation of findings.

Fitting an exposure as a covariate in standard mixed models is a common approach for estimating genetic parameters such as heritability and genetic correlation, and may partially address vertical pleiotropy. However, as shown in previous studies (Aschard et al. 2015; Wang et al. 2024), this approach can introduce collider bias. Specifically, genetic variants that affect the exposure but have no true effect on the outcome may appear spuriously associated with the outcome once the exposure is conditioned on, due to induced statistical dependencies. This results in biased SNP effect estimates, which in turn bias heritability and genetic correlation estimates. Moreover, this approach becomes less feasible and more problematic in the context of bivariate models. In contrast, the HVP model explicitly accounts for the causal path from exposure to outcome and provides a more robust framework than existing methods.

In summary, our findings validate the relevance and utility of the HVP model in unraveling the complex genetic architecture of traits like MetS, providing deeper insights into how genetic effects are shared across various health conditions such as BMI, diabetes, sleep apnoea, cholelithiasis, CKD, cardiovascular diseases, and CRP levels. The HVP model effectively separates the contributions of horizontal and vertical pleiotropy, offering a clearer understanding of the genetic mechanisms linking these traits. By integrating these insights into shared genetic pathways, our approach improves the accuracy with which we identify key genetic drivers across multiple conditions. This enhanced understanding can help inform more targeted public health strategies, enabling early interventions and personalized treatment approaches.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1^{(337.5KB, docx)}

Acknowledgements

This research is supported by the Australian Research Council (DP190100766). L.D.A. acknowledges funding from the Enterprise Research Scholarship of University of South Australia. We obtained real data from the UK Biobank, and we would like to acknowledge them for providing access to the data. Our approved reference number for UK Biobank is 14575. The UK Biobank is funded by the UK Department of Health, the Medical Research Council, the Scottish Executive, and the Wellcome Trust medical research charity. The analyses were performed using computational resources provided by the Australian Government through Gadi under the National Computational Merit Allocation Scheme (NCMAS), as well as resources at the University of South Australia (UniSA) and HPCs (Statgen and Statgen 2 servers) managed by UniSA IT. We thank the University of South Australia IT team for their support in accessing these servers. Finally, we would like to thank the Statistical Genetics Group at the Australian Center for Precision Health for their support in providing quality-controlled genotypic and phenotypic data.

Author contributions

H.L conceived the idea, derived the proposed HVP models, and supervised the study. L.A contributed to performing simulations, data extraction, data analysis, and visualizations. Both L.A and H.L wrote the first draft of the manuscript. H.L initially created the computer code for simulations and analyses and reviewed the simulations, data extraction, data analysis, and visualizations throughout the study. L.A and H.L also worked on preparing the R package. T. L, N. M, and E. H reviewed the manuscript and provided critical feedback and suggestions. All authors discussed the results and contributed to finalizing the manuscript. All authors reviewed the manuscript.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions

Data availability

No datasets were generated or analysed during the current study.

Code availability

The proposed methods are implemented in R package, which is publicly available to download on GitHub at https://github.com/lamessad/HVP for individual-level data, https://github.com/lamessad/HVPSUM for summary data. All MR methods used are publicly available R packages with links given in the web resources section. The source code for MTG version 2.22 is publicly available in https://sites.google.com/view/s-hong-lee-homepage/mtg2. The source code of LDSC is available at https://github.com/bulik/ldsc. Web resources: TwoSampleMR, https://mrcieu.github.io/TwoSampleMR, MR-IVW, MR-Egger, MR-Weighted-Median, MR-Weighted-Mode, MR-ContMix, https://cran.r-project.org/web/packages/MendelianRandomization, MR-PRESSO, https://github.com/rondolab/MR-PRESSO, IMRP, https://github.com/XiaofengZhuCase/IMRP

GSMR, http://cnsgenomics.com/software/gsmr/, MRLOVA, https://github.com/lamessad/MRLOVA.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Lamessa Dube Amente, Email: lamessa.amente@mymail.unisa.edu.au.

S. Hong Lee, Email: hong.lee@unisa.edu.au.

References

Alberti KG, Eckel RH, Grundy SM, Zimmet PZ, Cleeman JI, Donato KA et al (2009) Harmonizing the metabolic syndrome: a joint interim statement of the international diabetes federation task force on epidemiology and prevention; National heart, lung, and blood institute; American heart association; world heart federation; international atherosclerosis society; and international association for the study of obesity. Circulation 120(16):1640–1645 [DOI] [PubMed] [Google Scholar]
Amente LD, Mills NT, Le TD, Hypponen E, Lee SH (2025) A latent outcome variable approach for Mendelian randomization using the stochastic expectation maximization algorithm. Hum Genet 144(5):559–574 [DOI] [PMC free article] [PubMed] [Google Scholar]
Aschard H, Vilhjalmsson BJ, Joshi AD, Price AL, Kraft P (2015) Adjusting for heritable covariates can bias effect estimates in genome-wide association studies. Am J Hum Genet 96(2):329–339 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bowden J, Davey Smith G, Burgess S (2015) Mendelian randomization with invalid instruments: effect Estimation and bias detection through Egger regression. Int J Epidemiol 44(2):512–525 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bowden J, Davey Smith G, Haycock PC, Burgess S (2016) Consistent Estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol 40(4):304–314 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bowden J, Del Greco MF, Minelli C, Zhao Q, Lawlor DA, Sheehan NA et al (2019) Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption. Int J Epidemiol 48(3):728–742 [DOI] [PMC free article] [PubMed] [Google Scholar]
Brainstorm C, Anttila V, Bulik-Sullivan B, Finucane HK, Walters RK, Bras J et al (2018) Analysis of shared heritability in common disorders of the brain. Science.;360(6395) [DOI] [PMC free article] [PubMed]
Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Loh PR et al (2015) An atlas of genetic correlations across human diseases and traits. Nat Genet 47(11):1236–1241 [DOI] [PMC free article] [PubMed] [Google Scholar]
Burgess S, Davey Smith G, Davies NM, Dudbridge F, Gill D, Glymour MM et al (2019) Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res 4:186 [DOI] [PMC free article] [PubMed] [Google Scholar]
Burgess S, Foley CN, Allara E, Staley JR, Howson JMM (2020) A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun 11(1):376 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen X, Bhuiyan I, Kuja-Halkola R, Magnusson PKE, Svensson P (2019) Genetic and environmental influences on the correlations between traits of metabolic syndrome and CKD. Clin J Am Soc Nephrol 14(11):1590–1596 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chesmore K, Bartlett J, Williams SM (2018) The ubiquity of Pleiotropy in human disease. Hum Genet 137(1):39–44 [DOI] [PubMed] [Google Scholar]
Cross-Disorder Group of the Psychiatric, Genomics C, Lee SH, Ripke S, Neale BM, Faraone SV, Purcell SM et al (2013) Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat Genet 45(9):984–994 [DOI] [PMC free article] [PubMed] [Google Scholar]
Davey Smith G, Hemani G (2014) Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet 23(R1):R89–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
Ford K, McDonald D, Mali P (2019) Functional genomics via CRISPR-Cas. J Mol Biol 431(1):48–65 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hackinger S, Zeggini E (2017) Statistical methods to detect Pleiotropy in human complex traits. Open Biol.;7(11) [DOI] [PMC free article] [PubMed]
Hemani G, Bowden J, Davey Smith G (2018) Evaluating the potential role of Pleiotropy in Mendelian randomization studies. Hum Mol Genet 27(R2):R195–R208 [DOI] [PMC free article] [PubMed] [Google Scholar]
Jang SK, Saunders G, Liu M, andMe Research T, Jiang Y, Liu DJ et al (2022) Genetic correlation, pleiotropy, and causal associations between substance use and psychiatric disorder. Psychol Med 52(5):968–978 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee SH, van der Werf JH (2016) MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information. Bioinformatics 32(9):1420–1422 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee SH, Wray NR, Goddard ME, Visscher PM (2011) Estimating missing heritability for disease from genome-wide association studies. Am J Hum Genet 88(3):294–305 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee SH, Yang J, Goddard ME, Visscher PM, Wray NR (2012a) Estimation of Pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28(19):2540–2542 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee SH, Goddard ME, Wray NR, Visscher PM (2012b) A better coefficient of determination for genetic profile analysis. Genet Epidemiol 36(3):214–224 [DOI] [PubMed] [Google Scholar]
Lin L, Tan W, Pan X, Tian E, Wu Z, Yang J (2022) Metabolic Syndrome-Related kidney injury: A review and update. Front Endocrinol (Lausanne) 13:904001 [DOI] [PMC free article] [PubMed] [Google Scholar]
Loh PR, Bhatia G, Gusev A, Finucane HK, Bulik-Sullivan BK, Pollack SJ et al (2015) Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat Genet 47(12):1385–1392 [DOI] [PMC free article] [PubMed] [Google Scholar]
Mottillo S, Filion KB, Genest J, Joseph L, Pilote L, Poirier P et al (2010) The metabolic syndrome and cardiovascular risk a systematic review and meta-analysis. J Am Coll Cardiol 56(14):1113–1132 [DOI] [PubMed] [Google Scholar]
Ni G, Moser G, Schizophrenia Working Group of the Psychiatric, Genomics C, Wray NR, Lee SH (2018) Estimation of genetic correlation via linkage disequilibrium score regression and genomic restricted maximum likelihood. Am J Hum Genet 102(6):1185–1194 [DOI] [PMC free article] [PubMed] [Google Scholar]
Qi G, Chatterjee N (2019) Mendelian randomization analysis using mixture models for robust and efficient Estimation of causal effects. Nat Commun 10(1):1941 [DOI] [PMC free article] [PubMed] [Google Scholar]
Ridker PM, Pare G, Parker A, Zee RY, Danik JS, Buring JE et al (2008) Loci related to metabolic-syndrome pathways including LEPR,HNF1A, IL6R, and GCKR associate with plasma C-reactive protein: the women’s genome health study. Am J Hum Genet 82(5):1185–1192 [DOI] [PMC free article] [PubMed] [Google Scholar]
Sivakumaran S, Agakov F, Theodoratou E, Prendergast JG, Zgaga L, Manolio T et al (2011) Abundant Pleiotropy in human complex diseases and traits. Am J Hum Genet 89(5):607–618 [DOI] [PMC free article] [PubMed] [Google Scholar]
Slob EAW, Burgess S (2020) A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol 44(4):313–329 [DOI] [PMC free article] [PubMed] [Google Scholar]
Solovieff N, Cotsapas C, Lee PH, Purcell SM, Smoller JW (2013) Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet 14(7):483–495 [DOI] [PMC free article] [PubMed] [Google Scholar]
Timpson NJ, Lawlor DA, Harbord RM, Gaunt TR, Day IN, Palmer LJ et al (2005) C-reactive protein and its role in metabolic syndrome: Mendelian randomisation study. Lancet 366(9501):1954–1959 [DOI] [PubMed] [Google Scholar]
van Rheenen W, Peyrot WJ, Schork AJ, Lee SH, Wray NR (2019) Genetic correlations of polygenic disease traits: from theory to practice. Nat Rev Genet 20(10):567–581 [DOI] [PubMed] [Google Scholar]
van Walree ES, Jansen IE, Bell NY, Savage JE, de Leeuw C, Nieuwdorp M et al (2022) Disentangling genetic risks for metabolic syndrome. Diabetes 71(11):2447–2457 [DOI] [PubMed] [Google Scholar]
Vattikuti S, Guo J, Chow CC (2012) Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits. PLoS Genet 8(3):e1002637 [DOI] [PMC free article] [PubMed] [Google Scholar]
Verbanck M, Chen CY, Neale B, Do R (2018) Detection of widespread horizontal Pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet 50(5):693–698 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang P, Lin Z, Xue H, Pan W (2024) Collider bias correction for multiple covariates in GWAS using robust multivariable Mendelian randomization. PLoS Genet 20(4):e1011246 [DOI] [PMC free article] [PubMed] [Google Scholar]
Xu H, Choi SE, Kang JK, Park DJ, Lee JK, Lee SS (2020) Basal metabolic rate and Charlson comorbidity index are independent predictors of metabolic syndrome in patients with rheumatoid arthritis. Joint Bone Spine 87(5):455–460 [DOI] [PubMed] [Google Scholar]
Yang J, Lee SH, Goddard ME, Visscher PM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88(1):76–82 [DOI] [PMC free article] [PubMed] [Google Scholar]
Yang J, Lee SH, Goddard ME, Visscher PM (2013) Genome-wide complex trait analysis (GCTA): methods, data analyses, and interpretations. Methods Mol Biol 1019:215–236 [DOI] [PubMed] [Google Scholar]
Zhang Y, Cheng Y, Jiang W, Ye Y, Lu Q, Zhao H (2021) Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics. Brief Bioinform.;22(5) [DOI] [PMC free article] [PubMed]
Zhu Z, Zheng Z, Zhang F, Wu Y, Trzaskowski M, Maier R et al (2018) Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun 9(1):224 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhu X, Li X, Xu R, Wang T (2021) An iterative approach to detect Pleiotropy and perform Mendelian randomization analysis using GWAS summary statistics. Bioinformatics 37(10):1390–1400 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhu Q, Xing Y, Fu Y, Chen X, Guan L, Liao F et al (2023) Causal association between metabolic syndrome and cholelithiasis: a Mendelian randomization study. Front Endocrinol (Lausanne) 14:1180903 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1^{(337.5KB, docx)}

Data Availability Statement

No datasets were generated or analysed during the current study.

GSMR, http://cnsgenomics.com/software/gsmr/, MRLOVA, https://github.com/lamessad/MRLOVA.

[CR27] Alberti KG, Eckel RH, Grundy SM, Zimmet PZ, Cleeman JI, Donato KA et al (2009) Harmonizing the metabolic syndrome: a joint interim statement of the international diabetes federation task force on epidemiology and prevention; National heart, lung, and blood institute; American heart association; world heart federation; international atherosclerosis society; and international association for the study of obesity. Circulation 120(16):1640–1645 [DOI] [PubMed] [Google Scholar]

[CR22] Amente LD, Mills NT, Le TD, Hypponen E, Lee SH (2025) A latent outcome variable approach for Mendelian randomization using the stochastic expectation maximization algorithm. Hum Genet 144(5):559–574 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR44] Aschard H, Vilhjalmsson BJ, Joshi AD, Price AL, Kraft P (2015) Adjusting for heritable covariates can bias effect estimates in genome-wide association studies. Am J Hum Genet 96(2):329–339 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] Bowden J, Davey Smith G, Burgess S (2015) Mendelian randomization with invalid instruments: effect Estimation and bias detection through Egger regression. Int J Epidemiol 44(2):512–525 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] Bowden J, Davey Smith G, Haycock PC, Burgess S (2016) Consistent Estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol 40(4):304–314 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] Bowden J, Del Greco MF, Minelli C, Zhao Q, Lawlor DA, Sheehan NA et al (2019) Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption. Int J Epidemiol 48(3):728–742 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] Brainstorm C, Anttila V, Bulik-Sullivan B, Finucane HK, Walters RK, Bras J et al (2018) Analysis of shared heritability in common disorders of the brain. Science.;360(6395) [DOI] [PMC free article] [PubMed]

[CR4] Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Loh PR et al (2015) An atlas of genetic correlations across human diseases and traits. Nat Genet 47(11):1236–1241 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] Burgess S, Davey Smith G, Davies NM, Dudbridge F, Gill D, Glymour MM et al (2019) Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res 4:186 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] Burgess S, Foley CN, Allara E, Staley JR, Howson JMM (2020) A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun 11(1):376 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] Chen X, Bhuiyan I, Kuja-Halkola R, Magnusson PKE, Svensson P (2019) Genetic and environmental influences on the correlations between traits of metabolic syndrome and CKD. Clin J Am Soc Nephrol 14(11):1590–1596 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] Chesmore K, Bartlett J, Williams SM (2018) The ubiquity of Pleiotropy in human disease. Hum Genet 137(1):39–44 [DOI] [PubMed] [Google Scholar]

[CR10] Cross-Disorder Group of the Psychiatric, Genomics C, Lee SH, Ripke S, Neale BM, Faraone SV, Purcell SM et al (2013) Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat Genet 45(9):984–994 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] Davey Smith G, Hemani G (2014) Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet 23(R1):R89–98 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR41] Ford K, McDonald D, Mali P (2019) Functional genomics via CRISPR-Cas. J Mol Biol 431(1):48–65 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] Hackinger S, Zeggini E (2017) Statistical methods to detect Pleiotropy in human complex traits. Open Biol.;7(11) [DOI] [PMC free article] [PubMed]

[CR13] Hemani G, Bowden J, Davey Smith G (2018) Evaluating the potential role of Pleiotropy in Mendelian randomization studies. Hum Mol Genet 27(R2):R195–R208 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] Jang SK, Saunders G, Liu M, andMe Research T, Jiang Y, Liu DJ et al (2022) Genetic correlation, pleiotropy, and causal associations between substance use and psychiatric disorder. Psychol Med 52(5):968–978 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] Lee SH, van der Werf JH (2016) MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information. Bioinformatics 32(9):1420–1422 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] Lee SH, Wray NR, Goddard ME, Visscher PM (2011) Estimating missing heritability for disease from genome-wide association studies. Am J Hum Genet 88(3):294–305 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] Lee SH, Yang J, Goddard ME, Visscher PM, Wray NR (2012a) Estimation of Pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28(19):2540–2542 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] Lee SH, Goddard ME, Wray NR, Visscher PM (2012b) A better coefficient of determination for genetic profile analysis. Genet Epidemiol 36(3):214–224 [DOI] [PubMed] [Google Scholar]

[CR34] Lin L, Tan W, Pan X, Tian E, Wu Z, Yang J (2022) Metabolic Syndrome-Related kidney injury: A review and update. Front Endocrinol (Lausanne) 13:904001 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] Loh PR, Bhatia G, Gusev A, Finucane HK, Bulik-Sullivan BK, Pollack SJ et al (2015) Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat Genet 47(12):1385–1392 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] Mottillo S, Filion KB, Genest J, Joseph L, Pilote L, Poirier P et al (2010) The metabolic syndrome and cardiovascular risk a systematic review and meta-analysis. J Am Coll Cardiol 56(14):1113–1132 [DOI] [PubMed] [Google Scholar]

[CR1] Ni G, Moser G, Schizophrenia Working Group of the Psychiatric, Genomics C, Wray NR, Lee SH (2018) Estimation of genetic correlation via linkage disequilibrium score regression and genomic restricted maximum likelihood. Am J Hum Genet 102(6):1185–1194 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] Qi G, Chatterjee N (2019) Mendelian randomization analysis using mixture models for robust and efficient Estimation of causal effects. Nat Commun 10(1):1941 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] Ridker PM, Pare G, Parker A, Zee RY, Danik JS, Buring JE et al (2008) Loci related to metabolic-syndrome pathways including LEPR,HNF1A, IL6R, and GCKR associate with plasma C-reactive protein: the women’s genome health study. Am J Hum Genet 82(5):1185–1192 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] Sivakumaran S, Agakov F, Theodoratou E, Prendergast JG, Zgaga L, Manolio T et al (2011) Abundant Pleiotropy in human complex diseases and traits. Am J Hum Genet 89(5):607–618 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] Slob EAW, Burgess S (2020) A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol 44(4):313–329 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] Solovieff N, Cotsapas C, Lee PH, Purcell SM, Smoller JW (2013) Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet 14(7):483–495 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] Timpson NJ, Lawlor DA, Harbord RM, Gaunt TR, Day IN, Palmer LJ et al (2005) C-reactive protein and its role in metabolic syndrome: Mendelian randomisation study. Lancet 366(9501):1954–1959 [DOI] [PubMed] [Google Scholar]

[CR2] van Rheenen W, Peyrot WJ, Schork AJ, Lee SH, Wray NR (2019) Genetic correlations of polygenic disease traits: from theory to practice. Nat Rev Genet 20(10):567–581 [DOI] [PubMed] [Google Scholar]

[CR12] van Walree ES, Jansen IE, Bell NY, Savage JE, de Leeuw C, Nieuwdorp M et al (2022) Disentangling genetic risks for metabolic syndrome. Diabetes 71(11):2447–2457 [DOI] [PubMed] [Google Scholar]

[CR42] Vattikuti S, Guo J, Chow CC (2012) Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits. PLoS Genet 8(3):e1002637 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR33] Verbanck M, Chen CY, Neale B, Do R (2018) Detection of widespread horizontal Pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet 50(5):693–698 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR45] Wang P, Lin Z, Xue H, Pan W (2024) Collider bias correction for multiple covariates in GWAS using robust multivariable Mendelian randomization. PLoS Genet 20(4):e1011246 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] Xu H, Choi SE, Kang JK, Park DJ, Lee JK, Lee SS (2020) Basal metabolic rate and Charlson comorbidity index are independent predictors of metabolic syndrome in patients with rheumatoid arthritis. Joint Bone Spine 87(5):455–460 [DOI] [PubMed] [Google Scholar]

[CR7] Yang J, Lee SH, Goddard ME, Visscher PM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88(1):76–82 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] Yang J, Lee SH, Goddard ME, Visscher PM (2013) Genome-wide complex trait analysis (GCTA): methods, data analyses, and interpretations. Methods Mol Biol 1019:215–236 [DOI] [PubMed] [Google Scholar]

[CR3] Zhang Y, Cheng Y, Jiang W, Ye Y, Lu Q, Zhao H (2021) Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics. Brief Bioinform.;22(5) [DOI] [PMC free article] [PubMed]

[CR20] Zhu Z, Zheng Z, Zhang F, Wu Y, Trzaskowski M, Maier R et al (2018) Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun 9(1):224 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] Zhu X, Li X, Xu R, Wang T (2021) An iterative approach to detect Pleiotropy and perform Mendelian randomization analysis using GWAS summary statistics. Bioinformatics 37(10):1390–1400 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR36] Zhu Q, Xing Y, Fu Y, Chen X, Guan L, Liao F et al (2023) Causal association between metabolic syndrome and cholelithiasis: a Mendelian randomization study. Front Endocrinol (Lausanne) 14:1180903 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Disentangling horizontal and vertical Pleiotropy in genetic correlation estimation: introducing the HVP model

Lamessa Dube Amente

Natalie T Mills

Thuc Duy Le

Elina Hyppönen

S Hong Lee

Abstract

Supplementary Information

Introduction

Method

Ethical statement

Statistical model: horizontal and vertical Pleiotropy

Fig. 1.

Estimation of causal effect

GREML approach

MR approach

Correction of genetic covariance

Estimation of standard error for the corrected genetic covariance

Correction of genetic variance

Estimation of standard error of the corrected genetic variance

Correction of heritability

Correction of genetic correlation

Estimation of standard error of the corrected genetic correlation

Binary variables

Table 1.

For summary-based data

Simulation setup

Scenario 1: vertical Pleiotropy only

Scenario 2: both vertical and horizontal Pleiotropy

Scenario 3: vertical and horizontal Pleiotropy plus residual covariance

Scenario 4: complete mediation of genetic effect of trait 1(y)

Scenario 5: vertical and horizontal Pleiotropy for the exposure, but only horizontal Pleiotropy for the outcome

Simulation for binary outcome and exposure

Application of real data: MetS comorbidities and related complex traits

Phenotype data

Genotype data

Genetic correlation

Results

Simulations

Fig. 2.

Real data

MetS comorbidities and related complex traits

Estimation of causal relationship

Fig. 3.

Genetic correlation estimations

Fig. 4.

HVP model based on GWAS summary statistics

Fig. 5.

Discussion

Electronic supplementary material

Acknowledgements

Author contributions

Funding

Data availability

Code availability

Declarations

Competing interests

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases