Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
. 2013 Feb 7;92(2):197–209. doi: 10.1016/j.ajhg.2013.01.001

Improved Detection of Common Variants Associated with Schizophrenia by Leveraging Pleiotropy with Cardiovascular-Disease Risk Factors

Ole A Andreassen 1,2,3,, Srdjan Djurovic 1,2, Wesley K Thompson 3, Andrew J Schork 4,5,6, Kenneth S Kendler 7, Michael C O’Donovan 8, Dan Rujescu 9, Thomas Werge 10, Martijn van de Bunt 11, Andrew P Morris 11, Mark I McCarthy 11; International Consortium for Blood Pressure GWAS; Diabetes Genetics Replication and Meta-analysis Consortium; Psychiatric Genomics Consortium Schizophrenia Working Group, J Cooper Roddey 4,13, Linda K McEvoy 4,12, Rahul S Desikan 4,12, Anders M Dale 3,4,12,13,∗∗
PMCID: PMC3567279  PMID: 23375658

Abstract

Several lines of evidence suggest that genome-wide association studies (GWASs) have the potential to explain more of the “missing heritability” of common complex phenotypes. However, reliable methods for identifying a larger proportion of SNPs are currently lacking. Here, we present a genetic-pleiotropy-informed method for improving gene discovery with the use of GWAS summary-statistics data. We applied this methodology to identify additional loci associated with schizophrenia (SCZ), a highly heritable disorder with significant missing heritability. Epidemiological and clinical studies suggest comorbidity between SCZ and cardiovascular-disease (CVD) risk factors, including systolic blood pressure, triglycerides, low- and high-density lipoprotein, body mass index, waist-to-hip ratio, and type 2 diabetes. Using stratified quantile-quantile plots, we show enrichment of SNPs associated with SCZ as a function of the association with several CVD risk factors and a corresponding reduction in false discovery rate (FDR). We validate this “pleiotropic enrichment” by demonstrating increased replication rate across independent SCZ substudies. Applying the stratified FDR method, we identified 25 loci associated with SCZ at a conditional FDR level of 0.01. Of these, ten loci are associated with both SCZ and CVD risk factors, mainly triglycerides and low- and high-density lipoproteins but also waist-to-hip ratio, systolic blood pressure, and body mass index. Together, these findings suggest the feasibility of using genetic-pleiotropy-informed methods for improving gene discovery in SCZ and identifying potential mechanistic relationships with various CVD risk factors.

Introduction

Complex human traits and disorders are influenced by numerous genes that each have small individual effects,1 and thousands of SNPs have been identified by genome-wide association studies (GWASs).2,3 However, these SNPs fail to explain a substantial proportion of the heritability of the complex phenotypes studied;4 this is often referred to as the “missing heritability.” Recent results indicate that GWASs have the potential to explain a greater proportion of the heritability of common complex phenotypes,5,6 and additional SNPs are likely to be identified in larger samples.7 Because of the polygenic architecture of most complex traits and disorders, a large number of SNPs have associations too weak to be identified in the currently available sample sizes.4 Cost-effective analytical methods are needed for reliably identifying a larger proportion of SNPs associated with complex diseases and phenotypes given that recruitment and genotyping of new participants is expensive. Here, we present a genetic-pleiotropy-informed approach for GWASs to capture more of the polygenic effects in complex disorders and traits. Given the high number of traits in humans and the relative small number of genes (∼20,000), some genes have to affect multiple traits (genetic pleiotropy).8 By combining independent GWASs from associated traits or comorbid disorders, we hypothesize that a genetic-pleiotropy-informed approach can significantly improve discovery of genes and help capture a greater proportion of the missing heritability.

Reports indicate overlapping SNPs between several human traits9,10 and disorders.11,12 To date, methods for assessing genetic pleiotropy have not taken full advantage of the existing GWAS data, and the majority of these studies have focused on the subset of SNPs exceeding a Bonferroni-corrected threshold of significance for each trait or disorder.10,13 However, this approach cannot detect SNPs that reach genome-wide significance in the combined analysis (hereafter referred to as polygenic pleiotropy) but do not meet Bonferroni-corrected significance in the individual phenotype. Combining GWASs from two traits or disorders also provides increased power to discover genes associated with common biological mechanisms and to potentially inform shared pathophysiological relationships between the phenotypes. In the current study, we use schizophrenia (SCZ [MIM 181500]) as an example of how a pleiotropy-informed analytical approach can improve gene discovery in a disorder with high heritability14 and for which, despite recent discoveries,13,15,16 most of the underlying genetic architecture remains unknown.13

SCZ, a debilitating mental health disorder, is among the leading global causes of disability17 and constitutes a substantial portion of disease burden worldwide. Systematic reviews and meta-analyses indicate that individuals with SCZ have significantly higher mortality rates than the general population, and this corresponds to a 10–20 year reduction in life expectancy.18–20 Although the mortality rate from suicide is high, lifestyle and cardiovascular-disease (CVD) risk factors contribute substantially to life-expectancy reduction in SCZ.19–21 Epidemiological research has shown increased rates of dyslipidemia, type 2 diabetes (T2D [MIM 125853]), and obesity (MIM 601665) and a high prevalence of metabolic syndrome among people with SCZ.22 This increase in CVD risk factors has been primarily attributed to lifestyle factors such as unhealthy diet, sedentary habits, excessive smoking, and the side effects of antipsychotic medication.19,23–25 However, as suggested by studies predating the introduction of antipsychotics,26 studies of untreated first-episode individuals and their healthy relatives,27 and the identification of overlapping candidate genes,28 shared genetics between SCZ and CVD risk factors might also be of importance.

Large GWASs have reported SNPs associated with a number of CVD risk factors, including systolic blood pressure (SBP), diastolic blood pressure (DBP),29 low-density lipoprotein (LDL) cholesterol,30 high-density lipoprotein (HDL) cholesterol,30 triglycerides (TGs),30 T2D,31 body mass index (BMI),32 and waist-to-hip ratio (WHR).33 In the current study, we employed model-free strategies and leveraged the power of multiple large independent GWASs to identify SNPs exhibiting pleiotropy between SCZ and eight CVD risk factors by using summary statistics from six studies. After applying genomic inflation control, we computed the stratified empirical cumulative distribution functions (cdfs) of the nominal p values. Strata were determined by the relative enrichment of pleiotropic SNPs in SCZ as a function of increased nominal p values in the different CVD risk factors. For each nominal p value, an estimate of the stratum-specific true discovery rate (TDR = 1 – false discovery rate [FDR]) was obtained from the empirical cdfs.34,35 We demonstrate that the stratified analysis improves power to detect SNPs by computing replication rates for nominal p value thresholds by using independent substudies for discovery and replication samples. We show that for a given replication rate, nominal p value thresholds are approximately 100 times larger for the most pleiotropic SNPs in SCZ than for all SNPs (hereafter referred to as genetic enrichment). Using this stratified methodology, we constructed a two-dimensional (2D) FDR “look-up” table in which the FDR in SCZ SNPs was computed conditionally on nominal CVD-risk-factor p values (this is referred to as conditional FDR). Using this table, we identified 25 loci that are significantly associated with SCZ at a conditional FDR level of 0.01. Finally, we constructed the conjunction FDR to investigate SNPs significantly associated with both SCZ and CVD risk factors. Specifically, we computed the conditional FDR for SCZ given CVD-risk-factor nominal p values, as well as conditional FDR for CVD risk factors given SCZ nominal p values, and we took the maximum of both values as the conjunction FDR. With this approach, we identified ten pleiotropic loci implicating overlapping genetic mechanisms between SCZ and blood lipids.

Material and Methods

Participant Samples

We obtained complete GWAS results in the form of summary-statistics p values from public-access websites or through collaboration with investigators (T2D cases and controls were from the Diabetes Genetics Replication and Meta-analysis [DIAGRAM] Consortium, and SCZ cases and controls were from the Psychiatric GWAS Consortium [PGC] [Table S1, available online]). There was no overlap between participants in the CVD GWAS and the SCZ case-control sample (n = 21,856), except for 2,974 of 12,462 (24%) controls.13

The SCZ GWAS summary-statistics results were obtained from the PGC,13 which consists of 9,394 cases with SCZ or schizoaffective disorder and 12,462 controls (52% screened) from a total of 17 samples from 11 countries. The quality of phenotypic data was verified by a systematic review of data-collection methods and procedures at each site, and only studies that fulfilled these criteria were included. This involved the following nine key items: (1) the use of a structured psychiatric interview, (2) systematic training of interviewers in the use of the instrument, (3) systematic quality control of diagnostic accuracy, (4) reliability trials, (5) review of medical-record information, (6) best-estimate procedure employed, (7) specific inclusion and exclusion criteria developed and utilized, (8) final diagnostic determination made by MDs or PhDs, and (9) special additional training for the final SCZ PGC sample. One sample from Sweden used another approach, but further empirical support for the validity of this approach was provided. Controls consisted of 12,462 European-ancestry samples collected from the same countries. Because the prevalence of SCZ is low, a large control sample in which some controls were not screened for SCZ was utilized. For further details on sample characteristics and quality-control procedures applied, please see Ripke et al.13 A total of 2,974 controls in the SCZ UK case-control sample16 from the Wellcome Trust Case Control Consortium (WTCCC) were also included in several of the CVD-risk-factor GWASs. This constitutes 24% of the total number of controls (n = 12,462) in the SCZ PGC sample.13

More information about inclusion criteria and phenotype characteristics of the CVD-risk-factor samples of the different GWASs is included in the original publications.29–33 The relevant institutional review boards or ethics committees approved the research protocol of the individual GWASs used in the current analysis, and all human participants gave written informed consent.

Statistical Analyses

Stratified Quantile-Quantile Plots

Quantile-quantile (Q-Q) plots compare a nominal probability distribution against an empirical distribution. In the presence of all null relationships, nominal p values form a straight line on a Q-Q plot when they are plotted against the empirical distribution. For each phenotype, for all SNPs, and for each categorical subset (strata), −log10 nominal p values were plotted against −log10 empirical p values (stratified Q-Q plots). Leftward deflections of the observed distribution from the projected null line reflect increased tail probabilities in the distribution of test statistics (Z scores) and, consequently, an overabundance (also termed “enrichment”) of low p values compared to that expected by chance.

Under large-scale testing paradigms, such as GWASs, quantitative estimates of probably true associations can be estimated from the distributions of summary statistics.36,37 A common method for visualizing the enrichment of statistical association relative to that expected under the global null hypothesis is through Q-Q plots of nominal p values obtained from GWAS summary statistics. The usual Q-Q curve has the nominal p value, denoted by “p,” as the y ordinate and the corresponding value of the empirical cdf, denoted by “q,” as the x ordinate. Under the global null hypothesis, the theoretical distribution is uniform on the interval [0, 1]. As is common in GWASs, we instead plot −log10(p) against −log10(q) to emphasize tail probabilities of the theoretical and empirical distributions. Therefore, genetic enrichment results in a leftward shift in the Q-Q curve, corresponding to a larger fraction of SNPs with a nominal −log10 p value greater than or equal to a given threshold. Stratified Q-Q plots are constructed by the creation of subsets of SNPs on the basis of levels of an auxiliary measure for each SNP and the computation of Q-Q plots separately for each level. If SNP enrichment is captured by variation in the auxiliary measure, this is expressed as successive leftward deflections in a stratified Q-Q plot as levels of the auxiliary measure increase.

Genomic Control

The empirical null distribution in GWASs is affected by global variance inflation due to population stratification and cryptic relatedness38 and deflation due to overcorrection of test statistics for polygenic traits by standard genomic-control methods,39 in addition to incorrect asymptotic approximation used for computing the p values. We applied a control method leveraging only intergenic SNPs, which are most likely depleted for true associations (unpublished data). First, we annotated the SNPs to genic (5′ UTR, exon, intron, and 3′ UTR) and intergenic regions by using information from the 1000 Genomes Project (1KGP). As illustrated in Figure S1, there are more functional genic regions than intergenic regions in SCZ. We used intergenic SNPs because their relative depletion of associations suggests that they provide a robust estimate of true null effects, and they thus seem to be a better category for genomic control than all SNPs. We converted all p values to Z scores, and for each phenotype, we estimated the genomic inflation factor λGC for intergenic SNPs. We computed the inflation factor λGC as the median Z score squared divided by the expected median of a chi-square distribution with one degree of freedom, and we adjusted all test statistics by λGC. The stratified Q-Q plot for SCZ after control for genomic inflation is shown in Figure S1.

Stratified Q-Q Plots for Pleiotropic Enrichment

To assess pleiotropic enrichment, we used a Q-Q plot stratified by “pleiotropic” effects. For a given associated phenotype, enrichment for pleiotropic signals is present if the degree of deflection from the expected null line is dependent on SNP associations with the second phenotype. We constructed stratified Q-Q plots of empirical quantiles of nominal –log10(p) values for SNP association with SCZ for all SNPs, as well as for subsets (strata) of SNPs determined by the nominal p values of their association with a given CVD risk factor. Specifically, we computed the empirical cumulative distribution of nominal p values for a given phenotype for all SNPs and for SNPs with significance levels below the indicated cutoffs for the other phenotype (–log10(p) ≥ 0, –log10(p) ≥ 1, –log10(p) ≥ 2, and –log10(p) ≥ 3 corresponding to p < 1, p < 0.1, p < 0.01, and p < 0.001, respectively). The nominal p values (–log10(p)) are plotted on the y axis, and the empirical quantiles (–log10(q), where q = cdf(p)) are plotted on the x axis. To assess polygenic effects below the standard GWAS significance threshold, we focused the stratified Q-Q plots on SNPs with nominal –log10(p) < 7.3 (corresponding to p > 5 × 10−8).

Significance of Enrichment

Using Q-Q plots (with 95% confidence intervals [CIs]) of empirical versus nominal –log10(p) values in SCZ as a function of the significance of association with the CVD risk factors, we estimated the significance of the polygenic enrichment (Figures S2A–S2C). After using intergenic SNPs to estimate and control for genomic inflation (Figure S1), we pruned the SNPs by removing SNPs in linkage disequilibrium (LD) (r2 ≥ 0.2) and computed 95% CIs for the Q-Q plots. From these CIs, we calculated standard errors and used two sample t tests to estimate the difference (degree of departure) between the empirical distribution of SCZ (phenotype 1) SNPs that were above a given association threshold (–log10(p) > 1, –log10(p) > 2, –log10(p) > 3, and –log10(p) > 4; red lines) and the distribution of SNPs with –log10(p) ≤ 1 for the CVD-risk-factor (phenotype 2) category (blue line). The p values listed in Table S5 indicate the most significant difference, as assessed by a two-sample t test, between the red (–log10(p) > 1, 2, 3, or 4) and blue (–log10(p) ≤ 1) lines. This is reflected in the largest difference between the 95% CIs. The 95% CIs also illustrate the region containing significant differences in the Q-Q plot. For differences between the distributions, we only report p values appearing above the –log10(p) > 2 threshold on the Q-Q plots. This clearly shows significant enrichment conditioning SCZ on TG and WHR.

For the CVD risk factors with significant enrichment, we further calculated the significance of the enrichment for SNPs with a Fisher’s combined p value below the genome-wide significance level of 5 × 10−8, and we hereafter refer to this as “censoring.” This made it possible to examine whether the enrichment was entirely explained by the most significant pleiotropic SNPs or whether it was due to a more general, polygenic effect (polygenic pleiotropy). As illustrated for SCZ and TG in Figure S2C, the polygenic pleiotropic enrichment is highly significant. See also Table S5.

Stratified TDR

Enrichment seen in the stratified Q-Q plots can be directly interpreted in terms of TDR (equivalent to 1 − FDR40). We applied the stratified FDR method,35 previously used for GWAS enrichment based on linkage information.34 Specifically, for a given p value cutoff, the FDR is defined as

FDR(p)=π0F0(p)F(p), (Equation 1)

where π0 is the proportion of null SNPs, F0 is the null cdf, and F is the cdf of all SNPs, both null and non-null; see below for details on this simple mixture-model formulation.41 Under the null hypothesis, F0 is the cdf of the uniform distribution on the unit interval [0, 1], so Equation 1 reduces to

FDR(p)=π0pF(p). (Equation 2)

The cdf F can be estimated by the empirical cdf q = Np / N, where Np is the number of SNPs with a p value less than or equal to p and N is the total number of SNPs. By replacing F with q in Equation 2, we get

EstimatedFDR(p)=π0pq, (Equation 3)

which is biased upward as an estimate of the FDR.41 Replacing π0 in Equation 3 with unity gives an estimated FDR that is further biased upward:

q=pq. (Equation 4)

If π0 is close to 1, as is probably true for most GWASs, the increase in bias from Equation 3 is minimal. The quantity 1 – p / q is therefore biased downward and hence is a conservative estimate of the TDR.

Referring to the formulation of the Q-Q plots, we see that q is equivalent to the nominal p value divided by the empirical quantile, as defined earlier. Given the –log10 of the Q-Q plots, we can easily obtain

log10(q)=log10(q)log10(p), (Equation 5)

demonstrating that the (conservatively) estimated FDR is directly related to the horizontal shift of the curves in the stratified Q-Q plots from the expected line x = y, i.e., a larger shift corresponds to a smaller FDR, as illustrated in Figure 1. As before, the estimated TDR can be obtained as 1 − FDR. For each range of p values (stratum) in a pleiotropic trait, we calculated the TDR as a function of p values in SCZ (indicated by different colored curves) in Figure 1 by using each observed p value as a threshold according to Equation 5.

Figure 1.

Figure 1

Enrichment and Replication

(A and B) Stratified Q-Q plot of nominal versus empirical –log10 p values (corrected for inflation) in SCZ below the standard GWAS threshold of p < 5 × 10−8 as a function of significance of association with (A) TGs and (B) WHR at the levels of –log10(p) > 0, –log10(p) > 1, –log10(p) > 2, and –log10(p) > 3, which correspond to p < 1, p < 0.1, p < 0.01, and p < 0.001, respectively. Dashed lines indicate the null hypothesis.

(C and D) Stratified TDR plots illustrating the TDR increase associated with increased pleiotropic enrichment in (C) SCZ conditioned on TG (SCZ|TG) and (D) SCZ conditioned on WHR (SCZ|WHR).

(E and F) Cumulative replication plot showing the average rate of replication (p < 0.05) within SCZ substudies for a given p value threshold demonstrates that pleiotropic enriched SNP categories replicate at a higher rate in independent SCZ samples for (E) SCZ conditioned on TG (SCZ|TG) and (F) SCZ conditioned on WHR (SCZ|WHR). The vertical intercept is the overall replication rate per category.

Estimates of Pleiotropy

Let z be the GWAS test statistic for a corresponding p value. The two-group mixture model for Z scores implicit in Equation 1 is given by

f(z)=π0f0(z)+(1π0)f1(z), (Equation 6)

where f0 is the null distribution (standard normal after appropriate genomic control), f1 is the non-null distribution (which can be estimated parametrically or nonparametrically),36 and π0 is the proportion null, as before. We can easily generalize this model to two Z scores from phenotypes simultaneously (z1 for phenotype 1 and z2 for phenotype 2) by using a bivariate density from the four-group mixture model,

f(z1,z2)=π0f0(z1,z2)+π1f1(z1,z2)+π2f2(z1,z2)+π3f3(z1,z2), (Equation 7)

where π0 is the proportion of SNPs for which both phenotypes are null, π1 is the proportion of SNPs for which phenotype 1 is non-null and phenotype 2 is null, π2 is the proportion of SNPs for which phenotype 1 is null and phenotype 2 is non-null, and π3 is the proportion of SNPs for which both phenotypes are non-null (i.e., the pleiotropic SNPs). The mixture densities in Equation 7 are given by

f0(z1,z2)=φ(z1)φ(z2)f1(z1,z2)=g1(z1)φ(z2)f2(z1,z2)=φ(z1)g2(z2)f3(z1,z2)=g1(z1)g2(z2), (Equation 8)

where ϕ() denotes the standard normal density and g1 and g2 denote the non-null marginal densities of z1 and z2, respectively. We found that modeling the marginal non-null densities with normal Laplace densities or the Weibull distribution on the squared Z scores (z2) (in which case the null densities are central chi-square with one degree of freedom) fits the data well. The proportions π = (π0123) and the parameters of the non-null distributions can be estimated with maximum likelihood or Bayesian methods such as Markov Chain Monte Carlo. From the probability density function (pdf) (Equation 7), we can compute the joint and conditional cdfs, and hence the FDR (Equation 2), of one phenotype conditionally on tail probabilities of the second.

Figure S3 presents the observed and fitted Q-Q curves for SCZ, TG, and WHR; these curves are based on the marginal cdfs from the bivariate-mixture-model fits, indicating very good fit. The non-null distributions were modeled parametrically with Weibull distributions for non-null z2 and with chi-square with one degree of freedom for null z.2 The estimated vector of probabilities π from these fits can also be used for testing whether the degree of pleiotropy is significantly higher than that expected by chance if both phenotypes are independent. Independence implies that the joint pdf of both phenotype Z scores is a product of two two-group mixtures (Equation 6). It is easy to show that demonstrating excess pleiotropy from that predicted by independence is equivalent to showing that π3 > π1π2 / π0 in Equation 7 or that the log odds ratio (LOR)

LOR(phenotype1,phenotype2)=log{π31π3}log{(π1π2/π0)(1π1π2/π0)} (Equation 9)

is greater than zero. With a multivariate normal approximation to the maximum-likelihood estimates with covariance obtained from the inverse Fisher information matrix, estimates of LOR with 95% CIs are LOR(SCZ,TG) = 4.0 [3.8, 4.3] and LOR(SCZ,WHR) = 2.4 [2.1, 2.7], which are both highly significantly different from zero. These 95% CIs include an adjustment that assumes an effective degree of freedom of 500,000 independent SNPs to account for the correlation of SNPs (i.e., LD).

Stratified Replication Rate

For each of the 17 substudies contributing to the final meta-analysis in SCZ, we independently adjusted Z scores by using intergenic inflation control. For 1,000 of the possible combinations of the eight-study discovery sets and nine-study replication sets, we calculated the eight-study combined discovery Z score and eight- or nine-study combined replication Z score for each SNP as the average Z score across the eight or nine studies and multiplied these by the square root of the number of studies. For discovery samples, the Z scores were converted to two-tailed p values, whereas replication samples were converted to one-tailed p values, preserving the direction of effect in the discovery sample. For each of the 1,000 discovery-replication pairs, cumulative rates of replication were calculated over 1,000 equally spaced bins spanning the range of negative log10(p values) observed in the discovery samples. The cumulative replication rate for any bin was calculated as the proportion of SNPs with a –log10(discovery p value) greater than the lower bound of the bin with a replication p value < 0.05. Cumulative replication rates were calculated independently for each of the four pleiotropic enrichment categories, as well as for intergenic SNPs and all SNPs. For each category, the cumulative replication rate for each bin was averaged across the 1,000 discovery-replication pairs, and the results are reported in Figure 1. The vertical intercept is the overall replication rate.

Stratified Replication Effect Sizes

Stratified TDR is directly related to stratified replication effect sizes and hence replication rates.

As before, for each of the 17 substudies contributing to the final meta-analysis in SCZ, we independently adjusted Z scores by using intergenic inflation control. For 1,000 of the possible combinations of the eight-study discovery sets and nine-study replication sets, we calculated the eight-study combined discovery Z score and eight- or nine-study combined replication Z score for each SNP. The effect sizes were stratified by levels of log10(p values) from the TG GWAS. As illustrated in Figure S4, we also calculated the cumulative replication rate without overlapping controls (we removed the UK sample that included the WTCCC controls).

For visualization, a cubic smoothing spline was fit for relating the discovery Z score bin midpoints to the corresponding average replication Z scores (see Figure S5). The nonlinear pattern of shrinkage is typical of that observed in mixture models, as in Equation 1. Importantly, the amount of shrinkage is highly dependent on enrichment stratum: replication effect sizes in more enriched strata exhibit more fidelity with discovery sample effect sizes. This directly relates to increased TDR and translates into increased replication rates for enriched strata.

Conditional Statistics—Test of Association with SCZ

To improve detection of SNPs associated with SCZ, we used a stratified FDR approach in which we leveraged associated phenotypes by using established stratified FDR methods.34,35 Specifically, we stratified SNPs on the basis of p values in the pleiotropic phenotype (e.g., TGs). On the basis of the combination of p values for the SNP in SCZ and the pleiotropic trait, we then assigned a conditional FDR value (denoted as FDRSCZ | TG) for SCZ to each SNP by interpolating into a 2D look-up table (Figure S6). All SNPs with FDR < 0.01 (–log10(FDR) > 2) in SCZ given the different CVD risk factors are listed in Table 1 after “pruning” (removing all SNPs with r2 > 0.2 according to 1KGP LD structure). A significance threshold of FDR < 0.01 corresponds to 1 false positive per 100 reported associations. We also list all SNPs with FDR < 0.05 (–log10(FDR) > 1.3) in Table S2.

Table 1.

Conditional FDR: SCZ Loci Given CVD Risk Factors

Locus SNP Gene Region MIM Chr SCZ p Value SCZ FDR Minimum Conditional FDR CVD Risk Factor
4 rs1625579 AK094607a 614304 1p21.3 5.52 × 10−6 0.02105 0.00420 TG
9 rs2272417 IFT172 607386 2p23.3 4.47 × 10−5 0.07516 0.00193 TG
17 rs17180327 CWC22 - 2q31.3 6.37 × 10−6 0.02332 0.00780 HDL
20 rs13025591 AGAP1 608651 2q37 9.26 × 10−6 0.02953 0.00131 TG
22 rs2239547 ITIH4a 600564 3p21.1 1.73 × 10−5 0.03920 0.00400 HDL
23 rs11715438 PTPRG 176886 3p21-p14 2.47 × 10−6 0.01601 0.00222 HDL
26 rs9838229 DKFZp434A128 - 3q27.2 1.11 × 10−5 0.02953 0.00825 HDL
37b rs2021722 TRIM26a 600830 6p21.3 2.08 × 109 0.00046 0.00001 TG
38 rs7383287 HLA-DOB 600629 6p21.3 3.44 × 10−5 0.06382 0.00748 HDL
39 rs1480380 HLA-DMA 142855 6p21.3 3.05 × 10−6 0.01746 0.00028 TG
40 rs9462875 CUL9 607489 6p21.1 1.20 × 10−5 0.03383 0.00739 WHR
42 rs1107592 MAD1L1 602686 7p22 7.63 × 10−7 0.00919 0.00493 HDL
48 rs10503253 CSMD1a 608397 8p23.2 3.96 × 10−6 0.01912 0.00432 TG
51 rs12234997 AK055863 - 8p23.1 2.23 × 10−5 0.04590 0.00347 TG
55 rs755223 BC037345 - 8q12.3 6.91 × 10−5 0.10338 0.00895 HDL
56 rs7004633 MMP16a 602262 8q21.3 2.60 × 10−7 0.00504 0.00141 HDL
65 rs11191580 NT5C2a 600417 10q24.32 3.73 × 10−7 0.00625 0.00013 SBP
rs7914558 CNNM2a 607803 10q24.32 1.90 × 10−6 0.01464 0.00101 HDL
rs2296569 CNNM2 607803 10q24.32 3.78 × 10−6 0.01912 0.00127 TG
rs10748835 AS3MT 611806 10q24.32 2.21 × 10−6 0.01464 0.00274 HDL
67 rs11191732 NEURL 603804 10q25.1 2.55 × 10−6 0.01601 0.00160 HDL
71 rs2172225 METT5D1 - 11p14.1 4.88 × 10−5 0.08828 0.00238 TG
rs7938219 CR618717 - 11p14.1 3.75 × 10−5 0.07516 0.00331 TG
78 rs548181 STT3A 601134 11q23.3 4.65 × 10−7 0.00707 0.00044 WHR
rs11220082 FEZ1 604825 11q24.2 2.84 × 10−6 0.01746 0.00279 TG
rs671789 PKNOX2 613066 11q24.2 1.46 × 10−5 0.03920 0.00695 WHR
80 rs7972947 CACNA1Ca 114205 12p13.2 7.12 × 10−6 0.02609 0.00415 TG
81 rs4765905 CACNA1Ca 114205 12p13.3 7.99 × 10−6 0.02609 0.00758 TG
84 rs8003074 KIAA0391 609947 14q13.2 7.23 × 10−6 0.02609 0.00484 HDL
rs10135277 KIAA0391 609947 14q13.1 5.02 × 10−6 0.02105 0.00491 TG
87 rs1869901 PLCB2 604114 15q15 3.66 × 10−6 0.01912 0.00203 TG
101 rs17597926 TCF4a 602272 18q21.1 6.49 × 10−7 0.00805 0.00216 TG

Independent complex or single-gene loci (r2 < 0.2) with SNP(s) with a conditional FDR < 0.01 in SCZ given the association in CVD risk factors. We defined the most significant SCZ SNP in each LD block on the basis of the minimum conditional FDR for each phenotype. Listed are the most significant SNPs in each gene of the LD block, as well as the CVD risk factor that provided the signal. All loci with SNPs with conditional FDR < 0.05 were used for defining the number of the loci (Table S2). This and the respective FDR values in each phenotype are listed in Table S2. SCZ FDR values < 0.01 are in bold. The following abbreviations are used: chr, chromosomal region; SCZ, schizophrenia; FDR, false-discovery rate; CVD, cardiovascular disease; TG, triglyceride; HDL, high-density lipoprotein; WHR, waist-to-hip ratio; and SBP, systolic blood pressure.

a

Same locus identified in previous SCZ GWASs. All data were first corrected for genomic inflation.

b

There are additional independent SNPs in the HLA region on chromosome 6 (locus 37). The complete SNP list is shown in Table S6.

Conditional Manhattan Plots

To illustrate the localization of the genetic markers associated with SCZ given the CVD-risk-factor effect, we created a “conditional Manhattan plot” by plotting all SNPs within an LD block in relation to their chromosomal location. As illustrated in Figure 2, the large points represent the SNPs with FDR < 0.05, whereas the small points represent the nonsignificant SNPs. All SNPs without pruning are shown. The strongest signal in each LD block is illustrated with a black line around the circles. We identified these signals by ranking all SNPs in increasing order on the basis of the conditional FDR value for SCZ and then by removing SNPs in LD r2 > 0.2 with any higher ranked SNP. Thus, the selected locus was the most significantly associated with SCZ in each LD block (Figure 2).

Figure 2.

Figure 2

Conditional Manhattan Plot

Conditional Manhattan plot of conditional –log10 (FDR) values for SCZ alone (black) and SCZ given the following CVD risk factors: TGs (SCZ|TG, red), LDL (SCZ|LDL, orange), HDL (SCZ|HDL, cyan), SBP (SCZ|SBP, green), BMI (SCZ|BMI, purple), WHR (SCZ|WHR, blue), and T2D (SCZ|T2D, chartreuse). SNPs with conditional –log10 FDR > 1.3 (i.e., FDR < 0.05) are shown with large points. A black line around the large points indicates the most significant SNP in each LD block, and this SNP was annotated with the closest gene, which is listed above the symbols in each locus (except for the HLA region on chromosome 6) and in Table S2. The figure shows the localization of 106 loci on a total of 21 chromosomes (1–19, 21, and 22). Details for the loci with –log10 FDR > 2 (i.e., FDR < 0.01) are shown in Table 1.

Conjunction Statistics—Test of Association with Both Phenotypes

In order to identify which of the SNPs associated with SCZ given the CVD risk factor (SCZ|CVD, Table 1) were also associated with CVD risk factors given SCZ (opposite direction), we calculated the conditional FDR in the other direction (CVD|SCZ). This is reported in Table 2. The corresponding Z scores are listed in Table S3. The Z scores were calculated from the p values, and the direction of effect was determined by the risk allele.

Table 2.

Conditional FDR: CVD-Risk-Factor Loci Given SCZ

Locus SNP Gene MIM Chr TG|SCZ LDL|SCZ HDL|SCZ SBP|SCZ BMI|SCZ WHR|SCZ T2D|SCZ
9 rs780110 IFT172 607386 2p23.3 0.00000 0.73578 0.66350 0.88851 0.57686 0.01079 1.00000
rs2272417 IFT172 607386 2p23.3 0.00000 0.86268 0.55896 0.83749 0.70039 0.06244 1.00000
20 rs6759206 AGAP1 608651 2q37 0.01764 0.89696 0.25333 1.00000 1.00000 0.95347 1.00000
22 rs3617 ITIH3 146650 3p21.1 0.69128 0.84071 0.37022 0.97795 0.45287 0.00942 1.00000
rs2276817 ITIH4 600564 3p21.1 0.28255 0.04717 0.25333 0.61208 0.45287 1.00000 1.00000
37 rs2328893 SLC17A4 604216 6p22.2 0.03788 0.34581 0.00396 0.83749 0.65586 1.00000 1.00000
rs1324082 SLC17A1 182308 6p22.2 0.03113 0.63999 0.00465 0.65717 0.78940 0.95347 1.00000
rs13198474 SLC17A3 611034 6p22.2 0.69128 0.73578 0.00289 0.80634 1.00000 0.93285 1.00000
rs16891235 HIST1H1A 142709 6p22.2 0.95191 0.02569 0.00213 0.70268 1.00000 0.93285 1.00000
rs13194781 HIST1H2BN 602801 6p22.2 0.00239 0.97314 0.14244 0.88851 1.00000 0.93285 1.00000
rs1235162 GABBR1 603540 6p22.1 0.00117 0.73578 0.10885 0.70268 0.82974 1.00000 1.00000
rs2844762 HLA-B 142830 6p22.1 0.00491 0.53895 0.78537 0.61208 NA 0.93285 1.00000
rs3130380 HCG18 - 6p22.1 0.00708 0.73578 0.01852 0.77857 0.70039 0.81643 1.00000
rs2524222 GNL1 143024 6p22.1 0.28255 0.02945 0.41447 0.80634 1.00000 0.93285 1.00000
rs9262143 KIAA1949 610990 6p22.1 0.00004 0.26238 0.05759 0.77857 0.92201 0.52829 1.00000
rs3095326 IER3 602996 6p22.1 0.00003 0.04717 0.04502 0.74450 0.92201 0.42354 1.00000
rs3099840 HCP5 604676 6p21.3 0.00000 0.39032 0.02988 0.28698 1.00000 0.37454 1.00000
rs2284178 HCP5 604676 6p21.3 0.01764 0.48709 0.25333 0.18351 0.74603 0.87368 1.00000
rs805294 LY6G6C 610435 6p21.33 1.00000 0.97314 0.12393 0.00248 0.61339 0.75370 1.00000
rs3117577 MSH5 603382 6p21.3 0.00000 0.02164 0.41447 0.61208 0.87106 0.42354 1.00000
rs3130679 C6orf48 605447 6p21.33 0.00000 0.07243 0.14244 0.41364 0.70039 0.13758 1.00000
rs412657 AK123889 - 6p21.33 0.69128 0.97314 0.03447 0.65717 0.65586 0.37454 1.00000
rs9268219 C6orf10 606766 6p21.33 0.00000 0.04220 0.12393 0.38400 0.65586 0.03366 1.00000
rs3129963 BTNL2 606000 6p21.33 0.59071 0.77938 0.00548 0.52604 0.92201 0.04119 1.00000
rs9268853 HLA-DRA 142860 6p21.3 0.69128 0.81421 0.03447 0.41364 0.61339 0.02983 1.00000
rs9275524 HLA-DQA2 613503 6p21.32 0.00409 0.03128 0.00548 0.33310 0.27214 0.05832 1.00000
39 rs1480380 HLA-DMA 142855 6p21.3 0.00708 0.86268 0.41447 0.18351 0.78940 0.10401 NA
40 rs7832 C6orf108 - 6p21.1 0.03399 0.97057 0.10762 NA NA NA NA
51 rs983309 AK055863 - 8p23.1 0.48760 0.00000 0.00000 0.80634 0.78940 0.47533 1.00000
rs17660635 AK055863 - 8p23.1 0.69128 0.00080 0.00010 0.74450 0.92201 0.81643 1.00000
65 rs4919666 SUFU 607035 10q24.32 0.85168 0.86268 0.78537 0.04405 0.40025 0.87368 1.00000
rs2296569 CNNM2 607803 10q24.32 0.15574 0.59079 0.03950 1.00000 1.00000 1.00000 1.00000
rs11191560 NT5C2 600417 10q24.32 0.69128 0.97314 0.72193 0.00000 0.02776 0.47533 1.00000
rs11191580 NT5C2 600417 10q24.32 0.78905 1.00000 0.61021 0.00000 0.02897 0.52829 1.00000
71 rs2958625 METT5D1 - 11p14.1 0.00491 0.89696 0.02569 0.88851 0.52128 0.52829 1.00000
rs10835491 METT5D1 - 11p14.1 0.00409 0.89696 0.03950 0.88851 0.52128 0.52829 1.00000
78 rs10790734 PKNOX2 613066 11q24.2 0.37774 0.89696 1.00000 0.80634 0.65586 0.04476 1.00000

For the independent complex or single-gene loci (r2 < 0.2) with SNP(s) with a conditional FDR < 0.01 in SCZ given associated CVD risk factors (Table 1), the conditional FDR in the other direction is provided, i.e., FDR CVD risk factors given association in SCZ. All independent loci are listed consecutively, and the same locus numbering is used as in Table 1. All data were first corrected for genomic inflation. FDR values < 0.05 are in bold. The following abbreviations are used: chr, chromosomal region; TG, triglyceride; SCZ, schizophrenia; LDL, low-density lipoprotein; HDL, high-density lipoprotein; SBP, systolic blood pressure; BMI, body mass index; WHR, waist-to-hip ratio; T2D, type 2 diabetes; and NA, not available.

In addition, to make a comprehensive, unselected map of pleiotropic signals, we used a conjunction testing procedure, as outlined for p value statistics in Nichols et al.,42 and adapted this method for FDR statistics on the basis of the conditional-FDR approach.34,35 On the basis of the combination of p values for the SNP in SCZ and the pleiotropic trait, we defined the conjunction statistics (denoted as FDRSCZ & TG) as the maximum conditional FDR in both directions, i.e.,

FDRSCZ&TG=max(FDRSCZ|TG,FDRTG|SCZ),

by interpolating into a bidirectional 2D look-up table (Figure S7). The conjunction statistic allows for identification of SNPs that are associated with both phenotypes, which minimizes the effect of a single phenotype driving the common association signal. All SNPs with conjunction FDR < 0.05 (–log10(FDR) > 1.3) with SCZ and any of the CVD risk factors considered are listed in Table S4 (after pruning).

Conjunction Manhattan Plots

To illustrate the localization of the pleiotropic genetic markers in association with both SCZ and CVD risk factors, we used a conjunction Manhattan plot, for which we plotted all SNPs with a significant conjunction FDR within an LD block in relation to their chromosomal location. As illustrated in Figure S8, the large points represent the significant SNPs (FDR < 0.05), whereas the small points represent the nonsignificant SNPs. All SNPs without pruning are shown, and the strongest signal in each LD block is illustrated with a black line around the circles. First, we ranked all SNPs on the basis of the conjunction FDR and removed SNPs in LD r2 > 0.2 with any higher ranked SNP (Figure S8).

Results

Q-Q Plots of SCZ SNPs Stratified by Association with Pleiotropic CVD Risk Factors

Stratified Q-Q plots for SCZ conditioned on nominal p values of association with TGs showed enrichment across different levels of significance for TGs (Figure 1A). The earlier departure from the null line (leftward shift) suggests a greater proportion of true associations for a given nominal SCZ p value. Successive leftward shifts for decreasing nominal TG p values indicate that the proportion of non-null effects varies considerably across different levels of association with CVD risk factors. For example, in the –log10(pTG) ≥ 3 category, the proportion of SNPs reaching a given significance level (e.g., –log10(pSCZ) > 6) is roughly 100 times greater than that for the –log10(pTG) ≥ 0 category (all SNPs), indicating a very high level of enrichment. Similarly, a clear pleiotropic enrichment was also seen for HDL and LDL cholesterol. A less clear pleiotropic enrichment was seen for WHR (Figure 1B), BMI, and SBP, but there was no evidence for enrichment in T2D (data not shown).

Conditional TDR in SCZ Is Increased by CVD Risk Factors

Because categories of SNPs with stronger pleiotropic enrichment are more likely to be associated with SCZ, all tag SNPs should not be treated exchangeably so that power for discovery can be maximized. Specifically, variation in enrichment across pleiotropic categories is expected to be associated with corresponding variation in the TDR (equivalent to 1 – FDR)40 for association of SNPs with SCZ. A conservative estimate of the TDR for each nominal p value is equivalent to 1 – (p / q), easily obtained from the stratified Q-Q plots. This relationship is shown for SCZ conditioned on TG (Figure 1C) and WHR (Figure 1D). For a given conditional TDR, the corresponding estimated nominal p value threshold varies by a factor of 100 from the most to the least enriched SNP category (strata) for SCZ conditioned on TG (SCZ|TG) and approximately by a factor of 40 for SCZ conditioned on WHR (SCZ|WHR). Phenotypes with weaker pleiotropy with SCZ showed smaller increases in conditional TDR (data not shown). Because TDR is strongly related to predicted replication rate, it is expected that the replication rate will increase for a given nominal p value for SNPs in categories with higher conditional TDR.

Replication Rate in SCZ Is Increased by Pleiotropic CVD Risk Factors

To demonstrate that the observed pattern of differential enrichment does not result from spurious (i.e., nongeneralizable) associations due to category-specific stratification or errors in statistical modeling, we also studied the empirical replication rate across independent substudies of SCZ. Figures 1E and 1F show the empirical cumulative replication-rate plots as a function of nominal p value for the same categories as for the conditional stratified TDR plots in Figures 1C and 1D. Consistent with the conditional TDR pattern, we found that the nominal p value corresponding to a wide range of replication rates was 100 times higher for the –log10(pTG) ≥ 3 category than for the –log10(pTG) ≥ 0 category (Figure 1E). Similarly, SNPs from pleiotropic SNP categories showing the greatest enrichments (–log10(pTG) ≥ 3) replicated at the highest rates—up to five times higher than all SNPs (–log10(pTG) ≥ 0)—for a wide range of p value thresholds. This suggests that adjusting p value thresholds according to the estimated category-specific conditional TDR could improve the discovery of replicating SNP associations. The same relationship between conditional TDR and replication rate was shown for SCZ|WHR (Figure 1F), but here, the increase in enrichment, and thus the increase in replication rate, was weaker than that for SCZ|TG.

SCZ Gene Loci Identified with Conditional FDR

To identify SNPs associated with SCZ, we constructed a conditional Manhattan plot showing the FDR conditional on each of the CVD risk factors (Figure 2). We identified significant loci located on a total of 21 chromosomes (1–19, 21, and 22) associated with SCZ, leveraging the reduced FDR obtained by the associated CVD risk factor. To estimate the number of independent loci, we pruned the associated SNPs (i.e., removed SNPs with LD > 0.2) and identified a total of 106 independent loci with a significance threshold of conditional FDR < 0.05 (Table S2). With the more conservative conditional-FDR threshold of 0.01, there remained 25 significant independent loci, of which 4 were complex and 21 were single genes (Table 1 and black line around large circles in Figure 2). The largest locus was on chromosome 6 in the human-leukocyte-antigen (HLA) region. This is the only locus that would have been discovered by standard methods based on p values (Bonferroni correction), and the 6p21.3 region (close to TRIM26 [MIM 600830]) was significantly associated with SCZ in the primary analysis of the current sample.13 With the FDR method in SCZ alone, six loci were identified. Of these, the regions close to TRIM26 (6p21.3), MMP16 (8q21.3 [MIM 602262]), CNNM2/NT5C2 (10q24.32 [MIM 607803 and 600417]), and TCF4 (18q21.1 [MIM 602272]) were identified in earlier GWASs only after large replication samples were included,13,15 except for 6p21.3. The remaining 19 loci would not have been identified in the current sample without the use of the pleiotropy-informed stratified FDR method. Of interest, the AK094607/MIR137 region (1p21.3 [MIM 614304]) and the CSMD1 region (8p23.2 [MIM 608397]) were identified in the primary analysis of the current SCZ sample after the inclusion of a large replication sample,13 and the ITIH4 (3p21.1 [MIM 600564]) and CACNA1C (12p13.3, locus 81 [MIM 114205]) regions were identified in the primary analysis after combination with a large bipolar-disorder sample.12,13 Thus, the current pleiotropy-informed FDR method validated nine loci discovered in considerably larger samples and discovered 16 additional loci. Furthermore, several of these additional loci are located in regions with borderline significance association with SCZ in previous studies: AGAP1 (2q37; CENTG2 [MIM 608651]),13 PTPRG (3p21 [MIM 176886]),13 MAD1L1 (7p22 [MIM 602686]),43 STT3A (11q23.3 [MIM 601134]),13 and PLCB2 (15q15 [MIM 604114]).13

Pleiotropic Gene Loci in SCZ and CVD Risk Factors Identified with Conjunction FDR

As a secondary analysis, we investigated whether any of the SNPs associated with SCZ conditioned on CVD (SCZ|CVD) were also significantly associated with CVD risk factors conditioned on SCZ (CVD|SCZ), i.e., the conditional FDR in the opposite direction. We identified ten independent loci (pruned on the basis of LD > 0.2) with a significant association also with the CVD risk factor (conditional FDR < 0.05); these included three complex loci and seven single-gene loci (Table 2). Of these, the ITIH4 region (3p21.1) and the CNNM2/NT5C2 region (10q24.32), in addition to the HLA region on chromosome 6, have been identified in previous SCZ studies after the inclusion of large replication samples.13 The significant loci were found in the analyses of TG|SCZ (six loci), LDL|SCZ (three loci), HDL|SCZ (four loci), SBP|SCZ (two loci), BMI|SCZ (one locus), and WHR|SCZ (four loci), and six loci were jointly associated with SCZ and more than one CVD risk factor (Table 2). This suggests that overlapping genetic pathways are involved in SCZ and CVD risk factors. The direction of the different SNP associations (Z scores) is shown in Table S3. There was no clear evidence for systematic directions across any of the SNPs in the different phenotypes, probably as a result of complex LD structures, especially on chromosome 6.

Further, to provide a comprehensive, unselected map of pleiotropic loci between SCZ and CVD risk factors, in addition to those primarily associated with SCZ, we performed a conjunction-FDR analysis and constructed a conjunction Manhattan plot (Figure S8). We detected 26 independent pleiotropic loci (pruned on the basis of LD > 0.2; black line around large circles) with a significance threshold of conjunction FDR < 0.05 on a total of 14 chromosomes. See Table S4 for more details.

Discussion

Here, leveraging the power of GWAS data from over 250,000 individuals, we demonstrate that GWASs from associated CVD risk factors can improve discovery of SCZ susceptibility loci. By using the stratified conditional-FDR approach34,35 in the combined analyses of the SCZ and CVD-risk-factor GWASs, we identified a total of 25 significant loci. By analyzing the SCZ GWAS alone, we identified five loci. In contrast, with standard GWAS methods, one locus was significant in the SCZ sample after genomic-control correction.13 The identified pleiotropic loci are associated with overlapping biological processes, and nine of them have been identified in previous SCZ GWASs after the inclusion of large additional samples. This shows the feasibility of using a pleiotropy-informed stratified FDR approach in SCZ in combination with associated phenotypes; it is much more cost efficient than increasing the sample size of SCZ individuals.44

To date, it has been difficult to use GWASs to discover a significant proportion of the missing heritability of complex human traits and disorders. Our statistical framework is based on the fact that SNPs are not exchangeable. Rather, SNPs with effects in pleiotropic phenotypes have a higher probability of being true non-nulls and hence also a higher probability of being replicated in independent studies. We therefore developed a conditional-FDR approach for GWAS summary statistics by adapting stratification methods originally used for linkage analysis and microarray expression data.34,35 Decreased conditional FDR (equivalently, increased conditional TDR) for a given nominal p value increases power to detect true non-null effects. Increased conditional TDR is directly related to increased replication effect sizes and replication rates in de novo samples. Importantly, we validated the conditional-FDR approach by demonstrating increased replication rates in independent SCZ substudies for given nominal p value cutoffs. Equivalently, conditional FDR can be used for controlling FDR at a given level while increasing power to discover non-null SNPs over the usual unconditional approaches that treat all SNPs as exchangeable.45

We also developed a conjunction-FDR approach to identify SNPs that are highly pleiotropic with SCZ and one or more CVD risk factors. The conjunction FDR is the minimum of the conditional FDR for SCZ given a CVD risk factor and vice versa. SNPs that exceed a stringent conjunction-FDR threshold are highly probable to be non-null in two phenotypes simultaneously. Of note, conjunction FDR is different from the Fisher combined probability test, for which the alternative hypothesis is that the SNP has a significant effect on at least one (but not necessarily both) phenotypes. We validated our approach by applying a bivariate model to estimate the covariation between SCZ and the CVD risk factors. This showed that for the pleiotropic phenotypes, the degree of pleiotropy is highly significantly different from zero (Figure S3). Further confidence in the significance of the current findings comes from the CI Q-Q plots (Figure S2), which show significant pleiotropic enrichment. Given that the current analyses are based on GWAS summary statistics, the findings depend on correctly computed p values in the original studies.

The current findings of difference in magnitude of enrichment and variation in the pleiotropic loci across the three lipid phenotypes show that the results are not driven by genetic stratification, given that the lipid phenotypes were all obtained from the same individuals30 and each had approximately the same sample size. Moreover, the improved replication rate with increasing pleiotropic enrichment further argues against nonspecific genetic stratification. As such, polygenic pleiotropy could potentially be a nonspecific phenomenon related to heritable human phenotypes, but the lack of polygenic enrichment and of significant loci between SCZ and T2D (an example of a successful GWAS31) suggests that the current results are phenotype specific. It is unlikely that nongenetic correlations explain the observed pleiotropy given that only a fraction of control participants in the CVD-risk-factor GWAS samples overlapped with the SCZ GWAS samples (WTCCC controls). The replication rate based on substudies was not driven by the UK sample (which included the overlapping WTCCC controls), as shown in Figure S4. The current threshold for significant association of SCZ|CVD was set at FDR < 0.01 as a result of the seven CVD risk factors tested. However, the CVD risk factors are highly correlated, and thus the 0.01 level is conservative despite the number of CVD phenotypes tested and is comparable to the standard FDR threshold of 0.05, which translates to 5 false positives per 100 findings.

In the current study, we defined pleiotropy as the association between a single gene or variant and more than one distinct phenotype (diseases or traits).9 It is possible that some of the loci identified in the current study might not be pleiotropic but rather underlie common aspects of the SCZ and CVD-risk-factor phenotypes.9 This can be investigated in samples with more detailed phenotypes. In the present study, we focused on SNPs, but gene-based pleiotropy is also interesting;8 however, this requires raw data from individual participants.

Our results implicate potential shared pathological mechanisms between SCZ and CVD risk factors. The ten pleiotropic loci were associated with multiple CVD phenotypes, supporting the hypothesis that the pathobiology of SCZ is heterogeneous and has numerous underlying mechanisms. The majority of the pleiotropic signal was found with lipid levels, suggesting that lipid biology might be involved in SCZ pathophysiology. As such, genetically determined dyslipidemia in SCZ is in line with evidence for white-matter abnormalities and myelin dysfunction46,47 and supports the neurodevelopmental hypothesis.48 However, the lack of consistent directionality suggests the need for further experimental studies for determining the mechanistic relationship between dyslipidemia and SCZ.

Our results show that a “model-free,” empirical, FDR framework that uses unthresholded summary-statistics data from independent GWASs can provide insights into relationships between risk factors and diseases. This approach can be used for examining the shared genetic basis between a number of diseases and traits. With the recent discovery of many common genetic variants influencing diseases and traits, there is increasing interest in pleiotropy. One recent review suggests that pleiotropy is common and associated with ∼17% of genes and ∼5% of SNPs associated with complex humans diseases and traits.9 In addition to identifying potential targets for drug development, gaining insight into the degree of genetic “connectivity” between diseases and traits provides an opportunity to ascertain whether current diagnoses and classifications are consistent with genetic architecture or whether genetic similarities traverse clinical conditions. Examining overlap in common variants can elucidate important pathobiology and might identify potential therapeutic targets for common diseases.

In conclusion, the current findings demonstrate that in SCZ, the pleiotropy-informed stratified FDR method can improve the statistical power for detecting “polygenic” effects and can offer insights into mechanistic relationships between lipid biology and SCZ pathogenesis.

Acknowledgments

The authors would like to thank Terry Jernigan for helpful input on this manuscript. O.A.A. was supported by the Kristian Gerhard Jebsen Foundation, the Research Council of Norway, the South East Norway Health Authority, and the Unger-Vetlesen Medical Fund. R.S.D. was supported by National Institutes of Health (NIH) grant T32 EB005970. A.M.D. was supported by NIH grants R01AG031224, R01EB000790, and RC2DA29475. A.J.S. was supported by NIH grants RC2DA029475 and R01HD061414 and the Robert J. Glushko and Pamela Samuelson Graduate Fellowship.

Contributor Information

Ole A. Andreassen, Email: o.a.andreassen@medisin.uio.no.

Anders M. Dale, Email: amdale@ucsd.edu.

Supplemental Data

Document S1. Figures S1–S8, Tables S1–S6, and a list of Schizophrenia Psychiatric GWAS Consortium members
mmc1.pdf (1.6MB, pdf)

Web Resources

The URLs for data presented here are as follows:

References

  • 1.Glazier A.M., Nadeau J.H., Aitman T.J. Finding genes that underlie complex traits. Science. 2002;298:2345–2349. doi: 10.1126/science.1076641. [DOI] [PubMed] [Google Scholar]
  • 2.Hirschhorn J.N., Daly M.J. Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 2005;6:95–108. doi: 10.1038/nrg1521. [DOI] [PubMed] [Google Scholar]
  • 3.Hindorff L.A., Sethupathy P., Junkins H.A., Ramos E.M., Mehta J.P., Collins F.S., Manolio T.A. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA. 2009;106:9362–9367. doi: 10.1073/pnas.0903103106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Manolio T.A., Collins F.S., Cox N.J., Goldstein D.B., Hindorff L.A., Hunter D.J., McCarthy M.I., Ramos E.M., Cardon L.R., Chakravarti A. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. doi: 10.1038/nature08494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Yang J., Benyamin B., McEvoy B.P., Gordon S., Henders A.K., Nyholt D.R., Madden P.A., Heath A.C., Martin N.G., Montgomery G.W. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010;42:565–569. doi: 10.1038/ng.608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Yang J., Manolio T.A., Pasquale L.R., Boerwinkle E., Caporaso N., Cunningham J.M., de Andrade M., Feenstra B., Feingold E., Hayes M.G. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 2011;43:519–525. doi: 10.1038/ng.823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Stahl E.A., Wegmann D., Trynka G., Gutierrez-Achury J., Do R., Voight B.F., Kraft P., Chen R., Kallberg H.J., Kurreeman F.A., Diabetes Genetics Replication and Meta-analysis Consortium. Myocardial Infarction Genetics Consortium Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat. Genet. 2012;44:483–489. doi: 10.1038/ng.2232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wagner G.P., Zhang J. The pleiotropic structure of the genotype-phenotype map: The evolvability of complex organisms. Nat. Rev. Genet. 2011;12:204–213. doi: 10.1038/nrg2949. [DOI] [PubMed] [Google Scholar]
  • 9.Sivakumaran S., Agakov F., Theodoratou E., Prendergast J.G., Zgaga L., Manolio T., Rudan I., McKeigue P., Wilson J.F., Campbell H. Abundant pleiotropy in human complex diseases and traits. Am. J. Hum. Genet. 2011;89:607–618. doi: 10.1016/j.ajhg.2011.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chambers J.C., Zhang W., Sehmi J., Li X., Wass M.N., Van der Harst P., Holm H., Sanna S., Kavousi M., Baumeister S.E., Alcohol Genome-wide Association (AlcGen) Consortium. Diabetes Genetics Replication and Meta-analyses (DIAGRAM+) Study. Genetic Investigation of Anthropometric Traits (GIANT) Consortium. Global Lipids Genetics Consortium. Genetics of Liver Disease (GOLD) Consortium. International Consortium for Blood Pressure (ICBP-GWAS) Meta-analyses of Glucose and Insulin-Related Traits Consortium (MAGIC) Genome-wide association study identifies loci influencing concentrations of liver enzymes in plasma. Nat. Genet. 2011;43:1131–1138. doi: 10.1038/ng.970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cotsapas C., Voight B.F., Rossin E., Lage K., Neale B.M., Wallace C., Abecasis G.R., Barrett J.C., Behrens T., Cho J., FOCiS Network of Consortia Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 2011;7:e1002254. doi: 10.1371/journal.pgen.1002254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Sklar P., Ripke S., Scott L.J., Andreassen O.A., Cichon S., Craddock N., Edenberg H.J., Nurnberger J.I., Jr., Rietschel M., Blackwood D., Psychiatric GWAS Consortium Bipolar Disorder Working Group Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat. Genet. 2011;43:977–983. doi: 10.1038/ng.943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ripke S., Sanders A.R., Kendler K.S., Levinson D.F., Sklar P., Holmans P.A., Lin D.Y., Duan J., Ophoff R.A., Andreassen O.A., Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium Genome-wide association study identifies five new schizophrenia loci. Nat. Genet. 2011;43:969–976. doi: 10.1038/ng.940. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lichtenstein P., Yip B.H., Björk C., Pawitan Y., Cannon T.D., Sullivan P.F., Hultman C.M. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: A population-based study. Lancet. 2009;373:234–239. doi: 10.1016/S0140-6736(09)60072-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Stefansson H., Ophoff R.A., Steinberg S., Andreassen O.A., Cichon S., Rujescu D., Werge T., Pietiläinen O.P., Mors O., Mortensen P.B., Genetic Risk and Outcome in Psychosis (GROUP) Common variants conferring risk of schizophrenia. Nature. 2009;460:744–747. doi: 10.1038/nature08186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Purcell S.M., Wray N.R., Stone J.L., Visscher P.M., O’Donovan M.C., Sullivan P.F., Sklar P., International Schizophrenia Consortium Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748–752. doi: 10.1038/nature08185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Murray C.J., Lopez A.D., editors. The Global Burden of Disease: A comprehensive assessment of mortality, injuries, and risk factors in 1990 and projected to 2020. Vol. 1. Harvard University Press; Cambridge, MA: 1996. (Global Burden of disease and Injury Series). [Google Scholar]
  • 18.Colton C.W., Manderscheid R.W. Congruencies in increased mortality rates, years of potential life lost, and causes of death among public mental health clients in eight states. Prev. Chronic Dis. 2006;3:A42. [PMC free article] [PubMed] [Google Scholar]
  • 19.Laursen T.M., Munk-Olsen T., Vestergaard M. Life expectancy and cardiovascular mortality in persons with schizophrenia. Curr. Opin. Psychiatry. 2012;25:83–88. doi: 10.1097/YCO.0b013e32835035ca. [DOI] [PubMed] [Google Scholar]
  • 20.Saha S., Chant D., McGrath J. A systematic review of mortality in schizophrenia: Is the differential mortality gap worsening over time? Arch. Gen. Psychiatry. 2007;64:1123–1131. doi: 10.1001/archpsyc.64.10.1123. [DOI] [PubMed] [Google Scholar]
  • 21.Marder S.R., Essock S.M., Miller A.L., Buchanan R.W., Casey D.E., Davis J.M., Kane J.M., Lieberman J.A., Schooler N.R., Covell N. Physical health monitoring of patients with schizophrenia. Am. J. Psychiatry. 2004;161:1334–1349. doi: 10.1176/appi.ajp.161.8.1334. [DOI] [PubMed] [Google Scholar]
  • 22.Mitchell A.J., Vancampfort D., Sweers K., van Winkel R., Yu W., De Hert M. Prevalence of Metabolic Syndrome and Metabolic Abnormalities in Schizophrenia and Related Disorders—A Systematic Review and Meta-Analysis. Schizophr. Bull. 2011 doi: 10.1093/schbul/sbr148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.American Diabetes Association. American Psychiatric Association. American Association of Clinical Endocrinologists. North American Association for the Study of Obesity Consensus development conference on antipsychotic drugs and obesity and diabetes. Diabetes Care. 2004;27:596–601. doi: 10.2337/diacare.27.2.596. [DOI] [PubMed] [Google Scholar]
  • 24.De Hert M.A., van Winkel R., Van Eyck D., Hanssens L., Wampers M., Scheen A., Peuskens J. Prevalence of the metabolic syndrome in patients with schizophrenia treated with antipsychotic medication. Schizophr. Res. 2006;83:87–93. doi: 10.1016/j.schres.2005.12.855. [DOI] [PubMed] [Google Scholar]
  • 25.Kaddurah-Daouk R., McEvoy J., Baillie R.A., Lee D., Yao J.K., Doraiswamy P.M., Krishnan K.R. Metabolomic mapping of atypical antipsychotic effects in schizophrenia. Mol. Psychiatry. 2007;12:934–945. doi: 10.1038/sj.mp.4002000. [DOI] [PubMed] [Google Scholar]
  • 26.Raphael T.P., Parsons J.P. Blood sugar studies in dementia praecox and manic-depressive insanity. Arch. Neurol. Psychiatry. 1921;5:687–709. [Google Scholar]
  • 27.Ryan M.C., Collins P., Thakore J.H. Impaired fasting glucose tolerance in first-episode, drug-naive patients with schizophrenia. Am. J. Psychiatry. 2003;160:284–289. doi: 10.1176/appi.ajp.160.2.284. [DOI] [PubMed] [Google Scholar]
  • 28.Hansen T., Ingason A., Djurovic S., Melle I., Fenger M., Gustafsson O., Jakobsen K.D., Rasmussen H.B., Tosato S., Rietschel M. At-risk variant in TCF7L2 for type II diabetes increases risk of schizophrenia. Biol. Psychiatry. 2011;70:59–63. doi: 10.1016/j.biopsych.2011.01.031. [DOI] [PubMed] [Google Scholar]
  • 29.Ehret G.B., Munroe P.B., Rice K.M., Bochud M., Johnson A.D., Chasman D.I., Smith A.V., Tobin M.D., Verwoert G.C., Hwang S.J., International Consortium for Blood Pressure Genome-Wide Association Studies. CARDIoGRAM consortium. CKDGen Consortium. KidneyGen Consortium. EchoGen consortium. CHARGE-HF consortium Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature. 2011;478:103–109. doi: 10.1038/nature10405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Teslovich T.M., Musunuru K., Smith A.V., Edmondson A.C., Stylianou I.M., Koseki M., Pirruccello J.P., Ripatti S., Chasman D.I., Willer C.J. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010;466:707–713. doi: 10.1038/nature09270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Voight B.F., Scott L.J., Steinthorsdottir V., Morris A.P., Dina C., Welch R.P., Zeggini E., Huth C., Aulchenko Y.S., Thorleifsson G., MAGIC investigators. GIANT Consortium Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 2010;42:579–589. doi: 10.1038/ng.609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Speliotes E.K., Willer C.J., Berndt S.I., Monda K.L., Thorleifsson G., Jackson A.U., Lango Allen H., Lindgren C.M., Luan J., Mägi R., MAGIC. Procardis Consortium Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 2010;42:937–948. doi: 10.1038/ng.686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Heid I.M., Jackson A.U., Randall J.C., Winkler T.W., Qi L., Steinthorsdottir V., Thorleifsson G., Zillikens M.C., Speliotes E.K., Mägi R., MAGIC Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat. Genet. 2010;42:949–960. doi: 10.1038/ng.685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Yoo Y.J., Pinnaduwage D., Waggott D., Bull S.B., Sun L. Genome-wide association analyses of North American Rheumatoid Arthritis Consortium and Framingham Heart Study data utilizing genome-wide linkage results. BMC Proc. 2009;3(Suppl 7):S103. doi: 10.1186/1753-6561-3-s7-s103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Sun L., Craiu R.V., Paterson A.D., Bull S.B. Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies. Genet. Epidemiol. 2006;30:519–530. doi: 10.1002/gepi.20164. [DOI] [PubMed] [Google Scholar]
  • 36.Efron B. Cambridge University Press; New York: 2010. Large-scale inference: Empirical Bayes methods for estimation, testing, and prediction. [Google Scholar]
  • 37.Schweder T., Spjotvoll E. Plots of P-Values to Evaluate Many Tests Simultaneously. Biometrika. 1982;69:493–502. [Google Scholar]
  • 38.Devlin B., Roeder K. Genomic control for association studies. Biometrics. 1999;55:997–1004. doi: 10.1111/j.0006-341x.1999.00997.x. [DOI] [PubMed] [Google Scholar]
  • 39.Yang J., Weedon M.N., Purcell S., Lettre G., Estrada K., Willer C.J., Smith A.V., Ingelsson E., O’Connell J.R., Mangino M., GIANT Consortium Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet. 2011;19:807–812. doi: 10.1038/ejhg.2011.39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Benjamini Y., Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Series B Stat. Methodol. 1995;57:289–300. [Google Scholar]
  • 41.Efron B. Size, power and false discovery rates. Ann. Stat. 2007;35:1351–1377. [Google Scholar]
  • 42.Nichols T., Brett M., Andersson J., Wager T., Poline J.B. Valid conjunction inference with the minimum statistic. Neuroimage. 2005;25:653–660. doi: 10.1016/j.neuroimage.2004.12.005. [DOI] [PubMed] [Google Scholar]
  • 43.Wang K.S., Liu X.F., Aragam N. A genome-wide meta-analysis identifies novel loci associated with schizophrenia and bipolar disorder. Schizophr. Res. 2010;124:192–199. doi: 10.1016/j.schres.2010.09.002. [DOI] [PubMed] [Google Scholar]
  • 44.Sullivan P.F. Puzzling over schizophrenia: Schizophrenia as a pathway disease. Nat. Med. 2012;18:210–211. doi: 10.1038/nm.2670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Craiu R.V., Sun L. Choosing the lesser evil: Trade-off between false discovery rate and non-discovery rate. Stat. Sin. 2008;18:861–879. [Google Scholar]
  • 46.Davis K.L., Stewart D.G., Friedman J.I., Buchsbaum M., Harvey P.D., Hof P.R., Buxbaum J., Haroutunian V. White matter changes in schizophrenia: Evidence for myelin-related dysfunction. Arch. Gen. Psychiatry. 2003;60:443–456. doi: 10.1001/archpsyc.60.5.443. [DOI] [PubMed] [Google Scholar]
  • 47.Karoutzou G., Emrich H.M., Dietrich D.E. The myelin-pathogenesis puzzle in schizophrenia: A literature review. Mol. Psychiatry. 2008;13:245–260. doi: 10.1038/sj.mp.4002096. [DOI] [PubMed] [Google Scholar]
  • 48.Marenco S., Weinberger D.R. The neurodevelopmental hypothesis of schizophrenia: Following a trail of evidence from cradle to grave. Dev. Psychopathol. 2000;12:501–527. doi: 10.1017/s0954579400003138. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S8, Tables S1–S6, and a list of Schizophrenia Psychiatric GWAS Consortium members
mmc1.pdf (1.6MB, pdf)

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES