Abstract
Schizophrenia is a highly heritable disorder with a polygenic pattern of inheritance and a population prevalence of ∼1%. Previous studies have implicated synaptic dysfunction in schizophrenia. We tested the accumulated association of genetic variants in expert-curated synaptic gene groups with schizophrenia in 4673 cases and 4965 healthy controls, using functional gene group analysis. Identifying groups of genes with similar cellular function rather than genes in isolation may have clinical implications for finding additional drug targets. We found that a group of 1026 synaptic genes was significantly associated with the risk of schizophrenia (P=7.6 × 10−11) and more strongly associated than 100 randomly drawn, matched control groups of genetic variants (P<0.01). Subsequent analysis of synaptic subgroups suggested that the strongest association signals are derived from three synaptic gene groups: intracellular signal transduction (P=2.0 × 10−4), excitability (P=9.0 × 10−4) and cell adhesion and trans-synaptic signaling (P=2.4 × 10−3). These results are consistent with a role of synaptic dysfunction in schizophrenia and imply that impaired intracellular signal transduction in synapses, synaptic excitability and cell adhesion and trans-synaptic signaling play a role in the pathology of schizophrenia.
Keywords: GWAS, ISC, GAIN, gene group analysis, synapse, genome-wide association
Introduction
Schizophrenia is a chronic and debilitating brain disorder that affects ∼1% of the population.1 It is characterized by delusional beliefs, hallucinations, disordered speech and deficits in emotional and social behavior (see, for example, Mowry et al.2) and is highly familial with heritability estimates of 81%.3
The genome-wide association (GWA) studies have explained only a small amount of genetic variance in schizophrenia and are limited in power because of the many tests performed and do not necessarily lead to knowledge about molecular mechanisms of a clinical trait. In addition, the strongest associated variants—when part of a pathway—might not be the best drug target for therapeutic intervention, and identifying variants in the same cellular pathway or functionally related gene group may help in finding additional drug targets. Recent GWA studies for schizophrenia have implicated the major histocompatibility complex on 6p21.2–22.1, neurogranin (NRGN) and transcription factor 4 (TCF4).4, 5, 6 In addition, they have provided molecular genetic evidence for a substantial polygenic component, implicating a large number of single-nucleotide polymorphisms (SNPs) of very small effect in the etiology of schizophrenia.4 Some of these exceed genome-wide significance, but the currently available sample sizes are insufficient to detect these effects.7 Therefore, SNPs in the 5 × 10−8−1 × 10−6 band are a mix of SNPs, some of true effect and some that are false positives. Exactly what is the mix of true/false positives is currently unknown. Gene set or pathway analysis involves testing for the combined effect of multiple SNPs, which individually may have a very small effect that does not reach significance. By using a competitive testing scheme, associations of gene sets with a disease are corrected for false positives.
It seems likely that the substantial polygenic component involves SNPs that are not distributed randomly across the genome but are distributed across genes that share a common biological function or pathway.8, 9, 10, 11 Recent pathway analyses provided evidence for the importance of the cell adhesion molecule pathway in schizophrenia,12 as well as the glutamate metabolism, transforming growth factor-β signaling and tumor-necrosis factor receptor-1 pathways.10
Pathway analysis is predicated upon accurate pathway definitions and validated assignment of genes to pathways. However, many of the available databases used for pathway definitions are not optimally annotated and the same pathways can be differentially defined across databases. Classically defined pathways are usually not independent, as the same genes, especially the end points, are often active in different pathways. Consequently, genetic variation that affects the expression or function of genes in different pathways may have similar consequences, have similar impact on pathogenesis and show similar disease association. Genes may also be grouped according to shared cellular function (see, for example, Ruano et al.11). Such ‘functional gene grouping' goes across the traditionally defined biological pathways as it groups genes based on similar cellular function, and not based on a cascade of induced events, as in biological pathways. We recently proposed such a functional gene grouping approach to test for the combined effect of genetic variants in genes with shared cellular function in the synapse using a manually curated database of gene function based on both experimentation and data mining.11 Using this approach, it was found that one relay element involved in many pathways (G proteins) was associated with cognitive traits, a strong association, which had remained unnoticed by traditional single-marker analysis.11
Numerous statistical methods are available to evaluate the enrichment of selected pathways or functional gene groups for selected traits, including, for example, the Gene Set Enrichment Analysis,13, 14 testing for overrepresentation of categories of genes,9 the SNP ratio test,12 hypergeometric tests (see, for example, Jia et al.10) or the Σ-log(P) method combined with permutation.11 Most of these methods correct for linkage disequilibrium between SNPs, the number of SNPs per gene, gene size and multiple testing of independent pathways. Permutation is generally used to determine how likely a given result is if the null hypothesis of no association is true. However, the more genes are present in the defined gene group, the more likely it becomes to observe smaller P-values. In addition, generally these methods do not test how unique a certain result is given the polygenic nature of many studied traits. The latter involves testing whether a given pathway is significantly associated with a trait because it (1) includes a lot of genes and the trait is polygenic in nature or (2) because of the biological function of the pathway. This can be resolved by testing for association of matched-control gene groups in comparison with the targeted gene groups or pathways.
The purpose of the current study is to apply a functional gene group approach to detect well-annotated functional gene groups that are important to the risk of schizophrenia. The synaptic hypothesis of schizophrenia15, 16, 17, 18 is one of the leading hypotheses in the field of schizophrenia. Recent genetic findings underscore the importance of synaptic dysfunction in schizophrenia.6, 12 Therefore, we formally tested whether the group of genes involved in pre- and post-synaptic functioning is related to schizophrenia, and whether this group is more strongly related than randomly drawn matched sets of genes, using a ‘competitive' control method.19 Apart from testing all synaptic genes for an association with the risk of schizophrenia, we also tested 17 subgroups of synaptic functioning, defined based on data mining and experimentation.11 We used the data genotyped within the International Schizophrenia Consortium (ISC) case–control sample4 and the Genetic Association Information Network (GAIN) schizophrenia data set.
Materials and methods
Participants and genotyping
The ISC case–control sample includes 3322 cases and 3587 controls (European ancestry), derived from seven different collection sites and is described in detail elsewhere.4 Subjects were genotyped on Affymetrix 5.0 or 6.0 SNP arrays (Affymetrix, Santa Clara, CA, USA), and 300 523 SNPs passed quality control for the ISC_affy5 sample (3353 subjects) and 717 126 SNPs for the ISC_affy6 sample (3556 subjects).
The GAIN schizophrenia case–control sample has been described elsewhere5, 20 and has been downloaded from dbGAP (phs000021.v2.p1). Briefly, this sample included 1351 cases and 1378 controls of European ancestry. All individuals were genotyped on the Affymetrix 6.0 array, and 727 872 SNPs were available for analysis. The quality control procedures followed those described in Shi et al.5 The GAIN and two ISC (affy5 and affy6) are independent and non-overlapping, and together contain 9638 individuals (4673 cases/4965 controls) of European ancestry.
Defining functional gene groups
Synaptic functional gene group definition was based on cellular function as determined by previous protein identification and data mining for synaptic genes and gene function.11 Genes were considered ‘synaptic' based on proteomic analysis of synaptic preparations.21 In case of presynaptic genes, an additional expert curation was performed because only few analyses of highly purified preparations are currently available for the presynaptic proteome, except synaptic vesicles.22 Hence, presynaptic genes not covered by Takamori et al.22 were manually curated using published functional data and a cumulative scoring paradigm with the following set of weighted criteria: null mutation produces a synaptic phenotype; activation of the gene product (for example, receptor) or blockade thereof directly modulates synaptic function; and immunoelectron microscopy detects gene product in the synapse. More than 500 PubMed entries were manually screened. Although this approach introduces a bias toward well-studied genes, this is inherent to creating functional gene groups, as functional grouping is by definition limited to those genes for which functional data are available. Synaptic genes were subdivided into 17 functional groups based on shared cellular function (a full listing of genes assigned to functional groups is provided in the Supplementary Material, Table S4).
SNP assignment
All SNPs that survived quality control in the ISC and GAIN samples were mapped to genes on the basis of NCBI (National Center for Biotechnology Information) human assembly build 36.3 and dbSNP release 129 (following Holmans et al.9). For the definition of the gene boundaries we downloaded the ‘seq_gene.md' file from the FTP website of NCBI. From this list of records we deleted genes coded as pseudo in the column ‘feature_type'. Subsequently, we selected the records with gene as ‘feature_type' and reference as ‘group_label'. For these records, we assigned SNPs to genes when annotated between ‘chr_start' (transcription start site) and ‘chr_stop' (transcription stop site).
Association analysis
SNP association analyses were carried out using additive models of allele counts. For the ISC data set, a correction for clustering within stratum (collection site) was performed.4 Cochran–Mantel–Haenszel tests implemented in PLINK were used for the association analyses (PLINK, Boston, MA, USA). All analyses were carried out separately for the ISC_affy5, ISC_affy6 and the GAIN data sets. Empirical P-values from the three data sets were combined by Stouffer's weighted Z-transform method23 to obtain an overall P-value.
Evaluating the combined effect of all SNPs in a functional gene group: the Σ-log(P) method
We summed the logarithm of the reciprocal of the P-values (denoted as Σ-log(P) method)24, 25 as previously applied to gene group analysis11 to determine the significance of the combined effect of SNPs annotated to genes in a functional group. The Σ-log(P) method combines P-values from association analyses within a group of genetic variants, then calculates the –log10 of each P-value, and sums over all P-values in a group to obtain the Σ-log(P) test statistic. To allow unbiased interpretation of the Σ-log(P) test statistic, 10 000 permutations were conducted, which are implicitly conditional on linkage disequilibrium, sample size, gene size, the number of SNPs per gene and the number of genes per group, by permuting affection status over genotypes. With this permutation procedure, only the relation between any genetic variant and affection status was disconnected, whereas linkage disequilibrium structure was kept intact. In addition, each group of genetic variants included the same (numbers of) SNPs and genes and had the same sample size as the original data set. For each permutation, we obtained the Σ-log(P) for each functional group and then compared the observed Σ-log(P) of a group with the empirical P-value distribution by calculating the proportion of Σ-log(P) in the permuted data sets that was higher than the observed Σ-log(P).
Controlling for known polygenic effect on schizophrenia
The permutation approach described above provides information on how likely a given value of the combined effect of all SNPs in a group of genes is under the null hypothesis of no association of any SNP included in the functional group with the risk of schizophrenia (that is, self-contained testing).19 We additionally applied matched-control methods (that is, competitive testing) that allow to test whether randomly drawn groups of SNPs/genes would provide an equally or more significant (combined) empirical P-value as compared with the combined P-value from the group of synaptic genes. We created control gene groups matched for the number of genes (method 1) and groups that were matched for the effective number of SNPs, which could be drawn from all genic and nongenic SNPs (method 2), from genic SNPs only (method 3), from nongenic SNPs only (method 4) or from genic SNPs in brain-expressed genes only (method 5). The effective number of SNPs denotes the number of independent SNPs that is consistent with the empirical mean and variance of the distribution of the test statistic under the null hypothesis of no association26 (see Supplementary Material, section 2). Matching for the number of genes as well as for the effective number of SNPs in a functional gene group would be ideal but is highly limited as there will only be a few (<5) sets of gene groups that can be created when the original group of genes is large (1026 in our case). We thus created matched control groups following the five methods described above, each testing slightly different null hypotheses (see Table 1).
Table 1. Five applied competitive control methods to test whether synaptic genes are more strongly associated with the risk for schizophrenia than any other set of randomly grouped genes or single-nucleotide polymorphisms (SNPs).
Method | Matched for | SNPs drawn from | Null hypothesis |
---|---|---|---|
1 | Number of genes | Genic SNPs, excluding SNPs in synaptic genes | No more evidence for association in the group of synaptic genes than any other set of an equal number of genes |
2 | Effective number of SNPs | Genic and nongenic SNPs, excluding SNPs in synaptic genes | No more evidence for association in the group of synaptic genes than any other set of an equal effective number of SNPs |
3 | Effective number of SNPs | Genic SNPs, excluding SNPs in synaptic genes | No more evidence for association in the group of synaptic genes than any other set of an equal effective number of genic SNPs |
4 | Effective number of SNPs | Nongenic SNPs | No more evidence for association in the group of synaptic genes than any other set of an equal effective number of nongenic SNPs |
5 | Effective number of SNPs | Genic SNPs in genes expressed in brain, excluding SNPs in synaptic genes | No more evidence for association in the group of synaptic genes than any other set of an equal effective number of genic SNPs from brain-expressed genes |
For each of the five competitive test designs, 100 matched control groups were drawn. For each draw we carried out an association analysis of all SNPs in the matched control group in each of the three data sets, calculated the Σ-log(P) and then conducted 10 000 permutations of the data set to determine the empirical P-value of each of the 100 matched control groups of genes in each data set, similar to the actual analysis with the group of synaptic genes. These empirical P-values were combined across the three data sets using Stouffer's weighted Z-transform method.23 For the five control designs, we thus obtained five sets of 100 combined empirical P-values. We then calculated how often the true combined empirical P-value (from the synaptic gene group) was higher than the combined empirical P-value from the matched control groups and divided that by the number of draws. As there were 100 draws, the lowest empirical P-value of the combined empirical P-value that could be obtained was <0.01, when none of the combined empirical P-values from the random draws was equal or more significant than the combined empirical P-value from the synaptic gene group (see Figure 1 for a graphical overview of the steps in data analysis).
Enrichment tests of previously implicated genes
To test whether synaptic functional groups contained previously implicated genes more often than by chance, we retrieved all SNPs with P⩽1.0–5 from all significant loci reported in GWA studies for schizophrenia that were published before 14 February 2011, using the GWAS catalog,27 and mapped these loci to protein-coding genes (NCBI build v36.3). In addition, we added genes implicated from genome-wide copy number variation studies. Fisher's exact tests were used to determine the presence of enrichment.
Results
Are synaptic genes significantly associated with the risk of schizophrenia?
No individual SNP reached the threshold for genome-wide significance in any of the three data sets using a genome-wide association analysis for each data set separately (see Supplementary Material, Section 1). Functional gene group analysis of all 1026 synaptic genes jointly resulted in a significant association of the total group of pre- and post-synaptic genes to the risk of schizophrenia. This was true in all three samples separately and highly significant when combined across samples, with a combined P-value of 7.6 × 10−11. For each sample, the Σ-log(P) obtained from the original analysis with all synaptic genes was in the higher end of the empirical distribution and highly significant with only one of 10 000 permutations exceeding the observed Σ-log(P) for the ISC_affy5, ISC_affy6 and GAIN data sets (see Figure 2).
Are synaptic genes more significantly associated with the risk of schizophrenia than randomly drawn groups of genes/genetic variants?
Results from the five control methods show that SNPs in synaptic genes are more strongly associated with the risk of schizophrenia than any other set of randomly drawn genes. For none of the control methods we found a combined empirical P-value that was more significant than the combined empirical P-value from the synaptic genes (see Figure 3). The ‘empirical P-value of the combined empirical P-value' was <0.01 in all methods, suggesting that the group of synaptic genes is generally more strongly associated with schizophrenia than other groups of genes that either include the same number of genes, the same effective number of nongenic or genic SNPs, nongenic SNPs only, genic SNPs only, or the same effective number of SNPs drawn from brain-expressed genes only (see Supplementary Material, Section 2 and 3).
Which synaptic subgroups are most strongly related to the risk of schizophrenia?
We tested 17 synaptic subgroups and one group of synaptic genes that did not share a known function for association with schizophrenia. We found that three synaptic subgroups were significantly associated with increased risk of schizophrenia under the null hypothesis that none of the SNPs in these groups were associated with schizophrenia: intracellular signal transduction group (P=0.0002), genes related to excitability (P=0.0009) and genes involved in CAT signaling (P=0.0024) (see Table 2). The matched control methods for these subgroups resulted in P-values between 0.02 and 0.04 for the intracellular signal transduction group, P-values between <0.01 and 0.03 for the excitability group and between 0.03 and 0.06 for the CAT signaling group (see Figure 3 and Supplementary Material, Section 4).
Table 2. Association of synaptic functional gene groups with schizophrenia in the three data sets.
ISC AFFY5 |
ISC AFFY6 |
GAIN AFFY6 |
ALL |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
N genes | N SNPs | Σ -log(P) | PEMP | N genes | N SNPs | Σ -log(P) | PEMP | N genes | N SNPs | Σ -log(P) | PEMP | PCOMB | |
All synaptic genes | 795 | 15 105 | 7102 | 1.0E–04 | 906 | 34 860 | 16 071 | 1.0E–04 | 908 | 35 412 | 16 348 | 1.0E–04 | 7.6E–11 |
Intracellular signal transduction | 112 | 2350 | 1140 | 0.0061 | 133 | 5387 | 2590 | 0.0037 | 134 | 5475 | 2450 | 0.2088 | 0.0002 |
Excitability | 47 | 1120 | 555 | 0.0233 | 50 | 2656 | 1238 | 0.1026 | 50 | 2680 | 1327 | 0.0100 | 0.0009 |
CAT signaling | 69 | 3278 | 1483 | 0.1568 | 79 | 7866 | 3564 | 0.1181 | 79 | 7962 | 3888 | 0.0013 | 0.0024 |
Endocytosis | 20 | 257 | 133 | 0.1000 | 23 | 576 | 312 | 0.0379 | 23 | 581 | 312 | 0.0384 | 0.0029 |
Structural plasticity | 72 | 1169 | 547 | 0.1388 | 81 | 2463 | 1172 | 0.0599 | 82 | 2494 | 1171 | 0.0887 | 0.0108 |
GPCR signaling | 31 | 786 | 395 | 0.0395 | 37 | 1823 | 879 | 0.0832 | 37 | 1844 | 823 | 0.3340 | 0.0162 |
‘Unknown' | 45 | 556 | 266 | 0.1250 | 50 | 1514 | 786 | 0.0087 | 50 | 1542 | 641 | 0.7143 | 0.0272 |
Protein cluster | 40 | 893 | 463 | 0.0160 | 46 | 2039 | 1001 | 0.0435 | 46 | 2069 | 840 | 0.8210 | 0.0272 |
Tyrosine kinase signaling | 7 | 334 | 185 | 0.0190 | 7 | 786 | 352 | 0.3401 | 7 | 798 | 356 | 0.3802 | 0.0491 |
Cell metabolism | 41 | 216 | 112 | 0.0581 | 44 | 473 | 198 | 0.6137 | 44 | 477 | 224 | 0.1830 | 0.1154 |
Neurotransmitter metabolism | 25 | 258 | 113 | 0.4492 | 27 | 580 | 290 | 0.0832 | 27 | 606 | 274 | 0.3137 | 0.1166 |
Intracellular trafficking | 59 | 455 | 205 | 0.3364 | 70 | 987 | 424 | 0.5422 | 69 | 1020 | 519 | 0.0286 | 0.1329 |
LGIC signaling | 33 | 842 | 403 | 0.1210 | 35 | 1910 | 820 | 0.5437 | 35 | 1948 | 855 | 0.4308 | 0.2370 |
Exocytosis | 68 | 1170 | 512 | 0.4321 | 78 | 2547 | 1084 | 0.6207 | 78 | 2589 | 1170 | 0.2500 | 0.4062 |
RPSFB | 47 | 376 | 160 | 0.5549 | 62 | 869 | 403 | 0.1933 | 63 | 889 | 368 | 0.7077 | 0.4204 |
Ion balance/transport | 35 | 322 | 129 | 0.7092 | 41 | 727 | 295 | 0.7088 | 41 | 751 | 372 | 0.1044 | 0.5266 |
Peptide/neurothropin signals | 21 | 515 | 221 | 0.5124 | 21 | 1122 | 427 | 0.8919 | 21 | 1132 | 530 | 0.2133 | 0.6615 |
G-protein relay | 23 | 208 | 81 | 0.7922 | 23 | 536 | 235 | 0.4444 | 23 | 556 | 228 | 0.6758 | 0.7327 |
Abbreviations: CAT, cell adhesion and trans-synaptic molecule; GAIN, Genetic Association Information Network; GPCR, G-protein-coupled receptor; ISC, International Schizophrenia Consortium; LGIC, ligand-gated ion channel; RPSFB, RNA and protein synthesis, folding and breakdown; SNP, single-nucleotide polymorphism; Unknown, genes that are known to be expressed in the synapse but currently have no known shared function with other genes.
All PEMP values are based on 10 000 permutations of the data.
The signal of the most significant functional group (intracellular signal transduction) was mainly derived from the two ISC samples, whereas the GAIN sample contributed less to the overall evidence of significance of these groups but contributed mostly to association with the CAT signaling group. For the group of excitability genes, however, all three samples independently showed nominally significant or suggestive evidence. Quantile-quantile (Q-Q) plots (Supplementary Figures S3a–c) of the significant functional groups in the three samples show that for each functional group a multitude of SNPs in multiple genes, each of small effect, contribute to the overall significance, suggesting that the association cannot be explained by only a few genes in the group but rather by the joint effect of many genes in the functional group.
Synaptic subgroups include genes associated previously with schizophrenia
We tested whether the synaptic functional groups included genes for schizophrenia previously implicated from GWAS or copy number variation studies. The intracellular signal transduction group includes NRGN that was one of the most significant genes identified in the SGENE+-based GWAS,6 but was not below the genome-wide threshold in the ISC or MGS GWAS.5 The excitability group contains CACNA1C, which was one of the two most significant genes identified in a recent GWAS for bipolar disorder,28 and was recently also associated with schizophrenia.29 From the group of genes involved in CAT signaling, four genes were implicated previously in schizophrenia. Enrichment analysis of previously implicated genes in schizophrenia from GWAS and copy number variation studies indicated significant, although moderate, enrichment of previously associated genes in the total group of synaptic genes (P=0.02) using Fisher's exact test (see Supplementary Material, Section 5). This enrichment was mainly because of enrichment in the CAT signaling group (P=0.0002). However, three out of the four genes in CAT signaling group that were implicated previously were very large genes. As significant results from GWAS studies may be biased toward large genes, the enrichment test for the CAT signaling group needs to be interpreted with caution.
Discussion
Our overarching goal was to test whether genetic variation associated with schizophrenia risk accumulates in functional gene groups operating in the synapse. We showed that the total group of genes encoding proteins in the synapse was highly associated with the risk of developing schizophrenia with a combined P-value of 7.6 × 10−11. In addition, the group of synaptic genes was more strongly associated with schizophrenia than any of the matched-control groups of genes (P<0.01). The functional gene group approach is a novel approach in which genes are grouped according to cellular function, and which goes across traditionally defined biological pathways, also referred to as horizontal versus vertical grouping.11 We used a manually curated database of functional gene groups, which tends to include more updated annotation information of gene function—especially for genes expressed in brain—than some of the online available databases. We do note, however, that gene function annotation is an ongoing endeavor and that annotation of functional gene groups is therefore continuously improved.
Apart from testing all genetic variants in synaptic genes as a group, we tested subgroups of synaptic functioning and found that three subgroups of synaptic functioning mainly drive the association of the synaptic gene group with schizophrenia; intracellular signal transduction (P=0.0002), excitability (P=0.0009) and CAT signaling (P=0.0024). In general, these associations were stronger than associations with matched-control groups of genes, except for CAT signaling (method 5, P=0.06), indicating that at least some groups of similar size as the CAT signaling group and existing of SNPs in brain-expressed genes are more significantly associated with schizophrenia than the CAT signaling group. We do note however that the CAT signaling group overlaps with the cell adhesion molecule pathway from the KEGG (Kyoto Encyclopedia of Genes and Genomes) database that was previously associated with schizophrenia in the ISC.12
The group of intracellular signal transduction was most strongly associated with the risk of schizophrenia and includes the NRGN gene, which was one of the most significant loci identified in the (independent) SGENE+ GWAS,6 but—as a single marker effect—not below the threshold of significance in the individual samples on which the current analysis was based. In the samples included in our study, each individual SNP in the intracellular signal transduction group contributed very little to the risk of schizophrenia. However, combining their contributions resulted in a significant association.
Intracellular signal transduction in neurons and synapses is characterized by a high degree of crosstalk. A great variety of initial steps, such as activation of many different cell membrane receptors, leads to changes in a rather limited number of enzymes that generate second messengers (adenylyl cyclase, phospholipases) and a limited number of second messengers inside the cell (calcium, cyclic adenosine-monophosphate, cyclic guanosine monophosphate, inositol 1,4,5-triphosphate; reviewed in de Jong and Verhage30). Hence, it is plausible that genetic variation in the genes encoding these factors has similar biological consequences and additive contributions to pathogenesis.
The second most significant functional group (Excitability) regulates steady-state and action potential-induced ionic currents and membrane potential. Many different channels can contribute but they all allow a limited number of types of biologically relevant ions to pass. Hence, as for the group of intracellular signal transduction genes, it is plausible that genetic variation in the genes encoding these channels have similar biological consequences in cellular excitability, and thus additive contributions to pathogenesis.
For complex traits with evidence for large numbers of variants of small effect size contributing to disease risk—such as schizophrenia,4, 31 multiple sclerosis32 and type 1 diabetes mellitus33—it is of crucial importance to test whether a reported association with a group of genes is merely because of the polygenic nature of the disease or the biological function of that group of genes. Any large group of genes is likely to emerge from pathway or functional gene group analysis merely because of background polygenic effects to the risk of disease. Reporting a significant association with the group of synaptic genes may therefore seem rather trivial, as it merely confirms that synaptic genes are included in the multitude of genes related to schizophrenia. A more interesting question is thus whether the group of synaptic genes is more strongly related to schizophrenia than other randomly drawn groups of genes. To test this, we designed five methods in which we created matched-control groups of genetic variants. As the genetic variants were drawn from different pools, every control method tested slightly different null hypotheses, providing insight into how important an observed association with a group of genes is under a polygenic model of inheritance. We propose that such competitive tests for pathways or functional groups need to be included in any future pathway or functional gene group analysis.
In this study we investigated whether the accumulated effects of genetic variants in multiple genes may cause dysfunction of a biological system (for example, intracellular signal transduction), while a single genetic variant is not sufficient to cause disease. The functional gene groups we defined are characterized by redundancy, which is most likely accomplished by previous gene duplication. Over time, genetic mutations may arise causing different or less optimal protein function, which may thrive in a gene pool, thus leading to diversity or genetic heterogeneity. To some extent genes in the same functional group may functionally replace each other when others function suboptimal. Such redundancy and heterogeneity provide for fail-safe mechanisms, which render functional gene groups—like most other biological systems—robust. Robustness is a property that allows a system to maintain its functions against internal and external perturbations.34, 35
Typically, in different individuals a different set of mutations may be responsible for dysfunction. As a consequence, individuals with the same disease may have completely different genetic backgrounds, which is consistent with both a polygenic model of disease and a threshold model of disease but seriously hampers single-marker GWAS analysis, as it decreases the effect sizes of single SNPs/genes. When focusing on a functional gene group, it becomes less relevant which particular genes carry a mutation, whereas the number of genes carrying a mutation before the system starts to dysfunction is much more important. Robustness, inherent to for instance synaptic protein networks, and their underlying genes, may thus provide biologically meaningful ways to interpret the notion that ‘thousands of genes underlie complex traits' and may provide important insights in the biological systems important in disease etiology (see Supplementary Material, Section 6 and 7).
Our current results suggest that multiple genes involved in synaptic functioning are important for schizophrenia, provide support for the synaptic hypothesis of schizophrenia15, 16, 17, 18 and provide tentative evidence for the involvement of the biological mechanisms involved in intracellular signal transduction, excitability and cell adhesion and trans-synaptic signaling molecules in schizophrenia.
Acknowledgments
Statistical analyses were carried out on the Genetic Cluster Computer (http://www.geneticcluster.org), which is financially supported by the Netherlands Scientific Organization (NWO 480-05-003). The genotyping of the samples was provided through the Genetic Association Information Network (GAIN). The data set(s) used for the analyses described in this manuscript were obtained from the GAIN Database found at http://view.ncbi.nlm.nih.gov/dbgap, controlled through dbGaP accession number phs000021.v2.p1. Samples and associated phenotype data for the genome-wide association of schizophrenia study were provided by the Molecular Genetics of Schizophrenia Collaboration (PI: PV Gejman, Evanston Northwestern Healthcare (ENH) and Northwestern University, Evanston, IL, USA). We gratefully acknowledge financial support of the NWO/VIDI (452-05-318 to DP), TOP ZonMW (40-00812-98-07-032 to LNC and MV), NWO-ALW Pilot grant (051.07.004 to MV), FP7 HEALTH-F2-2009-241498 (Eurospin consortium), Neuroscience Campus Amsterdam and the European Union Seventh Framework Program under grant agreement no. HEALTHF2-2009-242167 (‘SynSys' project). ABS and MV are supported by the Centre for Medical Systems Biology (CMSB). PMV acknowledges funding from the Australian National Health and Medical Research Council (NHMRC grants 389892 and 613672). Collaboration between DP and PMV was supported through a Visiting Professorship grant from the Royal Netherlands Academy of Arts and Sciences (KNAW).
Appendix
The International Schizophrenia Consortium:
Susanne Akterin,9 Kristen Ardlie,3 M. Helena Azevedo,28 Nicholas Bass,6 Douglas HR Blackwood,7 Celia Carvalho,11 Kimberly Chambert,2,3 Khalid Choudhury,6 David Conti,11 Aiden Corvin,8 Nick J Craddock,5 Caroline Crombie,21 David Curtis,20 Mark J Daly,2,3,4 Susmita Datta,6 Emma Flordal Thelander,9 Eva Fredriksson,9 Stacey B Gabrie,13 Casey Gates,3 Lucy Georgieva,5 Michael Gill,8 Hugh Gurling,6 Peter A Holmans,5 Christina M Hultman,9,10 Ayman Fanous,11 Gillian Fraser,21 Elaine Kenny,8 George K Kirov,5 James A Knowles,11 Robert Krasucki,6 Joshua Korn,3,4 Soh Leh Kwan,12 Jacob Lawrence,6 Paul Lichtenstein,9 Antonio Macedo,28 Stuart Macgregor,14 Alan W Maclean,7 Patrik Magnusson,9 Scott Mahon,3 Pat Malloy,7 Kevin A McGhee,7 Andrew McQuillin,6 Helena Medeiros,11 Frank Middleton,23 Vihra Milanova,16 Christopher Morley,23 Derek W Morris,8 Walter J Muir,7 Ivan Nikolov,5 N Norton,5 Colm T O'Dushlaine,8 Michael C O'Donovan,5 Michael J Owen,5 Carlos N Pato,11 Carlos Paz Ferreira,27 Ben Pickard,7 Jonathan Pimm,6 Shaun M Purcell,1,2,3,4 Vinay Puri,6 Digby Quested,19,29 Douglas M Ruderfer,1,2,3,4 Edward M Scolnick,2,3 Pamela Sklar,1,2,3,4 David St Clair,12 Jennifer L Stone,1,2,3,4 Patrick F Sullivan,13 Emma F Thelander,9 Srinivasa Thirumalai,18 Draga Toncheva,15 Margaret Van Beck,7 Peter M Visscher,14 John L Waddington,17 Nicholas Walker,22 H Williams5 and Nigel M. Williams5
1 Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA;
2 Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA;
3 Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA;
4 Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA;
5 School of Medicine, Department of Psychological Medicine, School of Medicine, Cardiff University, Cardiff, UK;
6 Molecular Psychiatry Laboratory, Department of Mental Health Sciences, University College London Medical School, Windeyer Institute of Medical Sciences, London, UK;
7 Division of Psychiatry, School of Molecular and Clinical Medicine, University of Edinburgh, Edinburgh, UK;
8 Neuropsychiatric Genetics Research Group, Department of Psychiatry and Institute of Molecular Medicine, Trinity College Dublin, Dublin, Ireland;
9 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden;
10 Department of Neuroscience, Psychiatry, Ulleråker, Uppsala University, Uppsala, Sweden;
11 Center for Genomic Psychiatry, University of Southern California, Los Angeles, CA, USA;
12 Institute of Medical Sciences, University of Aberdeen, Foresterhill, Aberdeen, UK;
13 Departments of Genetics, Psychiatry, and Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA;
14 Queensland Institute of Medical Research, Brisbane, QLD,, Australia;
15 Department of Medical Genetics, University Hospital Maichin Dom, Sofia, Bulgaria;
16 Department of Psychiatry, First Psychiatric Clinic, Alexander University Hospital, Sofia, Bulgaria;
17 Molecular and Cellular Therapeutics and RCSI Research Institute, Royal College of Surgeons in Ireland, Dublin, Ireland;
18 West Berkshire NHS Trust, Reading, UK;
19 West London Mental Health Trust, Hammersmith and Fulham Mental Health Unit and St Bernard's Hospital, London, UK;
20 Queen Mary College, University of London and East London and City Mental Health Trust, Royal London Hospital, Whitechapel, London, UK;
21 Department of Mental Health, University of Aberdeen, Aberdeen, UK;
22 Ravenscraig Hospital, Inverkip Road, Greenock, UK;
23 State University of New York–Upstate Medical University, Syracuse, NY, USA;
24 Washington VA Medical Center, Washington DC, USA;
25 Department of Psychiatry, Georgetown University School of Medicine, Washington DC, USA;
26 Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA;
27 Department of Psychiatry, Sao Miguel, Azores, Portugal;
28 Department of Psychiatry, University of Coimbra, Coimbra, Portugal;
29 Current address: Department of Psychiatry, University of Oxford, Warneford Hospital, Headington, Oxford, UK
The authors declare no conflict of interest.
Footnotes
Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)
Supplementary Material
References
- Andreasen NC. Symptoms, signs, and diagnosis of schizophrenia. Lancet. 1995;346:477–481. doi: 10.1016/s0140-6736(95)91325-4. [DOI] [PubMed] [Google Scholar]
- Mowry BJ, Holmans PA, Pulver AE, Gejman PV, Riley B, Williams NM, et al. Multicenter linkage study of schizophrenia loci on chromosome 22q. Mol Psychiatry. 2004;9:784–795. doi: 10.1038/sj.mp.4001481. [DOI] [PubMed] [Google Scholar]
- Sullivan PF, Kendler KS, Neale MC. Schizophrenia as a complex trait: evidence from a meta-analysis of twin studies. Arch Gen Psychiatry. 2003;60:1187–1192. doi: 10.1001/archpsyc.60.12.1187. [DOI] [PubMed] [Google Scholar]
- Purcell SM, Wray NR, Stone JL, Visscher PM, O'Donovan MC, Sullivan PF, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748–752. doi: 10.1038/nature08185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe'er I, et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature. 2009;460:753–757. doi: 10.1038/nature08192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D, et al. Common variants conferring risk of schizophrenia. Nature. 2009;460:744–747. doi: 10.1038/nature08186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim Y, Zerwas S, Trace SE, Sullivan PF. Schizophrenia genetics: where next. Schizophr Bull. 2011;37:456–463. doi: 10.1093/schbul/sbr031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Torkamani A, Topol EJ, Schork NJ. Pathway analysis of seven common diseases assessed by genome-wide association. Genomics. 2008;92:265–272. doi: 10.1016/j.ygeno.2008.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holmans P, Green EK, Pahwa JS, Ferreira MA, Purcell SM, Sklar P, et al. Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. Am J Hum Genet. 2009;85:13–24. doi: 10.1016/j.ajhg.2009.05.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jia P, Wang L, Meltzer HY, Zhao Z. Common variants conferring risk of schizophrenia: a pathway analysis of GWAS data. Schizophr Res. 2010;122:38–42. doi: 10.1016/j.schres.2010.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ruano D, Abecasis GR, Glaser B, Lips ES, Cornelisse LN, de Jong AP, et al. Functional gene group analysis reveals a role of synaptic heterotrimeric G proteins in cognitive ability. Am J Hum Genet. 2010;86:113–125. doi: 10.1016/j.ajhg.2009.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Dushlaine C, Kenny E, Heron E, Donohoe G, Gill M, Morris D, et al. Molecular pathways involved in neuronal cell adhesion and membrane scaffolding contribute to schizophrenia and bipolar disorder susceptibility. Mol Psychiatry. 2011;16:286–292. doi: 10.1038/mp.2010.7. [DOI] [PubMed] [Google Scholar]
- Subramanian A, Kuehn H, Gould J, Tamayo P, Mesirov JP. GSEA-P: a desktop application for Gene Set Enrichment Analysis. Bioinformatics. 2007;23:3251–3253. doi: 10.1093/bioinformatics/btm369. [DOI] [PubMed] [Google Scholar]
- Wang K, Li M, Bucan M. Pathway-based approaches for analysis of genomewide association studies. Am J Hum Genet. 2007;81:1278–1283. doi: 10.1086/522374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Owen MJ, O'Donovan MC, Harrison PJ. Schizophrenia: a genetic disorder of the synapse. BMJ. 2005;330:158–159. doi: 10.1136/bmj.330.7484.158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harrison PJ, Weinberger DR.Schizophrenia genes, gene expression, and neuropathology: on the matter of their convergence Mol Psychiatry 20051040–68.image 45. [DOI] [PubMed] [Google Scholar]
- Hayashi-Takagi A, Sawa A. Disturbed synaptic connectivity in schizophrenia: convergence of genetic risk factors during neurodevelopment. Brain Res Bull. 2010;83:140–146. doi: 10.1016/j.brainresbull.2010.04.007. [DOI] [PubMed] [Google Scholar]
- Johnson RD, Oliver PL, Davies KE. SNARE proteins and schizophrenia: linking synaptic and neurodevelopmental hypotheses. Acta Biochim Pol. 2008;55:619–628. [PubMed] [Google Scholar]
- Wang K, Li M, Hakonarson H. Analysing biological pathways in genome-wide association studies. Nat Rev Genet. 2010;11:843–854. doi: 10.1038/nrg2884. [DOI] [PubMed] [Google Scholar]
- Suarez BK, Duan J, Sanders AR, Hinrichs AL, Jin CH, Hou C, et al. Genomewide linkage scan of 409 European-ancestry and African American families with schizophrenia: suggestive evidence of linkage at 8p23.3-p21.2 and 11p13.1-q14.1 in the combined sample. Am J Hum Genet. 2006;78:315–333. doi: 10.1086/500272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li K, Hornshaw MP, van Minnen J, Smalla KH, Gundelfinger ED, Smit AB. Organelle proteomics of rat synaptic proteins: correlation-profiling by isotope-coded affinity tagging in conjunction with liquid chromatography-tandem mass spectrometry to reveal post-synaptic density specific proteins. J Proteome Res. 2005;4:725–733. doi: 10.1021/pr049802+. [DOI] [PubMed] [Google Scholar]
- Takamori S, Holt M, Stenius K, Lemke EA, Gronborg M, Riedel D, et al. Molecular anatomy of a trafficking organelle. Cell. 2006;127:831–846. doi: 10.1016/j.cell.2006.10.030. [DOI] [PubMed] [Google Scholar]
- Whitlock MC. Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach. J Evol Biol. 2005;18:1368–1373. doi: 10.1111/j.1420-9101.2005.00917.x. [DOI] [PubMed] [Google Scholar]
- Fisher RA. Statistical Methods for Research Workers. Oliver and Boyd: London; 1925. [Google Scholar]
- Zaykin DV, Zhivotovsky LA, Westfall PH, Weir BS. Truncated product method for combining P-values. Genet Epidemiol. 2002;22:170–185. doi: 10.1002/gepi.0042. [DOI] [PubMed] [Google Scholar]
- Nyholt DR. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet. 2004;74:765–769. doi: 10.1086/383251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA. 2009;106:9362–9367. doi: 10.1073/pnas.0903103106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferreira MA, O'Donovan MC, Meng YA, Jones IR, Ruderfer DM, Jones L, et al. Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar disorder. Nat Genet. 2008;40:1056–1058. doi: 10.1038/ng.209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Green EK, Grozeva D, Jones I, Jones L, Kirov G, Caesar S, et al. The bipolar disorder risk allele at CACNA1C also confers risk of recurrent major depression and of schizophrenia. Mol Psychiatry. 2010;15:1016–1022. doi: 10.1038/mp.2009.49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Jong AP, Verhage M. Presynaptic signal transduction pathways that modulate synaptic transmission. Curr Opin Neurobiol. 2009;19:245–253. doi: 10.1016/j.conb.2009.06.005. [DOI] [PubMed] [Google Scholar]
- The Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium Genome-Wide Association Study Identifies Five Novel Schizophrenia Loci Nat Genetin press. [DOI] [PMC free article] [PubMed]
- Bush WS, Sawcer SJ, de Jager PL, Oksenberg JR, McCauley JL, Pericak-Vance MA, et al. Evidence for polygenic susceptibility to multiple sclerosis--the shape of things to come. Am J Hum Genet. 2010;86:621–625. doi: 10.1016/j.ajhg.2010.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wei Z, Wang K, Qu HQ, Zhang H, Bradfield J, Kim C, et al. From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes. PLoS Genet. 2009;5:e1000678. doi: 10.1371/journal.pgen.1000678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kitano H. Biological robustness. Nat Rev Genet. 2004;5:826–837. doi: 10.1038/nrg1471. [DOI] [PubMed] [Google Scholar]
- Stelling J, Sauer U, Szallasi Z, Doyle FJ, 3rd, Doyle J. Robustness of cellular functions. Cell. 2004;118:675–685. doi: 10.1016/j.cell.2004.09.008. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.