Abstract
Existence of aging associated transcriptional differences in the schizophrenia brain was investigated in RNA sequencing data from 610 postmortem Dorso-Lateral Pre-Frontal Cortex (DLPFC) samples in the CommondMind Consortium (CMC) and the psychENCODE cohorts. This analysis discovered that the trajectory of gene expression changes that occur during brain aging differed between schizophrenia cases and unaffected controls. Mainly, the identified gene expression differences between the diagnosis groups shrank in magnitude following 60 years of age. A differential expression analysis restricted to the 40 to 60 year age group identified 556 statistically significant loci that replicated and had highly consistent gene expression fold changes in the two cohorts. An interaction between age and diagnosis in the wider psychENCODE cohort was also detected. Gene set enrichment analysis discovered disruptions in mitochondria, RNA splicing and phosphoprotein gene pathways. The identified differentially expressed genes in the two cohorts were also significantly enriched in genomic regions associated with schizophrenia although no enrichment was observed for differentially expressed genes identified in the 40 to 60 year age group. This work implicates disruptions to the normal brain aging processes in the pathology of schizophrenia and demonstrates the need for age stratification in schizophrenia postmortem brain gene expression studies.
Introduction
Gene expression in the postmortem schizophrenia brain has been extensively characterized by microarray and next generation sequencing studies1–6. The consensus from these studies is that there are many subtle expression differences in individual genes that likely alter the functioning of various gene pathways in schizophrenia2,4,7,8. The standard analysis approach employed by these studies assumes that gene expression and age have a linear relationship. Thus, appropriate statistical methods are used to adjust for the effect of age on gene expression. However, epidemiological studies consistently find excess early mortality8,9 in disease. In addition, studies that estimate brain age based on neuroanatomical structures10 and integrity of the white matter11,12 have found accelerated aging in schizophrenia. As transcriptional changes are likely to accompany neuroanatomical changes in the brain, the relationship between age and disease is likely to be more complex than assumed in current analysis pipelines. Therefore, in this work publicly available RNA sequencing data was reanalyzed to determine whether the transcriptome of the aging brain differs between unaffected controls and schizophrenia cases.
Results
Gene expression profiles in the aging schizophrenia brain is altered
In order to study aging in schizophrenia, RNA sequencing data generated from the BA9 of the CMC and BA46 of the psychENCODE postmortem brain collections were reanalyzed. As CMC has the largest number of brain samples with RNA sequencing data and gene expression differences in schizophrenia have already been identified in this cohort1,13, we reasoned that this cohort likely has the necessary statistical power to detect differences in our analysis. Following the exclusion criteria (see Methods) 237 unaffected controls and 239 schizophrenia subjects between the ages of 25 to 90 remained (Fig. 1A). In the psychENCODE cohort only BA46 samples from schizophrenia and control cases that were between the ages of 25 and 65 were included (See Methods). This left a total of 67 cases and 67 control samples (Fig. 1B).
Gene count tables were generated for the CMC cohort using the Ensembl annotation for the hg19 human genome build and the DESeq214 package from the bioconductor project was used to identify differentially expressed loci. The following statistical design that assumes an interaction between diagnoses and age was used (see Methods for details):
PMI refers to Post Mortem Interval whereas RIN is the RNA Integrity Number, which is an estimate of RNA quality15,16. RuvFactors were calculated using the RuvSeq package which is a statistical method to remove unwanted variation from RNA sequencing data17. Based on these criteria, Wald tests identified 7434 differentially expressed genes in which the adjusted p-value (corrected for multiple testing) was <0.1 and each gene on average had at least 20 reads per sample. Similar to previous studies, we found moderate differences in fold change in the statistically significant differences we identified (Fig. 2 and Supplementary Fig. 1 plots the top 100 loci with the lowest p-values). These graphs revealed that the difference in gene expression levels between schizophrenia cases and unaffected controls vary with age (Fig. 2). The maximum gene expression difference in the CMC was consistently between 40 and 60 years of age (Supplementary Figure 1) with the caveat that there are relatively few samples under the age of 40 in the CMC cohort. For many loci the expression differences at ages older than 60 gets smaller and the direction of the change is reversed. In order to summarize the results of the differential expression analysis, the normalized difference between schizophrenia cases and unaffected controls was plotted for each locus (Fig. 3 and Supplementary Fig. 2. See Normalized Difference Plots in Methods). Briefly, schizophrenia expression levels (the blue lines in Fig. 2) were subtracted from control expression levels (the red lines in Fig. 2) for each age. The resulting differences in expression at each age were plotted for every locus (Fig. 3 - See methods for details). In order to avoid over plotting, the differentially expressed loci were subdivided into those with a maximum difference at ages 60 or less and were 1) more expressed (3114 loci) or 2) less expressed (3369 loci) in controls (Fig. 3) and those with a maximum difference at ages over 60 (Supplementary Fig. 2) that were 3) more expressed (405 loci) or 4) less expressed (546 loci) in controls. These plots verified that the difference in gene expression between cases and controls is not constant across age. Although, the CMC cohort contains 307 samples over the age of 60 there are only 33 samples under the age of 40. Therefore, the smaller gene expression differences observed between cases and controls under the age of 40 needs to be interpreted with caution.
Age stratification improves consistency between studies
As the largest expression differences between schizophrenia and control samples occurred consistently between 40 and 60 years of age in the CMC analysis, differential expression analysis for this age group in the CMC and psychENCODE cohorts was performed. Eighty-two cases and 54 controls were present in the CMC and 51 cases and 48 controls were present in the psychENCODE cohorts. Gene count tables were generated based on the Ensembl hg19 annotation for both cohorts. For this analysis the interaction term was removed and the following regression formula was used:
The resulting analysis identified 2167 differentially expressed genes in the CMC and 4086 in psychENCODE that had an adjusted p-value of less than 0.1 and at least 20 reads per gene. After excluding results driven by outliers (using cooks distance) 556 loci replicated between the cohorts (Fig. 4, Supplementary Table 1). The p-values ranged from 1.25e-05 to 0.1 (mean 0.045) for the CMC cohort and 2.71e-09 to 0.1 (mean 0.027) for the psychENCODE cohort whereas the absolute fold change ranged from 1.05–1.86 (mean 1.18) for the CMC and 1.04–2.16 (mean 1.17) for psychENCODE. In order to determine the consistency of the changes between the two cohorts, the fold change difference in the two studies was plotted (Fig. 4A). The fold change difference between the two studies were highly correlated (r2 = 0.82) Boxplots were used to gage the variability between diagnosis groups (Fig. 4B). Gene set enrichment analysis performed using the DAVID annotation tool18 revealed that these 556 genes are mostly associated with mitochondrial functions (fold enrichment 1.88–6.48, p-value 0.0017–0.1), a biological pathway implicated to be involved in the pathology of schizophrenia (Table 1).
Table 1.
Category | Term | Gene Count | Fold Enrichment | Bonferroni |
---|---|---|---|---|
SP_PIR_KEYWORDS | mitochondrion | 43 | 2.16 | 1.74E-03 |
SP_PIR_KEYWORDS | transit peptide | 30 | 2.64 | 1.78E-03 |
SP_PIR_KEYWORDS | respiratory chain | 11 | 6.48 | 2.84E-03 |
UP_SEQ_FEATURE | transit peptide:Mitochondrion | 30 | 2.67 | 4.61E-03 |
GOTERM_CC_FAT | GO:0005739~mitochondrion | 51 | 1.88 | 5.30E-03 |
GOTERM_CC_FAT | GO:0070469~respiratory chain | 11 | 5.88 | 5.54E-03 |
GOTERM_CC_FAT | GO:0044429~mitochondrial part | 33 | 2.22 | 1.20E-02 |
GOTERM_BP_FAT | GO:0030198~extracellular matrix organization | 13 | 5 | 1.95E-02 |
SP_PIR_KEYWORDS | mitochondrion inner membrane | 16 | 3.47 | 2.64E-02 |
KEGG_PATHWAY | hsa05010:Alzheimer’s disease | 14 | 3.31 | 3.17E-02 |
KEGG_PATHWAY | hsa05012:Parkinson’s disease | 12 | 3.61 | 5.05E-02 |
KEGG_PATHWAY | hsa00190:Oxidative phosphorylation | 12 | 3.56 | 5.75E-02 |
GOTERM_CC_FAT | GO:0005746~mitochondrial respiratory chain | 9 | 5.63 | 6.11E-02 |
GOTERM_CC_FAT | GO:0019866~organelle inner membrane | 21 | 2.56 | 7.50E-02 |
GOTERM_CC_FAT | GO:0044455~mitochondrial membrane part | 12 | 3.85 | 9.58E-02 |
Gene set enrichment analysis implicates similar gene pathways in the CMC and The psychENCODE cohorts
Although the psychENCODE cohort was much smaller and only spanned an age range of 25 to 65, the existence of an interaction between age and diagnosis was investigated in this cohort. Thus the analysis performed on the CMC cohort using the formula ~Diagnosis + Age + Age: Diagnosis + Gender + Ethnicity + PMI + RIN + RuvFactors was repeated. The threshold for statistical significance was again set at an adjusted p-value of <0.1 and a minimum average read count of 20/sample. Based on these criteria we identified 29 loci to be differentially expressed (Supplementary Table 2). Fifteen out of these loci were also differentially expressed in the CMC cohort (11 in the same direction). Gene set enrichment analysis on the 29 identified loci did not yield statistically significant results. However, a plot of the p-values for all genes tested in the psychENCODE cohort revealed that there are many nominally significant genes (unadjusted p-value <0.05) that failed to reach the threshold of statistical significance following multiple testing (Supplementary Fig. 3) suggesting that this small cohort might be underpowered. Gene set enrichment analysis was performed using the DAVID annotation tool18 on the 2976 nominally expressed genes in psychENCODE (unadjusted p < 0.05, minimum mean read count 20 and outlier removal by cooks distance) and the significantly expressed loci in the CMC to determine whether there was any overlap between the two cohorts. Gene pathways associated with acetylation (p = 4.75E-57) and phosphoproteins (p = 3.55E-40) had the most significant enrichment in the CMC analysis. In addition multiple gene pathways associated with mitochondrial functions, splicing and ribosome activity were significantly enriched (Table 2, Supplementary Table 3). The 4 gene pathways that were significantly enriched in the psychENCODE cohort were alternative splicing (p = 1.25E-10), splice variant (p = 5.28-09) phosphoprotein (p = 1.66E-07) and coiled coil pathways (p = 0.09) (Table 2).
Table 2.
Category | Term | Gene Count | Fold Enrichment | Bonferroni |
---|---|---|---|---|
CommonMind | ||||
SP_PIR_KEYWORDS | acetylation | 1070 | 1.50 | 4.75E-57 |
SP_PIR_KEYWORDS | phosphoprotein | 2375 | 1.21 | 3.55E-40 |
GOTERM_CC_FAT | GO:0005739~mitochondrion | 458 | 1.58 | 6.12E-28 |
GOTERM_CC_FAT | GO:0030529~ribonucleoprotein complex | 250 | 1.82 | 3.61E-24 |
SP_PIR_KEYWORDS | mitochondrion | 359 | 1.60 | 9.83E-22 |
SP_PIR_KEYWORDS | alternative splicing | 2325 | 1.15 | 4.17E-21 |
GOTERM_CC_FAT | GO:0070013~intracellular organelle lumen | 647 | 1.36 | 1.38E-19 |
GOTERM_CC_FAT | GO:0044429~mitochondrial part | 266 | 1.67 | 3.02E-19 |
GOTERM_CC_FAT | GO:0005840~ribosome | 125 | 2.18 | 3.11E-19 |
SP_PIR_KEYWORDS | ribonucleoprotein | 152 | 2.01 | 3.53E-19 |
PsychENCODE | ||||
SP_PIR_KEYWORDS | alternative splicing | 1006 | 1.19 | 1.25E-10 |
UP_SEQ_FEATURE | splice variant | 1002 | 1.18 | 5.28E-09 |
SP_PIR_KEYWORDS | phosphoprotein | 957 | 1.16 | 1.66E-07 |
SP_PIR_KEYWORDS | coiled coil | 280 | 1.22 | 9.15E-02 |
*Only the 10 most significant results listed for CommonMind.
Differentially Expressed Genes Identified In The CMC And The psychENCODE Cohorts Are Enriched In The 108 Regions Associated With Schizophrenia
In order to determine whether the differentially expressed loci identified in the CMC and the psychENCODE cohorts are enriched near the genomic regions associated with schizophrenia19, Inrich analysis20 was performed. The 556 loci identified in the 40 to 60 year age group were not enriched in genomic regions associated with schizophrenia. However, out of the 95 schizophrenia regions that contain genes (based on the Ensembl gene annotations), the differentially expressed genes identified in the CMC at a threshold of adjusted p-value <0.1 overlapped 54 genomic regions associated with schizophrenia (Inrich analysis adjusted p = 0.01, Supplementary Table 4). Repeating the Inrich analysis for differentially expressed genes that had an adjusted p-value of <0.05 (overlapped 50 GWAS regions, Inrich adjusted p = 0.0024) and < 0.01 (overlapped 35 GWAS regions, Inrich adjusted p = 0.0016) improved the significance of the enrichment. The nominally significant psychENCODE loci overlapped 33 genomic regions associated with schizophrenia (Inrich p = 0.02) (Supplementary Table 5). Twenty-six out of the thirty-three regions (76% overlap) identified in the psychENCODE cohort were also statistically significant in the Inrich CMC analysis (Supplementary Tables 4 and 5).
Discussion
The process of brain aging differs significantly between schizophrenia cases and unaffected controls. The level of differential expression between diagnoses groups in the brain varies with age highlighting the complex affect aging has on disease gene expression profiles. As the largest gene expression level differences in disease for the CMC were in the 40 to 60 year age range, we compared this age group between the CMC and psychENCODE. Five hundred fifty six loci enriched in mitochondrial functions were identified that surpassed the threshold for multiple testing in both the CMC and psychENCODE cohorts. The consistency of fold change differences (r2 = 0.82) between the two cohorts provides further confidence in this finding. In comparison, the analysis of the entire CMC cohort yields much smaller gene expression difference compared to other cohorts - a mean of 1.09 and a range of 1.03–1.33 fold1. The similar differences observed between the CMC and psychENCODE when the analysis is restricted to the 40 to 60 year age group strongly suggests that age stratification in schizophrenia gene expression studies is beneficial. Although the largest gene expression differences observed in the CMC data set was between the ages of 40 and 60, the scarcity of samples under 40 years of age prevented us from characterizing gene expression changes in younger subjects. Clearly, ages from late adolescence to early adulthood are extremely important to schizophrenia pathophysiology as disease onset occurs at this stage. However, since most psychiatric brain banks lack samples at these young ages, we are unable to perform a thorough analysis of a key stage in schizophrenia and are likely missing important gene expression changes associated with disease pathology. Evidence for a statistical interaction between age and schizophrenia was discovered in both the CMC and the psychENCODE cohorts. Gene set enrichment analysis revealed disruptions in splicing and phosphoprotein pathways to be common to both cohorts. In addition, the differentially expressed genes identified were enriched in the genomic regions associated with schizophrenia. This finding was especially strong in the CMC cohort and encompassed more than half of all genic regions associated with disease (54 out of 95).
Modeling an interaction between age and diagnosis did not eliminate discrepant findings between the CMC and psychENCODE cohorts. One reason for this discrepancy may be related to the differences between the cohorts. The CMC project generated sequencing data on BA9 samples whereas the PsychENCODE consortium sequenced BA46 samples. The psychENCODE cohort was also less than 3.5 times that of the CMC and may have been underpowered. The overlap in gene set enrichment and Inrich analysis between the CMC and the nominally significant genes in the psychENCODE cohorts support this notion. The fact that the psychENCODE cohort spanned ages between 25 and 65 as opposed to the CMC that spanned ages between 25 and 90 may have also contributed to the differences in these findings. Another reason for the discrepant findings is likely related to the heterogeneous nature of schizophrenia. As schizophrenia genetic studies required over a hundred thousand samples in order to overcome the heterogeneity problem19, expecting every loci to be differentially expressed in every single schizophrenia postmortem brain sample is unrealistic. It is likely that the disruption of different genes that belong to the same biological pathway results in similar pathology in different people. This is congruent with the presented gene set enrichment analysis that found disruptions in RNA splicing and phosphoprotein pathways in both the CMC and the psychENCODE cohorts.
This work implicates gene pathways associated with mitochondrial function, splicing and phosphoproteins in schizophrenia pathology. Previous schizophrenia postmortem brain studies have reported differential gene expression for mitochondrial genes in disease21,22. Widespread splicing deficits have also been reported in schizophrenia2,23,24 and numerous phosphoproteins such as synapsin II25 and DARPP-3226 have been implicated in disease pathology. It is important to note that mitochondrial changes27,28, widespread alterations in splicing patterns29 and altered phosphorylation30,31 also occur in the aging brain. Potentially, the deficits observed in schizophrenia might be similar to the changes that occur in the aging brain.
This notion is in line with previous studies that have found a link between aging and schizophrenia. Many studies have found schizophrenia patients to have lower life expectancy (for a review see8) and the suggestion has been made that schizophrenia might be a systematic disease of accelerated aging32. Although this work discovered age related changes in schizophrenia, our results do not necessarily support accelerated aging in the disease. It is important to note that our current understanding of brain aging at the molecular level is limited and therefore, we may be failing to recognize patterns associated with accelerated aging in our data. These results are also consistent with previous pathway analysis performed on microarray data that found transcriptional differences in the aging schizophrenia brain33,34. Given that inflammation is associated with old age, the findings of increased inflammation in schizophrenia35–37 might also be related to the aging disruptions we are finding. In summary, our results along with previous findings strongly support the existence of a link between aging and schizophrenia pathology. Additional postmortem brain collections that examine schizophrenia throughout the entire lifespan and are cognizant of the importance of age are needed to fully explore the effects of age on gene expression in the disease.
Individuals suffering from schizophrenia have a tendency to smoke38, use recreational drugs, and consume excess levels of alcohol compared to the general population39. In addition, schizophrenia patients use various antipsychotic and other psychiatric medications that cause weight gain and other health problems40,41. These factors have the potential to disrupt the aging process in the brain and may account for our findings. Conceivably, the expression differences observed in the 40 to 60 age group are caused by such factors. However, the statistically significant enrichment between age dependent RNA expression changes in the brain and the 108 genomic regions associated with schizophrenia is intriguing. Further research is needed to determine whether some of the age related gene expression differences observed in the schizophrenia brain might be inherited.
Duration of illness, along with lifetime antipsychotic use are likely to affect gene expression since they are associated with brain volume changes in schizophrenia42,43. Duration of illness in schizophrenia is of particular interest as it appears to influence treatment response44,45. The effects of these factors on gene expression could not be assessed in the current study because this information was not available for the CommonMind collection. In the smaller psychENCODE cohort, which spans a narrower age range and appears to be statistically underpowered, both duration of illness and lifetime antipsychotic use were highly correlated with age. Generally, older subjects with schizophrenia had lived with the disease longer and had used more antipsychotics. Given this association with age, an adequate assessment of the effects of disease duration or the effects of lifetime antipsychotic use will require comparison of early and late disease onset cases that live into old age and comparison of subjects at similar ages that differ in their lifetime antipsychotic use. Each of these comparisons will likely require large cohorts, potentially similar in size to CommonMind, because of the subtle gene expression differences that occur in schizophrenia. It is probable that at least a subset of the differentially expressed genes identified in the current study are associated with the pathology related to duration of illness and/or lifetime antipsychotic use but definitive conclusions can not be made. Larger postmortem brain collections will also enable characterization of the associations between gene expression and disease subtypes or assessment scores, neither of which were available for this study. Potentially, an analysis cognizant of disease subtypes or assessment scores will be extremely beneficial as it may refine the heterogenous schizophrenia classification and yield larger and more consistent gene expression differences.
We conclude that age is an important aspect of schizophrenia pathophysiology and special consideration needs to be given to age in gene expression studies of disease. Appropriate treatment of age as a variable is likely to aid in efforts to unravel the molecular etiology of schizophrenia and the identification of biomarkers and therapeutic targets for the disease.
Materials and Methods
Samples
Differential expression analysis was performed on the DLPFC RNA sequencing data from CMC1 (N = 476) and the PsychENCODE46 (N = 134) projects. Details regarding the ethical approval process and relevant adherence to relevant guidelines and regulations are detailed in the original publications1,46. Access to the RNA sequencing data was approved by the NIMH Respository and Genomic Resource Data Access Committee and deidentified data was downloaded. In the CMC cohort only BA9 samples from schizophrenia cases and unaffected controls were included. Samples from individuals with Klinefelter syndrome were excluded and only a single individual chosen at random from a sibling pair was included in the analysis. Samples below the age of 25 were excluded since there were only 3 schizophrenia samples in this age group. The age group labeled 90+ was also excluded since an exact age could not be assigned for the regression analysis. Samples less then 25 and more then 65 years of age in the psychENCODE cohort were excluded in order to have a balanced number of cases and controls across all age groups (the 3 samples above 65 were controls and 6 out of the 7 samples below 25 were schizophrenia cases). psychENCODE samples with a PMI of greater than 50 hours were also excluded in order to be consistent with the CMC cohort.
Differential expression analysis of schizophrenia and aging
All analysis was performed using R version 3.4.3. Bam files for each corresponding study was downloaded and count tables were generated for genes (considering only exons) using the Ensembl hg19 annotation. Specifically, the readGAlignmentPairs command from the GenomicAlignments package47 was used to read the alignments from the bam files and the resulting data structure along with the Ensembl hg19 annotation was passed as parameters to the summarizeOverlaps command. The summarizeOvelaps command calculates the number of reads that span the exon intervals in the annotation and provide the total number of reads that are present for each gene. RUVSeq package was used to account for hidden batch effects and remove unwanted variation from the samples17. In order to identify loci that are not differentially expressed between the diagnosis groups, which is required by RUVSeq, we performed an initial analysis using DESeq214 in which we only considered diagnosis and age. Genes that had an unadjusted p-value of >0.7 were considered to be not differentially expressed and passed to the Ruvg command to calculate normalization factors that account for unwanted variation17. The factors calculated from RUVseq were included in the design formula of the DESeq analysis. Wald tests were performed in DESeq2 using the DESeq command, which also performs normalization on the samples, in order to identify differentially expressed genes. The statistical model used for the Wald test included an interaction term between diagnosis and age. This and all subsequent formulas were used as the ‘design’ parameter in DESeq2.
The resulting p-values were corrected (or adjusted) for multiple testing using the Benjamini & Hochberg method implemented in the DESeq2 package. Only loci with a corrected p-value <0.1 on average 20 read counts per gene were considered significantly differentially expressed. An adjusted p-value of 0.1 is recommend over 0.05 since the multiple correction methods used to adjust the p-values are overly stringent14. We repeated the same DESeq2 analysis without RuvFactors and found that the age related trajectory differences are observed even in the absence of batch effect correction (data not shown). Cooks distance was calculated for each differentially expressed locus that surpassed the threshold of statistical significance in order to remove genes that appear to be different because of outliers. The Cooks distance cutoff threshold was identified empirically for each cohort. A similar analysis was performed for the 40 to 60 year old age group using the same formula as above but lacking the interaction term.
New normalization factors were calculated using RuvSeq for the 40 to 60 year age group (an initial DESeq2 analysis was performed with diagnosis only and non-differentially expressed genes were identified as before).
Differential expression plots
The raw normalized counts from the DESeq data object were extracted using the ‘counts’ function with the normalized parameter set to true. The obtained normalized read counts were plotted over age using the bioconductor ggplot2 package. We opted to not use log-transformed values in order to provide the most accurate representation of the data.
Normalized difference plots
For each differentially expressed locus identified in the CMC analysis, the loess function in R was used to fit gene expression in control or schizophrenia samples and age. The R ‘predict’ function was run to calculate gene expression values for the age range 25–90 based on the results of the loess function. For each locus, the resulting values for the schizophrenia samples were subtracted from control samples and this value was normalized to the absolute maximum difference in all ages using the following formula:
Normalization meant that only values between −1 and 1 were possible. Each locus had a value of 1 or −1 at some age indicating the maximal gene expression difference in disease for the locus. The normalized values for each locus was plotted using a very thin line in order to avoid over plotting.
GWAS enrichment and gene set analysis
Differentially expressed genes in the CMC with an adjusted p-value of 0.1 or less and a minimum mean read count of 20 were used for gene set enrichment and Inrich20 analysis. The DAVID 6.7 Bioinformatics Resource (https://david-d.ncifcrf.gov/) was selected for the gene set enrichment analysis as it is a robust and widely used tool with over 33000 citations48. DAVID performs a competitive analysis using a modified Fisher’s exact test. The standard DAVID analysis49 was performed using the default conditions of minimum 2 genes per term and an EASE score, which is a modified and more stringent Fisher Exact P-value, of 0.1. The genes in the human genome were used as the background gene list. The databases included in the DAVID analysis were (Selected from Annotation Summary Results page that is generated by DAVID after gene sets are uploaded and mapped):
OMIM (https://www.ncbi.nlm.nih.gov/omim) from the Disease category; COG_ONTOLOGY (https://www.ncbi.nlm.nih.gov/COG/), SP_PIR_KEYWORD (https://proteininformationresource.org/pirwww/dbinfo/iproclass.shtml) and UP_SEQ_FEATURES (https://www.uniprot.org/) from the Functional Categories category; GOTERM_BP_FAT, GOTERM_CC_FAT and GOTERM_MF_FAT (https://www.ebi.ac.uk/GOA) from the Gene_Ontology category; BBID50, BIOCARTA (https://cgap.nci.nih.gov/Pathways/BioCarta_Pathways) and KEGG_PATHWAY (https://www.genome.jp/kegg/) from the Pathways category; INTERPRO (https://www.ebi.ac.uk/interpro/), PIR_SUPERFAMILY (https://proteininformationresource.org/pirwww/dbinfo/iproclass.shtml) and SMART (http://smart.embl-heidelberg.de/) from the Protein_Domains category.
Inrich, analysis was performed as described51 to determine whether intervals near schizophrenia associated loci were more likely to be associated with the identified differentially expressed genes. Only genes that fell within the GWAS regions were considered. Adjusted p-value <0.05 from the Inrich program was set as the threshold for significance. For the PsychENCODE cohort, genes with a nominal p-value of 0.05 or less and a minimum read count of 20 were used for the same analyses. The same genes were used for gene set enrichment analysis performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID) 6.7 web tool18. In the 40 to 60 age group the 556 genes that replicated between the two cohorts were subjected to DAVID and Inrich analyses. Gene pathways reaching a Bonferroni corrected p-values < 0.1 from the DAVID analysis are reported.
Supplementary information
Acknowledgements
I thank the Stanley Medical Research Institute for providing funding for this research. I am especially thankful to Dr. Jeff Leek for his guidance and advice on statistical approaches. I would also like to thank Drs Robert Yolken, Tim Moran, Jonathan Pevsner, Russell Margolis, Brion Maher, Fuller Torrey and Maree Webster for their helpful suggestions and Drs Yunjung Kim and Patrick Sullivan for providing code for the Inrich analysis. CMC data were generated as part of the CommonMind Consortium supported by funding from Takeda Pharmaceuticals Company Limited, F. Hoffman-La Roche Ltd and NIH grants R01MH085542, R01MH093725, P50MH066392, P50MH080405, R01MH097276, RO1-MH-075916, P50M096891, P50MH084053S1, R37MH057881 and R37MH057881S1, HHSN271201300031C, AG02219, AG05138 and MH06692. Brain tissue for the study was obtained from the following brain bank collections: the Mount Sinai NIH Brain and Tissue Repository, the University of Pennsylvania Alzheimer’s Disease Core Center, the University of Pittsburgh NeuroBioBank and Brain and Tissue Repositories and the NIMH Human Brain Collection Core. CMC Leadership: Pamela Sklar, Joseph Buxbaum (Icahn School of Medicine at Mount Sinai), Bernie Devlin, David Lewis (University of Pittsburgh), Raquel Gur, Chang-Gyu Hahn (University of Pennsylvania), Keisuke Hirai, Hiroyoshi Toyoshiba (Takeda Pharmaceuticals Company Limited), Enrico Domenici, Laurent Essioux (F. Hoffman-La Roche Ltd), Lara Mangravite, Mette Peters (Sage Bionetworks), Thomas Lehner, Barbara Lipska (NIMH). PsychENCODE data were generated as part of the PsychENCODE Consortium, supported by: U01MH103339, U01MH103365, U01MH103392, U01MH103340, U01MH103346, R01MH105472, R01MH094714, R01MH105898, R21MH102791, R21MH105881, R21MH103877, and P50MH106934 awarded to: Schahram Akbarian (Icahn School of Medicine at Mount Sinai), Gregory Crawford (Duke), Stella Dracheva (Icahn School of Medicine at Mount Sinai), Peggy Farnham (USC), Mark Gerstein (Yale), Daniel Geschwind (UCLA), Thomas M. Hyde (LIBD), Andrew Jaffe (LIBD), James A. Knowles (USC), Chunyu Liu (UIC), Dalila Pinto (Icahn School of Medicine at Mount Sinai), Nenad Sestan (Yale), Pamela Sklar (Icahn School of Medicine at Mount Sinai), Matthew State (UCSF), Patrick Sullivan (UNC), Flora Vaccarino (Yale), Sherman Weissman (Yale), Kevin White (UChicago) and Peter Zandi (JHU).
Author Contributions
S.S. designed the experiments, performed the analysis and authored the manuscript.
Data Availability
The CMC and the psychEncode datasets are available from the NIMH Repository and Genomics Resources (NRGR).
Code Availability
Analysis code used for this manuscript is available upon request and without restriction.
Competing Interests
The author declares no competing interests.
Footnotes
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary information accompanies this paper at 10.1038/s41598-019-42308-5.
References
- 1.Fromer, M. et al. Gene expression elucidates functional impact of polygenic risk for schizophrenia. Nat Neurosci. 19(11), 1442–53, 10.038/nn.4399 (2016). [DOI] [PMC free article] [PubMed]
- 2.Darby MM, Yolken RH, Sabunciyan S. Consistently altered expression of gene sets in postmortem brains of individuals with major psychiatric disorders. Transl Psychiatry. 2016;6(9):e890. doi: 10.1038/tp.2016.173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Vawter MP, et al. Genome scans and gene expression microarrays converge to identify gene regulatory loci relevant in schizophrenia. Hum Genet. 2006;119(5):558–70. doi: 10.1007/s00439-006-0172-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Fillman, S. G. et al. Increased inflammatory markers identified in the dorsolateral prefrontal cortex of individuals with schizophrenia. Mol Psychiatry (2012). [DOI] [PubMed]
- 5.Roussos, P., Katsel, P., Davis, K. L., Siever, L. J. & Haroutunian, V. A System-Level Transcriptomic Analysis of Schizophrenia Using Postmortem Brain Tissue Samples. Arch Gen Psychiatry. 1–11 (2012). [DOI] [PubMed]
- 6.Vawter MP, et al. Gene expression of metabolic enzymes and a protease inhibitor in the prefrontal cortex are decreased in schizophrenia. Neurochem Res. 2004;29(6):1245–55. doi: 10.1023/B:NERE.0000023611.99452.47. [DOI] [PubMed] [Google Scholar]
- 7.Horvath S, Mirnics K. Schizophrenia as a Disorder of Molecular Pathways. Biol Psychiatry. 2014;10(14):00007–9. doi: 10.1016/j.biopsych.2014.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Laursen, T. M., Nordentoft, M. & Mortensenm, P. B. Excess early mortality in schizophrenia. Annu Rev Clin Psychol. 10, 425–48, 10.1146/annurev-clinpsy-032813-153657 (2014). [DOI] [PubMed]
- 9.Tiihonen J, et al. 11-year follow-up of mortality in patients with schizophrenia: a population-based cohort study (FIN11 study) Lancet. 2009;374(9690):620–7. doi: 10.1016/S0140-6736(09)60742-X. [DOI] [PubMed] [Google Scholar]
- 10.Koutsouleris N, et al. Accelerated brain aging in schizophrenia and beyond: a neuroanatomical marker of psychiatric disorders. Schizophr Bull. 2014;40(5):1140–53. doi: 10.1093/schbul/sbt142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kochunov P, et al. Testing the hypothesis of accelerated cerebral white matter aging in schizophrenia and major depression. Biol Psychiatry. 2013;73(5):482–91. doi: 10.1016/j.biopsych.2012.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wright S, et al. Accelerated white matter aging in schizophrenia: role of white matter blood perfusion. Neurobiology of aging. 2014;35(10):2411–8. doi: 10.1016/j.neurobiolaging.2014.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gandal MJ, et al. Shared molecular neuropathology across major psychiatric disorders parallels polygenic overlap. Science. 2018;359(6376):693–7. doi: 10.1126/science.aad6469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Catts VS, et al. A microarray study of post-mortem mRNA degradation in mouse brain tissue. Brain research Molecular brain research. 2005;138(2):164–77. doi: 10.1016/j.molbrainres.2005.04.017. [DOI] [PubMed] [Google Scholar]
- 16.Gallego Romero I, Pai AA, Tung J, Gilad Y. RNA-seq: impact of RNA degradation on transcript quantification. BMC biology. 2014;12:42. doi: 10.1186/1741-7007-12-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Risso D, Ngai J, Speed TP, Dudoit S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol. 2014;32(9):896–902. doi: 10.1038/nbt.2931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Huang DW, et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 2007;35(Web Server issue):W169–75. doi: 10.1093/nar/gkm415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Biological insights from 108 schizophrenia-associated genetic loci. Nature, 511(7510), 421–7, 10.1038/nature13595 (2014). [DOI] [PMC free article] [PubMed]
- 20.Lee P. H., O’Dushlaine C., Thomas B. & Purcell S. M. INRICH: interval-based enrichment analysis for genome-wide association studies. Bioinformatics. 28(13), 1797–9, 10.093/bioinformatics/bts191 (2012). [DOI] [PMC free article] [PubMed]
- 21.Iwamoto K, Bundo M, Kato T. Altered expression of mitochondria-related genes in postmortem brains of patients with bipolar disorder or schizophrenia, as revealed by large-scale DNA microarray analysis. Hum Mol Genet. 2005;14(2):241–53. doi: 10.1093/hmg/ddi022. [DOI] [PubMed] [Google Scholar]
- 22.Altar CA, et al. Deficient hippocampal neuron expression of proteasome, ubiquitin, and mitochondrial genes in multiple schizophrenia cohorts. Biol Psychiatry. 2005;58(2):85–96. doi: 10.1016/j.biopsych.2005.03.031. [DOI] [PubMed] [Google Scholar]
- 23.Morikawa T, Manabe T. Aberrant regulation of alternative pre-mRNA splicing in schizophrenia. Neurochemistry international. 2010;57(7):691–704. doi: 10.1016/j.neuint.2010.08.012. [DOI] [PubMed] [Google Scholar]
- 24.Wu JQ, et al. Transcriptome sequencing revealed significant alteration of cortical promoter usage and splicing in schizophrenia. PLoS One. 2012;7(4):e36351. doi: 10.1371/journal.pone.0036351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Molinaro L, Hui P, Tan M, Mishra RK. Role of presynaptic phosphoprotein synapsin II in schizophrenia. World journal of psychiatry. 2015;5(3):260–72. doi: 10.5498/wjp.v5.i3.260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Colantuoni C, et al. Age-related changes in the expression of schizophrenia susceptibility genes in the human prefrontal cortex. Brain structure & function. 2008;213(1-2):255–71. doi: 10.1007/s00429-008-0181-5. [DOI] [PubMed] [Google Scholar]
- 27.Bishop NA, Lu T, Yankner BA. Neural mechanisms of ageing and cognitive decline. Nature. 2010;464(7288):529–35. doi: 10.1038/nature08983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Grimm A, Eckert A. Brain aging and neurodegeneration: from a mitochondrial point of view. Journal of neurochemistry. 2017;143(4):418–31. doi: 10.1111/jnc.14037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Mazin P, et al. Widespread splicing changes in human brain development and aging. Molecular systems biology. 2013;9:633. doi: 10.1038/msb.2012.67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Foster TC. Interaction of rapid signal transduction cascades and gene expression in mediating estrogen effects on memory over the life span. Frontiers in neuroendocrinology. 2005;26(2):51–64. doi: 10.1016/j.yfrne.2005.04.004. [DOI] [PubMed] [Google Scholar]
- 31.Magnoni MS, Govoni S, Battaini F, Trabucchi M. The aging brain: protein phosphorylation as a target of changes in neuronal function. Life sciences. 1991;48(5):373–85. doi: 10.1016/0024-3205(91)90492-T. [DOI] [PubMed] [Google Scholar]
- 32.Kirkpatrick, B., Messias, E., Harvey, P. D., Fernandez-Egea, E. & Bowie, C. R. Is schizophrenia a syndrome of accelerated aging? Schizophr Bull. 34(6), 1024–32 (2008). [DOI] [PMC free article] [PubMed]
- 33.Tang, B. et al. Normal human aging and early-stage schizophrenia share common molecular profiles. Aging Cell. 8(3), 339–42, 10.1111/j.474-9726.2009.00468.x (2009). [DOI] [PMC free article] [PubMed]
- 34.Torkamani, A., Dean, B., Schork, N. J. & Thomas, E. A. Coexpression network analysis of neural tissue reveals perturbations in developmental processes in schizophrenia. Genome Res. 20(4), 403–12, 10.1101/gr.101956.109 (2010). [DOI] [PMC free article] [PubMed]
- 35.Fillman SG, et al. Increased inflammatory markers identified in the dorsolateral prefrontal cortex of individuals with schizophrenia. Mol Psychiatry. 2013;18(2):206–14. doi: 10.1038/mp.2012.110. [DOI] [PubMed] [Google Scholar]
- 36.Hwang Y, et al. Gene expression profiling by mRNA sequencing reveals increased expression of immune/inflammation-related genes in the hippocampus of individuals with schizophrenia. Transl Psychiatry. 2013;3:e321. doi: 10.1038/tp.2013.94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Potvin, S. et al. Inflammatory cytokine alterations in schizophrenia: a systematic quantitative review. Biol Psychiatry. 63(8), 801–8. Epub 2007 Nov 19 (2008). [DOI] [PubMed]
- 38.de Leon J, Diaz FJ. A meta-analysis of worldwide studies demonstrates an association between schizophrenia and tobacco smoking behaviors. Schizophr Res. 2005;76(2-3):135–57. doi: 10.1016/j.schres.2005.02.010. [DOI] [PubMed] [Google Scholar]
- 39.Fowler IL, Carr VJ, Carter NT, Lewin TJ. Patterns of current and lifetime substance use in schizophrenia. Schizophr Bull. 1998;24(3):443–55. doi: 10.1093/oxfordjournals.schbul.a033339. [DOI] [PubMed] [Google Scholar]
- 40.Chen, J., Huang, X. F., Shao, R., Chen, C., Deng, C. Molecular Mechanisms of Antipsychotic Drug-Induced Diabetes. Front Neurosci. 11, 643, 10.3389/fnins.2017.00643 eCollection 2017 (2017). [DOI] [PMC free article] [PubMed]
- 41.Silva, A., Ribeiro, M., Sousa-Rodrigues, C. F. & Barbosa, F. T. Association between antipsychotics and cardiovascular adverse events: A systematic review. 1992), 63(3), 261–7, 10.1590/806-9282.63.03.261 (2017). [DOI] [PubMed]
- 42.Haijma SV, et al. Brain volumes in schizophrenia: a meta-analysis in over 18 000 subjects. Schizophr Bull. 2013;39(5):1129–38.. doi: 10.1093/schbul/sbs118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hashimoto N, et al. The effect of duration of illness and antipsychotics on subcortical volumes in schizophrenia: Analysis of 778 subjects. NeuroImage Clinical. 2018;17:563–9. doi: 10.1016/j.nicl.2017.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Szymanski SR, Cannon TD, Gallacher F, Erwin RJ, Gur RE. Course of treatment response in first-episode and chronic schizophrenia. Am J Psychiatry. 1996;153(4):519–25. doi: 10.1176/ajp.153.4.519. [DOI] [PubMed] [Google Scholar]
- 45.Altamura AC, Serati M, Buoli M. Is duration of illness really influencing outcome in major psychoses? Nordic journal of psychiatry. 2015;69(6):403–17. doi: 10.3109/08039488.2014.990919. [DOI] [PubMed] [Google Scholar]
- 46.Akbarian, S. et al. The PsychENCODE project. Nat Neurosci. 18(12), 1707–12, 10.038/nn.4156 (2015). [DOI] [PMC free article] [PubMed]
- 47.Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput Biol. 9(8), e1003118, 10.1371/journal.pcbi (2013). [DOI] [PMC free article] [PubMed]
- 48.Research NLB. DAVID Bioinformatics Resource, https://scholar.google.com/citations?user=dMn7gzYAAAAJ Google Scholar: Google (2019).
- 49.Huang da W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37(1):1–13. doi: 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Becker KG, White SL, Muller J, Engel J. BBID: the biological biochemical image database. Bioinformatics (Oxford, England). 2000;16(8):745–6. doi: 10.1093/bioinformatics/16.8.745. [DOI] [PubMed] [Google Scholar]
- 51.Kim Y, et al. Comparative genomic evidence for the involvement of schizophrenia risk genes in antipsychotic effects. Mol Psychiatry. 2018;23(3):708–12. doi: 10.1038/mp.2017.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The CMC and the psychEncode datasets are available from the NIMH Repository and Genomics Resources (NRGR).
Analysis code used for this manuscript is available upon request and without restriction.