Abstract
Structural variations of the human brain are heritable and highly polygenic traits, with hundreds of associated genes identified in recent genome-wide association studies (GWAS). Transcriptome-wide association studies (TWAS) can both prioritize these GWAS findings and also identify additional gene-trait associations. Here we perform cross-tissue TWAS analysis of 211 structural neuroimaging and discover 278 associated genes exceeding Bonferroni significance threshold of 1.04 × 10−8. The TWAS-significant genes for brain structures have been linked to a wide range of complex traits in different domains. Through TWAS gene-based polygenic risk scores (PRS) prediction, we find that TWAS PRS gains substantial power in association analysis compared to conventional variant-based GWAS PRS, and up to 6.97% of phenotypic variance (p-value = 7.56 × 10−31) can be explained in independent testing data sets. In conclusion, our study illustrates that TWAS can be a powerful supplement to traditional GWAS in imaging genetics studies for gene discovery-validation, genetic co-architecture analysis, and polygenic risk prediction.
Subject terms: Gene expression, Neuroscience
Brain structural traits are highly heritable and have been linked to disease. Here the authors have used gene expression data to perform a transcriptome-wide association study on 211 brain structural traits, discovering 273 associated genes.
Introduction
Variations in brain structure and microstructure across individuals are associated with many neurological and psychiatric (referred to as neuropsychiatric hereafter) traits including cognitive functions1–5, neurodegenerative, neurodevelopmental, and psychiatric disorders6–9, as well as alcohol and tobacco consumption10, and physical bone density11. Structural variations of human brain can be quantified by multimodal magnetic resonance imaging (MRI). Specifically, the T1-weighted MRI (T1-MRI) can provide basic morphometric information of brain tissues, such as volume, surface area, sulcal depth, and cortical thickness. In region of interest (ROI)-based T1-MRI analysis, images are annotated onto ROIs of pre-defined brain atlas, and then both global (e.g., whole brain, gray matter, white matter) and local (e.g., basal ganglia structures, limbic, and diencephalic regions) markers can be generated to measure the brain anatomy. On the other hand, diffusion MRI (dMRI) can capture local tissue microstructure through the random movement of water. Using diffusion tensor imaging (DTI) models, brain structural connectivity can be quantified by using white matter tracts extracted from dMRI, which build psychical connections among brain ROIs and are involved in connected networks for various brain functions12,13. See Miller et al.11 and Elliott et al.14 for a global overview and more information about neuroimaging modalities used in the present study.
Structural neuroimaging traits have shown moderate-to-high degree of heritability in both twin and population-based studies14–24. In the past decade, genome-wide association studies (GWAS)14,24–34 have been conducted to identify the associated genetic variants (typically single-nucleotide polymorphisms [SNPs]) for brain structures. A highly polygenic35,36 genetic architecture has been observed, indicating that a large number of genetic variants contribute to variations in brain structure measured by neuroimaging biomarkers21,37. Particularly, using data from the UK Biobank (UKB38) cohort, two recent large-scale GWAS have identified 578 associated genes for 101 regional brain volumes derived from T1-MRI39 (referred to as ROI volumes, n = 19,629) and 110 DTI parameters of dMRI40 (referred as DTI parameters, n = 17,706). Some of these discovered genes had been implicated for neuropsychiatric diseases or traits by previous GWAS. However, most of them have not been verified and need further investigations. Complementary to traditional GWAS, transcriptome-wide association studies (TWAS) have become increasingly adopted in gene-trait association analysis thanks to recent advances in gene expression imputation methods41–47 and burgeoning generation of such expression imputation reference data sets (e.g., the Genotype-Tissue Expression (GTEx) project48). Despite some challenges49 such as interpreting causality, TWAS has successfully discovered additional gene-trait associations and provided insights into biological mechanisms for many complex traits50. Through imputed transcriptomes, TWAS can reduce the multiple testing burden and leverage gene expression data to increase testing power for gene-trait association detection. This is a particularly desirable feature for imaging genetics studies, for which most neuroimaging GWAS data sets continue to have small sample sizes and heavy multiple testing burden51.
In this work, we performed TWAS analysis for 211 structural neuroimaging traits including 101 ROI volumes and 110 DTI parameters. As these brain-related traits tend to be highly polygenic21,37 and are related to many traits across a range of categories11, we used a cross-tissue (panel) TWAS approach (UTMOST43) in our main analysis. UTMOST first performs single-tissue gene-trait association analysis in each reference panel with both within-tissue and cross-tissue statistical penalties, and then combines these single-tissue results using the Generalized Berk-Jones (GBJ) test52, which accommodates tissue dependence and can account for the potential sharing of local expression regulation across tissues. The UKB data set was used in the discovery phase (n = 19,629 for ROI volumes and 17,706 for DTI parameters, respectively). For the discovery UKB cohort, we compared TWAS-significant genes with previous GWAS findings in gene-based association analysis via MAGMA53 and gene-level functional mapping and annotation results by FUMA54. The UKB TWAS results were validated in five independent data sources, including Philadelphia Neurodevelopmental Cohort (PNC55, n = 537), Alzheimer’s Disease Neuroimaging Initiative (ADNI56, n = 860), Pediatric Imaging, Neurocognition, and Genetics (PING57, n = 461), the Human Connectome Project (HCP58, n = 334), and the ENIGMA224 and ENIGMA-CHARGE collaboration34 (n = 13,193, for eight ROI volume traits, referred as ENIGMA in this paper). Chromatin interaction enrichment analysis was conducted for TWAS-significant genes. Finally, we developed TWAS gene-based polygenic risk scores59 (PRS) using FUSION41 to fully assess polygenic architecture and examine the predictive capability of the UKB TWAS results.
Results
Overview of TWAS discovery-validation in the six data sets
We conducted a two-phase discovery-validation TWAS analysis for 211 neuroimaging traits by using the UKB cohort for discovery and the other data sets (ADNI, HCP, PING, PNC, and ENIGMA) for validation. We applied the UTMOST gene expression imputation models trained on GTEx tissues, and used GWAS summary statistics generated from previous GWAS as inputs. We refer to 1.04 × 10−8 (that is, 5 × 10−2/22,694/211, adjusted for all candidate genes and traits performed) as the significance threshold for gene-trait associations unless otherwise stated. The original version of UTMOST models was trained using GTEx v6 as the reference. In this study, we retrained the UTMOST models using the recently released GTEx v8 data and performed our analysis using both versions. As the GTEx v6 and v8 databases share individual-level samples, we are particularly interested in the associations that can be consistently detected in the two versions. Therefore, in the rest of this paper we reported genes that were either (1) significant in both versions; or (2) significant in one version and were within ±1 MB window with at least one significant gene in the other version (Methods).
The UKB discovery phase identified 918 significant gene-trait associations (Supplementary Data 1) between 278 genes and 152 neuroimaging traits (57 ROI volumes, 95 DTI parameters). Of the 278 TWAS-significant genes, 90 (32.4%) had significant associations with more than two neuroimaging traits, 16 (10.4%) had more than five significant associations, and 16 (5.8%) had at least ten, including POLR2F, TREH OR1F12, FOXF1, LRRC37A, AC008105.1, MAPT, ARHGAP27, EIF4EBP3, PLEKHM1, ZKSCAN4, CCDC157, XRCC4, AC005670.1, CRHR1, and RECQL4. These 16 genes together contributed 344 (37.5%) of the 918 gene-trait associations, indicating their widespread influences on brain structures. Specifically, we identified 173 genes whose imputed gene expression levels were significantly associated with one or more of the 57 ROI volumes (328 associations in total, 186 additional, Supplementary Fig. 1), and 140 significantly associated genes (35 overlappings) for one or more of the 95 DTI parameters (590 associations in total, 277 additional, Supplementary Fig. 2).
Figure 1 illustrates that TWAS prioritized previous GWAS findings of MAGMA and FUMA and also discovered many additional associations and genes. Moreover, some genes were associated with both ROI volumes and DTI parameters, while others were more specifically related to certain structures (Supplementary Fig. 3). For example, XRCC4, ZKSCAN4, EIF4EBP3, and CD14 were associated with DTI parameters but not ROI volumes, DEFB124, COX4I2, HCK, HM13, and REM1 showed associations with putamen and pallidum volumes, and the associations of PLEKHM1, LRRC37A, MAPT, AC005670.1, RECQL4, ARHGAP27, and CRHR1 were spread widely across DTI parameters and total brain volume.
We validated the UKB results in the other five independent cohorts. For each data set, we applied the Bonferroni-corrected significance threshold accounting for all candidate genes and traits analyzed (that is, 5 × 10−2/22,694/number of traits, Supplementary Data 2–6). We found that 19 additional UKB TWAS-significant genes (NPSR1, TREH, CRYBA1, MFRP, SLX1B, RPL13AP3, GALP, KCNH7, DCTPP1, LINC02454, JPH3, IL4, HCK, TIMM8AP1, LGALS3, LINC02057, RECQL4, DLGAP5, and AC090666.1) can be validated in one or more of the five data sets. These data sets also replicated six previous UKB GWAS-significant genes (NUP210L, MIR1-1HG, DOK5, KRTAP5-1, AC008393.1, and DPP4), and four genes that were significant in both UKB TWAS and GWAS (DCC, LRRC37A, ANKRD42, and DLG2) (Supplementary Fig. 4). The TWAS additional findings and validated genes were discussed further in detail below.
Additional TWAS discoveries and validated genes
Of the 278 UKB TWAS-significant genes, 159 were not discovered in previous GWAS of the same UKB data set (Supplementary Data 7). TWAS resulted in 102 additional associated genes for 54 ROI volumes (186 associations, Supplementary Fig. 5), and 75 additional genes for 90 DTI parameters (277 associations, Supplementary Fig. 6). According to NHGRI-EBI GWAS catalog60, the 159 TWAS-significant genes replicated 21 previous findings on brain structures, including JPH361 for hippocampal volume in mild cognitive impairment, CRYBA133 for brain stem volume measurement, AC145285.233 for caudate nucleus volume, and C1QL162 for white matter hyperintensity burden. The other 138 genes had not been linked to brain structure previously and thus can be regarded as additional genes for these 211 neuroimaging traits. To explore the genetic overlaps with other traits in different domains, we performed association lookups for the 159 TWAS genes on the NHGRI-EBI GWAS catalog. Figure 2 shows that these genes were widely associated with anthropometric measures (e.g., height, waist-to-hip ratio, heel bone mineral density, body mass index), neuropsychiatric traits (e.g., cognitive function, intelligence, math ability, schizophrenia, bipolar disorder, Alzheimer’s disease), coronary artery disease, mean corpuscular hemoglobin, neuroticism, education, reaction time, chronotype, smoking behavior, and alcohol use, such as ELL63–65, SH2B166–69, IL2768,70, KCNH771,72, HYI73,74, and GNAT175,76.
For the 29 TWAS-validated genes shown in Supplementary Fig. 4, ten (ANKRD42, DCC, LRRC37A, NUP210L, DOK5, KRTAP5-1, MIR1-1HG, AC008393.1, DLG2, and DPP4) of them had been discovered in the previous UKB GWAS and were implicated in brain-related complex traits, such as neuroticism77, major depression78, schizophrenia75,79,80, Intelligence70, math ability72, reaction time68, and insomnia81. The remaining 19 genes, which are additional findings from our TWAS analysis, also had known associations with various neuropsychiatric traits. For example, previous GWAS reported that HCK was associated with chronotype81, LGALS3 with schizophrenia82, AC090666.1 with neuroticism71, CRYBA1 with depression78, RECQL4 with cognitive ability68, KCNH7 with cognitive performance72 and reaction time68, and JPH3 with bipolar disorder83 and cognitive impairment61. Moreover, we found that DCC, MIR1-1HG, DPP4, and RECQL4 were specifically associated with brain-related traits and disorders, while other genes (such as NUP210L, DLG2, AC090666.1, KCNH7, and JPH3) were also widely associated with non-brain traits, including triglycerides84, mean platelet volume64, and coronary artery disease85. In summary, TWAS additional and validated genes expand the overview of gene-level pleiotropy across these traits, suggesting that neuroimaging-derived biomarkers could be useful in studying a wide range of complex traits.
Comparing power to detect the association between brain tissues and all tissues
As a comparison, we performed a brain tissue-specific version of UTMOST TWAS that only combined brain tissues (10 brain tissues in GTEx v6 or 13 brain tissues in GTEx v8, Method). This brain tissue-specific TWAS detected 396 significant gene-trait associations (Supplementary Data 8) between 134 unique genes and 81 neuroimaging traits, including 84 associated genes for one or more of 29 ROI volumes (136 associations, Supplementary Fig. 7), and 68 genes (18 overlapping) for one or more of 52 DTI parameters (260 associations, Supplementary Fig. 8).
Most (119/134) of the brain tissue-specific genes have been identified by either the cross-tissue TWAS (117/134) or previous GWAS (65/134). The 15 genes that were uniquely identified by brain tissue-specific analysis included DNAJC2, LHFPL3, NUPR1, UQCRQ, BCL2L1, MBD2, KNCN, NUFIP2, MIB2, C3orf62, CDHR4, FXYD1, TMEM173, ZSCAN31, and PI4KAP2. Among them, LHFPL3 showed associations with education86, social behavior87,88, cognitive ability68, schizophrenia89, and bipolar disorder90. MBD2 was associated with reaction time68, ZSCAN31 with schizophrenia89 and cross disorders91, and NUPR1, CDHR4, and C3orf62 with intelligence81,92.
Compared with brain tissue-specific TWAS, the cross-tissue analysis clearly identified more signals. For example, of the 328 gene-trait associations identified by cross-tissue analysis of ROI volumes, 142 had been identified in GWAS, 50 can be additionally identified by brain tissue-specific TWAS, and 136 can only be detected by cross-tissue analysis (Supplementary Fig. 9). Similarly, 313 of the 590 cross-tissue TWAS associations for DTI can be identified in GWAS, 90 can be additionally identified by brain tissue-specific TWAS, and 187 were cross-tissue TWAS only (Supplementary Fig. 10). These results illustrate the advantage of cross-tissue analysis over brain tissue-specific TWAS for discovering association signals that are difficult to be identified in traditional GWAS. We further compared their results in a few follow-up analyses below.
Comparison with GWAS variant-level signals and conditional analysis
For each of the 918 gene-trait associations detected in cross-tissue TWAS, we used previous GWAS summary statistics to check the most significant variant within the gene region (with a 1 MB window on each side) that was pinpointed in the same UKB data set (Method). The GWAS p value of the most significant variant (i.e., the variant with the smallest p value) was >1 × 10−6 for associations of 19 genes (Supplementary Data 9). None of them had been identified by MAGMA or FUMA, indicating that it can be difficult to detect these genes by GWAS or post-GWAS screening for any of these neuroimaging traits. Of the 19 genes, seven (GALP, LINC02057, CRYBA1 TREH, IL4, DCTPP1, RECQL4) were validated in one or more of the five validation data sets and were discussed in the previous section. For the other 12 genes (LGALS16, MYO9A, FAM83C, CEACAMP3, H4C11, AC005670.1, OR10V3P, TMEM136, CELSR3, TMEM101, CCDC157, and GDF5) genes, MYO9A was reported for defects in the structure and function of the neuromuscular junction93, FAM83 family was linked to certain brain tumors94, CELSR3 was associated with education71 and cognitive ability70,77, and CCDC157 was found to be associated with white matter microstructure in other data sets95. The same checking was then performed for the 396 significant gene-trait associations of brain tissue-specific TWAS. We found that only DCTPP1 and CCDC157 had minimum GWAS p value <1 × 10−6 (Supplementary Data 10).
We next performed a conditional analysis to see whether the TWAS signals remained significant after adjustment for the most significant genetic variant used in UTMOST gene expression imputation models (Method). Although our cross-tissue analysis combined information from many genetic variants across various human tissues, we found that 472 associations may indeed be dominated by the strongest GWAS signal of the imputation model, as their conditional p-values were larger than 0.05 (Supplementary Data 11). However, the conditional p values of eight genes (WIF1, XRCC4, C15orf56, CCDC53, RPSAP52, CCDC157, AMZ1, NMT1) were smaller than 1 × 10−6 for 23 gene-trait associations, suggesting that these associations were unlikely to be driven by a signal genetic variant. When the p value threshold was relaxed to 1 × 10−3, 118 associations of 42 genes persisted after conditional analysis. Similar conditional analysis was also performed on significant associations of brain tissue-specific TWAS. The conditional p values were smaller than 1 × 10−6 for five genes (XRCC4, C15orf56, NMT1, CCDC157, AMZ1) with 20 associations, and were smaller than 1 × 10−3 for 25 genes with 84 associations (Supplementary Data 12).
Chromatin interaction enrichment and genetic overlaps
To explore the biological interpretations of TWAS and GWAS-significant genes, we performed enrichment analysis in promoter-related chromatin interactions of four types of brain cells96 (induced pluripotent stem cells (iPSC)-induced excitatory neurons, iPSC-derived hippocampal DG-like neurons, iPSC-induced lower motor neurons, and primary astrocytes) (Method). Both GWAS and cross-tissue TWAS-significant genes were significantly enriched in chromatin interactions of astrocytic glial cells (Supplementary Data 13, Wilcoxon rank test, p value < 2.8 × 10−2), and combining GWAS and cross-tissue TWAS-significant genes resulted in a smaller p value (1.04 × 10−3). Cross-tissue TWAS-significant genes were also significantly enriched in chromatin interactions from two neuron types (excitatory and lower motor neurons). For all of the three neuron types, cross-tissue TWAS-significant genes had smaller enrichment p values (p value range = [2.3 × 10−2, 6.18 × 10−2]) than those of GWAS-significant genes (p value range = [0.11, 0.57]). Overall, these results suggest that cross-tissue TWAS-significant genes were more actively interacted with other chromatin regions and may play a more important role in regulating gene expressions as compared with other genes. In contrast, brain tissue-specific TWAS-significant genes did not show any significant enrichment (p value range = [0.14, 0.68]), indicating the value of cross-tissue TWAS over brain tissue-specific TWAS.
Next, we applied fastENLOC97 to perform colocalization analysis for the 278 cross-tissue TWAS-significant genes (Methods). We found that 96 of the 278 (34.5%) genes (involving 233 of 918 gene-trait associations) had regional colocalization probability (RCP) > 0.1 in at least one tissue type and seven genes (involving 17 gene-trait associations) had RCP > 0.9 (Supplementary Data 14). Among them, there are known risk genes. For example, SLC16A8 is a known risk gene of glioma/glioblastomas98. In our cross-tissue TWAS analysis, SLC16A8 was significantly associated with multiple white matter microstructure traits, and fastENLOC colocalization analysis also found that SLC16A8 had a high colocalization probability (0.919) with expression quantitative trait loci (eQTL) signals in GTEx v8 nerve tibial tissue type.
To further explore the gene-level genetic overlaps among brain structure and other complex traits and clinical outcomes, we performed cross-tissue TWAS analysis for 16 other brain-related complex traits with a large GWAS sample size, including neuropsychiatric traits, cognition, and cardiovascular risk factors (Supplementary Data 15). We found that 112 of the 278 cross-tissue TWAS-significant genes of neuroimaging traits were also significantly associated with one or more of 14 traits (that is, 5 × 10−2/22,694/16, Supplementary Data 16, Fig. 3). These results suggest the genes involved in brain structure changes are often related to vascular risk factors and are also active in brain functions and neuropsychiatric disorder/diseases. For example, we found 65 overlapping genes with cognitive function, 54 with education, 53 with numerical reasoning, 50 with intelligence, 39 with neuroticism, 37 with drinking behavior, and 22 with schizophrenia. A large proportion (83/112) of these genes were associated with more than one neuropsychiatric traits, and 13 genes were linked to more than five traits, including NSF, LRP4, ZSCAN9, CRHR1, ARHGAP27, RECQL4, C1QTNF4, KCNH7, MAPT, FAM180B, AC005829.1, AC005670.1, and AC090666.1, indicating the high degree of statistical pleiotropy99 of these genes.
We next performed some additional analysis for the 19 validated UKB TWAS additional genes. First, we found that JPH3 has a high probability of being loss-of-function (LoF) intolerant100 (pLI = 0.986), indicating its intolerant of LoF variation. JPH3 has also been reported for brain disorders, including Huntington disease101,102, Huntington Disease-Like 2101,103, spinocerebellar ataxia101, and Dentatorubral-pallidoluysian atrophy104. Second, DCTPP1 and DLGAP5 were also identified by a recent eQTL study of developing human brain105. Moreover, LGALS3 and DLGAP5 were within the mitotic progenitors and cell division function module in the constructed transcriptional networks106, and JPH3 was within the adult neurons, synaptic transmission, and neuron projection development function module, indicating their potential functions in biological processes of brain development. In addition, NPSR1, GALP, KCNH7, JPH3, IL4, and LGALS3 mutations have been reported to be related with behavior/neurological phenotypes in mice (Mouse Genome Informatics, http://www.informatics.jax.org/).
TWAS gene-based polygenic risk scores analysis
To fully assess the polygenic genetic architecture of neuroimaging traits and examine the predictive ability of UKB TWAS results, we constructed TWAS gene-based PRS on subjects in PNC, HCP, PING, and ADNI cohorts for all of the 211 neuroimaging traits (Method). The prediction analysis was conducted separately on 52 reference panels (13 GETx v7 brain tissues, 35 GTEx v7 other tissues, 1 non-GETx brain tissue, and 3 non-GETx other tissues) using the FUSION41 software and database. We found that genetically predicted profiles for 28 ROI volumes (Fig. 4) and 23 DTI parameters (Supplementary Fig. 11) were significantly associated with the corresponding observed traits in all testing data sets after Bonferroni correction (that is, 101 × 4 + 3 × 110 = 734 tests). Compared with previous SNP-based PRS analysis that yielded significant PRS profiles for 11 ROI volumes39, gene-based PRS profiles were significant for more ROI volumes, such as left/right insula, left/right pallidum, left/right ventral DC, left/right fusiform, and left/right transverse temporal, suggesting the substantial power gain in association analysis of PRS. The significant TWAS PRS can account for 0.97–6.97% phenotypic variance (p value range = [8.0 × 10−29, 6.81 × 10−5]) (Supplementary Data 17–18), which was within a similar range to SNP-based PRS analysis (1.17–6.38%)39. For example, the (incremental) R2 of TWAS PRS of cerebellar vermal lobules VIII–X was 6.97% in PNC and 6.48% in HCP, and the R2 of SFO MD-derived TWAS PRS was 3.8% in PING and 2.41% in PNC.
To evaluate the additional prediction power that TWAS PRS has on the top of traditional GWAS PRS, we next include both GWAS and TWAS PRS together as predictors in one linear model to predict the above 28 TWAS-significant ROI volumes (Method). Compared to the linear model with TWAS or GWAS PRS only, we found that the prediction accuracy was improved for most ROIs when using both of the two types of PRS (Fig. 5). Conditioning on GWAS PRS, TWAS PRS can additionally explain 0.33–5.22% of phenotypic variance (Supplementary Data 19, Supplementary Fig. 12). The two PRS together can have 1.48–9.02% prediction R2 (Supplementary Data 20, Supplementary Fig. 13). For example, the R2 of cerebellar vermal lobules VIII–X became 7.94% in PNC and 9.02% in HCP, in which TWAS PRS additionally contributed 5.22% and 3.66% for PNC and HCP, respectively. On the other hand, conditioning on TWAS PRS, GWAS PRS increased the R2 by 0.02–4.65% (Supplementary Data 21, Supplementary Fig. 14). These results clearly demonstrate the unique value of TWAS PRS for complex traits prediction and suggest that combining both GWAS and TWAS PRS can achieve better prediction accuracy.
We also examined the performance of each reference panel on these significant traits. There was a significant linear relationship between the panel sample size and average prediction R2 (48 GTEx reference panels, simple correlation = 0.53, p value = 1.21 × 10−4, Supplementary Fig. 15), which means that currently, the panel sample size may dominate the performance of TWAS PRS analysis regardless of the tissue specificity59. Among the brain tissue panels, we found that cerebellum tissue had the largest sample size and also showed the highest average R2 (Supplementary Data 22), further supporting the importance of reference panel sample size. Thus, we expect that a reference panel with a larger sample size will be available and can improve the prediction power of TWAS PRS.
Discussion
In this study, we applied TWAS methods on 211 neuroimaging traits to identify genes, whose imputed expression levels were associated with brain structure variations. Using a cross-tissue approach, our main discovery analysis identified 138 additional genes and validated 29 significant genes at stringent Bonferroni correction p value thresholds. Conditional analysis and comparison with GWAS variant-level results suggested that the identification and validation of additional genes reflect the ability of TWAS to reduce the testing burden and to combine the small genetic variant effects. We also performed brain tissue-specific TWAS and illustrated the unique strengths of cross-tissue TWAS in conditional and enrichment analyses. Lots of brain structure-related genes were known genetic factors for a wide range of complex traits, ranging from physical traits, cognition, mental disease/disorders, blood assays, to lifestyle, which extend the potential applications of neuroimaging traits. Some of these genetic overlaps were additionally highlighted by a TWAS analysis of other complex traits.
The present study faces some limitations. First, as these results are purely based on statistical associations, it is hard to draw conclusions about the underlying causality and prioritize causal genes43,107. This is also one of the main challenges for most of the current TWAS approaches49. Follow-up experimental validation is a clear need to confirm TWAS results and pinpoint the causal genes of brain structure changes. In addition, colocalization analysis (such as fastENLOC) can also help prioritize genes having more evidence of causal association. Second, the brain tissue-specific TWAS did not yield much additional results compared with the previous GWAS, and brain tissue panels did not show better prediction accuracy than non-brain tissues in gene-based PRS analysis. Both of the two observations support the use of multiple tissues in our analysis to increase testing power for association analysis, but making the causality interpretation of TWAS results even more complicated. The better performance of cross-tissue analysis may be partially explained by the fact that multi-tissue approaches additionally evaluate cross-tissue evidence108,109. In addition, though gene-based PRS had much better power in association tests than SNP-based polygenic scores, their prediction accuracies were similar. These limitations may be due to the fact that current brain tissue reference panels, like many other tissues, do not have large sample sizes, and/or the associated gene expression imputations may be of low quality. For example, imputations using genetic variants with low frequency may not be accurate when the reference panel sample size is small. Despite these limitations, TWAS has been holding and delivering to the promise of becoming a powerful supplement to traditional GWAS in imaging genetics studies. In our study, many additional gene-trait associations were discovered and the underlying genetic overlaps among complex traits were substantially expanded. With better brain tissue gene expression reference panels and more neuroimaging GWAS data sets available, future TWAS analyses of neuroimaging traits are expected to show the value of tissue specificity and improve our understanding of the genetic basis of human brain.
Methods
GWAS summary statistics data sets
We made use of GWAS summary statistics to test for gene-trait associations in our TWAS study. The GWAS summary-level were from six studies, including the UKB38 (http://www.ukbiobank.ac.uk/resources/) study, the HCP58 (https://www.humanconnectome.org/) study, the PING57 (http://www.chd.ucsd.edu/research/ping-study.html) study, the PNC55 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v1.p1) study, the ADNI56 (http://adni.loni.usc.edu/data-samples/) study, and ENIGMA224 (GWAS of subcortical volumes) and the ENIGMA-CHARGE34 collaboration (http://enigma.ini.usc.edu/research/). More information about original GWAS design can be found in Zhao et al.38 and Zhao et al.39 for UKB, ADNI, HCP, PING, and PNC studies; and in Hibar et al.24 and Adams et al.33 for ENIGMA studies. Details about GWAS on validation cohorts (HCP, PING, PNC, ADNI, and ENIGMA) were also provided in Supplementary Note. For discovery, we used the GWAS summary statistics of the UKB study. Then the GWAS results of the other studies were used for validation, see Supplementary Data 23 for a summary of sample size, IDs, names, and modalities of the analyzed neuroimaging traits of each GWAS. To explore genetic overlaps, we also performed TWAS analysis for 16 brain-related complex traits, see Supplementary Data 15 for these data resources.
Cross-tissue TWAS analysis by UTMOST
Cross-tissue TWAS analysis was performed for each trait using the UTMOST software (https://github.com/Joker-Jerome/UTMOST). We performed UTMOST analysis using GTEx v6 and v8 reference panels separately. Details about UTMOST model training using GTEx v8 data can be found in Supplementary Note. We first run a single-tissue association test for each GTEx reference panel (44 panels in v6 and 49 panels in v8, respectively) using the above GWAS summary statistics as input. There were 22,694 candidate genes considered in UTMOST. Second, the gene-trait associations in all panels (tissues) were combined by the GBJ test (https://cran.r-project.org/web/packages/GBJ/, R version 3.5.0). We used the pre-trained cross-tissue imputation models and pre-calculated covariance matrices provided by UTMOST. For the 211 neuroimaging traits in the UKB cohort, we also performed a brain tissue-specific version of UTMOST analysis that only combined the brain tissues in GTEx (10 tissues in v6 and 13 tissues in v8, respectively). We applied the Bonferroni correction to account for all candidate genes and traits analyzed in each data set. Specifically, the significance threshold was 5 × 10−2/22,694/211 in UKB, PING, PNC, and HCP cohorts, 5 × 10−2/22,694/101 in ADNI cohort, and 5 × 10−2/22,694/16 in the analysis of 16 other complex traits and clinical outcomes. For each cohort, we obtained a list of significant associations for GTEx v6 and v8 versions, respectively. We reported genes that were either (1) significant in both versions; or (2) significant in one version and at least one of its neighboring (within ±1 MB window) gene was significant in the other version.
Comparison with previous GWAS findings
We compared TWAS-significant genes with those identified in the same UKB cohort by MAGMA gene-based association analysis and FUMA functional gene mapping analysis, which can be found in previous GWAS (Supplementary Tables 12 and 15 of Zhao et al.39 for ROI volumes and Supplementary Tables 14 and 16 of Zhao et al.40 for DTI parameters, respectively). For each significant gene-trait association, we also explored whether any genetic variant of this gene region (with 1 MB window on both sides) had been linked to this neuroimaging trait by checking the smallest p value in corresponding GWAS. For TWAS-significant genes that were not identified in GWAS, we used NHGRI-EBI GWAS catalog (version 2019-10-14, https://www.ebi.ac.uk/gwas/) to look for their reported associations with brain structure traits and any other traits. We summarized the traits that frequently reported for these genes, such as physical measures (e.g., height, waist-to-hip ratio, heel bone mineral density, body mass index), cognitive functions (such as general cognitive ability, cognitive performance), intelligence, educational attainment, math ability (such as highest math class taken and self-reported math ability), reaction time, neuroticism, neurodegenerative diseases (such as Alzheimer’s disease and Parkinson’s disease), neuropsychiatric disorders (such as major depressive disorder, schizophrenia, and bipolar disorder), coronary artery disease, and mean corpuscular hemoglobin.
Cross-tissue analysis conditional on the most significant GWAS signal
The TWAS gene expression imputation model can be viewed as a weighted sum of multiple genetic variants. If certain variant has a relatively large weight, the imputed gene expression could be driven by a single GWAS signal. In order to look at how many significant TWAS signals could be dominated by a single genetic variant, we rerun TWAS analysis in UKB cohort conditional on the most significant variant used in the UTMOST imputation model (R version 3.5.0). First, for each reference panel, we considered a simple linear model
Phenotype ~ imputed gene expression + variant,
where the variant conditioned on was the most significant variant in previous GWAS of this phenotype in the same UKB cohort. Then, similar to cross-tissue TWAS analysis, single-tissue conditional p values of the imputed gene expression were combined by the GBJ test across the GTEx reference panels (44 panels in GTEx v6 and 49 panels in GTEx v8, respectively).
Chromatin interaction enrichment analysis
The chromatin interaction enrichments between significant and non-significant genes were tested using the Wilcoxon rank sum test (R version 3.5.0). For the adult neural Promoter Capture Hi-C, the enrichment of each gene was measured as the number of interactions overlapping gene with CHiCAGO Enrichment Score >596. The enrichment was tested separately in four cell types, including iPSC-induced excitatory neurons, iPSC-derived hippocampal DG-like neurons, iPSC-induced lower motor neurons, and primary astrocytes. The Wilcoxon rank sum test was separately performed for the significant genes obtained from cross-tissue TWAS analysis, FUMA/MAGMA, and brain tissue-specific TWAS analysis.
Gene-based TWAS polygenic risk prediction
Gene-based polygenic profiles were created to assess the out-of-sample prediction power of the UKB TWAS results. In this analysis, we used the individual-level phenotype and genetic data, whose processing steps were detailed in the previous GWAS39,40. The FUSION software and database (http://gusevlab.org/projects/fusion/) were used to impute gene expression levels in UKB, ADNI, HCP, PNC, and PING data sets using individual-level genetic data. We performed imputation for 52 different reference panels (Supplementary Data 22). In training data (UKB), we estimated the effect size of each imputed gene expression in a linear regression model, whereas adjusting for the age (at imaging), age-squared, sex, age-sex interaction, age-squared-sex interaction, as well as the top 40 genetic principle components provided by UKB110 (Data-Field 22009). For ROI volumes, we also included total brain volume (for ROIs other than total brain volume itself) as a covariate. The gene-based TWAS PRS were generated in testing data by summarizing across imputed gene expressions, weighed by their effect sizes estimated from the training data. We tried a series of p value thresholds for predictor selection: 1, 0.8, 0.5, 0.4, 0.3, 0.2, 0.1, 0.08, 0.05, 0.02, 0.01, 0.001, 1 × 10−4, 1 × 10−5, 1 × 10−6, 1 × 10−7, and 5 × 10−8. Thus, 17 polygenic profiles were generated for each neuroimaging trait and we reported the best prediction power that can be achieved by a single profile of them in the single reference panel. The association between polygenic profile and trait was estimated and tested in linear regression model (R version 3.5.0), adjusting for the effects of age and sex. The additional phenotypic variation that can be explained by polygenic profile (i.e., the incremental R2) was used to measure the prediction power. Next, we additionally considered the best variant-based GWAS PRS reported in Zhao et al.39 and re-evaluated the incremental R2. Specifically, we considered the following four simple linear models
Phenotype ~ covariates (m1),
Phenotype ~ TWAS PRS + covariates (m2),
Phenotype ~ GWAS PRS + covariates (m3), and
Phenotype ~ TWAS PRS + GWAS PRS + covariates (m4).
We estimated the incremental R2 of TWAS PRS conditioning on GWAS PRS using models m4 and m3, the incremental R2 of GWAS PRS conditioning on TWAS PRS using models m4 and m2, and calculated the additional phenotypic variation that can be jointly explained by GWAS and TWAS PRS using models m4 and m1. More details about constructing and evaluating gene-based PRS can be found in Supplementary Note.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
This research was partially supported by U.S. NIH grants MH086633 (HT.Z.), HD079124 (Y.L.), HL129132 (Y.L.), and MH116527 (TF.L.). We thank Quan Wang, Bingshan Li, and Jia Wen for helpful conversations. We thank the individuals represented in the UK Biobank, ADNI, HCP, PING, PNC, ENIGMA2, and ENIGMA-CHARGE data sets for their participation and the research teams for their work in collecting, processing, and disseminating these data sets for analysis. This research has been conducted using the UK Biobank resource (application number 22783), subject to a data transfer agreement. We gratefully acknowledge all the studies and databases that made GWAS summary data available. The data resources had obtained informed consent from all participants and had obtained approval from their research ethics committees or institutional review boards. The UKB study had obtained ethics approval from the North West Multicentre Research Ethics Committee (approval number: 11/NW/0382). ADNI study was approved by all the institutional ethical review boards of all participating centers. The institutional review boards of the University of Pennsylvania and the Children’s Hospital of Philadelphia approved all study procedures in the PNC study. The human research protection programs and institutional review boards at the nine institutions participating in the PING project approved all experimental and consenting procedures. All experimental procedures in the HCP study were approved by the institutional review boards at Washington University (approval number: 201204036). Part of data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering and through generous contributions from the following: Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen Idec Inc.; Bristol-Myers Squibb Company; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd; Janssen Alzheimer Immunotherapy Research & Development, LLC; Johnson & Johnson Pharmaceutical Research & Development LLC; Medpace, Inc.; Merck & Co., Inc.; Meso Scale Diagnostics, LLC; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Synarc Inc.; and Takeda Pharmaceutical Company. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Part of the data collection and sharing for this project was funded by the Pediatric Imaging, Neurocognition and Genetics Study (PING) (U.S. National Institutes of Health Grant RC2DA029475). PING is funded by the National Institute on Drug Abuse and the Eunice Kennedy Shriver National Institute of Child Health & Human Development. PING data are disseminated by the PING Coordinating Center at the Center for Human Development, University of California, San Diego. Support for the collection of the PNC data sets was provided by grant RC2MH089983 awarded to Raquel Gur and RC2MH089924 awarded to Hakon Hakonarson. All PNC subjects were recruited through the Center for Applied Genomics at The Children’s Hospital in Philadelphia. HCP data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.
Author contributions
B.Z., Y.S., Y.L., and HT.Z. designed the study. B.Z., Y.S., Y.Y., Z.Y., HY.Z., P.S. TF.L., X.W., TY.L., and Z.Z performed the experiments and analyzed the data. B.Z., Y.S., Y.L., and HT.Z. wrote the manuscript with feedback from all authors.
Data availability
The data used in this work were obtained from publicly available data sets: the UK Biobank (UKB) study, the Human Connectome Project (HCP) study, the Pediatric Imaging, Neurocognition, and Genetics (PING) study, the Philadelphia Neurodevelopmental Cohort (PNC) study, the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study, and ENIGMA2 & the ENIGMA-CHARGE collaboration. For the first five data sets, the raw MRI, covariates, and SNP data are available from each data resource: UK Biobank, http://www.ukbiobank.ac.uk/resources/;PING, http://pingstudy.ucsd.edu/resources/genomics-core.html/; PNC, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v1.p1/; ADNI, http://adni.loni.usc.edu/data-samples/; and HCP, https://www.humanconnectome.org/. The GWAS summary statistics can be obtained at https://github.com/BIG-S2/GWAS and http://enigma.ini.usc.edu/research/. In addition, we used other 16 sets of publicly available GWAS summary statistics shared by several GWAS databases. These data resources are summarized in Supplementary Data 15. The FUSION database used in this study is available at http://gusevlab.org/projects/fusion/.
Code availability
We made use of publicly available software and tools, especially the UTMOST (https://github.com/Joker-Jerome/UTMOST) and the FUSION (http://gusevlab.org/projects/fusion/). The analysis code is freely available at 10.5281/zenodo.4649360111.
Competing interests
The authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks Alvaro Barbeira and the other, anonymous, reviewer for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Bingxin Zhao, Yue Shan.
These authors jointly supervised this work: Yun Li, Hongtu Zhu.
Contributor Information
Yun Li, Email: yunli@med.unc.edu.
Hongtu Zhu, htzhu@email.unc.edu.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-021-23130-y.
References
- 1.Ritchie SJ, et al. Beyond a bigger brain: multivariable structural brain imaging and intelligence. Intelligence. 2015;51:47–56. doi: 10.1016/j.intell.2015.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Davies G, et al. Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N= 112 151) Mol. Psychiatry. 2016;21:758–767. doi: 10.1038/mp.2016.45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Van der Meer D, et al. Brain scans from 21,297 individuals reveal the genetic architecture of hippocampal subfield volumes. Mol. psychiatry. 2020;25:3053–3065. doi: 10.1038/s41380-018-0262-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Caldiroli A, et al. The relationship of IQ and emotional processing with insula volume in schizophrenia. Schizophr. Res. 2018;202:141–148. doi: 10.1016/j.schres.2018.06.048. [DOI] [PubMed] [Google Scholar]
- 5.Vreeker A, et al. The relationship between brain volumes and intelligence in bipolar disorder. J. Affect. Disord. 2017;223:59–64. doi: 10.1016/j.jad.2017.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Nir TM, et al. Effectiveness of regional DTI measures in distinguishing Alzheimer’s disease, MCI, and normal aging. NeuroImage: Clin. 2013;3:180–195. doi: 10.1016/j.nicl.2013.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Bohnen NI, Albin RL. White matter lesions in Parkinson disease. Nat. Rev. Neurol. 2011;7:229. doi: 10.1038/nrneurol.2011.21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Voineskos AN. Genetic underpinnings of white matter ‘connectivity’: heritability, risk, and heterogeneity in schizophrenia. Schizophr. Res. 2015;161:50–60. doi: 10.1016/j.schres.2014.03.034. [DOI] [PubMed] [Google Scholar]
- 9.Sudre G, et al. Estimating the heritability of structural and functional brain connectivity in families affected by attention-deficit/hyperactivity disorder. JAMA psychiatry. 2017;74:76–84. doi: 10.1001/jamapsychiatry.2016.3072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Peng P, et al. Brain structure alterations in respect to tobacco consumption and nicotine dependence: a comparative voxel-based morphometry study. Front. Neuroanat. 2018;12:43. doi: 10.3389/fnana.2018.00043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Miller KL, et al. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat. Neurosci. 2016;19:1523–1536. doi: 10.1038/nn.4393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rubinov M, Sporns O. Complex network measures of brain connectivity: uses and interpretations. Neuroimage. 2010;52:1059–1069. doi: 10.1016/j.neuroimage.2009.10.003. [DOI] [PubMed] [Google Scholar]
- 13.Hu W, Zhang A, Cai B, Calhoun V, Wang Y-P. Distance canonical correlation analysis with application to an imaging-genetic study. J. Med. Imaging. 2019;6:026501. doi: 10.1117/1.JMI.6.2.026501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Elliott LT, et al. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature. 2018;562:210–216. doi: 10.1038/s41586-018-0571-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wen W, et al. Distinct genetic influences on cortical and subcortical brain structures. Sci. Rep. 2016;6:32760. doi: 10.1038/srep32760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.den Braber A, et al. Heritability of subcortical brain measures: a perspective for future genome-wide association studies. NeuroImage. 2013;83:98–102. doi: 10.1016/j.neuroimage.2013.06.027. [DOI] [PubMed] [Google Scholar]
- 17.Eyler LT, et al. Conceptual and data-based investigation of genetic influences and brain asymmetry: a twin study of multiple structural phenotypes. J. Cogn. Neurosci. 2014;26:1100–1117. doi: 10.1162/jocn_a_00531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Blokland GA, de Zubicaray GI, McMahon KL, Wright MJ. Genetic and environmental influences on neuroimaging phenotypes: a meta-analytical perspective on twin imaging studies. Twin Res. Hum. Genet. 2012;15:351–371. doi: 10.1017/thg.2012.11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kremen WS, et al. Genetic and environmental influences on the size of specific brain regions in midlife: the VETSA MRI study. Neuroimage. 2010;49:1213–1223. doi: 10.1016/j.neuroimage.2009.09.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Jansen AG, Mous SE, White T, Posthuma D, Polderman TJ. What twin studies tell us about the heritability of brain development, morphology, and function: a review. Neuropsychol. Rev. 2015;25:27–46. doi: 10.1007/s11065-015-9278-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhao B, et al. Heritability of regional brain volumes in large-scale neuroimaging and genetic studies. Cereb. Cortex. 2018;29:2904–2914. doi: 10.1093/cercor/bhy157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Biton, A. et al. Polygenic architecture of human neuroanatomical diversity. Cereb Cortex. 30, 2307–2320 (2020). [DOI] [PMC free article] [PubMed]
- 23.Toro R, et al. Genomic architecture of human neuroanatomical diversity. Mol. Psychiatry. 2015;20:1011–1016. doi: 10.1038/mp.2014.99. [DOI] [PubMed] [Google Scholar]
- 24.Hibar DP, et al. Common genetic variants influence human subcortical brain structures. Nature. 2015;520:224–229. doi: 10.1038/nature14101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hibar DP, et al. Novel genetic loci associated with hippocampal volume. Nat. Commun. 2017;8:13624. doi: 10.1038/ncomms13624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Franke B, et al. Genetic influences on schizophrenia and subcortical brain volumes: large-scale proof of concept. Nat. Neurosci. 2016;19:420–431. doi: 10.1038/nn.4228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Guadalupe T, et al. Human subcortical brain asymmetries in 15,847 people worldwide reveal effects of age and sex. Brain Imaging Behav. 2017;11:1497–1514. doi: 10.1007/s11682-016-9629-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.van der Meer, D. et al. Brain scans from 21,297 individuals reveal the genetic architecture of hippocampal subfield volumes. Mol. Psychiatry, in press. (2018). [DOI] [PMC free article] [PubMed]
- 29.Ikram MA, et al. Common variants at 6q22 and 17q21 are associated with intracranial volume. Nat. Genet. 2012;44:539–544. doi: 10.1038/ng0612-732c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bis JC, et al. Common variants at 12q14 and 12q24 are associated with hippocampal volume. Nat. Genet. 2012;44:545–551. doi: 10.1038/ng.2237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Grasby, K. L. et al. The genetic architecture of the human cerebral cortex. Science367, eaay6690 (2020). [DOI] [PMC free article] [PubMed]
- 32.Hofer E, et al. Genetic correlations and genome-wide associations of cortical structure in general population samples of 22,824 adults. Nat. Commun. 2020;11:1–16. doi: 10.1038/s41467-020-18367-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Satizabal CL, et al. Genetic architecture of subcortical brain structures in 38,851 individuals. Nat. Genet. 2019;51:1624–1636. doi: 10.1038/s41588-019-0511-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Adams HH, et al. Novel genetic loci underlying human intracranial volume identified through genome-wide association. Nat. Neurosci. 2016;19:1569. doi: 10.1038/nn.4398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Boyle EA, Li YI, Pritchard JK. An expanded view of complex traits: from polygenic to omnigenic. Cell. 2017;169:1177–1186. doi: 10.1016/j.cell.2017.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Timpson NJ, Greenwood CMT, Soranzo N, Lawson DJ, Richards JB. Genetic architecture: the shape of the genetic contribution to human traits and disease. Nat. Rev. Genet. 2017;19:110–124. doi: 10.1038/nrg.2017.101. [DOI] [PubMed] [Google Scholar]
- 37.O’Connor LJ, et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 2019;105:456–476. doi: 10.1016/j.ajhg.2019.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Sudlow C, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. doi: 10.1371/journal.pmed.1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Zhao B, et al. Genome-wide association analysis of 19,629 individuals identifies variants influencing regional brain volumes and refines their genetic co-architecture with cognitive and mental health traits. Nat. Genet. 2019;51:1637–1644. doi: 10.1038/s41588-019-0516-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Zhao, B. et al. Large-scale GWAS reveals genetic architecture of brain white matter microstructure and genetic overlap with cognitive and mental health traits (n = 17,706). Mol. Psychiatry. Epub ahead of print (2019). [DOI] [PMC free article] [PubMed]
- 41.Gusev A, et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 2016;48:245. doi: 10.1038/ng.3506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Barbeira AN, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 2018;9:1825. doi: 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hu Y, et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 2019;51:568–576. doi: 10.1038/s41588-019-0345-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Gamazon ER, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 2015;47:1091. doi: 10.1038/ng.3367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Zeng P, Zhou X. Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models. Nat. Commun. 2017;8:456. doi: 10.1038/s41467-017-00470-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Zhou X, Stephens M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat. Methods. 2014;11:407. doi: 10.1038/nmeth.2848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Nagpal S, et al. TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am. J. Hum. Genet. 2019;105:258–266. doi: 10.1016/j.ajhg.2019.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Consortium G. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science. 2015;348:648–660. doi: 10.1126/science.1262110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Wainberg M, et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 2019;51:592. doi: 10.1038/s41588-019-0385-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Zhang W. Advancements of transcriptome imputation and related transcriptome-wide association studies. Curr. Res. Biochem. Mol. Biol. 2019;1:14–16. doi: 10.33702/crbmb.2019.1.1.4. [DOI] [Google Scholar]
- 51.Smith SM, Nichols TE. Statistical challenges in “big data” human neuroimaging. Neuron. 2018;97:263–268. doi: 10.1016/j.neuron.2017.12.018. [DOI] [PubMed] [Google Scholar]
- 52.Sun R, Lin X. Set-based tests for genetic association using the generalized Berk-Jones statistic. arXiv Preprint. 2017;1710:02469. [Google Scholar]
- 53.de Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 2015;11:e1004219. doi: 10.1371/journal.pcbi.1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Watanabe K, Taskesen E, Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 2017;8:1826. doi: 10.1038/s41467-017-01261-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Satterthwaite TD, et al. Neuroimaging of the Philadelphia neurodevelopmental cohort. Neuroimage. 2014;86:544–553. doi: 10.1016/j.neuroimage.2013.07.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Weiner MW, et al. The Alzheimer’s disease neuroimaging Initiative: a review of papers published since its inception. Alzheimer’s Dement. 2013;9:e111–e194. doi: 10.1016/j.jalz.2013.05.1769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Jernigan TL, et al. The pediatric imaging, neurocognition, and genetics (PING) data repository. Neuroimage. 2016;124:1149–1154. doi: 10.1016/j.neuroimage.2015.04.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Somerville LH, et al. The Lifespan Human Connectome Project in Development: a large-scale study of brain connectivity development in 5–21 year olds. NeuroImage. 2018;183:456–468. doi: 10.1016/j.neuroimage.2018.08.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Gusev A, et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat. Genet. 2018;50:538–548. doi: 10.1038/s41588-018-0092-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Buniello A, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2018;47:D1005–D1012. doi: 10.1093/nar/gky1120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chung J, et al. Genome-wide association study of Alzheimer’s disease endophenotypes at prediagnosis stages. Alzheimer’s Dement. 2018;14:623–633. doi: 10.1016/j.jalz.2017.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Verhaaren BF, et al. Multiethnic genome-wide association study of cerebral white matter hyperintensities on MRI. Circulation: Cardiovascular Genet. 2015;8:398–409. doi: 10.1161/CIRCGENETICS.114.000858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Kunkle BW, et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat. Genet. 2019;51:414. doi: 10.1038/s41588-019-0358-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Astle WJ, et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell. 2016;167:1415–1429.e19. doi: 10.1016/j.cell.2016.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Kim SK. Identification of 613 new loci associated with heel bone mineral density and a polygenic risk score for bone mineral density, osteoporosis and fracture. PloS ONE. 2018;13:e0200785. doi: 10.1371/journal.pone.0200785. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Shungin D, et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature. 2015;518:187. doi: 10.1038/nature14132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Linnér RK, et al. Genome-wide association analyses of risk tolerance and risky behaviors in over 1 million individuals identify hundreds of loci and shared genetic influences. Nat. Genet. 2019;51:245–257. doi: 10.1038/s41588-018-0309-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Davies G, et al. Study of 300,486 individuals identifies 148 independent genetic loci influencing general cognitive function. Nat. Commun. 2018;9:2098. doi: 10.1038/s41467-018-04362-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Kichaev G, et al. Leveraging polygenic functional enrichment to improve GWAS power. Am. J. Hum. Genet. 2019;104:65–75. doi: 10.1016/j.ajhg.2018.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Savage JE, et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat. Genet. 2018;50:912–919. doi: 10.1038/s41588-018-0152-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Okbay A, et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 2016;48:624–633. doi: 10.1038/ng.3552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Lee JJ, et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet. 2018;50:1112–1121. doi: 10.1038/s41588-018-0147-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Herold C, et al. Family-based association analyses of imputed genotypes reveal genome-wide significant association of Alzheimer’s disease with OSBPL6, PTPRG, and PDCL3. Mol. psychiatry. 2016;21:1608–1612. doi: 10.1038/mp.2015.218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Lee PH, et al. Genomic relationships, novel loci, and pleiotropic mechanisms across eight psychiatric disorders. Cell. 2019;179:1469–1482.e11. doi: 10.1016/j.cell.2019.11.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Li Z, et al. Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia. Nat. Genet. 2017;49:1576–1583. doi: 10.1038/ng.3973. [DOI] [PubMed] [Google Scholar]
- 76.Kanai M, et al. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 2018;50:390–400. doi: 10.1038/s41588-018-0047-6. [DOI] [PubMed] [Google Scholar]
- 77.Lam M, et al. Large-scale cognitive gwas meta-analysis reveals tissue-specific neural expression and potential nootropic drug targets. Cell Rep. 2017;21:2597–2613. doi: 10.1016/j.celrep.2017.11.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Wray NR, et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet. 2018;50:668. doi: 10.1038/s41588-018-0090-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Periyasamy S, et al. Association of schizophrenia risk with disordered niacin metabolism in an Indian genome-wide association study. JAMA Psychiatry. 2019;76:1026–1034. doi: 10.1001/jamapsychiatry.2019.1335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Ripke S, et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511:421. doi: 10.1038/nature13595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Jansen PR, et al. Genome-wide analysis of insomnia in 1,331,010 individuals identifies new risk loci and functional pathways. Nat. Genet. 2019;51:394–403. doi: 10.1038/s41588-018-0333-3. [DOI] [PubMed] [Google Scholar]
- 82.Lam M, et al. Pleiotropic meta-analysis of cognition, education, and schizophrenia differentiates roles of early neurodevelopmental and adult synaptic pathways. Am. J. Hum. Genet. 2019;105:334–350. doi: 10.1016/j.ajhg.2019.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Winham SJ, et al. Genome-wide association study of bipolar disorder accounting for effect of body mass index identifies a new risk allele in TCF7L2. Mol. Psychiatry. 2014;19:1010. doi: 10.1038/mp.2013.159. [DOI] [PubMed] [Google Scholar]
- 84.Hoffmann TJ, et al. A large electronic-health-record-based genome-wide study of serum lipids. Nat. Genet. 2018;50:401–413. doi: 10.1038/s41588-018-0064-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.van der Harst P, Verweij N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ. Res. 2018;122:433–443. doi: 10.1161/CIRCRESAHA.117.312086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Rietveld CA, et al. Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proc. Natl. Acad. Sci. 2014;111:13790–13794. doi: 10.1073/pnas.1404623111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.St Pourcain B, et al. Variability in the common genetic architecture of social-communication spectrum phenotypes during childhood and adolescence. Mol. Autism. 2014;5:18. doi: 10.1186/2040-2392-5-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Day FR, Ong KK, Perry JR. Elucidating the genetic basis of social interaction and isolation. Nat. Commun. 2018;9:2457. doi: 10.1038/s41467-018-04930-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Goes FS, et al. Genome‐wide association study of schizophrenia in Ashkenazi Jews. Am. J. Med. Genet. Part B: Neuropsychiatr. Genet. 2015;168:649–659. doi: 10.1002/ajmg.b.32349. [DOI] [PubMed] [Google Scholar]
- 90.Hou L, et al. Genetic variants associated with response to lithium treatment in bipolar disorder: a genome-wide association study. Lancet. 2016;387:1085–1093. doi: 10.1016/S0140-6736(16)00143-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Consortium C-DGotPG. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet. 2013;381:1371–1379. doi: 10.1016/S0140-6736(12)62129-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Hill W, et al. A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence. Mol. Psychiatry. 2019;24:169–181. doi: 10.1038/s41380-017-0001-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.O’Connor E, et al. Identification of mutations in the MYO9A gene in patients with congenital myasthenic syndrome. Brain. 2016;139:2143–2153. doi: 10.1093/brain/aww130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Snijders AM, et al. FAM83 family oncogenes are broadly involved in human cancers: an integrative multi‐omics approach. Mol. Oncol. 2017;11:167–179. doi: 10.1002/1878-0261.12016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Sprooten E, et al. Common genetic variants and gene expression associated with white matter microstructure in the human brain. Neuroimage. 2014;97:252–261. doi: 10.1016/j.neuroimage.2014.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Song M, et al. Mapping cis-regulatory chromatin contacts in neural cells links neuropsychiatric disorder risk variants to target genes. Nat. Genet. 2019;51:1252. doi: 10.1038/s41588-019-0472-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Pividori, M. et al. PhenomeXcan: Mapping the genome to the phenome through the transcriptome. Sci. Adv. 6, eaba2083 (2020). [DOI] [PMC free article] [PubMed]
- 98.Melin BS, et al. Genome-wide association study of glioma subtypes identifies specific differences in genetic susceptibility to glioblastoma and non-glioblastoma tumors. Nat. Genet. 2017;49:789–794. doi: 10.1038/ng.3823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Watanabe K, et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 2019;51:1339–1348. doi: 10.1038/s41588-019-0481-0. [DOI] [PubMed] [Google Scholar]
- 100.Lek M, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285. doi: 10.1038/nature19057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Schneider SA, Walker RH, Bhatia KP. The Huntington’s disease-like syndromes: what to consider in patients with a negative Huntington’s disease gene test. Nat. Clin. Pract. Neurol. 2007;3:517–525. doi: 10.1038/ncpneuro0606. [DOI] [PubMed] [Google Scholar]
- 102.Stevanin G, et al. Huntington’s disease‐like phenotype due to trinucleotide repeat expansions in the TBP and JPH3 genes. Brain. 2003;126:1599–1603. doi: 10.1093/brain/awg155. [DOI] [PubMed] [Google Scholar]
- 103.Holmes SE, et al. A repeat expansion in the gene encoding junctophilin-3 is associated with Huntington disease–like 2. Nat. Genet. 2001;29:377–378. doi: 10.1038/ng760. [DOI] [PubMed] [Google Scholar]
- 104.Wild EJ, et al. Huntington’s disease phenocopies are clinically and genetically heterogeneous. Mov. Disord. 2008;23:716–720. doi: 10.1002/mds.21915. [DOI] [PubMed] [Google Scholar]
- 105.Walker RL, et al. Genetic control of expression and splicing in developing human brain informs disease mechanisms. Cell. 2019;179:750–771. doi: 10.1016/j.cell.2019.09.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 2005;4:1–43. doi: 10.2202/1544-6115.1128. [DOI] [PubMed] [Google Scholar]
- 107.Ioannidis NM, et al. Gene expression imputation identifies candidate genes and susceptibility loci associated with cutaneous squamous cell carcinoma. Nat. Commun. 2018;9:4264. doi: 10.1038/s41467-018-06149-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Barbeira AN, et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 2019;15:e1007889. doi: 10.1371/journal.pgen.1007889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Xu Z, Wu C, Wei P, Pan W. A powerful framework for integrating eQTL and GWAS summary data. Genetics. 2017;207:893–902. doi: 10.1534/genetics.117.300270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Bycroft C, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. doi: 10.1038/s41586-018-0579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Zhao, B. et al. Transcriptome-wide association analysis of brain structures yields insights into pleiotropy with complex neuropsychiatric traits. Zenodo, 10.5281/zenodo.4649360 (2021). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data used in this work were obtained from publicly available data sets: the UK Biobank (UKB) study, the Human Connectome Project (HCP) study, the Pediatric Imaging, Neurocognition, and Genetics (PING) study, the Philadelphia Neurodevelopmental Cohort (PNC) study, the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study, and ENIGMA2 & the ENIGMA-CHARGE collaboration. For the first five data sets, the raw MRI, covariates, and SNP data are available from each data resource: UK Biobank, http://www.ukbiobank.ac.uk/resources/;PING, http://pingstudy.ucsd.edu/resources/genomics-core.html/; PNC, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v1.p1/; ADNI, http://adni.loni.usc.edu/data-samples/; and HCP, https://www.humanconnectome.org/. The GWAS summary statistics can be obtained at https://github.com/BIG-S2/GWAS and http://enigma.ini.usc.edu/research/. In addition, we used other 16 sets of publicly available GWAS summary statistics shared by several GWAS databases. These data resources are summarized in Supplementary Data 15. The FUSION database used in this study is available at http://gusevlab.org/projects/fusion/.
We made use of publicly available software and tools, especially the UTMOST (https://github.com/Joker-Jerome/UTMOST) and the FUSION (http://gusevlab.org/projects/fusion/). The analysis code is freely available at 10.5281/zenodo.4649360111.