Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 May 1.
Published in final edited form as: Nat Genet. 2019 Nov 1;51(11):1637–1644. doi: 10.1038/s41588-019-0516-6

Genome-wide association analysis of 19,629 individuals identifies variants influencing regional brain volumes and refines their genetic co-architecture with cognitive and mental health traits

Bingxin Zhao 1, Tianyou Luo 1, Tengfei Li 2,3, Yun Li 1,4,5, Jingwen Zhang 6, Yue Shan 1, Xifeng Wang 1, Liuqing Yang 7, Fan Zhou 1, Ziliang Zhu 1; Alzheimer’s Disease Neuroimaging Initiative8; Pediatric Imaging, Neurocognition and Genetics8, Hongtu Zhu 1,3,*
PMCID: PMC6858580  NIHMSID: NIHMS1540523  PMID: 31676860

Abstract

Volumetric variations of human brain are heritable and are associated with many brain-related complex traits. Here we performed genome-wide association studies (GWAS) of 101 brain volumetric phenotypes using the UK Biobank (UKB) sample including 19,629 participants. GWAS identified 365 independent genetic variants exceeding significance threshold of 4.9 × 10−10, adjusted for testing multiple phenotypes. Gene-based association study found 157 associated genes (124 new), and functional gene mapping analysis linked 146 additional genes. Many of the discovered genetic variants and genes have previously been implicated in cognitive and mental health traits. Using genome-wide polygenic risk score prediction, more than 6% of phenotypic variance (P = 3.13 × 10−24) in four other independent studies could be explained by the UKB GWAS results. In conclusion, our study identifies many new genetic associations at variant, locus and gene levels and advances our understanding of the pleiotropy and genetic co-architecture between brain volumes and other traits.

Editorial summary:

Genome-wide analyses in 19,629 individuals identify 365 independent variants associated with brain volumetric phenotypes. The study provides insight into the overlapping genetic architecture of brain volume measures and cognitive and mental health traits.


Regional brain volumes are heritable measures of brain functional and structural changes. Volumetric variations of human brain are known to be phenotypically and genetically associated with heritable cognitive and mental health traits1-5, and it is an active research area to understand the shared genetic influences on these traits6. Individual variations of human brain volume are usually quantified by magnetic resonance imaging (MRI). In region of interest (ROI)-based analysis, whole brain MRIs are processed and annotated onto many pre-defined ROIs, and then regional volumetric phenotypes are generated to measure the structure of brain ROIs. Both twin and population-based studies have shown that these volumetric phenotypes can be highly or moderately heritable. The heritability of brain regions estimated from twin studies can be larger than 80%7-12. For example, the heritability of basal ganglia structures (putamen, caudate, pallidum) and limbic and diencephalic regions (hippocampus, amygdala, thalamus) was reported to range from 0.60 to 0.8511. Common genetic variants (typically single-nucleotide polymorphisms (SNPs)) can account for more than 50% phenotypic variation in the general population13-17. The SNP heritability18 estimates of accumbens area, amygdala, putamen, palladium, caudate, thalamus and hippocampus range from 0.40 to 0.5415. A highly polygenic or omnigenic19,20 genetic architecture has been observed, which indicates that a large number of genetic variants influence regional brain volumes and their genetic contributions are widespread across the genome.

Several genome-wide association studies (GWAS)3,14,17,21-25 have been conducted to identify genetic risk variants for brain volumetric phenotypes. However, except for the whole brain volume and volumes of a few specific ROIs (e.g., hippocampus in subcortical area3,17,26), GWAS of most brain volumetric phenotypes were insufficiently powered, for which the largest sample size of discovery GWAS was less than 10,000 in Elliott et al.14. Such GWAS sample size is much smaller than those of recent GWAS of other heritable brain-related traits, such as cognitive function27, neuroticism28, and intelligence29, where sample sizes ranged from 269,867 to 449,484. Given the polygenic nature of brain volumes, most of the genetic risk variants may remain undetected, and GWAS with larger sample size can uncover more associated variants and enrich the pleiotropy and genetic co-architecture with other traits. Recently, the UK Biobank (UKB30) study team has collected and released MRI data for more than 20,000 participants. In addition, publicly available imaging genetic datasets also emerge from several other independent studies, including Philadelphia Neurodevelopmental Cohort (PNC31), Alzheimer’s Disease Neuroimaging Initiative (ADNI32), Pediatric Imaging, Neurocognition, and Genetics (PING33), and the Human Connectome Project (HCP34), among others. These datasets provide a new opportunity to perform better-powered GWAS of all ROI brain volumes.

Here we downloaded the raw MRI data from these data resources and processed the data using consistent standard procedures via advanced normalization tools (ANTs35,36) to generate 101 regional (and total) brain volume phenotypes (referred as ROI volumes), including total brain volume (TBV), gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). We used 19,629 UKB individuals of British ancestry in the main discovery GWAS. Four other datasets with relatively small sample sizes (total sample size 2,192 after quality controls) were used to validate the UKB findings, and finally, a meta-analysis was performed to combine all the data. We started our analysis of UKB data by estimating SNP heritability, which is the proportion of phenotypic variation that can be explained by the additive effects of all common autosomal variants37. Since the UKB MRI data were released at different time points, we organized them in two parts: the first part was released in 2017 (which we refer to as phase 1, n = 9,198), most of which has been analyzed in Elliott et al.14, and the second part was released in 2018 (which we refer to as phase 2, n = 10,431). To detect any potential heterogeneity between the two phases, we compared the SNP heritability estimated in phase 2 data to those in phase 1 data. We then carried out GWAS to identify the associated genetic variants for each ROI volume. We performed gene-based association analysis via MAGMA38 to uncover gene-level associations, and performed post-GWAS functional mapping and annotation (FUMA39) to explore the functional consequences of the significant genetic variants. We calculated the pairwise genetic correlation between ROI volumes and 50 brain-related complex traits by the linkage disequilibrium (LD) score regression (LDSC40). To confirm the robustness of UKB GWAS findings, we jointly analyzed the UKB GWAS results with those from PNC, ADNI, PING and HCP. We developed genome-wide polygenic risk scores (PRS) to assess the predictive ability of the UKB GWAS results on the four other datasets. GWAS summary statistics of the UKB sample and meta-analysis for the five studies have been made publicly available at https://med.sites.unc.edu/bigs2/data/gwas-summary-statistics/.

RESULTS

SNP heritability estimates of the two UKB phases.

In Supplementary Figure 1, we compare the SNP heritability (h2) estimated separately from UKB phase 1 and 2 data. The sample correlation coefficient of these estimates was 0.85 (correlation = 0.85), indicating moderate to high level of agreement in terms of the degree of genetic contributions to ROI between the two phases. The mean h2 across 101 ROI volumes was 0.41 for phase 1 and 0.37 for phase 2. The difference of mean h2 was not significant (two-sided t-test, P = 0.12). Ten ROIs had >0.6 h2 estimates in both phases, including TBV, cerebellar vermal lobules VIII-X, cerebellar vermal lobules I-V, brain stem, left/right cerebellum exterior, left/right cerebellum white matter, and left/right putamen. The h2 estimates from the combined data were highly correlated with those from phase 1 (correlation = 0.93) and phase 2 (correlation = 0.95) (Supplementary Figs. 2 and 3). The h2 and the corresponding 95% confidence interval (CI) are illustrated in Supplementary Figures 4-6. The h2 estimates, standard errors, raw and Bonferroni-corrected P-values from the one-sided likelihood ratio tests are provided in Supplementary Table 1. In the combined data, h2 of most ROIs was significant after Bonferroni correction for multiple testing (mean h2 = 0.40, h2 range = (0.12, 0.72), standard error = 0.15). SNP heritability estimates of left basal forebrain (h2 = 0.10) and optic chiasm (h2 = 0.06) were not significant. These h2 estimates were comparable with previous results14,15. In addition, for each ROI, we examined the genetic correlation (gc) of its regional volumes collected in the two phases. The gc estimates distributed around the point one, and the 95% CIs of gc estimates covered the point one for most ROIs (Supplementary Table 1 and Supplementary Fig. 7). In summary, SNP heritability and genetic correlation analyses indicate that most ROI volumes are heritable and have largely consistent genetic basis in the two phases data.

Significant GWAS associations of 101 ROI volumes.

We carried out GWAS of the 101 ROI volumes using 8,944,375 genetic variants after genotyping quality controls. Manhattan and QQ plots of all 101 phenotypes are displayed in Supplementary Datasets 1 and 2, respectively. In the rest of this paper, we use 4.9 × 10−10 (that is, 5 × 10−8/101, additionally adjusted for all 101 GWAS performed) as the significance threshold for genetic variant-level associations unless otherwise stated.

We found that 365 independent significant variants had 494 significant associations with 58 ROIs (Supplementary Tables 2 and 3) at the 4.9 × 10−10 significance level. Independent significant variants were defined as significant variants that were independent of other significant variants by FUMA (Methods). The number of associations for each ROI is displayed in Figure 1 and Supplementary Table 2. Left/right hippocampus, left/right putamen, and cerebellar vermal lobules VIII-X had at least 30 independent significant variants. The number of independent significant associations on each chromosome is shown in Supplementary Table 4. Chromosome 12 had the largest number of independent variant-level associations after weighting by chromosome length (Supplementary Fig. 8).

Figure 1 ∣. Number of independent significant variant-level associations discovered in UKB GWAS (n = 19,629 subjects) at different significance levels.

Figure 1 ∣

The P-values are raw P-values of two-sided t-test statistics. The outer layer counts the number of associations for each ROI volume with P < 5 × 10−8, the middle layer counts the ones with P < 5 × 10−9, and inner layer counts P < 4.9 × 10−10. The 4.9 × 10−10 threshold corresponds to adjusting for testing multiple imaging phenotypes with Bonferroni correction.

Based on the pre-calculated LD structure from the 1000 Genomes reference panel41, variants in LD with independent significant variants were identified and then (independent) lead variants and genetic risk loci were defined (Methods). The 494 independent significant variant-level associations were further characterized as 170 significant associations between genetic risk loci and ROI volumes (Supplementary Table 5). Brain stem, X4th ventricle, cerebellar vermal lobules VIII-X, cerebellar vermal lobules VI-VII, left/right putamen, left/right cerebellum exterior, left/right hippocampus, left/right lateral ventricle, left pallidum, TBV and WM had at least five associated loci (Supplementary Table 2). Each chromosome had at least one associated locus except for chromosomes 13, 21 and 22 (Supplementary Table 6). Results at significance thresholds 5 × 10−8 and 5 × 10−9 are also provided in the above tables and summarized in Supplementary Table 7. We also performed association analysis for 283,120 genetic variants on the X chromosome (Methods) but observed no significant association at the 4.9 × 10−10 significance level.

Concordance with previous GWAS results.

We performed association lookups for the 365 independent significant variants and their correlated variants in the NHGRI-EBI GWAS catalog42. We found that 166 independent significant variants (associated with 47 ROI volumes) have previously reported GWAS associations with other traits (Supplementary Table 8). Our results tagged many variants that were previously reported in GWAS of ROI volumes, including 19 variants in van der Meer et al.3 for hippocampal subfield volumes, 12 in Hibar et al.17 for subcortical brain region volumes, 6 in Chen et al.43 for putamen volume, 4 in Bis et al.25 for hippocampal volume, 2 in Hibar et al.21 for hippocampal volume, 2 in Stein et al.44 for brain structure, 2 in Ikram et al.24 for intracranial volume, 1 in Furney et al.45 for whole brain volume, and 1 in Baranzini et al.46 for normalized brain volume (Supplementary Table 9). For the other traits, we highlighted previous associations of 46 variants with mental health disorders (such as schizophrenia, autism spectrum disorder (ASD), and depression), 98 with cognitive functions, 25 with educational attainment, 24 with neuroticism, 14 with Parkinson’s disease, 4 with reaction time, and 3 with Alzheimer’s disease. We observed more overlap with previous GWAS results when the significance threshold was relaxed to 5 × 10−8 (Supplementary Table 10). We also compared our results with those reported in Elliott et al.14, who performed GWAS of 3,144 imaging phenotypes (including brain volume phenotypes processed by FreeSurfer47) using the UKB phase 1 data (n = 8,428). When both were corrected for the number of GWAS analyses performed, 26 of the 78 significant variants reported in Elliott et al.14 were in LD (r2 ≥ 0.6) with our independent significant variants (Supplementary Table 11). When both were relaxed to the 5 × 10−8 significance threshold, 124 of their 616 significant variants were in LD with our independent significant variants.

Gene-based association analysis and functional mapping.

We performed gene-based association analysis with GWAS summary statistics for 18,796 candidate genes (Methods). We found 281 significant gene-level associations (P < 2 × 10−8, adjusted for multiple traits) between 157 genes and 55 ROIs (Supplementary Table 12). Our results replicated 33 genes discovered in previous studies, including FOXO3 in Baranzini et al.46 for normalized brain volume, GATAD2B in Hibar et al.48 for lentiform nucleus volume, GNA12 in Sprooten et al.49 for white matter integrity, MCC in Kim and Webster50 for brain cytoarchitecture, HMGA2 and HRK in Stein et al.44 for brain structure, KANSL1, MAPT, STH and CENPW in Ikram et al.24 for intracranial volume, GMNC, WNT3 and PDCD11 in Klein et al.51 for intracranial volume, SLC44A5 in Furney et al.45 for whole brain volume, MSRB3, BCL2L1, DCC and CRHR1 in Hibar et al.17 for subcortical brain region volumes, LEMD3, WIF1 and ASTN2 in Bis et al.25 for hippocampal volume, MAST4, FAM53B, METTL10 and FAF1 in van der Meer et al.3 for hippocampal subfield volumes, DSCAML1 and KTN1 in Chen et al.43 for putamen volume, and ZIC4, VCAN, PAPPA, DRAM1, DAAM1 and ALDH1A2 in Elliott et al.14 for brain imaging measurements. We found that 124 genes were novel and had not been linked to ROI volumes previously (Supplementary Table 13). Of the 157 detected genes, 70 have previously been implicated with cognitive functions, intelligence, education, neuroticism, neuropsychiatric and neurodegenerative diseases/disorders, such as IGF2BP129,52, WNT327,28,53,54, PLEKHM54-56, and AGBL228,54,57,58. Particularly, 47 of the 70 pleiotropic genes were novel genes of ROI volumes, and thus these findings substantially uncovered the gene-level pleiotropy between ROI volumes and these traits (Fig. 2).

Figure 2 ∣. Genes identified in gene-based association analysis of ROI volumes (n = 19,629 subjects) that have been linked to cognitive traits and mental health disorders in previous GWAS.

Figure 2 ∣

For each of the ROI-associated genes listed in the x-axis, we manually checked the previously reported associations on the NHGRI-EBI GWAS catalog (https://www.ebi.ac.uk/gwas/). The novel and previously reported genes of ROI volumes were labeled with two different colors (orange and green, respectively).

The independent significant variants were also annotated by functional consequences on gene functions (Supplementary Table 14 and Supplementary Fig. 9), and were subsequently mapped to genes according to physical position, expression quantitative trait loci (eQTL) association (for brain tissues), and 3D chromatin (Hi-C) interaction (Methods). Functional gene mapping yielded 505 significant associations for 279 genes and 53 ROIs (Supplementary Table 15). Of the 279 genes, 163 were not discovered in the above gene-based association analysis, which replicated more previous findings on ROI volumes, such as FBXW8 in Stein et al.44 for brain structure, WNT16 in Zheng et al.59 for cortical thickness, TBPL2 in Chen et al.43 for putamen volume, FAT3 in Hibar et al.17 for subcortical brain region volumes, FAM175B, LHPP, SLC4A10, RNFT2, TESC, FOXD2, DMRTA2, CDKN2C and DPP4 in van der Meer et al.3 for hippocampal subfield volumes, and EPHA3, SLC39A8, BANK1, CHPT1, ACADM, FAM3C, L3HYPDH, JKAMP, and AQP9 in Elliott et al.14 for brain imaging measurements. We found that 53 (41 new) of the 163 genes were associated with cognitive functions, intelligence, education, neuroticism, neuropsychiatric and neurodegenerative disorders, such as NT5C228,55,60,61, ADAM1061,62, and GOSR127,55 (Supplementary Fig. 10). Particularly, 182 significant Hi-C interactions were observed in the Hi-C functional mapping analysis (Supplementary Table 16), which yielded 33 significant associations between 13 genes and 16 ROIs (Supplementary Table 17). Of the 13 genes, 5 were not mapped by physical position or eQTL association, such as C5orf64 for left pericalcarine. C5orf64 has been reported to be associated with cognitive functions and intelligence27, education and math ability55, as well as risk behaviors63 and Alzheimer’s disease64.

In addition, we explored the biological interpretations of our GWAS results by performing several enrichment and annotation analyses, including gene property analysis by MAGMA and chromatin-based annotation analysis by stratified LDSC65 (Methods). To gain more insights into the biological mechanisms, we used DEPICT66 and MAGMA to conduct gene set analysis (Methods). The results can be found in Supplementary Note and are summarized in Supplementary Tables 18-21. In general, though some positive results can be obtained from these analyses, the present GWAS still has limited power to infer the specific biological pathway(s) influencing brain ROI volumes, and future GWAS with larger sample size is needed to further explore the biological mechanisms of brain imaging phenotypes.

Joint analysis with four independent datasets.

To validate the UKB GWAS results, we repeated GWAS of 101 ROI volumes separately on data obtained from four other independent studies: PNC (n = 537), HCP (n = 334), PING (n = 461), and ADNI (n = 860). Due to the small sample size of these four datasets, the probability of replicating significant findings in the UKB was low. Instead, we checked whether the effect signs were concordant in the five studies and whether the P-value of top UKB risk variants decreased after meta-analysis (Methods). Smaller P-values after meta-analysis indicate similar variant effects in independent samples67,68.

We carried out a joint analysis on 3,841,911 genetic variants that were present in all five sets of GWAS results. For the 7,310 significant associations (at 4.9 × 10−10 significance level), 63.8% (4,666) associations had the same effect signs across the five studies, and 97.0% (7,090) associations had the same effect signs in at least four studies (including UKB). Specifically, the number of genetic variants that had the same effect sign as UKB was 6,823 (93.3%) for ADNI, 6,436 (88.0%) for HCP, 6,455 (88.3%) for PING, and 6,648 (91.0%) for PNC. Exact binomial test69 showed a significant non-random agreement in effect signs across all the four studies (one-sided P < 2.2 × 10−16, null hypothesis: agreement has a probability 0.5). 93.9% (1,877) of the top 2,000 significant associations had smaller P-value after meta-analysis, and 91.4% (6,678) of the 7,310 associations were enhanced. We then performed meta-analysis on all 8,944,375 UKB GWAS genetic variants (variants were allowed to be missing in the four independent datasets). Compared to the UKB GWAS results (Supplementary Table 2, Supplementary Fig. 11, and Supplementary Note), there were more significant associations after meta-analysis: 29,585 significant associations at 5 × 10−8 significance level and 16,591 at 4.9 × 10−10 significance level (Supplementary Table 22 and Supplementary Fig. 12).

Genetic correlation with other traits.

We used the meta-analysis GWAS results to estimate the genetic correlation with other traits via LDSC. As positive controls, we first estimated the genetic correlation between several UKB ROIs volumes (TBV, left/right thalamus proper, left/right caudate, left/right putamen, left/right pallidum, left/right hippocampus, left/right accumbens area) and their corresponding traits studied in the ENIGMA consortium70. The gc estimates were all significant (P < 4.13 × 10−6), and average correlation was 0.95 (Supplementary Table 23). We then collected 50 sets of publicly available GWAS summary statistics (Supplementary Table 24) and calculated their pairwise genetic correlation with ROI volumes (Supplementary Table 25). We mainly focused on traits that showed evidence of pleiotropy in association lookups. There were 22 significant associations after adjusting for multiple testing by the Benjamini-Hochberg (B-H) procedure at 0.05 level (Supplementary Table 26 and Supplementary Fig. 13).

Significant genetic correlations linked 13 ROI volumes with general cognitive functions, education (education years, college completion), intelligence, numerical reasoning, reaction time, depressive symptoms, neuroticism, and bipolar disorder (BD) (Fig. 3), which matched our findings in variant and gene level lookups. Particularly, TBV had positive correlations with cognitive functions, education, intelligence, and numerical reasoning (gc range = (0.20, 0.25), mean = 0.22, P-value range = (1.52 × 10−11, 3.45 × 10−5)). These results matched the previous finding that brain size has small but significant connections with cognitive performance71. Reaction time had negative correlations with left/right pallidum, left/right ventral DC, and WM (gc range = (−0.20, −0.13), P-value range = (3.80 × 10−7, 1.14 × 10−4)). The negative correlations between reaction time and WM volumes have been previously reported72,73. Further details can be found in Supplementary Note. When the FDR level was relaxed to 0.1, suggestive evidence was observed for more brain-related traits, such as ASD and sleep traits (Supplementary Table 26 and Supplementary Fig. 14). In conclusion, our results confirm the significant genetic correlation among these traits and quantify the degree of their genetic overlaps.

Figure 3 ∣. Selected pairwise genetic correlations between ROI volumes (n = 21,821 subjects) and other traits.

Figure 3 ∣

The pairwise genetic correlations were estimated and tested by LDSC (https://github.com/bulik/ldsc). Stars are significant associations after adjusting for multiple testing by the Benjamini-Hochberg procedure at 0.05 significance level. The y-axis lists the ROI volumes. The x-axis provides the name of cognitive or mental health traits, the consortium sharing the GWAS summary statistics, and the corresponding sample sizes (see Supplementary Table 24 for further information about these studies).

Predictive ability of the UKB GWAS results.

We examined the out-of-sample prediction power of the UKB GWAS summary statistics using polygenic risk scores prediction74. We first used a ten-fold cross-validation design to examine the prediction power within the UKB sample for seven ROIs, including thalamus proper, caudate, putamen, pallidum, hippocampus, accumbens area, and TBV (Methods). The polygenic profiles can explain 1.18%-3.93% phenotypic variance (P-value range = (7.88 × 10−210, 4.90 × 10−72)) for these ROIs. The largest R-squared 3.93% was observed on putamen. Next, we used ROI-derived profiles to carry out cross-trait prediction on brain-related traits including education, reaction time, numeric memory, and fluid intelligence. The largest R-squared of a single profile was 0.24% (P = 7.53 × 10−7), which occurred when using the TBV-derived profile to predict fluid intelligence. When putting the profiles of seven ROIs together in one multivariate model, the R-squared for predicting fluid intelligence can be improved to 0.52% (P = 1.89 × 10−9). These results are summarized in Supplementary Table 27.

We then used the GWAS summary statistics of 19,629 UKB individuals to construct polygenic profiles on subjects in PNC, HCP, PING, and ADNI. We found that, for 11 ROIs (Fig. 4), the genetically predicted regional volume was significantly associated with the observed ROI volume in all four validation datasets after Bonferroni correction (that is, 101 × 4 = 404 tests), and can account for 1.17%-6.38% phenotypic variance (P-value range = (3.31 × 10−24, 1.68 × 10−5)) (Supplementary Table 28). For example, the R-squared of right putamen-derived profile was 6.38% in ADNI and 4.85% in PNC. Furthermore, 29 genetically predicted regional volumes were significant in at least three of the four datasets, 56 in at least two datasets, and 84 in at least one dataset (Supplementary Figs. 15-17). In summary, our within-UKB and out-of-UKB PRS analyses clearly indicate that UKB GWAS summary statistics of ROI volumes have widespread prediction power across ROIs. However, the R-squared can be low when predicting other brain-related complex traits. Such results are unsurprising because the genetic correlations among these traits were found to be small (though significant) in LDSC analysis.

Figure 4 ∣. Prediction accuracy (incremental R-squared) of polygenic risk scores constructed by UKB GWAS (n = 19,629 subjects) summary statistics on the four independent datasets.

Figure 4 ∣

The y-axis lists the ROI volumes (left/right cerebellum exterior, left/right putamen, left/right cerebellum white matter, left hippocampus, cerebellar vermal lobules VIII-X, X4th ventricle, right accumbens area and TBV). The x-axis lists the four independent cohorts (ADNI, HCP, PING and PNC). The displayed numbers are the proportions of phenotypic variation that can be additionally explained by polygenic risk scores, i.e., the incremental R-squared (see Methods for details of polygenic risk prediction).

DISCUSSION

In this study, we presented GWAS of 101 ROI volumes using data of 19,629 UKB individuals. Our novel contributions include: (i) identification of many new genetic associations at variant, locus, and gene levels; (ii) insights into the genetic co-architecture of brain volume phenotypes and other brain-related complex traits; (iii) validation of the UKB results in independent studies; and (iv) assessment of the predictive power of UKB GWAS results. Significant (P < 4.9 × 10−10) associations were found for 58 of the 101 ROIs. With larger sample size, the present study replicated many known genetic variants but also prioritized new ones. Compared to Elliott et al.14, our GWAS not only discovered more genetic variants, but also enriched the degree of (statistical) pleiotropy75 of the associated genes and characterized the shared genetic influences with cognitive and mental health traits. Our SNP heritability estimates are aligned with those previous results of existing twin studies. For example, our results supported previous findings that the degree of genetic control varies across different regions within the brain7,12,76,77. We also confirmed that cortical ROIs have larger variability in their heritability estimates than subcortical and ventricular ROIs11. In addition, some subcortical ROIs, such as putamen, cerebellum white matter, and brain stem11,78, were confirmatively highly heritable. On the other hand, SNP heritability of ROI volumes were found to be generally lower than estimates reported in twin studies7-10. This is expected79 and may indicate that genetic influences cannot be fully captured by additive effects of common genetic variants37. Such gaps may inspire future work to explore the effects of rare genetic variants on ROI volumes and to better model the genetic variation of the brain.

The present GWAS still faces some limitations. First, the current GWAS sample size of ROI volumes (and many other brain imaging phenotypes) is still far from sufficient. The highly polygenic genetic architecture of ROI volumes requires a larger number of individuals to identify many weak causal variants. In the era of sharing GWAS summary statistics, well-powered GWAS is essential for ROI volumes to be linked to the genetic co-architecture atlas with other complex traits. For example, a recent study of Watanabe et al.75 to discover the global overview of genetic co-architecture of 2,965 traits only focused on GWAS with sample size larger than 50,000, with the average sample size of selected traits being 256,276. In our genetic correlation analysis, we only obtained limited number of significant correlations, even though many pleiotropic genes were found in association lookups. In addition, ROI-derived PRS currently may have insufficient power to predict other brain-related traits. Therefore, we expect that GWAS of ROI volumes with larger sample size will be available and can further improve our understating of genetic overlaps underlying other traits. Besides increasing the sample size, combining genotyping data with external information, such as gene expression data80, may also help elucidate causal mechanisms, improve prediction performance, and identify genetic connections among traits.

Second, potential imaging artifacts, such as MRI hardware and software changes81, may cause unwanted variation in downstream genetic analyses, especially when combining multi-site and multiple-phase neuroimaging data82-84. In the present GWAS, we confirmed that the pairwise genetic correlations between UKB phases 1 and 2 data distributed around the point one, and verified that the UKB GWAS results had satisfactory prediction ability on four other independent datasets. However, we found that the SNP heritability estimates of the two phases data were not perfectly harmonized. The inadequate GWAS sample size may partially explain the variation in these heritability estimates, but it is also possible that artificial factors impaired the consistency of our results (see Table 1 of Smith and Nichols82 for a list of common imaging batch effects). Future studies that integrate data from more sites and phases are expected to be batch effects-aware and to confirm the previous GWAS findings.

METHODS

GWAS participants and phenotypes.

We performed GWAS separately on five publicly available datasets: the UK Biobank (UKB, http://www.ukbiobank.ac.uk/resources/) study, the Human Connectome Project (HCP,https://www.humanconnectome.org/) study, the Pediatric Imaging, Neurocognition, and Genetics (PING, http://pingstudy.ucsd.edu/resources/genomics-core.html) study, the Philadelphia Neurodevelopmental Cohort (PNC, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v1.p1) study, and the Alzheimer’s Disease Neuroimaging Initiative (ADNI, http://adni.loni.usc.edu/data-samples/) study. The main GWAS made use of data of 19,629 individuals of British ancestry from the UKB study, and the four other GWAS were performed on individuals of European ancestry (see Supplementary Table 29 for a summary of sample size of each GWAS).

The raw MRI, covariates and genetic data were downloaded from each data resource. We processed the MRI data locally using consistent procedures via advanced normalization tools (ANTs, http://stnava.github.io/ANTs/) to generate ROI volume phenotypes for each dataset. The processing steps are detailed in Supplementary Note, and we removed three ROIs (X5th ventricle and left/right lesion) with missing rates > 99%. For each phenotype and continuous covariate variable, we further removed values greater than five times the median absolute deviation from the median value. All individuals were aged between 3 and 92 years. More information about study cohorts can be found in Supplementary Table 30 and the Supplementary Note.

Heritability estimation and genome-wide association analysis.

We estimated the proportion of variation explained by all autosomal genetic variants in UKB using GCTA-GREML analysis85 (http://cnsgenomics.com/software/gcta/). The adjusted covariates included age (at imaging), age-squared, sex, age-sex interaction, age-squared-sex interaction, TBV (for ROIs other than TBV itself), as well as the top 40 genetic principle components (PCs) provided by UKB86 (Data-Field 22009). The heritability estimates were tested in one-sided likelihood ratio tests. For genetic variants of autosomes, we performed association analysis for each ROI volume using PLINK87 (https://www.cog-genomics.org/plink2/). The same set of covariates as in GCTA-GREML analysis were adjusted. The marginal genetic effects were tested in two-sided t-tests. GWAS were also separately performed on PING, PNC, ADNI, and HCP data. In these four datasets, we adjusted for age, age-squared, sex, age-sex interaction, age-squared-sex interaction, TBV (for ROIs other than TBV itself), and top ten genetic PCs estimated from the genetic variants. We also adjusted for Alzheimer’s disease status in ADNI GWAS. To examine the genetic correlation between UKB phase 1 and phase 2 data, we performed GWAS separately on data of the two phases. For genetic variants on the X chromosome, we performed association analysis using XWAS88 (version 3.0, http://keinanlab.cb.bscb.cornell.edu/content/xwas/). We coded male genotypes on X chromosome as 0/2, and sex was considered as a covariant in the model.

Genomic risk loci characterization and comparison with previous findings.

Genomic risk loci were defined using FUMA online platform (version 1.3.4, http://fuma.ctglab.nl/). We input the UKB GWAS summary statistics obtained from PLINK. FUMA first identified independent significant variants, which were defined as variants with a P-value smaller than the predefined threshold and independent of other significant variants at r2 < 0.6. Using these independent significant variants, FUMA then constructed LD blocks for independent significant variants by tagging all variants that had a MAF ≥ 0.0005 and were in LD (r2 ≥ 0.6) with at least one of the independent significant variants. These variants included those from the 1000 Genomes reference panel and may not have been included in the present study. Based on these independent significant variants, (independent) lead variants were also identified as those that were independent from each other (r2 < 0.1). If LD blocks of independent significant variants were closed (<250 kb based on the closest boundary variants of LD blocks), they were merged to a single genomic locus. Thus, each genomic locus could contain more than one independent significant variants and lead variants. Independent significant variants and all the tagged variants were subsequently searched by FUMA in the NHGRI-EBI GWAS catalog (version 2019-01-31, https://www.ebi.ac.uk/gwas/) to look for their reported associations (P < 9 × 10−6) with any traits.

Gene-based association analysis and functional annotation.

Gene-based association analysis was carried out for 18,796 protein-coding genes using MAGMA (v1.07, https://ctg.cncr.nl/software/magma/), which was also implemented in FUMA. Genetic variants were mapped according to their psychical positions, and then the gene-based P-values were calculated by the GWAS summary statistics of mapped variants. Default MAGMA parameters were used, which mapped genetic variants to genes with no window around genes (window size = 0). In functional annotation and mapping analysis, variant-level signals were annotated with their biological functionality and then were linked to genes by a combination of positional, eQTL, and 3D chromatin interaction mappings. Specifically, independent significant variants and all the tagged variants were first annotated for functional consequences on gene functions (e.g., intergenic, intronic, exonic) using ANNOVAR89 (version 2017-01-11). Functionally-annotated variants were then mapped to 35,808 candidate genes based on physical position on the genome (tissue/cell types for 15-core chromatin state: brain), eQTL associations (tissue types: GTEx90 v7 brain, BRAINEAC91, and CommonMind Consortium92) and chromatin interaction mapping (built-in chromatin interaction data: dorsolateral prefrontal cortex, hippocampus93; annotate enhancer/promoter regions: E053-E082 brain94). We used default values for all other parameters.

For the detected genes, we performed lookups in the NHGRI-EBI GWAS catalog (version 2019-05-03) again to explore the previously reported associations with the same or other traits. We focused on traits including cognitive functions (such as general cognitive ability, cognitive performance, and empathy quotient), intelligence, educational attainment, math ability (such as highest math class taken and self-reported math ability), reaction time, neuroticism, neurodegenerative diseases (such as Alzheimer’s disease and Parkinson’s disease), and neuropsychiatric disorders (such as major depressive disorder, schizophrenia, and bipolar disorder).

Biological annotation and enrichment analyses.

For the 14 brain tissues (GTEx90 v7), we performed gene property analysis via MAGMA. That is, for each candidate gene, we tested whether its tissue-specific expression levels can be linked to the strength of its association with ROI volumes. We also performed cell-type/tissue-specific chromatin-based annotation analysis using stratified LDSC (https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses). The cell-type/tissue-specific annotations of DNase I hypersensitivity and activating histone marks (H3K27ac, H3K4me3, H3K4me1, H3K9ac and H3K36me3) were from the Roadmap Epigenomics consortium94 and the ENCODE project95. For each annotation, we tested whether it had an enriched contribution to per-SNP heritability, conditional on the other annotations. DEPICT (version 1 rel194, https://github.com/perslab/depict) and MAGMA gene set analyses were used to explore the implicated biological pathway by the UKB GWAS summary statistics. Specifically, DEPICT tested 10,968 reconstituted gene sets, and the GWAS summary statistics with P < 10−5 were used as input. The MAGMA gene set analysis examined 10,678 gene sets from the Molecular Signatures Database96 (MSigDB, v6.2, http://software.broadinstitute.org/gsea/msigdb), including 4,761 curated gene sets and 5,917 Gene Ontology (GO) terms. All parameters in these analyses were set as default.

Meta-analysis of GWAS results.

We meta-analyzed the UKB, PING, PNC, ADNI, and HCP GWAS summary results using METAL (https://genome.sph.umich.edu/wiki/METAL) with the sample-size weighted approach. Since the sample sizes of four other datasets were small, we removed the variants that were not presented in the UKB data.

Genetic correlation estimation with LDSC.

LD Hub (v1.9.1, http://ldsc.broadinstitute.org/ldhub/) was used to estimate the genetic correlation between several UKB ROIs volumes and their corresponding traits studied in the ENIGMA consortium (http://enigma.ini.usc.edu/). The LDSC software (v1.0.0, https://github.com/bulik/ldsc) was then used to estimate the pairwise genetic correlation with 50 sets of collected GWAS summary statistics. In addition, for each ROI, we also examined the genetic correlation between its regional volumes collected in UKB phases 1 and 2. We used the pre-calculated LD scores provided by LDSC (https://data.broadinstitute.org/alkesgroup/LDSCORE/), which were computed using 1000 Genomes European data. We used HapMap397 variants and removed all variants in the major histocompatibility complex (MHC) region.

Polygenic scoring.

Polygenic profiles were created to examine the out-of-sample prediction power of the GWAS results. Specifically, we used PLINK to generate risk scores in testing data by summarizing across variants, weighed by their effect sizes estimated from training data. To account for the LD structure, two procedures were used: (i) LD-based pruning (window size 50, step 5, r2 = 0.2); and (ii) posterior effect size estimation under continuous shrinkage prior with an external LD reference panel98 (https://github.com/getian107/PRScs). We tried five P-value thresholds for predictor selection in each of the two procedures: 1, 0.5, 0.05, 5 × 10−4 and 5 × 10−8. Thus, ten polygenic profiles were generated for each ROI volume, and we reported the best prediction power that can be achieved by a single profile of the ten. The association between polygenic profile and phenotype was estimated and tested in linear regression model, adjusting for the effects of age and sex. The additional phenotypic variation that can be explained by polygenic profile (i.e., the incremental R-squared) was used to measure the prediction power.

For UKB dataset, we randomly divided the 19,629 UKB individuals into ten folds, then used nine of these folds as training data to rerun GWAS, and created polygenic profiles on the individuals in the remaining fold, which served as testing data. We repeated this procedure ten times such that each fold alternated to serve as the testing data for exactly one time. We examined seven ROIs including thalamus proper, caudate, putamen, pallidum, hippocampus, accumbens area, and TBV. For the first six ROIs, their volumes were the sum of volumes of the corresponding left and right ROIs. We then used these ROI-derived profiles to predict four brain-related traits: education (Data-Field: 845), reaction time (Data-Field: 20023), numeric memory (Data-Field: 4282), and fluid intelligence (Data-Field: 20016). We first assessed the cross-trait prediction ability of each profile, and then we selected the best profile for each ROI and put the seven profiles together in one model for multivariate analysis.

Next, we used the UKB GWAS results to perform prediction on ADNI, PING, PNC and HCP data for all 101 ROI volumes. The prediction accuracy was evaluated on all samples in the four testing sets (with phenotype and genetic data available), not limited to individuals of European ancestry used in GWAS.

Reporting summary.

Further information on research design is available in the Life Sciences Reporting Summary linked to this article.

Data availability

The data used in this work were obtained from five publicly available datasets: the UK Biobank (UKB) study, the Human Connectome Project (HCP) study, the Pediatric Imaging, Neurocognition, and Genetics (PING) study, the Philadelphia Neurodevelopmental Cohort (PNC) study, and the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study. We used 50 sets of publicly available GWAS summary statistics from several GWAS databases. The data resources are summarized in Supplementary Table 24. All UKB and meta-analysis GWAS summary statistics of 101 ROI volumes can be found at: https://med.sites.unc.edu/bigs2/data/gwas-summary-statistics/.

Code availability

We made use of publicly available software and tools. All codes used to generate results that are reported in this paper are available upon request.

Supplementary Material

1

ACKNOWLEDGEMENTS

This research was partially supported by U.S. NIH grants MH086633 (H.Z.) and MH116527 (T. Li), and a grant from the Cancer Prevention Research Institute of Texas (H.Z.). We thank the individuals represented in the UK Biobank, ADNI, HCP, PING and PNC datasets for their participation and the research teams for their work in collecting, processing and disseminating these datasets for analysis. This research has been conducted using the UK Biobank resource (application number 22783), subject to a data transfer agreement. We gratefully acknowledge all the studies and databases that made GWAS summary data available. Part of data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering and through generous contributions from the following: Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen Idec Inc.; Bristol-Myers Squibb Company; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd; Janssen Alzheimer Immunotherapy Research & Development, LLC; Johnson & Johnson Pharmaceutical Research & Development LLC; Medpace, Inc.; Merck & Co., Inc.; Meso Scale Diagnostics, LLC; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Synarc Inc.; and Takeda Pharmaceutical Company. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Part of the data collection and sharing for this project was funded by the Pediatric Imaging, Neurocognition and Genetics Study (PING) (U.S. National Institutes of Health Grant RC2DA029475). PING is funded by the National Institute on Drug Abuse and the Eunice Kennedy Shriver National Institute of Child Health & Human Development. PING data are disseminated by the PING Coordinating Center at the Center for Human Development, University of California, San Diego. Support for the collection of the PNC datasets was provided by grant RC2MH089983 awarded to Raquel Gur and RC2MH089924 awarded to Hakon Hakonarson. All PNC subjects were recruited through the Center for Applied Genomics at The Children’s Hospital in Philadelphia. HCP data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.

Footnotes

COMPETETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

REFERENCES

  • 1.Ritchie SJ et al. Beyond a bigger brain: Multivariable structural brain imaging and intelligence. Intelligence 51, 47–56 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Davies G et al. Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N= 112 151). Molecular Psychiatry 21, 758–767 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.van der Meer D et al. Brain scans from 21,297 individuals reveal the genetic architecture of hippocampal subfield volumes. Molecular Psychiatry, doi: 10.1038/s41380-018-0262-7 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Caldiroli A et al. The relationship of IQ and emotional processing with insula volume in schizophrenia. Schizophrenia Research 202, 141–148 (2018). [DOI] [PubMed] [Google Scholar]
  • 5.Vreeker A et al. The relationship between brain volumes and intelligence in bipolar disorder. Journal of Affective Disorders 223, 59–64 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wigmore EM et al. Do regional brain volumes and major depressive disorder share genetic architecture? A study of Generation Scotland (n= 19 762), UK Biobank (n= 24 048) and the English Longitudinal Study of Ageing (n= 5766). Translational Psychiatry 7, e1205 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wen W et al. Distinct genetic influences on cortical and subcortical brain structures. Scientific Reports 6, 32760 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.den Braber A et al. Heritability of subcortical brain measures: a perspective for future genome-wide association studies. NeuroImage 83, 98–102 (2013). [DOI] [PubMed] [Google Scholar]
  • 9.Eyler LT et al. Conceptual and data-based investigation of genetic influences and brain asymmetry: a twin study of multiple structural phenotypes. Journal of Cognitive Neuroscience 26, 1100–1117 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Blokland GA, de Zubicaray GI, McMahon KL & Wright MJ Genetic and environmental influences on neuroimaging phenotypes: a meta-analytical perspective on twin imaging studies. Twin Research and Human Genetics 15, 351–371 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kremen WS et al. Genetic and environmental influences on the size of specific brain regions in midlife: the VETSA MRI study. Neuroimage 49, 1213–1223 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jansen AG, Mous SE, White T, Posthuma D & Polderman TJ What twin studies tell us about the heritability of brain development, morphology, and function: a review. Neuropsychology Review 25, 27–46 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhao B et al. Heritability of regional brain volumes in large-scale neuroimaging and genetic studies. Cerebral Cortex 29, 2904–2914 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Elliott LT et al. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562, 210–216 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Biton A et al. Polygenic architecture of human neuroanatomical diversity. bioRxiv, 592337 (2019). [Google Scholar]
  • 16.Toro R et al. Genomic architecture of human neuroanatomical diversity. Molecular Psychiatry 20, 1011–1016 (2015). [DOI] [PubMed] [Google Scholar]
  • 17.Hibar DP et al. Common genetic variants influence human subcortical brain structures. Nature 520, 224–229 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yang J et al. Common SNPs explain a large proportion of the heritability for human height. Nature Genetics 42, 565–569 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Boyle EA, Li YI & Pritchard JK An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Timpson NJ, Greenwood CMT, Soranzo N, Lawson DJ & Richards JB Genetic architecture: the shape of the genetic contribution to human traits and disease. Nature Reviews Genetics 19, 110–124 (2017). [DOI] [PubMed] [Google Scholar]
  • 21.Hibar DP et al. Novel genetic loci associated with hippocampal volume. Nature Communications 8, 13624 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Franke B et al. Genetic influences on schizophrenia and subcortical brain volumes: large-scale proof of concept. Nature Neuroscience 19, 420–431 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Guadalupe T et al. Human subcortical brain asymmetries in 15,847 people worldwide reveal effects of age and sex. Brain imaging and behavior 11, 1497–1514 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ikram MA et al. Common variants at 6q22 and 17q21 are associated with intracranial volume. Nature Genetics 44, 539–544 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Bis JC et al. Common variants at 12q14 and 12q24 are associated with hippocampal volume. Nature Genetics 44, 545–551 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Satizabal CL et al. Genetic architecture of subcortical brain structures in over 40,000 individuals worldwide. bioRxiv, 173831 (2017). [Google Scholar]
  • 27.Davies G et al. Study of 300,486 individuals identifies 148 independent genetic loci influencing general cognitive function. Nature Communications 9, 2098 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Nagel M et al. Meta-analysis of genome-wide association studies for neuroticism in 449,484 individuals identifies novel genetic loci and pathways. Nature Genetics 50, 920 (2018). [DOI] [PubMed] [Google Scholar]
  • 29.Savage JE et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nature Genetics 50, 912–919 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Sudlow C et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Medicine 12, e1001779 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Satterthwaite TD et al. Neuroimaging of the Philadelphia neurodevelopmental cohort. Neuroimage 86, 544–553 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Weiner MW et al. The Alzheimer’s Disease Neuroimaging Initiative: a review of papers published since its inception. Alzheimer’s & Dementia 9, e111–e194 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Jernigan TL et al. The pediatric imaging, neurocognition, and genetics (PING) data repository. Neuroimage 124, 1149–1154 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Somerville LH et al. The Lifespan Human Connectome Project in Development: A large-scale study of brain connectivity development in 5–21 year olds. NeuroImage 183, 456–468 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Avants BB et al. A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage 54, 2033–2044 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Tustison NJ et al. Large-scale evaluation of ANTs and FreeSurfer cortical thickness measurements. Neuroimage 99, 166–179 (2014). [DOI] [PubMed] [Google Scholar]
  • 37.Yang J, Zeng J, Goddard ME, Wray NR & Visscher PM Concepts, estimation and interpretation of SNP-based heritability. Nature Genetics 49, 1304–1310 (2017). [DOI] [PubMed] [Google Scholar]
  • 38.de Leeuw CA, Mooij JM, Heskes T & Posthuma D MAGMA: generalized gene-set analysis of GWAS data. PLoS Computational Biology 11, e1004219 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Watanabe K, Taskesen E, Bochoven A & Posthuma D Functional mapping and annotation of genetic associations with FUMA. Nature Communications 8, 1826 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Bulik-Sullivan B et al. An atlas of genetic correlations across human diseases and traits. Nature Genetics 47, 1236–1241 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Buniello A et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Research 47, D1005–D1012 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Chen C-H et al. Leveraging genome characteristics to improve gene discovery for putamen subcortical brain structure. Scientific Reports 7, 15736 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Stein JL et al. Identification of common variants associated with human hippocampal and intracranial volumes. Nature Genetics 44, 552–561 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Furney S et al. Genome-wide association with MRI atrophy measures as a quantitative trait locus for Alzheimer’s disease. Molecular Psychiatry 16, 1130–1138 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Baranzini SE et al. Genome-wide association analysis of susceptibility and clinical phenotype in multiple sclerosis. Human Molecular Genetics 18, 767–778 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Fischl B FreeSurfer. Neuroimage 62, 774–781 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Hibar DP et al. Genome-wide association identifies genetic variants associated with lentiform nucleus volume in N = 1345 young and elderly subjects. Brain Imaging and Behavior 7, 102–115 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Sprooten E et al. White matter integrity as an intermediate phenotype: exploratory genome-wide association analysis in individuals at high risk of bipolar disorder. Psychiatry Research 206, 223–231 (2013). [DOI] [PubMed] [Google Scholar]
  • 50.Kim S & Webster M Integrative genome-wide association analysis of cytoarchitectural abnormalities in the prefrontal cortex of psychiatric disorders. Molecular Psychiatry 16, 452–461 (2011). [DOI] [PubMed] [Google Scholar]
  • 51.Klein M et al. Genetic markers of ADHD-related variations in intracranial volume. American Journal of Psychiatry 176, 228–238 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Hill W et al. A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence. Molecular Psychiatry 24, 169–181 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Jun G et al. A novel Alzheimer disease locus located near the gene encoding tau protein. Molecular Psychiatry 21, 108–117 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Luciano M et al. Association analysis in over 329,000 individuals identifies 116 independent variants influencing neuroticism. Nature Genetics 50, 6–11 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Lee JJ et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nature Genetics 50, 1112–1121 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Edwards TL et al. Genome-wide association study confirms SNPs in SNCA and the MAPT region as common risk factors for Parkinson disease. Annals of Human Genetics 74, 97–109 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Turley P et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nature Genetics 50, 229–237 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Okbay A et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nature Genetics 48, 624–633 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Zheng H-F et al. WNT16 influences bone mineral density, cortical bone thickness, bone strength, and osteoporotic fracture risk. PLoS Genetics 8, e1002745 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Pardiñas AF et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nature Genetics 50, 381–389 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Li Z et al. Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia. Nature Genetics 49, 1576–1583 (2017). [DOI] [PubMed] [Google Scholar]
  • 62.Marioni RE et al. GWAS on family history of Alzheimer’s disease. Translational Psychiatry 8, 99 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Linnér RK et al. Genome-wide association analyses of risk tolerance and risky behaviors in over 1 million individuals identify hundreds of loci and shared genetic influences. Nature Genetics 51, 245–257 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Kamboh M et al. Genome-wide association study of Alzheimer’s disease. Translational Psychiatry 2, e117 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Finucane HK et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nature Genetics 50, 621–629 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Pers TH et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nature Communications 6, 5890 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Jansen PR et al. Genome-wide analysis of insomnia in 1,331,010 individuals identifies new risk loci and functional pathways. Nature Genetics 51, 394–403 (2019). [DOI] [PubMed] [Google Scholar]
  • 68.Skol AD, Scott LJ, Abecasis GR & Boehnke M Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nature Genetics 38, 209–213 (2006). [DOI] [PubMed] [Google Scholar]
  • 69.Hanscombe KB et al. Genetic factors influencing coagulation factor XIII B-subunit contribute to risk of ischemic stroke. Stroke 46, 2069–2074 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Thompson PM et al. The ENIGMA Consortium: large-scale collaborative analyses of neuroimaging and genetic data. Brain Imaging and Behavior 8, 153–182 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Nave G, Jung WH, Karlsson Linnér R, Kable JW & Koellinger PD Are bigger brains smarter? evidence from a large-scale preregistered study. Psychological Science 30, 43–54 (2019). [DOI] [PubMed] [Google Scholar]
  • 72.Walhovd KB & Fjell AM White matter volume predicts reaction time instability. Neuropsychologia 45, 2277–2284 (2007). [DOI] [PubMed] [Google Scholar]
  • 73.Delorme S et al. Reaction time Is negatively associated with corpus callosum area in the early stages of CADASIL. American Journal of Neuroradiology 38, 2094–2099 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Consortium IS Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Watanabe K et al. A global overview of pleiotropy and genetic architecture in complex traits. Nature Genetics 51, 1339–1348 (2019). [DOI] [PubMed] [Google Scholar]
  • 76.Thompson PM et al. Genetic influences on brain structure. Nature Neuroscience 4, 1253–1258 (2001). [DOI] [PubMed] [Google Scholar]
  • 77.Peper JS, Brouwer RM, Boomsma DI, Kahn RS & Hulshoff Pol HE Genetic influences on human brain structure: a review of brain imaging studies in twins. Human Brain Mapping 28, 464–473 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Yoon U, Perusse D, Lee J-M & Evans AC Genetic and environmental influences on structural variability of the brain in pediatric twin: deformation based morphometry. Neuroscience Letters 493, 8–13 (2011). [DOI] [PubMed] [Google Scholar]
  • 79.Manolio TA et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Gusev A et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nature Genetics 50, 538–548 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Miller KL et al. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nature Neuroscience 19, 1523–1536 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Smith SM & Nichols TE Statistical challenges in “big data” human neuroimaging. Neuron 97, 263–268 (2018). [DOI] [PubMed] [Google Scholar]
  • 83.Fortin J-P et al. Removing inter-subject technical variability in magnetic resonance imaging studies. Neuroimage 132, 198–212 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Fortin J-P et al. Harmonization of cortical thickness measurements across scanners and sites. NeuroImage 167, 104–120 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

METHODS-ONLY REFERENCES

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

The data used in this work were obtained from five publicly available datasets: the UK Biobank (UKB) study, the Human Connectome Project (HCP) study, the Pediatric Imaging, Neurocognition, and Genetics (PING) study, the Philadelphia Neurodevelopmental Cohort (PNC) study, and the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study. We used 50 sets of publicly available GWAS summary statistics from several GWAS databases. The data resources are summarized in Supplementary Table 24. All UKB and meta-analysis GWAS summary statistics of 101 ROI volumes can be found at: https://med.sites.unc.edu/bigs2/data/gwas-summary-statistics/.

RESOURCES