Abstract
Structural brain changes along the lineage leading to modern Homo sapiens contributed to our distinctive cognitive and social abilities. However, the evolutionarily relevant molecular variants impacting key aspects of neuroanatomy are largely unknown. Here, we integrate evolutionary annotations of the genome at diverse timescales with common variant associations from large-scale neuroimaging genetic screens. We find that alleles with evidence of recent positive polygenic selection over the past 2000–3000 years are associated with increased surface area (SA) of the entire cortex, as well as specific regions, including those involved in spoken language and visual processing. Therefore, polygenic selective pressures impact the structure of specific cortical areas even over relatively recent timescales. Moreover, common sequence variation within human gained enhancers active in the prenatal cortex is associated with postnatal global SA. We show that such variation modulates the function of a regulatory element of the developmentally relevant transcription factor HEY2 in human neural progenitor cells and is associated with structural changes in the inferior frontal cortex. These results indicate that non-coding genomic regions active during prenatal cortical development are involved in the evolution of human brain structure and identify novel regulatory elements and genes impacting modern human brain structure.
Keywords: cortical surface area, genome-wide association study, human gained enhancers, polygenic selection
Introduction
The size, shape, and neural architecture of the modern human brain reflect the cumulative effects of selective pressures over evolutionary history. Analyses of fossilized skulls indicate that endocranial volume has increased dramatically on the lineage that led to Homo sapiens in the over 6 million years since our last common ancestor with chimpanzees (Fig. 1; Henneberg 1988; Lee and Wolpoff 2003; Jantz and Jantz 2016; Moorjani et al. 2016; Du et al. 2018). It is thought that these volumetric increases were mainly driven by expansions of neocortical surface area (SA) (Rakic 2009; Lui et al. 2011; Geschwind and Rakic 2013), although changes in other brain structures, including the cerebellum, also likely played a significant role (Barton and Venditti 2014; Miller et al. 2019). Beyond overall size differences, skull endocasts of archaic hominins suggest that human-specific refinements to brain structure occurred during the last 300 000 years, most notably the shift toward a more globular shape (Hublin et al. 2017; Gunz et al. 2019). A commonly held view is that differential expansion of distinct regions of the neocortex contributed to the evolution of the distinctive cognitive and social abilities of our species (Rakic 2009; Lui et al. 2011; Geschwind and Rakic 2013). Neuroanatomical changes in our ancestors were accompanied by increasingly sophisticated tool use, the emergence of proficient spoken language, world-wide migrations, and the development of agriculture, among other innovations (Pääbo 2014).
Several studies have identified fixed genomic differences that may have impacted aspects of brain structure along our lineage (Enard 2016; Sousa et al. 2017; Mitchell and Silver 2018), but the genetic variation that shaped the cortex across human evolution is still largely undetermined. In the present study, we adopt a novel strategy to uncover genetic variants that have contributed to anatomical features of the modern human brain. To do so, we identify loci of defined evolutionary relevance in the genome and assess the effects of those loci on cortical structure through large-scale neuroimaging genetics. Comparative genomic and population genetic annotations from multiple sources have been used to identify evolutionarily relevant loci in the human genome across diverse time scales (Fig. 1; Pollard, Salama, King, et al. 2006a; Vernot and Akey 2014; Reilly et al. 2015; Field et al. 2016; Simonti et al. 2016; Vermunt et al. 2016; Nielsen et al. 2017; Peyrégne et al. 2017). Two annotations of particular note capture distinct periods in human history. The singleton density score (SDS) uses genome sequencing data to identify haplotypes with a decreased accumulation of singleton variants in the population being studied, providing evidence for polygenic natural selection acting over the past ~2000–3000 years (Field et al. 2016). On a deeper time scale, human-gained enhancers (HGEs) represent gene regulatory elements that display stronger histone acetylation or methylation marks of promoters or enhancers in human cortical tissue compared with extant primates or mice (Reilly et al. 2015; Vermunt et al. 2016), arising after our last common ancestor with Old World monkeys about 30 million years ago (Mya).
By themselves, these indices suggest loci of likely evolutionary significance in the human genome but are not informative for defining which loci (if any) influence the structure of the human brain. We hypothesize that, for evolutionarily relevant genetic variants that have not reached fixation, data from genome-wide association studies (GWAS) of cortical structure can help determine their potential functional impacts on brain structure. We reason that GWAS data may shed light on the evolution of cortical structure by: 1) determining if alleles under selective pressure are associated with variation in neural anatomy and 2) revealing if interindividual variation in defined genomic regions of evolutionary significance is associated with variation in neural anatomy. Crucially, this novel approach for studying human brain evolution depends on the availability of large datasets of many thousands of individuals in which structural neuroimaging measures have been coupled to genome-wide genotyping. In this regard, we take advantage of recent large-scale GWAS work from the Enhancing NeuroImaging Genetics through Meta Analysis (ENIGMA) consortium (Grasby et al. 2020), including data from the UK Biobank (Elliott et al. 2018), which identified hundreds of genetic loci associated with interindividual variability in human cortical structure in living populations. Thus, here, we integrate genomic annotations spanning 30 million years of our evolutionary history with data from a GWAS meta-analysis of cortical SA in over 33 000 modern humans (Grasby et al. 2020) to assess the aggregate impact of each annotation on modern variation in cortical SA and identify genetic variants within these annotations with notable effects on human neural development.
Materials and Methods
Genome-Wide Association Summary Statistics
Summary statistics for 35 cortical SA phenotypes (global SA and average bilateral SA for 34 regions) were obtained from a European ancestry discovery sample of the ENIGMA cortical SA meta-analysis (Grasby et al. 2020) including data from the UK Biobank (UKBB) (Elliott et al. 2018). For comparative purposes, corresponding summary statistics for cortical thickness were also obtained from the same source. We focused our analyses on SA given its particular expansion during hominid evolution, well established in prior literature, but as a comparison also show results from analyses of thickness in the Supplementary Materials. Details of image segmentation, genotyping, imputation, association, and meta-analysis are found in the primary GWAS meta-analysis reference (Grasby et al. 2020). Briefly, magnetic resonance images of the brain were segmented with FreeSurfer (Dale et al. 1999) using a gyrally defined atlas (Desikan et al. 2006), and visually quality checked based on guidelines provided at the ENIGMA website (http://enigma.ini.usc.edu/research/gwasma-of-cortical-measures/). Imputation of genome-wide genotyping arrays was conducted to the 1000 Genomes phase 1 release v3 reference panel. When conducting associations of gyrally defined regions, the global measure of SA was included as a covariate, in order to test for genetic influences that were specific to each region. The original association models also included four multidimensional scaling components to help control for ancestry, as well as linear and nonlinear corrections for age and sex, diagnostic status, and scanner. Fixed effects meta-analysis was used to combine effects across all sites contributing to the analysis (Willer et al. 2010). All analyses were performed on summary statistics without genomic control (Bacanu et al. 2000) correction applied.
Ancestry Regression
We first determined the impact of subtle population stratification on each GWAS summary statistics dataset, in light of studies showing that such stratification can confound estimates of selection (Berg et al. 2018; Sohail et al. 2019). First, all unrelated subjects [defined in Gazal et al. (2015)] were selected from 1000 Genomes Phase 3 data (1000 Genomes Project Consortium et al. 2015). We then selected single nucleotide polymorphisms (SNPs) that had a minor allele frequency (MAF) > 5% in 1000 Genomes and that were not located in the major histocompatibility complex (MHC) locus, the chromosome 8 inversion region, or regions of long linkage disequilibrium (LD). LD-independent SNPs (r2 < 0.2) were selected via pruning using a window size 500 kb and a slide of 100 kb (PLINK—indep-pairwise 500 100 0.2). Principal component (PC) analysis was performed in PLINK (Chang et al. 2015) on the 264 339 remaining SNPs. In order to obtain SNP PC loadings for all SNPs in the 1000 genomes project (MAF < 0.05, MHC locus, the chromosome 8 inversion region, or regions of long LD removed), we performed linear regressions of the PC scores on the genotype allele count of each SNP (after controlling for sex) and used the resulting regression coefficients as the SNP PC loading estimates. This procedure followed that used in previous work (Sohail et al. 2019). For the first 20 PCs, the weighting of the PCs for each subject was used as a trait and tested for association with each subject’s genotype in PLINK. For each SNP, across all 20 PCs, we identified the degree of association of that SNP to population frequency differences along that principal axis of variation (Beta_PCs). After merging summary statistics of each SA GWAS without genomic control (Bacanu et al. 2000) correction (Beta_strat) with Beta_PC values, ensuring beta values were with respect to the same effect allele, and sorting based on chromosomal position, a block jackknife correlation with 1000 blocks approach was used to assess the correlation between Beta_strat and Beta_PCs, shown in Figure 2a and Supplementary Figure 1.
We then implemented an ancestry regression procedure following previous work (Bhatia et al. 2016). We used a regression model fitting each set of SA GWAS summary statistics without genomic control correction (Beta_strat) simultaneously to the 20 Beta_PC values calculated as described above using the lm() function in R (v3.2.3). The residuals of this model (Beta_r) were used as ancestry-corrected effect sizes. Ancestry-corrected standard errors and P-values were calculated following the same prior work (Bhatia et al. 2016). The same block jackknife correlation method was used to assess the impact of subtle population stratification by correlating Beta_r with Beta_PC in Figure 2b and Supplementary Figure 2. The same analyses were completed for cortical thickness (Supplementary Figs 3 and 4).
We evaluated an additional measure of population stratification, the LD-score regression (LDSC) intercept (Bulik-Sullivan, Finucane, et al. 2015a), before and after ancestry regression (Fig. 2c, Supplementary Fig. 5). The summary statistics (with or without ancestry regression, as above) were first written into a standard format using munge_sumstats.py. Then, precomputed LD scores from 1000 Genomes Phase 3 (using only HapMap3 SNPs, excluding the MHC region) were downloaded from the LDSC website (https://github.com/bulik/ldsc) and implemented according to the guidelines given there.
Genetic Correlations
Genetic correlations of ancestry-regressed cortical structure with height (Supplementary Fig. 6) were calculated using LDSC regression (Bulik-Sullivan, Finucane, et al. 2015a). Summary statistics for height were acquired from previously published work (Wood et al. 2014).
SDS Implementation
SDSs (Field et al. 2016) for each SNP were downloaded from https://datadryad.org/resource/doi:10.5061/dryad.kd58f. Ancestry-regressed summary statistics (without any significance thresholding) were merged with SDS scores by rsID and ensured that the SDS value describes the trait increasing allele (tSDS). The ancestry-regressed Z-score was calculated as the ancestry-regressed beta divided by the ancestry-regressed standard error. Merged files were then sorted by chromosomal position, and block jackknife Spearman’s correlation with 100 blocks was used to determine the relationship between ancestry-regressed Z-scores and tSDS values. The Benjamini Hochberg false discovery rate (FDR) correction was used to correct for multiple comparisons across each of the 35 GWASs used. Results were plotted on a representative brain surface using the R/plotly package, where the correlation values were only shown for significant associations after FDR correction (FDR adjusted P-value < 0.05; see Fig. 3a). These analyses were run in two additional ways: 1) without ancestry regression on the full ENIGMA SA GWASs in Figure 3b; and 2) without ancestry regression in the UKBB dataset subset to only European individuals, which is less susceptible to the impact of population stratification due to the combination of effects across many sites, as in the larger ENIGMA analysis (N = 9923; Supplementary Fig. 7). The same analyses were also completed for cortical thickness in Supplementary Figure 8.
Partitioned Heritability
The contributions of each SNP set to the total SNP heritability of each trait were determined using partitioned heritability analyses as implemented in the LDSC software package (Finucane et al. 2015). Enrichment of heritability within HARs (Capra et al. 2013), selective sweep regions (Peyrégne et al. 2017), Neanderthal-introgressed SNPs (Vernot and Akey 2014), and Neanderthal-depleted regions (Vernot et al. 2016) all controlled for the baselineLD v2 model from the original LDSC study (Finucane et al. 2015). Heritability enrichment in fetal brain HGEs (Reilly et al. 2015) controlled for both the baseline model and a set of fetal brain active regulatory elements (E081) from the Epigenomics Roadmap resource. Heritability enrichment in adult brain HGEs (Vermunt et al. 2016) controlled for both the baseline model and adult brain active regulatory elements (E073) from the Epigenomics Roadmap resource. Active regulatory elements were defined using chromHMM (Ernst and Kellis 2012) marks from the 15 state models including all the following annotations: 1_TssA, 2_TssAFlnk, and 7_Enh, 6_EnhG.
Gene Annotations
Gene sets impacted by genetic variation within any HGE were derived separately for 1) global SA or 2) any of the 34 regional SA loci. We first identified all SNPs within 10 000 kb of a nominally significant (P-value < 5 × 10−8) GWAS locus with r2 > 0.6 in the 1000G EUR population to the index SNP, using PLINK 1.9. With this extended list of SNPs in LD with the GWAS index SNP, we looked for overlaps with HGEs defined in any human brain region or developmental time period (Reilly et al. 2015). For those genome-wide significant loci that also overlapped with HGEs, we then recorded known functional impacts on gene expression using adult brain expression quantitative trait loci (eQTLs) from the PsychENCODE dataset (Wang et al. 2018), downloaded from http://adult.psychencode.org/ selecting the dataset thresholded by the following parameters (FDR < 0.05, expression > 0.1 FPKM in at least 10 samples). Gene biotype annotations (e.g., protein coding) were called using ENSEMBL via biomaRt.
Pathway enrichment was performed for each gene list using the gost function from the “gprofiler2” package (version 0.1.3). Electronic gene ontology (GO) annotations (evidence code IEA) were excluded, the sources were limited to GO, KEGG, and Reactome pathways, and FDR correction was applied with a significance threshold of 0.05.
Chromatin Accessibility Quantitative Trait Locus (caQTL) Mapping at the HEY2 Locus
caQTL data were acquired from our previous work (Liang et al. 2020). Briefly, we generated chromatin accessibility profiles from primary human neural progenitor cell lines (Ndonors cultured = 73) and their differentiated neuronal progeny (Ndonors cultured = 61) using ATAC-seq (Buenrostro et al. 2013). We genotyped the same cell lines using an Illumina HumanOmni2.5 or HumanOmni2.5Exome platform and imputed to 1000 Genomes Phase 3 reference panel. We performed a caQTL analysis separately for progenitors and neurons using a mixed effects model including a kinship matrix for SNPs 100kb up- and downstream from the center of each chromatin accessibility peak. Allele-specific chromatin accessibility was performed in DESeq2 (Love et al. 2014) after utilizing WASP to reduce mapping bias (van de Geijn et al. 2015).
Data Visualization
Genomic loci plots were constructed using the R package “GViz”, with evolutionary annotation data sourced from the references given in Data Availability. Brain plots (Figs 3 and 4) were made using the “plotly” package. All other plots were made in R using “ggplot2” and related packages.
Data and Code Availability
Code used to perform analyses is available at https://bitbucket.org/jasonlouisstein/enigmaevolma6/src/master/. Genomic regions that underwent rapid change on the human lineage (human accelerated regions, HARs) were combined from several sources (Pollard, Salama, Lambert, et al. 2006b; Prabhakar et al. 2006; Bird et al. 2007; Bush and Lahn 2008; Lindblad-Toh et al. 2011). BED files listing fetal brain enhancer elements not found in macaques or mice were obtained from previous work (Reilly et al. 2015). Adult brain enhancer elements arising since our last common ancestor with the macaque or chimpanzee were obtained from (Vermunt et al. 2016). A refined list of SNPs gained through introgression with Neanderthals was obtained from previous work (Simonti et al. 2016). Genomic regions depleted of introgressed Neanderthal DNA were obtained from previous work (Vernot et al. 2016). Ancient selective sweep regions identified using extended lineage sorting were obtained from previous work (Peyrégne et al. 2017). A summary of all annotations is found in Supplementary Table 1.
Results
Reducing the Impact of Subtle Population Stratification
The ENIGMA consortium recently conducted a GWAS meta-analysis identifying hundreds of common variants associated with variability in SA and cortical thickness in European populations (N = 33 992 individuals from cohorts across the lifespan) (Grasby et al. 2020). Given the massive expansion of SA in modern humans and only subtle increases in cortical thickness as compared with extant mammalian species (Rakic 2009), we chose SA as the primary focus for the present study. Nevertheless, for comparative purposes, we performed a matching set of analyses for thickness associations, and these are shown in the Supplementary Materials.
Population stratification is the existence of systematic differences in allele frequencies between populations. Unbalanced representations of multiple populations in genetic association studies can lead to false-positive findings that are driven by allele frequency differences between populations rather than true association with a trait (Balding 2006). Moreover, subtle population stratification in GWAS statistics can inflate the assessment of polygenic selection impacting a trait (Berg et al. 2018; Novembre and Barton 2018; Barton et al. 2019; Sohail et al. 2019). We first tested whether subtle population stratification was influencing meta-analysis effect sizes in the cortical GWAS data, even after applying the accepted standard correction for multidimensional scaling components of ancestry prior to meta-analysis (Grasby et al. 2020). PC analysis enabled us to identify major axes of variation in allele frequency across current human populations using unrelated individuals of all ancestries from the 1000 Genomes Phase 3 data. Then, we tested the association of each SNP to the top 20 PCs (each treated as a separate trait) within the 1000 Genomes population, yielding an estimate of the degree to which each SNP contributes to population frequency differences along each principal axis of variation (Beta_PCs) (Sohail et al. 2019). Finally, using Pearson’s correlation, the Beta_PCs were correlated with the effect sizes from the GWAS meta-analysis for each trait, which may be impacted by population stratification (Beta_Strat). To assess the significance of the correlation in the context of LD, a block jackknife approach was employed to calculate the standard errors for the correlation (Kunsch 1989; Busing et al. 1999). Significant correlations between Beta_PCs (consistent allele frequency differences differentiating human populations) and Beta_Strat (effect sizes of variants on human brain structure from GWAS) are indicative of subtle, uncorrected population stratification (Berg et al. 2018; Sohail et al. 2019). As shown in Figure 2a, we detected significant relationships between Beta_Strat and PCs 6, 7, 8, 15, 18, and 19 for global SA, indicating subtle residual population stratification affecting the GWAS summary statistics. This analysis also showed subtle population stratification affecting summary statistics for each of the regional SAs, to varying degrees (Supplementary Fig. 1). We note that another measure of population stratification, the LDSC intercept (Bulik-Sullivan, Loh, et al. 2015b), gave values that were uniformly less than 1.05 (a commonly used threshold for ruling out stratification) for global SA and all regional SAs (Fig. 2c).
To correct for this subtle population stratification, we implemented an ancestry regression procedure based on GWAS summary statistics (Bhatia et al. 2016). The residuals (Beta_r) of a model fitting GWAS effect sizes (Beta_Strat) with the first 20 PC weightings (Beta_PC) were used as ancestry-corrected estimates of effect sizes. As expected, these ancestry-corrected estimates (Beta_r) showed much reduced correlations with PC weights (Fig. 2b, Supplementary Fig. 2). Additionally, LDSC intercepts for phenotypes after ancestry regression were generally slightly decreased, consistent with diminished effects of subtle population stratification on common variant associations to SA (Fig. 2c). Furthermore, correlations between effect size measurements after ancestry regression (Beta_r) and effect size measurements prior to ancestry regression (Beta_Strat) were all extremely high, indicating that ancestry regression did not strongly change the association statistics (correlations all >0.995; Fig. 2d). There were 343 genome-wide significant loci (P-values < 5 × 10−8; clumping r2 < 0.2) prior to ancestry regression impacting global SA or any of the regional SAs and 303 genome-wide significant loci after ancestry regression. Finally, those brain regions that showed the highest LDSC intercepts prior to ancestry regression, indicative of being most affected by subtle residual population stratification, were also those that showed the largest changes in GWAS effect sizes following ancestry regression (r = −0.334; P-value = 0.0498; Fig. 2d). The ancestry regression procedure was also carried out for cortical thickness GWAS (Supplementary Figs 3–5). For all subsequent analyses, we used the ancestry-corrected effect size estimates, standard errors, and P-values, thereby minimizing the impact of population stratification on the results of our evolutionary assessments.
Specificity of GWAS Results to Brain versus Body Size
To investigate whether GWAS results of cortical structure revealed specific influences on the brain as compared with global body size, we performed genetic correlations with a GWAS of height (Wood et al. 2014). As was noted in previous work, and shown in Supplementary Figure 6, there is a partially shared genetic basis between height and global cortical SA (rg = 0.21) (Grasby et al. 2020). However, our previous work also indicated the genetic correlations between intracranial volume controlling for height and global SA (rg = 0.81) are much stronger than genetic correlations between height and global cortical SA (rg = 0.21), which demonstrates that the genetic signal discovered in our global cortical SA GWAS is mostly brain specific and not driven entirely by body size (Grasby et al. 2020). In the association model of each of the 34 cortical regions, we control for global SA to identify specific effects on that region, so we do not expect to observe a large degree of shared genetics with body size. Indeed, we did not observe any significant (FDR < 0.05) genetic correlations between height and the 34 regional SA measurements (Supplementary Fig. 6). We performed the same analyses for thickness and only detected one region with a significant (FDR < 0.05) genetic correlation with height, inferior temporal gyrus, which was not implicated in any of our subsequent evolutionary analyses. In sum, our findings are largely brain-specific.
Evidence for Polygenic Selection Impacting Human Cortical Structure
In our evolutionary analyses, we first assessed how alleles that show evidence of recent selective pressure impact cortical SA. The SDS reveals haplotypes under recent positive/negative selection in the human genome by identifying those that harbor fewer/greater singleton variants (presumed to have arisen recently) near any given SNP (Field et al. 2016). This metric, together with data from a suitable GWAS, can be used to infer whether a trait of interest has been subject to highly polygenic selection on an evolutionarily recent timescale, over the past ~2000–3000 years. We found that alleles with evidence of positive selection over this recent timescale have a small but detectable influence on increasing global SA in the GWAS datasets (block jackknife correlation = 0.0129, FDR adjusted P-value = 0.0038; Fig. 3a). In addition, our results showed that alleles undergoing polygenic selection over the past ~2000–3000 years are associated with variation in cortical SA of individual gyrally defined brain regions (Fig. 3a). Notably, based on the cortical region-specific GWASs, there is a detectable relationship between alleles under positive polygenic selective pressure (increasing in allele frequency over time) and increased cortical SA in regions known to be important for speech/language functions (pars opercularis, part of the inferior frontal gyrus) and visual processing (lateral occipital cortex). Conversely, alleles under negative polygenic selective pressure (decreasing allele frequency over time) are associated with increased cortical SA in the pre- and postcentral gyrus, regions involved in somatosensation and movement.
We conducted secondary analyses to investigate the potential impacts of the subtle population stratification, described above, because it was recently shown that SDS correlations can be highly influenced by this confounder (Berg et al. 2018; Sohail et al. 2019). Exploratory analyses of GWAS data that were uncorrected for ancestry showed a clear relationship between the LDSC intercept (a measure of the degree of population stratification) and the level of correlation between SDS and the GWAS Z-scores (cor = 0.432, P-value = 0.0096; Fig. 3b). In contrast, for analyses of GWAS data that had undergone ancestry regression, there was no significant relationship between SDS and GWAS Z-scores (cor = 0.257, P-value = 0.137; Fig. 3c), indicating that the ancestry regression procedure is effective in diminishing confounding effects of population stratification. We note that although the ancestry regression procedure attenuates the signals of polygenic selection impacting SA, nevertheless several regions are robust to such adjustment (FDR < 0.05 are colored and labeled in Fig. 3a). Finally, we show that SDS correlations within the UKBB European population alone, which is less susceptible to the impacts of subtle population stratification than meta-analysis of data from consortia (Berg et al. 2018; Sohail et al. 2019), show a highly consistent SDS relationship with the ancestry-corrected meta-analysis results (cor = 0.635, P-value = 4.2 × 10−5; Supplementary Fig. 7).
After implementing the ancestry regression procedures, we also performed the evolutionary analyses on cortical thickness. We found no significant correlation between global thickness and selective pressures over the past 2000–3000 years. We detected two significant associations between recent selective pressures and cortical thickness in the precuneus and superior parietal cortex. In these regions, alleles inferred to have increased in frequency over the past 2000–3000 years are associated with increased cortical thickness (Supplementary Fig. 8). These regions have been independently proposed in prior studies as relevant for human brain evolution (Bruner et al. 2017; Pereira-Pedro et al. 2020).
Significant Heritability Enrichment within HGEs (30 Mya)
We went on to assess deeper evolutionary time scales, targeting human fetal brain enhancer elements that emerged since our last common ancestor with macaques, commonly referred to in the literature as HGEs (Reilly et al. 2015). These elements were detected by comparing post-translational modifications of histone tails indicative of enhancers and promoters (H3K27ac and H3K4me2) across humans, macaques, and mice. Using brain tissue from similar developmental time points across the three species, regulatory elements (peaks in the histone modification signals) were identified that were present in human fetal brain at 7 postconception weeks (PCW), but to a significantly lesser degree in developing macaque or mouse brain tissue (Reilly et al. 2015). The enhancer activity of HGEs has recently been experimentally tested using a multiplex parallel reporter assay in human neural progenitor cells (Uebbing et al. 2019). In this assay, 43% of HGEs were found to be active enhancers, providing important experimental validation that histone post-translational modification marks are functionally active. To understand how these HGEs influence cortical SA in modern humans, we measured their relative contribution to total SNP heritability. A trait’s SNP-based heritability is the total amount of variance in the trait (e.g., global SA) that can be attributed to common variation across the genome, and it can be estimated from GWAS summary statistics. This genome-wide SNP heritability can be partitioned into categories to measure how specific genomic regions of interest (in this case, evolutionary annotations) contribute to the heritability of the trait.
We assessed how common variants within HGEs contribute to the SNP-based heritability of cortical SAs, testing for enrichment using LDSC partitioned heritability (Finucane et al. 2015), with FDR correction for the 35 traits (34 regions plus global SA). Furthermore, because SNPs within regulatory elements that are active during fetal development are known to make significant impacts on both intracranial volume and cortical SA (de la Torre-Ubieta et al. 2018; Grasby et al. 2020), we controlled for a global category of fetal brain active regulatory elements [derived from the Epigenomics Roadmap (Roadmap Epigenomics Consortium et al. 2015)] in the analysis. This is in addition to the 97 categories included in the baselineLD v2 model, which span a wide range of functional elements. These additional control categories make it possible to assess the contribution of evolution-focused annotations with a high degree of specificity. SNPs within HGE elements made significantly enriched contributions to cortical SA heritability for 6 out of 34 gyrally defined regions after controlling for global SA (Fig. 4a). The enrichment signal was strongest for the pars orbitalis, part of the inferior frontal gyrus (Enrichment = 14.96, FDR corrected P-value = 0.0053). As the regional GWAS results were controlled for global SA, the heritability enrichment signals detected in each region are independent of global SA. Our findings indicate that SNPs within these HGEs have effects beyond those of general fetal enhancers. Altogether, the data suggest that a key set of neural enhancer regions that became functional since our split from Old World monkeys contribute an unusually large amount to the heritability of regional cortical SA in adult humans. This influence on SA in the adult brain may be realized through common genetic variation within these HGEs impacting gene regulation during fetal brain development. In order to assess the specificity of these findings to brain-related phenotypes, we tested heritability enrichment of irritable bowel disease (Jostins et al. 2012) for these evolutionary annotations, as it is a non-neural human trait with a GWAS meta-analysis of comparable sample size to the cortical structure GWASs. We found no significant enrichments across the same set of evolution-focused annotations, applying the same controls described above (Supplementary Table 3). The same partitioned heritability analysis was performed for global and regional cortical thickness, but no significant enrichment was identified (Supplementary Fig. 10).
Other Classes of Evolutionary Annotations are not Enriched for Cortical SA or Thickness Heritability
We examined the contributions of several other evolution-focused annotations (namely, HGEs active in the adult brain based on comparison to either macaque or chimpanzee, human accelerated regions, selective sweeps, and Neanderthal introgressed or depleted regions) to the heritability of cortical SA, finding no significant positive enrichment (Supplementary Fig. 9). The results suggest that these particular sets of genomic regions do not contribute more to the heritability of cortical SA than expected, given their size.
The same partitioned heritability analysis was performed for global and regional cortical thickness, with the only positive enrichment surviving FDR correction being for Neanderthal lineage depleted regions in the superior parietal region (Enrichment = 0.20, FDR-corrected P-value = 0.042, Supplementary Fig. 10).
Linking GWAS Results, Regulatory Elements, Genes, and Evolutionary History
To further understand how gene regulation is impacted by common variation within HGEs, we established which of the genome-wide significant SA loci (P-value < 5 × 10−8 including SNPs in LD at r2 > 0.6) fall within HGEs and also modulate gene expression in adult cortical tissue (Wang et al. 2018) (expression quantitative trait loci—eQTLs—at FDR < 0.05). Seven of twenty-four genome-wide significant global SA loci overlapped (directly or with an LD-associated SNP) with an HGE. Four of those seven loci also have a significant eQTL impacting 18 protein-coding genes, eGenes, defined as the genes whose expression is associated with the genetic variation. These eGenes included developmentally relevant genes FOXO3, ERBB3, and WNT3 (a full list is found in Supplementary Table 2). One SNP in LD with rs2802295 (rs9400239, r2 = 0.715), associated with global SA, maps to a 7 PCW fetal brain HGE and is located within an intron of the FOXO3 gene on chromosome 6q21. The derived allele (G) at rs2802295 is associated with increased global cortical SA. The Human Genome Dating atlas estimates the derived allele to be 26 353 (23 115.3–29 770.7 95% confidence interval) generations old (Albers and McVean 2020). Assuming 25 years per generation, the estimated age of the derived allele is 658 (578–744) kya. rs2802295 has also been associated with interindividual variation in general intelligence (Sniekers et al. 2017) (marked by rs2490272 index SNP, r2 = 1.0 with rs2802295), with the SA increasing allele also associated with higher scores on tests of intelligence. The SNP also functions as a cortical eQTL for FOXO3 [FDR adjusted P-value = 0.0051, derived from the adult brain PsychENCODE dataset (Wang et al. 2018)]. FOXO3 encodes a transcription factor that regulates neuronal stem cell homeostasis (Renault et al. 2009), among other roles. Considering the 279 genome-wide significant regional SA loci, there were 46 that overlapped (directly or with an LD-associated SNP) with an HGE. Out of those 46 loci, 30 also have a significant eQTL, impacting a total of 47 protein-coding eGenes. These eGenes include known genes involved in areal identity including LMO4 (Huang et al. 2009) as well as developmentally relevant transcription factors like HEY2 (a full list is found in Supplementary Table 2).
We focused on understanding potential mechanisms by which evolutionarily relevant genetic variation may be associated with changes in inferior frontal brain structure, given this region’s strong HGE partitioned heritability enrichment and involvement in language. For a locus significantly associated with pars opercularis SA, 26 SNPs in LD (r2 > 0.6) with index SNP rs1159974 map to fetal brain HGEs, with the locus centered on the promoter of the HEY2 gene on chromosome 6q22 (Fig. 5a). The strongest cortical eQTL for HEY2 of a SNP within an HGE is rs10457469 (FDR adjusted P-value = 7.09 × 10−44 derived from the adult brain PsychENCODE dataset (Wang et al. 2018), r2rs10457469:rs1159974 = 1), which regulates neural progenitor proliferation during neurogenesis (Sakamoto et al. 2003).
Next, we leveraged our recently generated dataset of chromatin accessibility (ca) QTLs in human cortical neural progenitors and their differentiated neuronal progeny (Liang et al. 2020) to further understand the influence of genetic variation on gene regulatory elements in the developing brain. Using this dataset, we identified a chromatin accessibility peak at the promoter of HEY2 (chr6:125746611–125750660) that overlapped with multiple HGEs and had significantly higher accessibility in progenitors than in neurons (logFC = 0.484, FDR adjusted P-value = 5.98 × 10−31; Fig. 5a–d). A SNP associated with differences in chromatin accessibility (caSNP) within this peak (rs7764016) was in high LD (r2 = 0.823 calculated using the donors of the caQTL dataset; r2 = 0.988 calculated in 1000G phase 3 EUR dataset) with the index SNP associated with pars opercularis SA (rs1159974). The allele linked to decrease in SA and increased HEY2 gene expression (T) was associated with higher chromatin accessibility of the promoter peak in progenitors (P-value = 3.99 × 10−8) but not in neurons (P-value = 0.68; Fig. 5a,b). To provide further support for these findings, we used an alternative method for inferring allelic effects on chromatin accessibility within heterozygous donors (allele specific chromatin accessibility) at rs7764016 and found that the T allele was associated with higher chromatin accessibility in both progenitors (P-value = 1.51 × 10−10) and neurons (P-value = 7.45 × 10−8; Fig. 5c). We controlled for the GWAS index SNP in the progenitor caQTL analysis which abolished the caQTL signal, demonstrating that these variants mark the same locus (co-localization; Fig. 5a). Overall, we suggest a causal variant (rs7764016) where the T allele is associated with increased chromatin accessibility in neural progenitors at an HGE near the promoter of HEY2, increased gene expression of HEY2, and decreased cortical SA of the pars opercularis. Conversely, the derived allele (G) of rs7764016 is associated with reduced HEY2 expression and increased cortical SA for this region. The Human Genome Dating atlas estimates this derived allele to be 2993.7 (2660.5–3316.7, 95% confidence interval) generations old (Albers and McVean 2020). Assuming 25 years per generation, the estimated age of the derived allele is 74 (66–82) kya. These results indicate that genetically mediated alteration of the function of a regulatory element with specific activity in the developing human brain impacts adult inferior frontal cortical SA. The findings also suggest a specific gene and regulatory element involved in shaping inferior frontal gyrus cortical structure in humans, acting within a polygenic framework.
Likely due to the limited number of eGenes identified, no significant (FDR < 0.05) gene ontology terms with greater than 5 intersections with HGE regulated genes were identified. Nevertheless, this analysis points to specific developmentally interesting genes regulated by HGEs which have shaped both the overall SA of the cortex and specific regions. Plots of all genome-wide significant loci that overlapped with one or more of the evolutionary annotations considered in this study are provided in Supplementary Figure 11.
Discussion
By integrating genomic annotations of primate evolutionary history with the largest available genome-wide association analysis of neuroanatomy in living populations (Grasby et al. 2020), we are able to map genetic variation shaping cortical SA across different time periods on the lineage that led to modern humans. We find evidence of polygenic selection influencing global SA over the past 2000–3000 years. Notably, the signals of polygenic selection for increased SA in parts of the inferior frontal gyrus highlight cortical regions known to be important for the production of spoken language. These results are interesting in light of a recent study that used paleoanthropology, speech biomechanics, ethnography, and historical linguistics to show that changes in human bite configuration and speech-sound inventories occurred after the Neolithic period, potentially due to advances in food-processing technologies (Blasi et al. 2019). Thus, it is plausible that the consequent increases in the diversity of sounds produced may have led to a subtle, but consistent, polygenic selection of alleles increasing cortical SA in brain regions with relevance for speech. If this hypothesis is confirmed, it would represent a novel example of gene-culture co-evolution on the human lineage (Laland et al. 2010).
Considering a deeper evolutionary timescale, our analyses also reveal that common variation found within human-gained enhancers that are active during fetal development has effects on cortical SA measured largely in adults. Of note, regions of the inferior frontal gyrus were again among the most significant cortical areas implicated by our analyses, suggesting that they have been subject to evolutionary processes at multiple distinct timepoints on the lineage that led to modern humans. These findings implicate neural progenitor proliferation and differentiation as processes critical to evolutionary expansion of cortical SA on the human lineage. Such a relationship is consistent with the radial unit hypothesis (Rakic 2009), which posits that cortical expansion is driven by an increase in the progenitor pool present during development. In addition, through the integration of multi-omic QTLs, brain structure GWAS, and evolutionarily relevant genomic annotations, we identify a regulatory element near the promoter of HEY2 with activity specific to humans where sequence variation in that locus impacts the cortical structure of the inferior frontal gyrus. We note that the decreased expression of HEY2 is associated with increased cortical SA. Work in mice links this gene to neural progenitor proliferation (Sakamoto et al. 2003). The effect of allelic regulation of HEY2 expression levels on progenitor proliferation and cortical areal size will depend on spatiotemporal patterns of HEY2 expression and interactions with other factors that are co-expressed with it in the different regions. We believe that this represents a novel approach to identify the functional impact of evolutionarily relevant regulatory elements on brain structure. Intriguingly, a rare single gene duplication of HEY2 was identified in a child with cardiac and neurodevelopmental deficits, including disrupted speech development (Jordan et al. 2015). Although this case report requires further support from identification and characterization of additional mutation carriers, it is consistent with our association of HEY2 promoter variants with changes in cortical SA of inferior frontal regions, as these brain areas are known to be hubs in distributed circuits involved in speech and language processing.
Our study should be carefully interpreted in light of some limitations. First, in this study, we were only able to assess a subset of genetic variation that is important for human cortical SA expansion and refinement through human evolution. Specifically, we assess alleles that are both common and polymorphic in current human populations, with a bias toward European ancestry. It is almost certain that derived alleles that are now fixed in modern human populations (and therefore not detectable in GWAS) also made substantial contributions to the shaping of cortical SA during hominid evolution. So far, relatively few of these variants are known (Sousa et al. 2017), but future studies, for example introducing fixed chimpanzee or Neanderthal alleles into human neural progenitor cells, will help to assess the impacts of this class of genetic variants (Ryu et al. 2018). Second, our study is limited to understanding selective pressures within defined historical windows from evolutionarily relevant genomic annotations (Fig. 1). Our SDS correlations suggest that polygenic selective forces impacted human cortical structure but are uninformative about the timepoint that polygenic selective forces first acted because SDS does not provide information concerning evolutionary periods preceding 3000 years ago. Third, subtle population stratification can influence the inferences of polygenic selection impacting a trait (Berg et al. 2018; Sohail et al. 2019). Prior to analyses of GWAS data, we implemented an ancestry regression procedure to correct for subtle population stratification (Bhatia et al. 2016). We show that this procedure reduces the impact of population stratification, as evaluated by two independent methods (ancestry PC correlations and LDSC intercept). However, LDSC intercepts were not uniformly at 1 (indicative of no population stratification) suggesting that some residual population stratification remains. Allele frequency differences across populations may not be independent of selective pressures, so our procedure may also over-correct leading to diminished evidence of selective effects. Even using our conservative ancestry regression approach, robust signals of polygenic selection were detected for cortical SA, giving us confidence in the results. Nevertheless, replication of our findings in future genetic association studies of brain structure in sufficiently large family-based populations that are less susceptible to impacts of population stratification (Spielman and Ewens 1996; Hemani et al. 2013) would allow further verification of the results presented here. Finally, future studies focusing on understanding genetic influences on behavioral and cognitive traits (language, motor skills) (Deriziotis and Fisher 2017) combined with GWAS of their neurobiological substrates (like this one) may provide a more complete picture of how shifts in genetic variation across time might yield changes in brain structure and behavior.
These findings provide new insights into a number of long-standing debates about the genetic basis for brain size and cortical SA expansion in modern humans. First, consistent with the idea that noncoding genetic variation is a large driver of human brain evolution (King and Wilson 1975), we note that genomic annotations of evolutionary history in which cortical SA heritability enrichment was observed are not derived from protein-coding variations. Instead, these come largely from noncoding intergenic or specifically regulatory sequences. Second, our work refutes prior claims that an evolutionary change in just one gene (or perhaps a small handful of genes) can fully account for the distinctive nature of the modern human brain. For example, it was previously proposed that a single genetic variant of strong effect was sufficient to cause the expansion of human brains and cognitive abilities around 50 kya (Klein 2002). Here, we not only show that variation in multiple human-gained enhancers influences cortical SA in aggregate, but also find evidence of much more recent polygenic selection acting on these traits. We clarified molecular mechanisms for one of the genes contributing to the overall polygenic signal, HEY2, by integration of multi-omic datasets. Thus, multiple alleles each of small effect have contributed to the shaping of modern human cortical SA across different evolutionary timescales, even within the last 2000–3000 years, supporting the importance of gene-culture co-evolution in explaining our biology. In sum, selective pressures over the last 30 million years of human evolution appear to have shaped different aspects of modern human brain structure, from ancient effects on broad growth patterns through to much more recent influences on a number of cortical regions, including those linked to our capacity for spoken language.
Notes
We thank Leo Zsembik and Shana Hall for initial work on polygenic selection analyses. We also thank Dr Philipp Gunz for many helpful discussions during the development of Figure 1. S.E.F. is a member of the Center for Academic Research and Training in Anthropogeny (CARTA). Conflict of Interest: D.P.H. is a full-time employee of Genentech, Inc.
Funding
Foundation of Hope (to J.L.S.); the Brain Research Foundation (to J.L.S.); the National Institutes of Health (R01 MH118349, R00 MH102357, U54 EB020403 to J.L.S.); the National Science Foundation (ACI-16449916 to J.L.S.); the Max Planck Society (to S.E.F.); APP1173025 (to K.L.G.).
Author Contributions
S.E.F. and J.L.S. originated the project and oversaw the work. A.K.T., S.E.F., and J.L.S. drafted the manuscript. A.K.T. performed partitioned heritability analyses. A.K.T., D.L., and J.L.S. identified specific genes and regulatory elements impacted by HGE. J.L.S. implemented ancestry regression. S.L., S.M.B., and J.L.S. implemented recent polygenic selection analyses. E.A.K., B.E.S., and L.K.D. provided data for performing ancestry regression and independently implemented analyses. K.L.G., N.J., J.P., L.C.C., J.B., D.P.H., P.A.L., P.M.T., S.E.M., and J.L.S. provided the SA GWAS summary statistics. All authors edited the manuscript.
Supplementary Material
Contributor Information
Amanda K Tilot, Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, 6500 AH, Netherlands; Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Marina del Rey, CA 90292, USA.
Ekaterina A Khramtsova, Department of Medicine, Section of Genetic Medicine & Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA; Computational Sciences, Janssen Pharmaceuticals, Spring House, PA 19477, USA.
Dan Liang, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; UNC Neuroscience Center, University of North Carolina, Chapel Hill, NC 27599, USA.
Katrina L Grasby, Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD 4006, Australia.
Neda Jahanshad, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Marina del Rey, CA 90292, USA.
Jodie Painter, Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD 4006, Australia.
Lucía Colodro-Conde, Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD 4006, Australia.
Janita Bralten, Radboud University Medical Center, 6525 XZ Nijmegen, Netherlands.
Derrek P Hibar, Genentech, Inc., South San Francisco, CA 94080, USA.
Penelope A Lind, Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD 4006, Australia.
Siyao Liu, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; UNC Neuroscience Center, University of North Carolina, Chapel Hill, NC 27599, USA.
Sarah M Brotman, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; UNC Neuroscience Center, University of North Carolina, Chapel Hill, NC 27599, USA.
Paul M Thompson, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Marina del Rey, CA 90292, USA.
Sarah E Medland, Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD 4006, Australia.
Fabio Macciardi, Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92697, USA.
Barbara E Stranger, Department of Medicine, Section of Genetic Medicine & Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA; Department of Pharmacology, Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
Lea K Davis, Department of Medicine, Division of Medical Genetics, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt University Medical Center, Vanderbilt Genetics Institute, Nashville, TN 37232, USA.
Simon E Fisher, Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, 6500 AH, Netherlands; Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, 6500 HB, Netherlands.
Jason L Stein, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; UNC Neuroscience Center, University of North Carolina, Chapel Hill, NC 27599, USA.
References
- 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, Mcarthy S, GA MV et al. 2015. A global reference for human genetic variation. Nature. 526:68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Albers PK, McVean G. 2020. Dating genomic variants and shared ancestry in population-scale sequencing data. PLoS Biol. 18:e3000586. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bacanu SA, Devlin B, Roeder K. 2000. The power of genomic control. Am J Hum Genet. 66:1933–1944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balding DJ. 2006. A tutorial on statistical methods for population association studies. Nat Rev Genet. 7:781–791. [DOI] [PubMed] [Google Scholar]
- Barton N, Hermisson J, Nordborg M. 2019. Why structure matters. Elife. 8:e45380. doi: 10.7554/eLife.45380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barton RA, Venditti C. 2014. Rapid evolution of the cerebellum in humans and other great apes. Curr Biol. 24:2440–2444. [DOI] [PubMed] [Google Scholar]
- Berg JJ, Harpak A, Sinnott-Armstrong N, Joergensen AM, Mostafavi H, Field Y, Boyle EA, Zhang X, Racimo F, Pritchard JK et al. 2018. Reduced signal for polygenic adaptation of height in UK Biobank. Elife. 8:e39725. doi: 10.7554/eLife.39725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhatia G, Furlotte NA, Loh P-R, Liu X, Finucane HK, Gusev A, Price A. 2016. Correcting subtle stratification in summary association statistics. bioRxiv. 10.1101/076133. [DOI] [Google Scholar]
- Bird CP, Stranger BE, Liu M, Thomas DJ, Ingle CE, Beazley C, Miller W, Hurles ME, Dermitzakis ET. 2007. Fast-evolving noncoding sequences in the human genome. Genome Biol. 8:R118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blasi DE, Moran S, Moisik SR, Widmer P, Dediu D, Bickel B. 2019. Human sound systems are shaped by post-Neolithic changes in bite configuration. Science, 363(6432):eaav3218. doi: 10.1126/science.aav3218. [DOI] [PubMed] [Google Scholar]
- Bruner E, Preuss TM, Chen X, Rilling JK. 2017. Evidence for expansion of the precuneus in human evolution. Brain Struct Funct. 222:1053–1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. 2013. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods. 10:1213–1218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Loh PR, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3, Duncan L, JRB P, Patterson N et al. 2015a. An atlas of genetic correlations across human diseases and traits. Nat Genet. 47:1236–1241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bulik-Sullivan BK, Loh PR, Finucane HK, Ripke S, Yang J, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Patterson N, Daly MJ, Price AL, Neale BM. 2015b. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 47:291–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bush EC, Lahn BT. 2008. A genome-wide screen for noncoding elements important in primate evolution. BMC Evol Biol. 8:17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Busing FMTA, Meijer E, Leeden RVD. 1999. Delete-m Jackknife for unequal m. Stat Comput. 9:3–8. [Google Scholar]
- Capra JA, Erwin GD, McKinsey G, Rubenstein JLR, Pollard KS. 2013. Many human accelerated regions are developmental enhancers. Philos Trans R Soc Lond B Biol Sci. 368:20130025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. 2015. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 4:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dale AM, Fischl B, Sereno MI. 1999. Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage. 9:179–194. [DOI] [PubMed] [Google Scholar]
- de la Torre-Ubieta L, Stein JL, Won H, Opland CK, Liang D, Lu D, Geschwind DH. 2018. The dynamic landscape of open chromatin during human cortical neurogenesis. Cell. 172:289–304.e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deriziotis P, Fisher SE. 2017. Speech and language: translating the genome. Trends Genet. 33:642–656. [DOI] [PubMed] [Google Scholar]
- Desikan RS, Ségonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, Buckner RL, Dale AM, Maguire RP, Hyman BT et al. 2006. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage. 31:968–980. [DOI] [PubMed] [Google Scholar]
- Du A, Zipkin AM, Hatala KG, Renner E, Baker JL, Bianchi S, Bernal KH, Wood BA. 2018. Pattern and process in hominin brain size evolution are scale-dependent. Proc Biol Sci. 285(1873):20172738. doi: 10.1098/rspb.2017.2738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elliott LT, Sharp K, Alfaro-Almagro F, Shi S, Miller KL, Douaud G, Marchini J, Smith SM. 2018. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature. 562:210–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Enard W. 2016. The molecular basis of human brain evolution. Curr Biol. 26:R1109–R1117. [DOI] [PubMed] [Google Scholar]
- Ernst J, Kellis M. 2012. ChromHMM: automating chromatin-state discovery and characterization. Nat Methods. 9:215–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Field Y, Boyle EA, Telis N, Gao Z, Gaulton KJ, Golan D, Yengo L, Rocheleau G, Froguel P, McCarthy MI et al. 2016. Detection of human adaptation during the past 2000 years. Science. 354:760–764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finucane HK, Bulik-Sullivan B, Gusev A, Trynka G, Reshef Y, Loh P-R, Anttila V, Xu H, Zang C, Farh K et al. 2015. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat Genet. 47:1228–1235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gazal S, Sahbatou M, Babron M-C, Génin E, Leutenegger A-L. 2015. High level of inbreeding in final phase of 1000 genomes project. Sci Rep. 5:17453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geschwind DH, Rakic P. 2013. Cortical evolution: judge the brain by its cover. Neuron. 80:633–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grasby KL, Jahanshad N, Painter JN, Colodro-Conde L, Bralten J, Hibar DP, Lind PA, Pizzagalli F, Ching CRK, McMahon MAB et al. 2020. The genetic architecture of the human cerebral cortex. Science. 367(6484):eaay6690. doi: 10.1126/science.aay6690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gunz P, Tilot AK, Wittfeld K, Teumer A, Shapland CY, van Erp TGM, Dannemann M, Vernot B, Neubauer S, Guadalupe T et al. 2019. Neandertal introgression sheds light on modern human endocranial globularity. Curr Biol. 29:120–127.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hemani G, Yang J, Vinkhuyzen A, Powell JE, Willemsen G, Hottenga J-J, Abdellaoui A, Mangino M, Valdes AM, Medland SE et al. 2013. Inference of the genetic architecture underlying BMI and height with the use of 20,240 sibling pairs. Am J Hum Genet. 93:865–875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henneberg M. 1988. Decrease of human skull size in the Holocene. Hum Biol. 60:395–405. [PubMed] [Google Scholar]
- Huang Z, Kawase-Koga Y, Zhang S, Visvader J, Toth M, Walsh CA, Sun T. 2009. Transcription factor Lmo4 defines the shape of functional areas in developing cortices and regulates sensorimotor control. Dev Biol. 327:132–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hublin J-J, Ben-Ncer A, Bailey SE, Freidline SE, Neubauer S, Skinner MM, Bergmann I, Le Cabec A, Benazzi S, Harvati K et al. 2017. New fossils from Jebel Irhoud, Morocco and the pan-African origin of Homo sapiens. Nature. 546:289–292. [DOI] [PubMed] [Google Scholar]
- Isler K, Christopher Kirk E, Miller JMA, Albrecht GA, Gelvin BR, Martin RD. 2008. Endocranial volumes of primate species: scaling analyses using a comprehensive and reliable data set. J Hum Evol. 55:967–978. [DOI] [PubMed] [Google Scholar]
- Jantz RL, Jantz LM. 2016. The remarkable change in Euro-American cranial shape and size. Hum Biol. 88:56–64. [DOI] [PubMed] [Google Scholar]
- Jordan VK, Rosenfeld JA, Lalani SR, Scott DA. 2015. Duplication of HEY2 in cardiac and neurologic development. Am J Med Genet A. 167A:2145–2149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jostins L, Ripke S, Weersma RK, Duerr RH, McGovern DP, Hui KY, Lee JC, Schumm LP, Sharma Y, Anderson CA et al. 2012. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 491:119–124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- King MC, Wilson AC. 1975. Evolution at two levels in humans and chimpanzees. Science. 188:107–116. [DOI] [PubMed] [Google Scholar]
- Klein RG. 2002. The Dawn of human culture. New York, NY: Wiley. [Google Scholar]
- Klein RG. 2009. The human career. Chicago, IL: University of Chicago Press.
- Kunsch HR. 1989. The jackknife and the bootstrap for general stationary observations. Ann Stat. 17(3):1217–1241. [Google Scholar]
- Laland KN, Odling-Smee J, Myles S. 2010. How culture shaped the human genome: bringing genetics and the human sciences together. Nat Rev Genet. 11:137–148. [DOI] [PubMed] [Google Scholar]
- Lee S-H, Wolpoff MH. 2003. The pattern of evolution in Pleistocene human brain size. Paleobiology. 29:186–196. [Google Scholar]
- Liang D, Elwell AL, Aygün N, Lafferty MJ, Krupa O, Cheek KE, Courtney KP, Yusupova M, Garrett ME, Ashley-Koch A et al. 2020. Cell-type specific effects of genetic variation on chromatin accessibility during human neuronal differentiation. Biorxiv. 10.1101/2020.01.13.904862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindblad-Toh K, Garber M, Zuk O, Lin MF, Parker BJ, Washietl S, Kheradpour P, Ernst J, Jordan G, Mauceli E et al. 2011. A high-resolution map of human evolutionary constraint using 29 mammals. Nature. 478:476–482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Love MI, Huber W, Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lui JH, Hansen DV, Kriegstein AR. 2011. Development and evolution of the human neocortex. Cell. 146:18–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller IF, Barton RA, Nunn CL. 2019. Quantitative uniqueness of human brain evolution revealed through phylogenetic comparative analysis. Elife. 8:e41250. doi: 10.7554/eLife.41250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mitchell C, Silver DL. 2018. Enhancing our brains: genomic mechanisms underlying cortical evolution. Semin Cell Dev Biol. 76:23–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moorjani P, Amorim CEG, Arndt PF, Przeworski M. 2016. Variation in the molecular clock of primates. Proc Natl Acad Sci USA. 113:10607–10612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neubauer S, Gunz P, Schwarz U, Hublin J-J, Boesch C. 2012. Brief communication: endocranial volumes in an ontogenetic sample of chimpanzees from the Taï Forest National Park. Ivory Coast Am J Phys Anthropol. 147:319–325. [DOI] [PubMed] [Google Scholar]
- Neubauer S, Hublin J-J, Gunz P. 2018. The evolution of modern human brain shape. Sci Adv. 4:eaao5961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nielsen R, Akey JM, Jakobsson M, Pritchard JK, Tishkoff S, Willerslev E. 2017. Tracing the peopling of the world through genomics. Nature. 541:302–310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Novembre J, Barton NH. 2018. Tread lightly interpreting polygenic tests of selection. Genetics. 208:1351–1355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pääbo S. 2014. The human condition-a molecular approach. Cell. 157:216–226. [DOI] [PubMed] [Google Scholar]
- Pereira-Pedro AS, Bruner E, Gunz P, Neubauer S. 2020. A morphometric comparison of the parietal lobe in modern humans and Neanderthals. J Hum Evol. 142:102770. [DOI] [PubMed] [Google Scholar]
- Peyrégne S, Boyle MJ, Dannemann M, Prüfer K. 2017. Detecting ancient positive selection in humans using extended lineage sorting. Genome Res. 27:1563–1572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pollard KS, Salama SR, King B, Kern AD, Dreszer T, Katzman S, Siepel A, Pedersen JS, Bejerano G, Baertsch R et al. 2006a. Forces shaping the fastest evolving regions in the human genome. PLoS Genet. 2:e168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pollard KS, Salama SR, Lambert N, Lambot M-A, Coppens S, Pedersen JS, Katzman S, King B, Onodera C, Siepel A et al. 2006b. An RNA gene expressed during cortical development evolved rapidly in humans. Nature. 443:167–172. [DOI] [PubMed] [Google Scholar]
- Prabhakar S, Noonan JP, Pääbo S, Rubin EM. 2006. Accelerated evolution of conserved noncoding sequences in humans. Science. 314:786. [DOI] [PubMed] [Google Scholar]
- Rakic P. 2009. Evolution of the neocortex: a perspective from developmental biology. Nat Rev Neurosci. 10:724–735. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reilly SK, Yin J, Ayoub AE, Emera D, Leng J, Cotney J, Sarro R, Rakic P, Noonan JP. 2015. Evolutionary genomics. Evolutionary changes in promoter and enhancer activity during human corticogenesis. Science. 347:1155–1159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Renault VM, Rafalski VA, Morgan AA, Salih DAM, Brett JO, Webb AE, Villeda SA, Thekkat PU, Guillerey C, Denko NC et al. 2009. FoxO3 regulates neural stem cell homeostasis. Cell Stem Cell. 5:527–539. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, Heravi-Moussavi A, Kheradpour P, Zhang Z, Wang J et al. 2015. Integrative analysis of 111 reference human epigenomes. Nature. 518:317–330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ryu H, Inoue F, Whalen S, Williams A, Kircher M, Martin B, Alvarado B, Samee MAH, Keough K, Thomas S et al. 2018. Massively parallel dissection of human accelerated regions in human and chimpanzee neural progenitors. bioRxiv. 10.1101/256313. [DOI] [Google Scholar]
- Sakamoto M, Hirata H, Ohtsuka T, Bessho Y, Kageyama R. 2003. The basic helix-loop-helix genes Hesr1/Hey1 and Hesr2/Hey2 regulate maintenance of neural precursor cells in the brain. J Biol Chem. 278:44808–44815. [DOI] [PubMed] [Google Scholar]
- Simonti CN, Vernot B, Bastarache L, Bottinger E, Carrell DS, Chisholm RL, Crosslin DR, Hebbring SJ, Jarvik GP, Kullo IJ et al. 2016. The phenotypic legacy of admixture between modern humans and Neandertals. Science. 351:737–741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sniekers S, Stringer S, Watanabe K, Jansen PR, Coleman JRI, Krapohl E, Taskesen E, Hammerschlag AR, Okbay A, Zabaneh D et al. 2017. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nat Genet. 49:1107–1112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sohail M, Maier RM, Ganna A, Bloemendal A, Martin AR, Turchin MC, Chiang CW, Hirschhorn J, Daly MJ, Patterson N et al. 2019. Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies. Elife. 8:e39702. doi: 10.7554/eLife.39702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sousa AMM, Meyer KA, Santpere G, Gulden FO, Sestan N. 2017. Evolution of the human nervous system function, structure, and development. Cell. 170:226–247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spielman RS, Ewens WJ. 1996. The TDT and other family-based tests for linkage disequilibrium and association. Am J Hum Genet. 59:983–989. [PMC free article] [PubMed] [Google Scholar]
- Uebbing S, Gockley J, Reilly SK, Kocher AA, Geller E, Gandotra N, Scharfe C, Cotney J, Noonan JP. 2019. Massively parallel discovery of human-specific substitutions that alter neurodevelopmental enhancer activity. bioRxiv. 10.1101/865519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van de Geijn B, McVicker G, Gilad Y, Pritchard JK. 2015. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nat Methods. 12:1061–1063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vermunt MW, Tan SC, Castelijns B, Geeven G, Reinink P, de Bruijn E, Kondova I, Persengiev S, Netherlands Brain Bank, Bontrop R et al. 2016. Epigenomic annotation of gene regulatory alterations during evolution of the primate brain. Nat Neurosci. 19:494–503. [DOI] [PubMed] [Google Scholar]
- Vernot B, Akey JM. 2014. Resurrecting surviving Neandertal lineages from modern human genomes. Science. 343:1017–1021. [DOI] [PubMed] [Google Scholar]
- Vernot B, Tucci S, Kelso J, Schraiber JG, Wolf AB, Gittelman RM, Dannemann M, Grote S, McCoy RC, Norton H et al. 2016. Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science. 352:235–239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang D, Liu S, Warrell J, Won H, Shi X, Navarro FCP, Clarke D, Gu M, Emani P, Yang YT et al. 2018. Comprehensive functional genomic resource and integrative model for the human brain. Science. 362(6420):eaat8464. doi: 10.1126/science.aat8464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Willer CJ, Li Y, Abecasis GR. 2010. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 26:2190–2191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wood AR, Esko T, Yang J, Vedantam S, Pers TH, Gustafsson S, Chu AY, Estrada K, Luan J, Kutalik Z et al. 2014. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet. 46:1173–1186. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Code used to perform analyses is available at https://bitbucket.org/jasonlouisstein/enigmaevolma6/src/master/. Genomic regions that underwent rapid change on the human lineage (human accelerated regions, HARs) were combined from several sources (Pollard, Salama, Lambert, et al. 2006b; Prabhakar et al. 2006; Bird et al. 2007; Bush and Lahn 2008; Lindblad-Toh et al. 2011). BED files listing fetal brain enhancer elements not found in macaques or mice were obtained from previous work (Reilly et al. 2015). Adult brain enhancer elements arising since our last common ancestor with the macaque or chimpanzee were obtained from (Vermunt et al. 2016). A refined list of SNPs gained through introgression with Neanderthals was obtained from previous work (Simonti et al. 2016). Genomic regions depleted of introgressed Neanderthal DNA were obtained from previous work (Vernot et al. 2016). Ancient selective sweep regions identified using extended lineage sorting were obtained from previous work (Peyrégne et al. 2017). A summary of all annotations is found in Supplementary Table 1.