Abstract
Facial recognition plays a key role in human interactions, and there has been great interest in understanding the evolution of human abilities for individual recognition and tracking social relationships. Individual recognition requires sufficient cognitive abilities and phenotypic diversity within a population for discrimination to be possible. Despite the importance of facial recognition in humans, the evolution of facial identity has received little attention. Here we demonstrate that faces evolved to signal individual identity under negative frequency-dependent selection. Faces show elevated phenotypic variation and lower between-trait correlations compared to other traits. Regions surrounding face-associated SNPs show elevated diversity consistent with frequency-dependent selection. Genetic variation maintained by identity signaling tends to be shared across populations and, for some loci, predates the origin of Homo sapiens. Studies of human social evolution tend to emphasize cognitive adaptations but we show that social evolution has shaped patterns of human phenotypic and genetic diversity as well.
Introduction
Human societies are predicated on our abilities to individually recognize and track scores of people in our social networks1,2. The complexity of human societies is widely recognized as a major selective force that has shaped our cognitive abilities and social intelligence3–5. Indeed, humans have highly developed individual recognition abilities, and there is a rich literature examining social cognition and individual recognition in humans6–9. In particular, facial recognition plays a critical role in human social interactions, and the cognitive mechanisms underlying facial recognition have been well studied9–11. When it comes to recognition, however, cognition is only one half of the equation. Recognition also depends on phenotypic variation in a population12, without which discrimination is impossible. Compared to other animals or other parts of our bodies, we perceive human faces as being unusually variable and easy to recognize (Fig. 1). While this phenomenon can be at least partly explained by our specialization for learning human faces10, the fact that facial recognition is so important for social interactions among humans suggests that selection may have lead to increased facial distinctiveness. Despite the striking differences among human faces and the importance of facial identity for human society, the evolution of individuality in human faces has yet to be explored.
Figure 1. Humans have much more individually distinctive faces than many animals.

(A) Human populations show extensive variability in facial morphology that is used for individual recognition. Patterns of elevated variability are even maintained in more genetically homogeneous populations such as the Finnish, as demonstrated by the portraits of six male soldiers (B) In contrast to the variability present in human faces, many animals such as king penguins have much more uniform appearances. While king penguins are not known to visually recognize individuals, they do have highly distinctive vocalizations that are used for individual recognition. (Photo credits: SA-kuva, Finnish Armed Forces photograph; Wikimedia commons)
Theoretical and empirical studies have proposed multiple non-mutually exclusive negative frequency-dependent selection (NFDS) pressures that may maintain elevated phenotypic diversity in natural populations. These include apostatic selection, in which predation is lower on rare prey13,14, mating preferences for novel phenotypes15 and selection to be recognizable16. For example, both apostatic selection and frequency-dependent mate preferences have been shown to contribute to the maintenance of the highly variable and heritable male coloration patterns in the guppy (Poecilla reticulata)14,17. Frequency-dependent attractiveness has been proposed as a mechanism to explain patterns of hair and eye color diversity in Europeans18 and recent tests have found empirical support for frequency-dependent attractiveness of beards in human males19. Whereas elevated variation in guppy coloration is limited to adult males17, human facial individuality is not limited to a particular age or sex-class, suggesting that frequency-dependent mate preferences are unlikely to be the sole or major driver of elevated diversity in human facial patterns. Indeed, individual recognition is important in humans from cradle to grave across multiple contexts. Selection to be individually recognizable in a variety of scenarios is therefore a prime hypothesis to explain the high diversity in human facial appearance12.
Individual recognition will only evolve when it is beneficial to identify other individuals12. Whether or not individuals benefit be being identifiable and easily recognized raises a different question. Traits used for individual recognition are expected to evolve as either identity cues or as identity signals depending on the benefits of being recognized12,20,21. Identity cues are traits that allow discrimination but have not evolved for the purpose of recognition22 and are not expected to show signatures of adaptive evolution. Cues are essentially inadvertent phenotypic variation that other individuals can use for discrimination23,24. For example, today human fingerprints are used for forensic identification though they have not evolved to facilitate recognition. As is the case for fingerprints, identity cues do not necessarily benefit individuals that are being recognized and may in fact harm them. In contrast, identity signals are traits that have been selected to facilitate individual recognition and as a result show elevated variation within populations16,25. Individual recognition can rely on cues alone, but if individuals benefit from being recognized then selection is expected to favor individuals to advertise their identity with distinctive phenotypes12,20,21. Identity signals evolve when being confused with others is costly due to misdirected behaviors including aggression26, mating opportunities27, parental care28, etc. Comparative and experimental evidence for identity signaling leading to increased phenotypic diversity has been documented in multiple taxa25,26,28,29 but has not been investigated in humans.
Individual recognition is facilitated when individuals display divergent trait values and novel combinations of traits11,21 leading to disruptive selection on multiple traits and the evolution of independent developmental pathways21,26. Three key predictions of the identity signaling hypothesis are (i) facial characteristics should be more variable than other visible traits not used for recognition, (ii) face traits are expected to show lower inter-trait correlations compared to other morphological traits, and (iii) loci underlying normal facial variation are expected to show elevated genetic diversity consistent with NFDS favoring rare phenotypes. Loci contributing to identity-signaling traits are expected to show evidence of NFDS, such as an excess of intermediate-frequency alleles and elevated diversity when controlling for divergence30,31. Selection for identity signaling on any one facial trait is likely to be relatively weak as there are numerous traits that contribute to individual identity. Due to the complex genetic architecture of facial variation32–34 we expect a modest signature of elevated diversity in genomic regions underlying identity signals as a whole35,36, though there may be stronger evidence for NFDS at a subset of loci. While it is plausible that identity cues could also show elevated phenotypic variation and reduced phenotypic correlations as a result of relaxed selection or stochastic developmental processes, the loci underlying identity cues are expected to evolve neutrally. Thus even a weak signature of NFDS, as may be expected for a complex quantitative trait such as facial identity, would reject the cue hypothesis and provide support for identity signaling.
Consistent with the predictions of the identity-signaling hypothesis, we find elevated phenotypic variation and reduced levels of inter-trait correlations in human faces compared to non-facial morphology. Furthermore, we find population genomic support for the identity-signaling hypothesis. Loci associated with variation in normal facial morphology show elevated nucleotide diversity compared to loci associated with variation in height or presumably neutral, intergenic variation. The loci with the strongest evidence of selection tend to be shared across continents, suggesting that selection on at least some loci associated with identity signaling is likely to be old. Indeed, by comparing sequences of modern humans to those of Neanderthals and Denisovans, we demonstrate that variation at some loci associated with facial morphology predates that origin of the human species. While studies of human social evolution have tended to emphasize it effect on cognition, our results suggest that social evolution has also played an important role in shaping human morphology.
Results
Morphological evidence
Morphological comparisons between faces and other traits are consistent with the predictions of identity signaling. We tested these predictions using data for 18 facial and 46 non-facial linear distance measures from the ANSUR anthropometric study of US army personnel37 for females and males of African and European American ancestry respectively (Supplementary Tables 1-2). Linear distances between facial landmarks have higher coefficients of variation than linear measurements of body traits in every group (Fig. 2a, Mann-Whitney U (MWU) test, n = 18 facial and 46 non-facial measures, P < 0.03 for all comparisons). Without selection for uncoupled development, traits within individuals are generally correlated, as larger individuals tend to have larger traits38,39. However, facial traits show lower inter-trait correlation coefficients than body traits in all four groups as predicted by the identity-signaling hypothesis (Fig. 2b, MWU test, n = 153 facial correlations and 1035 non-facial correlations, P < 0.001 for all comparisons). Indeed the vast majority of the body measures are correlated (Percentage of significant Pearson's correlations between traits, AAF = 95.17%, AAM = 99.14%, EAF = 96.62%, EAM = 99.81%, n = 1035 pairwise comparisons), though many fewer facial traits are correlated with each other (AAF = 63.4%, AAM = 73.9%, EAF = 47.1%, EAM = 84.2%, n = 153 pairwise comparisons, Z-ratio < -11 and P < 0.002 for all comparisons). Uncorrelated values for face traits increase the diversity of facial phenotypes, facilitating recognition. For example, the breadth and length of hands are correlated (Fig. 2c, r2 = 0.30, P<0.0001), though the breadth and length of noses are not (Fig. 2d, r2 = 0.002, P = 0.06). These results add to previous findings that humans have among the lowest craniofacial morphological integration among primates and mammals more broadly40.
Figure 2. Morphological evidence that human faces have evolved to signal individual identity.
Morphological comparisons of facial features to other aspects of body morphology are consistent with selection for identity signals. (A) In all four groups examined facial traits have higher coefficients of variation than other body traits (P < 0.03 for all comparison). (B) Facial traits as a group show lower inter-trait correlations than non-facial traits in all four populations examined (P < 0.001 for all comparisons). (C) For most traits, such as hands, larger individuals have larger traits such that the width and length of an individual's hand are correlated. (D) In contrast to hands, the width and length of the nose are not correlated. Box-plots show median and 25th and 75th percentiles (N = 181 African American females; 457 African American males; 204 European American females; 1168 European American males).The P-values shown The scatterplots show the trait values for European American male service members measured in the ANSUR II dataset. Best-fit lines are shown for significant regressions.
Population genomic evidence
Using data from the 1000 Genomes Project41, we tested for a signature of NFDS in genomic regions surrounding SNPs previously associated with differences in normal facial morphology in Europeans32,33. We compared the distribution of population genetic summary statistics calculated around face SNPs to the distribution of summary statistics for 5000 putatively neutral intergenic regions as identified by the Neutral Region Explorer42. Here we present an analysis based on 2kb windows, though the patterns of diversity reported here are robust to a range of window sizes (Supplementary Fig. 1). Elevated diversity in face regions relative to the intergenic regions would be consistent with the predictions of NFDS. However, it is possible that morphological traits in general could show elevated diversity in comparison to neutral regions, so we also compared face regions to regions surrounding SNPs associated with height43, another complex morphological trait, as an additional control.
Patterns of diversity surrounding face-associated SNPs are consistent with NFDS on complex quantitative traits as predicted under the identity-signaling hypothesis. Here we present the values for the Finnish population (Fig. 3) though broadly similar overall patterns are found for other 1000 Genomes population samples from Europe and Africa and to a weaker extent Asia (Supplementary Fig. 2-9). The folded site-frequency spectrum shows that regions surrounding face-associated SNPs have an excess of intermediate frequency variants compared to the two sets of control loci (Fig. 3a, MWU, P < 0.0001 for both comparisons). Additionally, the distribution of summary statistics for faces differs from distributions found for height or intergenic regions consistent with NFDS on facial traits (Fig. 3b-e, P < 0.05 for all comparisons). One possible confounding factor is that intermediate-frequency SNPs are overrepresented in genotyping panels44 and thus more often associated with traits in genome-wide association studies, so elevated diversity could conceivably be confounded by ascertainment biases. Two lines of evidence argue against this. First, the association studies and population genomic analyses were conducted in different samples of Europeans, and the patterns of elevated diversity around face SNPs are also found in African and Asian populations. Second, in the Finnish examined here the minor allele frequency of the focal SNPs is actually lower for faces than for height (MWU, P = 0.028, Supplementary Fig. 10), suggesting that potential biases in association studies cannot explain the elevated patterns of diversity surrounding face-associated SNPs. The combination of elevated morphological and genetic diversity associated with human faces rejects a neutral explanation for human facial individuality and instead supports the hypothesis that human facial diversity is the product of selection for identity signaling in humans.
Figure 3. Population genomic evidence that human faces have evolved to signal individual identity.
Genomic regions associated with facial morphology show evidence of selection for identity signaling in the Finnish. (A) Face regions (N=59) have elevated levels of intermediate-frequency alleles compared to neutral regions (N=5000) or genomic regions associated with variation in height (N=365). The bar graph shows the proportion of SNPs within each allele-frequency bin. (B) Additionally, face regions have elevated levels of π, (C) even after controlling for differing rates of divergence among loci. (D) Similarly, face regions show an elevated number of segregating sites, measured as Waterson's θ. (E) Tajima's D is higher in facial regions than neutral regions while (F) Fu and Li's D* is higher in facial regions than height regions. Box-plots show medians and 25th and 75th percentiles. Whiskers shows the 5th and 95th percentiles. Outliers are not shown so that the main distributions can be viewed at larger size. The P-values shown are from one-tailed Mann-Whitney U tests. Note that sample sizes are reduced for tests corrected for divergence, as alignments were not available for all regions considered (N= 58 face loci, 356 height loci, 4873 neutral loci).
Evolutionary dynamics of identity signaling loci
Due to the lack of data on SNPs associated with signaling traits in animals, population genomic methods have not previously been used to empirically explore the evolutionary dynamics of signaling traits. The present dataset on identity signals in humans, however, provides an unprecedented opportunity to examine the history of selection on signaling traits used in social communication. Selection for identity signaling is expected to act on faces in all populations though it need not occur at the same loci. Conceivably, selection may act on the same loci across populations; different populations could maintain diversity at distinct loci underlying the same trait; or selection may act on loci underlying different traits in each population depending on the dynamics of selection as human populations expanded across the globe. We explored this question by assessing whether loci showing elevated diversity, where both π corrected for divergence with macaques and Tajima's D fall in the 95th percentile, were shared across populations. Indeed a disproportionate number of loci show evidence of elevated diversity in at least one population for faces (9/58) compared to height (6/356; χ2 = 23.5, P < 0.0001) and intergenic regions (57/4873; χ2 = 78.8, P < 0.0001, Fig. 4). Furthermore, the regions that show elevated diversity for faces are more consistent across continents than expected; 5 of 9 regions show elevated diversity on at least two continents compared to 1 of 57 intergenic regions (χ2 = 27.2, P <0.0002, Fig. 4). All 9 regions identified as having elevated diversity in at least one population have high levels of π/divergence (>90th percentile) in both African populations examined here (Table s3). Additionally, analyses of the haplotype networks for the 9 regions show greater allelic diversity in African populations, with European and Asian populations carrying a subset of the African haplotypes (Supplementary Fig. 11-19). These patterns of diversity and haplotype sharing across populations are consistent with an African origin of allelic variation at identity signaling loci predating human migration out of Africa. Population differentiation in facial morphology appears in part to be the result of differential loss of diversity in non-African populations, consistent with reduced morphological variation in populations with increased distance to Africa45.
Figure 4. Patterns of elevated diversity in face-associated loci across populations.
The face-associated loci with elevated diversity consistent with selection for identity signaling tend to be shared across populations both within and between continents. The heatmap highlights loci on the extreme ends of the distributions for π (controlling for divergence with macaque) and Tajima's D. Columns correspond to populations and rows correspond to individual loci. Squares that are fully filled in with dark blue designate loci with evidence of elevated diversity (>95th percentile for both summary statistics). A greater number of loci show evidence of elevated diversity in at least one population for faces (9/58) compared to height (6/356; χ2 = 23.5, P < 0.0001) and intergenic regions (57/4873; χ2 = 78.8, P < 0.0001). Additionally, patterns of elevated diversity are more consistently shared across populations for face-associated regions compared to the neutral regions (5/9 face regions versus 1/57 neutral regions, χ2 = 27.2, P <0.0002). To facilitate visual comparison representative subsamples of height and intergenic regions are shown here. Subsamples were generated by randomly selecting loci from the height and neutral lists, which we confirmed did not deviate from the distribution of the total sample. All analyses reported were conducted on the full datasets.
Here we present two examples highlighting the complex evolutionary trajectories of loci involved in identity signaling in humans. The examples illustrate (i) that genetic variation underlying identity signals tends to be old and of African origin and (ii) that phenotypic divergence between non-African populations is partly related to the differential loss of ancestral variation (Fig. 5). Variants associated with the distance between the chin and bridge of the nose33 are found within an intron of TMCT2. A sliding window analysis of the region demonstrates that there is elevated diversity and reduced Fst consistent with sustained selection for identity signaling that is common to the three continental groups or occurred in their ancestral population (Fig. 5a). In contrast to the shared diversity at TMCT2, intronic variants of SDK1 associated with nasal morphology32 show a clear reduction of nucleotide diversity in Asian populations compared to the elevated diversity found in African populations. The reduction in diversity in Asian populations and increased Fst between Asian and African population could either be the result of loss of diversity during population bottlenecks or directional selection on nasal morphology in Asian populations (Fig. 5b). For both loci we constructed gene trees for the 5kb window with the highest level of nucleotide diversity for 30 modern human sequences as well as the Neanderthal, Denisovan and chimpanzee sequences (Fig. 5c-d). Both trees provide further evidence for the ancient origins of loci under selection for identity signaling as archaic Hominin species are nested within modern human diversity. This result suggests that selection on some loci associated with identity signaling predates the origin of Homo sapiens and the emergence of modern facial morphology.
Figure 5. Evolutionary history of example face-associated loci.
Patterns of genetic diversity associated with facial morphology at TMCT2 and SDK1. (A) At TMCT2 variation is largely shared across continents, while (B) at SDK1 variation has been lost mainly in the CHB population. The sliding window analyses (A – B) show nucleotide diversity for three 1000 Genomes populations representing Europe (FIN), Asia (CHB) and Africa (YRI) respectively for 5kb windows at 1kb sliding intervals. Nucleotide diversity is shown with solid lines while Fst is represented by dotted lines. Color of the lines represents the population examined for π (FIN = blue, CHB = black, YRI = red) or the two population Fst comparisons (FIN - YRI= red, CHB - YRI = black, FIN – CHB = blue) The locations of SNPs associated with facial morphology are shown as blue circles except for the focal SNP included in other window-based analyses that is denoted with a red circle. The UCSC Genome Browser tracks showing the locations of exons and three ENCODE regulatory regions, which show regions likely associated with genomic features involved in gene regulation, are shown below the sliding window. (C - D) Maximum-likelihood trees show the relationships among 10 modern humans sampled from each of three populations (FIN, CHB and YRI) as well as sequences from Denisovan, Neanderthal and Chimpanzee. The modern human sequences are colored according to their population of origin (FIN = blue, CHB = black, YRI = red). The region analyzed was the 5kb window with the highest nucleotide diversity as determined by the sliding window analysis. Note that in both cases, the sequences for archaic Hominins are nested within modern human diversity, indicating the origin of the major haplogroups predates the evolution of Homo sapiens.
Discussion
Here we have presented both morphological and population genomic evidence consistent with the hypothesis that selection for individual identity signals has shaped patterns of human facial diversity. Though the evidence for selection at individual loci is modest, as expected for molecular evolution of polygenic traits35, the combination of morphological and genomic data from multiple populations clearly rejects the identity cue hypothesis and provides compelling evidence consistent with the idea that selection for individual identity signaling has shaped patterns of facial morphology in humans. Provided that the variation used in identity signals is not developmentally costly to produce or maintain, even a small selective advantage of individuality is expected to give rise to elevated phenotypic diversity when confusion is costly21. Previous studies have shown that being confused with others may be costly in a range of circumstances including within social hierarchies in Polistes wasps26, sexual selection in house mice27, and parent-offspring interactions in cliff swallows28. It is unknown at present, which aspects of human sociality have been the most important sources of selection for identity signaling though it is likely that multiple facets of social interactions contribute to selection for identity signals. Individual recognition and discriminating among individuals plays a role in shaping important human behaviors including kin recognition46, investment in offspring47, and cooperation48. It is likely that many social contexts favor identity signaling in humans, so it will be important for future research to explore the relative benefits of individuality across many social contexts and developmental stages in humans.
In addition to selection for identity signaling it is possible that other frequency-dependent process such as preferences for mates with rare or novel features could have played a role in shaping human diversity. For example, a recent study showed frequency-dependent effects on the attractiveness of male facial hair-styles19. Preferences for individuals with rare phenotypes have also been shown in other animals, such as guppies where rare phenotypes confer a survival advantage due to reduced predation14,17. In humans, females tend to advertise physical attractiveness to mate more prominently than do males, who tend to advertise resources or performance ability49,50. Thus if frequency-dependent mate preferences were the major driver in determining facial identity, then females might be expected to show elevated levels of individuality compared to males just as female preferences for novel individuals contributes to the elevated color pattern variation seen in male guppies17. However, in humans both males and females show elevated individuality in faces compared to other external morphology (Fig. 2), suggesting that mate preferences alone cannot explain the patterns observed here. Similarly, mate preferences might be expected to drive variation only in adults as is observed in guppies17, yet distinctive facial morphology is seen at all life stages in humans. To the extent that frequency-dependent mate preferences play a role in shaping patterns of facial individuality it is likely that mating preferences and identity signaling would have a positive feedback. If individual distinctiveness is beneficial in non-sexual contexts, preferences for mates with rare phenotypes may then also provide an additional benefit to distinctiveness15. Finally, our data do not preclude potential directional or stabilizing selection pressures that may arise from other potential mating preferences51 or climate52 on particular features of human facial morphology, though directional and stabilizing selection do not predict elevated genetic diversity within populations at the associated loci and so cannot explain the patterns of elevated genetic variation we have documented here.
It is important to note that facial recognition is widespread in primates53 and identity signaling is unlikely to be limited to human facial morphology, though the loci under selection may vary considerably across species. This may be especially true for humans, which have undergone considerable directional evolution of facial form during the course of hominin evolution54. While faces are a key feature used in human social recognition other traits such as our voices also contribute to recognition and may have also experienced selection for identity signaling. Additionally, the strength of selection for particular identity signaling traits may have changed over time in modern humans as cultural practices gave rise to individually distinctive clothing and hairstyles, which provide additional cues to identity. Traditional treatments of social selection in human evolution have emphasized the potential role for social interactions in shaping our cognitive abilities 3, though our work demonstrates that social selection has shaped our morphology as well to facilitate social recognition. Importantly, our work draws a link between social interactions and the maintenance of genetic variation underlying traits used in social recognition. Social recognition is found across disparate animal taxa suggesting that selection for identity signaling is likely to be a common mechanisms generating phenotypic variation and maintaining genetic variation.
Methods
Morphological Analyses
We examined morphological relationships among body parts and facial features using published anthropometric datasets. We focused our analyses on the ANSUR II dataset because it provides a large, consistent database of individual-level facial and body measurements. We analyzed the linear anthropometric measurements (Supplementary Tables 1-2). In our analysis we considered four groups of service members based on their sex and racial identity: African American females (n = 181, mean height = 64.29 + 0.17 inches), African American males (n = 457, mean height = 69.12 + 0.13 inches), European American females (n = 204, mean height = 64.27 + 0.15 inches), and European American males (n = 1168, mean height = 69.32 + 0.08 inches). Compared to the general civilian population the individuals measured in the ANSUR II dataset tend to be taller and have lower levels of body fat55. Neither of these factors should influence our results or conclusions because our comparisons use facial and body measurements from the same individuals. Identity signaling predicts that traits used for recognition will have greater variance and be less correlated with each other compared to non-recognition traits in the same group of individuals.
Using the ANSUR II dataset we tested two predictions of the identity-signaling hypothesis. First, we considered the levels of variation in each trait by calculating the coefficient of variation – by the dividing the standard deviation of each trait by the mean. Coefficients of variation provide a scale-free method for comparing variation across samples that differ in average size as is the case for human morphological data. Second, we considered the correlations among traits by calculating the inter-trait Pearson's correlations for all pairwise combination of traits within each class of traits. To compare the distribution of correlation coefficients between bodies and faces we recorded the correlation coefficients significant at the P < 0.05 level. For any pair of traits which did not show a significant correlation at P < 0.05, we recorded the correlation as 0. Pearon's correlation test is sensitive to the sample size such that correlations are more likely to be significant when larger samples are used. Therefore, comparisons between the different groups considered should be made with caution because of differences in sample size. For example, differences in the percentage of significant pairwise comparisons between males and females likely reflect differences in samples sizes. Within a group, however, the same individuals were measured for both facial and body traits, providing a direct comparison of the relative degree of correlation among traits.
Selection of Genomic Regions for Analysis
Face-associated SNPs were taken from two recent genome wide association studies of normal facial morphology. Paternoster et al33 conducted a discovery phase association study where they examined the relationship between facial characteristics and more than 2.5 million imputed SNPs in a sample of 2,185 15 year olds from the Avon Longitudinal Study of Parents and their Children (ALSPC)56. Only subjects who genetically clustered with the CEU HapMap population were included in their analysis. The study identified 30 loci associated with facial morphology at P< 5×10-7, which we examined in our study. Liu et al32 examined the relationship between facial morphology and more than 2.5 million SNPs in a discovery phase association study of 5,388 adults. The samples in the Liu et al32 study came individuals of European ancestry living in the Netherlands, Australia, Canada, Germany and the United Kingdom. They identified 29 loci associated with facial morphology at P< 5×10-7, which we examined in this study. None of the SNPs identified by the two studies overlapped, providing a total of 59 loci for investigation. In both studies, multiple linked SNPs were often identified in association with a particular phenotype. When more than one SNP was associated with a trait we chose the SNP with the smallest P value within a 1MB region of a chromosome from the association study. The SNPs identified for further examination from the two studies include one from each of 59 loci distributed throughout the autosomes. The SNPs are largely intergenic (95%) though a few occur within introns (5%). None were located in coding regions.
We compared face-associated genomic regions to two sets of control regions. First, we examined SNPs associated with height taken from the GWAS Catalog of the National Human Genome Research Institute (www.genome.gov) on April 25, 2013. In order to prevent multiple sampling of any regions we only considered SNPs that were separated by more than 4kb. In the instances where multiple nearby SNPs had been associated with height, we chose the SNP that had been associated with the smallest P-value as reported in the GWAS Catalog. We excluded six SNPs associated with height that fall within the HLA region though including the SNPs in our analyses does not alter our pattern of results. This produced a total of 365 loci associated with variation in height. Like faces, height is a composite character that depends on the morphology of numerous different bones. Additionally, both height and facial morphology have complex genetic bases with many loci of small effect contributing overall phenotypic variation 43. SNPs associated with height are predominantly located in intronic regions (54%) and intergenic regions (36%) with a smaller percent found near the 3′ and 5′ end of genes (6%) or exons (3%). Second, we considered the genome-wide patterns of diversity by examining 5000 2kb intergenic regions. We identified putatively neutral intergenic regions in Europeans using the Neutral Region Explorer webserver 42. The same set of intergenic regions was used for all populations.
We analyzed regions surrounding the SNPs identified by the association studies at a set window size. The causative mutations underlying the traits are not known, though are likely to be located near the SNPs identified through genome-wide association studies57. The a priori best choice for a window size is not clear, though the patterns of elevated nucleotide diversity we observer are seen over a range of window sizes (Supplementary Fig. 1). We chose to analyze 1kb both up and downstream of the SNPs, providing windows of 2kb. Smaller window sizes show marked increased variance in the summary statistics across loci (Supplementary Fig. 1), though this variance levels off at window sizes of 2kb or greater.
Summary Statistics
We calculated summary statistics for each population using binary SNP and indel data from 1000 Genome Project Phase 1 variants. Nine non-admixed populations originating from Europe (CEU: Utah residents with Northern and Western European Ancestry; GBR: British from England and Scotland; FIN: Finnish from Finland; TSI: Toscani from Italia), Asia (CHB: Han Chinese in Bejing, China, CHS: Southern Han Chinese, JPT: Japanese from Tokyo, Japan) and Africa (LWK: Luhya from Webuye, Kenya; YRI: Yoruba from Ibadan, Nigeria) were considered in our study. We downloaded the population data to Galaxy 58 using the Table Browser function of the UCSC genome browser. We filtered the data based on the sets of 2kb windows for face, height and intergenic regions to produce three files for each population, which we subsequently examined using custom macros in Excel.
We used folded site frequency spectra to examine the distribution of minor allele frequencies among SNPs found within each of the demarcated regions. The expected distribution of allele frequencies at loci underlying a polygenic trait under negative frequency dependent selection is unclear and will depend on the exact form of selection and the genetic architecture of the trait 59. Nonetheless, frequency-dependent selection is expected to maintain alleles in a population, on average, longer than expected for neutral alleles60. Thus, the distribution of allele frequencies should differ from that expected in a stationary population at mutation drift equilibrium. In particular, we expect fewer rare alleles under a scenario of frequency-dependent selection. Spectra were compared using the raw counts of SNPs with each minor allele frequency using a Mann-Whitney U test. To graph the folded site frequency spectra we binned data into ranges of minor allele frequencies.
In addition to the aggregated site frequency spectrum analysis, we also considered the distribution of multiple summary statistics of genetic diversity across the loci considered within our study. We calculated the following summary statistics for each 2kb window: π, π corrected for human-macaque divergence, Waterson's θ Tajima's D and Fu and Li's D*. Both π and θ, are estimators of the neutral mutation parameter, 4Neμ. π is based on the number of pairwise differences among sequences within a sample and θ is based on the proportion of segregating sites. Loci under frequency-dependent selection are expected to show elevated values for π because frequency-dependent selection maintains alleles over longer periods of time. Older alleles accumulate mutations and therefore show higher levels of pairwise sequence divergence. We also examined the distribution of π corrected for the rate of divergence between humans and macaques. Different regions of the genome are known to experience differences in rates of mutation 61. Loci with higher mutation rates will show elevated levels of π. The rate of divergence between humans and macaques provides a means of estimating the relative differences in mutation rates among loci62. The maintenance of multiple alleles in a population under frequency-dependent selection is also expected to lead to higher estimates of θ. Tajima's D is the normalized difference between π and θ. Tajima's D takes on positive values when there is an excess of intermediate frequency variants and negative values when there is an excess of rare variants. Fu and Li's D* is based on the number of nucleotide variants observed only once in a sample63. Negative measures of Fu and Li's D* indicate an excess of singletons. Loci under frequency-dependent selection are expected to have a relatively smaller number of singletons and therefore more positive values of Fu and Li's D*.
We calculated the summary statistics using the allele frequencies given in the Phase 1 Variant files from the 1000 Genomes project. The short indels recorded in the dataset were considered in the same manner as SNPs. Human-macaque divergence data were estimated using the LastZ alignment of the two reference genomes. Only regions with alignments between the two species' genomes were considered in the analysis of π corrected for divergence with macaques (Faces = 58 regions, Height = 356 regions, Neutral = 4873 regions) for subsequent analyses using this statistic. For the aligned regions, the average alignment lengths were 1832.21 ± 4.49 sites out of 2000. We compared the distribution of summary statistics for face-associated loci to the distributions for the two control datasets using one-tailed Mann-Whitney U tests.
Patterns of diversity across populations
We asked whether the same loci showed elevated diversity in different populations. To do this we identified loci in each population for which π/divergence and Tajima's D were above the 95th percentile as determined from the empirical distribution of intergenic regions examined. We then asked whether or not a disproportionate number of loci with elevated diversity were shared between continents for face regions compared to the intergenic regions examined. For the nine loci showing elevated diversity in at least one population, we investigated the patterns of haplotype sharing across populations. We examined the sequences in the 2kb window used for previous analyses. For those coordinates we downloaded a combined PED file including CHB, FIN and YRI from the 1000 Genomes project site (browser.1000genomes.org). We converted the PED files to fasta format using PGD Spider 64. This procedure produced a fasta file containing the polymorphic sites found within the examined loci. Using the ‘pegas’ package in R 65 we created haplotype networks for each of the loci.
Sliding window analyses
To examine the extent to which selection for identity signaling has been shared or divergent across continents we conducted a sliding-window analysis of the regions identified as having elevated diversity in at least one population. We calculated π and Fst for 5kb windows every 1kb for a total of 200 kb. π was calculated for one representative population for each continent (FIN, CHB and YRI). We estimated levels of differentiation between populations using Hudson's Fst following66 as it produces unbiased estimates of Fst and is less sensitive to sample size and rare variants than other estimates of Fst such as Weir-Cockeram and Nei's67. We estimated Fst for each set of SNPs considered by calculating a ratio of averages rather than an average of ratios, as the former is less sensitive to the presence of rare variants in a sample 66.
The SNPs identified in association with facial morphology are not found in coding sequences, so they are likely to influence gene regulation or splicing in some manner. For the two loci examined in greater detail, we used the UCSC genome browser to identify polymorphic sites in ENCODE regulatory regions. We focused on three ENCODE tracks in the UCSC browser68: H3K27Ac marks, DNase sensitivity clusters, and transcription factor binding sites. The H3K27Ac marks show regions for which there is CHIP-seq based evidence of enrichment for the H3K27Ac histone mark. H3K27 acetylation is associated with enhanced transcription. DNase sensitivity clusters show regions sensitive to DNase as assessed across 125 cell types. Promoters and other regulatory regions tend to be DNase sensitive. The transcription factor track shows regions with evidence of transcription factor binding sites.
Gene Trees
For the two 5kb loci examined we constructed maximum likelihood gene trees with 10 sequences each from the FIN, CHB and YRI 1000 Genomes populations for a total of thirty sequences. Additionally, we included the human and chimpanzee reference sequences as well as sequences for Denisovans69 and Neanderthals (http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/bam/). We downloaded the alignment of the human and chimpanzee reference sequences from Ensembl. Denisovan sequences were downloaded using the Table Browser function of the UCSC Genome Browser. The draft Altai Neaderthal sequences were downloaded for the relevant chromosomes from the Department of Evolutionary Genetics at the Max Planck Institute's website. We constructed individual sequences for the 1000 Genomes, Denisovan and Neanderthal by manually altering the human reference sequence in accordance with the data found in the respective VCF files using Mega 5.2.170. For the phased 1000 Genomes data we selected one chromosome per individual sample. For the Neanderthal and Denisovan sequences, we included all of the sites that differed from the human reference to make a single sequence. After removing sites with gaps in the alignment, we constructed a maximum likelihood tree using a general time reversible model with a gamma distribution of invariant sites.
Supplementary Material
Acknowledgments
We thank M. Phifer-Rixey and A. Werner for computational assistance during this project. W. Allen, K. Ferris and T. Hendry provided useful comments on earlier drafts. MJS was supported by a Ruth Kirschstein National Research Service Award from NIH. MWN was supported by NIH R01 GM074245.
Footnotes
Author contributions: MJS conceived the project; MJS and MWN designed the study; MJS collected and analyzed the data; and MJS and MWN wrote the paper.
Conflict of interest statement: The authors have no conflicts of interest.
References
- 1.Dunbar RIM. Social cognition on the Internet: testing constraints on social network size. Philos Trans R Soc B Biol Sci. 2012;367:2192–2201. doi: 10.1098/rstb.2012.0121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Apicella CL, Marlowe FW, Fowler JH, Christakis NA. Social networks and cooperation in hunter-gatherers. Nature. 2012;481:497–501. doi: 10.1038/nature10736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dunbar RIM, Shultz S. Evolution in the social brain. Science. 2007;317:1344–1347. doi: 10.1126/science.1145463. [DOI] [PubMed] [Google Scholar]
- 4.Byrne RW, Whiten A. Machiavellian Intelligence: Social Expertise and the Evolution of Intellect in Monkeys, Apes, and Humans. Oxford: Univeristy Press; 1988. [Google Scholar]
- 5.Whiten A, Erdal D. The human socio-cognitive niche and its evolutionary origins. Philos Trans R Soc B Biol Sci. 2012;367:2119–2129. doi: 10.1098/rstb.2012.0114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Haxby JV, Hoffman EA, Gobbini MI. Human neural systems for face recognition and social communication. Biol Psychiatry. 2002;51:59–67. doi: 10.1016/s0006-3223(01)01330-0. [DOI] [PubMed] [Google Scholar]
- 7.Herrmann E, Call J, Hernandez-Lloreda MV, Hare B, Tomasello M. Humans have evolved specialized skills of social cognition: The cultural intelligence hypothesis. Science. 2007;317:1360–1366. doi: 10.1126/science.1146282. [DOI] [PubMed] [Google Scholar]
- 8.Parr LA. The evolution of face processing in primates. Philos Trans R Soc B Biol Sci. 2011;366:1764–1777. doi: 10.1098/rstb.2010.0358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Pascalis O, Kelly DJ. The Origins of Face Processing in Humans: Phylogeny and Ontogeny. Perspect Psychol Sci. 2009;4:200–209. doi: 10.1111/j.1745-6924.2009.01119.x. [DOI] [PubMed] [Google Scholar]
- 10.Kanwisher N, Yovel G. The fusiform face area: a cortical region specialized for the perception of faces. Philos Trans R Soc B-Biol Sci. 2006;361:2109–2128. doi: 10.1098/rstb.2006.1934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Light LL, Kayra-Stuart F, Hollander S. Recognition memory for typical and unusual faces. J Exp Psychol [Hum Learn] 1979;5:212. [PubMed] [Google Scholar]
- 12.Tibbetts EA, Dale J. Individual recognition: it is good to be different. Trends Ecol Evol. 2007;22:529–537. doi: 10.1016/j.tree.2007.09.001. [DOI] [PubMed] [Google Scholar]
- 13.Bond AB, Kamil AC. Apostatic selection by blue jays produces balanced polymorphism in virtual prey. Nature. 1998;395:594–596. [Google Scholar]
- 14.Olendorf R, et al. Frequency-dependent survival in natural guppy populations. Nature. 2006;441:633–636. doi: 10.1038/nature04646. [DOI] [PubMed] [Google Scholar]
- 15.Kokko H, Jennions MD, Houde A. Evolution of frequency-dependent mate choice: keeping up with fashion trends. Proc R Soc B Biol Sci. 2007;274:1317–1324. doi: 10.1098/rspb.2007.0043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Dale J. Bird Color Vol 2 Funct Evol. Harvard University Press; 2006. [Google Scholar]
- 17.Hughes KA, Houde AE, Price AC, Rodd FH. Mating advantage for rare males in wild guppy populations. Nature. 2013;503:108–110. doi: 10.1038/nature12717. [DOI] [PubMed] [Google Scholar]
- 18.Frost P. European hair and eye color: A case of frequency-dependent sexual selection? Evol Hum Behav. 2006;27:85–103. [Google Scholar]
- 19.Janif ZJ, Brooks RC, Dixson BJ. Negative frequency-dependent preferences and variation in male facial hair. Biol Lett. 2014;10:20130958. doi: 10.1098/rsbl.2013.0958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Johnstone RA. Recognition and the evolution of distinctive signatures: when does it pay to reveal identity? Proc R Soc Lond Ser B-Biol Sci. 1997;264:1547–1553. [Google Scholar]
- 21.Dale J, Lank DB, Reeve HK. Signaling individual identity versus quality: A model and case studies with ruffs, queleas, and house finches. Am Nat. 2001;158:75–86. doi: 10.1086/320861. [DOI] [PubMed] [Google Scholar]
- 22.Scott-Phillips TC. Defining biological communication. J Evol Biol. 2008;21:387–395. doi: 10.1111/j.1420-9101.2007.01497.x. [DOI] [PubMed] [Google Scholar]
- 23.Thom MD, Hurst JL. Individual recognition by scent. Ann Zool Fenn. 2004;41:765–787. [Google Scholar]
- 24.Bergman TJ, Sheehan MJ. Social Knowledge and Signals in Primates. Am J Primatol. 2013;75:683–694. doi: 10.1002/ajp.22103. [DOI] [PubMed] [Google Scholar]
- 25.Sheehan MJ, Tibbetts EA. Selection for individual recognition and the evolution of polymorphic identity signals in Polistes paper wasps. J Evol Biol. 2010;23:570–577. doi: 10.1111/j.1420-9101.2009.01923.x. [DOI] [PubMed] [Google Scholar]
- 26.Sheehan MJ, Tibbetts EA. Evolution of identity signals: Frequency-dependent benefits of distinctive phenotypes used for individual recognition. Evolution. 2009;63:3106–3113. doi: 10.1111/j.1558-5646.2009.00833.x. [DOI] [PubMed] [Google Scholar]
- 27.Thom MDF, Dytham C. Female chosiness leads to the evolution of individually distinctive males. Evolution no. 2012 doi: 10.1111/j.1558-5646.2012.01732.x. [DOI] [PubMed] [Google Scholar]
- 28.Medvin MB, Stoddard PK, Beecher MD. Signals for parent offspring recognition: a comparative-analysis of the begging calls of cliff swallows and barn swallows. Anim Behav. 1993;45:841–850. [Google Scholar]
- 29.Pollard KA, Blumstein DT. Social Group Size Predicts the Evolution of Individuality. Curr Biol. 2011;21:413–417. doi: 10.1016/j.cub.2011.01.051. [DOI] [PubMed] [Google Scholar]
- 30.Bamshad M, Wooding SP. Signatures of natural selection in the human genome. Nat Rev Genet. 2003;4:99–111A. doi: 10.1038/nrg999. [DOI] [PubMed] [Google Scholar]
- 31.Andrés AM, et al. Targets of balancing selection in the human genome. Mol Biol Evol. 2009;26:2755–2764. doi: 10.1093/molbev/msp190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Liu F, et al. A Genome-Wide Association Study Identifies Five Loci Influencing Facial Morphology in Europeans. PLoS Genet. 2012;8:e1002932. doi: 10.1371/journal.pgen.1002932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Paternoster L, et al. Genome-wide Association Study of Three-Dimensional Facial Morphology Identifies a Variant in PAX3 Associated with Nasion Position. Am J Hum Genet. 2012;90:478–485. doi: 10.1016/j.ajhg.2011.12.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Attanasio C, et al. Fine Tuning of Craniofacial Morphology by Distant-Acting Enhancers. Science. 2013;342 doi: 10.1126/science.1241006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Pritchard JK, Pickrell JK, Coop G. The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Curr Biol. 2010;20:R208–R215. doi: 10.1016/j.cub.2009.11.055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Biswas S, Akey JM. Genomic insights into positive selection. TRENDS Genet. 2006;22:437–446. doi: 10.1016/j.tig.2006.06.005. [DOI] [PubMed] [Google Scholar]
- 37.Gordon CC, Churchill T, Clauser CE, Bradtmiller B, McConville JT. Anthropometric survey of US army personnel: methods and summary statistics 1988. 1989 DTIC Document. [Google Scholar]
- 38.Frankino WA, Zwaan BJ, Stern DL, Brakefield PM. Natural selection and developmental constraints in the evolution of allometries. Science. 2005;307:718–720. doi: 10.1126/science.1105409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Huxley J. Problems of relative growth. 1932 [Google Scholar]
- 40.Marroig G, Shirai LT, Porto A, de Oliveira FB, De Conto V. The evolution of modularity in the mammalian skull II: evolutionary consequences. Evol Biol. 2009;36:136–148. [Google Scholar]
- 41.A map of human genome variation from population-scale sequencing. Nature. 2011;473:544–544. doi: 10.1038/nature09534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Arbiza L, Zhong E, Keinan A. NRE: a tool for exploring neutral loci in the human genome. BMC Bioinformatics. 2012;13:301. doi: 10.1186/1471-2105-13-301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Allen HL, et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature. 2010;467:832–838. doi: 10.1038/nature09410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Barrett JC, Cardon LR. Evaluating coverage of genome-wide association studies. Nat Genet. 2006;38:659–662. doi: 10.1038/ng1801. [DOI] [PubMed] [Google Scholar]
- 45.Manica A, Amos W, Balloux F, Hanihara T. The effect of ancient population bottlenecks on human phenotypic variation. Nature. 2007;448:346–348. doi: 10.1038/nature05951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lieberman D, Tooby J, Cosmides L. The architecture of human kin detection. Nature. 2007;445:727–731. doi: 10.1038/nature05510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Fernandez-Duque E, Valeggia CR, Mendoza SP. The biology of paternal care in human and nonhuman primates. Annu Rev Anthropol. 2009;38:115–130. [Google Scholar]
- 48.Brosnan SF, Salwiczek L, Bshary R. The interplay of cognition and cooperation. Philos Trans R Soc B-Biol Sci. 2010;365:2699–2710. doi: 10.1098/rstb.2010.0154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Anderson RC, Klofstad CA. For Love or Money? The Influence of Personal Resources and Environmental Resource Pressures on Human Mate Preferences. Ethology. 2012;118:841–849. [Google Scholar]
- 50.Bereczkei T, Voros S, Gal A, Bernath L. Resources, attractiveness, family commitment; reproductive decisions in human mate choice. Ethology. 1997;103:681–699. doi: 10.1111/j.1439-0310.1997.tb00178.x. [DOI] [PubMed] [Google Scholar]
- 51.Puts DA, Jones BC, DeBruine LM. Sexual selection on human faces and voices. J Sex Res. 2012;49:227–243. doi: 10.1080/00224499.2012.658924. [DOI] [PubMed] [Google Scholar]
- 52.Hubbe M, Hanihara T, Harvati K. Climate signatures in the morphological differentiation of worldwide modern human populations. Anat Rec. 2009;292:1720–1733. doi: 10.1002/ar.20976. [DOI] [PubMed] [Google Scholar]
- 53.Parr LA. The evolution of face processing in primates. Philos Trans R Soc B-Biol Sci. 2011;366:1764–1777. doi: 10.1098/rstb.2010.0358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Wood B, Harrison T. The evolutionary context of the first hominins. Nature. 2011;470:347–352. doi: 10.1038/nature09709. [DOI] [PubMed] [Google Scholar]
- 55.Fromuth R, Parkinson M. Predicting 5th and 95th percentile anthropometric segment lengths from population stature. Proc ASME Int Des Eng Tech Conf. 2008:3–6. [Google Scholar]
- 56.Golding J, Pembrey M, Jones R, Team AS. ALSPAC-the avon longitudinal study of parents and children. I. study methodology. Paediatr Perinat Epidemiol. 2001;15:74–87. doi: 10.1046/j.1365-3016.2001.00325.x. [DOI] [PubMed] [Google Scholar]
- 57.Orozco G, Barrett JC, Zeggini E. Synthetic associations in the context of genome-wide association scan signals. Hum Mol Genet. 2010;19:R137–R144. doi: 10.1093/hmg/ddq368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Blankenberg D, et al. Galaxy: A Web-Based Genome Analysis Tool for Experimentalists. Curr Protoc Mol Biol. 2010:19.10. 1–19.10. 21. doi: 10.1002/0471142727.mb1910s89. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Navarro A, Barton NH. The effects of multilocus balancing selection on neutral variability. Genetics. 2002;161:849–863. doi: 10.1093/genetics/161.2.849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Takahata N, Nei M. Allelic genealogy under overdominant and frequency-dependent selection and polymorphism of major histocompatibility complex loci. Genetics. 1990;124:967–978. doi: 10.1093/genetics/124.4.967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Wolfe KH, Sharp PM, Li WH. Mutation rates differ among regions of the mammalian genome. 1989 doi: 10.1038/337283a0. [DOI] [PubMed] [Google Scholar]
- 62.Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol. 1980;16:111–120. doi: 10.1007/BF01731581. [DOI] [PubMed] [Google Scholar]
- 63.Fu YX, Li WH. Statistical tests of neutrality of mutations. Genetics. 1993;133:693–709. doi: 10.1093/genetics/133.3.693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Lischer HEL, Excoffier L. PGDSpider: an automated data conversion tool for connecting population genetics and genomics programs. Bioinformatics. 2012;28:298–299. doi: 10.1093/bioinformatics/btr642. [DOI] [PubMed] [Google Scholar]
- 65.Paradis E. pegas: an R package for population genetics with an integrated–modular approach. Bioinformatics. 2010;26:419–420. doi: 10.1093/bioinformatics/btp696. [DOI] [PubMed] [Google Scholar]
- 66.Bhatia G, Patterson N, Sankararaman S, Price AL. Estimating and interpreting FST: The impact of rare variants. Genome Res. 2013;23:1514–1521. doi: 10.1101/gr.154831.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Weir BS, Hill WG. Estimating F-statistics. Annu Rev Genet. 2002;36:721–750. doi: 10.1146/annurev.genet.36.050802.093940. [DOI] [PubMed] [Google Scholar]
- 68.Rosenbloom KR, et al. ENCODE Data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res. 2013;41:D56–D63. doi: 10.1093/nar/gks1172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Meyer M, et al. A high-coverage genome sequence from an archaic Denisovan individual. Science. 2012;338:222–226. doi: 10.1126/science.1224344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Tamura K, et al. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–2739. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




