Skip to main content
Genome Biology and Evolution logoLink to Genome Biology and Evolution
. 2018 Dec 19;11(1):232–241. doi: 10.1093/gbe/evy271

Immune Gene Diversity in Archaic and Present-day Humans

David Reher 1,, Felix M Key 1,2, Aida M Andrés 1,3,2, Janet Kelso 1,2
Editor: Partha Majumder
PMCID: PMC6347564  PMID: 30566634

Abstract

Genome-wide analyses of two Neandertals and a Denisovan have shown that these archaic humans had lower genetic heterozygosity than present-day people. A similar reduction in genetic diversity of protein-coding genes (gene diversity) was found in exome sequences of three Neandertals. Reduced gene diversity, particularly in genes involved in immunity, may have important functional consequences. In fact, it has been suggested that reduced diversity in immune genes may have contributed to Neandertal extinction. We therefore explored gene diversity in different human groups, and at different time points on the Neandertal lineage, with a particular focus on the diversity of genes involved in innate immunity and genes of the Major Histocompatibility Complex (MHC).

We find that the two Neandertals and a Denisovan have similar gene diversity, all significantly lower than any present-day human. This is true across gene categories, with no gene set showing an excess decrease in diversity compared with the genome-wide average. Innate immune-related genes show a similar reduction in diversity to other genes, both in present-day and archaic humans. There is also no observable decrease in gene diversity over time in Neandertals, suggesting that there may have been no ongoing reduction in gene diversity in later Neandertals, although this needs confirmation with a larger sample size. In both archaic and present-day humans, genes with the highest levels of diversity are enriched for MHC-related functions. In fact, in archaic humans the MHC genes show evidence of having retained more diversity than genes involved only in the innate immune system.

Keywords: Neandertal, Denisovan, immunity, evolution, natural selection, diversity

Introduction

Since the first complete Neandertal genome was sequenced (Green et al. 2010), ongoing efforts have retrieved DNA sequences from a number of additional extinct hominins (Meyer et al. 2012; Castellano et al. 2014; Prufer et al. 2014, 2017). Comparing the Neandertal and Denisovan genomes to the genomes of present-day people provided evidence that the ancestors of all non-Africans living today met and interbred with Neandertals (Green et al. 2010) and that the ancestors of people that today live in Oceania interbred with Denisovans (Reich et al. 2010). Although some of the resulting introgressed DNA has been shown to be adaptive in anatomically modern humans (Huerta-Sanchez et al. 2014; Racimo et al. 2015; Dannemann et al. 2016; Racimo et al. 2017), conserved regions of present-day human genomes are significantly depleted of introgressed Neandertal sequence, which has been interpreted as evidence for purifying selection against introgressed Neandertal DNA in anatomically modern human genomes (Sankararaman et al. 2014; Fu et al. 2016; Harris and Nielsen 2016; Juric et al. 2016). Recent studies suggested that slightly deleterious alleles may have accumulated in the genomes of Neandertals and Denisovans because of reduced efficacy of natural selection as a result of their small long-term effective population size (Ne) (Harris and Nielsen 2016; Juric et al. 2016).

All archaic individuals analyzed to date have genome-wide heterozygosities that are lower than those seen in present-day humans. The genome-wide heterozygosity of a ∼50,000-year-old Neandertal from Vindija cave in Croatia (Prufer et al. 2017) was estimated to be 1.6 × 10−5, similar to that previously reported for the ∼120,000-year-old Altai Neandertal (Prufer et al. 2014) and only slightly lower than the estimate for an ∼80,000-year-old Denisovan individual (1.8 × 10−5) (Meyer et al. 2012). This low genetic diversity has also been observed in the exome sequences of three Neandertals from the Vindija, El Sidrón, and Denisova Caves which show lower average heterozygosities than present-day humans (Castellano et al. 2014). Genetic diversity in genic regions is particularly important, as it can potentially impact the levels of functional diversity in the population. However, the limited number of high-quality archaic genome sequences means that we do not know to what extent levels of gene diversity (i.e., genetic diversity in protein-coding genes) may have changed over time. The availability of two high-coverage Neandertal genomes of individuals who lived 70,000 years apart, as well the high-coverage genome of one Denisovan, now allows us to begin to explore gene diversity in archaic human populations at different times during Neandertal history.

It has been suggested that lack of functional variation in immune-related genes—especially in genes related to the innate immune system which is known to serve as a first defense mechanism against pathogen detection—, some of which are targets of long-term balancing selection (Meyer and Thomson 2001; Key, Teixeira, et al. 2014; Bitarello et al. 2018), could have contributed to Neandertal extinction (Wolff and Greenwood 2010; Houldcroft and Underdown 2016; Sullivan et al. 2017). This is known as the differential pathogen resistance hypothesis (DPRH), and Sullivan et al. (2017) recently presented evidence both for and against this hypothesis. Using the exome data (Castellano et al. 2014), they found that Neandertals had substantially lower numbers of nonsynonymous single nucleotide polymorphisms (SNPs) than present-day humans in 73 innate immune-related genes, 12 genes of the Major Histocompatibility Complex (MHC), 164 virus-interacting protein genes, and 73 loci with high diversity in chimpanzee (which might be enriched for targets of balancing selection). They concluded that reduced protein sequence diversity in this set of immune genes may have resulted in reduced resistance to pathogens and have thereby contributed to Neandertal extinction. However, on the other hand they also reported a higher number of nonsynonymous SNPs in Neandertals than in present-day humans for 12 genes of the MHC, suggesting high levels of functional diversity in this component of the immune system.

Here, we leverage existing high-quality whole-genome data from three archaic humans to test for evidence of a specific reduction of gene diversity in archaic humans that would be expected under the DPRH. We focus on comparing genetic diversity between archaic and present-day humans, and over time in the Neandertal lineage in a comprehensive set of 1,548 innate immunity genes defined by Deschamps et al. (2016). We chose to study genes of the innate immune system as these are affected in a more direct way by the effects of natural selection than genes involved in adaptive immunity (Quintana-Murci and Clark 2013), such as T cell and B cell receptors, that derive their variability both from inherited genetic variation and from individual somatic recombination (Flajnik and Kasahara 2010). In addition, we separately analyze 14 MHC genes because of their important role in immunity, and their well-studied and unique evolutionary history (Meyer and Thomson 2001; Key, Teixeira, et al. 2014; Bitarello et al. 2018). In a second analysis, we then generalize the idea underlying the DPRH. Instead of exploring gene diversity only in innate immunity genes, we tested if any functional category of genes had particularly high or low gene diversity in Neandertals when compared with modern humans.

Materials and Methods

Data

Our analyses are based on three published high-coverage genomes of the Altai and Vindija Neandertals and the Denisovan (Meyer et al. 2012; Prufer et al. 2014, 2017) as well as a published data set of 14 present-day individuals consisting of five individuals from Africa (Mandenka, Mbuti, San, Yoruba, Dinka), three from Asia (Dai, Han, Papuan), two from Australia, two from Europe (French, Sardinian), and two from South America (Karitiana, Mixe) (Meyer et al. 2012). For all analyses, we used the filters applied by (Prufer et al. 2017). In brief, we retained sequences with mapping quality >25, sites with coverage >10 (including both a 2.5% higher and lower coverage cut-off; corrected for GC content), and unique positions in the genome according to 35-mer 1-mismatch filter, while removing simple repeats (tandem repeat finder track at UCSC). We downloaded a list of all annotated autosomal human protein-coding genes from BioMart/Ensembl Release 84 (GRCh37) (Yates et al. 2016) including introns, exons, and additional 1 kb up and downstream to capture adjacent regulatory elements, and filtered for uniqueness by HGNC symbol and gene coordinates (GRCh37/hg19, N = 17,505). We extracted the sequences that pass these filters for each individual from the whole genome VCF files and excluded genes with <2,000 callable sites from the analysis (reducing the number of genes by 2,041 genes for each individual on average). Data processing was done using Tabix (Li 2011), BEDOPS (Neph et al. 2012), and BEDTools (Quinlan 2014) and statistical analysis and visualization was done using R (Team RC 2016).

Measure of Gene Diversity

To estimate gene diversity for autosomal protein-coding genes per genome, we counted SNPs (Andres 2009). For individual genomes, a SNP is defined as a biallelic heterozygous site. To account for local heterogeneity in mutation rate and the rate of substitutions, we divided the number of SNPs by the number of fixed differences (FDs) which serves as a proxy for mutation rate. For individual genomes, a FD is a site in which the chimpanzee reference allele (taken from the EPO alignment version 69 [Yates et al. 2016] based on the chimpanzee reference CHIMP2.1.4) is different from a homozygous allele in the test individual. We calculated the SNP/FD ratio for each gene that passed our filter and define this measure as a proxy for genetic diversity in protein-coding genes which we call gene diversity. We note that because we consider the full length of genes including regulatory sequences and introns, and use divergence with chimpanzee, the number of genes with very small number of FDs is extremely low (on average, there are eight genes with less than three FDs per individual; the mean number of FDs per gene over all individuals is 272). Further, we predefined sets of genes (more specifically, innate immune and MHC genes, see below), summed the total number of SNPs in those genes, and divided that number by the total number of FDs in those genes to a combined single ratio of SNPs to FDs per individual (mean SNP/FD ratio). We computed confidence intervals on bootstrapped sets. After sampling N genes with replacement from both test and background gene sets, we recalculated the SNP/FD ratios for each of 5,000 resampled sets and defined cut-offs based on the 2.5% and 97.5% quantiles of the resulting empirical distributions as cut-offs.

Diversity in Innate Immune and MHC Genes

To test whether there is evidence for an overall increase or reduction in gene diversity of innate immune genes in archaic humans, we calculated SNP/FD ratios in a comprehensive set of innate-immune genes curated by Deschamps et al. (2016). This set combines genes from InnateDB (Breuer et al. 2013) and genes assigned to the GO category innate immune response (GO: 0045087). We updated this gene list with recent InnateDB entries following our filtering scheme. This resulted in a set of 1,548 innate immunity genes. We additionally investigated the following subsets of this innate-immune gene list separately: Toll-like receptorsignalingpathway (GO: 0002224, N = 169), innate immune response in mucosa (GO: 0002227, N = 10), defense response (GO: 0006952, N = 65), defense response to bacterium (GO: 0042742, N = 60), defense response to Gram-negative bacterium (GO: 0050829, N = 29), defense response to Gram-positive bacterium (GO: 00050830, N = 52), defense response to fungus (GO: 0050832, N = 16), defense response to virus (GO: 0051607, N = 154), as well as the MHC genes (N = 14). We then defined a hand-curated set of autosomal protein-coding background genes without any reported immune function to use as a background set (13,393 Ensembl genes [Yates et al. 2016]) for which we excluded 4,723 genes with any reported immune system-related function (ImmPort gene list [Breuer et al. 2013]) as well as genes shorter than 500 bp in length. To compare the diversity of immune genes relative to the protein-coding background between archaic and present-day humans, we normalized the levels of gene diversity for each individual by the overall gene diversity in the set of background genes found in that same individual, that is, we divided the mean SNP/FD ratio of innate-immune genes by the mean SNP/FD ratio of the background genes (normalized gene diversity). We repeated the same analysis for the MHC gene set, that is, all HLA genes on chromosome 6.

GO Enrichment Analysis

We performed a gene ontology (GO) enrichment analysis to explore whether any particular functional groups of genes (GO categories) are overrepresented among the genes with the highest (top-5% tail of the empirical SNP/FD ratio distribution) or lowest (bottom-5% tail of the same distribution) SNP/FD ratios in the three archaics, or a set of three representative present-day humans (Africa [Yoruba], Europe [French], and Asia [Han]). In this analysis, we only considered genes that pass our above-mentioned filters in the test individuals, and averaged the SNP/FD ratio over those individuals for each gene. For GO enrichment analyses, we used the R package “GOfuncR” (Prufer et al. 2007; Grote et al. 2016; Grote 2018). In the GO enrichment analyses, we compared the test sets to all genes with SNP/FDs ratios outside the top and bottom-5% in the relevant set of three individuals. We further performed GO enrichment analyses for pairs of genomes and for individual genomes. While these analyses have lower power than the one above, they allow us to better define and understand the enrichment signal in genes with specific functions.

Results

Archaic Humans Had Lower Overall Gene Diversity than Present-day Humans

We estimated gene diversity per individual by calculating SNP/FD ratios in five present-day individuals from African populations (Mandenka, Mbuti, San, Yoruba, and Dinka), and nine present-day individuals from non-African populations (French, Sardinian, Dai, Han, Papuan, Karitiana, Mixe, and two Australians). In agreement with previous observations, we consistently found significantly higher diversity in African individuals than in individuals from non-African populations (fig. 1A, indicated by nonoverlapping 95% confidence intervals), consistent with reduced diversity in non-African as a consequence of the out-of-Africa bottleneck (reviewed in Cavalli-Sforza and Feldman 2003). All three archaic humans exhibit significantly lower gene diversity compared with the present-day humans, consistent with their previously reported overall low genomic diversity (Meyer et al. 2012; Castellano et al. 2014; Prufer et al. 2014, 2017). The Altai Neandertal has lower gene diversity than the other two archaic individuals likely as a consequence of recent inbreeding (Prufer et al. 2014). Removing the extended tracts of homozygosity (defined by Prufer et al. 2014) from the Altai genome (Altai*), results in comparable levels of gene diversity in the Altai and Vindija Neandertals. Gene diversity in both of the Neandertals is slightly lower than in the Denisovan, which is consistent with the reported differences in genome-wide diversity (Meyer et al. 2012).

Fig. 1.

Fig. 1.

—Distributions of SNP/FD ratios per gene for all 17 individuals. Black lines and notches give medians and 95% confidence intervals, respectively. Y-axis trimmed at 0.5 for clarity (for full plots see supplementary fig. S1, Supplementary Material online), each grey dot gives the SNP/FD ratio for a single gene. (A) All protein-coding genes and (B) innate immune-related genes. AF, African; EU, European; AS, Asian; SA, South American; AH, Archaic Humans.

Archaic Humans Had Similarly Low Gene Diversity in Innate Immune Genes Compared with Non-immune Genes

Next, we tested whether genes of the innate immune system showed a similar reduction in Neandertals (when compared with present-day humans) as non-immune genes. Figure 1B shows the distribution of SNP/FD ratios for each immune gene in all individuals as well as for the Altai Neandertal excluding homozygous tracts (Altai*). As was the case for all autosomal protein-coding genes (fig. 1A), present-day humans from Africa (Mandenka, Mbuti, San, Yoruba, and Dinka) have higher diversity in innate immune-related genes (SNP/FD ratios range from 0.154 to 0.167) than individuals from non-African populations (French, Sardinian, Dai, Han, Papuan, Karitiana, Mixe, and two Australians, SNP/FD ratios range from 0.042 to 0.105). With values from 0.005 to 0.018, the median SNP/FD ratios are lower for the three archaic humans than for the present-day humans (fig. 1B). The median SNP/FD ratio for the Altai Neandertal is slightly (not significantly) lower than that for the Vindija 33.19 Neandertal. Again, after removing identified homozygous tracts, the Altai Neandertal (Altai*) exhibits similar gene diversity to the younger Vindija 33.19 Neandertal and the Denisovan, suggesting the lower SNP/FD ratio is likely a result of recent inbreeding in the Altai Neandertal (Prufer et al. 2014).

To further investigate a putative specific reduction in innate immune gene diversity, we investigated normalized immune gene diversity by dividing mean SNP/FD ratios in immune genes by mean SNPD/FD ratios in a set of non-immune-related background genes (see Materials and Methods, fig. 2). There are no significant differences in the normalized gene diversities between any pair of ancient or present-day individuals, as indicated by overlapping 95% confidence intervals. Furthermore, 95% confidence intervals include the value 0 in 16 of the 17 individuals (in one Australian the upper confidence interval limit is slightly below 0). This suggests that in all individuals, innate-immune related genes have levels of diversity that are expected given their genome-wide gene diversity. We find thus no indication that innate immunity genes in archaic individuals have significantly different levels of normalized gene diversity than in present-day humans. These results are also reflected in the analysis of eight subsets of innate immunity genes (containing 10–169 genes, respectively) in which we also find no evidence for a specific reduction of gene diversity, even though there is some variation due to low sample sizes (supplementary fig. S2, Supplementary Material online).

Fig. 2.

Fig. 2.

—Normalized mean SNP/FD ratios (log2) for all 17 individuals in the full set of innate immune-related genes (N = 1,548). Error bars give 95% confidence intervals calculated by bootstrapping (B = 5,000). AF, Africa; EU, European; AS, Asian; SA, South American; AH, Archaic Humans. Dashed line gives expected value if the mean values for innate immunity genes and autosomal protein-coding background genes were equal.

The younger Vindija 33.19 Neandertal and the older Altai Neandertal show similar levels of normalized gene diversity in innate immune genes (–0.05 and –0.02 for Vindija 33.19 and Altai*, respectively with a trend towards lower normalized gene diversity in Vindija 33.19). This further suggests that immune gene diversity did not decrease over time. We note that differences in the branch lengths leading to each of the archaic humans, which reflect the differences in ages of the specimens (that is, branch shortening), do not have a substantial effect on our analyses as we do not observe a strong positive correlation between the age of archaic individuals and the normalized gene diversity (Pearson correlation coefficient between normalized mean SNP/FD ratio and age, using Altai*: 0.08).

High MHC Gene Diversity in Archaic Humans

MHC genes are known to be among the most diverse genes in the genome, due to the action of long-term balancing selection (Meyer and Thomson 2001; Key, Teixeira, et al. 2014; Bitarello et al. 2018). In contrast to the overall gene diversity, which is consistently lower in archaic than in present-day individuals, the levels of gene diversity in MHC genes of archaic humans are comparable to the levels observed in present-day humans (supplementary fig. S3, Supplementary Material online). To better understand this signature, we evaluated MHC gene diversity for the three archaic humans and the 14 present-day humans by averaging the normalized SNP/FD ratios (fig. 3A). Both archaic and present-day humans show higher diversity in MHC genes than in the background gene set (indicated by log2-values of lower 95% confidence intervals > 0). Interestingly, MHC diversity is ∼47-fold higher than the background genes in archaic humans (95% CI: 32–76-fold) but only ∼7-fold higher than background genes in present-day humans (95% CI: 5–9-fold). This higher diversity in the MHC observed in archaics compared with present-day humans is driven largely by the MHC class II genes (fig. 3BD). It is interesting that the normalized gene diversity in the MHC of the two early modern humans Loschbour (which is ∼7,000 years old [Lazaridis et al. 2014]) and Ust’-Ishim (which is ∼45,000 years old [Fu et al. 2014]) is comparable to that of present-day humans (fig. 3E), and thus lower than that of the archaic humans—this is also true for the set of innate immunity genes (supplementary fig. S4, Supplementary Material online).

Fig. 3.

Fig. 3.

—Comparison of normalized mean SNP/FD ratios (log2) of present-day humans (PDH) and archaic humans (AH). (A) Average values for PDH and AH for all MHC genes. Error bars give 95% confidence intervals calculated via bootstrap (B = 5,000). (B) Average values for PDH and AH for all MHC class I genes. (C) Average values for PDH and AH for all MHC class II genes. (D) Distribution of SNP/FD ratios (log2) of MHC genes (x-axis) for all individuals. Missing values for single individuals can either be genes without FDs or genes with <500 callable sites. (E) Comparison of normalized mean SNP/FD ratios (mean) between single individuals. AF, Africa; EU, European; AS, Asian; SA, South American; EMH, Early anatomically Modern Humans; AH, Archaic Humans. Note differences in the y-axis. Dashed lines in (D) and (E) give expected values if the mean values for innate immunity and autosomal protein-coding background genes are the same.

Comparing MHC gene diversity between archaic and early modern humans also helps to determine whether problems with the alignment of short ancient DNA reads in these highly polymorphic genomic regions could lead to an overestimate of diversity. The sequences generated from these ancient specimens have comparable read length distributions (supplementary fig. S5A, Supplementary Material online), and a similarly high median genomic coverage to the three archaic genomes (supplementary fig. S5B, Supplementary Material online). Since MHC gene diversity in the Loschbour and Ust’-Ishim individuals is no higher than in present-day humans, the high MHC diversity in the archaics is likely not caused by problems with aligning short ancient DNA reads. The fact that the signature is unique of the MHC further suggests that it is not an artefact of incorrect short read mapping of short ancient DNA reads (supplementary figs. S6–S8, Supplementary Material online).

The relatively high MHC diversity in archaic humans is evenly distributed in introns and exons, which do not show significantly different SNP/FD ratios (supplementary fig. S9, Supplementary Material online). In fact, neither introns nor exons show different SNP/FD ratios between archaic and present-day humans (supplementary fig. S10, Supplementary Material online). High gene diversity is thus homogeneously distributed within single genes, rather than due to peaks of diversity in highly polymorphic sections of the genes, which we would expect with mapping errors. The observed patterns are consistent with the high linkage disequilibrium and low average recombination rates of the MHC region (International HapMap C 2005; Miretti et al. 2005; de Bakker et al. 2006; Traherne 2008) resulting in high levels of diversity across the entire MHC region. Further, we note that coverage is also evenly distributed across genes—we find no evidence for different coverages in introns than exons—and coverage is also comparable between ancient and present-day individuals (supplementary fig. S11, Supplementary Material online).

We separately investigated SNP/FD ratios for the β-2-microglobulin gene (B2M). B2M is part of the class I MHC (light-chain) but—unlike the other MHC genes—is located on chromosome 15. It is known to be nonpolymorphic in humans (Corazza et al. 2004; Esposito et al. 2008). In concordance with this, we find low SNP/FD ratios that are comparable between archaic and present-day humans (supplementary fig. S12, Supplementary Material online).

Genes with Highest/Lowest Diversity Show Similar GO Enrichments in Archaic and Present-day Humans

When evaluating the diversity of the entire gene set, genes with the highest diversity (in the top-5% tail of the empirical distribution for SNP/FD ratios) show a highly significant enrichment only of GO categories related to the MHC in both the archaic and present-day humans after correcting for multiple testing (Bonferroni correction, k = 17, table 1). This signal was consistently found when testing the archaic humans in pairs (rather than triplets, supplementary tables S1–S3, Supplementary Material online) and individually for the Vindija Neandertal and the Denisova (supplementary tables S4 and S5, Supplementary Material online), with the Altai Neandertal showing nonsignificant enrichment (supplementary table S6, Supplementary Material online).

Table 1.

Significantly Enriched GO Categories for Genes from the Top 5% Tail of the SNP/FD Empirical Distribution in the Three Archaic Humans and Three Present-day Humans (Yoruba, French, Han) Ordered by Family-Wise Error Rate (FWER)

Archaic Humans (N = 3)
Present-day Humans (N = 3)
GO ID GO Name FWER GO ID GO Name FWER
0032395 MHC class II receptor activity 0 0004984 Olfactory receptor activity 0
Top 5% 0042613 MHC class II protein complex 0 0042613 MHC class II protein complex 0
0004984 Olfactory receptor activity 0.017 0032395 MHC class II receptor activity 0.034
0042611 MHC protein complex 0.034

Note.—FWER values are given after correcting for multiple testing (Bonferroni correction, k = 17). Analyses considering pairs of individuals, and single individuals, are presented in the Supplementary Materials. GO categories related to the MHC region are highlighted with bold font.

In the bottom-5% tail of the empirical distribution we found enriched categories related to virus or mitochondrial functions (Bonferroni correction, k = 17, table 2). It is tempting to interpret these findings as differences between archaic and present-day humans, especially as previous work suggests that genes related to antiviral defense are often subject to natural selection, either under strong purifying (Deschamps et al. 2016) or positive selection (Manry et al. 2011; Key, et al. 2014; Enard et al. 2016). However, there was no consistent enrichment pattern when analyzing genomes in pairs or individually (rather than triplets), and most enrichments were nonsignificant trends in some, but not all, archaic or present-day humans (supplementary tables S7–S22, Supplementary Material online). Thus, we have no strong evidence of any different GO enrichment patterns between the modern and archaic genomes.

Table 2.

Significantly Enriched GO Categories among the Bottom 5% Tail of the SNP/FD Empirical Distribution in the Three Archaic Humans and Three Present-day Humans (Yoruba, French, Han) Ordered by Family-Wise Error Rate (FWER)

Archaic Humans (N = 3)
Present-day Humans (N = 3)
GO ID GO Name FWER GO ID GO Name FWER
0004984 Olfactory receptor activity 0 0003676 Nucleic acid binding 0
Bottom 5% 0004930 G-protein coupled receptor activity 0
0004888 Transmembrane signaling receptor activity 0
0098800 Inner mitochondrial membrane protein complex 0
0099600 Transmembrane receptor activity 0
0005125 Cytokine activity 0
0005179 Hormone activity 0
0098798 Mitochondrial protein complex 0
0038023 Signaling receptor activity 0
0007186 G-protein coupled receptor signaling pathway 0.017
0005743 Mitochondrial inner membrane 0.034

Note.—FWER values are given after correcting for multiple testing (Bonferroni correction, k = 17). Analyses considering pairs of individuals, and single individuals, are presented in the Supplementary Materials.

Together, our results indicate that the enrichment of gene categories among the genes with the highest or lowest diversity is not specific to archaic nor present-day humans, with patterns of diversity in genes with the highest or lowest SNP/FD ratios probably being shaped by the action of long-term balancing selection and strong purifying selection or selective sweeps, respectively. However, we caution that the strength of our conclusions is limited by our small sample size, and that they will have to be confirmed when more archaic genomes become available.

Discussion

Our results are consistent with previous studies that have reported lower genetic diversity in archaic humans than in present-day humans, both genome-wide and in protein-coding regions (Meyer et al. 2012; Castellano et al. 2014; Prufer et al. 2014, 2017). In a recent study, lower protein-coding diversity observed in a set of 73 innate immunity genes (Sullivan et al. 2017) was interpreted as suggesting that Neandertals may have lacked the functional immune diversity necessary to survive new pathogen infections (Wolff and Greenwood 2010; Houldcroft and Underdown 2016; Sullivan et al. 2017). Here, we re-evaluated this hypothesis by studying the diversity of a set of 1,548 innate immune genes, and by explicitly comparing them to all autosomal protein-coding genes. We focused on innate rather than adaptive immune genes as individual variation in innate immune genes is not affected by somatic recombination (Flajnik and Kasahara 2010). Using this set of innate immune genes, we find no significant difference in diversity between protein-coding genes involved in innate immunity and all other autosomal protein-coding genes in any present-day or archaic individual. More strikingly, we see no difference in innate immune gene diversity between the older Altai Neandertal and the younger Vindija Neandertal individuals who lived at least 70,000 years later, as might have been the case if Neandertals were losing important gene diversity over time. A larger number of Neandertal genomes are needed to confirm our results, but with the current available genomes, we find no evidence to link a specific reduction in innate immune gene diversity to Neandertal extinction. We cannot exclude, though, that the global reduction in genome-wide diversity in archaic humans affected the function of immune-related genes.

As expected from long-term balancing selection, we find that diversity in MHC genes is much higher than the diversity in other autosomal protein-coding genes (Meyer and Thomson 2001; Key, Teixeira, et al. 2014; Bitarello et al. 2018) but, interestingly, this effect is much stronger in archaic than in present-day humans: For archaic humans, we find an ∼47-fold higher diversity in MHC than in the protein-coding background, whereas for present-day humans, MHC diversity is only ∼7-fold higher than in background genes. This signal is driven by very high diversity in the polymorphic MHC class II genes. This is consistent with the analysis of 12 MHC genes by Sullivan et al. (2017) who reported a significantly higher number of nonsynonymous SNPs in Neandertals compared with present-day humans, also at intermediate frequencies, for MHC genes relative to a genome-wide background. From this, they concluded that heterozygote advantage at MHC loci might have been stronger than expected and might have maintained crucial functional variation despite low Ne in Neandertals (Sullivan et al. 2017). Interestingly, the two early modern humans we analyzed here show similar MHC gene diversity to present-day people (fig. 3E), even though their population densities, and therefore likelihoods of pathogen transmission, were presumably more similar to that of archaic humans than to that of present-day humans. This is not completely unexpected as the effective population size of early modern humans was likely higher than that of archaic humans (Fu et al. 2014). However, it contrasts with the unexpectedly high MHC diversity maintained in the archaic genomes.

Although it is difficult to completely rule out that technical artefacts might increase our measure of diversity in the MHC genes, none of our tests shows evidence for misalignments of short ancient DNA reads being responsible for our findings. There are thus two plausible explanations for the pattern of MHC diversity. 1) It could be caused by the old TMRCA in the MHC region (Leffler et al. 2013; Tesicky and Vinkler 2015) and the persistent presence of intermediate frequency alleles as a consequence of long-term balancing selection (Meyer and Thomson 2001; Key, Teixeira, et al. 2014; Bitarello et al. 2018), resulting in the maintenance of sequence diversity in these genes which has been reported for targets of long-term balancing selection (Bitarello et al. 2018). 2) It could have been shaped by stronger selective pressures in Neandertal than humans preventing extensive loss of diversity in these genes—although in populations with small Ne the effects of selection are generally weaker than in population of larger Ne (Willi et al. 2006; Charlesworth 2009; Hoffmann et al. 2017). A possible mechanism for this would be associative overdominance. In that case, selection against homozygous recessive deleterious alleles in genomic regions could result in overdominance at linked neutral loci, boosting the effects of balancing selection. Associative dominance has recently been reported to drive maintenance of genetic diversity in experimental small-Ne populations of field-caught Drosophila melanogaster, especially in regions with low recombination rates (Fraser 2017; Schou et al. 2017). This is particular interesting as recombination rates in the human MHC region on average are notably lower than expected from the genome average (International HapMap Consoritum 2005; Miretti et al. 2005; de Bakker et al. 2006; Traherne 2008). Theoretically, the increased diversity in the MHC could also be the result of introgression into the archaic hominins. However, we note that 1) gene flow of this magnitude has not been detected to date and 2) if introgression contributed, we would not expect it to strongly affect the gene set as a whole. Therefore, we consider this an unlikely explanation.

Future sequencing of additional high-coverage archaic genomes that sample the geographic and temporal distribution of Neandertals will allow questions about the effects of gene diversity on Neandertal fitness to be addressed in greater detail.

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online.

Supplementary Material

Supplementary Data

Acknowledgments

We thank Steffi Grote for help with the GO enrichment analysis, and Bárbara D. Bitarello, Michael Dannemann, Steffi Grote, Benjamin M. Peter, Svante Pääbo, and Joshua M. Schmidt for helpful comments. We are also grateful for comments of two anonymous reviewers. This work was supported by the Max Planck Society.

Author Contributions

A.M.A. and J.K. designed the study. D.R. analyzed the data. All authors interpreted results. D.R. wrote the manuscript with contributions from F.M.K., A.M.A., and J.K. All authors read and approved the final manuscript.

Literature Cited

  1. Andres AM. 2009. Targets of balancing selection in the human genome. Mol Biol Evol. 26(12):2755–2764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bitarello BD, et al. 2018. Signatures of long-term balancing selection in human genomes. Genome Biol Evol. 10:939–955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Breuer K, et al. 2013. InnateDB: systems biology of innate immunity and beyond – recent updates and continuing curation. Nucleic Acids Res. 41(D1):D1228–D1233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Castellano S, et al. 2014. Patterns of coding variation in the complete exomes of three Neandertals. Proc Natl Acad Sci U S A. 111(18):6666–6671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cavalli-Sforza LL, Feldman MW.. 2003. The application of molecular genetic approaches to the study of human evolution. Nat Genet. 33(Suppl):266–275. [DOI] [PubMed] [Google Scholar]
  6. Charlesworth B. 2009. Fundamental concepts in genetics: effective population size and patterns of molecular evolution and variation. Nat Rev Genet. 10(3):195–205. [DOI] [PubMed] [Google Scholar]
  7. Corazza A, et al. 2004. Properties of some variants of human beta2-microglobulin and amyloidogenesis. J Biol Chem. 279(10):9176–9189. [DOI] [PubMed] [Google Scholar]
  8. Dannemann M, Andres AM, Kelso J.. 2016. Introgression of Neandertal- and Denisovan-like haplotypes contributes to adaptive variation in human toll-like receptors. Am J Hum Genet. 98(1):22–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. de Bakker PI, et al. 2006. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet. 38(10):1166–1172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Deschamps M, et al. 2016. Genomic signatures of selective pressures and introgression from archaic hominins at human innate immunity genes. Am J Hum Genet. 98(1):5–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Enard D, Cai L, Gwennap C, Petrov DA. 2016. Viruses are a dominant driver of protein adaptation in mammals. eLife 5:56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Esposito G, et al. 2008. The controlling roles of Trp60 and Trp95 in beta2-microglobulin function, folding and amyloid aggregation properties. J Mol Biol. 378(4):887–897. [DOI] [PubMed] [Google Scholar]
  13. Flajnik MF, Kasahara M.. 2010. Origin and evolution of the adaptive immune system: genetic events and selective pressures. Nat Rev Genet. 11(1):47–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Fraser DJ. 2017. Genetic diversity of small populations: not always “doom and gloom”? Mol Ecol. 26(23):6499–6501. [DOI] [PubMed] [Google Scholar]
  15. Fu Q, et al. 2014. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature 514(7523):445–449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Fu Q, et al. 2016. The genetic history of Ice Age Europe. Nature 534(7606):200–205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Green RE, et al. 2010. A draft sequence of the Neandertal genome. Science 328(5979):710–722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Grote S. 2018. GOfuncR: gene ontology enrichment using FUNC. R package version 1.2.0.
  19. Grote S, Prufer K, Kelso J, Dannemann M.. 2016. ABAEnrichment: an R package to test for gene set expression enrichment in the adult and developing human brain. Bioinformatics 32(20):3201–3203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Harris K, Nielsen R.. 2016. The genetic cost of Neanderthal introgression. Genetics 203(2):881–891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hoffmann AA, Sgro CM, Kristensen TN.. 2017. Revisiting adaptive potential, population size, and conservation. Trends Ecol Evol. 32(7):506–517. [DOI] [PubMed] [Google Scholar]
  22. Houldcroft CJ, Underdown SJ.. 2016. Neanderthal genomics suggests a pleistocene time frame for the first epidemiologic transition. Am J Phys Anthropol. 160(3):379–388. [DOI] [PubMed] [Google Scholar]
  23. Huerta-Sanchez E, et al. 2014. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature 512:194–197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. International HapMap Consortium 2005. A haplotype map of the human genome. Nature 437:1299–1320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Juric I, Aeschbacher S, Coop G.. 2016. The strength of selection against Neanderthal introgression. PLoS Genet. 12(11):e1006340.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Key FM, et al. 2014. Selection on a variant associated with improved viral clearance drives local, adaptive pseudogenization of interferon lambda 4 (IFNL4). PLoS Genet. 10(10):e1004681.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Key FM, Teixeira JC, de Filippo C, Andres AM.. 2014. Advantageous diversity maintained by balancing selection in humans. Curr Opin Genet Dev. 29:45–51. [DOI] [PubMed] [Google Scholar]
  28. Lazaridis I, et al. 2014. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513(7518):409–413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Leffler EM, et al. 2013. Multiple instances of ancient balancing selection shared between humans and chimpanzees. Science 339(6127):1578–1582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Li H. 2011. Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27(5):718–719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Manry J, et al. 2011. Evolutionary genetic dissection of human interferons. J Exp Med. 208(13):2747–2759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Meyer D, Thomson G.. 2001. How selection shapes variation of the human major histocompatibility complex: a review. Ann Hum Genet. 65(Pt 1):1–26. [DOI] [PubMed] [Google Scholar]
  33. Meyer M, et al. 2012. A high-coverage genome sequence from an archaic Denisovan individual. Science 338(6104):222–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Miretti MM, et al. 2005. A high-resolution linkage-disequilibrium map of the human major histocompatibility complex and first generation of tag single-nucleotide polymorphisms. Am J Hum Genet. 76(4):634–646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Neph S, et al. 2012. BEDOPS: high-performance genomic feature operations. Bioinformatics 28(14):1919–1920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Prufer K, et al. 2017. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science 358(6363):655–658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Prufer K, et al. 2007. FUNC: a package for detecting significant associations between gene sets and ontological annotations. BMC Bioinformatics 8:41.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Prufer K, et al. 2014. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505:43–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Quinlan AR. 2014. BEDTools: the Swiss-army tool for genome feature analysis. Curr Protoc Bioinformatics. 47:11.12.11–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Quintana-Murci L, Clark AG.. 2013. Population genetic tools for dissecting innate immunity in humans. Nat Rev Immunol. 13(4):280–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Racimo F, Marnetto D, Huerta-Sanchez E.. 2017. Signatures of archaic adaptive introgression in present-day human populations. Mol Biol Evol. 34(2):296–317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Racimo F, Sankararaman S, Nielsen R, Huerta SE.. 2015. Evidence for archaic adaptive introgression in humans. Nat Rev Genet. 16(6):359–371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Reich D, et al. 2010. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468(7327):1053–1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Sankararaman S, et al. 2014. The genomic landscape of Neanderthal ancestry in present-day humans. Nature 507(7492):354–357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Schou MF, Loeschcke V, Bechsgaard J, Schlötterer C, Kristensen TN.. 2017. Unexpected high genetic diversity in small populations suggests maintenance by associative overdominance. Mol Ecol. 26(23):6510–6523. [DOI] [PubMed] [Google Scholar]
  46. Sullivan AP, Manuel M, Marques-Bonet T, Perry GH.. 2017. An evolutionary medicine perspective on Neandertal extinction. J Hum Evol. 108:62–71. [DOI] [PubMed] [Google Scholar]
  47. Team RC. 2016. R: a language environment for statistical computing. Version 3.3.1. Vienna (Austria): R Foundation for Statistical Computing. [Google Scholar]
  48. Tesicky M, Vinkler M.. 2015. Trans-species polymorphism in immune genes: general pattern or MHC-restricted phenomenon? J Immunol Res. 2015:838035.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Traherne JA. 2008. Human MHC architecture and evolution: implications for disease association studies. Int J Immunogenet. 35(3):179–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Willi Y, Van Buskirk J, Hoffmann AA.. 2006. Limits to the adaptive potential of small populations. Ann Rev Ecol Evol Syst. 37(1):433–458. [Google Scholar]
  51. Wolff H, Greenwood AD.. 2010. Did viral disease of humans wipe out the Neandertals? Med Hypotheses. 75(1):99–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Yates A, et al. 2016. Ensembl 2016. Nucleic Acids Res. 44(D1):D710–D716. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Genome Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES