Significance
There is much interest in understanding the role of natural selection in shaping physiological adaptations to climate, diet, and diseases in humans. We investigated this issue by analyzing genomic data from Native American populations inhabiting different ecological regions and ancient Native Americans. We found signals of natural selection at the fatty acid desaturases (FADS) genes not only in an Arctic population, as was previously found, but throughout the Americas, suggesting a single and strong adaptive event that occurred in Beringia, before the range expansion of the first Americans within the American continent and Greenland.
Keywords: peopling of America, natural selection, genetics, first Americans
Abstract
When humans moved from Asia toward the Americas over 18,000 y ago and eventually peopled the New World they encountered a new environment with extreme climate conditions and distinct dietary resources. These environmental and dietary pressures may have led to instances of genetic adaptation with the potential to influence the phenotypic variation in extant Native American populations. An example of such an event is the evolution of the fatty acid desaturases (FADS) genes, which have been claimed to harbor signals of positive selection in Inuit populations due to adaptation to the cold Greenland Arctic climate and to a protein-rich diet. Because there was evidence of intercontinental variation in this genetic region, with indications of positive selection for its variants, we decided to compare the Inuit findings with other Native American data. Here, we use several lines of evidence to show that the signal of FADS-positive selection is not restricted to the Arctic but instead is broadly observed throughout the Americas. The shared signature of selection among populations living in such a diverse range of environments is likely due to a single and strong instance of local adaptation that took place in the common ancestral population before their entrance into the New World. These first Americans peopled the whole continent and spread this adaptive variant across a diverse set of environments.
Human history in America begins about 18,000–15,000 y before the present (yBP) (1, 2), when the first Americans entered the New World and migrated southward, peopling all latitudes of the continent. However, the history of this human group begins well before this date, before the initial colonization. Around 25,000 yBP, the Last Glacial Maximum (LGM) drastically reduced sea levels. In the region where we currently find the Bering Strait, this drop in sea level exposed an unglaciated land extension known as the Bering land bridge, which joined Northeast Asia and North America, forming Beringia (3–5). Most of the studies on the peopling of the Americas agree that the genetic differentiation of the current Native American populations probably occurred in this area during a standstill in Beringia (6–9), where migrants from different regions in Asia had arrived about 23,000 yBP, inhabiting it for a few thousand years (5,000–8,000 y) (1).
Greenlandic Inuit are one of the descendant populations of the first Americans. A recent study analyzing 196,725 SNPs in 191 individuals from this population proposed that they have experienced genetic and physiological adaptations to a cold climate, obtained through their living for a long time in the extreme conditions of the Arctic and subsisting on a diet rich in protein and fatty acids (10). Using an approach based on a measure of genetic differentiation (FST), these authors provided evidence for positive selection at several sites in the FADS genes on chromosome 11 (FADS1, FADS2, and FADS3), which are involved in the metabolism of omega-3 polyunsaturated fatty acids (PUFAs).
A previous investigation (11) also studied FADS1 and FADS2, with different SNPs from those of ref. 10, and found an association between one haplotype (D) with PUFAs metabolism. This haplotype presented considerable intercontinental variation in its frequency, possibly indicating some level of local adaptation. Moreover, the observation that the climate and diet resources in Beringia during the LGM were probably similar to those in Greenland (4) suggested the possibility that this adaptive haplotype originated in a population ancestral to both Inuit and Native Americans. For this reason, we analyzed sequence variation in FADS genes in Native American populations from Central and South America and searched for sites in FADS genes implicated in cold adaptation.
Results and Discussion
Genomic Screening for Mutations with Large Continental Frequency Differential.
We first compared Native Americans (12) to other continental populations (Africans, Europeans, and East Asians; Table S1) to verify the alleles that had high frequencies in indigenous American populations and low frequencies elsewhere. We found three significant SNPs (Fisher’s exact test with Bonferroni correction), two of which lay within FADS2 (rs174570, P < 0.0001 and rs174556, P < 0.001) and another located on chromosome 22 (rs13054099, P < 0.0000001) in SLC25A17 (Fig. 1). The latter gene encodes a peroxisomal membrane protein that belongs to the family of mitochondrial solute carriers. It is expressed in liver and is likely involved in membrane transport (13). Thus far, there is no a priori evidence that it plays a role in adaptation to diet or temperature.
Table S1.
Population | Region | Sample size |
Illumina | ||
CHB | East Asian | 83 |
Dai | East Asian | 10 |
Daur | East Asian | 9 |
Han | East Asian | 33 |
Han-N_China | East Asian | 10 |
Hezhen | East Asian | 8 |
Japanese | East Asian | 28 |
JPT | East Asian | 85 |
Miao | East Asian | 10 |
Mongolia | East Asian | 7 |
Mongolia | East Asian | 8 |
Naxi | East Asian | 8 |
Orogen | East Asian | 9 |
She | East Asian | 10 |
Sindhi | East Asian | 1 |
Tu | East Asian | 10 |
Tujia | East Asian | 10 |
Xibo | East Asian | 9 |
Yi | East Asian | 10 |
French | European | 28 |
Adygei | European | 17 |
Basque | European | 24 |
CEU | European | 108 |
Italia | European | 12 |
Ocardian | European | 15 |
Sardinian | European | 28 |
TSI | European | 88 |
Tuscan | European | 8 |
Inuit | Greenland | 13 |
Aleutian | Aleutian islands | 3 |
Altaian | Siberian | 12 |
Buryat | Siberian | 17 |
Chukchi | Siberian | 30 |
Dolgan | Siberian | 4 |
Evenki | Siberian | 15 |
Ket | Siberian | 2 |
Khanty | Siberian | 35 |
Koryak | Siberian | 10 |
Naukan | Siberian | 16 |
Nganasan1 | Siberian | 23 |
Selkup | Siberian | 9 |
Tundra_Nentsi | Siberian | 3 |
Yakut | Siberian | 34 |
Yukaghir | Siberian | 12 |
Algonquin | Native American | 5 |
Arara | Native American | 1 |
Arhuaco | Native American | 5 |
Aymara | Native American | 23 |
Bribri | Native American | 4 |
Cabecar | Native American | 31 |
Chane | Native American | 2 |
Chipewyan | Native American | 15 |
Chono | Native American | 4 |
Chorotega | Native American | 1 |
Cree | Native American | 4 |
Diaguita | Native American | 5 |
Embera | Native American | 5 |
Guahibo | Native American | 6 |
Guarani | Native American | 6 |
Guaymi | Native American | 5 |
Huetar | Native American | 1 |
Hulliche | Native American | 4 |
Inga | Native American | 9 |
Jamamadi | Native American | 1 |
Kaingang | Native American | 2 |
Kaqchikel | Native American | 12 |
Karitiana | Native American | 13 |
Kogi | Native American | 4 |
Maleku | Native American | 3 |
Maya1 | Native American | 18 |
Mixe | Native American | 17 |
Mixtec | Native American | 5 |
Ojibwa | Native American | 4 |
Palikur | Native American | 3 |
Parakanã | Native American | 1 |
Piapoco | Native American | 6 |
Purepecha | Native American | 1 |
Quechua | Native American | 40 |
Surui | Native American | 24 |
Teribe | Native American | 3 |
Ticuna | Native American | 6 |
Toba | Native American | 4 |
Waunana | Native American | 3 |
Wayuu | Native American | 11 |
Wichi | Native American | 5 |
Yaghan | Native American | 4 |
Yaqui | Native American | 1 |
Zapotec1 | Native American | 22 |
Affymetrix | ||
Dai | East Asian | 10 |
Daur | East Asian | 9 |
Han | East Asian | 34 |
Han_Nchina | East Asian | 10 |
Hezhen | East Asian | 9 |
Japanese | East Asian | 29 |
Miao | East Asian | 10 |
Mongolia | East Asian | 10 |
Naxi | East Asian | 9 |
Orogen | East Asian | 9 |
She | East Asian | 10 |
Tu | East Asian | 10 |
Tujia | East Asian | 10 |
Xibo | East Asian | 9 |
Yi | East Asian | 10 |
Adygei | European | 17 |
Basque | European | 22 |
French | European | 28 |
Italian | European | 13 |
Orcadian | European | 13 |
Sardinian | European | 28 |
Tuscan | European | 8 |
Apalai | Native American | 4 |
Arara | Native American | 4 |
Guarani_GN | Native American | 7 |
Guarani_KW | Native American | 10 |
Karitiana | Native American | 4 |
Surui | Native American | 4 |
Urubu Kaapor | Native American | 3 |
Xavante | Native American | 11 |
Zoró | Native American | 1 |
Genetic Signatures of Natural Selection.
Genes showing population-specific changes in allele frequency, such as FADS2 and SLC25A17 in Native Americans, are potential candidates for revealing adaptive evolution. We thus focused our subsequent analyses on chromosomes 11 and 22. To evaluate whether we could identify natural selection as the cause for these extreme allele frequency changes, we applied three cross-population test statistics to two different datasets with 374,470 (dataset 1, ref. 12) and 593,142 (dataset 2, ref. 14) SNPs in each. Two tests are FST-based, involving the comparison of differentiation between Native Americans, Europeans, and East Asians [the population branch statistics (PBS) (15) and a Bayesian method implemented with Bayescan (16)]. To formally test whether natural selection underlies the cases of extreme differentiation, PBS values were compared against those obtained with neutral coalescent simulations generated according to a plausible demographic scenario for the peopling of the New World (17). Bayescan corrects for demographic biases by differentiating the proportion of the FST variation that is locus-specific (due to natural selection) from that which is population-specific (due to demography) (16). In addition to these tests, we also performed the cross-population extended haplotype homozygosity test (XP-EHH) (18) between pairs of populations.
Fig. S1 A and B show the distribution of PBS statistics for the Native American branch across the extension of chromosome 11 in each dataset. The most extreme PBS value in dataset 1 corresponds to rs174570 (Fig. S1A), the same SNP that was highly significant in our preliminary analysis of extreme allele frequency differences across continents (Fig. 1). This result demonstrated that the SNP detection was consistent across typing platforms and different methods. In addition, this SNP was one of the sites found under positive selection in the Greenlandic Inuit (10).
For dataset 2, we found 15 SNPs with highly significant PBS values (Fig. S1B), three of them being located on FADS genes (rs74771917, rs7115739, and rs174570). Neutral coalescent simulations indicated that these deviations were statistically significant (P < 0.0001; Fig. S2), consistent with the action of positive selection as opposed to genetic drift in increasing the frequency of the derived allele in Native American populations. The other SNPs showing signals of natural selection in Native Americans were located at, or near, genes related with metabolism (Table S2). The Bayescan analysis corroborated all of the aforementioned findings, especially regarding selection on FADS genes (Fig. S1 C and D). However, for the rs13054099 SNP on chromosome 22 we found weak or lack of evidence for selection with all methods.
Table S2.
Chromosome | Position | rs | Ancestral | Derived | DAF EUR | DAF EAS | DAF NAM | PBS | Gene | Associated disease |
Illumina | ||||||||||
11 | 61353788 | rs174570 | C | T | 0.1113 | 0.3683 | 0.9973 | 1.3018 | FADS2 | Metabolic syndrome |
11 | 61337211 | rs174556 | C | T | 0.2393 | 0.3612 | 0.9973 | 1.0207 | MIR1908(-1.997kb)|FADS1(0) |FADS2(-3.039kb) | Metabolic syndrome |
11 | 133087432 | rs4936203 | G | A | 0.0777 | 0.0904 | 0.7280 | 1.0070 | — | |
11 | 12128635 | rs9804570 | T | C | 0.7423 | 0.8853 | 0.1296 | 1.0043 | MICAL2 | Asthma/visceral fat |
11 | 94230937 | rs4309121 | T | C | 0.1662 | 0.2875 | 0.91398 | 0.9873 | AMOTL1 | Psychiatric disorder/ADH |
11 | 79818081 | rs7111535 | A | G | 0.8807 | 0.8042 | 0.1791 | 0.9551 | — | |
11 | 61326406 | rs174546 | T | C | 0.7180 | 0.6346 | 0.0027 | 0.9493 | TMEM258(+9.745kb)|MIR611(+9.797kb) |FEN1(+5.116kb)|FADS1(0) | Metabolic syndrome |
11 | 79810048 | rs12277775 | G | A | 0.8857 | 0.7989 | 0.1818 | 0.9475 | — | |
11 | 61354548 | rs1535 | G | A | 0.7134 | 0.6289 | 0.0027 | 0.9348 | ELF5(-0.262kb)|CAT(+6.472kb) | β2-glycoprotein I (β2-GPI) plasma levels/cataracts in type 2 diabetes |
11 | 11779270 | rs4910420 | G | A | 0.1296 | 0.0907 | 0.7434 | 0.9289 | — | |
11 | 61396955 | rs174449 | G | A | 0.6509 | 0.7705 | 0.0410 | 0.9250 | MIR6746(-5.308kb)|FADS3(-0.615kb) |FADS2(+5.553kb) | Metabolic syndrome |
11 | 61314379 | rs102275 | C | T | 0.7018 | 0.6317 | 0.0027 | 0.9228 | TMEM258(0)|MYRF(+1.813kb)|MIR611(-2.163kb) |FEN1(-2.305kb)|FADS1(-9.293kb) | Metabolic syndrome |
11 | 61306034 | rs174534 | A | G | 0.2957 | 0.3725 | 0.9944 | 0.9065 | TMEM258(-7.143kb)|MYRF(0) | Metabolic syndrome |
11 | 68613614 | rs1551310 | T | C | 0.0046 | 0.4504 | 0.8413 | 0.8981 | TPCN2(0)|MIR3164(+6.312kb) | Prostate cancer/hair color |
11 | 13715365 | rs1866948 | T | C | 0.1128 | 0.0213 | 0.6489 | 0.8762 | FAR1(+4.896kb) | Bone mineral density |
Affymetrix | ||||||||||
11 | 61627960 | rs74771917 | C | T | 0.0388 | 0.1835 | 0.9479 | 1.7484 | FADS2 | Metabolic syndrome |
11 | 12071129 | rs4910436 | C | T | 0.0310 | 0.0053 | 0.7396 | 1.7148 | LOC105376554 | Visceral fat |
11 | 61641717 | rs7115739 | G | T | 0.0433 | 0.1962 | 0.9457 | 1.6692 | MIR6746(-3.97kb)|FADS3(0)|FADS2(+6.891kb) | Metabolic syndrome |
11 | 12016528 | rs952932 | C | T | 0.0349 | 0.0053 | 0.6771 | 1.4705 | DKK3 | Visceral fat |
11 | 43903391 | rs33984031 | T | G | 0.0194 | 0.0293 | 0.6458 | 1.2962 | ALKBH3 | Body mass index |
11 | 43902894 | rs3740984 | G | A | 0.0194 | 0.0293 | 0.6458 | 1.2962 | ALKBH3 | Body mass index |
11 | 43895919 | rs34319180 | C | T | 0.0194 | 0.0293 | 0.6458 | 1.2962 | ALKBH3 | Body mass index |
11 | 43895827 | rs34529902 | C | T | 0.9806 | 0.9707 | 0.3542 | 1.2962 | ALKBH3 | Body mass index |
11 | 120155118 | rs11217788 | G | A | 0.1016 | 0.1941 | 0.8958 | 1.2258 | POU2F3 | Psoriasis/intraocular pressure/corneal curvature |
11 | 69593176 | rs11263566 | A | G | 0.0233 | 0.1064 | 0.7292 | 1.2221 | FGF4 | Serum urate levels by body-mass-index/type 1 diabetes/IgG glycosylation/breast cancer |
11 | 69601039 | rs11263574 | T | G | 0.0234 | 0.1064 | 0.7292 | 1.2199 | — | |
11 | 43217978 | rs17500007 | A | G | 0.0543 | 0.0000 | 0.5833 | 1.1376 | — | |
11 | 61597212 | rs174570 | C | T | 0.1008 | 0.4149 | 0.9896 | 1.1373 | FADS2 | Metabolic syndrome |
11 | 43893782 | rs12418820 | G | T | 0.0195 | 0.0585 | 0.6304 | 1.1277 | ALKBH3 | Body mass index |
11 | 45192009 | rs12362992 | G | A | 0.2791 | 0.0775 | 0.8854 | 1.1073 | PRDM11 | IgG glycosylation/thyroid hormone levels/forced vital capacity |
Putatively Selected Haplotype in Extant and Ancient Americans.
To investigate whether this signal of natural selection could be the result of adaptation to the conditions encountered by the ancestors of first Americans in Beringia, we compared the genotypes of living Native Americans and Inuit to those of four ancient humans. These included Saqqaq, a Paleo-Eskimo from Greenland who lived ∼4,000 yBP (19); Anzick-1, an individual belonging to the classical North American Clovis culture who lived ∼12,500 yBP (20); the Mal’ta boy who lived in Siberia ∼24,000 yBP (21); and the Ust’-Ishim man, who lived in Siberia ∼45,000 yBP (22) (Table 1). The haplotype found in the homozygous state in the Anzick-1 individual—which is a representative of first Americans, or close to them—is the same as the one present at high frequencies in extant Native Americans from diverse locations of the continent. The Paleo-Eskimo had a haplotype different from that present in Native Americans or the Inuit. The high frequency of the putatively selected FADS haplotype and its shared distribution in living and ancient Native Americans (Table S3) is therefore consistent with a scenario of intense selection during the Beringian standstill.
Table 1.
Genotypes composing the putatively selected FADS haplotype are shown in red. Double asterisks indicate that the genotypes could not be identified in the sequences examined. Y chr, Y chromosome.
Table S3.
Population | Absolute frequency | No. of chromosomes | Relative frequency |
Siberia | |||
Aleutian | 6 | 6 | 1.000 |
Altaian | 20 | 24 | 0.833 |
Buryat | 16 | 34 | 0.471 |
Chukchi | 49 | 60 | 0.817 |
Dolgan | 6 | 8 | 0.750 |
Evenki | 15 | 30 | 0.500 |
Ket | 2 | 4 | 0.500 |
Khanty | 56 | 70 | 0.800 |
Koryak | 14 | 20 | 0.700 |
Naukan | 30 | 32 | 0.938 |
Nganasan1 | 23 | 46 | 0.717 |
Selkup | 13 | 18 | 0.722 |
Tundra_Nentsi | 6 | 6 | 1.000 |
Yakut | 41 | 68 | 0.603 |
Yukaghir | 14 | 24 | 0.583 |
Greenland | |||
Inuit | 26 | 26 | 1.000 |
North and Central America | |||
Algonquin | 10 | 10 | 1.000 |
Chipewyan | 23 | 30 | 0.767 |
Cree | 8 | 8 | 1.000 |
Diaguita | 10 | 10 | 1.000 |
Kaqchikel | 24 | 24 | 1.000 |
Maya1 | 34 | 36 | 0.944 |
Mixe | 34 | 34 | 1.000 |
Mixtec | 4 | 10 | 0.400 |
Ojibwa | 6 | 8 | 0.750 |
Purepecha | 2 | 2 | 1.000 |
Yaqui | 2 | 2 | 1.000 |
Zapotec1 | 42 | 44 | 0.955 |
Average | 199 | 218 | 0.913 |
South America | |||
Arara | 2 | 2 | 1.000 |
Arhuaco | 8 | 10 | 0.800 |
Aymara | 43 | 46 | 0.935 |
Bribri | 7 | 8 | 0.875 |
Cabecar | 57 | 62 | 0.919 |
Chane | 4 | 4 | 1.000 |
Chono | 8 | 8 | 1.000 |
Chorotega | 2 | 2 | 1.000 |
Embera | 10 | 10 | 1.000 |
Guahibo | 12 | 12 | 1.000 |
Guarani | 12 | 12 | 1.000 |
Guaymi | 10 | 10 | 1.000 |
Huetar | 2 | 2 | 1.000 |
Hulliche | 8 | 8 | 1.000 |
Inga | 16 | 18 | 0.889 |
Jamamadi | 2 | 2 | 1.000 |
Kaingang | 4 | 4 | 1.000 |
Karitiana | 26 | 26 | 1.000 |
Kogi | 8 | 8 | 1.000 |
Maleku | 6 | 6 | 1.000 |
Palikur | 6 | 6 | 1.000 |
Parakanã | 2 | 2 | 1.000 |
Piapoco | 10 | 12 | 0.833 |
Quechua | 79 | 80 | 0.988 |
Suruí | 48 | 48 | 1.000 |
Teribe | 6 | 6 | 1.000 |
Ticuna | 12 | 12 | 1.000 |
Toba | 8 | 8 | 1.000 |
Waunana | 5 | 6 | 0.833 |
Wayuu | 20 | 22 | 0.909 |
Wichi | 10 | 10 | 1.000 |
Yaghan | 8 | 8 | 1.000 |
Average | 461 | 480 | 0.960 |
During the LGM, Beringia offered a feasible home for the Asian migrants, due to its ecological conditions (4) furnishing a relatively isolated place for their differentiation. We hypothesized that a putative selection event occurred after migrants from northeastern Siberia (23), as well as other Asian regions, reached Beringia. Presently, extant native Siberians present relatively lower frequencies (average 69%) of the putatively selected FADS haplotype (Fig. 2 and Table S3). Cardona et al. (24) surveyed the entirety of Siberian chromosome 11 and found no indication of selection at the FADS region. Therefore, the presence of the putatively selected FADS haplotype might have resulted from subsequent gene flow between Native Americans and Siberian populations occurring after the Beringian event. Note that the two ancient Siberians (Mal’ta and Ust’-Ishim) had haplotypes distinct from that found in the ancient Anzick-1 Native American individual but similar to the one found in the Saqqaq Paleo-Eskimo (Table 1).
Several studies have suggested that all Paleo-Eskimos arose from a single distinct migration that occurred around 6,000 yBP (19, 25, 26). It would therefore be expected that the selected alleles found in extant Inuit should also be found in ancient Eskimo genomes, especially if selection had occurred in this specific human group due to the extreme weather conditions and particular diet. The observed difference suggests that another explanation should be considered.
The vast majority (95%) of Native Americans have the putatively selected haplotype, which is found at high frequency in the Inuit from Greenland (Fig. 2). Considering the whole continent, there are slight differences in haplotype frequencies between south (80–100%, average 96%) and north and central (40–100%, average 91.3%) Native Americans (Table S3). Such variation could be due to later gene flow with ancient Siberian populations, subsequent to the initial peopling of America (12), or to stochastic factors. The high frequencies occur despite marked differences in lifestyles and diets of the different indigenous populations. Amazonian hunter-gatherers have highly variable diets, mainly composed of fruits, roots, and small mammals, whereas agriculturalists from Mesoamerica and the Andean region eat food based on crop agriculture (27) (Table S4). By contrast, Arctic populations such as the Inuit have diets that rely on hunting marine mammals and have animal fat as the main source of nutrients (10).
Table S4.
Pairs of comparisons | P value |
AA × MA | 0.3521 |
AA × SA | 0.9196 |
AA × RF | 0.5217 |
MA × SA | 0.1194 |
MA × RF | 0.2420 |
SA × RF | 0.6089 |
Groups were compared using the χ2 test with Yates correction (IBM SPSS Statistics). Populations are as follows. AA: Aymará, Huilliche, Inga, and Quechua. MA: Cabecar, Guaymi, Kaqchikel, Maya, Mixe, Mixtec, Purepecha, and Zapotec. SA: Arara, Arhuaco, Bribi, Chane, Chorotega, Embera, Guahibo, Guarani, Huetar, Jamamadi, Kaingang, Karitiana, Kogi, Maleku, Palikur, Parakanã, Piapoco, Suruí, Teribe, Ticuna, Toba, Waunana, Wichi, and Yaghan. RF: Inuit and Chono (29). AA, Andean agriculturalists; MA, Mesoamerican agriculturalists; RF, rich-fat consumers; SA, South American hunter-gatherers.
The extant Eskimo-Aleut genomes are a result of the encounter of the ancestors of the first Americans with subsequent, and more recent, stream of Asian gene flow (12). Multidisciplinary models of American settlement also suggested that Eskimos are descendants of ancient Beringians, although posterior gene flow from Asia introduced some specific genetic and phenotypic traits into these groups (28, 29). Given the aforementioned pattern of genetic variation in the ancient individuals, the fact that our results for the XP-EHH test do not show differentiation between Greenlandic Inuit and Native Americans in the FADS region (XP-EHH = 1.13218) implies that the signal of natural selection found in the Inuit is a result of their shared ancestry with the first Americans, which reflects adaptive selection in that region of the genome.
Conclusions
Our results strongly suggest that our hypothesis that the signal of positive selection is not restricted to the Arctic is correct. We found a signature of natural selection at FADS loci in all 53 Native American populations studied here, including a strong signal in Amazonian populations that are extremely different from the Inuit regarding culture, environment, and diet. In our opinion, this genetic profile suggests a single and very strong adaptive event that occurred in Beringia, before the range expansion of the ancestors of the first Americans within the American continent and Greenland. This event was likely related to the metabolic adaptations to diet and cold weather required to subsist during glacial and perglacial conditions that existed during the Beringian standstill (30).
Materials and Methods
Populations.
We analyzed data from 349 individuals from 44 Native American populations previously published by Reich et al. (12), as well as information on 48 individuals from nine Native American populations published by Skoglund et al. (14). These datasets were called dataset 1 and dataset 2, respectively. Detailed population information can be found in Table S1. In addition, data from 225 Asian individuals and 328 Europeans from the Human Genome Diversity Project (www.hagsc.org/hgdp/) were evaluated. The data were curated in the same way reported by Laurie et al. (31). We computationally phased these data using the SHAPEIT software (32) with default parameters.
PBS Test.
PBS analysis was performed as described in ref. 15. For each SNP, an FST value was estimated between pairs of populations [Native Americans (NAM), East Asians (EAS), and Utah residents of European ancestry (CEU)], using Reynolds et al.’s (33) estimator. Nonpolymorphic sites in at least two populations were excluded. PBS estimates were performed between NAM and EAS, using CEU as an outgroup. The analyses were done with individual SNPs and with 20 SNPs windows overlapping 5 SNPs, as implemented by Fumagalli et al. (10).
Neutral Coalescent Simulations.
Simulations were performed using the demographic model for the peopling of the New World inferred by Gutenkunst et al. (17). In short, these authors modeled the Eurasian divergence from the African population, with subsequent divergence of the Asian population from the European, at ∼26 kya. The Americas were peopled by a subset of the Asian population with 800 individuals that went through exponential growth after entering the New World ∼22 kya. Gene flow from Europeans to both Asian and Native American populations was allowed to happen, because the genetic dataset used to infer this model comprised Mexican individuals that are known to have some European genetic contribution (17). This approach allowed us to correct for possible bias due to genetic admixture, and indeed the generation time and population size estimates are in agreement with the literature (1). More details about this scenario can be found in figure 3 and table 2 from ref. 33.
We implemented our simulations according to this scenario in ms (34) with the following command line:
ms Ntot 10000 -s 1 -I 4 0 NEuro NAsia NAmer -n 1 1.682020 -n 2 2.424020 -n 3 4.185850 -n 4 7.942130 -eg 0 2 67.978337 -eg 0 3 109.406463 -eg 0 4 147.474095 -ema 0 4 × 0 0 0 0 × 3.960400 0 0 3.960400 × 0 0 0 0 x -ej 0.029475 4 3 -ema 0.029475 4 × 0.881098 0.561966 × 0.881098 × 3.960400 × 0.561966 3.960400 × x -ej 0.036114 3 2 -en 0.036114 2 0.287184 -ema 0.036114 4 × 7.293140 × x 7.293140 × x -ej 0.197963 2 1 -en 0.303500 1 1.
The total sample number (Ntot) and ethnic group samples number (NEuro, NAsia, and NAmer) varied between datasets, being 904, 328, 358, and 218 for dataset 1 and 365, 129, 188, and 48 for dataset 2, respectively. We used 10,000 simulations according to this scenario to generate a null distribution of the PBS values to which empirical PBS values were compared. This procedure allowed us to evaluate the significance of the differences and get the outlier PBS values.
Detecting SNPs Under Positive Selection with a Bayesian Approach.
Two versions of BayeScan (16, 35) were used to identify candidate targets for positive selection. Both versions assume an island model in which the subpopulations’ allele frequencies are correlated through a common migrant gene pool from which they may differ in varying degrees. Based on this outcome, FST is calculated and decomposed into a population-specific component (β), shared by all loci, and a locus-specific component (α), shared by all populations (see refs. 15 and 35 and the corresponding manuals for details). Significant, positive values of α indicate an overly large level of differentiation of a given SNP, given the genomic average specific for each population (the β part of the FST mentioned above), indicating positive selection. With the older version of BayeScan we were able to identify outlier loci, and with the new version we identified those that were outliers in the Native American population, assigning posterior probabilities for each marker as being under positive selection. These posterior probabilities were then transformed into q-values to control for the false discovery rate.
A sliding-window approach was followed to generate Manhattan plots with the distribution of the q-values along chromosome 11. This approach was set with windows of size 500 kb, with a shifting increment of 25 kb at each step. The q-value associated to each window was assumed as the 95% quantile of the q-values calculated for all SNPs in the window.
Note that this approach slightly differs from the previous one (PBS). The different implementations repeated for each dataset were intended to eliminate biases related to a specific method, its implementation, and also any bias related to the genetic data.
Variation in Ancient Humans.
We checked the FADS genotypes SNPs in the ancient individuals accessing the information from VCF or BAM files; for this purpose we used the VariantAnnotation (https://bioconductor.org/packages/release/bioc/html/VariantAnnotation.html) and Samtools (www.htslib.org/) packages.
Geographic Analysis.
Geographic maps displaying spatial variation in populations were obtained with inverse distance weighted using the ArcGis 9.3 software (www.esri.com/software/arcgis).
Acknowledgments
We thank Marcelo Zagonel de Oliveira for technical assistance in the geographic analysis. This work was supported by a Science Without Borders fellowship from Conselho Nacional de Desenvolvimento Científico e Tecnológico Grant PDE 201145/2015-4 (to C.E.G.A.) and Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil Grant 2012/ 09950-9 (to K.N.).
Footnotes
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1620541114/-/DCSupplemental.
References
- 1.Fagundes NJ, et al. Mitochondrial population genomics supports a single pre-Clovis origin with a coastal route for the peopling of the Americas. Am J Hum Genet. 2008;82(3):583–592. doi: 10.1016/j.ajhg.2007.11.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Poznik GD, et al. 1000 Genomes Project Consortium Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat Genet. 2016;48(6):593–599. doi: 10.1038/ng.3559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hoffecker JF, Elias SA, O’Rourke DH. Anthropology. Out of Beringia? Science. 2014;343(6174):979–980. doi: 10.1126/science.1250768. [DOI] [PubMed] [Google Scholar]
- 4.Meiri M, et al. Faunal record identifies Bering isthmus conditions as constraint to end-Pleistocene migration to the New World. Proc Biol Sci. 2013;281(1776):20132167. doi: 10.1098/rspb.2013.2167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hopkins DM. The Bering Land Bridge. Stanford Univ Press; Stanford, CA: 1967. [Google Scholar]
- 6.Szathmary EJE. Genetics of aboriginal North Americans. Evol Anthropol. 1993;1:202–220. [Google Scholar]
- 7.Bonatto SL, Salzano FM. A single and early migration for the peopling of the Americas supported by mitochondrial DNA sequence data. Proc Natl Acad Sci USA. 1997;94(5):1866–1871. doi: 10.1073/pnas.94.5.1866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tamm E, et al. Beringian standstill and spread of Native American founders. PLoS One. 2007;2(9):e829. doi: 10.1371/journal.pone.0000829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kitchen A, Miyamoto MM, Mulligan CJ. A three-stage colonization model for the peopling of the Americas. PLoS One. 2008;3(2):e1596. doi: 10.1371/journal.pone.0001596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Fumagalli M, et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science. 2015;349(6254):1343–1347. doi: 10.1126/science.aab2319. [DOI] [PubMed] [Google Scholar]
- 11.Ameur A, et al. Genetic adaptation of fatty-acid metabolism: A human-specific haplotype increasing the biosynthesis of long-chain omega-3 and omega-6 fatty acids. Am J Hum Genet. 2012;90(5):809–820. doi: 10.1016/j.ajhg.2012.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Reich D, et al. Reconstructing Native American population history. Nature. 2012;488(7411):370–374. doi: 10.1038/nature11258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Visser WF, van Roermund CW, Waterham HR, Wanders RJ. Identification of human PMP34 as a peroxisomal ATP transporter. Biochem Biophys Res Commun. 2002;299(3):494–497. doi: 10.1016/s0006-291x(02)02663-3. [DOI] [PubMed] [Google Scholar]
- 14.Skoglund P, et al. Genetic evidence for two founding populations of the Americas. Nature. 2015;525(7567):104–108. doi: 10.1038/nature14895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Yi X, et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010;329(5987):75–78. doi: 10.1126/science.1190371. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Foll M, Gaggiotti O. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: A Bayesian perspective. Genetics. 2008;180(2):977–993. doi: 10.1534/genetics.108.092221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 2009;5(10):e1000695. doi: 10.1371/journal.pgen.1000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Szpiech ZA, Hernandez RD. selscan: An efficient multithreaded program to perform EHH-based scans for positive selection. Mol Biol Evol. 2014;31(10):2824–2827. doi: 10.1093/molbev/msu211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Rasmussen M, et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature. 2010;463(7282):757–762. doi: 10.1038/nature08835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Rasmussen M, et al. The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Nature. 2014;506(7487):225–229. doi: 10.1038/nature13025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fu Q, et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature. 2014;514(7523):445–449. doi: 10.1038/nature13810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Raghavan M, et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature. 2014;505(7481):87–91. doi: 10.1038/nature12736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Dulik MC, et al. Mitochondrial DNA and Y chromosome variation provides evidence for a recent common ancestry between Native Americans and Indigenous Altaians. Am J Hum Genet. 2012;90(2):229–246. doi: 10.1016/j.ajhg.2011.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Cardona A, et al. Genome-wide analysis of cold adaptation in indigenous Siberian populations. PLoS One. 2014;9(5):e98076. doi: 10.1371/journal.pone.0098076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Harrit RK, et al. Paleo-Eskimo beginnings in North America: A new discovery at Kuzitrin Lake, Alaska. Etud Inuit. 1998;22:61–81. [Google Scholar]
- 26.Gilbert MT, et al. Paleo-Eskimo mtDNA genome reveals matrilineal discontinuity in Greenland. Science. 2008;320(5884):1787–1789. doi: 10.1126/science.1159750. [DOI] [PubMed] [Google Scholar]
- 27.Hünemeier T, et al. Evolutionary responses to a constructed niche: Ancient Mesoamericans as a model of gene-culture coevolution. PLoS One. 2012;7(6):e38862. doi: 10.1371/journal.pone.0038862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.González-José R, Bortolini MC, Santos FR, Bonatto SL. The peopling of America: Craniofacial shape variation on a continental scale and its interpretation from an interdisciplinary view. Am J Phys Anthropol. 2008;137(2):175–187. doi: 10.1002/ajpa.20854. [DOI] [PubMed] [Google Scholar]
- 29.Bortolini MC, González-José R, Bonatto SL, Santos FR. Reconciling pre-Columbian settlement hypotheses requires integrative, multidisciplinary, and model-bound approaches. Proc Natl Acad Sci USA. 2014;111(2):E213–E214. doi: 10.1073/pnas.1321197111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hoffecker JF, Elias SA, O’Rourke DH, Scott GR, Bigelow NH. Beringia and the global dispersal of modern humans. Evol Anthropol. 2016;25(2):64–78. doi: 10.1002/evan.21478. [DOI] [PubMed] [Google Scholar]
- 31.Laurie CC, et al. GENEVA Investigators Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol. 2010;34(6):591–602. doi: 10.1002/gepi.20516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Delaneau O, Zagury JF, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 2013;10(1):5–6. doi: 10.1038/nmeth.2307. [DOI] [PubMed] [Google Scholar]
- 33.Reynolds J, Weir BS, Cockerham CC. Estimation of the coancestry coefficient: Basis for a short-term genetic distance. Genetics. 1983;105(3):767–779. doi: 10.1093/genetics/105.3.767. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002;18(2):337–338. doi: 10.1093/bioinformatics/18.2.337. [DOI] [PubMed] [Google Scholar]
- 35.Foll M, Gaggiotti OE, Daub JT, Vatsiou A, Excoffier L. Widespread signals of convergent adaptation to high altitude in Asia and America. Am J Hum Genet. 2014;95(4):394–407. doi: 10.1016/j.ajhg.2014.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]