Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2017 Feb 13;114(9):2195–2199. doi: 10.1073/pnas.1620541114

Genetic signature of natural selection in first Americans

Carlos Eduardo G Amorim a,1, Kelly Nunes b,1, Diogo Meyer b, David Comas c, Maria Cátira Bortolini d, Francisco Mauro Salzano d,2, Tábita Hünemeier b,2
PMCID: PMC5338486  PMID: 28193867

Significance

There is much interest in understanding the role of natural selection in shaping physiological adaptations to climate, diet, and diseases in humans. We investigated this issue by analyzing genomic data from Native American populations inhabiting different ecological regions and ancient Native Americans. We found signals of natural selection at the fatty acid desaturases (FADS) genes not only in an Arctic population, as was previously found, but throughout the Americas, suggesting a single and strong adaptive event that occurred in Beringia, before the range expansion of the first Americans within the American continent and Greenland.

Keywords: peopling of America, natural selection, genetics, first Americans

Abstract

When humans moved from Asia toward the Americas over 18,000 y ago and eventually peopled the New World they encountered a new environment with extreme climate conditions and distinct dietary resources. These environmental and dietary pressures may have led to instances of genetic adaptation with the potential to influence the phenotypic variation in extant Native American populations. An example of such an event is the evolution of the fatty acid desaturases (FADS) genes, which have been claimed to harbor signals of positive selection in Inuit populations due to adaptation to the cold Greenland Arctic climate and to a protein-rich diet. Because there was evidence of intercontinental variation in this genetic region, with indications of positive selection for its variants, we decided to compare the Inuit findings with other Native American data. Here, we use several lines of evidence to show that the signal of FADS-positive selection is not restricted to the Arctic but instead is broadly observed throughout the Americas. The shared signature of selection among populations living in such a diverse range of environments is likely due to a single and strong instance of local adaptation that took place in the common ancestral population before their entrance into the New World. These first Americans peopled the whole continent and spread this adaptive variant across a diverse set of environments.


Human history in America begins about 18,000–15,000 y before the present (yBP) (1, 2), when the first Americans entered the New World and migrated southward, peopling all latitudes of the continent. However, the history of this human group begins well before this date, before the initial colonization. Around 25,000 yBP, the Last Glacial Maximum (LGM) drastically reduced sea levels. In the region where we currently find the Bering Strait, this drop in sea level exposed an unglaciated land extension known as the Bering land bridge, which joined Northeast Asia and North America, forming Beringia (35). Most of the studies on the peopling of the Americas agree that the genetic differentiation of the current Native American populations probably occurred in this area during a standstill in Beringia (69), where migrants from different regions in Asia had arrived about 23,000 yBP, inhabiting it for a few thousand years (5,000–8,000 y) (1).

Greenlandic Inuit are one of the descendant populations of the first Americans. A recent study analyzing 196,725 SNPs in 191 individuals from this population proposed that they have experienced genetic and physiological adaptations to a cold climate, obtained through their living for a long time in the extreme conditions of the Arctic and subsisting on a diet rich in protein and fatty acids (10). Using an approach based on a measure of genetic differentiation (FST), these authors provided evidence for positive selection at several sites in the FADS genes on chromosome 11 (FADS1, FADS2, and FADS3), which are involved in the metabolism of omega-3 polyunsaturated fatty acids (PUFAs).

A previous investigation (11) also studied FADS1 and FADS2, with different SNPs from those of ref. 10, and found an association between one haplotype (D) with PUFAs metabolism. This haplotype presented considerable intercontinental variation in its frequency, possibly indicating some level of local adaptation. Moreover, the observation that the climate and diet resources in Beringia during the LGM were probably similar to those in Greenland (4) suggested the possibility that this adaptive haplotype originated in a population ancestral to both Inuit and Native Americans. For this reason, we analyzed sequence variation in FADS genes in Native American populations from Central and South America and searched for sites in FADS genes implicated in cold adaptation.

Results and Discussion

Genomic Screening for Mutations with Large Continental Frequency Differential.

We first compared Native Americans (12) to other continental populations (Africans, Europeans, and East Asians; Table S1) to verify the alleles that had high frequencies in indigenous American populations and low frequencies elsewhere. We found three significant SNPs (Fisher’s exact test with Bonferroni correction), two of which lay within FADS2 (rs174570, P < 0.0001 and rs174556, P < 0.001) and another located on chromosome 22 (rs13054099, P < 0.0000001) in SLC25A17 (Fig. 1). The latter gene encodes a peroxisomal membrane protein that belongs to the family of mitochondrial solute carriers. It is expressed in liver and is likely involved in membrane transport (13). Thus far, there is no a priori evidence that it plays a role in adaptation to diet or temperature.

Table S1.

Information about the populations and samples considered in the present analysis

Population Region Sample size
Illumina
 CHB East Asian 83
 Dai East Asian 10
 Daur East Asian 9
 Han East Asian 33
 Han-N_China East Asian 10
 Hezhen East Asian 8
 Japanese East Asian 28
 JPT East Asian 85
 Miao East Asian 10
 Mongolia East Asian 7
 Mongolia East Asian 8
 Naxi East Asian 8
 Orogen East Asian 9
 She East Asian 10
 Sindhi East Asian 1
 Tu East Asian 10
 Tujia East Asian 10
 Xibo East Asian 9
 Yi East Asian 10
 French European 28
 Adygei European 17
 Basque European 24
 CEU European 108
 Italia European 12
 Ocardian European 15
 Sardinian European 28
 TSI European 88
 Tuscan European 8
 Inuit Greenland 13
 Aleutian Aleutian islands 3
 Altaian Siberian 12
 Buryat Siberian 17
 Chukchi Siberian 30
 Dolgan Siberian 4
 Evenki Siberian 15
 Ket Siberian 2
 Khanty Siberian 35
 Koryak Siberian 10
 Naukan Siberian 16
 Nganasan1 Siberian 23
 Selkup Siberian 9
 Tundra_Nentsi Siberian 3
 Yakut Siberian 34
 Yukaghir Siberian 12
 Algonquin Native American 5
 Arara Native American 1
 Arhuaco Native American 5
 Aymara Native American 23
 Bribri Native American 4
 Cabecar Native American 31
 Chane Native American 2
 Chipewyan Native American 15
 Chono Native American 4
 Chorotega Native American 1
 Cree Native American 4
 Diaguita Native American 5
 Embera Native American 5
 Guahibo Native American 6
 Guarani Native American 6
 Guaymi Native American 5
 Huetar Native American 1
 Hulliche Native American 4
 Inga Native American 9
 Jamamadi Native American 1
 Kaingang Native American 2
 Kaqchikel Native American 12
 Karitiana Native American 13
 Kogi Native American 4
 Maleku Native American 3
 Maya1 Native American 18
 Mixe Native American 17
 Mixtec Native American 5
 Ojibwa Native American 4
 Palikur Native American 3
 Parakanã Native American 1
 Piapoco Native American 6
 Purepecha Native American 1
 Quechua Native American 40
 Surui Native American 24
 Teribe Native American 3
 Ticuna Native American 6
 Toba Native American 4
 Waunana Native American 3
 Wayuu Native American 11
 Wichi Native American 5
 Yaghan Native American 4
 Yaqui Native American 1
 Zapotec1 Native American 22
Affymetrix
 Dai East Asian 10
 Daur East Asian 9
 Han East Asian 34
 Han_Nchina East Asian 10
 Hezhen East Asian 9
 Japanese East Asian 29
 Miao East Asian 10
 Mongolia East Asian 10
 Naxi East Asian 9
 Orogen East Asian 9
 She East Asian 10
 Tu East Asian 10
 Tujia East Asian 10
 Xibo East Asian 9
 Yi East Asian 10
 Adygei European 17
 Basque European 22
 French European 28
 Italian European 13
 Orcadian European 13
 Sardinian European 28
 Tuscan European 8
 Apalai Native American 4
 Arara Native American 4
 Guarani_GN Native American 7
 Guarani_KW Native American 10
 Karitiana Native American 4
 Surui Native American 4
 Urubu Kaapor Native American 3
 Xavante Native American 11
 Zoró Native American 1

Fig. 1.

Fig. 1.

Significant SNPs found in Native American populations compared with other continental populations (Africans, Asians, and Europeans). The comparison was tested using Fisher’s exact test with Bonferroni correction. We found three significant SNPs, two within FADS2 (rs174570, P < 0.0001 and rs174556, P < 0.001) and another located on chromosome 22 (rs13054099, P < 0.0000001) in SLC25A17.

Genetic Signatures of Natural Selection.

Genes showing population-specific changes in allele frequency, such as FADS2 and SLC25A17 in Native Americans, are potential candidates for revealing adaptive evolution. We thus focused our subsequent analyses on chromosomes 11 and 22. To evaluate whether we could identify natural selection as the cause for these extreme allele frequency changes, we applied three cross-population test statistics to two different datasets with 374,470 (dataset 1, ref. 12) and 593,142 (dataset 2, ref. 14) SNPs in each. Two tests are FST-based, involving the comparison of differentiation between Native Americans, Europeans, and East Asians [the population branch statistics (PBS) (15) and a Bayesian method implemented with Bayescan (16)]. To formally test whether natural selection underlies the cases of extreme differentiation, PBS values were compared against those obtained with neutral coalescent simulations generated according to a plausible demographic scenario for the peopling of the New World (17). Bayescan corrects for demographic biases by differentiating the proportion of the FST variation that is locus-specific (due to natural selection) from that which is population-specific (due to demography) (16). In addition to these tests, we also performed the cross-population extended haplotype homozygosity test (XP-EHH) (18) between pairs of populations.

Fig. S1 A and B show the distribution of PBS statistics for the Native American branch across the extension of chromosome 11 in each dataset. The most extreme PBS value in dataset 1 corresponds to rs174570 (Fig. S1A), the same SNP that was highly significant in our preliminary analysis of extreme allele frequency differences across continents (Fig. 1). This result demonstrated that the SNP detection was consistent across typing platforms and different methods. In addition, this SNP was one of the sites found under positive selection in the Greenlandic Inuit (10).

Fig. S1.

Fig. S1.

(A and B) Distribution of the PBS statistics (99.9th and 99.5th percentiles) for the Native American branch across the extension of chromosome 11 (on the x axis) in each dataset (1 and 2, respectively). (C and D) FST outlier detection from BayeScan across the extension of chromosome 11 (on the x axis) in both datasets. With a false discovery rate of 0.05 (dashed line), SNPs that reach a significant log-transformed q-value (y axis) in the Native American population are shown in red. The dashed lines indicate the sliding-window q-value.

For dataset 2, we found 15 SNPs with highly significant PBS values (Fig. S1B), three of them being located on FADS genes (rs74771917, rs7115739, and rs174570). Neutral coalescent simulations indicated that these deviations were statistically significant (P < 0.0001; Fig. S2), consistent with the action of positive selection as opposed to genetic drift in increasing the frequency of the derived allele in Native American populations. The other SNPs showing signals of natural selection in Native Americans were located at, or near, genes related with metabolism (Table S2). The Bayescan analysis corroborated all of the aforementioned findings, especially regarding selection on FADS genes (Fig. S1 C and D). However, for the rs13054099 SNP on chromosome 22 we found weak or lack of evidence for selection with all methods.

Fig. S2.

Fig. S2.

Distribution of 10,000 simulated PBS values under a neutral coalescent model. The dashed line represents the top observed PBS SNP values in the empirical datasets. (A) Dataset 1. (B) Dataset 2.

Table S2.

Information on the 15 SNPs showing significant signals of natural selection

Chromosome Position rs Ancestral Derived DAF EUR DAF EAS DAF NAM PBS Gene Associated disease
Illumina
 11 61353788 rs174570 C T 0.1113 0.3683 0.9973 1.3018 FADS2 Metabolic syndrome
 11 61337211 rs174556 C T 0.2393 0.3612 0.9973 1.0207 MIR1908(-1.997kb)|FADS1(0) |FADS2(-3.039kb) Metabolic syndrome
 11 133087432 rs4936203 G A 0.0777 0.0904 0.7280 1.0070
 11 12128635 rs9804570 T C 0.7423 0.8853 0.1296 1.0043 MICAL2 Asthma/visceral fat
 11 94230937 rs4309121 T C 0.1662 0.2875 0.91398 0.9873 AMOTL1 Psychiatric disorder/ADH
 11 79818081 rs7111535 A G 0.8807 0.8042 0.1791 0.9551
 11 61326406 rs174546 T C 0.7180 0.6346 0.0027 0.9493 TMEM258(+9.745kb)|MIR611(+9.797kb) |FEN1(+5.116kb)|FADS1(0) Metabolic syndrome
 11 79810048 rs12277775 G A 0.8857 0.7989 0.1818 0.9475
 11 61354548 rs1535 G A 0.7134 0.6289 0.0027 0.9348 ELF5(-0.262kb)|CAT(+6.472kb) β2-glycoprotein I (β2-GPI) plasma levels/cataracts in type 2 diabetes
 11 11779270 rs4910420 G A 0.1296 0.0907 0.7434 0.9289
 11 61396955 rs174449 G A 0.6509 0.7705 0.0410 0.9250 MIR6746(-5.308kb)|FADS3(-0.615kb) |FADS2(+5.553kb) Metabolic syndrome
 11 61314379 rs102275 C T 0.7018 0.6317 0.0027 0.9228 TMEM258(0)|MYRF(+1.813kb)|MIR611(-2.163kb) |FEN1(-2.305kb)|FADS1(-9.293kb) Metabolic syndrome
 11 61306034 rs174534 A G 0.2957 0.3725 0.9944 0.9065 TMEM258(-7.143kb)|MYRF(0) Metabolic syndrome
 11 68613614 rs1551310 T C 0.0046 0.4504 0.8413 0.8981 TPCN2(0)|MIR3164(+6.312kb) Prostate cancer/hair color
 11 13715365 rs1866948 T C 0.1128 0.0213 0.6489 0.8762 FAR1(+4.896kb) Bone mineral density
Affymetrix
 11 61627960 rs74771917 C T 0.0388 0.1835 0.9479 1.7484 FADS2 Metabolic syndrome
 11 12071129 rs4910436 C T 0.0310 0.0053 0.7396 1.7148 LOC105376554 Visceral fat
 11 61641717 rs7115739 G T 0.0433 0.1962 0.9457 1.6692 MIR6746(-3.97kb)|FADS3(0)|FADS2(+6.891kb) Metabolic syndrome
 11 12016528 rs952932 C T 0.0349 0.0053 0.6771 1.4705 DKK3 Visceral fat
 11 43903391 rs33984031 T G 0.0194 0.0293 0.6458 1.2962 ALKBH3 Body mass index
 11 43902894 rs3740984 G A 0.0194 0.0293 0.6458 1.2962 ALKBH3 Body mass index
 11 43895919 rs34319180 C T 0.0194 0.0293 0.6458 1.2962 ALKBH3 Body mass index
 11 43895827 rs34529902 C T 0.9806 0.9707 0.3542 1.2962 ALKBH3 Body mass index
 11 120155118 rs11217788 G A 0.1016 0.1941 0.8958 1.2258 POU2F3 Psoriasis/intraocular pressure/corneal curvature
 11 69593176 rs11263566 A G 0.0233 0.1064 0.7292 1.2221 FGF4 Serum urate levels by body-mass-index/type 1 diabetes/IgG glycosylation/breast cancer
 11 69601039 rs11263574 T G 0.0234 0.1064 0.7292 1.2199
 11 43217978 rs17500007 A G 0.0543 0.0000 0.5833 1.1376
 11 61597212 rs174570 C T 0.1008 0.4149 0.9896 1.1373 FADS2 Metabolic syndrome
 11 43893782 rs12418820 G T 0.0195 0.0585 0.6304 1.1277 ALKBH3 Body mass index
 11 45192009 rs12362992 G A 0.2791 0.0775 0.8854 1.1073 PRDM11 IgG glycosylation/thyroid hormone levels/forced vital capacity

Putatively Selected Haplotype in Extant and Ancient Americans.

To investigate whether this signal of natural selection could be the result of adaptation to the conditions encountered by the ancestors of first Americans in Beringia, we compared the genotypes of living Native Americans and Inuit to those of four ancient humans. These included Saqqaq, a Paleo-Eskimo from Greenland who lived ∼4,000 yBP (19); Anzick-1, an individual belonging to the classical North American Clovis culture who lived ∼12,500 yBP (20); the Mal’ta boy who lived in Siberia ∼24,000 yBP (21); and the Ust’-Ishim man, who lived in Siberia ∼45,000 yBP (22) (Table 1). The haplotype found in the homozygous state in the Anzick-1 individual—which is a representative of first Americans, or close to them—is the same as the one present at high frequencies in extant Native Americans from diverse locations of the continent. The Paleo-Eskimo had a haplotype different from that present in Native Americans or the Inuit. The high frequency of the putatively selected FADS haplotype and its shared distribution in living and ancient Native Americans (Table S3) is therefore consistent with a scenario of intense selection during the Beringian standstill.

Table 1.

Genetic information in ancient and extant Native Americans

graphic file with name pnas.1620541114t01.jpg

Genotypes composing the putatively selected FADS haplotype are shown in red. Double asterisks indicate that the genotypes could not be identified in the sequences examined. Y chr, Y chromosome.

Table S3.

Information on the frequencies of the putatively selected FADS haplotype

Population Absolute frequency No. of chromosomes Relative frequency
Siberia
 Aleutian 6 6 1.000
 Altaian 20 24 0.833
 Buryat 16 34 0.471
 Chukchi 49 60 0.817
 Dolgan 6 8 0.750
 Evenki 15 30 0.500
 Ket 2 4 0.500
 Khanty 56 70 0.800
 Koryak 14 20 0.700
 Naukan 30 32 0.938
 Nganasan1 23 46 0.717
 Selkup 13 18 0.722
 Tundra_Nentsi 6 6 1.000
 Yakut 41 68 0.603
 Yukaghir 14 24 0.583
Greenland
 Inuit 26 26 1.000
North and Central America
 Algonquin 10 10 1.000
 Chipewyan 23 30 0.767
 Cree 8 8 1.000
 Diaguita 10 10 1.000
 Kaqchikel 24 24 1.000
 Maya1 34 36 0.944
 Mixe 34 34 1.000
 Mixtec 4 10 0.400
 Ojibwa 6 8 0.750
 Purepecha 2 2 1.000
 Yaqui 2 2 1.000
 Zapotec1 42 44 0.955
  Average 199 218 0.913
South America
 Arara 2 2 1.000
 Arhuaco 8 10 0.800
 Aymara 43 46 0.935
 Bribri 7 8 0.875
 Cabecar 57 62 0.919
 Chane 4 4 1.000
 Chono 8 8 1.000
 Chorotega 2 2 1.000
 Embera 10 10 1.000
 Guahibo 12 12 1.000
 Guarani 12 12 1.000
 Guaymi 10 10 1.000
 Huetar 2 2 1.000
 Hulliche 8 8 1.000
 Inga 16 18 0.889
 Jamamadi 2 2 1.000
 Kaingang 4 4 1.000
 Karitiana 26 26 1.000
 Kogi 8 8 1.000
 Maleku 6 6 1.000
 Palikur 6 6 1.000
 Parakanã 2 2 1.000
 Piapoco 10 12 0.833
 Quechua 79 80 0.988
 Suruí 48 48 1.000
 Teribe 6 6 1.000
 Ticuna 12 12 1.000
 Toba 8 8 1.000
 Waunana 5 6 0.833
 Wayuu 20 22 0.909
 Wichi 10 10 1.000
 Yaghan 8 8 1.000
  Average 461 480 0.960

During the LGM, Beringia offered a feasible home for the Asian migrants, due to its ecological conditions (4) furnishing a relatively isolated place for their differentiation. We hypothesized that a putative selection event occurred after migrants from northeastern Siberia (23), as well as other Asian regions, reached Beringia. Presently, extant native Siberians present relatively lower frequencies (average 69%) of the putatively selected FADS haplotype (Fig. 2 and Table S3). Cardona et al. (24) surveyed the entirety of Siberian chromosome 11 and found no indication of selection at the FADS region. Therefore, the presence of the putatively selected FADS haplotype might have resulted from subsequent gene flow between Native Americans and Siberian populations occurring after the Beringian event. Note that the two ancient Siberians (Mal’ta and Ust’-Ishim) had haplotypes distinct from that found in the ancient Anzick-1 Native American individual but similar to the one found in the Saqqaq Paleo-Eskimo (Table 1).

Fig. 2.

Fig. 2.

The geographic distribution of the putatively selected FADS haplotype that presents a signal of natural selection in Native American populations.

Several studies have suggested that all Paleo-Eskimos arose from a single distinct migration that occurred around 6,000 yBP (19, 25, 26). It would therefore be expected that the selected alleles found in extant Inuit should also be found in ancient Eskimo genomes, especially if selection had occurred in this specific human group due to the extreme weather conditions and particular diet. The observed difference suggests that another explanation should be considered.

The vast majority (95%) of Native Americans have the putatively selected haplotype, which is found at high frequency in the Inuit from Greenland (Fig. 2). Considering the whole continent, there are slight differences in haplotype frequencies between south (80–100%, average 96%) and north and central (40–100%, average 91.3%) Native Americans (Table S3). Such variation could be due to later gene flow with ancient Siberian populations, subsequent to the initial peopling of America (12), or to stochastic factors. The high frequencies occur despite marked differences in lifestyles and diets of the different indigenous populations. Amazonian hunter-gatherers have highly variable diets, mainly composed of fruits, roots, and small mammals, whereas agriculturalists from Mesoamerica and the Andean region eat food based on crop agriculture (27) (Table S4). By contrast, Arctic populations such as the Inuit have diets that rely on hunting marine mammals and have animal fat as the main source of nutrients (10).

Table S4.

Comparisons of the absolute frequencies of the putatively selected haplotype between populations regarding their subsistence modes

Pairs of comparisons P value
AA × MA 0.3521
AA × SA 0.9196
AA × RF 0.5217
MA × SA 0.1194
MA × RF 0.2420
SA × RF 0.6089

Groups were compared using the χ2 test with Yates correction (IBM SPSS Statistics). Populations are as follows. AA: Aymará, Huilliche, Inga, and Quechua. MA: Cabecar, Guaymi, Kaqchikel, Maya, Mixe, Mixtec, Purepecha, and Zapotec. SA: Arara, Arhuaco, Bribi, Chane, Chorotega, Embera, Guahibo, Guarani, Huetar, Jamamadi, Kaingang, Karitiana, Kogi, Maleku, Palikur, Parakanã, Piapoco, Suruí, Teribe, Ticuna, Toba, Waunana, Wichi, and Yaghan. RF: Inuit and Chono (29). AA, Andean agriculturalists; MA, Mesoamerican agriculturalists; RF, rich-fat consumers; SA, South American hunter-gatherers.

The extant Eskimo-Aleut genomes are a result of the encounter of the ancestors of the first Americans with subsequent, and more recent, stream of Asian gene flow (12). Multidisciplinary models of American settlement also suggested that Eskimos are descendants of ancient Beringians, although posterior gene flow from Asia introduced some specific genetic and phenotypic traits into these groups (28, 29). Given the aforementioned pattern of genetic variation in the ancient individuals, the fact that our results for the XP-EHH test do not show differentiation between Greenlandic Inuit and Native Americans in the FADS region (XP-EHH = 1.13218) implies that the signal of natural selection found in the Inuit is a result of their shared ancestry with the first Americans, which reflects adaptive selection in that region of the genome.

Conclusions

Our results strongly suggest that our hypothesis that the signal of positive selection is not restricted to the Arctic is correct. We found a signature of natural selection at FADS loci in all 53 Native American populations studied here, including a strong signal in Amazonian populations that are extremely different from the Inuit regarding culture, environment, and diet. In our opinion, this genetic profile suggests a single and very strong adaptive event that occurred in Beringia, before the range expansion of the ancestors of the first Americans within the American continent and Greenland. This event was likely related to the metabolic adaptations to diet and cold weather required to subsist during glacial and perglacial conditions that existed during the Beringian standstill (30).

Materials and Methods

Populations.

We analyzed data from 349 individuals from 44 Native American populations previously published by Reich et al. (12), as well as information on 48 individuals from nine Native American populations published by Skoglund et al. (14). These datasets were called dataset 1 and dataset 2, respectively. Detailed population information can be found in Table S1. In addition, data from 225 Asian individuals and 328 Europeans from the Human Genome Diversity Project (www.hagsc.org/hgdp/) were evaluated. The data were curated in the same way reported by Laurie et al. (31). We computationally phased these data using the SHAPEIT software (32) with default parameters.

PBS Test.

PBS analysis was performed as described in ref. 15. For each SNP, an FST value was estimated between pairs of populations [Native Americans (NAM), East Asians (EAS), and Utah residents of European ancestry (CEU)], using Reynolds et al.’s (33) estimator. Nonpolymorphic sites in at least two populations were excluded. PBS estimates were performed between NAM and EAS, using CEU as an outgroup. The analyses were done with individual SNPs and with 20 SNPs windows overlapping 5 SNPs, as implemented by Fumagalli et al. (10).

Neutral Coalescent Simulations.

Simulations were performed using the demographic model for the peopling of the New World inferred by Gutenkunst et al. (17). In short, these authors modeled the Eurasian divergence from the African population, with subsequent divergence of the Asian population from the European, at ∼26 kya. The Americas were peopled by a subset of the Asian population with 800 individuals that went through exponential growth after entering the New World ∼22 kya. Gene flow from Europeans to both Asian and Native American populations was allowed to happen, because the genetic dataset used to infer this model comprised Mexican individuals that are known to have some European genetic contribution (17). This approach allowed us to correct for possible bias due to genetic admixture, and indeed the generation time and population size estimates are in agreement with the literature (1). More details about this scenario can be found in figure 3 and table 2 from ref. 33.

We implemented our simulations according to this scenario in ms (34) with the following command line:

  • ms Ntot 10000 -s 1 -I 4 0 NEuro NAsia NAmer -n 1 1.682020 -n 2 2.424020 -n 3 4.185850 -n 4 7.942130 -eg 0 2 67.978337 -eg 0 3 109.406463 -eg 0 4 147.474095 -ema 0 4 × 0 0 0 0 × 3.960400 0 0 3.960400 × 0 0 0 0 x -ej 0.029475 4 3 -ema 0.029475 4 × 0.881098 0.561966 × 0.881098 × 3.960400 × 0.561966 3.960400 × x -ej 0.036114 3 2 -en 0.036114 2 0.287184 -ema 0.036114 4 × 7.293140 × x 7.293140 × x -ej 0.197963 2 1 -en 0.303500 1 1.

The total sample number (Ntot) and ethnic group samples number (NEuro, NAsia, and NAmer) varied between datasets, being 904, 328, 358, and 218 for dataset 1 and 365, 129, 188, and 48 for dataset 2, respectively. We used 10,000 simulations according to this scenario to generate a null distribution of the PBS values to which empirical PBS values were compared. This procedure allowed us to evaluate the significance of the differences and get the outlier PBS values.

Detecting SNPs Under Positive Selection with a Bayesian Approach.

Two versions of BayeScan (16, 35) were used to identify candidate targets for positive selection. Both versions assume an island model in which the subpopulations’ allele frequencies are correlated through a common migrant gene pool from which they may differ in varying degrees. Based on this outcome, FST is calculated and decomposed into a population-specific component (β), shared by all loci, and a locus-specific component (α), shared by all populations (see refs. 15 and 35 and the corresponding manuals for details). Significant, positive values of α indicate an overly large level of differentiation of a given SNP, given the genomic average specific for each population (the β part of the FST mentioned above), indicating positive selection. With the older version of BayeScan we were able to identify outlier loci, and with the new version we identified those that were outliers in the Native American population, assigning posterior probabilities for each marker as being under positive selection. These posterior probabilities were then transformed into q-values to control for the false discovery rate.

A sliding-window approach was followed to generate Manhattan plots with the distribution of the q-values along chromosome 11. This approach was set with windows of size 500 kb, with a shifting increment of 25 kb at each step. The q-value associated to each window was assumed as the 95% quantile of the q-values calculated for all SNPs in the window.

Note that this approach slightly differs from the previous one (PBS). The different implementations repeated for each dataset were intended to eliminate biases related to a specific method, its implementation, and also any bias related to the genetic data.

Variation in Ancient Humans.

We checked the FADS genotypes SNPs in the ancient individuals accessing the information from VCF or BAM files; for this purpose we used the VariantAnnotation (https://bioconductor.org/packages/release/bioc/html/VariantAnnotation.html) and Samtools (www.htslib.org/) packages.

Geographic Analysis.

Geographic maps displaying spatial variation in populations were obtained with inverse distance weighted using the ArcGis 9.3 software (www.esri.com/software/arcgis).

Acknowledgments

We thank Marcelo Zagonel de Oliveira for technical assistance in the geographic analysis. This work was supported by a Science Without Borders fellowship from Conselho Nacional de Desenvolvimento Científico e Tecnológico Grant PDE 201145/2015-4 (to C.E.G.A.) and Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil Grant 2012/ 09950-9 (to K.N.).

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1620541114/-/DCSupplemental.

References

  • 1.Fagundes NJ, et al. Mitochondrial population genomics supports a single pre-Clovis origin with a coastal route for the peopling of the Americas. Am J Hum Genet. 2008;82(3):583–592. doi: 10.1016/j.ajhg.2007.11.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Poznik GD, et al. 1000 Genomes Project Consortium Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat Genet. 2016;48(6):593–599. doi: 10.1038/ng.3559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hoffecker JF, Elias SA, O’Rourke DH. Anthropology. Out of Beringia? Science. 2014;343(6174):979–980. doi: 10.1126/science.1250768. [DOI] [PubMed] [Google Scholar]
  • 4.Meiri M, et al. Faunal record identifies Bering isthmus conditions as constraint to end-Pleistocene migration to the New World. Proc Biol Sci. 2013;281(1776):20132167. doi: 10.1098/rspb.2013.2167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hopkins DM. The Bering Land Bridge. Stanford Univ Press; Stanford, CA: 1967. [Google Scholar]
  • 6.Szathmary EJE. Genetics of aboriginal North Americans. Evol Anthropol. 1993;1:202–220. [Google Scholar]
  • 7.Bonatto SL, Salzano FM. A single and early migration for the peopling of the Americas supported by mitochondrial DNA sequence data. Proc Natl Acad Sci USA. 1997;94(5):1866–1871. doi: 10.1073/pnas.94.5.1866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Tamm E, et al. Beringian standstill and spread of Native American founders. PLoS One. 2007;2(9):e829. doi: 10.1371/journal.pone.0000829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kitchen A, Miyamoto MM, Mulligan CJ. A three-stage colonization model for the peopling of the Americas. PLoS One. 2008;3(2):e1596. doi: 10.1371/journal.pone.0001596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Fumagalli M, et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science. 2015;349(6254):1343–1347. doi: 10.1126/science.aab2319. [DOI] [PubMed] [Google Scholar]
  • 11.Ameur A, et al. Genetic adaptation of fatty-acid metabolism: A human-specific haplotype increasing the biosynthesis of long-chain omega-3 and omega-6 fatty acids. Am J Hum Genet. 2012;90(5):809–820. doi: 10.1016/j.ajhg.2012.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Reich D, et al. Reconstructing Native American population history. Nature. 2012;488(7411):370–374. doi: 10.1038/nature11258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Visser WF, van Roermund CW, Waterham HR, Wanders RJ. Identification of human PMP34 as a peroxisomal ATP transporter. Biochem Biophys Res Commun. 2002;299(3):494–497. doi: 10.1016/s0006-291x(02)02663-3. [DOI] [PubMed] [Google Scholar]
  • 14.Skoglund P, et al. Genetic evidence for two founding populations of the Americas. Nature. 2015;525(7567):104–108. doi: 10.1038/nature14895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Yi X, et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010;329(5987):75–78. doi: 10.1126/science.1190371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Foll M, Gaggiotti O. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: A Bayesian perspective. Genetics. 2008;180(2):977–993. doi: 10.1534/genetics.108.092221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 2009;5(10):e1000695. doi: 10.1371/journal.pgen.1000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Szpiech ZA, Hernandez RD. selscan: An efficient multithreaded program to perform EHH-based scans for positive selection. Mol Biol Evol. 2014;31(10):2824–2827. doi: 10.1093/molbev/msu211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Rasmussen M, et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature. 2010;463(7282):757–762. doi: 10.1038/nature08835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Rasmussen M, et al. The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Nature. 2014;506(7487):225–229. doi: 10.1038/nature13025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fu Q, et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature. 2014;514(7523):445–449. doi: 10.1038/nature13810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Raghavan M, et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature. 2014;505(7481):87–91. doi: 10.1038/nature12736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Dulik MC, et al. Mitochondrial DNA and Y chromosome variation provides evidence for a recent common ancestry between Native Americans and Indigenous Altaians. Am J Hum Genet. 2012;90(2):229–246. doi: 10.1016/j.ajhg.2011.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Cardona A, et al. Genome-wide analysis of cold adaptation in indigenous Siberian populations. PLoS One. 2014;9(5):e98076. doi: 10.1371/journal.pone.0098076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Harrit RK, et al. Paleo-Eskimo beginnings in North America: A new discovery at Kuzitrin Lake, Alaska. Etud Inuit. 1998;22:61–81. [Google Scholar]
  • 26.Gilbert MT, et al. Paleo-Eskimo mtDNA genome reveals matrilineal discontinuity in Greenland. Science. 2008;320(5884):1787–1789. doi: 10.1126/science.1159750. [DOI] [PubMed] [Google Scholar]
  • 27.Hünemeier T, et al. Evolutionary responses to a constructed niche: Ancient Mesoamericans as a model of gene-culture coevolution. PLoS One. 2012;7(6):e38862. doi: 10.1371/journal.pone.0038862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.González-José R, Bortolini MC, Santos FR, Bonatto SL. The peopling of America: Craniofacial shape variation on a continental scale and its interpretation from an interdisciplinary view. Am J Phys Anthropol. 2008;137(2):175–187. doi: 10.1002/ajpa.20854. [DOI] [PubMed] [Google Scholar]
  • 29.Bortolini MC, González-José R, Bonatto SL, Santos FR. Reconciling pre-Columbian settlement hypotheses requires integrative, multidisciplinary, and model-bound approaches. Proc Natl Acad Sci USA. 2014;111(2):E213–E214. doi: 10.1073/pnas.1321197111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hoffecker JF, Elias SA, O’Rourke DH, Scott GR, Bigelow NH. Beringia and the global dispersal of modern humans. Evol Anthropol. 2016;25(2):64–78. doi: 10.1002/evan.21478. [DOI] [PubMed] [Google Scholar]
  • 31.Laurie CC, et al. GENEVA Investigators Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol. 2010;34(6):591–602. doi: 10.1002/gepi.20516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Delaneau O, Zagury JF, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 2013;10(1):5–6. doi: 10.1038/nmeth.2307. [DOI] [PubMed] [Google Scholar]
  • 33.Reynolds J, Weir BS, Cockerham CC. Estimation of the coancestry coefficient: Basis for a short-term genetic distance. Genetics. 1983;105(3):767–779. doi: 10.1093/genetics/105.3.767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002;18(2):337–338. doi: 10.1093/bioinformatics/18.2.337. [DOI] [PubMed] [Google Scholar]
  • 35.Foll M, Gaggiotti OE, Daub JT, Vatsiou A, Excoffier L. Widespread signals of convergent adaptation to high altitude in Asia and America. Am J Hum Genet. 2014;95(4):394–407. doi: 10.1016/j.ajhg.2014.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES