Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
. 2019 Feb 28;104(3):553–561. doi: 10.1016/j.ajhg.2019.02.007

Recent Adaptive Acquisition by African Rainforest Hunter-Gatherers of the Late Pleistocene Sickle-Cell Mutation Suggests Past Differences in Malaria Exposure

Guillaume Laval 1,2,, Stéphane Peyrégne 1,2,3, Nora Zidane 1,2, Christine Harmant 1,2, François Renaud 4, Etienne Patin 1,2, Franck Prugnolle 4,5, Lluis Quintana-Murci 1,2,5,∗∗
PMCID: PMC6407493  PMID: 30827499

Abstract

The hemoglobin βS sickle mutation is a textbook case in which natural selection maintains a deleterious mutation at high frequency in the human population. Homozygous individuals for this mutation develop sickle-cell disease, whereas heterozygotes benefit from higher protection against severe malaria. Because the overdominant βS allele should be purged almost immediately from the population in the absence of malaria, the study of the evolutionary history of this iconic mutation can provide important information about the history of human exposure to malaria. Here, we sought to increase our understanding of the origins and time depth of the βS mutation in populations with different lifestyles and ecologies, and we analyzed the diversity of HBB in 479 individuals from 13 populations of African farmers and rainforest hunter-gatherers. Using an approximate Bayesian computation method, we estimated the age of the βS allele while explicitly accounting for population subdivision, past demography, and balancing selection. When the effects of balancing selection are taken into account, our analyses indicate a single emergence of βS in the ancestors of present-day agriculturalist populations ∼22,000 years ago. Furthermore, we show that rainforest hunter-gatherers have more recently acquired the βS mutation from the ancestors of agriculturalists through adaptive gene flow during the last ∼6,000 years. Together, our results provide evidence for a more ancient exposure to malarial pressures among the ancestors of agriculturalists than previously appreciated, and they suggest that rainforest hunter-gatherers have been increasingly exposed to malaria during the last millennia.

Keywords: rainforest hunter-gatherers, farmers, malaria, sickle mutation, balancing selection, approximate Bayesian computation, Africa

Main Text

The burden imposed by the malaria (MIM: 611162) parasite Plasmodium falciparum on children worldwide is paramount: it causes hundreds of thousands of deaths per year. As a consequence, malaria is probably among the strongest—and the most documented—selective pressures imposed by an infectious agent on the human population. In this context, balancing selection at the β-globin gene HBB (MIM: 141900), as a result of the heterozygote advantage afforded by the βS sickle (MIM: 603903) mutation in malaria-endemic regions (i.e., the “Haldane malaria hypothesis”),1, 2, 3 is the most iconic case in which natural selection maintains a deleterious mutation at high frequency in humans. Although βSS homozygotes develop sickle-cell disease, an often-fatal anemia caused by red-cell deformities, heterozygotes benefit from higher protection—of about an order of magnitude—against life-threatening forms of malaria.4, 5, 6

The evolutionary history of the βS mutation has been the object of intense research over the last decades; the study of this mutation can provide insight into the observed disparities in malaria susceptibility across populations and, more generally, into the history of human malaria. Because the current geographic distribution of βS correlates with the dependence of populations on agriculture, it has been hypothesized that malaria developed with slash-and-burn agriculture in central Africa.7, 8 Two exclusive models have been proposed in this regard: the multicentric model, which assumes recent, independent mutational events8, 9, 10, 11, 12, 13 occurring concomitantly with the emergence of agriculture ∼3,000–5,000 years ago (ya),14 and the unicentric model, which supports a more ancient, unique origin of the βS allele.15, 16, 17

A recent, comprehensive analysis of whole-genome sequence data for 156 βS carriers from agriculturalist (AGR) populations of the 1000 Genomes Project, the African Genome Variation Project, and Qatar has supported a unique origin of the βS allele ∼7,300 ya in Africa (95% credible interval: 3,400–11,100 years).18 However, this age was estimated from neutral (without selection) ancestral recombination graphs,19 whereas βS is known to have evolved under balancing selection. Indeed, in the absence of any heterozygote advantage, the βS mutation should have been purged almost immediately from the population because of its strong deleterious effect in the homozygous state.18 Furthermore, most studies have focused on the history of βS among AGR populations only. Defining the geographic distribution and age of the βS mutation in rainforest hunter-gatherers (RHGs)20 might thus provide new insights into the history and epidemiology of malaria in sub-Saharan Africa.

In the present study, we explored the evolutionary history of βS in populations of AGRs and RHGs by explicitly considering population structure, demography, and overdominance in our analyses. We sequenced HBB (Figure S1A) in 423 individuals from seven AGR and six RHG populations (Table 1). Informed consent was obtained from all participants, and the study was approved by the Institut Pasteur (IRB n° 2011-54/IRB/8). To investigate the history of the βS mutation (rs334) in a genome-wide context, we merged our βS genotypes with 332,550 available SNP genotypes obtained for 479 individuals, including the 423 sequenced individuals21, 22, 23 and 56 unrelated Yoruba (Yoruba in Ibadan, Nigeria, from HapMap24), in whom the βS mutation was genotyped (Supplemental Methods).

Table 1.

Description of Population Samples, βS Frequencies, and Haplotype-Based Statistics Computed Using 500kb Windows Around βS

Population IDain Map Sample Sizes Genome-Wide Datasets βS Freqency iHS ΔiHH nSL DIND CSSb
Mandenka 1 wAGRc 20 HGDP21 0.15 −3.640∗∗ −5.104∗∗ −3.769∗∗ −4.014∗∗ 26.661∗∗
Yoruba 2 wAGR 77 HGDP,HapMap24 0.136 −3.627∗∗ −4.475∗∗ −3.789∗∗ −3.900∗∗ 26.000∗∗
Nzebi 3 wAGR 28 CAd,22 0.089 −3.527∗∗ −6.719∗∗ −3.804∗∗ −8.686∗∗ 29.641∗∗
Bakota 4 wAGR 46 CA22 0.098 −3.052∗∗ −3.844 −3.139∗∗ −2.409 20.168∗∗
Nzime 5 wAGR 47 CA23 0.043 1.025 0.863 0.897 0.943 0.533
Bakiga 6 eAGR 49 CA23 0.01 NAe NA NA NA NA
Bantu 7 eAGR 10 HGDP 0.1 −1.338 −0.849 −1.341 −0.579 7.872
Baka 8 wRHG 82 CA22, 23 0.11 −3.071∗∗ −4.771∗∗ −2.976∗∗ −3.527∗∗ 23.367∗∗
Bongo 9 wRHG 33 CA22, 23 0.061 −2.237 −2.228 −1.891 −1.384 13.328
Bakoya 10 wRHG 25 CA22 0 NA NA NA NA NA
Biaka 11 wRHG 21 HGDP 0.048 −1.448 −1.669 −1.457 −1.099 10.489
Mbuti 12 eRHG 13 HGDP 0.154 −1.889 −1.455 −1.768 −1.516 11.867
Batwa 13 eRHG 28 CA23 0 NA NA NA NA NA

Merged Samples According to Lifestyle

AGR 277 0.087 −1.719 −0.635 −2.422 0.063 9.430
RHG 202 0.069 −3.193∗∗ −3.362 −3.001∗∗ −3.715∗∗ 20.850∗∗

p < 0.05; ∗∗p < 0.01.

a

Identifiers used in Figure S1B.

b

Combined selection score.

c

“w” and “e” stand for western and eastern, respectively.

d

Individuals from central Africa.

e

Not applicable.

The βS mutation was, as expected, widely distributed across sub-Saharan Africans (Table 1 and Figure S1B) at frequencies that ranged from 0% (Batwa RHG and Bakoya RHG) to 15% (Mandenka AGR and Mbuti RHG).18, 20 To minimize the impact of local drift and low sample sizes, we merged the different subpopulations according to their lifestyles, and the frequencies of βS were remarkably similar: 9% in AGR and 7% in RHG. The degree of population differentiation at rs334 was slightly, although non-significantly, lower than genome-wide expectations (FST 0 versus 0.016; Figure S2). Although such merged frequency values depend on the populations included in the analysis, these observations suggest that malaria might have exerted selective pressures of comparable intensities in AGR and RHG populations. Under the Haldane hypothesis, similar exposure to malaria results in similar βS equilibrium frequencies and thus in diminished FST (shared overdominance).25, 26

To determine the time needed for βS to reach equilibrium frequencies, we simulated the frequency trajectory of this allele with the forward-in-time simulator SLiM.27 We assumed a relative fitness of each genotype wβAβA=1, wβAβS=1+s, and wβSβS=1l. Under overdominance (s>0 and l>0), the equilibrium frequency depends only on the ratio h=s/l (e.g., peq=1(1+h)/(1+2h) in a population of infinite size, leading peqto be approximately equal tos/l when peq is small). We found that equilibrium frequencies similar to those observed in the AGR or RHG dataset (∼8%) can indeed be achieved in a very short time period (e.g., <2,800 years if one assumes a generation time of 28 years18, 28 and depending on the parameter l, Figure S3). These simulations suggest that, if βS first occurred earlier than ∼2,800 ya, its current frequency, which has already reached an equilibrium state depending on the s/l value, cannot be used for estimating its age. We thus reasoned that the age of βS could be estimated from the long-range conservation of its associated haplotypes. If βS has recently reached its equilibrium frequency, one would expect unusually long haplotypes associated with βS, a pattern similar to that expected after an event of recent positive selection.29 βS haplotypes will then be shortened by recombination as a function of the equilibrium-state duration, which can be very long, e.g., HLA haplotypes have been maintained for millions of years.30

To test this prediction, we simulated the conservation of haplotypes carrying an overdominant mutation that occurred at different time depths, uniformly distributed until 100,000 ya (i.e., Plasmodium falciparum might have existed for 100,000 years in Africa).6 We assumed a single origin of βS in the AGR or in the RHG lineages (Figures 1A and 1B) and set h=0.1 to simulate equilibrium frequencies equal to peq=0.083 in each (close to the ∼8% observed in AGR and RHG). Given that child mortality of βSS homozygotes has been reported to range between 50% and 90%,31 we considered high recessive lethality (l=0.8) as previously assumed.18 We used the pedigree-based recombination rate of 2.7 × 10−8 per generation per site observed in the HBB region32 to simulate large DNA regions containing βS. We also simulated the population structure and specific demographic parameters that we recently inferred for these populations33 and matched the numbers of simulated SNPs and the allele frequency spectrum to genome-wide observations. After merging samples according to their AGR and RHG lifestyles (Figures 1A and 1B) to avoid unwanted noise due to low sample sizes, we assessed the long-range conservation of βS haplotypes by using four haplotype-based statistics—iHS,29 DIND,34 ΔiHH,35 and nSL36—that we computed with 100 kb and 500 kb windows around βS (Supplemental Methods).

Figure 1.

Figure 1

Simulated Demography in Agriculturalist (AGR) and Rainforest Hunter-Gatherer (RHG) Populations

(A and B) The long-range conservation of βS haplotypes is assessed in the lineage where the βS mutation occurred (at frequency 1/2N), i.e., in the corresponding, merged population group. For example, in (A), we computed the long-range conservation of βS haplotypes in the merged group of AGR samples. Time is presented in thousands of years, and effective population sizes are presented in thousands of individuals.

Our simulations clearly showed that haplotypes containing a βS mutation of recent origin tend to be more conserved than haplotypes that evolve at similar frequencies under neutrality, as attested by the markedly negative values of haplotype-based statistics (Figures 2 and S4). The monotonic relationship between mutation age and haplotype-based statistics indicated that these metrics are informative for estimating the age of an overdominant βS mutation. We then assessed the long-range conservation of βS haplotypes in our empirical data (Tables 1 and S1) by first computing the haplotype-based statistics in each population separately. We found genome-wide-significant signals of selection in some AGR groups, as previously shown,24, 37, 38 but also in several RHG populations (p < 0.01, Figure 3, Table 1). Consistent with our simulations, these results indicate rapid increases in frequency, and these increases are not compatible with neutral expectations. Furthermore, they reveal that recent balancing selection has targeted the βS mutation in both AGR and RHG, highlighting the need to explicitly consider this selective regime in our age estimations.

Figure 2.

Figure 2

Haplotype Long-Range Conservation Provides Information Useful for Estimating the Age of the βS Mutation

We used SLiM to simulate DNA regions around βS according to realistic demographic parameters (Figure 1), recombination rate (2.7 × 10−8 per generation per site), and overdominance parameters (h=0.1, peq=0.083, and l=0.8).

(A) Simulated long-range conservation of βS haplotypes in the AGR (red) and RHG (blue) lineages, for various bins of βS age drawn from a uniform distribution ranging from the present to 100,000 ya (bins of 10,000 years each). For each bin, the 99% CIs (red and blue dashed lines) and the average (red and blue curves) of simulated values are indicated. Four haplotype-based statistics, each computed with 500 kb windows around the simulated βS mutations, are shown. The haplotype-based statistics at the simulated βS in the AGR and RHG lineages were obtained from 200,000 simulations with the assumption that βS occurred by mutation in either of these lineages. The red and blue horizontal lines represent the empirical average haplotype-based statistics computed similarly at βS using our data. Note that βS haplotypes tend to be longer than expected under neutrality, as indicated by negative haplotype-based statistics (see Supplemental Methods).

(B) Distributions of the corresponding haplotype-based statistics obtained for all simulated ages (<100 ky) and for simulated ages younger than 40,000 years (<40 ky) and 5,000 years (<5 ky) in AGR and RHG, respectively.

Figure 3.

Figure 3

βS Haplotype-Based Statistics Observed in Each Population

(A and B) Colored circles and triangles indicate iHS values (y axis, left side) and the combined selection score (CSS) (y axis, right side), respectively, computed for the βS mutation (vertical dashed line) in each population separately. Significantly negative and positive values of iHS and CSS, respectively, indicate that the βS haplotypes are longer than expected under neutrality. Populations in which the signal was found to be significant at the genome-wide level are indicated (p < 0.05 and ∗∗p < 0.01). Black circles and gray triangles indicate iHS and CSS values, respectively, computed for mutations located in the 500 kb flanking regions of HB, and exhibiting allele frequencies in the same range of variation as that observed for βS, i.e., from 0 to 0.15. Values of other haplotype-based statistics for the βS mutation are reported in Table 1 and Table S1.

To estimate the age of βS, we next computed the haplotype-based statistics by merging population samples depending on their lifestyles, as we did in our simulations. We found more negative values for RHG than for AGR (Table 1); this difference could reflect a more recent age of βS in RHG (Figure 2) and/or varying population sub-structure, which might have differentially affected the AGR and RHG haplotype-based statistics. However, our simulation-based approach considers population sub-structure; the model used presented an excellent fit to the genome-wide levels of population differentiation (FST) observed within and between AGR and RHG lineages.33 Simulated ages ranging from the present to ∼15,000 and ∼60,000 ya generated haplotype-based statistics that are compatible with those observed in the RHG and AGR data, respectively, thus indicating a more recent βS age in RHG (Figure 2).

Because haplotype-based statistics contain no information on the origin of βS, we used the sequence-based data of HBB (Table S2) to test whether the βS allele in RHG was a new, independent mutation or the same as that present in AGR. Using simulations, we found that a single mutational origin of βS, in AGR or in RHG, better fits our data (p < 0.011, Supplemental Methods, Figure S5). We then considered the possibility that βS could have occurred in AGR or in RHG and estimated its age independently in the two groups. We used an ABC framework39 and the previous simulations for these estimations (Figure 2). As ABC summary statistics, we used the four haplotype-based metrics and the current βS frequency and θπ computed around βS to build posterior distributions from simulations that closely match the observed βS frequency and associated genetic diversity. Our simulations reproduced the empirical data well both in terms of βS frequencies and haplotype-based statistics (Figures S6 and S7). The estimation accuracy was found to be higher for young mutations, as expected, and was similar for the three ABC methods (Figures S8–S10). Because of the very similar posterior distributions obtained across methods (Figure S9), we summarized our results by combining them into a single posterior distribution (Supplemental Methods) without any loss of accuracy, e.g., the 90% credible intervals (CIs) computed from the combined posteriors contained the true values for ∼90% of the simulations, as expected (Figures S8 and S10).

Our mean estimates indicate that the βS mutation occurred very recently in RHG, ∼3,800 ya (e.g., 3,200 with 95% CI: [1,500–6,600 ya] when 500 kb windows were used) and ∼22,000 ya in AGR (e.g., 23,900 with 95% CI: [10,600–78,700 ya] when 500 kb windows were used; see Figure 4A for all individual estimates, and Figures S11A–S11C and S12A–S12C). These estimations further support the possibility of a single occurrence of βS in the ancestors of AGR, as proposed by the unicentric model18 and investigated here with our resequencing data (p < 0.002, Figure S13). The same estimates were obtained when we used western AGR and RHG individuals only, confirming that population subdivision did not alter our results (Figures S11D and S11E and Figures S12D and S12E). Our results collectively support a model where the βS mutation appeared only once in the ancestors of present-day AGR populations and was later introgressed into RHG groups through gene flow.

Figure 4.

Figure 4

ABC Estimations of the Age of the βS Mutation

Note that the error bars given on the right-hand models roughly indicate the 95% CI computed from the posterior distributions of the βS age.

(A and B) ABC posterior distributions of the βS age obtained from a combination of three different ABC methods (Supplemental Methods). Estimations were obtained through the use of haplotype-based statistics computed with 100 kb and 500 kb windows around βS. The age corresponding to the maximum posterior probability, the posterior average (in brackets), and the 95% CIs are indicated. (95% CIs are also displayed with horizontal colored lines.) The estimations obtained with each ABC method are indicated in Figures S11, S12, S16, and S17, and the model used for obtaining the estimations is indicated to the right of the corresponding figures. Black dotted lines indicate the uniform prior distributions of age from the present to 100,000 ya. Note that the estimations obtained with 500 kb windows in AGR have been performed without the nSL statistic so that estimations would not exceed the prior limits, as is known to be a problem in ABC.51

(A) Estimations obtained with simulations performed according to the parameters described in the main text (h=0.1, peq=0.083 and l=0.8) and used in Figure 2. The posterior distributions in red and blue were obtained from 200,000 simulations in which βS occurred in the AGR or the RHG lineage, respectively, and the haplotype-based statistics were computed in the corresponding lineage (Figures 1A and 1B).

(B) Estimations obtained from simulations performed according to the parameters described in the main text (h=0.1, peq=0.083 and l uniformly distributed between 0 and 1). These estimations were obtained from 200,000 simulations in which βS occurred in the AGR lineage and spread by gene flow to the RHG lineage. The long-range conservation of haplotypes with age simulated for this model can be found in Figures S14 and S15. Posterior distributions in AGR (red) and RHG (blue) were performed with the haplotype-based statistics computed in the AGR and RHG lineages, respectively. In the case of RHG, the age is slightly overestimated with respect to the estimation shown in (A) because in this single-origin model, βS has to occur first in AGR and spread to RHG by gene flow.

Finally, we sought to re-estimate the age of βS with no prior information on homozygote lethality and while assuming the retained single-origin model. We set l uniformly distributed between l=0(wβSβS=1) and l=1 (wβSβS=0) and obtained similar, yet more robust, estimations (Figures 4B and S14–S17). The mean estimates confirmed that βS occurred first in AGR ∼24,000 ya (e.g., 23,300 with 95% CI: [13,200–73,500 ya] when 500 kb windows were used) and was later introduced in RHG. The posterior distribution of the time at which βS first occurred in AGR given the RHG data (Figure 4B and Figures S16 and S17), which cannot be formally used for estimating the time at which βS first occurred in RHG because it systematically overestimates the date of arrival of βS in these groups, confirms that increased gene flow led to the occurrence of βS in RHG in the last ∼6,000 years (see above). (The migration rate from AGR to RHG increased by two orders of magnitude 10,000 ya.33) Although the accuracy of demographic models inferred from genetic data is still a matter of debate,40 our estimates of a more recent age of βS in RHG appear to be robust to demography; when we swapped empirical data and simulated demography (i.e., analyzed the RHG data by using a “wrong” AGR demographic model and vice versa), we found that βS occurred ∼9,300 ya in RHG and ∼19,000 ya in AGR. Our estimated dates of βS arrival in AGR are indeed older than those recently obtained by Shriner and Rotimi,18 and even older when one considers their recombination rate of 1.5 × 10−8 (∼37,600 years, see Figure S18). This suggests that the use of neutral ancestral recombination graphs might lead to underestimations of the age of mutations targeted by recent balancing selection. 19 Indeed, ancestral recombination graphs are known to be sensitive mainly to long-term balancing selection, owing to large inflations in the times to most recent common ancestor (TMRCA). Interestingly, our estimations of lethality of ∼0.7 (95% CI: 0.2-1), though imprecise, are in good agreement with epidemiological data related to mortality rates among βSS homozygotes.

Here, by combining computer simulations and population-genetics theory, we have revisited the evolutionary history of the βS mutation in the context of both realistic selective expectations and populations that differ in their lifestyles and ecologies. We considered a parsimonious selection model with similar fitness values across populations, a reasonable assumption given the very similar average βS frequencies observed in AGR and RHG. Our analyses, which show that the age and recessive lethality of an over-dominant mutation can potentially be estimated from genetic data, open new opportunities to investigate selection parameters that vary according to ecological habitats. For example, central African Bantu-speaking AGR that live near the rainforest exhibit a lower βS average frequency than do the western African Mandenka and Yoruba, who live in more open savannah environments (∼7% versus ∼14%, respectively; Table 1). This suggests a reduced heterozygote advantage s of the βS variant among Bantu speakers, with respect to that previously assumed in non-Bantu-speaking AGR (s ∼0.15).10, 18 However, given that equilibrium frequencies are driven byh=s/l, a diminished equilibrium frequency might also indicate higher recessive lethality l among Bantu speakers. Indeed, the severity of sickle-cell disease depends on several environmental factors (e.g., climate, air quality, and infection),41 which can vary according to the lifestyles of the populations analyzed. Future investigations based on larger amounts of genetic data from populations differing in lifestyles and exposure to environmental cues, together with detailed epidemiological data, should reveal such differential selection by formally estimating h in individual populations.

An important finding of our study is the much older age of βS emergence than previously appreciated. This is in agreement with the estimated date of emergence of human malaria in sub-Saharan Africa during the late Pleistocene, as suggested by the divergence of Plasmodium falciparum and its closest relative, Plasmodium praefalciparum (isolated in western lowland gorillas) 40,000–60,000 ya.42 Furthermore, our age estimates and selection signals support a scenario whereby the βS mutation was introduced from AGR to RHG groups through adaptive gene flow more recently, in accordance with the increase in migration rates between the two groups during the last 10,000 years.33 The frequency of βS in RHG, and its associated long haplotypes, are unlikely to result from neutral, recent gene flow from AGR populations (i.e., in the last hundreds of years) because RHG groups with the highest βS frequencies (i.e., the Baka and the Mbuti) present the lowest AGR ancestry proportions because of very recent and limited episodes of admixture.23 Our findings suggest instead that RHG have an increased βS heterozygote advantage due to an exposure to malaria that has been high enough to prevent the loss of βS and to drive its frequency close to 8% in the past few thousand years.

Collectively, our results support previous evidence in favor of an early occurrence of human malaria and substantial selective pressures predating the emergence of agriculture in Africa;18 i.e., they support the expansion of the highly anthropophilic malaria vector Anopheles gambiae43 and the emergence of Plasmodium falciparum42 during the late Pleistocene. Importantly, our study extends this knowledge to the view that the ancestors of present-day AGR were highly exposed to malaria before the ancestors of RHG were exposed, suggesting different ecological habitats and/or population densities for these groups. Consistently with this model, the genetic diversity of present-day AGR is compatible with population growth 16,000–22,000 ya; RHG, on the other hand, are known to live as small, mobile groups, 23, 44 and increased population densities favor malaria transmission.45 This also suggests that the ancestors of AGR lived in open areas where malaria is expected to be more prevalent (e.g., rainforest fringe or savannah, as suggested for the ancestors of Bantu-speaking AGR46) or manipulated their habitat by creating open areas (as archaeological evidence suggests that Homo sapiens has manipulated the tropical forest for at least 45,000 years).47 More recently, the mid-Holocene climate changes had created encroaching savannah habitats in the periphery of the rainforest by at least 4,000 ya,46 and/or agriculture-induced deforestation, known to facilitate malaria transmission,48, 49, 50 could have further increased the exposure of RHG to malaria, as attested by the young age of βS in these populations. (The maximum posterior probabilities of the age of βS were found at 3,200 and 4,450 years, Figure 4.). In light of this, our results suggest that the penetration of Bantu-speaking AGR into the central African rainforest about 4,000–5,000 ya22, 46 was accompanied by both the increased prevalence of malaria among RHG groups and the adaptive acquisition of the βS malaria-protective mutation by these populations.

Declaration of Interests

The authors declare no competing interests.

Acknowledgments

We thank Paul Verdu, Luis B. Barreiro, and George H. Perry for providing western and eastern hunter-gatherer DNA samples. This work was supported by the Institut Pasteur, the Centre National de la Recherche Scientifique (CNRS), and the Agence Nationale de la Recherche (ANR) grants “IEIHSEER” ANR-14-CE14-0008-02, “TBPATHGEN” ANR-14-CE14-0007-02, and “AGRHUM” ANR-14-CE02-0003-01. The laboratory of L.Q.-M. has received funding from the French government’s Investissement d’Avenir program through the Laboratoire d’Excellence “Integrative Biology of Emerging Infectious Diseases” (grant no. ANR-10- LABX-62-IBEID).

Published: February 28, 2019

Footnotes

Supplemental Data can be found with this article online at https://doi.org/10.1016/j.ajhg.2019.02.007.

Contributor Information

Guillaume Laval, Email: glaval@pasteur.fr.

Lluis Quintana-Murci, Email: quintana@pasteur.fr.

Accession Numbers

All newly generated sequences reported in this manuscript are accessible in GenBank from the accession number GenBank: MK475663 to GenBank: MK476504.

Web Resources

GenBank, http://www.ncbi.nlm.nih.gov/genbank/

Supplemental Data

Document S1. Figures S1–S18, Table S1, Supplemental Methods, and Supplemental References
mmc1.pdf (5.2MB, pdf)
Document S2. Table S2
mmc2.xlsx (22.1KB, xlsx)
Document S3. Article plus Supplemental Data
mmc3.pdf (6.8MB, pdf)

References

  • 1.Allison A.C. Protection afforded by sickle-cell trait against subtertian malareal infection. BMJ. 1954;1:290–294. doi: 10.1136/bmj.1.4857.290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Allison A.C. The sickle-cell and hemoglobin C genes in some African populations. Ann. Hum. Genet. 1956;21:67–89. [PubMed] [Google Scholar]
  • 3.Haldane J.B.S. Disease and Evolution. La. Ric. Sci. 1949;19(Suppl. A):68–76. [Google Scholar]
  • 4.Ackerman H., Usen S., Jallow M., Sisay-Joof F., Pinder M., Kwiatkowski D.P. A comparison of case-control and family-based association methods: The example of sickle-cell and malaria. Ann. Hum. Genet. 2005;69:559–565. doi: 10.1111/j.1529-8817.2005.00180.x. [DOI] [PubMed] [Google Scholar]
  • 5.Hill A.V., Allsopp C.E., Kwiatkowski D., Anstey N.M., Twumasi P., Rowe P.A., Bennett S., Brewster D., McMichael A.J., Greenwood B.M. Common west African HLA antigens are associated with protection from severe malaria. Nature. 1991;352:595–600. doi: 10.1038/352595a0. [DOI] [PubMed] [Google Scholar]
  • 6.Kwiatkowski D.P. How malaria has affected the human genome and what human genetics can teach us about malaria. Am. J. Hum. Genet. 2005;77:171–192. doi: 10.1086/432519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Livingstone F.B. Anthropological implications of sickle cell gene distribution in West Africa. Am. Anthropol. 1958;60:533–562. [Google Scholar]
  • 8.Wiesenfeld S.L. Sickle-cell trait in human biological and cultural evolution. Development of agriculture causing increased malaria is bound to gene-pool changes causing malaria reduction. Science. 1967;157:1134–1140. doi: 10.1126/science.157.3793.1134. [DOI] [PubMed] [Google Scholar]
  • 9.Chebloune Y., Pagnier J., Trabuchet G., Faure C., Verdier G., Labie D., Nigon V. Structural analysis of the 5′ flanking region of the beta-globin gene in African sickle cell anemia patients: further evidence for three origins of the sickle cell mutation in Africa. Proc. Natl. Acad. Sci. USA. 1988;85:4431–4435. doi: 10.1073/pnas.85.12.4431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Currat M., Trabuchet G., Rees D., Perrin P., Harding R.M., Clegg J.B., Langaney A., Excoffier L. Molecular analysis of the beta-globin gene cluster in the Niokholo Mandenka population reveals a recent origin of the beta(S) Senegal mutation. Am. J. Hum. Genet. 2002;70:207–223. doi: 10.1086/338304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kurnit D.M. Evolution of sickle variant gene. Lancet. 1979;1 doi: 10.1016/s0140-6736(79)90093-x. 104–104. [DOI] [PubMed] [Google Scholar]
  • 12.Mears J.G., Lachman H.M., Cabannes R., Amegnizin K.P.E., Labie D., Nagel R.L. Sickle gene. Its origin and diffusion from West Africa. J. Clin. Invest. 1981;68:606–610. doi: 10.1172/JCI110294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Pagnier J., Mears J.G., Dunda-Belkhodja O., Schaefer-Rego K.E., Beldjord C., Nagel R.L., Labie D. Evidence for the multicentric origin of the sickle cell hemoglobin gene in Africa. Proc. Natl. Acad. Sci. USA. 1984;81:1771–1773. doi: 10.1073/pnas.81.6.1771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Phillipson D.W. Cambridge University Press; 2005. African Archaeology. [Google Scholar]
  • 15.Flint J., Harding R.M., Clegg J.B., Boyce A.J. Why are some genetic diseases common? Distinguishing selection from other processes by molecular analysis of globin gene variants. Hum. Genet. 1993;91:91–117. doi: 10.1007/BF00222709. [DOI] [PubMed] [Google Scholar]
  • 16.Solomon E., Bodmer W.F. Evolution of sickle variant gene. Lancet. 1979;1 doi: 10.1016/s0140-6736(79)91398-9. 923–923. [DOI] [PubMed] [Google Scholar]
  • 17.Stine O.C., Dover G.J., Zhu D., Smith K.D. The evolution of two west African populations. J. Mol. Evol. 1992;34:336–344. doi: 10.1007/BF00160241. [DOI] [PubMed] [Google Scholar]
  • 18.Shriner D., Rotimi C.N. Whole-genome-sequence-based haplotypes reveal single origin of the sickle allele during the holocene wet phase. Am. J. Hum. Genet. 2018;102:547–556. doi: 10.1016/j.ajhg.2018.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Rasmussen M.D., Hubisz M.J., Gronau I., Siepel A. Genome-wide inference of ancestral recombination graphs. PLoS Genet. 2014;10:e1004342. doi: 10.1371/journal.pgen.1004342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Cavalli-Sforza L.L. Orlando Academic Press; 1986. African pygmies. [Google Scholar]
  • 21.Li J.Z., Absher D.M., Tang H., Southwick A.M., Casto A.M., Ramachandran S., Cann H.M., Barsh G.S., Feldman M., Cavalli-Sforza L.L., Myers R.M. Worldwide human relationships inferred from genome-wide patterns of variation. Science. 2008;319:1100–1104. doi: 10.1126/science.1153717. [DOI] [PubMed] [Google Scholar]
  • 22.Patin E., Lopez M., Grollemund R., Verdu P., Harmant C., Quach H., Laval G., Perry G.H., Barreiro L.B., Froment A. Dispersals and genetic adaptation of Bantu-speaking populations in Africa and North America. Science. 2017;356:543–546. doi: 10.1126/science.aal1988. [DOI] [PubMed] [Google Scholar]
  • 23.Patin E., Siddle K.J., Laval G., Quach H., Harmant C., Becker N., Froment A., Régnault B., Lemée L., Gravel S. The impact of agricultural emergence on the genetic history of African rainforest hunter-gatherers and agriculturalists. Nat. Commun. 2014;5:3163. doi: 10.1038/ncomms4163. [DOI] [PubMed] [Google Scholar]
  • 24.Frazer K.A., Ballinger D.G., Cox D.R., Hinds D.A., Stuve L.L., Gibbs R.A., Belmont J.W., Boudreau A., Hardenbol P., Leal S.M., International HapMap Consortium A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449:851–861. doi: 10.1038/nature06258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Brandt D.Y.C., César J., Goudet J., Meyer D. The effect of balancing selection on population differentiation: A study with HLA genes. G3 (Bethesda) 2018;8:2805–2815. doi: 10.1534/g3.118.200367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Schierup M.H., Vekemans X., Charlesworth D. The effect of subdivision on variation at multi-allelic loci under balancing selection. Genet. Res. 2000;76:51–62. doi: 10.1017/s0016672300004535. [DOI] [PubMed] [Google Scholar]
  • 27.Haller B.C., Messer P.W. SLiM 2: Flexible, interactive forward genetic simulations. Mol. Biol. Evol. 2017;34:230–240. doi: 10.1093/molbev/msw211. [DOI] [PubMed] [Google Scholar]
  • 28.Fenner J.N. Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. Am. J. Phys. Anthropol. 2005;128:415–423. doi: 10.1002/ajpa.20188. [DOI] [PubMed] [Google Scholar]
  • 29.Voight B.F., Kudaravalli S., Wen X., Pritchard J.K. A map of recent positive selection in the human genome. PLoS Biol. 2006;4:e72. doi: 10.1371/journal.pbio.0040072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Leffler E.M., Gao Z., Pfeifer S., Ségurel L., Auton A., Venn O., Bowden R., Bontrop R., Wall J.D., Sella G. Multiple instances of ancient balancing selection shared between humans and chimpanzees. Science. 2013;339:1578–1582. doi: 10.1126/science.1234070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Grosse S.D., Odame I., Atrash H.K., Amendah D.D., Piel F.B., Williams T.N. Sickle cell disease in Africa: A neglected cause of early childhood mortality. Am. J. Prev. Med. 2011;41(6, Suppl 4):S398–S405. doi: 10.1016/j.amepre.2011.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Matise T.C., Chen F., Chen W., De La Vega F.M., Hansen M., He C., Hyland F.C., Kennedy G.C., Kong X., Murray S.S. A second-generation combined linkage physical map of the human genome. Genome Res. 2007;17:1783–1786. doi: 10.1101/gr.7156307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Lopez M., Kousathanas A., Quach H., Harmant C., Mouguiama-Daouda P., Hombert J.M., Froment A., Perry G.H., Barreiro L.B., Verdu P. The demographic history and mutational load of African hunter-gatherers and farmers. Nat Ecol Evol. 2018;2:721–730. doi: 10.1038/s41559-018-0496-4. [DOI] [PubMed] [Google Scholar]
  • 34.Barreiro L.B., Ben-Ali M., Quach H., Laval G., Patin E., Pickrell J.K., Bouchier C., Tichit M., Neyrolles O., Gicquel B. Evolutionary dynamics of human Toll-like receptors and their different contributions to host defense. PLoS Genet. 2009;5:e1000562. doi: 10.1371/journal.pgen.1000562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Grossman S.R., Shlyakhter I., Karlsson E.K., Byrne E.H., Morales S., Frieden G., Hostetter E., Angelino E., Garber M., Zuk O. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science. 2010;327:883–886. doi: 10.1126/science.1183863. [DOI] [PubMed] [Google Scholar]
  • 36.Ferrer-Admetlla A., Liang M., Korneliussen T., Nielsen R. On detecting incomplete soft or hard selective sweeps using haplotype structure. Mol. Biol. Evol. 2014;31:1275–1291. doi: 10.1093/molbev/msu077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Liu X., Ong R.T., Pillai E.N., Elzein A.M., Small K.S., Clark T.G., Kwiatkowski D.P., Teo Y.Y. Detecting and characterizing genomic signatures of positive selection in global populations. Am. J. Hum. Genet. 2013;92:866–881. doi: 10.1016/j.ajhg.2013.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Sabeti P.C., Schaffner S.F., Fry B., Lohmueller J., Varilly P., Shamovsky O., Palma A., Mikkelsen T.S., Altshuler D., Lander E.S. Positive natural selection in the human lineage. Science. 2006;312:1614–1620. doi: 10.1126/science.1124309. [DOI] [PubMed] [Google Scholar]
  • 39.Beaumont M.A., Zhang W., Balding D.J. Approximate Bayesian computation in population genetics. Genetics. 2002;162:2025–2035. doi: 10.1093/genetics/162.4.2025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lapierre M., Lambert A., Achaz G. Accuracy of demographic inferences from the site frequency spectrum: The case of the Yoruba population. Genetics. 2017;206:439–449. doi: 10.1534/genetics.116.192708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Tewari S., Brousse V., Piel F.B., Menzel S., Rees D.C. Environmental determinants of severity in sickle cell disease. Haematologica. 2015;100:1108–1116. doi: 10.3324/haematol.2014.120030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Otto T.D., Gilabert A., Crellen T., Böhme U., Arnathau C., Sanders M., Oyola S.O., Okouga A.P., Boundenga L., Willaume E. Genomes of all known members of a Plasmodium subgenus reveal paths to virulent human malaria. Nat. Microbiol. 2018;3:687–697. doi: 10.1038/s41564-018-0162-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Crawford J.E., Lazzaro B.P. The demographic histories of the M and S molecular forms of Anopheles gambiae s.s. Mol. Biol. Evol. 2010;27:1739–1744. doi: 10.1093/molbev/msq070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Patin E., Laval G., Barreiro L.B., Salas A., Semino O., Santachiara-Benerecetti S., Kidd K.K., Kidd J.R., Van der Veen L., Hombert J.M. Inferring the demographic history of African farmers and pygmy hunter-gatherers using a multilocus resequencing data set. PLoS Genet. 2009;5:e1000448. doi: 10.1371/journal.pgen.1000448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Kabaria C.W., Gilbert M., Noor A.M., Snow R.W., Linard C. The impact of urbanization and population density on childhood Plasmodium falciparum parasite prevalence rates in Africa. Malar. J. 2017;16:49. doi: 10.1186/s12936-017-1694-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Grollemund R., Branford S., Bostoen K., Meade A., Venditti C., Pagel M. Bantu expansion shows that habitat alters the route and pace of human dispersals. Proc. Natl. Acad. Sci. USA. 2015;112:13296–13301. doi: 10.1073/pnas.1503793112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Roberts P., Hunt C., Arroyo-Kalin M., Evans D., Boivin N. The deep human prehistory of global tropical forests and its relevance for modern conservation. Nat. Plants. 2017;3:17093. doi: 10.1038/nplants.2017.93. [DOI] [PubMed] [Google Scholar]
  • 48.Afrane Y.A., Githeko A.K., Yan G. The ecology of Anopheles mosquitoes under climate change: case studies from the effects of deforestation in East African highlands. Ann. N Y Acad. Sci. 2012;1249:204–210. doi: 10.1111/j.1749-6632.2011.06432.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Guerra C.A., Snow R.W., Hay S.I. A global assessment of closed forests, deforestation and malaria risk. Ann. Trop. Med. Parasitol. 2006;100:189–204. doi: 10.1179/136485906X91512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Walsh J.F., Molyneux D.H., Birley M.H. Deforestation: Effects on vector-borne disease. Parasitology. 1993;106(Suppl):S55–S75. doi: 10.1017/s0031182000086121. [DOI] [PubMed] [Google Scholar]
  • 51.Fagundes N.J.R., Ray N., Beaumont M., Neuenschwander S., Salzano F.M., Bonatto S.L., Excoffier L. Statistical evaluation of alternative models of human evolution. Proc. Natl. Acad. Sci. USA. 2007;104:17614–17619. doi: 10.1073/pnas.0708280104. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S18, Table S1, Supplemental Methods, and Supplemental References
mmc1.pdf (5.2MB, pdf)
Document S2. Table S2
mmc2.xlsx (22.1KB, xlsx)
Document S3. Article plus Supplemental Data
mmc3.pdf (6.8MB, pdf)

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES