Abstract
Southwestern Angola is a region characterized by contact between indigenous foragers and incoming food-producers, involving genetic and cultural exchanges between peoples speaking Kx’a, Khoe-Kwadi, and Bantu languages. Although present-day Bantu speakers share a patrilocal residence pattern and matrilineal principle of clan and group membership, a highly stratified social setting divides dominant pastoralists from marginalized groups that subsist on alternative strategies and have previously been thought to have pre-Bantu origins. Here, we compare new high-resolution sequence data from 2.3 Mb of the male-specific region of the Y chromosome (MSY) from 170 individuals with previously reported mitochondrial DNA (mtDNA) genomes, to investigate the population history of seven representative southwestern Angolan groups (Himba, Kuvale, Kwisi, Kwepe, Twa, Tjimba, !Xun), and to study the causes and consequences of sex-biased processes in their genetic variation. We found no clear link between the formerly Kwadi-speaking Kwepe and pre-Bantu eastern African migrants, and no pre-Bantu MSY lineages among Bantu-speaking groups, except for small amounts of “Khoisan” introgression. We therefore propose that irrespective of their subsistence strategies, all Bantu-speaking groups of the area share a male Bantu origin. Additionally, we show that in Bantu-speaking groups, the levels of among-group and between-group variation are higher for mtDNA than for MSY. These results, together with our previous demonstration that the matriclanic systems of southwestern Angolan Bantu groups are genealogically consistent, suggest that matrilineality strongly enhances both female population sizes and interpopulation mtDNA variation.
Subject terms: Genetic variation, Genetic markers
Introduction
Due to their uniparental modes of inheritance, the mitochondrial DNA (mtDNA) and male-specific region of the Y chromosome (MSY) have been extensively used to study differences between paternal and maternal histories of human populations and assess the influence of socio-cultural practices on sex-specific patterns of variation [1, 2]. However, direct comparisons of mtDNA and MSY diversity were hampered until recently by differences between methods used to detect variation, and by ascertainment bias in the choice of MSY single nucleotide polymorphisms (SNPs) [3]. In the last few years, various studies took advantage of the increasing availability of next-generation sequencing (NGS) platforms to compare unbiased MSY and mtDNA sequence data on a global scale [4–7]. Nevertheless, the sex-specific patterns disclosed by worldwide studies reflect average trends exhibited by different macro-regions, and do not explore the rich diversity of interactions between genetic variation and cultural practices that shape MSY and mtDNA variation at the local level [8].
Here, we used targeted NGS to generate unbiased sequence data from 2.3 Mb of the MSY in 170 males comprising seven small communities from SW Angola, whose complete mtDNA genomes have been previously investigated [9]. Despite being located in a relatively small geographic area (Fig. 1), these groups offer a unique framework to explore the relationships between socio-cultural practices, local variation in mtDNA versus MSY diversity, and continent-wide migratory processes. The !Xun from Kunene Province are Kx’a-speaking hunter gatherers who descend from the oldest population layer of southern Africa represented by the so-called “Khoisan” peoples [10]. The Kuvale, Himba, Tjimba, Twa, Kwisi, and Kwepe from the Angolan Namib Desert are all Bantu-speaking groups, who display striking socio-economic disparities that have been associated with different population histories. The Himba and Kuvale are two mildly polygynous pastoralist populations belonging to the broad Herero ethnic division, who arrived in SW Africa during the Bantu expansions [11]. The Kwepe, Twa, Kwisi, and Tjimba are marginalized ethnic minorities subsisting on small-scale pastoralism and foraging, who gravitate around the Himba and Kuvale and are best described as peripatetic peoples [12]. While the Himba-speaking Tjimba are commonly thought to be impoverished Himba [13], the Kuvale-speaking Kwisi and Twa have been associated with a hypothetical stratum of pre-Bantu foragers of unknown provenance, whose original language has been lost [14]. The Kwepe, who spoke Kwadi—a now extinct language belonging to the Khoe-Kwadi family—before shifting to Kuvale, were considered to be remnants of a pre-Bantu pastoralist migration introducing Khoe-Kwadi languages into southern Africa [15]. In spite of their different subsistence strategies, all Bantu-speaking groups of the Angolan Namib share a patrilocal residence pattern and a matrilineal descent-group system, which regulates important parts of social life, such as group membership, inheritance, and marriage behavior.
In this study, we compare new data on MSY sequence variation with our previous findings on whole mtDNA genomes [9] to investigate whether the histories of these groups fit the expectations of earlier anthropological and linguistic hypotheses, and to study the causes and consequences of sex-biased processes for their genetic variation.
Material and methods
Samples
We analyzed 162 partial MSY sequences sampled from populations that inhabit the Angolan Namib Desert (including the Himba, Kuvale, Kwepe, Kwisi, Twa and Tjimba) and the Kunene Province (!Xun) (Fig. 1; Table S1), and eight additional sequences from Bantu-speaking individuals with other ethnic affiliations used exclusively in haplotype-based analysis. The samples were collected as described previously [16], with the donors’ written informed consent, the ethical clearance of ISCED and CIBIO/InBIO-University of Porto boards, and the support and permission of the Provincial Governments of Namibe and Kunene.
MSY Sequencing
Indexed libraries produced previously [9] were enriched for 2.3 Mb of target MSY as previously described [17]. Paired-end sequencing data of 107 bp length were generated on the Illumina HiSeq 2500 platform and standard Illumina base-calling was performed using Bustard. We trimmed Illumina adapters and merged completely overlapping paired sequences using leeHOM [18], and de-multiplexed the pooled sequencing data using deML [19]. The sequencing data were aligned to the human reference genome hg19 and SNPs were identified according to ref. [17]. Reads that aligned to the MSY captured region are available in the European Nucleotide Archive (https://www.ebi.ac.uk/ena) with the study accession number PRJEB27776. Y chromosome haplogroups were determined using yhaplo [20], which is based on the International Society of Genetic Genealogy (ISOGG) nomenclature of January 2016 (https://isogg.org/tree/index.html), with two modifications: (i) the SNP defining B-50f2(P) (Table S2) was corrected according to a recent ISOGG update (August 6th, 2017); and (ii) instead of using variant NC_000024.9:g.16251357G>A (not typed in this study), we used variant NC_000024.9:g.7595638T>A [21, 22] to define haplogroup E-V1245 (Table S2).
Data analysis
Genetic diversity indices, pairwise Φst values and Analyses of Molecular Variance (AMOVA) were computed in Arlequin v35 [23]. Non-metric multidimensional scaling (MDS), k-means and neighbor-joining (NJ) analyses based on pairwise Φst distance matrices were carried out in R, using the functions “isoMDS”, “kmeans” with several random starts, and “nj”, respectively. To determine the support of NJ partitions we generated bootstrap replicates with the function “boot.phylo” and used “stat.phist” (strataG v0.9.2) to recalculate Φst distances. Mantel tests were performed in R using the function “mantel” (package vegan) with 1,000 permutations of matrix elements to determine significance.
A phylogenetic tree was constructed with BEAST v1.8 [24], using an A00 representative haplotype as outgroup [4]. To account for the absence of invariable sites in BEAST, we applied an invariant site correction. We used a strict clock and a mutation rate of 0.74 × 10−9 mutations/bp/year, as estimated by Karmin et al. [4] based on calibration with two ancient DNA sequences. Despite the uncertainty regarding the mutation rate and differences between genealogical and evolutionary-based estimates, the estimates calibrated with aDNA better match independently-dated events such as the Out-of-Africa expansion and the peopling of the Americas [3]. Furthermore, the use of aDNA calibrations yielded similar results in different studies [25]. Additional settings used in BEAST are reported in Table S3. We performed additional BEAST runs to build Bayesian Skyline plots (BSP) for different population groupings based on MSY and previously published mtDNA data [9] (see settings in Table S3). The estimates of effective population size (Ne) were obtained assuming the more conservative, lower values of generation time recommended by Fenner [26] for hunter-gatherer females and males (25 and 31 years, respectively), as they are likely more suitable for the traditional societies studied here than the corresponding values estimated for nation states.
Median-joining networks were computed with Network 5.0 (www.fluxus-engineering.com) and plotted with Network Publisher v2.1.1.2.
For comparative purposes, we merged the sequence data generated in this study (2.3 Mb) with (i) 447 partial MSY sequences (0.9 Mb) from other southern African groups [27], obtaining an overlap of 0.56 Mb, and (ii) 21 complete Y chromosomes of various origins in Africa [22, 28]. The merged datasets were used to build networks.
Results
We obtained 2.3 Mb of MSY sequence from each of 170 Angolan individuals, with a mean coverage of 28× (range 8–52×). After quality filtering, a total of 1854 SNPs were identified, of which only 66% are reported in dbSNP (build 150). A VCF file containing all SNPs and 154 non-variable nucleotide positions that are different from the reference sequence is available online (Supplementary datafile 1). An average of 6 nucleotides per individual (0.3%) were missing and were imputed with Beagle 4.0 [29] as described in Supplementary datafile 2. We used a simulated dataset to evaluate the performance of Beagle in imputing missing genotypes from haploid data and show that this imputation choice is suitable for empirical datasets where the amount of missing genotypes is below 25% (Fig. S1; Supplementary datafile 2).
MSY phylogeography in Angola
Figure 1 displays a Bayesian phylogenetic tree for the Angolan MSY sequences, which also includes an early splitting A00 haplotype [4] (see also Fig. S2 for a network relating all Angolan haplotypes). The estimated split time of the A-L419 branch (143 kya), which comprises A-P262 and A-M51, corresponds to the most recent common ancestor (TMRCA) of all Angolan sequences. This and the date of the split of the Angolan sequences from A00 (256 kya) are remarkably close to previous estimates based on high-coverage whole Y chromosomes sampled from other populations (Table S4) [4, 30]. Despite being sampled in a relatively small area, the Angolan lineages have very different phylogeographical characteristics, and belong to haplogroups that have been associated with three major population layers that settled southern Africa at different periods (see Table S2 for alternative haplogroup nomenclatures). A-P262, A-M51, and B-50f2(P) contain deep-rooting nodes and are associated with an early substrate of “Khoisan” foragers (>10 kya) speaking Kx’a and Tuu languages [27]. These represent 44% of the !Xun but only 0–9% of the genetic makeup of Bantu-speaking peoples from Angola (Fig. 1; Table S2). E-M293 sequences, which have been linked to a pre-Bantu migration of sheep pastoralists from East Africa (~2 kya) [31] have a TMRCA of 6.6 kya and are observed in varying frequencies among the Kwepe (7%), Kwisi (6%), Himba (2%), and !Xun (25%) (Fig. 1; Table S2). E-M180 and B-M109, which have been previously associated with the Bantu expansions [32, 33] (though B-M109 might also have existed in “Khoisan” groups before the arrival of Bantu speakers [27]), have TMRCAs close to 10 kya and represent 91–98% of the MSY sequences sampled among Bantu-speaking groups and 33% of the !Xun MSY sequences (Fig. 1; Table S2). In accordance with previous studies [7], subhaplogroup E-M180 displays a star-like branching pattern (Fig. 1 and S2), consistent with a rapid demographic expansion from a small ancestral population size.
An inspection of the molecular relationships between MSY haplotypes from different populations reveals that most lineages from Angola cluster together with other available sequences from southern Africa (Fig. S3). The only exceptions are the B-M109 sequences (Fig. S3g-h), which are grouped in a divergent monophyletic cluster that includes lineages previously found in SW Bantu groups from Namibia, and the B-50f2(P) haplotypes, which are very divergent from haplotypes found in the wider region of southern Africa or in Pygmy groups (Fig. S3c-d).
Intrapopulation diversity and demographic inferences
Table S1 presents summary statistics for MSY and mtDNA diversity. The MSY nucleotide diversity in the !Xun (πMSY = 1.2 × 10−4) is 2.5 times higher than in the Bantu-speaking groups (πMSY = 4.8 × 10−5), and similar to values calculated from previous studies for other “Khoisan” groups (πMSY = 1.5 × 10−4–1.8 × 10−4) [6, 27].
In Bantu speakers, MSY vs. mtDNA diversity ratios accounting for differences in mutation rates of the two chromosomes (πMSY/πmtDNA) range from 0.09 to 0.5 (Table S1), indicating that Bantu peoples, like many other human populations, display less MSY diversity relative to mtDNA than expected in neutral demographic models without sex-biased processes [4, 6, 27, 34]. In contrast, the !Xun resemble other “Khoisan” groups in displaying comparable levels of diversity in both sexes (πMSY/πmtDNA = 1.11) [6, 27].
To better understand the present differences in levels of mtDNA and MSY diversity, we inferred the history of male and female effective population size (Ne) changes by using BSPs (Fig. 2). We found striking past population size differences between males and females in a pooled sample comprising all Bantu-speaking groups from the Angolan Namib (Fig. 2a). Starting from the past, Ne estimates based on mtDNA (Nef) remained stable for a long period of time (~20,000), and display a sharp reduction with minimum size (~2000) around 2 kya, followed by expansion to the present (Fig. 2a). In contrast, the male demographic profile is characterized by a recent expansion from a relatively low, more stable long-term population size (Nem ~3000) (Fig. 2a). Unlike the Bantu-speaking populations, the !Xun displayed almost overlapping female and male population sizes that start to decline around 10 kya with no traces of population recovery (Fig. 2b).
When population size changes among Bantu speakers with different subsistence patterns were compared, the peripatetic communities showed less pronounced differences in sex-specific Ne, and smaller post-bottleneck size recoveries than the pastoral populations (Fig. 2c, d). These differences persisted when the lower sample sizes of peripatetics were taken into account (Fig. S4).
As BSPs assume a single, isolated, panmictic population, and the Angolan groups are likely to be part of a network of structured populations, some inferred demographic events might have been more influenced by migration levels and sampling design than by real changes in population size [35]. To account for these confounding factors, we generated separate mtDNA and MSY BSPs for all individual groups (Fig. S5). The demographic profile of the Kuvale, who display high frequencies of “Khoisan”-related mtDNA haplogroups, remained similar to that of the Himba, who have similar sample sizes and do not show signs of “Khoisan” introgression in their mtDNA [9], suggesting that the differences between female and male BSPs of pastoralists are not exclusively due to admixture with resident foragers (Fig. S5). On the other hand, not all of the BSPs for individual peripatetic groups show the signs of post-bottleneck Ne recovery that were detected in the pooled “peripatetic” sample (Fig. S5).
Interpopulation diversity
To compare the levels of between-population divergence for the MSY with our previous results on mtDNA variation in the same populations [9], we carried out AMOVA based on different partitions of the data (Table S5). Although we found similar amounts of divergence between the !Xun and Bantu speakers (22.5% for the MSY vs. 16.6% for mtDNA), the genetic differentiation among Bantu speakers is much lower for the MSY than for mtDNA (4.4% for the MSY vs. 20.2% for mtDNA), even when the Kuvale are removed from the comparisons to eliminate the confounding effects of “Khoisan” lineages on the levels of population divergence (5.5% for the MSY vs. 18.8% for mtDNA) [9] (Table S5). Moreover, we found that the genetic differentiation among matriclans for mtDNA (50.8%) is much higher than for the MSY (2.5%) (Table S5), reflecting the structuring effect of the matriclanic system on mtDNA, but not on MSY variation [9].
The population relationships displayed in an MDS plot based on pairwise Φst values further reveal noticeable differences between the MSY and mtDNA (Fig. 3a), which are reflected in a lack of correlation between their corresponding Φst matrices (Mantel test, p-value = 0.091). In addition, there is a clear mismatch between the clustering patterns inferred from k-means analyses based on MSY and mtDNA (Fig. 3b, c). For mtDNA, the best k-means partition (k = 4; Φct = 20.9%; p-value < 0.006; Table S5) places the Kwisi and Twa in a separate group, and associates the Kwepe with the Himba. For MSY (k = 3; Φst = 14.4%; p-value < 0.014; Table S5), the Kwepe and the Kwisi are grouped with their northern Kuvale neighbors, while the southernmost Twa are grouped with the Himba (Figs. 1 and 3). Interestingly, the MSY clustering has remarkable parallels with the distribution of cultural traits such as language, dressing habits, and names of matriclans. For example, while the Twa tend to imitate the dressing habits of Himba women, the Kwisi and Kwepe try to mimic the characteristic attire of the Kuvale [14]. Moreover, the variety of Kuvale spoken by the Twa has clearly been influenced by the Himba language, while the Kwisi and Kwepe speak language varieties that are practically indistinguishable from mainstream Kuvale [9] (see NJ in Fig. S6a). This is also reflected in the significant correlation we found between lexicon-based linguistic distances [9] and MSY distances (Mantel test, p-value = 0.033). Finally, we have previously shown that peripatetic groups tend to replace their own clan names with those of their neighboring pastoral groups, leading to the shared use of matriclan labels by the Twa and Himba on the one hand, and by the Kwisi, Kwepe and Kuvale on the other [9]. Despite the genetic consistency of the matriclanic system within each group, this clan switching leads to quite different patterns of population relationships based on mtDNA variation and on the distribution of clan names (Fig. S6b-c). In contrast, we found that NJ trees constructed based on Φst distances for MSY and on distances based on clan name frequencies do have similar patterns (Fig. S6b, d), as confirmed by a significant correlation between the two distance matrices (Mantel test, p-value = 0.001).
Discussion
The origins of MSY diversity in SW Angola
In accordance with our previous mtDNA study [9], the present MSY analysis reveals a major division between the Kx’a-speaking !Xun and the Bantu-speaking groups, whose paternal genetic ancestry does not display any old remnant lineages or a clear link to pre-Bantu eastern African migrants introducing Khoe-Kwadi languages and pastoralism into southern Africa (cf. [15]). This is especially evident in the distribution of the eastern African subhaplogroup E-M293 [31], which reaches the highest frequency in the !Xun (25%) and not in the formerly Kwadi-speaking Kwepe (7%). This observation, together with recent genome-wide estimates of 9-22% of eastern African ancestry in other Kx’a and Tuu-speaking groups [36], suggests that eastern African admixture was not restricted to present-day Khoe-Kwadi speakers. Alternatively, it is likely that the dispersal of pastoralism and Khoe-Kwadi languages involved a series of punctuated contacts that led to a wide variety of cultural, genetic and linguistic outcomes, including possible shifts to Khoe-Kwadi by originally Bantu-speaking peoples [37].
Although traces of an ancestral pre-Bantu population may yet be found in autosomal genome-wide studies, the extant variation in both uniparentally inherited markers strongly supports a scenario in which all groups of the Angolan Namib share most of their genetic ancestry with other Bantu groups but became increasingly differentiated within the highly stratified social context of SW African pastoral societies [11].
The influence of socio-cultural behaviors on the diversity of MSY and mtDNA
A comparison of the MSY variation with previous mtDNA results for the same groups [9] identifies three main sex-specific patterns. First, gene flow from the Bantu into the !Xun is much higher for male than for female lineages (31% for the MSY vs. 3% for mtDNA; Table S2, see also Fig. 2 of ref. [9]), similar to the reported male-biased patterns of gene flow from Bantu to “Khoisan”-speaking groups [34], and from non-Pygmies to Pygmies in Central Africa [38]. A comparable trend involving the introgression of MSY eastern African lineages was also found in the !Xun, who exhibit high frequencies of haplogroup E-M293 (25%) while retaining the mtDNA profile characteristic of southern African “Khoisan” populations [9]. These patterns may be explained by a context of social discrimination, in which women from food producing populations are prevented from moving into forager communities, while food producing men and forager women can have children, who will then mostly be raised in the mother’s group [38, 39]. However, the dominant Kuvale pastoralists, who show a high frequency of “Khoisan”-related mtDNA (53%), indicate that admixed children may also remain in the father’s group [9].
Secondly, the levels of intrapopulation diversity in the Bantu-speaking peoples from the Namib were found to be consistently higher for mtDNA than for the MSY, reflecting the marked association between the Bantu expansion and the relatively young MSY E-M180 haplogroup, which has no parallel in mtDNA [27, 40]. In contrast, the !Xun have a more diverse MSY haplogroup composition, combining deeply rooted lineages and younger clades obtained through recent admixture. Using BSP analysis, we found these patterns to be reflected in larger long-term Nef than Nem in Bantu speakers, and more equal sex-specific Ne in the !Xun (Fig. 2).
Global patterns showing that Nef was larger than Nem during a large part of human history have been explained by a number of sex-biased processes, including natural selection affecting the MSY or culturally influenced sex-specific demographic behaviors [1–3]. In the context of the Bantu expansions, these patterns have been mostly interpreted as the result of polygyny and/or higher levels of assimilation of females from resident forager communities [39, 41]. However, most groups from the Angolan Namib are only mildly polygynous [11] and ethnographic data suggest that the actual rates of polygyny in many populations may be insufficient to significantly reduce Nem [2, 42]. In addition, the finding of a large Nef/ Nem ratio in the Himba (Fig. S5), who have almost no Khoisan-related mtDNA lineages [9], indicates that female-biased introgression cannot fully explain the observed patterns.
An alternative explanation may be sought in the prevailing matrilineal descent rules, which might have created a sex-specific structuring effect, similar to that proposed for patrilineal groups from Central Asia [43]. As we previously demonstrated [9], all Bantu-speaking groups sampled in the region have genetically consistent, highly structured matrilineal descent systems, with levels of genetic variation between matriclans that are 20 times higher for mtDNA than for the MSY (50.8% for mtDNA vs. 2.5% for the MSY; Table S5). Since ethnic groups are conglomerates of matriclans, they harbor a remarkable amount of mtDNA structure and have fragmented female populations that can inflate Nef estimates [44]. Under this hypothesis, and using the terminology proposed by Wakeley (1999) [45], the population size growth starting at ~2 kya that is detected in both the female and male BSPs (Fig. 2 and S5) would be associated with the “scattering phase” of the mtDNA tree. The separation of male and female BSPs before 2 kya would then correspond to Wakeley’s “collecting phase” [45] and reflect the inflation of Nef due to the large mtDNA differences between matriclans. Since the male pool is not structured, the MSY tree has no collecting phase and Nem remains essentially unchanged beyond 2 kya. In the future, it will be interesting to make a more comprehensive re-evaluation of the relationship between descent rules and Nem/Nef ratios across different Bantu populations, since studies in other regions of the world have shown that the more structured sex may not display the highest Ne if the extinction rate of clans is high [43, 46].
The third important sex-specific pattern observed in this study is the much lower amount of between-group differentiation for the MSY than for mtDNA among Bantu-speaking populations (4.4% for the MSY vs. 20.2% for mtDNA), in spite of the patrilocal residence patterns of all ethnic groups (Table S5). This difference can hardly be explained by unequal levels of introgression of “Khoisan” mtDNA lineages into the Bantu, since the percentage of mtDNA variation remains high (18.8%) when the Kuvale, who have high frequencies of “Khoisan”-related mtDNA, are excluded from the comparisons. It therefore seems more plausible that differentiation is higher in the mtDNA simply because there is more ancestral mtDNA than MSY variation that can be sorted among different populations [47]. Moreover, due to the matriclanic organization of all Bantu-speaking communities, factors enhancing inter-group differentiation, like kin-structured migration and kin-structured founder effects [48], would have been restricted to mtDNA. Finally, it is also likely that the discrepancy between among-group divergence of mtDNA vs. the MSY might have been influenced by higher migration rates in males than females. In fact, although all Bantu-speaking populations have patrilocal residence patterns, the observance of endogamy rules severely constrains the between-group mobility of females. In this context, the children from extramarital unions involving members from different populations tend to be raised in the mother’s group, effectively increasing male versus female migration rates. Moreover, it is likely that, in the highly hierarchized setting of the Namib, most inter-group extramarital unions would involve men from dominant groups and women from peripatetic communities. This hypothesis is indirectly supported by the finding that in MSY-based clusters (but not in mtDNA-based clusters) pastoralist populations are grouped together with peripatetic communities that share their cultural traits (Figs. S6 and 3b), suggesting that migration of MSY lineages follows a path that is similar to horizontally transmitted cultural features.
Taken together, our results highlight the importance of the matrilineal rule of descent in shaping sex-specific patterns of population diversity and differentiation, stressing the need to better understand how regularities disclosed at the global level are associated with demographic processes occurring at local scales.
Electronic supplementary material
Acknowledgements
We thank all sample donors for their participation in this study, the governments of Namibe and Kunene Provinces in Angola for supporting our work, João Guerra, Raimundo Dungulo, and Serafim Nemésio for assistance in the preparation of fieldwork, António Mbeape, José Domingos, and Okongo Toko for assistance with sample collection, Roland Schröder for assistance in the lab and Enrico Macholdt for assistance in the processing of sequencing data. This is scientific paper no. 8 from the Portuguese-Angolan TwinLab established between CIBIO/InBIO and ISCED/Huíla, Lubango. This work is dedicated to the memory of our colleague Samuel Aço.
Author contributions
JR, BP, and MS planned the study; JR, A-MF, SO, TA, and FL performed the fieldwork. SO generated the laboratory data. SO, AH, BP, MS, and JR analyzed the data. SO and JR wrote the article with input from all other authors.
Funding
Financial support was provided by FEDER funds through the Operational Programme for Competitiveness Factors—COMPETE, by National Funds through FCT—Foundation for Science and Technology under the PTDC/BIA-EVF/ 2907/2012 and FCOMP-01-0124-FEDER- 028341, and by the Max Planck Society. SO was supported by the FCT grant SFRH/BD/85776/2012. BP acknowledges the LABEX ASLAN (ANR-10-LABX-0081) of Université de Lyon for its financial support within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) of the French government operated by the National Research Agency (ANR).
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Electronic supplementary material
The online version of this article (10.1038/s41431-018-0304-2) contains supplementary material, which is available to authorized users.
References
- 1.Webster TH, Sayres MAW. Genomic signatures of sex-biased demography: progress and prospects. Curr Opin Genet Dev. 2016;41:62–71. doi: 10.1016/j.gde.2016.08.002. [DOI] [PubMed] [Google Scholar]
- 2.Heyer E, Chaix R, Pavard S, Austerlitz F. Sex-specific demographic behaviours that shape human genomic variation. Mol Ecol. 2012;21:597–612. doi: 10.1111/j.1365-294X.2011.05406.x. [DOI] [PubMed] [Google Scholar]
- 3.Jobling MA, Tyler-Smith C. Human Y-chromosome variation in the genome-sequencing era. Nat Rev Genet. 2017;18:485–97. doi: 10.1038/nrg.2017.36. [DOI] [PubMed] [Google Scholar]
- 4.Karmin M, Saag L, Vicente M, Sayres MAW, Järve M, Talas UG, et al. A recent bottleneck of Y chromosome diversity coincides with a global change in culture. Genome Res. 2015;25:459–66. doi: 10.1101/gr.186684.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hallast P, Batini C, Zadik D, Delser PM, Wetton JH, Arroyo-Pardo E, et al. The Y-chromosome tree bursts into leaf: 13,000 high-confidence SNPs covering the majority of known clades. Mol Biol Evol. 2014;32:661–73. doi: 10.1093/molbev/msu327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lippold S, Xu H, Ko A, Li M, Renaud G, Butthof A, et al. Human paternal and maternal demographic histories: Insights from high-resolution Y chromosome and mtDNA sequences. Investig Genet. 2014;5:13. doi: 10.1186/2041-2223-5-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Poznik GD, Xue Y, Mendez FL, Willems TF, Massaia A, Wilson Sayres MA, et al. Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat Genet. 2016;48:593–9. doi: 10.1038/ng.3559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kumar V, Langstieh BT, Madhavi KV, Naidu VM, Singh HP, Biswas S, et al. Global patterns in human mitochondrial DNA and Y-chromosome variation caused by spatial instability of the local cultural processes. PLoS Genet. 2006;2:420–4. doi: 10.1371/journal.pgen.0020053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Oliveira S, Fehn AM, Aço T, Lages F, Gayà-Vidal M, Pakendorf B, et al. Matriclans shape populations: insights from the Angolan Namib Desert into the maternal genetic history of southern Africa. Am J Phys Anthropol. 2018;165:518–35. doi: 10.1002/ajpa.23378. [DOI] [PubMed] [Google Scholar]
- 10.Barnard A. Hunters and Herders of Southern Africa: a comparative ethnography of the Khoisan Peoples. Cambridge: Cambridge University Press; 1992. [Google Scholar]
- 11.Estermann C. The Ethnography of Southwestern Angola: The Herero People (translated and edited by Gibson GD). New York: Africana Pub. Co, 1981.
- 12.Berland JC, Rao A. Customary strangers:new perspectives on peripatetic people in the Middle East, Africa, and Asia. Westport: Greenwood Publishing Group, 2004.
- 13.MacCalman H, Grobbelaar B. Preliminary report of two stone-working OvaTjimba groups in the northern Kaokoveld of South West Africa. Windhoek: Staatsmuseum; 1965. [Google Scholar]
- 14.Estermann C. The ethnography of Southwestern Angola: The Non-Bantu Peoples. The Ambo Ethnic Group (translated and edited by Gibson GD). New York: Africana Pub. Co, 1976.
- 15.Güldemann T. A linguist’s view: Khoe-Kwadi speakers as the earliest food-producers of southern Africa. South Afr Humanit. 2008;20:93–132. [Google Scholar]
- 16.Pinto JC, Oliveira S, Teixeira S, Martins D, Fehn AM, Aço T, et al. Food and pathogen adaptations in the Angolan Namib desert: Tracing the spread of lactase persistence and human African trypanosomiasis resistance into southwestern Africa. Am J Phys Anthropol. 2016;161:436–47. doi: 10.1002/ajpa.23042. [DOI] [PubMed] [Google Scholar]
- 17.Kutanan W, Kampuansai J, Changmai P, Flegontov P, Schröder R, Macholdt E, et al. Contrasting maternal and paternal genetic variation of hunter-gatherer groups in Thailand. Sci Rep. 2018;8:1536. doi: 10.1038/s41598-018-20020-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Renaud G, Stenzel U, Kelso J. LeeHom: Adaptor trimming and merging for Illumina sequencing reads. Nucleic Acids Res. 2014;42:e141. doi: 10.1093/nar/gku699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Renaud G, Stenzel U, Maricic T, Wiebe V, Kelso J. DeML: Robust demultiplexing of Illumina sequences using a likelihood-based approach. Bioinformatics. 2015;31:770–2. doi: 10.1093/bioinformatics/btu719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Poznik GD. Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men. bioRxiv 2016; 88716.
- 21.Trombetta B, D’Atanasio E, Massaia A, Ippoliti M, Coppa A, Candilio F, et al. Phylogeographic refinement and large scale genotyping of human Y chromosome haplogroup E provide new insights into the dispersal of early Pastoralists in the African Continent. Genome Biol Evol. 2015;7:1940–50. doi: 10.1093/gbe/evv118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Poznik GD, Henn BM, Yee MC, Sliwerska E, Euskirchen GM, Lin AA, et al. Sequencing Y chromosomes resolves discrepancy in time to common ancestor of males versus females. Science. 2013;341:562–5. doi: 10.1126/science.1237619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Excoffier L, Lischer HEL. Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010;10:564–7. doi: 10.1111/j.1755-0998.2010.02847.x. [DOI] [PubMed] [Google Scholar]
- 24.Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012;29:1969–73. doi: 10.1093/molbev/mss075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Balanovsky O. Toward a consensus on SNP and STR mutation rates on the human Y-chromosome. Hum Genet. 2017;136:575–90. doi: 10.1007/s00439-017-1805-8. [DOI] [PubMed] [Google Scholar]
- 26.Fenner JN. Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. Am J Phys Anthropol. 2005;128:415–23. doi: 10.1002/ajpa.20188. [DOI] [PubMed] [Google Scholar]
- 27.Barbieri C, Hübner A, Macholdt E, Ni S, Lippold S, Schröder R, et al. Refining the Y chromosome phylogeny with southern African sequences. Hum Genet. 2016;135:541–53. doi: 10.1007/s00439-016-1651-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Mallick S, Li H, Lipson M, Mathieson I, Gymrek M, Racimo F, et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature. 2016;538:201–6. doi: 10.1038/nature18964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet. 2007;81:1084–97. doi: 10.1086/521987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Scozzari R, Massaia A, Trombetta B, Bellusci G, Myres NM, Novelletto A, et al. An unbiased resource of novel SNP markers provides a new chronology for the human y chromosome and reveals a deep phylogenetic structure in Africa. Genome Res. 2014;24:535–44. doi: 10.1101/gr.160788.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Henn BM, Gignoux C, Lin AA, Oefner PJ, Shen P, Scozzari R, et al. Y-chromosomal evidence of a pastoralist migration through Tanzania to southern Africa. Proc Natl Acad Sci USA. 2008;105:10693–8. doi: 10.1073/pnas.0801184105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Beleza S, Gusmão L, Amorim A, Carracedo A, Salas A. The genetic legacy of western Bantu migrations. Hum Genet. 2005;117:366–75. doi: 10.1007/s00439-005-1290-3. [DOI] [PubMed] [Google Scholar]
- 33.De Filippo C, Barbieri C, Whitten M, Mpoloka SW, Gunnarsdóttir ED, Bostoen K, et al. Y-chromosomal variation in sub-Saharan Africa: insights into the history of Niger-Congo groups. Mol Biol Evol. 2011;28:1255–69. doi: 10.1093/molbev/msq312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bajić V, Barbieri C, Hübner A, Güldemann T, Naumann C, Gerlach L, et al. Genetic structure and sex-biased gene flow in the history of southern African populations. Am J Phys Anthropol 2018; 167:656–71. [DOI] [PMC free article] [PubMed]
- 35.Heller R, Chikhi L, Siegismund HR. The Confounding Effect of Population Structure on Bayesian Skyline Plot Inferences of Demographic History. PLoS One. 2013;8:e62992. doi: 10.1371/journal.pone.0062992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Schlebusch CM, Malmström H, Günther T, Sjödin P, Coutinho A, Edlund H, et al. Southern African ancient genomes estimate modern human divergence to 350,000 to 260,000 years ago. Science. 2017;358:652–5. doi: 10.1126/science.aao6266. [DOI] [PubMed] [Google Scholar]
- 37.Rocha J, Fehn A-M. Genetics and demographic history of the Bantu. eLS. 2016.
- 38.Verdu P, Becker NSA, Froment A, Georges M, Grugni V, Quintana-Murci L, et al. Sociocultural behavior, sex-biased admixture, and effective population sizes in central African pygmies and non-pygmies. Mol Biol Evol. 2013;30:918–37. doi: 10.1093/molbev/mss328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Destro-Bisol G, Donati F, Coia V, Boschi I, Verginelli F, Caglià A, et al. Variation of female and male lineages in sub-Saharan populations: The importance of sociocultural factors. Mol Biol Evol. 2004;21:1673–82. doi: 10.1093/molbev/msh186. [DOI] [PubMed] [Google Scholar]
- 40.Barbieri C, Vicente M, Oliveira S, Bostoen K, Rocha J, Stoneking S, et al. Migration and interaction in a contact zone: mtDNA variation among Bantu-speakers in Southern Africa. PLoS One. 2014;9:e99117. doi: 10.1371/journal.pone.0099117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Wood ET, Stover DA, Ehret C, Destro-Bisol G, Spedini G, McLeod H, et al. Contrasting patterns of Y chromosome and mtDNA variation in Africa: Evidence for sex-biased demographic processes. Eur J Hum Genet. 2005;13:867–76. doi: 10.1038/sj.ejhg.5201408. [DOI] [PubMed] [Google Scholar]
- 42.Scelza BA. Female choice and extra-pair paternity in a traditional human population. Biol Lett. 2011;7:889–91. doi: 10.1098/rsbl.2011.0478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Chaix R, Quintana-Murci L, Hegay T, Hammer MF, Mobasher Z, Austerlitz F, et al. From social to genetic structures in central Asia. Curr Biol. 2007;17:43–48. doi: 10.1016/j.cub.2006.10.058. [DOI] [PubMed] [Google Scholar]
- 44.Hartl D, Clark A. Principles of population genetics. Sunderland: Sinauer Associates; 1997. [Google Scholar]
- 45.Wakeley J. Non-equilibrium migration in human evolution. Genetics. 1999;153:1863–71. doi: 10.1093/genetics/153.4.1863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Zeng TC, Aw AJ, Feldman MW. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun. 2018;9:2077. doi: 10.1038/s41467-018-04375-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Jakobsson M, Edge MD, Rosenberg NA. The relationship between F(ST) and the frequency of the most frequent allele. Genetics. 2013;193:515–28. doi: 10.1534/genetics.112.144758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Fix A. Migration and colonization in human microevolution. Cambridge: Cambridge University Press; 1999. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.