Abstract
Recent genetic studies have established that the KhoeSan populations of southern Africa are distinct from all other African populations and have remained largely isolated during human prehistory until ∼2000 years ago. Dozens of different KhoeSan groups exist, belonging to three different language families, but very little is known about their population history. We examine new genome-wide polymorphism data and whole mitochondrial genomes for >100 South Africans from the ≠Khomani San and Nama populations of the Northern Cape, analyzed in conjunction with 19 additional southern African populations. Our analyses reveal fine-scale population structure in and around the Kalahari Desert. Surprisingly, this structure does not always correspond to linguistic or subsistence categories as previously suggested, but rather reflects the role of geographic barriers and the ecology of the greater Kalahari Basin. Regardless of subsistence strategy, the indigenous Khoe-speaking Nama pastoralists and the N|u-speaking ≠Khomani (formerly hunter-gatherers) share ancestry with other Khoe-speaking forager populations that form a rim around the Kalahari Desert. We reconstruct earlier migration patterns and estimate that the southern Kalahari populations were among the last to experience gene flow from Bantu speakers, ∼14 generations ago. We conclude that local adoption of pastoralism, at least by the Nama, appears to have been primarily a cultural process with limited genetic impact from eastern Africa.
Keywords: ancestry, population structure, KhoeSan, pastoralism
The indigenous populations of southern Africa, referred to by the compound ethnicity “KhoeSan” (Schlebusch 2010), have received intense scientific interest. This interest is due both to the practice of hunter-gatherer subsistence among many groups—historically and to the present day—and genetic evidence suggesting that the ancestors of the KhoeSan diverged early on from all other African populations (Behar et al. 2008; Tishkoff et al. 2009; Henn et al. 2011, 2012; Pickrell et al. 2012; Veeramah et al. 2012; Barbieri et al. 2013). Genetic data from KhoeSan groups have been extremely limited until very recently, and the primary focus has been on reconstructing early population divergence. Demographic events during the Holocene and the ancestry of the Khoekhoe-speaking pastoralists have received limited, mostly descriptive, attention in human evolutionary genetics. However, inference of past population history depends strongly on understanding recent population events and cultural transitions.
The KhoeSan comprise a widely distributed set of populations throughout southern Africa, speaking, at least historically, languages from one of three different linguistic families—all of which contain click consonants rarely found elsewhere. New genetic data indicate that there is deep population divergence even among KhoeSan groups (Pickrell et al. 2012; Schlebusch et al. 2012, 2013; Schlebusch and Soodyall 2012; Barbieri et al. 2013), with populations living in the northern Kalahari estimated to have split from southern groups 30,000–35,000 years ago (Pickrell et al. 2012; Schlebusch et al. 2012; Schlebusch and Soodyall 2012). Pickrell et al. (2012) estimate a time of divergence between the northwestern Kalahari and southeastern Kalahari population dating back to 30,000 years ago; “northwestern” refers to Juu-speaking groups like the !Xun and Ju|’hoansi, while “southeastern” refers to Taa speakers. In parallel, Schlebusch et al. (2012) also estimated an ancient time of divergence among the KhoeSan (dating back to 35,000 years ago), but here the southern groups include the ≠Khomani, Nama, Karretjie (multiple language families), and the northern populations refer again to the !Xun and Ju|’hoansi. Thus, KhoeSan populations are not only strikingly isolated from other African populations but they appear geographically structured among themselves. To contrast this with Europeans, the ≠Khomani and the Ju|’hoansi may have diverged >30,000 years ago but live only 1000 km apart, roughly the equivalent distance between Switzerland and Denmark whose populations have little genetic differentiation (Novembre et al. 2008). However, it is unclear how this ancient southern African divergence maps onto current linguistic and subsistence differences among populations, which may have emerged during the Holocene.
In particular, the genetic ancestry of the Khoe-speaking populations and specifically the Khoekhoe, (e.g., Nama) who practice sheep, goat, and cattle pastoralism, remains a major open question. Archaeological data have been convened to argue for a demic migration of the Khoe from eastern African into southern Africa, but others have also argued that pastoralism represents cultural diffusion without significant population movement (Boonzaier 1996; MacDonald 2000; Robbins et al. 2005; Sadr 2008, 2015; Dunne et al. 2012; Pleurdeau et al. 2012; Jerardino et al. 2014). Lactase persistence alleles are present in KhoeSan groups, especially frequent in the Nama (20%), and clearly derive from eastern African pastoralist populations (Breton et al. 2014; Macholdt et al. 2014). This observation, in conjunction with other Y-chromosome and autosomal data (Henn et al. 2008; Pickrell et al. 2014), has been used to argue that pastoralism in southern Africa was another classic example of demic diffusion. However, the previous work is problematic in that it tended to focus on single loci (MCM6/LCT, Y chromosome), subject to drift or selection. Estimates of eastern African autosomal ancestry in the KhoeSan remain minimal (<10%) and the distribution of ancestry informative markers is dispersed between both pastoralist and hunter-gatherer populations. Here, we present a comprehensive study of recent population structure in southern Africa and clarify fine-scale structure beyond “northern” and “southern” geographic descriptors. We then specifically test whether the Khoe-speaking Nama pastoralists derive their ancestry from eastern Africa, the northeastern Kalahari Basin, or far southern Africa. Our results suggest that ecological features of southern Africa, broadly speaking, are better explanatory features than either language, clinal geography, or subsistence on its own.
Materials and Methods
Sample collection and ethical approval
DNA samples from the Nama, ≠Khomani San, and South African Colored populations were collected with written informed consent and approval of the Human Research Ethics Committee of Stellenbosch University (N11/07/210), South Africa, and Stanford University (protocol 13829). Community level results were returned to the communities in 2015 prior to publication. A contract for this project was approved by the Working Group of Indigenous Minorities in Southern Africa (ongoing).
Autosomal data and genotyping platforms
Two primary datasets were used: A) ∼565,000 SNPs on the Affymetrix Axiom Genome-wide Human Origins Array derived from Pickrell et al. (2012), Lazaridis et al. (2014), with additional ≠Khomani San and Hadza individuals from our collections for a total of 33 populations and 396 individuals. B) ∼320,000 SNPs from the intersection of HGDP (Illumina 650Y) (Li et al. 2008), HapMap3 (joint Illumina Human 1M and Affymetrix SNP 6.0), Illumina OmniExpressPlus and OmniExpress SNP array platforms generated here, as well as the dataset from Petersen et al. (2013) for a total of 21 populations and 852 individuals.
Population structure
ADMIXTURE (Alexander et al. 2009) was used to estimate the ancestry proportions via a model-based approach. Iterations through various k values are necessary. The k value is an estimate of the number of original ancestral populations. Cross-validation (CV) was performed by ADMIXTURE and these values were plotted to acquire the k value that was the most stable. Depiction of the Q matrix was performed in R. Ten iterations were performed for each k value with 10 random seeds. Iterations were grouped according to admixture patterns to identify the major and minor modes by pong (Behr et al. 2015). These Q matrices from ADMIXTURE, as well as longitude and latitude coordinates for each population were adjusted to the required format for use in an R script supplied by Ryan Raaum to generate the surface maps (Figure 2).
Estimating Effective Migration Surfaces (EEMs) analysis
Estimating Effective Migration Surfaces (EEMs) analyses (Petkova et al. 2016) were run on the Affymetrix Human Origins data set. Genetic dissimilarities were calculated using the bed2diffs script and EEMs was run using the runeems_snps version of the program. A grid is constructed so as to house all demes in the data provided. Each individual is assigned to a specific deme. Using a stepping stone model, migration rates between demes are calculated. Genetic dissimilarities are calculated fitting an “isolation-by-distance model.” In order for the MCMC iterations to converge, the number of MCMC iterations, burn iterations, and thin iterations were increased. The other parameters were optimized as per the manual’s recommendations, i.e., diversity and migration parameters were adjusted so as to produce 20–30% acceptance rates. The PopGPlot R package was used to visualize the data.
Association between Fst, geography, and language
A Mantel test (Fst and geographic distance) and a partial Mantel test (Fst and language, accounting for geographic distance) were performed using the vegan package in R. Geographic distances (in kilometers) between populations were calculated using latitude and longitude values as tabulated in Supplemental Material, Table S1. Weir and Cockerham genetic distances (Fst) were calculated from allele frequencies estimated with vcftools (Danecek et al. 2011). A Jaccard phonemic distance matrix was used as formulated in Creanza et al. (2015). Populations included in the analysis were the Nama, ≠Khomani, East Taa, West Taa, Naro, G|ui, G||ana, Shua, Kua, !Xuun, and Khwe.
Mitochondrial DNA network
We utilized Network (ver. 4.6, copyrighted by Fluxus Technology), for a median-joining phylogenetic network analysis in order to produce Figure 5 and Figure S6. Network Publisher (ver. 2.0.0.1, copyrighted by Fluxus Technology) was then used to draw the phylogenetic relationships among individuals.
Data availability
The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article. Data files are freely available on GitHub: https://github.com/bmhenn/khoesan_arraydata.
Results
To resolve fine-scale population structure and migration events in southern Africa, we generated genome-wide data from three South African populations. We genotyped ≠Khomani San (n = 75), Nama (n = 13), and South African Colored (SAC) (n = 25) individuals on the Illumina OmniExpress and OmniExpressPlus SNP array platforms. Sampling locations are listed in Table S1, in addition to language groupings and subsistence strategies. These data were merged with HapMap3 (joint Illumina Human1M and Affymetrix SNP 6.0) (International HapMap 3 Consortium et al. 2010), HGDP (Illumina 650Y) data (Li et al. 2008), and Illumina HumanOmni1-Quad (Petersen et al. 2013), resulting in an intersection of ∼320,000 SNPs for 852 individuals from 21 populations. In addition, we used the Affymetrix Human Origins SNP Array generated as part of Pickrell et al. (2012) and Lazaridis et al. (2014), including n = 9 ≠Khomani San individuals from our collection and encompassing >396 individuals from 33 populations. Whole mitochondrial genomes were generated from off-target reads from exome- and Y-chromosome capture short read Illumina sequencing. Reads were mapped to GRCh37, which uses the revised Cambridge reference sequence. Only individuals with >7× haploid coverage were included in the analysis: ≠Khomani San (n = 64) and Nama (n = 31); haplogroup frequencies were corrected for pedigree structure (Table S2). In this study, we address population structure among southern African KhoeSan, the genetic affinity of the Khoe, and how pastoralism diffused into southern Africa.
Population structure in southern African KhoeSan populations
We first tested whether southern African populations conform to an isolation-by-distance model, or whether there is strong heterogeneity among populations relative to geographic distance. Using 22 southern African populations (with 560,000 SNPs from Affymetrix Human Origins array), we implemented the spatially explicit program EEMs (Petkova et al. 2016) to test for effective migration patterns across the region. We observe a higher effective migration rate (m) in the central Kalahari Basin relative to a lower migration rate that forms a rim around the Kalahari Desert (Figure 1). A second resistance band stretches across northern Namibia, indicating higher gene flow above northern Namibia, Angola, and southern Zambia. Differences in effective migration rates can result from differences in effective population sizes. For example, a larger effective population size can result in higher effective migration rates, relative to neighboring demes, with smaller Ne’s. The higher m in the central Kalahari Basin, relative to the rim, could result from either a larger Ne relative to Kalahari rim populations or simply higher migration among groups in a similar ecological area.
We then tested whether heterogeneity in population structure could be mapped to distinct genetic ancestries. Unsupervised population structure analysis identifies five distinct, spatially organized ancestries among the sampled 22 southern African populations. These ancestries were inferred from the Affymetrix Human Origins data set using ADMIXTURE (Figure S1) (Alexander et al. 2009). Multimodality per k value was assessed using pong (Behr et al. 2015) and results from k = 10 are discussed below (6/10 runs assigned to the major mode, 3/10 other runs involved cluster switching only within East Africa). Visualization of these ancestries according to geographic sampling location specifically demonstrates fine-scale structure in and around the Kalahari Desert (Figure 2). While prior studies have argued for a northern vs. southern divergence of KhoeSan populations (Pickrell et al. 2012; Schlebusch et al. 2012; Schlebusch and Soodyall 2012; Barbieri et al. 2013, 2014), the structure inferred from our data set indicates a more geographically complex pattern of divergence and gene flow. Even recent migration events into southern Africa remain structured, consistent with ecological boundaries to gene flow (see below). The distribution of the five ancestries corresponds to: a northern Kalahari ancestry, central Kalahari ancestry, circum-Kalahari ancestry, a northwestern Namibian savannah ancestry, and ancestry from eastern Bantu speakers (Figure 2). This geographic patterning does not neatly correspond to linguistic or subsistence categories, in contrast to previous discussions (Pickrell et al. 2012; Schlebusch et al. 2012; Barbieri et al. 2014).
The northern Kalahari ancestry is the most defined of these ancestries, encompassing several forager populations such as the Ju|’hoansi, !Xun, Khwe, Naro, and to a lesser extent the Khoekhoe-speaking Hai||om. While these populations are among the best-studied KhoeSan in anthropological texts with particular reference to cultural similarities (Dornan 1925; Bleek 1928; Schapera 1934; Barnard 1992), they represent only a fraction of the diversity among Khoisan-speaking populations. We note that this cluster includes Kx’a (Juu), Khoe-Kwadi, and Khoekhoe speakers, suggesting that language interacts in a complex fashion with other factors such as subsistence strategy and ecology. The Hai||om are thought to have shifted to speaking Khoekhoe from an ancestral Juu-based language (Barnard 1992). The second, central Kalahari ancestry, occupies a larger geographical area throughout the Kalahari Basin, with its highest frequency among the Taa speakers: G|ui, G||ana, ≠Hoan, and Naro. This ancestry spans all three Khoisan language families (Table S1), at considerable frequency in each; all are primarily foragers.
The third ancestry cluster is represented by southern KhoeSan populations distributed along the rim of the Kalahari Desert (Figure 2)—referred to here as the “circum-Kalahari ancestry.” The circum-Kalahari ancestry is at its highest frequency in the Nama and ≠Khomani (see also Figure S2), with significant representation in the Hai||om, Khwe, !Xun, and Shua. This ancestry spans all linguistic and subsistence strategies. We propose that the circum-Kalahari is better explained by ecology than alternative factors such as language or recent migration. Specifically, we find the Kalahari Desert is an ecological boundary to gene flow (Figure 1, Figure 2). The circum-Kalahari ancestry is not easily explained by a pastoralist Khoekhoe dispersal. This spatially distinct ancestry is common in both forager and pastoralist groups, indeed all of the circum-Kalahari populations were historically foragers (except for the Nama). Therefore, to support a Khoekhoe dispersal model, we would have to posit an adoption of pastoralism by a northeastern group, leading to demic expansion around the Kalahari, with subsequent reversion to foraging in the majority of the circum-Kalahari groups; this scenario seems unlikely (but see Smith 2014 for additional discussion).
Finally, our analysis reveals two additional ancestries outside of the greater Kalahari Basin: one ancestry composed of Bantu speakers, frequent to the north, east, and southeast of the Kalahari; and a second composed of Himba, Ovambo, and Damara ancestry in northwestern Namibia distributed throughout the mopane savannah. Interestingly, the Damara are a Khoekhoe-speaking population of former foragers (later in servitude to the Nama pastoralists) whose ancestry has been unclear (see below).
We used our data and the Affymetrix HumanOrigins data set containing the greatest number of KhoeSan populations to date, to test whether language or geography better explains genetic distance (see language families and subsistence strategies in Table S1). The genetic data were compared to a phonemic distance matrix (Jaccard 1908) as well as geographic distances between each population (Table S3). In order to test whether genetic distance (Fst) was associated with geography or language, we performed a partial Mantel test for the relationship between Fst and language (Creanza et al. 2015) accounting for geographic distance among 11 KhoeSan populations. This result was not significant (r = 0.06, P = 0.30). Although an association between Fst and geographic distance within Africa has been documented (Ramachandran et al. 2005; Tishkoff et al. 2009; Creanza et al. 2015), a Mantel test for the relationship between Fst and pairwise geographic distance in our data set was also null (r = 0.021, P = 0.38), reflecting the nonlinear aspect of shared ancestry in southern Africa as seen in Figure 1 and Figure 2.
Spatially distinct ancestries are also supported by principal components analysis (PCA) (Figure 3, Figure S3). The KhoeSan anchor one end of PC1 opposite to Eurasians. PC2 separates other African populations from the KhoeSan, including western Africans, as well as central and eastern African hunter-gatherers. PC3 separates the Ju|’hoansi and !Xun (northern Kalahari) from ≠Hoan, Taa speakers and Khoe speakers, with other KhoeSan populations intermediate. PC3 and PC4 suggest that the present language distribution may reflect recent language transitions, as genetic ancestry and linguistic structure do not neatly map onto each other (Figure S4). For example, the ≠Hoan currently speak a Kx’a language but are genetically distinct from other northern Kalahari Kx’a speakers; rather, they appear to be more genetically similar to southern Kalahari Taa speakers who cluster together. We suggest that the patterns observed here are better explained by ecogeographic patterns than either language or subsistence alone (Figure S5). Specifically, PC3 discriminates northern vs. southern Kalahari ancestry (see below). PC4 discriminates western and eastern non-KhoeSan ancestry derived from Bantu speakers or other populations. Finally, the intermediate position of the Nama, ≠Khomani, and Hai||om on PC3 and PC4 is neither linguistic- nor subsistence based, but represents a nonlinear circum-Kalahari component featured in Figure 2.
A divergent southern KhoeSan ancestry
This separation of northern (Ju|’hoansi) and southern (Taa and Khoe speakers) KhoeSan populations has been observed by Schlebusch et al. (2012) and Pickrell et al. (2012). We estimate that this trans-Kalahari genetic differentiation from the inferred ancestral allele frequencies (Figure S2) is substantial (Fst = 0.05). We verify this divergence between the northern Kx’a speakers and the shared Nama and ≠Khomani ancestry in a new, second sample of Nama, from South Africa rather than central Namibia (Table S1, Figure S3). This southern KhoeSan ancestry is also present in admixed Bantu-speaking populations from South Africa (e.g., amaXhosa) as well as the admixed Western Cape SAC populations (de Wit et al. 2010), supporting a hypothesis of distinct southern-specific KhoeSan ancestry (Figure S1, Figure S2) shared between indigenous and admixed groups.
Mitochondrial data support this concept of a southern-specific KhoeSan ancestry (Schlebusch et al. 2013; Barbieri et al. 2013). Both mitochondrial DNA (mtDNA) haplogroups L0d and L0k are at high frequency in northern KhoeSan populations (Behar et al. 2008), but L0k is absent in our sample of the Nama (n = 31) and there is only one ≠Khomani individual (n = 64) with L0k (1.56%) (Table 1). L0d dominates the haplogroup distribution for both the Nama and ≠Khomani (84 and 91%, respectively), with L0d2a especially common in both. L0d2a, inferred to have originated in southern Africa, was also previously found at high frequencies in the Karretjie people further south in the central Karoo of South Africa, as well as the SAC population in the Western Cape (Quintana-Murci et al. 2010; Schlebusch et al. 2013). L0d2b is also common in the Nama (16%).
Table 1. Mitochondrial DNA haplogroup frequencies of the Nama and ≠Khomani.
Minimal population structure between the Nama and ≠Khomani
The ≠Khomani San are a N|u-speaking (!Ui classified language) former hunter-gatherer population that inhabit the southern Kalahari Desert in South Africa, bordering on Botswana and Namibia. The Nama, currently a primarily caprid pastoralist population, live in the Richtersveld along the northwestern coast of South Africa and up into Namibia. The ancestral geographic origin of the Nama has been widely contested over a number of years (Nurse and Jenkins 1977; Barnard 1992; Boonzaier 1996), but a leading hypothesis suggests that they originated further north in Botswana/Zambia and migrated into South Africa and Namibia ∼2000 years ago (Nurse and Jenkins 1977; Barnard 1992; Boonzaier 1996; Pickrell et al. 2012). The Nama and N|u languages are in distinct, separate Khoisan language families [Khoe and Tuu (!Ui-Taa), respectively] and these groups historically utilized different subsistence strategies. For this reason, we hypothesized that there would be strong population structure between the two populations.
Our global ancestry results, inferred from ADMIXTURE, show minimal population structure between the Nama and ≠Khomani San in terms of their southern KhoeSan ancestry. The ≠Khomani share ∼10% of their ancestry with the Botswana KhoeSan populations (Figure S1, Figure S3), consistent with their closer proximity to the southern Botswana populations (Taa speakers !Xo and ≠Hoan). PCA reveals a degree of fine-scale population structure between the Nama and ≠Khomani, with each population forming its own distinct cluster at PC4, partly due to the increase in Damara ancestry in the Nama (Figure 3B, Figure S1), but the two groups are clearly proximal. This increase in Damara ancestry (as depicted from k = 9 in all modes of Figure S1) is likely due to integration of the Damara people as clients of the Nama over multiple generations. However, our second sample of Nama from South Africa do not harbor significant western African ancestry, suggesting heterogeneity in the Damara component (Figure S2).
Recent patterns of admixture in South Africa
Two Bantu-speaking, spatially distinct ancestries are present in southern Africa. The first is rooted in the Ovambo and Himba in northwestern Namibia; the other reflects gene flow from Bantu-speaking ancestry present in the east (Figure 2). We estimated the time intervals for admixture events into the southern KhoeSan via analysis of the distribution of local ancestry segments using RFMix (Maples et al. 2013) and TRACTs (Gravel 2012) for the ≠Khomani OmniExpress data set (n = 59 unrelated individuals) (Figure 4, Table S2). The highest likelihood model suggests that there were three gene flow events. Approximately 14 generations ago (∼443–473 years ago assuming a generation time of 30 years and accounting for the age of our sampled individuals), the ≠Khomani population received gene flow from a Bantu-speaking group, represented here by the Kenyan Luhya. Our results are consistent with Pickrell et al. (2012) who found that the southern Kalahari Taa speakers were the last to interact with the expanding Bantu speakers ∼10–15 generations ago. Subsequently, this event was followed by admixture with Europeans between 6 and 7 generations ago (∼233–263 years ago), after the arrival of the Dutch in the Cape and the resulting migrations of “trekboers” (nomadic pastoralists of Dutch, French, and German descent) from the Cape into the South African interior. Lastly, we find a recent pulse of primarily KhoeSan ancestry 4–5 generations ago (∼173–203 years ago). This event could be explained by gene flow into the ≠Khomani from another KhoeSan group, potentially as groups shifted local ranges in response to the expansion of European farmers in the Northern Cape, or other population movements in southern Namibia or Botswana.
We also considered the impact of recent immigration into indigenous South Africans, derived from non-African source populations. The SAC populations are a five-way admixed population, deriving ancestries from Europe, eastern African, KhoeSan, and Asian populations (de Wit et al. 2010). This unique, admixed ethnic population was founded by the Dutch who settled on the southern tip of South Africa by the 17th century and by the importation of slaves from Indonesia, Bengal, India, and Madagascar. However, within the SAC, strong differences in ancestry and admixture proportions are observed between different districts within Cape Town, the Eastern Cape, and the Northern Cape Provinces. SAC individuals from the Northern Cape, where historically there was a greater concentration of European settlement (Theal 1887), have higher European ancestry. The SAC individuals from the Eastern Cape, which is the homeland of the Bantu-speaking Xhosa populations, have relatively more ancestry from Bantu-speaking populations (Figure S2). The “ColouredD6” population is from an area in Cape Town called District 6. Historically, this was a district where the slaves and political exiles from present day Indonesia resided, as well as many who were from Madagascar and India based on written documentation (du Plessis 1947). The SAC D6 population consequently has a noticeable increase in south/eastern Asian ancestry represented by the Pathan and Han Chinese populations in our data set (Figure S2).
This south/eastern Asian ancestry is not confined to the SAC population, as attested by the presence of the M36 mitochondrial haplogroup. The M36 haplogroup (South Indian/Dravidian in origin) is present in two of 64 ≠Khomani San matrilineages (Table 1). The presence of M36 is likely derived from slaves of South Asian origin who escaped from Cape Town or the surrounding farms and dispersed into the northwestern region of South Africa. In addition, we observe one M7c3c lineage in the Nama (Table 1), which traces back to southeastern Asia but has been implicated in the Austronesian expansion of Polynesian speakers into Oceania (Kayser 2010; Delfin et al. 2012) and Madagascar (Poetsch et al. 2013). The importation of Malagasy slaves to Cape Town may best explain the observation of M7c3c in the Nama.
Discussion
The KhoeSan are distinguished by their unique phenotype(s), genetic divergence, click languages, and hunter-gatherer subsistence strategy compared to other African populations; classifications of the many KhoeSan ethnic groups have primarily relied on language or subsistence strategy. Here, we generate additional genome-wide data from three South African populations and explore patterns of fine-scale population structure among 22 southern African groups. We find that complex geographic or “ecological” information is likely a better explanatory variable for genetic ancestry than language or subsistence. We identify five primary ancestries in southern Africans, each localized to a specific geographic region (Figure 2). In particular, we examined the circum-Kalahari ancestry, which appears as a ring around the Kalahari Desert and accounts for the primary ancestry of the Nama, representative of the Khoekhoe-speaking pastoralists.
We observe striking ecogeographic population structure associated with the Kalahari Desert. There are two distinct ancestries segregating within the Kalahari Desert KhoeSan populations, described here as northern Kalahari and central Kalahari ancestries. Analyses of migration rates across the 22 populations indicate particularly high migration within the Kalahari Desert. This may indicate a larger effective population size for the two desert ancestries or extensive migration related to shifting ranges in response to climatic and ecological changes over time. It is worth noting that the northern Kalahari formerly supported an extensive lake (i.e., Makgadikgadi) just before and after the Last Glacial Maximum, as well as the presence of the Okavango Delta and associated river systems; archeological data may suggest high population density near the pans, although this likely predates the genetic structure we observe today (Burrough 2016; Robbins et al. 2016). Our lack of samples outside of Botswana, Namibia, and northern South Africa prevent precise inference of m in Zambia, Limpopo, and Mozambique; but Figure 2 indicates recent extensive gene flow in the east, consistent with the expansion of Bantu-speaking agriculturalists into eastern grasslands and coastal forests. Additionally, we find a separate ancestry segregating in the far western border of Namibia and Angola, particularly frequent in the Damara and Himba, and to a lesser extent in the Ovambo and Mbukushu. This intersection of steppe and savannah along the Kunene may have facilitated recent settlement of the area during the past 500 years by Bantu-speaking pastoralists, but it is noteworthy that little Kalahari KhoeSan ancestry persists in these populations. Rather, the Damara (currently Nama speaking) or related hunter-gatherers may have been formerly more widespread in this area and subsequently absorbed into the western Bantu-speaking pastoralists.
The practice of sheep, goat, and cattle pastoralism in Africa is widespread. Within KhoeSan populations, pastoralist communities are limited to the Khoekhoe-speaking populations. Earlier hypotheses proposed that the Khoe-speaking pastoralists derived from a population originating outside of southern Africa. However, more recent genetic work supports a model of autochthonous Khoe ancestry influenced by either demic or cultural diffusion of pastoralism from East Africa ∼2500 years ago (Pleurdeau et al. 2012; Pickrell et al. 2014). For example, the presence of lactase persistence alleles in southern Africa indicates contact between East African herders and populations in south-central Africa, with subsequent migration into Namibia (Breton et al. 2014). This scenario is also supported by Y-chromosomal analysis that indicates a direct interaction between eastern African populations and southern African populations ∼2000 years ago (Henn et al. 2008). However, in both cases (i.e., MCM6/LCT and Y-chromosome M293), the frequency of the eastern African alleles is low in southern Africa and occurs in both pastoralist and hunter-gatherer populations. A simple model of eastern African demic diffusion into south-central Africa, leading to the adoption of pastoralism and a Khoekhoe population expansion from this area cannot be inferred from the genetic data.
Our samples from the Khoekhoe-speaking Nama pastoralists demonstrate that their primary ancestry is shared with other far southern nonpastoralist KhoeSan, such as the ≠Khomani San and the Karretjie (see also Schlebusch et al. 2011). mtDNA also suggests that the Nama display a haplogroup frequency distribution more similar to KhoeSan south of the Kalahari than to any other population in south-central Africa. Our results indicate that the majority of the Nama ancestry has likely been present in far southern Africa for longer than previously assumed, rather than resulting from a recent migration from further north in Botswana where other Khoe speakers live. The only other Khoekhoe-speaking population in our data set is the Hai||om who share ∼50% of the circum-Kalahari ancestry with the Nama and ≠Khomani, but are foragers rather than pastoralists. We conclude that Khoekhoe-speaking populations share a circum-Kalahari genetic ancestry with a variety of other Khoe-speaking forager populations in addition to the !Xun, Karretjie, and ≠Khomani (Figure 1, Figure 2). This ancestry is divergent from central and northern Kalahari ancestries, arguing against a major demic expansion of Khoekhoe pastoralists from northern Botswana into South Africa. Rather, in this region, cultural transfer likely played a more important role in the diffusion of pastoralism. Of course, a demic expansion of the Khoekhoe within a more limited region of Namibia and South Africa may still have occurred—but geneticists currently lack representative DNA samples from many of the now “Coloured” interior populations, which may carry Khoekhoe ancestry.
This is an unusual case of cultural transmission (Jerardino et al. 2014). Other prehistoric economic transitions have been shown to be largely driven by demic diffusion (Gignoux et al. 2011; Fort 2012; Lazaridis et al. 2014; Skoglund et al. 2014; Malmström et al. 2015). Recent analysis of Europe provides a case study of demic diffusion, which appears far more complex than initially hypothesized. The initial spread of Near Eastern agriculturalists into southern Europe clearly replaced or integrated many of the autochthonous hunter-gatherer communities. Even isolated populations such as the Basque have been shown to derive much of their ancestry from Near Eastern agriculturalists (Skoglund et al. 2014). The early demic diffusion of agriculture exhibits a strong south-to-north cline across Europe, reflecting the integration of hunter-gatherers into composite southern agriculturalist populations, which then expanded northward with mixed ancestry (Sikora et al. 2014). The cline of the early Near Eastern Neolithic ancestry becomes progressively diluted in far northern European populations. In contrast, we see little evidence of a clear eastern African ancestry cline within southern African KhoeSan; nor is the putative “Khoe” ancestry identified in the Nama of eastern African origin or even of clear origin from northeastern Botswana where initial pastoralist contact presumably occurred.
However, the transfer of pastoralism from eastern to southern Africa itself was not purely cultural (see above). We also report here the presence of mitochondrial L4b2 that supports limited gene flow from eastern Africa, approximately during the same time frame as the pastoralist diffusion. L4b2, formerly known as L3g or L4g, is a mtDNA haplogroup historically found at a high frequency in eastern Africa, in addition to the Arabian Peninsula. L4b2 is at high frequency specifically in click-speaking populations such as the Hadza and Sandawe in Tanzania (sometimes described as “Khoisan speaking”) (Knight et al. 2003). Nearly 60% of the Hadza population and 48% of Sandawe belong to L4b2 (Tishkoff et al. 2007). Even though both Tanzanian click-speaking groups and the southern African KhoeSan share some linguistic similarities and a hunter-gatherer lifestyle, they have been isolated from each other over the past 35,000 years (Tishkoff et al. 2007). The L4b2a2 haplogroup is present at a low frequency in both the Nama and ≠Khomani San, observed in one matriline in each population (Table 1). L4b2 was also formerly reported in the SAC population (0.89%) (Quintana-Murci et al. 2010) but has not been discussed in the literature. We identified several additional southern L4b2 haplotypes from whole mtDNA genomes deposited in public databases (Behar et al. 2008; Barbieri et al. 2013) and analyzed these samples together with all L4b2 individuals available in National Center for Biotechnology Information (NCBI). Median-joining phylogenetic network analysis of the mtDNA haplogroup, L4b2, supports the hypothesis that there was gene flow from eastern Africans to southern African KhoeSan groups. As shown in Figure 5 (and in more detail in Figure S6), southern African individuals branch off in a single lineage from eastern African populations in this network (Salas et al. 2002; Tishkoff et al. 2007; Gonder et al. 2007). The mitochondrial network suggests a recent migratory scenario (estimated to be <5000 years before present), although the source of this gene flow, whether from eastern African click-speaking groups or others, remains unclear (Pickrell et al. 2014).
Conclusions
Analysis of 22 southern African populations reveals that fine-scale population structure corresponds better with ecological rather than linguistic or subsistence categories. The Nama pastoralists are autochthonous to far southwestern Africa, rather than representing a recent population movement from further north. We find that the KhoeSan ancestry remains highly structured across southern Africa and suggests that cultural diffusion likely played the key role in adoption of pastoralism.
Acknowledgments
We thank Jeffrey Kidd for assisting with genotyping of samples, David Poznik for providing off-target mtDNA reads from a separate next-generation sequencing experiment, Aaron Behr and Sohini Ramachandran for prepublication use of pong, and Meng Lin for help with analyses. We thank Carlos Bustamante for his encouragement and support of this project and Marcus Feldman for a close reading of our manuscript. We thank Julie Granka, Justin Myrick, and Cedric Werely for assistance with the saliva sample collection and Ben Viljoen for DNA extractions. Guidance from Ryan Raaum with regards to formulating the surface plots is appreciated. We also thank the Working Group of Indigenous Minorities in Southern Africa and the South African San Institute for their encouragement and advice. Finally, we thank Richard Jacobs, Wilhelmina Mondzinger, Hans Padmaker, Willem de Klerk, Hendrik Kaiman, and the communities in which we have sampled; without their support, this study would not have been possible. Funding was provided by a Stanford University Center on the Demographics and Economics of Health and Aging CDEHA seed grant to B.M.H. (National Institutes of Health, National Institute of Aging, NIA P30 AG017253-12) as well as a Stanford University Computation, Evolutionary, and Human Genomics trainee research grant to A.R.M. C.U. was funded by the National Research Foundation of South Africa. C.R.G. was funded by Predoctoral Training Grant 32.
Author contributions: C.U., M.K. A.R.M., and D.B. performed analysis. C.R.G., M.M., A.R.M., C.U., and B.M.H. collected DNA samples. P.D.v.H., M.M., E.G.H., and B.M.H. conceived of the study. C.U., C.R.G., M.M., E.G.H., and B.M.H. wrote the manuscript in collaboration with all coauthors. All authors read and approved of the manuscript.
Footnotes
Supplemental material is available online at http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.116.187369 /-/DC1.
Communicating editor: L. B. Jorde
Literature Cited
- Alexander D. H., Novembre J., Lange K., 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19: 1655–1664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barbieri C., Vicente M., Rocha J., Mpoloka S. W., Stoneking M., et al. , 2013. Ancient substructure in early mtDNA lineages of Southern Africa. Am. J. Hum. Genet. 92: 285–292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barbieri C., Güldemann T., Naumann C., Gerlach L., Berthold F., et al. , 2014. Unraveling the complex maternal history of Southern African Khoisan populations. Am. J. Phys. Anthropol. 153: 435–448. [DOI] [PubMed] [Google Scholar]
- Barnard, A., 1992 Hunters and Herders of Southern Africa: A Comparative Ethnography of the Khoisan Peoples. Cambridge University Press, Cambridge, UK. [Google Scholar]
- Behar D. M., Villems R., Soodyall H., Blue-Smith J., Pereira L., et al. Genographic Consortium , 2008. The dawn of human matrilineal diversity. Am. J. Hum. Genet. 82: 1130–1140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Behr A. A., Liu K. Z., Liu-Fang G., Nakka P., Ramachandran S., 2016. pong: fast analysis and visualization of latent clusters in population genetic data. Bioinformatics: btw327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bleek, D. F., 1928 The Naron: A Bushman Tribe of the Central Kalahari, Cambridge University Press Archive, Cambridge, UK. [Google Scholar]
- Blench R., and K. C. MacDonald, 2000 The Origins and Development of African Livestock: Archaeology, Genetics, Linguistics, and Ethnography. UCL Press. London. [Google Scholar]
- Boonzaier, E., 1996 The Cape Herders: A History of the Khoikhoi of Southern Africa, New Africa Books, Kaapstad, South Africa. [Google Scholar]
- Breton G., Schlebusch C. M., Lombard M., Sjödin P., Soodyall H., et al. , 2014. Lactase persistence alleles reveal partial East African ancestry of southern African Khoe pastoralists. Curr. Biol. CB 24: 852–858. [DOI] [PubMed] [Google Scholar]
- Burrough, S. L., 2016 Late quaternary environmental change and human occupation of the Southern African interior, pp. 161–174 in Africa from MIS 6–2, Vertebrate Paleobiology and Paleoanthropology, edited by B. A. Stewart and S. C. Jones. Springer-Verlag, Berlin. [Google Scholar]
- Creanza N., Ruhlen M., Pemberton T. J., Rosenberg N. A., Feldman M. W., et al. , 2015. A comparison of worldwide phonemic and genetic variation in human populations. Proc. Natl. Acad. Sci. USA 112: 1265–1272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danecek P., Auton A., Abecasis G., Albers C. A., Banks E.; 1000 Genomes Project Analysis Group, 2011. The variant call format and VCFtools. Bioinformatics 27: 2156–2158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Delfin F., Myles S., Choi Y., Hughes D., Illek R., et al. , 2012. Bridging near and remote Oceania: mtDNA and NRY variation in the Solomon Islands. Mol. Biol. Evol. 29: 545–564. [DOI] [PubMed] [Google Scholar]
- Dornan, S. S., 1925 Pygmies and Bushmen of the Kalahari: An Account of the Hunting Tribes Inhabiting the Great Arid Plateau of the Kalahari Desert, Seeley, Service & Company, London. [Google Scholar]
- Dunne J., Evershed R. P., Salque M., Cramp L., Bruni S., et al. , 2012. First dairying in green Saharan Africa in the fifth millennium BC. Nature 486: 390–394. [DOI] [PubMed] [Google Scholar]
- du Plessis, I. D. D., 1947 The Cape Malays. South African Institute of Race Relations, Johannesburg, South Africa. [Google Scholar]
- Fort J., 2012. Synthesis between demic and cultural diffusion in the Neolithic transition in Europe. Proc. Natl. Acad. Sci. USA 109: 18669–18673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gignoux C. R., Henn B. M., Mountain J. L., 2011. Rapid, global demographic expansions after the origins of agriculture. Proc. Natl. Acad. Sci. USA 108: 6044–6049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gonder M. K., Mortensen H. M., Reed F. A., de Sousa A., Tishkoff S. A., 2007. Whole-mtDNA genome sequence analysis of ancient African lineages. Mol. Biol. Evol. 24: 757–768. [DOI] [PubMed] [Google Scholar]
- Gravel S., 2012. Population genetics models of local ancestry. Genetics 191: 607–619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henn B. M., Gignoux C., Lin A. A., Oefner P. J., Shen P., et al. , 2008. Y-chromosomal evidence of a pastoralist migration through Tanzania to southern Africa. Proc. Natl. Acad. Sci. USA 105: 10693–10698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henn B. M., Gignoux C. R., Jobin M., Granka J. M., Macpherson J. M., et al. , 2011. Hunter-gatherer genomic diversity suggests a southern African origin for modern humans. Proc. Natl. Acad. Sci. USA 108: 5154–5162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henn B. M., Cavalli-Sforza L. L., Feldman M. W., 2012. The great human expansion. Proc. Natl. Acad. Sci. USA 109: 17758–17764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- International HapMap 3 Consortium; D. M., Altshuler, R. A., Gibbs, L., Peltonen, D. M., Altshuler, et al, 2010. Integrating common and rare genetic variation in diverse human populations. Nature 467: 52–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jaccard P., 1908. Nouvelles Recherches Sur La Distribution Florale. Bull. Soc. Vaud. Sci. Nat. 44: 223–270. [Google Scholar]
- Jerardino A., Fort J., Isern N., Rondelli B., 2014. Cultural diffusion was the main driving mechanism of the Neolithic transition in southern Africa. PLoS One 9: e113672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kayser M., 2010. The human genetic history of Oceania: near and remote views of dispersal. Curr. Biol. 20: R194–R201. [DOI] [PubMed] [Google Scholar]
- Knight A., Underhill P. A., Mortensen H. M., Zhivotovsky L. A., Lin A. A., et al. , 2003. African Y chromosome and mtDNA divergence provides insight into the history of click languages. Curr. Biol. 13: 464–473. [DOI] [PubMed] [Google Scholar]
- Lazaridis I., Patterson N., Mittnik A., Renaud G., Mallick S., et al. , 2014. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513: 409–413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li J. Z., Absher D. M., Tang H., Southwick A. M., Casto A. M., et al. , 2008. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319: 1100–1104. [DOI] [PubMed] [Google Scholar]
- MacDonald, K. C. R. H. M., 2000 The origins and development of African livestock: archaeology, genetics, linguistics and ethnography. Orig. Dev. Domest. Anim. Arid West Afr.: 127–162.
- Macholdt E., Lede V., Barbieri C., Mpoloka S. W., Chen H., et al. , 2014. Tracing pastoralist migrations to southern Africa with lactase persistence alleles. Curr. Biol. CB 24: 875–879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malmström H., Linderholm A., Skoglund P., Storå J., Sjödin P., et al. , 2015. Ancient mitochondrial DNA from the northern fringe of the Neolithic farming expansion in Europe sheds light on the dispersion process. Philos. Trans. R. Soc. Lond. B Biol. Sci. 370: 20130373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maples B. K., Gravel S., Kenny E. E., Bustamante C. D., 2013. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93: 278–288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Novembre J., T. Johnson, K. Bryc, Z. Kutalik, A. R. Boyko et al., 2008 Genes mirror geography within Europe. Nature 456: 98–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nurse G. T., Jenkins T., 1977. Health and the hunter-gatherer. Biomedical studies on the hunting and gathering populations of Southern Africa. Monogr. Hum. Genet. 8: 1–126. [PubMed] [Google Scholar]
- Petersen D. C., Libiger O., Tindall E. A., Hardie R.-A., Hannick L. I. et alIndian Genome Variation Consortium , 2013. Complex patterns of genomic admixture within southern Africa. PLoS Genet. 9: e1003309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petkova D., Novembre J., Stephens M., 2016. Visualizing spatial population structure with estimated effective migration surfaces. Nat. Genet. 48: 94–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pickrell J. K., Patterson N., Barbieri C., Berthold F., Gerlach L., et al. , 2012. The genetic prehistory of southern Africa. Nat. Commun. 3: 1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pickrell J. K., Patterson N., Loh P.-R., Lipson M., Berger B., et al. , 2014. Ancient west Eurasian ancestry in southern and eastern Africa. Proc. Natl. Acad. Sci. USA 111: 2632–2637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pleurdeau D., Imalwa E., Détroit F., Lesur J., Veldman A., et al. , 2012. “Of sheep and men”: earliest direct evidence of caprine domestication in Southern Africa at Leopard Cave (Erongo, Namibia). PLoS One 7: e40340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poetsch M., Wiegand A., Harder M., Blöhm R., Rakotomavo N., et al. , 2013. Determination of population origin: a comparison of autosomal SNPs, Y-chromosomal and mtDNA haplogroups using a Malagasy population as example. Eur. J. Hum. Genet. 21: 1423–1428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quintana-Murci L., Harmant C., Quach H., Balanovsky O., Zaporozhchenko V., et al. , 2010. Strong maternal Khoisan contribution to the South African coloured population: a case of gender-biased admixture. Am. J. Hum. Genet. 86: 611–620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramachandran S., Deshpande O., Roseman C. C., Rosenberg N. A., Feldman M. W., et al. , 2005. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl. Acad. Sci. USA 102: 15942–15947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robbins L. H., Campbell A. C., Murphy M. L., Brook G. A., Srivastava P., et al. , 2005. The advent of herding in Southern Africa: early AMS dates on domestic livestock from the Kalahari Desert. Curr. Anthropol. 46: 671–677. [Google Scholar]
- Robbins, L. H., G. A. Brook, M. L. Murphy, A. H. Ivester, and A. C. Campbell, 2016 The Kalahari during MIS 6–2 (190–12 ka): archaeology, paleoenvironment, and population dynamics, pp. 175–193 in Africa from MIS 6–2, Vertebrate Paleobiology and Paleoanthropology, edited by B. A. Stewart and S. C. Jones. Springer-Verlag, Berlin. [Google Scholar]
- Sadr K., 2008. Invisible herders? The archaeology of Khoekhoe pastoralists. South. Afr. Humanit. 20: 179–203. [Google Scholar]
- Sadr K., 2015. Livestock first reached southern Africa in two separate events. PLoS One 10: e0134215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salas A., Richards M., De la Fe T., Lareu M.-V., Sobrino B., et al. , 2002. The making of the African mtDNA landscape. Am. J. Hum. Genet. 71: 1082–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schapera I., 1934. The Khoisan Peoples of South Africa. Routledge & Kegan Paul, London. [Google Scholar]
- Schlebusch C., 2010. Issues raised by use of ethnic-group names in genome study. Nature 464: 487–, author reply 487.. [DOI] [PubMed] [Google Scholar]
- Schlebusch C. M., Soodyall H., 2012. Extensive population structure in San, Khoe, and mixed ancestry populations from southern Africa revealed by 44 short 5-SNP haplotypes. Hum. Biol. 84: 695–724. [DOI] [PubMed] [Google Scholar]
- Schlebusch C. M., de Jongh M., Soodyall H., 2011. Different contributions of ancient mitochondrial and Y-chromosomal lineages in “Karretjie people” of the Great Karoo in South Africa. J. Hum. Genet. 56: 623–630. [DOI] [PubMed] [Google Scholar]
- Schlebusch C. M., Skoglund P., Sjödin P., Gattepaille L. M., Hernandez D., et al. , 2012. Genomic variation in seven Khoe-San groups reveals adaptation and complex African history. Science 338: 374–379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schlebusch C. M., Lombard M., Soodyall H., 2013. MtDNA control region variation affirms diversity and deep sub-structure in populations from southern Africa. BMC Evol. Biol. 13: 56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sikora M., Carpenter M. L., Moreno-Estrada A., Henn B. M., Underhill P. A., et al. , 2014. Population genomic analysis of ancient and modern genomes yields new insights into the genetic ancestry of the Tyrolean Iceman and the genetic structure of Europe. PLoS Genet. 10: e1004353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skoglund P., Malmström H., Omrak A., Raghavan M., Valdiosera C., et al. , 2014. Genomic diversity and admixture differs for Stone-Age Scandinavian foragers and farmers. Science 344: 747–750. [DOI] [PubMed] [Google Scholar]
- Smith A., 2014. The Origins of Herding in Southern Africa: Debating the “Neolithic” model. Lap Lambert Academic Publishing, Saarbrücken, Germany. [Google Scholar]
- Theal, G. M., 1887 History of the Boers in South Africa, or the wanderings and wars of the emigrant farmers [microform]: from their leaving the Cape colony to the acknowledgement of their independence by Great Britain. S. Sonnenschein, Lowrey, London. [Google Scholar]
- Tishkoff S. A., Reed F. A., Ranciaro A., Voight B. F., Babbitt C. C., et al. , 2007. Convergent adaptation of human lactase persistence in Africa and Europe. Nat. Genet. 39: 31–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tishkoff S. A., Reed F. A., Friedlaender F. R., Ehret C., Ranciaro A., et al. , 2009. The genetic structure and history of Africans and African Americans. Science 324: 1035–1044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Veeramah K. R., Wegmann D., Woerner A., Mendez F. L., Watkins J. C., et al. , 2012. An early divergence of KhoeSan ancestors from those of other modern humans is supported by an ABC-based analysis of autosomal resequencing data. Mol. Biol. Evol. 29: 617–630. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Wit E., Delport W., Rugamika C. E., Meintjes A., Möller M., et al. , 2010. Genome-wide analysis of the structure of the South African Coloured Population in the Western Cape. Hum. Genet. 128: 145–153. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article. Data files are freely available on GitHub: https://github.com/bmhenn/khoesan_arraydata.