Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
. 2022 May 26;185(11):1842–1859.e18. doi: 10.1016/j.cell.2022.04.008

The genomic origins of the world’s first farmers

Nina Marchi 1,2,22, Laura Winkelbach 3,22, Ilektra Schulz 2,4,22, Maxime Brami 3,22, Zuzana Hofmanová 2,4,5,6, Jens Blöcher 3, Carlos S Reyna-Blanco 2,4, Yoan Diekmann 3,7, Alexandre Thiéry 1,2,24, Adamandia Kapopoulou 1,2, Vivian Link 2,4, Valérie Piuz 1, Susanne Kreutzer 3,25, Sylwia M Figarska 3, Elissavet Ganiatsou 8, Albert Pukaj 3, Travis J Struck 9, Ryan N Gutenkunst 9, Necmi Karul 10, Fokke Gerritsen 11,12, Joachim Pechtl 13, Joris Peters 14,15, Andrea Zeeb-Lanz 16, Eva Lenneis 17, Maria Teschler-Nicola 18,19, Sevasti Triantaphyllou 20, Sofija Stefanović 21, Christina Papageorgopoulou 8, Daniel Wegmann 2,4,23,, Joachim Burger 3,23,∗∗, Laurent Excoffier 1,2,23,26,∗∗∗
PMCID: PMC9166250  PMID: 35561686

Summary

The precise genetic origins of the first Neolithic farming populations in Europe and Southwest Asia, as well as the processes and the timing of their differentiation, remain largely unknown. Demogenomic modeling of high-quality ancient genomes reveals that the early farmers of Anatolia and Europe emerged from a multiphase mixing of a Southwest Asian population with a strongly bottlenecked western hunter-gatherer population after the last glacial maximum. Moreover, the ancestors of the first farmers of Europe and Anatolia went through a period of extreme genetic drift during their westward range expansion, contributing highly to their genetic distinctiveness. This modeling elucidates the demographic processes at the root of the Neolithic transition and leads to a spatial interpretation of the population history of Southwest Asia and Europe during the late Pleistocene and early Holocene.

Keywords: demographic inference, demogenomic modeling, demographic processes, ancient genomics, Neolithic transition, upper Palaeolithic, human evolution, population admixture

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • European HGs diverged from SW Asian HGs during the LGM

  • Low genetic diversity of European HGs is due to a strong LGM demographic bottleneck

  • Ancestors of western early farmers emerged after repeated post-LGM admixtures

  • EFs strongly diverged from SW Asians during their expansion through Anatolia


Ancient DNA analysis and evolutionary modeling have allowed for the ancestral tracing of the Neolithic populations of Southwest Asia and Europe to resolve the genetic origins of the world’s first sedentary farmers.

Introduction

Genetic analyses of skeletal remains from prehistoric sites have greatly enriched our knowledge of the changes that brought sedentism and food-production, along with new people, material culture and practices, to Europe approximately 8.6 kya (kiloyears ago) through processes often described in archaeology as the “Neolithic revolution” (Childe, 1936). These processes are thought to have reached a tipping point ∼11.7 kya in Southwest Asia, where plants and animals were first domesticated (Fuller et al., 2011; Peters et al., 1999). From this region, it is widely agreed that farming spread into Europe along two main routes, the “Mediterranean” route and the “Danubian” route (Shennan, 2018); it is the latter route that forms the focus of the research described here. Despite this well-developed archaeological narrative, the genetic origins of the world’s first farmers and the spatiotemporal scope of the processes involved remain elusive, in large part due to the lack of high-quality ancient genomes derived from the populations involved in the crucial initial phases of farming expansion.

To date, palaeogenetic studies have established that European early farmers (EFs) were genetically distinct from contemporary Holocene hunter-gatherers (HGs) inhabiting the continent (Bramanti et al., 2009; Skoglund et al., 2012). Apparently, genetic exchange between EFs and HGs appear to have been limited in the early phases of the agricultural expansion, with more intense exchange taking place in the later stages (Lazaridis et al., 2014; Mathieson et al., 2015). Most farmers present in Continental Europe around 9 kya appear to have descended from populations inhabiting the Aegean basin and the Eastern Marmara region (Hofmanová et al., 2016), but their ultimate genetic and geographic origins are still a matter of debate.

Early Neolithic (EN) farmers from the Aegean are clearly related to Central Anatolian farmers (Kılınç et al., 2017), but they also show affinities with Pre-Pottery Neolithic farmers of the Southern Levant (Lazaridis et al., 2016). This suggests a common origin of all these populations prior to the westward spread of agriculture (Kılınç et al., 2016), potentially in the Fertile Crescent area, an archaeologically significant region that contained parts of modern-day Iran, Iraq, Israel, Palestine, Jordan, Lebanon, Syria, and Turkey. However, research has also revealed that Aegean farmers are genetically distinct from early farming populations from the eastern wing of the Fertile Crescent, the Zagros region of Iran and northern Iraq, which may indicate parallel adoption of farming practices by genetically distant groups of HGs across Southwest Asia (Broushaki et al., 2016). Furthermore, there is some evidence of genetic continuity between Epipalaeolithic and Neolithic populations of Central Anatolia (Feldman et al., 2019), suggesting local transitions to agriculture without major gene flow. To make the picture even more complex, some Central Anatolian EFs also show genetic affinities to Caucasus HGs as represented by a 10th millennium before present (BP) genome from Kotias in Western Georgia (Jones et al., 2015; Kılınç et al., 2016; Skourtanioti et al., 2020). Caucasus HGs are themselves thought to be closely related to early Iranian Neolithic farmers (Lazaridis et al., 2016) as well as to later Pontic-Caspian steppe pastoralists (Lazaridis, 2018; Mathieson et al., 2018; Narasimhan et al., 2019).

Despite efforts to understand the genomic history of early farming populations in Europe and Southwest Asia, we still lack a detailed historical scenario of population demography, divergence, and migration embedded in geographic and temporal frameworks. Only a few analyses have previously estimated divergence times between ancestors of Neolithic and HG groups (Broushaki et al., 2016; Jones et al., 2015); the models used were additionally simplistic. The palaeogenetic conclusions outlined above mainly derive from the interpretation of descriptive analyses and summary statistics (e.g., principal component and admixture analyses, f-statistics), usually computed on low coverage genomes and/or on ascertained sets of SNPs (Patterson et al., 2012). Such data are difficult to integrate into “demogenomic” (shorthand for demographic modeling using genomic information) analyses. The goal of this paper is to reconstruct the ancestry of Southwest Asian and European EFs and the processes that contributed to their differentiation from HGs. To do so, we produced 15 high-resolution genomes of early Holocene farmers and HGs distributed along a geographical and temporal transect reaching from Southwest Asia to the Rhine valley in West-Central Europe (Figure 1). DNA was extracted from skeletons recovered from some of the most important archaeological sites in early Holocene Europe and Anatolia, including the first farming villages in the Aegean basin (Barcın, Aktopraklık, Nea Nikomedeia); Mesolithic and Neolithic sites from the Iron Gates gorges and other areas of the Central Balkans (Lepenski Vir, Vlasac, Grad-Starčevo, Vinča-Belo Brdo); and the oldest cemeteries and mass grave or “massacre” sites of the Central European EN (Kleinhadersdorf, Asparn-Schletz, Essenbach-Ammerbreite, Dillingen-Steinheim, Herxheim) (Table 1; Figure 1).

Figure 1.

Figure 1

Spatial and temporal distribution of the ancient genomes analyzed in this study

(A) Location of archaeological sites with newly sequenced genomes and additional genomes used for modeling: Neolithic (black) and Mesolithic or Palaeolithic (red); different chronological phases of Neolithic expansion (colored areas) and archaeological cultures (blue) along the Danubian route of Neolithization; geographical areas (purple).

(B) Chronological distribution of the 25 genomes analyzed in this study, with the 15 newly sequenced genomes in bold, and the previously published genomes in italics (details in Tables 1 and S3). We also list the cultural groups (EFs, early farmers; HGs, hunter-gatherers), the regions and the archaeological sites where ancient individuals were sampled. The chronological interval at 2 sigma (95.4% probability) is shown for each directly 14C-dated sample, except for Stuttgart and Ess7, for which approximate dates are given based on the archaeological context.

Table 1.

Archaeological and genetic information for newly sequenced genomes

Individual Period (culture) Site Country Age (cal. BP) Mean depth (X) Genetic sex Haplogroups mtDNA Haplogroups Y
VLASA7 LM Vlasac Serbia 8,764–8,340 15.21 M U5a2a I2
VLASA32 LM Vlasac Serbia 9,741–9,468 12.65 M U5a2a R1b1
AKT16 EN Aktopraklık Turkey 8,635–8,460 12.25 F K1a3
Bar25 EN Barcın Turkey 8,384–8,205 12.65 M N1a1a1 G2a2b2a1
Nea3 EN Nea Nikomedeia Greece 8,327–8,040 11.57 F K1a2c
Nea2 EN Nea Nikomedeia Greece 8,173–8,023 12.51 F K1a
LEPE48 TEN Lepenski Vir Serbia 8,012–7,867 10.92 M K1a1 C1a2b
LEPE52 E-MN Lepenski Vir Serbia 7,931–7,693 12.37 M H3 G2a2b2a1a1c
STAR1 EN (Starčevo) Grad-Starčevo Serbia 7,589–7,476 10.55 F T2e2
VC3-2 EN (Starčevo) Vinča-Belo Brdo Serbia 7,565–7,426 11.22 M HV-16311 G2a2a1a3
Asp6 EN (LBK) Asparn-Schletz Austria 7,575–7,474 12.11 M U5a1c1 G2a2b2a3
Klein7 EN (LBK) Kleinhadersdorf Austria 7,244–7,000 11.30 F W1-119
Dil16 EN (LBK) Dillingen-Steinheim Germany 7,235–6,998 10.60 M J1c6 C1a2b
Ess7 EN (LBK) Essenbach-Ammerbreite Germany (7,050–6,900 BP) 12.34 M U5b2c1 G2a2b2a1a1
Herx EN (LBK) Herxheim Germany 7,164–6,993 11.46 F K1a4a1i

LM, late Mesolithic; EN, early Neolithic; TEN, transformational/early Neolithic; E-MN, early-middle Neolithic; LBK, Linearbandkeramik. Samples with genetic sex determined as XX and XY are noted as F and M, respectively.

The samples’ ages are based on 14C dating (95.4% probability), except Ess7, for which an approximate date is given based on the archaeological context.

Results

Genetic structure and affinities of ancient individuals

Multidimensional scaling (MDS) performed on the neutral portion of the genome (Figure 2A) of ancient individuals and modern reference populations reveals three clusters of ancient individuals: (1) European HGs, (2) western EFs, i.e., EFs from Europe and Anatolia, (3) an EF individual from Iran (WC1) and a Mesolithic HG from Caucasus (KK1). Consistent with previous analyses based on ascertained SNPs (Marcus et al., 2020; Skoglund et al., 2012), the western EFs show strongest affinities with modern Sardinians, with the exception of one English (CarsPas1) and two Northwest Anatolian EFs (Bar8 and AKT16), who are found to cluster with modern individuals from other parts of Southern Europe. In contrast, Palaeolithic and Mesolithic European HGs are genetically well differentiated from all modern Europeans. The Iranian EF and the Caucasus HG appear to be genetically close to modern populations from their sampling area, in keeping with some long-term genetic continuity in those regions. This observation is even more pronounced when performing a MDS analysis on the whole genome, including sites potentially affected by selection (Figure M1_5 from Methods S1). Generally, ancient individuals appear to be closer to modern ones, once the MDS is computed on the whole genome instead of just neutral sites. This could be explained by a slower evolution of genomic regions influenced by background selection (Pouyet et al., 2018). Another striking difference visible on the whole-genome MDS plot (Figure M1_5) is that western EFs are closer to some other Southern Europeans than to Sardinians.

Figure 2.

Figure 2

Genetic structure and affinities of ancient individuals

(A) Multidimensional scaling (MDS) analysis performed on the neutrally evolving portion of ancient (n = 25; large filled circles and squares: early farmers; triangles: hunter-gatherers) and modern (n = 65, shown as small circles) genomes from Europe and SW Asia. Ellipses highlight two clusters of ancient individuals: the European hunter-gatherers (HGs) and the European and Anatolians early farmers (i.e., western EFs).

(B) Admixture analyses (K = 3) performed on 22 ancient genomes (three genomes with the lowest quality were discarded: Bichon, Bon002, and Bar8). Note that AKT16 in NW Anatolia is more admixed than a neighbor genome from the Barcın site (Bar25), in keeping with f-statistics analyses (see Figure S1), which has led us to consider them as originating from independent populations in our demographic modeling.

(C) Heterozygosity computed at neutral sites in ancient genomes (HGs in blue, EFs in green).

(D) Runs of homozygosity (ROHs) computed on imputed ancient whole genomes for intermediate ROHs (2–10 Mb, dark color) or long ROHs (>10 Mb, light color), indicative of background relatedness in small populations and close inbreeding, respectively.

See also Figure S1.

Demogenomic modeling assumptions

To progress our research, we first contrasted alternative models of population differentiation using high-resolution ancient genomes drawn from each of the three MDS clusters described above. Sampled individuals from each cluster were assumed to derive from populations that belong to a large structured metapopulation, consisting of interconnected but mostly unsampled populations. Such a model is described in the literature as the “continent-island” model (Excoffier, 2004, 2013) with sampled populations (the islands) receiving a single pulse of gene flow from the metapopulation (the continent) shortly before their sampling time. The three metapopulations were designated Western, Central, and Eastern to reflect the geographic distribution of the ancient genomes across Europe and SW Asia. These three metapopulations represent the pools of western European HGs, western EFs from Europe and Anatolia, and Iranian EFs, respectively. Additional ancient individuals were then gradually added to the initial model to infer their ancestry and relationships with other populations (see Methods S1—Demogenomic inference with fastsimcoal2). Thus, increasingly complex models could be built and tested, resulting in the demographic scenario shown in Figure 3.

Figure 3.

Figure 3

Demographic scenario inferred from genomic modeling

This demographic history was obtained by compiling the best models of all tested scenarios (see Methods S1— Demogenomic inference with fastsimcoal2—Final model). Times of the events (y axis) and population ages (shown below their symbols) are indicated in ky BP. Under each population name, we indicate their sampled genomes, their associated inbreeding coefficients (Fis), and their diploid effective population sizes (Ne). Unfilled symbols indicate ancestral populations that we simulated after or before key events (split times or admixture events). The X symbols indicate bottlenecks that occurred on ancestral branches, modeled as a one-generation bottleneck through a population with its effective size shown in italics. Admixture proportions >10% from the Western metapopulation are indicated by blue arrows.

See also Figure S2 for effective populations sizes inferred by MSMC2.

All western EFs share a remote common ancestry with Caucasus HGs

In contrast with previous studies (Lazaridis et al., 2016), we find that Caucasus HGs (represented by KK1) and western EFs are all descended from a population ancestral to the Central metapopulation. This is in line with a recent genetic study showing that KK1 was more closely related to EFs than to western European HGs (Speidel et al., 2021). This ancestral Central metapopulation received about 14% (95% CI 8–26) of its gene pool from the Western metapopulation some 14.2 kya (95% CI 13.7–19.0, Figures 3 and M1_18). Ancestors of the Iranian Neolithic population (represented by WC1) were not affected by this initial admixture: they rather diverged from the Eastern metapopulation 13.6 kya (95% CI 11–24.6) after its split from the Central metapopulation ∼15.8 kya (95% CI 14.3–25.6). Even though Caucasus HGs show closer genetic affinities with EFs from Iran (Figures 2A and 2B), our analyses suggest that they share a common ancestry with all western EFs.

Ancestors ofwestern EFs admixed twice with western HGs

We find that the ancestors of western EFs received a second pulse of gene flow (15%, 95% CI 6–17) from the Western metapopulation ∼12.9 kya (95% CI 9.4–13.9), while Caucasus HGs did not (Figure M1_20B). Models that do not include this additional admixture have a lower likelihood and are therefore rejected (Figure M1_20A). Thus, the ancestors of western EFs are the product of repeated episodes of gene flow from the Western metapopulation. These populations have then diverged from Caucasus HGs due to an intense period of genetic drift between 12.9 and 9.1 kya (Figures 3 and 4). Indeed, we find that their effective population size was reduced to 620 individuals (95% CI 72–2,150) during this relatively long period of drift, which caused them to not only diverge genetically from their ancestral population but also from Caucasus and European HGs, and from Iranian EFs (Figure 4).

Figure 4.

Figure 4

Evolutionary insights gained from the demographic scenario shown in Figure 3

(A) MDS analysis done on 12 populations used in the demogenomic analyses and on simulated ancestral populations (unfilled symbols) sampled at key moments of their history, as defined on Figure 3: (1) on the ancestral branch before the split between Western and Eastern metapopulations 25.6 kya; (2) on the Central metapopulation branch just before and after the admixture occurring 14.2 kya; (3) on the Western metapopulation branch just before this admixture; (4) on the Eastern metapopulation branch at the time of split of the Iranian population 13.6 kya; (5) at the top of the western EFs ancestors branch just after its admixture with the Western metapopulation (12.9 kya), and then every 25 generations until the split of the Aegean populations 9.3 kya. Arrows indicate the trajectory of the populations caused by important demographic events (i.e., admixture events, bottlenecks, episodes of drift).

(B) Admixture plot for K = 3 performed on sampled and ancestral populations.

See also Figure S3 for corresponding admixture plots done on observed and simulated individuals.

Anatolian and Aegean farmers differentiation

Populations from Northwest Anatolia (the archaeological sites of Aktopraklık and Barcın) and Northern Greece (Nea Nikomedeia) appear to have diverged from one another at about the same time ∼9.1–9.3 kya (95% CI 9.1–12, Figures M1_20 and M1_22), potentially during the colonization of the wider Aegean area by EFs. In contrast, EFs from Central Anatolia (represented by a genome from Boncuklu) diverged at least 1,000 years earlier ∼10.5 kya (95% CI 10.5–11, Figure M1_24). That Anatolian and Aegean populations show varying amounts of recent gene flow from the Western metapopulation suggests different levels of interaction with surrounding HGs. Indeed, genomes from Northern Greece show a lower degree of further HG introgression (3%, 95% CI 1–11) than those of Boncuklu (10%, 95% CI 3–15), Barcın (12%, 95% CI 6–16), and especially Aktopraklık (17%, 95% CI 11–18) (Table S4). The high level of Western metapopulation admixture found in Aktopraklık, a site previously described as influenced by both Epipalaeolithic and Neolithic traditions (Özdoğan, 2011), is in line with the admixture analysis (Figure 2B) and the f-statistics that show larger genetic affinities of AKT16 with European HGs than other western EFs (Figure S1).

Figure S1.

Figure S1

Population grouping verified with f-statistics, related to Figure 2 and Methods S1

These analyses were performed on the 1240k dataset.

(A) We first tested if some individuals of a specific group had significantly more shared ancestry with individuals of a different group using f-statistics of the form D(Ind1 from a population, Ind2 from the same population; Test, Mbuti [outgroup]). We only found three significant absolute Z scores (>3.0, yellow). For Austria, Asp6 appears to share more ancestry with VLASA7 than Klein7. Since variation in European HG ancestry is expected in EF populations due to the ongoing process of admixture and since these samples did not show variation in their affinities to other EF samples, modeling them as a single population seems justified. For NWAnatolia, AKT16 was found to share significantly more ancestry with both Loschbour and Bichon, which we further investigated in (B).

(B) To shed more light on the variation in European HG ancestry among Anatolian and Greek samples, we calculated f-statistics of the form D(NGreece/SGreece/NWAnatolia/CAnatolia, CAnatolia; HG_west, Mbuti [outgroup]), where we use HG_west to denote both West 1 and West 2 European HGs. This test indicates whether the tested individual/population from NGreece, SGreece, NWAnatolia, or CAnatolia (left) shares more (orange, Z score > 0) or less (blue, Z score < 0) ancestry with the tested HG_west individual (bottom) than the tested individual from CAnatolia (top). Significant Z scores above 3.0 or below −3.0 are shown with more intense colors. Among the CAnatolian individuals, Pınarbaşı and Boncuklu_N appear to share excess drift with HG_west when compared with individuals from Greece and NWAnatolia, in contrast to Tepecik-Çiftlik_N. Among the individuals from NGreece, SGreece, and NWAnatolia, AKT16 appears closest to HG_west, and much closer than Bar25. In light of these results, we modeled AKT16 and Bar25 independently in the demographic inferences.

A stepwise, demic expansion of Neolithic farmers into Central Europe

To understand the progressive spread of EFs into Europe, we expanded the existing demographic model by including three EN populations from Serbia, Austria, and Germany (Figure M1_25). Even though the model is not spatially explicit, aspects of the EFs spread can be inferred from the spatial and temporal distribution of the archaeological sites. We find that a simple model with a strict stepwise migration of EFs originating in the wider Aegean region (Northwest Anatolia or Northern Greece) and extending to Serbia through the Balkans and along the so-called Danubian corridor, then to Austria, and eventually Germany, is better supported than a scenario allowing for long-distance migration from the Aegean directly to Austria (Figure M1_26A; Table S4). We also find that early farming communities incorporated a few HG individuals at each modeled stage of their dispersal along the Danubian corridor (2%–7%), compatible with previous estimates of 3%–9% (Hofmanová et al., 2016; Lipson et al., 2017; Nikitin et al., 2019; Figure M1_26B).

Scenarios without Western metapopulation introgression into EF populations have a lower likelihood than scenarios with introgression (Figure M1_26A). Even though we modeled this introgression from the Western metapopulation that is closely related to the newly sequenced genomes from the Mesolithic site of Vlasac, we cannot exclude that it actually occurred from other European HG groups like those related to Loschbour and Bichon. Previous genetic studies have indeed suggested that different Mesolithic backgrounds could have introgressed into the EF gene pool in different regions of Europe (Lipson et al., 2017).

A last glacial maximum divergence between Eastern and Western metapopulations

Our model also provides important insights regarding the deep branching of pre-Neolithic populations. The divergence between the ancestors of the Western and Eastern metapopulations is estimated to date back to ∼25.6 kya (95% CI 17.3–31.3, Figure M1_18). This is much younger than the previously inferred divergence time between the ancestors of western European HGs and either Iranian EFs (46–77 kya; Broushaki et al., 2016) or European EFs (46 kya; Jones et al., 2015). However, these previous divergence times were obtained using relatively simple models without bottlenecks and assuming topologies with only recent or even no admixture at all.

We have explored additional scenarios to evaluate the effects of these simplifications on metapopulation divergence times. As expected, a model without bottlenecks on the metapopulation branches leads to a much older divergence time of 39 kya between Eastern and Western metapopulations (Table S4), which is more in line with previous estimates, but this model is inherently less likely than the original one (Figure M1_18A). On the other hand, models without any admixture between the Western HG metapopulation and the ancestors of western EFs lead to a much younger divergence time between the Eastern and Western HG metapopulations (16 kya, Table S4), but the fit with the data is very poor (Figure M1_18A).

Comparing two western European HGs from the sites of Bichon and Loschbour with our newly sequenced Mesolithic individuals from Vlasac further reveals that European HG populations had already split during the last glacial maximum (LGM) ∼22.8 kya (95% CI 16.7–24.7, Figure M1_28), with Bichon and Loschbour populations subsequently diverging from one another, approximately 1,000 years later.

The reduced diversity in European HGs is due to a massive LGM bottleneck

Genetic diversity as quantified by the heterozygosity at neutral sites is much lower in HGs than in EFs (Figure 2C), excepting Northwest Anatolian EFs in line with previous studies (Kılınç et al., 2016; Kousathanas et al., 2017). HG genomes furthermore show a generally larger proportion of intermediate runs of homozygosity (ROHs) (2–10 Mb ROHs, Figure 2D; Figure M1_7; Ringbauer et al., 2021) indicative of background relatedness within European HGs, usually attributed to small population size (Ceballos et al., 2021)—a small population size is also observed in MSMC2 analyses (Figure S2).

Figure S2.

Figure S2

MSMC2 effective population size estimates, related to Figure 3 and Methods S1

These were obtained for populations either with four haplotypes or two haplotypes when only one ancient individual was available for the population (shown in the legend with , details in Table M1_3, population sizes estimated from single ancient and modern genomes are shown in Figure M1_12). We used a mutation rate of 1.25 × 10−8 per generation per site and a generation time of 29 years. The analysis suggests smaller effective population sizes in the most recent times for HGs compared with EFs.

However, it is interesting to note that we estimate in our model HG effective population sizes to be larger than those of most EFs, particularly those from Anatolia (Bon002, AKT16, and Bar25), which show effective population sizes of a few hundred individuals only (Figure 3), consistent with their high proportion of intermediate length ROHs (Figure 2D). The lower diversity observed among European HGs thus appears to be rather due to a very strong LGM bottleneck (Figures 3 and M1_18 and see discussion) than to individuals living in small isolated groups.

The inferred model reproduces key features of genomic data

Genomic data simulated under the most complete demographic scenario (Figure 3) lead to population relationships (Figure 4A) that are very similar to those observed on the MDS plot performed on real data (Figure 2A). Three clusters are clearly visible, with European HGs all in proximity to each other, and markedly distinct from the western EFs; Caucasus HGs, and Iran EFs in contrast remain distinct. Furthermore, a simplified admixture graph (Patterson et al., 2012) fitted on data simulated under the model presented in Figure 3 leads to f-statistics consistent with those calculated on the real data (compare Figures M1_29 and M1_30). All population relationships inferred in our model of Figure 3 are thus confirmed by f-statistics. Finally, an admixture analysis performed on data simulated according to our model leads to admixture proportions that are very similar to those observed, as shown in Figures 4B and S3. In particular, the Caucasus HG shows a large yellow component shared with the Iranian EF, in line with their proximity on the MDS plot. The good match between observed and simulated data thus provides an a posteriori validation of our model-based approach.

Figure S3.

Figure S3

Comparison of observed and simulated Admixure plots, related to Figure 4 and Methods S1.

Admixture plot for K = 2 (left panel) and K = 3 (right panel) carried out on (A) observed data for 16 ancient genomes included into fastsimcoal2 demographic inferences (909,688 sites without missing data; Bichon, Bar8, and Bon002 were not included because of lower quality) and on (B) data simulated accordingly to our final model (shown in Figure 3), for the same subset of individuals as in (A).

Discussion

Evolutionary insights gained from explicit demographic modeling

Our sequencing of ancient genomes at >10x has not only tripled the number of high-quality whole genomes available for the early Holocene in Europe but also allowed us to perform genetic analyses on an unbiased set of markers minimally impacted by selection. Such neutral markers are ideally suited for reconstructing the population history of Europe and Southwest Asia from the Late Pleistocene to the Early Holocene. In addition to confirming long-held assumptions and interpretations, our modeling provides several key insights about the demographic processes that preceded the Neolithic transition and its expansion to the west.

A split of European HGs triggered by the LGM

Our model suggests that European HGs had already split into two subgroups (West 1 and West 2 in Figure 5C) ∼23 kya, after experiencing a very severe bottleneck during the LGM, responsible for their low level of genetic diversity (Figure 2C). In contrast to previous studies (Ceballos et al., 2021; Günther et al., 2018), HGs were found to have generally larger effective population sizes than contemporary EFs (Figure 3). Such relatively large effective population sizes can lead to slow population differentiation, which might explain why the different HG groups show close genetic affinities (Figures 2A and 4A) despite long divergence times and a wide geographic distribution. Large HG effective population sizes could be due to long-distance genetic exchanges between groups. Contrastingly, the inferred low effective population size of EFs (despite obvious large census sizes) suggests that the Neolithic transition was linked to a reduction in local EF effective population sizes, potentially due to “sedentarization” or commitment to place (Aimé et al., 2013) and restricted gene flow among small-scale farming communities, as observed at the aceramic Neolithic sites of Boncuklu and Aşıklı (Yaka et al., 2021).

Figure 5.

Figure 5

A spatiotemporal interpretation of population differentiation in SW Asia and Europe based on our model and the geographic distribution of the genomes

For a Figure360 author presentation of this figure, see https://doi.org/10.1016/j.cell.2022.04.008.

Colored shaded areas indicate approximate putative distributions of populations at different time points. The letters (A) to (H) indicate the chronological order of events; see main text for a detailed description. Note that warmer periods (Bølling and Allerød interstadials, Holocene) correspond to population range expansions while colder periods (LGM, Older Dryas) are associated with contractions.

See also Figure S4 for additional f-statistics analyses supporting alternative connections between the Levant and the Aegean/Greece area.

Ancestors of western EFs are related to Caucasus HGs

We have shown that HGs from the Caucasus share a common ancestor with western EFs, as both show traces of an ancestral admixture event between the Western and Eastern metapopulations (Figure 3). Despite this historical relationship, our model still manages to predict the apparent affinity of the Caucasus HG genome with the Iranian EF (e.g. the yellow component shared by KK1 and WC1 found for real and simulated data in Figures 2B, 4B, and S3, as well as their proximities on the MDS plots of Figures 2A and 4A). This shows that observed patterns of genetic similarities are not sufficient to reconstruct actual ancestry relationships.

Specific demographic processes explain EF and HG differentiation from their ancestral population

Through our demogenomic modeling, we can demonstrate how specific demographic processes gave rise to the genetic divergence of past populations. We can indeed not only simulate the genomic diversity of sampled ancient individuals but also that of individuals drawn from ancestral populations at any time point and thus predict their relationships with sampled individuals (Figure 4A). For instance, we infer that the population ancestral to all our ancient sampled individuals was genetically close to Iranian EFs and Caucasus HGs. As also shown in Figure 4A, the ancestors of European HGs (the Western metapopulation) considerably diverged from the ancestral population due to a LGM bottleneck, explaining their outlier position on the top left of the MDS plot.

The ancestors of western EFs (i.e., the Central metapopulation) were initially close to the ancestral population and to the Iranian EFs, but the two consecutive admixture events with the Western HG metapopulation put them closer to the European HGs on the MDS plot. Nonetheless, it was the >2,500 years of intense genetic drift that made them particularly distinct from all other groups, resulting in them occupying the upper right corner of the MDS plot in Figure 4A. We postulate that this rapid divergence process could be due to recurrent founder effects having occurred during their dispersal through Anatolia. Although western EFs were previously described as genetically intermediate between other SW Asian groups (Feldman et al., 2019), or as a mixture of Iranian and Southern Levant Neolithic populations with western HGs (Lazaridis et al., 2016), this initial admixture signal remains hidden to classical admixture analyses because it is progressively eroded by the genetic drift that occurred during the migration of these populations through Anatolia (Figure 4B). Whereas populations simulated shortly after the two main admixture events with the Western metapopulation appear as admixed (i.e., EFAncestors [12.9 kya] in Figure 4B), this signal progressively disappears through time with the emergence (and migration) of western EFs who more and more appear as having a completely unrelated gene pool (e.g., green component in the admixture analyses of Figure 4B). Their slightly more central position on the MDS plot (Figure 4A) in later generations is then due to admixture with the Western metapopulation and surrounding farmers modeled as the Central metapopulation.

A spatial interpretation of population differentiation

The timing and sequence of demographic events that emerge from our model suggest a scenario of population differentiation with a clear geographic and chronological resolution from the LGM to the early Holocene (Figure 5).

HG structuration and divergence induced by the LGM

We find that the divergence between the Eastern and Western HG metapopulations has been initiated by the LGM some 26 kya (Figures 5A and 5B), probably due to a deterioration of the habitat and a contraction into LGM refugia potentially located in milder regions (Figure 5B). The archaeological, environmental, and climatic records indeed suggest that large tracts of Eurasia were deserted at the height of the LGM, ∼26 kya, when ice sheets were at their maximum extent (Clark et al., 2009; French, 2021; Jöris et al., 2009), and both human and animal populations survived in glacial refugia located in more southern regions of Europe (Sommer et al., 2008).

In our initial model, the LGM divergence is immediately followed by a bottleneck of very strong intensity in the population ancestral to European HGs. The modeled intensity I of this bottleneck depends on the bottleneck duration (t) and its size (Nbot) as I = t/(2Nbot). If the bottleneck had lasted 4,000 years (corresponding to 138 generations of 29 years), our estimated intensity I = 0.18 would correspond to an effective bottleneck population size of 383 individuals. This low number is in line with the archaeological record suggesting a 60% decline in census population size in the latter part of the Gravettian, 14C-dated to 29,000–25,000 cal. BP, with total population size in Europe as low as 700–1,550 individuals (Maier, 2017). We have explored an alternative demographic scenario where the bottleneck was decoupled from the divergence between the Eastern and Western HG metapopulations. In that case (Figure M1_17; Table S4), we find a slightly older metapopulation divergence (27 kya) occurring during the phase of decrease in northern summer solar insolation, i.e., between 32 and 26 kya (Clark et al., 2009), and a more recent bottleneck (23 kya) sitting in the middle of the LGM, confirming that the two events are related to this extreme cold phase.

Our analyses further indicate that European HGs differentiated in two separate refugia by the end of the LGM 21.7 kya (Figures 3 and 5C), perhaps corresponding to what archaeologists traditionally identify as the areas of distribution of Solutrean and Epigravettian technocomplexes (Smith, 1966; Kozłowski and Kaczanowska, 2004).

Post-LGM range expansions and admixture

Following a period of range expansion after the LGM (Figure 5D), representatives of the Central HG metapopulation, who were likely descendants of the Epigravettian refugial population, mixed ∼14.2 kya with the population ancestral to both Caucasus HGs and western EFs (called East 1 in Figure 5E). Given the assumed geographical distribution of the preceding glacial refugia, this Bølling interstadial period admixture likely happened in a region encompassing Southeastern Anatolia and the Northern Levant or even in neighboring regions such as Central and Eastern Anatolia or the Turkish South coast.

Differentiation of Anatolian and Aegean EF groups

The demographic processes behind the differentiation between Central Anatolian and Aegean EF populations are more difficult to pinpoint and locate with precision. The inferred low population size of the ancestors of western EFs after their split from the Central metapopulation during the Older Dryas Figure 5F) could be due to a westward range expansion and associated recurrent founder effects during the Allerød interstadial. This period was characterized by relatively favorable climatic conditions, which may have allowed the ancestors of western EFs to further mix with Epipalaeolithic HGs in Anatolia (∼12.9 kya; Figure 5G).

The fact that Central Anatolian EFs share the same admixture event and drift with Aegean EFs suggests that they were part of the same expansion wave, and that Central Anatolia was settled by EFs before they reached the Aegean area, potentially by a different route. However, a previous study (Feldman et al., 2019) has shown that Central Anatolian EFs and Epipalaeolithic HGs were genetically similar, which indicates that admixed groups existed in Central Anatolia prior to the Neolithic transition. Alternatively, admixed HG populations could have moved there from the Fertile Crescent, adopting fully developed farming practices at a later stage, which is in line with the observation that early aceramic sites such as Boncuklu and Aşıklı on the Anatolian Plateau show experiments in crop cultivation and caprine management ∼9.7 kya (Buitenhuis et al., 2018; Ergun et al., 2018).

By contrast, the migration to NW Anatolia (Figure 5H) likely occurred at the time of the fully developed ceramic Neolithic characterized by the establishment of widespread mixed farming (Bogaard et al., 2017). Further support for such a demic diffusion scenario to NW Anatolia by a direct (coastal) route and to a lesser extent via the Konya plain region of Central Anatolia comes from f-statistics showing Southern Levant populations sharing more drift with Aegean Neolithic individuals than with Central Anatolian ones (Figure S4). This signal could be due to either (1) long-distance gene flow between Aegean and Levantine communities, as suggested by archaeologists based on similarities in material culture (Perlès, 2001; Horejs et al., 2015) (2) a higher level of Western metapopulation admixture in Central Anatolia as observed in Boncuklu (Figure S1B) and inferred in our model (Figure 3), or (3) an early migration of the Boncuklu ancestors from the Fertile Crescent to Central Anatolia, combined with some later gene flow between populations from the Fertile Crescent and the ancestors of Aegean EFs. However, f-statistics analyses reveal that EFs from the Aegean and NW Anatolia are rather heterogeneous in their levels of shared drift with several populations, including HGs from the Levant and EFs from Iran (Figure S4; Table S5), suggesting that the Neolithization of the Aegean was a more complex process.

Figure S4.

Figure S4

Patterns of population admixture revealed by f-statistics, related to Figure 5 and Methods S1.

Relationship of Anatolian and Greek Neolithics with the Levant using f-statistics on the 1240k dataset of the form D(CAnatolia/NWAnatolia/NGreece/SGreece/CEurope,Test; Levant,Mbuti [outgroup]). This test indicates whether Neolithic individuals from CAnatolia, NWAnatolia, Greece, or CEurope (left) share less (blue, Z score < 0) or more (orange, Z score > 0) ancestry with samples from the Levant, namely individuals from contemporary Israel associated with Natufian culture (Israel_Natufian), Pre-Pottery Neolithic B (Israel_PPNB), and Chalcolithic (Israel_C; all bottom) than the Test individuals/populations from Greece and CAnatolia. Significant Z scores above 3.0 or below −3.0 are shown with more intense colors. We find a strongly significant excess of shared drift between populations from the Levant and NGreece, SGreece, NWAnatolia, and CEurope when contrasted to Boncuklu. This signal was, however, not replicated when representing CAnatolia by Pınarbaşı. In contrast, the SGreece populations Diros_EN and Peloponnese_N appear to share excess drift with populations from the Levant, and in particular the Chalcolithic Israel_C, when contrasted to samples from NGreece, NWAnatolia, or CEurope.

Neolithic expansion occurred through a mixture of cultural and demic diffusion

Even though the initial spread of the Neolithic must have been through cultural diffusion in the Fertile Crescent among genetically well differentiated groups, our results indicate that expansion to Northwestern Anatolia, the Aegean Basin and the Danubian corridor occurred primarily through demic diffusion (Figure 5H). The initial spread of populations beyond the Fertile Crescent was certainly far from linear and associated with multiple genetic influences from the Levant, some of which were not linked to the emergence of a fully developed Neolithic economy. As soon as Neolithic lifeways reached Europe, the mode of dispersal of populations to Central Europe became more linear and can essentially be modeled as a stepping-stone migration (Figure M1_26).

Limitations of the study

Although our demogenomic model (Figure 3) can clarify observed population affinities, it has exposed temporal and geographical gaps, which can only be filled by producing additional genomic data of similarly high quality. In particular, high-quality genomes from pre-Neolithic and Neolithic populations of the Western Fertile Crescent, as well as reference genomes for Central Anatolia and HGs from Eastern Europe are needed to confirm our conclusions and render them more complete. Whereas the demographic scenarios we have explored are much more complex than those previously investigated (Broushaki et al., 2016; Jones et al., 2015), they are certainly still very schematic and reality must have been more complex. In particular, we have used simplified assumptions such as constant mutation rates and generation times, which may have actually changed over time (Coll Macià et al., 2021; Harris, 2015), or we have modeled genetic interactions between populations as direct and unique pulses of gene flow, whereas gene flow could have occurred over prolonged periods, or could have occurred indirectly through unsampled populations. However, since our model still captures and reproduces most observed genetic patterns (Figure 4), it seems robust enough to provide key insights into past demographic processes. Finally, while we have analyzed genetic evidence relating to European EFs along the Danubian corridor, the spread of EFs along the Mediterranean coast remains to be investigated to obtain a more complete picture of the Neolithic settlement of Europe.

Conclusions

Population modeling using the framework outlined here allowed us to extract key, unexpected, but complementary and far more detailed information on population affinities than could be concluded from summary statistics and multivariate analysis alone. In addition, this model provides a time frame for the differentiation of the major groups populating Southwest Asia and Europe from the LGM until the introduction of agriculture and highlights the crucial role of climatic changes as driver of population fragmentation and admixture events (Lahr and Foley, 1998). Although the world’s first farmers look genetically very different from European HGs, our simulation based on high-quality genomes shows that some European and Southwest Asian populations in fact shared a recent common history marked by repeated interactions since the end of the last ice age. Strong drift during their expansion through Anatolia contributed to making western EFs look more dissimilar than they actually were and somehow concealed their hybrid nature. In summary, the idea of a single cultural and genetic origin of all farmers in the Fertile Crescent, without significant initial contribution of European-like HGs, is no longer tenable in its current form.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Biological samples

Ancient human bone material This study AKT16; ERS10598167
Ancient human bone material This study Bar25; ERS10598177
Ancient human bone material This study Nea2; ERS10598179
Ancient human bone material This study Nea3; ERS10598178
Ancient human bone material This study VLASA32; ERS10598181
Ancient human bone material This study VLASA7; ERS10598180
Ancient human bone material This study LEPE48; ERS10598171
Ancient human bone material This study LEPE52; ERS10598168
Ancient human bone material This study STAR1; ERS10598170
Ancient human bone material This study VC3-2; ERS10598169
Ancient human bone material This study Asp6;ERS10598172
Ancient human bone material This study Klein7; ERS10598176
Ancient human bone material This study Ess7; ERS10598174
Ancient human bone material This study Dil16; ERS10598175
Ancient human bone material This study Herx; ERS10598173

Chemicals, peptides, and recombinant proteins

AccuPrime™ Pfx SuperMix Invitrogen Cat#12344040
Bst Polymerase, Large Fragment (8 U/μl) New England Biolabs GmbH Cat#M0275S
dNTPs (each 10 mM) QIAGEN, Hilden, Germany Cat#201901
EDTA (0.5 M, pH 8.0) Ambion/Applied Biosystems, Life Technologies Cat#AM9262
Lauroylsarcosine, Sodium Salt Merck Millipore, Merck KGaA, Darmstadt, Germany Cat#428010
NEBNext End Repair Enzyme Mix New England Biolabs GmbH Cat#E6050L
NEBNext End Repair Reaction Buffer (10X) New England Biolabs GmbH Cat#E6050L
Nuclease-free H2O Life Technologies Cat#AM9932
PEG-4000 Thermo Scientific Cat#EL0011
Proteinase K Roche Diagnostics, Mannheim, Germany Cat#3115828001
T4 DNA Ligase (5 U/μl) Thermo Scientific Cat#EL0011
T4 DNA Ligase Buffer (10X) Thermo Scientific Cat#EL0011
ThermoPol Buffer (10X) New England Biolabs GmbH Cat#M0275S
Tris-EDTA Sigma-Aldrich Cat#T9285
Tris-HCl (1M, pH 8.0) Life Technologies Cat#15568025
USER™ enzyme New England Biolabs GmbH Cat#M5505L

Critical commercial assays

Agilent 2100 Expert Bioanalyzer System and High Sensitivity DNA Analysis Kit Agilent Technologies Cat#5067-4626 (kit)
Qubit Fluorometric quantitation and dsDNA HS Assay Kit Invitrogen Cat#Q32854 (kit);
Cat#Q32856 (tubes)

Deposited data

Sequencing data and aligned BAMfiles This study ENA: PRJEB50857
Code and input-files connected to this study This study https://doi.org/10.5281/zenodo.6367517 for the release of https://github.com/CMPG/originsEarlyFarmers/
Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Zuzana Hofmanová, co-author of this study) (Hofmanová et al., 2016) Bar8
Ancient genome - Demultiplexed FASTQ files (Jones et al., 2015) Bichon; ERR1078331 - ERR1078351
Ancient genome - FASTQ files (produced from BAMfiles by ENA) (Kılınç et al., 2016) Bon002; ERR1514027
Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Kay Prüfer) (Lazaridis et al., 2014) Loschbour
Ancient genome - Raw sequencing FASTQ file kindly provided by the authors (contact: Lara Cassidy) (Gamba et al., 2014) NE1
Ancient genome - Aligned BAM file (Günther et al., 2018) SF12; ERR2060277
Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Kay Prüfer) (Lazaridis et al., 2014) Stuttgart
Ancient genome - Demultiplexed FASTQ file kindly provided by the authors (contact: Yoan Diekmann, co-author of this study) (Brace et al., 2019) CarsPas1
Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Jens Blöcher, co-author of this study) (Broushaki et al., 2016) WC1
Ancient genome - Demultiplexed FASTQ files (Jones et al., 2015) KK1; ERR1078321-ERR1078325
Modern genomes - Aligned BAMfiles from 77 modern individuals The Simons Genome Diversity Project (SGDP)
(Mallick et al., 2016)
https://www.simonsfoundation.org/simons-genome-diversity-project/Individual identifiers, see Table S2.
1000 Genomes phase3 per chromosome VCFs (1000 Genomes Project Consortium et al., 2015) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/
Allen Ancient DNA Resource v42.4 and v37.2 Reich lab public data release https://reichdata.hms.harvard.edu/pub/datasets/amh_repo/curated_releases/index_v42.4.html;
https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data
Assembly gaps and centromeres UCSC genome browser http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gap.txt.gz
Chimpanzee hg19 nucleotide states UCSC genome browser http://hgdownload.cse.ucsc.edu/goldenpath/hg19/vsPanTro4/axtNet/
Chimpanzee Reference Genome (panTro4) The Chimpanzee Sequencing and Analysis Consortium (2005) GenBank assembly accession: GCA_000001515.4
CpG islands list UCSC genome browser (CpG Islands (cpgIslandExt) Track)
Ensembl Compara 71 genome FASTA files Ensembl Compara http://ftp.ensembl.org/pub/release-71/fasta/ancestral_alleles/homo_sapiens_ancestor_GRCh37_e71.tar.bz2
Gorilla hg19 nucleotide states UCSC genome browser http://hgdownload.cse.ucsc.edu/goldenpath/hg19/vsGorGor5/axtNet/
Haplotype Reference Consortium dataset (McCarthy et al., 2016) accession number EGAD00001002729 on the European Genome-phenome Archive
HapMap file for chrX (ANGSD) HapMapChrx.gz (Rasmussen et al., 2011) http://www.popgen.dk/angsd/index.php/ANGSD
HapMap phase II b37 genetic map N/A https://github.com/odelaneau/shapeit4/tree/master/maps
Human reference sequence hs37d5 (1000 Genomes Project Consortium et al., 2015) ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
Known InDel positions (1000 Genomes Project Consortium et al., 2015) ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.indels.b37.vcf.gz
Known InDel positions (1000 Genomes Project Consortium et al., 2015) ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/Mills_and_1000G_gold_standard.indels.b37.vcf.gz
Per chromosome mappability mask for human reference genome hs37d5 The Simons Genome Diversity Project (SGDP)
(Mallick et al., 2016)
https://github.com/wangke16/MSMC-IM/blob/master/masks/um75-hs37d5.bed.gz
Recombination Map for YRI population (1000 Genomes Project Consortium et al., 2015) ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20130507_omni_recombination_rates/
Ultraconserved sites for recalibration step (Dimitrieva and Bucher, 2013) https://ccg.epfl.ch/UCNEbase/data/download/ucnes/hg19_UCNE_coord.bed

Oligonucleotides

P5 and P7 adapters (Meyer and Kircher, 2010)
IDT, Leuven, Belgium
N/A
P5 and P7 indexing primers with index sequences (8 bp) from the NexteraXT index Kit v2 (Kircher et al., 2012) and Illumina, San Diego, California, United States
IDT, Leuven, Belgium
N/A

Software and algorithms

ADMIXTOOLS package v7300 (Patterson et al., 2012) https://github.com/DReichLab/AdmixTools
ANGSD - version 0.917 (Rasmussen et al., 2011) http://www.popgen.dk/angsd/index.php/ANGSD
ANNOVAR (Wang et al., 2010) https://annovar.openbioinformatics.org
ATLAS - commits 6bd2482 & 7cfc900 (Link et al., 2017) https://bitbucket.org/wegmannlab/atlas/
ATLAS-Pipeline, commit 6df90e7 Wegmann lab, Ilektra Schulz bitbucket.org/wegmannlab/atlas-pipeline
bcftools versions: 1.9 and 0.1.15 (Danecek et al., 2021) https://samtools.github.io/bcftools/howtos/index.html
bwa - Burrows-Wheeler Alignment Tool - versions 0.7.15 and 0.7.17 (Li, 2013) bio-bwa.sourceforge.net
BEDOPS v2.4.40 (Neph et al., 2012) https://bedops.readthedocs.io/en/latest/
Bedtools 2.25.0 (Quinlan and Hall, 2010) https://bedtools.readthedocs.io/en/latest/
ContamMix - version 1.0 (Fu et al., 2013) https://science.umd.edu/biology/plfj/
dadi (Gutenkunst et al., 2009) https://bitbucket.org/gutenkunstlab/dadi
fastsimcoal2.7 (Excoffier et al., 2013, 2021) http://cmpg.unibe.ch/software/fastsimcoal2/
fastqc - version 0.11.5 Babraham Bioinformatics www.bioinformatics.babraham.ac.uk/projects/fastqc/
GATK - version 3.7 (DePristo et al., 2011) https://gatk.broadinstitute.org
HIrisPlex-S webtool (Chaitanya et al., 2018; Walsh et al., 2013) https://hirisplex.erasmusmc.nl/
IBDSeq v. r1206 (Browning and Browning, 2013) http://faculty.washington.edu/browning/ibdseq.html
LEA R package v.2.6.0 (Frichot and François, 2015) https://bioconductor.org/packages/release/bioc/html/LEA.html
mafft - version 7.31 (Katoh et al., 2002) https://mafft.cbrc.jp/alignment/software/linux.html
MIA (Mapping Iterative Assembler) - version 1.0 MPI EVA Bioinformatics https://github.com/mpieva/mapping-iterative-assembler
MSMC2 (Schiffels and Wang, 2020) https://github.com/stschiff/msmc2
MSMC-tools, commit 07bc8a9 (Schiffels and Wang, 2020) https://github.com/stschiff/msmc-tools
phy-mer (Navarro-Gomez et al., 2015) https://github.com/MEEIBioinformaticsCenter/phy-mer/
Picard-tools - version 2.9 Broad Institute http://broadinstitute.github.io/picard/
R - version 4.0, 3.7, and 3.6.1 R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria https://www.R-project.org/
SAMtools - version 1.3 (Li et al., 2009) http://www.htslib.org/
Samtools 1.9 (Danecek et al., 2021) https://github.com/samtools/samtools
seqtk - version 1.2 N/A https://github.com/lh3/seqtk
SHAPEIT4 v1.2 (Delaneau et al., 2019) https://odelaneau.github.io/shapeit4/
Snakemake - version 4.0 (Köster and Rahmann, 2012) https://snakemake.readthedocs.io/en/stable/
Trim Galore! - version 0.4.3 Babraham Bioinformatics www.bioinformatics.babraham.ac.uk/projects/trim_galore/
Yjasc_3752_ry_compute.py, version 0.4 (Skoglund et al., 2013) https://ars.els-cdn.com/content/image/1-s2.0-S0305440313002495-mmc1.zip
Yleaf (Ralf et al., 2018) https://github.com/genid/Yleaf

Other

Agencourt® AMPure® XP beads Beckmann Coulter Cat#A63880
Amicon Ultra-4 Centrifugal Filter Units, 30kDa Merck Millipore, Darmstadt, Germany Cat#UFC803096
Hydroxyapatite Sigma-Aldrich Cat#21223
MinElute PCR Purification Kit QIAGEN, Hilden, Germany Cat#28006
MSB® Spin PCRapace Invitek Molecular GmbH, Berlin, Germany Cat#1020220400
Spezial-Edelkorund (EW60/250μ) Harnisch+Rieth, Winterbach, Germany Cat#75250
Spezial-Edelkorund (30B/50μ) Harnisch+Rieth, Winterbach, Germany Cat#75308

Resource availability

Lead contact

  • Further information and requests should be directed to and will be fulfilled by the lead contact Laurent Excoffier (laurent.excoffier@iee.unibe.ch).

Materials availability

  • This study did not generate new unique reagents.

Experimental model and subject details

In this study, we present new whole-genome sequences for 15 ancient human individuals (see Table 1, Figure D1_1 from Data S1, and key resources table for an overview). Our sample set consists of two Mesolithic individuals from Vlasac (Serbia), two individuals from Transitional and Neolithic layers at Lepenski Vir (Serbia) and 11 early Neolithic individuals originating from Aktopraklık and Barcın in Turkey (one individual each), from Nea Nikomedeia in Greece (two individuals), from Vinča-Belo Brdo and Grad-Starčevo in Serbia (one individual each), from Kleinhadersdorf and Asparn-Schletz in Austria (one individual each) and Essenbach-Ammerbreite, Dillingen-Steinheim and Herxheim in Germany (one individual each). For detailed information on the archaeological background, please refer to the Data S1. Age estimated from anthropological analyses and genetic sex of the individuals are reported in Table 1 and Data S1.

The ancient skeletal material used for this study was collected and analysed in collaboration with the anthropologists/archaeologists in charge of the archaeological sites, in accordance with the laws of the respective countries. Specifically, for the Greek samples, we had permission by the Greek Ministry of Culture and Sports for the sampling, DNA extraction, and radiocarbon dating according to Greek law for destructive sampling of archaeological material (Ν.3028/02).

Method details

Data generation

Sample preparation

All laboratory analyses were conducted in the dedicated ancient DNA facilities of the Palaeogenetics Group Mainz (Institute of Organismic and Molecular Evolution, Johannes Gutenberg University, Mainz), according to strict ancient DNA protocols to prevent contamination with modern DNA as well as cross contamination between samples as previously described (Bollongino et al., 2013; Bramanti et al., 2009; Scheu et al., 2015). These ancient DNA standards include decontamination of workspace, labware and samples, independent DNA extractions and the processing of blank controls during sample pulverisation, DNA extraction, library preparation and PCR reactions to monitor contaminations.

All steps prior to PCR amplification (sample preparation, DNA extraction and library preparation) were carried out in the dedicated ancient DNA facilities of the Palaeogenetics Group, which are separated from post-PCR areas.

The petrous bone samples were prepared as described in Hofmanová et al. (2016). In detail, after documentation, samples were sterilised under ultraviolet light (254 nm) from 2 sides for 45 minutes per side. In order to remove superficial contaminants, soil remains and the outer bone surface were removed using a sandblasting machine (P-G 400, Harnisch+Rieth) with Spezial-Edelkorund (EW60/250μ and 30B/50μ; Harnisch+Rieth). The densest, inner part of the petrous bone was then isolated and cut in cubes using a disk saw (Electer Emax IH-300, MAFRA). After irradiation with ultraviolet light (254 nm) from two sides, for 45 minutes per side, the densest bone cubes were pulverised using a mixer mill (MM200, Retsch). To control contamination, blank milling controls containing hydroxyapatite were processed in parallel.

Radiocarbon dating and stable isotope analysis

New radiocarbon dates and stable isotope values (carbon δ13C and nitrogen δ15Ν) were obtained for seven individuals (Table D1_2). The analyses were performed at the Curt-Engelhorn-Zentrum Archäometrie gGmbH (Mannheim, Germany). The collagen used for the analyses was extracted from fragments of the petrous bones used for palaeogenomic analysis.

All dates were uniformly re-calibrated in OxCal 4.4.2 (Bronk Ramsey, 2009) using the IntCal20 calibration curve (Reimer et al., 2020). The isotope results were used as a basis for palaeodietary inference (Figure D1_11) and for freshwater reservoir effect correction in the case of the Vlasac sample (Table D1_2). Radiocarbon dates on human bones from archaeological sites in the Danube Gorges are known to be influenced by a large freshwater 14C reservoir effect, due to high intake of freshwater food, and need to be corrected before calibration (Cook et al., 2001). We did it following the method described by Cook et al. (2001), later amended by Bonsall et al. (2015): for human bones with δ15N8.3, uncalibrated dates of the form μ±σ were adjusted to (μ545c)±σ2+(70c)2 where c=δ15N8.3178.3.

DNA extraction

DNA extraction followed the protocol by Yang et al. (1998) with the modifications described in MacHugh et al. (2000) and Gamba et al. (2014) as well as additional modifications described below. For each sample, 0.15-0.31 g of bone powder was used for extraction.

Prior to extraction, a pre-lysis was performed by adding 1 mL of EDTA (0.5 M, pH8) to the bone powder and incubating at room temperature for 10 minutes. The solution was centrifuged at maximum speed to pellet the powder and the supernatant was removed.

Lysis was performed on rocking shakers at 37°C for 24 hours (900 rpm) using 1 ml of extraction buffer containing EDTA (950 μl, 0.5 M, pH8), Tris-HCl (20 μl, 1 M, pH8), N-Lauroylsarcosine (17 μl, 5%) and Proteinase K (13 μl; 20 mg/ml). After 24 hours of incubation, the samples were centrifuged for 10 minutes at 10,000 rpm, the supernatant was removed, transferred into new tubes and stored in a fridge until further processing. A second lysis step was performed following the same procedure.

Following lysis, the supernatants from the two lysis steps were merged on an Amicon Filter (Amicon Ultra-4 30 kDA, 15 ml) and centrifuged for 10 min at 2500 rpm. The DNA was then washed twice with 3 ml 1X Tris-EDTA, followed by centrifugation at 2500 rpm for 20 minutes and discarding of the flow-through in between. After washing, the extract was concentrated to 100 μl and subsequently purified with the MinElute PCR Purification Kit following the manufacturer’s instructions but incubating for 5 minutes during elution with 44 μl elution buffer (preheated to 65°C).

Blank controls were processed during DNA extraction and incorporated into all further steps of the analyses.

Library preparation

Double-indexed libraries were prepared according to the protocol by Kircher et al. (2012) with slight modifications.

Prior to library preparation the DNA extracts were treated with USER™ enzyme: 5 μl of USER™ enzyme was added to 16.25 μl of DNA extract (exception: 17μl of two merged extracts were used for library SL3_5 of sample AKT16, see Table S1) and the mixture was incubated for three hours at 37°C (Verdugo et al., 2019).

The blunt-end repair step followed immediately and was performed using the NEBNext End Repair Module: the DNA extract was mixed with NEBNext End Repair Reaction Buffer (10X, 7 μl), NEBNext End Repair Enzyme Mix (3.5 μl) and nuclease-free water (38.25 μl; for a final reaction volume of 70 μl) and incubated for 15 minutes at 25°C followed by 5 minutes at 12°C. Deviating from this procedure, two libraries (SL2.12_MU and SL3.12_MU of sample VC3-2) were produced using 20 μl DNA extract and without USER™ enzyme treatment of the DNA extract (see Table S1). In the adapter ligation step hybridised adapters P5 and P7 (Meyer and Kircher, 2010) were used at a concentration of 0.75 μM. 3 μl of Fill-In product (total volume: 40 μl) were amplified using the AccuPrime™ Pfx SuperMix (20 μl) in one PCR parallel adding unique and sample-specific index combinations to the library molecules (final reaction volume: 25 μl; final primer concentration: 200 nM or 160 nM each). Double indexing followed Kircher et al. (2012), but using index sequences from the NexteraXT index Kit v2 (Illumina; barcode length 8 bp; ordered at IDT, Leuven, Belgium). The PCR was performed in 9-13 cycles; the PCR temperature profile followed the manufacturer’s recommendations but using an annealing temperature of 60°C, extending for 30 seconds during each cycle and performing a final elongation step for 5 minutes.

For purification during library preparation the MinElute PCR Purification Kit was used, while amplified libraries were purified with the MSB® Spin PCRapace. Libraries were quantified by Qubit Fluorometric quantitation (dsDNA HS assay) and measurement on the Agilent 2100 Bioanalyser System (High Sensitivity DNA Analysis).

Blank controls as well as positive controls (nonsense hybrids) of known concentration, were processed in every library step including PCR amplification to verify the success of the library preparation and to monitor contamination. Quantification of blank controls did not indicate significant contamination during any laboratory step (pulverisation, DNA extraction, library preparation and amplification).

Sample Screening

Using shallow shotgun sequencing on an Illumina MiSeq™ platform at StarSEQ GmbH (Mainz, Germany), all samples were screened for their endogenous DNA preservation. Libraries were equimolarly pooled, subsequently purified with Agencourt® AMPure® XP beads and sequenced in single-end runs with 50 bp read length. Demultiplexing was performed by the sequencing facility using the MiSeq Reporter and allowing one mismatch in the barcode. Blank controls from DNA extraction, library preparation and PCR-steps were sequenced alongside to estimate the fraction of potentially contaminating reads introduced in the lab (for details see Table S1).

Whole-genome sequencing

For deeper shotgun sequencing, 2-5 DNA extracts and 3-9 libraries of each sample were prepared (as detailed above). The libraries were amplified in up to 13 PCR parallels, each with an individual index combination, to increase the complexity of the libraries. Subsequently, PCR parallels were purified and quantified individually as described above. For sequencing, PCR parallels were pooled equimolarly according to their concentrations measured on Qubit® and purified with Agencourt® AMPure® XP beads. The samples were sequenced on an Illumina HiSeq3000 (SE, 100 cycles or PE, 150 cycles) at the Next Generation Sequencing Platform at the University of Berne, Switzerland. A summary and details of the laboratory work and the sequencing strategy for each sample can be found in Table S1.

Bioinformatics pipeline

The bioinformatic steps within this section were conducted with the following programs and versions unless specified otherwise:

Raw data handling

We committed to process all 102 samples (15 ancient genomes from this study, 10 previously published ancient genomes from Brace et al., 2019; Broushaki et al., 2016; Gamba et al., 2014; Günther et al., 2018; Hofmanová et al., 2016; Jones et al., 2015; Kılınç et al., 2016; Lazaridis et al., 2016, and 77 modern SGDP genomes from Mallick et al., 2016; see Table D1_1 and Table S2 for the complete list) with the same bioinformatic pipeline, albeit minor changes due to variation in how the raw data were obtained. We therefore first describe the pipeline used for the 15 samples sequenced in this study, followed by a description of the differences in how previously published data were analysed:

The sequencing quality was checked with fastqc. Raw reads were trimmed using Trim Galore with no quality filter and a length filter of 30 bp (-q0, --length 30, -a ‘AGATCGGAAGAGCACACGTCTGAACTCC’). For paired-end libraries, the additional option --retain unpaired was used and the second adapter sequence (-a2 'AGATCGGAAGAGCGTCGTGTAGGGAAAG') was provided. A second fastqc analysis was performed to verify trimming and the quality of the remaining sequences. Reads were then aligned to the 1000 Genomes project version of the human reference genome hs37d5 (1000 Genomes Project Consortium et al., 2015) using bwa -mem with options -t 8 -M. Reads with a mapping quality below 30 were filtered out with SAMtools -q30. Duplicate reads were marked but not removed using picard-tools MarkDuplicates with VALIDATION_STRINGENCY=SILENT and AS=TRUE. Unmapped reads were removed with SAMtools with option -F4. Informative read groups were added to keep track of individual PCR-parallels with picard-tools AddOrReplaceReadGroups. All necessary sorting and indexing in-between the steps was performed with SAMtools index and sort, library-files from the same samples were merged using SAMtools merge.

Local Realignment

Local indel-realignment is an important but computationally expensive step in our pipeline. In order to allow for parallelization, we proceeded with the following three steps:

  • (i)

    We identified potential target intervals using GATK RealignerTargetCreator, on a set of ten ancient and ten modern samples (Asp6, Bar25, Dil16, Ess7, Klein7, LEPE48, Nea2, Nea3, STAR1, VC3-2, VLASA32, VLASA7, French-1, Sardinian-1, Russian-1, Spanish-1, Georgian-1, Hungarian-1, Icelandic-1, Finnish-1, English-1, Greek-1, Mende-1, Estonian-1, Polish-1), while providing known indel sites identified by the 1000 Genomes project (see Key Resources table). We refer to the obtained target set as TargetsBase.

  • (ii)

    Using ATLAS task=downsample, we created a set of reads by downsampling the following 5 ancient and 5 modern samples to a depth of 4X: Bar25, Klein7, STAR1, VLASA32, VLASA7, French-1, Georgian-1, Finnish-1, Greek-1, Polish-1. We refer to this set as the GuidanceSet.

  • (iii)

    We run local indel-realignment for each sample individually as follows: we identified private target positions for the sample using GATK RealignerTargetCreator while providing the known indel sites from the 1000 Genomes project (as above). We then unified the identified targets with TargetsBase to obtain a sample-specific target set. Finally, we run GATK IndelRealigner on this sample together with the GuidanceSet on the sample-specific targets. We parallelized this step additionally by contig, merging the output with picard-tools.

Filter and Sample Statistics

To reduce spurious alignments, we used picard-tools FilterSamReads to filter out reads that contained soft-clipped bases previously identified with ATLAS task=assessSoftClipping or did not pass picard-tools ValidateSamFile. We further used SAMtools to keep only primary alignments and, for paired-end libraries, proper pairs.

We determined read counts, sequencing depth, endogenous DNA-content and other statistics using ATLAS task=BAMDiagnostics and SAMtools flagstat. These statistics are given per library-parallel and for the merged data for all 15 ancient samples presented here in Table S1.

ATLAS Pipeline

We used commit 6df90e7 of the snakemake pipeline ATLAS-Pipeline, which is available at bitbucket.org/wegmannlab/atlas-pipeline and summarised below. Configfiles used are made available at https://github.com/CMPG/originsEarlyFarmers.

Since Post-mortem-Damage (PMD) is most prevalent at read ends, the PMD pattern is different if a read spans the entire fragment or not. Following Kousathanas et al. (2017), we split single-end read groups by length using ATLAS task=splitRGbyLength with the option allowForLarger. We identified reads that span the entire fragment as those shorter than the maximum read length minus 5 bases to account for variation introduced by adapter-trimming. We then identified PMD patterns for each read group using ATLAS task=estimatePMD, providing the reference genome with fasta=hs37d5.fasta and limiting the analysis to the chromosomes 1-22, X and Y with option chr. PMD estimates are shown in Figure M1_1A. We performed base-quality recalibration as described in Kousathanas et al. (2017), using ATLAS task=recal on known ultraconserved sites obtained from https://ccg.epfl.ch/UCNEbase/data/download/ucnes/hg19_UCNE_coord.bed (Dimitrieva and Bucher, 2013). We assumed equal base frequencies (equalBaseFreq) and inferred recalibration parameters for each PCR-parallel (pmdFile) from all sites with a minimum depth of 2 (minDepth=2). Sequencing- and PCR-duplicates were not excluded from the estimation of recalibration parameters (keepDuplicates) to add additional power. A quality transformation by base quality recalibration was created with ATLAS task=qualityTransformation (Figure M1_2).

For paired-end sequenced read-groups, the estimated recalibration parameters were directly corrected in the BAM file using ATLAS task=recalBAM, providing the recal-parameters (recal) and the reference (fasta=hs37d5.fasta). Paired-end reads with corrected base qualities were then physically merged with ATLAS task=mergeReads to avoid overlapping bases to be counted twice in downstream analysis. Reads that did not pass picard-tools ValidateSamFile were omitted. As a result of the physical merging, relative positions of bases within a read have changed, and we thus re-estimated PMD patterns as outlined above.

We called Bayesian genotypes in the reference genome with ATLAS task=callNew and parameters method=Bayesian prior=theta fixedTheta=0.001 infoFields=DP equalBaseFreq. The reference genome, PMD patterns and recal-patterns (for single-end read groups) were provided with fasta, pmdFile and recal, respectively. The first and last 2 bp of each read were ignored (trim5=2, trim3=2). We also created recalibrated BAM files from single-end libraries, which we used in any subsequent analysis in which ATLAS was not involved (e.g. contamination estimation).

The molecular sex was inferred using a script from (Skoglund et al., 2013) version 0.4, based on the ratio of reads aligning to the X and Y chromosomes (Table 1; Figure M1_1B).

Contamination estimation

Blank controls of extraction, library-preparation and index-PCR were analysed alongside the screening process. The concentration of potential contaminants was never higher than 0.81 ng/μl. A Bioanalyser analysis (HS DNA, Agilent Technologies, Waldbronn, Germany) and the screening results for extraction- and library-controls confirm the detected DNA to consist of primer- and adapter-dimers with a maximum of 55 aligning reads per blank control (out of a potential share of 200,000 reads; see Table S1).

We used the ContamMix R-script estimate.R to estimate the amount of authentically mapping mitochondrial reads at 311 modern diagnostic marker positions. We run ContamMix with the option --baseq20 and the following two input-files that we generated from all reads in the recalibrated BAM file that mapped against the MT-Genome: i) An alignment of mitochondrial reads against their own consensus (-samFn). This consensus was constructed by iteratively mapping the reads with MIA (mia) and mapping the last iteration with MIA (mia) to obtain one FASTA-sequence. The MT-reads were mapped against their consensus using bwa aln and samse (v.0.7.17) and filtered for a minimum mapping quality of 30 with SAMtools -q30. ii) A multiple alignment of the consensus genome and a fasta-file containing the diagnostic marker positions against each other (--malnFn) obtained with mafft.

For male individuals, contamination was additionally estimated using the ANGSD contamination script based on haploid X-chromosomal regions (X:5000000-154900000). We run ANGSD with the options -doCounts 1 and -iCounts 1, followed by the executable misc/contamination, providing the publicly available HapMap file HapMapChrx.gz.

The contamination estimates obtained with both methods (ANGSD and ContamMix) are shown in Figure M1_1C and Table S1.

Reference samples

To minimise reference biases (Günther and Nettelblad, 2019), we analysed all previously published samples with the same pipeline as outlined above. However, some adaptations were necessary for some samples:

  • Blank characters in read names (e.g. from ENA accession numbers) were removed to ensure the proper detection of optical duplicates.

  • Quality scores of Bichon FASTQ files were converted from illumina 1.5 to illumina 1.9 with setqk seq -V 64.

  • Stuttgart and Loschbour were received as unaligned and untrimmed demultiplexed raw data in BAM file format and transformed to unaligned FASTQ format using picard-tools SamToFastq.

  • For Loschbour files, individual library information was not recoverable. Therefore, duplicates across all sequencing files were marked.

  • Raw sequencing data was not obtainable for Bon002 and SF12. For Bon002, we downloaded the FASTQ files that were produced from BAM files by ENA. For SF12, we used published BAM files with already physically merged reads and converted them back to FASTQ files using picard-tools SamToFastq. For both, we split the FASTQ files by run identifiers to treat each run separately throughout the pipeline. Since those reads were already pre-processed, we omitted adapter trimming, removed orphaned reads and otherwise followed the pipeline for single-end samples without splitting read groups by length.

  • For NE1, one of the FASTQ files was corrupted and only 25% of the contained reads could be recovered. As a result, we could only use 97% of the total number of reads generated for that sample (Gamba et al., 2014). We then demultiplexed the fixed FASTQ-files with a custom bash script, allowing one mismatch at the first position and one mismatch at any other position of the index.

  • Since the modern samples from the SGDP dataset were already aligned to the desired reference genome, we abstained from remapping them and analysed them along the ancient samples as described above, starting with filtering with for MQ<30, marking optical and PCR duplicates and performing local realignment.

For each ancient sample, all pipeline results and differences from the standard pipeline are detailed in Table S1.

Data filtering

Individual VCFs obtained after Bayesian genotype calling were first filtered according to their read depth (DP): for each individual we excluded sites that had DP < 8 and those that had a DP bigger or smaller than 2.5 s.d. away from the mean DP for each individual. To minimise effects of outliers, the mean DP was calculated using the R function optimize for the sites comprised +/-10 of the mode DP and a tolerance of 10-6 (see Table S2). We also excluded sites that had poor genotype quality (GQ < 30), and we only kept autosomal sites. Furthermore, heterozygous sites were considered as homozygous if they had a significant allelic imbalance (p-value ≤ 0.1) tested by Fisher’s exact test. After filtering, the modern individuals had on average 2,549,812,798 sites passing the filters against 1,980,736,974 for the ancients (ranging from 333,061,590 to 2,544,170,685). All VCF file processing was performed using bcftools v0.1.15 (Danecek et al., 2021).

Applying these filters led to a substantial increase in the quality of our heterozygous sites, reducing the amount of sites with high allelic imbalances and decreasing the asymmetry between singleton and non-singleton sites to a minimum (Figures M1_3 and M1_4). To quantify this improvement explicitly, let us denote by His the set of loci at which individual i was called heterozygous either for a singleton (only non-reference allele across all samples, s = 1) or not (s = 0). We then define the imbalance statistics

ris=lHisIrildil>0.5lHisIrildil<0.5,

where ril denotes the number of reference alleles observed in individual i at site l out of the total number of observed alleles dil at this site. To compare singleton (s = 1) against non-singleton (s = 0) positions, we further quantify ρ=ri1/ri0. As shown in Figure M1_4, the chosen filters guarantee comparable imbalances at singleton and non-singleton sites, indicating similar quality of both categories. A striking outlier was sample CarsPas1, which was not included in any demographic analysis.

Dataset assembling

For further downstream analysis we assembled the following datasets:

  • “Ancient”: The filtered individual VCFs from the 25 ancient individuals were merged into a single VCF polarised with the Chimpanzee reference genome (panTro4 from The Chimpanzee Sequencing Analysis Consortium, 2005). We only kept the biallelic polymorphic sites (6,986,216) for which individuals can have missing genotypes (pipeline made available in our GitHub repository: https://github.com/CMPG/originsEarlyFarmers).

  • 102samples”: Same as Ancient but with all 102 ancient and modern samples, keeping 14,896,103 biallelic polymorphic sites. On average, the modern genomes show 4% of missing sites with a maximum of 6%, the ancients 29% ranging from 3 to 87%.

  • Neutral”: In order to avoid biases due to background selection (BGS) and biased gene conversion (BGC) when estimating population diversity and relationships (Matthey-Doret and Whitlock, 2019), we performed most of our genetic and demographic analyses on a “neutral” portion of the genome (Pouyet et al., 2018). We defined this portion as the restricted set of sites from the 102samples dataset that had the same reference allele in the chimpanzee and gorilla reference genomes, that were in regions with recombination rate >1 cM/Mb where BGS has little effect, outside of CpG islands, non CpG sites (i.e. when a C is followed by a G in the chimpanzee and gorilla genomes and has a T as alternative allele in our data; or a G preceded by a C in in the chimpanzee and gorilla genomes with an A as alternative allele) and with A↔T and G↔C mutations which are not affected by BGC or post-mortem damages. We kept 473,834 biallelic sites. We obtained the Chimpanzee and the Gorilla hg19 nucleotide states from the chained and netted alignments http://hgdownload.cse.ucsc.edu/goldenpath/hg19/vsPanTro4/axtNet/ and http://hgdownload.cse.ucsc.edu/goldenpath/hg19/vsGorGor5/axtNet/, respectively; the recombination map for the YRI population from the 1000 Genomes Project (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20130507_omni_recombination_rates/); the CpG islands listing from the UCSC genome browser (CpG Islands cpgIslandExt Track). Filtering was performed in R, with VCF files read using SNPRelate package (Zheng et al., 2012)

  • Phased”: We did genotype imputation and phasing for each chromosome on the 102samples dataset that was polarised back with hs37d5 human reference genome. We used SHAPEIT4 v1.2 (Delaneau et al., 2019) with by-default parameters for sequencing data, the HapMap phase II b37 genetic map provided in SHAPEIT4 folders, and the Haplotype Reference Consortium (McCarthy et al., 2016) dataset (accession number EGAD00001002729 on the European Genome-phenome Archive) as reference panel. All the chromosomal phased-imputed VCFs were then concatenated into a single phased VCF with bcftools v1.9, which includes 2,795,127,079 sites in total.

  • 1240k”: We compared the samples obtained in this study with previously published, target-enriched datasets available through Allen Ancient DNA Resource v42.4 (AADR) at https://reichdata.hms.harvard.edu/pub/datasets/amh_repo/curated_releases/index_v42.4.html. We use these population labels, but split Peleponnese_N into Diros_EN and Peleponnese_LN. To ensure our data is comparable to the pseudo-haploid calls, we generated majority calls for all our ancient individuals with ATLAS (task=majorityBase; commit 7cfc900) at the 1,233,013 SNP positions present in this reference dataset. We then combined the calls of the 25 ancient genomes with the AADR and refer to this dataset as 1240k in the following.

  • 1240k_HOIll”: We extracted the non-functional HOIll subset of sites (Lazaridis et al., 2016) for the Ancient dataset. HOlll sites were also extracted for additional samples (Altai, Chimp, Vindija and Dinka) retrieved from the target-enriched dataset available through Allen Ancient DNA Resource v37.2 (AADR) at https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present-day-and-ancient-dna-data.

Phenotype predictions

Pigmentation

Pigmentation phenotypes of hair, skin and eyes were predicted with the HIrisPlex-S webtool (Chaitanya et al., 2018; Walsh et al., 2013) for each of the newly sequenced samples and the previously published samples. In case of missing genotypes for any of the 41 SNPs used by HIrisPlex-S, the sites with low or no depth in the Ancient dataset were examined directly in the BAMs and the most abundant allele was chosen (Table S3). In BAMs, we only considered bases at least 3 bases away from either end of the read, with a base quality ≥ 25, and no C↔T and G↔A SNPs to avoid any effect of PMD on the prediction. To account for the uncertainty associated with low-coverage sites, two HIrisPlex-S input files were created for each individual: one in which the missing genotype consisted of the allele taken from the BAM and the non-effect allele and one where the genotype was comprised of the allele taken from the BAM and the effect allele. Running HIrisPlex-S twice for each sample resulted in ranges of probabilities for each phenotype (Table S3). A prediction was accepted without further explanation if in both runs the same phenotype showed a probability ≥ 0.7 (Chaitanya et al., 2018). If predictions differed between runs, the most parsimonious phenotype was chosen, following the approach of Walsh in Brace et al. (2019) (Figure M1_9).

Standing height

To assess genetic variation related to standing height, a classical highly heritable polygenic trait (McEvoy and Visscher, 2009), polygenic scores (PS) were computed based on a set of 670 SNPs (Chan et al., 2015) on the Ancient dataset (Figure M1_10). To account for missing data common even in high depth ancient genomes, we applied the generalised risk score approach described by Veeramah et al. (2018) on samples where enough SNPs were present to account for at least 75% of the scores effect size. This led to the exclusion of five ancient individuals (Bar8, Bon002, Bichon, CarsPas1, AKT16) out of the 25 included in this study.

Other phenotypes

Genotypes for SNPs associated with additional phenotypes of interest were inspected manually for each sample in the Ancient dataset or BAM files if necessary: rs4988235 a variant in the MCM6 gene associated with lactase-persistence in Eurasia; rs3827760 in the EDAR gene; rs17822931 in ABCC11; seven SNPs located in the FADS1/2 gene complex (listed in Table M1_2). A two-sided binomial test was used to compare counts of derived alleles in the ancient individuals with frequencies estimated in the CEU population from phase3 of the 1000 genomes project (1000 Genomes Project Consortium et al., 2015), chosen as a proxy for Central European modern populations (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/).

Quantification and statistical analysis

Population genetics analyses

Uniparental haplogroup determination

Mitochondrial haplogroups were determined from the BAM files for the 15 newly-sequenced genomes using phy-mer (Navarro-Gomez et al., 2015) with K-mer minimal number of occurrences = 10 (Table S3).

For the samples genetically identified as men, Y-chromosomal haplogroups were determined using Yleaf (Ralf et al., 2018) with minimal base-quality of 20 (-q 20) and base-majority to determine an allele set to 90% (-b 90).

Neanderthal introgression

We quantified proportions of Vindija Neanderthal (Prüfer et al., 2017) introgression by computing f4 ratio statistics of the form f4(Altai,Chimp;X,Dinka)f4(Altai,Chimp;Vindija,Dinka) as previously suggested (Petr et al., 2019) with qpF4ratio from the ADMIXTOOLS package (Patterson et al., 2012) on the 1240k_HOIll dataset.

Genomic heterozygosity

We estimated the level of genetic diversity of each individual as the proportion of heterozygous sites found in the neutrally evolving portion of the genome (Figure 2C; Table S2). Thus, for every genome, we divided the amount of heterozygous sites observed in the Neutral dataset by the expected number of neutral sites genotyped for this individual. This number is obtained by considering all the genotyped sites for the considered individual that i) have the same reference allele for both the chimpanzee and gorilla reference genomes, ii) are not CpG sites and out of CpG islands, iii) are in regions with recombination rate > 1 cM/Mb based on Pouyet et al. (2018). Finally, in order to consider only sites unaffected by BGC, we divided the number of these previously defined sites by three, as only one third of all mutations should be BGC free (i.e. A↔T and G↔C polymorphisms).

Runs of homozygosity (ROHs)

ROHs were identified on the Phased dataset in genomes of 90 European and SW Asian modern and ancient individuals imputed using IBDSeq v. r1206 (Browning and Browning, 2013) with default parameters but errormax = 0.005 and ibdlod = 2. We further processed the artificially long tracts spanning assembly gaps or centromeres (genomic locations were obtained from UCSC genome browser: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gap.txt.gz) and split them into shorter tracks excluding the gap stretch, inspired by Sikora et al. (2017). As advised when genomes do not come from a homogeneous population (Browning and Browning, 2013), we focused on intermediate (2-10 Mb) and long (>10 Mb) ROHs similarly to Racimo et al. (2020).

Multi-Dimensional scaling (MDS)

Genetic relationships among individuals were estimated from pairwise average nucleotide divergence πXY (Nei and Li., 1979). For each pair of individuals X and Y, we identified the sites showing no missing data for the two individuals, and we computed their average nucleotide divergence πXY over these sites (Table S2). We considered whole-genome sites or neutral sites only (in this case, the number of differences was divided by the expected number of neutral sites as defined above in the heterozygosity paragraph). We then represented the relationships between modern and ancient genomes from Europe and SW Asia (for sites from the 102samples or Neutral data sets; Figure 2A; Figure M1_5A) or only the ancient genomes (for sites from the Ancient dataset; Figure M1_5B), using a classical multidimensional scaling (MDS) approach implemented in R (cmdscale function from stats package).

Uniform manifold approximation and projection

In order to represent potentially complex relationships between 90 individuals (85 modern and 25 ancient samples) from Western Eurasia, we performed a Uniform Manifold Approximation and Projection (UMAP) dimension reduction on their genotypes obtained at 2035 neutral polymorphic sites (from the Neutral dataset) that did not present missing data for any individual. The analysis was performed using the umap function from the umap R package (using the options n_neighbors=3, n_epochs=1000, min_dist=0.1, negative_sample_rate=5, n_components=2, and metric="euclidean") (Figure M1_6).

Admixture clustering analyses

Admixture coefficients of each ancient genome were estimated using the R package LEA (Frichot and François, 2015) with parameters K = 2 or 3, alpha = 10, and number of repetitions = 5. We used the function snmf to calculate the fit (entropy) of each run and we plotted the admixture coefficients from the run with the smallest entropy value. The groups were defined in an unsupervised manner. In order to maximise the number of genomic sites to be used, we excluded for these analyses the three lowest quality Neolithic genomes (Bichon, Bon002, Bar8) from the Ancient dataset (Figure 2B; Figure S3A).

f-statistics

We first used f-statistics to confirm that the assignment of samples to specific populations is not violated by the presence of shared drift with external samples. For this, we calculated all possible f-statistics on the 1240k dataset of the form of D(Individual 1 from the population tested, Individual 2 from the same population; Other samples, Outgroup) using ADMIXTOOLS (Patterson et al., 2012) and Mbuti as the outgroup. We used qpGraph from ADMIXTOOLS v7300 (Patterson et al., 2012) using 1,000 initial conditions to match specificities of the datasets.

To validate the model inferred by fastsimcoal2, we aimed at comparing f-statistics predicted under the fastsimcoal2 model against those calculated from the data. For this, we first created a full admixture graph carefully matching the model in Figure 3. To validate this graph, we used data simulated with fastsimcoal2 under the model shown in Figure 3 (seeMethods S1 "- Demogenomic inference with fastsimcoal2 - Final Model" section), which we fitted with qpGraph 100 times with 1,000 initial conditions each. However, qpGraph failed to identify a suitable set of branch lengths and admixture proportion to explain the simulated data with all runs resulting in predicted f-statistics very different from those calculated on the simulated data, suggesting that the SFS captures more information of the full data than f-statistics do.

Since the full graph could not be fitted, we next aimed at manually simplifying the graph. Specifically, we removed admixture edges, until we identified the simplest graph (in terms of admixture edges) for which no major differences between the f-statistics predicted under the model and those calculated from the simulated data were detected. This graph was then re-fitted with qpGraph on both a subset of the Ancient dataset (the sites from the EurEF panel shown in Table M1_4) as well as on the 1240k dataset described above. In both cases, qpGraph was run 100 times with 1,000 initial conditions each. To present the fitted graphs and quantify the reliability of the fitting, we calculated the median and 90% quantiles for all branch lengths and admixture odds ratios across the 20 best graphs as judged by the final score. A more detailed description can be found in Methods S1.

Joint Distribution of Fitness Effects

To create a representative Neolithic model population, we aggregated all Neolithic samples that were newly sequenced, plus the WC1 genome (N = 14) from the 102samples dataset. As well, to create a representative modern population, we used the SGDP populations in the 102samples dataset that are in rough proximity to the ancient populations (Polish, Bergamo, Czech, French, Hungarian, Greek, Albanian, Bulgarian, and Turkish). We annotated the synonymous and nonsynonymous SNPs using ANNOVAR (Wang et al., 2010) with hg19/GRCh37 (included with ANNOVAR) as the reference genome. Ancestral alleles were determined based on the Ensembl Compara 71 genome FASTA files. To account for missing data, we projected the joint allele frequency spectra downward to 16 Neolithic chromosomes and 28 modern chromosomes, to roughly maximise the number of segregating synonymous SNPs.

The joint Distribution of Fitness Effects (DFE) analysis requires a simpler demographic model than our main inference (Figure 3), and simulations suggest that joint DFE analysis is robust to demographic model details (Huang et al., 2021). Based on prior results (Gravel et al., 2011), we used dadi (Gutenkunst et al., 2009) to fit a demographic model to the synonymous data in which the ancestors of the Neolithic population underwent a bottleneck to relative size nB followed by exponential growth (Figure M1_11A). The ancestors of the modern samples diverged TB time units after the bottleneck, following the same growth rate to reach final size nF. TS time units after this divergence, Neolithic chromosomes were sampled, TF time units before present. We also included a parameter pmis to account for potential ancestral state misidentification (Ragsdale et al., 2016). To fit this model, we used grid points of [128, 138, 148], and the best-fit parameters were nB = 0.113, nF = 1.42, TB = 0.106, TS = 0.0054, TF = 0.0232, pmis = 0.0634.

We then fit a model including this demographic history plus a joint DFE to the nonsynonymous data (Huang et al., 2021). We modelled the DFE as a bivariate lognormal distribution. We assumed that the population-scaled mutation rate q for nonsynonymous mutations was 2.31 times that for synonymous mutations, and we again included a parameter for ancestral state misidentification. We assumed that selection coefficients were potentially different in the ancestors of the modern individuals (Figure M1_11E). The best fit parameters for the joint DFE were m = 3.29, s = 2.61, and r = 0.9968, with 0 misidentification inferred.

Demographic Analyses

MSMC2

We used MSMC2 (Schiffels and Wang, 2020) to infer past effective sizes of ancestral populations and their split times for all high-quality ancient and some representative modern individuals (Table M1_3) on the Phased dataset. To ensure high data quality especially high mappability and genotype quality, we followed (Schiffels and Wang, 2020) and used two masks: 1) per chromosome mappability masks (um75-hs37d5.bed.gz) for the human reference genome hs37d5 downloaded from https://github.com/wangke16/MSMC-IM/tree/master/masks; 2) sample-specific masks that we generated as suggested by Schiffels and Wang (2020). We then ran the generate_multihetsep.py script from MSMC-tools (https://github.com/stschiff/msmc-tools, commit 07bc8a9) to get single, multi-sample and paired population input files for MSMC2.

Example of a command line for two, four or eight haplotypes:

generate_multihetsep.py --chr 1 --mask ind1.chr1.mask.bed --mask

mappabilityMaskperChr/chr1_m75-hs37d5.bed ind1.chr1.phased.vcf > ind1.chr1.multihetsep.txt

generate_multihetsep.py --chr 1 --mask ind1.chr1.mask.bed --mask ind2.chr1.mask.bed --mask

mappabilityMaskperChr/chr1_m75-hs37d5.bed ind1.chr1.phased.vcf ind2.chr1.phased.vcf > pop1.chr1.multihetsep.txt

generate_multihetsep.py --chr 1 --mask ind1.chr1.mask.bed --mask ind2.chr1.mask.bed --mask

ind3.chr1.mask.bed --mask ind4.chr1.mask.bed --mask mappabilityMaskperChr/chr1_m75-hs37d5.bed ind1.chr1.phased.vcf ind2.1.phased.vcf ind3.chr1.phased.vcf ind4.chr1.phased.vcf >

pop1_pop2.chr1.multihetsep.txt

We inferred past effective population size for each diploid individual separately (Figure M1_12) as well as using two samples per population if available (Figure S2), using command lines such as:

msmc2 -t11 -s -o ind1.2haps.msmc2 ind1.chr.multihetsep.txt

msmc2 -t11 -I 0,1,2,3 -s -o pop1.4haps.msmc2 pop1.chr.multihetsep.txt

All MSMC2 results were scaled using a mutation rate of 1.25 × 10−8 per base pair per generation (Scally and Durbin, 2012; Schiffels and Wang, 2020), and a generation time of 29 years (Fenner, 2005).

To estimate split times between population pairs, we used MSMC2 1) to estimate coalescent rates among the samples of the first population, 2) to estimate coalescent rates among the samples of the second population, and 3) to estimate coalescent rates across the two populations. Example command lines used for two populations pop1 and pop2 were:

msmc2 -t11 -I 0,1,2,3 -s -o pop1.4haps.msmc2 pops.chr.multihetsep.txt,

msmc2 -t11 -I 4,5,6,7 -s -o pop2.4haps.msmc2 pops.chr.multihetsep.txt,

msmc2 -t11 -I 0-4,0-5,0-6,0-7,1-4,1-5,1-6,1-7,2-4,2-5,2-6,2-7,3-4,3-5,3-6,3-7 -s -o

pop1-pop2.8haps.cross.msmc2 pops.chr.multihetsep.txt.

We then used the MSMC2 script combineCrossCoal.py to create a single output file with all three rates:

combineCrossCoal.py pop1-pop2..cross.msmc2.final.txt pop1..msmc2.final.txt pop2..msmc2.final.txt

> pop1-pop2.combined.msmc2.final.txt

For each population-pair, we then plotted the relative cross-coalescence rate (CCR), which is estimated by taking the ratio of the across-rate and the mean within-rate (Figure M1_13; Schiffels and Wang, 2020). The relative CCR indicates when two populations were a single population (values around 1) and when they were well separated into two isolated populations (values close to zero) (Schiffels and Wang, 2020). However, translating the relative CCR into estimates of split times is difficult for two reasons: First, if a population split was followed by migration, the relative CCR will remain high even after the split. Second, the uncertainty associated with the different coalescent rates translates into a gradual change in the relative CCR, even under a hard split. As a rough estimate, (Schiffels and Wang, 2020) recommends estimating split times as the time when the relative CCR hits 0.5, which we show in Figure M1_14 for all pairwise comparisons.

To obtain confidence intervals around coalescence rate estimates, we generated 20 artificial genomes by block-bootstrapping MSMC2 input files in 5Mb blocks (Figures M1_15 and M1_16) as suggested in Schiffels and Wang (2020).

fastsimcoal2

Demographic inferences were carried out with fastsimcoal2.7 (Excoffier et al., 2013, 2021) on six different panels of newly sequenced ancient individuals, on the neutral SFS and with neutral mutation rate adjusted for each panel from the basal mutation (Table M1_4).

Panels composition and properties: For each panel, we excluded from the Ancient dataset i) sites with any missing data in any individual of the panel using BEDOPS (Neph et al., 2012), ii) sites with different reference allele for the chimp and gorilla reference genomes, iii) CpG sites, iv) sites found in CpG islands or v) in genomic regions with recombination rate less than 1 cM/Mb. We thus kept a total of TX sites for the panel X, among which SX were polymorphic and MX monomorphic (the properties of each panel are found in Table M1_4). Second, in order to have neutral data best suited for demographic inference (Pouyet et al., 2018), we only kept Sneutral_X BGC-free A↔T and G↔C polymorphic sites, which represented αX∼ 0.2 of the filtered polymorphic sites for any panel. Since we would have expected them to represent one third of all filtered sites if mutation rates had been equal for all mutation types, we computed the ratio rX=αX/(1/3)=3αX, which represents the reduction in mutation rates that has occurred at those sites in panelX. We then estimated the number of neutrally evolving monomorphic sites in each panel as Mneutral_X = MX/3, since we expect that one third of all mutations should be BGC-free.

Parameter inference via maximum likelihood: Parameter estimates were obtained by maximising the model likelihood over 50 independent runs of fastsimcoal, 50 expectation conditional maximisation (ECM) cycles per run and 500,000 coalescent simulations per estimation of the expected SFS (except for the Core models for which 200,000 simulations were run). The command line used for the estimation was of the type:

  • fsc -t xxx.tpl -n500000 -d -e xxx.est -M -L50 -q -C5 --multiSFS --logprecision 18 -c1 -B1

where fsc is the fastsimcoal2.7 program (available on http://cmpg.unibe.ch/software/fastsimcoal2/) and xxx the generic name of the input files. The .est and .tpl input files are available in our GitHub repository: https://github.com/CMPG/originsEarlyFarmers.

Likelihood comparison and model choice: When several models were tested for the same panel, we retained the model with the highest estimated likelihood over 50 runs, and recorded the maximum likelihood (ML) parameters of this model. In order to take into account the variance in the estimation of the likelihoods due the limited number of performed coalescent simulations (here 200,000 or 500,000) to estimate the expected SFS, we also compared the likelihoods of the models estimated on the basis of 10 million coalescent simulations done under the ML parameters. We repeated this procedure 100 times per model to check if the distributions of these likelihoods were overlapping and thus not distinguishable. We used the following command line to get these 100 likelihoods:

  • fsc -i xxx_maxL.par -R100 -n10000000 -d -u -C5 --logprecision 18 -q

Finally, we also computed the Akaike criterion (AIC) (Akaike, 1974) and the model relative likelihoods assuming site independence, even though our “neutral” sites may be linked on some chromosomes.

Confidence intervals: Confidence intervals around ML parameter point estimates were obtained via a parametric bootstrap approach, for which we first generated 100 SFS covering Tneutral_X nucleotides using estimated ML parameters, with the command line:

  • fsc -i xxx.par -n100 -j -d -s0 -x –I -q -u

Then, for each of these bootstrapped SFS, we re-estimated the parameters of the model using 20 independent runs starting at the ML parameters value (option --initvalues in fastsimcoal2). We used 60 ECM cycles for each run and performed 500,000 simulations for estimating the expected SFS under a given set of parameter values and to estimate the model likelihood. The fastsimcoal2 command line used for the bootstrap was of the type:

  • fsc -t xxx.tpl -n500000 -d -e xxx.est --initvalues xxx.pv -M -L60 -q -C5 --multiSFS --logprecision 18

The limits of 95% confidence intervals were finally estimated by computing the 2.5% and 97.5% quantiles of the distribution of the 100 newly estimated ML parameter values.

Acknowledgments

We are grateful to Martina Unterländer and Aleksandra Žegarac for help with sample preparation. Lara Cassidy and Kay Prüfer kindly provided access to unpublished raw sequencing data. We thank Ourania Palli and Franz Pieler for useful archaeological information. We thank Katie Meheux for her careful proofreading of the manuscript and Jan Grenner for his voice on the soundtrack of Figure360. We thank the General Directorate of Cultural Heritage Rhineland-Palatinate, the Anthropological Department of the Natural History Museum Vienna, Michaela Harbeck, and the Bavarian State Collection for Anthropology and Palaeoanatomy, Munich, for providing skeletal samples. We also acknowledge the use of the sequencing platform at the University of Berne for whole-genome sequencing services and support, the IBU cluster of the University of Berne for NGS data analysis (https://www.bioinformatics.unibe.ch/), the UBELIX HPC cluster of the University of Berne (http://www.id.unibe.ch/hpc) for the demographic analyses, and the supercomputer Mogon of Johannes Gutenberg University Mainz (https://hpc.uni-mainz.de). We finally thank the three anonymous reviewers for their constructive comments, which greatly helped us to improve the manuscript. Fundings: Swiss NSF grant no. 310030_188883 (L.E. and N.M.); Swiss NSF grant no. 31003A_173062 (D.W., I.S., V.L., and C.S.R.-B.); Swiss NSF grant no. 310030_200420 (D.W.); German Science Foundation BU 1403/6-1 (J. Burger, C.P., and S.K.); Humboldt foundation (J. Burger and C.P.); Greek-German bilateral agreement (GSRT and BMBF) project “BIOMUSE” 5030121 (C.P., J. Burger, L.W., E.G., and Y.D.); Serbian Ministry of Science project ID III47001 (S.S.); Czech Grant Agency—GACR 21-17092X (Z.H.); National Institutes of Health grant R01GM127348 (R.N.G., T.J.S.), ERC BIRTH project 640557 (S.S.); European Research Council under the European Union’s Horizon 2020 research and innovation program—ERC-2019-SyG n°856453 (Z.H.); Marie Skłodowska-Curie actions ITN “BEAN” (J. Burger, S.S., S.M.F., C.P., and Z.H.); Marie Skłodowska-Curie individual fellowship (M.B.: 793893 “ODYSSEA”); EMBO Long-Term Fellowship (Z.H.: ALTF 445-2017); Seal of Excellence Fund grant from the University of Berne (N.M.: SELF2018-04); Mainz University (L.W.).

Authors contributions

Conceptualization, J. Burger, D.W., and L.E.; resources (samples, archaeological, and anthropological context), S.S., M.B., C.P., S.T., N.K., F.G., A.Z.-L., J. Pechtl, J. Peters, E.L., and M.T.-N.; data production, L.W., S.M.F., and S.K.; data curation, I.S., V.L., A.T., and A.K.; formal analyses, investigation, and visualization, N.M., L.E., A.K., E.G., T.J.S., R.N.G., V.P., J. Burger, J. Blöcher, Y.D., Z.H., I.S., A.P., C.S.R.-B., and D.W.; writing – original draft, N.M., M.B., D.W., J. Burger, and L.E.; writing – review & editing: all co-authors.

Declaration of interests

The authors declare no competing interests.

Published: May 12, 2022

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.cell.2022.04.008.

Contributor Information

Daniel Wegmann, Email: daniel.wegmann@unifr.ch.

Joachim Burger, Email: jburger@uni-mainz.de.

Laurent Excoffier, Email: laurent.excoffier@iee.unibe.ch.

Supporting citations

The following references appear in the supplemental information: Adam, (2007); Alpaslan-Roodenberg and Roodenberg, (2020); Alpaslan-Roodenberg, (2011); Ambrose, 1991; Ameur et al., (2012); Angel, (1973); Arbuckle et al., (2014); Atakuman et al., (2020); Baird, (2012); Baird et al., (2013, 2018); Bar-Yosef, (2011); Beerli, (2004); Beichman et al., (2017); Benjamin et al., (2017); Bickle, (2018); Bickle et al., (2015); Bocquet-Appel and Bar-Yosef, (2008); Bonsall et al., (2016); Bonsall et al., (1997, 2008); Borić, (2009); Borić, (2011); Borić, (2016); Borić, (2019); Borić and Dimitrijević, (2007); Borić et al., (2008); Borić and Price, (2013); Boulestin and Coupey, (2015); Boulestin et al., (2009); Brami and Horejs, (2019); Brink-Kloke, (1990); Brotherton et al., (2013); Buckley et al., (2017); Budd et al., (2018); Budd et al., (2013); Burger et al., (2020); Çakırlar, (2013); Carter, (2016); Carter et al., (2019); Cavalli-Sforza et al., (2004); Chapman, (2000); Çilingiroğlu and Çakırlar, (2013); Çilingiroğlu et al., (2020); Clason, (1980); Colledge et al., (2019); Cox et al., (2019); Cristiani et al., (2018); Davison et al., (2006); de Becdelièvre et al., 2020; de Manuel et al., (2016); Diaz-Papkovich et al., (2019); Diaz-Papkovich et al., (2021); Dietrich and Kociumaka, (2001); Douka et al., (2017); Efstratiou et al., (2014); Ehrich, (1977); Enattah et al., (2002); Esin et al., (2016); Fogel et al., (1989); Fujimoto et al., (2008); Fuller et al., (2006); Fumagalli et al., (2015); Gallego-Llorente et al., (2016); Gazal et al., (2014); Gerritsen and Özbal, (2019); Gerritsen et al., 2020; González-Fortes et al., (2017); Haack, (2016); Halstead and Isaakidou, (2020); Harding et al., (2000); Hershkovitz et al., (1997); Hofmanová, (2017); Ivanova et al., (2018); Jovanović et al., (2019); Jovanović, (2017); Kartal, (2003); Karul and Avci, (2011); Kozłowski, (1999); Kozłowski, (2005); Krauß et al., (2018); Kreutzer, (2017); Lagia et al., (2007); Lambeck, (1995); Latreille et al., (2009); Lazaridis et al., 2017; Li et al., (2009); Malaspinas et al., (2016); Marchi and Excoffier, (2020); Marchi et al., (2021); Marouli et al., 2017; Wild et al., (2004); Mateiciucová, (2015); Mittnik et al., (2019); Nehlich et al., (2010); Neugebauer-Maresch and Lenneis, (2015); Nieszery, (1995); Ohashi et al., (2011); Olalde et al., (2018, 2019); Orschiedt and Haidle, (2012); Ortner, (2003); Özdoǧan, (1997); Papathanasiou, (2003); Papathanasiou, (2011, 2015); Papoulia, (2016); Park et al., (2012); Pechtl, (2015); Perlès, (2000); Perlès et al., (2013); Porčić et al., (2020); Pyke and Yiouni, (1996); Quinlan and Hall, (2010); Richards and Hedges, (1999); Riedhammer, (2019); Rodden, (1964, 1965); Rodden et al., (1962); Rodden and Rodden, (1964); Rosenstock et al., (2019); Runnels, (1988); Runnels and Özdoğan, (2001); Schoeninger and DeNiro, (1982); Shang et al., (2013); Slatkin, (2005); Stadler, (2015); Stefanović, (2016); Stefanović and Borić, (2008); Tasić et al., (2015); Tasić et al., (1990, 2016); Teschler-Nicola, (2012); Tiefenböck and Teschler-Nicola, (2015); Tourloukis and Harvati, (2018); Triantaphyllou, (2001); Turck, (2019); Vaiglova et al., (2014); van Andel and Runnels, (1995); Vigne et al., (2012); Wahlund, (1928); Weiberg et al., (2019); Weninger et al., (2014); Whittle et al., (2002); Windl, (1999, 2009); Xue et al., (2009); Yamaguchi et al., (2012); Zeder, (2017); Zeeb-Lanz, (2016); Zeeb-Lanz, (2019a); Zeeb-Lanz, (2019b); Zoledziewska et al., (2015).

Supplemental information

Table S1. Sample processing including detailed information on library preparation, pipeline statistics and read statistics for sequenced libraries and ancient reference samples, related to Methods S1 and STAR Methods
mmc1.xlsx (519.9KB, xlsx)
Table S2. Genomic and sampling information about all 102 genomes used in our study, i.e., 15 newly sequenced ancient genomes, 10 ancient genomes from the literature and 77 modern genomes selected within the SGDP panel, related to Figure 2, Methods S1, and STAR Methods

This table includes in particular the depths of coverage before and after filtering, the genomic heterozygosity shown in Figure 2B, and the average nucleotide pairwise differences between individuals used in Figure 2A and Figure M1_5.

mmc2.xlsx (785.2KB, xlsx)
Table S3. Detailed information for mtDNA and Y haplogroups, including quality information and defining markers, for newly-sequenced individuals (in black) and published individuals (in gray); and results for pigmentation phenotypes as reported by the HIrisPlex-S webtool for the newly-sequenced individuals (in black) and published individuals (in gray), including the raw results files as well as our combined interpretation, related to Methods S1
mmc3.xlsx (460.6KB, xlsx)
Table S4. Parameters inferred for six different panels, related to Figure 3 and Methods S1

For each panel, we provide the maximum estimated log-likelihood of all models tested, with the best supported model highlighted in red, their relative likelihood derived from the AIC, the values obtained for the different parameters and the 95% confidence intervals under the best supported models.

mmc4.xlsx (455.3KB, xlsx)
Table S5. Results of f-statistics in the form of D(Population1, Population2, Population3, Outgroup) where we used Mbuti as Outgroup and Population1–3 were samples analysed in the study and relevant reference samples and populations, related to Figures S1 and S4 and Methods S1
mmc5.xlsx (770.2KB, xlsx)
Methods S1. Bioinformatic pipeline, population genetics analyses and demographic inferences, related to STAR Methods, Figures 2, 3, and 4, Figures S1, S2, S3, and S4, and Tables S1, S2, S3, S4, and S5
mmc6.pdf (7.6MB, pdf)
Data S1. Archaeological background, related to Figure 1, Table 1, and STAR Methods
mmc7.pdf (25.8MB, pdf)
Figure360. Animation and narration of Figure 5, related to Figure 5
Download video file (43.5MB, mp4)

Data and code availability

  • Raw sequencing data (FASTQ-files) and aligned BAM-files generated in this study have been deposited to European Nucleotide Archive (ENA: PRJEB50857) and are publicly available as of the date of publication. Individual Accession numbers are listed in the key resources table. Filtered VCF-files have been deposited to European Variant Archive (EVA: PRJEB51919) and are publicly available as of the date of publication. Further archaeological information and analyses additional to the present article are available (Data S1; Methods S1), as well as Supplemental Tables and Figures.

  • All original code has been deposited at https://github.com/CMPG/originsEarlyFarmers and is publicly available as of the date of publication. under DOI https://doi.org/10.5281/zenodo.6367517.

  • Any additional information required to reanalyse the data reported in this paper is available from the lead contact upon request.

References

  1. 1000 Genomes Project Consortium. Auton A., Brooks L.D., Durbin R.M., Garrison E.P., Kang H.M., Korbel J.O., Marchini J.L., McCarthy S., McVean G.A., et al. A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Adam E. Looking out for the Gravettian in Greece. Paléo. 2007;19:145–158. [Google Scholar]
  3. Aimé C., Laval G., Patin E., Verdu P., Ségurel L., Chaix R., Hegay T., Quintana-Murci L., Heyer E., Austerlitz F. Human genetic data reveal contrasting demographic patterns between sedentary and nomadic populations that predate the emergence of farming. Mol. Biol. Evol. 2013;30:2629–2644. doi: 10.1093/molbev/mst156. [DOI] [PubMed] [Google Scholar]
  4. Akaike H. A new look at the statistical model identification. IEEE Trans. Autom. Control. 1974;19:716–723. [Google Scholar]
  5. Alpaslan-Roodenberg M.S., Roodenberg J. In the light of new data: The population of the first farming communities in the eastern Marmara region. Prähistorische Z. 2020;95:48–77. [Google Scholar]
  6. Alpaslan-Roodenberg S.M. A preliminary study of the burials from late neolithic-early chalcolithic Aktopraklık. Anatolica. 2011;37:17–43. [Google Scholar]
  7. Ambrose S.H. Effects of diet, climate and physiology on nitrogen isotope abundances in terrestrial foodwebs. J. Archaeol. Sci. 1991;18:293–317. [Google Scholar]
  8. Ameur A., Enroth S., Johansson A., Zaboli G., Igl W., Johansson A.C.V., Rivas M.A., Daly M.J., Schmitz G., Hicks A.A., et al. Genetic adaptation of fatty-acid metabolism: a human-specific haplotype increasing the biosynthesis of long-chain omega-3 and omega-6 fatty acids. Am. J. Hum. Genet. 2012;90:809–820. doi: 10.1016/j.ajhg.2012.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Angel J.L. In: Die Anfänge des Neolithikums vom Orient bis Nordeuropa. Teil VIIIa Anthropologie. Schwidetzky I., editor. Böhlau Verlag; 1973. Early Neolithic people of Nea Nikomedeia; pp. 103–112. [Google Scholar]
  10. Arbuckle B.S., Kansa S.W., Kansa E., Orton D., Çakırlar C., Gourichon L., Atici L., Galik A., Marciniak A., Mulville J., et al. Data sharing reveals complexity in the westward spread of domestic animals across Neolithic Turkey. PLoS One. 2014;9:e99845. doi: 10.1371/journal.pone.0099845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Atakuman Ç., Erdoğu B., Gemici H.C., Baykara İ., Karakoç M., Biagi P., Starnini E., Guilbeau D., Yücel N., Turan D., et al. Before the Neolithic in the Aegean: the Pleistocene and the Early Holocene record of Bozburun—Southwest Turkey. J. Isl. Coastal Archaeol. 2020 doi: 10.1080/15564894.2020.1803458. [DOI] [Google Scholar]
  12. Baird D. In: A companion to the archaeology of the ancient Near East, 13,000--4000 BC. Potts D.T., editor. Wiley Online Library; 2012. The late Epipaleolithic, Neolithic, and chalcolithic of the Anatolian plateau; pp. 431–465. [Google Scholar]
  13. Baird D., Asouti E., Astruc L., Baysal A., Baysal E., Carruthers D., Fairbairn A., Kabukcu C., Jenkins E., Lorentz K., et al. Juniper smoke, skulls and wolves’ tails. The Epipalaeolithic of the Anatolian plateau in its South-west Asian context; insights from Pınarbaşı. Levant. 2013;45:175–209. [Google Scholar]
  14. Baird D., Fairbairn A., Jenkins E., Martin L., Middleton C., Pearson J., Asouti E., Edwards Y., Kabukcu C., Mustafaoğlu G., et al. Agricultural origins on the Anatolian plateau. Proc. Natl. Acad. Sci. USA. 2018;115:E3077–E3086. doi: 10.1073/pnas.1800163115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Bar-Yosef O. Climatic fluctuations and early farming in West and East Asia. Curr. Anthropol. 2011;52:S175–S193. [Google Scholar]
  16. Beerli P. Effect of unsampled populations on the estimation of population sizes and migration rates between sampled populations. Mol. Ecol. 2004;13:827–836. doi: 10.1111/j.1365-294x.2004.02101.x. [DOI] [PubMed] [Google Scholar]
  17. Beichman A.C., Phung T.N., Lohmueller K.E. Comparison of single genome and allele frequency data reveals discordant demographic histories. G3 (Bethesda) 2017;7:3605–3620. doi: 10.1534/g3.117.300259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Benjamin J., Rovere A., Fontana A., Furlani S., Vacchi M., Inglis R.H., Galili E., Antonioli F., Sivan D., Miko S., et al. Late Quaternary sea-level changes and early human societies in the central and eastern Mediterranean Basin: an interdisciplinary review. Quat. Int. 2017;449:29–57. [Google Scholar]
  19. Bickle P. Stable isotopes and dynamic diets: the Mesolithic-Neolithic dietary transition in terrestrial central Europe. J. Archaeol. Sci.: Rep. 2018;22:444–451. [Google Scholar]
  20. Bickle P., Hofmann D., Bentley R.A., Hedges R., Hamilton J., Laiginhas F., Nowell G., Pearson D.G., Whittle A. In: Mitteilungen der Prähistorischen Kommission. von Kleinhadersdorf D.L.G., Neugebauer-Maresch C., Lenneis E., editors. Verlag der Österreichischen Akademie der Wissenschaften; 2015. The isotope results from Kleinhadersdorf; pp. 173–177. [Google Scholar]
  21. Bocquet-Appel J.-P., Bar-Yosef O. Springer Science and Business Media; 2008. The Neolithic demographic transition and its consequences. [Google Scholar]
  22. Bogaard A., Filipović D., Fairbairn A., Green L., Stroud E., Fuller D., Charles M. Agricultural innovation and resilience in a long-lived early farming community: the 1,500-year sequence at Neolithic to early Chalcolithic Çatalhöyük, central Anatolia. Anatol. Stud. 2017;67:1–28. [Google Scholar]
  23. Bollongino R., Nehlich O., Richards M.P., Orschiedt J., Thomas M.G., Sell C., Fajkosová Z., Powell A., Burger J. 2000 years of parallel societies in Stone Age Central Europe. Science. 2013;342:479–481. doi: 10.1126/science.1245049. [DOI] [PubMed] [Google Scholar]
  24. Bonsall C., Boroneanț A., Simalcsik A., Higham T. In: Southeast Europe and Anatolia in prehistory. Essays in Honor of Vassil Nikolov on His 65th Anniversary. Bacvarov K., Gleser R., editors. Verlag Dr. Rudolf Habelt; 2016. Radiocarbon dating of Mesolithic burials from Ostrovul Corbului, southwest Romania; pp. 41–50. [Google Scholar]
  25. Bonsall C., Boroneanţ V., Radovanović I. Archaeopress; 2008. The Iron Gates in prehistory: new perspectives. [Google Scholar]
  26. Bonsall C., Lennon R., McSweeney K., Stewart C., Harkness D., Boronean V., Bartosiewicz L., Payton R., Chapman J. Mesolithic and early Neolithic in the iron gates: a palaeodietary perspective. J. Eur. Archaeol. 1997;5:50–92. [Google Scholar]
  27. Bonsall C., Vasić R., Boroneanț A., Roksandic M., Soficaru A., McSweeney K., Evatt A., Aguraiuja Ü., Pickard C., Dimitrijević V., et al. New AMS 14C dates for human remains from Stone Age sites in the Iron Gates reach of the Danube, southeast Europe. Radiocarbon. 2015;57:33–46. [Google Scholar]
  28. Borić D. In: Metals and societies: studies in honour of Barbara S. Ottaway. Kienlin T.L., Roberts B.W., editors. Verlag Dr. Rudolf Habelt; 2009. Absolute dating of metallurgical innovations in the Vinča Culture of the Balkans; pp. 191–245. [Google Scholar]
  29. Borić D. In: Beginnings – New Research in the Appearance of the Neolithic between Northwest Anatolia and the Carpathian Basin. Papers of the International Workshop 8th - 9th April 2009. Ciobotaru D., Horejs B., Krauß R., Krauß R., editors. Verlag Marie Leidorf GmbH; 2011. Adaptations and transformations of the Danube Gorges foragers (c. 13,000–5500 cal. BC): an overview; pp. 157–203. [Google Scholar]
  30. Borić D. Serbian Archaeological Society; 2016. Deathways at Lepenski Vir: patterns in mortuary practice. [Google Scholar]
  31. Borić D. Lepenski Vir chronology and stratigraphy revisited. Starinar. 2019:9–60. [Google Scholar]
  32. Borić D., Dimitrijević V. When did the “Neolithic package” reach Lepenski Vir? Radiometric and faunal evidence. Doc. Praehistorica. 2007;34:53–71. [Google Scholar]
  33. Borić D., French C., Dimitrijević V. Vlasac revisited: formation processes, stratigraphy and dating. Doc. Praehistorica. 2008;35:261–287. [Google Scholar]
  34. Borić D., Price T.D. Strontium isotopes document greater human mobility at the start of the Balkan Neolithic. Proc. Natl. Acad. Sci. USA. 2013;110:3298–3303. doi: 10.1073/pnas.1211474110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Boulestin B., Coupey A.-S. Archaeopress; 2015. Cannibalism in the linear pottery culture: the human remains from Herxheim. [Google Scholar]
  36. Boulestin B., Zeeb-Lanz A., Jeunesse C., Haack F., Arbogast R.-M., Denaire A. Mass cannibalism in the Linear Pottery Culture at Herxheim (Palatinate, Germany) Antiquity. 2009;83:968–982. [Google Scholar]
  37. Brace S., Diekmann Y., Booth T.J., van Dorp L., Faltyskova Z., Rohland N., Mallick S., Olalde I., Ferry M., Michel M., et al. Ancient genomes indicate population replacement in Early Neolithic Britain. Nat. Ecol. Evol. 2019;3:765–771. doi: 10.1038/s41559-019-0871-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Bramanti B., Thomas M.G., Haak W., Unterlaender M., Jores P., Tambets K., Antanaitis-Jacobs I., Haidle M.N., Jankauskas R., Kind C.-J., et al. Genetic discontinuity between local hunter-gatherers and central Europe’s first farmers. Science. 2009;326:137–140. doi: 10.1126/science.1176869. [DOI] [PubMed] [Google Scholar]
  39. Brami M., Horejs B., editors. Proceedings of the Neolithic Workshop held at 10th ICAANE in Vienna. Austrian Academy of Sciences Press; 2019. The central/western Anatolian farming frontier. [Google Scholar]
  40. Brink-Kloke H. Das linienbandkeramische Gräberfeld von Essenbach-Ammerbreite. Ldkr (Germania) 1990;68:427–481. [Google Scholar]
  41. Bronk Ramsey C.B. Bayesian analysis of radiocarbon dates. Radiocarbon. 2009;51:337–360. [Google Scholar]
  42. Brotherton P., Haak W., Templeton J., Brandt G., Soubrier J., Jane Adler C., Richards S.M., Der Sarkissian C., Ganslmeier R., Friederich S., et al. Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans. Nat. Commun. 2013;4:1764. doi: 10.1038/ncomms2656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Broushaki F., Thomas M.G., Link V., López S., van Dorp L., Kirsanow K., Hofmanová Z., Diekmann Y., Cassidy L.M., Díez-del-Molino D., et al. Early Neolithic genomes from the eastern Fertile Crescent. Science. 2016;353:499–503. doi: 10.1126/science.aaf7943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Browning B.L., Browning S.R. Detecting identity by descent and estimating genotype error rates in sequence data. Am. J. Hum. Genet. 2013;93:840–851. doi: 10.1016/j.ajhg.2013.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Buckley M.T., Racimo F., Allentoft M.E., Jensen M.K., Jonsson A., Huang H., Hormozdiari F., Sikora M., Marnetto D., Eskin E., et al. Selection in Europeans on fatty acid desaturases associated with dietary changes. Mol. Biol. Evol. 2017;34:1307–1318. doi: 10.1093/molbev/msx103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Budd C., Karul N., Alpaslan-Roodenberg S., Galik A., Schulting R., Lillie M. Diet uniformity at an early farming community in northwest Anatolia (Turkey): carbon and nitrogen isotope studies of bone collagen at Aktopraklık. Archaeol. Anthropol. Sci. 2018;10:2123–2135. [Google Scholar]
  47. Budd C., Lillie M., Alpaslan-Roodenberg S., Karul N., Pinhasi R. Stable isotope analysis of Neolithic and Chalcolithic populations from Aktopraklık, northern Anatolia. J. Archaeol. Sci. 2013;40:860–867. [Google Scholar]
  48. Buitenhuis H., Peters J., Pöllath N., Stiner M.C., Munro N.D., Saritaş Ö. In: The early settlement at Aşıklı Höyük. Essays in Honor of Ufuk Esin. Özbaşaran M., Duru G., Stiner M., editors. Yayınları; 2018. The faunal remains from levels 3 and 2 of Aşıklı Höyük: evidence for emerging management practices; pp. 281–324. [Google Scholar]
  49. Burger J., Link V., Blöcher J., Schulz A., Sell C., Pochon Z., Diekmann Y., Žegarac A., Hofmanová Z., Winkelbach L., et al. Low prevalence of lactase persistence in Bronze Age Europe indicates ongoing strong selection over the last 3,000 years. Curr. Biol. 2020;30:4307–4315.e13. doi: 10.1016/j.cub.2020.08.033. [DOI] [PubMed] [Google Scholar]
  50. Çakırlar C. In: Barely surviving or more than enough? Groot M., Lentjes D., Zeiler J., editors. Sidestone Press; 2013. Rethinking Neolithic subsistence at the gateway to Europe with new archaeozoological evidence from Istanbul; pp. 59–79. [Google Scholar]
  51. Carter T. Obsidian consumption in the Late Pleistocene--Early Holocene Aegean: contextualising new data from Mesolithic Crete. Annu. Br. Sch. Athens. 2016;111:13–34. [Google Scholar]
  52. Carter T., Contreras D.A., Holcomb J., Mihailović D.D., Karkanas P., Guérin G., Taffin N., Athanasoulis D., Lahaye C. Earliest occupation of the Central Aegean (Naxos), Greece: implications for hominin and Homo sapiens’ behavior and dispersals. Sci. Adv. 2019;5:eaax0997. doi: 10.1126/sciadv.aax0997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Cavalli-Sforza L.L., Moroni A., Zei G. Princeton University Press; 2004. Consanguinity, inbreeding, and genetic drift in Italy. [Google Scholar]
  54. Ceballos F.C., Gürün K., Altınışık N.E., Gemici H.C., Karamurat C., Koptekin D., Vural K.B., Mapelli I., Sağlıcan E., Sürer E., et al. Human inbreeding has decreased in time through the Holocene. Curr. Biol. 2021;31:3925–3934.e8. doi: 10.1016/j.cub.2021.06.027. [DOI] [PubMed] [Google Scholar]
  55. Chaitanya L., Breslin K., Zuñiga S., Wirken L., Pośpiech E., Kukla-Bartoszek M., Sijen T., de Knijff P., Liu F., Branicki W., et al. The HIrisPlex-S system for eye, hair and skin colour prediction from DNA: introduction and forensic developmental validation. Forensic Sci. Int. Genet. 2018;35:123–135. doi: 10.1016/j.fsigen.2018.04.004. [DOI] [PubMed] [Google Scholar]
  56. Chan Y., Salem R.M., Hsu Y.-H.H., McMahon G., Pers T.H., Vedantam S., Esko T., Guo M.H., Lim E.T., et al. GIANT Consortium Genome-wide analysis of body proportion classifies height-associated variants by mechanism of action and implicates genes important for skeletal development. Am. J. Hum. Genet. 2015;96:695–708. doi: 10.1016/j.ajhg.2015.02.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Chapman J. Routledge; 2000. Fragmentation in archaeology. People, places and broken objects in the prehistory of south eastern Europe. [Google Scholar]
  58. Childe V.G. Watts & Company; 1936. Man makes himself. [Google Scholar]
  59. Chimpanzee Sequencing and Analysis Consortium Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 2005;437:69–87. doi: 10.1038/nature04072. [DOI] [PubMed] [Google Scholar]
  60. Çilingiroğlu Ç., Çakırlar C. Towards configuring the neolithisation of Aegean Turkey. Doc. Praehistorica. 2013;40:21–29. [Google Scholar]
  61. Çilingiroğlu Ç., Kaczanowska M., Kozłowski J.K., Dinçer B., Çakırlar C., Turan D. Between Anatolia and the Aegean: Epipalaeolithic and Mesolithic foragers of the Karaburun Peninsula. J. Field Archaeol. 2020;45:479–497. [Google Scholar]
  62. Clark P.U., Dyke A.S., Shakun J.D., Carlson A.E., Clark J., Wohlfarth B., Mitrovica J.X., Hostetler S.W., McCabe A.M. The Last Glacial maximum. Science. 2009;325:710–714. doi: 10.1126/science.1172873. [DOI] [PubMed] [Google Scholar]
  63. Clason A.T. Padina and Starčevo: game, fish and cattle. Palaeohistoria. 1980;22:141–173. [Google Scholar]
  64. Coll Macià M., Skov L., Peter B.M., Schierup M.H. Different historical generation intervals in human populations inferred from Neanderthal fragment lengths and mutation signatures. Nat. Commun. 2021;12:5317. doi: 10.1038/s41467-021-25524-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Colledge S., Conolly J., Crema E., Shennan S. Neolithic population crash in northwest Europe associated with agricultural crisis. Quat. Res. 2019;92:686–707. [Google Scholar]
  66. Cook G.T., Bonsall C., Hedges R.E.M., McSweeney K., Boronean V., Pettitt P.B. A freshwater diet-derived 14C reservoir effect at the Stone Age sites in the Iron Gates gorge. Radiocarbon. 2001;43:453–460. [Google Scholar]
  67. Cox S.L., Ruff C.B., Maier R.M., Mathieson I. Genetic contributions to variation in human stature in prehistoric Europe. Proc. Natl. Acad. Sci. USA. 2019;116:21484–21492. doi: 10.1073/pnas.1910606116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Cristiani E., Radini A., Borić D., Robson H.K., Caricola I., Carra M., Mutri G., Oxilia G., Zupancich A., Šlaus M., et al. Dental calculus and isotopes provide direct evidence of fish and plant consumption in Mesolithic Mediterranean. Sci. Rep. 2018;8:8147. doi: 10.1038/s41598-018-26045-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Danecek P., Bonfield J.K., Liddle J., Marshall J., Ohan V., Pollard M.O., Whitwham A., Keane T., McCarthy S.A., Davies R.M., et al. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10 doi: 10.1093/gigascience/giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Davison K., Dolukhanov P., Sarson G.R., Shukurov A. The role of waterways in the spread of the Neolithic. J. Archaeol. Sci. 2006;33:641–652. [Google Scholar]
  71. de Becdelièvre C., Jovanović J., Hofmanová Z., Goude G., Stefanović S. In: Farmers at the frontier: a pan European perspective on Neolithisation. Gron K.J., Sørensen L., Rowley-Conwy P., editors. Oxbow Books; 2020. Direct insight into dietary adaptations and the individual experience of Neolithisation: comparing subsistence, provenance and ancestry of Early Neolithic humans from the Danube Gorges c. 6200--5500 cal BC; pp. 45–76. [Google Scholar]
  72. de Manuel M., Kuhlwilm M., Frandsen P., Sousa V.C., Desai T., Prado-Martinez J., Hernandez-Rodriguez J., Dupanloup I., Lao O., Hallast P., et al. Chimpanzee genomic diversity reveals ancient admixture with bonobos. Science. 2016;354:477–481. doi: 10.1126/science.aag2602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Delaneau O., Zagury J.F., Robinson M.R., Marchini J.L., Dermitzakis E.T. Accurate, scalable and integrative haplotype estimation. Nat. Commun. 2019;10:5436. doi: 10.1038/s41467-019-13225-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. DePristo M.A., Banks E., Poplin R., Garimella K.V., Maguire J.R., Hartl C., Philippakis A.A., del Angel G., Rivas M.A., Hanna M., et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 2011;43:491–498. doi: 10.1038/ng.806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Diaz-Papkovich A., Anderson-Trocmé L., Ben-Eghan C., Gravel S. UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet. 2019;15:e1008432. doi: 10.1371/journal.pgen.1008432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Diaz-Papkovich A., Anderson-Trocmé L., Gravel S. A review of UMAP in population genetics. J. Hum. Genet. 2021;66:85–91. doi: 10.1038/s10038-020-00851-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Dietrich H., Kociumaka C. Jungsteinzeitliche Befunde aus Steinheim: Stadt und Landkreis Dillingen ad Donau, Schwaben. Das Archäol. Jahr Bayern. 2001;2000:32–35. [Google Scholar]
  78. Dimitrieva S., Bucher P. UCNEbase--a database of ultraconserved non-coding elements and genomic regulatory blocks. Nucleic Acids Res. 2013;41:D101–D109. doi: 10.1093/nar/gks1092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Douka K., Efstratiou N., Hald M.M., Henriksen P.S., Karetsou A. Dating Knossos and the arrival of the earliest Neolithic in the southern Aegean. Antiquity. 2017;91:304–321. [Google Scholar]
  80. Efstratiou N., Biagi P., Starnini E. The Epipalaeolithic site of Ouriakos on the island of Lemnos and its place in the Late Pleistocene peopling of the east Mediterranean region. Adalya. 2014;17:1–13. [Google Scholar]
  81. Ehrich R.W. In: Ancient Europe and the Mediterranean. Markotić V., editor. Aris & Phillips; 1977. Starčevo revisited; pp. 59–67. [Google Scholar]
  82. Enattah N.S., Sahi T., Savilahti E., Terwilliger J.D., Peltonen L., Järvelä I. Identification of a variant associated with adult-type hypolactasia. Nat. Genet. 2002;30:233–237. doi: 10.1038/ng826. [DOI] [PubMed] [Google Scholar]
  83. Ergun M., Tengberg M., Willcox G., Douché C. In: The early settlement at Aşıklı Höyük. Essays in Honor of Ufuk Esin. Özbaşaran M., Duru G., Stiner M., editors. Yayınları; 2018. Plants of Aşıklı Höyük and changes through time: first archaeobotanical results from the 2010--14 excavation seasons; pp. 191–217. [Google Scholar]
  84. Esin N.V., Esin N.I., Yanko-Hombach V. The Black Sea basin filling by the Mediterranean salt water during the Holocene. Quat. Int. 2016;409:33–38. [Google Scholar]
  85. Excoffier L. Patterns of DNA sequence diversity and genetic structure after a range expansion: lessons from the infinite-island model. Mol. Ecol. 2004;13:853–864. doi: 10.1046/j.1365-294x.2003.02004.x. [DOI] [PubMed] [Google Scholar]
  86. Excoffier L., Dupanloup I., Huerta-Sánchez E., Sousa V.C., Foll M. Robust demographic inference from genomic and SNP data. PLoS Genet. 2013;9:e1003905. doi: 10.1371/journal.pgen.1003905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Excoffier L., Marchi N., Marques D.A., Matthey-Doret R., Gouy A., Sousa V.C. fastsimcoal2: demographic inference under complex evolutionary scenarios. Bioinformatics. 2021;37:4882–4885. doi: 10.1093/bioinformatics/btab468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Feldman M., Fernández-Domínguez E., Reynolds L., Baird D., Pearson J., Hershkovitz I., May H., Goring-Morris N., Benz M., Gresky J., et al. Late Pleistocene human genome suggests a local origin for the first farmers of central Anatolia. Nat. Commun. 2019;10:1218. doi: 10.1038/s41467-019-09209-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Fenner J.N. Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. Am. J. Phys. Anthropol. 2005;128:415–423. doi: 10.1002/ajpa.20188. [DOI] [PubMed] [Google Scholar]
  90. Fogel M.L., Tuross N., Owsley D.W. Nitrogen isotope tracers of human lactation in modern and archaeological populations. Carnegie Inst. Wash. Yearb. 1989;88:111–117. [Google Scholar]
  91. French J.C. Cambridge University Press; 2021. Palaeolithic Europe: A Demographic and Social Prehistory. [Google Scholar]
  92. Frichot E., François O. LEA: an R package for landscape and ecological association studies. Methods Ecol. Evol. 2015;6:925–929. [Google Scholar]
  93. Fu Q., Mittnik A., Johnson P.L.F., Bos K., Lari M., Bollongino R., Sun C., Giemsch L., Schmitz R., Burger J., et al. A revised timescale for human evolution based on ancient mitochondrial genomes. Curr. Biol. 2013;23:553–559. doi: 10.1016/j.cub.2013.02.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Fujimoto A., Kimura R., Ohashi J., Omi K., Yuliwulandari R., Batubara L., Mustofa M.S., Samakkarn U., Settheetham-Ishida W., Ishida T., et al. A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness. Hum. Mol. Genet. 2008;17:835–843. doi: 10.1093/hmg/ddm355. [DOI] [PubMed] [Google Scholar]
  95. Fuller B.T., Fuller J.L., Harris D.A., Hedges R.E.M. Detection of breastfeeding and weaning in modern human infants with carbon and nitrogen stable isotope ratios. Am. J. Phys. Anthropol. 2006;129:279–293. doi: 10.1002/ajpa.20249. [DOI] [PubMed] [Google Scholar]
  96. Fuller D.Q., Willcox G., Allaby R.G. Cultivation and domestication had multiple origins: arguments against the core area hypothesis for the origins of agriculture in the Near East. World Archaeol. 2011;43:628–652. [Google Scholar]
  97. Fumagalli M., Moltke I., Grarup N., Racimo F., Bjerregaard P., Jørgensen M.E., Korneliussen T.S., Gerbault P., Skotte L., Linneberg A., et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science. 2015;349:1343–1347. doi: 10.1126/science.aab2319. [DOI] [PubMed] [Google Scholar]
  98. Gallego-Llorente M., Connell S., Jones E.R., Merrett D.C., Jeon Y., Eriksson A., Siska V., Gamba C., Meiklejohn C., Beyer R., et al. The genetics of an early Neolithic pastoralist from the Zagros, Iran. Sci. Rep. 2016;6:31326. doi: 10.1038/srep31326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Gamba C., Jones E.R., Teasdale M.D., McLaughlin R.L., Gonzalez-Fortes G., Mattiangeli V., Domboróczki L., Kővári I., Pap I., Anders A., et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 2014;5:5257. doi: 10.1038/ncomms6257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Gazal S., Sahbatou M., Perdry H., Letort S., Génin E., Leutenegger A.-L. Inbreeding coefficient estimation with dense SNP data: comparison of strategies and application to HapMap III. Hum. Hered. 2014;77:49–62. doi: 10.1159/000358224. [DOI] [PubMed] [Google Scholar]
  101. Gerritsen F., Özbal R. Barcın Höyük, a seventh millennium settlement in the Eastern Marmara region of Turkey. Doc. Praehistorica. 2019;46:58–67. [Google Scholar]
  102. Gerritsen F., Özbal R., Gerrits P. In: Metallurgica Anatolica: Festschrift für Ünsal Yalçın anlasslich seines 65. Geburtstags / Ünsal Yalçın 65. Yaşgünü Armağan Kitabı. Yalçın H.G., Stegemeier O., editors. Ege Yayınları; 2020. A red floor at Neolithic Barcın Höyük: special or Not? pp. 35–43. [Google Scholar]
  103. González-Fortes G., Jones E.R., Lightfoot E., Bonsall C., Lazar C., Grandal-d’Anglade A., Garralda M.D., Drak L., Siska V., Simalcsik A., et al. Paleogenomic Evidence for Multi-generational Mixing between Neolithic Farmers and Mesolithic Hunter-Gatherers in the Lower Danube Basin. Curr. Biol. 2017;27:1801–1810.e10. doi: 10.1016/j.cub.2017.05.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Gravel S., Henn B.M., Gutenkunst R.N., Indap A.R., Marth G.T., Clark A.G., Yu F., Gibbs R.A., 1000 Genomes Project. Bustamante C.D. Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. USA. 2011;108:11983–11988. doi: 10.1073/pnas.1019276108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Günther T., Malmström H., Svensson E.M., Omrak A., Sánchez-Quinto F., Kılınç G.M., Krzewińska M., Eriksson G., Fraser M., Edlund H., et al. Population genomics of Mesolithic Scandinavia: investigating early postglacial migration routes and high-latitude adaptation. PLoS Biol. 2018;16:e2003703. doi: 10.1371/journal.pbio.2003703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Günther T., Nettelblad C. The presence and impact of reference bias on population genomic studies of prehistoric human populations. PLoS Genet. 2019;15:e1008302. doi: 10.1371/journal.pgen.1008302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Gutenkunst R.N., Hernandez R.D., Williamson S.H., Bustamante C.D. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 2009;5:e1000695. doi: 10.1371/journal.pgen.1000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Haack F. In: Ritualised destruction in the early Neolithic - the exceptional site of Herxheim (palatinate, Germany) Zeeb-Lanz A., editor. Generaldirektion Kulturelles Erbe Rheinland-Pfalz; 2016. The early Neolithic ditched enclosure of Herxheim - architecture, fill formation processes and service life; pp. 19–118. [Google Scholar]
  109. Halstead P., Isaakidou V. In: Farmers at the frontier: a pan European perspective on Neolithisation. Gron K.J., Sørensen L., Rowley-Conwy P., editors. Oxbow Books; 2020. Pioneer farming in earlier Neolithic Greece; pp. 77–100. [Google Scholar]
  110. Harding R.M., Healy E., Ray A.J., Ellis N.S., Flanagan N., Todd C., Dixon C., Sajantila A., Jackson I.J., Birch-Machin M.A., et al. Evidence for variable selective pressures at MC1R. Am. J. Hum. Genet. 2000;66:1351–1361. doi: 10.1086/302863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Harris K. Evidence for recent, population-specific evolution of the human mutation rate. Proc. Natl. Acad. Sci. USA. 2015;112:3439–3444. doi: 10.1073/pnas.1418652112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Hershkovitz I., Rothschild B.M., Latimer B., Dutour O., Léonetti G., Greenwald C.M., Rothschild C., Jellema L.M. Recognition of sickle cell anemia in skeletal remains of children. Am. J. Phys. Anthropol. 1997;104:213–226. doi: 10.1002/(SICI)1096-8644(199710)104:2<213::AID-AJPA8>3.0.CO;2-Z. [DOI] [PubMed] [Google Scholar]
  113. Hofmanová Z. Johannes Gutenberg-Universität Mainz; 2017. Palaeogenomic and Biostatistical Analysis of Ancient DNA Data from Mesolithic and Neolithic Skeletal Remains. [Google Scholar]
  114. Hofmanová Z., Kreutzer S., Hellenthal G., Sell C., Diekmann Y., Díez-Del-Molino D., van Dorp L., López S., Kousathanas A., Link V., et al. Early farmers from across Europe directly descended from Neolithic Aegeans. Proc. Natl. Acad. Sci. USA. 2016;113:6886–6891. doi: 10.1073/pnas.1523951113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Horejs B., Milić B., Ostmann F., Thanheiser U., Weninger B., Galik A. The Aegean in the Early 7th Millennium BC: maritime Networks and Colonization. J. World Prehist. 2015;28:289–330. doi: 10.1007/s10963-015-9090-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Huang X., Fortier A.L., Coffman A.J., Struck T.J., Irby M.N., James J.E., León-Burguete J.E., Ragsdale A.P., Gutenkunst R.N. Inferring genome-wide correlations of mutation fitness effects between populations. Mol. Biol. Evol. 2021;38:4588–4602. doi: 10.1093/molbev/msab162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Ivanova M., De Cupere B., Ethier J., Marinova E. Pioneer farming in southeast Europe during the early sixth millennium BC: climate-related adaptations in the exploitation of plants and animals. PLoS One. 2018;13:e0197225. doi: 10.1371/journal.pone.0197225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Jones E.R., Gonzalez-Fortes G., Connell S., Siska V., Eriksson A., Martiniano R., McLaughlin R.L., Gallego Llorente M., Cassidy L.M., Gamba C., et al. Upper Palaeolithic genomes reveal deep roots of modern Eurasians. Nat. Commun. 2015;6:8912. doi: 10.1038/ncomms9912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Jöris O., Street M., Sirocko F. In: Von der Eiszeit bis ins 21, Wetter, Klima, and Menschheitsentwicklung. Jahrhundert F.S., editor. WBG; 2009. Siedlungsleere – das Kältemaximum der Letzten Kaltzeit (24.000 - 16.000 BP) pp. 83–87. [Google Scholar]
  120. Jovanović J., de Becdelièvre C., Stefanović S., Živaljević I., Dimitrijević V., Goude G. Last hunters - first farmers: new insight into subsistence strategies in the Central Balkans through multi-isotopic analysis. Archaeol. Anthropol. Sci. 2019;11:3279–3298. [Google Scholar]
  121. Jovanović J.D. University of Belgrade; 2017. The diet and health status of the early Neolithic communities of the central Balkans (6200–5200 BC) PhD thesis. [Google Scholar]
  122. Kartal M. Anatolian epi-paleolithic period assemblages: problems, suggestions, evaluations and various approaches. Anadolu. 2003;24:45–62. [Google Scholar]
  123. Karul N., Avci M.B. Neolithic communities in the Eastern Marmara region: Aktopraklik C. Anatolica. 2011;37:1–15. [Google Scholar]
  124. Katoh K., Misawa K., Kuma K.-I., Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–3066. doi: 10.1093/nar/gkf436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Kılınç G.M., Koptekin D., Atakuman Ç., Sümer A.P., Dönertaş H.M., Yaka R., Bilgin C.C., Büyükkarakaya A.M., Baird D., Altınışık E., et al. Archaeogenomic analysis of the first steps of Neolithization in Anatolia and the Aegean. Proc. Biol. Sci. 2017;284 doi: 10.1098/rspb.2017.2064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Kılınç G.M., Omrak A., Özer F., Günther T., Büyükkarakaya A.M., Bıçakçı E., Baird D., Dönertaş H.M., Ghalichi A., Yaka R., et al. The demographic development of the first farmers in Anatolia. Curr. Biol. 2016;26:2659–2666. doi: 10.1016/j.cub.2016.07.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Kircher M., Sawyer S., Meyer M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 2012;40:e3. doi: 10.1093/nar/gkr771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. Köster J., Rahmann S. Snakemake--a scalable bioinformatics workflow engine. Bioinformatics. 2012;28:2520–2522. doi: 10.1093/bioinformatics/bts480. [DOI] [PubMed] [Google Scholar]
  129. Kousathanas A., Leuenberger C., Link V., Sell C., Burger J., Wegmann D. Inferring heterozygosity from ancient and low coverage genomes. Genetics. 2017;205:317–332. doi: 10.1534/genetics.116.189985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Kozłowski J.K. Gravettian/Epigravettian sequences in the Balkans: environment, technologies, hunting strategies and raw material procurement. Br. Sch. Athens Stud. 1999;3:319–329. [Google Scholar]
  131. Kozłowski J.K. The importance of the Aegean basin for the Neolithization of South-Eastern Europe. J. Isr. Prehist. Soc. 2005;35:409–424. [Google Scholar]
  132. Kozłowski J.K., Kaczanowska M. Gravettian/Epigravettian sequences in the Balkans and Anatolia. Mediterr. Archaeol. Archaeom. 2004;4:5–18. [Google Scholar]
  133. Krauß R., Marinova E., De Brue H., Weninger B. The rapid spread of early farming from the Aegean into the Balkans via the Sub-Mediterranean-Aegean Vegetation Zone. Quat. Int. 2018;496:24–41. [Google Scholar]
  134. Kreutzer S. 2017. Populationsgenetische Analyse prähistorischer Individuen aus Griechenland. [Google Scholar]
  135. Lagia A., Eliopoulos C., Manolis S. Thalassemia: macroscopic and radiological study of a case. Int. J. Osteoarchaeol. 2007;17:269–285. [Google Scholar]
  136. Lahr M.M., Foley R.A. Towards a theory of modern human origins: geography, demography, and diversity in recent human evolution. Am. J. Phys. Anthropol. 1998;27:137–176. doi: 10.1002/(sici)1096-8644(1998)107:27+<137::aid-ajpa6>3.0.co;2-q. [DOI] [PubMed] [Google Scholar]
  137. Lambeck K. Late Pleistocene and Holocene sea-level change in Greece and south-western Turkey: a separation of eustatic, isostatic and tectonic contributions. Geophys. J. Int. 1995;122:1022–1044. [Google Scholar]
  138. Latreille J., Ezzedine K., Elfakir A., Ambroisine L., Gardinier S., Galan P., Hercberg S., Gruber F., Rees J., Tschachler E., et al. MC1R gene polymorphism affects skin color and phenotypic features related to sun sensitivity in a population of French adult women. Photochem. Photobiol. 2009;85:1451–1458. doi: 10.1111/j.1751-1097.2009.00594.x. [DOI] [PubMed] [Google Scholar]
  139. Lazaridis I. The evolutionary history of human populations in Europe. Curr. Opin. Genet. Dev. 2018;53:21–27. doi: 10.1016/j.gde.2018.06.007. [DOI] [PubMed] [Google Scholar]
  140. Lazaridis I., Mittnik A., Patterson N., Mallick S., Rohland N., Pfrengle S., Furtwängler A., Peltzer A., Posth C., Vasilakis A., et al. Genetic origins of the Minoans and Mycenaeans. Nature. 2017;548:214–218. doi: 10.1038/nature23310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  141. Lazaridis I., Nadel D., Rollefson G., Merrett D.C., Rohland N., Mallick S., Fernandes D., Novak M., Gamarra B., Sirak K., et al. Genomic insights into the origin of farming in the ancient Near East. Nature. 2016;536:419–424. doi: 10.1038/nature19310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  142. Lazaridis I., Patterson N., Mittnik A., Renaud G., Mallick S., Kirsanow K., Sudmant P.H., Schraiber J.G., Castellano S., Lipson M., et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature. 2014;513:409–413. doi: 10.1038/nature13673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  143. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at arXiv. 2013 1303.3997v2. [Google Scholar]
  144. Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Link, V., Kousathanas, A., Veeramah, K., Sell, C., Scheu, A., and Wegmann, D. ATLAS: analysis tools for low-depth and ancient samples. Preprint at bioRxiv, /10.1101/105346.
  146. Lipson M., Szécsényi-Nagy A., Mallick S., Pósa A., Stégmár B., Keerl V., Rohland N., Stewardson K., Ferry M., Michel M., et al. Parallel palaeogenomic transects reveal complex genetic history of early European farmers. Nature. 2017;551:368–372. doi: 10.1038/nature24476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. MacHugh D.E., Edwards C.J., Bailey J.F., Bancroft D.R., Bradley D.G. The extraction and analysis of ancient DNA from bone and teeth: a survey of current methodologies. Anc. Biomol. 2000;3:81–102. [Google Scholar]
  148. Maier A. Population and settlement dynamics from the Gravettian to the Magdalenian. Mitt. Ges. Urgeschichte. 2017;26:83–101. [Google Scholar]
  149. Malaspinas A.-S., Westaway M.C., Muller C., Sousa V.C., Lao O., Alves I., Bergström A., Athanasiadis G., Cheng J.Y., Crawford J.E., et al. A genomic history of Aboriginal Australia. Nature. 2016;538:207–214. doi: 10.1038/nature18299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  150. Mallick S., Li H., Lipson M., Mathieson I., Gymrek M., Racimo F., Zhao M., Chennagiri N., Nordenfelt S., Tandon A., et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature. 2016;538:201–206. doi: 10.1038/nature18964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Marchi N., Excoffier L. Gene flow as a simple cause for an excess of high-frequency-derived alleles. Evol. Appl. 2020;13:2254–2263. doi: 10.1111/eva.12998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Marchi N., Schlichta F., Excoffier L. Demographic inference. Curr. Biol. 2021;31:R276–R279. doi: 10.1016/j.cub.2021.01.053. [DOI] [PubMed] [Google Scholar]
  153. Marcus J.H., Posth C., Ringbauer H., Lai L., Skeates R., Sidore C., Beckett J., Furtwängler A., Olivieri A., Chiang C.W.K., et al. Genetic history from the Middle Neolithic to present on the Mediterranean island of Sardinia. Nat. Commun. 2020;11:939. doi: 10.1038/s41467-020-14523-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  154. Marouli E., Graff M., Medina-Gomez C., Lo K.S., Wood A.R., Kjaer T.R., Fine R.S., Lu Y., Schurmann C., Highland H.M., et al. Rare and low-frequency coding variants alter human adult height. Nature. 2017;542:186–190. doi: 10.1038/nature21039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  155. Mateiciucová I. In: Das Linearbandkeramische Gräberfeld von Kleinhadersdorf. Neugebauer-Maresch C., Lenneis E., editors. Verlag der Österreichischen Akademie der Wissenschaften; 2015. Silices; pp. 111–122. [Google Scholar]
  156. Mathieson I., Alpaslan-Roodenberg S., Posth C., Szécsényi-Nagy A., Rohland N., Mallick S., Olalde I., Broomandkhoshbacht N., Candilio F., Cheronet O., et al. The genomic history of southeastern Europe. Nature. 2018;555:197–203. doi: 10.1038/nature25778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  157. Mathieson I., Lazaridis I., Rohland N., Mallick S., Patterson N., Roodenberg S.A., Harney E., Stewardson K., Fernandes D., Novak M., et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature. 2015;528:499–503. doi: 10.1038/nature16152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Matthey-Doret R., Whitlock M.C. Background selection and FST : consequences for detecting local adaptation. Mol. Ecol. 2019;28:3902–3914. doi: 10.1111/mec.15197. [DOI] [PubMed] [Google Scholar]
  159. McCarthy S., Das S., Kretzschmar W., Delaneau O., Wood A.R., Teumer A., Kang H.M., Fuchsberger C., Danecek P., Sharp K., et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 2016;48:1279–1283. doi: 10.1038/ng.3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  160. McEvoy B.P., Visscher P.M. Genetics of human height. Econ. Hum. Biol. 2009;7:294–306. doi: 10.1016/j.ehb.2009.09.005. [DOI] [PubMed] [Google Scholar]
  161. Meyer M., Kircher M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc. 2010;2010 doi: 10.1101/pdb.prot5448. pdb.prot5448. [DOI] [PubMed] [Google Scholar]
  162. Mittnik A., Massy K., Knipper C., Wittenborn F., Friedrich R., Pfrengle S., Burri M., Carlichi-Witjes N., Deeg H., Furtwängler A., et al. Kinship-based social inequality in Bronze Age Europe. Science. 2019;366:731–734. doi: 10.1126/science.aax6219. [DOI] [PubMed] [Google Scholar]
  163. Narasimhan V.M., Patterson N., Moorjani P., Rohland N., Bernardos R., Mallick S., Lazaridis I., Nakatsuka N., Olalde I., Lipson M., et al. The formation of human populations in South and Central Asia. Science. 2019;365:eaat7487. doi: 10.1126/science.aat7487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Navarro-Gomez D., Leipzig J., Shen L., Lott M., Stassen A.P.M., Wallace D.C., Wiggs J.L., Falk M.J., van Oven M., Gai X. Phy-Mer: a novel alignment-free and reference-independent mitochondrial haplogroup classifier. Bioinformatics. 2015;31:1310–1312. doi: 10.1093/bioinformatics/btu825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Nehlich O., Borić D., Stefanović S., Richards M.P. Sulphur isotope evidence for freshwater fish consumption: a case study from the Danube Gorges, SE Europe. J. Archaeol. Sci. 2010;37:1131–1139. [Google Scholar]
  166. Nei M., Li W.H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. USA. 1979;76:5269–5273. doi: 10.1073/pnas.76.10.5269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Neph S., Kuehn M.S., Reynolds A.P., Haugen E., Thurman R.E., Johnson A.K., Rynes E., Maurano M.T., Vierstra J., Thomas S., et al. BEDOPS: high-performance genomic feature operations. Bioinformatics. 2012;28:1919–1920. doi: 10.1093/bioinformatics/bts277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Neugebauer-Maresch C., Lenneis E. Vol. 82. Verlag der Österreichischen Akademie der Wissenschaften; 2015. (Das Linearbandkeramische Gräberfeld von Kleinhadersdorf. Mitteilungen der Prähistorischen Kommission). [Google Scholar]
  169. Nieszery N. verlag Marie Leidorf; 1995. Linearbandkeramische Gräberfelder in Bayern. [Google Scholar]
  170. Nikitin A.G., Stadler P., Kotova N., Teschler-Nicola M., Price T.D., Hoover J., Kennett D.J., Lazaridis I., Rohland N., Lipson M., et al. Interactions between earliest Linearbandkeramik farmers and central European hunter gatherers at the dawn of European Neolithization. Sci. Rep. 2019;9:19544. doi: 10.1038/s41598-019-56029-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  171. Ohashi J., Naka I., Tsuchiya N. The impact of natural selection on an ABCC11 SNP determining earwax type. Mol. Biol. Evol. 2011;28:849–857. doi: 10.1093/molbev/msq264. [DOI] [PubMed] [Google Scholar]
  172. Olalde I., Brace S., Allentoft M.E., Armit I., Kristiansen K., Booth T., Rohland N., Mallick S., Szécsényi-Nagy A., Mittnik A., et al. The Beaker phenomenon and the genomic transformation of northwest Europe. Nature. 2018;555:190–196. doi: 10.1038/nature25738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Olalde I., Mallick S., Patterson N., Rohland N., Villalba-Mouco V., Silva M., Dulias K., Edwards C.J., Gandini F., Pala M., et al. The genomic history of the Iberian Peninsula over the past 8000 years. Science. 2019;363:1230–1234. doi: 10.1126/science.aav4040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. Orschiedt J., Haidle M.N. In: Sticks, stones and broken bones: Neolithic violence in a European perspective. Schulting R.J., Fibiger L., editors. Oxford University Press; 2012. Violence against the living, violence against the dead on the human remains from Herxheim, Germany. Evidence of a crisis and mass cannibalism? pp. 121–138. [Google Scholar]
  175. Ortner D.J. Academic Press; 2003. Identification of pathological conditions in human skeletal remains. [Google Scholar]
  176. Özdoǧan M. Anatolia from the Last Glacial Maximum to the Holocene Climatic Optimum: cultural formations and the impact of the environmental setting. Paléorient. 1997;23:25–38. [Google Scholar]
  177. Özdoğan M. Archaeological evidence on the westward expansion of farming communities from eastern Anatolia to the Aegean and the Balkans. Curr. Anthropol. 2011;52:S415–S430. [Google Scholar]
  178. Papathanasiou A. Stable isotope analysis in Neolithic Greece and possible implications on human health. Int. J. Osteoarchaeol. 2003;13:314–324. [Google Scholar]
  179. Papathanasiou A. In: Human Bioarchaeology of the Transition to Agriculture. Pinhasi R., Stock J., editors. John Wiley & Sons, Ltd; 2011. Health, diet and social implications in Neolithic Greece from the study of human osteological material; pp. 87–106. [Google Scholar]
  180. Papathanasiou A. Stable isotope analyses in Neolithic and Bronze Age Greece: an overview. Hesperia. 2015;49:25–55. [Google Scholar]
  181. Papoulia C. Géoarchéologie Des Îles de méditerranée. CNRS Editions; 2016. Late Pleistocene to Early Holocene Sea-crossings in the Aegean: direct, indirect and controversial evidence; pp. 33–46. [Google Scholar]
  182. Park J.-H., Yamaguchi T., Watanabe C., Kawaguchi A., Haneji K., Takeda M., Kim Y.-I., Tomoyasu Y., Watanabe M., Oota H., et al. Effects of an Asian-specific nonsynonymous EDAR variant on multiple dental traits. J. Hum. Genet. 2012;57:508–514. doi: 10.1038/jhg.2012.60. [DOI] [PubMed] [Google Scholar]
  183. Patterson N., Moorjani P., Luo Y., Mallick S., Rohland N., Zhan Y., Genschoreck T., Webster T., Reich D. Ancient admixture in human history. Genetics. 2012;192:1065–1093. doi: 10.1534/genetics.112.145037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  184. Pechtl J. Verorten in Raum, Zeit und Umwelt. Ein Forschungsprojekt zur ersten bäuerlichen Kultur in Bayerisch-Schwaben. Denkmalpflege Informationen. 2015;161:14–17. [Google Scholar]
  185. Perlès C. In: Hunters of the golden Age. The mid-upper Palaeolithic of Eurasia 30,000 - 20,000 BP, W. Roebroeks. Mussi M., Svoboda J., Fennema K., editors. University of Leiden; 2000. Greece, 30,000–20,000 bp; pp. 375–397. [Google Scholar]
  186. Perlès C. Cambridge University Press; 2001. The early Neolithic in Greece: the first farming communities in Europe. [Google Scholar]
  187. Perlès C., Quiles A., Valladas H. Early seventh-millennium AMS dates from domestic seeds in the Initial Neolithic at Franchthi Cave (Argolid, Greece) Antiquity. 2013;87:1001–1015. [Google Scholar]
  188. Peters J., Helmer D., Von Den Driesch A., Saña Segui M. Early animal husbandry in the northern Levant. Paléorient. 1999;25:27–48. [Google Scholar]
  189. Petr M., Pääbo S., Kelso J., Vernot B. Limits of long-term selection against Neandertal introgression. Proc. Natl. Acad. Sci. USA. 2019;116:1639–1644. doi: 10.1073/pnas.1814338116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  190. Porčić M., Blagojević T., Pendić J., Stefanović S. The Neolithic demographic transition in the Central Balkans: population dynamics reconstruction based on new radiocarbon evidence. Phil. Trans. R. Soc. Lond. B. 2020;376:1816. doi: 10.1098/rstb.2019.0712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  191. Pouyet F., Aeschbacher S., Thiéry A., Excoffier L. Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences. eLife. 2018;7:e36317. doi: 10.7554/eLife.36317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  192. Prüfer K., de Filippo C., Grote S., Mafessoni F., Korlević P., Hajdinjak M., Vernot B., Skov L., Hsieh P., Peyrégne S., et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science. 2017;358:655–658. doi: 10.1126/science.aao1887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  193. Pyke G., Yiouni P. The British School at Athens; 1996. Nea Nikomedeia I: The excavation of an early Neolithic village in Northern Greece 1961–1964. The excavation and the ceramic assemblage. [Google Scholar]
  194. Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  195. R Core Team . R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2019. https://www.R-project.org/ [Google Scholar]
  196. Racimo F., Sikora M., Vander Linden M.V., Schroeder H., Lalueza-Fox C. Beyond broad strokes: sociocultural insights from the study of ancient genomes. Nat. Rev. Genet. 2020;21:355–366. doi: 10.1038/s41576-020-0218-z. [DOI] [PubMed] [Google Scholar]
  197. Ragsdale A.P., Coffman A.J., Hsieh P., Struck T.J., Gutenkunst R.N. Triallelic population genomics for inferring correlated fitness effects of same site nonsynonymous mutations. Genetics. 2016;203:513–523. doi: 10.1534/genetics.115.184812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  198. Ralf A., Montiel González D.M., Zhong K., Kayser M. Yleaf: software for human Y-chromosomal haplogroup inference from next-generation sequencing data. Mol. Biol. Evol. 2018;35:1291–1294. doi: 10.1093/molbev/msy032. [DOI] [PubMed] [Google Scholar]
  199. Rasmussen M., Guo X., Wang Y., Lohmueller K.E., Rasmussen S., Albrechtsen A., Skotte L., Lindgreen S., Metspalu M., Jombart T., et al. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science. 2011;334:94–98. doi: 10.1126/science.1211177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  200. Reimer P.J., Austin W.E.N., Bard E., Bayliss A., Blackwell P.G., Bronk Ramsey C., Butzin M., Cheng H., Edwards R.L., Friedrich M., et al. The IntCal20 Northern Hemisphere Radiocarbon Age Calibration Curve (0–55 cal kBP) Radiocarbon. 2020;62:725–757. [Google Scholar]
  201. Richards M.P., Hedges R.E.M. Stable isotope evidence for similarities in the types of marine foods used by late Mesolithic humans at sites along the Atlantic Coast of Europe. J. Archaeol. Sci. 1999;26:717–722. [Google Scholar]
  202. Riedhammer K. In: Ritualised destruction in the early Neolithic – the exceptional site of Herxheim (palatinate, Germany) Zeeb-Lanz A., editor. Vol. 2. Generaldirektion Kulturelles Erbe, Direktion Landesarchäologie; 2019. The radiocarbon dates from Herxheim and their archaeological interpretation; pp. 285–304. [Google Scholar]
  203. Ringbauer H., Novembre J., Steinrücken M. Parental relatedness through time revealed by runs of homozygosity in ancient DNA. Nat. Commun. 2021;12:5425. doi: 10.1038/s41467-021-25289-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  204. Rodden R.J. Recent discoveries from prehistoric Macedonia: an interim report. Balk. Stud. 1964;5:109–124. [Google Scholar]
  205. Rodden R.J. An early Neolithic village in Greece. Sci. Am. 1965;212:82–92. [Google Scholar]
  206. Rodden R.J., Dimbleby G.W., Western A.C., Willis E.H., Higgs E.S., Clench W.J. Excavations at the Early Neolithic Site at Nea Nikomedeia, Greek Macedonia (1961 season) Proc. Prehist. Soc. 1962;28:267–288. [Google Scholar]
  207. Rodden R.J., Rodden J.M. Illustrated London News; 1964. A European Link with Chatal Huyuk: uncovering a 7th Millennium Settlement in Macedonia. Part I - Site and pottery; pp. 564–567. [Google Scholar]
  208. Rosenstock E., Ebert J., Martin R., Hicketier A., Walter P., Groß M. Human stature in the Near East and Europe ca. 10,000–1000 BC: its spatiotemporal development in a Bayesian errors-in-variables model. Archaeol. Anthropol. Sci. 2019;11:5657–5690. [Google Scholar]
  209. Runnels C. A prehistoric survey of Thessaly: new light on the Greek middle paleolithic. J. Field Archaeol. 1988;15:277–290. [Google Scholar]
  210. Runnels C., Özdoğan M. The Palaeolithic of the Bosphorus region NW Turkey. J. Field Archaeol. 2001;28:69–92. [Google Scholar]
  211. Scally A., Durbin R. Revising the human mutation rate: implications for understanding human evolution. Nat. Rev. Genet. 2012;13:745–753. doi: 10.1038/nrg3295. [DOI] [PubMed] [Google Scholar]
  212. Scheu A., Powell A., Bollongino R., Vigne J.-D., Tresset A., Çakırlar C., Benecke N., Burger J. The genetic prehistory of domesticated cattle from their origin to the spread across Europe. BMC Genet. 2015;16:54. doi: 10.1186/s12863-015-0203-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  213. Schiffels S., Wang K. MSMC and MSMC2: the multiple sequentially markovian coalescent. Methods Mol. Biol. 2020;2090:147–166. doi: 10.1007/978-1-0716-0199-0_7. [DOI] [PubMed] [Google Scholar]
  214. Schoeninger M.J., DeNiro M.J. Carbon isotope ratios of apatite from fossil bone cannot be used to reconstruct diets of animals. Nature. 1982;297:577–578. doi: 10.1038/297577a0. [DOI] [PubMed] [Google Scholar]
  215. Shang D., Zhang X., Sun M., Wei Y., Wen Y. Strong association of the SNP rs17822931 with wet earwax and bromhidrosis in a Chinese family. J. Genet. 2013;92:289–291. doi: 10.1007/s12041-013-0229-z. [DOI] [PubMed] [Google Scholar]
  216. Shennan S. Cambridge University Press; 2018. The first farmers of Europe: an evolutionary perspective. [Google Scholar]
  217. Sikora M., Seguin-Orlando A., Sousa V.C., Albrechtsen A., Korneliussen T., Ko A., Rasmussen S., Dupanloup I., Nigst P.R., Bosch M.D., et al. Ancient genomes show social and reproductive behavior of early Upper Paleolithic foragers. Science. 2017;358:659–662. doi: 10.1126/science.aao1807. [DOI] [PubMed] [Google Scholar]
  218. Skoglund P., Malmström H., Raghavan M., Storå J., Hall P., Willerslev E., Gilbert M.T., Götherström A., Jakobsson M. Origins and genetic legacy of Neolithic farmers and hunter-gatherers in Europe. Science. 2012;336:466–469. doi: 10.1126/science.1216304. [DOI] [PubMed] [Google Scholar]
  219. Skoglund P., Storå J., Götherström A., Jakobsson M. Accurate sex identification of ancient human remains using DNA shotgun sequencing. J. Archaeol. Sci. 2013;40:4477–4482. [Google Scholar]
  220. Skourtanioti E., Erdal Y.S., Frangipane M., Balossi Restelli F., Yener K.A., Pinnock F., Matthiae P., Özbal R., Schoop U.-D., Guliyev F., et al. Genomic history of Neolithic to Bronze Age Anatolia, Northern Levant, and Southern Caucasus. Cell. 2020;181:1158–1175.e28. doi: 10.1016/j.cell.2020.04.044. [DOI] [PubMed] [Google Scholar]
  221. Slatkin M. Seeing ghosts: the effect of unsampled populations on migration rates estimated for sampled populations. Mol. Ecol. 2005;14:67–73. doi: 10.1111/j.1365-294X.2004.02393.x. [DOI] [PubMed] [Google Scholar]
  222. Smith . Publications de l’Institut de Préhistoire de l’Université de Bordeaux, Mém. n° 5. Delmas; 1966. Le Solutréen en France; p. 449. [Google Scholar]
  223. Sommer R.S., Zachos F.E., Street M., Jöris O., Skog A., Benecke N. Late Quaternary distribution dynamics and phylogeography of the red deer (Cervus elaphus) in Europe. Quat. Sci. Rev. 2008;27:714–733. [Google Scholar]
  224. Speidel L., Cassidy L., Davies R.W., Hellenthal G., Skoglund P., Myers S.R. Inferring population histories for ancient genomes using genome-wide genealogies. Mol. Biol. Evol. 2021;38:3497–3511. doi: 10.1093/molbev/msab174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  225. Stadler P. In: Mitteilungen der Prähistorischen Kommission 82. von Kleinhadersdorf D.L.G., Neugebauer-Maresch C., Lenneis E., editors. Verlag der Österreichischen Akademie der Wissenschaften; 2015. Versuch einer Auswertung der 14C-Proben von Kleinhadersdorf mittels Bayes’scher Statistik; pp. 149–151. [Google Scholar]
  226. Stefanović S. Laboratory for Bioarchaeology, Faculty of Philosophy; 2016. People of the Lepenski Vir: Bioanthropological Analysis of Human Skeletal Remains. [Google Scholar]
  227. Stefanović S., Borić D. In: The Iron Gates in Prehistory: New Perspectives, BAR Int. Series 1893. Bonsall C., Boroneanţ V., Radovanović I., editors. Archaeopress; 2008. The newborn infant burials from Lepenski Vir: in pursuit of contextual meanings; pp. 131–169. [Google Scholar]
  228. Tasić N., Marić M., Ramsey C.B., Kromer B., Barclay A., Bayliss A., Beavan N., Gaydarska B., Whittle A. Vinča-Belo Brdo, Serbia: the times of a tell. Germanica. 2015;93:1–75. [Google Scholar]
  229. Tasić N., Marić M., Ramsey C.B., Kromer B., Barclay A., Bayliss A., Beavan N., Gaydarska B., Whittle A. University of Heidelberg; 2016. Vinča-Belo Brdo, Serbia: the times of a tell (Dataset) (Heidelberg Research Data Repository. [Google Scholar]
  230. Tasić N., Srejović D., Stojanović B. project Rastko; 1990. Vinča: Centre of the Neolithic Culture of the Danubian Region. [Google Scholar]
  231. Teschler-Nicola M. In: Sticks, stones and broken bones: Neolithic violence in a European perspective. Schulting R.J., Fibiger L., editors. Oxford University Press; 2012. The Early Neolithic site Asparn/Schletz (Lower Austria) pp. 101–120. [Google Scholar]
  232. Tiefenböck B., Teschler-Nicola M. In: Neugebauer-Maresch C., Lenneis E., editors. Vol. 82. Verlag der Österreichischen Akademie der Wissenschaften; 2015. pp. 297–392. (Das Linearbandkeramische Gräberfeld von Kleinhadersdorf. Mitteilungen der Prähistorischen Kommission-Teil II: Anthropologie). [Google Scholar]
  233. Tourloukis V., Harvati K. The Palaeolithic record of Greece: a synthesis of the evidence and a research agenda for the future. Quat. Int. 2018;466:48–65. [Google Scholar]
  234. Triantaphyllou S. Archaeopress; 2001. A bioarchaeological approach to prehistoric cemetery populations from western and central Greek Macedonia. [Google Scholar]
  235. Turck R. In: Zeeb-Lanz A., editor. Vol. 2. Generaldirektion Kulturelles Erbe; 2019. Where did the dead from Herxheim originate? Isotope analyses of human individuals from the find concentrations in the ditches, Direktion Landesarchäologie; pp. 313–421. (Ritualised destruction in the early Neolithic – The Exceptional Site of Herxheim (palatinate, Germany)). [Google Scholar]
  236. Vaiglova P., Bogaard A., Collins M., Cavanagh W., Mee C., Renard J., Lamb A., Gardeisen A., Fraser R. An integrated stable isotope study of plants and animals from Kouphovouno, southern Greece: a new look at Neolithic farming. J. Archaeol. Sci. 2014;42:201–215. [Google Scholar]
  237. van Andel T.H., Runnels C.N. The earliest farmers in Europe. Antiquity. 1995;69:481–500. [Google Scholar]
  238. Veeramah K.R., Rott A., Groß M., van Dorp L., López S., Kirsanow K., Sell C., Blöcher J., Wegmann D., Link V., et al. Population genomic analysis of elongated skulls reveals extensive female-biased immigration in Early Medieval Bavaria. Proc. Natl. Acad. Sci. USA. 2018;115:3494–3499. doi: 10.1073/pnas.1719880115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  239. Verdugo M.P., Mullin V.E., Scheu A., Mattiangeli V., Daly K.G., Maisano Delser P., Hare A.J., Burger J., Collins M.J., Kehati R., et al. Ancient cattle genomics, origins, and rapid turnover in the Fertile Crescent. Science. 2019;365:173–176. doi: 10.1126/science.aav1002. [DOI] [PubMed] [Google Scholar]
  240. Vigne J.-D., Briois F., Zazzo A., Willcox G., Cucchi T., Thiébault S., Carrère I., Franel Y., Touquet R., Martin C., et al. First wave of cultivators spread to Cyprus at least 10,600 y ago. Proc. Natl. Acad. Sci. USA. 2012;109:8445–8449. doi: 10.1073/pnas.1201693109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  241. Wahlund S. Zusammensetzung von Populationen und Korrelationerscheinungen vom Standpunkt der Vererbungslehre aus betrachtet. Hereditas. 1928;11:65–106. [Google Scholar]
  242. Walsh S., Liu F., Wollstein A., Kovatsi L., Ralf A., Kosiniak-Kamysz A., Branicki W., Kayser M. The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA. Forensic Sci. Int. Genet. 2013;7:98–115. doi: 10.1016/j.fsigen.2012.07.005. [DOI] [PubMed] [Google Scholar]
  243. Wang K., Li M., Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164. doi: 10.1093/nar/gkq603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  244. Weiberg E., Bevan A., Kouli K., Katsianis M., Woodbridge J., Bonnier A., Engel M., Finné M., Fyfe R., Maniatis Y., et al. Long-term trends of land use and demography in Greece: a comparative study. Holocene. 2019;29:742–760. [Google Scholar]
  245. Weninger B., Clare L., Gerritsen F., Horejs B., Krauß R., Linstädter J., Özbal R., Rohling E.J. Neolithisation of the Aegean and southeast Europe during the 6600–6000 calBC period of Rapid Climate Change. Doc. Praehistorica. 2014;41:1–31. [Google Scholar]
  246. Whittle A., Bartosiewicz L., Borić D., Pettitt P., Richards M. In the beginning: new radiocarbon dates for the Early Neolithic in Northern Serbia and South-East Hungary. Antaeus. 2002;25:63–117. [Google Scholar]
  247. Wild E.M., Stadler P., Häußer A., Kutschera W., Steier P., Teschler-Nicola M., Wahl J., Windl H.J. Neolithic massacres: local skirmishes or General Warfare in Europe? Radiocarbon. 2004;46:377–385. [Google Scholar]
  248. Windl H.J. Makabres Ende einer Kultur? Archäol. Dtschl. 1999;1:54–57. [Google Scholar]
  249. Windl H.J. In: Zum Ende der Bandkeramik in Mitteleuropa. Beiträge der Internationalen Tagung in Herxheim bei Landau (Pfalz) vom 14. Krisen, Kulturwandel, Kontinuitäten, Zeeb-Lanz A., editors. verlag Marie Leidorf; 2009. Zur Stratigraphie der bandkeramischen Grabenwerke von Asparn an der Zaya-Schletz; pp. 191–196. [Google Scholar]
  250. Xue Y., Zhang X., Huang N., Daly A., Gillson C.J., Macarthur D.G., Yngvadottir B., Nica A.C., Woodwark C., Chen Y., et al. Population differentiation as an indicator of recent positive selection in humans: an empirical evaluation. Genetics. 2009;183:1065–1077. doi: 10.1534/genetics.109.107722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  251. Yaka R., Mapelli I., Kaptan D., Doğu A., Chyleński M., Erdal Ö.D., Koptekin D., Vural K.B., Bayliss A., Mazzucato C., et al. Variable kinship patterns in Neolithic Anatolia revealed by ancient genomes. Curr. Biol. 2021;31:2455–2468.e18. doi: 10.1016/j.cub.2021.03.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  252. Yamaguchi K., Watanabe C., Kawaguchi A., Sato T., Naka I., Shindo M., Moromizato K., Aoki K., Ishida H., Kimura R. Association of melanocortin 1 receptor gene (MC1R) polymorphisms with skin reflectance and freckles in Japanese. J. Hum. Genet. 2012;57:700–708. doi: 10.1038/jhg.2012.96. [DOI] [PubMed] [Google Scholar]
  253. Yang D.Y., Eng B., Waye J.S., Dudar J.C., Saunders S.R. Technical note: Improved DNA extraction from ancient bones using silica-based spin columns. Am. J. Phys. Anthropol. 1998;105:539–543. doi: 10.1002/(SICI)1096-8644(199804)105:4<539::AID-AJPA10>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]
  254. Zeder M.A. In: Human Dispersal and Species Movement: From Prehistory to the Present. Boivin N., Crassard R., Petraglia M., editors. Cambridge University Press; 2017. Out of the Fertile Crescent: the dispersal of domestic livestock through Europe and Africa; pp. 261–303. [Google Scholar]
  255. Zeeb-Lanz A., editor. Forschungen zur Pfälzischen Archäologie, 8.1. Generaldirektion Kulturelles Erbe, Direktion Landesarchäologie; 2016. Ritualised Destruction in the Early Neolithic – the Exceptional Site of Herxheim (Palatinate, Germany) Vol. 1. [Google Scholar]
  256. Zeeb-Lanz A. In: Ritualised destruction in the early Neolithic – the exceptional site of Herxheim (palatinate, Germany) Vol. 2. Zeeb-Lanz A., editor. Generaldirektion Kulturelles Erbe, Direktion Landesarchäologie; 2019. The Herxheim ritual enclosure – a synthesis of results and interpretative approaches; pp. 423–482. [Google Scholar]
  257. Zeeb-Lanz A., editor. Forschungen Zur Pfälzischen Archäologie 8.2. Generaldirektion Kulturelles Erbe, Direktion Landesarchäologie; 2019. Ritualised Destruction in the Early Neolithic – the Exceptional Site of Herxheim (Palatinate, Germany), Vol. 2. [Google Scholar]
  258. Zheng X., Levine D., Shen J., Gogarten S.M., Laurie C., Weir B.S. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics. 2012;28:3326–3328. doi: 10.1093/bioinformatics/bts606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  259. Zoledziewska M., UK10K Consortium. Sidore C., Chiang C.W.K., Sanna S., Mulas A., Steri M., Busonero F., Marcus J.H., Marongiu M., et al. Height-reducing variants and selection for short stature in Sardinia. Nat. Genet. 2015;47:1352–1356. doi: 10.1038/ng.3403. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1. Sample processing including detailed information on library preparation, pipeline statistics and read statistics for sequenced libraries and ancient reference samples, related to Methods S1 and STAR Methods
mmc1.xlsx (519.9KB, xlsx)
Table S2. Genomic and sampling information about all 102 genomes used in our study, i.e., 15 newly sequenced ancient genomes, 10 ancient genomes from the literature and 77 modern genomes selected within the SGDP panel, related to Figure 2, Methods S1, and STAR Methods

This table includes in particular the depths of coverage before and after filtering, the genomic heterozygosity shown in Figure 2B, and the average nucleotide pairwise differences between individuals used in Figure 2A and Figure M1_5.

mmc2.xlsx (785.2KB, xlsx)
Table S3. Detailed information for mtDNA and Y haplogroups, including quality information and defining markers, for newly-sequenced individuals (in black) and published individuals (in gray); and results for pigmentation phenotypes as reported by the HIrisPlex-S webtool for the newly-sequenced individuals (in black) and published individuals (in gray), including the raw results files as well as our combined interpretation, related to Methods S1
mmc3.xlsx (460.6KB, xlsx)
Table S4. Parameters inferred for six different panels, related to Figure 3 and Methods S1

For each panel, we provide the maximum estimated log-likelihood of all models tested, with the best supported model highlighted in red, their relative likelihood derived from the AIC, the values obtained for the different parameters and the 95% confidence intervals under the best supported models.

mmc4.xlsx (455.3KB, xlsx)
Table S5. Results of f-statistics in the form of D(Population1, Population2, Population3, Outgroup) where we used Mbuti as Outgroup and Population1–3 were samples analysed in the study and relevant reference samples and populations, related to Figures S1 and S4 and Methods S1
mmc5.xlsx (770.2KB, xlsx)
Methods S1. Bioinformatic pipeline, population genetics analyses and demographic inferences, related to STAR Methods, Figures 2, 3, and 4, Figures S1, S2, S3, and S4, and Tables S1, S2, S3, S4, and S5
mmc6.pdf (7.6MB, pdf)
Data S1. Archaeological background, related to Figure 1, Table 1, and STAR Methods
mmc7.pdf (25.8MB, pdf)
Figure360. Animation and narration of Figure 5, related to Figure 5
Download video file (43.5MB, mp4)

Data Availability Statement

  • Raw sequencing data (FASTQ-files) and aligned BAM-files generated in this study have been deposited to European Nucleotide Archive (ENA: PRJEB50857) and are publicly available as of the date of publication. Individual Accession numbers are listed in the key resources table. Filtered VCF-files have been deposited to European Variant Archive (EVA: PRJEB51919) and are publicly available as of the date of publication. Further archaeological information and analyses additional to the present article are available (Data S1; Methods S1), as well as Supplemental Tables and Figures.

  • All original code has been deposited at https://github.com/CMPG/originsEarlyFarmers and is publicly available as of the date of publication. under DOI https://doi.org/10.5281/zenodo.6367517.

  • Any additional information required to reanalyse the data reported in this paper is available from the lead contact upon request.

RESOURCES