Skip to main content
Nature Portfolio logoLink to Nature Portfolio
. 2024 Jan 10;625(7994):301–311. doi: 10.1038/s41586-023-06865-0

Population genomics of post-glacial western Eurasia

Morten E Allentoft 1,2,✉,#, Martin Sikora 1,✉,#, Alba Refoyo-Martínez 1,#, Evan K Irving-Pease 1,#, Anders Fischer 1,3,4,#, William Barrie 5,6,#, Andrés Ingason 1,7,#, Jesper Stenderup 1, Karl-Göran Sjögren 3, Alice Pearson 6, Bárbara Sousa da Mota 8,9, Bettina Schulz Paulsson 3, Alma Halgren 10, Ruairidh Macleod 1,5,11,12, Marie Louise Schjellerup Jørkov 13, Fabrice Demeter 1,14, Lasse Sørensen 15, Poul Otto Nielsen 15, Rasmus A Henriksen 1, Tharsika Vimala 1, Hugh McColl 1, Ashot Margaryan 16,17, Melissa Ilardo 18, Andrew Vaughn 19, Morten Fischer Mortensen 15, Anne Birgitte Nielsen 20, Mikkel Ulfeldt Hede 21, Niels Nørkjær Johannsen 22, Peter Rasmussen 15, Lasse Vinner 1, Gabriel Renaud 23, Aaron Stern 19, Theis Zetner Trolle Jensen 16, Gabriele Scorrano 1, Hannes Schroeder 16, Per Lysdahl 24, Abigail Daisy Ramsøe 1, Andrei Skorobogatov 25, Andrew Joseph Schork 6,26, Anders Rosengren 1,6, Anthony Ruter 1, Alan Outram 27, Aleksey A Timoshenko 28, Alexandra Buzhilova 29, Alfredo Coppa 30, Alisa Zubova 31, Ana Maria Silva 32,33, Anders J Hansen 1, Andrey Gromov 31, Andrey Logvin 34, Anne Birgitte Gotfredsen 1, Bjarne Henning Nielsen 35, Borja González-Rabanal 36, Carles Lalueza-Fox 37,38, Catriona J McKenzie 27, Charleen Gaunitz 1, Concepción Blasco 39, Corina Liesau 39, Cristina Martinez-Labarga 40, Dmitri V Pozdnyakov 28, David Cuenca-Solana 41,42, David O Lordkipanidze 43,44, Dmitri En’shin 45, Domingo C Salazar-García 46,47, T Douglas Price 3,48, Dušan Borić 30,49, Elena Kostyleva 50, Elizaveta V Veselovskaya 51, Emma R Usmanova 52,53,54,55, Enrico Cappellini 16, Erik Brinch Petersen 56, Esben Kannegaard 57, Francesca Radina 58, Fulya Eylem Yediay 1, Henri Duday 59, Igor Gutiérrez-Zugasti 41, Ilya Merts 60, Inna Potekhina 61,62, Irina Shevnina 34, Isin Altinkaya 1, Jean Guilaine 63, Jesper Hansen 64, Joan Emili Aura Tortosa 46, João Zilhão 33,65, Jorge Vega 66, Kristoffer Buck Pedersen 67, Krzysztof Tunia 68, Lei Zhao 1, Liudmila N Mylnikova 28, Lars Larsson 69, Laure Metz 70, Levon Yepiskoposyan 71,72, Lisbeth Pedersen 73, Lucia Sarti 74, Ludovic Orlando 75, Ludovic Slimak 75, Lutz Klassen 57, Malou Blank 3, Manuel González-Morales 41, Mara Silvestrini 76, Maria Vretemark 77, Marina S Nesterova 28, Marina Rykun 78, Mario Federico Rolfo 79, Marzena Szmyt 80, Marcin Przybyła 81, Mauro Calattini 74, Mikhail Sablin 82, Miluše Dobisíková 83, Morten Meldgaard 84, Morten Johansen 85, Natalia Berezina 29, Nick Card 86, Nikolai A Saveliev 87, Olga Poshekhonova 45, Olga Rickards 40, Olga V Lozovskaya 88, Olivér Gábor 89, Otto Christian Uldum 85,90, Paola Aurino 91, Pavel Kosintsev 92,93, Patrice Courtaud 59, Patricia Ríos 39, Peder Mortensen 94, Per Lotz 95,96, Per Persson 97, Pernille Bangsgaard 98, Peter de Barros Damgaard 1, Peter Vang Petersen 15, Pilar Prieto Martinez 99, Piotr Włodarczak 68, Roman V Smolyaninov 100, Rikke Maring 23,57, Roberto Menduiña 66, Ruben Badalyan 101, Rune Iversen 56, Ruslan Turin 25, Sergey Vasilyev 51,102, Sidsel Wåhlin 24, Svetlana Borutskaya 29, Svetlana Skochina 45, Søren Anker Sørensen 95, Søren H Andersen 103, Thomas Jørgensen 95, Yuri B Serikov 104, Vyacheslav I Molodin 28, Vaclav Smrcka 105, Victor Merts 106, Vivek Appadurai 6, Vyacheslav Moiseyev 31, Yvonne Magnusson 107, Kurt H Kjær 1, Niels Lynnerup 13, Daniel J Lawson 108, Peter H Sudmant 10,19, Simon Rasmussen 109, Thorfinn Sand Korneliussen 1, Richard Durbin 6,110, Rasmus Nielsen 1,10, Olivier Delaneau 8,9, Thomas Werge 1,6,111, Fernando Racimo 1, Kristian Kristiansen 1,3, Eske Willerslev 1,5,112,
PMCID: PMC10781627  PMID: 38200295

Abstract

Western Eurasia witnessed several large-scale human migrations during the Holocene15. Here, to investigate the cross-continental effects of these migrations, we shotgun-sequenced 317 genomes—mainly from the Mesolithic and Neolithic periods—from across northern and western Eurasia. These were imputed alongside published data to obtain diploid genotypes from more than 1,600 ancient humans. Our analyses revealed a ‘great divide’ genomic boundary extending from the Black Sea to the Baltic. Mesolithic hunter-gatherers were highly genetically differentiated east and west of this zone, and the effect of the neolithization was equally disparate. Large-scale ancestry shifts occurred in the west as farming was introduced, including near-total replacement of hunter-gatherers in many areas, whereas no substantial ancestry shifts happened east of the zone during the same period. Similarly, relatedness decreased in the west from the Neolithic transition onwards, whereas, east of the Urals, relatedness remained high until around 4,000 bp, consistent with the persistence of localized groups of hunter-gatherers. The boundary dissolved when Yamnaya-related ancestry spread across western Eurasia around 5,000 bp, resulting in a second major turnover that reached most parts of Europe within a 1,000-year span. The genetic origin and fate of the Yamnaya have remained elusive, but we show that hunter-gatherers from the Middle Don region contributed ancestry to them. Yamnaya groups later admixed with individuals associated with the Globular Amphora culture before expanding into Europe. Similar turnovers occurred in western Siberia, where we report new genomic data from a ‘Neolithic steppe’ cline spanning the Siberian forest steppe to Lake Baikal. These prehistoric migrations had profound and lasting effects on the genetic diversity of Eurasian populations.

Subject terms: Population genetics, Archaeology, Genomics


An analysis involving the shotgun sequencing of more than 300 ancient genomes from Eurasia reveals a deep east–west genetic divide from the Black Sea to the Baltic, and provides insight into the distinct effects of the Neolithic transition on either side of this boundary.

Main

Genetic diversity in west Eurasian human populations was largely shaped by three major prehistoric migrations: anatomically modern hunter-gatherers (HGs) occupying the area from around 45,000 bp (refs. 4,6); Neolithic farmers expanding from the Middle East from around 11,000 bp (ref. 4); and steppe pastoralists coming out of the Pontic Steppe around 5,000 bp (refs. 1,2). Palaeogenomic analyses have uncovered the early post-glacial colonization routes7 that led to a basal ancestral dichotomy between HGs in central and western Europe and HG groups represented further east8. Western HG (WHG) ancestry appears to be derived directly from ancestry sources related to Epigravettian, Azilian and Epipalaeolithic cultures (the Villabruna cluster)9, whereas eastern HG (EHG) ancestry shows further admixture with an Upper Palaeolithic Siberian source (Ancient North Eurasian; ANE)10. The WHG ancestry composition was regionally variable in the Mesolithic populations. There is evidence for continuous local admixture in Iberian HGs11, which contrasts with the more homogenous WHG ancestry profile in Britain and northwestern continental Europe, suggesting ancestry formation before expansion12. The timing of the ancestry admixture that formed EHG has been estimated at 13,000–15,000 bp, and the composition seems to follow a cline that is broadly correlated with geography, with Baltic and Ukrainian HGs showing more affinity to the Villabruna Upper Palaeolithic cluster ancestry, as compared with HGs in Russia, who exhibited more ANE ancestry5,7,13,14. Genomic analyses of Mesolithic skeletal material from the Scandinavian Peninsula has revealed varied mixes of WHG and EHG ancestry among the later Mesolithic populations3,15,16.

Beyond these broad-scale characterizations, our knowledge about Mesolithic population structure and demographic admixture processes is limited, and has substantial chronological and geographical information gaps. This is partly owing to a relative paucity of well-preserved Mesolithic human skeletons older than 8,000 years, and partly because most ancient DNA studies on the Mesolithic and Neolithic periods have been restricted to individuals from Europe. The archaeological record indicates a boundary from the eastern Baltic to the Black Sea, east of which HG societies persisted for much longer than in western Europe, despite the similar distance to the distribution centre for early agriculture in the Middle East17. Components of eastern and western HG ancestry appear highly variable in this boundary region5,18,19 but the wider spatiotemporal genetic implications of the east–west division are unclear. The spatiotemporal mapping of population dynamics east of Europe, including northern and central Asia during the same time period, is limited. In these regions, the term ‘Neolithic’ is characterized by cultural and economic changes including societal-network differences, changes in lithic technology and use of pottery. For instance, the Neolithic cultures of the central Asian steppe and the Russian taiga belt possessed pottery, but retained a HG economy alongside stone-blade technology, similar to the preceding Mesolithic cultures20. A fundamental lack of data from some key regions and periods has made it difficult to gain a deeper understanding of how the neolithization differed in its timing, mechanisms and effects across northern and western Eurasia.

The transition from hunting and gathering to farming was based on domesticated plants and animals of Middle Eastern origin, and represents one of the most fundamental shifts in demography, health, lifestyle and culture in human prehistory. The neolithization process in large parts of Europe was accompanied by the arrival of immigrants of Anatolian descent21. For example, in Iberia, the Neolithic began with the abrupt spread of immigrant farmers of Anatolian–Aegean ancestry along the Mediterranean and Atlantic coasts, after which admixture with local HGs gradually took place11. Similarly, in southeastern and central Europe, farming rapidly spread with Anatolian Neolithic farmers, who were to some extent subsequently admixed with local HGs2227. Conversely, in Britain, data suggest that there was a complete replacement of the HG population when agriculture was introduced by incoming continental farmers, without a subsequent resurgence of local HG ancestry12,28. In the east Baltic region, a markedly different neolithization trajectory occurred, with the introduction of domesticates only at the emergence of the Corded Ware complex (CWC) around 4,800 calibrated years before present (cal. bp) (refs. 18,19). Similarly, in eastern Ukraine, HGs of Mesolithic ancestry co-existed for millennia with farming groups further west5,29. These studies have all provided important regional contributions to the understanding of west Eurasian population history, but from a broader cross-continental perspective, our knowledge is still patchy.

From approximately 5,000 bp, an ancestry component related to Early Bronze Age steppe pastoralists such as the Yamnaya culture rapidly spread across Europe through the expansion of the CWC and related cultures1,2. Although previous studies have identified these large-scale migrations into Europe and central Asia, central aspects concerning the demographic processes are not resolved. Yamnaya ancestry (that is, ‘steppe’ ancestry) has been characterized broadly as a mix between EHG ancestry and Caucasus hunter-gatherer (CHG), formed in a hypothetical admixture between a ‘northern’ steppe source and a ‘southern’ Caucasus source30. However, the exact origins of these ancestry sources have not been identified. Furthermore, with a few exceptions3133, published Yamnaya Y-chromosomal haplogroups do not match those found in Europeans after 5,000 bp, and the origin of this patrilineal lineage is also unresolved. Finally, in Europe, ‘steppe’ ancestry has hitherto been identified only in admixed form, but the origin of this admixture event and the mechanism by which the ancestry subsequently spread with the CWC have remained elusive.

To investigate these formative processes at a cross-continental scale, we sequenced the genomes of 317 radiocarbon-dated (by accelerator mass spectrometry) individuals of mainly Mesolithic and Neolithic origin, covering major parts of Eurasia. We combined these with published shotgun-sequenced data to impute a dataset of more than 1,600 diploid ancient genomes. Of the 317 sampled ancient skeletons (Fig. 1, Extended Data Fig. 1 and Supplementary Data 1), 272 were radiocarbon-dated within the project, 30 dates were derived from published literature and 15 examples were dated by archaeological context. Dates were corrected for marine and freshwater reservoir effects (Supplementary Note 4) and ranged from the Upper Palaeolithic around 25,700 cal. bp to the mediaeval period (around 1,200 cal. bp). However, 97% of the individuals (n = 309) date to between 11,000 and 3,000 cal. bp, with a heavy focus on individuals associated with various Mesolithic and Neolithic cultures. Geographically, the 317 sampled skeletons cover a vast territory across Eurasia, from Lake Baikal to the Atlantic coast and from Scandinavia to the Middle East, deriving from contexts that include burial mounds, caves, bogs and the sea floor (Supplementary Notes 6 and 7). Broadly, we can divide our research area into three large regions: (1) central, western and northern Europe; (2) eastern Europe, including western Russia, Belarus and Ukraine; and (3) the Urals and western Siberia (Supplementary Notes 6 and 7). Samples cover many of the key Mesolithic and Neolithic cultures in western Eurasia, such as the Maglemose, Ertebølle, Funnel Beaker (TRB) and Corded Ware/Single Grave cultures in Scandinavia; the Cardial in the Mediterranean; the Körös and Linear Pottery (LBK) in southeastern and central Europe; and many archaeological cultures in Ukraine, western Russia and the trans-Ural region (for example, Veretye, Lyalovo, Volosovo and Kitoi). Our sampling was particularly dense in Denmark, from where an accompanying paper presents a detailed and continuous sequence of 100 genomes spanning the Early Mesolithic to the Bronze Age34. Dense sampling was also obtained from Ukraine, western Russia and the trans-Ural region, spanning the Early Mesolithic through to the Neolithic, up to around 5,000 bp.

Fig. 1. Sample overview and broad-scale genetic structure.

Fig. 1

a,b, Geographical (a) and temporal (b) distribution of the 317 ancient genomes sequenced and reported in this study. Insert shows dense sampling in Denmark34. The age and the geographical region of ancient individuals are indicated by the colour and the shape of the symbols, respectively. Colour scale for age is capped at 15,000 years; older individuals are indicated with black. Random jitter was added to geographical coordinates to avoid overplotting. c,d, PCA of 3,316 modern and ancient individuals from Eurasia, Oceania and the Americas (c), and restricted to 2,126 individuals from western Eurasia (west of the Urals) (d). Principal components were defined using both modern and imputed ancient (n = 1,492) genomes passing all filters, with the remaining low-coverage ancient genomes projected. Ancient genomes sequenced in this study are indicated with black circles (imputed genomes passing all filters, n = 213) or grey diamonds (pseudo-haploid projected genomes; n = 104). Genomes of modern individuals are shown in grey, with population labels corresponding to their median coordinates. BA, Bronze Age.

Extended Data Fig. 1. Genetic structure of the 317 herein-reported ancient genomes.

Extended Data Fig. 1

ad, PCA of 3,316 modern and ancient individuals from Eurasia, Oceania and the Americas (a,b), as well as restricted to 2,126 individuals from western Eurasia (west of the Urals) (c,d). Shown are analyses with principal components inferred either using both modern and imputed ancient genomes passing all filters, and projecting low coverage ancient genomes (a,c); or only modern genomes and projecting all ancient genomes (b,d). Ancient genomes sequenced in this study are indicated either with black circles (imputed genomes) or grey diamonds (projected genomes). e, Model-based clustering results using ADMIXTURE for 284 newly reported genomes (excluding close relatives and individuals flagged for possible contamination). Results shown are based on ADMIXTURE runs from K = 2 to K = 15 on 1,593 ancient individuals, corresponding to the full set of 1,492 imputed genomes passing filters as well as 101 low coverage genomes represented by pseudo-haploid genotypes (flags “lowcov” or “lowGpAvg”, Supplementary Data 7; indicated with alpha transparency in plot).

Broad-scale genetic structure

Ancient DNA was extracted from either dental cementum or petrous bones, and the 317 genomes were shotgun-sequenced to a depth of coverage ranging between 0.01× and 7.1× (mean, 0.75×, median, 0.26×), with more than 1× coverage for 81 genomes (Supplementary Note 1). We used a computational method optimized for low-coverage data35 to impute genotypes using the 1000 Genomes phased data36 as a reference panel. This method was jointly applied to more than 1,300 previously published shotgun-sequenced genomes (Supplementary Data 7), resulting in a dataset of 8.5 million common single-nucleotide polymorphisms (SNPs) (with a minor allele frequency (MAF) greater than 1% and an imputation INFO score greater than 0.5) for 1,664 imputed diploid ancient genomes (Extended Data Fig. 2). For most downstream analyses, n = 71 individuals were excluded because they were found to be close relatives or because the estimated contamination was greater than 5%. This resulted in 1,593 genomes, of which 1,492 were analysed as imputed (213 sequenced in this study) and 101 were analysed as pseudo-haploids owing to low coverage (less than 0.1×) and/or low imputation quality (average genotype probability lower than 0.98).

Extended Data Fig. 2. Imputation accuracy of ancient DNA.

Extended Data Fig. 2

a, Imputation accuracy across 42 high-coverage ancient genomes when downsampled to lower depth of coverage values (see Supplementary Note 2 and Supplementary Table 2.1). b, Imputation accuracy for 1× depth of coverage across 9 prehistoric European genomes; c, across 5 Viking age genomes; and d, across 7 ancient genomes from Early Medieval Hungary. In all panels, imputation accuracy is shown as the squared Pearson correlation between imputed and true genotype dosages as a function of MAF of the target variant sites.

We conducted a broad-scale characterization of this dataset using principal component analysis (PCA) and model-based clustering (ADMIXTURE), which recapitulated previously described ancestry clines in ancient Eurasian populations at increased resolution (Fig. 1, Extended Data Fig. 1 and Supplementary Note 3d). Our imputed whole genomes allowed us to perform PCA using ancient genomes as input, instead of projecting onto a space defined by modern variation. Notably, this resulted in much higher differentiation among the ancient individuals than observed previously (Extended Data Fig. 1). This is particularly notable in a PCA of west Eurasian individuals, in which the variance explained by the first two PCs increases more than 1.5-fold, and present-day populations are confined within a small central area of the PCA space (Fig. 1d and Extended Data Fig. 1c,d). These results are consistent with the genetic differentiation between ancient Europeans being higher than is observed in present-day populations, reflecting more genetic isolation and lower effective population sizes among ancient groups.

To obtain a finer-scale characterization of genetic ancestries across space and time, we used an approach similar to the widely used ChromoPainter–FineSTRUCTURE workflow3739. We first performed community detection on a network constructed from pairwise identity-by-descent (IBD)-sharing similarities between ancient individuals to group them into hierarchically related clusters of similar genetic ancestry (Extended Data Fig. 3 and Supplementary Note 3c). At higher levels of the hierarchy, the resulting clusters represented previously described ancestry groups reflecting broad genetic structure, such as EHGs and WHGs (‘HG_EuropeE’ and ‘HG_EuropeW’; Extended Data Fig. 3). Clusters at the lowest level resolved fine-scale genetic structure, grouping individuals within restricted spatiotemporal ranges and/or archaeological contexts but also revealing previously unknown connections across broader geographical areas (Extended Data Fig. 3 and Supplementary Note 3f). These resulting clusters were subsequently used in supervised ancestry modelling, in which sets of ‘target’ individuals were modelled as mixtures of ‘source’ groups (Methods).

Extended Data Fig. 3. Genetic clustering of ancient individuals.

Extended Data Fig. 3

Characterization of genetic clusters for 1,401 imputed ancient individuals from Eurasia (that is, excluding 91 individuals from Africa and Americas), inferred from pairwise IBD sharing (indicated using coloured symbols throughout), a, Temporal distribution of clustered individuals, grouped by broad ancestry cluster. b,c, Geographical distribution of clustered individuals, shown for individuals predating 3,000 bp (b) and after 3,000 bp (c). d, Network graph of pairwise IBD sharing between 596 ancient Eurasians predating 3,000 bp, highlighting within- and between-cluster relationships. Each node represents an individual, and the width of edges connecting nodes indicates the fraction of the genome shared IBD between the respective pair of individuals. Network edges were restricted to the 10 highest sharing connections for each individual, and the layout was computed using the force-directed Fruchterman-Reingold algorithm. e, Neighbour-joining tree showing relationships between genetic clusters, inferred using total variation distance (TVD) of IBD painting palettes. f,g, PCA of 3,119 Eurasian (f) or 2,126 west Eurasian (g) ancient and modern individuals (“HO” dataset).

Population structure of HGs after the LGM

Our study comprises 113 shotgun-sequenced and imputed HG genomes, of which 79 were sequenced in this study. Among them, we report a 0.83× (0.83-fold coverage) genome of an Upper Palaeolithic skeleton from Kotias Klde Cave in Georgia, Caucasus (NEO283), directly dated to 26,052–25,323 cal. bp (95% confidence interval). In the PCA of all non-African individuals, this individual occupied a position distinct from those of other previously sequenced Upper Palaeolithic individuals—shifted towards west Eurasians along PC1 (Supplementary Note 3d). Using admixture graph modelling, we find that a well-fitting graph for this Caucasus Upper Palaeolithic lineage derives it as a mixture of predominantly west Eurasian Upper Palaeolithic HG ancestry (76%), with a contribution of about 24% from a ‘basal Eurasian’ ghost population, first observed in west Asian Neolithic individuals4 (Supplementary Note 3d and Supplementary Fig. 3d.16). To further explore the fine-scale structure of later European HGs, we then performed supervised ancestry modelling using sets of increasingly proximate source clusters (Extended Data Fig. 4). We replicate previous results of broad-scale genetic differentiation between HGs in eastern and western Europe after the Last Glacial Maximum (LGM)5,7. We show that the deep ancestry divisions in the Eurasian human gene pool that were established during early post-LGM dispersals7 persisted throughout the Mesolithic (Extended Data Fig. 4). Using distal sets of pre-LGM HGs as sources, we modelled western HGs as predominantly derived from a source related to the herein-reported Caucasus Upper Palaeolithic individual from Kotias Klde cave (Caucasus_25000BP), whereas eastern HGs showed varying amounts of ancestry related to a Siberian HG from Mal’ta (Malta_24000BP; Extended Data Fig. 4a and Supplementary Data 12). Using post-LGM sources, this divide is best represented by ancestry related to southern European (Italy_15000BP_9000BP) and Russian (RussiaNW_11000BP_8000BP) HGs, respectively, corresponding to the ‘WHG’ and ‘EHG’ labels commonly used in previous studies.

Extended Data Fig. 4. Genetic structure of European HGs after the LGM.

Extended Data Fig. 4

a, Supervised ancestry modelling using non-negative least squares on IBD sharing profiles. Panels show estimated ancestry proportions for target individuals from genetic clusters representing European HGs, using different sets of increasingly proximal source groups. Individuals used as sources in a particular set are indicated with black crosses and coloured bars with 100% ancestry proportion. Black lines indicate 1 standard error for the respective ancestry component. b, Residuals for model fit of target individuals from selected genetic clusters across different source sets. c, Moon charts showing spatial distribution of ancestry proportions in European HGs deriving from four European source groups (set “hgEur2”; source origins shown with coloured symbol). Estimated ancestry proportions are indicated by both size and amount of fill of moon symbols. Note that ‘Italy_15000BP_9000BP’ and ‘RussiaNW_11000BP_8000BP’ correspond to ‘WHG’ and ‘EHG’ labels used in previous studies. d, Maps showing networks of highest between-cluster IBD sharing (top 10 highest sharing per individual) for individuals from two genetic clusters representing Scandinavian HGs. See Supplementary Data 1 and 7 for details of individual sample IDs presented here.

Adding extra proximate sources allowed us to further refine the ancestry composition of northern European HGs. In Denmark, our 28 sequenced and imputed HG genomes derived almost exclusively from a southern European source (Italy_15000BP_9000), with notable homogeneity across a 5,000-year transect34 (Extended Data Fig. 4a and Supplementary Data 12). By contrast, we observed marked geographical variation in the ancestry composition of HGs from other parts of Scandinavia. Mesolithic individuals from Scandinavia were broadly modelled as mixtures with varying proportions of eastern and western HGs using distal post-LGM sources (‘hgEur1’; Extended Data Fig. 4a), as previously reported15. In Mesolithic individuals from southern Sweden, the eastern HG ancestry component was largely replaced by a southeastern European source (Romania_8800BP) in more proximate models, making up between 60% and 70% of the ancestry (Extended Data Fig. 4a and Supplementary Data 12). Ancestry related to Russian HGs increased in a cline towards the far north, peaking at around 75% in a late HG from Tromso (VK531; around 4,350 bp) (Extended Data Fig. 4a,c and Supplementary Data 12); this was also reflected in the fact that those individuals shared the highest IBD with northern Russian HGs (Extended Data Fig. 4d). During the late Mesolithic, we observed higher southern European HG ancestry in coastal individuals (NEO260 from Evensås and NEO679 from Skateholm) than in earlier individuals from further inland. Adding Danish HGs as a proximate source substantially improved the fit for those two individuals (‘hgEur3’; Extended Data Fig. 4b), with an estimated 58–76% of ancestry derived from Danish HGs (‘hgEur3’; Extended Data Fig. 7a and Supplementary Data 12), suggesting a population genetic link with Denmark, where this ancestry prevailed (Extended Data Fig. 4c). These results indicate that there were at least three distinct waves of northwards HG ancestry into Scandinavia: (1) a predominantly southern European source into Denmark and coastal southwestern Sweden; (2) a source related to southeastern European HGs into the Baltic and southeastern Sweden; and (3) a northwest Russian source into the far north, which then spread south along the Atlantic coast of Norway15 (Extended Data Fig. 4c). These movements are likely to represent post-glacial expansions from refugial areas shared with many plant and animal species40.

Extended Data Fig. 7. Ancestry modelling for genetic clusters east of the Urals.

Extended Data Fig. 7

Supervised ancestry modelling using non-negative least squares on IBDaring profiles. Panels show estimated ancestry proportions of a global Eurasian cluster corresponding to central, east and north Asian individuals with east Eurasian genetic affinities. Columns show results of modelling target individuals using three panels of increasingly distal source groups: “postBA”: Bronze Age and Neolithic source groups; “postNeol”, Bronze Age and later targets using Late Neolithic/early Bronze Age and earlier source groups; “deep”, Mesolithic and later targets using deep ancestry source groups. Individuals used as sources in a particular set are indicated with black crosses and coloured bars with 100% ancestry proportion. Black lines indicate 1 standard error for the respective ancestry component.

On the Iberian Peninsula, the earliest individuals, including an approximately 9,200-year-old HG (NEO694) from Santa Maira (eastern Spain), sequenced in this study, showed predominantly southern European HG ancestry, with a minor contribution from Upper Palaeolithic HG sources (Extended Data Fig. 4a). This observed Upper Palaeolithic HG ancestry source mix is likely to reflect the pre-LGM Magdalenian-related ancestry component that has previously been reported in Iberian HGs11, for which a good source population proxy is lacking in our dataset. By contrast, later individuals from northern Iberia were more similar to HGs from southeastern Europe, deriving around 30–40% of their ancestry from a source related to HGs from the Balkans in more proximate models11,41 (Extended Data Fig. 4a and Supplementary Data 12). The earliest evidence for this gene flow was observed in a Mesolithic individual from El Mazo, Spain (NEO646) who was dated, calibrated and reservoir-corrected to around 8,200 bp (8,365–8,182 cal. bp; 95%) but dated slightly earlier by context42 (8,550–8,330 bp). The directly dated age coincides with some of the oldest Mesolithic geometric microliths in northern Iberia, appearing around 8,200 bp at this site42. An influx of southeastern European HG-related ancestry in Ukrainian individuals after the Mesolithic (Extended Data Fig. 4a and Supplementary Data 12) suggests a similar eastward expansion in southeastern Europe5. Of note, two newly reported approximately 7,300-year-old genomes from the Middle Don River region in the Pontic-Caspian steppe (Golubaya Krinitsa, NEO113 & NEO212) were found to be predominantly derived from earlier Ukrainian HGs, but with around 18-24% of their ancestry contributed from a source related to HGs from the Caucasus (Caucasus_13000BP_10000BP) (Extended Data Fig. 4a and Supplementary Data 12). Further lower-coverage (non-imputed) genomes from the same site project in the same PCA space (Fig. 1d) shifted away from the European HG cline towards Iran and the Caucasus. Using the linkage-disequilibrium-based method DATES43, we dated this admixture to around 8,300 bp (Supplementary Data 14). These results document genetic contact between populations from the Caucasus and the steppe region that is much earlier than previously known, providing evidence of admixture before the advent of later nomadic steppe cultures—in contrast with recent hypotheses—and further to the west than has been previously reported5,44.

Major genetic transitions in Europe

Previous ancient genomics studies have documented several episodes of large-scale population turnover in Europe within the past 10,000 years (see, for example, refs. 1,2,5,45), but the 317 genomes reported here fill important knowledge gaps. Our analyses reveal profound differences in the spatiotemporal neolithization dynamics across Europe. Supervised admixture modelling (using the ‘deep’ ancestry set; Supplementary Data 11) and spatiotemporal kriging46 document a broad east–west distinction along a boundary zone running from the Black Sea to the Baltic. On the western side of this ‘great divide’, the Neolithic transition is accompanied by large-scale shifts in genetic ancestry from local HGs to farmers with Anatolian-related ancestry (Boncuklu_10000BP; Fig. 2a and Fig. 3 and Extended Data Figs. 57). The arrival of Anatolian-related ancestry in different regions spans an extensive time period of more than 3,000 years, from its earliest evidence in the Balkans (Lepenski Vir) at around 8,700 bp (ref. 5) to around 5,900 bp in Denmark.

Fig. 2. Genetic ancestry transects of western Eurasia.

Fig. 2

a, Regional timelines of genetic ancestry compositions within the past 12,000 years in western Eurasia. Ancestry proportions in 1,012 imputed ancient genomes (representing populations west of the Urals) inferred using supervised ancestry modelling with the ‘deep’ HG ancestry source groups. Coloured bars within the timelines represent ancestry proportions for temporally consecutive individuals, with the width corresponding to their age difference. Individuals with identical age were offset along the time axis by adding random jitter. b, Map highlighting geographical areas (coloured areas) for samples included in the individual regional timelines, and excavation locations (black crosses). Only shotgun-sequenced genomes were used in our study, so the exact timing of ancestry shifts might differ slightly from previous studies if they are based on different types of data from different individuals.

Fig. 3. Spatiotemporal kriging analysis of major ancestries.

Fig. 3

The temporal transects show how WHG ancestry (Italy_15000BP_9000BP) was replaced by Neolithic farmer ancestry (Boncuklu_10000BP) during the Neolithic transition in Europe. Later, the steppe migrations around 5,000 cal. bp introduced both EHG (MiddleDon_7500BP) and CHG (Caucasus_13000BP_10000BP) ancestry into Europe, thereby reducing Neolithic farmer ancestry.

Extended Data Fig. 5. Ancestry modelling for HG and Neolithic farmer-associated genetic clusters.

Extended Data Fig. 5

Supervised ancestry modelling using non-negative least squares on IBD sharing profiles. Panels show estimated ancestry proportions of two global Eurasian clusters, corresponding to European HGs before 4,000 bp and individuals from Europe and western Asia from around 10,000 bp until historical times, including Anatolian-associated (Neolithic) farmers, Caucasus HGs and recent individuals with genetic affinity to the Levant. Columns show results of modelling target individuals using three panels of increasingly distal source groups: “postBA”: Bronze Age and Neolithic source groups; “postNeol”, Bronze Age and later targets using Late Neolithic/early Bronze Age and earlier source groups; “deep”, Mesolithic and later targets using deep ancestry source groups. Individuals used as sources in a particular set are indicated with black crosses and coloured bars with 100% ancestry proportion. Black lines indicate 1 standard error for the respective ancestry component.

Furthermore, we corroborate previous reports (for example, refs. 2,5,45,47) of widespread, low-level admixture between early European farmers and local HGs, resulting in a resurgence of HG ancestry in many regions of Europe during subsequent centuries (Extended Data Fig. 8b,c and Supplementary Data 8). The resulting estimated proportions of HG ancestry rarely exceeded 10%, with notable exceptions observed in individuals from southeastern Europe (Iron Gates) and Sweden (Pitted Ware Culture), as well as in the herein-reported Early Neolithic genomes from Portugal (western Cardial), which are estimated to contain 27%–43% Iberian HG ancestry (Iberia_9000BP_7000BP). The latter result, together with an estimated admixture date of just 200 years earlier (‘Iberia farmer early’ in Supplementary Data 14), suggests extensive first-contact admixture, and is in agreement with archaeological inferences derived from modelling the spread of farming across west Mediterranean Europe48. Neolithic individuals from Denmark showed some of the highest overall proportions of HG ancestry (up to around 25%), but this was mostly derived from non-local western European-related HGs (EuropeW_13500BP_8000BP), with only a small contribution from local Danish HG groups in some individuals (Extended Data Fig. 8b and Supplementary Note 3f).

Extended Data Fig. 8. Dynamics of the Neolithic transition in Europe.

Extended Data Fig. 8

a, Supervised ancestry modelling using non-negative least squares on IBD sharing profiles. Panels show estimated ancestry proportions for target individuals from genetic clusters representing European Neolithic farmer individuals (“fEur” source set). Individuals used as sources in a particular set are indicated with black crosses and coloured bars with 100% ancestry proportion. Black lines indicate 1 standard error for the respective ancestry component. b, Composition of HG ancestry proportions from different source groups in individuals with Neolithic farmer ancestry, shown as bar plots. Grey bars represent contributions from a source with ancestry related to local HGs. c, Moon charts showing spatial distribution of estimated ancestry proportions related to local HGs across Europe. Estimated ancestry proportions are indicated by size and amount of fill of moon symbols. Coloured areas indicate the geographical extent of individuals included as local sources in the respective regions. d, Estimated time of admixture between local HG groups and Neolithic farmers. Black diamonds and error bars represent point estimate and standard errors of admixture time, coloured bars show temporal range of included target individuals. The time to admixture was adjusted backwards by the average age of individuals for each region. e, Moon charts showing spatial distribution of estimated ancestry proportions derived from genetic clusters of early Neolithic European farmers (locations indicated with coloured symbols). Estimated ancestry proportions are indicated by size and amount of fill of moon symbols. Red symbols indicate individuals where standard errors exceed the point estimates for the respective ancestry source.

We find evidence for regional stratification in early Neolithic farmer ancestries in subsequent Neolithic groups. Specifically, southern European early farmers were found to have provided major genetic ancestry to Neolithic groups of later dates in western Europe, whereas central European early farmer ancestry was mainly observed in subsequent Neolithic groups in eastern Europe and Scandinavia (Extended Data Fig. 8e). These results are consistent with distinct migratory routes of expanding farmer populations, as previously suggested49.

On the eastern side of the great divide, no ancestry shifts can be observed during this period. In the east Baltic region50, Ukraine and western Russia, local HG ancestry prevailed until around 5,000 bp without a noticeable input of Anatolian-related farmer ancestry (Figs. 2 and 3 and Extended Data Figs. 57). This eastern genetic continuity is in congruence with the archaeological record, which shows the persistence of pottery-using forager groups in this wide region, and a delayed introduction of cultivation and animal husbandry by several thousand years (Supplementary Note 5). Around 5,000 bp, major demographic events unfolded on the Eurasian Steppe, resulting in steppe-related ancestry spreading rapidly both eastwards and westwards1,2, marking the end of the great population genomic divide (Figs. 3 and 6). We find that this second transition happened at a faster pace than during the neolithization, reaching most parts of Europe within an approximately 1,000-year time period after first appearing in the eastern Baltic region around 4,800 cal. bp (Fig. 3). In line with previous reports, we observe that by around 4,200 cal. bp, steppe-related ancestry was already dominant in individuals from Britain, France and the Iberian Peninsula12,51. Notably, because of the delayed neolithization in southern Scandinavia, these dynamics resulted in two episodes of large-scale genetic turnover in Denmark and southern Sweden within a period of roughly 1,000 years34 (Fig. 3).

Fig. 6. Genetic relatedness across western Eurasia.

Fig. 6

Maps showing networks of highest IBD sharing (top 10 highest sharing per individual) during different time periods for 579 imputed genomes predating 3,000 cal. bp and located in the geographical region shown. Shading and thickness of lines are scaled to represent the amount of IBD shared between two individuals. In the earliest periods, sharing networks exhibit strong links within relatively narrow geographical regions, representing predominantly close genetic ties between small HG communities, and rarely crossing the East–West divide extending from the Baltic to the Black Sea. From around 9,000 cal. bp onwards, a more extensive network with weaker individual ties appears in the south, linking Anatolia to the rest of Europe, as early Neolithic farmer communities spread across the continent. The period 7,000–5,000 cal. bp shows more connected subnetworks of western European and eastern/northern European Neolithic farmers, while locally connected networks of HG communities prevail on the eastern side of the divide. From c. 5,000 bp onwards the divide finally collapses, and continental-wide genetic relatedness unifies large parts of western Eurasia.

Although the broader effects of the steppe migrations around 5,000 cal. bp are well known, the origin of this ancestry has remained a mystery. Here we show that the steppe ancestry composition (Steppe_5000BP_4300BP) can be modelled as a mixture of around 65% ancestry related to herein-reported HG genomes from the Middle Don River region (MiddleDon_7500BP) and around 35% ancestry related to HGs from Caucasus (Caucasus_13000BP_10000BP) (Extended Data Fig. 6 and Supplementary Data 9). Thus, Middle Don HGs, who already carried ancestry related to Caucasus HGs (Extended Data Fig. 4a), serve as a hitherto-unknown proximal source for the majority ancestry contribution into Yamnaya-related genomes. The individuals in question derive from the burial ground Golubaya Krinitsa (Supplementary Note 3). Material culture and burial practices at this site are similar to the Mariupol-type graves, which are widely found in neighbouring regions of Ukraine; for instance, along the Dnepr River. They belong to the group of complex pottery-using HGs mentioned above, but the genetic composition at Golubaya Krinitsa is different from that in the remaining Ukrainian sites (Fig. 2a and Extended Data Fig. 5). A previous study30 suggested a model for the formation of Yamnaya ancestry that includes a ‘northern’ steppe source (EHG + CHG ancestry) and a ‘southern’ Caucasus Chalcolithic source (CHG ancestry), but did not identify the exact origin of these sources. The Middle Don genomes analysed here show the appropriate balance of EHG and CHG ancestry, suggesting that they are candidates for the missing northern proximate source for Yamnaya ancestry.

Extended Data Fig. 6. Ancestry modelling for post-Neolithic genetic clusters.

Extended Data Fig. 6

Supervised ancestry modelling using non-negative least squares on IBD sharing profiles. Panels show estimated ancestry proportions of a global Eurasian cluster corresponding to European individuals after 5,000 bp, as well as pastoralist groups from the Eurasian steppe. Columns show results of modelling target individuals using three panels of increasingly distal source groups: “postBA”: Bronze Age and Neolithic source groups; “postNeol”, Bronze Age and later targets using Late Neolithic/early Bronze Age and earlier source groups; “deep”, Mesolithic and later targets using deep ancestry source groups. Individuals used as sources in a particular set are indicated with black crosses and coloured bars with 100% ancestry proportion. Black lines indicate 1 standard error for the respective ancestry component.

The dynamics of the continent-wide transition from Neolithic farmer ancestry to steppe-related ancestry also differ markedly between geographical regions. The contribution of local Neolithic ancestry to the incoming groups was high in eastern, western and southern Europe, reaching more than 50% on the Iberian Peninsula41 (‘postNeol’ set; Extended Data Fig. 6 and Supplementary Data 10). Scandinavia, however, shows a very different picture, with much lower contributions (less than 15%), including near-complete replacement of the local population in some regions (Extended Data Fig. 9b). Steppe-related ancestry accompanies and spreads with the formation of the CWC across Europe, and our results provide new evidence on the foundational admixture event. Individuals associated with the CWC carry a mix of steppe-related and Neolithic farmer-related ancestry; we show that the latter can be modelled as deriving exclusively from a genetic cluster associated with the Late Neolithic Globular Amphora culture (GAC) (Poland_5000BP_4700BP), and that this ancestry co-occurred with steppe-related ancestry across all sampled European regions (Fig. 4a and Extended Data Fig. 6). This suggests that the spread of steppe-related ancestry was predominantly mediated through groups already admixed with GAC-related farmer groups of the eastern European plains—an observation that has major implications for understanding the emergence of the CWC.

Extended Data Fig. 9. Dynamics of the steppe transition in Europe.

Extended Data Fig. 9

a, Estimated time of admixture between local HG groups and Neolithic farmers. Black diamonds and error bars represent point estimate and standard errors of admixture time, coloured bars show temporal range of included target individuals. The time to admixture was adjusted backwards by the average age of individuals for each region. b, Moon charts showing spatial distribution of estimated ancestry proportions related to local Neolithic farmers across Europe. Estimated ancestry proportions are indicated by size and amount of fill of moon symbols. Coloured areas indicate the geographical extent of individuals included as local sources in the respective regions. c, Map showing networks of highest between-cluster IBD sharing (top 10 highest sharing per individual) for individuals from genetic cluster “Steppe_5000BP_4300BP” representing the major steppe ancestry source for Europeans. d, Distributions of difference in estimated steppe-related ancestry proportions, using individuals from the genetic cluster “Steppe_5000BP_4300BP”, associated with either Yamnaya or Afanasievo cultural contexts as separate sources.

Fig. 4. Fine-scale structure and temporal dynamics of steppe-related ancestry during the second transition in Europe.

Fig. 4

a, Correlation between the estimated proportions of steppe-related and GAC farmer-related ancestries (‘postNeol’ source set), across west Eurasian target individuals. b, Timeline of difference in estimated steppe-related ancestry proportions, using individuals from the genetic cluster ‘Steppe_5000BP_4300BP’ associated with either Yamnaya or Afanasievo cultural contexts as separate sources. Individuals from European post-Neolithic genetic clusters before 3,000 cal. bp are indicated with coloured symbols; other west Eurasian target individuals are indicated with grey symbols. Symbols with black outlines highlight early steppe-related individuals associated with either Corded Ware or related (for example, Battle Axe) cultural contexts.

A stylistic connection between GAC and CWC ceramics has long been suggested, including the use of amphora-shaped vessels and the development of cord decoration patterns52. Moreover, shortly before the emergence of the earliest CWC groups, eastern GAC and western Yamnaya groups exchanged cultural elements in the forest–steppe transition zone northwest of the Black Sea, where GAC ceramic amphorae and flint axes were included in Yamnaya burials, and the typical Yamnaya use of ochre was included in GAC burials53, indicating close interactions between these groups. Previous ancient genomic data from a few individuals suggested that this was limited to cultural influences and not population admixture54. However, in the light of our new genetic evidence, it seems that this zone—and possibly other similar zones of contact between GAC and groups from the steppe (for example, the Yamnaya)—were key in the formation of the CWC, through which steppe-related ancestry and GAC-related ancestry co-dispersed far towards the west and the north55. This resulted in regionally diverse situations of interaction and admixture14,32, but a substantial part of the CWC dispersal happened through corridors of cultural and demic transmission that had been established by the GAC during the preceding period33,56. Differences in Y-chromosomal haplogroups between CWC and Yamnaya suggest that the currently published Yamnaya-associated genomes do not represent the most direct source for the steppe ancestry component in CWC32,33. This notion was supported by proximate ancestry modelling using published genomes1 associated with Yamnaya or Afanasievo cultural contexts as separate sources, which revealed a subtle increase in affinity for an Afanasievo-related source over a Yamnaya-related source in early individuals with European steppe ancestry before 3,000 cal. bp (Fig. 4b and Extended Data Fig. 9d). The result confirms the subtle population genomic structure in the population associated with Yamnaya or Afanasievo, showing that more dense sampling across the steppe horizon will be required to find the direct source or sources of steppe ancestry in the early CWC.

HG resilience east of the Urals

In contrast to the considerable number of ancient HG genomes from western Eurasia that have been studied so far, genomic data from HGs east of the Urals have remained sparse. These regions are characterized by an early introduction of pottery from areas further east, and were inhabited by complex forager societies with permanent and sometimes fortified settlements20,57. Here, we substantially expand knowledge on ancient populations of this region by reporting genomic data from 38 individuals, 28 of whom date to pottery-associated HG contexts between 8,300 and 5,000 cal. bp (Supplementary Data 2). Most of these genomes form a previously only sparsely sampled13,43 ‘Neolithic steppe’ cline that spans the Siberian forest steppe zones of the Irtysh, Ishim, Ob, and Yenisei River basins to the Lake Baikal region (Fig. 1c and Extended Data Figs. 1a and 3e). Supervised admixture modelling (using the ‘deep’ set of ancestry sources; Supplementary Data 9) revealed contributions from three major sources in these HGs from east of the Urals: early west Siberian HG ancestry (SteppeC_8300BP_7000BP) dominated in the western forest steppe; northeast Asian HG ancestry (Amur_7500BP) was highest at Lake Baikal; and Palaeo-Siberian ancestry (SiberiaNE_9800BP) was observed in a cline of decreasing proportions from northern Lake Baikal westwards across the forest steppe13 (Extended Data Figs. 7 and 10a).

Extended Data Fig. 10. Genetic transformations across the Eurasian steppe.

Extended Data Fig. 10

a, Moon charts showing spatial distribution of estimated ancestry proportions of Siberian HGs from the “deep” Siberian ancestry sources (names and locations indicated with coloured symbols). Estimated ancestry proportions are indicated by size and amount of fill of moon symbols. b, Timelines of ancestry proportions from “postNeol” sources in central and north Asian ancient individuals after 5,000 bp. Symbol shape and colour indicate the genetic cluster of each individual. Black lines indicate 1 standard error. c,d, Difference in estimated steppe-related ancestry proportions, using individuals from genetic cluster “Steppe_5000BP_4300BP” associated with either Yamnaya or Afanasievo cultural contexts as separate sources, as a function of time (c) or total estimated steppe-ancestry proportion (d). Individuals from genetic clusters of individuals associated with Okunevo (blue stars) or Sintashta/Andronovo (green diamonds) contexts are indicated with coloured symbols.

We used these Neolithic HG clusters (‘postNeol’ ancestry source set; Extended Data Fig. 7) as putative source groups in more proximal admixture modelling to investigate the spatiotemporal dynamics of ancestry compositions across the steppe and the Lake Baikal region after the Neolithic period. We replicate previously reported evidence for a genetic shift towards higher forest steppe HG ancestry (source SteppeCE_7000BP_3600BP) in Late Neolithic and Early Bronze Age (LNBA) individuals at Lake Baikal (clusters Baikal_5600BP_5400BP and Baikal_4800BP_4200BP)13,58. However, ancestry related to this cluster is also already observed at around 7,000 bp in herein-reported Neolithic HG individuals both at Lake Baikal (NEO199 and NEO200) and along the Angara river to the north (NEO843) (Extended Data Fig. 7). Both male individuals at Lake Baikal belonged to the Y-chromosome haplogroup Q1b1, characteristic of the later LNBA groups in the same region (Supplementary Note 3b and Supplementary Fig. 3b.5). Together with an early estimated admixture time (upper bound of around 7,300 cal. bp) for the LNBA groups (Supplementary Data 14), these results suggest that gene flow between HGs of Lake Baikal and those of the south Siberian forest steppe regions already occurred during the eastern Early Neolithic, consistent with archaeological interpretations of contact. In this region, bifacially flaked tools first appeared near Baikal59, from where the technique spread far to the west. We find echoes of such bifacial flaking in archaeological complexes (Shiderty 3, Borly, Sharbakty 1, Ust-Narym and so on) in northern and eastern Kazakhstan, around 6,500–6,000 cal. bp (refs. 60,61). Here, Mesolithic cultural networks with southwest Asia have also been recorded, as evidenced by pebble and flint lithics known from southwest Asia cultures62.

Genomes reported here also shed light on the genetic origins of the Early Bronze Age Okunevo Culture in the Minusinsk Basin in Southern Siberia. In contrast to previous results, we find no evidence for Lake Baikal HG-related ancestry in the Okunevo13,58 when using our newly reported Siberian forest steppe HG genomes jointly with Lake Baikal LNBA genomes as putative proximate sources. Instead, we find that they originate from the admixture of a forest steppe HG source (best modelled as a mixture of clusters Steppe_6700BP_4600BP and SteppeCE_7000BP_3600BP) and steppe-related ancestry (Steppe_5300BP_4000BP; Extended Data Fig. 7, set ‘postBA’ and Supplementary Data 11). We date the admixture with steppe-related ancestry to around 4,600 bp (Supplementary Data 14), and find it to be modelled exclusively from an Afanasievo-related source in proximate modelling separating the Yamnaya and Afanasievo steppe ancestries (Extended Data Figs. 9d and 10c,e). This is direct evidence for gene flow from peoples of the Afanasievo Culture, who were closely related to the Yamnaya and existed near Altai and Minusinsk Basin during the era of the steppe migrations1,58.

From around 3,700 cal. bp, individuals across the steppe and Lake Baikal regions show markedly different ancestry profiles (Fig. 5 and Extended Data Figs. 7 and 9b). We document a sharp increase in non-local ancestries, with only limited ancestry contributions from local HGs. The early stages of this transition are characterized by an influx of steppe-related ancestry, which decays over time from its peak of around 70% in the earliest individuals. Similar to the dynamics in western Eurasia, steppe-related ancestry is here correlated with GAC-related farmer ancestry (Poland_5000BP_4700BP; Fig. 5 and Extended Data Fig. 10b), recapitulating the previously documented gene flow from GAC groups into neighbouring groups of the steppe and the forest steppe, and the eastward expansion of admixed western steppe pastoralists from the Sintashta and Andronovo complexes during the Bronze Age43,63. However, GAC-related ancestry is notably absent in individuals of the Okunevo culture, and individuals with steppe ancestry after 3,700 bp show a slight excess in affinity to Yamnaya over Afanasievo in proximate modelling (Extended Data Fig. 10d), providing further support for two distinct eastward migrations of western steppe pastoralists during the early (Yamnaya-related) and later (Sintashta and Andronovo) Bronze Age. The later stages of the transition are characterized by increasing central Asian (Turkmenistan_7000 BP_5000BP) and northeast Asian-related (Amur_7500BP) ancestry components (Fig. 5 and Extended Data Fig. 10b). Together, these results show that deeply structured HG ancestry dominated the eastern Eurasian steppe substantially longer than in western Eurasia, before successive waves of population expansions swept across the steppe within the last 4,000 years. These included a large-scale introduction of domesticated horse lineages concomitant with new equestrian equipment and spoke-wheeled chariotry63,64, as well as the adoption of millet as a robust subsistence crop65.

Fig. 5. Genetic transects east of the Urals.

Fig. 5

Timelines of genetic ancestry compositions within the past 6,000 years east of the Urals. Shown are ancestry proportions in 148 imputed ancient genomes from this region, inferred using supervised ancestry modelling (‘postNeol’ source set). Panels separate ancestry proportions from local forest steppe HGs (HG) and sources representing ancestries originating further east or west.

Sociocultural insights

We used patterns of pairwise IBD sharing between individuals to examine our data for temporal shifts in relatedness within genetic clusters. We found clear trends of a reduction of within-cluster relatedness over time, in both western and eastern Eurasia (Extended Data Fig. 11a). This pattern is consistent with a scenario of increasing effective population sizes during this period66. Nevertheless, we observe notable differences in temporal relatedness patterns between western and eastern Eurasia, mirroring the wider difference in population dynamics discussed above. In the west, within-group relatedness changed substantially during the Neolithic transition (around 9,000–6,000 bp), in which clusters of individuals with Anatolian farmer-related ancestry show overall reduced IBD sharing compared with clusters of individuals with HG-associated ancestry (Extended Data Fig. 11a). In the east, genetic relatedness remained high until around 4,000 bp, consistent with a much longer persistence of smaller localized HG groups (Fig. 6 and Extended Data Fig. 11a).

Extended Data Fig. 11. Patterns of co-ancestry.

Extended Data Fig. 11

a, Panels show within-cluster genetic relatedness over time, measured as the total length of genomic segments shared IBD between individuals. Results for both measures are shown separately for individuals from western versus eastern Eurasia. Small grey dots indicate estimates for individual pairs, with larger coloured symbols indicating median values within genetic clusters. Ranges of median values for major ancestry groups are indicated with labelled convex hulls. b, Distribution of ROH lengths for 29 individuals with evidence for recent parental relatedness (>50 cM total in ROHs > 20 cM). c, Karyogram showing genomic distribution of ROH in individual tem003, an ancient case of uniparental disomy for chromosome 2. Regions within ROH are indicated with blue colour.

Next, we examined the data for evidence of recent parental relatedness, by identifying individuals in which more than 50 centimorgans (cM) of their genomes was contained in long (more than 20 cM) runs of homozygosity (ROH) segments67. We detected only 29 such individuals out of a total sample of 1,396 imputed ancient genomes from across Eurasia (Extended Data Fig. 11b). This suggests that close kin mating was not common in the regions and periods covered by our data. No obviously discernible spatiotemporal or cultural clustering were observed among the individuals with recent parental relatedness. Notably, an approximately 1,700-year-old Sarmatian individual from Temyaysovo (tem003)68 was found to be homozygous for almost the entirety of chromosome 2, but without evidence of ROH elsewhere in the genome, suggesting that this is the first documented case of uniparental disomy in an ancient individual (Extended Data Fig. 11c). Among several noteworthy familial relationships (see Supplementary Fig. 3c.2), we report a Mesolithic father–son burial at Ertebølle (NEO568 and NEO569), as well as a Mesolithic mother–daughter burial at Dragsholm (NEO732 and NEO733), Denmark34.

Formation and dissolution of the divide

We have provided evidence for the existence of a clear east–west genetic division extending from the Black Sea to the Baltic, mirroring archaeological observations, and persisting for several millennia. We show that this deep ancestry division in the Eurasian human gene pool that was established during early post-LGM dispersals7 was maintained throughout the Mesolithic and Neolithic ages (Fig. 6). Accordingly, we show that the genetic effect of the Neolithic transition was highly distinct east and west of this boundary. These observations raise a series of questions related to understanding the underlying drivers.

In eastern Europe, the expansion of Neolithic farming was halted for around 3,000 years, and this delay could be linked to environmental factors, with regions east of the division having more continental climates and harsher winters, possibly less suited for Middle Eastern agricultural practices69. Here, highly developed HG societies persisted with stable, complex and sometimes fortified settlements, long-distance exchange and large cemeteries70,71. A diet including freshwater fish is clear both from our isotopic data (Supplementary Data 2) and from analyses of lipids in pottery71. In the northern forested regions of this boundary zone, HG societies persisted until the emergence of the CWC around 5,000 cal. bp, whereas in the southern and eastern steppe regions, hunting and gathering was eventually complemented with some animal husbandry (cattle and sheep), and possibly horse herding in central Asia72. Some of these groups, such as Khvalynsk at the Volga, saw the emergence of male sodalities involved in wide-ranging trade connections of copper objects from east central Europe and the Caucasus29. Settlements were confined mainly to the flat flood plains and river valleys, whereas the steppe belt remained largely unexploited.

The eventual dissolution of this genetic, economic and social border was driven by events that unfolded in the steppe region. Here, two temporal phases of technological innovations can be observed archaeologically: the widespread dispersal of ox-drawn wheeled vehicles around 5,500 cal. bp and the later development of horse riding. Combined with possible changing environmental conditions73, this opened up the steppe as an economic zone, allowing Yamnaya groups to exploit the steppe as pastoral nomads around 5,000 cal. bp (ref. 74). Eneolithic settlements along river valleys were replaced by this new mobile economy75, which finally dissolved the great genomic boundary that had persisted in the preceding millennia (Fig. 6).

By 4,000 cal. bp, the invention of chariot warfare and the adoption of millet as a food crop allowed the final eastward expansion into central Asia and beyond by the Andronovo and related groups, with global legacies for the expansion of Indo-European languages76. Our study has provided new genetic knowledge on these steppe migrations on two levels: we have identified a hitherto-unknown source of ancestry in HGs from the Middle Don region contributing ancestry to the steppe pastoralists, and we have documented how the later spread of steppe-related ancestry into Europe through the CWC was first mediated through peoples associated with the GAC. In a contact zone that included forested northern regions, the CWC was rapidly formed from a cultural and genetic amalgamation of steppe-groups related to the Yamnaya and the GAC groups in eastern Europe. In accordance with their mixed cultural and genetic background, the CWC practised a mixed economy, using various subsistence strategies in different environments. This flexibility would have contributed substantially to their success in settling and adapting to very different ecological and climatic settings over a very short period of time33.

Methods

Generation and authentication of ancient DNA data

Sampling of ancient human remains was undertaken in collaboration with co-authors responsible for the curation and contextual analyses of these, and with the approval of the relevant institutions responsible for the archaeological remains (detailed in the Reporting Summary). Laboratory work was undertaken in dedicated ancient DNA clean-lab facilities (Globe Institute, University of Copenhagen) following optimized ancient DNA protocols1,77 (Supplementary Note 1). Double-stranded blunt-end libraries were constructed from the extracted DNA using NEBNext DNA Prep Master Mix Set E6070 (New England Biolabs) and sequenced (80 bp and 100 bp single read) on Illumina HiSeq 2500 and 4000 platforms. Initial shallow shotgun screening identified 317 of 962 ancient samples with sufficient DNA preservation for deeper sequencing. Of these, 211 were teeth, 91 were petrous bones and 15 were sampled from long bones, ribs and cranial bones (Supplementary Data 2). Reads were mapped to the human reference genome build 37 and also to the mitochondrial genome (rCRS) alone. Mapped reads were filtered for mapping quality 30 and sorted using Picard (v.1.127) (http://picard.sourceforge.net) and SAMtools78. Data were merged to library level and duplicates were removed using Picard MarkDuplicates (v.1.127) and merged to sample level. Sample-level BAMs were re-aligned using GATK (v.3.3.0) and hereafter had the md-tag updated and extended BAQs calculated using samtools calmd (v.1.10)78. Read depth and coverage were determined using pysam (https://github.com/pysam-developers/pysam) and BEDtools (v.2.23.0)79. Post-mortem DNA damage patterns were determined using mapDamage2.0 (ref. 80). For the 317 samples we observed C-to-T deamination fractions ranging from 10.4% to 67.8%, with an average of 38.3% across all samples (Supplementary Data 1). These numbers indicate DNA-molecule degradation consistent with a millennia-scale depositional age. Three methods were used to estimate DNA contamination: two based on mitochondrial sequences81,82 and one method investigating X-chromosomal data in males (ANGSD, Supplementary Note 1). All contamination estimates are reported in Supplementary Data 5 (summary values in Supplementary Data 1). On the basis of this approach, we had a total of 15 samples flagged as ‘possibly contaminated’ in our downstream analyses (Supplementary Note 1).

Imputation of ancient genomes

We imputed the ancient genomes in this study using the imputation and phasing tool GLIMPSE v.1.0.0 (ref. 35) and 1000 Genomes phase 3 (ref. 36) as a reference panel. We first generated genotype likelihoods at the biallelic 1000 Genomes variant sites from the bam files with bcftools v.1.10 and the command bcftools mpileup with parameters -I -E -a ‘FORMAT/DP’ --ignore-RG, followed by bcftools call -Aim -C alleles. Using GLIMPSE_chunk, the genotype likelihood data were first split into chunks of sizes between 1 and 2 Mb with a buffer region of 200 kb at each side. We then imputed each chunk with GLIMPSE_phase with parameters --burn 10, --main 15 and --pbwt-depth 2. Finally, the imputed chunks were ligated with GLIMPSE_ligate. To validate the accuracy of the imputation, 42 high-coverage (5× to 39×) genomes, including a Neolithic trio, were downsampled for testing83 (Supplementary Note 2). We evaluated imputation accuracy on the basis of depth of coverage; MAF; and ancestry and time frame of ancient genomes, using high-coverage ancient genomes83. Genomes with higher than 1× coverage provided a notably high imputation accuracy (closely matching that obtained for modern samples; Extended Data Fig. 2), except for African genomes, which had lower accuracy owing to the poor representation of this ancestry in the reference panel. Imputation accuracy was influenced by both MAF and coverage (Supplementary Fig. 2.3). We found that coverage as low as 0.1× and 0.4× was sufficient to obtain r2 imputation accuracies of 0.8 and 0.9 at common variants (MAF ≥ 10%), respectively. We conclude that ancient genomes can be imputed confidently from coverages above 0.4×, and that genome-wide aggregate analyses relying on common SNPs (for example, PCA and admixture modelling) can be performed with a low amount of bias for genome coverage from as low as 0.1× when using specific quality control on the imputed data (although at very low coverage a bias arises towards the major allele; see Supplementary Note 2). We also tested for possible effects of bias affecting inferred ancestry components83 propagating biases in individual-level pairwise analyses, using D-statistics, which indicated that imputed ancient genomes down to 0.1× coverage are not significantly affected (Supplementary Note 2).

Demographic inference

We determined the genetic sex of the study individuals using the ratio of reads aligning to either of the sex chromosomes (RY statistic)84. Y chromosomes of inferred male individuals were further analysed using phylogenetic placement85. We built a reference phylogenetic tree of 1,244 male individuals from the 1000 Genomes project with RAxML-NG (ref. 86), using the general time-reversible model including among-site rate heterogeneity and ascertainment correction (model GTR+G+ASC_LEWIS). For each ancient sample, haploid genotypes given the positions and alleles in the reference panel were called using ‘bcftools call’ (options -C alleles –ploidy 1 -i). The resulting genotypes were converted to fasta format and placed onto the reference tree using EPA-ng (ref. 85). Phylogenetic placements were processed and visualized using gappa (ref. 87). To convert phylogenetic placements into haplogroup calls, we assigned each branch of the reference phylogeny to its representing haplogroup, using SNP annotations from ISOGG (v.15.73). For each ancient sample, haplogroups were then called using the most basal branch accumulating 99% of the placement weights, obtained using ‘accumulate’ in gappa. Phylogenetic analyses of reconstructed mitochondrial genomes were also undertaken using RAxML-ng (ref. 85; see Supplementary Note 3a).

To infer genetic relatedness between the study individuals, we used the allele-frequency-free inference method introduced previously88. For each pair of individuals, three relatedness estimators were calculated, R0, R1 and KING-robust (ref. 89) using the site-frequency-spectrum (SFS)-based approach. We used the realSFS method90 implemented in the ANGSD package91 to infer the 2D-SFS, selecting the SFS with the highest likelihood across ten replicates. We used a set of 1,191,529 autosomal transversion SNPs with MAF ≥ 0.05 from the 1000 Genomes Project36 for the analysis. Previously established cut-offs89 for the KING-robust estimator were applied to assign individual pairs to first-, second- or third-degree relationships. Parent–offspring relationships were distinguished from sibling relationships using R0 and R1 ratios, by requiring that R0 ≤ 0.02 and 0.4 ≤ R1 ≤ 0.6 to infer a parent–offspring relative pair. Individual pairs with fewer than 20,000 sites contributing to the estimators were excluded.

We generated a dataset for population genetic analysis by combining the 317 newly sequenced individuals with 1,347 previously published ancient genomes with genomic coverage higher than 0.1× generated using shotgun sequencing (Supplementary Data 7). Imputed genotype data (Supplementary Note 2) for this set of 1,664 ancient genomes were merged with genotypes of 2,504 modern individuals from the 1,000 Genomes project36 used as a reference panel in the imputation. We retained only SNPs that passed the 1000 Genomes strict mask, resulting in a final dataset of 4,168 individuals genotyped at 7,321,965 autosomal SNPs (‘1000G’ dataset). As well as imputed genotypes, we also generated pseudo-haploid genotypes for each ancient individual by randomly sampling an allele from sequencing reads covering those SNPs. For population structure analyses in the context of global genetic diversity, we generated a second dataset by intersecting the ancient genotype data with SNP array data of 2,180 modern individuals from 213 worldwide populations3,4,92,93 (‘HO’ dataset).

To facilitate filtering for downstream analyses, we flagged individuals to potentially exclude according to the following criteria: (i) contamination estimate greater than 5% (‘contMT5pct’, ‘contNuc5pct’; Supplementary Note 1); (ii) autosomal coverage less than 0.1× (‘lowcov’); (iii) genome-wide average imputation genotype probability less than 0.98 (‘lowGpAvg’); (iv) individual is the lower-quality sample in a close relative pair (‘1d_rel’, ‘2d_rel’; Supplementary Note 3c). A total of 1,492 individuals (213 newly reported) passed all filters, which were used in most of the downstream analyses unless otherwise noted.

We investigated overall population structure among the dataset individuals using PCA and model-based clustering (ADMIXTURE94). We performed PCA using different subsets of individuals in the ‘HO’ dataset. For the PCA including only imputed diploid samples, we used GCTA (ref. 95), excluding SNPs with MAF < 0.05 in the respective panel. For PCA projecting low coverage or flagged individuals, we used smartpca (refs. 96,97) with options ‘lsqproject: YES’ and ‘autoshrink: YES’ on a fixed set of 400,186 SNPs with MAF ≥ 0.05 in non-African individuals passing all filters. We ran ADMIXTURE on a set of 1,593 ancient individuals from the ‘1000G’ dataset, excluding individuals flagged as close relatives or with a contamination estimate greater than 5%. For the 1,492 individuals passing all filters we used imputed genotypes; the remaining 101 lower-coverage samples were represented by pseudo-haploid genotypes. We restricted the analysis to transversion SNPs with imputation INFO score ≥ 0.8 and MAF ≥ 0.05. We further performed linkage-disequilibrium pruning and filtering for missingness using plink98 (options --indep-pairwise 500 50 0.4 –geno 0.8), for a final analysis set of 142,550 SNPs.

We performed admixture graph fitting (qpGraph) to investigate deep Eurasian population structure using ADMIXTOOLS2 (ref. 99). For these analyses, pairwise f2-statistics were pre-computed from pseudo-haploid genotypes in the ‘1000G’ dataset using the ‘extract_f2’ function with ‘afProd=TRUE’. We grouped individuals into populations using their membership in the genetic clusters inferred from IBD sharing (Supplementary Note 3f), with the exception of the Upper Palaeolithic European individual Kostenki 14, who was treated as a separate population (new cluster label ‘Europe_37000BP_33000BP_Kostenki’). We carried out admixture graph fitting using a semi-automatic iterative approach (Supplementary Note 3d).

We used IBDseq100 to detect genomic segments shared IBD between all individuals in the ‘1000G’ dataset, restricting to transversion SNPs with imputation INFO score ≥ 0.8 and MAF ≥ 0.01. We filtered the resulting IBD segments for LOD score ≥ 3 and a minimum length of 2 centimorgans (cM), and further removed regions of excess long IBD as described previously101. First, we used the GenomicRanges102 package in R to calculate the total number of long IBD segments (greater than 10 cM) overlapping each position along the genome, and calculated their 3% trimmed mean and s.d. We then called regions of excess IBD if they were more than 10 trimmed s.d. from the trimmed mean, and removed any segments overlapping the excess IBD regions. For analyses of ROH we used a shorter length cut-off of 1 cM.

We performed genetic clustering of the ancient individuals using hierarchical community detection on a network of pairwise IBD-sharing similarities103. To facilitate the detection of clusters at a finer scale, we ran IBDseq (v.r1206) on a dataset restricting to ancient samples only, and applied more lenient filters of imputation INFO score > 0.5, and minimum IBD segment length of 1 cM. We constructed a weighted network of the individuals using the igraph104 package in R, with the fraction of the genome shared IBD between pairs of individuals as weights. We then performed iterative community detection on this network using the Leiden algorithm105 implemented in the leidenAlg R package (v1.01; https://github.com/kharchenkolab/leidenAlg). We used a resolution parameter of r = 0.5 as the starting value for each level of community detection. If more than one community was detected, we split the network into the respective communities, and repeated the community detection step. If no communities were detected, we incremented the resolution parameter in steps of 0.5 until a maximum value of r = 3. The initial clustering was completed when no more communities were detected at the highest resolution parameter, across all subcommunities. To convert the resulting hierarchy into a final clustering, we simplified the initial clustering by collapsing nodes into single clusters on the basis of observed spatiotemporal annotations of the samples. We note that the obtained clusters should not be interpreted as ‘populations’ in the sense of a local community of individuals, but rather as sets of individuals with shared ancestry. Although this approach is an oversimplification of the complex spatiotemporally structured populations investigated here, the obtained clusters nevertheless captured real effects, grouping individuals within restricted spatiotemporal ranges and/or archaeological contexts and recapitulating known relationships between clusters.

To circumvent some of the pitfalls of grouping individuals into discrete clusters, we used supervised ancestry modelling in which sets of ‘target’ individuals were modelled as mixtures of ‘source’ groups, selected to represent particular ancestry components. As an illustrative case, an individual of European HG ancestry with a minor contribution of Neolithic farmer admixture might be inferred to be a member of a HG genetic cluster, but will be modelled as a mixture of a HG and Neolithic farmer sources in the ancestry modelling. To estimate ancestry proportions from patterns of pairwise IBD sharing, we applied an approach akin to ‘chromosome painting’106. We first inferred an IBD-based ‘painting profile’ for each target individual, by summing up the total amount of IBD shared with each ‘donor’ group (using population labels for modern donors or IBD-based genetic clusters for ancient donors), and normalizing them to the interval [0,1]. We used a leave-one-out approach38 to account for the fact that recipient individuals cannot be included as donors from their own group. We then used these painting profiles in supervised modelling of target individuals as mixtures from different sets of putative source groups38,107, using non-negative least squares implemented in the R package limSolve108. We estimated standard errors of ancestry proportions using a weighted block jacknife, leaving out each chromosome in turns. A comparison of results obtained using this approach to other commonly used methods (supervised ADMIXTURE, qpAdm) is shown in Supplementary Note 3f). We focused our analyses on three panels of putative source clusters reflecting different temporal depths: ‘deep’, using a set of deep ancestry source groups reflecting major ancestry poles; ‘postNeol’, using diverse Neolithic and earlier source groups; and ‘postBA’, using Late Neolithic and Bronze Age source groups (Extended Data Figs. 57). We also used additional source sets in follow-up analyses of more restricted spatiotemporal contexts (Supplementary Data 713).

Finally, we aimed to infer the geographical and temporal spread of major ancestries (Supplementary Note 3e). We used a method46 applying spatiotemporal ordinary kriging on latent ancestry proportion estimates from ancient and present-day genomes. This way, we obtained spatiotemporal maps reflecting the dynamics of the spread of ancestry during the transition from the Mesolithic to the Neolithic, Bronze Age, Iron Age and more recent periods. We obtained ancestry proportions estimated using ADMIXTURE109 with K = 9 latent ancestry clusters (Supplementary Note 3d) on a sequence dataset including both whole-genome shotgun-sequenced genomes and genomic sequences obtained through SNP capture (Supplementary Note 2, intersection with ‘HO’ dataset). We performed spatiotemporal kriging110 of these proportions over the last 12,900 years, in intervals of 300 years, with a 5,000-point spatial grid spanning western and central Eurasia. We used the R package gstat to fit a spatiotemporal variogram via a metric covariance model, and perform ordinary kriging111. We focused on the ancestry clusters for which we could fit variogram models that were not static over time.

14C chronology and reservoir effects

Of the 317 individuals sequenced in this study, 272 were 14C-dated in the project, 30 14C-dates were obtained from literature and 15 were dated by archaeological context (Supplementary Note 4 and Supplementary Data 2). Some individuals were dated twice. Most of the dates (n = 242) were performed at the 14CHRONO Centre laboratory at Queen’s University, Belfast, following published sample pretreatment and laboratory protocols112. Additional samples were analysed by the Oxford Radiocarbon Accelerator Unit (ORAU) laboratory (n = 24) and by the Keck-CCAMS Group (n = 6) (see previous reports113,114 for laboratory procedures). Only datings with a C/N ratio of 2.9–3.6 were accepted; both δ13C and δ15N collagen measurements were also performed, and were used in estimates of marine and freshwater reservoir effects (MRE and FRE, respectively) (see Supplementary Note 4 and Supplementary Data 4). Published values of MRE and FRE were used where available, but for some regions, such as sites in western Russia, a standard FRE value of 500 years was applied. A diet-weighted reservoir offset was then applied to the 14C central value before calibration. Calibrations were made in Oxcal 4.4 using the Intcal20 calibration curve115. For display and calculation purposes a midpoint of the reservoir-corrected and calibrated 95% interval was calculated. Full details of the reservoir correction and calibration procedure are given in Supplementary Note 4 and the calculations are in Supplementary Table 4.1.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Online content

Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41586-023-06865-0.

Supplementary information

Supplementary Information (28.7MB, pdf)

Supplementary Notes 1–7: 1, Data Generation and Authentication; 2, Imputation of ancient DNA (including Figures S2.1 to S2.11, and Tables S2.1 and S2.2); 3, Demographic Inference, comprising: 3a ‘Phylogenetic analysis of mtDNA sequences’ (including Figures S3a.1 to S3a.3), 3b ‘Y chromosome / sex determination’ (including Figures S3b.1 to S3b.8), 3c ‘Relatedness’ (including Figures S3c.1 and S3c.2, and Tables S3c.1 and S3c.2), 3d ‘Overall Population Structure’ (including Figures S3d.1 to S3d.16), 3e ‘Inferring the spatiotemporal spread of population movements in the past 13 millennia’ (including Figures S3e.1 to S3e.5, and animations S3.1 to s3e.11), 3f ‘HBD/ IBD sharing/ROH/clustering’ (including Figures S3f.1 to S3f.53); 4, 14C chronology and estimates of reservoir effects (including Table S4.1); 5, From forager to farmer in western Eurasia: an archaeological overview (including Figures S5.1 to S5.3); 6, Catalogue of Danish archaeological sites (including Figures S6.1 to S6.15); and 7, Catalogue of non-Danish archaeological sites (including Figures S7.1 to S7.3, and Tables S7.1 to S7.3).

Reporting Summary (108.7KB, pdf)
Supplementary Data 1 (63.1KB, xlsx)

Summary details of samples presented with novel genome data.

Supplementary Data 2–4 (230.2KB, xlsx)

Supplementary Data 2 contains dates, isotopes and context. Supplementary Data 3 includes reservoir correction calculations, and Supplementary Data 4 contains isotopes and all individual samples.

Supplementary Data 5 and 6 (162.4KB, xlsx)

Supplementary Data 5 contains DNA contamination estimates and Supplementary Data 6 contains relatedness estimates.

Supplementary Data 7 (206.9KB, xlsx)

Full ancient genomes dataset.

Supplementary Data 8–13 (4.3MB, xlsx)

Supplementary Data 8 contains mixture model sets. Supplementary Data 9–13 show ancestry proportions for sets “deep”, “postNeol”, “postBA”, “hgEur” and “fEur” respectively.

Supplementary Data 14 (29KB, xlsx)

Admixture time estimates.

Acknowledgements

We acknowledge P. Bennike, who was involved in initiating this project, for her substantial contributions to its conception and to prehistoric research more broadly; she passed away in 2017. We thank L. Olsen and P. Selmer Olsen for administrative and technical assistance, respectively; the UK Biobank for access to the UK Biobank genomic resource; Illumina for collaboration; and S. Ellingvåg for assistance with sample access. E.W. thanks St John’s College, Cambridge, for providing a stimulating environment of discussion and learning. The Lundbeck Foundation GeoGenetics Centre is supported by grants from the Lundbeck Foundation (R302-2018-2155 and R155-2013-16338), the Novo Nordisk Foundation (NNF18SA0035006), the Wellcome Trust (214300), the Carlsberg Foundation (CF18-0024), the Danish National Research Foundation (DNRF94, DNRF174), the University of Copenhagen (KU2016 programme) and Ferring Pharmaceuticals A/S to E.W. This research has been conducted using the UK Biobank Resource and the iPSYCH Initiative, funded by the Lundbeck Foundation (R102-A9118 and R155-2014-1724). This work was further supported by the Swedish Foundation for Humanities and Social Sciences grant (Riksbankens Jubileumsfond M16-0455:1) to K.K. M.E.A. was supported by Marie Skłodowska-Curie Actions of the EU (grant no. 300554), The Villum Foundation (grant no. 10120) and Independent Research Fund Denmark (grant no. 7027-00147B). W.B. is supported by the Hanne and Torkel Weis-Fogh Fund (Department of Zoology, University of Cambridge). A.P. is funded by the Wellcome grant WT214300; B.S.d.M and O.D. by the Swiss National Science Foundation (SFNS PP00P3_176977) and the European Research Council (ERC 679330); R. Macleod by an SSHRC doctoral studentship grant (G101449: ‘Individual Life Histories in Long-Term Cultural Change’); G.R. by a Novo Nordisk Foundation Fellowship (gNNF20OC0062491); N.N.J. by Aarhus University Research Foundation; B.S.P. by an ERC-Starter Grant 'NEOSEA' (grant no. 949424); H.S. by a Carlsberg Foundation Fellowship (CF19-0601); G.S. by Marie Skłodowska-Curie Individual Fellowship ‘PALAEO-ENEO’ (grant agreement number 751349); A. J. Schork by a Lundbeckfonden Fellowship (R335-2019-2318) and the National Institute on Aging (NIH award numbers U19AG023122, U24AG051129 and UH2AG064706); A.L. and I.S. by the Science Committee, Ministry of Education and Science of the Republic of Kazakhstan (AP08856317); B.G.-R. and M.G.-M. by the Spanish Ministry of Science and Innovation (project HAR2016-75605-R); C.M.-L. and O.R. by the Italian Ministry for the Universities (grants 2010-11 prot.2010EL8TXP_001, ‘Biological and cultural heritage of the central-southern Italian population through 30 thousand years’ and 2008 prot. 2008B4J2HS_001, ‘Origin and diffusion of farming in central-southern Italy: a molecular approach’); and D.C.-S. and I.G.-Z. by the Spanish Ministry of Science and Innovation (project HAR2017-86262-P). D.C.S.-G. acknowledges funding from the Generalitat Valenciana (CIDEGENT/2019/061) and the Spanish Government (EUR2020-112213). D.B. was supported by the NOMIS Foundation and Marie Skłodowska-Curie Global Fellowship 'CUSP' (grant no. 846856); E.R.U. by the Science Committee, Ministry of Education and Science of the Republic of Kazakhstan (АР09261083: ‘Transcultural Communications in the Late Bronze Age (Western Siberia–Kazakhstan–Central Asia)‘); E.C. by Villum Fonden (17649); J.E.A.T. by the Spanish Ministry of Economy and Competitiveness (HAR2013‐46861‐R) and Generalitat Valenciana (Aico/ 2018/125 and Aico 2020/97); and P.K. by the Russian Ministry of Science and Higher Education (Ural Federal University Program of Development within the Priority-2030 Program). P.K. also acknowledges the Museum of the Institute of Plant and Animal Ecology (UB RAS, Ekaterinburg). L.Y. acknowledges funding by the Science Committee of the Armenian Ministry of Education and Science (project 21AG-1F025); L.O. by the ERC Consolidator Grant ‘PEGASUS’ (agreement no. 681605); M. Sablin by the Russian Ministry of Science and Higher Education (075-15-2021-1069); N.C. by Historic Environment Scotland; S.V. and E.V.V. by the Russian Ministry of Science and Higher Education (075-15-2022-328); and V.M. by the Science Committee, Ministry of Education and Science of the Republic of Kazakhstan (AR08856925). V.A. is supported by a Lundbeckfonden Fellowship (R335-2019-2318); P.H.S. by the National Institute of General Medical Sciences (R35GM142916); S.R. by the Novo Nordisk Foundation (NNF14CC0001); T.S.K. is funded by Carlsberg grant CF19-0712; R.D. by the Wellcome Trust (WT214300); R.N. by the National Institute of General Medical Sciences (NIH grant R01GM138634); and F. Racimo by a Villum Fonden Young Investigator Grant (no. 00025300); by a Novo Nordisk Fonden Data Science Ascending Investigator Award (NNF22OC0076816) and by the European Research Council (ERC) under the European Union’s Horizon Europe programme (grant agreement No. 101077592). T.W. and V.A. are supported by the Lundbeck Foundation iPSYCH initiative (R248-2017-2003).

Extended data figures and tables

Author contributions

E.W. initiated the study. M.E.A., M.S., T.S.K., R.D., R.N., O.D., T.W., F. Racimo, K.K. and E.W. led the study. M.E.A., M.S., A.F., M.M., C.L.-F., R.N., E.C., T.W., K.K. and E.W. conceptualized the study. M.E.A., M.S., H.S., L.O., T.S.K., R.D., R.N., O.D., T.W., F. Racimo, K.K. and E.W. supervised the research. M.E.A., L.O., R.D., R.N., T.W., K.H.K., K.K. and E.W. acquired funding for research. M.E.A., A.F., J.S., K.-G.S., M.L.S.J., M.U.H., A.A.T., A.C., A.Z., A.M.S., A.J.H, A.G., A.L., B.H.N., B.G.-R, C.B., C.L., C.M.-L., D.V.P., D.C.-S., D.O.L., D.E., D.C.S.-G., D.B., E.B.P., E.K., E.V.V., E.R.U., E. Kannegaard, F. Radina, H.D., I.G.-Z., I.P., I.S., J.G., J.H., J.E.A.T., J.Z., J.V., K.B.P., K.T., L.N., L.L., L.N.M., L.Y., L.P., L. Sarti, L. Slimak, L.K., M.G.-M., M. Silvestrini, M.D., M.V., M.S.N., M.R., M.S., M.P., M.C., M. Sablin, N.C., N.S., O.P., O.R., O.V.L., P.A., P.K., P.C., P. Ríos, P. Lotz, P. Lysdahl, P.M., P.P., P.B., P.d.B.D., P.V.P., P.P.M., P.W., R.V.S., R. Maring, R. Menduiña, R.B., R.T., S.V., S.W., S.B., S.S., S.A.S., S.H.A., T.D.P., T.Z.T.J., Y.B.S., V.I.M., V.S., V.M., Y.M., I.M., O.G. and N.L. were involved in sample collection. M.E.A., M.S., A.R.-M., E.K.I.-P., W.B., A.I., J.S., A.P., B.S.d.M., M.I., M.M., L.V., A. Stern, C.G., F.E.Y, D.J.L., T.S.K., R.D., R.N., O.D., F. Racimo, K.H.K. and E.W. were involved in developing and applying methodology. M.E.A., J.S., C.G. and L.V. led the DNA laboratory work research component. K.-G.S., A.F. and M.E.A. led bioarchaeological data curation. M.E.A., M.S., A.R.-M., E.K.I.-P., W.B., A.I., A.P., B.S.d.M., B.S.P., A.H., R.A.H., T.V., H.M., A.M., A.V., A.B.N., P. Rasmussen, G.R., A. D. Ramsøe, A. Skorobogatov, A. J. Schork, A. Rosengren, C.J.M., I.A., L.Z., R. Maring, V.S., V.A., P.H.S., S.R., T.S.K., O.D. and F. Racimo undertook formal analyses of data. M.E.A., M.S., A.R.-M., E.K.I.-P., A.F., W.B., A.I., K.-G.S., R. Macleod, D.J.L., P.H.S., T.S.K., F. Racimo and E.W. drafted the main text (M.E.A. and M.S. led this). M.E.A., M.S., A.R.-M., E.K.I.-P., A.F., W.B., A.I., K.-G.S., A.P., B.S.d.M., B.S.P., A.H., R. Macleod, R.A.H., T.V., M.F.M., A.B.N., M.U.H., P. Rasmussen, A. J. Stern, N.N.J., H.S., G.S., A. Ramsøe, A. Skorobogatov, A. Rosengren, A.O., A.B., A.C., A.G., A.L., A.B.G., C.J.M., D.C.S.-G., E.B.P., E. Kostyleva, E.R.U., E. Kannegaard, I.G.-Z., I.P., I.S., J.G., J.H., J.E.A.T., K.H.K., L.Z, L.Y., L.P., L.K., M.B., M.G.-M., M.V., M.R., M.J., N.B., O.V.L., O.C.U., P.K., P. Lysdahl, P.B., P.W., R.V.S., R. Maring, R.B., R.I., S.V., S.W., S.B., S.H.A., T.Z.T.J., V.S., D.J.L., P.H.S., S.R., T.S.K., O.D. and F. Racimo drafted supplementary notes and materials. M.E.A., M.S., A.R.-M., E.K.I.-P., A.F., W.B., A.I., G.S., A.H., M.L.S.J., F.D., R. Macleod, L. Sørensen, P.O.N., R.A.H., T.V., H.M., A.M., N.N.J., H.S., A. Ramsøe, A. Skorobogatov, A. J. Schork, A. Ruter, A.O., B.H.N., B.G.-R., D.C.-S., D.C.S.-G., I.G.-Z., I.P., J.G., J.E.A.T., L.Z., L.O., L.K., M.G.-M., P.d.B.D., R.I., S.A.S., D.J.L., I.M., O.G., P.H.S., T.S.K., R.D., R.N., O.D., T.W., F. Racimo, K.K. and E.W. were involved in reviewing drafts and editing (M.E.A., M.S., A.F., K.-G.S., R. Macleod and E.W. led this, and subsequent finalization of the study). All co-authors read, commented on and agreed on the submitted manuscript.

Peer review

Peer review information

Nature thanks Benjamin Peter and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Data availability

All adapter-trimmed sequence data (fastq) for the samples sequenced in this study are publicly available on the European Nucleotide Archive under accession PRJEB64656, together with sequence alignment map files, aligned using human build GRCh37. The full analysis dataset including both imputed and pseudo-haploid genotypes for all ancient individuals used in this study is available at 10.17894/ucph.d71a6a5a-8107-4fd9-9440-bdafdfe81455. Aggregated IBD-sharing data as well as high-resolution versions of supplementary figures are available at Zenodo (10.5281/zenodo.8196989). Previously published ancient genomic data used in this study are detailed in Supplementary Data 7, and are all already publicly available. Bioarchaeological data (including accelerator mass spectrometry results) are included in the online supplementary materials of this submission. Map figures were created using Natural Earth Data (in Figs. 13 and 6 and Extended Data Figs. 1, 3, 4 and 811.).

Code availability

All analyses relied on available software, which has been fully referenced in the manuscript and is detailed in the relevant supplementary notes. A collection of R functions for IBD-based mixture model inference is available at https://github.com/martinsikora/mixmodel_ibd.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Morten E. Allentoft, Martin Sikora, Alba Refoyo-Martínez, Evan K. Irving-Pease, Anders Fischer, William Barrie, Andrés Ingason

Deceased: Esben Kannegaard, Peder Mortensen

These authors jointly supervised this work: Thorfinn Korneliussen, Richard Durbin, Rasmus Nielsen, Olivier Delaneau, Thomas Werge, Fernando Racimo, Kristian Kristiansen, Eske Willerslev

Change history

1/18/2024

A Correction to this paper has been published: 10.1038/s41586-024-07044-5

Contributor Information

Morten E. Allentoft, Email: morten.allentoft@curtin.edu.au

Martin Sikora, Email: martin.sikora@sund.ku.dk.

Eske Willerslev, Email: ew482@cam.ac.uk.

Extended data

is available for this paper at 10.1038/s41586-023-06865-0.

Supplementary information

The online version contains supplementary material available at 10.1038/s41586-023-06865-0.

References

  • 1.Allentoft ME, et al. Population genomics of Bronze Age Eurasia. Nature. 2015;522:167–172. doi: 10.1038/nature14507. [DOI] [PubMed] [Google Scholar]
  • 2.Haak W, et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature. 2015;522:207–211. doi: 10.1038/nature14317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lazaridis I, et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature. 2014;513:409–413. doi: 10.1038/nature13673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Lazaridis I, et al. Genomic insights into the origin of farming in the ancient Near East. Nature. 2016;536:419–24. doi: 10.1038/nature19310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Mathieson I, et al. The genomic history of southeastern Europe. Nature. 2018;555:197–203. doi: 10.1038/nature25778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Posth C, et al. Pleistocene mitochondrial genomes suggest a single major dispersal of non-Africans and a late glacial population turnover in Europe. Curr. Biol. 2016;26:827–833. doi: 10.1016/j.cub.2016.01.037. [DOI] [PubMed] [Google Scholar]
  • 7.Posth C, et al. Palaeogenomics of Upper Palaeolithic to Neolithic European hunter-gatherers. Nature. 2023;615:117–126. doi: 10.1038/s41586-023-05726-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Mathieson I, et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature. 2015;528:499–503. doi: 10.1038/nature16152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Fu Q, et al. The genetic history of Ice Age Europe. Nature. 2016;534:200–205. doi: 10.1038/nature17993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Raghavan M, et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature. 2014;505:87–91. doi: 10.1038/nature12736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Villalba-Mouco V, et al. Survival of Late Pleistocene hunter-gatherer ancestry in the Iberian Peninsula. Curr. Biol. 2019;29:1169–1177. doi: 10.1016/j.cub.2019.02.006. [DOI] [PubMed] [Google Scholar]
  • 12.Brace S, et al. Ancient genomes indicate population replacement in Early Neolithic Britain. Nat. Ecol. Evol. 2019;3:765–771. doi: 10.1038/s41559-019-0871-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.de Barros Damgaard P, et al. The first horse herders and the impact of early Bronze Age steppe expansions into Asia. Science. 2018;360:eaar7711. doi: 10.1126/science.aar7711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Saag L, et al. Genetic ancestry changes in Stone to Bronze Age transition in the East European plain. Sci. Adv. 2021;7:eabd6535. doi: 10.1126/sciadv.abd6535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Günther T, et al. Population genomics of Mesolithic Scandinavia: investigating early postglacial migration routes and high-latitude adaptation. PLoS Biol. 2018;16:e2003703. doi: 10.1371/journal.pbio.2003703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kashuba N, et al. Ancient DNA from mastics solidifies connection between material culture and genetics of mesolithic hunter–gatherers in Scandinavia. Commun. Biol. 2019;2:185. doi: 10.1038/s42003-019-0399-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Zvelebil, M., Domanska, L. & Dennell, R. Harvesting the Sea, Farming the Forest: The Emergence of Neolithic Societies in the Baltic Region (Bloomsbury, 1998).
  • 18.Jones ER, et al. The Neolithic transition in the Baltic was not driven by admixture with early European farmers. Curr. Biol. 2017;27:576–582. doi: 10.1016/j.cub.2016.12.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Mittnik A, et al. The genetic prehistory of the Baltic Sea region. Nat. Commun. 2018;9:442. doi: 10.1038/s41467-018-02825-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kislenko, A. & Tatarintseva, N. in Late Prehistoric Exploitation of the Eurasian Steppe (eds Levine, M. et al.) 183–216 (McDonald Institute for Archaeological Research, 1999).
  • 21.Furholt M. Mobility and social change: understanding the European Neolithic period after the archaeogenetic revolution. J. Archaeol. Res. 2021;29:481–535. doi: 10.1007/s10814-020-09153-x. [DOI] [Google Scholar]
  • 22.Lipson M, et al. Ancient genomes document multiple waves of migration in Southeast Asian prehistory. Science. 2018;361:92–95. doi: 10.1126/science.aat3188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Fernandes DM, et al. A genomic Neolithic time transect of hunter-farmer admixture in central Poland. Sci. Rep. 2018;8:14879. doi: 10.1038/s41598-018-33067-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Immel, A. et al. Genome-wide study of a Neolithic Wartberg grave community reveals distinct HLA variation and hunter-gatherer ancestry. Commun. Biol.4, 113 (2021). [DOI] [PMC free article] [PubMed]
  • 25.Jeong C, et al. The genetic history of admixture across inner Eurasia. Nat. Ecol. Evol. 2019;3:966–976. doi: 10.1038/s41559-019-0878-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Nikitin AG, et al. Interactions between earliest Linearbandkeramik farmers and central European hunter gatherers at the dawn of European Neolithization. Sci. Rep. 2019;9:19544. doi: 10.1038/s41598-019-56029-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Gelabert, P. et al. Social and genetic diversity among the first farmers of Central Europe. Preprint at bioRxiv10.1101/2023.07.07.548126 (2023).
  • 28.Cassidy LM, et al. Neolithic and Bronze Age migration to Ireland and establishment of the insular Atlantic genome. Proc. Natl Acad. Sci. USA. 2016;113:368–373. doi: 10.1073/pnas.1518445113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Penske, S. et al. Early contact between late farming and pastoralist societies in southeastern Europe. Nature620, 358–365 (2023). [DOI] [PMC free article] [PubMed]
  • 30.Lazaridis I, et al. The genetic history of the Southern Arc: a bridge between West Asia and Europe. Science. 2022;377:eabm4247. doi: 10.1126/science.abm4247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Egfjord AF-H, et al. Genomic Steppe ancestry in skeletons from the Neolithic Single Grave Culture in Denmark. PLoS One. 2021;16:e0244872. doi: 10.1371/journal.pone.0244872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Papac L, et al. Dynamic changes in genomic and social structures in third millennium BCE central Europe. Sci. Adv. 2021;7:eabi6941. doi: 10.1126/sciadv.abi6941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Heyd, V. in Rethinking Migrations in Late Prehistoric Eurasia (eds Fernández-Götz, M. et al.) 41–62 (Oxford Univ. Press, 2023).
  • 34.Allentoft, M. E. et al. 100 ancient genomes show repeated population turnovers in Neolithic Denmark. Nature10.1038/s41586-023-06862-3 (2024). [DOI] [PMC free article] [PubMed]
  • 35.Rubinacci S, Ribeiro DM, Hofmeister RJ, Delaneau O. Efficient phasing and imputation of low-coverage sequencing data using large reference panels. Nat. Genet. 2021;53:412. doi: 10.1038/s41588-021-00788-0. [DOI] [PubMed] [Google Scholar]
  • 36.The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Leslie S, et al. The fine-scale genetic structure of the British population. Nature. 2015;519:309–314. doi: 10.1038/nature14230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Hofmanová Z, et al. Early farmers from across Europe directly descended from Neolithic Aegeans. Proc. Natl Acad. Sci. USA. 2016;113:6886–6891. doi: 10.1073/pnas.1523951113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Busby GB, et al. Admixture into and within sub-Saharan Africa. eLife. 2016;5:e15266. doi: 10.7554/eLife.15266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Schmitt T. Molecular biogeography of Europe: Pleistocene cycles and postglacial trends. Front. Zool. 2007;4:11. doi: 10.1186/1742-9994-4-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Olalde I, et al. The genomic history of the Iberian Peninsula over the past 8000 years. Science. 2019;363:1230–1234. doi: 10.1126/science.aav4040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.García-Escárzaga A, et al. Human forager response to abrupt climate change at 8.2 ka on the Atlantic coast of Europe. Sci. Rep. 2022;12:6481. doi: 10.1038/s41598-022-10135-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Narasimhan VM, et al. The formation of human populations in South and Central Asia. Science. 2019;365:eaat7487. doi: 10.1126/science.aat7487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wang C-C, et al. Ancient human genome-wide data from a 3000-year interval in the Caucasus corresponds with eco-geographic regions. Nat. Commun. 2019;10:590. doi: 10.1038/s41467-018-08220-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Lipson M, et al. Parallel palaeogenomic transects reveal complex genetic history of early European farmers. Nature. 2017;551:368. doi: 10.1038/nature24476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Racimo F, et al. The spatiotemporal spread of human migrations during the European Holocene. Proc. Natl Acad. Sci. USA. 2020;117:8989–9000. doi: 10.1073/pnas.1920051117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Martiniano R, et al. The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods. PLoS Genet. 2017;13:e1006852. doi: 10.1371/journal.pgen.1006852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Isern N, Zilhão J, Fort J, Ammerman AJ. Modeling the role of voyaging in the coastal spread of the Early Neolithic in the West Mediterranean. Proc. Natl Acad. Sci. USA. 2017;114:897–902. doi: 10.1073/pnas.1613413114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Betti L, et al. Climate shaped how Neolithic farmers and European hunter-gatherers interacted after a major slowdown from 6,100 BCE to 4,500 BCE. Nat. Hum. Behav. 2020;4:1004–1010. doi: 10.1038/s41562-020-0897-7. [DOI] [PubMed] [Google Scholar]
  • 50.Saag L, et al. Extensive farming in Estonia started through a sex-biased migration from the Steppe. Curr. Biol. 2017;27:2185–2193. doi: 10.1016/j.cub.2017.06.022. [DOI] [PubMed] [Google Scholar]
  • 51.Seguin-Orlando A, et al. Heterogeneous hunter-gatherer and Steppe-related ancestries in Late Neolithic and Bell Beaker genomes from present-day France. Curr. Biol. 2021;31:1072–1083. doi: 10.1016/j.cub.2020.12.015. [DOI] [PubMed] [Google Scholar]
  • 52.Furholt M. Die Złota-Gruppe in Kleinpolen: Ein Beispiel für die Transformation eines Zeichensystems? Germania. 2008;86:1–28. [Google Scholar]
  • 53.Szmyt, M. in A Turning of Ages (ed. Kadrow, S.) 443–466 (Institute of Archaeology and Ethnology, Polish Academy of Sciences, 2000).
  • 54.Tassi F, et al. Genome diversity in the Neolithic Globular Amphorae culture and the spread of Indo-European languages. Proc. R. Soc. B. 2017;284:20171540. doi: 10.1098/rspb.2017.1540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Nordqvist K, Heyd V. The forgotten child of the wider Corded Ware family: Russian Fatyanovo Culture in context. Proc. Prehist. Soc. 2020;86:65–93. doi: 10.1017/ppr.2020.9. [DOI] [Google Scholar]
  • 56.Kristiansen K, et al. Re-theorising mobility and the formation of culture and language among the Corded Ware Culture in Europe. Antiquity. 2017;91:334–347. doi: 10.15184/aqy.2017.17. [DOI] [Google Scholar]
  • 57.Borzunov, V. A. The neolithic fortified settlements of the Western Siberia and Trans-Urals. Russ. Archaeol.4, 20–34 (2013).
  • 58.Yu H, et al. Paleolithic to Bronze Age Siberians reveal connections with First Americans and across Eurasia. Cell. 2020;181:1232–1245. doi: 10.1016/j.cell.2020.04.037. [DOI] [PubMed] [Google Scholar]
  • 59.Okladnikov, A. P. Neolit i Bronzovyi vek Pribaikaliya [Neolithic and Bronze Age of the Baikal region] (AS USSR Publications, 1950).
  • 60.Merts, V. in Paleodemography and Migration Processes in Western Siberia in Antiquity and the Middle Ages (ed. Kiryushin, Y. F.) 39–42 (Altai State University, 1994).
  • 61.Merts, V. Periodization of the Holocene Complexes of Northern and Central Kazakhstan Based on the Materials of the Multilayer Site Shiderty 3 (Thesis, Kemerovo State Univ., 2008).
  • 62.Merts VK. Neolithization processes in the Northeast Kazakhstan. Herald Omsk Univ. Ser. Histor. Stud. 2018;3:99–109. [Google Scholar]
  • 63.de Barros Damgaard P, et al. 137 ancient human genomes from across the Eurasian steppes. Nature. 2018;557:369–374. doi: 10.1038/s41586-018-0094-2. [DOI] [PubMed] [Google Scholar]
  • 64.Librado P, et al. The origins and spread of domestic horses from the Western Eurasian steppes. Nature. 2021;598:634–640. doi: 10.1038/s41586-021-04018-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Huang, Y. et al. The early adoption of East Asian crops in West Asia: rice and broomcorn millet in northern Iran. Antiquity97, 674–689 (2023).
  • 66.Palamara PF, Lencz T, Darvasi A, Pe’er I. Length distributions of identity by descent reveal fine-scale demographic history. Am. J. Hum. Genet. 2012;91:809–822. doi: 10.1016/j.ajhg.2012.08.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Ringbauer H, Novembre J, Steinrücken M. Parental relatedness through time revealed by runs of homozygosity in ancient DNA. Nat. Commun. 2021;12:5425. doi: 10.1038/s41467-021-25289-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Krzewińska M, et al. Ancient genomes suggest the eastern Pontic-Caspian steppe as the source of western Iron Age nomads. Sci. Adv. 2018;4:eaat4457. doi: 10.1126/sciadv.aat4457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Matuzeviciute GM. The possible geographic margin effect on the delay of agriculture introduction in the East Baltic. Eston. J. Archaeol. 2018;22:149–162. doi: 10.3176/arch.2018.2.03. [DOI] [Google Scholar]
  • 70.Piezonka, H. Jäger, Fischer, Töpfer: Wildbeutergruppen mit Früher Keramik in Nordosteuropa im 6. und 5. Jahrtausend v. Chr. (Habelt, 2015).
  • 71.Oras E, et al. The adoption of pottery by north-east European hunter-gatherers: Evidence from lipid residue analysis. J. Archaeol. Sci. 2017;78:112–119. doi: 10.1016/j.jas.2016.11.010. [DOI] [Google Scholar]
  • 72.Matuzeviciute GM, et al. Archaeobotanical investigations at the earliest horse herder site of Botai in Kazakhstan. Archaeol. Anthropol. Sci. 2019;11:6243–6258. doi: 10.1007/s12520-019-00924-2. [DOI] [Google Scholar]
  • 73.Anthony, D. W. in The Black Sea Flood Question: Changes in Coastline, Climate and Human Settlement (eds Yanko-Hombach, V. et al.) 345–370 (Springer, 2007).
  • 74.Trautmann M, et al. First bioanthropological evidence for Yamnaya horsemanship. Sci. Adv. 2023;9:eade2451. doi: 10.1126/sciadv.ade2451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Anthony DW, et al. The Eneolithic cemetery at Khvalynsk on the Volga River. Praehistor. Zeitschr. 2022;97:22–67. doi: 10.1515/pz-2022-2034. [DOI] [Google Scholar]
  • 76.Kristiansen, K., Kroonen, G. & Willerslev, E. The Indo-European Puzzle Revisited: Integrating Archaeology, Genetics, and Linguistics (Cambridge Univ. Press, 2023).
  • 77.Damgaard PB, et al. Improving access to endogenous DNA in ancient bones and teeth. Sci. Rep. 2015;5:11184. doi: 10.1038/srep11184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Li H, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Jónsson H, Ginolhac A, Schubert M, Johnson PLF, Orlando L. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics. 2013;29:1682–1684. doi: 10.1093/bioinformatics/btt193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Renaud G, Slon V, Duggan AT, Kelso J. Schmutzi: estimation of contamination and endogenous mitochondrial consensus calling for ancient DNA. Genome Biol. 2015;16:224. doi: 10.1186/s13059-015-0776-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Fu Q, et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature. 2014;514:445–449. doi: 10.1038/nature13810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Sousa da Mota B, et al. Imputation of ancient human genomes. Nat. Commun. 2023;14:3660. doi: 10.1038/s41467-023-39202-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Skoglund P, Storå J, Götherström A, Jakobsson M. Accurate sex identification of ancient human remains using DNA shotgun sequencing. J. Archaeol. Sci. 2013;40:4477–4482. doi: 10.1016/j.jas.2013.07.004. [DOI] [Google Scholar]
  • 85.Barbera P, et al. EPA-ng: massively parallel evolutionary placement of genetic sequences. Syst. Biol. 2019;68:365–369. doi: 10.1093/sysbio/syy054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Kozlov AM, Darriba D, Flouri T, Morel B, Stamatakis A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics. 2019;35:4453–4455. doi: 10.1093/bioinformatics/btz305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Czech L, Barbera P, Stamatakis A. Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data. Bioinformatics. 2020;36:3263–3265. doi: 10.1093/bioinformatics/btaa070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Waples RK, Albrechtsen A, Moltke I. Allele frequency-free inference of close familial relationships from genotypes or low-depth sequencing data. Mol. Ecol. 2019;28:35–48. doi: 10.1111/mec.14954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Manichaikul A, et al. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–2873. doi: 10.1093/bioinformatics/btq559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Nielsen R, Korneliussen T, Albrechtsen A, Li Y, Wang J. SNP calling, genotype calling, and sample allele frequency estimation from New-Generation Sequencing data. PLoS One. 2012;7:e37558. doi: 10.1371/journal.pone.0037558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Korneliussen TS, Albrechtsen A, Nielsen R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics. 2014;15:356. doi: 10.1186/s12859-014-0356-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Patterson N, et al. Ancient admixture in human history. Genetics. 2012;192:1065–1093. doi: 10.1534/genetics.112.145037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Pickrell JK, et al. The genetic prehistory of southern Africa. Nat. Commun. 2012;3:1143. doi: 10.1038/ncomms2140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Shringarpure SS, Bustamante CD, Lange K, Alexander DH. Efficient analysis of large datasets and sex bias with ADMIXTURE. BMC Bioinformatics. 2016;17:218. doi: 10.1186/s12859-016-1082-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet.88, 76-82 (2011). [DOI] [PMC free article] [PubMed]
  • 96.Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLoS Genet. 2006;2:e190. doi: 10.1371/journal.pgen.0020190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Price AL, et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 2006;38:904–909. doi: 10.1038/ng1847. [DOI] [PubMed] [Google Scholar]
  • 98.Chang CC, et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Maier, R. et al. On the limits of fitting complex models of population history to f-statistics. eLife12, e85492 (2023). [DOI] [PMC free article] [PubMed]
  • 100.Browning BL, Browning SR. Detecting identity by descent and estimating genotype error rates in sequence data. Am. J. Hum. Genet. 2013;93:840–851. doi: 10.1016/j.ajhg.2013.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Browning SR, Browning BL. Accurate non-parametric estimation of recent effective population size from segments of identity by descent. Am. J. Hum. Genet. 2015;97:404–418. doi: 10.1016/j.ajhg.2015.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Lawrence M, et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 2013;9:e1003118. doi: 10.1371/journal.pcbi.1003118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Greenbaum G, Rubin A, Templeton AR, Rosenberg NA. Network-based hierarchical population structure analysis for large genomic data sets. Genome Res. 2019;29:2020–2033. doi: 10.1101/gr.250092.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006;1695(5):1–9. [Google Scholar]
  • 105.Traag VA, Waltman L, van Eck NJ. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 2019;9:5233. doi: 10.1038/s41598-019-41695-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8:e1002453. doi: 10.1371/journal.pgen.1002453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Hellenthal G, et al. A genetic atlas of human admixture history. Science. 2014;343:747–751. doi: 10.1126/science.1243518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Soetaert, K., Van den Meersche, K. & van Oevelen, D. limSolve: solving linear inverse models. R version 1.5.7 https://cran.r-project.org/web/packages/limSolve (2009).
  • 109.Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:1655–1664. doi: 10.1101/gr.094052.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Cressie, N. & Wikle, C. K. Statistics for Spatio-Temporal Data (John Wiley & Sons, 2015).
  • 111.Gräler B, Pebesma E, Heuvelink G. Spatio-Temporal Interpolation using gstat. R J. 2016;8:204. doi: 10.32614/RJ-2016-014. [DOI] [Google Scholar]
  • 112.Reimer, P. et al. Laboratory Protocols Used for AMS Radiocarbon Dating at the 14CHRONO Centre. Report No. 5-2015 (English Heritage, 2015).
  • 113.Brock, F., Higham, T., Ditchfield, P. & Ramsey, C. B. Current pretreatment methods for AMS radiocarbon dating at the Oxford Radiocarbon Accelerator Unit (ORAU). Radiocarbon52, 103–112 (2010).
  • 114.Beaumont W, Beverly R, Southon J, Taylor RE. Bone preparation at the KCCAMS laboratory. Nucl. Instrum. Methods Phys. Res. B. 2010;268:906–909. doi: 10.1016/j.nimb.2009.10.061. [DOI] [Google Scholar]
  • 115.Reimer P, et al. The IntCal20 Northern Hemisphere Radiocarbon Age Calibration Curve (0–55 cal kBP) Radiocarbon. 2020;62:725–757. doi: 10.1017/RDC.2020.41. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (28.7MB, pdf)

Supplementary Notes 1–7: 1, Data Generation and Authentication; 2, Imputation of ancient DNA (including Figures S2.1 to S2.11, and Tables S2.1 and S2.2); 3, Demographic Inference, comprising: 3a ‘Phylogenetic analysis of mtDNA sequences’ (including Figures S3a.1 to S3a.3), 3b ‘Y chromosome / sex determination’ (including Figures S3b.1 to S3b.8), 3c ‘Relatedness’ (including Figures S3c.1 and S3c.2, and Tables S3c.1 and S3c.2), 3d ‘Overall Population Structure’ (including Figures S3d.1 to S3d.16), 3e ‘Inferring the spatiotemporal spread of population movements in the past 13 millennia’ (including Figures S3e.1 to S3e.5, and animations S3.1 to s3e.11), 3f ‘HBD/ IBD sharing/ROH/clustering’ (including Figures S3f.1 to S3f.53); 4, 14C chronology and estimates of reservoir effects (including Table S4.1); 5, From forager to farmer in western Eurasia: an archaeological overview (including Figures S5.1 to S5.3); 6, Catalogue of Danish archaeological sites (including Figures S6.1 to S6.15); and 7, Catalogue of non-Danish archaeological sites (including Figures S7.1 to S7.3, and Tables S7.1 to S7.3).

Reporting Summary (108.7KB, pdf)
Supplementary Data 1 (63.1KB, xlsx)

Summary details of samples presented with novel genome data.

Supplementary Data 2–4 (230.2KB, xlsx)

Supplementary Data 2 contains dates, isotopes and context. Supplementary Data 3 includes reservoir correction calculations, and Supplementary Data 4 contains isotopes and all individual samples.

Supplementary Data 5 and 6 (162.4KB, xlsx)

Supplementary Data 5 contains DNA contamination estimates and Supplementary Data 6 contains relatedness estimates.

Supplementary Data 7 (206.9KB, xlsx)

Full ancient genomes dataset.

Supplementary Data 8–13 (4.3MB, xlsx)

Supplementary Data 8 contains mixture model sets. Supplementary Data 9–13 show ancestry proportions for sets “deep”, “postNeol”, “postBA”, “hgEur” and “fEur” respectively.

Supplementary Data 14 (29KB, xlsx)

Admixture time estimates.

Data Availability Statement

All adapter-trimmed sequence data (fastq) for the samples sequenced in this study are publicly available on the European Nucleotide Archive under accession PRJEB64656, together with sequence alignment map files, aligned using human build GRCh37. The full analysis dataset including both imputed and pseudo-haploid genotypes for all ancient individuals used in this study is available at 10.17894/ucph.d71a6a5a-8107-4fd9-9440-bdafdfe81455. Aggregated IBD-sharing data as well as high-resolution versions of supplementary figures are available at Zenodo (10.5281/zenodo.8196989). Previously published ancient genomic data used in this study are detailed in Supplementary Data 7, and are all already publicly available. Bioarchaeological data (including accelerator mass spectrometry results) are included in the online supplementary materials of this submission. Map figures were created using Natural Earth Data (in Figs. 13 and 6 and Extended Data Figs. 1, 3, 4 and 811.).

All analyses relied on available software, which has been fully referenced in the manuscript and is detailed in the relevant supplementary notes. A collection of R functions for IBD-based mixture model inference is available at https://github.com/martinsikora/mixmodel_ibd.


Articles from Nature are provided here courtesy of Nature Publishing Group

RESOURCES