Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2024 Jan 11:2023.07.11.548607. Originally published 2023 Jul 11. [Version 2] doi: 10.1101/2023.07.11.548607

The contribution of gene flow, selection, and genetic drift to five thousand years of human allele frequency change

Alexis Simon 1,2, Graham Coop 1,2
PMCID: PMC10370008  PMID: 37503227

Abstract

Genomic time series from experimental evolution studies and ancient DNA datasets offer us a chance to directly observe the interplay of various evolutionary forces. We show how the genome-wide variance in allele frequency change between two time points can be decomposed into the contributions of gene flow, genetic drift, and linked selection. In closed populations, the contribution of linked selection is identifiable because it creates covariances between time intervals, and genetic drift does not. However, repeated gene flow between populations can also produce directionality in allele frequency change, creating covariances. We show how to accurately separate the fraction of variance in allele frequency change due to admixture and linked selection in a population receiving gene flow. We use two human ancient DNA datasets, spanning around 5,000 years, as time transects to quantify the contributions to the genome-wide variance in allele frequency change. We find that a large fraction of genome-wide change is due to gene flow. In both cases, after correcting for known major gene flow events, we do not observe a signal of genome-wide linked selection. Thus despite the known role of selection in shaping long-term polymorphism levels, and an increasing number of examples of strong selection on single loci and polygenic scores from ancient DNA, it appears to be gene flow and drift, and not selection, that are the main determinants of recent genome-wide allele frequency change. Our approach should be applicable to the growing number of contemporary and ancient temporal population genomics datasets.

Keywords: Linked selection, gene flow, time series, ancient DNA, human evolution

1. Introduction

There is a long-standing debate about the role of genetic drift versus selection in evolutionary change (Buffalo, 2021; Gillespie, 1984; Jensen et al., 2019; Kern & Hahn, 2018; Kimura, 1968; Kreitman, 1996; Sella et al., 2009). While this debate has sometimes been contentious, the answers to these questions are quantitative, describing the relative contributions of basic evolutionary forces to allele frequency change and how this differs across species and different functional categories. Estimating these contributions is complicated, in part because selection can have direct and indirect effects, where the indirect effects include “linked selection”, the impact of correlated selection at nearby selected sites (Barton, 2000; Charlesworth et al., 1993; Coop, 2016; Kaplan et al., 1989; Maynard-Smith & Haigh, 1974; Sella et al., 2009). The problem is made more difficult as we often rely on a single snapshot of contemporary genomes to tease apart multiple interacting processes (gene flow, demography, hard or soft sweeps, background selection, selective interference, etc).

Genomic time series, from museum collections and ancient DNA, offer a potent reservoir of temporal genetic data to track the changes in allele frequencies, identify selected loci, and understand the impact of other evolutionary forces (e.g. Bergland et al., 2014; He et al., 2023; Le et al., 2022; Machado et al., 2021; Mathieson & Terhorst, 2022). Ancient human DNA has already revolutionized our view of human history, revealing that large-scale population movement and gene flow are pervasive, with complex patterns of allele frequency change driven by population turnover.

Unlike genetic drift, allele frequency change due to either selection or gene flow is expected to be sustained and directional. Recent investigations have highlighted the role of gene flow and population structure in confounding our interpretation of genetic signals of selection in humans (Berg et al., 2019; Petr et al., 2019; Sohail et al., 2019; Souilmi et al., 2022). Methods to look at selection at single loci and on polygenic scores in human ancient DNA now often account for the confounding effects of gene flow. These approaches have revealed persuasive signals of selection (Field et al., 2016; Irving-Pease et al., 2022; Ju & Mathieson, 2021; Mathieson & Terhorst, 2022; Mathieson et al., 2015; Souilmi et al., 2022; Wilde et al., 2014). However, these methods only capture outlier signals, and so cannot give us a full picture of how gene flow, selection, and genetic drift have driven genome-wide change. Linked selection is thought to have a critical role in shaping patterns of genetic diversity and divergence in humans on long time scales, with an autosomal reduction in diversity of upward of 20 % from background selection alone (McVicker et al., 2009; Murphy et al., 2021). Some authors have also argued for a pervasive role of selective sweeps in shaping genome-wide patterns of diversity (Enard et al., 2014; Schrider & Kern, 2017). Thus, it is an open question how much of allele frequency change genome-wide is driven by selection in humans.

We set out to decompose the total variance in allele frequency change into the contributions of linked selection, gene flow, and drift. Unlike genetic drift, ongoing selection creates covariance in allele frequency change between non-overlapping time intervals. The use of genome-wide allele frequency change covariances in time series data has recently been proposed to identify the proportion of genome-wide change due to selection in closed populations (Buffalo & Coop, 2019). In a panmictic population, a genome-wide positive covariance between a pair of time periods indicates the compounding of allele frequency change across generations, while negative covariance can potentially result from selection pressures in opposite directions. Many different modes of selection are expected to generate these covariance patterns (Buffalo & Coop, 2020; Santiago & Caballero, 1998). This approach has been applied to experimental evolution datasets (Brennan et al., 2022; Buffalo & Coop, 2020) and to natural populations where temporal sampling is available (in Mimulus, oaks and cod; Kelly, 2022; Reid et al., 2023; Saleh et al., 2022). These applications, along with related methods (Bertram, 2021), have revealed that a reasonable proportion of total allele frequency change, especially in artificial selection experiments, can be attributed to widespread selection beyond just a few outliers. However, applying these methods when population structure and migration are present will give biased results, as sustained gene flow across time periods can also drive temporal covariance in allele frequency change.

Here, we develop theory and simulations to show how the effect of gene flow can be accounted for using an admixture model from a known set of sources to recover the genome-wide contribution of gene flow and linked selection. We demonstrate this approach using two European human ancient DNA time series from the UK and the Bohemian region of Central Europe to quantify the contributions of linked selection and gene flow to the total variance in allele frequency change between the Neolithic and a modern or recent time point. In both cases, we find a major contribution of gene flow to allele frequency change. After correcting for known gene flow in these time transects, we do not observe any signal of genome-wide linked selection. However, we detect a weak signal of linked selection in levels of temporal allele frequency variance in regions of the genome with low recombination and high gene density.

2. Results

2.1. Model

We consider a model of data from a population sampled at multiple discrete time points (t0,T) from an arbitrary geographic region. This population receives gene flow, modeled as single pulses of admixture, from other known source groups through time. We follow the population allele frequency over time, pt, and use Δpt, defined as pt+1pt, to denote the change in allele frequency between adjacent time points. The focal population has mean ancestry fractions α¯r,tr1,R from R source populations that will change over time due to gene flow. We assume that allele frequencies in the source populations are constant and that good proxy samples for these sources are available, and discuss the implications of those assumptions later.

Given sample frequencies at some large set of SNPs, we can calculate the empirical variance-covariance matrix of allele frequency change for our time series, CovΔpi,Δpj, for all combinations of time intervals i and j, averaged over SNPs. In calculating these covariances we include adjustments for biases in the variance and covariance estimates due to shared sampling of an intermediate time point (see section 4.1 and appendix A).

Under our model, the total variance in allele frequency change between the first sampled time point (0) and any following time point (T) in the time series can be decomposed into sums over time intervals of the contributions due to drift, selection and admixture:

VarpTp0=i=0T1VarΔDpiDrift+i=0T1VarΔSpiSelection+i=0T1VarΔApiAdmixture+ijT1CovΔSpi,ΔSpjSelection+ijT1CovΔApi,ΔApjAdmixture (1)

For simplicity, here we omit an interaction between drift in one time period that admixture in later time periods subsequently erases, that adds an additional term to covariances (appendix C) that we account for.

The expected variance and covariance of allele frequency change due to admixture follow from the expected allele frequency change in ancestry proportions through time. Specifically, if in the tth time interval admixture changes the ancestry proportion from the rth source from α¯r,t to α¯r,t+1Δα¯r,t=α¯r,t+1α¯r,t, then the expected change in frequency due to admixture is ΔApt=r=1RΔα¯r,tfr, where fr is the allele frequency in the source population r. Thus, the admixture contribution to covariance in allele frequency change between time periods i and j can be expressed as:

CovΔApi,ΔApj=Covr=1RΔα¯r,ifr,r=1RΔα¯r,jfr (2)

As we only have sample allele frequencies from proxies of the sources of admixture, this matrix is corrected for sampling noise biases in fr (appendix B). With this admixture covariance in hand, we can now calculate the contribution of gene flow to allele frequency change, and adjust for the contribution of admixture when looking for covariances induced by selection.

We express the estimated proportion of total variance in allele frequency change attributable to gene flow (admixture) up to time tAt as:

Ati=0t1VarΔApi+ijt1CovΔApi,ΔApjVarptp0 (3)

where the terms in the numerator are given by eq. (2). Note that At might be an under-estimate as it excludes the contribution of gene flow from unmodeled sources, as well as gene flow events that leave the admixture proportions relatively unchanged.

The proportion of total variance in allele frequency change between 0 and t due to linked selection, Gt, is defined as the ratio of the total covariance due to linked selection over the total variance (Buffalo & Coop, 2019). Under our model, Gt can be estimated by correcting the empirical covariance by the estimated covariance term due to admixture:

Gt=ijt1CovΔSpi,ΔSpjVarptp0=ijt1CovΔpi,ΔpjCovΔApi,ΔApjVarptp0 (4)

We also report an estimate of G not controlling for admixture (Gnc). Note that G is a lower bound on the proportion as it does not account for the contribution of linked selection to the variance in allele frequency change within time periods. We attribute the residual proportion 1AG of the total allele frequency variance to drift-like allele frequency change. This proportion of the temporal variance is consistent with genetic drift as it excludes covariances between time periods, which can not come from drift, and the contribution of known gene flow.

2.2. Simulations

To illustrate our decomposition of genome-wide allele frequency change, we simulated a simple scenario where a population receives pulses of admixture (arrows in fig. 1A and fig. S1). This repeated admixture results in positive covariances between time intervals due to the admixture-driven allele frequency change in generations 160–100 (measured before present) being in the same direction as those in generations 60–0 (fig. 1B & C). We can remove the covariance due to admixture (eq. (2)), resulting in covariances close to zero (fig. 1B & D).

Figure 1:

Figure 1:

Simulation scenario of admixture between two populations (0 and 1) under neutrality (B to E) and with selection (F). (A) Ancestry proportions of the focal admixed population through time in generations before present. Arrows indicate the migration pulses from Pop1 (at 150, 130, 110, 50, 30 and 10 generations before present). (B) Covariance between time intervals. Below diagonal values are before admixture correction, above diagonal are after admixture correction. (C) and (D) pre- and post-admixture correction covariances. X-axis values are slightly shifted for visualization and the bottom lines indicate point groupings to their corresponding time. (E) Proportion of the total variance between the initial measured time (120) and t due to linked selection (Gnc and G are pre- and post-admixture correction respectively) and to gene flow (A). Points are slightly x-shifted for visualization. (F) Simulations for the decomposition of variance for neutral polymorphisms with selection around a gradually moving optimum starting at generation 140 BP for three independent traits. All points have 95 % confidence intervals (CIs) of the mean, computed using 100 replicates of the simulations (here the small CIs are hidden by their points).

We can calculate the total contribution of gene flow to allele frequency change in our simulated time series (eq. (3)) and see that gene flow accounts for much of the allele frequency change. Because the repeated gene flow creates positive covariance in allele frequency change, failing to account for this gene flow generates a spurious signal of linked selection (large non-zero Gnc's, dashed black line fig. 1E). However, when we account for gene flow, the signal of linked selection is almost completely removed from our neutral simulations (G, black line, fig. 1E). The remaining slightly non-zero G value in our final time period (fig. 1E) results from a slight over-correction for the covariance due to the interaction of drift and gene flow (see figs. S2 and S3F).

To illustrate the effects of selection on the covariance, we extended the above admixture simulations to include a set of loci underlying traits under stabilizing selection around a moving optimum (see section 4.2). In these simulations, we can see the proportion of neutral allele frequency change being due to selection increasing as covariances build up over time (black line fig. 1D, see also fig. S3). The effect of selection on the covariances in neutral allele frequencies is also well illustrated in a model without any gene flow (figs. S4 and S5).

2.3. Ancient DNA time transects

We investigated two time transects of allele frequencies in ancient humans in restricted geographical regions, the first one in the UK (Patterson et al., 2022, the England and Wales samples), and the second in the Central European region of Bohemia (Papac et al., 2021, samples from the current Czech Republic) spanning periods back to ~5,600 years ago. Both these time series cover major migrations of people, where an initial mixture of early-farmer-like ancestry (EEF-like) and Western hunter-gatherer-like ancestry (WHG-like) is partially replaced by Steppe pastoralist-like ancestry (Steppe-like). This large turnover due to Steppe-like migration into Central and Western Europe was followed by a progressive increase in EEF-like ancestry over a longer time period.

We combined the Patterson et al. (2022) UK data from 793 ancient individuals with the present day GBR 1000 genomes samples (“people of European ancestry from Great Britain (GBR)”), to form a time series that runs from ~5,500 years ago to the present day. Following Patterson et al. (2022) we broke the samples into 7 time periods corresponding to transitions in ancestry proportions (fig. 2C). We used the individual ancestry proportions of Patterson et al. (2022) inferred from a qpAdm three population model. These ancestry proportions are calculated to reflect genetic similarity to a set of samples that are pre-specified proxies for sources of ancestry. Note that an increase in a particular ancestry likely does not reflect gene flow directly from that source but rather from more nearby groups who themselves were already mixtures. In turn, each of the three major putative source ancestries was a product of admixture in the past.

Figure 2:

Figure 2:

Human time series covariance matrices and ancestries (UK left column, Bohemia right column). (A-B) Covariance matrices with covariances significantly different from zero marked with a star. The covariances have only been corrected for sampling bias and not admixture. (C-D) Mean ancestry proportions from the three reference populations in the time transects samples (the mean of sample ages in each period is used for the representation).

The covariance matrix of allele frequency changes between time periods is shown in fig. 2A. The UK time transect shows several large negative covariances between allele frequency change in the first time period (Δp0, 5424–4005 years ago) and subsequent time periods (see also black points in fig. 3A). This negative covariance largely reflects that the initial large population turnover due to Steppe-like migration (during the first time period), was followed by an increase in EEF-like ancestry (fig. 2C, Patterson et al., 2022) generating allele frequency changes in the opposite direction. After correction by admixture, covariances are strongly reduced with only a small subset differing significantly from 0 (fig. 3C). In the UK time transect, most of the total variance in allele frequency change across time is due to admixture-based changes, with 60 % of allele frequency change being driven by the Steppe-like gene flow, only for the contribution of admixture to drop gradually as the EEF-like ancestry increases due to subsequent migration(s) (fig. 3E, G and A). If we do not adjust for admixture, our estimate of the contribution of linked selection (v) is negative (fig. 3E, Gnc), reflecting the negative covariances induced by admixture. After adjusting for admixture there is no signal of a long-term contribution of linked selection, with G not departing from zero, as there is no consistent pattern of residual positive covariances. Our empirical covariance results match those produced by a UK-like neutral simulation (Pearson and Durbin, 2023 model with UK matched admixture pulses, figs. S16 and S17). Finally, we checked for the ascertainment bias effects on our G and A estimates making use of the fact that the genotyping array consists of SNP sets discovered in different ascertainment panels. While using subsets of differently ascertained loci increases our uncertainty in our estimates, particularly for panels from more genetically distant samples, we find that our results are robust to the ascertainment scheme (fig. S8).

Figure 3:

Figure 3:

Human time series covariance corrections and variance decomposition (UK left column, Bohemia right column). The top panels show the time intervals on the years BP axis, note that the two time series have different axes. All 95 % CIs are computed through a block bootstrap procedure and represented with vertical bars. (A-B) Covariance values pre-admixture correction. Each line corresponds to the covariance between a first time interval (Δpi, color code) and a later time interval (Δpj, x-axis). (C-D) Covariance values post-admixture correction. (E-F) Proportion of the total variance, between the initial measured time (0) and time t, due to linked selection (Gnc for non-corrected and G) and to gene flow (A).

We investigated another time transect from the Bohemia region (Papac et al., 2021) spanning from 5606 to 3037 years ago, which we also split into 7 time periods following the original paper (fig. 2D). The largest covariance is negative, and an order of magnitude larger than in the UK dataset (fig. 3B), and again seems to be due to the large influx from a Steppe-like source between 3880 and 3120 years ago (between time points 3 and 4, fig. 2D) and subsequent recovery of EEF-like ancestry (between Δp3 and Δp4, fig. 2D). Again this large covariance due to admixture can be corrected for (fig. 3D). While the older time points have a large amount of uncertainty, due to small sample sizes (section 4.3), nearly all of the variance in allele frequency change across the full time period is accounted for by admixture (A~0.9) and we see little evidence of allele frequency change attributable to linked selection in this Bohemian time series (fig. 3F G, a result that holds over SNP ascertainment scheme fig. S8).

In sum, we see little evidence, in either transect, of linked selection in the covariances in allele frequency change between time intervals, suggesting that having accounted for admixture, much of the residual change across time intervals is due to drift-like sampling processes. One caveat is that if selection operates over short time scales, e.g. selection pressures fluctuate or deleterious alleles are quickly lost, selection could generate substantial allele frequency change (variance) within time intervals but little to no covariance between the time intervals we consider.

To address the concern about the time intervals, we first reran our covariance analysis on the larger UK dataset splitting each time period in half. With this finer dissection of short-term covariances, we still see no evidence of covariance due to linked selection (fig. S12). To further explore the effect of time intervals we extended our simulations of Gaussian stabilizing selection and found that the sum of covariances decreases with the length of the time interval studied (fig. 4A), however, this effect is only pronounced when the recombination rate is low (see also fig. S6). Temporal covariances are also generated under models of background selection fig. 4B and fig. S7 Buffalo and Coop, 2020 and, while somewhat diminished, these also persist with longer sampling periods. Thus our simulations suggest that while temporal binning of ancient DNA samples will lead to lower covariances, the signal of linked selection should still be detectable.

Figure 4:

Figure 4:

Variation of the total covariances and variances with sampling intervals (x-axis) recombination rates (color coded) for the Gaussian stabilizing selection model (GSS, A and C) and background selection (BGS, B). Mean and 95 % CI of the mean are plotted for either the sum of all covariances or the sum of all variances of allele frequency changes between time intervals after corrections. The sum of covariances is proportional to G. For each recombination rate, 100 simulations are run with sample recording at every generation starting at the creation of Pop2 in the demographic scenario with admixture (fig. S1).

One further prediction is that linked selection is expected to have larger effects in low recombination regions than high recombination regions, and in regions with a greater density of functional sites (patterns that are seen in human polymorphism datasets, Cai et al., 2009; Hernandez et al., 2011; McVicker et al., 2009). In simulations, we can see this effect, with greater temporal covariances in regions of lower recombination fig. 4A) and larger variances in allele frequency changes with lower recombination (fig. 4C). While the temporal covariances decrease with longer time interval, linked selection makes a greater contribution to the variance of allele frequency change within time intervals, so the overall signal of linked selection can be retained in the correlation of allele frequency temporal variances with recombination. To empirically examine this effect of selection we binned SNPs by their local recombination rate and a measure of the potential strength of linked selection, the B-value, which at each location in the genome combines the information of recombination rates and density of coding sites (McVicker et al., 2009; Murphy et al., 2021). In both time transects, we do not observe any significant variation in the G and A statistics recalculated in bins of recombination rate (Bhérer et al., 2017) or the B-value figs. S10 and S14). However, in the UK transect, we do see a significant increase in the total variance in allele frequency change in the lowest B-value bin (corresponding to the largest decrease in effective population size due to background selection, fig. 5A, first bin mean is above the genome-wide 95 % CI) and in the variances of change within some of the time intervals (fig. 5B). Noise in the smaller Bohemia time transect precludes seeing such effects (figs. S13 to S15). The allele frequency temporal variance increase in the UK for the lowest quintile B-value compared to the genome-wide mean is of 14.8 %, suggesting a fifth of this, ~3 %, is an estimate of the genome-wide contribution of linked selection to allele frequency change.

Figure 5:

Figure 5:

Temporal variances for the UK time series binned by a proxy for the strength of linked selection (B-value). (A) Total temporal variance and sum of covariance for each quintile bin of B-values. The blue dash-dotted line and interval are the genome-wide mean and 95 % block bootstrap confidence interval for total variance computed with 1/5th of windows sampled on each bootstrap to be comparable to the binned values. (B) Variances by bin for each time interval, normalized by heterozygosity. B-value quintile bins: [0.536–0.755), [0.755–0.849), [0.849–0.902), [0.902–0.944), [0.944–0.997).

3. Discussion

Here we have shown how ancient DNA data can be used to decompose the contribution of gene flow, linked selection, and drift to genome-wide allele frequency change. Using two ancient DNA time transects, our results demonstrate that gene flow is the dominant force changing allele frequencies in the recent history of European human populations, and that selection-driven change is not common across the genome. This does not necessarily contradict the number of signals of temporal selection found to date, as a small fraction of loci could be subject to strong selection (e.g. Irving-Pease et al., 2022; Ju & Mathieson, 2021; Le et al., 2022; Mathieson & Terhorst, 2022; Mathieson et al., 2015; Wilde et al., 2014). Indeed, some of these methods apply similar admixture adjustments as ours, but look for genome-wide outliers and so only detect strong selection on single loci (e.g. Mathieson and Terhorst (2022) expect to be able to detect selection coefficients > 0.02). Another set of approaches looks for ancient selection on polygenic scores constructed from genome-wide association studies (Le et al., 2022; Mathieson & Terhorst, 2022). These approaches account for admixture and can detect subtle shifts at loci in ancient DNA, but rely on the fraction of genetic variation for specific traits captured in modern-day samples (Yair & Coop, 2022). Thus, our genome-wide method is complementary to both time series outlier approaches as well as phenotype-motivated approaches.

The large contribution of gene flow to evolutionary change in the past few thousand years is not surprising given the dynamic picture of population movement that has emerged from ancient DNA. Our results provide additional evidence that allele frequency changes are well fit by relatively simple admixture models, and strengthens the view that multiple migrations events throughout the history of European Human populations have played a preponderant role in the composition of modern populations. The lack of a substantial contribution of linked selection is perhaps more surprising. Linked selection has been estimated to account for upward of 20 % reduction in long-term patterns of human diversity (under models of background selection, McVicker et al., 2009; Murphy et al., 2021), and so we should expect a similar portion of the variance in allele frequencies to come from linked selection. Much of this effect should manifest itself in the compounding of positive covariances between allele frequency changes across the generations. While this effect has been seen empirically in selection experiments and in some natural populations, we currently do not see any evidence of this in humans. One possibility is that the time periods we consider are not long enough for strong covariances to build up, as the long-term patterns of linked selection reflect dynamics over coalescent time scales of hundreds of thousands of years. In contrast, the other possibility is that negative selection generating background selection is fast enough that it does not contribute to covariances among the time periods used here. However, under this latter interpretation, we should see higher allele frequency variances in regions predisposed to stronger linked selection, but we see this effect only weakly when partitioning loci by B-value. Larger collections of ancient DNA will allow better temporal resolution of allele frequency covariances, which could be combined with more individual-level approaches to avoid the need for sample lumping in time periods. It is also possible that some signals of linked selection may be washed out at the fine geographic scale of our time series, as our time series approach may partially be picking up ephemeral change which may average out over the much larger meta-population within which our time series are embedded.

Our approach uses ancestry proportions from ancient DNA for the three major inferred waves of gene flow into Europe. The sparsity of ancient DNA means that we rely on the ancestry proportions of relatively small samples of ancient individuals to be representative of people living in the past. However, the periods that we divide our samples into reflect reasonably well-established periods in the peopling of the regions. We also rely on allele frequencies in a set of samples as proxies for sources of gene flow. As we discuss below, the misspecification of the sources of gene flow may appear as evolutionary change within our focal time series. One future extension might be to use principal components analysis to learn about major axes of population structure involving samples in a time series and then to regress these PCAs out of our genotypes to account for variation in ancestry composition in a more model-free manner.

Our admixture correction seems to perform well on time intervals involving the large ancestry shift in Steppe-like and EEF-like ancestry (compare black points between fig. 3A and C). However, we see several negative covariances that remain after adjustment for admixture (fig. 3A and C). In principle, these could reflect fluctuating selection, but that seems unlikely given the general lack of other evidence of selection. Rather, these covariances could reflect that our proxies for gene flow sources only capture part of the allele frequency change driven by gene flow. Indeed, the increase of EEF-like ancestry in the UK population is driven by subsequent migration(s) from populations similar to the UK but with a higher proportion of EEF-like ancestry, probably from mainland Europe. Therefore, modeling the increase of EEF-derived alleles with the ancestral EEF allele frequencies might not fully account for the impact of migration. More detailed modeling with admixture graphs and tree sequences could help better resolve the sources of gene flow in time series (e.g. Allentoft et al., 2022; Irving-Pease et al., 2022; Pearson and Durbin, 2023, although such inferences may currently not be fully robust, Maier et al., 2023).

We attribute residual variance after accounting for gene flow and temporal covariances to drift-like processes. Genetic drift from the compounded sampling of parents to form each generation in our geographic area will obviously contribute to this. However, as our focal geographic areas are not homogeneous populations, small changes in ancestry composition over time not captured by larger-scale admixture analyses might be captured as drift-like processes. Finally, as we take a fixed sample to reflect the allele frequencies in the sources of gene flow, change in the actual groups contributing to gene flow can also contribute to the signal of drift; e.g. if the allele frequencies in the source of EEF-like ancestry early differs from those contributing EEF-like ancestry later in the time series, that will appear as drift-like change in our focal time series. Our drift-like change in allele frequency is small, corresponding to relatively large estimates of temporal effective population sizes (see section 4.3). However, further work is needed to separate the long-term effect of drift and the combined contribution of other drift-like processes to our estimates of allele frequency change.

Extensions of our approach to larger geographical areas would allow the contributions of local genetic drift and migration among regions to be more fully explored. Such analyses would also pose an interesting set of modeling challenges to measure evolutionary change across spatially spread populations experiencing both local migration and more long-range gene flow events.

Finally, while a large body of ancient DNA work has focused on humans, ancient DNA and museum datasets for a wide range of other organisms are also being generated (e.g. dogs, Bergström et al., 2020; horses, Orlando, 2020; sticklebacks, Kirch et al., 2021; chipmunks, Bi et al., 2019; Amaranthus, Kreiner et al., 2022). The spread of ancient DNA and museum DNA research as well as more widespread usage of genome-wide sequencing to temporally monitor contemporary natural populations will generate a rich set of resources of time series data. This offers the chance for comparative studies to decompose the contribution of different forces to genome-wide evolutionary change across systems, time scales, and ecological and selection regimes.

4. Materials and Methods

4.1. Calculating the covariance matrix

We bin our samples into a set of discrete time points and then calculate the allele frequency change at SNP l between adjacent time points, t and t+1, Δpt,l. We then calculate the empirical variance-covariance matrix of these allele frequency changes for all time points averaged across SNPs. We denote the raw covariance matrix by R. We wish to quantify the expected contribution of admixture to this matrix, but in doing so we also have to correct for sampling noise in both the time series allele frequencies and sources of admixture. The corrected covariance matrix is given by

C=RSsA+SAD (5)

Ss is the expected matrix of biases from using sample frequencies in calculating the empirical covariance matrix (see eq. (6) below). Our admixture adjustment, A, is the expected admixture variance-covariance matrix (see eq. (2)), where proxy samples are used as references for the admixture sources. The matrix SA is the expected bias in the admixture matrix due to the sample noise from using sample frequencies in our admixture correlation (see eq. (7) below). Finally, D is the expected drift/admixture interaction matrix (see appendix C).

Here we calculate the sampling biases in the specific case of pseudohaploid data in line with the ancient DNA datasets considered in this paper (appendix A), using ni for the haploid sample size at time point i and p˜i,l the sample frequency at SNP l. The sampling noise from taking a small sample of individuals inflates the variance of allele frequency change and shared sample between adjacent time points creates covariance

Ss,i=j=E1ni,l1p˜i,l1p˜i,l+1ni+1,l1p˜i+1,l1p˜i+1,lSs,ji=1=E1ni+1,l1p˜i+1,l1p˜i+1,l (6)

with all other terms in the matrix set to zero (appendix A and Buffalo and Coop, 2019). These matrices are calculated as an average over all our SNPs. Second, sampling noise is also present in frequencies of the samples used as proxies for admixture, and so this biases the admixture expectation as the same reference samples are used for multiple time points (appendix B):

SA,i,j=ErΔαr,iΔαr,j1nr,l1f˜r,l1f˜r,l (7)

following eq. (B.3) with αr,i the admixture proportion from reference population r at time i, and f˜r the empirical allele frequency in the reference population r.

Finally, the simple admixture covariance expectation is missing a term due to shared drift variances between time intervals. This can be estimated as shown in appendix C and requires the assumption that only one parental population is contributing to gene flow during each time interval. D is given by eq. (C.9) and is dependent on the estimated drift variance terms in parental populations and admixture proportions at each time step common between two time points.

4.2. Simulations

We used the Demes format to write inter-operable demographic scenarios (Gower et al., 2022). This allowed us to run the same model with either msprime for neutral simulations (Baumdicker et al., 2022) or SLiM (v3.7) for simulations including selection (Haller & Messer, 2019). Results were recorded as treesequences and analyzed in Python using tskit (Kelleher et al., 2018). All results are based on 100 replicates of each scenario. The simulations pipeline was built with snakemake (Mölder et al., 2021) and can be found in the zenodo archive https://doi.org/10.5281/zenodo.8093105 and includes the version of all software used.

In the main text simple scenario, an ancestral population splits into the source populations 1,500 generations before present (BP). All populations are kept at a constant size of 10,000 diploid individuals. 200 generations BP our focal population that receives the admixture pulses is created from the first parental population (pop0). Pulses of admixture from pop1 happen at regular 20 generations interval starting at 150 and finishing at 10 generations BP. We sample 30 individuals 10 generations before and after pulses in our focal population. For our admixture sources, we sample 30 individuals from each parental population at 200 generations BP for allele frequency computations. Samples are rendered pseudohaploid to mimic ancient DNA results (though no missing data was inserted). A census event of all populations is performed in the source populations when the admixed population is created to allow us to compute the admixture proportions of all descendants.

We simulated a chromosome 100 Mbp in length with a mutation rate of 1 ×10−8 per bp and per generation and a uniform recombination rate of 2 × 10−8.

To simulate linked-selection in SLiM (Haller & Messer, 2019), we considered three independent polygenic traits with alleles having a random effect size of ±0.01 evolving under a model of stabilizing selection around an initial optimum of 0 for each trait. The fitness landscape is a Gaussian function centered on the optimum with a variance, Vs, of 1. The optimum is gradually shifted from 0 to 3 standard deviations between 140 generations BP to the present similarly in all extant populations (by steps of shift/time). The ancestral population has a burn-in of 0.1N generations in SLiM and the complete ancestral history has been recapitated with pyslim (Ralph et al., 2023). Mutations under selection are not used in the downstream analyses and neutral mutations are added a posteriori with msprime. We note that these simulations are not intended to mimic a particular selection scenario, as the density of loci underlying different traits per chromosome is unknown. Rather the parameters were chosen to generate results where both admixture and selection made comparable contributions for illustration purposes.

Finally, to investigate the role of sampling intervals and recombination rates we repeated the above Gaussian stabilizing selection simulations with varying recombination rates (0.1 × 108, 0.5 × 108, 1 × 108, 2 × 108, 3 × 108) approximately spanning the range of human recombination rates, and analyzing the outputted treesequences at different time sampling intervals ([2, 5, 10, 15, 20]). We also simulated background selection using a model similar to Buffalo and Coop (2020) in SLiM where we use several deleterious mutations per haploid genome per generation U=1 and a negative selection coefficient s=0.1. The rest of the model and analysis pipeline is similar to the Gaussian stabilizing selection case above.

4.3. Ancient DNA analyses

We used two datasets from Patterson et al. (2022) and Papac et al. (2021) for ancient DNA time transects in the UK and the Bohemian region respectively. Data from those papers were downloaded from the indicated sources and merged with a set of parental population proxies and modern samples from the AADR v50.0 1240k dataset (Mallick et al., 2023 using data from Allentoft et al., 2015; Bergström et al., 2020; Brace et al., 2019; Consortium, 2015; Coutinho et al., 2020; Fu et al., 2016; Haak et al., 2015; Jones et al., 2015; Lamnidis et al., 2018; Lazaridis et al., 2016, 2017; Lipson et al., 2017; Mallick et al., 2016; Margaryan et al., 2020; Martiniano et al., 2016; Mathieson et al., 2015, 2018; Mittnik et al., 2018; Narasimhan et al., 2019; Olalde et al., 2018; Patterson et al., 2012; Raghavan et al., 2014; Rivollat et al., 2020; Scheib et al., 2019; Schroeder et al., 2019; Skoglund et al., 2015; Villalba-Mouco et al., 2019; Wang et al., 2019). Modern samples were used to provide a modern time point in the UK time transect. The data analysis snakemake pipeline can be found in the zenodo archive https://doi.org/10.5281/zenodo.8093107 and includes the version of all software used.

Individuals from each time period defined in the original analyses are pooled together to compute allele frequencies and the mean estimated age is taken as the time point date. For the UK dataset (Patterson et al., 2022), we merged the published data with AADR v50.0 (providing modern samples and parental population proxies), and with data from Fowler et al. (2022) to access 10 individuals missing from the other datasets. Only loci with more than 10 genotypes in each time point grouping and more than 5 genotypes in reference populations were kept, resulting in 474,554 SNPs kept over the initial 1,135,618. We used combined filters 0 and 1 from Table S5 of Patterson et al. (2022) as our quality and relevance filtering. This resulted in sample sizes of [37, 69, 26, 23, 273, 38, 62] for periods labeled [‘Neolithic’, ‘Chalcolithic/EBA’, ‘Middle Bronze Age’, ‘Late Bronze Age’, ‘Iron Age’, ‘Post Iron Age’, ‘Modern’] and mean non-missing genotypes across all SNPs of [25.5, 46.3, 19.6, 14.8, 208.2, 25.5, 59.9]. Mean sample dates for those periods are [5424, 4005, 3326, 2929, 2215, 1180, 0] years BP. Reference sample sizes are [18, 21, 18] for groups labeled [‘WHGA’, ‘Balkan_N’, ‘OldSteppe’] in the dataset interpreted as WHG-like, EEF-like, and Steppe-like. Those reference samples have mean non-missing genotypes across all SNPs of [7.4, 15.4, 12.4] respectively.

For the Papac et al. (2021) dataset, only loci with more than 2 genotypes in each time point grouping and more than 2 genotypes in reference samples were kept, resulting in 461,844 SNPs kept over the initial 1,150,639. Sample sizes in this dataset are [3, 5, 29, 14, 48, 59, 84] for periods labeled [‘Neotithic’, ‘Proto-Eneolithic’, ‘Early Eneolithic’, ‘Middle Eneolithic’, ‘Corded Ware’, ‘Bell Beaker’, ‘Unetice’] and mean non-missing genotypes across all SNPs of [3., 3.7, 23.4, 10.9, 33.1, 41.9, 55.1]. Mean sample dates for those periods are [5607, 5253, 4229, 3880, 3726, 3120, 3037] years BP. Reference sample sizes are [4, 17, 15] for groups labeled [‘WHG’, ‘Anatolia_Neolithic’, ‘Yamnaya’] in the dataset interpreted as WHG-like, EEF-like, and Steppe-like. Those reference samples have mean non-missing genotypes across all SNPs of [3.5, 13.1, 9.0] respectively.

We used admixture measures from both published papers produced by the qpAdm method, extracted from the supplementary Table S5 from Patterson et al. (2022) and Table S9 from Papac et al. (2021). In concordance with the literature on European Human demographic history during the last 5000 years, we consider the simplest three-way admixture between populations genetically most similar to European early farmers (EEF-like, early migrants from Anatolia), Western hunter-gatherers (WHG-like) and individuals associated to the Steppe pastoralists Yamnaya culture (Steppe-like).

We computed confidence intervals around estimates by block bootstrap sampling of windows of 1000 SNPs along the genome. Statistics computed for each window were re-sampled 104 times with replacement and a 95 % confidence interval was computed by the pivot method as in Buffalo and Coop (2020). Statistics were computed through a weighted average to account for variability in the number of SNPs in each window (windows at the end of chromosomes often do not contain the required number of SNPs). When dealing with ratio statistics (like G), we computed separately the numerators and denominators and used the ratio of the weighted averages for the final values.

Each dataset was transformed from the eigenstrat to the sgkit format through a plink (Chang et al., 2015) conversion step. Sex chromosomes were removed from the datasets. To investigate the correlation of our statistics with recombination or background selection, we incorporated in the dataset recombination rates (Bhérer et al., 2017, sex-averaged version) and B-values (Murphy et al., 2021) for each SNP – by using the value of the window the SNP was in. We split all SNPs into five quantile bins and computed G and A proportions for each one, as well as the variance and covariances.

We can compute a simple estimate of the diploid effective population size, 2N, by equating the expected variance due to drift after t generations,

Varpt=p01p0¯1112Nt (8)

with the residual variances for each time period in the studied datasets (having adjusted for the variance due to admixture and sampling). Using a generation time of 30 years, and using the number of generations between the mid-points of each time interval for the UK dataset we obtain 2N = [ 351472, 3204693, 2827982, 2041971, 361192, 203114 ] for each time interval. For the Bohemia dataset time intervals, we get 2N = [ 768209, 2576065, 1392381, 1181146, 1275489, 1167632 ]. We note that these effective population size estimates are only approximate as they do not account for the more continuous distribution of sampling times present in the data.

Supplementary Material

Supplement 1
media-1.pdf (524.2KB, pdf)

Significance statement.

The relative contribution of random genetic drift and natural selection to the change in allele frequencies through time is a long standing question in Evolutionary Biology. We show through theory and simulation how genomic time series – such as ancient DNA datasets – can be used to decompose the genome-wide contributions of selection, gene flow, and genetic drift to allele frequency change. We apply these methods to two time time series from ancient Europeans and show that gene flow accounts for most allele frequency change over the last few thousand years, with genetic drift and not selection making up much of the rest of the contribution to genome-wide evolutionary change.

Acknowledgments

We thank members of the Coop lab, Vince Buffalo, and Joshua Schraiber for helpful discussions. We also thank the editor and reviewers for helpful comments during the review process. AS and GC were supported by the National Institute of General Medical Sciences of the National Institutes of Health (NIH R35 GM136290 to GC).

Appendix

A. Pseudohaploid sampling

Following Buffalo and Coop (2020), the observed variance in allele frequency at time t can be decomposed with the law of total variance:

Varp˜t=EVarp˜t|pt+VarEp˜t|pt (A.1)
=EVarp˜t|pt+Varpt (A.2)

This gives us a way to correct the observed variance for sampling noise.

Pseudohaploid representation is common in ancient DNA data to avoid errors when calling heterozygotes. Most often, one read (and therefore allele) is selected randomly among the mapped reads for each individual at a given position. Pseudohaploid calling can be modeled as a binomial sampling. We consider sampling nt individuals in a population with frequency of the alternate allele pt at time t. Pseudohaploid calling is equivalent to each individual drawing one allele from the pool of alleles. We define XtBinomnt,pt the number of alternate alleles sampled and p˜t=Xt/nt. Then the sampling noise is

Varp˜t|pt=1nt2VarXt (A.3)
=1ntpt1pt (A.4)

Correction of the variance VarΔp˜t is carried out by subtracting both sampling variances Varp˜t|pt and Varp˜t+1|pt+1. As in Buffalo and Coop (2020), the covariances between two overlapping time intervals, CovΔp˜t,Δp˜t+1 are negatively biased by the shared sampling noise in pt+1, and this needs to be corrected by adding the shared time point sampling variance Varp˜t+1|pt+1 back in. For these corrections, we need an unbiased estimator of the half heterozygosity. We define the sample heterozygosity as H˜/2=p˜1p˜, then

EH˜/2=p1pn1n (A.5)

Therefore

Varp˜t|pt=1nt1p˜t1p˜t (A.6)

Similarly, if needed, we can compute the diploid sampling bias estimator:

Varp˜t|pt=12nt1p˜t1p˜t (A.7)

B. Pseudohaploid sampling noise in reference populations

Let’s consider the allele frequency of reference population r

f˜r=fr+δfr (B.1)

The observed allele frequency is equal to the true allele frequency (fr) plus sampling noise.

Therefore

ΔpA,i=r=1RΔα¯i,rf˜rδfr=Δp˜A,iΔδfr,i (B.2)

Decomposing CovΔpA,i,ΔpA,j as EΔpA,iΔpA,jEΔpA,iEΔpA,j and remembering that EΔδfr,i=0, we end up with CovΔpA,i,ΔpA,j=CovΔp˜A,i,Δp˜A,j+EΔδfr,iΔδfr,j

Therefore

CovΔpA,i,ΔpA,j=CovΔp˜A,i,Δp˜A,j+rΔαr,iΔαr,jVarδfr (B.3)

with Varδfr, the variance of sampling noise in the pseudohaploid case, equal to 1nr1f˜r1f˜r (similar to appendix A).

C. Accounting for drift in the admixture correction

We consider a simple model where only the focal admixed population experiences drift and parental populations from which gene flow occurs are not. This is in line with our use of a single proxy sample for the sources of admixture. Under this model drift happening in any of our contributing populations is absorbed into the drift observed in the focal population. Drift that occurs in one time interval can partially be erased by admixture in subsequent time intervals. This interaction between drift and gene flow generates additional covariance that needs to be accounted for.

Let fr be the frequencies of R parental populations for a particular SNP. At time 0, an admixed population of frequency p0 is established with ancestry proportions qk0 from any of the R populations. Subsequently, between time points t and t+1, this admixed population can receive a migration pulse from any of the R populations, where a proportion αr,t of individuals in the focal population are replaced by migrants. Between time intervals t and t+1, drift happens changing the frequency by Δdt. pt is the allele frequency at time t of our admixed population. qr,t are the ancestry proportions at time t of this admixed population.

We define the proportion of individuals replaced by admixture as:

At=rαr,t , with0At1 (C.1)

The change in allele frequency can then be written as:

pt+1=1Atpt+rαr,tfr+Δdt (C.2)
Δpt=pt+1pt (C.3)
=Atpt+rαr,tfr+Δdt (C.4)

We can expand this out in terms of the change in allele frequency due to admixture and drift in the preceding time periods:

Δpt=rαr,tAtqr,tfr+ΔdtAtΔdt1+k=0t2l=k+1t11AtΔdk (C.5)

We can express the ancestry fraction from source r at time t in terms of the change due to admixture in previous time periods:

qr,t=αr,t1+k=0t2αr,kl=k+1t11Al (C.6)
qr,t+1=αr,t+1Atqr,tΔqr,t=αr,tAtqr,t (C.7)

As a constraint, if at each time step there is only one αrt>0, then it simplifies to:

Δqr,t=αr,tαr,tqr,tαr,t=Δqr,t1qr,t (C.8)

allowing us to compute the admixture fraction from the ancestry proportions.

When computing the covariance between two time intervals i and ji< j, a composite drift term depending on the admixture pulses will be shared between all time intervals between times 0 to i between the two Δpi and Δpj. This expected admixture-drift term D can be computed as:

DΔpi,Δpji<j=k=0iAi1>0ikl=k+1i11Al1>0ik1×Ajl=k+1j11Al1>0jk1VarΔdk (C.9)

with

1>0x=1 if x>00 else (C.10)

D can then be subtracted from the empirical covariance to remove this effect.

To compute D, individual time interval drift variances need to be estimated (VarΔdk). We can use the fact that the variances at each time interval can be decomposed as a linear combination of drift variances to solve for them. Solving is only possible when considering that only one parental population is contributing to gene flow at a given time to be able to estimate the values of individual αr (eq. (C.8)). The system for 0it is of the form:

VarΔpi=Varrαr,tAtqr,tfr+VarΔdi+Ai2VarΔdi1+k=0i2l=k+1i11Al2VarΔdk (C.11)

Data availability

No new data was produced for this work. Analyses pipelines are available at doi: 10.5281/zenodo.8093105 for simulations and at doi: 10.5281/zenodo.8093107 for the ancient DNA data. Those analyses rely on a custom helper python module available at doi: 10.5281/zenodo.8093101.

References

  1. Allentoft M. E., Sikora M., Refoyo-Martínez A., Irving-Pease E. K., Fischer A., Barrie W., Ingason A., Stenderup J., Sjögren K.-G., Pearson A., Mota B. S. d., Paulsson B. S., Halgren A., Macleod R., Jørkov M. L. S., Demeter F., Novosolov M., Sørensen L., Nielsen P. O., … Willerslev E. (2022). Population genomics of stone age eurasia. bioRxiv. 10.1101/2022.05.04.490594 [DOI] [Google Scholar]
  2. Allentoft M. E., Sikora M., Sjögren K.-G., Rasmussen S., Rasmussen M., Stenderup J., Damgaard P. B., Schroeder H., Ahlström T., Vinner L., Malaspinas A.-S., Margaryan A., Higham T., Chivall D., Lynnerup N., Harvig L., Baron J., Casa P. D., Dąbrowski P., … Willerslev E. (2015). Population genomics of bronze age eurasia. Nature, 522(7555), 167–172. 10.1038/nature14507 [DOI] [PubMed] [Google Scholar]
  3. Barton N. H. (2000). Genetic hitchhiking. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 355(1403), 1553–1562. 10.1098/rstb.2000.0716 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baumdicker F., Bisschop G., Goldstein D., Gower G., Ragsdale A. P., Tsambos G., Zhu S., Eldon B., Ellerman E. C., Galloway J. G., Gladstein A. L., Gorjanc G., Guo B., Jeffery B., Kretzschumar W. W., Lohse K., Matschiner M., Nelson D., Pope N. S., … Kelleher J. (2022). Efficient ancestry and mutation simulation with msprime 1.0. Genetics, 220(3), iyab229. 10.1093/genetics/iyab229 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Berg J. J., Harpak A., Sinnott-Armstrong N., Joergensen A. M., Mostafavi H., Field Y., Boyle E. A., Zhang X., Racimo F., Pritchard J. K., & Coop G. (2019). Reduced signal for polygenic adaptation of height in UK biobank (Nordborg M., McCarthy M. I., Nordborg M., Barton N. H., & Hermisson J., Eds.). eLife, 8, e39725. 10.7554/eLife.39725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bergland A. O., Behrman E. L., O’Brien K. R., Schmidt P. S., & Petrov D. A. (2014). Genomic evidence of rapid and stable adaptive oscillations over seasonal time scales in drosophila. PLOS Genetics, 10(11), e1004775. 10.1371/journal.pgen.1004775 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bergström A., Frantz L., Schmidt R., Ersmark E., Lebrasseur O., Girdland-Flink L., Lin A. T., Storå J., Sjögren K.-G., Anthony D., Antipina E., Amiri S., Bar-Oz G., Bazaliiskii V. I., Bulatović J., Brown D., Carmagnini A., Davy T., Fedorov S., … Skoglund P. (2020). Origins and genetic legacy of prehistoric dogs. Science, 370(6516), 557–564. 10.1126/science.aba9572 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bertram J. (2021). Allele frequency divergence reveals ubiquitous influence of positive selection in drosophila. PLOS Genetics, 17 (9), e1009833. 10.1371/journal.pgen.1009833 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bhérer C., Campbell C. L., & Auton A. (2017). Refined genetic maps reveal sexual dimorphism in human meiotic recombination at multiple scales. Nature Communications, 8(1), 14994. 10.1038/ncomms14994 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bi K., Linderoth T., Singhal S., Vanderpool D., Patton J. L., Nielsen R., Moritz C., & Good J. M. (2019). Temporal genomic contrasts reveal rapid evolutionary responses in an alpine mammal during recent climate change. PLoS Genetics, 15(5), e1008119. 10.1371/journal.pgen.1008119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Brace S., Diekmann Y., Booth T. J., van Dorp L., Faltyskova Z., Rohland N., Mallick S., Olalde I., Ferry M., Michel M., Oppenheimer J., Broomandkhoshbacht N., Stewardson K., Martiniano R., Walsh S., Kayser M., Charlton S., Hellenthal G., Armit I., … Barnes I. (2019). Ancient genomes indicate population replacement in early neolithic britain. Nature Ecology & Evolution, 3(5), 765–771. 10.1038/s41559-019-0871-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Brennan R. S., deMayo J. A., Dam H. G., Finiguerra M., Baumann H., Buffalo V., & Pespeni M. H. (2022). Experimental evolution reveals the synergistic genomic mechanisms of adaptation to ocean warming and acidification in a marine copepod. Proceedings of the National Academy of Sciences, 119(38), e2201521119. 10.1073/pnas.2201521119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Buffalo V. (2021). Quantifying the relationship between genetic diversity and population size suggests natural selection cannot explain lewontin’s paradox. eLife, 10, e67509. 10.7554/eLife.67509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Buffalo V., & Coop G. (2019). The linked selection signature of rapid adaptation in temporal genomic data. Genetics, 213, 1007–1045. 10.1534/genetics.119.302581 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Buffalo V., & Coop G. (2020). Estimating the genome-wide contribution of selection to temporal allele frequency change. Proceedings of the National Academy of Sciences, 117 (34), 20672–20680. 10.1073/pnas.1919039117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cai J. J., Macpherson J. M., Sella G., & Petrov D. A. (2009). Pervasive hitchhiking at coding and regulatory sites in humans. PLoS Genetics, 5(1), e1000336. 10.1371/journal.pgen.1000336 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Chang C. C., Chow C. C., Tellier L. C., Vattikuti S., Purcell S. M., & Lee J. J. (2015). Second-generation PLINK: Rising to the challenge of larger and richer datasets. GigaScience, 4(1), s13742–015–0047–8. 10.1186/s13742-015-0047-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Charlesworth B., Morgan M. T., & Charlesworth D. (1993). The effect of deleterious mutations on neutral molecular variation. 134, 1289–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Consortium, 1. G. P. (2015). A global reference for human genetic variation. Nature, 526(7571), 68–74. 10.1038/nature15393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Coop G. (2016). Does linked selection explain the narrow range of genetic diversity across species? bioRxiv. 10.1101/042598 [DOI] [Google Scholar]
  21. Coutinho A., Günther T., Munters A. R., Svensson E. M., Götherström A., Storå J., Malmström H., & Jakobsson M. (2020). The neolithic pitted ware culture foragers were culturally but not genetically influenced by the battle axe culture herders. American Journal of Physical Anthropology, 172(4), 638–649. 10.1002/ajpa.24079 [DOI] [PubMed] [Google Scholar]
  22. Enard D., Messer P. W., & Petrov D. A. (2014). Genome-wide signals of positive selection in human evolution. Genome Research, 24(6), 885–895. 10.1101/gr.164822.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Field Y., Boyle E. A., Telis N., Gao Z., Gaulton K. J., Golan D., Yengo L., Rocheleau G., Froguel P., McCarthy M. I., & Pritchard J. K. (2016). Detection of human adaptation during the past 2000 years. Science, 354(6313), 760–764. 10.1126/science.aag0776 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Fowler C., Olalde I., Cummings V., Armit I., Büster L., Cuthbert S., Rohland N., Cheronet O., Pinhasi R., & Reich D. (2022). A high-resolution picture of kinship practices in an early neolithic tomb. Nature, 601(7894), 584–587. 10.1038/s41586-021-04241-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Fu Q., Posth C., Hajdinjak M., Petr M., Mallick S., Fernandes D., Furtwängler A., Haak W., Meyer M., Mittnik A., Nickel B., Peltzer A., Rohland N., Slon V., Talamo S., Lazaridis I., Lipson M., Mathieson I., Schiffels S., … Reich D. (2016). The genetic history of ice age europe. Nature, 534(7606), 200–205. 10.1038/nature17993 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Gillespie J. H. (1984). The status of the neutral theory: The neutral theory of molecular evolution. motoo kimura. cambridge university press, new york, 1983. xvi, 367 pp., illus. $69.50. Science, 224(4650), 732–733. 10.1126/science.224.4650.732 [DOI] [PubMed] [Google Scholar]
  27. Gower G., Ragsdale A. P., Bisschop G., Gutenkunst R. N., Hartfield M., Noskova E., Schiffels S., Struck T. J., Kelleher J., & Thornton K. R. (2022). Demes: A standard format for demographic models. Genetics, 222(3), iyac131. 10.1093/genetics/iyac131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Haak W., Lazaridis I., Patterson N., Rohland N., Mallick S., Llamas B., Brandt G., Nordenfelt S., Harney E., Stewardson K., Fu Q., Mittnik A., Bánffy E., Economou C., Francken M., Friederich S., Pena R. G., Hallgren F., Khartanovich V., … Reich D. (2015). Massive migration from the steppe was a source for indo-european languages in europe. Nature, 522(7555), 207–211. 10.1038/nature14317 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Haller B. C., & Messer P. W. (2019). SLiM 3: Forward genetic simulations beyond the wright–fisher model (Hernandez R., Ed.). Molecular Biology and Evolution, 36(3), 632–637. 10.1093/molbev/msy228 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. He Z., Dai X., Lyu W., Beaumont M., & Yu F. (2023). Estimating temporally variable selection intensity from ancient DNA data. Molecular Biology and Evolution, msad008. 10.1093/molbev/msad008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Hernandez R. D., Kelley J. L., Elyashiv E., Melton S. C., Auton A., McVean G., 1000 GENOMES PROJECT, Sella G., & Przeworski M. (2011). Classic selective sweeps were rare in recent human evolution. Science, 331(6019), 920–924. 10.1126/science.1198878 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Irving-Pease E. K., Refoyo-Martínez A., Ingason A., Pearson A., Fischer A., Barrie W., Sjögren K.-G., Halgren A. S., Macleod R., Demeter F., Henriksen R. A., Vimala T., McColl H., Vaughn A., Speidel L., Stern A. J., Scorrano G., Ramsøe A., Schork A. J., … Willerslev E. (2022). The selection landscape and genetic legacy of ancient eurasians. bioRxiv. 10.1101/2022.09.22.509027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Jensen J. D., Payseur B. A., Stephan W., Aquadro C. F., Lynch M., Charlesworth D., & Charlesworth B. (2019). The importance of the neutral theory in 1968 and 50 years on: A response to kern and hahn 2018. Evolution, 73(1), 111–114. 10.1111/evo.13650 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Jones E. R., Gonzalez-Fortes G., Connell S., Siska V., Eriksson A., Martiniano R., McLaughlin R. L., Gallego Llorente M., Cassidy L. M., Gamba C., Meshveliani T., Bar-Yosef O., Müller W., Belfer-Cohen A., Matskevich Z., Jakeli N., Higham T. F. G., Currat M., Lordkipanidze D., … Bradley D. G. (2015). Upper palaeolithic genomes reveal deep roots of modern eurasians. Nature Communications, 6(1), 8912. 10.1038/ncomms9912 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Ju D., & Mathieson I. (2021). The evolution of skin pigmentation-associated variation in west eurasia. Proceedings of the National Academy of Sciences, 118(1), e2009227118. 10.1073/pnas.2009227118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Kaplan N. L., Hudson R. R., & Langley C. H. (1989). The ”hitchhiking effect” revisited. Genetics, 123(4), 887–899. 10.1093/genetics/123.4.887 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kelleher J., Thornton K. R., Ashander J., & Ralph P. L. (2018). Efficient pedigree recording for fast population genetics simulation. PLOS Computational Biology, 14(11), e1006581. 10.1371/journal.pcbi.1006581 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kelly J. K. (2022). The genomic scale of fluctuating selection in a natural plant population. Evolution Letters, 6(6), 506–521. 10.1002/evl3.308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kern A. D., & Hahn M. W. (2018). The neutral theory in light of natural selection (Kumar S., Ed.). Molecular Biology and Evolution, 35(6), 1366–1371. 10.1093/molbev/msy092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kimura M. (1968). Evolutionary rate at the molecular level. Nature, 217, 624–626. 10.1038/217624a0 [DOI] [PubMed] [Google Scholar]
  41. Kirch M., Romundset A., Gilbert M. T. P., Jones F. C., & Foote A. D. (2021). Ancient and modern stickleback genomes reveal the demographic constraints on adaptation. Current Biology, 31(9), 2027–2036.e8. 10.1016/j.cub.2021.02.027 [DOI] [PubMed] [Google Scholar]
  42. Kreiner J. M., Latorre S. M., Burbano H. A., Stinchcombe J. R., Otto S. P., Weigel D., & Wright S. I. (2022). Rapid weed adaptation and range expansion in response to agriculture over the past two centuries. Science, 378(6624), 1079–1085. 10.1126/science.abo7293 [DOI] [PubMed] [Google Scholar]
  43. Kreitman M. (1996). The neutral theory is dead. long live the neutral theory. BioEssays, 18(8), 678–683. 10.1002/bies.950180812 [DOI] [PubMed] [Google Scholar]
  44. Lamnidis T. C., Majander K., Jeong C., Salmela E., Wessman A., Moiseyev V., Khartanovich V., Balanovsky O., Ongyerth M., Weihmann A., Sajantila A., Kelso J., Pääbo S., Onkamo P., Haak W., Krause J., & Schiffels S. (2018). Ancient fennoscandian genomes reveal origin and spread of siberian ancestry in europe. Nature Communications, 9(1), 5018. 10.1038/s41467-018-07483-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Lazaridis I., Mittnik A., Patterson N., Mallick S., Rohland N., Pfrengle S., Furtwängler A., Peltzer A., Posth C., Vasilakis A., McGeorge P. J. P., Konsolaki-Yannopoulou E., Korres G., Martlew H., Michalodimitrakis M., Özsait M., Özsait N., Papathanasiou A., Richards M., … Stamatoyannopoulos G. (2017). Genetic origins of the minoans and mycenaeans. Nature, 548(7666), 214–218. 10.1038/nature23310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Lazaridis I., Nadel D., Rollefson G., Merrett D. C., Rohland N., Mallick S., Fernandes D., Novak M., Gamarra B., Sirak K., Connell S., Stewardson K., Harney E., Fu Q., Gonzalez-Fortes G., Jones E. R., Roodenberg S. A., Lengyel G., Bocquentin F., … Reich D. (2016). Genomic insights into the origin of farming in the ancient near east. Nature, 536(7617), 419–424. 10.1038/nature19310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Le M. K., Smith O. S., Akbari A., Harpak A., Reich D., & Narasimhan V. M. (2022). 1,000 ancient genomes uncover 10,000 years of natural selection in europe. 10.1101/2022.08.24.505188 [DOI] [Google Scholar]
  48. Lipson M., Szécsényi-Nagy A., Mallick S., Pósa A., Stégmár B., Keerl V., Rohland N., Stewardson K., Ferry M., Michel M., Oppenheimer J., Broomandkhoshbacht N., Harney E., Nordenfelt S., Llamas B., Gusztáv Mende B., Köhler K., Oross K., Bondár M., … Reich D. (2017). Parallel palaeogenomic transects reveal complex genetic history of early european farmers. Nature, 551(7680), 368–372. 10.1038/nature24476 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Machado H. E., Bergland A. O., Taylor R., Tilk S., Behrman E., Dyer K., Fabian D. K., Flatt T., González J., Karasov T. L., Kim B., Kozeretska I., Lazzaro B. P., Merritt T. J., Pool J. E., O’Brien K., Rajpurohit S., Roy P. R., Schaeffer S. W., … Petrov D. A. (2021). Broad geographic sampling reveals the shared basis and environmental correlates of seasonal adaptation in drosophila (Nordborg M., Wittkopp P. J., & Nordborg M., Eds.). eLife, 10, e67577. 10.7554/eLife.67577 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Maier R., Flegontov P., Flegontova O., Isildak U., Changmai P., & Reich D. (2023). On the limits of fitting complex models of population history to f-statistics (Nordborg M., Ed.). eLife, 12, e85492. 10.7554/eLife.85492 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Mallick S., Li H., Lipson M., Mathieson I., Gymrek M., Racimo F., Zhao M., Chennagiri N., Nordenfelt S., Tandon A., Skoglund P., Lazaridis I., Sankararaman S., Fu Q., Rohland N., Renaud G., Erlich Y., Willems T., Gallo C., … Reich D. (2016). The simons genome diversity project: 300 genomes from 142 diverse populations. Nature, 538(7624), 201–206. 10.1038/nature18964 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Mallick S., Micco A., Mah M., Ringbauer H., Lazaridis I., Olalde I., Patterson N. J., & Reich D. E. (2023). The allen ancient DNA resource (AADR): A curated compendium of ancient human genomes. 10.1101/2023.04.06.535797 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Margaryan A., Lawson D. J., Sikora M., Racimo F., Rasmussen S., Moltke I., Cassidy L. M., Jørsboe E., Ingason A., Pedersen M. W., Korneliussen T., Wilhelmson H., Buś M. M., de Barros Damgaard P., Martiniano R., Renaud G., Bhérer C., Moreno-Mayar J. V., Fotakis A. K., … Willerslev E. (2020). Population genomics of the viking world. Nature, 585(7825), 390–396. 10.1038/s41586-020-2688-8 [DOI] [PubMed] [Google Scholar]
  54. Martiniano R., Caffell A., Holst M., Hunter-Mann K., Montgomery J., Müldner G., McLaughlin R. L., Teasdale M. D., van Rheenen W., Veldink J. H., van den Berg L. H., Hardiman O., Carroll M., Roskams S., Oxley J., Morgan C., Thomas M. G., Barnes I., McDonnell C., … Bradley D. G. (2016). Genomic signals of migration and continuity in britain before the anglo-saxons. Nature Communications, 7 (1), 10326. 10.1038/ncomms10326 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Mathieson I., Alpaslan-Roodenberg S., Posth C., Szécsényi-Nagy A., Rohland N., Mallick S., Olalde I., Broomandkhoshbacht N., Candilio F., Cheronet O., Fernandes D., Ferry M., Gamarra B., Fortes G. G., Haak W., Harney E., Jones E., Keating D., Krause-Kyora B., … Reich D. (2018). The genomic history of southeastern europe. Nature, 555(7695), 197–203. 10.1038/nature25778 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Mathieson I., Lazaridis I., Rohland N., Mallick S., Patterson N., Roodenberg S. A., Harney E., Stewardson K., Fernandes D., Novak M., Sirak K., Gamba C., Jones E. R., Llamas B., Dryomov S., Pickrell J., Arsuaga J. L., de Castro J. M. B., Carbonell E., … Reich D. (2015). Genome-wide patterns of selection in 230 ancient eurasians. Nature, 528(7583), 499–503. 10.1038/nature16152 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Mathieson I., & Terhorst J. (2022). Direct detection of natural selection in bronze age britain. Genome Research, 32(11), 2057–2067. 10.1101/gr.276862.122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Maynard-Smith J., & Haigh J. (1974). The hitch-hiking effect of a favourable gene. Genetical Research, 23(1), 23–35. 10.1017/S0016672300014634 [DOI] [PubMed] [Google Scholar]
  59. McVicker G., Gordon D., Davis C., & Green P. (2009). Widespread genomic signatures of natural selection in hominid evolution. PLOS Genetics, 5(5), e1000471. 10.1371/journal.pgen.1000471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Mittnik A., Wang C.-C., Pfrengle S., Daubaras M., Zariņa G., Hallgren F., Allmäe R., Khartanovich V., Moiseyev V., Tõrv M., Furtwängler A., Andrades Valtueña A., Feldman M., Economou C., Oinonen M., Vasks A., Balanovska E., Reich D., Jankauskas R., … Krause J. (2018). The genetic prehistory of the baltic sea region. Nature Communications, 9(1), 442. 10.1038/s41467-018-02825-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Mölder F., Jablonski K., Letcher B., Hall M., Tomkins-Tinch C., Sochat V., Forster J., Lee S., Twardziok S., Kanitz A., Wilm A., Holtgrewe M., Rahmann S., Nahnsen S., & Köster J. (2021). Sustainable data analysis with snakemake. F1000Research, 10(33). 10.12688/f1000research.29032.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Murphy D., Elyashiv E., Amster G., & Sella G. (2021). Broad-scale variation in human genetic diversity levels is predicted by purifying selection on coding and non-coding elements. 10.1101/2021.07.02.450762 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Narasimhan V. M., Patterson N., Moorjani P., Rohland N., Bernardos R., Mallick S., Lazaridis I., Nakatsuka N., Olalde I., Lipson M., Kim A. M., Olivieri L. M., Coppa A., Vidale M., Mallory J., Moiseyev V., Kitov E., Monge J., Adamski N., … Reich D. (2019). The formation of human populations in south and central asia. Science, 365(6457), eaat7487. 10.1126/science.aat7487 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Olalde I., Brace S., Allentoft M. E., Armit I., Kristiansen K., Booth T., Rohland N., Mallick S., Szécsényi-Nagy A., Mittnik A., Altena E., Lipson M., Lazaridis I., Harper T. K., Patterson N., Broomandkhoshbacht N., Diekmann Y., Faltyskova Z., Fernandes D., … Reich D. (2018). The beaker phenomenon and the genomic transformation of northwest europe. Nature, 555(7695), 190–196. 10.1038/nature25738 [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Orlando L. (2020). The evolutionary and historical foundation of the modern horse: Lessons from ancient genomics. Annual Review of Genetics, 54(1), 563–581. 10.1146/annurev-genet-021920-011805 [DOI] [PubMed] [Google Scholar]
  66. Papac L., Ernée M., Dobeš M., Langová M., Rohrlach A. B., Aron F., Neumann G. U., Spyrou M. A., Rohland N., Velemínský P., Kuna M., Brzobohatá H., Culleton B., Daněček D., Danielisová A., Dobisíková M., Hložek J., Kennett D. J., Klementová J., … Haak W. (2021). Dynamic changes in genomic and social structures in third millennium BCE central europe. Science Advances, 7 (35), eabi6941. 10.1126/sciadv.abi6941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Patterson N., Isakov M., Booth T., Büster L., Fischer C.-E., Olalde I., Ringbauer H., Akbari A., Cheronet O., Bleasdale M., Adamski N., Altena E., Bernardos R., Brace S., Broomandkhoshbacht N., Callan K., Candilio F., Culleton B., Curtis E., … Reich D. (2022). Large-scale migration into britain during the middle to late bronze age. Nature, 601(7894), 588–594. 10.1038/s41586-021-04287-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Patterson N., Moorjani P., Luo Y., Mallick S., Rohland N., Zhan Y., Genschoreck T., Webster T., & Reich D. (2012). Ancient admixture in human history. Genetics, 192(3), 1065–1093. 10.1534/genetics.112.145037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Pearson A., & Durbin R. (2023). Local ancestry inference for complex population histories. bioRxiv. 10.1101/2023.03.06.529121 [DOI] [Google Scholar]
  70. Petr M., Pääbo S., Kelso J., & Vernot B. (2019). Limits of long-term selection against neandertal introgression. Proceedings of the National Academy of Sciences, 116(5), 1639–1644. 10.1073/pnas.1814338116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Raghavan M., Skoglund P., Graf K. E., Metspalu M., Albrechtsen A., Moltke I., Rasmussen S., Stafford T. W. Jr, Orlando L., Metspalu E., Karmin M., Tambets K., Rootsi S., Mägi R., Campos P. F., Balanovska E., Balanovsky O., Khusnutdinova E., Litvinov S., … Willerslev E. (2014). Upper palaeolithic siberian genome reveals dual ancestry of native americans. Nature, 505(7481), 87–91. 10.1038/nature12736 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Ralph P., Wong Y., Jeffery B., Ashander J., Patterson G., Kelleher J., R., M., Kern A., Tittes S., Huang X., Galloway J., & lclclclclclclc. (2023). Tskit-dev/pyslim: 1.0.3 (Version 1.0.3). Zenodo. 10.5281/zenodo.8068030 [DOI] [Google Scholar]
  73. Reid B. N., Star B., & Pinsky M. L. (2023). Detecting parallel polygenic adaptation to novel evolutionary pressure in wild populations: A case study in atlantic cod (gadus morhua). Philosophical Transactions of the Royal Society B: Biological Sciences, 378(1881), 20220190. 10.1098/rstb.2022.0190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Rivollat M., Jeong C., Schiffels S., Küçükkalıpçı İ., Pemonge M.-H., Rohrlach A. B., Alt K. W., Binder D., Friederich S., Ghesquière E., Gronenborn D., Laporte L., Lefranc P., Meller H., Réveillas H., Rosenstock E., Rottier S., Scarre C., Soler L., … Haak W. (2020). Ancient genome-wide DNA from france highlights the complexity of interactions between mesolithic hunter-gatherers and neolithic farmers. Science Advances, 6(22), eaaz5344. 10.1126/sciadv.aaz5344 [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Rohland N., Mallick S., Mah M., Maier R., Patterson N., & Reich D. (2022). Three assays for in-solution enrichment of ancient human DNA at more than a million SNPs. Genome Research, 32(11), 2068–2078. 10.1101/gr.276728.122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Saleh D., Chen J., Leplé J.-C., Leroy T., Truffaut L., Dencausse B., Lalanne C., Labadie K., Lesur I., Bert D., Lagane F., Morneau F., Aury J.-M., Plomion C., Lascoux M., & Kremer A. (2022). Genome-wide evolutionary response of european oaks during the anthropocene. Evolution Letters, 6(1), 4–20. 10.1002/evl3.269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Santiago E., & Caballero A. (1998). Effective size and polymorphism of linked neutral loci in populations under directional selection. Genetics, 149(4), 2105–2117. 10.1093/genetics/149.4.2105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Scheib C. L., Hui R., D’Atanasio E., Wohns A. W., Inskip S. A., Rose A., Cessford C., O’Connell T. C., Robb J. E., Evans C., Patten R., & Kivisild T. (2019). East anglian early neolithic monument burial linked to contemporary megaliths. Annals of Human Biology, 46(2), 145–149. 10.1080/03014460.2019.1623912 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Schrider D. R., & Kern A. D. (2017). Soft sweeps are the dominant mode of adaptation in the human genome. Molecular Biology and Evolution, 34(8), 1863–1877. 10.1093/molbev/msx154 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Schroeder H., Margaryan A., Szmyt M., Theulot B., Włodarczak P., Rasmussen S., Gopalakrishnan S., Szczepanek A., Konopka T., Jensen T. Z. T., Witkowska B., Wilk S., Przybyła M. M., Pospieszny Ł., Sjögren K.-G., Belka Z., Olsen J., Kristiansen K., Willerslev E., … Allentoft M. E. (2019). Unraveling ancestry, kinship, and violence in a late neolithic mass grave. Proceedings of the National Academy of Sciences, 116(22), 10705–10710. 10.1073/pnas.1820210116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Sella G., Petrov D. A., Przeworski M., & Andolfatto P. (2009). Pervasive natural selection in the drosophila genome? PLOS Genetics, 5(6), e1000495. 10.1371/journal.pgen.1000495 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Skoglund P., Mallick S., Bortolini M. C., Chennagiri N., Hünemeier T., Petzl-Erler M. L., Salzano F. M., Patterson N., & Reich D. (2015). Genetic evidence for two founding populations of the americas. Nature, 525(7567), 104–108. 10.1038/nature14895 [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Sohail M., Maier R. M., Ganna A., Bloemendal A., Martin A. R., Turchin M. C., Chiang C. W., Hirschhorn J., Daly M. J., Patterson N., Neale B., Mathieson I., Reich D., & Sunyaev S. R. (2019). Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies (Nordborg M., McCarthy M. I., Nordborg M., Barton N. H., & Hermisson J., Eds.). eLife, 8, e39702. 10.7554/eLife.39702 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Souilmi Y., Tobler R., Johar A., Williams M., Grey S. T., Schmidt J., Teixeira J. C., Rohrlach A., Tuke J., Johnson O., Gower G., Turney C., Cox M., Cooper A., & Huber C. D. (2022). Admixture has obscured signals of historical hard sweeps in humans. Nature Ecology & Evolution, 1–13. 10.1038/s41559-022-01914-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Villalba-Mouco V., van de Loosdrecht M. S., Posth C., Mora R., Martínez-Moreno J., Rojo-Guerra M., Salazar-García D. C., Royo-Guillén J. I., Kunst M., Rougier H., Crevecoeur I., Arcusa-Magallón H., Tejedor-Rodríguez C., García-Martínez de Lagrán I., Garrido-Pena R., Alt K. W., Jeong C., Schiffels S., Utrilla P., … Haak W. (2019). Survival of late pleistocene hunter-gatherer ancestry in the iberian peninsula. Current Biology, 29(7), 1169–1177.e7. 10.1016/j.cub.2019.02.006 [DOI] [PubMed] [Google Scholar]
  86. Wang C.-C., Reinhold S., Kalmykov A., Wissgott A., Brandt G., Jeong C., Cheronet O., Ferry M., Harney E., Keating D., Mallick S., Rohland N., Stewardson K., Kantorovich A. R., Maslov V. E., Petrenko V. G., Erlikh V. R., Atabiev B. C., Magomedov R. G., … Haak W. (2019). Ancient human genome-wide data from a 3000-year interval in the caucasus corresponds with eco-geographic regions. Nature Communications, 10(1), 590. 10.1038/s41467-018-08220-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Wilde S., Timpson A., Kirsanow K., Kaiser E., Kayser M., Unterländer M., Hollfelder N., Potekhina I. D., Schier W., Thomas M. G., & Burger J. (2014). Direct evidence for positive selection of skin, hair, and eye pigmentation in europeans during the last 5,000 y. Proceedings of the National Academy of Sciences, 111(13), 4832–4837. 10.1073/pnas.1316513111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Yair S., & Coop G. (2022). Population differentiation of polygenic score predictions under stabilizing selection. Philosophical Transactions of the Royal Society B: Biological Sciences, 377 (1852), 20200416. 10.1098/rstb.2020.0416 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1
media-1.pdf (524.2KB, pdf)

Data Availability Statement

No new data was produced for this work. Analyses pipelines are available at doi: 10.5281/zenodo.8093105 for simulations and at doi: 10.5281/zenodo.8093107 for the ancient DNA data. Those analyses rely on a custom helper python module available at doi: 10.5281/zenodo.8093101.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES