Skip to main content
Journal of the Royal Society Interface logoLink to Journal of the Royal Society Interface
. 2025 Jul 16;22(228):20250146. doi: 10.1098/rsif.2025.0146

Taylor’s Power Law rules the dynamics of allele frequencies during viral evolution in response to host changes

João M F Silva 1,, María J Olmo-Uceda 1, Valerie J Morley 2,, Paul E Turner 2,3,4, Santiago F Elena 1,5,
PMCID: PMC12303090  PMID: 40664232

Abstract

Sudden and gradual changes from permissive to resistant hosts affect viral fitness, virulence and rates of molecular evolution. We analysed the roles of stochasticity and selection in evolving populations of Sindbis virus under different rates of host replacement. First, approximate Markov models within the Wright–Fisher diffusion framework revealed a reduction in effective population size by approximately half under sudden host changes. These scenarios were also associated with fewer weak beneficial mutations. Second, genetic distance between populations at consecutive time points indicated that populations undergoing gradual host changes evolved steadily until the original host disappeared. Distances to the ancestral sequence in these cases exhibited occasional leapfrog phenomena, where the rise of certain haplotypes is not predictable based on their relatedness to previously dominant ones. In contrast, populations exposed to sudden changes exhibited less-stable compositions and diverged from the ancestral sequence at a consistent rate. Third, we observed that the distribution of allele frequencies followed Taylor’s Power Law. Both treatments exhibited high levels of allele aggregation and significant fluctuations, with neutral, beneficial and deleterious alleles distinguishable by their behaviour and position on Taylor’s plot. Finally, we found evidence that the host replacement regime influences the temporal distribution of mutations across the genome.

Keywords: approximate Wright–Fisher diffusion model, effective population size, experimental evolution, Hurst’s exponent, selection coefficient, Sindbis virus, Taylor’s temporal fluctuation scaling, virus evolution

1. Introduction

Environmental heterogeneity strongly influences molecular adaptation dynamics. Pathogens adapt to new hosts by fixing mutations that enhance fitness in the novel host, even if these mutations are neutral or detrimental in the original host. Consequently, the availability of novel hosts in the environment affects both the rate and direction of molecular adaptation. For example, the adaptation dynamics of the Sindbis virus (SINV) to a less-permissive cell type in laboratory tissue culture are determined by the rate at which these novel host cells ‘invade’ the environment [1,2]. In a series of evolution experiments, Morley et al. [1] introduced a novel host cell type at varying rates, ranging from a gradual increase in the proportion of the novel cell type with each passage to an abrupt shift to an environment composed entirely of the novel host. Gradual changes were associated with virus populations achieving higher fitness in both novel and original hosts, as well as greater convergence among populations that fixed the same adaptive mutations [1]. Here, we reanalyse these results within a systems biology framework to uncover the interplay between selection and noise under different host replacement rates, and how this affects population composition during adaptation.

Allele frequencies typically fluctuate over time in both natural and experimental populations due to a variety of factors, including purely stochastic processes (e.g. genetic drift, fluctuating selection, migration or unpredictable ecological changes) and deterministic processes (e.g. directional selection). Indeed, temporal fluctuations are ubiquitous in physical systems, raising the question of whether they follow universal laws. Various power-law relationships have been found to be pervasive in physical and biological systems. One such relationship, known as Taylor’s Power Law—or the fluctuation scaling law—posits that the variance of a system’s σ2 elements scales as a power of the mean, μ: σ2=Vμβ [3]. Originally described in ecology, this law naturally arises in many complex systems [49]. Its parameters capture both the amplitude of the noise level (V) and the degree of temporal aggregation (β) in the fluctuations observed within the system [5,6,8]. Notably, in the context of infectious diseases, Taylor’s Law has been applied to temporal data from the human microbiota, revealing that an individual’s negative health status is associated with increased noise and system instability [8]. Similarly, recent studies have found that during severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, the transcripts dynamics in cells from human intestinal organoids, but not pulmonary cells, exhibit an increase–decrease–increase pattern in system noise and instability as infection progresses and the virus accumulates [9]. In the context of temporal variations in allele frequencies during virus adaptation to novel hosts, we propose that Taylor’s Power Law can be used to model how the variance of allele frequencies changes over time. In populations influenced solely by stochastic processes (i.e. exponential), it is expected that β = 2, meaning that variance scales quadratically with the mean allele frequency. In contrast, a β > 2 indicates non-random processes that amplify variability, such as migration or environmental heterogeneity, whereas a β < 2 suggests processes that constrain variability, such as stabilizing selection or density dependence (with β = 1 corresponding to a Poisson process).

Many natural processes exhibit long-term memory or persistent behaviour (autocorrelation), meaning that a high value is likely to be followed by another high value (and similarly for low values) [1015]. This behaviour can be quantified using the Hurst exponent (denoted as H throughout this work), where: 0 < H < 0.5 indicates anti-persistent (negative autocorrelation) behaviour, H = 0.5 indicates a random walk, and 0.5 < H < 1 indicates persistent behaviour. Interestingly, many studied processes have an estimated H ≈ 0.7, a phenomenon known as the Hurst phenomenon [10]. Although Taylor’s Law and the Hurst phenomenon both describe variability and scaling behaviour, they focus on different aspects of the process and are indirectly related. Specifically, when Taylor’s Law is applied to the variance of fluctuations at different time scales, the simple relationship β = 2H must hold [11]. This connection arises because both laws describe the fractal or scaling properties of purely stochastic processes.

In this work, we estimate population genetic parameters—such as the selection coefficient per allele (s) and effective population size (Ne)—and describe allele dynamics in experimentally evolving SINV populations under two different temporal schemes of host replacement. SINV populations that experienced a sudden replacement of a highly susceptible host with a less permissive one exhibited smaller Ne values, indicating stronger bottlenecks and genetic drift. This observation prompted us to investigate how noise and selection affect the dynamics of virus molecular adaptation to novel hosts. First, we show that under gradual replacement of the highly susceptible host, the genetic composition of the viral populations changed steadily until the host was completely removed. In contrast, under the sudden treatment, the shift in population composition was more pronounced and less consistent. Second, we characterized Taylor’s Power Law for the temporal variation in allele frequency. This power model allowed us to examine in greater detail how selection and noise resulting from genetic drift influence the dynamics of molecular adaptation. Third, we investigate the persistent behaviour characteristic of the Hurst phenomenon in the distribution of mutations along SINV genomes, finding evidence that, in the sudden treatment, mutations become more randomly distributed earlier in time.

2. Methods

2.1. Description of the study system and data acquisition

We analysed data from an experimental evolution study that tracked the molecular adaptation of SINV (species Alphavirus sindbis, genus Alphavirus, family Togaviridae) typically cultured on a highly susceptible host, BHK-21 cells, while challenged to infect the less-susceptible host CHO cells, which were genetically modified (pgsD-677 ATCC CRL-2244) to be more resistant to SINV infection [1]. Briefly, SINV populations were evolved through 25 passages (approx. 100 virus generations) in monolayer cell cultures with a multiplicity of infection (MOI) of approximately 0.01 plaque-forming units (pfu) per cell at each passage. Two different treatments from the original study [1,2] were chosen for an in-depth analysis to investigate the link between population parameters and system dynamics of molecular adaptation to a novel host. Figure 1a shows a schematic representation of the chosen treatments. The initial stock was prepared by expression of an infectious clone in BHK-21 cells for 24 h [16]. In the gradual treatment, the proportion of CHO cells in the cell cultures increased at each passage, reaching 100% at the last passage. In contrast, in the sudden treatment, cell cultures were entirely composed of CHO cells from the first passage. For each treatment, nine populations were evolved and sequenced at passages 4, 7, 10, 13, 16, 19, 22 and 25. Following RNA extraction, two technical replicates were prepared for each sample from the reverse transcription step onward. Sample preparation is fully explained in [2].

Figure 1.

Schematic representations of the experiments and computational analyses flow.

Schematic representations of the experiments and computational analyses flow. (a) Proportion of CHO cells at each passage for both treatments. Dotted lines represent the passages in which sequencing was performed. (b) Estimation of populational parameters, diversity measures and Taylor’s parameters directly from the alleles frequency table. Hurst’s exponent H was calculated from binary sequences where ones (represented by vertical lines) correspond to polymorphic sites.

The final allele frequency tables for each sample were obtained from [2]. Briefly, reads were trimmed with cutadapt version 1.8.3 [17], aligned to the consensus sequence of the original stock with BWA version 0.7.10 [18] and variant calling was conducted with QUASR version 7.01 [19] and VarScan version 2.3.9 [20]. To minimize technical errors, variants that appeared in only one technical replicate were assumed to be errors and were excluded. For variants detected in both technical replicates with a frequency of at least 1%, the mean of the two replicate frequencies was included in the final variant table.

2.2. Wright–Fisher approximate Bayesian computation estimates of selection coefficients and effective population sizes

The selection coefficient (s) and effective population size (Ne) were estimated for each allele using the approxwf software [21] (figure 1b), which uses a discrete approximation of the Wright–Fisher diffusion model and Bayesian inference to estimate population parameters. The implementation assumes a log-uniform (1, 5) prior distribution on Ne and a normal (0, 0.05) prior distribution on s [21]. The algorithm runs a Markov chain Monte Carlo (MCMC) using 51 states for 25 000 interactions, discarding the first 2000 interactions as burn-in. The mean values of s and Ne from their posterior distributions were used in downstream analyses.

2.3. Genetic diversity within evolving viral populations

The alleles frequency table was used to calculate genetic distance and diversity between passages (figure 1b). To measure the genetic distance between two viral populations from consecutive time points, the allele frequency difference (AFD) [22] was calculated at each polymorphic site and averaged by the richness of the sample with custom R scripts, where n represents the total number of different alleles observed at the polymorphism and the fi terms are the proportion of allele i in the two viral populations,

AFD=12i=1n|(fi1fi2)|.

As a second measure of within-sample genetic diversity, we used allele richness, defined as the number of unique alleles per locus, adjusted for sample size.

2.4. Taylor’s Fluctuations Power Law

Mean and variance of allele frequencies (p) across time (p and σp2, respectively) were computed and used to estimate Taylor’s parameters V and β by linear regressions in the log–log space: logσp2=logV+βlogp [3] (figure 1b). Given that the proportion of zeros had a significant influence on the fits, Taylor’s parameters were estimated independently for alleles grouped by their number of non-zero occurrences. Multivariate analysis of variance (MANOVA) of Taylor’s parameters were performed with parameters estimated from at least five observations. The magnitude of effects was evaluated using the ηP2 statistic. Conventionally, 0.01 < ηP2 ≤ 0.14 are considered as medium effects and ηP2 > 0.14 as large ones. Three random walk matrices were constructed based on the cumulative sum of random variables drawn from a normal distribution with a mean of zero and standard deviations of 0.2, 0.05 and 0.01, respectively. For each matrix, 1000 random walks with eight time points were generated, and values below 0.01 were set to zero and those above 0.99 were set to one. After eliminating walks with only zeros, 792, 757 and 587 random walks were present in the matrices with standard deviation 0.2, 0.05 and 0.01, respectively.

2.5. Long-range dependence of mutation sites

To characterize the long-range dependence behaviour of mutations along the SINV genome, rescaled-range analyses were performed on the placement of mutations on each time point with the R package pracma version 2.4.2 [23]. For this analysis, binary sequences of length 11 703 (which is the genome length) with ones in the position of mutations were used to estimate the empirical Hurst’s exponent (H) (figure 1b). The minimum window size was set to 2000 to avoid windows with only zeros. Indels were excluded from this analysis.

2.6. Statistical analyses

All the statistical analyses indicated above were performed with R version 4.4.0 under RStudio version 2024.04.2+764.

3. Results and discussion

3.1. A sudden host transition is associated with stronger genetic drift

The selection coefficient (s) and effective population size (Ne) were estimated for each allele in the evolving viral populations (figure 2) using a diffusion approximation to Wright–Fisher process [21]. This method is essentially the same approximation developed by Kolmogorov and later analysed by Kimura in the 1940s and 1950 s, as reviewed by Rouzine et al. [24]. Regardless of the cistron, the sudden host transition treatment consistently resulted in approximately 2 times smaller Ne values. Because s and Ne can influence each other, we tested for the effects of cistron, treatment, population and their interactions on both s and Ne parameters. All independent variables and their interactions, with the exception of the interaction between treatment and cistron, were significant (MANOVA; p < 0.0001 for treatment, cistron and population, and p = 0.0012 for the interaction between cistron and population). By analysing s and Ne independently, treatment had the relatively greater effect on Ne but very small effect on s (ηP2 = 0.99 for Ne and ηP2 = 0.01 for s). However, post hoc pairwise tests show that, on average, alleles observed in the sudden transition at the nsp1, nsp2 and nsp3 cistrons are significantly less deleterious that those found under the gradual transition. The proteins encoded by these three cistrons are involved in the formation of the viral replication complex [25], thus are expected to be under strong purifying selection. In particular, the product of nsp3 is involved in host specificity and virulence.

Figure 2.

Population parameters s and Ne for each allele across all populations.

Population parameters s and Ne for each allele across all populations. Boxplots of the mean values of the posterior distribution of s and Ne for each allele, displayed by treatment and cistron. Asterisks represent the significance of Mann–Whitney two-samples tests: * p < 0.05 and **** p < 0.0001.

Sudden host transitions were associated with stronger (more extreme) bottlenecks, which led to reduced Ne and, in turn, increased the influence of genetic drift. Interestingly, unlike in the gradual treatment, the cistron had no significant effect on s (ANOVA; p < 0.0001 for the gradual treatment and p = 0.3400 for the sudden treatments). One possible explanation is that under strong bottlenecks, the effect of selection across different genomic regions may be weaker or less detectable, whereas it becomes more apparent when the populations are less impacted by drift, as seen in the gradual treatment. However, even in the gradual treatment, the effect size of cistron on s was moderate (ηP2 = 0.03). Despite significant differences in population parameters between treatments, particularly in Ne, the approxwf method assumes each allele evolves independently and ignores linkage effects such as clonal interference, which have been shown to be important [26]. More sensitive methods for the estimation of population parameters are available [2729].

3.2. A sudden host transition is associated with more instability in population allele composition

Changes in the composition of the populations between consecutive passages were evaluated by summing the AFDs [22] of all alleles. Overall, the composition of the populations was more variable (less steady) in the sudden treatment (figure 3), most likely due to greater effects of genetic drift. In the gradual treatment, most populations changed steadily until the last passage. At this point, a large shift in the composition of the populations was seen (figure 3a).

Figure 3.

Evolution of the populations’ composition.

Evolution of the populations’ composition. (a) Boxplots of the mean AFD between two consecutive passages and between each passage and ancestral sequence, also showing the mean AFD for each population represented by lines. (b) Boxplot of the richness of each passage. Lines represent the richness of each population, and asterisks represent the significance of paired samples Wilcoxon tests: * p < 0.05 and ** p < 0.01.

The rate at which the populations diverged from the ancestral sequence also differed between treatments (figure 3a). In the sudden treatment, populations showed a steady trend of increasing divergence from the ancestor. In contrast, the gradual treatment displayed instances where populations became more similar to the ancestral sequence over time. This phenomenon, known as the leapfrog effect [30], occurs when one haplotype replaces another, with both having evolved from a common ancestor. Since both haplotypes are more similar to their ancestor than they are to one another, the population can become more similar to the ancestral population as it evolves. Five not mutually exclusive mechanisms can be brought forward to explain this leapfrog effect. First, balancing selection that might favour diversity at certain loci, allowing rare haplotypes to gain an advantage under changing environmental or selective conditions [26]. Second, epistasis and hitchhiking in which specific combinations of alleles in rare haplotypes may confer a fitness advantage (epistasis) that becomes beneficial over time, hitchhiking them to dominance [3133]. Third, environmental fluctuations can alter the fitness landscape, favouring haplotypes that were previously at a disadvantage, even if they are less related to the dominant haplotypes [3436]. Fourth, genetic drift in small population allows rare haplotypes to randomly increase in frequency, contributing to their rise in dominance over time [37]. And, fifth, lineage sorting, in which low-frequency alleles unrelated to previously dominant ones, but linked through deeper evolutionary events [38], and recombination, which breaks apart haplotypes and creates new advantageous combinations not present in the previously dominant group [39], both contribute to the emergence of novel variants. Since we have observed the leapfrog phenomena in the gradual transition regime (larger Ne; figure 2), we propose that genetic drift may contribute less than the other four mechanisms.

3.3. BHK-21 specialists persisted until the complete elimination of BHK-21 cells from the environment

We hypothesize the co-occurrence of BHK-21-specialist, CHO-specialist and generalist haplotypes in the gradual treatment. Clonal interference and competition are expected between and within generalists and specialists, but less so between haplotypes specialized in different host cells. Clonal interference occurs when beneficial mutations are lost due to competition with other beneficial mutations that are present in other haplotypes [30,40,41]. Here, it is more likely that beneficial mutations on generalists will be lost due to competition with specialists, although the opposite may also happen, especially if there is no trade-off in the evolution of generalists [42]. The large shift in population composition seen at the last passage in the gradual treatment suggests that BHK−21-specialists became extinct once BHK-21 cells were completely absent from the environment. We noted, however, that two populations did not seem to follow this trend, at least as strongly, suggesting that in those cases, a large fraction of high-fitness generalists could have emerged (figure 3a).

To better understand the evolution of population composition, we analysed the richness of each sample, defined as the number of polymorphic sites (figure 3b). Specifically, we aimed to determine whether the elimination of BHK-21 specialists was, as expected, associated with a loss in richness. Interestingly, in the sudden treatment, an initial decline in richness was followed by a relatively high and stable level of richness from passage 13 onwards. This observation, along with the concurrent high variability in the population, suggests that many mutations are lost between passages but are continually replenished by new ones. In the gradual treatment, although a decrease in richness was observed at the final point, it is not statistically significant. Therefore, the apparent loss of BHK-21 specialists does not fully account for the substantial shift in population composition observed at the final passage in the gradual treatment.

We reanalysed fitness data of the evolved populations on BHK-21 and CHO cells to better contextualize the results presented above (figure 4). Populations evolved under the sudden treatment exhibited higher fitness in CHO cells than in BHK-21 cells, consistent with a composition dominated by CHO-specialist haplotypes. In contrast, no significant fitness difference was observed in populations evolved under the gradual replacement regime. Given the absence of a trade-off between hosts in these populations [1], our results suggested that, in the gradual treatment, the combined fitness of CHO specialists and generalists in CHO cells is comparable to the fitness of generalists in BHK-21 cells. Thus, gradual host replacement is associated not only with the emergence of generalist populations but also with the evolution of high-fitness generalist haplotypes [1].

Figure 4.

Effect of host replacement regime on fitness.

Effect of host replacement regime on fitness. Boxplots of the change in fitness relative to the ancestral for control (lineages evolved in BHK-21 cells), sudden and gradual treatments. Lines represent lineages, and asterisks represent the significance of paired Wilcoxon tests: *** p < 0.001 and **** p < 0.0001.

3.4. Allele frequencies display large fluctuations and aggregation behaviour

Morley & Turner [2] observed that some positively selected mutations were lost after reaching intermediate allele frequencies, suggesting clonal interference, which was particularly pronounced in the gradual treatment lineages. Here, we investigated whether temporal fluctuations in allele frequencies follow Taylor’s Power Law, focusing on the β parameter, which represents the slope of the fit in the log–log space. A β value of 1 indicates random fluctuations around the mean, consistent with a Poisson process, while a β value of 2 suggests large fluctuations characteristic of an exponential distribution. We hypothesize that due to the larger effect of drift on smaller populations, alleles in the sudden treatment will fluctuate closer to what is expected by the Poisson distribution, and thus, present smaller β than alleles in the gradual treatment. Additionally, large fluctuations due to clonal interference, especially for those alleles that reached intermediate or high frequencies, are also expected in the gradual treatment. Inspection of the Taylor’s plots showed that the fraction of zeros had a drastic impact on the fit (figure 5), which is to be expected given that only eight time points are present. Here, it is important to note that the presence of zeros might be either due to biological reasons, i.e. the allele was not present at that time point, or due to technical noise (measurement error). The latter is more likely for low frequency alleles that fluctuate close to the detection limit.

Figure 5.

Taylor’s law plots.

Taylor’s Law plots. Each panel corresponds to one population, where each point corresponds to the mean and variance of an allele frequency across all passages. Alleles are coloured by their number of non-zero occurrences (n), and for each n, a coloured line represents the fit to Taylor’s Law. A black line represents the fit to Taylor’s Law for all alleles together. Taylor’s parameters V and β, respectively, for the fit without adjusting for n, are shown.

Fits to Taylor’s Law were performed using alleles grouped by their number of non-zero occurrences (n) to analyse the scaling of Taylor’s parameters V and β, referred to here as Vn and βn, under different host-change regimes. In all cases, βn was estimated to be close to or higher than one (figure 6a). When fitting based on the number of zeros, we expect that as the mean allele frequency increases, aggregation becomes more apparent, since values are concentrated in non-zero ranges. However, the βn parameter still effectively captures the distribution of these values and quantifies the degree of aggregation, with an important caveat. Because allele frequency is bounded in the interval (0, 1), early fixation results in high mean frequency and low variance, given that most time points will be ones, causing βn to decrease and resemble a Poisson-like process. In contrast, late fixation can increase βn. Therefore, interpretation of this parameter should focus on its deviation from 2, which represents the expected value for random fluctuations without fixation. In other words, mean frequencies and their fluctuations will occupy different regions in respect to βn = 2 (electronic supplementary material, figure S1a)

Figure 6.

Distribution of alleles frequencies across passages by their non-zero occurrences (n).

Distribution of alleles frequencies across passages by their non-zero occurrences (n). (a) Taylor’s parameters estimated from alleles frequencies for each n for the gradual and sudden treatments and random walk matrices. (b) For each passage, boxplots of the distribution of alleles at each category of n.

To test this, three simulated matrices of random walks were generated based on the cumulative sum of random variables drawn from a normal distribution with mean zero and standard deviations of 0.2, 0.05 and 0.01, respectively (electronic supplementary material, figure S1b), where higher variance corresponds to greater fluctuations between time points. We then performed fits to the simulated random walks and observed that βn now tends to be lower than one when the frequency of zeros reduces (figure 6a). Additionally, βn decreases as fluctuation increases, indicating that higher fluctuations are associated to lower βn. This is due to alleles occupying different regions of the Taylor’s plot depending on the strength of their fluctuations (electronic supplementary material, figure S1c). As expected, when fitting to Taylor’s Law without adjusting for n, random matrices behave close to a Poisson distribution (β ~ 1), where parameter V increases with larger fluctuations.

To better understand how allele frequencies are distributed across the sequenced passages, we examined and plotted the number of alleles separated by n at each passage (figure 6b). In the sudden treatment, a higher concentration of alleles with an n of one or two was observed during early passages, consistent with a strong bottleneck as viral populations were introduced to the completely new CHO host. Starting at passage 13, the number of these alleles begins to rise again. Overall, the sudden treatment showed a consistently higher number of alleles with an n of one or two across all passages compared with the gradual treatment (figure 6b). This pattern aligns with the expected effects of genetic drift, as well as the continuous emergence and loss of new mutations discussed above. In the gradual treatment, alleles with an n of one or two also clustered at the first and last passages. However, their early concentration was less pronounced than in the sudden treatment, and their numbers did not begin to rise again until passage 22 (figure 6b).

3.5. Effects of treatment and selection coefficient on alleles’ fluctuation

Next, we tested the effect of treatment, population and n on the distribution of alleles accounting for all interactions between these variables. The number of non-zero occurrences, n, had the highest influence on Taylor’s parameters Vn and βn (MANOVA; p < 0.0001; ηP2 = 0.83). Both the main effect treatment and the interaction between treatment and n had a significant influence on Taylor’s parameter (MANOVA; respectively, p = 0.0019 and ηP2 = 0.14, and p = 0.0080 and ηP2 = 0.11), which indicates that host change rate has a significant and large impact on the distribution of allele frequencies. The main effect population and the interaction between n and population were not significant, despite large effect sizes (MANOVA; respectively, p = 0.0531 and ηP2 = 0.23, and p = 0.0837 and ηP2 = 0.22).

Alleles were divided into three categories, neutral, deleterious and beneficial, to determine whether strength of selection has an impact on Taylor’s parameters. By visualizing the distribution of s (figure 7a) it becomes clear that s follows a mostly bimodal distribution with a peak for negatively selected alleles and another for neutrally selected ones, with only a fraction being positively selected. Interestingly, negatively selected alleles on the gradual treatment seem to be subjected to stronger negative selection. Based on these distributions, we set a threshold of s < −0.16 for alleles to be considered as negatively selected and s > 0.16 for them to be considered positively selected, whereas alleles with s∈ (−0.16, 0.16) are considered as neutral.

Figure 7.

Effects of selection on allele fluctuations.

Effects of selection on allele fluctuations. (a) Distribution of s per treatment. Dashed lines represent the thresholds for categorizing neutral, negatively and positively selected alleles. Taylor’s parameters V and β, respectively, for the fit without adjusting for n, are shown. (b) Taylor’s plots of combined allele frequencies across populations by treatment and s. All the alleles of each treatment are represented as dots and are coloured by their number of non-zero occurrences (n) in their selection classification. For each n and s, a coloured line represents the fit to Taylor’s Law. A black line represents the fit to Taylor’s Law for all alleles classified as the corresponding s in the panel. Taylor’s parameters V and β, respectively, for the fit without adjusting for n, are shown. (c) Boxplots of Vn and βn by treatment and s category. Asterisks represent the significance of unpaired Wilcoxon tests: * p < 0.05.

Given that the separation of alleles by s drastically reduces sample size in some populations and n combinations, fits to Taylor’s Law were performed by treatment, (where frequencies from all population were combined), selection and n (figure 7b). Still, sample size drastically affected the fits, especially in cases where alleles with low mean were not present (e.g. positively selected alleles with n = 8) to accurately scale how variance changes with the mean. Additionally, some n categories are missing entirely for either positively or negatively selected alleles depending on treatment. Again, n had the highest influence on Taylor’s parameters Vn and βn (MANOVA; p < 0.0001; ηP2 = 0.75), followed by the interaction between n and s (p = 0.0011; ηP2 = 0.31), and s (p = 0.0299; ηP2 = 0.2). In this case, treatment did not have a significant impact on Taylor’s parameters. It is likely that combining allele frequencies across populations introduced confounding effects. This is supported by the significant interaction between n and population when Taylor’s parameters were estimated separately for each population.

Despite not finding a significant influence of treatment on Taylor’s, some very interesting observations can be made from these Taylor’s plots in regard to s. First, points are far more dispersed for neutral alleles, and occupy a region composed by alleles that are present at several time points but have low mean frequency and low variance. Most notably, all alleles with n = 8 that have low mean frequency and variance are neutral. Due to this, if the data is fitted to Taylor’s Law without adjusting for n, neutral alleles will be closer to a Poisson distribution. Also, neutral alleles exhibit the highest values of βn (figure 7c). Second, positively selected alleles with an n = 8 have negative βn and their variance decreases as their mean frequency increase. This is due to early fixation, causing most time points to be ones, and consequently, their variance will be low. When fitting the data to Taylor’s Law without adjusting for n, this drives β down, and as a result, β is lower for positively selected alleles in comparison with negatively selected ones. Additionally, positively selected alleles occupy a region of the Taylor’s plot close to the upper limits of the system, especially for higher n. Third, negatively selected alleles occupy a region of the Taylor’s plot between neutral and positively selected alleles. This difference is clearer for the sudden treatment.

3.6. Host replacement regime affects the distribution of mutations along the Sindbis virus genome

Lastly, it is likely that host change dynamics may also have an effect not only on the temporal aggregation dynamics of allele frequencies but also on genomic location of mutations due to the interplay between drift and selection and linkage. Due to genetic hitchhiking [2,3133], neutral or low fitness mutations may benefit from being strongly linked to high fitness ones. In addition, the gradual host replacement is associated with greater clonal interference [2], in which genomes containing multiple beneficial mutations outcompete those containing only one, promoting the fixation of linked mutations [2,4345]. Given that the fixation of sweeps of mutations is greater in the gradual treatment [2], we sought to investigate whether mutations in the gradual treatment are less evenly distributed along the genome, meaning if they are aggregating.

For this, we estimated the empirical Hurst’s exponent H to measure persistent behaviour in the placement of mutations at each passage for each population (figure 8). Here, H was estimated from binary sequences with the same length as SINV genome, where ones represent sites with any disagreement to the ancestral sequence. Lower values of H are associated with more evenly spaced mutations along the SINV genome. Despite high variability in the estimates, particularly in the sudden treatment, evidence of persistent behaviour was observed in most cases, with the median H value at each passage remaining above 0.5 for both treatments (figure 8). During the first two passages, the median H was higher in the sudden treatment, followed by two passages where H values were similar between treatments. In the final passages, H decreased and became lower in the sudden treatment; however, these differences were only statistically significant at passage 22 (p = 0.040). This pattern suggests that the continuous emergence and loss of mutations observed in the sudden treatment at later passages is associated with a more random distribution of mutations along the genome, potentially explaining the more scattered mutation pattern in this treatment. Nonetheless, treatment did not have a significant overall effect on H (linear model with passage and treatment as orthogonal fixed effects, and population as a random effect nested within treatment; sudden coefficient estimate = −0.043, p = 0.100).

Figure 8.

Persistent behaviour in the placement of mutations.

Persistent behaviour in the placement of mutations. Hurst exponents (H) estimated from binary sequences (where 1 represent a mutant allele) for each population at each passage. Dashed line represents the 0.7 Hurst phenomenon threshold. Asterisks represent the significance of unpaired Wilcoxon tests: * p < 0.05.

4. Conclusions

This study investigates how viruses adapt at the molecular level when exposed to novel hosts. By reanalysing genomic data from an experimental evolution study of SINV under sudden versus gradual host-replacement regimes, we explored how environmental heterogeneity, genetic drift and selection shape temporal allele frequency fluctuations and the genome-wide distribution of mutations. Using two tools from the complex systems framework, Taylor’s Power Law and the Hurst exponent, we examined whether the observed fluctuations reflect random noise or more complex underlying dynamics.

Our findings highlight the pivotal role of environmental heterogeneity in viral adaptation. Gradual host replacement fosters more stable and convergent evolutionary trajectories, enhancing viral fitness across hosts. This regime promotes clonal interference, facilitating the fixation of linked beneficial mutations and producing temporally persistent mutation patterns across the genome. In contrast, sudden host transitions reduce effective population size, intensify genetic drift, destabilize allele frequencies and result in more dispersed mutation distributions along the viral genome. These results underscore how both the rate and nature of environmental change influence evolutionary outcomes.

The application of Taylor’s Power Law and the Hurst exponent provided insights into the scaling behaviour and temporal aggregation of allele frequency fluctuations. Neutral alleles exhibited higher variability, while beneficial alleles tended to reach fixation earlier. The interplay between genetic drift, selection and linkage effects shaped the spatial and temporal distributions of mutations. Gradual host replacement favoured aggregated mutation patterns due to hitchhiking of linked beneficial mutations. While our findings provide valuable insights into the evolutionary dynamics of SINV under different host-replacement regimes, we acknowledge that the relatively small sample sizes, particularly the limited number of replicate populations and the eight sequenced time points, may constrain the generalizability and statistical power of some analyses. This limitation is especially relevant when estimating Taylor’s parameters and Hurst exponents, or when stratifying alleles by selection category and occurrence frequency. As such, some subgroup trends should be interpreted with caution. Future studies incorporating denser temporal sampling and larger replicate numbers would enhance the robustness of these scaling analyses and allow for more nuanced exploration of stochastic versus deterministic forces in viral evolution.

Finally, while gradual host-replacement led to greater convergence on high-fitness generalist haplotypes, the more erratic evolutionary trajectories seen under sudden host transitions may, over the long term, facilitate the emergence of high-fitness haplotypes that are distant from local optima in sequence space. This may result from the persistence of neutral or mildly deleterious mutations and the broader genomic spread of mutations under this regime.

Overall, this study advances our understanding of virus evolution and offers a novel framework for predicting pathogen adaptation to dynamic environments, such as shifts in host availability or immune pressures. However, we recognize that these findings may not generalize across all viral systems. In natural populations, particularly those involving multicellular hosts, bottlenecks during host-to-host transmission, immune responses, and spatial structure can have pronounced effects on allele dynamics. Additionally, intrinsic factors such as mutation rates, recombination frequency and the number of accessible beneficial mutations vary widely across viruses and are likely to influence adaptive trajectories. While the principles uncovered here, such as the impact of environmental change rate on effective population size and allele frequency fluctuations, may extend to other RNA viruses, further studies across diverse viral systems and ecological contexts are needed to test the universality of these patterns. Finally, we acknowledge that the relatively small number of replicate populations and limited temporal resolution may constrain the statistical power of some analyses, particularly those involving Taylor’s parameters and Hurst exponents. Future work with denser sampling and larger experimental designs will be essential to validate and expand upon these findings.

Acknowledgements

The authors would like to thank Dr. Carlos P. Garay and Dr. José A. Oteo for very fruitful discussions on the meaning of Taylor’s Power Law and Hurst’s exponent in evolving complex systems.

Contributor Information

João M. F. Silva, Email: joaomarcos.fagundes@gmail.com.

María J. Olmo-Uceda, Email: mariajose.olmo@csic.es.

Valerie J. Morley, Email: valerie.morley@ginkgobioworks.com.

Paul E. Turner, Email: paul.turner@yale.edu.

Santiago F. Elena, Email: santiago.elena@csic.es.

Ethics

This work did not require ethical approval from a human subject or animal welfare committee.

Data accessibility

Raw sequencing data are available through the NCBI SRA project accession number SRP096731. Electronic supplementary material are available at the Zenodo repository [46].

Declaration of AI use

We have not used AI-assisted technologies in creating this article.

Authors’ contributions

J.M.F.S.: conceptualization, formal analysis, investigation, software, writing—original draft, writing—review and editing; M.J.O.-U.: conceptualization, investigation, methodology, writing—review and editing; V.J.M.: data curation, investigation; P.E.T.: data curation, investigation, supervision, writing—review and editing; S.F.E.: conceptualization, formal analysis, investigation, methodology, writing— original draft, writing—review and editing.

All authors gave final approval for publication and agreed to be held accountable for the work performed therein.

Conflict of interest declaration

We declare we have no competing interests.

Funding

This work was supported by grant PID2022-136912NB-I00 funded by MCIN/AEI/10.13039/501100011033 and by ‘ERDF a way of making Europe’ and by Generalitat Valenciana grant CIPROM/2022/59 to S.F.E. M.J.O-U. was supported by grant FPU2019/05246 funded by MCIN/AEI/10.13039/501100011033 and by ‘ESF investing in your future’.

References

  • 1. Morley VJ, Mendiola SY, Turner PE. 2015. Rate of novel host invasion affects adaptability of evolving RNA virus lineages. Proc. R. Soc. B 282, 20150801. ( 10.1098/rspb.2015.0801) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Morley VJ, Turner PE. 2017. Dynamics of molecular evolution in RNA virus populations depend on sudden versus gradual environmental change. Evolution 71, 872–883. ( 10.1111/evo.13193) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Taylor LR. 1961. Aggregation, variance and the mean. Nature 189, 732–735. ( 10.1038/189732a0) [DOI] [Google Scholar]
  • 4. Eisler Z, Bartos I, Kertész J. 2008. Fluctuation scaling in complex systems: Taylor’s law and beyond. Adv. Phys. 57, 89–142. ( 10.1080/00018730801893043) [DOI] [Google Scholar]
  • 5. Fronczak A, Fronczak P. 2010. Origins of Taylor’s power law for fluctuation scaling in complex systems. Phys. Rev. E 81, 066112. ( 10.1103/physreve.81.066112) [DOI] [PubMed] [Google Scholar]
  • 6. Kendal WS, Jørgensen B. 2011. Tweedie convergence: a mathematical basis for Taylor’s power law, 1/f noise, and multifractality. Phys. Rev. E 84, 066120. ( 10.1103/PhysRevE.84.066120) [DOI] [PubMed] [Google Scholar]
  • 7. Lazzardi S, Valle F, Mazzolini A, Scialdone A, Caselle M, Osella M. 2023. Emergent statistical laws in single-cell transcriptomic data. Phys. Rev. E 107, 044403. ( 10.1103/physreve.107.044403) [DOI] [PubMed] [Google Scholar]
  • 8. Martí JM, Martínez-Martínez D, Rubio T, Gracia C, Peña M, Latorre A, Moya A, P. Garay C. 2017. Health and disease imprinted in the time variability of the human microbiome. mSystems 2, e00144-16. ( 10.1128/msystems.00144-16) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Silva JMF, Oteo JÁ, Garay CP, Elena SF. 2024. System and transcript dynamics of cells infected with severe acute respiratory syndrome virus 2 (SARS-CoV-2). PLOS Complex Syst. 1, e0000016. ( 10.1371/journal.pcsy.0000016) [DOI] [Google Scholar]
  • 10. Hurst HE. 1951. Long-term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 116, 770–799. ( 10.1061/taceat.0006518) [DOI] [Google Scholar]
  • 11. Mandelbrot BB, Wallis JR. 1969. Robustness of the rescaled range R/S in the measurement of noncyclic long run statistical dependence. Water Resour. Res. 5, 967–988. ( 10.1029/wr005i005p00967) [DOI] [Google Scholar]
  • 12. Feder J. 1988. Fractals. New York, NY: Plenum Press. ( 10.1007/978-1-4899-2124-6) [DOI] [Google Scholar]
  • 13. Blumm N, Ghoshal G, Forró Z, Schich M, Bianconi G, Bouchaud JP, Barabási AL. 2012. Dynamics of ranking processes in complex systems. Phys. Rev. Lett. 109, 128701. ( 10.1103/physrevlett.109.128701) [DOI] [PubMed] [Google Scholar]
  • 14. Ghorbani M, Jonckheere EA, Bogdan P. 2018. Gene expression is not random: scaling, long-range cross-dependence, and fractal characteristics of gene regulatory networks. Front. Physiol. 9, 1446. ( 10.3389/fphys.2018.01446) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Oteo JA, Oteo-García G. 2022. Mutations along human chromosomes: how randomly scattered are they? Phys. Rev. E 106, 064404. ( 10.1103/physreve.106.064404) [DOI] [PubMed] [Google Scholar]
  • 16. Rice CM, Levis R, Strauss JH, Huang HV. 1987. Production of infectious RNA transcripts from Sindbis virus cDNA clones: mapping of lethal mutations, rescue of a temperature-sensitive marker, and in vitro mutagenesis to generate defined mutants. J. Virol. 61, 3809–3819. ( 10.1128/jvi.61.12.3809-3819.1987) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Martin M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10. ( 10.14806/ej.17.1.200) [DOI] [Google Scholar]
  • 18. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760. ( 10.1093/bioinformatics/btp324) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Gaidatzis D, Lerch A, Hahne F, Stadler MB. 2015. QuasR: quantification and annotation of short reads in R. Bioinformatics 31, 1130–1132. ( 10.1093/bioinformatics/btu781) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, Wilson RK, Ding L. 2009. VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 25, 2283–2285. ( 10.1093/bioinformatics/btp373) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Ferrer-Admetlla A, Leuenberger C, Jensen JD, Wegmann D. 2016. An approximate Markov model for the Wright–Fisher diffusion and its application to time series data. Genetics 203, 831–846. ( 10.1534/genetics.115.184598) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Berner D. 2019. Allele frequency difference AFD—an intuitive alternative to FST for quantifying genetic population differentiation. Genes 10, 308. ( 10.3390/genes10040308) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Borchers H. 2023. pracma: practical numerical math functions, version 2.4.42 [R package]. ( 10.32614/CRAN.package.pracma) [DOI]
  • 24. Rouzine IM, Rodrigo A, Coffin JM. 2001. Transition between stochastic evolution and deterministic evolution in the presence of selection: general theory and application to virology. Microbiol. Mol. Biol. Rev. 65, 151–185. ( 10.1128/mmbr.65.1.151-185.2001) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Kääriänen L, Ahola T. 2002. Functions of alphavirus nonstructural proteins in RNA replication. Prog. Nucleic Acids Res. Mol. Biol. 71, 187–222. ( 10.1016/S0079-6603(02)71044-1) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Desai MM, Fisher DS. 2007. Beneficial mutation–selection balance and the effect of linkage on positive selection. Genetics 176, 1759–1798. ( 10.1534/genetics.106.067678) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Rouzine IM, Coffin JM. 1999. Linkage disequilibrium test implies a large effective population number for HIV in vivo. Proc. Natl Acad. Sci. USA 96, 10758–10763. ( 10.1073/pnas.96.19.10758) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Barlukova A, Rouzine IM. 2021. The evolutionary origin of the universal distribution of mutation fitness effect. PLoS Comput. Biol. 17, e1008822. ( 10.1371/journal.pcbi.1008822) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Likhachev IV, Rouzine IM. 2023. Measurement of selection coefficients from genomic samples of adapting populations by computer modeling. STAR Protoc. 4, 101821. ( 10.1016/j.xpro.2022.101821) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Gerrish PJ, Lenski RE. 1998. The fate of competing beneficial mutations in an asexual population. Genetica 102, 127–144. ( 10.1023/A:1017067816551) [DOI] [PubMed] [Google Scholar]
  • 31. Smith JM, Haigh J. 1974. The hitch-hiking effect of a favourable gene. Genet. Res. 23, 23–35. ( 10.1017/S0016672308009579) [DOI] [PubMed] [Google Scholar]
  • 32. Lang GI, Rice DP, Hickman MJ, Sodergren E, Weinstock GM, Botstein D, Desai MM. 2013. Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations. Nature 500, 571–574. ( 10.1038/nature12344) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Buskirk SW, Peace RE, Lang GI. 2017. Hitchhiking and epistasis give rise to cohort dynamics in adapting populations. Proc. Natl Acad. Sci. USA 114, 8330–8335. ( 10.1073/pnas.1702314114) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Flynn KM, Cooper TF, Moore FBG, Cooper VS. 2013. The environment affects epistatic interactions to alter the topology of an empirical fitness landscape. PLoS Genet. 9, e1003426. ( 10.1371/journal.pgen.1003426) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Melbinger A, Vergassola M. 2015. The impact of environmental fluctuations on evolutionary fitness functions. Sci. Rep. 5, 15211. ( 10.1038/srep15211) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Cervera H, Lalić J, Elena SF. 2016. Effect of host species on topography of the fitness landscape for a plant RNA virus. J. Virol. 22, 10160–10169. ( 10.1128/jvi.01243-16) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Gillespie JH. 2000. Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155, 909–919. ( 10.1093/genetics/155.2.909) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Brunet É, Derrida B, Mueller AH, Munier S. 2007. Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization. Phys. Rev. E 76, 041104. ( 10.1103/physreve.76.041104) [DOI] [PubMed] [Google Scholar]
  • 39. Cohen E, Kessler DA, Levine H. 2005. Recombination dramatically speeds up evolution of finite populations. Phys. Rev. Lett. 94, 098102. ( 10.1103/PhysRevLett.94.098102) [DOI] [PubMed] [Google Scholar]
  • 40. Miralles R, Gerrish PJ, Moya A, Elena SF. 1999. Clonal interference and the evolution of RNA viruses. Science 285, 1745–1747. ( 10.1126/science.285.5434.1745) [DOI] [PubMed] [Google Scholar]
  • 41. Park SC, Krug J. 2007. Clonal interference in large populations. Proc. Natl Acad. Sci. USA 104, 18135–18140. ( 10.1073/pnas.0705778104) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Remold S. 2012. Understanding specialism when the jack of all trades can be the master of all. Proc. R. Soc. B 279, 4861–4869. ( 10.1098/rspb.2012.1990) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Gillespie JH. 1984. Molecular evolution over the mutational landscape. Evolution 38, 1116–1129. ( 10.1111/j.1558-5646.1984.tb00380.x) [DOI] [PubMed] [Google Scholar]
  • 44. Schiffels S, Szöllősi GJ, Mustonen V, Lässig M. 2011. Emergent neutrality in adaptive asexual evolution. Genetics 189, 1361–1375. ( 10.1534/genetics.111.132027) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Good BH, Desai MM. 2014. Deleterious passengers in adapting populations. Genetics 198, 1183–1208. ( 10.1534/genetics.114.170233) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Elena S. 2025. Host shifts and viral evolution: evidence of Taylor’s Law in allele frequency dynamics [dataset]. Zenodo. ( 10.5281/zenodo.14866795) [DOI]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Raw sequencing data are available through the NCBI SRA project accession number SRP096731. Electronic supplementary material are available at the Zenodo repository [46].


Articles from Journal of the Royal Society Interface are provided here courtesy of The Royal Society

RESOURCES