Skip to main content
Virus Evolution logoLink to Virus Evolution
. 2025 Jul 21;11(1):veaf054. doi: 10.1093/ve/veaf054

Immune pressure is key to understanding observed patterns of respiratory virus evolution in prolonged infections

Amber Coats 1,, Yintong R Wang 2, Katia Koelle 3,4,
PMCID: PMC12360705  PMID: 40831532

Abstract

Analyses of viral samples from prolonged SARS-CoV-2 infections as well as from prolonged infections with other respiratory viruses have indicated that there are several consistent patterns of evolution observed across these infections. These patterns include accelerated rates of nonsynonymous substitution, viral genetic diversification into distinct lineages, parallel substitutions across infected individuals, and heterogeneity in rates of antigenic evolution. Here, we use within-host model simulations to explore the drivers of these intrahost evolutionary patterns. Our simulations build on a tunably rugged fitness landscape model to first assess the role that mutations that impact only viral replicative fitness have in driving these patterns. We then further incorporate pleiotropic sites that jointly impact replicative fitness and antigenicity to assess the role that immune pressure has on these patterns. Through simulation, we find that the empirically observed patterns of viral evolution in prolonged infections cannot be robustly explained by viral populations evolving on replicative fitness landscapes alone. Instead, we find that immune pressure is needed to consistently reproduce the observed patterns. Moreover, our simulations show that the amount of antigenic change that occurs is higher when immune pressure is stronger and at intermediate immune breadth. While our simulation models were designed to shed light on drivers of viral evolution in prolonged infections with respiratory viruses that generally cause acute infection, their structure can be used to better understand viral evolution in other acutely infecting viruses such as noroviruses that can cause prolonged infection as well as viruses such as HIV that are known to chronically infect.

Keywords: prolonged viral infections, persistent infections, SARS-CoV-2, influenza A viruses, fitness landscape, within-host immune escape

Introduction

Many respiratory viruses, including influenza viruses and coronaviruses, typically cause acute infections that last less than two weeks (Bar-On et al., 2020, Einav et al., 2020). However, particularly in individuals who are immunocompromised or immunosuppressed, these viral infections can persist for much longer (Memoli et al., 2014, Baang et al., 2021, Dioverti et al., 2022, Machkovech et al., 2024). Sequencing of respiratory tract samples from individuals experiencing prolonged viral infections has revealed that a considerable amount of viral evolution can occur in these infections (Rogers et al., 2015, Xue et al., 2017, Borges et al., 2021, Chen et al., 2021, Kemp et al., 2021, Harari et al., 2022, Khatamzas et al., 2022, Ko et al., 2022, Nussenblatt et al., 2022, Quaranta et al., 2022, Scherer et al., 2022, Sonnleitner et al., 2022, Chaguza et al., 2023, Gonzalez-Reiche et al., 2023). Understanding these viral evolutionary dynamics is important for several reasons. First, these evolutionary dynamics can reveal whether the infecting virus has evolved to escape from a patient’s immune response or treatment regimen (McMinn et al., 1999, Rogers et al., 2015, Jensen et al., 2021, Kemp et al., 2021, Khatamzas et al., 2022, Scherer et al., 2022, Khosravi et al., 2023), and as such could help inform treatment strategies for the focal patient and more generally for individuals experiencing prolonged viral infections. Second, new viral variants that emerge at the host population level may originate from viruses that evolved in individuals experiencing prolonged infections, as has been discussed for SARS-CoV-2 (Berkhout and Herrera-Carrillo, 2022, Ghafari et al., 2022, Hill et al., 2022). Characterizing patterns of viral evolution within individuals with prolonged infections could therefore help in surveillance efforts at the level of the host population and efforts to anticipate phenotypes of forthcoming variants.

Many studies have described patterns of respiratory virus evolution within individuals with prolonged infections (Rocha et al., 1991, Rogers et al., 2015, Xue et al., 2017, Borges et al., 2021, Chen et al., 2021, Kemp et al., 2021, Harari et al., 2022, Khatamzas et al., 2022, Ko et al., 2022, Nussenblatt et al., 2022, Quaranta et al., 2022, Riddell et al., 2022, Scherer et al., 2022, Sonnleitner et al., 2022, Wilkinson et al., 2022, Chaguza et al., 2023, Gonzalez-Reiche et al., 2023, Rutsinsky et al., 2024). In these studies and others, four evolutionary patterns are frequently observed: (I) Nonsynonymous substitution rates tend to be higher than synonymous substitution rates (Choi et al., 2020, Harari et al., 2022, Chaguza et al., 2023, Markov et al., 2023) (Fig. 1A), particularly in viral genes that code for surface proteins; (II) Multiple co-circulating viral lineages often establish within individuals with prolonged infection (Rogers et al., 2015, Chaguza et al., 2023, Machkovech et al., 2024) (Fig. 1B); (III) Parallel viral substitutions often occur across individuals experiencing prolonged infection (Memoli et al., 2014, Xue et al., 2017, Wilkinson et al., 2022) (Fig. 1C); and (IV) The extent of antigenic evolution that is observed across individuals with prolonged infection is highly variable (Rocha et al., 1991, McMinn et al., 1999, Xue et al., 2017, Harari et al., 2022) (Fig. 1D).

Figure 1.

Figure 1

Evolutionary patterns of respiratory viruses observed in individuals with prolonged infection. (A) Nonsynonymous substitution rates generally exceed synonymous substitution rates. The panel shows nonsynonymous and synonymous viral substitutions that have accrued in an individual experiencing a prolonged SARS-CoV-2 infection. Sample days and substitutions are relative to the individual’s T0, Day 0 consensus sequence. Schematic reproduced from Choi et al. (2020) with permission from the Massachusetts Medical Society/New England Journal of Medicine. (B) Multiple distinct viral lineages often evolve within individuals with prolonged infections. (C) Parallel substitutions often arise across individuals with prolonged infection. The schematic shows consensus viral sequences from early and late infection timepoints of three infected individuals. Arrows highlight the frequently observed E484K substitution in the spike gene of SARS-CoV-2. (D) Rates of antigenic evolution are variable across individuals with prolonged infection. The schematic depicts heterogeneity in the extent of antigenic evolution that occurs over time in three individuals.

While these four patterns of respiratory virus evolution in individuals with prolonged infection are well established, we still lack a comprehensive understanding of their drivers. Here, we develop a simulation model for respiratory virus evolution within individuals with prolonged infection and simulate this model under various parameterizations to help shed light on possible drivers of within-host viral evolution in these longer-term infections. Specifically, we extend a tunably rugged fitness landscape model (Aita et al., 2000, Neidhart et al., 2014) to consider viral evolution at sites that are either synonymous or nonsynonymous, with the latter impacting either viral replicative fitness, antigenicity, or both replicative fitness and antigenicity (sites that we call pleiotropic sites). We simulate this model to identify features of the viral fitness landscape and characteristics of the immune response that can reproduce these four evolutionary patterns that have been empirically observed in prolonged respiratory virus infections. From these simulations, we find that immune pressure is key to consistently reproducing all four viral evolutionary patterns shown in Fig. 1.

Methods

The fitness landscape model

We model the viral genome as consisting of four different types of sites: synonymous sites (Inline graphic), phenotypic sites (Inline graphic), antigenic sites (Inline graphic), and pleiotropic sites (Inline graphic) (Fig. 2A). Mutations falling on synonymous sites are assumed to be silent and do not impact viral fitness. Mutations falling on phenotypic sites are nonsynonymous and impact phenotypes related to replicative fitness. Mutations falling on antigenic sites are nonsynonymous and impact only antigenicity. Mutations falling on pleiotropic sites are nonsynonymous and simultaneously impact both replicative fitness and antigenicity. The number of sites in each of these four classes is given by Inline graphic, Inline graphic, Inline graphic, and Inline graphic, respectively, with the total number of sites in the viral genome given by Inline graphic. Genotypes are modeled as bitstrings, with each site carrying one of two possible alleles: 0 or 1. As such, genotype space consists of Inline graphic genotypes. We determine the overall fitness of a given viral strain by taking into consideration its replicative fitness as well as its antigenic phenotype.

Figure 2.

Figure 2

Structure of the viral genome and projections of replicative fitness and antigenic fitness. (A) Structure of the viral genome, showing the 4 different types of sites: phenotypic (Inline graphic), pleiotropic (Inline graphic), antigenic (Inline graphic), and synonymous (Inline graphic). Site order is not important because we do not consider the process of recombination in our simulations of viral evolution. (B) Projection of the RMF fitness landscape given by equation (1), parameterized with a small Inline graphic to implement a highly rugged fitness landscape. Here, Inline graphic. (C) Projection of the RMF fitness landscape given by equation (1), parameterized with a Inline graphic of 0.2 to implement a semi-rugged fitness landscape. (D) Projection of the RMF fitness landscape with a large Inline graphic to implement a relatively smooth fitness landscape. Here, Inline graphic. In (B–D), Inline graphic is given by the bitstring of all ones and the genome length is set to Inline graphic. (E–G) Linearly transformed fitness landscapes, derived from panels (B–D), with parameter Inline graphic set to 100. The red lines in panels (B-G) show one genetic pathway consisting of single point mutations from the bitstring of all ones (Inline graphic; Hamming distance to Inline graphic) to the bitstring of all zeros (Hamming distance to Inline graphic). H) Antigenic fitness of genotypes that are different antigenic distances away from the consensus genotype Inline graphic. The black line shows the antigenic fitness function parameterized with Inline graphic (and any value of Inline graphic between 0 and 1). The red line shows Inline graphic with Inline graphic and Inline graphic, corresponding to immune strength being weak and immune breadth being narrow. The red dashed line shows Inline graphic with Inline graphic and Inline graphic, corresponding to immune strength being weak and immune breadth being moderate. The blue line shows Inline graphic with Inline graphic and Inline graphic, corresponding to immune strength being strong and immune breadth being narrow. The blue dashed line shows Inline graphic with Inline graphic and Inline graphic, corresponding to immune strength being strong and immune breadth being moderate.

To quantify the replicative fitness of a viral genotype, we extend an existing fitness landscape model called the Rough Mount Fuji (RMF) model (Aita et al., 2000, Neidhart et al., 2014). The original model has two parameters: the length of the genome and a parameter Inline graphic that controls the ruggedness of the landscape. The landscape is a static landscape, with the fitness of a genotype that does not change over time. We thus use the RMF landscape only for quantifying the replicative fitness component of a viral genotype. Because in our model replicative fitness depends only on the alleles present at Inline graphic and Inline graphic sites, the length of the genome we use for the RMF model is given by Inline graphic, resulting in a total of Inline graphic possible genotypes on the landscape. Fitness values in the RMF model are initially calculated according to the following equation (Neidhart et al., 2014):

graphic file with name DmEquation1.gif (1)

where Inline graphic denotes focal genotype Inline graphic and Inline graphic denotes a reference genotype. The first term, Inline graphic, is a product of the non-negative parameter Inline graphic and the Hamming distance Inline graphic between genotype Inline graphic and reference genotype Inline graphic. The Hamming distance is defined as the number of sites at which the two genotypes differ. The minimum value of Inline graphic is 0, which occurs when Inline graphic. The maximum value of Inline graphic is Inline graphic, which occurs when focal genotype Inline graphic differs from Inline graphic at every site that contributes to replicative fitness. The second term, Inline graphic, is a discrete random variable that is drawn from a specified probability distribution function. Here, we use a Gaussian distribution with mean 0 and standard deviation 1 to generate our Inline graphic random variables. This RMF model generates a tunably rugged fitness landscape model, with the random variable Inline graphic contributing the ruggedness and the parameter Inline graphic modulating the extent of the ruggedness. When Inline graphic, the fitness landscape reduces to a House of Cards landscape, where the fitness value of a genotype is uncorrelated with the fitness value of genotypes that surround it. At low Inline graphic (high ruggedness), epistatic interactions dominate. As Inline graphic gets larger, epistatic interactions become weaker and the fitness landscape becomes less rugged, reducing to a smooth Mount Fuji landscape as Inline graphic, with genotype Inline graphic being the genotype with highest fitness. Figure 2B–D project the RMF landscape for three different values of Inline graphic: Inline graphic, Inline graphic, and Inline graphic.

Under this model, the range of values that the fitness landscape spans differs depending on the value of Inline graphic chosen. When Inline graphic is larger, the range of fitness values is larger (Fig. 2B–D). To be able to compare patterns of viral evolution across fitness landscapes of different ruggedness, we therefore modified the above RMF model such that the overall distribution of fitness values would be similar across Inline graphic values. We did this by linearly transforming the Inline graphic fitness values so that they fall between 0 and Inline graphic, where Inline graphic is a non-negative parameter, similar to the approach taken in Greene and Crona (2014). We denote the linearly-transformed replicative fitness value of genotype Inline graphic as Inline graphic. The equation used for linear transformation is: Inline graphic where the slope Inline graphic and the y-intercept Inline graphic. Because linear transformation may result in some genotypes (particularly those that are distant from the reference genotype) having negative fitness values, we further set the fitness values Inline graphic of all genotypes with negative fitness values to 0. Fig. 2E–G show the linearly-transformed fitness landscapes for the RMF landscapes shown in Fig. 2B–D, respectively, with Inline graphic set to 100. These linearly-transformed fitness landscape projections demonstrate that a lower value of the parameter Inline graphic results in a more rugged fitness landscape, with many local fitness peaks present across the landscape.

We allow antigenic changes from mutations occurring at either antigenic (Inline graphic) sites or pleiotropic (Inline graphic) sites to impact overall viral fitness. Specifically, we assume that the overall fitness of a viral genotype Inline graphic depends multiplicatively on its replicative fitness and its antigenic fitness:

graphic file with name DmEquation2.gif (2)

Here, antigenic fitness Inline graphic is a value that lies between 0 and 1. When Inline graphic is closer to 0, there is substantial immune pressure against genotype Inline graphic that results in its overall fitness being close to 0. As Inline graphic approaches 1, immune pressure against genotype Inline graphic is weaker and overall fitness is determined solely by replicative fitness. By modeling antigenic fitness in this manner, we are assuming that it modulates viral fitness by introducing a fitness cost. We model antigenic fitness using the function:

graphic file with name DmEquation3.gif (3)

where Inline graphic denotes the consensus genotype circulating in the viral population and Inline graphic denotes the Hamming distance between genotype Inline graphic and consensus genotype Inline graphic at exclusively antigenic and pleiotropic sites (notationally referred to here as Inline graphic sites). This function contains 2 parameters. Taking on values between 0 and 1, parameter Inline graphic can be thought of as a parameter that modifies the strength of immune pressure. When Inline graphic, the antigenic fitness of the consensus genotype (and all genotypes) is 1, reflecting an absence of immune pressure. As Inline graphic approaches 1, the antigenic fitness of the consensus genotype approaches 0, indicating that immune pressure is strong and considerably impacts overall viral fitness. Parameter Inline graphic can be thought of as a parameter that modifies the breadth of immunity. Also taking on values between 0 and 1, when Inline graphic is closer to 0, the breadth of the immunity is narrow, with genotypes that are only a single antigenic mutation away from the consensus genotype experiencing only very weak immune pressure (their Inline graphic values are close to 1). As Inline graphic approaches 1, immune protection becomes broader, with genotypes that are several antigenic mutations away from the consensus genotype still experiencing similar immune pressure to that of the consensus genotype. Figure 2H shows the antigenic fitness values of viral genotypes that are different antigenic distances away from the consensus genotype under different parameterizations of Inline graphic and Inline graphic.

A mutation at a synonymous (Inline graphic) site that generates viral genotype Inline graphic does not impact either Inline graphic or Inline graphic, such that the overall fitness of the mutant is the same as its parent. A mutation at a phenotypic (Inline graphic) site impacts only Inline graphic. A mutation at an antigenic (Inline graphic) site impacts only Inline graphic. A mutation at a pleiotropic (Inline graphic) site impacts both Inline graphic and Inline graphic. Mutations at both Inline graphic and Inline graphic sites impact replicative fitness on the RMF landscape and thus interact epistatically with other Inline graphic and Inline graphic sites. The formulation of equation (2) allows for a trade-off between replicative fitness Inline graphic and immune escape (captured by Inline graphic) in that a mutation at a pleiotropic site may result in a viral genotype that is further away from the consensus genotype (thus enabling immune escape and increasing antigenic fitness Inline graphic) and at the same time result in lower replicative fitness Inline graphic. However, the formulation also allows for a mutation to simultaneously enable immune escape while increasing replicative fitness. This would occur if the mutation increases both Inline graphic and Inline graphic. Our model formulation thus does not impose a trade-off between replicative fitness and immune escape, but it allows for such a trade-off to occur. One would expect a trade-off to occur more frequently when the virus is already well-adapted to the host simply because a mutation impacting replicative fitness would have a higher chance of being deleterious if the virus was already well-adapted. Finally, it is important to note that, by the formulation of equation (3), we are assuming that immune escape is fleeting in that viral fitness with respect to antigenicity only depends on how different a genotype is from the consensus genotype at the present time and not the history of viral genotypes that have circulated over the course of infection.

Simulating within-host viral evolution

We simulate within-host viral evolution under the assumption of a fixed viral population size Inline graphic using Gillespie’s Inline graphic-leap algorithm (Gillespie, 2001). Initially, at the time of infection (Inline graphic), each viral particle is set to the same infecting genotype. We adopt this assumption to reflect the low levels of viral diversity at the start of infection that result from a tight transmission bottleneck (McCrone et al., 2018, Lythgoe et al., 2021, Martin and Koelle, 2021, Shi et al., 2024). We then update the evolving viral population from one time point to the next by determining which viral particles will die and which viral particles will reproduce over the next Inline graphic time increment. To determine which viral particles will die, we first draw a random number Inline graphic from a Poisson distribution with mean Inline graphic, where Inline graphic is the per capita viral death rate. We then choose Inline graphic viral particles at random and without replacement to be removed from the viral population. To determine which viral particles will reproduce, we draw Inline graphic viral particles (with replacement) in proportion to their overall fitness values Inline graphic. Progeny viruses inherit the genotype of their respective parents, plus any additional mutations that occur during their “birth.” The number of additional mutations that occur during the birth of a given viral particle is drawn from a Poisson distribution with mean Inline graphic, where Inline graphic is the per site per infection cycle mutation rate and Inline graphic is the total number of sites in the viral genome. A mutation results in a flip of the parental allele (from either 0 to 1 or 1 to 0). The sites at which mutations occur are selected at random from the viral genome. Once viral particle deaths and births have been simultaneously updated, time is incremented from Inline graphic to Inline graphic.

When viral genome sizes are very small, it is possible to calculate the replicative fitness values of each genotype prior to simulating viral evolution on the landscape. However, even with only 100 sites that impact replicative fitness, genotype space becomes too large to adopt this approach. When simulating viral evolution, we therefore dynamically allocate Inline graphic fitness values as strains are accessed through mutation over the course of a prolonged infection. Adopting this approach allows us to calculate and store a considerably smaller portion of viral genotype space.

Model parameterization

We simulated intrahost viral evolution using viral genomes that have a total of either Inline graphic or Inline graphic sites. Our choice of these genome sizes attempted to strike a balance between computational tractability and the Inline graphic660 base pair lengths of the receptor binding domains of SARS-CoV-2 and influenza A viruses. In our Inline graphic simulations, we let 315 of these sites be nonsynonymous and the remaining 85 sites be synonymous (Inline graphic). This ratio of 315:85 reflects the approximate ratio of 3.7 in nonsynonymous to synonymous sites observed for SARS-CoV-2 and influenza viruses. Of the 315 nonsynonymous sites, we consider in our simulations different scenarios for these sites being classified as phenotypic (Inline graphic), antigenic (Inline graphic), and pleiotropic (Inline graphic) sites. Similarly, in our Inline graphic simulations, we let 630 of the sites be nonsynonymous and the remaining 170 sites be synonymous (Inline graphic), again resulting in a nonsynonymous-to-synonymous ratio of 3.7. In all of our simulations, we set the parameter Inline graphic to 100, such that replicative fitness spans values from 0 to 100 (or a little higher for the most rugged landscapes considered; see Fig. 2E). Unless otherwise noted, we further set infecting genotypes to be Inline graphic50% adapted to the host by setting the reference genotype Inline graphic to all ones and letting infecting genotypes be bitstrings that contain exactly 50% ones and exactly 50% zeros at randomly chosen sites across their genomes.

We set the mutation rate to Inline graphic mutations per site per infection cycle. This mutation rate lies between the estimated mutation rate of Inline graphic mutations per site per cycle for coronaviruses (Bar-On et al., 2020, Amicone et al., 2022) and the estimated mutation rate of Inline graphic mutations per site per cycle for influenza A viruses (Pauly et al., 2017). At this mutation rate and with a genome length of Inline graphic, two or more mutations are expected to occur in less than 0.005% of replications. With a genome length of Inline graphic, two or more mutations are also expected to only rarely occur (in less than 0.02% of replications). As such, in our simulations, we only allow zero mutations (with probability Inline graphic) or one mutation (with probability Inline graphic) to occur during the process of viral replication. We set the generation time to 6 h, corresponding to a death rate of Inline graphic per day. We chose this generation time based on the 6–8 h generation time estimated for influenza viruses (Einav et al., 2020) and the 6–9 h generation time estimated for SARS-CoV-1 (Schneider et al., 2012, Bar-On et al., 2020). In all of our simulations, we calculate overall fitness values for genotypes using equation (2) and set the time step to Inline graphic hour.

Results

Viral population size modulates the strength of selection and genetic drift

We first simulated the model to confirm that it recovers the well-established evolutionary pattern that genetic drift dominates when (effective) population sizes are small and that selection acts more efficiently when population sizes are larger. To this end, we simulated viral evolution on a relatively smooth replicative fitness landscape (Inline graphic) for two different viral population sizes: Inline graphic and Inline graphic. The viral genome in these simulations consisted of 85 synonymous (Inline graphic) sites and 315 phenotypic (Inline graphic) sites. Supplementary Figure S1 shows the evolutionary dynamics of simulated viral populations over the time course of 3 years. When viral population sizes are small (Inline graphic), mean fitness of these populations does not consistently increase (Supplementary Fig. S1A), indicating a lack of viral adaptation. In contrast, when viral population sizes are larger (Inline graphic), mean population fitness consistently increases (Supplementary Fig. S1B). In small viral populations, genetic divergence accrues at a rate that is similar to that expected under neutral evolution (Supplementary Fig. S1C), whereas in large viral populations, genetic divergence accrues at a rate that exceeds that expected under neutral evolution (Supplementary Fig. S1D), again, indicating the occurrence of viral adaptation in these larger populations. Finally, as one would expect, average pairwise genetic diversity levels are lower in the smaller viral populations (Supplementary Fig. S1E) than in the larger viral populations (Supplementary Fig. S1F). More interestingly, the small populations show patterns of genetic diversity that are consistent with expected levels of genetic diversity under neutral evolution (Supplementary Fig. S1E), whereas the large populations show patterns of genetic diversity that are substantially lower than those expected under neutral evolution (Supplementary Fig. S1F). This is consistent with positive selection acting to reduce genetic diversity in the large-Inline graphic simulations. Together, the results shown in Supplementary Fig. S1 indicate that viral population sizes modulate the strength of selection and genetic drift, with genetic drift dominating in small viral populations and selection dominating in large viral populations, as expected from population genetic theory.

Observed excess of nonsynonymous substitutions appears incompatible with viral evolution on static replicative fitness landscapes

We next used our model to explore the impact that fitness landscape ruggedness has on patterns of within-host viral evolution and adaptation, specifically focusing on what types of fitness landscapes could consistently reproduce the empirically observed pattern that nonsynonymous substitution rates generally exceed synonymous substitution rates in prolonged viral infections (Fig. 1A). To this end, we considered four fitness landscapes across a range of ruggedness from Inline graphic (highly rugged) to Inline graphic (smooth), each with an overall viral genome size of Inline graphic sites. In each case, we set the viral population size to Inline graphic based on findings that viral effective population sizes in prolonged infections are thought to be large (Xue et al., 2017, Lumby et al., 2020). We further initially considered only synonymous sites and sites impacting replicative fitness (Inline graphic, Inline graphic, Inline graphic, Inline graphic) and simulated viral evolution on each of the four fitness landscapes for one-year periods. Six independent replicates were simulated for each fitness landscape to be able to assess general trends.

On a highly rugged landscape (Inline graphic), viral populations rapidly adapted in the first 2 months following infection, from a mean fitness value of Inline graphic50 to a mean fitness value of Inline graphic80 (Fig. 3A). Following this initial period of adaptation, further increases in fitness were less pronounced and only occurred sporadically. Divergence from the infecting genotype increased rapidly during the initial period of adaptation but then tended to slow down (Fig. 3E), with divergence considerably lower than expected under neutral evolution during later time points. These results are consistent with initial movements of the viral populations to local higher-fitness peaks in the landscape, followed by the populations getting “trapped” in these local peaks, at least temporarily. Indeed, when we plot changes in these populations’ consensus sequences over time, we see that a small number of nonsynonymous substitutions occurred shortly following infection, but additional nonsynonymous substitutions were rare thereafter (Fig. 3I). In contrast, synonymous substitutions continue to accumulate over the year of simulation (Fig. 3I). Figure 3I further indicates that nonsynonymous viral substitution rates would be expected to be lower than synonymous ones in viral populations that have evolved on this rugged fitness landscape when calculated 6–12 months following infection.

Figure 3.

Figure 3

Patterns of viral adaptation observed across fitness landscapes of variable ruggedness. Columns correspond to simulations of viral populations on fitness landscapes of increasing smoothness, with the Inline graphic landscape implementing a highly rugged landscape (first column) and the Inline graphic landscape (last column) implementing a relatively smooth landscape. (A–D) Mean population fitness over the course of infection for six independent simulations. The dashed black line at 50 indicates the expected fitness of the infecting genotype. (E–H) Mean divergence from the infecting genotype for the same six populations. The dashed black line shows expected divergence under neutral evolution. (I–L) Number of nonsynonymous and synonymous substitutions in the six populations over time. For a given simulation, substitutions are calculated between the consensus sequence at a given time point and the infecting genotype. Red lines show nonsynonymous substitutions. Grey lines show synonymous substitutions. Substitutions are normalized by the number of nonsynonymous and synonymous sites, respectively, yielding a per-site number of substitutions. Dashed black line shows the expected number of substitutions per site under neutral evolution. All simulations used a viral genome of length Inline graphic, with Inline graphic and Inline graphic. Each infecting genotype had a Hamming distance of 200 from the reference genotype of all ones. Other parameters were: Inline graphic, Inline graphic mutations per site per infection cycle, Inline graphic, Inline graphic infection cycles per day.

We next simulated viral evolution on a less rugged fitness landscape (Inline graphic). Mean fitness of the simulated viral populations also rapidly increased within the first 2 months of infection, but fitness levels only reached values of Inline graphic60 (Fig. 3B), rather than the fitness values of 80 that were observed on the more rugged fitness landscape of Inline graphic. This indicates that viral populations still got trapped, at least temporarily, in local fitness peaks on this less rugged landscape, despite the landscape being smoother. The remainder of the evolutionary patterns on the Inline graphic landscape are highly similar to those observed on the Inline graphic landscape: divergence tended to level off (Fig. 3F) and nonsynonymous substitutions occurred only early on in infection and then purifying selection dominated (Fig. 3J), resulting in fewer nonsynonymous substitutions than synonymous substitutions by 6–12 months post-infection. Patterns of viral evolution on an even smoother fitness landscape (Inline graphic) look similar to those on the Inline graphic landscape, with the only appreciable difference being that the initial periods of viral adaptation that occurred during the first 2 months of infection reached even lower fitness plateaus (55–60, rather than Inline graphic60). This pattern of lower fitness plateaus in higher-Inline graphic simulations makes sense in that local fitness peaks in higher-Inline graphic simulations will have fitness values similar to those of the infecting genotypes, whereas local fitness peaks in lower-Inline graphic simulations can have fitness values much higher than those of the infecting genotypes. Finally, on the smoothest fitness landscape considered (Inline graphic), fitness levels continued to increase over the simulated year-long infections (Fig. 3D). These continuous increases, however, only increased mean fitness slightly. Unlike the populations that evolved on the more rugged landscapes, divergence in the populations evolving on the smooth Inline graphic landscape continued to increase, consistent with neutral patterns of divergence (Fig. 3H). Nonsynonymous substitutions also continued to accrue, but only at rates similar to those of synonymous substitutions (Fig. 3L). This is because the fitness impacts of phenotype-impacting mutations on these smooth landscapes were small, approaching nearly-neutral. Nevertheless, selection was still able to act (albeit slowly) on the viral populations evolving on these smooth landscapes, as indicated by the consistent increase in mean fitness in these populations (Fig. 3D). The sustained increase in mean fitness is possible due to the lack of local fitness peaks on this smooth landscape in which the viral populations can temporarily get trapped in. Together, our results indicate that simulations of viral evolution occurring on fitness landscapes of variable ruggedness do not recapitulate patterns of nonsynonymous substitutions being in excess of synonymous substitutions.

Viral populations evolving on static replicative fitness landscapes do not diversify into substantively distinct lineages during their adaptation

While the results shown in Fig. 3 indicate that viral adaptation is expected to occur across fitness landscapes that differ in their ruggedness, they did not yield information on the extent to which the viral populations diversified throughout their adaptation. It is conceivable that the viral populations remained largely monomorphic, in each simulation traversing to a single local fitness peak. Alternatively, viral populations could have diversified, leading to the occupation of many different local fitness peaks. To determine which of these possibilities occurred, we serially sampled the simulated viral populations on a monthly basis and inferred time-aligned trees from these samples (Fig. 4). These time-aligned trees first indicate that viral turnover is generally observed, particularly in viral populations evolving on smoother (Inline graphic) fitness landscapes. Across all four fitness landscapes considered, the trees do not consistently show viral diversification into substantively distinct lineages. On the more rugged fitness landscapes (Inline graphic and Inline graphic), the most recent common ancestor of viruses sampled at 1-year post infection is sometimes the infecting genotype, but in other simulations, the time of the most recent common ancestor is only several months before this final time point sample. This pattern of recent common ancestry becomes more robust on smoother fitness landscapes (particularly Inline graphic). As such, these simulations do not consistently reproduce the pattern of viral lineage diversification observed in prolonged infections (Fig. 1B). These phylogenies further indicate that, while several fitness peaks may get “discovered,” the presence of fitness differences between these local peaks often results in viral populations ultimately persisting in only one of them.

Figure 4.

Figure 4

Phylogenies inferred from simulated viral populations evolving on fitness landscapes of variable ruggedness. Columns correspond to the fitness landscapes in Fig. 3, ranging from a highly rugged fitness landscape (Inline graphic, first column) to a smooth fitness landscape (Inline graphic, fourth column). Rows correspond to the six independent simulations shown in Fig. 3. For each simulation, a time-aligned phylogeny was inferred using a dataset that contained 130 sequences (10 sequences per time point sampled, with monthly sampling from Inline graphic to Inline graphic days). Time-aligned phylogenies were generated by first inferring neighbor-joining trees using the R packages Ape (Paradis and Schliep, 2019) and treeio (Wang et al., 2020) and then using treedater (Volz and Frost 2017) to time-align these neighbor-joining trees. Time-aligned trees were visualized using FigTree v1.4.4 (Rambaut, 2018). The time scale corresponds to the number of days following infection.

Parallel substitutions do not occur readily on static replicative fitness landscapes

We now address through simulation the question of whether static replicative fitness landscapes of various ruggedness can consistently reproduce the pattern of parallel substitutions that is frequently observed across individuals with prolonged infections (Fig. 1C). We again considered viral evolution on the four different fitness landscapes we considered in Fig. 3, and simulated viral evolution for 6 months. For each landscape, we started each of the six simulations off with the same infecting genotype that was Inline graphic50% adapted to the host and used the same fitness landscape for each of the simulations. As such, genotypes that had been accessed in a previous simulation and had their fitness values already dynamically allocated retained their fitness values across simulations.

On a very rugged landscape (Inline graphic), the mean fitness of all six simulated viral populations increased, largely in the 2 months following infection (Fig. 5A). These dynamics are, as expected, consistent with the dynamics of mean fitness that were observed across different fitness landscapes parameterized with Inline graphic (Fig. 3A). Interestingly, the fitness levels at which the populations plateaued differed across the six simulations, despite the same underlying fitness landscape and the same infecting genotype. This indicates that the different viral populations may, at least temporarily, be residing in different nearby local fitness peaks. To explore this possibility, we identified, for each population, the set of sites that contained a nonsynonymous high-frequency allele that differed from the infecting genotype at 6 months post-infection, defining high-frequency as exceeding 20%. For each pair of individuals, we then determined the number of sites that were shared across their sets. On the Inline graphic landscape, we found that very few (if any) high-frequency mutations were shared between individuals (Fig. 5E). To understand these results, we can remember that highly rugged landscapes come close to a House of Cards landscape, where the fitness effect of a mutation depends almost entirely on its genetic context (i.e. epistatic interactions dominate fitness effects). As such, one would expect the first nonsynonymous substitution to impact the fitness effects of all other possible nonsynonymous substitutions. Parallel substitutions would therefore only likely be observed if the first substitution was the same one across individuals. Once different substitutions occurred, the viral populations across the different individuals would be expected to be on different evolutionary paths due to the extreme ruggedness of the fitness landscape.

Figure 5.

Figure 5

Patterns of viral evolution and parallel mutations on identical static replicative fitness landscapes. In each column, all 6 viral populations evolve on the same static fitness landscape, starting from the same infecting genotype. (A–D) Changes in mean population fitness of six viral populations evolving on fitness landscapes of various ruggedness: Inline graphic (A), Inline graphic (B), Inline graphic (C), and Inline graphic (D). (E–H) The number of shared, high-frequency nonsynonymous mutations observed across pairs of individuals at time Inline graphic years. High-frequency was defined as Inline graphic20%. Cells along the diagonal show the number of high-frequency nonsynonymous mutations identified in each individual at time Inline graphic years. Model parameters are: Inline graphic, Inline graphic mutations per site per infection cycle, Inline graphic, Inline graphic replications per day, Inline graphic and Inline graphic. In Supplementary Fig. S2, we further considered an alternative definition of what constitutes a high-frequency mutation, defining it as any mutation in an individual that reached a frequency of 20% or higher at any point in time during the 6-month course of the individual’s infection. Parallel mutations were then again considered those that occurred across pairs of individuals. The results using this alternative definition are qualitatively similar to those shown in (E)–(H): few, is any, parallel mutations are observed across pairs of individuals.

On a semi-rugged landscape parameterized with Inline graphic, we see that fitness increases to similar levels across the individual infections (Fig. 5B). However, these fitness increases appear to largely stem from different nonsynonymous substitutions, given little-to-no sharing of nonsynonymous high-frequency mutations (Fig. 5F). Viral evolution on even smoother fitness landscapes again results in little to no sharing of nonsynonymous variation (Fig. 5G and H). These results make sense in that, on smoother landscapes, beneficial mutations all have similar fitness effects, such that there are many different routes to higher fitness.

Simulated patterns of within-host viral evolution are largely robust to different viral genome sizes

Together, our results shown in Figs 35 indicate that adaptation is readily observed in simulated viral populations evolving on fitness landscapes of variable ruggedness. On rugged landscapes, adaptation occurs rapidly and then evolution is dominated by purifying selection, such that nonsynonymous substitution rates do not exceed synonymous ones in the long-term. On smooth landscapes, adaptation continues to occur, but fitness differentials are so small that nonsynonymous substitution rates are similar to synonymous ones in the long-term. As such, viral evolution on these static fitness landscapes did not consistently reproduce the observed excess of nonsynonymous substitutions observed in prolonged infections. Our simulations on these static landscapes further indicate that co-circulating viral lineages do not robustly evolve and that parallel mutations are not readily observed on any of the considered fitness landscapes. All of these results, however, were based on simulations with a viral genome of size Inline graphic sites. To assess the robustness of these results, we further considered a viral genome with Inline graphic sites. Simulations with this larger genome size were generally consistent with the results with the Inline graphic viral genome. On rough fitness landscapes, Inline graphic viral populations exhibited analogous nonsynonymous and synonymous substitution patterns (Supplementary Fig. S3), an analogous lack of substantive lineage diversification (Supplementary Fig. S4) and an analogous lack of parallel substitutions across individuals experiencing prolonged infection (Supplementary Fig. S5). On smoother fitness landscapes, Inline graphic viral populations again exhibited steady, albeit slow, rates of adaptation (Supplementary Fig. S3). Viral divergence was slightly higher than expected under neutrality, and the rate of nonsynonymous substitutions slightly exceeded that of synonymous substitutions (Supplementary Fig. S3), indicating that this empirical pattern may be reproducible on a smooth fitness landscape if the viral genome size is sufficiently large, provided that a considerable fraction of mutations increase fitness. However, substantive lineage diversification still was not consistently reproduced (Supplementary Fig. S4) and parallel substitutions across individuals experiencing prolonged infection were still not observed (Supplementary Fig. S5) on smooth fitness landscapes with viral genome sizes of Inline graphic.

Because of the inability for these viral simulations (with genome sizes of Inline graphic or Inline graphic) to consistently replicate observed patterns of within-host viral evolution, we therefore next considered the role that mutations that impact antigenicity may play in the reproduction of the empirical patterns shown in Fig. 1. Specifically, we consider the impact of pleiotropic sites on these patterns, where mutations at these sites impact both replicative fitness and antigenicity.

Interindividual variation in rates of antigenic evolution can be explained by differences in the strength and breadth of immune pressure

To assess the impact that pleiotropic sites have on patterns of within-host viral evolution in prolonged infections, we assume a semi-rugged fitness landscape of Inline graphic, again starting with a viral genome of size Inline graphic with 315 nonsynonymous sites and Inline graphic synonymous sites. However, we now assume that a subset of the nonsynonymous sites impact only replicative fitness (Inline graphic) while the remainder of the nonsynonymous sites impact both replicative fitness and antigenicity (Inline graphic). We arrived at this partition between phenotypic sites and pleiotropic sites based roughly on the proportion of amino acid residues in the receptor binding domain of SARS-CoV-2’s spike gene that impact antigenicity (Greaney et al. 2022). In our simulations, we consider different strengths of immune pressure as well as different breadths of the immune response. Our first simulations consider variation in the strength of immune pressure, modified by varying the value of parameter Inline graphic in equation (3). Figure 6 shows simulations at four different values of Inline graphic, ranging from Inline graphic (no immune pressure) to Inline graphic (very strong immune pressure). For all of these simulations we kept immune breadth constant at a moderate level of Inline graphic. The strength of immune pressure (Inline graphic) implemented under these scenarios is shown graphically in Supplementary Fig. S6.

Figure 6.

Figure 6

The strength of the immune response impacts patterns of within-host viral evolution in prolonged infections. Columns correspond to varying strengths of the immune response: no immune response (Inline graphic), low strength (Inline graphic), medium strength (Inline graphic), and high strength (Inline graphic). All simulations set the breadth of the immune response to Inline graphic. (A–D) Extent of antigenic evolution over the course of infection for six simulations. Antigenic evolution was calculated as divergence between the consensus genotype at a given time point and the infecting genotype, at the subset of sites that impact antigenicity (Inline graphic and Inline graphic sites). (E–H) Number of nonsynonymous (red) and synonymous (grey) substitutions per site over the course of infection. Dashed black line shows the expected number of substitutions under neutral evolution. (I–L) Mean viral replicative fitness over the course of infection. The horizontal dashed line shows the expected fitness of the infecting genotype. (M–P) Time-aligned phylogenies for the simulations shown in black in the above panels. The time scale corresponds to the number of days following infection. Simulations were performed using a viral genome of length Inline graphic, with Inline graphic, Inline graphic, Inline graphic, and Inline graphic. Other parameters are: Inline graphic, Inline graphic mutations per site per infection cycle, Inline graphic, Inline graphic, and Inline graphic infection cycles per day.

In the absence of immune pressure (Inline graphic), antigenic divergence (realized through mutations occurring at pleiotropic sites) remained low (Fig. 6A), as one might expect given the lack of immune pressure acting on these simulated viral populations. In the presence of immune pressure (Inline graphic), antigenic divergence increased over the simulated infections, with higher rates of antigenic divergence observed with stronger immune pressure (Fig. 6B-D). In all cases, there is an apparent leveling-off, or plateauing, of antigenic divergence. We return to this intriguing pattern below. While our results show that there is some variation in the amount of antigenic evolution that occurs within simulations that are parameterized with the same strength of immune pressure, there is considerably more variation in the amount of antigenic evolution observed across simulations that differ in the strength of immune pressure. Observed interindividual heterogeneity in the rate of antigenic evolution could thus be explained by variation across individuals in the strength of their immune response, with individuals that exert higher immune pressure giving rise to viral populations that have undergone more antigenic evolution. Of note, these simulations do not consider the impact of immune pressure on viral population sizes; with higher immune pressure, it could be the case that viral population sizes are reduced, which would decrease the efficiency of selection and therefore may reduce the rate of within-host antigenic evolution, as has been suggested previously (Grenfell et al., 2004).

Figure 7 considers the impact that the breadth of the immune response has on the rate of antigenic evolution. We parameterize the simulations for this figure using different values of Inline graphic (see equation (3)), ranging from Inline graphic (very narrow immune breadth) to Inline graphic (broad immune breadth). In these simulations, we keep the strength of immune pressure the same across simulations at a moderate value of Inline graphic. The extent of immune pressure (Inline graphic) adopted under these scenarios is shown graphically in Supplementary Fig. S7. In simulations with narrow immune breadth (Inline graphic), antigenic evolution occurred, but only at a very slow rate (Fig. 7A). As immune breadth increased (higher Inline graphic), the rate of antigenic evolution increased (Fig. 7B and C). These results make sense in that viral genotypes that are more than a single antigenic mutation away from the consensus genotype have incrementally higher antigenic fitness (Inline graphic) when immune breadth is intermediate, thereby facilitating antigenic divergence. As immune breadth increases further (toward Inline graphic), the fitness differential conferred by antigenicity-impacting mutations is reduced. Interestingly, we therefore see a reduced rate of antigenic evolution at very broad immune breadths (Fig. 7D). In Fig. 7C and D, there is again an apparent leveling-off, or plateauing, of antigenic divergence, which we return to below. Our results indicate that observed interindividual heterogeneity in the rate of antigenic evolution could be explained by variation across individuals in the breadth of their immune response, with individuals that have an intermediate immune breadth expected to give rise to viral populations that have undergone the most antigenic evolution. Again, these conclusions are based on results that assume that viral population sizes are not impacted by the breadth of the immune response.

Figure 7.

Figure 7

The breadth of the immune response impacts patterns of within-host viral evolution in prolonged infections. Columns correspond to varying breadths of the immune response: very narrow breadth (Inline graphic), low breadth (Inline graphic), medium breadth (Inline graphic), and broad breadth (Inline graphic). All simulations assumed a moderate strength of immune pressure (Inline graphic). (A–D) Extent of antigenic evolution over the course of infection for six simulations. (E–H) Number of nonsynonymous (red) and synonymous (grey) substitutions per site over the course of infection. Dashed black line shows the expected number of substitutions under neutral evolution. (I–L) Mean viral replicative fitness over the course of infection. The horizontal dashed line shows the expected fitness of the infecting genotype. (M–P) Time-aligned phylogenies for the simulations shown in black in the above panels. The time scale corresponds to the number of days. Simulations were performed using a viral genome of length Inline graphic, with Inline graphic, Inline graphic, Inline graphic, and Inline graphic. Other parameters are: Inline graphic, Inline graphic mutations per site per infection cycle, Inline graphic, Inline graphic, and Inline graphic infection cycles per day.

In sum, our results shown in Figs 6A–D and 7A–D indicate that differences between individuals in the strength and/or breadth of their immune response can reproduce heterogeneities in the extent of antigenic evolution that are observed across individuals experiencing prolonged infections (Fig. 1D). To determine whether these results are robust to changes in viral genome size, we again simulated viral evolution under the same immune escape parameterizations, this time with a viral genome that had Inline graphic sites, keeping the same proportions of phenotypic, pleiotropic, antigenic, and synonymous sites as for our previous simulations with Inline graphic sites. Patterns of antigenic evolution in these simulations displayed the same overall trends as those observed in Figs 6 and 7, with more antigenic evolution observed when the strength of the immune response was stronger (Supplementary Fig. S8) and the highest amount of antigenic evolution observed when the breadth of the immune response was moderate (Supplementary Fig. S9). However, because of the larger number of sites that impacted antigenicity in the Inline graphic simulations, the overall levels of antigenic divergence tended to be larger in the Inline graphic simulations than in the Inline graphic simulations.

We next wanted to determine whether and to what extent the simulated patterns of antigenic evolution would be robust to whether the antigenicity-impacting mutations affected only antigenicity or whether they affected both antigenicity and replicative fitness (i.e. whether the antigenicity-impacting sites were antigenic (Inline graphic) sites or pleiotropic (Inline graphic) sites). To address this question, we simulated the model with our original Inline graphic site viral genomes, under the same parameterizations as those in Figs 6 and 7, with the only difference being that the Inline graphic pleiotropic sites were replaced with Inline graphic antigenic sites (Supplementary Figs S10 and S11). Simulations of this model yielded similar patterns of antigenic evolution to those shown in Figs 6 and 7, with the greatest amount of antigenic evolution occurring when immune pressure was strong (high Inline graphic, Supplementary Fig. S10A–D) and when immune breadth was intermediate (intermediate Inline graphic, Supplementary Fig. S11A–D). A closer comparison of Supplementary Fig. S10 and Fig. 6, however, reveals that when the antigenicity-impacting sites also impact viral replicative fitness (i.e. when Inline graphic and Inline graphic; Fig. 6A–D), the rate of antigenic evolution is slower than when antigenicity-impacting sites only impact antigenicity (i.e. when Inline graphic and Inline graphic; Supplementary Fig. S10A–D). This is likely because antigenic evolution is impeded by the simultaneous impact these mutations have on viral replicative fitness. Similarly, a closer comparison of Supplementary Fig. S11 and Fig. 7 reveals that when the antigenicity-impacting sites also impact viral replicative fitness (i.e. when Inline graphic and Inline graphic; Fig. 7A–D), the rate of antigenic evolution is slower than when antigenicity-impacting sites only impact antigenicity (i.e. when Inline graphic and Inline graphic; Supplementary Fig. S11A–D). These results, at the within-host level, are consistent with findings from a recent study that showed that pleiotropic effects can constrain the antigenic evolution of influenza viruses at the population level (Yu et al., 2025).

Immune pressure increases nonsynonymous substitution rates

We now revisit the patterns of viral evolution shown in Fig. 1A–C to assess whether our above simulations with antigenicity-impacting sites can consistently reproduce these patterns. We do this by returning to the Inline graphic simulations, with Inline graphic, Inline graphic, Inline graphic, and Inline graphic. Figure 6E–H indicate that when the strength of the immune response is greater (larger Inline graphic), nonsynonymous substitution rates increase and exceed synonymous substitution rates. This result makes sense in that mutations at pleiotropic sites can and often do yield viral genotypes that have a selective advantage in the presence of immune pressure, with the selective advantage being larger with stronger immune pressure. Figure 7E–H indicate that when immune breadth is at an intermediate level (moderate Inline graphic), nonsynonymous substitution rates are higher than they are at either narrow breadth or broad breadth, and again exceed synonymous substitution rates. This result similarly makes sense in that mutations at pleiotropic sites again yield viral genotypes that experience the largest selective advantage when immune breadth is at an intermediate level.

While immune escape, occurring through mutations at pleiotropic sites, consistently increases nonsynonymous substitution rates, its impact on viral replicative fitness is varied. When infecting genotypes are of intermediate fitness (Inline graphic50% adapted), mean replicative fitness increases similarly across simulations without immune pressure (Fig. 6I) as with immune pressure of various strengths and breadths (Figs 6J–L and 7I–K, 7L). Under these scenarios, therefore, immune escape neither impedes nor facilitates viral phenotypic adaptation. The presence of host immune pressure, however, does impact viral replicative fitness when the infecting genotype is either poorly adapted to the host or when it is well-adapted to the host. When the infecting genotype is poorly adapted to the host, stronger immune pressure tends to facilitate viral phenotypic adaptation, with mean fitness levels increasing to higher levels than in the absence of immune pressure (Supplementary Fig. S12). In contrast, when the infecting genotype is well adapted to the host, stronger immune pressure tends to impede viral phenotypic adaptation, with mean fitness levels evolving to be lower than in the absence of immune pressure (Supplementary Fig. S13). These results make sense in terms of immune pressure driving antigenic evolution. When antigenic evolution occurs at pleiotropic sites, then these evolutionary dynamics increase mean replicative fitness if the replicative fitness effect of these mutations are on average positive, which they are if the infecting genotype is poorly adapted. In contrast, these evolutionary dynamics decrease mean replicative fitness if the replicative fitness effect of these mutations are on average negative, which they are if the infecting genotype is well adapted. As such, we would not expect immune pressure to either always facilitate or impede viral adaptation unrelated to antigenicity. The net impact of immune pressure on viral adaptation unrelated to antigenicity is expected to depend on the average replicative fitness impact of mutations at pleiotropic sites.

Immune pressure results in multiple co-circulating viral lineages

In the absence of sites impacting antigenicity, our simulations did not robustly reproduce empirically observed patterns of co-circulating viral lineages. However, in the presence of immune pressure, our simulations consistently yielded two, and occasionally three, co-circulating viral lineages (Figs 6N–P and 7M–P). This was the case for even the lowest extent of immune strength we considered (Inline graphic; Fig. 6N) and for even relatively broad immunity breadth (Inline graphic; Fig. 7P). These results make sense in that immune pressure, as implemented, results in negative frequency-dependent selection. As such, the consensus genotype is expected to shift back and forth between the two clades, with the viral clade that does not include the consensus genotype having a selective advantage.

The co-circulation of viral lineages that are observed in the presence of immune pressure also help us understand the seemingly jagged nonsynonymous and synonymous substitution rates apparent in Figs 6F–H and 7E–H. These occur because of co-circulating clades that have different numbers of nonsynonymous as well as synonymous substitutions. The jaggedness appears from the consensus genotype rapidly changing from being in one clade to being in another clade.

Finally, the co-circulation of viral lineages in these simulations also help us understand the plateauing of antigenic divergence that is observed in Figs 6B–D and 7B–D. This plateauing results from the diminishing returns of immune escape mutations combined with negative frequency dependent selection. More specifically, if an immune escape mutation occurs in a viral genome that is in the non-dominant clade (i.e. in the clade that does not contain the consensus sequence), then this mutation would not appreciably increase antigenic fitness (Inline graphic) because both the mutant virus and its parent would both already be antigenically far away from the consensus genotype. If an immune escape mutation occurs in a viral genome that is instead in the dominant clade, then this mutation would increase its antigenic fitness, but only while that clade remains dominant. As such, once two (or more) antigenically divergent lineages are co-circulating, immune pressure for new antigenic mutations should be reduced, resulting in a leveling-off of antigenic divergence. Note that this expectation may be specific to the way we model antigenic fitness; if we did not model immune escape as “fleeting” (see Methods), but instead assumed that viral antigenic fitness depended in some way on the history of viral genotypes that have circulated over the course of infection, we might not expect antigenic divergence to plateau.

Immune pressure increases the likelihood of observing parallel substitutions

Finally, we examined the impact of immune pressure on the occurrence of parallel mutations across individuals. To compare against simulations that included immune pressure, we first simulated a model with no immune pressure by setting the strength of immune pressure to Inline graphic. Consistent with our previous results, mean replicative fitness in each individual increased over the 6 months of simulation (Fig. 8A). Next, we calculated the number of parallel high-frequency mutations that were shared across individuals at time Inline graphic years, as we did for Fig. 5. We again considered only nonsynonymous sites and defined high-frequency mutations as those that exceeded 20%. Figure 8F again shows that in the absence of immune pressure (Inline graphic), parallel mutations do not readily occur. (The results shown in Fig. 8F are quantitatively similar to those in Fig. 5G, but the scale bar used is different, such that the results in Fig. 8F can be compared against those of Fig. 8G–J.) In contrast, in the presence of immune pressure, parallel mutations are more readily observable (Fig. 8G–J). This is particularly the case when the strength of immune pressure is strong and when the breadth of the immune response is moderate (Fig. 8J). This is because the rate of antigenic evolution is largest under this parameterization (Fig. 8O versus Fig. 8K–N). We hypothesized that the reason why parallel mutations were more likely to be observed in the presence of immune pressure was because there were relatively few sites that impacted antigenicity and even fewer of these sites that impacted antigenicity and did not decrease replicative fitness. To test this hypothesis, we simulated the model under a similar parameterization, only changing the 48 pleiotropic (Inline graphic) sites into antigenic (Inline graphic) sites. Supplementary Fig. S15 shows that simulations of this model generate similar patterns to those shown in Fig. 8F–J, with somewhat higher frequencies of observed parallel mutations in the simulations with antigenic Inline graphic sites rather than pleiotropic (Inline graphic) sites. The greater number of parallel mutations in Supplementary Fig. S15 is because the overall rate of antigenic change is higher when the evolution of antigenicity-impacting mutations is not constrained by pleiotropic effects. As such, pleiotropic effects tend to decrease the number of observed parallel mutations due to the constraints they place on the rate of antigenic evolution. To further determine whether parallel mutations occur as a result of a small number of antigenicity-impacting sites, we next performed model simulations with a much larger number of antigenicity-impacting mutations, while keeping the viral genome size at Inline graphic sites: Inline graphic, Inline graphic, Inline graphic, and Inline graphic. Supplementary Figure S16 shows that simulations of this model, even in the presence of strong immune pressure, do not result in a large number of parallel mutations. This indicates that, indeed, the common occurrence of parallel mutations observed in Fig. 8 in the presence of immune pressure is due to the relative small number of sites that impact antigenicity and the strength of selection for antigenic change in these simulations.

Figure 8.

Figure 8

Immune pressure increases the frequency of parallel mutations observed across individuals. Columns correspond to different parameterizations of the immune response (depicted in Fig. 1H). Column 1: no immune pressure (Inline graphic). Column 2: weak immune strength (Inline graphic) and narrow immune breadth (Inline graphic). Column 3: weak immune strength (Inline graphic) and moderate immune breadth (Inline graphic). Column 4: strong immune strength (Inline graphic) and narrow immune breadth (Inline graphic). Column 5: strong immune strength (Inline graphic) and moderate immune breadth (Inline graphic). Infecting genotypes are Inline graphic50% adapted to the host. (A–E) Changes in mean viral replicative fitness for six independent viral populations evolving on the same fitness landscape, starting with the same infecting genotype. (F–J) The number of shared high-frequency mutations across pairs of individuals. Only mutations at nonsynonymous sites that exceeded frequencies of 20% at time Inline graphic years were considered in this calculation. Supplementary Figure S14 shows analogous results for an alternative definition of high-frequency nonsynonymous mutations, namely any nonsynonymous mutation that reaches a frequency of 20% or higher at any point over the 6-month course of an individual’s infection. (K–O) Extent of antigenic evolution over the course of infection. All simulations were performed using viral genome of length Inline graphic, with Inline graphic, Inline graphic, Inline graphic, and Inline graphic. Other parameters are: Inline graphic, Inline graphic mutations per site per infection cycle, Inline graphic, Inline graphic, and Inline graphic infection cycles per day.

Discussion

Prolonged infections with respiratory viruses such as influenza viruses and coronaviruses have been extensively documented. Through serial sampling of these infections, several consistent patterns of viral evolution have been identified, including an observed excess of nonsynonymous substitutions (Fig. 1A), co-circulating viral lineages (Fig. 1B), parallel substitutions across infected individuals (Fig. 1C), and variable rates of antigenic evolution (Fig. 1D). Here, we have developed a fitness landscape model to determine what processes likely drive these evolutionary patterns. Through simulation, we found that the patterns shown in Fig. 1A–C could not be consistently reproduced in the absence of sites that impacted antigenicity. In contrast, when we instead assumed that a subset of the nonsynonymous sites impacted antigenicity in addition to viral replicative fitness, our simulations were able to consistently reproduce these three observed patterns. The final pattern, of variable rates of antigenic evolution, could be explained by either differing strengths of immune pressure across infected individuals and/or differing breadths of the immune response across infected individuals.

We designed our model to be flexibly parameterized by easily changing the number and types of sites, the mutation rate, the ruggedness of the replicative fitness landscape, and the strength and breadth of the immune response. While we could not simulate the model under all possible parameterizations, we considered the impact of multiple different fitness landscapes on patterns of within-host viral evolution, the impact of different viral genome sizes and the impact of different strengths and breadths of the immune response on these patterns. Our model did adopt several assumptions that could be relaxed in future studies. First, we assumed that no recombination occurs within our model genome. This assumption simplified our model considerably, in that we did not have to model cellular co-infection either implicitly or explicitly, and it allowed us to organize our sites in the model genome in an arbitrary order. While this assumption simplified our model, it also prevented us from being able to determine how different fitness landscapes and types of immune pressure would impact rates of recombination and ultimately adaptation and immune escape. Second, we assumed that viral population sizes were constant over the course of infections and compared across simulations that had the same viral population size. This assumption could be easily relaxed, but in the absence of a question related specifically to population sizes and population dynamics, we adopted this assumption to be able to interpret our results without this variation as a confounding factor. Third, we assumed a single viral population within each individual, with no spatial subdivision or tissue compartmentalization. As such, spatial structure could not be assessed or invoked as a driver of any of the evolutionary patterns we considered. Given findings that within-host spatial structure occurs in respiratory virus infections and may contribute to genetic diversification of within-host viral populations (Gallagher et al., 2018, Chaguza et al., 2023, Farjo et al., 2024, Smith et al., 2024, Ferreri et al., 2025), this assumption could be relaxed in future extensions of the model. Third, we assumed that all mutations that impacted antigenicity did so to the same extent. This is clearly not the case empirically. For example, for influenza viruses, it is well known that mutations around the receptor binding site have particularly large antigenic effects, although mutations at other sites in the head of the hemagglutinin protein also impact antigenicity (Koel et al., 2013). This variation in antigenic impact may explain why, of the parallel mutations observed across individuals experiencing prolonged infections, there are several ones (like SARS-CoV-2’s E484K mutation) that are particularly recurrent. Finally, we used our model to evaluate the drivers of four different evolutionary patterns that have been documented in prolonged infections with respiratory viruses. As additional studies accrue, some of these patterns themselves may need reevaluation. For example, a recent study has found that sequencing errors may have inflated estimates of viral diversity and evolutionary rates of SARS-CoV-2 populations sampled from individuals experiencing prolonged infections (Rutsinsky et al., 2024).

Although the work presented here focused on viral evolution in individuals experiencing prolonged infection with respiratory viruses that generally cause acute infection, our results reveal patterns similar to those observed in other prolonged viral infections, including prolonged infections with enteric viruses such as noroviruses (Beek et al., 2017, Doerflinger et al., 2017) as well as with viruses such as hepatitis C virus (HCV) and HIV that lead to chronic infection. For example, chronic HCV infections are known to result in rapid diversification from the founder virus into multiple distinct viral lineages that continue to cocirculate over the course of the infection (Raghwani et al., 2016  2019). This co-circulation of viral lineages results in significant fluctuations in viral divergence over time (Raghwani et al., 2016), a pattern that we also observed in our analysis when we incorporated immune pressure. As such, we hope that the fitness landscape model presented here may further be adapted to evaluate the drivers of observed patterns of viral evolution in these other types of viruses.

Supplementary Material

rough_MountFufi_final_supplemental_clean_veaf054

Acknowledgments

We thank two anonymous reviewers for their helpful comments and feedback, as well as Anice Lowen, Anne Piantadosi, Jessica Belser, Tim Read, and members of the Koelle lab for feedback.

Contributor Information

Amber Coats, Program in Microbiology and Molecular Genetics, Emory University, 1462 Clifton Road NE, Atlanta, GA 30322, United States.

Yintong R Wang, Department of Biology, Emory University, 1510 Clifton Road NE, Atlanta, GA 30322, United States.

Katia Koelle, Department of Biology, Emory University, 1510 Clifton Road NE, Atlanta, GA 30322, United States; Emory Center of Excellence for Influenza Research and Response (CEIRR), Atlanta GA, United States.

Conflict of interest: None declared.

Funding

Research reported in this publication was funded by NIH R01 AI154894 and the National Institute of Allergy and Infectious Diseases, Centers of Excellence for Influenza Research and Response, contract number 75N93021C00017. AC was further supported by Emory University and the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number T32AI138952 (Infectious Diseases Across Scales Training Program; IDASTP). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Data availability

All simulation code is available on GitHub at: https://github.com/katiakoelle/rmf_model. All data in this work are simulated data.

References

  1. Aita  T, Uchiyama  H, Inaoka  T  et al.  Analysis of a local fitness landscape with a model of the rough Mt. Fuji-type landscape: application to prolyl endopeptidase and thermolysin. Biopolymers  2000;54:64–79. [DOI] [PubMed] [Google Scholar]
  2. Amicone  M, Borges  V, Alves  MJ  et al.  Mutation rate of SARS-CoV-2 and emergence of mutators during experimental evolution. Evol Med Public Health  2022;10:142–55. 10.1093/emph/eoac010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baang  JH, Smith  C, Mirabelli  C  et al.  Prolonged severe acute respiratory syndrome coronavirus 2 replication in an immunocompromised patient. J Infect Dis  2021;223:23–7. 10.1093/infdis/jiaa666 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bar-On  YM, Flamholz  A, Phillips  R  et al.  SARS-CoV-2 (COVID-19) by the numbers. Elife  2020;9:e57309. 10.7554/eLife.57309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Ben Zvi A, Rutsinsky  N, Fabian  I  et al.  2024. Diverse patterns of intra-host genetic diversity in chronically infected SARS-CoV-2 patients. Virus Evol 2025;11:veaf047. 10.1093/ve/veaf047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Berkhout  B, Herrera-Carrillo  E. SARS-CoV-2 evolution: on the sudden appearance of the omicron variant. J Virol  2022;96:e0009022. 10.1128/jvi.00090-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Borges  V, Isidro  J, Cunha  M  et al.  Long-term evolution of SARS-CoV-2 in an immunocompromised patient with non-Hodgkin lymphoma. mSphere  2021;6:e0024421. 10.1128/mSphere.00244-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Chaguza  C, Hahn  AM, Petrone  ME  et al.  Accelerated SARS-CoV-2 intrahost evolution leading to distinct genotypes during chronic infection. Cell Rep Med  2023;4:100943. 10.1016/j.xcrm.2023.100943 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chen  L, Zody  MC, Di Germanio  C  et al.  Emergence of multiple SARS-CoV-2 antibody escape variants in an immunocompromised host undergoing convalescent plasma treatment. mSphere  2021;6:e0048021. 10.1128/mSphere.00480-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Choi  B, Choudhary  MC, Regan  J  et al.  Persistence and evolution of SARS-CoV-2 in an immunocompromised host. N Engl J Med  2020;383:2291–3. 10.1056/NEJMc2031364 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Dioverti  V, Salto-Alejandre  S, Haidar  G. Immunocompromised patients with protracted COVID-19: a review of “long persisters”. Curr Transplant Rep  2022;9:209–18. 10.1007/s40472-022-00385-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Doerflinger, S. Y., Weichert, S., Koromyslova, A  et al.  2017. Human norovirus evolution in a chronically infected host. mSphere, 2:e00352-16. 10.1128/mSphere.00352-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Einav  T, Gentles  LE, Bloom  JD. SnapShot: influenza by the numbers. Cell  2020;182:532–532.e1. 10.1016/j.cell.2020.05.004 [DOI] [PubMed] [Google Scholar]
  14. Farjo  M, Koelle  K, Martin  MA  et al.  Within-host evolutionary dynamics and tissue compartmentalization during acute SARS-CoV-2 infection. J Virol  2024;98:e0161823. 10.1128/jvi.01618-23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Ferreri  LM, Seibert  B, Caceres  CJ  et al.  Dispersal of influenza virus populations within the respiratory tract shapes their evolutionary potential. Proc Natl Acad Sci USA  2025;122:e2419985122. 10.1073/pnas.2419985122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Gallagher  ME, Brooke  CB, Ke  R  et al.  Causes and consequences of spatial within-host viral spread. Viruses  2018;10:627. 10.3390/v10110627 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Ghafari  M, Liu  Q, Dhillon  A  et al.  Investigating the evolutionary origins of the first three SARS-CoV-2 variants of concern. Front Virol  2022;2:2. 10.3389/fviro.2022.942555 [DOI] [Google Scholar]
  18. Gillespie  DT. Approximate accelerated stochastic simulation of chemically reacting systems. J Chem Phys  2001;115:1716–33. 10.1063/1.1378322 [DOI] [Google Scholar]
  19. Gonzalez-Reiche  AS, Alshammary  H, Schaefer  S  et al.  Sequential intrahost evolution and onward transmission of SARS-CoV-2 variants. Nat Commun  2023;14:3235. 10.1038/s41467-023-38867-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Greaney  AJ, Starr  TN, Bloom  JD. An antibody-escape estimator for mutations to the SARS-CoV-2 receptor-binding domain. Virus Evol.  2022;8:veac021. 10.1093/ve/veac021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Greene  D, Crona  K. The changing geometry of a fitness landscape along an adaptive walk. PLoS Comput Biol  2014;10:e1003520. 10.1371/journal.pcbi.1003520 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Grenfell  BT, Pybus  OG, Gog  JR  et al.  Unifying the epidemiological and evolutionary dynamics of pathogens. Science  2004;303:327–32. 10.1126/science.1090727 [DOI] [PubMed] [Google Scholar]
  23. Harari  S, Tahor  M, Rutsinsky  N  et al.  Drivers of adaptive evolution during chronic SARS-CoV-2 infections. Nat Med  2022;28:1501–8. 10.1038/s41591-022-01882-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hill  V, Du Plessis  L, Peacock  TP  et al.  The origins and molecular evolution of SARS-CoV-2 lineage B.1.1.7 in the UK. Virus Evol.  2022;8:veac080. 10.1093/ve/veac080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Jensen  B, Luebke  N, Feldt  T  et al.  Emergence of the E484K mutation in SARS-COV-2-infected immunocompromised patients treated with bamlanivimab in Germany. Lancet Reg Health Eur  2021;8:100164. 10.1016/j.lanepe.2021.100164 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kemp  SA, Collier  DA, Datir  RP  et al.  SARS-CoV-2 evolution during treatment of chronic infection. Nature  2021;592:277–82. 10.1038/s41586-021-03291-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Khatamzas  E, Antwerpen  MH, Rehn  A  et al.  Accumulation of mutations in antibody and CD8 T cell epitopes in a B cell depleted lymphoma patient with chronic SARS-CoV-2 infection. Nat Commun  2022;13:5586. 10.1038/s41467-022-32772-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Khosravi  D, Soloff  H, Langsjoen  RM  et al.  Severe acute respiratory syndrome coronavirus 2 evolution and escape from combination monoclonal antibody treatment in a person with HIV. Open forum. Infect Dis  2023;10:ofad054. 10.1093/ofid/ofad054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Ko  KKK, Yingtaweesittikul  H, Tan  TT  et al.  Emergence of SARS-CoV-2 spike mutations during prolonged infection in immunocompromised hosts. Microbiol Spectr  2022;10:e0079122. 10.1128/spectrum.00791-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Koel  BF, Burke  DF, Bestebroer  TM  et al.  Substitutions near the receptor binding site determine major antigenic change during influenza virus evolution. Science  2013;342:976–9. 10.1126/science.1244730 [DOI] [PubMed] [Google Scholar]
  31. Lumby  CK, Zhao  L, Breuer  J  et al.  A large effective population size for established within-host influenza virus infection. Elife  2020;9:e56915. 10.7554/eLife.56915 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Lythgoe  KA, Hall  M, Ferretti  L  et al.  SARS-CoV-2 within-host diversity and transmission. Science  2021;372:eabg0821. 10.1126/science.abg0821 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Machkovech  HM, Hahn  AM, Garonzik Wang  J  et al.  Persistent SARS-CoV-2 infection: significance and implications. Lancet Infect Dis  2024;24:e453–62. 10.1016/S1473-3099(23)00815-0 [DOI] [PubMed] [Google Scholar]
  34. Markov  PV, Ghafari  M, Beer  M  et al.  The evolution of SARS-CoV-2. Nat Rev Microbiol  2023;21:361–79. 10.1038/s41579-023-00878-2 [DOI] [PubMed] [Google Scholar]
  35. Martin  MA, Koelle  K. Comment on “genomic epidemiology of superspreading events in Austria reveals mutational dynamics and transmission properties of SARS-CoV-2”. Sci Transl Med  2021;13:eabh1803. 10.1126/scitranslmed.abh1803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. McCrone  JT, Woods  RJ, Martin  ET  et al.  Stochastic processes constrain the within and between host evolution of influenza virus. Elife  2018;7:e35962. 10.7554/eLife.35962 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. McMinn  P, Carrello  A, Cole  C  et al.  Antigenic drift of influenza a (H3N2) virus in a persistently infected immunocompromised host is similar to that occurring in the community. Clin Infect Dis  1999;29:456–8. 10.1086/520243 [DOI] [PubMed] [Google Scholar]
  38. Memoli  MJ, Athota  R, Reed  S  et al.  The natural history of influenza infection in the severely immunocompromised vs nonimmunocompromised hosts. Clin Infect Dis  2014;58:214–24. 10.1093/cid/cit725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Neidhart  J, Szendro  IG, Krug  J. Adaptation in tunably rugged fitness landscapes: the rough Mount Fuji model. Genetics  2014;198:699–721. 10.1534/genetics.114.167668 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Nussenblatt  V, Roder  AE, Das  S  et al.  Yearlong COVID-19 infection reveals within-host evolution of SARS-CoV-2 in a patient with B-cell depletion. J Infect Dis  2022;225, :1118–23. 10.1093/infdis/jiab622 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Paradis  E, Schliep  K. Ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics  2019;35:526–8. 10.1093/bioinformatics/bty633 [DOI] [PubMed] [Google Scholar]
  42. Pauly  MD, Procario  MC, Lauring  AS. A novel twelve class fluctuation test reveals higher than expected mutation rates for influenza A viruses. Elife  2017;6:e26437. 10.7554/eLife.26437 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Quaranta  EG, Fusaro  A, Giussani  E  et al.  SARS-CoV-2 intra-host evolution during prolonged infection in an immunocompromised patient. Int J Infect Dis  2022;122:444–8. 10.1016/j.ijid.2022.06.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Raghwani  J, Rose  R, Sheridan  I  et al.  Exceptional heterogeneity in viral evolutionary dynamics characterises chronic hepatitis C virus infection. PLoS Pathog  2016;12:e1005894. 10.1371/journal.ppat.1005894 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Raghwani  J, Wu  C-H, Ho  CKY  et al.  High-resolution evolutionary analysis of within-host hepatitis C virus infection. J Infect Dis  2019;219:1722–9. 10.1093/infdis/jiy747 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Rambaut, A.  2018. Figtree v1.4.4. Software  available from http://tree.bio.ed.ac.uk/software/figtree/
  47. Riddell  AC, Kele  B, Harris  K  et al.  Generation of novel severe acute respiratory syndrome coronavirus 2 variants on the B.1.1.7 lineage in 3 patients with advanced human immunodeficiency Virus-1 disease. Clin Infect Dis  2022;75:2016–8. 10.1093/cid/ciac409 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Rocha  E, Cox  NJ, Black  RA. Antigenic and genetic variation in influenza A (H1N1) virus isolates recovered from a persistently infected immunodeficient child. J Virol  1991;65:2340–50. 10.1128/jvi.65.5.2340-2350.1991 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Rogers, M. B., Song, T., Sebra, R  et al.  2015. Intrahost dynamics of antiviral resistance in influenza A virus reflect complex patterns of segment linkage, reassortment, and natural selection. MBio, 6:e02464-14. 10.1128/mBio.02464-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Scherer  EM, Babiker  A, Adelman  MW  et al.  SARS-CoV-2 evolution and immune escape in immunocompromised patients. N Engl J Med  2022;386:2436–8. 10.1056/NEJMc2202861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Schneider  M, Ackermann  K, Stuart  M  et al.  Severe acute respiratory syndrome coronavirus replication is severely impaired by MG132 due to proteasome-independent inhibition of M-calpain. J Virol  2012;86:10112–22. 10.1128/JVI.01001-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Shi, Y. T., Harris, J. D., Martin, M. A., and Koelle, K.  2024. Transmission bottleneck size estimation from de novo viral genetic variation. Mol Biol Evol, 41:msad286. 10.1093/molbev/msad286 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Smith  E, Hamilton  WL, Warne  B  et al.  Variable rates of SARS-CoV-2 evolution in chronic infections. PLoS Pathog 2025;28:e1013109. 10.1371/journal.ppat.1013109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Sonnleitner  ST, Prelog  M, Sonnleitner  S  et al.  Cumulative SARS-CoV-2 mutations and corresponding changes in immunity in an immunocompromised patient indicate viral evolution within the host. Nat Commun  2022;13:2560. 10.1038/s41467-022-30163-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. van Beek  J, de Graaf  M, Smits  S  et al.  Whole-genome next-generation sequencing to study within-host evolution of norovirus (NoV) among immunocompromised patients with chronic NoV infection. J Infect Dis  2017;216:1513–24. 10.1093/infdis/jix520 [DOI] [PubMed] [Google Scholar]
  56. Volz, E. M. and Frost, S. D. W.  2017. Scalable relaxed clock phylogenetic dating. Virus Evol, 3:vex025. 10.1093/ve/vex025 [DOI] [Google Scholar]
  57. Wang  L-G, Lam  TT-Y, Xu  S  et al.  Treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Mol Biol Evol  2020;37:599–603. 10.1093/molbev/msz240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Wilkinson  SAJ, Richter  A, Casey  A  et al.  Recurrent SARS-CoV-2 mutations in immunodeficient patients. Virus Evol  2022;8:veac050. 10.1093/ve/veac050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Xue  KS, Stevens-Ayers  T, Campbell  AP  et al.  Parallel evolution of influenza across multiple spatiotemporal scales. Elife  2017;6:e26875. 10.7554/eLife.26875 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Yu  TC, Kikawa  C, Dadonaite  B  et al.  Pleiotropic mutational effects on function and stability constrain the antigenic evolution of influenza hemagglutinin. bioRxiv  2025.05.24.655919.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

rough_MountFufi_final_supplemental_clean_veaf054

Data Availability Statement

All simulation code is available on GitHub at: https://github.com/katiakoelle/rmf_model. All data in this work are simulated data.


Articles from Virus Evolution are provided here courtesy of Oxford University Press

RESOURCES