Skip to main content
PLOS Pathogens logoLink to PLOS Pathogens
. 2018 Sep 12;14(9):e1007291. doi: 10.1371/journal.ppat.1007291

Antigenic evolution of viruses in host populations

Igor M Rouzine 1,2,*,#, Ganna Rozhnova 2,3,4,#
Editor: Marco Vignuzzi5
PMCID: PMC6173453  PMID: 30208108

Abstract

To escape immune recognition in previously infected hosts, viruses evolve genetically in immunologically important regions. The host’s immune system responds by generating new memory cells recognizing the mutated viral strains. Despite recent advances in data collection and analysis, it remains conceptually unclear how epidemiology, immune response, and evolutionary factors interact to produce the observed speed of evolution and the incidence of infection. Here we establish a general and simple relationship between long-term cross-immunity, genetic diversity, speed of evolution, and incidence. We develop an analytic method fusing the standard epidemiological susceptible-infected-recovered approach and the modern virus evolution theory. The model includes the factors of strain selection due to immune memory cells, random genetic drift, and clonal interference effects. We predict that the distribution of recovered individuals in memory serotypes creates a moving fitness landscape for the circulating strains which drives antigenic escape. The fitness slope (effective selection coefficient) is proportional to the reproductive number in the absence of immunity R0 and inversely proportional to the cross-immunity distance a, defined as the genetic distance of a virus strain from a previously infecting strain conferring 50% decrease in infection probability. Analysis predicts that the evolution rate increases linearly with the fitness slope and logarithmically with the genomic mutation rate and the host population size. Fitting our analytic model to data obtained for influenza A H3N2 and H1N1, we predict the annual infection incidence within a previously estimated range, (4-7)%, and the antigenic mutation rate of Ub = (5 − 8) ⋅ 10−4 per transmission event per genome. Our prediction of the cross-immunity distance of a = (14 − 15) aminoacid substitutions agrees with independent data for equine influenza.

Author summary

Spread of many RNA viruses in a population represents a competition between host immune responses and viral evolution. RNA viruses accumulate mutations in immunologically important regions to escape immune recognition in hosts previously exposed to infection, while the immune system responds by producing new memory cells. Despite recent advances in data collection and their analysis, it remains conceptually unclear how epidemiology, immune response, and evolutionary factors interact to produce the observed speed of evolution and its incidence. By combining the standard epidemiological approach with the modern theory of viral evolution, we predict a general relationship between long-term cross-immunity, antigenic diversity of virus, its evolution speed, infection incidence, and the time to the most recent common ancestor. We apply these theoretical findings to available data on influenza virus to determine two important parameters of its evolution and confirm the model. Current strategies of vaccination against influenza should take into account stochastic fluctuations in fitness effect of mutations predicted by the theory.

Introduction

Spread of many RNA viruses occurs as a race between host immune responses and rapid viral evolution. The development of treatment and effective preventive measures such as vaccines and therapeutic interference particles [13] requires understanding of the mechanics of viral evolution at the scale of a population. To evade immune recognition by hosts previously exposed to infection, in a never-ending chase, viruses accumulate mutations in immunologically relevant regions of the genome [4]. Despite advances in the collection and analysis of epidemiological and genomic data, it remains conceptually unclear how epidemiology, immune response, and evolutionary factors interact to produce the observed evolution speed and the incidence of infection.

Influenza virus infects 5-15% of the world population. The global spread and reinfection of the same individuals is caused by rapid evolution of antibody-binding regions [4]. A large amount of information has been obtained, including world-wide circulation [57], genetic maps of virus variants and antibodies, molecular mechanisms, and fitness effect of specific mutations [4, 810]. Vigorous data analysis and computer simulation helped to understand many features of influenza virus evolution [7, 1115]. In particular, recent work [15] offers an inference model to predict short-term evolution of influenza, which is helpful for optimization of vaccination strategy. However, the more general connection between the population-scale viral parameters and its evolutionary behavior is still lacking.

The aim of this work is to establish general and simple relationships for the speed of virus evolution, genetic diversity, and annual incidence in terms of population parameters, and to train them on the available data for influenza virus. We propose a general analytic approach combining a susceptible-infected-recovered (SIR) framework [11, 16] with the stochastic evolution theory [1725]. Using the experimental observation that phylogenetic trees of influenza virus have a vine-like structure with short branches [4], we focus on virus evolution along the one-dimensional trunk. Analysis demonstrates that the evolution under immune memory occurs in the form of a traveling wave in antigenic space, with fitness landscape moving together with the wave. The fitness slope (effective selection coefficient) can be expressed in terms of the cross-immunity distance.

We provide analytic predictions for the speed, incidence, and the average time to most recent common ancestor in terms of population parameters, including reproduction number, population size, and cross-immunity distance. Then we discuss how the punctuated nature of influenza evolution alternating small-effect and large mutations [4, 14] may be interpreted from the stochasticity of evolution.

Model and methods

Strain-structured epidemiological model

We start by describing briefly our model and approach. The details are given in S1 Appendix. Standard models of evolution focus on the dynamics of virus strains (variants), while standard epidemiological models study the transmission of a virus in a host population. For viruses that evolve to evade immune memory of previously infected hosts, evolutionary and epidemiological dynamics are tightly coupled [26]. Here we adopt a strain-based formulation of epidemiological models, in which all individuals are infected or recovered. Recovered individuals are classified according to their current ability to respond to various viral strains which represent genetic variants of an antibody-binding region of the virus (e.g., hemaglutinin gene for influenza virus). Each infected individual is assumed to be infected with a single strain denoted by x. We measure the “antigenic coordinate” x as genetic distance in terms of non-synonymous nucleotide substitutions. Infection by a viral strain is cleared in several days or weeks leaving in the recovered individual immunological memory that provides full protection against the same strain and partial protection against infection by genetically close strains. We assume one-dimensional space, x, that represents the trunk of the phylogenetic tree. For each recovered individual, we keep track only of the memory of the most recent infection [11, 12]. In S1 Appendix, Section 1.3.3, we show that this approximation has a modest effect on the final results.

Let i(x, t) denote the density of individuals currently infected with strain x, and r(x, t) be the density of individuals whose last infection was with strain x and who then recovered. The model is represented by a system of differential equations that describe the dynamics of the distributions i(x, t) and r(x, t):

dr(x,t)dt=-r(x,t)R0xK(x-y)i(y,t)dy+i(x,t),di(x,t)dt=i(x,t)[R0-xK(y-x)r(y,t)dy-1]+(mutationterm) (1)

We assume that each individual is either infected or recovered, as given by the normalization condition

-+[r(x,t)+i(x,t)]dx=1.

The treatment of mutations, which are assumed to be rare, will be described below in subsection Mutation.

Eq 1 describe the following epidemiological processes. Firstly, recovered individuals from strain x can be infected with strain y and their susceptibility is proportional to the cross-immunity function K(xy), which depends on the genetic distance between strains x and y, so that K(xy) > 0, y > x; K(xy) ≡ 0, y < x; K(−∞) = 1.

Here we assume that individuals recovered from strain x can be infected only by strains y ahead of x in time, y > x, so that K(u) is zero when its argument u is zero or positive (Fig 1, blue curve). In fact, there is a narrow region at the leading edge, where the backward infection could be possible. However, since the edge region is very narrow in the parameter range of interest, this process has a minor effect on the results (see the details in S1 Appendix, Section 1.3.2).

Fig 1. One-dimensional epidemiological model predicts a steady traveling wave along fitness axis.

Fig 1

A) Frequencies of recovered individuals (black curve) and the infected individuals (red histogram) in population in the reference frame moving with the wave. Here X axis plots the antigenic coordinate in that reference frame, u = xct. Black solid line shows analytic prediction for r(u) (Eq 3). Histograms show the result of a full stochastic simulation of the epidemiological model, Eq 1. Blue line is cross-immunity function K(u) (Table 1). Parameters (example): R0 = 2, a = 9, Ub = 5.8 × 10−6, N = 108. Units of the values on the axes are given in Table 1 and Eq 1. A wave in the rest frame of reference is shown in S2B Fig.

Secondly, infected individuals with the density i(x, t) recover. Thirdly, individuals infected with a strain x may produce a mutant strain x′ with a small probability, as explained below (Mutation). We measure time in the units of infectious period, trec, so that recovery rate is 1, and transmission rate equals the basic reproduction number, R0, defined as the reproduction number in a population of previously uninfected individuals.

Mutation

So far we have considered only dynamics of strains x which already exist. What drives the antigenic evolution is the emergence of new viral strains. Each strain x occasionally and accidentally undergoes a mutation event which changes its ability to be recognized by antibodies (phenotype). We describe this as a variable change in its antigenic coordinate Δx > 0. The new influenza strain with a new antigenic coordinate, x + Δx, is either cleared from the individual or (with small probability) transmitted to another person. The model parameters describing random mutations are the average rate Ub per genome per infectious period (Table 1) and the distribution of Δx. The actual distribution may be quite complex [27]; here, we consider a class of exponential distributions [23]. Specifically, we assume that with each mutation, the value of Δx is drawn randomly with the following probability density

ρ(Δx)=e-(Δx)βΓ(1+1/β), (2)

where β is a fixed parameter.

Table 1. Model parameters: Input (upper rows) and output (lower rows).

Notation Name Unit H3N2 H1N1
R0 Basic reproduction number 1b 1.8a 1.46a
trec Recovery time day 5a 5a
Ub Mutation rate per genome 1/trec|yr 5 10−4|0.036c 8 10−4|0.058c
a Crossimmunity distance AA 15c 14c
K(u) Crossimmunity function 1 |u|/(a + |u|) |u|/(a + |u|)
N Population size 1 108 108
β Mutation distrib. parameter 1 2 2
σ Average selection coefficient 1 0.048d 0.028d
365NitrecN Annual incidence 1/yr 0.07d 0.04d
c Substitution rate 1/trec|yr 0.036|2.6a 0.031|2.26a
TMRCA2 Pairwise coalescent time Year 3.03a 4.59a

a Known from published data for influenza A strains H3N2 and H1N1 [7, 13, 30, 31]

b Unit “1” stands for “dimensionless”.

c Input parameter of the model which was adjusted to fit published data.

d A value predicted for the best-fit parameter set

Genetic drift

Below in Results, we introduce the critically important factor of random genetic drift [28, 29] by allowing the number of new infections to vary randomly among the sources of transmission. The model parameters and their estimates used in the analysis are summarized in Table 1.

Results

The model described in the previous section establishes a general analytic relationship between immunological, epidemiological, and evolutionary properties of a virus causing non-chronic infection. Using the analytic approach described in the previous section, below we predict the evolution speed, the incidence of influenza in a population, and the time to the most recent common ancestor. Then, we test analytic results with stochastic simulation and compare them to available data on influenza strain A H3N2.

Recovered individuals and the traveling wave

Below we analyze epidemiological dynamics in two steps. First, we assume that, in the realistic parameter range, a ≫ 1, the frequency of infected individuals, i(x, t) represents a solitary peak, much more narrow in genetic distance x than the frequency of recovered individuals, r(x, t). Using this fact, we find analytically the form of r(x, t). Second, we apply the well-developed theory of asexual evolution [1821, 23] to obtain parameters of the distribution of infected individuals i(x, t). Details are given in S1 Appendix; here we present the main steps of the derivation.

We start our analytic derivation by noting that, in the limit of small mutation rates, the main role of mutation is to form new strains with antigenic coordinate x larger than for already existing strains. For already existing strains, mutation is negligible. This assumption is intuitively clear and is verified in the relevant parameter range, using estimates of mutation rate Ub (Table 1).

Neglecting the mutation term in Eq 1, we seek for a traveling wave solution of the form r(x, t) = r(xct) and i(x, t) = i(xct) where xctu is the relative antigenic coordinate of a strain and c = d 〈x〉/d t is the wave speed defined as the average number of non-synonymous nucleotide substitutions per year. Without loss of generality, we choose the peak of the infected wave i(u) to be at u = 0, [di(u)/du]u=0 = 0. The traveling wave solution of Eq 1 for infected and recovered individuals then reads

i(u)=Acf(u),r(u){Aexp[-AR0u0K(v)dv],u<0,0,u>0, (3)

where A is a constant found from the normalization condition -+[r(u)+i(u)]du=1, and f(u) is a narrow peak with unit area and a width much less than the width of the recovered distribution, r(u). The wave speed c and the shape of the infected density f(u) are to be determined later on.

At large R0, K(v) in Eq 3 can be expanded linearly near zero, so that density of the recovered becomes a half of a Gaussian

r(u)2R0πae-(R0uaπ)2,u<0;0,u>0 (4)

and A = 2R0/(πa). The fraction of infected individuals in population

NinfN=-i(u)du=Ac=2R0cπa (5)

is assumed to be much smaller than 1. Then the annual incidence of infection is expressed in terms of cross-immunity distance, evolution speed, and basic reproduction number as

Annualincidence=2R0cπa365trec, (6)

which is a directly testable prediction.

Analytic solution, Eqs 3 and 4, is based on the assumption that the infected wave i(u) is much more narrow than the recovered wave r(u). To verify the validity of this approximation, we compare the Eq 3 with Monte-Carlo simulation based on Eq 1. The simulation confirms the existence of a steady traveling wave with two linked components moving to the right in antigenic coordinate (Fig 1). Infected wave i(u) is, indeed, a narrow peak. The time-averaged solution for recovered individuals obtained from simulation agrees fairly well with the analytic prediction (black line). Recovered wave r(u) displays a sharp increase near the maximum of i(u) and a slowly decaying tail at u < 0. The sharp increase is due to continuous recovery of infected individuals. The decaying tail is caused by reinfection of recovered individuals once they become genetically remote from the moving front of wave r(u). This derivation captures only the shape of the recovered peak leaving the narrow infected peak undefined.

Moving fitness landscape

In order to determine the infected individual distribution, i(u), we use standard traveling wave theory [1823]. The interesting feature of the selection due to immune escape is that the fitness landscape which controls the traveling wave travels with the wave. Moreover, it is the wave itself which creates its own landscape, as follows: the recovered create a landscape for the infected evolution, which moves the recovered distribution forward in x, and so on.

To derive the form of landscape on the human population level, we use the standard definition of viral fitness as the average number of secondary infections caused by an infected individual [28, 3234]. (The reproductive number must not to be confused with the basic reproductive number R0, which is its maximum value, i.e. the value in a totally susceptible population.) Here we choose to define fitness w(x, t) as the log of R0 − 1, i.e., the exponential expansion rate of the density of infected individuals i(x, t) measured per infectious period:

w(x,t)=lni(x,t)t=R0-xK(y-x)r(y,t)dy-1. (7)

The form of w(u) obtained from Eqs 7 and 3 is shown in Fig 2 (red line).

Fig 2. Traveling fitness landscape and its linear approximation near the infected peak.

Fig 2

Red curve: analytic result (Eq 7). Gray circles: Monte-Carlo simulation based on Eq 1. Black line: linear approximation with the average selection coefficient σ = 0.066 (Eq 8). Parameters as in Fig 1: R0 = 2, a = 9, Ub = 5.8 × 10−6, N = 108. For the accuracy of linear approximation, see S1 Fig.

The asymptotic cases of the fitness landscape w(u) are

w(u){R0-1,ua,σu,|u|a,-1,u<0,|u|a. (8)

where

σ=-R0-0dKdur(u)du (9)

has the meaning of the fitness landscape slope, or the average selection coefficient. According to Eq 8, w(u) is positive for u > 0 and negative for u < 0, indicating that viruses are selected for in front of the infected peak and selected against in the wake of the wave. For large positive or negative u, |u| ≫ a, we predict saturation of w(u). At u = 0, w(0) = 0, which is equivalent to the fact that the actual reproduction number is exactly 1 at the peak of the wave. Within the range |u| ≪ a, where the narow peak of the infected individuals is located, fitness landscape can be expanded linearly with slope σ > 0 which represents the average selection coefficient of a mutation event. For sufficiently large R0, from Eqs 4 and 9, σ can be approximated by a series in 1/R0

σ(a,R0)=1a[R0-2+3π2R0+O(1R02)], (10)

where a ≡ 1/|K′(0)|, and the second and third terms are small corrections to the first term. Thus, the average selection coefficient σ of the traveling fitness landscape is inversely proportional to the cross-immunity distance a. It also increases linearly with the basic reproduction ratio R0 when R0 is large. The two correction terms in Eq 10 depend on the form of cross-immunity function in Table 1. For an alternative form K(x) = 1 − exp(−x/a), they are smaller by factors of 2 and 6, respectively. The overall agreement for the entire landscape w(u) between the analytic prediction and simulation is quite good (Fig 3).

Fig 3. Analytic results for the evolution speed are confirmed by stochastic simulation.

Fig 3

Simulation is performed at fixed parameters R0 = 2, a = 9; Ub and β as shown. Solid and dashed lines are analytic results for the wave speed, c (Eq 6, S14, S16-S18) at two values of mutation rate Ub which define the broadest range of interest for RNA viruses, and two values of parameter β to test sensitivity to the density of selection coefficient distribution. Symbols show results either performed by full stochastic simulation of the SIR model (Eq 1) or by a reduced simulation with σ = 0.066 (S1 Appendix).

Antigenic diversity and the speed of evolution

We get further insight into the dynamics of the model by predicting the speed of viral evolution c. So far, we have left this value undetermined because it weakly affects the shape of the density of recovered individuals r(x, t), Eq 3. In contrast, the density of infected individuals i(u), which is much more narrow, needs to be determined simultaneously with c. Our result for the average selection coefficient σ, Eq 10, reduces the problem of epidemiological evolution to models of asexual populations with many diverse sites where the speed was derived previously in terms of population size, selection coefficient and mutation rate ([1823]). We consider a case with randomly distributed selection coefficient s = σΔx, where mutational distance Δx is sampled from distribution in Eq 2 with large parameter β.

This section contains the central result of our analysis: Antigenic diversity Var[x] = < (Δx)2 > and adaptation rate v defined as the average rate of fitness increase (“fitness flux”) depend on crossimmunity range a and other parameters [23]

Var[x]=2logNinfσlog(βσ/Ub) (11)
v=σ2Var[x] (12)

Another measure of evolution rate is the average substitution rate c

c=(σ2/s*)Var[x] (13)
s*=σ[2βlogσUb]1β-1 (14)

where s* represents the most probable fitness gain of a mutation established in a population [23]. Note that s* is larger than the average selection coefficient σ. The expressions for Var[x] and s* are approximate, within the accuracy of logarithms inside the large logarithms. For more accurate expressions, see S1 Appendix.

To apply these results to our case of antigenic evolution, we substitute average selection coefficient σ from Eq 10 and infected population size Ninf from Eq 5. Then the metrics of evolution speed c, v are expressed in terms of a and epidemiological parameters (Table 1). In the limit of very large β, Eqs 1114 match results of a model with constant selection coefficient σ [18, 20].

We verified analytic results for wave speed c by Monte-Carlo simulation in a wide range of N and Ub (Fig 3). We used two methods: full simulation of initial Eq 1 with randomly distributed mutational effects, and a reduced Moran algorithm with linearized fitness landscape (symbols in Fig 3). We observe that our analytic prediction of a logarithmic increase of c with N and Ub follows simulation quite well, except at smallest Ub and N explored in our study. Logarithmic dependencies are characteristic for asexual evolution models ([1823, 35, 36]). Abbreviations IS, CI, MM near symbols indicate different regimes regarding the number of genomic sites evolving within the same time frame: selection sweeps at isolated sites (IS), pairwise clonal interference (CI) [23, 35, 36], and multiple-mutation regime (MM) [1821, 23]. The traveling wave models are designed for MM regime, which explains the discrepancy at smallest Ub and N. We also observe that the steepness of the selection coefficient distribution, β, weakly affects the predicted speed.

Our analysis predicts that substitution rate of antigenic mutations c, Eq 13, is inversely proportional to the cross-immunity distance a and increases logarithmically with host population size and mutation rate. The average selection coefficient at the population level, σ, is also inversely proportional to a, Eq 10. An alternative measure of the evolution speed, the adaptation rate v, Eq 12, is inversely proportional to a2. The annual incidence of infection, Eq 6 also scales as 1/a2.

Time to the most recent common ancestor

Taking advantage of recent theoretical progress in asexual phylogeny [24, 25, 38], we also calculated an important observable quantity, the time to the most recent common ancestor of two co-existing viruses (S1 Appendix, Eqs S20-S21).

TMRCA2=z2log(Nσ)v (15)

Here numeric factor z depends on the distribution of mutational effect Δ[x] [24, 25]. The predicted values are z = 1.5 in the case of fixed mutational effect Δ[x], and z = 3 in the case of the Gaussian distribution of Δ[x] (Eq 2 with β = 2). Because the Gaussian case is more realistic, and because we are not aware of any results for TMRCA2 for other forms of distribution, below we choose the value β = 2 for data fitting.

Comparison with influenza A data

To test the model, we compared its predictions with available data on influenza A H3N2 and H1N1, as follows. The input parameters of the model and the output (predicted) parameters are summarized in Table 1. The values of input parameters such as population size N, reproduction ratio in the absence of immune recognition R0 (during a major pandemic caused by antigenic shift), and recovery time trec have been measured [7, 13, 30, 31]. In contrast, parameters a and Ub result from biological interactions at multiple biological scales (cell, host, population) and are hard to come by. On the other hand, data on two parameters predicted by the model, TMRCA2 and substitution rate c, are available. Therefore, we opted to adjust the unknown input parameters a and Ub to fit available data for the two predicted parameters (Fig 4A). We assumed a total susceptible population of N = 108 individuals, which corresponds to a large country.

Fig 4. For influenza A virus, the model predicts annual incidence and cross-immunity which agree with observations.

Fig 4

Shown is the best-fit to combined immunological, epidemiological, and evolutionary data available on influenza A strains H3N2 (red and blue colors) and H1N1 (magenta and cyan colors). (A) X and Y-axis are the cross-immunity scale, a, and the mutation rate per genome per transmission event, Ub, respectively. Analytic predictions for the evolution speed c (red and magenta curve, Eq 13) and TMRCA2 (blue and cyan, Eq 15 with z = 3) are shown as contours of constant heights taken from data [7] (Extended Data Table 1 and refs). Population size is estimated N ∼ 108 [31]. Dashed lines show the intersection points where both parameters fit experimental values. (B) Solid curves: The same three quantities for H3N2 as a function of population number N at the best-fit values of a and Ub. Dashed lines correspond to N = 108. (A and B) Input from data [7, 31]: R0 = 1.8, c = 2.6 AA/year, TMRCA2 = 3.0 years for H3N2 and R0 = 1.46, c = 2.3 AA/year, TMRCA2 = 4.6 years for H1N1. Infection cycle time trec = 5 days. Predicted annual incidence of infection of (4 − 7)% and the cross-immunity scale a = (14 − 15) AA are in good agreement with independent data [37].

It is evident that strain H2N3 has a faster evolution rate and a shorter time TMRCA2 than strain H1N1 due to a larger value of R0 causing, in turn, a larger average selection coefficient σ. The values of Ub and a for the two strains are similar (Fig 4a).

The best-fit values for the cross-immunity distance, a = 14 − 15, agree very well with independent data on equine influenza [37], which represents a direct confirmation of the model. The predicted annual incidence in humans of (4 − 7)% also falls within the experimentally observed range and previous modeling estimates [12, 13, 15]. Interestingly, the model explains the inverse correlation between TMRCA2 and evolution rate c reported previously for H2N3, H1N1 and two strains of influenza B [7]. Indeed, the predicted evolution rate c is linearly proportional to the effective selection coefficient σR0/a, while TMRCA2 is inversely proportional to σ. The dependence of c and TMRCA2 on the other parameters, Ub and N, is logarithmically slow.

To generalize our results for epidemics occurring on larger or smaller scales, we calculated the dependence of c, TMRCA2, and the annual incidence on population size N (Fig 4B). The sensitivity of our predictions to input parameters Ub, a, and R0 has also been tested (S1 Appendix, S3 and S4 Figs). Thus, traveling wave theory with modest selection predicts logarithmic dependence of the speed on population size (Fig 4B).

Results are robust to the existence of additional dimensions of antigenic space

Epidemiological data demonstrate that, a priori, antigenic space is not one-dimensional but has fractal nature and fractal dimensionality more than 1 [8, 31]. To demonstrate the weak sensitivity of our model to the existence of additional dimensions, we extended our model to a discrete random tree of epitope variants and solved it numerically (S1 Appendix, S6 Fig). Phylogeny demonstrates quasi-1D behavior comprising a long trunk of permanently fixed mutations and short branches representing transient virus variants and resembling the actual influenza H3N2 phylogeny [4, 12, 13, 15]. We also confirmed the formation of a 1D traveling wave for two-dimensional genetic space (S5 Fig).

Discussion

We investigated stochastic evolutionary dynamics of a virus driven by the pressure to escape immune recognition in previously infected individuals. We mapped this problem to an evolutionary model with fitness landscape expressed in terms of the cross-immunity function K(x) (Fig 2). Stochastic evolution occurs as a traveling wave with two population components structured in the antigenic variant space x, recovered individuals and the currently infected individuals, with different widths and total counts (Fig 1). The recovered distribution is broad and large. The infected distribution represents a narrow and small peak at the recovered distribution front. We expressed several observable parameters including the speed of viral evolution, the annual incidence of infection, and the average time to the most recent ancestor in terms of model parameters N, Ub, R0, K(x) (Table 1). The analytic predictions agree with simulation and are able to estimate correctly important parameters of viral evolution in host populations, as we illustrated using genomic data on influenza.

One of the puzzling aspects of influenza virus evolution is is punctuated nature [4]. While most mutations are almost neutral or have a modest phenotypic (fitness) effect, some represent large jumps in antibody recognition [14]. Our results interpret these jumps as a natural consequence of the stochastic nature of the traveling wave models. The extension of the leading edge of a wave occurs due to adding rare, best available escape alleles. Asexual evolution theory with variable fitness effect of mutations demonstrates that most fixed mutations have a fitness effect in excess of average fitness effect [23]. Good et al show that the most likely selection coefficient s* that drives the wave depends on model parameters σ, N, Ub, mapping the results either onto the multiple-mutation (MM) model with fixed s [1821] or the two-site clonal interference (CI) model [35, 36]. Present work demonstrates that influenza virus evolves within MM regime near the border with CI regime (Fig 3). In this region, the fitness effect of a fixed allele is predicted to fluctuate strongly around the most likely value s*, which represents a possible explanation of the punctuated effect.

An SIR model with immune memory and 1D antigenic space (Eq 1) has been previously proposed by Lin et al [11]. Their analysis differs from ours in two critical aspects. Firstly, their approach to viral evolution was completely deterministic, i.e. assumes infinite population size. In fact, the effect of clonal interference acting in finite population diminishes antigenic return on additional mutations. Secondly, their mutation term in Eq 1 had a diffusion form proportional to the second derivative of the infected individual density, ∂2i(x, t)/∂x2. This approximation would be correct if the front edge of the wave was smooth. As we discuss in S1 Appendix, neither approximation holds at low mutation rates, Ub ∼ 10−4. As a result, the approach of Lin et al predicts evolution speeds far below simulation results. The traveling wave approach employed here naturally accounts for both the stochastic effects and the steepness of the leading edge. Future development of this model requires inclusion of finite mutation cost [39].

Our analytic results agree with the numeric results of a previous simulation by Bedford et al [12]. Using a similar model, they predicted the same incidence range for influenza A, the same range for the evolution speed, and interpreted the quasi-one-dimensional trajectory in the genetic space we have also observed (S5 and S6 Figs). As starting parameters, they assumed mutation rate Ub ∼ 10−4 and set the cross-immunity distance to be a = 1/0.07 based on equine flu data [37]. By comparison, here we determine Ub and a a posteriori from fitting human H3N2 and H1N1 data on c and TMRCA from the cited work [7]. We test the model by comparing our prediction with the experimental value of a [37].

Conclusion

Merging the standard epidemiological approach and the modern traveling wave theory, we develop a general analytic approach that connects epidemiological and immunological parameters to the observed parameters of influenza evolution. We demonstrate that the distribution of recovered individuals in the genetic space effectively creates a fitness landscape for the infected individual distribution, and both distributions move together along quasi-one-dimensional path. Our predictions demonstrate a good experimental agreement with data on influenza A H3N2.

Supporting information

S1 Appendix. Mathematical appendix.

(PDF)

S1 Fig. Theory of clonal interference with relative fitness linear in antigenic coordinate is accurate at small mutation rates and approximately correct at intermediate rates.

(TIFF)

S2 Fig. Finite population size N eliminates the artifact of “mirror wave”.

(TIFF)

S3 Fig. Dependence of the wave speed and incidence on the population size.

(TIFF)

S4 Fig. Dependence of the wave speed and incidence on the cross-immunity scale.

(TIFF)

S5 Fig. Two-dimensional influenza model predicts spontaneous development of a stable 1D-like traveling wave starting from a flat front.

(TIFF)

S6 Fig. Phylogenetic tree of virus strains existing at different times in a multi-dimensional antigenic space projected onto 2D.

(TIFF)

Acknowledgments

This work initiated in extensive discussions with Michael Lässig. I.M.R. is grateful to Eric Brunet for valuable suggestions and discussions.

Data Availability

All data in the work are from published studies cited in the text.

Funding Statement

This work has been partly supported by Deutsche Forschungsgemeinschaft grant SFB 680/C2 to Michael Lässig, http://www.dfg.de/, and Agence Nationale de Recherche grant J16R389 to IMR, http://www.agence-nationale-recherche.fr/. The funding agencies had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Metzger VT, Lloyd-Smith JO, Weinberger LS. Autonomous targeting of infectious superspreaders using engineered transmissible therapies. PLoS Comput Biol. 2011;7(3):e1002015 10.1371/journal.pcbi.1002015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Rouzine IM, Weinberger LS. Design requirements for interfering particles to maintain coadaptive stability with HIV-1. J Virol. 2013;87(4):2081–2093. 10.1128/JVI.02741-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Rouzine IM, Weinberger LS. Reply to “Coadaptive stability of interfering particles with HIV-1 when there is an evolutionary conflict”. J Virol. 2013;87(17):9960–9962. 10.1128/JVI.00932-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Smith DJ, Lapedes AS, de Jong JC, Bestebroer TM, Rimmelzwaan GF, Osterhaus AD, et al. Mapping the antigenic and genetic evolution of influenza virus. Science. 2004;305(5682):371–376. 10.1126/science.1097211 [DOI] [PubMed] [Google Scholar]
  • 5. Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC. The genomic and epidemiological dynamics of human influenza A virus. Nature. 2008;453(7195):615–619. 10.1038/nature06945 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Russell CA, Jones TC, Barr IG, Cox NJ, Garten RJ, Gregory V, et al. Influenza vaccine strain selection and recent studies on the global migration of seasonal influenza viruses. Vaccine. 2008;26 Suppl 4:D31–34. 10.1016/j.vaccine.2008.07.078 [DOI] [PubMed] [Google Scholar]
  • 7. Bedford T, Riley S, Barr IG, Broor S, Chadha M, Cox NJ, et al. Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature. 2015;523(7559):217–220. 10.1038/nature14460 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Koel BF, Burke DF, Bestebroer TM, van der Vliet S, Zondag GC, Vervaet G, et al. Substitutions near the receptor binding site determine major antigenic change during influenza virus evolution. Science. 2013;342(6161):976–979. 10.1126/science.1244730 [DOI] [PubMed] [Google Scholar]
  • 9. Fonville JM, Wilks SH, James SL, Fox A, Ventresca M, Aban M, et al. Antibody landscapes after influenza virus infection or vaccination. Science. 2014;346(6212):996–1000. 10.1126/science.1256427 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Neher RA, Bedford T, Daniels RS, Russell CA, Shraiman BI. Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses. Proc Natl Acad Sci USA. 2016;113(12):E1701–1709. 10.1073/pnas.1525578113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Lin J, Andreasen V, Casagrandi R, Levin SA. Traveling waves in a model of influenza A drift. J Theor Biol. 2003;222(4):437–445. 10.1016/S0022-5193(03)00056-0 [DOI] [PubMed] [Google Scholar]
  • 12. Bedford T, Rambaut A, Pascual M. Canalization of the evolutionary trajectory of the human influenza virus. BMC Biol. 2012;10:38 10.1186/1741-7007-10-38 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Strelkowa N, Lassig M. Clonal interference in the evolution of influenza. Genetics. 2012;192(2):671–682. 10.1534/genetics.112.143396 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Bedford T, Suchard MA, Lemey P, Dudas G, Gregory V, Hay AJ, et al. Integrating influenza antigenic dynamics with molecular evolution. Elife. 2014;3:e01914 10.7554/eLife.01914 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Luksza M, Lassig M. A predictive fitness model for influenza. Nature. 2014;507(7490):57–61. 10.1038/nature13087 [DOI] [PubMed] [Google Scholar]
  • 16. Gog JR, Rimmelzwaan F, Osterhaus ADME, Grenfell BT. Population dynamics of rapid fixation in cytotoxic T lymphocyte escape mutants of influenza A. Proc Natl Acad Sci. 2003;100:11143–11147. 10.1073/pnas.1830296100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Tsimring LS, Levine H, Kessler DA. RNA virus evolution via a fitness-space model. Phys Rev Lett. 1996;76(23):4440–4443. 10.1103/PhysRevLett.76.4440 [DOI] [PubMed] [Google Scholar]
  • 18. Rouzine IM, Wakeley J, Coffin JM. The solitary wave of asexual evolution. Proc Natl Acad Sci USA. 2003;100(2):587–592. 10.1073/pnas.242719299 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Desai MM, Fisher DS. Beneficial mutation selection balance and the effect of linkage on positive selection. Genetics. 2007;176(3):1759–1798. 10.1534/genetics.106.067678 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Rouzine IM, Brunet E, Wilke CO. The traveling-wave approach to asexual evolution: Muller’s ratchet and speed of adaptation. Theor Popul Biol. 2008;73(1):24–46. 10.1016/j.tpb.2007.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Brunet E, Rouzine IM, Wilke CO. The stochastic edge in adaptive evolution. Genetics. 2008;179(1):603–620. 10.1534/genetics.107.079319 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Hallatschek O. The noisy edge of traveling waves. Proc Natl Acad Sci USA. 2011;108(5):1783–1787. 10.1073/pnas.1013529108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Good BH, Rouzine IM, Balick DJ, Hallatschek O, Desai MM. Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. Proc Natl Acad Sci USA. 2012;109(13):4950–4955. 10.1073/pnas.1119910109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Desai MM, Walczak AM, Fisher DS. Genetic diversity and the structure of genealogies in rapidly adapting populations. Genetics. 2013;193(2):565–585. 10.1534/genetics.112.147157 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Neher RA, Hallatschek O. Genealogies of rapidly adapting populations. Proc Natl Acad Sci USA. 2013;110(2):437–442. 10.1073/pnas.1213113110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Grenfell BT, Pybus OG, Gog JR, Wood JL, Daly JM, Mumford JA, et al. Unifying the epidemiological and evolutionary dynamics of pathogens. Science. 2004;303(5656):327–332. 10.1126/science.1090727 [DOI] [PubMed] [Google Scholar]
  • 27. Acevedo A, Brodsky L, Andino R. Mutational and fitness landscapes of an RNA virus revealed through population sequencing. Nature. 2014;505(7485):686–690. 10.1038/nature12861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Poulin R. Evolutionary Ecology of Parasites. Princeton University Press; 2007. [Google Scholar]
  • 29. Rouzine IM, Rodrigo A, Coffin JM. Transition between stochastic evolution and deterministic evolution in the presence of selection: general theory and application to virology [review]. Microbiol Mol Biol Rev. 2001;65:151–185. 10.1128/MMBR.65.1.151-185.2001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Carrat F, Vergu E, Ferguson NM, Lemaitre M, Cauchemez S, Leach S, et al. Time lines of infection and disease in human influenza: a review of volunteer challenge studies. Am J Epidemiol. 2008;167(7):775–785. 10.1093/aje/kwm375 [DOI] [PubMed] [Google Scholar]
  • 31. Biggerstaff M, Cauchemez S, Reed C, Gambhir M, Finelli L. Estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. BMC Infect Dis. 2014;14:480 10.1186/1471-2334-14-480 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Astier S. Principles of Plant Virology. Science Publishers; 2007. [Google Scholar]
  • 33. Nowak MA. Evolutionary Dynamics: Exploring the Equations of Life. Harvard University Press; 2006. [Google Scholar]
  • 34. Rice SH. Evolutionary Theory: Mathematical and Conceptual Foundations. Sinauer Associated; 2004. [Google Scholar]
  • 35. Gerrish PJ, Lenski RE. The fate of competing beneficial mutations in an asexual population. Genetica. 1998;102-103(1-6):127–144. 10.1023/A:1017067816551 [DOI] [PubMed] [Google Scholar]
  • 36. Schiffels S, Szollosi GJ, Mustonen V, Lassig M. Emergent neutrality in adaptive asexual evolution. Genetics. 2011;189(4):1361–1375. 10.1534/genetics.111.132027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Park AW, Daly JM, Lewis NS, Smith DJ, Wood JL, Grenfell BT. Quantifying the impact of immune escape on transmission dynamics of influenza. Science. 2009;326(5953):726–728. 10.1126/science.1175980 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Brunet E, Derrida B, Mueller AH, Munier S. Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization. Phys Rev E Stat Nonlin Soft Matter Phys. 2007;76(4 Pt 1):041104 10.1103/PhysRevE.76.041104 [DOI] [PubMed] [Google Scholar]
  • 39. Batorsky R, Sergeev RA, Rouzine IM. The Route of HIV Escape from Immune Response Targeting Multiple Sites Is Determined by the Cost-Benefit Tradeoff of Escape Mutations. PLoS Comput Biol. 2014;10:e1003878 10.1371/journal.pcbi.1003878 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Appendix. Mathematical appendix.

(PDF)

S1 Fig. Theory of clonal interference with relative fitness linear in antigenic coordinate is accurate at small mutation rates and approximately correct at intermediate rates.

(TIFF)

S2 Fig. Finite population size N eliminates the artifact of “mirror wave”.

(TIFF)

S3 Fig. Dependence of the wave speed and incidence on the population size.

(TIFF)

S4 Fig. Dependence of the wave speed and incidence on the cross-immunity scale.

(TIFF)

S5 Fig. Two-dimensional influenza model predicts spontaneous development of a stable 1D-like traveling wave starting from a flat front.

(TIFF)

S6 Fig. Phylogenetic tree of virus strains existing at different times in a multi-dimensional antigenic space projected onto 2D.

(TIFF)

Data Availability Statement

All data in the work are from published studies cited in the text.


Articles from PLoS Pathogens are provided here courtesy of PLOS

RESOURCES