Abstract
The most basic stochastic epidemic models are those involving global transmission, meaning that infection rates depend only on the type and state of the individuals involved, and not on their location in the population. Simple as they are, there are still several open problems for such models. For example, when will such an epidemic go extinct and with what probability (questions depending on the population being fixed, changing or growing)? How can a model be defined explaining the sometimes observed scenario of frequent mid-sized epidemic outbreaks? How can evolution of the infectious agent transmission rates be modelled and fitted to data in a robust way?
Keywords: Stochastic epidemics, Global transmission, Extinction, Genetic evolution, Endemicity
1. Introduction and classification
Epidemic processes are essentially stochastic, but stochastic epidemic models have not had a straightforward history. That epidemics proceed by chance contacts with individuals was under-stood from the earliest days of modelling, but early modelling developments were deterministic. The development of stochastic models, from the 1950s onward (e.g. Bailey, 1950; Bartlett, 1956), was in parallel with developments in techniques, starting with models that dealt in total numbers of infecteds, susceptibles, etc. Individual-based models came in first to deal with spatial populations (1970s), with subsequent developments related to computer methodology (simulations, inference) and network theory.
Stochastic models can conveniently be classified according to whether their contact structure is global, network, metapopulation or spatial. Given the many other aspects of disease to be modelled, there is good reason to model contact structure as simply as possible. Models with too many parameters cannot usefully be fitted: as Euler is reputed to have said, ‘Give me four parameters and I will draw you an elephant, five and I will have him wave his trunk’.
The simplest contact structure is no structure, often referred to as either global or homogeneous mixing (Mollison, 1995). Individuals’ probabilities of interaction do not depend on their location in the population, such as their social group or spatial location. Global models can incorporate individual heterogeneity, for example by having different rates of infection for individuals of different age, sex, or infection history. Numerous examples of (deterministic) global models, over the range of diseases important to humans, can be found in Anderson and May (1992).
Network epidemic models (Pellis et al., in this volume) are more difficult to define. Any individual-based epidemic model can be thought of as a network or random graph: with individuals as nodes, and infection of one by another as a link. The question is rather whether network theory can be usefully applied. In recent years network models have been notably successful in analysing models where individuals vary greatly in their number of contacts (the degree distribution of the underlying graph).
Metapopulation models (Ball et al., in this volume) deal with cases where interactions do depend on social group. The basic case is where the population is partitioned into non-overlapping groups, e.g. households; individuals have one contact rate with individuals in different groups, and another (higher) rate for individuals in the same group. More general metapopulation models allow an individual to belong to several different types of group, each with its own contact rate, or allow more levels of mixing.
Spatial models (Riley et al., in this volume) vary from simple lattices with only nearest–neighbour interactions, for which some theoretical analysis is possible, to complex models with long-distance interactions, for which only qualitative and approximate results are known. A key feature of spatial models is that they display slower than exponential growth, even in their earliest stage; this makes it difficult to approximate them adequately by deterministic models, and even to define threshold parameters.
As a simple example to illustrate these different types of model, consider a disease among two type of individual, male and female. In each case consider a simple Markov process SIR, in which infected individuals (I) have an exponentially distributed infectious period before being removed (R), during which they may infect susceptibles (S) as follows. First, suppose that the infection rates between any (I,S) pair depends only on the types of the individuals involved (perhaps individuals can only infect others of the opposite sex, and perhaps the rates from male to female and female to male are different); this is a global model. Second, suppose the individuals live distributed between a number of different villages, and that the rates of infection have two levels, with higher infection rates if the(I,S) pair live in the same village, lower if they live in different villages; this is a metapopulation model. Third, suppose instead that the individuals live in a line of houses equally spaced along a street, and that the infection rate between I and S depends on the distance between the houses they live in (normally one would take this to be a decreasing function of distance); this is a spatial model. Finally, in any of these populations, suppose that we think of individuals as vertices of a graph, with edges of the graph connecting pairs that have some kind of social relationship; and then take rates of infection between connected individuals that only depend on their type; this is a network model. Note that all the other three examples can be considered as network models, if we draw edges between all pairs of individuals (everyone knows everyone”), and add dependence of infection rates on village or distance as appropriate.
We are now ready to state our first challenge, namely: is this classification into global, network, metapopulation and spatial models sufficient for the range of contact structures of interest in understanding infectious disease dynamics?
The focus of the present paper is global stochastic epidemic models, where any (infectious) individual may infect any other(susceptible) individual at a transmission rate that may differ between different pairs of individuals, but should be of the same order 1/N (or 0), where N is the population size. The simplest model assumption is where all transmission rates are identical, which is called a homogeneously mixing population of homogeneous individuals, but one may also assume different mixing rates and/or that individuals are of different types with respect to susceptibility and/or infectivity. As we shall see in this section, there are several open problems also for global epidemic models (only having transmission on a global scale). In real world epidemics there is of course nearly always some local structure within which transmission is much higher. Still, results for global epidemic models have undoubtedly been most influential in affecting health policies, and for highly transmittable diseases global mixing is often a reasonable approximation.
Having specified identical transmission rates (between all pairs of individuals) does not define the model uniquely. Other aspects to consider in formulating a stochastic model include.
Type of epidemic model
An SI model is where Susceptibles may become infected and infectious, and if they do, they remain infectious forever. In an SIR model, individuals that are infectious (from now on denoted Infectives) eventually recover from the disease and become immune for the rest of their lives (measles and chicken-pox being two examples). An SIS model is where infectives, rather than recovering and becoming immune they recover and enter the susceptible state again. SEIS models admit that there is a latent (E for exposed) state where an individual has already been infected but where he or she has not yet started to shed virus or bacteria. Other examples, hopefully self-explanatory, are SEIR, SIRS, SEIRS, …
Treatment of time
Is the time evolution of the epidemic of interest or only the end/final state of an outbreak? Is discrete or continuous time more appropriate? Do all rates/probabilities obey the practical Markov property (that future events only depend on present states and not the history, meaning that all underlying distributions are exponentials), or are durations not all exponentially distributed?
Population
Are we considering a fixed and finite population of size N, or a population having births and deaths but fluctuating randomly around N, or even a growing population? If the time-frame of interest is short, then a fixed population model might suffice, whereas if interest is on longer periods, a dynamic population is more realistic, thus allowing for influx of new individuals. If the population size fluctuates randomly around N it will eventually die out with probability 1 (and the disease will go extinct before this happens) so questions of interest then relate to population-disease properties prior to extinction (quasiendemic) and the length of time to extinction of the disease. Alternatively, if the population grows, then it may happen that the disease will remain present in the population for ever (endemic situation).
Fluctuations over time
Do all event rates stay the same over time except for the numbers “at risk”? The simplest models answer this question with a yes, but there are situations where this is clearly not the case, for example when the infectious agent evolves on the same time scale as the epidemic outbreak, and/or because individuals start taking precautions as more and more people are struck by the disease. A (perhaps simpler) fluctuation over time is where individuals and/or transmission rates change over time for reasons other than the epidemic itself. Examples include seasonality due to school term and school closure, but also varying transmission rates due to changes in temperature.
These type of questions are dealt with in the remainder of the current paper, and several challenges for these type of models are listed.
2. Endemicity: persistence of infection
Bartlett’s seminal paper (Bartlett, 1956) highlighted a severe inadequacy of deterministic models in describing the persistence of infection in an SIR (or similar) process with demography: fluctuations in the prevalence of infection about the endemic level can often be large enough for transmission to be interrupted by stochastic fadeout. Using a stochastic linearization approach, Bartlett estimated the magnitude of these fluctuations and characterizing the critical community size required for the persistence of such infections (most notably, for measles). This approach, later formalized in terms of an Ornstein–Uhlenbeck process, provides the basis of later work that derives approximations for the time to extinction when starting at the endemic (quasi-)equilibrium (e.g. Nåsell, 1999, and others). Improved approximations can be obtained using large deviation theory (e.g. Kamenev and Meerson, 2008).
The question of endemic persistence is most pointed for a newly-introduced infection given that the initial epidemic is the most severe. While it is well known how to compute the probability that an epidemic takes off when N is large (e.g. Ball, 1983), a more challenging question is how to calculate the probability that the infection persists through the trough that follows the initial epidemic. In particular, how does this probability depend on the parameters of the infection process (i.e. the transmission parameter and recovery rate), the birth rate and the population size? van Herwaarden (1997) provides an approximate answer, obtained by asymptotic solution of a boundary value problem applied to a Fokker–Planck equation, and more recently, Meerson and Sasorov (2009) have used large deviation theory and the WKB (Wentzel-Kramers-Brillouin) approximation approach to attack the problem.
Challenges remain in extending this work beyond the simplest settings, for instance when there is extrinsic seasonal variation in transmission (e.g. the seasonally forced outbreaks of measles in the pre-vaccine era) or for infections with more complex lifecycles (e.g. vector-borne infections).
3. Near-critical behaviour
Many disease systems of interest are neither strongly supercritical (with large outbreaks possible), nor subcritical (with large outbreaks impossible), but instead exhibit ‘stuttering’ behaviour of repeated, midsized outbreaks. This is particularly true for emerging zoonotic infections (Lloyd-Smith et al., 2009) and diseases where transmission has been significantly reduced due to eradication or elimination efforts (Klepac et al., 2013). Blumberg and Lloyd-Smith (2013) review approaches to this problem based on estimation of the parameters of a subcritical branching process, however this problem is inherently extremely hard and has already been identified by Lloyd-Smith et al. (2009) as an issue requiring additional attention from modellers. In particular, the clustering of unvaccinated individuals (see also the paper ‘Network Models’ in this journal issue) means that the homogeneous mixing assumption underlying commonly used branching process methods may be inappropriate. Even once an appropriate model has been selected, data that are available are likely to be at best weakly informative about the value of R0.
A significant challenge is therefore to obtain a thorough under-standing of the information content of near-critical branching processes, together with methods for data collection and quantification of relevant uncertainties is a key challenge for understanding diseases that are emerging, or close to elimination.
4. Epidemics in growing populations
Rigorous analysis of stochastic SIR epidemics is mainly focussed on static populations, which do not allow for demographic turnover through births and deaths. There is need for models for such epidemics in populations which have demographic turnover and to further extensions to populations with some social structure described through households or a network model.
If a population with demographic turnover has a large (quasi) stationary state, then an SIR epidemic will go extinct if there is no importation of the disease from outside (e.g. Section 4.7, Diekmann et al., 2013). However, it is still possible to distinguish between subcritical epidemics in which the epidemic will die out quickly and supercritical epidemics in which it takes an exponential (in the stationary population size) time to go extinct.
If the population is growing, e.g. if the population grows according to a birth and death process, then it is possible that the epidemic survives forever. Ignoring population structure, such model has been studied in Britton and Trapman (2014). It is shown that there are different regimes of survival. It might be that the epidemic survives, but the number of infectious individuals increases at slower speed than the population does, so the fraction of infected individuals goes to 0. It is also possible that the population and epidemic reach equilibrium and the fraction of infectious individuals converges to a constant. Some theoretical mathematical questions are still open (cf. Britton and Trapman, 2014.), in addition to relevant challenges from an epidemiological perspective. Examples of challenging questions are: Can an epidemic spread so fast, that, because of the quick depletion of susceptible individuals, after the first wave of the epidemic the epidemic still dies out with relatively large probability? If yes, what is the probability of this relatively fast extinction, and how does it depend on the model parameters?
The real challenge however lies in taking network structure into account in growing populations. We consider the most basic model, in which the population is governed by a linear birth and death model. Newly born individuals do not have connections yet and every individual acquires new connections at a fixed rate and connections are broken at another rate (cf. Britton et al., 2011). On this network a Markov SIR epidemic model can be considered. In addition to the open questions which already appear in the unstructured populations, questions arise due to dependencies which naturally appear in those networks (Britton et al., 2011). Even writing down an expression for R0 in this model is not trivial (see Leung et al., 2012 for a similar model). One possible way to attack this open question is to work via infinite type branching processes, where the type of an infectious individual is its age at the moment of its infection. Adaptions of methods from Ball et al. (2013) might be used to give (implicit) expressions for R0 and the probability of extinction of a SIR epidemic introduced by a single individual in an already large population.
5. Mutation and evolution
How can we represent the process of pathogen mutation (an inherently stochastic process) and associated fitness change within global epidemic models so as to capture observed evolutionary patterns with sufficient accuracy for the question at hand?
Patterns of incidence for all host-pathogen systems are influenced by evolution. However, the scale at which these changes can be observed in both time and space varies massively from one system to another. For many important human viruses, such as smallpox (prior to its eradication) and measles, rates are so slow they can safely be ignored. However, for antigenically variable viruses such as influenza and dengue and for most bacteria models that do not represent evolutionary processes in some way fail to capture even coarse patterns of incidence beyond relatively short periods of time or distances. The results of evolution can be seen directly when genotypic or phenotypic data are available, such as in the antigenic relationship between circulating strains of viral infections or in the proportion of bacterial isolates resistant to a specific treatment. Also, these phenotypic traits often drive crude measures of incidence even when they are not observed. The joint representation of evolutionary phylogenies and epidemic dynamics within the same quantitative framework is often referred to as phylodynamics (Grenfell et al., 2004).
Influenza in humans is the canonical example of an antigenically variable pathogen evolving rapidly in space and time: globally recommended vaccines need to be updated every few years (Smith et al., 2004) and resistance to established treatments emerges regularly (Graitcer et al., 2011) and spreads rapidly. However, despite early progress (Ferguson et al., 2003; Koelle et al., 2006), the representation of evolution at the global scale in a way that can be robustly tested with available data remains challenging (Ratmannet al., 2012). Simulation approaches that represent a subset of a consensus viral genome sequence for each infected individual will undoubtedly be extended to larger host populations with more accurate transport models, until they eventually reach a genuinely global scale.
The challenges presented by bacteria to large epidemic models are different from those of viruses (Gray et al., 2011). Bacterial populations evolve much more slowly and, in general, maintain geographically distinct lineages for much longer periods than do rapidly mutating viruses. Also, relative to point mutations, the recombination of bacteria (in which large portions of genes are exchanged during co-infection of different lineages) is much more important than is the reassortment of segmented viruses such as influenza (in which whole genes are exchanged). Therefore, to date, there has not been sufficient motivation to attempt the construction of global-scale models of key bacterial species such as Staphylococcus aureus. However, the increasing clinical importance of strains resistant to more than one treatment (Levy and Marshall, 2004) may well motivate exactly these types of analyses. In particular, the degree to which excessive use of antibiotics in one population influences the incidence of resistant strains in neighbouring populations is a question that naturally leads to globalscale analyses.
Acknowledgments
All authors are grateful to the Isaac Newton Institute. TB is supported by the Swedish Research Council. TH is supported by the Engineering and Physical Sciences Research Council. ALL is supported by the Research and Policy for Infectious Disease Dynamics (RAPIDD) program of the Science and Technology Directory, Department of Homeland Security, and Fogarty International Center, National Institutes of Health, and by grants from the National Institutes of Health (R01-AI091980) and the National Science Foundation (RTG/DMS-1246991). SR is supported by Wellcome Trust Project Award 093488/Z/10/Z, R01 TW008246-01 from Fogarty International Centre, The Medical Research Council (UK, Project Grant MR/J008761/1) and the RAPIDD program from Fogarty International Centre with the Science & Technology Directorate, Department of Homeland Security. PT is supported by Vetenskapsrådet (Swedish Research Council) project no. 20105873.
References
- Anderson RM, May RM. Infectious Diseases of Humans. Oxford University Press; 1992. [Google Scholar]
- Bailey NTJ. A simple stochastic epidemic. Biometrika. 1950;37(3–4):193–202. [PubMed] [Google Scholar]
- Ball F. The threshold behaviour of epidemic models. J. Appl. Probab. 1983;20(2):227–241. [Google Scholar]
- Ball F, Sirl D, Trapman P. Epidemics on random intersection graphs, arXiv: 1011.4242. 2013 [Google Scholar]
- Bartlett MS. Deterministic and stochastic models for recurrent epidemics. In: Neyman J, editor. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability. Vol. 4. Berkeley: University of California Press; 1956. pp. 81–109. [Google Scholar]
- Blumberg S, Lloyd-Smith JO. Comparing methods for estimating R0 from the size distribution of subcritical transmission chains. Epidemics. 2013;5(3):131–145. doi: 10.1016/j.epidem.2013.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Britton T, Lindholm M, Turova T. A dynamic network in a dynamic population: asymptotic properties. J. Appl. Probab. 2011;48(4):1163–1178. [Google Scholar]
- Britton T, Trapman P. Stochastic epidemics in growing populations. Bull. Math. Biol. 2014;76:985–996. doi: 10.1007/s11538-014-9942-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diekmann O, Heesterbeek H, Britton T. Mathematical Tools for Understanding Infectious Diseases Dynamics. Princeton University Press; 2013. [Google Scholar]
- Ferguson NM, Galvani AP, Bush RM. Ecological and immunological determinants of influenza evolution. Nature. 2003 Mar;422(6930):428–433. doi: 10.1038/nature01509. [DOI] [PubMed] [Google Scholar]
- Graitcer SB, Gubareva L, Kamimoto L, Doshi S, Vandermeer M, Louie J, Waters C, Moore Z, Sleeman K, Okomo-Adhiambo M, Marshall SA, St George K, Pan CY, LaPlante JM, Klimov A, Fry AM. Characteristics of patients with oseltamivir-resistant pandemic (H1N1) 2009, United States. Emerg. Infect. Dis. 2011 Feb;17(2):255–257. doi: 10.3201/eid1702.101724. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gray RR, Tatem AJ, Johnson JA, Alekseyenko AV, Pybus OG, Suchard MA, Salemi M. Testing spatiotemporal hypothesis of bacterial evolution using methicillin-resistant Staphylococcus aureus ST239 genome-wide data within a Bayesian framework. Mol. Biol. Evol. 2011 May;28(5):1593–1603. doi: 10.1093/molbev/msq319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grenfell BT, Pybus OG, Gog JR, Wood JLN, Daly JM, Mumford JA, Holmes EC. Unifying the epidemiological and evolutionary dynamics of pathogens. Science. 2004 Jan;303(5656):327–332. doi: 10.1126/science.1090727. [DOI] [PubMed] [Google Scholar]
- Kamenev A, Meerson B. Extinction of an infectious disease: a large fluctuation in a nonequilibrium system. Phys. Rev. E. 2008;77(6):061107. doi: 10.1103/PhysRevE.77.061107. [DOI] [PubMed] [Google Scholar]
- Klepac P, Metcalf CJE, McLean AR, Hampson K. Towards the endgame and beyond: complexities and challenges for the elimination of infectious diseases. Philos. Trans. R. Soc. Lond. B: Biol. Sci. 2013;368(1623):20120137. doi: 10.1098/rstb.2012.0137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koelle K, Cobey S, Grenfell B, Pascual M. Epochal evolution shapes the phylodynamics of interpandemic influenza A (H3N2) in humans. Science. 2006 Dec;314(5807):1898–1903. doi: 10.1126/science.1132745. [DOI] [PubMed] [Google Scholar]
- Leung K, Kretzschmar M, Diekmann O. Dynamic concurrent partner-ship networks incorporating demography. Theor. Popul. Biol. 2012;82(3):229–239. doi: 10.1016/j.tpb.2012.07.001. http://www.sciencedirect.com/science/article/pii/S0040580912000792. [DOI] [PubMed] [Google Scholar]
- Levy SB, Marshall B. Antibacterial resistance worldwide: causes, challenges and responses. Nat. Med. 2004 Dec;10(12 Suppl.):S122–S129. doi: 10.1038/nm1145. [DOI] [PubMed] [Google Scholar]
- Lloyd-Smith JO, George D, Pepin KM, Pitzer VE, Pulliam JRC, Dobson AP, Hudson PJ, Grenfell BT. Epidemic dynamics at the human–animal inter-face. Science. 2009;326(5958):1362–1367. doi: 10.1126/science.1177345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meerson B, Sasorov PV. WKB theory of epidemic fade-out in stochastic populations. Phys. Rev. E. 2009;80(4):041130. doi: 10.1103/PhysRevE.80.041130. [DOI] [PubMed] [Google Scholar]
- Mollison D. The structure of epidemic models. In: Mollison D, editor. Epidemic Models: Their Structure and Relation to Data. chapter 2. Cambridge University Press; 1995. pp. 17–33. [Google Scholar]
- Nåsell I. On the time to extinction in recurrent epidemics. J. Royal Stat. Soc. B. 1999;61(2):309–330. [Google Scholar]
- Ratmann O, Donker G, Meijer A, Fraser C, Koelle K. Phylodynamic inference and model assessment with approximate Bayesian computation: influenza as a case study. PLoS Comput. Biol. 2012 Dec;8(12):e1002835. doi: 10.1371/journal.pcbi.1002835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith DJ, Lapedes AS, de Jong JC, Bestebroer TM, Rimmelzwaan GF, Osterhaus ADME, Fouchier RAM. SI: mapping the antigenic and genetic evolution of influenza virus. Science. 2004 Jul;305(5682):371–376. doi: 10.1126/science.1097211. [DOI] [PubMed] [Google Scholar]
- van Herwaarden O. Stochastic epidemics: the probability of extinction of an infectious disease at the end of a major outbreak. J. Math. Biol. 1997;35:793–813. doi: 10.1007/s002850050077. [DOI] [PubMed] [Google Scholar]