Abstract
There is known heterogeneity between individuals in infectious disease transmission patterns. The source of this heterogeneity is thought to affect epidemiological dynamics but studies tend not to control for the overall heterogeneity in the number of secondary cases caused by an infection. To explore the role of individual variation in infection duration and transmission rate in parasite emergence and spread, while controlling for this potential bias, we simulate stochastic outbreaks with and without parasite evolution. As expected, heterogeneity in the number of secondary cases decreases the probability of outbreak emergence. Furthermore, for epidemics that do emerge, assuming more realistic infection duration distributions leads to faster outbreaks and higher epidemic peaks. When parasites require adaptive mutations to cause large epidemics, the impact of heterogeneity depends on the underlying evolutionary model. If emergence relies on within-host evolution, decreasing the infection duration variance decreases the probability of emergence. These results underline the importance of accounting for realistic distributions of transmission rates to anticipate the effect of individual heterogeneity on epidemiological dynamics.
Keywords: epidemiology, modelling, infection duration, superspreading, evolutionary rescue, emerging infectious diseases
1. Introduction
The expected number of secondary cases produced by an infected individual in a naive population is a key concept in epidemiology [1,2]. It is classically referred to as the basic reproduction number and denoted R0. Only infections with R0 > 1 can cause major outbreaks. However, this mean value does not reflect the impact of superspreading events, where an individual causes an unusually large number of secondary cases [3–8]. The more frequent these events are, the higher the variance in the number of secondary cases, and, therefore, the lower the probability of outbreak emergence and the faster the epidemic growth for outbreaks that do emerge [6].
Several biological processes can explain the heterogeneity in the number of secondary cases [9]. However, models investigating these processes tend only to vary one source of heterogeneity at a time. By doing so, they do not control for the (overall) heterogeneity in the number of secondary cases, which is known to have strong effects, independently of its source [6]. One of the few exceptions suggests that the biology matters, since it finds, for instance, that heterogeneity in host susceptibility has a lesser impact on the probability of emergence than heterogeneity in transmission rate, which can be defined as the product between a contact rate and the probability of transmission given that there is a contact between two individuals [10].
We use a stochastic mechanistic model to explore whether heterogeneity in transmission rates and heterogeneity in infection duration have different effects on an epidemic spread. Based on earlier models, we hypothesize that a more homogeneous distribution of infectious period duration decreases the variability of population dynamics in the early outbreak, therefore increasing the probability of outbreak extinction [11], but also increasing epidemic growth as well as epidemic peak size [11,12]. However, we stress that these hypotheses are based on studies that, in contrast to ours, do not control for variations in the distribution of the number of secondary cases.
Even if initially maladapted (i.e. R0 < 1), a parasite can evolve into a well-adapted strain before fading out and then cause a major outbreak, a phenomenon called ‘evolutionary emergence’ or ‘evolutionary rescue’ [13,14]. Since higher epidemic sizes can be reached more frequently with increasing heterogeneity of secondary cases when R0 < 1 [15], we hypothesize that the source of heterogeneity could affect evolutionary emergence. Since we do not explicitly model the within-host evolution process, we consider two extreme evolutionary processes for a mutant strain with R0 > 1 to appear [14,16]: either by taking over a host infected by the resident strain or during a transmission event.
Following earlier studies [6,15,17], we assume that the number of secondary infections caused by each individual follows a negative-binomial distribution with mean R0 and dispersion parameter k. The smaller is k, the more dispersed is . For example, the 2003 SARS outbreak in Singapore led to many superspreading events and transmission chain analyses estimated that k = 0.16 [6], and recent data from COVID-19 epidemics yielded values of k of the order of 0.3 [18].
We model individual transmission rates and infection duration values using lognormal distributions, denoted respectively and . Most models involving ordinary differential equations are ‘memoryless’—that is the duration of the infections is assumed to be exponentially distributed () (but see [11,12,19]). This is biologically unrealistic for recovery events since they often depend on the number of days since infection [20,21], and tends to overestimate the heterogeneity due to infection duration. We disentangle the specific role of infection duration heterogeneity from that of the secondary cases by varying k, and the coefficient of variation (CV) of the infection duration (). Those two parameters combined govern the distribution of transmission rate.
We simulate outbreaks, without and with evolution, and measure key summary statistics to analyse the impact of different sources of heterogeneity on properties of emerging outbreaks. We confirm that the dispersion of the distribution of the number of secondary infections () is the main driver of the frequency of emergence, but we also find that the source of this heterogeneity has a strong impact on the properties of emerging epidemics, and more interestingly that it can affect the risk of evolutionary emergence.
As an illustration, we compare dynamics that could be obtained with parameters estimated from two outbreaks: SARS in Singapore in 2003 and Ebola in West Africa in 2014, which have similar values of R0 and k [6,22] and different infection duration heterogeneity. We estimate (95% credible interval (CI): 0.44–1.9) for Ebola and 0.27 (95% CI: 0.01–0.80) for SARS. An explanation for that difference is that the Ebola virus is known to sometimes persist in some body fluids after clearance from the blood [23]. Animal studies also show variability in the host immune response against Ebola virus infection, which might allow persistence in some individuals [24,25]. Regarding SARS outbreaks, the reason why some infected individuals spread the virus more than others is thought to be a combination of host and environmental properties. On the biological side, individuals causing superspreading events were older [26], and coinfections have been hypothetized to increase the infectivity of SARS-CoV [27]. On the environmental side, superspreaders had a higher number of close contacts, and the diagnosis of the infection was often delayed [26].
2. Material and methods
(a) . Model without evolution
We implement a non-Markovian version of the susceptible–infected–recovered (SIR) epidemiological model [28], which means that not all rates are held constant throughout an infection [29]. We assume that the host population is of fixed size N and that epidemics are initiated by a single infectious individual. At time t, each individual is characterized by its current state (susceptible, infectious or removed), and, if infected, the time at which it will recover.
The first source of heterogeneity in the model comes from the transmission rates and has a behavioural (i.e. contact rates) or a biological (i.e. infectiousness) origin. We model it by drawing the per capita transmission rate βi for each individual i from a lognormal distribution, denoted , with parameters μB and σB. For mathematical convenience, and without further qualitative impact, we set the mean of such that . The standard deviation of is imposed by the choice of the coefficient of variation (CVB), which is equal to .
The second source of heterogeneity comes from the infection duration and has a biological origin. We assume that individuals remain in the I compartment for a time drawn randomly from a lognormal distribution, denoted , with parameters and . By construction, the expectation of is R0 in our model and we vary its coefficient of variation, which is equal to , between 0.05 and 2.
(b) . Coefficients of variation and dispersion
Given the construction of our model, the distribution of the number of secondary infections () is determined by heterogeneities in transmission rate and infection duration. Since the force of infection over the course of an individual’s infection is the product of two lognormal distributions ( and ), it is itself lognormally distributed, with parameters and . is therefore a lognormal-Poisson compound distribution.
(c) . Evolutionary emergence model
We introduce an additional class of individuals by distinguishing between Ir and Im, which refer to individuals infected by the resident and mutant parasite strain, respectively, with reproduction number and , respectively. Initially, we assume that Ir = 1 and Im = 0. Mutant infections can emerge from a transmission event or from taking over an infected host. In the case of within-host mutation, the mutation rate represents the instantaneous probability that a mutant appears and takes over the host. In the case of mutation during transmission, the mutation rate represents the probability that a mutant is transmitted instead of a resident strain. We assume that the mutation increases the mean transmission rate without altering CVB (i.e. by setting ). We further assume that the infectious period duration is not impacted by the mutation. For simplicity, we neglect coinfections and therefore assume that, in the case of within-host mutations, the mutant instantaneously takes over the host.
(d) . Frequency of emergence
We use the total epidemic size to determine if an outbreak has emerged or not. Emergence is assumed to occur when the total epidemic size is greater than the herd immunity threshold, i.e. [1].
(e) . Numerical simulations
We simulate epidemics, i.e. the succession of infection and recovery events, using Gillespie’s next reaction method [29] to generate non-Markovian distributions. The algorithm runs as follows:
-
1.
Initialize (i.e. set S, I = 1, t = 0).
-
2.
In the case of a new infected individual i, draw βi and the recovery time of this individual distributions, and , respectively.
-
3.
Update the new force of infection and draw the time to the next infection assuming an exponential distribution.
-
4.
Look for the event with the closest time of occurrence (i.e. either recovery or new infection), and update the compartments (S, I).
-
5.
Update the time t to the time of the new event.
-
6.
Go back to step 2.
In the case of evolutionary emergence, we adapt the model depending on how the mutant appears. (i) If the mutant appears during transmission, the model includes one force of infection for each class of infected host (Ir and Im), and two additional events: infection by the mutant strain (assuming an exponential distribution with a rate ), and recovery of an Im individual. (ii) In the scenario where the mutant first takes over the host, we distinguish the event of infection by the mutant strain (assuming an exponential distribution with a rate ) from the within-host mutation of a resident strain into a mutant strain (assuming an exponential distribution with a rate Ir × μ).
The model was implemented in Java 11.0.7 using parallel computation to decrease computing time. Simulation outputs were analysed with R v. 4.1.2.
(f) . Parameters estimation for known outbreaks
To estimate CVB and from observed outbreaks, we analysed serial interval and secondary case distributions from measles [30], Ebola [31,32], pneumonic plague [33], smallpox [34,35], monkeypox [36] and SARS outbreaks [6,37]. For the measles outbreak, patient line data were available, therefore allowing joint distribution estimations, and for the others, we had to assume that the two distributions were independent (see electronic supplementary material, table for further details about the data and parameter sources).
To obtain biologically relevant parameters from these empirical data, we infer parameters assuming a model with a latent period, the distribution of which we set using independent sources in the literature [32,38–41]. For simplicity, we assume that for a given parasite the distribution of the latent period does not vary between outbreaks. We also use independent estimates of R0 [30,32,37]. We also assume a constant transmission rate during the infectious period. We use a Bayesian approach, with the following priors: and . We use jags v. 4.3.0 to estimate parameters [42].
3. Results
(a) . Epidemic emergence without evolution
For a given heterogeneity in the number of secondary cases k, the coefficients of variation in infection duration () and transmission rate (CVB) are negatively correlated. This is shown in figure 1 and further explained in the Methods. Since the former should be easier to measure, we focus on the role of infection duration heterogeneity, but the results can also be interpreted in terms of transmission heterogeneity.
Figure 1.

Numerical estimation of the transmission rate coefficient of variation (CVB), as a function of heterogeneity in the number of secondary cases, k, and infection duration coefficient of variation . Labels in white show the range of values estimated using maximum-likelihood methods from outbreak data. If k remains constant, increasing always decreases CVB. Note that when the heterogeneity in the number of secondary cases is low (i.e. k is high) it is impossible to have a high . (Online version in colour.)
To illustrate the feasibility of inferring these infection properties, we highlight the values of parameters for several well-studied outbreaks in figure 1. This also shows that our parameter ranges are biologically realistic.
(i) . Probability of emergence
Figure 2a shows that the probability of an outbreak emergence only depends on the overall heterogeneity, here measured by k. The source of heterogeneity (i.e. infection duration or infectiousness) does not seem to play any role. Results are shown with R0 = 1.5, but a similar pattern is observed for any R0 > 1.
Figure 2.
Summary statistics of emergence of epidemics without evolution. We run simulations with R0 = 1.5 and vary the heterogeneity in the number of secondary cases, k, and the infection duration CV (CVΓ). (a) Frequency of emergence of an outbreak starting from one infection as a function of model heterogeneity. (b) Epidemic trajectories with k = 0.4, but different infection duration heterogeneity ( in red and in purple). The total population size is 50 000. (c–e) Metrics relative to the case where k = 1 and ; colours indicate the value of . Lines represent mean values computed from simulated outbreaks that emerge, and shaded areas show the 95% confidence intervals. (c) Relative time until the epidemic reaches the emergence threshold (i.e. here a prevalence of 100 infected individuals). (d) Relative doubling time during the exponential phase (i.e. going from a prevalence of 500 to 1000 infections). (e) Relative prevalence peak size. (f) Final outbreak size, as a percentage of the total population. This metric does not depend on . (Online version in colour.)
In the following, we analyse the properties of simulated outbreaks without evolution with R0 = 1.5, and compare key metrics with a reference value close to the Markovian case, i.e. k = 1 and .
(ii) . Growth rate
In the initial phases of an outbreak, the law of large numbers does not apply and prevalence time series shown are strongly affected by stochasticity (figure 2b). We quantify the early growth during this stochastic phase by measuring the time until the prevalence reaches the outbreak threshold of 100 infected individuals [43]. As expected [6], decreasing k leads to faster epidemic growth. Furthermore, for a given k, increasing the heterogeneity in infection duration also increases the early epidemic growth (figure 2c). On average, this would make a SARS outbreak reach the outbreak threshold 50% faster than an Ebola outbreak.
We then study the deterministic exponential growth phase, which starts when the number of infected is high enough to reach the law of large numbers, and ends when the depletion of susceptible host population cannot be neglected any more [43] (figure 2b). Figure 2d shows that the growth rate during this phase is mostly impacted by . For instance, even with similar R0, Ebola outbreaks would have a doubling time of 1.4 times the mean infection duration, while SARS outbreaks would have a doubling time of 0.9 times the mean infection duration. Not taking into account the difference in infectious period distribution between the two epidemics and considering a memoryless model with would lead to an overestimation of the SARS R0 [44].
(iii) . Epidemic peak size and final size
The prevalence peak value is highly affected by the heterogeneity in infection duration: its median increases by more than 50% when decreases from 1 to 0.5 (figure 2e). Although k has little effect on the mean epidemic peak size, there is a correlation between the variance in peak size and that of .
Finally, none of our heterogeneity metric seems to affect the median final epidemic size, which is always close to 58% of the population (figure 2f), corresponding to the expected value for R0 = 1.5 according to classical theory [28]. As for the other metrics, the variance in the total epidemic size decreases with k.
(b) . Evolutionary emergence
We now assume that the introduced ‘resident’ strain has and, therefore, will go extinct unless it evolves into a phenotypically different ‘mutant’ strain with . The mutant strain can arise either by taking over a host infected by the resident strain or during a transmission event.
(i) . Mutation probability
To disentangle the evolutionary process from the epidemiological process, following Yates et al. [10], we first assume that a mutant instantaneously takes over the population (). The mutation probability does not depend on the origin of heterogeneity. Moreover, figure 3a,b shows that the way the mutant strain appears does not seem to affect qualitatively the relationship between the frequency of mutation probability and the heterogeneity in the number of secondary cases, k. Overall, this relationship mostly depends on : when , there is little impact of k on the frequency of emergence, whereas when , increasing k increases the frequency of mutation probability.
Figure 3.
Individual heterogeneity and evolutionary emergence. We run simulations and vary the heterogeneity in the number of secondary cases, k, the infection duration CV (), the mutation rate, and the resident strain reproduction number . (a,b) Probability of mutation, as a function of and the mutation rate. In (a), the mutant appears during the infection within a host and replaces the resident strain, and in (b) the mutant appears during transmission. There is no influence of . (c,d) Probability of outbreak emergence of a mutant (c) taking over a host, and (d) appearing during transmission, as a function of , in the case of a mutant strain basic reproduction number and a mutation rate of 10−3. (Online version in colour.)
(ii) . Mutant outbreak
We then consider the more realistic case where the mutant has R0 = 1.5 (figure 3c,d). The general trend is qualitatively consistent with the case without evolution: decreasing the heterogeneity in the number of secondary cases increases the frequency of emergence.
When the mutant appears during transmission (figure 3d), the source of heterogeneity does not play any role. However, when the mutant appears by taking over an infection (figure 3c), decreasing the infection duration heterogeneity increases the probability of emergence. The difference between these two scenarios is that when the mutant arises within the host, the infection is ongoing, and the host recovery time is kept constant since we assume no difference in immune response between resident and variant strains. Therefore, with a more heterogeneous infection duration, individuals with longer infections will increase the probability that a mutant arises within the host and can transmit before the host recovery.
4. Discussion
When modelling epidemics, the variation between individuals can be aggregated into a single metric, the dispersion of the secondary infections caused by each individual, which shapes infectious disease outbreaks [6]. Several studies investigate how variations in a specific trait can have an impact on epidemiological dynamics but the majority overlook that variations in one trait (e.g. the distribution of the duration of infectious periods) may also affect the distribution of individual secondary cases. In this study, we investigate the relative effects of variation in the infection duration and transmission rate while keeping the distribution of the secondary cases constant.
Increasing the heterogeneity in transmission rates is known to lead to a faster increase in cases per generation among the outbreaks that do emerge in branching process models [6]. By simulating the whole course of the epidemic, we show that this effect does not translate into an increased growth rate after the epidemic evades the stochastic phase. Methodologically, this could also be studied using recent developments of branching process theory in epidemiology to incorporate the depletion of susceptible hosts [45].
We show that the heterogeneity in infectious period duration plays an important role in the deterministic phase of the epidemic, by increasing the growth rate and, more strikingly, the prevalence peak size. While previous studies reported a similar effect on both heterogeneity in the number of secondary cases and infection duration heterogeneity [11,44,46,47], we further show that this phenomenon is intrinsically related to the latter. Indeed, more heterogeneous infectious periods are known to lead to longer generation times because transmission relies on long infections, therefore increasing the doubling time and flattening the epidemic curve [44].
When considering a simple evolutionary rescue scenario, we show that the probability of mutation does not depend on the infection duration heterogeneity. This is consistent with the observation that the final epidemic size is not affected by the source of heterogeneity. Furthermore, we show that with very low , the distribution of secondary cases does not have any impact either. This can be explained by the fact that for R0 < 1, the decrease in frequency of emergence associated with heterogeneity is compensated by a higher probability in reaching larger outbreak sizes [15] (electronic supplementary material, figure S1), therefore maintaining the mean outbreak size (electronic supplementary material, figure S2). This effect diminishes as R0 gets higher and disappears when R0 > 1.
Finally, we show that infectious period duration heterogeneity can affect evolutionary emergence depending on the process that generates the mutant infection [14]. The impact of the mutational pathway and evolutionary scenario has already been pointed out by several studies [10,48]. As expected, we find no difference between the two mutation scenarios if the process is memoryless. This further underlines the importance of questioning this biologically unrealistic assumption [11,12,19,46]. When assuming more realistic infection duration distributions, we find that if mutations appear upon transmission events, the probability of evolutionary emergence only depends on the distribution of the secondary cases. However, when the mutation appears after a host takeover, infection duration heterogeneity increases the frequency of emergence. This is illustrated by electronic supplementary material, figure S3: although in either scenario the probability that a mutant appears remains constant and equal to the mutation rate, when the mutation occurs within the host, the probability that it gets transmitted is higher in the case of rare long infections, as already pointed out [49].
Our effort to maintain a simple and tractable model of outbreak emergence naturally leads to several limitations. In particular, there is an identifiability issue regarding the biological bases of the transmission rate heterogeneity, which could originate from variations in transmission rate or in host susceptibility. However, Yates et al. [10] find that the heterogeneity in infectivity plays a larger role in the frequency of emergence than the heterogeneity in susceptibility. It could also be interesting to enrich the model by considering a latent period during which exposed hosts are not yet infectious. This has been shown to affect R0 estimates but in a deterministic model that did not take into account superspreading events [47]. More generally, investigating other sources of heterogeneity of the number of secondary infections may help uncover potential biases. Another simplification made here is the assumption that infectiousness is constant over the infectious period of an individual. This is biologically not true, and therefore the infectious period defined here is probably shorter than the real infectious period, since infectiousness is usually higher at the beginning of the epidemic.
Since we ignore within-host dynamics, we chose two extreme scenarios regarding the way a mutant appears: either during transmission or within the host. Biological reality is likely in-between: mutants will gradually take over a host, which means an increasing proportion of the transmission events will be caused by the mutant [16]. At least for rapidly evolving viruses such as human immunodeficiency virus 1 and hepatitis C virus, within-host genetic variation is higher than what is expected given the strong host immune response selection [50]. This shows that within-host selection of novel mutations and transmission occur at the same rate.
Nested models, which explicitly include both within- and between-host dynamics, can take into account this gradual replacement. Coombs et al. [51] showed in a simple nested model with chronic infections that the best between-host competitors can be competitively excluded if they are outcompeted within the host in the short term during an infection. Moreover, when allowing mutation, coexistence of both strains could be possible under certain scenarios, which is not possible with our simplified model. When taking explicitly into account the interaction between the parasite and the host’s immune system and the possibility of multiple infections, models suggest that the outcome of the competition can lead to the coexistence of two strains with different within-host growth rates, as soon as there is a possibility that multiple infections can occur [52]. Including the possibility that more than two strains can coexist during the infection, it was shown that the level of selection that matters depends on the extent of phenotypic variation: with a higher between-host than within-host phenotypic variation observed, it is expected that strains maximizing the between-host transmission are selected, and vice versa [53]. Finally, Park et al. [54] combined a nested model with the question of the probability of emergence of an outbreak, with a stochastic epidemiological model. They showed that conflicting fitness effects of a mutation at the within-host level and at the between-host levels can strongly decrease the probability of emergence of a mutant.
We assumed that the population has no spatial structure, which is more realistic for directly transmitted diseases, such as SARS or measles, than for sexually transmitted infections for which contact networks impose strong constraints [15]. Furthermore, at the beginning of an epidemic, the spatial structure appears to have little effect on outbreak metrics, especially R0 [55]. However, it is known that heterogeneity in host susceptibility and spatial structure decreases the final epidemic size, i.e. the total proportion of the population infected throughout the epidemic [56,57]. We also do not include host demography and limit our analysis to a single epidemic wave.
We also assumed no correlation between infectiousness and infectious period duration. While this seems biologically realistic, little is known about the nature of the relationship between those parameters. Indeed, one could expect that higher infectiousness is associated with a higher pathogen load, leading to a shorter asymptomatic period where transmission can occur, as has been observed for HIV infections [58]. However, when analysing the measles outbreak in Hagelloch, Germany [30], where a joint estimation of both parameters is possible, we found no significant correlation between the estimated infectious rate and the infectious period duration (electronic supplementary material, figure S4), although our sample is limited to the 32 individuals who did transmit early in the outbreak.
Finally, this analysis relies on numerical results. This enables us to explore the role of stochasticity, which is particularly important to consider in the context of outbreak emergence from a mathematical modelling [59] and a statistical inference [60] point of view. However, it limits our analysis to a restricted number of parameters that we selected as being biologically relevant, although our results do not seem to be affected by the choice of R0 (figure S5).
These theoretical results have implications for outbreak monitoring. In particular, we show that making simplifying but biologically unrealistic assumptions about the distributions of infection duration can lead to underestimating the risk of emergence, the epidemic doubling time and the prevalence peak size. Given the risk of saturation of healthcare systems, accurately anticipating these values is a major issue. This stresses the importance of collecting detailed biological data to better inform epidemiological models.
Supplementary Material
Acknowledgements
The authors thank the CNRS and the IRD, and acknowledge the i-Trop HPC (South Green Platform) at IRD Montpellier for providing HPC resources that have contributed to the research results reported within this study (https://bioinfo.ird.fr/).
Data accessibility
Data and code are accessible on GitLab: https://gitlab.in2p3.fr/ete/heterogeneity-outbreak.
Authors' contributions
B.E.: formal analysis, investigation, writing—original draft, writing—review and editing; C.S.: conceptualization, methodology, supervision; S.A.: conceptualization, methodology, supervision, validation, writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed herein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
No funding has been received for this article.
References
- 1.Anderson RM, May RM. 1992. Infectious diseases of humans: dynamics and control, 1st edn. Oxford, UK: Oxford University Press. [Google Scholar]
- 2.Keeling MJ, Rohani P. 2008. Modeling infectious diseases in humans and animals. Princeton, NJ: Princeton University Press. [Google Scholar]
- 3.Endo A, Funk S. 2020. Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China. Wellcome Open Res. 5, 67. ( 10.12688/wellcomeopenres.15842.1) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gomes MGM, Águas R, Lopes JS, Nunes MC, Rebelo C, Rodrigues P, Struchiner CJ. 2012. How host heterogeneity governs tuberculosis reinfection? Proc. R. Soc. B 279, 2473-2478. ( 10.1098/rspb.2011.2712) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lemieux JE, et al. 2021. Phylogenetic analysis of SARS-CoV-2 in Boston highlights the impact of superspreading events. Science 371, eabe3261. ( 10.1126/science.abe3261) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. 2005. Superspreading and the effect of individual variation on disease emergence. Nature 438, 355-359. ( 10.1038/nature04153) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Marm Kilpatrick A, Daszak P, Jones MJ, Marra PP, Kramer LD. 2006. Host heterogeneity dominates West Nile virus transmission. Proc. R. Soc. B 273, 2327-2333. ( 10.1098/rspb.2006.3575) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Woolhouse MEJ, et al. 1997. Heterogeneities in the transmission of infectious agents: implications for the design of control programs. Proc. Natl Acad. Sci. USA 94, 338-342. ( 10.1073/pnas.94.1.338) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.VanderWaal KL, Ezenwa VO. 2016. Heterogeneity in pathogen transmission: mechanisms and methodology. Funct. Ecol. 30, 1606-1622. ( 10.1111/1365-2435.12645) [DOI] [Google Scholar]
- 10.Yates A, Antia R, Regoes RR. 2006. How do pathogen evolution and host heterogeneity interact in disease emergence? Proc. R. Soc. B 273, 3075-3083. ( 10.1098/rspb.2006.3681) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Anderson D, Watson R. 1980. On the spread of a disease with gamma distributed latent and infectious periods. Biometrika 67, 191-198. ( 10.1093/biomet/67.1.191) [DOI] [Google Scholar]
- 12.Malice M-P, Kryscio RJ. 1989. On the role of variable incubation periods in simple epidemic models. Math. Med. Biol. 6, 233-242. ( 10.1093/imammb/6.4.233) [DOI] [PubMed] [Google Scholar]
- 13.Antia R, Regoes RR, Koella JC, Bergstrom CT. 2003. The role of evolution in the emergence of infectious diseases. Nature 426, 658-661. ( 10.1038/nature02104) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gandon S, Hochberg ME, Holt RD, Day T. 2013. What limits the evolutionary emergence of pathogens? Phil. Trans. R. Soc. B 368, 20120086. ( 10.1098/rstb.2012.0086) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Garske T, Rhodes C. 2008. The effect of superspreading on epidemic outbreak size distributions. J. Theor. Biol. 253, 228-237. ( 10.1016/j.jtbi.2008.02.038) [DOI] [PubMed] [Google Scholar]
- 16.Alizon S, Luciani F, Regoes RR. 2011. Epidemiological and clinical consequences of within-host evolution. Trends Microbiol. 19, 24-32. ( 10.1016/j.tim.2010.09.005) [DOI] [PubMed] [Google Scholar]
- 17.Hellewell J, et al. 2020. Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts. Lancet Glob. Health 8, e488-e496. ( 10.1016/S2214-109X(20)30074-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sun K, et al. 2021. Transmission heterogeneities, kinetics, and controllability of SARS-CoV-2. Science 371, eabe2424. ( 10.1126/science.abe2424) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lloyd AL. 2001. Realistic distributions of infectious periods in epidemic models: changing patterns of persistence and dynamics. Theor. Popul. Biol. 60, 59-71. ( 10.1006/tpbi.2001.1525) [DOI] [PubMed] [Google Scholar]
- 20.Chan M, Johansson MA. 2012. The incubation periods of dengue viruses. PLoS ONE 7, e50972. ( 10.1371/journal.pone.0050972) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lessler J, Reich NG, Brookmeyer R, Perl TM, Nelson KE, Cummings DA. 2009. Incubation periods of acute respiratory viral infections: a systematic review. Lancet Infect. Dis. 9, 291-300. ( 10.1016/S1473-3099(09)70069-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Althaus CL. 2015. Ebola superspreading. Lancet Infect. Dis. 15, 507-508. ( 10.1016/S1473-3099(15)70135-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Chughtai AA, Barnes M, Macintyre CR. 2016. Persistence of Ebola virus in various body fluids during convalescence: evidence and implications for disease transmission and control. Epidemiol. Infect. 144, 1652-1660. ( 10.1017/S0950268816000054) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.MacIntyre CR, Chughtai AA. 2016. Recurrence and reinfection—a new paradigm for the management of Ebola virus disease. Int. J. Infect. Dis. 43, 58-61. ( 10.1016/j.ijid.2015.12.011) [DOI] [PubMed] [Google Scholar]
- 25.Rasmussen AL, et al. 2014. Host genetic diversity enables Ebola hemorrhagic fever pathogenesis and resistance. Science 346, 987-991. ( 10.1126/science.1259595) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Shen Z, Ning F, Zhou W, He X, Lin C, Chin DP, Zhu Z, Schuchat A. 2004. Superspreading SARS events, Beijing, 2003. Emerg. Infect. Dis. 10, 256-260. ( 10.3201/eid1002.030732) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Bassetti S, Bischoff WE, Sherertz RJ. 2005. Are SARS superspreaders cloud adults? Emerg. Infect. Dis. 11, 637-638. ( 10.3201/eid1104.040639) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kermack WO, McKendrick AG. 1927. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. A 115, 700-721. ( 10.1098/rspa.1927.0118) [DOI] [Google Scholar]
- 29.Gibson MA, Bruck J. 2000. Efficient exact stochastic simulation of chemical systems with many species and many channels. J. Phys. Chem. A 104, 1876-1889. ( 10.1021/jp993732q) [DOI] [Google Scholar]
- 30.Jombart T, Frost S, Nouvellet P, Campbell F, Sudre B. 2020. Outbreaks: a collection of disease outbreak data. R package version 1.9.0. See https://CRAN.R-project.org/package=outbreaks.
- 31.Faye O, et al. 2015. Chains of transmission and control of Ebola virus disease in Conakry, Guinea, in 2014: an observational study. Lancet Infect. Dis. 15, 320-326. ( 10.1016/S1473-3099(14)71075-8) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.WHO Ebola Response Team. 2014. Ebola virus disease in West Africa—the first 9 months of the epidemic and forward projections. N. Engl. J. Med. 371, 1481-1495. ( 10.1056/NEJMoa1411100) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Gani R, Leach S. 2004. Epidemiologic determinants for modeling pneumonic plague outbreaks. Emerg. Infect. Dis. 10, 608-614. ( 10.3201/eid1004.030509) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Fenner F, et al. 1988. Smallpox and its eradication. Geneva, Swizerland: World Health Organization. [Google Scholar]
- 35.Nishiura H, Eichner M. 2007. Infectiousness of smallpox relative to disease age: estimates based on transmission network and incubation period. Epidemiol. Infect. 135, 1145-1150. ( 10.1017/S0950268806007618) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Jezek Z, Grab B, Dixon H. 1987. Stochastic model for interhuman spread of monkeypox. Am. J. Epidemiol. 126, 1082-1092. ( 10.1093/oxfordjournals.aje.a114747) [DOI] [PubMed] [Google Scholar]
- 37.Lipsitch M, et al. 2003. Transmission dynamics and control of severe acute respiratory syndrome. Science 300, 1966-1970. ( 10.1126/science.1086616) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Bailey NTJ. 1956. On estimating the latent and infectious periods of measles: I. Families with two susceptibles only. Biometrika 43, 15-22. ( 10.2307/2333574) [DOI] [Google Scholar]
- 39.Kuk AYC, Ma S. 2005. The estimation of SARS incubation distribution from serial interval data using a convolution likelihood. Stat. Med. 24, 2525-2537. ( 10.1002/sim.2123) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Nishiura H. 2009. Determination of the appropriate quarantine period following smallpox exposure: an objective approach using the incubation period distribution. Int. J. Hyg. Environ. Health 212, 97-104. ( 10.1016/j.ijheh.2007.10.003) [DOI] [PubMed] [Google Scholar]
- 41.Nolen LD, et al. 2016. Extended human-to-human transmission during a monkeypox outbreak in the Democratic Republic of the Congo. Emerg. Infect. Dis. 22, 1014-1021. ( 10.3201/eid2206.150579) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Plummer M. 2003. JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In Proc. 3rd Int. Workshop Distributed Statistical Computing, 20 March 2003, Vienna, Austria (eds K. Hornik, A Zeileis). See https://www.r-project.org/conferences/DSC-2003/Proceedings/Plummer.pdf.
- 43.Hartfield M, Alizon S. 2014. Epidemiological feedbacks affect evolutionary emergence of pathogens. Am. Nat. 183, E105-E117. ( 10.1086/674795) [DOI] [PubMed] [Google Scholar]
- 44.Wallinga J, Lipsitch M. 2007. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc. R. Soc. B 274, 599-604. ( 10.1098/rspb.2006.3754) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Barbour A, Reinert G. 2013. Approximating the epidemic curve. Electron. J. Probab. 18, 1-30. ( 10.1214/EJP.v18-2557) [DOI] [Google Scholar]
- 46.Britton T, Lindenstrand D. 2008. Epidemic modelling: aspects where stochasticity matters. arXiv, 0812.3505v1. ( 10.48550/arXiv.0812.3505) [DOI]
- 47.Wearing HJ, Rohani P, Keeling MJ. 2005. Appropriate models for the management of infectious diseases. PLoS Med. 2, e174. ( 10.1371/journal.pmed.0020174) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Alexander HK, Day T. 2010. Risk factors for the evolutionary emergence of pathogens. J. R. Soc. Interface 7, 1455-1474. ( 10.1098/rsif.2010.0123) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.André J-B, Day T. 2005. The effect of disease life history on the evolutionary emergence of novel pathogens. Proc. R. Soc. B 272, 1949-1956. ( 10.1098/rspb.2005.3170) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Poon AFY, Pond SLK, Bennett P, Richman DD, Brown AJL, Frost SDW. 2007. Adaptation to human populations is revealed by within-host polymorphisms in HIV-1 and hepatitis C virus. PLoS Pathog. 3, e45. ( 10.1371/journal.ppat.0030045) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Coombs D, Gilchrist MA, Ball CL. 2007. Evaluating the importance of within- and between-host selection pressures on the evolution of chronic pathogens. Theor. Popul. Biol. 72, 576-591. ( 10.1016/j.tpb.2007.08.005) [DOI] [PubMed] [Google Scholar]
- 52.Alizon S, van Baalen M. 2008. Multiple infections, immune dynamics, and the evolution of virulence. Am. Nat. 172, E150-E168. ( 10.1086/590958) [DOI] [PubMed] [Google Scholar]
- 53.Lythgoe KA, Pellis L, Fraser C. 2013. Is HIV short-sighted? Insights from a multistrain nested model. Evolution 67, 2769-2782. ( 10.1111/evo.12166) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Park M, Loverdo C, Schreiber SJ, Lloyd-Smith JO. 2013. Multiple scales of selection influence the evolutionary emergence of novel pathogens. Phil. Trans. R. Soc. B 368, 20120333. ( 10.1098/rstb.2012.0333) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Trapman P, Ball F, Dhersin J-S, Tran VC, Wallinga J, Britton T. 2016. Inferring R0 in emerging epidemics—the effect of common population structure is small. J. R. Soc. Interface 13, 20160288. ( 10.1098/rsif.2016.0288) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Becker N, Marschner I. 1990. The effect of heterogeneity on the spread of disease. In Stochastic processes in epidemic theory (eds Gabriel J-P, Lefèvre C, Picard P), pp. 90-103. Berlin, Germany: Springer. [Google Scholar]
- 57.Volz E. 2008. SIR dynamics in random networks with heterogeneous connectivity. J. Math. Biol. 56, 293-310. ( 10.1007/s00285-007-0116-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Fraser C, Hollingsworth TD, Chapman R, de Wolf F, Hanage WP. 2007. Variation in HIV-1 set-point viral load: epidemiological analysis and an evolutionary hypothesis. Proc. Natl Acad. Sci. USA 104, 17 441-17 446. ( 10.1073/pnas.0708559104) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Britton T, Pardoux E. 2019. Stochastic epidemic models. In Stochastic epidemics in a homogeneous community (eds Britton T, Pardoux E), pp. 5-19. Cham, Switzerland: Springer. [Google Scholar]
- 60.King AA, Domenech de Cellès M, Magpantay FMG, Rohani P. 2015. Avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to Ebola. Proc. R. Soc. B 282, 20150347. ( 10.1098/rspb.2015.0347) [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data and code are accessible on GitLab: https://gitlab.in2p3.fr/ete/heterogeneity-outbreak.


