Superspreading quantified from bursty epidemic trajectories

Julius B Kirkegaard; Kim Sneppen

doi:10.1038/s41598-021-03126-w

. 2021 Dec 16;11:24124. doi: 10.1038/s41598-021-03126-w

Superspreading quantified from bursty epidemic trajectories

Julius B Kirkegaard ^1,^✉, Kim Sneppen ¹

PMCID: PMC8677763 PMID: 34916534

Abstract

The quantification of spreading heterogeneity in the COVID-19 epidemic is crucial as it affects the choice of efficient mitigating strategies irrespective of whether its origin is biological or social. We present a method to deduce temporal and individual variations in the basic reproduction number directly from epidemic trajectories at a community level. Using epidemic data from the 98 districts in Denmark we estimate an overdispersion factor k for COVID-19 to be about 0.11 (95% confidence interval 0.08–0.18), implying that 10 % of the infected cause between 70 % and 87 % of all infections.

Subject terms: Computational biophysics, Applied mathematics, Statistics, Diseases, Population dynamics

Introduction

In controlling epidemics, a deep understanding of the dynamics that underlie the spread of a disease is critical for choosing which interventions are most efficient to mitigate its continued spread. Epidemiological models of disease spreading^1,2 depend on parameters that capture effects both of the pathogen–host biology³ and the behaviour of the population in which the disease propagates⁴. Population-level data allow the estimation of the average basic reproduction number R, denoting the average number of people an infectious individual will transmit the disease to. Hidden in the average value of R are both temporal variations and variations between infectious individuals⁵. Variations in time stem both from the fact that social behaviour can change during an epidemic due to e.g. interventions being put in place, and because as the epidemic progresses the susceptible fraction of the population decreases. Variations from person to person can results both from biological differences or social behaviour.

A popular tale for some diseases is the 20/80-rule stating that 20 % of infectious individuals are responsible for 80 % of all infections. This was for example seen in recent epidemics such as the 2003 Asia outbreak of SARS⁵ and the 2014 Africa outbreak of Ebola⁶. Numerous of studies of COVID-19 suggest even more extreme statistics for this disease^7–11. These effects have collectively become known as superspreading, and while it is simple to define theoretically, measuring them typically requires data at the level of individuals. Viral genome sequences can be used to inform the analysis^12,13, and when contact tracing data is available^14–16 the analysis may be performed directly. More indirectly, the number of imported versus local cases has also been shown to inform the dispersion¹⁷. Focusing exclusively on the early evolution of the epidemic, recent work¹⁸ has shown that the variation in infection rate between regions can be used to estimate the dispersion.

In this report, we derive a Bayesian model for local epidemic outbursts to address the inverse problem of estimating temporal variations and individual infection heterogeneity from aggregate data. In other words, we demonstrate how to estimate this heterogeneity using data that only contain the total counts of the number of infected (and tested) per day. Our method relies on the fact that the epidemic trajectories of case numbers are bursty on a regional level, reflecting a mixture of simple Poisson randomness, varying testing frequencies, and individual infection heterogeneity. We model these fluctuations and sample for the statistics of the duration between reported cases as illustrated in Fig. 1). Using regional data allows us to bypass the averaging on the larger scales and permits the estimation of the underlying heterogeneity. Our method simultaneously samples across many regions, and thus naturally separates local outbursts from the large scale variation in average reproduction number.

Model definitions. (a) Illustration of a heterogeneous infection pattern (superspreading). Inset shows the probability density function and (one minus) the cumulative probability for the gamma distribution $Γ (R = 1.0, k = 0.1)$ . (b) Likelihood model. The infected individuals whose infection was reported on some day (orange) will themselves infect a number of people. These are in turn detected on other days according to the distribution $p (Δ τ)$ . (c) Time scale definitions. $τ_{d}$ denotes the duration from being infected to being reported, and $τ_{i}$ the duration between infections (generation time). Finally, $Δ τ$ is the difference between the reported times of infector-infectee pairs. (d) The maximum likelihood of the distributions we employ for $τ_{d}$ and $τ_{i}$ , and the distribution thus implied for $Δ τ$ . $p (Δ τ)$ has support below zero as it is possible that the infector’s infection is reported after the infectee.

We apply our approach on data for Denmark, which has a number of features that permit the analysis: Denmark, with a population of 5.8 million, makes available daily data for all of its 98 municipalities, which all coordinate their testing identically. As shown in Fig. 2(a–f), the number of cases in these municipalties vary significantly. In the capital region of Copenhagen, daily cases number in the hundreds, whereas in more rural Vesterhimmerlands the daily rate is less than ten. Finally, the population of Denmark is fairly uniform and thus slow, temporal variations in R can be assumed to affect all regions.

Daily cases of COVID-19 in Denmark between 26 February and 17 November 2020, during which only PCR tests were employed. (a) The total number of cases in each of Denmark’s 98 municipalities. (b–f) Daily number of cases in five municipalities. (g) Simulations of an epidemic with dispersion parameter $k = \infty$ and $k = 0.1$ , respectively. Both simulations use $R = 0.9$ and a crossing parameter chosen such that on average an infectious person enters every fifth day. Map created from DAGI [“Danmarks Administrative Geografiske Inddeling”] data (2020) supplied by the Danish Agency for Data Supply and Efficiency.

Model

Following the seminal paper of Lloyd-Smith et al.⁵, we assign each person an infectivity $ν$ sampled from a gamma distribution $Γ (R (t), k)$ with mean R(t) and dispersion parameter k. Small k correspond to a disease driven mainly by superspreading as illustrated in Fig. 1(a). The mean basic reproduction number is taken to be time-dependent to include changes due to policy, behavior and immunity. Accounting for subsequent independent stochastic infections, the offspring distribution is negative binomial $NB (R (t), k)$ ⁵. Our objective is to estimate R(t) and k. These parameters can be deduced directly if contact tracing data is available. Using only aggregate data, however, we need to instead build a probabilistic augmentation of the missing contact information.

In aggregate data, the duration between the infections of an infector-infectee pair being reported $Δ τ$ is stochastic. This distribution can be calculated if the distribution of infection-to-infection $p (τ_{i})$ (generation time) and infection-to-reporting $p (τ_{d})$ are known. As illustrated in Fig. 1(c), the time between reporting obey the random variable relation $Δ τ = - τ_{d} + τ_{i} + τ_{d}$ , where the first $τ_{d}$ refers to the random time of reporting for the infector and the latter $τ_{d}$ the time of reporting for the infectee. These do not affect the mean value of $Δ τ$ but do increase its variance. The resulting distribution is shown in Fig. 1(d) using estimates from the literature of $p (τ_{i}) \sim Γ (5.0 \pm 0.75, 10 \pm 1.5)$ and $p (τ_{d}) \sim Γ (4.5 \pm 0.75, 5.0 \pm 1.0)$ ^19,20. With these distributions in place it is straightforward to simulate an epidemic if R(t) and k are known. Fig. 2(g) shows two such examples for $k = \infty$ and $k = 0.1$ . For $k = 0.1$ there will be superspreading, but because of the distribution of $Δ τ$ these will be distributed over a number of days rendering visual distinction difficult and thus makes statistical analysis crucial for its discovery.

To tackle the inverse problem of the simulation we define a self-consistent model of the data. For simplicity let us first assume that all infectious individuals are found and postpone the discussion of under-reporting. Figure 1(b) illustrates our approach: we define the likelihood of the data by calculating the probability of the observed time-series for each municipality. In practice, this can only be calculated in a reasonable amount of time because of a few key features of the negative-binomial distribution. These are derived in the methods section, but can be summarised as follows: If the offspring distribution from a single individual is the negative binomial $NB (R, k)$ , then the offspring distribution from M people, where each individual is found on one specific day with probability p is exactly $NB (p M R, M k)$ . The total likelihood of a single day is then found by convolving these distributions using $p (Δ τ)$ for the daily probability of reporting. The precise formulae are presented in the SI.

To complete our model, we need to adjust for correlations that are present in the data as shown in Fig. 3. Naturally, a municipality with a large population will have a larger number of cases per day than a municipality with a small population. This is because there will be more imported cases in large regions (there may also be variations in R between cities and rural areas²¹, but this is a second-order effect that we ignore). As most imported cases will come from other municipalities, we ignore effects of international travel. In fact, daily cases per population of the municipalities will be strongly correlated as a function of time as demonstrated in Fig. 3(a), reflecting the fact that Denmark is a small country with overall homogeneous development of the disease. To account for coupling between communities we introduce a crossing parameter c that corrects for the fraction of infections that occur across municipality borders. For the number of infectious individuals in municipality m we thus use $N_{m}^{corrected} = (1 - c) N_{m} + c f_{m} T$ , where $N_{m}$ is the uncorrected number of infectious individuals in municipality m, $f_{m}$ is the population fraction of municipality m, and T is the total number of infectious individuals across all municipalities. With this simple formula it is ensured that municipalities will, on average, have a number of infections that is proportional to the population of the municipality.

Data correlations. (a) Cases per population of each municipality smoothed over one week. Thick line shows cases for all of Denmark. Inset shows the cross correlation between municipalities as a function of time. (b) Daily test frequency in each municipality. Inset shows the correlation of deviations of daily cases from a weekly running mean both with and without linear correction for the number of tests.

Figure 3(b) shows that the testing frequency in each municipality is highly irregular, with e.g. fewer tests being done on weekends. Our method to detect variations in reproduction number depends on the deviations in cases in each municipality to be uncorrelated. The inset of Fig. 3(b) shows that there is a small correlation present. This is natural, since the number of tests is correlated across municipalities. We correct for this effect by scaling with the number of tests. This is incorporated into our model by re-scaling the distribution of reporting $p (τ_{d})$ in proportion to the daily number of tests (see SI for details).

Finally, we employ Hamiltonian Monte Carlo²² to sample for R(t), k and c from the total likelihood function of all regions, aimed to reproduce the case counts at each day, given case counts on previous days. In particular, we run the NUTS algorithm²³ with gradients of the log likelihood calculated by automatic differentiation²⁴ on GPUs that allow for fast calculations of convolutions that make up our likelihood function (see methods). We restrict temporal variations of R(t) to be slow on the scale of weeks by parameterising the function using cubic Hermite splines. The Hamiltonian Monte Carlo chain is then run multiple times for sampled $p (τ_{i})$ and $p (τ_{d})$ .

Results

Our results are shown in Fig. 4. The sampling reveals an R(t) [Fig. 3(a)] that slightly deviates from estimates obtained by single approximations using e.g. the SIR model^1,25. This is because we calculate an R(t) that best explain the statistics of each municipality and not the sum of these. Further, we have a large uncertainty on our estimates because our R(t) models the reproduction number under uncertain values of $p (τ_{i})$ and $p (τ_{d})$ (see Ref.²⁶ for details on precise estimation of R(t) alone). In other words, we calculate the true value of R(t) as defined by the average offspring count, and not as the value of R(t) that best makes a single model fit the evolving infection statistics²⁷.

Figure 3(b) shows that we cannot constrain c more than to say that by far most infections happen within municipality borders, as is expected. In contrast, the degree of superspreading as defined by the value of k is fairly constrained as shown in Fig. 4(c). We find k in the range 0.08 – 0.18 (95% confidence interval), with mean $k = 0.11$ , which compares well to estimated confidence intervals obtained from other methods^{12,14,15,17,18} but smaller than $k \sim 0.4$ reported by Ref.¹⁶. For $R = 1.4$ , for instance, our value range corresponds to an epidemic in which $10 %$ of the infected individuals are responsible for $70 %$ – $87 %$ of all cases. In this case, the majority of infectious individuals will not infect anyone, in broad agreement with the fact that there are remarkably few transmissions within households^28,29. We note that the precise range of such statistics depends on the choice of probability distribution for infectiousness, for which we used the gamma distribution as has become standard⁵. If, for instance, the distribution instead were fat-tailed^4,9, then the quantification of the dispersion statistics would differ. This could be remedied by introducing an exponential cutoff, but then this extra parameter would also need to be sampled for. The value of the crossing parameter c only weakly affects k, which is instead affecting mainly by the mean value of $τ_{i}$ and the widths of the distributions of $τ_{i}$ and $τ_{d}$ . In particular, in our model we assume that infectious individuals spread the disease over time. If, in contrast, the spread from individuals is driven mainly single events then our distribution for $p (Δ τ)$ is too wide. To study the effect of this, we ran our model with both $p (τ_{i})$ and one of the two $p (τ_{d})$ that make up $p (Δ τ)$ constrained to a single day. This leads to a k that is about $40 %$ larger than the one estimated.

We have until now assumed that all infectious individuals were included in the data. This is of course not true. Focusing on estimating k we here consider the case where only a (time-independent) fraction $f < 1$ of all infectious are found. This leads our method to overestimate k. Most simply, if the incidence at each day is a factor 1/f larger than the measured data, fluctuations are amplified by 1/f and the true dispersion parameter k will be our measured k multiplied by f. Thus a value of $k = 0.1$ from Fig. 4c and an $f \sim 1 / 3$ would correspond to a true $k \sim 0.03$ . It is however more realistic to assume that each detected case is independently found with probability f. Using simulated data where a fraction $f \sim 1 / 3$ of cases are independently detected we find that a measured k of 0.1 correspond to a true underlying k that is between 0.05 and 0.085, depending on the simulation (see SI). If, on the other hand, there is large correlations between the reporting present in the data, our method may underestimate k. This is harder to gauge precisely as it depends on the correlations.

These systematic uncertainties should be considered for our estimated value of k. The existence of large spreading events makes our model underestimate k, whereas uncorrelated under-reporting leads to overestimation. The effects will tend to affect the value of k in opposite directions, but taken to the extreme could bring k to 0.04 – 0.28. We have furthermore tested our method on random subsets of all municipalities, and found that this did not have any significant impact on our estimates. Restricting to considered time interval to smaller subsections also did not affect the estimated value of k significantly.

Traditionally one characterises an epidemic with only one number, $R_{0}$ , and even so there are remarkably few direct measurements of this average for known diseases. Here we ventured beyond such average measurements and proposed a new community level method to extract also variations in infectivity without having access to person sensitive data and contact tracing. Using our method we quantified the COVID-19 epidemic as one of the most extreme superspreader dominated diseases ever recorded⁵. It has previously been demonstrated that such level of heterogeneity should make COVID-19 comparatively easy to mitigate with societal restrictions^10,11.

Methods

Negative binomial formulas

The offspring distribution from an individual with infectivity $ν$ is Poissonian:

\begin{matrix} p (n) = Pois (n ; ν) = \frac{ν^{n} e^{- ν}}{n!} . \end{matrix}

When infectivty $ν$ is distributed according to a Gamma distribution

\begin{matrix} p (ν) = Γ (ν ; R, k) = \frac{ν^{k - 1}}{Γ (k)} {(\frac{k}{R})}^{k} e^{- \frac{k ν}{R}}, \end{matrix}

the total offspring distribution becomes negative binomial

\begin{matrix} p (n) = NB (n ; R, k) = \frac{Γ (n + k)}{n! Γ (k)} \frac{k^{k} R^{n}}{{(k + R)}^{n + k}} . \end{matrix}

The negative binomial has probability generating function

\begin{matrix} G_{n} (s ; R, k) = {[1 + \frac{R}{k} (1 - s)]}^{- k} . \end{matrix}

The number of infections from M infectious people will then have generating function

\begin{matrix} G_{n}^{M} (s) = {[1 + \frac{R}{k} (1 - s)]}^{- M k} = {[1 + \frac{MR}{Mk} (1 - s)]}^{- M k} = G_{n} (s ; M R, M k) . \end{matrix}

If a person is only reported with probability p, corresponding to a Bernoulli random variable with generating function $G^{B} (s) = p s + (1 - p)$ , the generating function for the reported offspring distribution becomes

\begin{matrix} G_{n}^{D} (s) = G_{n} (G^{B}, (s)) = {(1 + \frac{pR}{k} (1 - s))}^{- k} = G_{n} (s ; p R, k) . \end{matrix}

These formulas combined show that the reported offspring distribution from M people is $NB (p M R, M k)$ .

Base likelihood model

We derive our likelihood model by calculating the probability to observe a given number of cases on a specific day given the previous days’ case counts. Define the variable $z (d_{1}, d_{2})$ as the number people reported on day $d_{2}$ whose infector was reported on day $d_{1}$ . Using the above results we have

\begin{matrix} z (d_{1}, d_{2}) \sim NB (p (d_{2} - d_{1}) c_{1} R, c_{1} k), \end{matrix}

where $c_{1}$ is the number of cases on day $d_{1}$ , and $p (Δ τ)$ is the distribution of the time between reporting.

The number of infections $c_{2}$ on day $d_{2}$ is given by

\begin{matrix} c_{2} = \sum_{d_{1}} z (d_{1}, d_{2}) . \end{matrix}

In other words: the number of infections reported on day $d_{2}$ is the sum of those cases from the surrounding days ( ${d_{1}}$ ) that are reported on day $d_{2}$ . We make the assumption that these are independent, although this is not strictly true. In this case $c_{2}$ will be distributed as

\begin{matrix} c_{2} \sim \underset{d_{1}}{⊛} NB (p (d_{2} - d_{1}) c_{1} R, c_{1} k), \end{matrix}

where $⊛$ denotes convolution.

The total log likelihood is then found by summing over all regions and days:

\begin{matrix} log L = \sum_{regions} \sum_{i} log (\underset{j}{⊛}, NB, (c_{i} ; p (d_{i} - d_{j}) c_{j} R, c_{j} k)), \end{matrix}

where $c_{i}$ is the number of cases on day $d_{i}$ . We use PyTorch to evaluate this expression and its gradients on an Nvidia Geforce RTX 2080 Ti GPU. More details are given in the SI.

Supplementary information

Supplementary Information.^{(226.8KB, pdf)}

Acknowledgements

This project has received funding from the Novo Nordisk Foundation, under its Data Science Initiative, Grant Agreement NNF20OC0062047, and from the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme, Grant Agreement No. 740704.

Author contribution

J.B.K. and K.S. conceived the project. J.B.K. performed research. J.B.K. wrote paper, and both J.B.K. and K.S. edited the paper.

Competing interest

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-021-03126-w.

References

1.Kermack WO, McKendrick AG, Walker GT. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. 1927;115:700–721. [Google Scholar]
2.Hans H, Roy M, Anderson V, Andreasen S, Bansal D, De A, Chris D, Ken TD, Eames W, John E, Simon DWF, Sebastian F, et al. Modeling infectious disease dynamics in the complex landscape of global health. Science. 2015;347:277. doi: 10.1126/science.aaa4339. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Varghese PM, Tsolaki AG, Yasmin H, Shastri A, Ferluga J, Vatish M, Madan T, Kishore U. Host-pathogen interaction in covid-19: Pathogenesis, potential therapeutics and vaccination strategies. Immunobiology. 2020;225:152008. doi: 10.1016/j.imbio.2020.152008. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Kirkegaard, J. B., Mathiesen, J., & Sneppen, K. Airborne pathogens in a heterogeneous world: Superspreading and mitigation. Sci. Rep. 11 (2020). [DOI] [PMC free article] [PubMed]
5.Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individual variation on disease emergence. Nature. 2005;438:355–359. doi: 10.1038/nature04153. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Faye O, Boëlle P-Y, Heleze E, Faye O, Loucoubar C, Magassouba NF, Soropogui B, Keita S, Gakou T, Koivogui L, et al. Chains of transmission and control of ebola virus disease in conakry, guinea, in 2014: an observational study. Lancet Infect. Dis. 2015;15:320–326. doi: 10.1016/S1473-3099(14)71075-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Liu Y, Eggo RM, Kucharski AJ. Secondary attack rate and superspreading events for sars-cov-2. Lancet. 2020;395:e47. doi: 10.1016/S0140-6736(20)30462-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Frieden TR, Lee CT. Identifying and interrupting superspreading events–implications for control of severe acute respiratory syndrome coronavirus 2. Emerg. Infect. Dis. 2020;26:1059. doi: 10.3201/eid2606.200495. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Wong F, Collins JJ. Evidence that coronavirus superspreading is fat-tailed. Proc. Natl. Acad. Sci. 2020;117:29416–29418. doi: 10.1073/pnas.2018490117. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Sneppen K, Nielsen BF, Taylor RJ, Simonsen L. Overdispersion in covid-19 increases the effectiveness of limiting nonrepetitive contacts for transmission control. Proc. Natl. Acad. Sci. 2021;118:e2016623118. doi: 10.1073/pnas.2016623118. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Nielsen BF, Simonsen L, Sneppen K. Covid-19 superspreading suggests mitigation by social network modulation. Phys. Rev. Lett. 2021;126:118301. doi: 10.1103/PhysRevLett.126.118301. [DOI] [PubMed] [Google Scholar]
12.Miller D, Martin MA, Harel N, Tirosh O, Kustin T, Meir M, Sorek N, Gefen-Halevi S, Amit S, Vorontsov O, et al. Full genome viral sequences inform patterns of sars-cov-2 spread into and within israel. Nat. Commun. 2020;11:1–10. doi: 10.1038/s41467-020-19248-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Liang W, Gao GF, Bi Y, et al. Inference of person-to-person transmission of covid-19 reveals hidden super-spreading events during the early outbreak phase. Nat. Commun. 2020;11:1–6. doi: 10.1038/s41467-020-18836-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Agus H, Hadi S, Kasim MF, Nuraini N, Lestari B, Triany D, Widyastuti W. Superspreading in early transmissions of covid-19 in indonesia. Sci. Rep. 2020;10:1–4. doi: 10.1038/s41598-019-56847-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Adam DC, Wu P, Wong JY, Lau EHY, Tsang TK, Cauchemez S, Leung GM, Cowling BJ. Clustering and superspreading potential of sars-cov-2 infections in hong kong. Nat. Med. 2020;26:1714–1719. doi: 10.1038/s41591-020-1092-0. [DOI] [PubMed] [Google Scholar]
16.Lau, M.S.Y., Grenfell, B., Nelson, K., & Lopman, B. Characterizing super-spreading events and age-specific infectivity of covid-19 transmission in georgia, USA. MedRXiv (2020). [DOI] [PMC free article] [PubMed]
17.Endo A, Abbott S, Kucharski AJ, Funk S, et al. Estimating the overdispersion in covid-19 transmission using outbreak sizes outside china. Wellcome Open Res. 2020;5:67. doi: 10.12688/wellcomeopenres.15842.3. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Pozderac C, Skinner B. Superspreading of sars-cov-2 in the USA. PLoS ONE. 2021;16:e0248808. doi: 10.1371/journal.pone.0248808. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Ferretti, L., Ledda, A., Wymant, C., Zhao, L., Ledda, V. & Abeler-Dorner, L. et al. The timing of covid-19 transmission. SSRN: 10.2139/ssrn.3716879
20.Griffin, J.M., Collins, A.B., Hunt, K., McEvoy, D., Casey, M., Byrne, A.W., McAloon, C.G., Barber, A., Lane, E.A. & More, S.J. A rapid review of available evidence on the serial interval and generation time of covid-19. medRxiv (2020). [DOI] [PMC free article] [PubMed]
21.Afshordi, N., Holder, B., Bahrami, M., & Lichtblau, D. Diverse local epidemics reveal the distinct effects of population density, demographics, climate, depletion of susceptibles, and intervention in the first wave of covid-19 in the united states. arXiv preprint arXiv:2007.00159 (2020).
22.Betancourt, M. A conceptual introduction to hamiltonian monte carlo. arXiv preprint arXiv: 1701.02434 (2017).
23.Hoffman MD, Gelman A. The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo. J. Mach. Learn. Res. 2014;15:1593–1623. [Google Scholar]
24.Paszke, A., Gross S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J. & Chintala, S. Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32 ( Curran Associates, Inc., 2019).
25.Bettencourt LMA, Ribeiro RM. Real time bayesian estimation of the epidemic potential of emerging infectious diseases. PLoS ONE. 2008;3:e2185. doi: 10.1371/journal.pone.0002185. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Gostic KM, McGough L, Baskerville EB, Abbott S, Joshi K, Tedijanto C, Kahn R, Niehus R, Hay JA, De Salazar PM, et al. Practical considerations for measuring the effective reproductive number, r t. PLoS Comput. Biol. 2020;16:e1008409. doi: 10.1371/journal.pcbi.1008409. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Breban R, Vardavas R, Blower S. Theory versus data: how to calculate r0? PLoS One. 2007;2:e282. doi: 10.1371/journal.pone.0000282. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Bi, Q., Wu, Y., Mei, S., Ye, C., Zou, X., Zhang, Z., Liu, X., Wei, L., Truelove, S.A., Zhang, T., et al. Epidemiology and transmission of covid-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. Lancet Infect. Dis. (2020). [DOI] [PMC free article] [PubMed]
29.Young, S., Young-Man, P., Seonju, K., Sangeun, Y., Baeg-Ju, L., Chang, N., Kim, B., Kim, J.I., Sook, H., Young, K., Kim, B. & Park, Y., et al. Coronavirus disease outbreak in call center, South Korea. Emerg. Infect. Dis. (2020). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information.^{(226.8KB, pdf)}

[CR1] 1.Kermack WO, McKendrick AG, Walker GT. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. 1927;115:700–721. [Google Scholar]

[CR2] 2.Hans H, Roy M, Anderson V, Andreasen S, Bansal D, De A, Chris D, Ken TD, Eames W, John E, Simon DWF, Sebastian F, et al. Modeling infectious disease dynamics in the complex landscape of global health. Science. 2015;347:277. doi: 10.1126/science.aaa4339. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Varghese PM, Tsolaki AG, Yasmin H, Shastri A, Ferluga J, Vatish M, Madan T, Kishore U. Host-pathogen interaction in covid-19: Pathogenesis, potential therapeutics and vaccination strategies. Immunobiology. 2020;225:152008. doi: 10.1016/j.imbio.2020.152008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Kirkegaard, J. B., Mathiesen, J., & Sneppen, K. Airborne pathogens in a heterogeneous world: Superspreading and mitigation. Sci. Rep. 11 (2020). [DOI] [PMC free article] [PubMed]

[CR5] 5.Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individual variation on disease emergence. Nature. 2005;438:355–359. doi: 10.1038/nature04153. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Faye O, Boëlle P-Y, Heleze E, Faye O, Loucoubar C, Magassouba NF, Soropogui B, Keita S, Gakou T, Koivogui L, et al. Chains of transmission and control of ebola virus disease in conakry, guinea, in 2014: an observational study. Lancet Infect. Dis. 2015;15:320–326. doi: 10.1016/S1473-3099(14)71075-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Liu Y, Eggo RM, Kucharski AJ. Secondary attack rate and superspreading events for sars-cov-2. Lancet. 2020;395:e47. doi: 10.1016/S0140-6736(20)30462-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Frieden TR, Lee CT. Identifying and interrupting superspreading events–implications for control of severe acute respiratory syndrome coronavirus 2. Emerg. Infect. Dis. 2020;26:1059. doi: 10.3201/eid2606.200495. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Wong F, Collins JJ. Evidence that coronavirus superspreading is fat-tailed. Proc. Natl. Acad. Sci. 2020;117:29416–29418. doi: 10.1073/pnas.2018490117. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Sneppen K, Nielsen BF, Taylor RJ, Simonsen L. Overdispersion in covid-19 increases the effectiveness of limiting nonrepetitive contacts for transmission control. Proc. Natl. Acad. Sci. 2021;118:e2016623118. doi: 10.1073/pnas.2016623118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Nielsen BF, Simonsen L, Sneppen K. Covid-19 superspreading suggests mitigation by social network modulation. Phys. Rev. Lett. 2021;126:118301. doi: 10.1103/PhysRevLett.126.118301. [DOI] [PubMed] [Google Scholar]

[CR12] 12.Miller D, Martin MA, Harel N, Tirosh O, Kustin T, Meir M, Sorek N, Gefen-Halevi S, Amit S, Vorontsov O, et al. Full genome viral sequences inform patterns of sars-cov-2 spread into and within israel. Nat. Commun. 2020;11:1–10. doi: 10.1038/s41467-020-19248-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Liang W, Gao GF, Bi Y, et al. Inference of person-to-person transmission of covid-19 reveals hidden super-spreading events during the early outbreak phase. Nat. Commun. 2020;11:1–6. doi: 10.1038/s41467-020-18836-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Agus H, Hadi S, Kasim MF, Nuraini N, Lestari B, Triany D, Widyastuti W. Superspreading in early transmissions of covid-19 in indonesia. Sci. Rep. 2020;10:1–4. doi: 10.1038/s41598-019-56847-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Adam DC, Wu P, Wong JY, Lau EHY, Tsang TK, Cauchemez S, Leung GM, Cowling BJ. Clustering and superspreading potential of sars-cov-2 infections in hong kong. Nat. Med. 2020;26:1714–1719. doi: 10.1038/s41591-020-1092-0. [DOI] [PubMed] [Google Scholar]

[CR16] 16.Lau, M.S.Y., Grenfell, B., Nelson, K., & Lopman, B. Characterizing super-spreading events and age-specific infectivity of covid-19 transmission in georgia, USA. MedRXiv (2020). [DOI] [PMC free article] [PubMed]

[CR17] 17.Endo A, Abbott S, Kucharski AJ, Funk S, et al. Estimating the overdispersion in covid-19 transmission using outbreak sizes outside china. Wellcome Open Res. 2020;5:67. doi: 10.12688/wellcomeopenres.15842.3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Pozderac C, Skinner B. Superspreading of sars-cov-2 in the USA. PLoS ONE. 2021;16:e0248808. doi: 10.1371/journal.pone.0248808. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Ferretti, L., Ledda, A., Wymant, C., Zhao, L., Ledda, V. & Abeler-Dorner, L. et al. The timing of covid-19 transmission. SSRN: 10.2139/ssrn.3716879

[CR20] 20.Griffin, J.M., Collins, A.B., Hunt, K., McEvoy, D., Casey, M., Byrne, A.W., McAloon, C.G., Barber, A., Lane, E.A. & More, S.J. A rapid review of available evidence on the serial interval and generation time of covid-19. medRxiv (2020). [DOI] [PMC free article] [PubMed]

[CR21] 21.Afshordi, N., Holder, B., Bahrami, M., & Lichtblau, D. Diverse local epidemics reveal the distinct effects of population density, demographics, climate, depletion of susceptibles, and intervention in the first wave of covid-19 in the united states. arXiv preprint arXiv:2007.00159 (2020).

[CR22] 22.Betancourt, M. A conceptual introduction to hamiltonian monte carlo. arXiv preprint arXiv: 1701.02434 (2017).

[CR23] 23.Hoffman MD, Gelman A. The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo. J. Mach. Learn. Res. 2014;15:1593–1623. [Google Scholar]

[CR24] 24.Paszke, A., Gross S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J. & Chintala, S. Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32 ( Curran Associates, Inc., 2019).

[CR25] 25.Bettencourt LMA, Ribeiro RM. Real time bayesian estimation of the epidemic potential of emerging infectious diseases. PLoS ONE. 2008;3:e2185. doi: 10.1371/journal.pone.0002185. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Gostic KM, McGough L, Baskerville EB, Abbott S, Joshi K, Tedijanto C, Kahn R, Niehus R, Hay JA, De Salazar PM, et al. Practical considerations for measuring the effective reproductive number, r t. PLoS Comput. Biol. 2020;16:e1008409. doi: 10.1371/journal.pcbi.1008409. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Breban R, Vardavas R, Blower S. Theory versus data: how to calculate r0? PLoS One. 2007;2:e282. doi: 10.1371/journal.pone.0000282. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Bi, Q., Wu, Y., Mei, S., Ye, C., Zou, X., Zhang, Z., Liu, X., Wei, L., Truelove, S.A., Zhang, T., et al. Epidemiology and transmission of covid-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. Lancet Infect. Dis. (2020). [DOI] [PMC free article] [PubMed]

[CR29] 29.Young, S., Young-Man, P., Seonju, K., Sangeun, Y., Baeg-Ju, L., Chang, N., Kim, B., Kim, J.I., Sook, H., Young, K., Kim, B. & Park, Y., et al. Coronavirus disease outbreak in call center, South Korea. Emerg. Infect. Dis. (2020). [DOI] [PMC free article] [PubMed]

PERMALINK

Superspreading quantified from bursty epidemic trajectories

Julius B Kirkegaard

Kim Sneppen

Abstract

Introduction

Figure 1.

Figure 2.

Model

Figure 3.

Results

Figure 4.

Methods

Negative binomial formulas

Base likelihood model

Supplementary information

Acknowledgements

Author contribution

Competing interest

Footnotes

Supplementary Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Superspreading quantified from bursty epidemic trajectories

Julius B Kirkegaard

Kim Sneppen

Abstract

Introduction

Figure 1.

Figure 2.

Model

Figure 3.

Results

Figure 4.

Methods

Negative binomial formulas

Base likelihood model

Supplementary information

Acknowledgements

Author contribution

Competing interest

Footnotes

Supplementary Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases