Super-spreading events initiated the exponential growth phase of COVID-19 with ℛ0 higher than initially estimated

Marek Kochańczyk; Frederic Grabowski; Tomasz Lipniacki

doi:10.1098/rsos.200786

. 2020 Sep 23;7(9):200786. doi: 10.1098/rsos.200786

Super-spreading events initiated the exponential growth phase of COVID-19 with ℛ₀ higher than initially estimated

Marek Kochańczyk ¹, Frederic Grabowski ², Tomasz Lipniacki ^1,^✉

PMCID: PMC7540800 PMID: 33047040

Abstract

The basic reproduction number $R_{0}$ of the coronavirus disease 2019 has been estimated to range between 2 and 4. Here, we used an SEIR model that properly accounts for the distribution of the latent period and, based on empirical estimates of the doubling time in the near-exponential phases of epidemic progression in China, Italy, Spain, France, UK, Germany, Switzerland and New York State, we estimated that $R_{0}$ lies in the range 4.7–11.4. We explained this discrepancy by performing stochastic simulations of model dynamics in a population with a small proportion of super-spreaders. The simulations revealed two-phase dynamics, in which an initial phase of relatively slow epidemic progression diverts to a faster phase upon appearance of infectious super-spreaders. Early estimates obtained for this initial phase may suggest lower $R_{0}$ .

Keywords: COVID-19, reproduction number

1. Introduction

The basic reproduction number $R_{0}$ is a critical parameter characterizing the dynamics of an outbreak of an infectious disease. By definition, $R_{0}$ quantifies the expected number of secondary cases generated by an infectious individual in an entirely susceptible population. $R_{0}$ may be influenced by natural conditions (such as seasonality) as well as socio-economic factors (such as population density or ingrained societal norms and practices) [1]. Accurate estimation of $R_{0}$ is of crucial importance because it informs the extent of control measures that should be implemented to terminate the spread of an epidemic. Also, $R_{0}$ determines the immune proportion f of population that is required to achieve herd immunity, $f = 1 - 1 / R_{0}$ .

A preliminary estimate published by the World Health Organization (WHO) suggested that $R_{0}$ of coronavirus disease 2019 (COVID-19) lies in between 1.4 and 2.5 [2]. Later this estimate has been revised to 2–2.5 [3], which is broadly in agreement with numerous other studies that, based on official data from China, implied the range of 2–4 (see Liu et al. [4] or Boldog et al. [5] for a summary). This range suggests an outbreak of a contagious disease that should be containable by imposition of moderate restrictions on social interactions. Unfortunately, moderate restrictions that were implemented in e.g. Italy or Spain turned out to be insufficient to prevent a surge of daily new cases and, consequently, nationwide quarantines had to be introduced.

We estimated the range of $R_{0}$ of COVID-19 based on the doubling times observed in the exponential phases of the epidemic in China, Italy, Spain, France, UK, Germany, Switzerland and New York State. For each of these locations, we used trajectories of both cumulative confirmed cases and deaths [6]. Since our stochastic simulations suggested that the epidemic may have two-phase dynamics—slow (and susceptible to extinction) before any super-spreading events occur, and fast and steadily expanding after the occurrence of super-spreading events—to capture the second phase of the trajectories, we analysed them after a fixed threshold of cases or deaths had been exceeded, in two-week intervals. Both the stochastic simulations and $R_{0}$ estimates were obtained within a susceptible–exposed–infected–removed (SEIR) model that correctly reproduces the shape of the latent period distribution and yields a plausible mean generation time. We concluded that the range of $R_{0}$ is 4.7–11.4, which is considerably higher than most early estimates. We conjecture that these early estimates were obtained for the first phase of the epidemic in which super-spreading events were absent.

2. Results

2.1. The SEIR model

We used an SEIR model (see Methods for model equations and justification of parameter values) in which:

—
we assumed that the latent period is the same as the incubation period and is Erlang-distributed with the shape parameter m = 6 and the mean of $5.28 days = 1 / σ$ [7];
—
we assumed that the infectious period is Erlang-distributed with the shape parameter n = 1 (exponentially distributed) or n = 2, and the mean of $2.9 days = 1 / γ$ [8,9];
—
the infection rate coefficient β was determined from σ, γ, m, n and doubling time $T_{d}$ , which in turn was estimated based on the epidemic data as described in the next subsection, ultimately allowing us to estimate $R_{0} = β / γ$ as $R_{0} (T_{d})$ .

The use of the Erlang distributions directly translates to the inclusion of multiple consecutive substates in the SEIR model, meaning that we assumed m ‘exposed’ substates and n ‘infectious’ substates (Erlang distribution is a distribution of a sum of independent exponentially distributed variables of the same mean).

2.2. Estimation of $R_{0}$ in the exponential growth phase

First, we estimated the doubling time $T_{d}$ within two-week periods beginning on the day in which the number of confirmed (in the SEIR model naming convention, ‘removed’, see Methods) cases exceeded 100 or the number of deaths exceeded 10 in China, six European countries and New York State (figure 1a,b). Values of $T_{d}$ that we obtained lie in between $T_{d}^{\min} = 1.86$ days (based on cases in New York State) and $T_{d}^{\max} = 2.96$ days (based on deaths in Switzerland).

Figure 1. — Estimation of the doubling time and the resulting basic reproduction number $R_{0}$ . (a,b) Estimates of the doubling time $T_{d}$ for China, six European countries and New York State using two-week periods beginning (a) when the number of confirmed cases exceeds 100 or (b) when the number of deaths exceeded 10, according to data gathered and made available by Johns Hopkins University [6]. (c) The range of $R_{0}$ estimated using two variants of our SEIR model (violet solid and dashed curves) for the range of $T_{d}$ estimated in (a) and (b). Vertical lines in the yellow area are $T_{d}$ estimates based on the cumulative number cases (orange, from (a)) or the cumulative number of deaths (brown, from (b)). Blue and green solid curves correspond to *$R_{0} (T_{d})$* according to SEIR models structured and parametrized as in the study of Kucharski *et al*. [9] (m = 2, n = 2) and Wu *et al*. [10] (m = 1, n = 1).

Then, we estimated the range of $R_{0}$ as a function of the doubling time $T_{d}$ using a formula that takes into account the mean latent and infectious period, 1/σ and 1/γ, respectively, as well as the shape parameters m and n, see equation (4.8) in Methods. The lower bound has been obtained using the model variant with $n = 2$ (two ‘infectious’ substates), whereas the upper bound results from the model with $n = 1$ (one ‘infectious’ substate), figure 1c. After plugging $T_{d}^{\max}$ and $T_{d}^{\min}$ in, respectively, the variant of our model with the lower $R_{0} (T_{d})$ curve ( $n = 2$ ) and the variant with the higher $R_{0} (T_{d})$ curve ( $n = 1$ ), we arrived at the estimated $R_{0}$ range of 4.7–11.4. The cases-based doubling time for China, 2.36, is consistent with the value of 2.4 reported by Sanche et al. [11], who estimated that $R_{0}$ for China lies in the range 4.7 to 6.6, overlapping with our estimated range for China: 5.6–7.3. The models having one or two ‘exposed’ substates, often used to estimate the value of $R_{0}$ , substantially underestimated $R_{0}$ , cf. figure 1c and the articles by Wearing et al. [12], Wallinga & Lipsitch [13] and Kochańczyk et al. [14].

There are two main reasons why our estimates of the basic reproduction number are higher compared to other published estimates:

(i)
Our SEIR model comprises six ‘exposed’ substates to account for the latent period distribution. As shown in figure 1c, the broader latent period distribution, exponential (i.e. Erlang with m = 1), results in lower $R_{0}$ estimates than the Erlang with m = 2 (at the same remaining model parameters). We characterized sensitivity of $R_{0}$ with respect to the mean latent period, 1/σ, in electronic supplementary material, figure S1, while in figure S2 we show that the assumed latent period distribution is in agreement with epidemiological estimates [7,15,16].
(ii)
We estimated the doubling time, $T_{d}$ ,from the growth of the number of cumulative cases and cumulative deaths in the two-week-long exponential phases of the epidemic in six locations, obtaining $T_{d}$ ranging from 1.86 to 2.96. These values are much lower than the values reported in the early influential studies of Wu et al. [10,17] and Li et al. [16]: 5.2 days, 6.4 days and 7.4 days, correspondingly. In these studies, the basic reproduction number has been estimated to lie in between 1.94 and 2.68. A summary in table 1 shows that the lower $R_{0}$ estimates follow from much longer estimates of $T_{d}$ .

Table 1.

Relation between $T_{d}$ model parameters (mean latent period or mean incubation period, 1/σ ; mean period of infectiousness, 1/γ ; and consequent mean generation interval, 〈GI〉), mean serial period, 〈SI〉 and $R_{0}$ . All estimates are based on the epidemic development in Hubei province of China. The unit of all values, except for $R_{0}$ , is day. Confidence intervals are given in oval brackets; a credible interval is given in square brackets.

$T_{d}$	1/σ	1/γ	〈SI〉 or 〈GI〉	$R_{0}$	reference
?	5.2	2.9	6.65^a	2.35	Kucharski et al. [9]
				(1.15–4.77)
5.2	6.5	?	7.0	1.94	Wu et al. [17]
(4.6–6.1)			(5.8–8.1)	(1.83–2.06)
6.4	6	2.4^b	8.4	2.68	Wu et al. [10]
[5.8–7.1]				(2.47–2.86)
7.4	5.2	?	7.5	2.2	Li et al. [16]
	(4.1–7.0)		(5.3–19)	(1.4–3.9)

Open in a new tab

^aThe 〈GI〉 value is not given in the article but calculated from the assumed values of 1/σ and 1/γ as $⟨ GI ⟩ = 1 / σ + \frac{1}{2} / γ$ [18].

^bThe value 1/γ was obtained by the authors as 〈SI〉 − 1/σ, which is inconsistent with the assumption that the infection occurs in a random time during the period of infectiousness.

2.3. Impact of super-spreading on $T_{d}$ estimation

The discrepancy in $T_{d}$ estimation may be potentially attributed to the fact that not all ‘removed’ individuals are registered. In the case when the ratio of registered to ‘removed’ individuals is increasing over time, the true increase of the ‘removed’ cases may be overestimated. We do not rule out this possibility, although we consider it implausible as the expansion of testing capacity in considered countries has been slower than the progression of the outbreak. We rather attribute the discrepancy to the fact that in the early phase, in which the doubling time (growth rate) is estimated based on individual case reports, the consequences of potential super-spreading events (such as football matches, carnival fests, demonstrations, masses or hospital-acquired infections) are negligible due to a low probability of such events when the number of infected individuals is low. In a given region or country, occurrence of first super-spreading events triggers transition to the faster-exponential growth, in which subsequent super-spreading events become statistically significant and may become decisive drivers of the epidemic spread [19]. Based on case reports in China, Sanche et al. [11] inferred that the initial epidemic period in Wuhan has been dominated by simple transmission chains. Phylogenetic analyses by Worobey et al. [20] revealed that the first cases recorded in USA and Europe did not initiate sustained SARS-CoV-2 transmission networks. In turn, super-spreading events were very likely the main drivers of the epidemic spread in e.g. Italy and Germany, where, in the early exponential phase, spatial heterogeneity of registered cases had been evident [21,22]. In Italy, Spain and France, this explosive phase was followed by a phase of slower growth, during which mass gatherings were forbidden, but quarantine (that finally brought the effective reproduction number below 1) had not been yet introduced.

Motivated by these considerations, we analysed the impact of super-spreading on estimation of $T_{d}$ based on stochastic simulations of SEIR model dynamics (see electronic supplementary material, listing S1). Simulations were performed in the perfectly mixed regime according to the Gillespie algorithm [23]. We assumed that a predefined fixed proportion of individuals (equal to 33%, 10%, 3% or 1%) has higher infectiousness and as such is responsible for on average either half of infections (super-spreaders) or two-third of infections (hyper-spreaders). To reproduce these fractions in systems with different assigned proportions of super- or hyper-spreaders, their infectiousness is assumed to be inversely proportional to their ratio in the simulated population. In figure 2, we show dynamics of the epidemic spread in the presence of 1% of hyper-spreaders to demonstrate that the phase of slower growth is transformed into the faster-exponential growth phase upon the occurrence of hyper-spreading events.

Figure 2. — Stochastic epidemic spread in the presence of 1% of hyper-spreaders. (a) Trajectories of confirmed cases (cumulative R in terms of SEIR compartments) resulting from 100 independent stochastic simulations. When the first hyper-spreading event occurs, the colour of the line is changed from blue to brown. Dashed grey line shows a deterministic trajectory. (b) Proportion of infections transmitted by hyper-spreaders among all transmission events over time. Stochastic trajectories stabilize at 66.7%. Trajectories shown in both panels results from the same set of simulations; simulations resulting in outbreak failure were discarded. Model parameters used for simulations in both panels: $(m, n) = (6, 1)$ , $(1 / σ, 1 / γ) = (5.28 days, 2.9 days)$ . Infection rate coefficient of hyper-spreaders was set β_h = 198 × β_n (where β_n is the infection rate coefficient for normal spreaders), which assures that in the deterministic limit 66.7% of infections are transmitted by hyper-spreaders. In turn, β_n was set such that the average infection rate coefficient β = 2.97 × β_n gives $T_{d} = 2 days$ (see equation (4.7) in Methods).

We estimated $T_{d}$ in two ways: based on one month of growth of the number of new cases since the first registered case (30 days since the first case) and based on growth of new cases in the two-week period after the number of registered cases exceeds 100 (14 days since 100 cases). As we are interested in the initial phase characterized by exponential growth, we assumed that the susceptible population remains constant. In figure 3, we show histograms of $T_{d}$ calculated using either the ‘14 days since 100 cases’ method or the ‘30 days since the first case’ method. One may observe that the histograms calculated using the ‘30 days since the first case’ method are broader than those calculated using the ‘14 days since 100 cases’ method, and the width of all histograms increases with increasing infectiousness (which is set inversely proportional to ρ). When $T_{d}$ is calculated using the ‘14 days since 100 cases’ method, its median value is slightly larger than $T_{d}$ in the deterministic model (equal to 2 days); however, when $T_{d}$ is calculated using the ‘30 days since the first case’ method, then for high infectiousness of super- and hyper-spreaders (correspondingly, for low ρ) its median value becomes much larger than the deterministic $T_{d}$ . Using the ‘30 days since the first case’ method for the case of the lowest considered $ρ = 1 %$ , when super-spreaders (hyper-spreaders) have their infectiousness about 100 times (200 times) higher than the infectiousness of normal individuals, one obtains median $T_{d}$ larger than $T_{d}$ obtained in the deterministic model by 29% (67%), while for ‘14 days since 100 cases’ the $T_{d}$ overestimation is negligible, 3% (6%). This difference is caused by low probability of appearance of super- or hyper-spreaders in the first weeks of the outbreak.

Figure 3. — Estimation of the doubling time $T_{d}$ based on stochastic simulations of the SEIR model with super- and hyper-spreaders. Histograms show probability density $p (T_{d})$ estimated using the ‘14 days since 100 cases’ method (orange) and the ‘30 days since the first case’ method (green). In each column, ρ denotes a fixed proportion of super-spreaders (top row) or hyper-spreaders (bottom row) in the population. For decreasing proportions of super- and hyper-spreaders (from left, except the shared leftmost panel with ρ = 0, to right), their infection rate coefficient β has been reduced to give the same deterministic $T_{d} = 2 days$ (vertical dotted grey lines). Remaining model parameters: (m, n) = (6, 1); (1/σ, 1/γ) = (5.28 days, 2.9 days). Each histogram results from 5000 stochastic simulations starting from a single infected normal individual; trajectories resulting in outbreak failure were discarded; fraction of trajectories that resulted in epidemic extinction for given conditions is given as ${\hat{p}}_{ext}$ . Each distribution is described in terms of its mean (μ), median (Q₂ and vertical dashed lines), standard deviation (s.d.) and the fraction of probability mass for $T_{d} > 2.5 days$ ( ${\hat{p}}_{2.5 +}$ ).

We note that $T_{d}$ estimation for a given country based on available data is equivalent to the analysis of a single stochastic trajectory and that at a very initial stage the epidemic can cease. Probability of extinction is larger when a small fraction of super-spreaders is responsible for a large fraction of cases. In figure 3, we provide extinction probability, ${\hat{p}}_{ext}$ , which in the extreme case of 1% of hyper-spreaders reaches 29%, whereas without hyper-spreaders (or super-spreaders) is 11%.

The examples shown in figures 2 and 3 are focused on the case in which the $T_{d} = 2$ days, which is close to $T_{d}$ estimated for Spain and New York State. After removing super-spreaders (assumed to be responsible for 50% of transmissions) the doubling time would be equal to 3.05 days, whereas after removing hyper-spreaders (responsible for 66.7% of transmissions) the doubling time would be equal to 4.24 days. The doubling times in the range 5.2–7.4, obtained by analysing early onsets of the epidemic (Wu et al. [10,15] and Li et al. [16]), exceed our model prediction obtained after removing 66.7% of transmissions by hyper-spreaders, suggesting that the fraction of transmissions for which hyper-spreaders are responsible can be even larger. Endo et al. estimated that 80% of secondary transmissions could have been caused by 10% of infectious individuals [19].

Finally, we compare $T_{d}$ estimates obtained for eight considered locations using the ‘14 days since 100 cases’ method (as in figure 1) or the ‘30 days since the first case’ method. As expected, $T_{d}$ estimates using ‘30 days since the first case’ method in most cases are larger and more dispersed (in range 1.92–12.6) than the estimates based on the primary method (1.86–2.88). Results shown in table 2 clearly indicate that the ‘14 days since 100 cases’ method is more reliable. Its disadvantage lies in the fact that for a given location the $T_{d}$ estimate is possible when the epidemic is fully developed (see third and fourth columns of table 2). It is, however, important to note that for China the ‘14 days since 100 cases’ method estimate (chosen in our study and giving $R_{0} (T_{d})$ in range 5.6–7.3) was possible on 4 February 2020, before the surge of the epidemic in Europe and USA, and more than one month ahead of the first European country-wide lockdown that was imposed in Italy (9 March 2020).

Table 2.

Doubling time ( $T_{d}$ ) estimates using the ‘14 days since 100 cases’ or the ‘30 days since the first case’ method. For all locations except China, the data gathered and made available by Johns Hopkins University [6] are used. As for the second method, the calculation either starts from the first case or from two first cases (if no date is provided for the first case). In the case of China (for which Johns Hopkins University [6] early data are not available) data from [24] are used.

location	‘14 days since 100 cases’			‘30 days since the first case’
location	$T_{d}$	from	up to	$T_{d}$	from	up to
China	2.36	548 cases	23 707 cases	4.43	1 case	37 cases
		(22 Jan 2020)	(4 Feb 2020)		(1 Dec 2019)	(30 Dec 2019)
Italy	2.56	155 cases	5883 cases	3.25	2 cases	1128 cases
		(23 Feb 2020)	(7 Mar 2020)		(31 Jan 2020)	(29 Feb 2020)
Spain	2.11	120 cases	7798 cases	6.35	1 case	84 cases
		(2 Mar 2020)	(15 Mar 2020)		(1 Feb 2020)	(1 Mar 2020)
France	2.61	130 cases	4496 cases	12.30	2 cases	12 cases
		(1 Mar 2020)	(14 Mar 2020)		(24 Jan 2020)	(22 Feb 2020)
UK	2.88	134 cases	3077 cases	7.66	2 cases	61 cases
		(2 Mar 2020)	(15 Mar 2020)		(31 Jan 2020)	(29 Feb 2020)
Germany	2.56	130 cases	4585 cases	12.56	1 case	17 cases
		(1 Mar 2020)	(14 Mar 2020)		(27 Jan 2020)	(25 Feb 2020)
Switzerland	2.86	114 cases	3028 cases	2.34	1 case	10 897 cases
		(5 Mar 2020)	(18 Mar 2020)		(25 Feb 2020)	(25 Mar 2020)
NY (state)	1.86	106 cases	11 727 cases	1.92	1 case	75 833 case
		(8 Mar 2020)	(21 Mar 2020)		(2 Mar 2020)	(31 Mar 2020)

Open in a new tab

3. Conclusion

Based on epidemic data from China, New York State and six European countries, we have estimated that the basic reproduction number $R_{0}$ lies in the range 4.7–11.4 (5.6–7.3 for China), which is higher than most previous estimates [4,5,8]. There are two sources of the discrepancy in $R_{0}$ estimation. First, in agreement with data on the incubation period distribution (assumed to be the same as the latent period distribution), we used a model with six ‘exposed’ states, which substantially increases $R_{0} (T_{d})$ with respect to the models with one or two ‘exposed’ states. Second, we estimated $T_{d}$ based on the two-week period of the exponential growth phase beginning on the day in which the number of cumulative registered cases exceeds 100, or when the number of cumulative registered fatalities exceeds 10. Importantly, values of $T_{d}$ estimated from the growth of registered cases and from the growth of the registered fatalities led to similar $R_{0}$ estimates. This approach, in contrast to estimation of $R_{0}$ based on individual case reports, allows to implicitly take into account super-spreading events that substantially shorten $T_{d}$ . Spatial heterogeneity of the epidemic spread observed in many European countries, including Italy, Spain and Germany, can be associated with larger or smaller super-spreading events that initiated outbreaks in particular regions of these countries. Lack of, or sporadic super-spreading events in first phase of epidemic explains why $T_{d}$ estimated using the ‘30 days since the first case’ method gives in most cases larger and more dispersed values than our method of choice: ‘14 days since 100 cases’. This, in turn, suggests that in general the reproduction number calculated based on early epidemic development can be probably underestimated, and thus in the case of future epidemics must be considered with caution.

Our estimates are consistent with current epidemic data in Italy, Spain and France. As of 24 April 2020, these countries managed to terminate the exponential growth phase by means of country-wide quarantine. Current COVID-19 Community Mobility Reports [25] show about 80% reduction of mobility in retail and recreation, transit stations and workplaces in these countries. Together with increased social distancing, this reduction possibly lowered the infection rate β at least fivefold; additionally, massive testing reduced the infectious period, 1/γ. Consequently, we suspect that the reproduction number R = β/γ was reduced more than fivefold, which brought it to the values somewhat smaller than 1. This suggests that $R_{0}$ in these countries could have been larger than 5.

4. Methods

4.1. SEIR model equations and parametrization

The dynamics of our SEIR model is governed by the following system of ordinary differential equations:

\frac{d S}{d t} = - β I (t) S (t) / N

4.1

\frac{d E_{1}}{d t} = β I (t) S (t) / N - m σ E_{1} (t),

4.2

\frac{d E_{i}}{d t} = m σ E_{i - 1} (t) - m σ E_{i} (t), 2 \leq i \leq m,

4.3

\frac{d I_{1}}{d t} = m σ E_{m} (t) - n γ I_{1} (t),

4.4

\frac{d I_{j}}{d t} = n γ I_{j - 1} (t) - n γ I_{j} (t), 2 \leq j \leq n,

4.5

\frac{d R}{d t} = n γ I_{n} (t),

4.6

where N = S(t) + E₁(t) + … + E_m(t) + I₁(t) + … + I_n(t) + R(t) is the constant population size, and I(t) = I₁(t) + … + I_n(t) is the size of infectious subpopulation. As m is the number of ‘exposed’ substates and n is the number of ‘infectious’ substates, there are m + n + 2 equations in the system. In the early phase of the epidemic, 1 − S(t)/N ≪ 1 and with constant coefficients β, σ and γ the growth of R (as well as E_i and I_j) is exponential.

An important property of a given SEIR model parametrization is its implied distribution of generation interval (GI), the period between subsequent infection events in a transmission chain. While the expected GI is easily computable from model parameters as $⟨ GI ⟩ = σ^{- 1} + \frac{1}{2} γ^{- 1}$ (the mean period of infectiousness is halved to reflect the assumption that the infection occurs in a random time during the period of infectiousness [17]), it can be hardly estimated based on even detailed epidemiological data. It should be noted that in some sources the formula 〈GI〉 = σ⁻¹ + γ⁻¹ is used (e.g. [26]), in which it is assumed that the infection occurs at the end of the period of infectiousness, not at a random point of this period. GI may be related to the serial interval, SI, the period between the occurrence of symptoms in the infector and the infectee. Although GI and SI may have different distributions, their means are expected to be equal and thus may be directly compared. Our parametrization implies 〈GI〉 = 6.73 days, which is consistent with 〈GI〉 of the model by Ferguson et al. (6.5 days) [27] and the estimates of 〈SI〉 by Wu et al. (7.0 days) [15], Ma et al. (6.8 days) [28] or Bi et al. (6.3 days) [29].

The short period of effective infectiousness reflects the assumption that the individuals with confirmed infection are quickly isolated or self-isolated and then cannot infect susceptible individuals. This enabled us to identify the reported increase of confirmed cases with the transfer of the individuals from the (last substate of the) ‘infectious’ compartment to the ‘removed’ compartment of the SEIR model. In addition to the currently diseased individuals that remain isolated, the ‘removed’ compartment contains the recovered (and assumed to be resistant) and the deceased individuals.

We assume the same 1/γ = 2.9 days in all locations and times, being, however, aware that the mean infectious period may shorten over time due to the implementation of protective health-care practices, increased diagnostic capacity, and contact tracing [29]. In turn, the mean latent period, 1/σ, may be considered an intrinsic property of the disease. As the distribution of the latent period is not known, as a simplification, in our model, the distribution of the latent period (time since infection during which an infected individual cannot infect) is assumed to be the same as the distribution of the incubation period (time since infection during which an infected individual has not yet developed symptoms). We demonstrated the influence of 1/σ on the estimation of $R_{0}$ in electronic supplementary material, figure S1.

4.2. Estimation of the doubling time and the basic reproduction number

Growth rates used for estimation of respective doubling times, $T_{d}$ , were determined by linear regression of the logarithm of the cumulative confirmed cases and cumulative deaths in the exponential phase of the epidemic separately in each of eight considered location. We discarded initial parts of trajectories with less than 100 confirmed cases (or 10 registered fatalities) and used two-week-long periods to strike a balance between: (i) analysis of epidemic progression when stochastic effects associated with individual transmission events, including super-spreading, are relatively small (see stochastic simulation trajectories in electronic supplementary material, figure 2a), and (ii) analysis of the exponential phase of epidemic progression, which is relatively short due to imposition of restrictions. We expect that the trajectories of deaths may be less affected by under-reporting; nevertheless, doubling times obtained from growth rates of cumulative cases and cumulative deaths turn out to be quite consistent.

In the context of our SEIR model, the doubling time $T_{d}$ and parameters β, σ, γ, n, m satisfy the relation

β (T_{d}; σ, γ, m, n) = \frac{\frac{\log 2}{T_{d}} {(\frac{\log 2}{T_{d} m σ} + 1)}^{m}}{1 - {(\frac{\log 2}{T_{d} n γ} + 1)}^{- n}}

4.7

that enables calculation of the basic reproduction number using the doubling time $T_{d}$ estimated directly from the epidemic data as

R_{0} (T_{d}) = \frac{β (T_{d}; σ, γ, m, n)}{γ},

4.8

in accordance with Wearing et al. [12] and Wallinga & Lipsitch [13].

Supplementary Material

Electronic Supplementary Material

rsos200786supp1.pdf^{(505KB, pdf)}

Reviewer comments

rsos200786_review_history.pdf^{(1.4MB, pdf)}

Acknowledgements

We thank the reviewers whose comments helped us to improve the manuscript.

Data accessibility

All data used in this theoretical study are referenced.

Authors' contributions

M.K. conceived study, performed model and data analysis, prepared figures and wrote manuscript; F.G. conceived study, performed model analysis and prepared figures; T.L. conceived study and wrote manuscript. All authors gave final approval for publication.

Competing interests

We declare we have no competing interest.

Funding

This study was supported by the National Science Centre (Poland) grant no. 2018/29/B/NZ2/00668.

References

1.Delamater P, Street E, Leslie T, Yang YT, Jacobsen K. 2019. Complexity of the basic reproduction number (R₀). Emerg. Infect. Dis. 25, 1 ( 10.3201/eid2501.171901) [DOI] [PMC free article] [PubMed] [Google Scholar]
2.WHO. 2020. Statement on the meeting of the International Health Regulations (2005) Emergency Committee regarding the outbreak of novel coronavirus 2019 (n-CoV) on 23 January 2020. https://www.who.int/news-room/detail/23-01-2020-statement-on-the-meeting-of-the-international-health-regulations-(2005)-emergency-committee-regarding-the-outbreak-of-novel-coronavirus-(2019-ncov) (accessed 26 June 2020).
3.WHO. 2020. Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). https://www.who.int/publications-detail/report-of-the-who-china-joint-mission-on-coronavirus-disease-2019-(covid-19) (accessed 26 June 2020).
4.Liu Y, Gayle AA, Wilder-Smith A, Rocklöv J. 2020. The reproductive number of COVID-19 is higher compared to SARS coronavirus. J. Travel Med. 27, taaa021 ( 10.1093/jtm/taaa021) [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Boldog P, Tekeli T, Vizi Z, Dénes A, Bartha FA, Röst G. 2020. Risk assessment of novel coronavirus COVID-19 outbreaks outside China. J. Clin. Med. 9, 571 ( 10.3390/jcm9020571) [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Dong E, Du H, Gardner L. 2020. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534. ( 10.1016/S1473-3099(20)30120-1). For GitHub repository see https://github.com/CSSEGISandData/COVID-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Lauer SA, Grantz KH, Bi Q, Jones FK, Zheng Q, Meredith H, Azman AS, Reich NG, Lessler J. 2020. The incubation period of 2019-nCoV from publicly reported confirmed cases: estimation and application. medRxiv. ( 10.1101/2020.02.02.20020016) [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Liu T. et al. 2020. Time-varying transmission dynamics of novel coronavirus pneumonia in China. bioRxiv. ( 10.1101/2020.01.25.919787) [DOI] [Google Scholar]
9.Kucharski AJ, Russell TW, Diamond C, Liu Y, Edmunds J, Funk S, Eggo RM. 2020. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect. Dis. 20, 553–558. ( 10.1016/S1473-3099(20)30144-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Wu JT, Leung K, Leung GM. 2020. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. Lancet 395, 689–697. ( 10.1016/s0140-6736(20)30260-9) [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Sanche S, Lin YT, Xu C, Romero-Severson E, Hengartner N, Ke R. 2020. High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. Emerg. Infect. Dis. 26, 1470–1477. ( 10.3201/eid2607.200282) [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Wearing HJ, Rohani P, Keeling MJ. 2005. Appropriate models for the management of infectious diseases. PLoS Med. 2, 0020174 ( 10.1371/journal.pmed.0020174) [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Wallinga J, Lipsitch M. 2007. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc. R. Soc. B 274, 599–604. ( 10.1098/rspb.2006.3754) [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Kochańczyk M, Grabowski F, Lipniacki T. 2020. Dynamics of COVID-19 pandemic at constant and time-dependent contact rates. Math. Model. Nat. Phenom. 15, 28 ( 10.1051/mmnp/2020011) [DOI] [Google Scholar]
15.Backer JA, Klinkenberg D, Wallinga J. 2020. Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China, 20–28 January 2020. EuroSurveill 25, 2000062 ( 10.2807/1560-7917.ES.2020.25.5.2000062) [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Li Q. et al. 2020. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 382, 1199–1207. ( 10.1056/NEJMoa2001316) [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Wu JT, Leung K, Bushman M, Kishore N, Niehus R, de Salazar PM, Cowling BJ, Lipsitch M, Leung GM. 2020. Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China. Nat. Med. 26, 506–510. ( 10.1038/s41591-020-0822-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Nelson KE, Williams K. 2014. Infectious disease epidemiology: theory and practice, 3rd edn Burlington, MA: Jones & Bartlett Learning. [Google Scholar]
19.Endo A, Abbott S, Kucharski A, Funk S. 2020. Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China [version 3; peer review: 2 approved]. Wellcome Open Res. 5, 67 ( 10.12688/wellcomeopenres.15842.3) [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Worobey M. et al. 2020. The emergence of SARS-CoV-2 in Europe and the US. bioRxiv. ( 10.1101/2020.05.21.109322) [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Cereda D.et al.2020The early phase of the COVID-19 outbreak in Lombardy, Italy. arXiv. (http://arxiv.org/abs/2003.09320)
22.Mercker M, Betzin U, Wilken D. 2020. What influences COVID-19 infection rates: a statistical approach to identify promising factors applied to infection data from Germany. medRxiv. ( 10.1101/2020.04.14.20064501) [DOI] [Google Scholar]
23.Harris LA. et al. 2016. BioNetGen 2.2: advances in rule-based modeling. Bioinformatics 32, 3366–3368. ( 10.1093/bioinformatics/btw469) [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Huang C. et al. 2020. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395, 497–506. ( 10.1016/S0140-6736(20)30183-5) [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Google LLC. 2020. Google COVID-19 Community Mobility Reports. https://www.google.com/covid19/mobility (accessed 26 June 2020).
26.Lipsitch M. et al. 2003. Transmission dynamics and control of severe acute pespiratory syndrome. Science 300, 1966–1970. ( 10.1126/science.1086616) [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Ferguson N, Laydon D, Nedjati Gilani G, Imai N, Ainslie K, Baguelin M. 2020. Report 9. https://spiral.imperial.ac.uk:8443/handle/10044/1/77482 (accessed 26 March 2020).
28.Ma S, Zhang J, Zeng M, Yun Q, Guo W, Zheng Y, Zhao S, Wang MH, Yang Z. 2020. Epidemiological parameters of coronavirus disease 2019: a pooled analysis of publicly reported individual data of 1155 cases from seven countries. medRxiv. ( 10.1101/2020.03.21.20040329) [DOI] [Google Scholar]
29.Bi Q. et al. 2020. Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. Lancet Infect. Dis. 20, 911–919. ( 10.1016/S1473-3099(20)30287-5) [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Electronic Supplementary Material

rsos200786supp1.pdf^{(505KB, pdf)}

Reviewer comments

rsos200786_review_history.pdf^{(1.4MB, pdf)}

Data Availability Statement

All data used in this theoretical study are referenced.

[RSOS200786C1] 1.Delamater P, Street E, Leslie T, Yang YT, Jacobsen K. 2019. Complexity of the basic reproduction number (R₀). Emerg. Infect. Dis. 25, 1 ( 10.3201/eid2501.171901) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C2] 2.WHO. 2020. Statement on the meeting of the International Health Regulations (2005) Emergency Committee regarding the outbreak of novel coronavirus 2019 (n-CoV) on 23 January 2020. https://www.who.int/news-room/detail/23-01-2020-statement-on-the-meeting-of-the-international-health-regulations-(2005)-emergency-committee-regarding-the-outbreak-of-novel-coronavirus-(2019-ncov) (accessed 26 June 2020).

[RSOS200786C3] 3.WHO. 2020. Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). https://www.who.int/publications-detail/report-of-the-who-china-joint-mission-on-coronavirus-disease-2019-(covid-19) (accessed 26 June 2020).

[RSOS200786C4] 4.Liu Y, Gayle AA, Wilder-Smith A, Rocklöv J. 2020. The reproductive number of COVID-19 is higher compared to SARS coronavirus. J. Travel Med. 27, taaa021 ( 10.1093/jtm/taaa021) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C5] 5.Boldog P, Tekeli T, Vizi Z, Dénes A, Bartha FA, Röst G. 2020. Risk assessment of novel coronavirus COVID-19 outbreaks outside China. J. Clin. Med. 9, 571 ( 10.3390/jcm9020571) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C6] 6.Dong E, Du H, Gardner L. 2020. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534. ( 10.1016/S1473-3099(20)30120-1). For GitHub repository see https://github.com/CSSEGISandData/COVID-19. [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C7] 7.Lauer SA, Grantz KH, Bi Q, Jones FK, Zheng Q, Meredith H, Azman AS, Reich NG, Lessler J. 2020. The incubation period of 2019-nCoV from publicly reported confirmed cases: estimation and application. medRxiv. ( 10.1101/2020.02.02.20020016) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C8] 8.Liu T. et al. 2020. Time-varying transmission dynamics of novel coronavirus pneumonia in China. bioRxiv. ( 10.1101/2020.01.25.919787) [DOI] [Google Scholar]

[RSOS200786C9] 9.Kucharski AJ, Russell TW, Diamond C, Liu Y, Edmunds J, Funk S, Eggo RM. 2020. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect. Dis. 20, 553–558. ( 10.1016/S1473-3099(20)30144-4) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C10] 10.Wu JT, Leung K, Leung GM. 2020. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. Lancet 395, 689–697. ( 10.1016/s0140-6736(20)30260-9) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C11] 11.Sanche S, Lin YT, Xu C, Romero-Severson E, Hengartner N, Ke R. 2020. High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. Emerg. Infect. Dis. 26, 1470–1477. ( 10.3201/eid2607.200282) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C12] 12.Wearing HJ, Rohani P, Keeling MJ. 2005. Appropriate models for the management of infectious diseases. PLoS Med. 2, 0020174 ( 10.1371/journal.pmed.0020174) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C13] 13.Wallinga J, Lipsitch M. 2007. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc. R. Soc. B 274, 599–604. ( 10.1098/rspb.2006.3754) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C14] 14.Kochańczyk M, Grabowski F, Lipniacki T. 2020. Dynamics of COVID-19 pandemic at constant and time-dependent contact rates. Math. Model. Nat. Phenom. 15, 28 ( 10.1051/mmnp/2020011) [DOI] [Google Scholar]

[RSOS200786C15] 15.Backer JA, Klinkenberg D, Wallinga J. 2020. Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China, 20–28 January 2020. EuroSurveill 25, 2000062 ( 10.2807/1560-7917.ES.2020.25.5.2000062) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C16] 16.Li Q. et al. 2020. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 382, 1199–1207. ( 10.1056/NEJMoa2001316) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C17] 17.Wu JT, Leung K, Bushman M, Kishore N, Niehus R, de Salazar PM, Cowling BJ, Lipsitch M, Leung GM. 2020. Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China. Nat. Med. 26, 506–510. ( 10.1038/s41591-020-0822-7) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C18] 18.Nelson KE, Williams K. 2014. Infectious disease epidemiology: theory and practice, 3rd edn Burlington, MA: Jones & Bartlett Learning. [Google Scholar]

[RSOS200786C19] 19.Endo A, Abbott S, Kucharski A, Funk S. 2020. Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China [version 3; peer review: 2 approved]. Wellcome Open Res. 5, 67 ( 10.12688/wellcomeopenres.15842.3) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C20] 20.Worobey M. et al. 2020. The emergence of SARS-CoV-2 in Europe and the US. bioRxiv. ( 10.1101/2020.05.21.109322) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C21] 21.Cereda D.et al.2020The early phase of the COVID-19 outbreak in Lombardy, Italy. arXiv. (http://arxiv.org/abs/2003.09320)

[RSOS200786C22] 22.Mercker M, Betzin U, Wilken D. 2020. What influences COVID-19 infection rates: a statistical approach to identify promising factors applied to infection data from Germany. medRxiv. ( 10.1101/2020.04.14.20064501) [DOI] [Google Scholar]

[RSOS200786C23] 23.Harris LA. et al. 2016. BioNetGen 2.2: advances in rule-based modeling. Bioinformatics 32, 3366–3368. ( 10.1093/bioinformatics/btw469) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C24] 24.Huang C. et al. 2020. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395, 497–506. ( 10.1016/S0140-6736(20)30183-5) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C25] 25.Google LLC. 2020. Google COVID-19 Community Mobility Reports. https://www.google.com/covid19/mobility (accessed 26 June 2020).

[RSOS200786C26] 26.Lipsitch M. et al. 2003. Transmission dynamics and control of severe acute pespiratory syndrome. Science 300, 1966–1970. ( 10.1126/science.1086616) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSOS200786C27] 27.Ferguson N, Laydon D, Nedjati Gilani G, Imai N, Ainslie K, Baguelin M. 2020. Report 9. https://spiral.imperial.ac.uk:8443/handle/10044/1/77482 (accessed 26 March 2020).

[RSOS200786C28] 28.Ma S, Zhang J, Zeng M, Yun Q, Guo W, Zheng Y, Zhao S, Wang MH, Yang Z. 2020. Epidemiological parameters of coronavirus disease 2019: a pooled analysis of publicly reported individual data of 1155 cases from seven countries. medRxiv. ( 10.1101/2020.03.21.20040329) [DOI] [Google Scholar]

[RSOS200786C29] 29.Bi Q. et al. 2020. Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. Lancet Infect. Dis. 20, 911–919. ( 10.1016/S1473-3099(20)30287-5) [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Super-spreading events initiated the exponential growth phase of COVID-19 with ℛ₀ higher than initially estimated

Marek Kochańczyk

Frederic Grabowski

Tomasz Lipniacki

Abstract

1. Introduction