Abstract
An epidemic model is formulated that describes the spread of an epidemic in a population. The model incorporates an Erlang distribution of times of sojourn in incubating, symptomatically and asymptomatically infectious compartments. Basic properties of the model are explored, with focus on properties important in the context of current COVID-19 pandemic.
Keywords: COVID-19, Mathematical model, Erlang distribution, Asymptomatic infections
1. Introduction
As part the authors’ work on COVID-19, we have relied several times on a specific model derived from an earlier model (Arino, Brauer, van den Driessche, Watmough, & Wu, 2007). This model, like that previous model, is applicable to a variety of emerging and re-emerging pathogens exhibiting an observable latent period as well as symptomatic and asymptomatic infections. The specificity of the present model is the further incorporation of Erlang distributions of the time of sojourn in some of the important compartments in the model. Indeed, most of the work we have carried out so far on COVID-19 has concerned predictions over a short time period, often no more than a month. In this context, incorporating a better description of sojourn times is extremely important. The present model generalises the simple 3-compartment SIR model to age of infection models, providing a reasonable approximation to the details of progression through infection with a minimal number of parameters and the convenience of an ODE model over integral or PDE models.
In this short note, we present this model and explain some of its features in the context of the current COVID-19 pandemic. We also conduct a simple sensitivity analysis in order to highlight the most important parameters in the model.
The main conclusion of the analysis here is that model responses are highly sensitive to the value of the parameter describing the fraction of cases that are asymptomatic, highlighting the need for intensive research to get a better handle on the value of this critical parameter.
2. The model
We use a simple variation on the classic SLIAR epidemic model for susceptible, latently infected, symptomatic and asymptomatic infectious and removed individuals, with numbers denoted respectively S, L, I, A and R (Arino, Brauer, van den Driessche, Watmough, & Wu, 2006). The SLIAR epidemic model has often been used to describe the propagation of diseases caused by virus leading to respiratory illness such as influenza (Jin et al., 2011; Kim, Lee, & Jung, 2017; Li et al., 2020). Contrary to SLIR (or SEIR) models, it allows to consider infection by asymptomatic individuals, which in the case of COVID-19 has been reported to have substantially contributed to disease propagation (Li et al., 2020).
As the time scale of interest is short, the model has no birth or natural death, only death by removal from the infectious compartments I and A. It is therefore an epidemic model, as opposed to an endemic model. Furthermore, because the time horizon for simulations is very short in comparison to reported estimates of incubation period (Backer, Klinkenberg, & Wallinga, 2020; Lauer et al., 2020a) and communicable period (Hu et al., 2020), making a more appropriate description of sojourn times in the incubation, symptomatically infectious and asymptomatically infectious compartments is important. Although not ideal, we use an Erlang distribution, i.e., a Gamma distribution with integer shape parameter. To simplify the problem, we use two compartments for each of the L, I and A states (Arino, 2020). We could use more if need be; the overall set up would vary very little.
Let us briefly justify this modification in the current context. Consider, for instance, the incubation period, i.e., the time between infection and the onset of symptoms. A wide range of possible durations has been reported. Let us suppose, for instance, that the mean incubation period of COVID-19 is 5.2 days as reported by (Lauer et al., 3AD). Then, comparing the fraction of individuals infected at time 0 and still incubating at time t, we obtain Fig. 1. The Erlang distribution thus allows both a less pronounced early end and a less extended duration of the incubation period. If need be, we could further extend this behaviour by adding more compartments and thus increasing the shape parameter of the Erlang distribution.
Note that this does not increase the complexity of the model and, for instance, parameter fitting procedures, since we use the same parameter for all compartments in these “chains”; for instance, the incubation period is described using the single parameter ε; the only difference with the exponential case is that, here, is the mean sojourn time in the combined and compartments.
The flow diagram of the model is as shown in Fig. 2.
We suppose that incidence takes the form
(1) |
where β is the transmission coefficient, η and ξ are the attenuation factors for transmission by incubating and asymptomatic cases, respectively, is the state vector and is the function describing the nature of the overall incidence function. As in Fig. 2, we can also think of incidence as taking the form ; in this case is the force of infection. Typical choices for f include , making the incidence mass action, and , with , giving proportional incidence.
The system governing the behaviour is then the following:
(2a) |
(2b) |
(2c) |
(2d) |
(2e) |
(2f) |
(2g) |
(2h) |
Note that it is assumed that there can be transmission during the incubation period, as this was reported (Tong et al., 2020). Thus, the compartment can be interpreted as consisting of pre-symptomatic infectious individuals. A fraction π of individuals is assumed to go into an asymptomatic phase following incubation (and correspondingly, develop symptoms). Finally, ε and γ describe the rates at which incubation and infectiousness end, respectively. By properties of Erlang distributions, the average times spent incubating and infectious (symptomatically or asymptomatically) are and time units, respectively.
3. Some properties of the model
The purpose of this paper is not to conduct a thorough mathematical analysis of (2). We refer to (Arino et al., 2006) for considerations on the behaviour of a version of this model with only one of each disease status compartments. It is useful, though, to summarise some elementary properties of (2).
3.1. Behaviour of the model
An epidemic model such as (2), as compared to endemic models with demographic components, only has one possible long term outcome: a disease-free equilibrium in which the only two potentially positive components are S and R. As is customary with such models, we denote and the limiting values of S and R. Because of the structure of the model, these limits always exist [3, Theorem 5.1].
As with other simple epidemic models of this type, the main interest is to know whether, following the introduction of infected individuals in the population, the number of infected individuals goes through an exponential growth phase, indicating an epidemic phase, before the disease becomes extinct. This is decided using the basic reproduction number . To find its value, we use the method in (Arino et al., 2007), using the notation therein. First, since with . The matrix (to avoid confusion with the fraction π of asymptomatic cases) has entry the fraction of the jth susceptible compartment moving, upon infection, to the ith infected compartment. Therefore, here it is a vector, , since all new infections move to the compartment. The row vector b describes the relative horizontal transmissions. It takes the form . The function denoted in (Arino et al., 2007) is here. Finally, the matrix V describing transitions between and out of infected states takes the form
and has inverse
From (Arino et al., 2007), the basic reproduction number of (2) is then
where is the susceptible population at the initial time. Note that for this formula to hold, must be defined. In other words, the basic reproduction number takes the form
(3) |
where the superscript is used to show dependence on the nature of f. If , i.e., we use mass action incidence, then
(4) |
whereas in the case of proportional or frequency-dependent incidence, and
(5) |
Observe that the latter form stems from the fact that at , and the latter equals since the model clearly preserves the total population.
3.2. Final size relations
One measure of particular importance in the context of COVID-19 and other emerging or re-emerging pathogens is the epidemic final size . The final size is often expressed in terms of the attack rate .
In the case where incidence is mass action, since we have a single susceptible class, the method in (Arino et al., 2007) provides an explicit final size relation,
where is the initial infected population. This simplifies to
(6) |
With other incidence functions, we obtain inequalities of the form
where K is the initial total population size. Similarly to (6), this can easily be simplified but is not shown here.
Thus, in the case of mass action incidence, the most used incidence in the case of epidemics, finding the final size of the epidemic (or, equivalently, the epidemic attack rate) requires to solve the simple transcendental equation (6), which is easily done at least numerically.
3.3. Detecting the turning point and the peak
Other measures of particular importance in the context of the study of an epidemic outbreak are the timing of the peak as well as the disease prevalence at the peak. In the discussion that follows, assume incidence is mass action.
In a classic ODE Kermack-McKendrick SIR model, the peak is easily characterised as the point in phase space where . Because of the number of compartments, the peak here can be studied in two parts. Before the peak is actually reached, there is first a point in time at which incidence starts to decrease; this is sometimes called the turning point of the epidemic. In the present model, this point is reached when , i.e., with mass action incidence, when
(7) |
where . Note that is not defined everywhere in phase space; for instance, it is not defined at nor if one considers an with sign pattern . However, past initial transients and before becomes close to , (7) provides a characterisation of the phase the epidemic is currently in. If , then the “natural tendency” of the epidemic is to propagate more. When enough susceptible individuals have been “consumed” by the infection, i.e., when , the epidemic cannot sustain itself anymore. For perspective, in the context of COVID-19 and using time units of days and mass action incidence, ε is of the order of while β is typically several order of magnitude smaller, so the factor in (7) is in the range . Note that because only and are observable, determining whether (7) holds is impossible in the field. On the other hand, this is easily done when considering numerical simulations.
The peak is then the point at which prevalence of the infection in the population is maximum. Here, we focus on the observable peak, i.e., the one that can actually be measured in real life data. Because of that, we consider the peak to happen at the time when the observable part of the epidemic, i.e., the number of infectious individuals , is maximum. Thus, at the peak, , i.e.,
(8) |
Since is the rate at which new symptomatic infections occur and is the rate at which symptomatic infections are resolved, either by recovery or death, the occurrence of conditions necessary for the peak to take place can be inferred from the data. So, while (7) is not observable, (8) is.
3.4. Start date of the epidemic in a location
When conducting a numerical investigation of the system properties, finding the date at which to start simulations is important. We describe here the methodology used to do so.
In this work, we do not account for “structural” under-reporting of cases and thus assume that the observable quantity in our model, in terms of infection, is the number of new symptomatic cases. As a consequence, we take, for a given location, the time at which the first c confirmed cases are reported. A location is thus characterised by a pair .
For illustration, take the situation of cumulative confirmed case counts in China as reported to WHO, which was of 547 cases on January 22, 2020, i.e., (t_c=2020_01_22,c=547). We seek the initial date for China such that, on January 22, 2020, China has this cumulative confirmed case count when solving (2) numerically and with the parameters considered. Given a point u in parameter space, we initiate a simulation with initial time . We solve (2) numerically, forward in time over the interval , with the population of China, and all other state variables equal to zero. This gives a solution . Extracting from this solution, we compute
As is the rate at which individuals enter the compartment, represents the total number of individuals having become infectious at time t, i.e., the cumulative number of symptomatic infections t days after the introduction of the first infectious individual in the population. We then let be the point when ; for China, with the parameters u, is then .
Note that a random point in parameter space might lead to a situation where . This is used in parameter estimation to disqualify such points. Other points that would be disqualified are those such that is too large. These aspects will be discussed in further work on the topic.
4. Sensitivity analysis
A global sensitivity analysis is carried out to characterise the impact of uncertainty of factors (inputs of the model) on model outputs. The inputs of model and their ranges are listed in Table 1; p inputs associated to the model parameters are considered. The number of observable cases during the course of the epidemic and at the peak, and the timing of the peak are the outputs of interest here. The number of observable cases is the cumulative confirmed case count .
Table 1.
Parameter | Definition | Range |
---|---|---|
β | transmission coefficient | |
η | attenuation of transmission for incubating individuals | |
ξ | attenuation of transmission for asymptomatic individuals | |
π | fraction of asymptomatic cases | |
Incubation time | mean duration of incubation | |
Infectious period | mean duration of infectious period |
The variance-based analysis is performed using the R package multisensi. For each factor, values are chosen using Latin hypercube sampling with uniform distributions on the range considered for the factor of interest (Table 1). Then, using a complete factorial design approach, scenarios are generated. Simulations are run for these scenarios with the same initial condition , and . Then, sensitivity indices are computed using a classic ANOVA decomposition (Lamboni, Makowski, Simon, Gabrielle, & Monod, 2009; Monod, Naud, & Makowski, 2006).
Fig. 3a shows the influence of factors on the number of observable cases over time. The sensitivity indices are computed every 2 days over a period of 250 days. The lower panel of Fig. 3a details, at each time point, contributions (normalised to 1) of model parameters to the total variability of model responses. At a given time , the relative lengths of the coloured segments represent the relative contributions of the main effect sensitivity indices to this total variability. Interactions between two factors is denoted “interaction” and those involving three or more factors is denoted “residual”.
For instance, at (after 85 days), the number of observable cases is mostly sensitive to the main effect of π. The main effect of the transmission coefficient β accounts for one-third of the variability. Interactions between factors contribute to about one-tenth of the output variability. In the long-run, this trend persists.
In the early dynamics, the variability is due to the main effects of incubation time and interactions between factors.
Fig. 3b displays the total sensitivity, a measure of influence of each factor; the outputs considered here are the timing of the peak and value of the observable variables at the peak, and respectively. In each bar (total sensitivity), the main effect (dark grey) and interaction between the factor of interest and another (light grey) are detailed. The time of the peak is affected by the transmission β and incubation time. The value of the peak is mainly influenced by the proportion of asymptomatic π, the duration of incubation and infectiousness and transmission β. The transmission coefficient (β) has the most influence on whereas is most impacted by the proportion π of asymptomatic cases. Furthermore, note that for the most influential factors of the first-order interactions contribute the most to their total sensitivity. The timing of peak of observable case numbers results mostly from the interplay between factors.
5. Discussion
The model presented here can be used to consider some of the aspects of spread of a novel or re-emergent pathogen. We have focused here on practical aspects of the use of the model, focusing on the need, in emergency response settings, to provide a fast evaluation of outcomes.
Model (2) consists of 8 differential equations, but its parametrisation involves the same number of parameters as the 5 equations model from which it is derived (Arino et al., 2006). Because a lot is unknown during the early stages of a crisis like the ongoing COVID-19 pandemic, simple models that can be fitted using a minimal number of parameters are extremely useful.
In a time of crisis, it is however also important not to oversell the capabilities of a model. We have been considering many variations on the current model, as part of work conducted in Canada regarding COVID-19. While this base model has proved very helpful in many circumstances, it is unable in particular to provide insights into testing or contact tracing. Other modelling paradigms such as individual-based (IBM) or agent-based (ABM) models are much better suited to answer questions in this area. A continuous-time Markov chain model version of this model has for instance been considered to answer specific questions where a better understanding of the infection chains is required, such as importations of cases into new locations.
Where our model is quite appropriate, on the other hand, is when reasonably sized populations are considered. In this case, it can easily be shown, as had been done in (Arino et al., 2006), that ODE models provide essentially similar results to population-level IBM and ABM.
This highlights another strength of this type of approach: ODE models are quite amenable to extensive sensitivity analyses, bringing forward an important non methodological conclusion of the present work. One of the most influential factor/parameter in the model considered is the proportion of asymptomatic cases. The basic reproduction number (through the transmission parameter β and its interaction with other parameters) determines the time of the peak of observable cases whereas the value of the peak, and thus the impact on the health care system, depends critically on the proportion of cases that are asymptomatic. This highlights the imperious need for more research, both in the field and in modelling, to understand the drivers of asymptomaticity and its prevalence among cases.
Declaration of competing interest
We have no conflict of interest.
Acknowledgements
The authors are supported in part by NSERC Discovery Grants. JA is also supported by CIHR through the Canadian 2019 Novel Coronavirus (COVID-19) Rapid Research Funding Opportunity. The authors wish to thank Dr. Nicholas Ogden, Director of Public Health Risk Science (PHRS) at the National Microbiology Laboratory of the Public Health Agency of Canada, as well as Drs. Aamir Fazil and Erin Rees, also with PHRS, for helpful discussions during work on COVID-19.
Handling Editor: Jianhong Wu
Footnotes
Peer review under responsibility of KeAi Communications Co., Ltd.
Contributor Information
Julien Arino, Email: Julien.Arino@umanitoba.ca.
Stéphanie Portet, Email: Stephanie.Portet@umanitoba.ca.
References
- Arino J. Mathematical epidemiology in a data-rich world. Infectious Disease Modelling. 2020;5:161–188. doi: 10.1016/j.idm.2019.12.008. https://github.com/julien-arino/modelling-with-data for code [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arino J., Brauer F., van den Driessche P., Watmough J., Wu J. Simple models for containment of a pandemic. Journal of The Royal Society Interface. 2006;3(8):453–457. doi: 10.1098/rsif.2006.0112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arino J., Brauer F., van den Driessche P., Watmough J., Wu J. A final size relation for epidemic models. Mathematical Biosciences and Engineering. 2007;4(2):159–175. doi: 10.3934/mbe.2007.4.159. [DOI] [PubMed] [Google Scholar]
- Backer J.A., Klinkenberg D., Wallinga J. Incubation period of 2019 novel coronavirus (2019-ncov) infections among travellers from Wuhan, China, 20–28 January 2020. Euro Surveillance. 2020;25(5) doi: 10.2807/1560-7917.ES.2020.25.5.2000062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu Z., Song C., Xu C., Jin G., Chen Y., Xu X. 2020. Clinical characteristics of 24 asymptomatic infections with COVID-19 screened among close contacts in Nanjing, China. medRxiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jin Z., Zhang J., Song L.-P., Sun G.-Q., Kan J., Zhu H. Modelling and analysis of influenza a (h1n1) on networks. BMC Public Health. 2011;11(Suppl 1):S9. doi: 10.1186/1471-2458-11-S1-S9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim S., Lee J., Jung E. Mathematical model of transmission dynamics and optimal control strategies for 2009 a/h1n1 influenza in the Republic of Korea. Journal of Theoretical Biology. 2017;412:74–85. doi: 10.1016/j.jtbi.2016.09.025. [DOI] [PubMed] [Google Scholar]
- Lamboni M., Makowski D., Simon L., Gabrielle B., Monod H. Multivariate global sensitivity analysis for dynamic crop models. Field Crops Research. 2009;113(3):312–320. [Google Scholar]
- Lauer S.A., Grantz K.H., Bi Q., Jones F.K., Zheng Q., Meredith H.R. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: Estimation and application. Annals of Internal Medicine. 03 2020 doi: 10.7326/M20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lauer S.A., Grantz K.H., Bi Q., Jones F.K., Zheng Q., Meredith H. 2020. The incubation period of 2019-ncov from publicly reported confirmed cases: Estimation and application. medRxiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li R., Pei S., Chen B., Song Y., Zhang T., Yang W. 2020. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (covid-19) medRxiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Monod H., Naud C., Makowski D. Uncertainty and sensitivity analysis for crop models. Working with dynamic crop models: Evaluation, analysis, parameterization, and applications. 2006;4:55–100. [Google Scholar]
- Z.-D. Tong, A. Tang, K.-F. Li, Peng L., H.-L. Wang, J.-P. Yi, et al. Potential presymptomatic transmission of SARS-CoV-2, zhejiang province, China, 2020. Emerging Infectious Diseases., 26(5), 2020. [DOI] [PMC free article] [PubMed]