Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2021 Jan 29;34:100439. doi: 10.1016/j.epidem.2021.100439

COVID-19 outbreak in Wuhan demonstrates the limitations of publicly available case numbers for epidemiological modeling

Elba Raimúndez a,b, Erika Dudkin a, Jakob Vanhoefer a, Emad Alamoudi a, Simon Merkt a, Lara Fuhrmann a, Fan Bai a, Jan Hasenauer a,b,c,
PMCID: PMC7845523  PMID: 33556763

Abstract

Epidemiological models are widely used to analyze the spread of diseases such as the global COVID-19 pandemic caused by SARS-CoV-2. However, all models are based on simplifying assumptions and often on sparse data. This limits the reliability of parameter estimates and predictions.

In this manuscript, we demonstrate the relevance of these limitations and the pitfalls associated with the use of overly simplistic models. We considered the data for the early phase of the COVID-19 outbreak in Wuhan, China, as an example, and perform parameter estimation, uncertainty analysis and model selection for a range of established epidemiological models. Amongst others, we employ Markov chain Monte Carlo sampling, parameter and prediction profile calculation algorithms.

Our results show that parameter estimates and predictions obtained for several established models on the basis of reported case numbers can be subject to substantial uncertainty. More importantly, estimates were often unrealistic and the confidence/credibility intervals did not cover plausible values of critical parameters obtained using different approaches. These findings suggest, amongst others, that standard compartmental models can be overly simplistic and that the reported case numbers provide often insufficient information for obtaining reliable and realistic parameter values, and for forecasting the evolution of epidemics.

Keywords: Compartment model, SEIRD, Parameter estimation, Model selection, Uncertainty analysis

1. Introduction

Epidemiological models are essential tools in public health as they facilitate assessments and forecasts of the spread of infectious diseases. This has been for instance demonstrated for influenza (Yang et al., 2015), dengue (Reich et al., 2016), Ebola (Shaman et al., 2014), Zika (Chowell et al., 2016), and – most recently – COVID-19 (Ferguson et al., 2020, Boldog et al., 2020). These assessments and forecasts are the basis for political decision making (Doms et al., 2018) and therefore of vital importance.

The spectrum of mathematical modeling approaches in epidemiology ranges from relatively simple ordinary differential equation (ODE) models (Kermack et al., 1927, Hethcote, 2000, Brauer and Castillo-Chavez, 2012), partial differential equation (PDE) models (Chalub and Souza, 2011, Lotfi et al., 2014), stochastic differential equation (SDE) models (Dargatz et al., 2006, Greenwood and Gordillo, 2009, Allen, 2010), continuous-time discrete-state Markov chain (CTMC) models (Allen, 2010, Britton, 2010, Isham, 2007), to complex agent-based models (Epstein and Axtell, 1996, Bruch and Atwell, 2015). While ODE, PDE and SDE models provide descriptions at the population level, agent-based models are centered around formulations of the properties and dynamics of individuals. Some models explicitly account for space (usually in terms of countries, regions and/or cities) to capture spreading. Furthermore, models for the infection spread are usually combined with models of testing and reporting strategy to link them to the observed case number (Birrell et al., 2011).

The choice of the modeling approach depends on the purpose of the study, the availability of information about the underlying disease and population, and the amount and quality of experimental data. Yet, all these models rely on parameter values that need to be extracted from the literature or estimated from given data. The parameters of epidemiological models in many studies are inferred using frequentist and Bayesian parameter estimation methods. Frequentist methods often rely on parameter optimization for obtaining point estimates and profile likelihoods for uncertainty analysis (Brookhart et al., 2002). Bayesian methods exploit sampling strategies such as Markov chain Monte Carlo (MCMC) methods (Weidemann et al., 2014, Birrell et al., 2011) or variational inference (Chatzilena et al., 2019). Also flexible emulator based methods based, e.g. on Gaussian process, have been applied (Farah et al., 2014). For applications in which competing hypotheses are available, the parameter estimation is often complemented by model selection. Established model selection measures include the Akaike Information Criterion (AIC) (Akaike, 1973), the Bayesian Information Criterion (BIC) (Schwarz, 1978), or Bayes factors (Kass and Raftery, 1995).

In this study, we exploit state-of-the-art parameter estimation and model selection approaches to perform an analysis of the COVID-19 outbreak in Wuhan, China. The first cases of COVID-19 were reported on December 30, 2019 and the Chinese Center for Disease Control and Prevention confirmed the isolation of a novel virus on January 7, 2020 (Gralinski and Menachery, 2020). Already by January 27, there were 1590 confirmed cases which include severe cases and 85 cumulative death cases in Wuhan, and several exported confirmed cases to Cambodia, Canada, France, Japan, Malaysia, Nepal, Republic of Korea, Singapore, Thailand, United States of America, and Vietnam (World Health Organization, 2020). As SARS-CoV-2 spread quickly, the Director-General of World Health Organisation (WHO) declared the flood of infections caused by SARS-CoV-2 a global pandemic on March 11 (Tedros, 2020).

Our study complements other modeling efforts (Shao and Shan, 2020, Chen et al., 2020, Li et al., 2020a, Ming et al., 2020, Read et al., 2020, Koo et al., 2020, Neher et al., 2020, Li et al., 2020b, Zhao et al., 2020b, Tian, 2020, Liu et al., 2020a, Nordt and Herdener, 2020, Jenny et al., 2020) by considering multiple established model topologies, observables, parameter estimation and model selection approaches. To recapitulate the situation in the beginning of the pandemic, we limit the use of prior knowledge to a minimum. This highlights challenges, e.g. the limited information content of case numbers and the dependence on proper model topology, but also opportunities for quantitative modeling in epidemiology.

2. Results

2.1. Observable selection and parameter identifiability

For this study, we considered the case numbers reported by the Health Commission of Hubei Province (2020) and Wuhan Municipal Health Commission (2020). These case numbers were particularly relevant for the analysis of the early transmission dynamics and the planning of interventions. Accordingly, these data were the basis of several modeling studies on the dynamics of COVID-19 epidemic (see e.g. Tian, 2020, Zhao et al., 2020a, Wang et al., 2020, Ming et al., 2020, Roosa et al., 2020, Zhao et al., 2020b, Peng et al., 2020, Li et al., 2020b). Here, we used the time interval from January 9 to February 9, as afterwards the definition of a positive test changed (Chinese Center for Disease Control and Prevention, 2020), which limits the comparability.

The Chinese Center for Disease Control and Prevention provides time-resolved information on:

  • Reported number of infected individuals: yI(t)

  • Reported number of recovered individuals: yR(t)

  • Reported number of deceased individuals: yD(t)

  • Reported cumulative number of infected individuals: yT(t)=yI(t)+yR(t)+yD(t)

The reported number of deceased individuals is probably most accurate, yet the overall reliability of the measurement and the distribution of the errors is unknown. As in the literature different combinations of these observables are used for model parameterization, we consider here the following fitting scenarios:

  • O1: Observations of yT and yD.

  • O2: Observations of yI and yD.

  • O3: Observations of yI, yR and yD.

As different studies considered different aspects of the data, we first asked which scenario is most suited to determine the parameters of the infection process.

To address this question, we employed a classical deterministic Susceptible–Exposed–Infectious–Recovered–Deceased (SEIRD) model (Capasso, 1993). This compartmental model describes the size of the susceptible (S), exposed (E), infectious (I), recovered (R) and deceased (D) subgroups (Fig. 1A). The time-dependence of the subgroup sizes is governed by the ODEs:

dSdt=βSINS(0)=NS
dEdt=βSINκEE(0)=NE
dIdt=κE(γ+δ)II(0)=NI
dRdt=γIR(0)=NR
dDdt=δID(0)=ND

in which β is the average number of contacts per person per time which result in an infection, κ is the rate at which exposed individuals become infectious, γ is the rate at which infectious individuals recover, δ is the rate at which infectious individuals decease, and N=S(t)+E(t)+I(t)+R(t)+D(t) is the overall population size. Note that the inverse of the rate κ is the average incubation time TE=κ1. The initial conditions for the different state variables are given by NS, NE, NI, NR and ND. The initial conditions are usually non-zero and might be inferred along with the unknown model parameters as shown in Peng et al., 2020, Tsay et al., 2020, Tang et al., 2020. We applied the simplifying assumption that all infectious individuals are observed.

Fig. 1.

Fig. 1

Analysis of different observable combinations. (A) Schematic of the SEIRD model. (B,C,D) Fitting results for observation scenarios O1, O2, and O3 assuming log-normally distributed measurement noise. The simulation for the maximum likelihood estimate (line) and interval for + one standard deviation of the inferred noise level (shaded area) is depicted. The upper bound for the inverse of the rate constants was set to 182 days and for the initial conditions to 1000 individuals.

For all observable scenarios we performed a maximum likelihood estimation assuming normally as well as log-normally distributed measurement noise with unknown standard deviations (see Materials and Methods). All parameters were assumed to be unknown with conservative bounds (Table 4), similar to various recent publications (Roda et al., 2020, Bertozzi et al., 2020, Tang et al., 2020, Peng et al., 2020). The multi-start local optimizations converged (Supplementary Figure S1A) and the simulations with the maximum likelihood estimates achieved a good agreement with the observed data (Fig. 1B–D and Supplementary Figure S1C–E). This confirms the findings of other research groups showing that the SEIRD model is sufficient to fit the observed case numbers of the COVID-19 outbreak in Wuhan. The comparably low noise levels inferred for the number of deceased individuals confirms our expectation that these observations are most reliable. Model selection based on AIC and BIC indicated a strong support for log-normally distributed measurement noise (Supplementary Figure S1B). For this choice, the residual distribution is consistent with the theoretically expected on Supplementary Figure S2, which indicated that the statistical model is appropriate.

Table 4.

Model parameters. Nominal values, lower bounds, upper bounds, priors and units for the model parameters. The nominal values are only used for model parameters which are not fitted.

Name Fitted Nominal value Lower bound Upper bound Prior Unit
Parameters of transition rates β0 Yes - 105 105 log-uniform day−1
ξ0 Yes - 0.0038 16.63 log-uniform day−1
κ Yes - 0.0038 16.63 log-uniform/-normal day−1
γ Yes - 0.0038 16.63 log-uniform/-normal day−1
ν Yes - 0.0038 16.63 log-uniform day−1
δ Yes - 0.0038 16.63 log-uniform/-normal day−1
ρ Yes - 0.0038 16.63 log-uniform day−1
Δ Yes - 0.0038 16.63 log-uniform -
TC No 13 - - - day
k Yes - 0.0038 16.63 log-uniform day−1

Total and initial
number of people
N No 9×106 - - - #
NE Yes - 101 103 log-uniform #
NA Yes - 101 103 log-uniform #
NI Yes - 101 103 log-uniform #
NR Yes - 101 103 log-uniform #
NR Yes - 101 103 log-uniform #
ND Yes - 101 103 log-uniform #

Noise level σT2 Yes - 103 103 log-uniform #
σI2 Yes - 103 103 log-uniform #
σR2 Yes - 103 103 log-uniform #
σD2 Yes - 103 103 log-uniform #

For an in-depth analysis of the impact of the choice of observables, we performed uncertainty analysis using frequentist and Bayesian methods using bounded log-uniform priors (Fig. 2). This analysis revealed several well-known problems, e.g. that the estimates of the transmission rate β and the progression rate κ are subject to substantial uncertainties (see also previous studies (Tuncer and Le, 2018, Roda et al., 2020, Roosa and Chowell, 2019)). Furthermore, profile likelihood calculation and MCMC sampling showed that for the case of an upper bound for the inverse of the rate constants of 182 days and for the initial conditions of 1000 individuals, O3 provides improved parameter identifiability and decreased parameter uncertainties compared to O1 and O2 (Fig. 2A–C). This was to be expected as O3 uses three observables (I, R and D) while O1 and O2 use two observables (and a subset of the information encoded in O3). Overall, the results were robust to changes in the bounds of the rates and for the initial conditions (Supplementary Figures S3, S4 and S5).

Fig. 2.

Fig. 2

Uncertainty quantification for the different observable combinations. (A–C) Parameter confidence/credibility intervals obtained using profile calculation and MCMC samples. The gray lines indicate the employed parameter boundaries. (D–F) Posterior of state variables obtained by MCMC sampling. Medians (line) and 99% confidence (dashed lines) / credibility intervals (area) are indicated. The upper bound for the inverse of the rate constants was set to 182 days and for the initial conditions to 1000 individuals.

As in some studies a subset of rates and initial conditions is fixed to specific values, we analyzed the impact of the choices on the estimates of the remaining parameters. We observed that while the maximum likelihood estimates remained mostly similar for different plausible choices, the confidence intervals are altered substantially (Supplementary Figures S3, S4 and S5). Subsequently, we analyzed the impact of fixing the initial number of exposed individuals NE for model O3. Here, we found that the estimates of the remaining parameters to be sensitive to the choice of NE. This was particularly relevant for the progression rate κ, for which implausible confidence intervals were obtained for a range of values of NE (Supplementary Figure S6). This is critical as it implies that fixing the initial value of this compartment (for which usually only unreliable estimates are available), can bias the complete analysis. The selected bounds as well as indicating which parameters are estimated for the subsequent analyses in the manuscript are shown in Table 4. Interestingly, the large parameter uncertainties for O1 and O2 are only partially reflected in the prediction uncertainties (Fig. 2D–F) due to a strong parameter correlation (Supplementary Figures S7, S8 and S9).

The most critical observation was that the parameter estimates are not realistic, independently of the selected initial / estimated conditions, and that the credibility intervals derived from the MCMC samples are too narrow. The 99%-credibility intervals for O1, O2 and O3 suggested that κ is in the interval of [0.39, 4.28] ×102 days−1. This would imply an incubation period of [23.4, 257.6] days. This is not consistent with the estimates reported by the WHO which indicate a median incubation time of 5–6 days (World Health Organization, 2020), which have been confirmed by several other studies (Chen et al., 2020, Li et al., 2020b, Koo et al., 2020, Jenny et al., 2020, Neher et al., 2020). Similar inconsistencies are observed for the basic reproduction number. Not only the Bayesian parameter estimates for bounded log-uniform priors are off, but also the maximum likelihood estimates. However, in contrast to narrow 99%-credibility intervals computed from MCMC samples, the 99%-confidence intervals derived from profile likelihoods are broad and cover realistic values. This indicates that parameters estimates derived from case numbers can be unrealistic and their reliability should be carefully assessed using different approaches.

In addition to the parameters, several model predictions are unrealistic and implausible. This includes high numbers of exposed individuals for O1. Given the estimated parameters of the transmission process, these estimate numbers of exposed individuals at the initial time point of our simulation could not have been reached. This implies that the estimated rate constants and initial conditions are inconsistent in this setup. As the consistency of the model and its parameters is essential, this needs to be assessed and ensured.

As the information encoded in the number of reported cases alone appeared insufficient to infer realistic and identifiable parameter estimates, we complemented it in the following with log-normal priors for the incubation period, death rate and recovery rate as specified in the Materials and Methods section. The parameters of the prior of the incubation time are derived from the work of Backer et al. (2020), which is based on the infections among travelers from Wuhan. While the parameters of the priors of the death and recovery rates are derived from the work of Zhou et al. (2020). We tested the sensitivity of the results with respect to the parameters by considering a range of values for the scale parameter. This revealed that plausible estimates of the incubation period TE are already obtained for weak priors, whereas plausible estimates for the death and recovery rates required strong priors (Supplementary Figure S10). For the subsequent analysis, we set the scales of the priors according to the reliability of the parameters provided in the respective publications. As O3 with these priors achieves plausible estimates, i.e. the parameter confidence intervals cover values reported in the literature (Read et al., 2020, Jenny et al., 2020, Peng et al., 2020), we considered for the following sections this setup with priors in the model fitting (Fig. 3A).

Fig. 3.

Fig. 3

Fit for different epidemiological models. (A–D) Illustration of model structures (top) and model-data comparison (bottom). The simulation for the maximum likelihood estimate (line) and interval for + one standard deviation of the inferred noise level (shaded area) is depicted.

2.2. Analysis of transmission process

The SEIRD model with the priors for the incubation period, the death rate and the recovery rates provides a reasonable description of the case numbers reported for Wuhan (Fig. 3A). Yet, this widely used model disregards the observation that patients are asymptomatic (European Centre for Disease Prevention and Control, 2020a). These patients can be infectious but are more difficult to detect. To study the impact of asymptomatic patients, we consider besides the basic SEIRD model (M1) also two alternative epidemiological models:

  • M2: A SAIRD model considering asymptomatic individuals (A) with transmission rate ζ and symptomatic individuals (I) with transmission rate β. The asymptomatic individuals are assumed to become symptomatic.

  • M3: A SAIRD model similar to M2, which allows for the direct recovery of asymptomatic patients. The recovered asymptomatic patients (R) are assumed to remain unreported.

We assume that the reported number of infected individuals corresponds to the number of symptomatic patients (I). Accordingly, the reported number of recovered individuals is assumed to count only previously symptomatic individuals.

As a further model extension, we consider the possibility of waning immunity. Some studies suggested that the infection with SARS-CoV-2 does not necessarily induce a long-lived antibody response (Amanat and Krammer, 2020, Long et al., 2020), while others found antibody responses 4 months after infection (Gudbjartsson et al., 2020). To address this, we consider:

  • M4: A SAIRD model similar to M2, which allows recovered individuals (R) to become susceptible (S) with rate ρ.

The parameters of models M1–M4 were estimated using multi-start local optimization. The simulations of models M1 to M4 for the respective maximum a posterior estimate show a reasonable agreement with the data (Fig. 3). Interestingly, while the simulations for M1, M2 and M4 are similar, the simulation for M3 shows an early saturation (Fig. 3C). The reason is that the initial number of unobserved recovered patients (which can also be interpreted as immune patients) is estimated to be very high, which does not appear to be plausible.

As the fitting results for all models were highly reproducible (Fig. 4A), we considered the values of the likelihood function and evaluated the AIC and BIC for model selection (Fig. 4B,C). First of all, we observed that although M4 is more complex than M1 and M2, it does not achieve a better likelihood value (Fig. 4A, zoom in). This suggests that recovered individuals are not again becoming susceptible in the considered time frame. However, when taking into account the AIC and BIC values, for which differences of 10 are considered substantial (Burnham and Anderson, 2002), we cannot reject M4. As all models achieved a relatively similar fit, M1 to M3 appeared to be suitable descriptions and the ranking differs for AIC and BIC. This confirms the limited information content of the case numbers as there is clear evidence for the relevance of asymptomatic cases.

Fig. 4.

Fig. 4

Analysis of model structure. (A) Waterfall plots for the 200 multi-start runs. The optimization runs are sorted by increasing negative log-likelihood value. The lower panel shows a magnification of the best 20 starts. (B) Differences in AIC with respect to the lowest value indicate M3 as the most plausible model. (C) Differences in BIC with respect to the lowest value indicate M1 as the most plausible model. Black dashed line in (B) and (C) depicts a change of 10 units considered as a rejection threshold (Burnham and Anderson, 2002).

To assess the uncertainty of the parameter estimates and predictions, we computed the profile likelihoods and performed MCMC sampling (Fig. 5A–D). The results indicate that the parameter uncertainties for M2–M4 are larger than for M1, but that most parameters are well determined. The profile likelihoods yield overall more conservative estimates than the sampling (Supplementary Figures S11, S12, S13 and S14). The predictions of the state variables based on the sampling suggest low uncertainties of all model states (Fig. 5E–H) while still having large parameter uncertainties as a result of correlations between parameters (Supplementary Figures S11, S12, S13 and S14). In particular for M3 this appears unrealistic as there are so far no reports about a large number of immune individuals (Fig. 5G).

Fig. 5.

Fig. 5

Uncertainty quantification for different model structures. (A–D) Confidence/credibility intervals for the model parameters obtained using profile calculation and MCMC sampling. The gray lines indicate the employed parameter boundaries. (E–H) Confidence/credibility intervals for the state variables obtained using prediction profile likelihood calculation and MCMC sampling. Medians (line) and 99% confidence (dashed lines) / credibility intervals (area) are indicated.

2.3. Analysis of intervention effect

A key question in many recent studies is how much interventions such as compulsory masks and social contact restrictions, as well as the rising public awareness impact the transmission rate of SARS-CoV-2. To study how well this question can be assessed based on early case report data, we considered three simple scenarios:

  • No change of the transmission rate.

  • Discrete change in the transmission rate due to the compulsory masks introduced by the government in Wuhan on January 22 and substantially increasing contract restrictions.

  • Continuous change in the transmission rate due to rising public awareness and a broad spectrum of interventions.

These scenarios are illustrated in Fig. 6 and a detailed mathematical description is provided in the Materials and Methods section.

Fig. 6.

Fig. 6

Modeling intervention strategies. Schematic of distinct intervention effects influencing β and ζ. TC denotes the time at which an intervention is performed.

As the result of the analysis of the infection dynamics carried out in the previous section was inconclusive, we considered model M1 to M4. For all 12 combinations of model structures and intervention effects, we performed parameter estimation and uncertainty analysis. The assessment of the model selection criteria provided support for a discrete change in the transmission rate on January 22 (Fig. 7A). The resulting model provides an accurate description of the data and suggests that the transmission rate dropped by around 46% (Fig. 7B). Moreover, the residual distribution is consistent with the theoretically expected (Supplementary Figures S15). The uncertainty estimates for the decrease (Δ) depend heavily on the analysis approach. While MCMC sampling yields a 99% credibility interval from 32.2 to 63.67%, the profiles suggests a much broader regime (Fig. 7C and Supplementary Figure S16). Accordingly, the reported case number for the early outbreak were not sufficient for an accurate assessment of all model parameters. Despite the parameter uncertainties, the states seem to be relatively well determined (Fig. 7D).

Fig. 7.

Fig. 7

Analysis of intervention effects. (A) Model selection using AIC weights (top) and BIC weights (bottom). (B) Best model fit (top) and estimated intervention effect (bottom). The simulation for the maximum likelihood estimate (line) and interval for + one standard deviation of the inferred noise level (shaded area) is depicted. (C) Confidence/credibility intervals for the model parameters obtained using profile calculation and MCMC sampling. The gray lines indicate the employed parameter boundaries. (D) Confidence/credibility intervals for the state variables obtained using prediction profile likelihood calculation and MCMC sampling. Medians (line) and 99% confidence (dashed line) / credibility intervals (area) are indicated.

3. Discussion

Pandemics pose a global challenge and show the importance of model-based forecasting. Forecasts influence the political decision-making process and have a significant impact on our society. Minimizing model uncertainties and properly evaluating them is therefore crucial. Yet, many publications are still only using reported case number and/or omit an identifiability and uncertainty analysis (Li et al., 2020a, Ming et al., 2020, Maier and Brockmann, 2020, Read et al., 2020, Zhao et al., 2020b, Barbarossa et al., 2020, Salim et al., 2020, Berk and Kadyrov, 2020). Here, we demonstrated that both aspects are problematic.

Our analysis of the COVID-19 outbreak in Wuhan demonstrates that the parameterization of epidemiological models can result in incorrect parameter estimates and predictions. Surprisingly, even Bayesian uncertainty analysis using MCMC sampling with bounded log-uniform priors as well as informative priors on the incubation time, death and recovery rates, resulted in an underestimation of the indeterminacy and provided inaccurate predictions. In principle this could be caused by (i) problems in the parameter estimation, (ii) unsuitable statistical data model, (iii) poor data quality, (iv) low information content of the data, and (v) inadequate process descriptions. Yet, we ensured the reliability and reproducibility of the fitting results (by even comparing multiple methods) and confirmed the appropriateness of the noise data (by considering multiple noise models and evaluating residual distributions). This indicates that (i)-(iii) do not cause the observed problems. In our opinion the concrete reasons are:

Besides parameter estimation, the aforementioned limitations of case numbers were observed in the model selection process. The data did, for instance, not allow to unravel that a large number of the asymptomatic cases is not detected. Yet, the data still contained information that allowed to detect the effect of government restrictions.

The problems we encountered in this study have in parts been described for other models. In particular, practical and structural identifiability has been reported in several publications which used case report data (Chowell, 2017, Roosa et al., 2020, Mukandavire et al., 2020, Zhao et al., 2020b, Tuncer and Le, 2018, Roda et al., 2020). In contrast, bias in parameter estimates and inappropriateness of confidence intervals – which is caused by unsuitability of the model – is rarely discussed.

Fortunately, the problems can be addressed by refining the models and by incorporating additional information. Already the incorporation of details of the disease progression provides substantially improved estimates and predictions (Giordano et al., 2020). In addition, parameter priors can be used to incorporate information which is not contained in case numbers. Yet, – as we demonstrate above – the priors have to be chosen appropriately, as the results can be very sensitive to them. Furthermore, while literature-based priors are used in many manuscripts (Khailaie et al., 2020), we hypothesize that it would be better to use information about individual cases for parameter estimation and model selection. In particular the date of the onset of symptoms, the date of the positive test, and the date of recovery/death for individuals is highly relevant. These data are being collected and analyzed (Backer et al., 2020, Lauer et al., 2020, Zhou et al., 2020, Wölfel et al., 2020), but should in the future be shared much earlier. Furthermore, randomized testing would be required, ideally using antibody tests to determine the fraction of completely asymptomatic patients. Such studies are usually not possible during an initial phase of a pandemic but are now on the way (Bayerisches Staatsministerium für Wissenschaft und Kunst., 2020).

This study does not offer new insights into the COVID-19 pandemic. However, it pinpoints important pitfalls and showcases the relevance of the underlying assumptions and the available data. Furthermore, it demonstrates that even a proper uncertainty analysis using state-of-the-art frequentist or Bayesian approaches does not ensure that the true parameters and dynamics are captured within the uncertainty bounds. While we demonstrate this aspect here for deterministic compartmental models – which are the basis of many modeling studies for COVID-19 and beyond – it certainly holds also for other modeling approaches. Similarly, model-free studies are based on assumptions, data and statistical models, rendering them subject to at least the same limitations.

4. Materials and methods

4.1. Data

The study is based on official reports on the total number of cases and the numbers of infected individuals, recovered individuals and deceased individuals. From January 11 to 20, the reports were made available by the Wuhan Municipal Health Commission (2020). Afterwards, the reports were organized by the Health Commission of Hubei Province (2020). All these data are publicly available on the respective webpages. The complete data sets used in this study are listed in Table 1.

The exact population size in the city of Wuhan in the period under consideration is not precisely known due to Chinese New Year. In this study we assume a population size of 9 million which was mentioned by the mayor of Wuhan, Zhou Xianwang (Business Insider, Ashley Collman, 2020).

Table 1.

Number of total, infected, deceased and recovered cases in Wuhan, China. Missing data in the official reports are indicated by “-”.

Date Total Infected Recovered Deceased
January 9
January 10
January 11 41 34 6 1
January 12 41 33 7 1
January 13 41 33 7 1
January 14 41 33 7 1
January 15 41 27 12 2
January 16 45 28 15 2
January 17 62 41 19 2
January 18 121 94 24 3
January 19 198 169 25 4
January 20 258 227 25 6
January 21 363 326 28 9
January 22 425 380 28 17
January 23 495 441 31 23
January 24 572 502 32 38
January 25 618 533 40 45
January 26 698 593 42 63
January 27 1590 85
January 28 1905 1726 75 104
January 29 2261 2050 82 129
January 30 2639 2377 103 159
January 31 3215 2884 139 192
February 1 4109 3714 171 224
February 2 5142 4653 224 265
February 3 6384 5768 303 313
February 4 8351 7621 368 362
February 5 10117 9272 431 414
February 6 11618 10606 534 478
February 7 13603 12360 698 545
February 8 14982 13497 877 608
February 9 16902 15177 1044 681

4.2. Mathematical models

We considered four different deterministic compartmental models for the description of the transmission process. The state variables of the models are the number of individuals with particular characteristics and the notations can be found in Table 2.

Table 2.

State variables of the mathematical models.

Name Description
S Susceptible
E Exposed but not infectious
A Asymptomatic and infectious
I Symptomatic and infectious
R Recovered with previously symptomatic progression
R Recovered with previously asymptomatic progression
D Deceased

The models allow for various processes which result in the transitions of individuals between compartments (see Fig. 3). A description of the rates is provided in Table 3. For the time-dependence of the transmission rates β and ζ,

β(t)=β0g(t)andζ(t)=β0ξ0g(t),

we considered three scenarios:

Table 3.

Rates in the mathematical models.

Name Process Description
β(t) S+IEA (Time-dependent) transmission rate for symptomatic
ζ(t) S+AEA (Time-dependent) transmission rate for asymptomatic
κ EAI Progression rate (= TE1 for incubation time TE)
γ IR Recovery rate for symptomatic case
ν AR Recovery rate for asymptomatic case
δ ID Death rate
ρ RS Rate at which immunity wanes
  • No change: g(t)=1

  • Discrete change: g(t)=1for tTC1Δotherwise

  • Continuous change: g(t)=ekt

The function g(t) describes the reduction of the transmission rates compared to baseline at t=0, with g(0)=1 for all scenarios. The parameters for the discrete change are the time point TC and the relative reduction Δ, and for the continuous change we have the decay rate k. The parameter ξ0=ζ0β0 denotes the relative difference between the transmission rates of the symptomatic individuals (β(t)) and asymptomatic individuals (ζ(t)).

The ODEs governing the dynamics of the different compartmental models are provided in Table 5. We decided to initialize the model on January 9, which is when the virus was first detected (European Centre for Disease Prevention and Control, 2020b), defining t=0. As on January 22 the wearing of masks became mandatory, we set for the scenario of a discrete reduction of β for TC=13 days.

The state variables of the model were linked to the observed case numbers using observation functions. The observation functions for the total number of cases (yT) as well as for the number of infected (yI), recovered (yR) and deceased (yD) people are provided in Table 5. As the reported numbers are subject to unknown measurement noise, we considered two error models:

  • Additive normally distributed measurement noise:
    y¯T,k=yT(tk)+εT,k,εT,kN(0,σT2)
    y¯I,k=yI(tk)+εI,k,εI,kN(0,σI2)
    y¯R,k=yR(tk)+εR,k,εR,kN(0,σR2)
    y¯D,k=yD(tk)+εD,k,εD,kN(0,σD2)
  • Multiplicative log-normally distributed measurement noise:
    y¯T,k=yT(tk)εT,k,εT,klogN(0,σT2)
    y¯I,k=yI(tk)εI,k,εI,klogN(0,σI2)
    y¯R,k=yR(tk)εR,k,εR,klogN(0,σR2)
    y¯D,k=yD(tk)εD,k,εD,klogN(0,σD2)

The observation time points tk, k=1,,32, are the days listed in Table 1, and the measurements (as indicated with the superscript m) are the respective case numbers. The distribution parameters σT, σI, σR and σD were considered as unknown.

In the following the parameters of the transition rates, the total and initial number of people in different compartments, and the parameters of the noise distribution are inferred. A comprehensive list of all model parameters and implemented constraints is provided in Table 4. The boundaries of the search space were chosen very conservatively to indicate the initial knowledge gap about SARS-CoV-2 and COVID-19.

Table 5.

Mathematical models. ODEs for the transmission dynamics, initial conditions and observations functions are defined for all considered model structures. As some models consider only a subset of the state variables, some rows are empty.

SEIRD model (M1) SAIRD model (M2) SAIRRD model (M3) SAIRDS model (M4)
Infection dynamics dSdt=βSIN dSdt=βSINζSAN dSdt=βSINζSAN dSdt=βSINζSAN+ρR
dEdt=βSINκE
dAdt=βSIN+ζSANκA dAdt=βSIN+ζSAN(κ+ν)A dAdt=βSIN+ζSANκA
dIdt=κE(γ+δ)I dIdt=κA(γ+δ)I dIdt=κA(γ+δ)I dIdt=κA(γ+δ)I
dRdt=γI dRdt=γI dRdt=γI dRdt=γIρR
dRdt=νA
dDdt=δI dDdt=δI dDdt=δI dDdt=δI

Initial conditions S(0)=NNENINRND S(0)=NNANINRND S(0)=NNANINRNRND S(0)=NNANINRND
E(0)=NE
A(0)=NA A(0)=NA A(0)=NA
I(0)=NI I(0)=NI I(0)=NI I(0)=NI
R(0)=NR R(0)=NR R(0)=NR R(0)=NR
R(0)=NR
D(0)=ND D(0)=ND D(0)=ND D(0)=ND

Observables yT=I+R+D yT=I+R+D yT=I+R+D yT=I+R+D
yI=I yI=I yI=I yI=I
yR=R yR=R yR=R yR=R
yD=D yD=D yD=D yD=D

4.3. Parameter estimation

To infer the unknown model parameters we used frequentist and Bayesian approaches. These approaches considered the conditional probability p(D|θ) of the data D given the parameters θ, also known as likelihood. For additive normally distributed measurement noise the likelihood function is

p(D|θ)=iIk=13212πσiexp12y¯i,kyi(tk)σi2,

and for multiplicative log-normally distributed measurement it is

p(D|θ)=iIk=13212πσiy¯i,kexp12log(y¯i,k)log(yi(tk))σi2.

The set of considered observables is encoded by I and differs for the scenarios O1 to O3: IO1={T,D}, IO2={I,D}, and IO3={I,R,D}. In addition to the case reports, we also incorporated knowledge available before the parameter estimation. For Bayesian approaches this was done by defining prior distributions. These prior distributions are mostly log-uniform with conservative upper and lower bounds, meaning that the distribution over the log-transformed parameter values is flat. For κ we include in parts of the study information about the incubation period (Backer et al., 2020), given by a log-normal prior:

p(κ)=12πσκκexp12log(κ)log(κˆ)σκ2,withκˆ=(6.4[day])1 and σκ=0.3.

For δ and γ we include in parts of the study information about the mean death and mean recovery time (Zhou et al., 2020), given by a log-normal prior:

p(δ)=12πσδδexp12log(δ)log(δˆ)σδ2,withδˆ=(18.5[day])1 and σδ=0.15,

and

p(γ)=12πσγγexp12log(γ)log(γˆ)σγ2,withγˆ=(22[day])1 and σγ=0.1.

For the frequentist approaches the available estimates of κ, δ and γ are treated as data points.

Remark

For the parameter estimation we consider the log-transformed parameter values. For the log-transformed parameters, the log-uniform prior become effectively a uniform prior. This renders the frequentist and the Bayesian approaches comparable, namely, the maximum likelihood and the maximum a posteriori estimates coincide.

4.3.1. Maximum likelihood and maximum a posteriori estimates

To determine the maximum likelihood and the maximum a posteriori estimates, we minimized the negative log-likelihood function and negative log-posterior function, respectively. As these optimization problems are non-linear and non-convex, we used multi-start local optimization. The starting points for the local optimizations were generated using latin hypercube sampling. Local optimization was performed using the interior point algorithm implemented in the MATLAB function lsqnonlin.m, which exploits the least-squares like structure of the optimization problems. To facilitate convergence, we computed the gradients of the residuals via forward sensitivity equations. The convergence of the global optimization was assessed using waterfall plots.

For each of the 18 considered combinations of compartment model, noise model, observable scenario and intervention scenario, we performed 200 local optimizations. For all combinations at least 15 runs converged to the observed global optimum. This suggested that the results are highly reliable.

4.3.2. Frequentist uncertainty analysis

To evaluate the (frequentist) parameter and prediction confidence intervals we used profile likelihoods (Raue et al., 2009, Kreutz et al., 2013). The profile likelihoods were computed using the MATLAB toolbox Data2Dynamics (Raue et al., 2015). This toolbox implements optimization-based methods with adaptive step-size selection as well as fast integration-based methods (Hass et al., 2016). To ensure the robustness of the results, the consistency of the outcomes was checked and confirmed.

The profile likelihoods were used to derive the (finite sample) confidence intervals and to assess practical identifiability (Raue et al., 2009). For a significance level α, the bounds of the confidence interval were determined as the smallest and largest parameter values for which the profile likelihood stays above the threshold defined by the αth-percentile of the χ2-distribution (Meeker and Escobar, 1995). We used the χ2-distribution with one degree of freedom which yields the so called point-wise confidence intervals. We note that for practically identifiable parameters (as termed (Raue et al., 2009)), the confidence intervals can still be rather wide.

4.3.3. Bayesian uncertainty analysis

To evaluate the (Bayesian) parameter and prediction credibility intervals we used Markov chain Monte Carlo sampling (Ballnus et al., 2017). The parameter posterior distribution was sampled using the Adaptive Metropolis algorithm implemented in the MATLAB toolbox PESTO (Stapor et al., 2018). The methods are self-tuning and provided good convergence properties. Convergence after burn-in was assessed using the Geweke test (Geweke, 1992). The parameter samples were used to generate samples for the model states and observables.

The samples of parameters and predictions were used to derive the credibility intervals. For a credibility level α, the bounds of the credibility interval were determined as the 100α2- and the 100(1α2)-percentile of the respective sample. This procedure yields the so called equal-tailed interval.

4.4. Model selection

We considered competing hypotheses on the dynamics of the infection process, the effect of the intervention and the noise distribution. Each of the resulting models was assessed using the Akaike information criterion (AIC),

AICj=2logp(D|θˆ(j))+2nθ(j),

and the Bayesian information criterion (BIC),

BICj=2logp(D|θˆ(j))+log(nD)nθ(j),

in which j is the model index, θˆ(j) is the maximum likelihood estimate for the jth model, and nθ(j) is the number of parameter of the jth model. The number of independent data points is denoted by nD. The parameter priors are treated as data points, therefore included in nD. AIC and BIC account for the likelihood of the data and penalize model complexity. Low AIC and BIC values are favorable. We consider a difference of 10 between AIC/BIC values of different models as substantial (Burnham and Anderson, 2002).

For analysis and visualization we also evaluated the AIC and BIC weights (Burnham and Anderson, 2002). The AIC weight for the jth model is defined as

wAIC,j=exp{12(AICjAICmin)}jexp{12(AICjAICmin)},withAICminminjAICj,

and provides the weight of evidence in favor of the jth model being the actual best model in terms of the Kullback–Leibler Information (assuming that the true model is in the considered set). The BIC weight for the jth model is defined as

wBIC,j=exp{12(BICjBICmin)}jexp{12(BICjBICmin)},withBICminminjBICj,

and provides an approximation to the Bayesian posterior probability of the jth model. AIC and BIC weights are between 0 and 1, and a high value indicates a strong support.

4.5. Implementation and availability

The model formulation, parameter estimation and profile likelihoods were performed in the MATLAB toolbox Data2Dynamics (https://github.com/Data2Dynamics/d2d) (Raue et al., 2015). Outliers in the computed prediction profiles arising from calculation errors were corrected subsequently. The calculation of parameter confidence intervals and MCMC sampling was carried out using the MATLAB toolboxes PESTO (https://github.com/ICB-DCM/PESTO) (Stapor et al., 2018) and AMICI (https://github.com/ICB-DCM/AMICI) (Fröhlich et al., 2017a, Fröhlich et al., 2017b). For numerical integration Data2Dynamics and AMICI rely on the SUNDIALS solver CVODES (Serban and Hindmarsh, 2005).

The complete implementation (including the respective version of the used toolboxes) and data are available on ZENODO ( https://doi.org/10.5281/zenodo.4457194). This includes the MATLAB code as well as the specification of the parameter estimation problems as PEtab files (Schmiester et al., 2020) (with the model in SBML format Hucka et al. (2003)).

CRediT authorship contribution statement

Elba Raimúndez: Visualization, Formal analysis, Methodology, Software, Writing - original draft, Writing - review & editing. Erika Dudkin: Visualization, Formal analysis, Software, Writing - original draft, Writing - review & editing. Jakob Vanhoefer: Methodology. Emad Alamoudi: Resources. Simon Merkt: Resources, Software. Lara Fuhrmann: Resources. Fan Bai: Conceptualization, Resources. Jan Hasenauer: Conceptualization, Writing - original draft, Writing - review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Funding

This work was supported by the European Union’s Horizon 2020 research and innovation program (CanPathPro; Grant no. 686282; E.D., J.H., S.M.), the Federal Ministry of Education and Research of Germany (Grant no. 031L0159C; E.A. & Grant no. 01ZX1705; J.H.), the Federal Ministry of Economic Affairs and Energy, Germany (Grant no. 16KN074236; J.V.), the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC-2047/1 - 390685813 (E.R., F.B., S.M., J.H) and EXC 2151 - 390873048 (E.R., E.D., E.A., J.V., S.M., F.B., J.H.), and the Excellence Strategy of the federal and state governments (E.D.).

Footnotes

Appendix A

Supplementary material related to this article can be found online at https://doi.org/10.1016/j.epidem.2021.100439.

Appendix A. Supplementary data

The following is the Supplementary material related to this article.

MMC S1

This e-component contains all Supplementary Figures.

mmc1.pdf (6MB, pdf)

References

  1. Ahmetolan S., Bilge A.H., Demirci A., Peker-Dobie A., Ergonul O. What can we estimate from fatality and infectious case data using the Susceptible-Infected-Removed (SIR) model? A case study of COVID-19 pandemic. Front. Med. 2020;7 doi: 10.3389/fmed.2020.556366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akaike H. 2nd International Symposium on Information Theory, Tsahkadsor, Armenian SSR, Vol. 1. Akademiai Kiado; 1973. Information theory and an extension of the maximum likelihood principle; pp. 267–281. [Google Scholar]
  3. Allen L.J.S. Chapman and Hall/CRC; 2010. An Introduction to Stochastic Processes with Applications to Biology. [Google Scholar]
  4. Amanat F., Krammer F. SARS-CoV-2 vaccines: Status report. Immunity. 2020;52:583–589. doi: 10.1016/j.immuni.2020.03.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Anastassopoulou C., Russo L., Tsakris A., Siettos C. Data-based analysis, modelling and forecasting of the COVID-19 outbreak. PLOS ONE. 2020;15 doi: 10.1371/journal.pone.0230405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Backer J.A., Klinkenberg D., Wallinga J. Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China. Eurosurveillance. 2020;25:20–28. doi: 10.2807/1560-7917.ES.2020.25.5.2000062. 2000062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Ballnus B., et al. Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems. BMC Syst. Biol. 2017;11:63. doi: 10.1186/s12918-017-0433-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Barbarossa M.V., et al. 2020. A first study on the impact of current and future control measures on the spread of COVID-19 in Germany. medRxiv. [DOI] [Google Scholar]
  9. Bayerisches Staatsministerium für Wissenschaft und Kunst. 2020. Pressemitteilung nr 072: Gemeinsam gegen COVID-19: Münchner Tropeninstitut beginnt Stichprobenanalyse zur Verbreitung der Corona-Pandemie und zur Wirksamkeit von Gegenmaßnahmen. https://www.stmwk.bayern.de/pressemitteilung/11894/gemeinsam-gegen-covid-19-muenchner-tropeninstitut-beginnt-stichprobenanalyse-zur-verbreitung-der-corona-pandemie-und-zur-wirksamkeit-von-gegenmassnahmen.html. [Google Scholar]
  10. Berk S.H., Kadyrov S. 2020. Purely data-driven exploration of COVID-19 pandemic after three months of the outbreak. medRxiv. [DOI] [Google Scholar]
  11. Bertozzi A.L., Franco E., Mohler G., Short M.B., Sledge D. The challenges of modeling and forecasting the spread of COVID-19. Proc. Natl. Acad. Sci. USA. 2020;117:16732–16738. doi: 10.1073/pnas.2006520117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Birrell P.J., et al. Bayesian modeling to unmask and predict influenza A/H1N1pdm dynamics in London. Proc. Natl. Acad. Sci. USA. 2011;108:18238–18243. doi: 10.1073/pnas.1103002108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Boldog P., et al. Risk assessment of novel coronavirus COVID-19 outbreaks outside China. J. Clin. Med. 2020;9:571. doi: 10.3390/jcm9020571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Brauer F., Castillo-Chavez C. In: Mathematical Models in Population Biology and Epidemiology, Texts in Applied Mathematics. Brauer F., Castillo-Chavez C., editors. Springer; New York, NY: 2012. Epidemic models; pp. 345–409. [Google Scholar]
  15. Britton T. Stochastic epidemic models: A survey. Math. Biosci. 2010;225:24–35. doi: 10.1016/j.mbs.2010.01.006. [DOI] [PubMed] [Google Scholar]
  16. Brookhart M.A., Hubbard A.E., van der Laan M.J., Colford J.M., Eisenberg J.N.S. Statistical estimation of parameters in a disease transmission model: Analysis of a cryptosporidium outbreak. Stat. Med. 2002;21:3627–3638. doi: 10.1002/sim.1258. [DOI] [PubMed] [Google Scholar]
  17. Bruch E., Atwell J. Agent-based models in empirical social research. Sociol. Methods Res. 2015;44:186–221. doi: 10.1177/0049124113506405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Burnham K.P., Anderson D.R. second ed. Springer; New York, NY: 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. [Google Scholar]
  19. Business Insider, Ashley Collman . 2020. https://www.businessinsider.com/5-million-left-wuhan-before-coronavirus-quarantine-2020-1?r=DE&IR=T. [Google Scholar]
  20. Capasso V. Springer-Verlag; 1993. Mathematical structures of epidemic systems. (Lecture Notes in Biomathematics). [Google Scholar]
  21. Chalub F.A.C.C., Souza M.O. The SIR epidemic model from a PDE point of view. Math. Comput. Modelling. 2011;53:1568–1574. [Google Scholar]
  22. Chatzilena A., van Leeuwen E., Ratmann O., Baguelin M., Demiris N. Contemporary statistical inference for infectious disease models using stan. Epidemics. 2019;29 doi: 10.1016/j.epidem.2019.100367. [DOI] [PubMed] [Google Scholar]
  23. Chen T.-M., et al. A mathematical model for simulating the phase-based transmissibility of a novel coronavirus. Infect Dis. Poverty. 2020;9:24. doi: 10.1186/s40249-020-00640-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Chinese Center for Disease Control and Prevention . 2020. http://www.chinacdc.cn/en/ [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Chowell G. Fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts. Infect Dis. Model. 2017;2:379–398. doi: 10.1016/j.idm.2017.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Chowell G., et al. Using phenomenological models to characterize transmissibility and forecast patterns and final burden of zika epidemics. PLoS Curr. 2016;8 doi: 10.1371/currents.outbreaks.f14b2217c902f453d9320a43a35b9583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Dargatz C., Georgescu V., Held L. Stochastic modelling of the spatial spread of influenza in Germany. Austrian J. Stat. 2006;35:5–20. [Google Scholar]
  28. Doms C., Kramer S.C., Shaman J. Assessing the use of influenza forecasts and epidemiological modeling in public health decision making in the United States. Sci. Rep. 2018;8:1–7. doi: 10.1038/s41598-018-30378-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Epstein J.M., Axtell R. MIT Press Books; 1996. Growing Artificial Societies: Social Science from the Bottom Up. [Google Scholar]
  30. European Centre for Disease Prevention and Control . 2020. Outbreak of novel coronavirus disease 2019 (COVID-19): increased transmission globally – fifth update. https://www.ecdc.europa.eu/sites/default/files/documents/RRA-outbreak-novel-coronavirus-disease-2019-increase-transmission-globally-COVID-19.pdf. [Google Scholar]
  31. European Centre for Disease Prevention and Control . 2020. Event background COVID-19. https://www.ecdc.europa.eu/en/novel-coronavirus/event-background-2019. [Google Scholar]
  32. Farah M., Birrell P., Conti S., Angelis D.D. Bayesian emulation and calibration of a dynamic epidemic model for A/H1N1 influenza. J. Am. Stat. Assoc. 2014;109:1398–1411. [Google Scholar]
  33. Ferguson N., et al. 2020. Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gida-fellowships/Imperial-College-COVID19-NPI-modelling-16-03-2020.pdf. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Fröhlich F., Kaltenbacher B., Theis F.J., Hasenauer J. Scalable parameter estimation for genome-scale biochemical reaction networks. PLoS Comput. Biol. 2017;13 doi: 10.1371/journal.pcbi.1005331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Fröhlich F., Theis F.J., Rädler J.O., Hasenauer J. Parameter estimation for dynamical systems with discrete events and logical operations. Bioinformatics. 2017;33:1049–1056. doi: 10.1093/bioinformatics/btw764. [DOI] [PubMed] [Google Scholar]
  36. Geweke J. In: Bayesian Statistics, Vol. 4. Bernardo J.M., Smith A.F.M., Dawid A.P., Berger J.O., editors. University Press Oxford; 1992. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments; pp. 169–193. [Google Scholar]
  37. Giordano G., et al. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy. Nat. Med. 2020;26:855–860. doi: 10.1038/s41591-020-0883-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Gralinski L.E., Menachery V.D. Return of the coronavirus: 2019-nCoV. Viruses. 2020;12(135) doi: 10.3390/v12020135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Greenwood P.E., Gordillo L.F. In: Mathematical and Statistical Estimation Approaches in Epidemiology. Chowell G., Hyman J.M., Bettencourt L.M.A., Castillo-Chavez C., editors. Springer Netherlands; Dordrecht: 2009. Stochastic epidemic modeling; pp. 31–52. [Google Scholar]
  40. Gudbjartsson D.F., et al. Humoral immune response to SARS-CoV-2 in Iceland. N. Engl. J. Med. 2020;383:1724–1734. doi: 10.1056/NEJMoa2026116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Halloran M.E., et al. Modeling targeted layered containment of an influenza pandemic in the United States. P. Natl. Acad Sci. 2008;105:4639–4644. doi: 10.1073/pnas.0706849105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Hass H., Kreutz C., Timmer J., Kaschek D. Fast integration-based prediction bands for ordinary differential equation models. Bioinformatics. 2016;32:1204–1210. doi: 10.1093/bioinformatics/btv743. [DOI] [PubMed] [Google Scholar]
  43. He S., Tang S., Rong L. A discrete stochastic model of the COVID-19 outbreak: Forecast and control. Math. Biosci. Eng. 2020;17:2792–2804. doi: 10.3934/mbe.2020153. [DOI] [PubMed] [Google Scholar]
  44. Health Commission of Hubei Province . 2020. http://wjw.hubei.gov.cn/ [Google Scholar]
  45. Hethcote H.W. The mathematics of infectious diseases. SIAM Rev. 2000;42:599–653. [Google Scholar]
  46. Hucka M., et al. The systems biology markup language (SBML): A medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19:524–531. doi: 10.1093/bioinformatics/btg015. [DOI] [PubMed] [Google Scholar]
  47. Isham V. Oxford University Press; 2007. Stochastic Models for Epidemics. [Google Scholar]
  48. Jenny P., Jenny D.F., Gorji H., Arnoldini M., Hardt W.-D. 2020. Dynamic modeling to identify mitigation strategies for covid-19 pandemic. medRxiv. [DOI] [PubMed] [Google Scholar]
  49. Kass R.E., Raftery A.E. Bayes factors. J. Am. Stat. Assoc. 1995;90:773–795. [Google Scholar]
  50. Kermack W.O., McKendrick A.G., Walker G.T. A contribution to the mathematical theory of epidemics. P. Roy. Soc. A-Math Phy. 1927;115:700–721. [Google Scholar]
  51. Khailaie S., et al. 2020. Estimate of the development of the epidemic reproduction number Rt from coronavirus SARS-CoV-2 case data and implications for political measures based on prognostics. medRxiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Koo J.R., et al. Interventions to mitigate early spread of SARS-CoV-2 in singapore: a modelling study. Lancet Infect Dis. 2020;20:678–688. doi: 10.1016/S1473-3099(20)30162-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Kreutz C., Raue A., Kaschek D., Timmer J. Profile likelihood in systems biology. FEBS J. 2013;280:2564–2571. doi: 10.1111/febs.12276. [DOI] [PubMed] [Google Scholar]
  54. Lauer S.A., et al. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: Estimation and application. Ann. Intern. Med. 2020;172:577–582. doi: 10.7326/M20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Li R., et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2) Science. 2020;368:489–493. doi: 10.1126/science.abb3221. https://science.sciencemag.org/content/368/6490/489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Li Q., et al. Early transmission dynamics in Wuhan, China, of novel coronavirus–infected pneumonia. N. Engl. J. Med. 2020;382:1199–1207. doi: 10.1056/NEJMoa2001316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Liu Y., Gayle A.A., Wilder-Smith A., Rocklöv J. The reproductive number of COVID-19 is higher compared to SARS coronavirus. J. Travel Med. 2020;27 doi: 10.1093/jtm/taaa021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Liu Z., Magal P., Seydi O., Webb G. 2020. Predicting the cumulative number of cases for the COVID-19 epidemic in China from early data. medRxiv. [DOI] [PubMed] [Google Scholar]
  59. Long Q.-X., et al. Antibody responses to SARS-CoV-2 in patients with COVID-19. Nat. Med. 2020;26:845–848. doi: 10.1038/s41591-020-0897-1. [DOI] [PubMed] [Google Scholar]
  60. Lotfi E.M., Maziane M., Hattaf K., Yousfi N. Partial differential equations of an epidemic model with spatial diffusion. Int. J. Partial Differ. Equ. 2014;2014 [Google Scholar]
  61. Maier B.F., Brockmann D. Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China. Science. 2020 doi: 10.1126/science.abb4557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Meeker W.Q., Escobar L.A. Teaching about approximate confidence regions based on maximum likelihood estimation. Am. Stat. 1995;49:48–53. [Google Scholar]
  63. Ming W.-K., Huang J., Zhang C.J.P. 2020. Breaking down of the healthcare system: Mathematical modelling for controlling the novel coronavirus (2019-nCoV) outbreak in wuhan, China. bioRxiv. [DOI] [Google Scholar]
  64. Mukandavire Z., et al. Quantifying early COVID-19 outbreak transmission in South Africa and exploring vaccine efficacy scenarios. PLOS ONE. 2020;15 doi: 10.1371/journal.pone.0236003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Neher R.A., Dyrdak R., Druelle V., Hodcroft E.B., Albert J. Potential impact of seasonal forcing on a SARS-CoV-2 pandemic. Swiss Med. Wkly. 2020;150 doi: 10.4414/smw.2020.20224. [DOI] [PubMed] [Google Scholar]
  66. Nishiura H., et al. Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19) Int. J. Infect Dis. 2020;94:154–155. doi: 10.1016/j.ijid.2020.03.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Nordt C., Herdener M. 2020. A pragmatic model to forecast the COVID-19 epidemic in different countries and allowing for daily updates. medRxiv. [DOI] [Google Scholar]
  68. Peng L., Yang W., Zhang D., Zhuge C., Hong L. 2020. Epidemic analysis of COVID-19 in China by dynamical modeling. medRxiv. [DOI] [Google Scholar]
  69. Raue A., et al. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics. 2009;25:1923–1929. doi: 10.1093/bioinformatics/btp358. [DOI] [PubMed] [Google Scholar]
  70. Raue A., et al. Data2Dynamics: A modeling environment tailored to parameter estimation in dynamical systems. Bioinformatics. 2015;31:3558–3560. doi: 10.1093/bioinformatics/btv405. [DOI] [PubMed] [Google Scholar]
  71. Read J.M., Bridgen J.R., Cummings D.A., Ho A., Jewell C.P. 2020. Novel coronavirus 2019-nCoV: Early estimation of epidemiological parameters and epidemic predictions. medRxiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Reich N.G., et al. Challenges in real-time prediction of infectious disease: A case study of dengue in Thailand. PLoS Negl. Trop D. 2016;10 doi: 10.1371/journal.pntd.0004761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Roda W.C., Varughese M.B., Han D., Li M.Y. Why is it difficult to accurately predict the COVID-19 epidemic? Infect Dis. Model. 2020;5:271–281. doi: 10.1016/j.idm.2020.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Roosa K., Chowell G. Assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models. Theor. Biol. Med. 2019;16:1. doi: 10.1186/s12976-018-0097-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Roosa K., et al. Real-time forecasts of the COVID-19 epidemic in China from February 5th to February 24th, 2020. Infect Dis. Model. 2020;5:256–263. doi: 10.1016/j.idm.2020.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Salim N., et al. 2020. Covid-19 epidemic in Malaysia: Impact of lock-down on infection dynamics. medRxiv. [DOI] [Google Scholar]
  77. Schmiester L., et al. PEtab – Interoperable specification of parameter estimation problems in systems biology. PLoS Comput. Biol. 2020;17:1–10. doi: 10.1371/journal.pcbi.1008646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Schwarz G. Estimating the dimension of a model. Ann. Stat. 1978;6:461–464. [Google Scholar]
  79. Serban R., Hindmarsh A.C. ASME 2005 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 6. ASME; 2005. CVODES: The sensitivity-enabled ODE solver in SUNDIALS; pp. 257–269. [Google Scholar]
  80. Shaman J., Yang W., Kandula S. Inference and forecast of the current west african ebola outbreak in guinea, sierra leone and liberia. PLoS Curr. 2014;6 doi: 10.1371/currents.outbreaks.3408774290b1a0f2dd7cae877c8b8ff6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Shao P., Shan Y. 2020. Beware of asymptomatic transmission: Study on 2019-nCoV prevention and control measures based on extended SEIR model. bioRxiv. [DOI] [Google Scholar]
  82. Stapor P., et al. PESTO: Parameter estimation toolbox. Bioinformatics. 2018;34:705–707. doi: 10.1093/bioinformatics/btx676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Tang B., et al. Estimation of the transmission risk of the 2019-nCoV and its implication for public health interventions. J. Clin. Med. 2020;9:462. doi: 10.3390/jcm9020462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Tedros A.G. 2020. WHO director-general’s opening remarks at the media briefing on COVID-19-11 2020. https://www.who.int/dg/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020. [Google Scholar]
  85. Tian H. others. An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China. Science. 2020 doi: 10.1126/science.abb6105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Tsay C., Lejarza F., Stadtherr M.A., Baldea M. Modeling state estimation, and optimal control for the US COVID-19 outbreak. Sci. Rep. 2020;10:10711. doi: 10.1038/s41598-020-67459-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Tuncer N., Le T.T. Structural and practical identifiability analysis of outbreak models. Math. Biosci. 2018;299:1–18. doi: 10.1016/j.mbs.2018.02.004. [DOI] [PubMed] [Google Scholar]
  88. Wang H., et al. Phase-adjusted estimation of the number of coronavirus disease 2019 cases in Wuhan, China. Cell Discov. 2020;6:1–8. doi: 10.1038/s41421-020-0148-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Weidemann F., Dehnert M., Koch J., Wichmann O., Höhle M. Bayesian parameter inference for dynamic infectious disease modelling: Rotavirus in Germany. Stat. Med. 2014;33:1580–1599. doi: 10.1002/sim.6041. [DOI] [PubMed] [Google Scholar]
  90. Wölfel R., et al. Virological assessment of hospitalized patients with COVID-2019. Nature. 2020;581:465–469. doi: 10.1038/s41586-020-2196-x. [DOI] [PubMed] [Google Scholar]
  91. World Health Organization . 2020. Novel coronavirus (2019-nCoV) situation report - 30. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200219-sitrep-30-covid-19.pdf?sfvrsn=3346b04f_2. [Google Scholar]
  92. World Health Organization . 2020. Novel coronavirus (2019-nCoV) situation report - 7. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200127-sitrep-7-2019--ncov.pdf. [Google Scholar]
  93. Wu J.T., et al. Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China. Nat. Med. 2020;26:506–510. doi: 10.1038/s41591-020-0822-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Wuhan Municipal Health Commission . 2020. http://wjw.wuhan.gov.cn/ [Google Scholar]
  95. Yang W., Cowling B.J., Lau E.H.Y., Shaman J. Forecasting influenza epidemics in Hong Kong. PLoS Comput. Biol. 2015;11 doi: 10.1371/journal.pcbi.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Zhao S., et al. Estimating the unreported number of novel coronavirus (2019-nCoV) cases in China in the first half of 2020: a data-driven modelling analysis of the early outbreak. J. Clin. Med. 2020;9:388. doi: 10.3390/jcm9020388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Zhao S., et al. Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak. Int. J. Infect. Dis. 2020;92:214–217. doi: 10.1016/j.ijid.2020.01.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Zhou F., et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in wuhan, China: a retrospective cohort study. Lancet. 2020;395:1054–1062. doi: 10.1016/S0140-6736(20)30566-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

MMC S1

This e-component contains all Supplementary Figures.

mmc1.pdf (6MB, pdf)

Articles from Epidemics are provided here courtesy of Elsevier

RESOURCES