Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2021 Jan 20;124:157–163. doi: 10.1016/j.isatra.2021.01.029

A compartmental epidemic model incorporating probable cases to model COVID-19 outbreak in regions with limited testing capacity

A Hasan a,, Y Nasution b
PMCID: PMC7817488  PMID: 33487398

Abstract

We propose a new compartmental epidemic model taking into account people who have symptoms with no confirmatory laboratory testing (probable cases). We prove well-posedness of the model and provide an explicit expression for the basic reproduction number (R0). We use the model together with an extended Kalman filter (EKF) to estimate the time-varying effective reproduction number (Rt) of COVID-19 in West Java province in Indonesia and the state of Michigan in the USA, where laboratory testing capacities are limited. Based on our estimation, the value of Rt is higher when the probable cases are taken into account. This correction can be used by decision and policy makers when considering re-opening policy and evaluation of public measures.

Keywords: COVID-19, Probable case, Extended Kalman filter, Reproduction number

1. Introduction

Since declared as pandemic by WHO in early of March 2020, the coronavirus disease (COVID-19) has been spread to 216 countries and territories by the end of 2020. At the end of 2020, the virus has infected more than 80 million people (confirmed) and caused more than 1.8 million deaths. Particularly in Indonesia, the number of confirmed positive infection is over 750 thousand cases with over 20 thousand deaths, while in the USA the number of confirmed positive infection is over 20 million cases with over 350 thousand deaths. The pandemic has attracted researchers from different disciplines to model and forecast using different approach, e.g., [1], [2].

In order to prevent and control the outbreak, in early April 2020, the Indonesian government began to conduct Large-Scale Social Restriction (LSSR) at several regions with severe outbreak cases, such as in Jakarta and West Java Province. LSSR includes measures such as closing public places, restricting public transport, and limiting travel to and from restricted regions. This regulation is being controlled by each of the local government. Superspreaders dictate COVID-19 transmission in big cities such as Jakarta and Batam [3]. To assess the duration of LSSR, local government relies on the estimation value of the time-varying effective reproduction number (Rt) at each region. The government officials use sequential Bayesian method based on the discrete SIR model presented in [4], [5]. The method requires daily confirmed infection data as primarily inputs to estimate Rt.

Based on CDC case definitions, however, there are two classifications of COVID-19 case, namely probable cases and confirmed cases [6]. A person can be classified into probable case if he or she

  • meets clinical criteria AND epidemiologic evidence with no confirmatory laboratory testing performed for COVID-19,

  • meets presumptive laboratory evidence AND either clinical criteria OR epidemiologic evidence,

  • meets vital records criteria with no confirmatory laboratory testing performed for COVID-19.

Details regarding clinical criteria, epidemiologic evidence, and vital records criteria can be found in [6]. Probable case is quite similar to Patient under Surveillance (Pasien Dalam Pengawasan or PDP) based on the Indonesian ministry of health. In many provinces in Indonesia, the number of probable cases far exceeded the number of confirmed cases, especially in the beginning of the pandemic. For example, Fig. 1 shows a much larger number of probable cases in West Java province in Indonesia compared to active/confirmed cases. Limited laboratory testing and the lag time between testing and result can be the cause of this situation. Our aim is to estimate the time-varying effective reproduction number Rt by utilizing probable case data together with confirmed case data. To this end, we propose a Susceptible–Probable–Infectious–Recovered (SPIR) model with four compartments as shown in Fig. 2.

Fig. 1.

Fig. 1

Stacked-bar of COVID-19 cases in West Java, Indonesia.

Fig. 2.

Fig. 2

SPIR model is proposed to model the number of probable case of COVID-19 transmission process.

2. Model description

2.1. Susceptible–probable–infectious–recovered (SPIR) model

The SPIR model consists of four ordinary differential equations (ODEs), describing the evolution of the population in each stage over time, and is given by:

dS(t)dt=Λ(t)βS(t)(I(t)+P(t))N+εP(t)μ1S(t), (1)
dP(t)dt=βS(t)P(t)N(κ+ε+μ2)P(t), (2)
dI(t)dt=βS(t)I(t)N+κP(t)(γ+μ2)I(t), (3)
dR(t)dt=γI(t)μ1R(t). (4)

Here, S, P, I, R denote susceptible case, probable case, active/confirmed case, and recovered case, respectively. The system satisfies

S(t)+P(t)+I(t)+R(t)=N, (5)

expressing in mathematical terms the constancy of population N. We assume the natality Λ(t) compensate for the death from all compartments, i.e., 

Λ(t)=μ1S(t)+μ2P(t)+μ2I(t)+μ1R(t), (6)

where μ1 and μ2 denote the death rates. The number of deaths from compartment S, P, I, and R are given by μ1S(t), μ2P(t), μ2I(t), and μ1R(t), respectively. In our model, we assume μ2>μ1 since infected individuals die at faster rate. If μ1=μ2=μ, then the natality is constant, i.e., Λ=μN. The remaining parameters include the infection rate β, the negative testing rate ε, the positive testing rate κ, and the recovery rate γ. Among these parameters, β could be considered as the most important parameter, which quantifies the transmission of the virus. The infection rate β could also be defined as the average number of contacts per person per time multiplied by the probability of disease transmission. Thus, in practice this parameter is time-varying due to intervention. Remark that in order to simplify the analysis, the infection rate β in our model goes equally to compartment I and compartment P. However, if a Polymerase Chain Reaction (PCR) testing shows a negative result, a person from compartment P can go back to compartment S. This is modeled by the negative testing rate ε.

2.2. Well-posedness of the SPIR model

We establish Theorem 1, Theorem 2 to ensure the mathematical and biological well-posedness of the SPIR model (1)(4). Let us define z(t)=S(t)P(t)I(t)R(t), such that the Initial Value Problem (IVP) of the SPIR model can be written as

dz(t)dt=F(z(t)), (7)
z(0)=z0. (8)

Theorem 1

For any initial conditions z(0)=z0R4 and t>0 , there exists a unique continuously differentiable vector-valued function z(t) solving the IVP (7) (8) .

Proof

The Jacobian of (1)(4) is linear with respect to S(t),P(t),I(t), and R(t). Thus, the system is locally Lipschitz. From Picard–Lindelöf theorem [7], there exists a unique local solution for the system (1)(4). To prove global existence of the solution, first let us assume the first-order derivative of z(t) is defined in an interval (a,b). Furthermore, let us denote

J=[t0a,t0+a], (9)
B=zR4|zz0b, (10)
D=(t,z)R×R4|tJ,zB. (11)

The vector-valued function F is continuous with respect to t and z on J and B, respectively. Thus, F is Lebesgue measurable. Furthermore, it is clear from (1) that for ε,μ1>0, S(t) is a monotonically decreasing function. Subsequently, (2)(4) are bounded. Thus, there exist positive constants ω and λ, such that F(z)ω+λz. Following Theorem 3.1 and Remark 3.2 in [8], the global solution exists. This conclude the proof.  □

Theorem 2

Let us assume that z(0)0 . The function z(t) will remain positive and bounded t[0,) , i.e., the system (1) (4) is a dynamical system on the compact set

K=(S,P,I,R)R+4c>0S+P+I+R<c. (12)

Proof

Solution of (2) is given by

P(t)=P(0)exp0tβS(τ)N(κ+ε+μ2)dτ. (13)

Thus, for P(0)0 we have P(t)0. Furthermore, solution for (1) is given by

S(t)=S(0)exp0tβ(I(τ)+P(τ))N+μ1dτ (14)
+exp0tβ(I(τ)+P(τ))N+μ1dτ×
S(t)0t(Λ(τ)+εP(τ))×exp0sβ(I(τ)+P(τ))N+μ1dτds.

Since Λ(t)0 and P(t)0, then S(t)0. Using similar approach, we can prove I(t),R(t)0. Boundedness follows from the fact that S(t)+P(t)+I(t)+R(t)=N.  □

2.3. Basic reproduction number

Setting the left hand side of (1)(4) to zero, the Disease-Free Equilibrium (DFE) of the SPIR model is given by (S,P,I,R)=(Λ(t)μ1,0,0,0). From (5), we have S=N and from (6), we have Λ(t)=μ1S=μ1N. Thus, Λ(t)μ1=N and the DFE becomes (S,P,I,R)=(N,0,0,0). Following Lemma 1 in [9], we define:

F=β00βandV=κ+ε+μ20κγ+μ2. (15)

According to [10], the next generation matrix is defined as FV1 and the basic reproduction number is defined as the spectral radius of the next generation matrix, i.e., 

R0=max{Ω(FV1)}=maxβκ+ε+μ2,βγ+μ2, (16)

where Ω denotes the eigenvalue of the next generation matrix FV1. Remark that, in practice β=β(t) due to intervention. Thus, considering the number of susceptible individual declines over time [11], the time-varying effective reproduction number is given by:

Rt(t)=S(t)S(0)maxβ(t)κ+ε+μ2,β(t)γ+μ2. (17)

In the next section, we use an extended Kalman (EKF) filter to estimate the infection rate β(t), and thus estimate the time-varying effective reproduction number Rt(t).

3. Estimation of the time-varying effective reproduction number Rt

Substituting (6) into (1), the SPIR model (1)(4) can be written as follow

dS(t)dt=β(t)S(t)(I(t)+P(t))N+(ε+μ2)P(t)+μ2I(t)+μ1R(t), (18)
dP(t)dt=β(t)S(t)P(t)N(κ+ε+μ2)P(t), (19)
dI(t)dt=β(t)S(t)I(t)N+κP(t)(γ+μ2)I(t), (20)
dR(t)dt=γI(t)μ1R(t). (21)

In this section, the SPIR model (18)(21) is discretized using forward Euler method and, together with EKF, is used to estimate the time-varying effective reproduction number Rt.

3.1. Discrete-time stochastic augmented SPIR model

Discretizing the SPIR model (18)(21) using forward Euler method and augmenting the infection rate β as a sixth state variable, we obtain the following discrete-time stochastic augmented SPIR model:

S(k+1)=S(k)β(k)S(k)(I(k)+P(k))ΔtN+(ε+μ2)ΔtP(k)+μ2ΔtI(k)+μ1ΔtR(k)+w1(k), (22)
P(k+1)=β(k)S(k)P(k)ΔtN+(1(κ+ε+μ2))ΔtP(k)+w2(k), (23)
I(k+1)=β(k)S(k)I(k)ΔtN+κΔtP(k)+(1(γ+μ2))ΔtI(k)+w3(k), (24)
R(k+1)=γΔtI(k)+(1μ1Δt)R(k)+w4(k), (25)
β(k+1)=β(k)+w5(k). (26)

where Δt denotes the time step. In practice, we set Δt=0.01. Here, we add noise w(k)=(w1(k)w2(k)w3(k)w4(k)w5(k)) to model uncertainty and is assumed to be white Gaussian noise and uncorrelated. Furthermore, we assume the profile of the infection rate β(k) as a piece-wise continuous function with jump every one day. Thus, in one day the value of β(k) is constant and (26) is satisfied. Augmenting a parameter as a new state variable is a common technique when estimating a parameter using EKF (see page 422 in [12]).

3.2. Extended Kalman filter (EKF)

In this section, we use EKF to estimate the state variables in (22)(26) dynamically. To simplify the presentation, we define an augmented state vector

x(k+1)=S(k+1)P(k+1)I(k+1)R(k+1)β(k+1), (27)

such that the discrete-time stochastic augmented SPIR model (22)(26) can be written as follows

x(k+1)=f(x(k))+w(k). (28)

Let us denote xˆ(k) as an estimated vector state from the EKF. Applying first-order Taylor series expansion to f at xˆ(k), we obtain f(x(k))=f(xˆ(k))+Jf(xˆ(k))(x(k)xˆ(k)), where Jf(xˆ(k)) is the Jacobian matrix of f, given by

Jf(xˆ(k))=J11(xˆ(k))J12(xˆ(k))J13(xˆ(k))μ1ΔtJ15(xˆ(k))J21(xˆ(k))J22(xˆ(k))00J25(xˆ(k))J31(xˆ(k))κΔtJ33(xˆ(k))0J35(xˆ(k))00γΔt1μ1Δt000001, (29)

where

J11(xˆ(k))=1β(k)(I(k)+P(k))ΔtN, (30)
J12(xˆ(k))=β(k)S(k)ΔtN+(ε+μ2)Δt, (31)
J13(xˆ(k))=β(k)S(k)ΔtN+μ2Δt, (32)
J15(xˆ(k))=S(k)(I(k)+P(k))ΔtN, (33)
J21(xˆ(k))=β(k)P(k)ΔtN, (34)
J22(xˆ(k))=β(k)S(k)ΔtN+1(κ+ε+μ2)Δt, (35)
J25(xˆ(k))=S(k)P(k)ΔtN, (36)
J31(xˆ(k))=β(k)I(k)ΔtN, (37)
J33(xˆ(k))=β(k)S(k)ΔtN+1(γ+μ2)Δt, (38)
J35(xˆ(k))=S(k)I(k)ΔtN. (39)

The EKF has two main tuning parameters: the process covariance matrix QF and the observation covariance matrix RF. Detail procedures of implementing this method can be found in [13]. Note that, the main purpose of the EKF is used as real-time data fitting. Thus, the tuning parameters are chosen such that the Relative Root Mean Square Error (RRMSE) between the confirmed data and the estimated data is sufficiently small. The RRMSE for each variable X is defined as

RRMSEX=1Ndi=1NdXiXˆi22Xi22, (40)

where Nd is the number of days observed. Here, Xi{S(i),P(i),I(i),R(i)} and Xˆi{Sˆ(i),Pˆ(i),Iˆ(i),Rˆ(i)} denote the confirmed data and the estimated data, respectively.

4. Case studies

We use the discrete-time stochastic augmented SPIR model together with the EKF to estimate the time-varying effective reproduction number Rt in West Java province in Indonesia and the state of Michigan in the USA. All data sets and MATLAB code are available on GitHub (https://github.com/agusisma/covidPDP). Furthermore, we compare the estimation of Rt without considering probable case by using the SIRD model presented in [13].

To simplify the model, the parameters are assumed to be constant and are obtained from clinical information, such as case-fatality-rate (CFR), infectious time Ti, and life expectancy Tl. If an individual is infectious for an average time period Ti, then γ+μ2=1Ti. Since μ2 is the death rate due to the disease, μ2 is the fatality fraction of 1Ti, i.e., μ2=CFRTi. On the other hand, since γ is the recovery fraction of 1Ti, then γ=1-CFRTi. Furthermore, since μ1 is the natural death rate relative to COVID-19 death rate, we assume μ1=CFRTl, where Tl is the life expectancy. The total rate from compartment P is 1Ti, i.e., ε+κ+μ2=1Ti. Since μ2=CFRTi, then the positive and the negative testing rate ε+κ=1CFRTi. If cp denotes the percentage of people in compartment P who are tested positive, then κ=cp1CFRTi and ε=(1cp)1CFRTi.

The data used in this simulation are provided in Table 1. The tuning parameters for the Kalman filter are chosen as QF=diag(10101050.2) and RF=diag(1001051), respectively, such that (40) is minimized.

Table 1.

Data used to calculate model parameters.

Region N (million) CFR Ti (days) Tl (days)
West Java 48 0.0425 12 ± 3 72 × 360
Michigan 10 0.0093 12 ± 3 78.3 × 360

4.1. West Java province, Indonesia

Fig. 3 shows results from dynamic data fitting using EKF for West Java province in Indonesia. As can be seen from this figure, the discrete-time stochastic augmented SPIR model together with the EKF are able to model the transmissions of COVID-19 accurately. The RRMSEX using the chosen tuning parameters QF and RF are given in Table 2, where the error is reasonably small. The estimated Rt can be seen from Fig. 4. Here, the value of Rt is higher when the probable cases are taken into account. The difference between the estimated Rt with and without probable cases becomes smaller once the rate of the probable cases decreases.

Fig. 3.

Fig. 3

Real-time fitting of COVID-19 cases in West Java using EKF.

Table 2.

RRMSE between confirmed data and estimated data using EKF.

RRMSEX
Region S P I R Total
West Java 4.2e−16 6.6e−04 1.3e−06 2.2e−06 6.7e−04
Michigan 5.7e−12 1.0e−04 2.5e−01 2.6e−03 2.6e−01

Fig. 4.

Fig. 4

Estimation of Rt in West Java with and without probable cases. Estimation of Rt without probable case is obtained using the SIRD model [13].

On May 6, the government of West Java province enact a Large-Scale Social Restriction (LSSR). Control measures include: (i) dismissal of schools and workplaces, (ii) restrictions on religious activities, activities in public places, social and cultural activities, and (iii) restrictions on modes of transportation. The decision was taken based on the evaluation of Rt, which according to WHO need to be done if Rt>1. The restriction can be relaxed if Rt<1 for fourteen consecutive days. Since the government officials estimated Rt without considering probable cases, the restriction was relaxed on May 20 when Rt<1 for two weeks. Taking probable cases into account, our estimation shows that during the period of LSSR, the estimated value of Rt is still above 1. Thus, the LSSR need to be extended rather than relaxed. The effect of LSSR, where Rt<1, can be seen after two weeks (May 24) since there are delays in infection confirmation.

Using 95% confidence interval, forecasting result for the total case is presented in Fig. 5 by the dashed-blue line, whereas the confirmed data is presented by the dashed-red line. The forecast is done assuming the current measure is continued for the next 30 days. The figure shows the forecast is a bit higher than the confirmed data, but still inside the confidence interval. This could mean the government has tighten the measures, the number of testing was decreased, the transmission was successfully slowed down, or the combinations of those reasons.

Fig. 5.

Fig. 5

Forecast for the number of total case in West Java. The dashed-red line is the actual or confirmed cases, while the dashed-blue line is the forecast. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

4.2. Michigan, USA

Fig. 6 shows data fitting results of COVID-19 cases in Michigan using EKF. It can be observed that the EKF estimates the confirmed cases accurately. As can be seen from Table 2, the RRSME is fairly small. Stay-at-home order was issued on March 24 by the government of Michigan. The order includes measures such as: (i) temporarily suspend in-person services that are not necessary for businesses and operations, (ii) asking residents to remain in their home, not to engage in outdoor activity, and practice social distancing. The order was lasting until May 15. The government started in-house testing for COVID-19 on March 16, with the capabilities to deliver same-day results. As a result, the number of daily probable cases is relatively low compare to the number of daily confirmed case, as can be seen from Fig. 7. Furthermore, as expected the estimation of Rt with and without probable cases is fairly similar. The stay-at-home order has been proven effective to reduce the transmission, indicated by Rt<1 two weeks after being put into effect. However, we see that there is an increase in the number of daily cases once the order was repealed.

Fig. 6.

Fig. 6

Real-time fitting of COVID-19 cases in Michigan using EKF.

Fig. 7.

Fig. 7

Estimation of Rt in Michigan with and without probable cases.

Fig. 8 shows forecasting result for the number of total case in Michigan in the next 30 days. To show the quality of the forecast, we compare the forecast for the next 30 days using data until 11 June and the confirmed data from 12 June until 12 July. The forecast is slightly lower than the confirmed data but still inside the 95% confidence interval. This could mean the control measures were relaxed. In this case, the government should have to tighten the measures to control the transmissions.

Fig. 8.

Fig. 8

Forecast for the number of total case in Michigan. The dashed-red line is the actual or confirmed cases, while the dashed-blue line is the forecast. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

5. Conclusion

A new compartmental epidemic model taking probable cases into account has been presented in this paper. The model, called the SPIR model, consists of four compartments: Susceptible (S), Probable (P), Active (I), and Recovered (R). The model is used to estimate the time-varying effective reproduction number Rt in West Java province in Indonesia and the state of Michigan in the USA. To this end, we apply an EKF to a discrete-time stochastic augmented SPIR model. Numerical simulations using data from West Java province show that when probable cases are taking into account, the value of Rt is significantly higher. However, if the number of probable cases is not significant, exemplified by the state of Michigan, the value of Rt is similar to regular methods. This results can be used to inform policy makers when deciding to loosen or tighten the measures. In general, our model and approach can be used in regions with limited testing capacity.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  • 1.Giordano G., Blanchini F., Bruno R., Colaneri P., Filippo A.D., Matteo A.D., et al. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy. Nat Med. 2020;26:855–860. doi: 10.1038/s41591-020-0883-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Singhal A., Singh P., Lall B., Joshi S. Modeling and prediction of COVID-19 pandemic using Gaussian mixture model. Chaos Solitons Fractals. 2020;138:1–8. doi: 10.1016/j.chaos.2020.110023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hasan A., Susanto H., Kasim M., Nuraini N., Lestari B., Triany D., et al. Superspreading in early transmissions of COVID-19 in Indonesia. Sci Rep. 2020;10:1–4. doi: 10.1038/s41598-020-79352-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bettencourt L., Ribeiro R. Real time Bayesian estimation of the epidemic potential of emerging infectious diseases. PLoS ONE. 2008;3:1–9. doi: 10.1371/journal.pone.0002185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Cori A., Ferguson N., Fraser C., Cauchemez S. A new framework and software to estimate time-varying reproduction numbers during epidemics. Am J Epidemiol. 2013;178:1505–1512. doi: 10.1093/aje/kwt133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Center for Disease Control and Prevention, Coronavirus Disease 2019 (COVID-19) 2020 Interim Case Definition, https://wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-covid-19/case-definition/2020/.
  • 7.Bartle R., Sherbert D. 4th ed. Wiley; 2011. Introduction to real analysis. [Google Scholar]
  • 8.Lin W. Global existence theory and chaos control of fractional differential equations. J Math Anal Appl. 2007;332:709–726. doi: 10.1016/j.jmaa.2006.10.040. [DOI] [Google Scholar]
  • 9.van den Driessche P., Watmough J. Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Math Biosci. 2002;180:29–48. doi: 10.1016/S0025-5564(02)00108-6. [DOI] [PubMed] [Google Scholar]
  • 10.Diekmann O., Heesterbeek J., Metz J. On the definition and the computation of the basic reproduction ratio R0 in models for infectious diseases in heterogeneous populations. J Math Biol. 1990;28:365–382. doi: 10.1007/BF00178324. [DOI] [PubMed] [Google Scholar]
  • 11.Brauer F., Castillo-Chavez C., Feng Z. 1st ed. Springer-Verlag; 2019. Mathematical models in epidemiology. [Google Scholar]
  • 12.Simon D. 1st ed. Wiley-Interscience; 2006. Optimal state estimation: Kalman, H infinity, and nonlinear approaches. [Google Scholar]
  • 13.Hasan A., Susanto H., Tjahjono V., Kusdiantara R., Putri E., Hadisoemarto P., et al. 2020. A new estimation method for COVID-19 time-varying reproduction number using active cases. MedRxiv (preprint) [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from ISA Transactions are provided here courtesy of Elsevier

RESOURCES