Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2021 Oct 8;50(2):231–246. doi: 10.1080/02664763.2021.1985091

Monitoring SEIRD model parameters using MEWMA for the COVID-19 pandemic with application to the state of Qatar

Edward L Boone a,CONTACT, Abdel-Salam G Abdel-Salam b, Indranil Sahoo a, Ryad Ghanam c, Xi Chen d, Aiman Hanif a
PMCID: PMC9870005  PMID: 36698549

ABSTRACT

During the current COVID-19 pandemic, decision-makers are tasked with implementing and evaluating strategies for both treatment and disease prevention. In order to make effective decisions, they need to simultaneously monitor various attributes of the pandemic such as transmission rate and infection rate for disease prevention, recovery rate which indicates treatment effectiveness as well as the mortality rate and others. This work presents a technique for monitoring the pandemic by employing an Susceptible, Exposed, Infected, Recovered, Death model regularly estimated by an augmented particle Markov chain Monte Carlo scheme in which the posterior distribution samples are monitored via Multivariate Exponentially Weighted Average process monitoring. This is illustrated on the COVID-19 data for the State of Qatar.

KEYWORDS: Epidemiology, augmented particle Markov chain Monte Carlo, Multivariate exponentially weighted moving average, process monitoring, COVID-19

1. Introduction

Coronavirus Disease 2019 (COVID-19) [22,27] is a severe pandemic affecting the whole world with a fast spreading regime, requiring strict precautions to keep it under control. As there are limited cure and target treatment at the moment, establishing those precautions become inevitable. These limitations [10] can be listed as social distancing, closure of businesses/schools and travel prohibitions [4].

Corona Virus is a new human Betacoronavirus that uses densely glycosylated spike protein to penetrate host cells. The COVID-19 belongs to the same family classification with Nidovirales, viruses that use a nested set of mRNAs to replicate and it further falls under the subfamily of alpha, beta, gamma and delta Co-Vis. The virus that causes COVID-19 belongs to the Betacoronavirus 2B lineage and has a close relationship with SARS species. It is a novel virus since the monoclonal antibodies do not exhibit a high degree of binding to SARS-CoV-2. Replication of the viral RNA occurs when RNA polymerase binds and re-attaches to multiple locations [8,18].

Cases of COVID-19 started in December 2019 when a strange condition was reported among a cluster of patients in Wuhan, China. Within a few weeks of this, the COVID-19 virus had spread to different parts of the world. On 20 January 2020, the first case of COVID-19 was recorded in the USA; Italy reported its first confirmed case on 31 January 2020. With COVID-19 cases rising across the world, the governments were soon seen intervening in financial and healthcare sectors. In late January 2020, the first US travel restrictions were imposed on travel from China. Weeks later, additional travel bans were imposed on countries in Europe and the United Kingdom. The World Health Organization (WHO) declared COVID-19 a pandemic on 11 March 2020, with a total of more than 100,000 cases globally. As of 26 January 2021, the worldwide total number of confirmed COVID-19 cases eclipses 100 million with over 2.15 million deaths. This virus has a global mortality rate of 3.4%, which makes it more severe than the flu. The elderly who have other pre-existing illnesses are succumbing more to the COVID-19. People with only mild symptoms recover within 3–7 days, while those with conditions such as pneumonia or severe diseases take weeks to recover. As of 26 January 2021, the recovery percentage of patients, for example, in China stands at 95%. The global recovery percentage rate of COVID-19 is somewhere around 97% [26].

The main efforts in the literature is focused on model estimation and forecasting the dynamic nature of the COVID-19 pandemic as opposed to monitoring the process. The Susceptible, Exposed, Infected, Recovered, Death (SEIRD) model is a common compartmental model used for modeling disease through a population and variants of it have been used in modeling the COVID-19 pandemic, such as [15] for Italy, [2] for India and [9] for the State of Qatar. For more on disease modeling in general, see [6,13,17,24].

Other modeling approaches include a time-series model to analyze the outbreak of the pandemic [7], a time-varying Bayesian semi-parametric model to look at short-term projections of the pandemic [23]. Han et al. [12] studies the dynamic pattern of COVID-19 deaths over time. The impact of government intervention such as a lockdown has been studied for China in [25] and for India in [2,3]. Monitoring of the mathematical process has been done for Ukraine's COVID-19 outbreak [14]; however the parameter estimates are not generated directly from the data and no clear monitoring scheme is presented. As the vaccine is beginning to be distributed to the public, monitoring the pandemic is more important to decision-makers as they can determine how the pandemic is progressing as well as detect any shocks to the system that may be problematic. The monitoring approach needs to be able to react quickly to any large shifts in the system.

The goal of this work is to develop a method of monitoring a pandemic using a base mathematical model, such as SEIRD, that can be quickly updated as soon as new information comes in and can ‘signal’ if there is a change in the parameters of the mathematical model. The literature does not seem to provide any approaches that meet this goal. The big challenge in this problem is updating the parameters in an automated way and converting those parameters to something that can be monitored. Here a Bayesian approach is taken for the parameter estimation via a sampling algorithm that will allow for quick updating, avoid particle depletion and from which the samples can be monitored using a standard process control regime.

This work is organized in the following manner. Section 2 introduces the SEIRD model specific to the State of Qatar, the mean model and the likelihood used. Section 3 introduces the Reproduction number and illustrates its deficiency in this case. This is followed by the Markov chain Monte Carlo sampling algorithm with particle augmentation to update the parameters at each time step in Section 4. The Multivariate Exponentially Weighted Moving Average (MEWMA) monitoring approach is presented in Section 5. The method is illustrated on real data from the State of Qatar in Section 6, which shows how the monitoring can be employed to identify critical changes in the pandemic. Finally, Section 7 provides a discussion of the method and provides some suggestions for implementation as well as possible areas for improvement.

2. SEIRD model

Let t be a time index that is the number of days from the first recorded case of COVID-19 in the population of interest, S(t) be the number of subjects Susceptible at time t, E(t) be the number of subjects Exposed at time t, I(t) be the number of Infected (symptomatic) subjects at time t, R(t) be the cumulative number of Recovered subjects at time t and D(t) be the cumulative number of subject Deaths at time t. This can be modeled with the following system of ordinary differential equations:

dS(t)dt=αS(t)E(t)dE(t)dt=αS(t)E(t)βE(t)γE(t)dI(t)dt=βE(t)γI(t)ηI(t)dR(t)dt=γI(t)dD(t)dt=ηI(t), (1)

where the parameters are explained as follows: α is the transmission rate (per day × individual 2) from susceptible to exposed, the rate (per day) at which exposed become infected (symptomatic) is denoted by β, γ is the rate (per day) at which infected become recovered and the mortality rate (per day) for those infected is denoted by η. Notice that, this model formulation makes several key assumptions which are as follows. Immigration, emigration, natural mortality and births are negligible over the time frame and hence are not in the model. Once a person is in the infected group, they are quarantined and hence they do not mix with the Susceptible population. The recovered and deaths compartments are for those who first are infected. Those who are exposed and are asymptomatic recover at the same rate γ as those who become sick and recover. The SEIRD model presented here is the same as the one presented in [9] and matches the assumptions needed for the example provided in Section 6. However, the estimation and monitoring method is not specific to this particular model.

Due to the dynamic nature of how the pandemic has developed, assuming the system is in ‘steady state’ is invalid as governments have intervened into the system in an effort to influence various parameters as well as medical treatment of the disease has changed across the time frame. Hence the parameters are also functions of time, denoted as α(t),β(t),γ(t) and η(t). Specifically,

dλS(t)dt=α(t)λS(t)λE(t)dλE(t)dt=α(t)λS(t)λE(t)β(t)λE(t)γ(t)λE(t)dλI(t)dt=β(t)λE(t)γ(t)λI(t)η(t)λIdλR(t)dt=γ(t)λI(t)dλD(t)dt=η(t)λI(t), (2)

where λs(t), λE(t), λs(I), λR(t) and λD(t) denote the respective mean parameters.

At each time point t, the parameters must have a prior distribution. For this work, the prior distribution specification will be the same for all t, however, this is not necessary if one has information that needs to be included at a specific time.

Since a Bayesian methodology is being employed the likelihood is specified to be:

I(t)Poisson(λI(t))R(t)Poisson(λR(t))D(t)Poisson(λD(t)). (3)

Notice that S(t) and E(t) are not in the likelihood as they are latent states in that they are not directly observed. There is no way to easily observe the number of exposed subjects as many people in the population may be exposed to and carry the virus however, they may be asymptomatic and hence not readily observed. Since asymptomatic exposed people are visually indistinguishable from those who are susceptible, directly observing the susceptibles is also difficult. The government did not collect these data and hence it is unobserved (latent). The true likelihood for {S(t),E(t),I(t),R(t),D(t)} should be Multinomial. However, with two latent states, one of which is the largest state, the Multinomial approach is challenging to apply, thus this work uses Poisson likelihood as an approximation.

3. The basic reproduction number R0

The basic reproduction number, R0, is defined as the expected number of secondary cases produced by a single infection in a completely susceptible population. It is dimensionless and can be calculated as a product of the transmissibility, the average rate of contact between susceptible and infected individuals, and the duration of the infectiousness [5]. In model (1), the last two equations do not contribute to R0 and so

R0=αβ+γ.

Since our model is time varying, it follows that

R0(t)=α(t)β(t)+γ(t).

In this work, the Reproduction number is not considered since it does not account for all the parameters in the model. One of the goals of this work is to ensure that the monitoring process accounts for all of the parameters.

4. Sequential sampling with particle augmentation

From a monitoring perspective, we need to essentially estimate the parameters, θt={α(t),β(t),γ(t),η(t)}), at each time step dependent primarily on the latest data. If at time t, we want to estimate the change in the parameters from t−1 to t, we need an estimation algorithm that can update the parameters so that changes can be detected. In a dynamic system where the parameters are changing due to interventions into the system such as vaccines or quarantines, the estimation algorithm needs to also take into account the previous changes that have taken place in the system. Traditional approaches to modeling varying coefficient models from a Bayesian perspective have difficulties as the sampling methods required often suffer from particle depletion as the process proceeds through time. Recall that many sampling methods are based on some form of acceptance/rejection rules. These methods start with a large number of particles (candidate samples) which are evaluated through the distribution of interest, the particles that best agree with the distribution are accepted and the other particles are rejected. In a sequential framework, the accepted particles from one time step are used as the candidate particles for the next time step. In each subsequent step, some particles are rejected which depletes the pool of candidate particles for the next step. Across many steps, this results in having only a few candidate particles and hence even fewer accepted particles. Eventually, only one particle remains which cannot adequately represent the distribution of interest. Hence any inferences made from one or a few particles is suspect. To avoid the problem of particle depletion, a new algorithm is presented here that augments the accepted particles with additional particles at each time step to be used as candidate particles for the next time step.

The proposed Algorithm 1 is a variant of the sampling importance resampling algorithm at each time step. To ensure there are enough particles to work through the sampling process, the basic idea is to augment each accepted sample by some random perturbations to generate new particles to move through to the next step.

Let us fix the notation first. Denote D(k)={S(k),E(k),I(k),R(k),D(k)} as the actual state vector of the system at time step k, where the first two components are latent, k{0,1,,T}, and T denotes the total number of time steps considered. Denote g(θ0) and g(θk|θk1), respectively, as the candidate distributions to sample θ from at time step 0 and at time step k ( k=1,2,,T). Let θ~k denote the set of accepted samples of θ at time k that will be passed on to the next time step ( k{0,1,,T1}). Denote the set that contains all the accepted values of θ up to time step k as Θ~(k), i.e. Θ~(k)=Θ~(k1)θ~k, k=1,2,,T.

We elaborate on the key steps of Algorithm 1. At time step 0, we draw nc candidate samples of θ from g(θ0), denoted by {θ0,j}j=1nc. We then evaluate the posterior distribution using the data at time 0 (i.e. D(0)) and the candidate particles to obtain the unnormalized weights {wj,j=1,2,,nc} at time step 0, which are subsequently normalized to the w^j. We then obtain np posterior samples by selecting from {θ0,j,j=1,2,,nc} with the corresponding probabilities given by {w^j}j=1nc; denote this set of np samples by θ~0={θ~0,1,θ~0,2,,θ~0,np}. This set of accepted values of θ will be passed on to time step 1. The sampling for time step k is similar to that for time step 0 but enhanced with sample augmentation. Specifically, at time step k ( k{1,2,,T}), using an appropriate candidate density g(θk|θ~k1,) which is conditioned on each accepted sample from time step k−1, we generate a batch of nb candidate samples in the neighborhood of θ~k1,; denote the batch by {θk,,j}j=1nb for =1,2,,np. We then evaluate the posterior distribution using the data up to time k (i.e. D(k)), the accepted samples up to time step k−1 (i.e. Θ~(k1)) and the candidate particles, to obtain the unnormalized weights {w,j,j=1,2,,nb,=1,2,,np} at time step k, which are subsequently normalized to the w^,j. We then obtain np posterior samples by selecting from {θk,,j,j=1,2,,nb,=1,2,,np} with the corresponding probabilities given by {w^,j,j=1,2,,nb,=1,2,,np}; denote this set of np accepted samples by θ~k={θ~k,1,θ~k,2,,θ~k,np}. Then by the end of time step k, we update the set of accepted values of θ to Θ~(k)=Θ~(k1)θ~k.

4.

5. Monitoring

The presented method for estimating the SEIRD model parameters through time is quite responsive to changes in the system. Hence, these parameters can be monitored to look for changes in the parameters which will be manifested in the data. Since there are four parameters in the SEIRD model that need to be simultaneously monitored, the Multivariate Exponentially Weighted Moving Average (MEWMA) approach was chosen as an appropriate monitoring method. [16] developed a multivariate EWMA (MEWMA) control chart, which is an extension to the univariate EWMA.

First the parameter samples were differenced using a single backward lag of one, Δα(t)i=α(t)iα(t1)i, Δγ(t)i=γ(t)iγ(t1)i, Δβ(t)i=β(t)iβ(t1)i and Δη(t)i=η(t)iη(t1)i. These form a vector (Δα(t)i,Δγ(t)i,Δβ(t)i,Δη(t)i)T to be monitored for significant deviations from zero, which would correspond to a significant change in the set of parameters. The multivariate parameters are given by the mean:

Δθ¯(t)=1np1npT(Δα(t),Δγ(t),Δβ(t),Δη(t))

and variance:

Cov(Δθ(t))=1np1(Δα(t)TΔγ(t)TΔβ(t)TΔη(t)T)(Δα(t),Δγ(t),Δβ(t),Δη(t)).

In order to have a monitoring process that is not too sensitive, the MEWMA is employed as

MEWMA(t)=ΛΔθ¯(t)+(1Λ)MEWMA(t1), (4)

with the moving covariance matrix:

V(t)=Λ2Cov(Δθ(t))+(1Λ)2V(t1),

where Λ is a smoothing coefficient that controls how much a new observation can influence the overall mean. Lower values of Λ are more conservative in that new observations do not have much influence on the mean and higher values allow new observations to have a greater influence on the mean. This is used to control false signals as any process being monitored will have some natural variation that may cause the observation to be signalled. Typical values of Λ include 0.1, 0.15, 0.2, 0.25 and 0.3.

Since we are looking at the differences in parameter values, the target value for the process mean should be (0,0,0,0) indicating no shift in the process. In the multivariate case, a test statistic based on Hotelling's T2 can be defined as

T2(t)=MEWMA(t)TV(t)1MEWMA(t). (5)

When np is large, T2(t) in Equation (5) is approximately χ2(4) which has a 0.95-quantile equal to 9.48. Hence any T2(t)>9.48 would be deemed as a significant change in the parameter differences and thus a significant change in the SEIRD process. Note that, in our case np=1000 and hence deemed large enough for the T2 approximation.

6. Data analysis: state of Qatar

6.1. Data description

The World Health Organization, Johns Hopkins University and other agencies maintain data sets on the daily number of confirmed infected cases, deaths, and recoveries for every country. We apply the proposed methodology to study the evolution of the pandemic in the State of Qatar. All data for Qatar were obtained from Johns Hopkins University and are freely accessible via the Johns Hopkins COVID-19 GitHub repository [19]. The GitHub site includes daily cumulative number of confirmed infections, cumulative number of recovered and cumulative number of deaths starting 22 January 2020.

The goal of the data analysis is to demonstrate and assess the proposed modeling approach and its use in monitoring the pandemic as the varying coefficient approach will allow for the model parameters to adjust quickly to changes in the data generation process. In model (1), the Recovered and Death states are cumulative with no outgoing transitions whereas the Infected state has transitions from Exposed and to Recovered and Death states. Hence the data for confirmed infections are cumulative and include the numbers from both the Recovered and Death states. As such, if CI(t) denotes the confirmed infections at time t, then the number of Infected subjects at time t is defined as

I(t)=CI(t)R(t)D(t).

From here on, the term ‘Active Infections’ will be used to denote this derived variable versus the cumulative Infected provided in the data.

Figure 1 shows the plots of daily Active Infections, Recovered and Deaths data for the State of Qatar since 29 February 2020. The Active Infections start very low but encounter a large jump around day 12 due to increased testing. The Active Infections then seem to plateau until day 30, after which there is an extreme growth in Active Infections. The Active Infections start to go down after day 90. The patterns in the Recovered and Deaths are similar and reflect the time of infection before recovery or death. Both graphs show a very slight increase in the number of recovered or dead subjects until about day 90, after which a steady increase is noticed.

Figure 1.

Figure 1.

Plot of Active Infections (a), Recovered (b) and Deaths (c) for the State of Qatar data.

6.2. Evolution of the pandemic in Qatar

The State of Qatar is one of the countries heavily affected by the COVID-19 pandemic. Since its first infected case way back on 29 February 2020, Qatar has become one of the highest infected countries in the Middle East with the total number of confirmed cases standing at 148,258 as of 26 January 2021. The total number of deaths in Qatar so far stands at 248 cases, which is low relative to the total number of infected cases, which is an indication of the country's highly effective healthcare system.

Qatar prepared an excellent flexible plan for risk management, grounded on national risk assessment, taking into account of the global risk assessment done by WHO and focusing on reinforcing capacities to reduce or eliminate health risks from COVID-19. Along with the well-organized healthcare system, the country was very quick to respond to the global pandemic. The country implemented many preventive measures very early on in the pandemic, including border control for early detection of cases. This included, were but not limited to, installing thermal screening for passengers who entered the country at Hamad International Airport and at seaports as early as January 2020, with the first quarantine facilities opening on 1 February [1].

On 9 March 2020 (day 10), Qatar closed all universities and schools and placed a travel ban on 15 countries: Bangladesh, China, Egypt, India, Iran, Iraq, Italy, Lebanon, Nepal, Pakistan, the Philippines, South Korea, Sri Lanka, Syria and Thailand. On 14 March 2020 (day 15), Qatar expanded its travel ban to include three new countries: Germany, Spain and France [11,20]. The Ministry of Municipal and Environment on 21 March 2020, closed all parks and public beaches to curb the spread of coronavirus. On 23 March 2020 (day 24), the Ministry of Commerce and Industry (MoCI) decided to temporarily close all restaurants, cafes, food outlets and food trucks for the public. Also, the MoCI decided to close all unnecessary businesses on 27 March 2020 (day 28) [11,20].

As the number of infected cases continued to rise, on 8 April 2020 (day 40), the Ministry of Public Health (MoPH) announced that Primary Health Care Cooperation will be designating two health centers, one in Umm-Salal and one in Gharrafat Al-Rayyan, for screening, testing and quarantining COVID-19 patients. MoPH also announced a hotline for psychological aid on 9 April 2020 (day 41). On 3 June 2020 (day 95), four people and families were allowed inside a vehicle, and working hours for the private sector employees were also extended from 7.00am until 8.00pm. Also, in early June 2020, it was announced that the country would start reopening in phases, with the first phase beginning on 15 June 2020 (day 108) and the second phase beginning on 1 July 2020 (day 123).

These proactive interventions made by the government change the dynamics of the pandemic and hence, need to be considered while setting up a real-time monitoring system of the infection, recovery and death rates. In the next subsection, we illustrate the proposed model as a data-driven forecasting model for use by stakeholders in the State of Qatar to monitor the COVID-19 pandemic.

6.3. Data analysis results

When specifying prior distributions, one typically attempts to use diffuse or minimally informative prior distributions. However, in the case of the SEIRD model, one needs to specify reasonably informative distributions as during the first few initial time periods the data does not contain information about all model transitions. For example, in the first few time periods both recoveries and deaths are not observed, hence the parameters governing the transitions between the infected to recovered state and the infected to death state cannot be based on data as there is no data yet in these states. Another issue that needs to be considered is that in dynamical systems the system may go quickly to a stable point, such as no disease spread or everyone has died from the disease, if the parameter values are not chosen correctly. To ensure the initialization phase of the model is successful the prior distribution parameters were chosen by tuning the model to the time periods up to and including the first death. This resulted in the following specification: α(t)Exp(2/4450000), β(t)Exp(1/105), γ(t)Exp(1/14) and η(t)Exp(1/9500). These values are identical to those chosen by Ghanam (2021).

The SEIRD model was run with the following initial values for Qatar: S(0)=2,782,000, E(0)=3, I(0)=1, R(0)=0 and D(0)=0. The SEIRD model and the sampling process given in Algorithm 1 were coded in MATLAB R2020a and were run on a PC with Intel Core i7-7700 CPU at 3.60GHz with 8GB of RAM. At each time step the sampler was run with nc=10,000, np=1000 and nb=10. Hence, each of the np samples had a batch of nb samples generated in its neighborhood resulting in np candidate particles at the beginning of the next time step (Step 9 of Algorithm 1); this can be accomplished by, e.g. randomly sampling from a Gaussian distribution centered at each of the np samples with a small variance. The model computation time is about 60 minutes for T = 135 time steps. Note that, the number of individuals in each compartment in the model is much smaller due to the population size. This means that many of the computations are faster especially when dealing with large factorials associated with the Poisson distribution.

Figure 2 shows the coefficient estimates across time, α(t) (panel a), β(t) (panel b), γ(t) (panel c) and η(t) (panel d) for the Qatar data set. Of particular interest is the time frame from day 90 to day 95 in which the active infections I(t) exhibited a large drop (see Figure 1). Notice that the distribution for α(t) becomes incredibly concentrated in this time frame as evidenced by the narrow credible intervals. This is further exhibited in β(t) and η(t). Also, by examining γ(t) during this time frame one can see a large spike in the recovery rate with very narrow credible intervals after about day 80. The recovery rate keeps on increasing dramatically with another spike around day 115.

Figure 2.

Figure 2.

Plots of α(t) (a), β(t) (b), γ(t) (c) and η(t) (d) across time with associated 95% credible intervals for the State of Qatar data.

Figure 3 shows the model fitted to the data with 95% posterior predictive bounds for Active Infections (panel a), Recovered (panel b) and Deaths (panel c). All three models appear to fit the data extremely well based on visual inspection. Particularly notice that around day 90, the dramatic drop in Active infections is contrasted with Figure 2.

Figure 3.

Figure 3.

Plots of the data, fitted model with 95% posterior prediction bounds for Active Infections (a), Recovered (b) and Deaths (c) for the State of Qatar data.

To assess the fit of the model, a Pseudo- R2 was calculated as:

Pseudo-R2=1t=1n[I(t)I^(t)]2+t=1n[R(t)R^(t)]2+t=1n[D(t)D^(t)]2t=1n[I(t)I(t)¯]2+t=1n[R(t)R(t)¯]2+t=1n[D(t)D(t)¯]2

where I(t)¯, R(t)¯ D(t)¯ are the sample means of I, R and D, respectively, across time (and hence are not a function of time), I^(t), R^(t) and D^(t) are the medians of the posterior predictive distributions for I, R and D, respectively, at each time (and hence are functions of time),

Since uncertainty quantification is important, the proportion of observations that fall into the predictive bands was calculated as follows:

P^fit=t=1nI{I(t)C^I(t)}+t=1nI{R(t)C^R(t)}+t=1nI{D(t)C^D(t)}3n

where C^I(t), C^R(t) and C^E(t) denote the 95% predictive intervals for I(t), R(t) and D(t), respectively, and I{A} is an indicator function taking the value one if event A is true.

The Pseudo- R2=0.9999 shows an incredible agreement between the median fitted values and their corresponding data values. The proportion of observations that fall into the 95% predictive bands P^fit=0.8394, which indicates that the model has less uncertainty that it should. However, the value is still quite high with approximately 84% of observations being captured by the intervals.

Figure 4 shows the plot of T2(t) for the Qatar data with a control limit set at 9.48 (red dashed line) and a smoothing parameter Λ=0.2. Notice that until day 40, the process seems to be pretty stable as indicated by the T2(t) being below the control limit. After day 40, there are several time points signalling a change in the process on days: 40, 44, 47, 64, 65, 67, 68, 69, 71, 77, 95 and 123. The early days (40,44,47) can easily be seen to agree with the change in both infection rate α(t) and death rate η(t) in Figure 2 (a) and (d). Notice in these plots the high volatility with spikes in α(t) and a clear shift upwards in η(t) at the same times. During days 64–77, there appears to be a very large amount of volatility in the infection rate α(t) which is also evidenced in Figure 3 (a) for active infections. Notice that the active infections have large increases one day and smaller increases the next day. The method also picks up the spike in infection rates as well as the dramatic increase in recovery rate γ(t) at day 95. At day 123, Figure 2 shows a spike in the rate of exposed to actively infected, β(t) (panel b) and another shift in recovery rate γ(t) (panel c).

Figure 4.

Figure 4.

Plot of Hotelling's T2 statistic through time. Horizontal line corresponds to the 95% control limits.

The results from our model corroborate the evolution of the COVID-19 pandemic in Qatar. As discussed in Section 6.2, in response to the rising number of active cases, day 40 (8 April 2020) marked the designation of two healthcare facilities for screening, testing and quarantining COVID-19 patients, which resulted in changes in infection and death rates. Days 64 to 77 correspond to the first two weeks of May 2020, when the number of new daily confirmed cases was seen to increase dramatically. Also, Qatar reached the 20,000 marks of COVID-19 cases on day 69 (8 May 2020) and the 30,000 marks on day 77 (16 May 2020). This caused our model to flag multiple days within the two weeks, indicating that the pandemic was constantly evolving over this time frame. Another change in the dynamics of the pandemic was signalled on day 95. On this day, up to four people and families were allowed to travel inside a vehicle, and working hours for the private sector employees were extended due to the increased recovery rate in the country. Finally, our model signalled day 123 (1 July 2020), the day of the second phase reopening in Qatar.

To study the sensitivity to the smoothing parameter in the MEWMA chart across signals, several other analyses were performed with Λ=0.1,Λ=0.15,Λ=0.25 and Λ=0.3, which are commonly chosen values for this parameter. When Λ=0.1 the monitoring process signaled at days: 44, 47, 69, 77 and 123. Since the smoothing parameter puts more weight on the previous mean than on the new observations, it is expected that a smaller number of days would be signalled. When Λ=0.15 the monitoring process signaled at days: 40, 44, 47, 64, 68, 69, 71, 77 and 123. Notice that for both Λ=0.1 and Λ=0.15 day 95 is not signaled, which upon inspection of almost all the charts shows a shift in the process. Hence, these parameter choices would be considered too conservative in this case. When Λ=0.25 the following days were signaled: 40, 44, 47, 64, 65, 67, 68, 69, 71, 77, 95 and 123. This is the same result as when Λ=0.2. When Λ=0.3 the monitoring process signaled the following days: 40, 44, 47, 64, 65, 67, 68, 69, 71, 72, 74, 77, 80, 95 and 123. Here days 72, 74 and 80 are added to the list of signaled days. This reflects the volatility in α(t) across this time frame. Overall, when less conservative values for Λ are chosen, the days signaled are quite reasonable. In could be argued that in a pandemic situation, a more sensitive monitoring process would be beneficial to public policy makers as it can signal when an effective intervention has been introduced.

7. Discussion

This work provides a novel tool for monitoring and capturing changes in a pandemic evolution process via monitoring changes in parameters of mathematical epidemiological models, such as the Susceptible, Exposed, Infected, Recovered, Death (SEIRD) model using the Multivariate Exponentially Weighted Moving Average (MEWMA) process monitoring technique. A Bayesian approach is taken for the parameter estimation with a sampling algorithm that allows for both quick updating of the SEIRD model but also provides samples that can be monitored by the MEWMA regime. This sampling algorithm uses the notion of Sampling Importance Resampling, but augments the particles at each step to avoid particle depletion. This quick updating allows for the process monitoring scheme to ‘signal’ quickly if there is a change in the model parameters. The method is then used to monitor the evolution of the COVID-19 pandemic in the State of Qatar.

Here, the maximum-likelihood (ML) approach might at first seem to be a reasonable method. However, when combined with the MEWMA method, the ML approach will cause problems. Recall that the MEWMA approach requires samples to be drawn at each step. In order to obtain samples for the ML approach, one would need to assume approximate normality for the parameters and then draw samples from the approximate sampling distribution of the parameters. The assumption of approximate normality may or may not be appropriate at every time step. On the other hand, the Bayesian approach directly provides those samples for each time step without making any distributional assumption.

Despite the proliferation of forecasting models for the evolution of the COVID-19 pandemic, their accuracy achieved can be compromised and comparisons can be complicated due to numerous factors, e.g. their construction methods, distinct healthcare systems adopted by different countries/regions, different political decisions or policies made, distinct testing and reporting mechanisms [21]. Hence, using the forecasts given by a particular forecasting model for critical decision-making is challenging. The proposed approach takes a different perspective and enables decision-makers to work with a tailored SEIRD model, assess the effectiveness of the policies/decisions made, and adopt interventions and/or prevention strategies consistently over time.

The State of Qatar example illustrates the proposed method's ability to perform daily monitoring of a pandemic. The proposed model fits the data very well with a pseudo- R2=0.999. In the model definition, immigration, emigration, natural births and natural mortality have not been included; however, based on the high psuedo- R2, they would have a negligible effect on the fit. Furthermore, the model does not contain compartments for subjects who recovered without being confirmed infections. Since this is not observed, one can only speculate on the impact that additional data would have on the model fit; however, it would be very small. As seen in Figures 2 and 3, the proposed method successfully picks up the day to day fluctuations in the pandemic evolution process in Qatar via the estimated time-varying model parameters. Note that the pandemic's overall state can also be monitored by tracking the T2 statistic over time (see Figure 4). For Qatar, the method signals the first change in the process around day 40. This change can be attributed to several government interventions such as closing parks and public beaches on day 24, closing all unnecessary businesses on day 28 and announcing two major health centers catered towards COVID-19 patients on day 40. The method also signals multiple days beyond day 40, all of which seem reasonable upon further inspection. Thus, the proposed method gives decision-makers the ability to evaluate planned interventions as well as discover new changes to the process and respond accordingly. This method can also be extended for monitoring a process at the state/county level by incorporating a spatial covariance and using the mixed model approach.

The unique environment of Qatar and the immediate policy making and enforcements imposed by the government of Qatar make the SEIRD model presented, appropriate for the population. They enforced quarantines, surveillance testing, lock downs and strict masking requirements. These and other restrictions were imposed uniformly throughout the country, which makes the proposed model assumptions valid in this setting. Whereas, in countries like the United States, individual states had different policy requirements and enforcement varied in a way that violates the assumptions of the SEIRD model. For example, masking was not enforced, testing was not enforced, those who were sick could interact with those who are healthy and mobility across the country was not impeded allowing the spread of the disease to occur in a non-uniform way. To pursue a model for the US would require individual models for each state as well as mechanism across models to account for transmission across states. This is a very different model than the one for Qatar and needs to be explored separately.

Finally, since the augmented sampling regime allows posterior samples to be saved from the previous day, updating is performed on a daily basis and only requires the new data and the previous day's samples. Thus the entire SEIRD model need not be fit from the beginning of the series. Furthermore, the MEWMA is quickly calculated from the posterior samples and can quickly signal those managing the pandemic. Note that the method is not tied to the SEIRD model given in Equation (1), as the augmented sampler and MEWMA monitoring protocol are generic. Our motivation here has been using a system where the reproduction number fails to include all the relevant parameters. In systems where the reproduction number is dependent on all parameters, the reproduction number could be added as a dimension to the monitoring protocol as well. In situations where the reproduction number is meaningful, this could be another dimension that could ‘signal’ serious changes in long-term process outcomes.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Al Khal A., Al-Kaabi S., and Checketts R.J., Qatar's response to COVID-19 pandemic, Heart. Views. 32 (2020), pp. 21–129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bagal D.K., Rath A., Barua A., and Patnaik D., Estimating the parameters of susceptible-infected-recovered model of COVID-19 cases in India during lockdown periods, Chaos, Solitons & Fractals 140 (2020), pp. 110154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Basu D., Salvatore M., Ray D., Kleinsasser M., Purkayastha S., Bhattacharyya R., and Mukherjee B., A comprehensive public health evaluation of lockdown as a non-pharmaceutical intervention on COVID-19 spread in India: National trends masking state level variations, 2020 medRxiv. [DOI] [PMC free article] [PubMed]
  • 4.Chinazzi M.D., Gioannini C., Litvinova M., Merler S., Mu K., Rossi L., Sun K., and Viboud C., The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak, Sci. 2020. [DOI] [PMC free article] [PubMed]
  • 5.Chowell G. and Brauer F., The basic reproduction number of infectious diseases: Computation and estimation using compartmental epidemic models, Mathematical and statistical estimation approaches in epidemiology, Springer, 2009.
  • 6.Clancy D. and O'Neill P.D., Bayesian estimation of the basic reproduction number in stochastic epidemic models, Bayesian Anal. 3 (2008), pp. 737–757. [Google Scholar]
  • 7.Deb S. and Majumdar M., A time series method to analyze incidence pattern and estimate reproduction number of COVID-19, preprint (2020). Available at arXiv:2003.10655.
  • 8.Fisher D. and Heymann D., Q&A: the novel coronavirus outbreak causing COVID-19, BMC Med. (2020). doi: 10.1186/s12916-020-01533-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ghanam R., Boone E. L., and Abdel-Salam A.-S. G., SEIRD Model for Qatar COVID-19 Outbreak, Lett. Biomath. 8 (2021), pp. 19–28. Available at https://lettersinbiomath.journals.publicknowledgeproject.org/index.php/lib/article/view/319. [Google Scholar]
  • 10.Giuliani D.D. and Santi F., Modelling and predicting the spread of Coronavirus (COVID-19) infection in NUTS-3 Italian regions, Preprint (2020). Available at arXiv:2003.06664.
  • 11.Hamad Medical Corporation , Major Risks to Business Continuity, Doha, Qatar, 2020. [Google Scholar]
  • 12.Han Z., Li T., and You J., These Unprecedented Times: The Dynamic Pattern Of COVID-19 Deaths Around The World, Preprint (2020). Available at arXiv:2011.02824.
  • 13.Jewell C.P., Kypraios T., Neal P., and Roberts G.O., Bayesian analysis for emerging infectious diseases, Bayesian Anal. 4 (2009), pp. 465–496. [Google Scholar]
  • 14.Kyrychko Y.N., Blyuss K.B., and Brovchenko I., Mathematical modelling of the dynamics and containment of COVID-19 in Ukraine, Sci. Rep. 10 (2020), pp. 19662. Availalbe at 10.1038/s41598-020-76710-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Loli Piccolomini E. and Zama F., Monitoring Italian COVID-19 spread by a forced SEIRD model, PLoS. ONE. 15 (2020), pp. e0237417. Available at 10.1371/journal.pone.0237417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lowry C.A., Woodall W.H., Champ C.W., and Rigdon S.E., A multivariate exponentially weighted moving average control chart, Technometrics 34 (1992), pp. 46–53. [Google Scholar]
  • 17.May R.M. and Anderson R.M., Infectious Diseases of Humans: Dynamics and Control, Oxford University Press, Oxford, 1991. ISBN 0-19-854040-X. [Google Scholar]
  • 18.McIntosh K., Coronavirus disease 2019 (COVID-19), Up-To-Date, Available at https://www.uptodate.com/contents/coronaviruses. Accessed May 8 (2020).
  • 19.Miller M., 2019 novel coronavirus COVID-19 (2019-nCoV) data repository, Bulletin-Association of Canadian Map Libraries and Archives (ACMLA) 164 (2020), pp. 47–51. [Google Scholar]
  • 20.Ministry of Public Health , Qatar National Preparedness and Response Plan for Communicable Diseases, Doha, Qatar, 2019. [Google Scholar]
  • 21.Nikolopoulos K., Punia S., Schäfers A., Tsinopoulos C., and Vasilakis C., Forecasting and planning during a pandemic: COVID-19 growth rates, supply chain disruptions, and governmental decisions, Eur. J. Oper. Res. 290 (2021), pp. 99–115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Rezabakhsh A., Ala A., and Khodaei S.H., Novel coronavirus (COVID-19): A new emerging pandemic threat, J. Res. Clinical Med. 8 (2020), pp. 5–6. [Google Scholar]
  • 23.Roy A. and Karmakar S., Bayesian semiparametric time varying model for count data to study the spread of the COVID-19 cases, preprint (2020). Available at arXiv:2004.02281.
  • 24.Vynnycky E. and White R.G., An Introduction to Infectious Disease Modelling, Oxford University Press, Oxford, 2010. ISBN 978-0-19-856576-5. [Google Scholar]
  • 25.Wang L., Zhou Y., He J., Zhu B., Wang F., Tang L., and Song P.X., An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China, J. Data. Sci. 18 (2020), pp. 409–432. [Google Scholar]
  • 26.World Health Organization , Novel Coronavirus (2019-nCoV) situation reports.
  • 27.Wu F., Zhao S., Yu B., Chen Y.M., Wang W., Song Z.G., Hu Y., Tao Z.W., Tian J.H., Pei Y.Y., and Yuan M.L., A new coronavirus associated with human respiratory disease in China, Nature. 579 (2020), pp. 265–269. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES