Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2022 Dec 13;167:112984. doi: 10.1016/j.chaos.2022.112984

COVID-19 spread control policies based early dynamics forecasting using deep learning algorithm

Furqan Ali a, Farman Ullah b, Junaid Iqbal Khan a, Jebran Khan a,c, Abdul Wasay Sardar a, Sungchang Lee a,
PMCID: PMC9744690  PMID: 36530380

Abstract

Many severe epidemics and pandemics have hit human civilizations throughout history. The recent Sever Actuate Respiratory disease SARS-CoV-2 known as COVID-19 became a global disease and is still growing around the globe. It has severely affected the world’s economy and ways of life. It necessitates predicting the spread in advance and considering various control policies to avoid the country’s complete closure. In this paper, we propose deep learning-based stacked Bi-directional long short-term memory (Stacked Bi-LSTM) network that forecasts COVID-19 more accurately for the country of South Korea. The paper’s main objectives are to present a lightweight, accurate, and optimized model to predict the spread considering restriction policies such as school closure, workspace closing, and the canceling of public events. Based on the fourteen parameters (including control policies), we predict and forecast the future value of the number of positive, dead, recovered, and quarantined cases. In this paper, we use the dataset of South Korea comprised of several control policies implemented for minimizing the spread of COVID-19. We compare the performance of the stacked Bi-LSTM with the traditional time-series models and LSTM model using the performance metrics mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). Moreover, we study the impact of control policies on forecasting accuracy. We further study the impact of changing the Bi-LSTM default activation functions Tanh with ReLU on forecasting accuracy. The research provides insight to policymakers to optimize the pooling of resources more optimally on the correct date and time prior to the event and to control the spread by employing various strategies in the meantime.

Keywords: COVID-19, Forecasting, Deep Learning, Stacked Bi-LSTM, Long short-term memory, Pandemic, Time series

1. Introduction

Mortality is an inescapable reality of the human experience, but the unavoidable nature of epidemics is debatable. Humanity has been suffering from epidemic diseases throughout history. However, once epidemics become foreseeable, it is possible to make preparations and fight back. COVID-19 is a novel corona virus infectious disease that transmits from person to person through airborne particles [1]. Major symptoms include shortness of breath and sometimes loss of taste buds and smell. While the virus mortality rate is relatively low, COVID-19 has proven to be quite adept at spreading quickly to large numbers of people. The impact of this virus on the country of South Korea can be seen in Fig. 1.

Fig. 1.

Fig. 1

Number of COVID-19 cases in South Korea as of October 14, 2021 [2].

In January 2020, COVID-19 and the relevant precautionary guidelines were publicly announced by the World Health Organization (WHO) [3]. Many countries around the globe enforced curfews in their cities and adopted a complete state of lockdown. Health authorities stated that the disease had a more significant impact on older demographic groups. Thus far, the virus has killed more than five million people worldwide. Official figures are based on data accumulated by researchers, keeping in mind that such calculations are always prone to errors or bias. Some advanced studies have also shown that forecasting this disease in the future with minimal error is also possible, particularly with deep learning modeling.

Previous studies have highlighted that pandemics do not follow a linear pattern in their spread [4], [5]. Accordingly, the challenge is to not only control the pandemic with strategies like contact tracing and social distancing [6], but also develop a model that can forecast the future given nonlinear and unpatterned data, allowing the policy makers to make proper action to deal with future state of the pandemic. This problem persists due to data scarcity and prediction uncertainty. The decay in uncertainty is directly proportional to how well the model performs over the nonlinear patterns of historical data. Researchers have proposed various techniques to resolve nonlinear pattern forecasting with lower error rates and higher levels of accuracy [7]. Deep learning is an effective tool employed to address these issues, eliciting promising results. Previous approaches attempted to fit a curve using one feature plotted against time. Multi-parametric approaches have made this obsolete, which consider multiple features plotted against time and fit the curve upon the relevant pattern. In deep learning, the selection of essential and relevant parameters is critical [8]. Therefore, multiple parameters are tested in a given scenario to fit the best curve against time and arrive at the model with the lowest validation error.

The government must pool resources optimally at the correct locations and times to combat the pandemic. In this paper, we propose a deep learning-based stacked Bi-directional long short-term memory (Bi-LSTM) network that can forecast four different parameters in time, i.e., COVID-19 confirmed patients, recovered, dead, and quarantined cases. Bi-LSTM is an advanced variant of recurrent neural networks (RNNs) [9]. RNNs are prone to the vanishing gradient problem [10], whereas Bi-LSTM models mitigate the vanishing effect. Different parameters comprised of control policies are input to the model. In order to evaluate the error, we employ the mean absolute error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE). The proposed model is efficient and computationally less expensive than previous hybrid approaches, which generally require significant computational power. The proposed approach is unique from previous approaches as it utilizes control policies data. However, in the literature the deep learning models are just trained on forecasting parameters such as positive rate, death rate etc. Our proposed approach uses control policies along with forecasting parameters to yield better and accurate results.

In the light of above description, following are our main contributions.

  • 1.

    Design an accurate, simple, stacked bidirectional LSTM model that can forecast COVID-19 for the next three days.

  • 2.

    Utilizing novel features set from the existing COVID-19 data such as user mobility policies.

  • 3.

    Analyzing the mobility policies that are most helpful in increasing model precision.

  • 4.

    Forecasting the positive rate, death rate, recovered rate, and quarantined rate considering the ten different policies employed by Govt such as school closure, movement between cities, and international travel.

The rest of the paper is divided as follow: Section 2 briefly describe the related works and background about the time series forecasting and modeling in context of COVID-19 and AI. The proposed methodology for forecasting COVID-19 using novel features is explained in Section 3. Section 4 shows the results and discussion on it, and finally we conclude the paper findings in Section 5.

2. Literature review

Many studies exist that show the current spread of COVID-19 and its effect on human health, psychology, and daily routines. There are also studies on detection, diagnosis, and the current transmission rate.

Many mathematical models have been derived to forecast the outspread of COVID-19 and came up with diagnostics. An early Monte Carlo simulation model was applied to show the transmission rate of COVID-19 with the help of a daily mean reproduction number (Rt), taking different parameters into account, such as the number of daily cases and the number of confirmed cases versus deaths [11]. A recent study has been carried out on COVID- 19 diagnostics using radiological images. Systematic reviews were collected from three different databases, PubMed, Scopus, and Web of Science, providing insights into which model provided the best accuracy in terms of sensitivity and specificity values [12]. Another study compared the federated machine learning model versus the traditional machine learning model and derived the best parameters for the early detection of COVID-19 using chest X-ray images. It was observed that the activation function (softmax) and a model optimizer (SGD) played a crucial role in improving model accuracy [13].

Time-series analysis is forecasting the future of a particular phenomenon by utilizing previous data. It is categorized into three different classes [14]: statistical modeling, machine learning modeling, and deep learning modeling for time series analysis. The proposed research is concerned with developing a better model for forecasting COVID-19 cases with respect to time. It provides insight into the spread pattern of COVID-19 by visualizing its spread graph, it also helps policymakers to determine the best policies that are helpful in improving the accuracy of forecast and is meaningful for the control of COVID-19. To forecast the growth rate of COVID-19, researchers have used various approaches to obtain the best accuracy [15]. Amar et al. [16] used a machine learning regression model to forecast the COVID-19 spread rate in Egypt. They used seven different regression model variants and came up with the best approach with the minimum error value. The models forecasted COVID-19 for the next 15 days with a government-provided dataset. Table 1 shows the summary of approaches and datasets used for the COVID-19 future value forecasting.

Table 1.

Literature review and related works for forecasting COVID-19 cases.

Ref No. Proposed Model Key Assertion Dataset
[17] The author aims to predict the future conditions of novel Coronavirus to recede its impact. Stacked LSTM, Bi-directional LSTM, ConvLSTM COVID-19 cases were forecasted for one month. Convolution LSTM outperforms the other two models. [18], [19]

[20] The goal is to forecast time series at large intervals. Transformers A self-attention-based transformer network with exceptional capability in time series forecasting is demonstrated. Real-world TSF datasets

[21] It demonstrates that COVID-19 spreads at varying rates and scales in distinct geographic regions. Attention crossing time series model Author created a new forecasting model called ACTS that produces predictions by analyzing patterns across several time-series data. Johns Hopkins University

[22] A worldwide forecasting tool that forecasts COVID-19 confirmed cases for the next seven days around the world. ANN, LSTM, ARIMA A comparative analysis between 4 different models is conducted i.e. ANN, LSTM, ARIMA and CNN. John Hopkins University

[23] The aim is to forecast COVID-19 infection rate of 60 days for 4 different nations. Fb-Prophet Demonstrated a nonlinear association between total registered cases and confirmed cases, as well as a linear relationship between confirmed cases and deaths. John Hopkins University

[24] The aim is to forecast COVID-19 for the selected states of India using different models. Encoder-Decoder LSTM and bidirectional LSTM Authors compares the infections spread of different states with those regions where the infections spread have reached its peak. [25]

[26] Forecasting, predictive modeling, and creating a heat map representation of policy measures across countries. ARIMA, LSTM, MLP LSTM outperformed the other models in terms of forecast accuracy and fewer error rate. Johns Hopkins University

[27] Nine countries COVID-19 data Forecasting using Deep Learning. M-LSTM, MLP, Random Forest The author utilized multiple models, compared their performances and effective features in forecasting. John Hopkins University

[28] Predicting the future cases of infection, based on the study of data mined from the internet search terms of people in the affected region. GWO-LSTM (Grey Wolf Optimizer) The author automated the process of hyperparameter-tuning using a meta-heuristic search algorithm namely, Grey Wolf Optimization (GWO). Johns Hopkins University

[29] Three hybrid approaches for forecasting COVID-19 using time series data, based on combining three deep learning models is proposed. Multi-head attention, CNN, LSTM The author used deep learning and Bayesian optimization methodologies for hyperparameter-tuning. Johns Hopkins University

[30] A DL-based approach is used to forecast the rate of new cases and new deaths cases for the three and seven days ahead. LSTM, Convolutional LSTM, and GRU Results show that forecasting for a longer horizon is harder than forecasting for a shorter horizon and Bi-directional models have lower error than other models. [31]

[32] The author proposed an ANN based online incremental learning technique to analyze the temporal dynamics of the disease spread. ANN The model is able to intelligently adapt to new ground realities in real-time eliminating the need to retrain the model from scratch. [33]

[34] In order to estimate COVID-19 (SARS-CoV-2) evolution in Spain, a semi-parametric technique was proposed. The Bayesian optimization with LSTM. It combines new Deep Learning (DL) techniques for analyzing sequences with the traditional Bayesian Poisson–Gamma model. [35]

Ardabili et al. [36] forecasted COVID- 19 spread for Italy, China, Iran, Germany, and the United States using the dataset of the World metro website [37]. They used multilayer perceptron (MLP) and adaptive neuro-fuzzy inference systems (ANFIS) to easily forecast COVID-19 infection rates with significantly less error. They proposed that to increase the accuracy of the MLP model, the integration of susceptible–exposed–infectious–removed (SEIR) showed promising results. Gupta et al. [38] predicted three different parameters: active rate, death rate, and recovered rate using the data of [39]. Using three different machine learning models, the support vector machine (SVM) model, FB prophet, and linear regression model, the FB prophet achieved the highest accuracy. US COVID-19 cases using ML and statistical models are studied in [40]. The study showed that the random forest model best found the cumulative growth of COVID-19 in the US. They also concluded that the rise of COVID-19 was significantly affected by the implementation of shelter-in-place (SIP) orders, which caused a reduction in COVID-19 cases and set an example many countries followed.

Saba and Elsheikh et al. [41] used two different approaches, nonlinear autoregressive artificial neural networks (NARANN) and autoregressive integrated moving average (ARIMA), to forecast the growth of COVID-19 in Egypt. RMSE, MAE, the deviation ratio, and the coefficient of residual mass are used as performance metrics, and NARANN outperformed ARIMA. Neeraj et al. [42] proposed a new attention-based model named AttentionLSTM, which used the architecture of the LSTM model. However, an additional attention layer was added to the model to store information in the context vector with respect to time. Instead of using the hidden state vector itself, the authors used the phenomena of fine-grained attention, i.e. attention on hidden state dimensions. The author additionally added an embedding layer that converted time into vector representations. This layer is known as Time2Vec. They employed the dataset obtained from Johns Hopkins University, and they forecasted the data of five different countries Canada, North America, Italy, Spain, and France.

Stefano Cabras et al. [43] proposed a semi-parametric approach to forecast COVID-19 for the next 14 days using an LSTM deep learning model along with the Bayesian Poisson–Gamma model using data from [44]. The data had not been recorded properly, so they used cumulative incidences over 14 days, which resulted in more reliable information about the pandemic. LSTM deep learning model [45] forecasted the COVID-19 and its possible endpoint in Canada using the Johns Hopkins University and the Canadian Health Authority. They also found out that the recovery rate played a vital role in forecasting COVID-19 more accurately. They applied a wavelet transformation to preserve the time–frequency component to minimize random noise in the dataset. They concluded that their model performed exceptionally well as compared to existing models.

Arora et al. [46] forecasted COVID-19 using three different deep learning models, stacked LSTM, bidirectional LSTM, and convolutional LSTM. The data was collected from the Ministry of Health and Family Welfare. The bidirectional LSTM provided better results than the other two models. The dataset was used to forecast weekly and daily cases for all Indian states using historical data from March 2020 to May 2020. Devaraj et al. [47] forecasted the outbreak of COVID-19 using different machine learning and deep learning models, namely ARIMA, LSTM, stacked LSTM, and the FB prophet model. They forecasted the cumulative confirmed positive patients, dead, and recovered cases for different cities in India. The correlation between different input parameters was computed as it was a multivariate time series approach. Features like rainfall, cumulative infected cases, total population, area, temperature, and population density were used. The stacked LSTM performed better than the other models, with an error reduction of two percent. Tomar and Gupta et al. [48] proposed a simple and effective LSTM-based approach to forecast COVID-19 for the next 30 days in India using the Johns Hopkins University dataset. They used this data to forecast total confirmed cases, daily confirmed cases, total recovered cases, and total deaths. They found that recoveries were as high as the total number of infections, but the time needed for recovery was higher than the time needed for viral transmission. Hence, recovery time needed to be improved by utilizing resources more effectively to achieve a healthier balance. The LSTM-based approach delivered 90 percent accuracy in predicting COVID cases.

Huang et al. [49] presented an analysis of COVID-19 diagnosis and isolation procedures that effectively slowed the spread of the virus, as well as a reduction in casualties during a period of the rapid increase in diagnosed cases in Wuhan, China. In the early phases of a pandemic, the authors urged governments to implement forceful public health actions quickly. However, such strategies have not been as effective in other nations with similar populations, such as India, even though they would have helped curb COVID-19’s spread in the US and Brazil.

3. The proposed COVID-19 spread control policies forecasting using deep learning algorithm

In this section, we have covered the pipeline of our deep learning model to forecast COVID-19 cases. We employed a stacked Bi-LSTM-based approach using the COVID-19 control policies to forecast confirmed cases, recovered cases, death cases, and people quarantined. We introduce restriction policy data and use this data to plot curves. Bi-LSTM is a type of RNN that combines the LSTM layers in the forward and backward directions. It avoids long-term dependencies and is best for processing and predicting time-series problems. First, the LSTM is applied in a forward direction over the input sequence data and then in the reverse direction (i.e., backward layer). Applying both the forward and backward LSTM twice improves the long-term dependencies and accuracy of the model. Before applying the model, we clean and normalize it as preprocessing steps. The missing data are imputed using the forward and backward imputation methods. We apply feature extraction techniques to determine the dependent and independent variables from the data. A windowing technique is applied to the multivariate data to forecast one and three days’ future value by considering data for the past three and seven days. Moreover, we split the data into training and testing sets. We use the Bi-directional LSTM layers along with ReLU and Tanh activation functions. We fit the model on the training data and validate it with the testing data. Single dense layers are connected at the end to produce the final prediction and a dropout layer to avoid overfitting the model. We calculated Model loss is calculated using RMSE and MAE. We analyze three different curves under three different conditions, i.e., the curve before the COVID-19 peak, the curve during the COVID-19 peak (when prevention policies are implemented), and the forecast of the curve after restrictions are lifted to observe the number of cases in all three periods.

Fig. 2 shows the overall architecture of the proposed COVID-19 Spread Control Policies Forecasting using Deep Learning Algorithm. We also provide a comparative analysis of Bi-LSTM and LSTM model with different policies. Among all approaches, the deep learning model stacked Bi-LSTM offers the best forecasting performance [50] with three restriction policies i.e. school closing, workspace closing, and facial covering. The overall methodology is composed of:

Fig. 2.

Fig. 2

Proposed architecture for COVID-19 spread control policies based early dynamics forecasting using deep learning algorithm.

  • Dataset description

  • Preprocessing and feature extraction

  • LSTM architecture for forecasting the future value of COVID-19 parameters.

  • Stacked Bi-LSTM

  • Performance metric

3.1. Dataset description

The dataset is obtained from different resources, one of the most renowned of which is SNU ARIC website [51] Novel Coronavirus (COVID-19) Cases, made available by SNU ARIC. The dataset consists of several various parameters. We apply pre-processing technique to select data only for the country of Korea and drop all those parameters which are not relevant.

In Fig. 3, the x-axis represents days and y-axis represent policy applied by the Govt, as described below.

Fig. 3.

Fig. 3

Pictorial overview of the dataset parameters.

  • 1.

    OVAR1 (It represents data from the school shutdown period. 0 - there are no measurements/1 - suggest closure/2 - mandate closure (but only for certain levels or categories, such as high school or public schools)/3 - all levels must be closed/No data - “blank”)

  • 2.

    OVAR3 (It represents the closing of the workplaces time period. 0 - no measures/1 - recommend closing (or recommend work from home)/2 - require closing (or work from home) for some sectors or categories of workers/3 - require closing (or work from home) for all-but-essential workplaces (e.g. grocery stores, doctors)/Blank - no data)

  • 3.

    OVAR5 (It denotes the postponement of public events. 0 indicates no measurements; 1 indicates that cancellation is recommended; 2 indicates that cancellation is required, and Blank indicates that no data is available.)

  • 4.

    OVAR7 (0 - restrictions on very large gatherings (the limit is above 1000 people)/1 - restrictions on gatherings between 101–1000 people/2 - restrictions on gatherings between 11–100 people/3 - restrictions on gatherings of 10 people or less/Blank - no data)

  • 5.

    OVAR9 (It represents the closing of the public transport time period. 0 - no measures/1 - recommend closing (or significantly reduce volume/route/means of transport available)/2 - require closing (or prohibit most citizens from using it)/Blank - no data. Since we had no data regarding this parameter therefore we dropped it.)

  • 6.

    OVAR11 (It represents the time period of confining in the home or workplace. 0 - recommend not leaving house/1 - require not leaving the house with exceptions for daily exercise, grocery shopping, and ‘essential’ trips/2 - require not leaving the house with minimal exceptions (e.g. allowed to leave once a week, or only one person can leave at a time, etc.)/Blank - no data.)

  • 7.

    OVAR13 (It represents the record of restrictions on internal movement between cities/regions. 0 - no measures/1 - recommend not to travel between regions/cities/2 - internal movement restrictions in place/Blank - no data)

  • 8.

    OVAR15 (It represents the record of restrictions on international travel/Note: this records policy for foreign travelers, not citizens. 0 - no restrictions/1 - screening arrivals/1 - quarantine arrivals from some or all regions/2 - ban arrivals from some regions.)

  • 9.

    OVAR21 (It represents public information campaign data but since we do not have many records therefore we dropped this variable.)

  • 10.

    OVAR24 (It represents Contact-tracing data. 0 - no contact tracing/1 - limited contact tracing; not done for all cases/2 - comprehensive contact tracing; done for all identified cases.)

  • 11.

    OVAR40 (It represents the restriction on the Facial-Covering time period. 0 - Recommended/1 - Required in some specified shared/public spaces outside the home with other people present, or some situations when social distancing is not possible/2 - Required in all shared/public spaces outside the home with other people present or all situations when social distancing not possible /3 - Required outside the home at all times regardless of location or presence of other people.)

  • 12.

    MVAR1 (It represents the total number of confirmed patients on y-axis and number of days on x-axis)

  • 13.

    MVAR2 (It represents the total number Covid death patients on y-axis and the number of days on x-axis)

  • 14.

    MVAR3 (It represents the total number of recovered patients record on y-axis and number of days on x-axis)

  • 15.

    MVAR4 (It represents the total number of people quarantined on y-axis and the number of days on x-axis)

These parameters are input into the deep learning model, and four parameters are forecasted as output as shown in Fig. 4, which are:

Fig. 4.

Fig. 4

Dataset target variables.

  • 1.

    MVAR1 (It represents the total number positive/confirmed patients on y-axis and number of days on x-axis)

  • 2.

    MVAR2 (It represents the total number Covid death patients on y-axis and number of days on x-axis)

  • 3.

    MVAR3 (It represents the total number of recovered patients record on y-axis and number of days on x-axis)

  • 4.

    MVAR4 (It represents the total number of people quarantined on y-axis and number of days on x-axis)

The dataset consists of a total of 536 days of samples; the data were collected from 23 January 2020 to 12 July 2021, so the total number of days is 536. We analyze it under three different conditions. (1) COVID-19 curve before the implications of restriction policies, (2) COVID-19 curve during the implication of restriction policies, and (3) COVID-19 forecasting curve after the restrictions have been lifted. With the help of this information, the obtained forecasting curves showed the importance and benefits of these policies in controlling COVID-19. The collected dataset is from 23 January 2020 to 12 July 2021 [51]. The real-world data has missing values. The missing data points between the time frame were replaced using forward and backward imputation. Forward imputation replaces the corresponding missing value with the previous value, and backward imputation fills the missing value using the following data point. The outliers in the data are removed using the box plot technique. Furthermore, the data is split into a train and test set, such as 480 sample days for training and 56 sample days for testing and validation. Finally, the data is scaled between 0 and 1 using the min–max scaler technique.

3.2. Preprocessing and feature extraction

The data is preprocessed using various techniques. First of all, the dates are parsed into years, months and days. Secondly, if the whole row data contains more than 5 NaN values out of 14 columns, the whole row is removed. Otherwise, if a few columns (2–3) have NaN values, then the missing data are imputed using forward and backward imputations in the respective feature column. In forwarding imputation, the data is filled with the previous raw value, and in backward imputation, the data is filled with the next available row value. For example, in our dataset, the data in the positive cases (MVAR1) column from Jan 27, 2020, till Jan 30, 2020, is missing, so it got replaced with the previous available value of Jan 26, 2020, which was ‘3’. The data in the quarantined cases (MVAR4) column is missing for the day, Feb 5, 2020, so it was replaced with the next row value (23) of the date Feb 6, 2020. Similarly, the same approach is applied to fill in the remaining missing values, and after that, the data is normalized between 0 and 1 using min–max scaling. The formula is described in Eq. (1).

xscaled=xxminxmaxxmin (1)

Once the data is normalized, windowing technique is applied on the dataset. A window takes previous 3 days data and predicts the next day data point.

3.3. LSTM architecture

Deep learning methods are a kind of recurrent neural network (RNN) that have proven to be effective predictive models [52] due to the automatic extraction of relevant features from the training samples. It feeds the activation function from the previous time step as input for the current time step and network self-connections. RNNs are good at processing data and exhibit great potential in time-series prediction [53] by storing large historical information in its internal state. However, it has a drawback concerning vanishing and gradient exploding problems, which lead to large training time expenditures or outright training failures. To address these shortcomings, [54] designed the LSTM RNN structure in 1997 to deal with a long-term dependency on the multiplicative gates that regulate the information flow and memory cells in the recurrent hidden layer. LSTM structure consists of four gates: the input gate, forget gate, control gate, and output gate, which are shown in Fig. 5 [55].

Fig. 5.

Fig. 5

LSTM cell working.

The input gate is defined by Eq. (2) as:

it=σ(Wi×[ht1,xt]+bi) (2)

The gate decides which information can be transferred to the cell. The information from the input of the previous memory that should be neglected is decided by the forget gate and is defined by Eq. (3) as:

ft=σ(Wf×[ht1,xt]+bf) (3)

The update of the cell is controlled by the control gate and is given by Eqs. (4), (5).

CIt=tanh(Wc×[ht1,xt]+bc) (4)
Ct=tanh(ft×Ct1+it×CIt) (5)

where control gate is represented by CIt. The hidden layer (ht1) is updated by output layer which is also responsible for updating the output as is given by Eqs. (6), (7).

Ot=σ(Wo×[ht1,xt]+bo) (6)
ht=Ot×tanh(Ct) (7)

In the above equations, tanh is used to scale the values into range −1 to 1, σ is the activation function which is taken as sigmoid and W is the corresponding weight matrices.

3.4. Stacked bidirectional LSTM

The proposed model uses stacked bidirectional LSTM architecture. It consists of two bidirectional LSTM layers with multiple units in each layer, as shown in Fig. 6. The Bi-LSTM network is used to forecast the time series data over the given interval of time followed by the previous time step data fed in as an input. To elicit deeper insights from the data and to study the varying trends, multiple Bi-directional LSTM layers are stacked together to construct an accurate model with high-level deeper representation [47]. The previous layer representations are learned by the subsequent layer for better optimization.

Fig. 6.

Fig. 6

Stacked bidirectional LSTM.

Conventionally, LSTM techniques can process the data in one direction. In other words, if something important in the future has a link with the future as well as the past simultaneously, then that information would have a chance of being missed. To overcome this issue, the bidirectional LSTM concept was introduced. A bidirectional LSTM can process data in both directions, i.e. in the forward and in the backward directions simultaneously. This helps Bi-LSTM memorize the contextual information in the past. The combination of two bidirectional LSTM layers yields a multi-stack Bi-directional LSTM model. The previous layer receives input and passes the information on to the next layer for further processing and the extraction of meaningful insights from the data.

The mathematical explanation of Bi-LSTM can be derived from the mathematical explanation of LSTM. The corresponding formulation of this architecture as Eqs. (8), (9), (10), (11), (12).

it=σ(Wixt+Uiht1+bi) (8)
ft=σ(Wfxt+Ufht1+bf) (9)
Ot=σ(Woxt+Uoht1+bo) (10)
c˜=ft×Ct1 (11)
ht=Ot×tanh(Ct) (12)

where Xt, σ, and Ct represent the input sample at time t, the sigmoid activation function, and the memory unit, respectively. (bf,bi,bo) and (Wf,Wi,Wo) stands for the bias and weight matrix for each gate, respectively.

To improve the learning capabilities of the classical LSTM model, the temporal structure considers the two-way relationship of the input data. In other words, Bi-LSTM processes the information forward and backward simultaneously. This results in obtaining a better understanding of contextual information in the data. Each Bi-LSTM block consists of the forward and backward layer. The output of the forward and backward layer are computed using Eqs. (13), (14).

ht=βhtf+γhtb (13)
yt=tanh(ht) (14)

where β and tanh are the numerical factors respecting the equality β + γ = 1

This proposed model has been developed using the tensorflow 2.5 deep learning python framework.

3.5. Performance metric

The performance of The Stacked Bi-directional LSTM model is calculated using RMSE (Root mean square error), MAE (Mean absolute error) and MAPE (Mean absolute percentage error). The best model gave the overall RMSE value of 214.50, MAPE value of 0.0143 and MAE value of 99.251 respectively. The RMSE values are calculated by Eq. (15).

RMSE=i=1n(yiˆyi)2n (15)

Similarly, the MAE value is obtained using formula described in Eq. (16) as.

MAE=i=1n|yiλ(xi)|n (16)

Finally, the MAPE value is obtained using formula shown in Eq. (17).

MAPE=1nt=1n|AtFtAt| (17)

4. Results and discussions

In this section, we discuss the results of the stacked Bi-LSTM forecasting. The results revolve around two major methods of forecasting COVID-19. The first uses the simple 2-layer LSTM model and the second uses the stacked Bi-LSTM model. Both the networks have a similar configuration and both are trained over a combination of different policies to figuring out the best policy which aids in improving the forecasting accuracy.

4.1. Features representation

The South Korean COVID-19 dataset includes the positive rate, death rate, recovered rate, quarantine rate, and ten other mobility policies. The policies are introduced in the dataset to observe the effects of restriction levels on the forecasting curve and accuracy. The stacked Bi-LSTM model is trained on different feature sets to determine which policy has a larger effect on modeling the correct prediction. The dataset features and target values are visualized in Fig. 3, Fig. 4.

4.2. Training LSTM and stacked Bi-LSTM models

ARIMA, SARIMA, VAR, LSTM, and proposed stacked Bi-LSTMS techniques are employed to forecast COVID-19 with high precision. Most of the comparison is made between LSTM and bi-LSTM. The LSTM network consists of two LSTM layers, with 184 units in the first layer and 164 units in the second layer, followed by a single dense layer with no activation function. Contrary to this approach, the present study uses the stacked Bi-LSTM technique, which consists of two Bi-LSTM layers with the same network configuration as LSTM. Table 2 shows the units and layers details used in the stacked Bi-LSTM model.

Table 2.

Network Architecture and parameters.

Network No of layers No of units per layer Loss Learning rate Activation function Optimizer No of epochs No of variables fed as input
LSTM 02 184, 164 MAE ReduceLROnPlateau Tanh Adam 800 5,6,7,14
LSTM 02 184, 164 MAE ReduceLROnPlateau ReLU Adam 800 7
Stacked Bi-LSTM 02 184, 164 MAE ReduceLROnPlateau Tanh Adam 800 5,6,7,14
Stacked Bi-LSTM 02 184, 164 MAE ReduceLROnPlateau ReLU Adam 800 7

Stacked Bi-LSTM network graph architecture is shown in Fig. 8.

Fig. 8.

Fig. 8

Stacked Bi-LSTM architecture.

We achieved the highest accuracy with two stacked bidirectional LSTM layers compared to LSTM [50] with the same network configuration. It also surpasses the accuracy of traditional forecasting models ARIMA, SARIMA, and VAR. Accuracy comparison is shown in Table 14, Table 15. When increasing the number of layers or units, the model loses the point of global minima, resulting in a bad model. The parameters presented in the Table 2 are obtained after several iterations. We train the model using the ‘Tanh’ and ‘ReLU’ activation functions. The default activation function in the Bi-LSTM layer is ‘Tanh,’ but it is replaced with ReLU to observe the change in accuracy. We used only the best forecasting model obtained using seven feature sets (three policies and four target parameters) shown in Table 2. However, the accuracy with Tanh is higher compared to the ReLU. Hence, the default activation function outperforms ReLU. The forecasting accuracy results with ReLU and Tanh are shown in the Table 10, Table 11. This study uses a normalization method that scales the data between 0 and 1. Adam is used as an optimizer for the model. We observed that the model does not perform well when optimizers other than Adam are used, such as SGD, RMSprop, and Adadelta. An early stopping callback is also used with a patience value of 100, and the mode of monitoring validation loss is minimum. The learning rate decreases by a factor of 0.3 with a patience value of 80, and the mode of monitoring is validation loss. Fig. 7 shows the learning curve of the best-performing stacked Bi-LSTM model (with seven parameters).

Table 14.

Forecasting/Prediction accuracy comparison between VAR, LSTM and Bi-LSTM in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using 3 policy.

Policy MAE
RMSE
P D R Q P D R Q
graphic file with name fx1023_lrg.gif graphic file with name fx1024_lrg.gif graphic file with name fx1025_lrg.gif graphic file with name fx1026_lrg.gif graphic file with name fx1027_lrg.gif graphic file with name fx1028_lrg.gif graphic file with name fx1029_lrg.gif graphic file with name fx1030_lrg.gif graphic file with name fx1031_lrg.gif
School closing, Workspace closing & Public event cancellation (VAR) 27 548 660.96 15 297.5 11 590 36 690 876.1 22 775 13 836
School, Workspace closing & Public event cancellation (LSTM) 3401.8 27.70 3745.6 589.31 4198.5 39.47 4441.4 726.14

Table 15.

Forecasting/Prediction accuracy comparison between VAR, LSTM, Bi-LSTM, ARIMA and SARIMAX in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using all 10 policies.

Policy MAE
RMSE
P D R Q P D R Q
graphic file with name fx1032_lrg.gif graphic file with name fx1033_lrg.gif graphic file with name fx1034_lrg.gif graphic file with name fx1035_lrg.gif graphic file with name fx1036_lrg.gif graphic file with name fx1037_lrg.gif graphic file with name fx1038_lrg.gif graphic file with name fx1039_lrg.gif graphic file with name fx1040_lrg.gif
All 10 policies (VAR) 8655.9 54.173 2275.23 6637.7 11 820 63.24 3414.8 8515.6
All 10 policies (LSTM) 6667.1 24.06 11 493.5 1436.3 7990.0 36.90 12 399.8 1568.9
ARIMA (No policy) 6338.5 24.05 2678.1 2005.5 7462.3 38.31 3217.9 2380.9
SARIMA (No policy) 6075.39 168.56 5578.2 1156.8 7069.9 200.18 6787.3 1599.7

Table 10.

Forecasting/Prediction accuracy over 7 parameters (3 control policies and 4 target variables) in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using Bi-LSTM.

Policy Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
graphic file with name fx1001_lrg.gif graphic file with name fx1002_lrg.gif graphic file with name fx1003_lrg.gif graphic file with name fx1004_lrg.gif graphic file with name fx1005_lrg.gif graphic file with name fx1006_lrg.gif graphic file with name fx1007_lrg.gif graphic file with name fx1008_lrg.gif graphic file with name fx1009_lrg.gif graphic file with name fx1010_lrg.gif graphic file with name fx1011_lrg.gif graphic file with name fx1012_lrg.gif graphic file with name fx1013_lrg.gif

School closing, Workspace closing & Public event cancellation (with ReLU) 1572.8 59.79 2517.8 739.70 0.010 0.028 0.017 0.088 1863.5 74.60 2829.9 883.2

School closing, Public event cancellation & Facial covering 307.11 14.413 376.14 359.51 0.0020 0.0072 0.0026 0.042 448.32 15.10 427.80 532.09

School closing, large gathering restriction & Facial covering 294.01 6.440 150.03 379.61 0.0018 0.0032 0.0010 0.045 403.19 7.105 187.43 520.88

Table 11.

Forecasting/Prediction accuracy over 7 parameters (3 control policies and 4 target variables) in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using LSTM.

Policy Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
School, Workspace closing & Public event cancellation 3401.8 27.70 3745.6 589.31 0.022 0.013 0.026 0.070 4198.5 39.47 4441.4 726.14

School, Workspace closing & Public event cancellation (with ReLU) 1082.7 70.84 1112.5 1332.6 0.007 0.033 0.007 0.148 1230.1 96.10 1417.8 1554.5

School closing, Public event cancellation & Facial covering 3615.6 152.5 2292.1 522.30 0.023 0.070 0.016 0.062 4417.6 161.30 2792.3 780.03

School closing, large gathering restriction & Facial covering 4865.0 19.165 6434.8 878.2 0.032 0.009 0.047 0.106 6584.2 19.480 7711.6 1251.5

Fig. 7.

Fig. 7

Training and validation loss. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 7 depicts that initially, the weights are initialized with very low values resulting in huge errors at the beginning of the training process. With training, the weights converge to a good value, causing the error approach to zero. During training, the validation loss started becoming more stable, causing the model to get stuck in local minima. It happens when the learning rate is very low, and the model gets stuck in local minima. We use a learning rate scheduler to change the learning rate adaptively to not stuck the model in local minima. Therefore, the validation loss tends to improve further with the number of increasing iterations in the graph. The blue line represents training loss, and the orange line represents validation loss. The y-axis represents loss values ranging between 0 and 1 because the data is normalized between 0 and 1. The X-axis represents the number of iterations. Since the data is normalized between 0 and 1, the loss reduces drastically at the beginning.

4.3. Forecasting curve

The number of predicted positive cases, deaths, hospitalized, and quarantined cases in South Korea follows almost the same curve as the original curve. This study looks at the previous three days of data and forecasts the data for the next three days. We used 480 days of data for training and 56 days of data for testing. The total numbers of positive cases are around 170,000, death cases are 2200, recovered cases are 160,000, and quarantined cases are 185 000.

4.3.1. Forecasting using one policy with target variables data

First, we train the model only single policy with four target variables, so the input features are five and the output features are four. There are ten policies, and we used them one by one during training to observe which one performs better. The name of the policy used in the model can be found in the title of the figures. The forecasted parameters are the number of positive COVID patients, death cases, recovered cases and quarantined cases. The red line shows the forecasted curve and green line represents the original number of cases in Figs. 9, 10, 11, Fig. 12, Fig. 13, Fig. 14, Fig. 15, Fig. 16, Fig. 17, Fig. 18. These policies are trained using LSTM and stacked Bi-LSTM models. However, the stacked Bi-LSTM model gave better results than LSTM. The error rate comparison is shown in the Table 4, Table 5. The single policies that give a minimum error rate while forecasting target parameters can also be observed from these two tables. The policies with minimum error rates are noted and then further used as a combination of two with the forecasting models. The single policies that gave good results are School closing, workplace closing, public event cancellation, large gathering restriction, international travel, public information campaign, contact tracing, and facial covering.

Fig. 9.

Fig. 9

Forecasting curve (school closing policy). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 10.

Fig. 10

Forecasting curve (closing of workplaces). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 11.

Fig. 11

Forecasting curve (canceling of public event). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 12.

Fig. 12

Forecasting curve (restrictions on large gathering). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 13.

Fig. 13

Forecasting curve (stay at home restriction). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 14.

Fig. 14

Forecasting curve (internal movement between cities restriction). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 15.

Fig. 15

Forecasting curve (restriction on international travel). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 16.

Fig. 16

Forecasting curve (public information campaigns). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 17.

Fig. 17

Forecasting curve (contact tracing policy). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 18.

Fig. 18

Forecasting curve (facial covering policy). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Table 4.

Forecasting/Prediction accuracy over 5 parameters (Single control policy and 4 target variables) in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using LSTM.

Policy Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
School closing 4402.3 90.68 7089.7 612.7 2.930 4.298 5.234 7.462 5846.1 97.94 7874.8 968.51
Workplace closing 1702.2 9.199 899.6 659.8 0.010 0.0045 0.0061 0.079 2350.1 12.78 1541.9 872.0
Public event cancellation 2432.8 13.98 3908.2 644.38 0.015 0.006 0.027 0.077 4631.1 24.24 4238.1 926.8
Large gathering restriction 1269.6 55.35 2470.1 548.25 0.0084 0.026 0.017 0.065 1550.4 70.68 2862.1 797.03
Stay at home restriction 1060.0 87.76 2706.2 735.31 0.0067 0.040 0.019 0.087 1577.0 119.90 3037.2 946.92
Internal movement restriction 1705.8 41.27 1461.3 897.7 1.121 1.933 1.045 11.89 2228.4 68.24 1640.9 1470.8
International travel restriction 3169.7 12.59 3453.9 695.35 0.020 0.006 0.024 0.083 4730.1 15.71 4561.8 964.4
Public information campaign 1824.1 120.3 2135.8 675.8 0.011 0.056 0.014 0.083 3071.4 130.27 2810.6 1104.8
Contact tracing policy 1836.9 112.04 2116.7 858.4 0.012 0.052 0.014 0.101 2606.3 125.44 2516.3 1078.6
Facial covering 4128.0 16.22 2699.3 863.9 0.027 0.0080 0.019 0.104 5968.1 16.86 3391.9 1235.9
Table 5.

Forecasting/Prediction accuracy over 5 parameters (Single control policy and 4 target variables) in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using Bi-LSTM.

Policy Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
School closing 195.73 2.727 238.05 236.22 0.125 0.136 0.166 2.828 296.67 3.123 350.8 330.66
Workplace closing 174.17 5.014 265.26 288.56 0.0011 0.0024 0.0018 0.0390 222.08 9.447 337.10 373.21
Public event cancellation 145.70 8.062 352.14 312.16 0.0009 0.004 0.0024 0.037 188.19 9.356 446.16 420.59
Large gathering restriction 186.09 3.699 113.48 225.13 0.0012 0.0018 0.0008 0.026 226.11 4.611 149.07 314.37
Stay at home restriction 440.09 12.592 206.87 339.2 0.002 0.006 0.001 0.047 523.08 24.01 288.28 443.26
Internal movement restriction 580.17 6.724 190.01 539.27 0.365 0.329 0.131 6.664 1037.8 11.805 330.91 957.63
International travel restriction 313.66 8.592 230.76 383.32 0.002 0.0043 0.0016 0.045 484.39 9.076 275.63 576.78
Public information campaign 178.84 3.536 302.10 332.3 0.0011 0.0017 0.0021 0.039 220.72 4.185 376.02 459.9
Contact tracing policy 143.35 4.08 521.46 317.90 0.0009 0.002 0.003 0.039 181.09 5.592 651.73 425.56
Facial covering 228.74 5.354 165.78 269.27 0.0014 0.0026 0.0011 .032 346.72 6.058 218.90 326.52

4.3.2. Forecasting using 2 policies with target variables data

The best-performing single policy is grouped with another policy to input the pair of two policies along the four forecasting parameters into the model. The school closing and workspace closing policy are grouped, and similarly, the cancellation of public events and public information campaign policy are grouped. Figs. 19, 20, 21, and 22 shows the curve fit for the two grouped policies. It is observed that the Bi-LSTM model gives the best results when the following combinations are used:

Fig. 19.

Fig. 19

Forecasting curve (school closing & canceling of public events).

Fig. 20.

Fig. 20

Forecasting curve (canceling of public event & public information campaign).

Fig. 21.

Fig. 21

Forecasting curve (facial covering & restriction on large gathering policy).

Fig. 22.

Fig. 22

Forecasting curve (school closing & workspace closing policy).

  • 1.

    Facial covering and large gathering restriction

  • 2.

    School closing and workspace closing policy

  • 3.

    School closing and canceling of public events

Almost all the curves look similar to the figures, but their respective error rates can be visualized in the Table 3, Table 7. The accuracy table shows that the model trained using school closing and workspace closing policies has a slightly less error rate than those without these two policies. Combining these two policies can yield much better results when combined with any third policy that positively affects model accuracy.

Table 3.

Forecasting/Prediction accuracy with 6 parameters (two control policies and 4 target parameters) in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using Bi-LSTM.

Policy Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
School closing & Canceling of public event 229.09 5.038 211.08 388.17 0.145 0.251 0.151 4.598 372.09 6.329 252.54 539.99

Facial covering & Large gathering restriction 318.45 1.650 206.91 364.47 0.0020 0.0008 0.0014 0.044 534.29 2.226 324.86 461.06

Public event cancellation & Public information campaign 320.81 8.314 479.84 316.65 0.0021 0.0041 0.0034 0.037 356.65 10.648 552.77 437.56

School & Workspace closing policy 220.85 6.919 132.25 316.97 0.141 0.346 0.093 3.737 317.09 7.726 167.43 449.36
Table 7.

Forecasting/Prediction accuracy over 6 parameters (2 control policies and 4 target variables) in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using LSTM.

Policy Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
School closing & Canceling of public event 3002.0 51.68 5583.6 658.55 1.969 2.727 4.066 7.890 3900.9 62.18 6192.5 819.9

Facial covering & Large gathering restriction 2804.1 9.690 2467.0 811.20 0.018 0.004 0.017 0.102 4677.9 10.899 3377.3 1296.8

Public event cancellation & Public information campaign 3066.2 23.48 2342.3 650.7 0.0202 0.011 0.016 0.079 4121.3 29.21 3337.2 1047.2

School & Workspace closing policy 7670.7 67.30 8424.8 658.6 5.283 3.233 6.294 7.989 8748.3 72.25 9249.3 987.01

4.3.3. Forecasting using 3 policies data with target variables data

After acquiring the best results with the combination of two policies, a combination of three policies is used to train the model to see if it improves the model’s accuracy. We form the combinations using those policies which gave minimum MAE rate while forecasting. The combination of three policies that we used as input are:

  • 1.

    School, workspace closing and public event cancellation

  • 2.

    School closing, public event cancellation & facial covering policy

  • 3.

    School closing, large gathering restriction & facial covering policy

Their respective forecasting curves can be visualized in Fig. 23, Fig. 24, Fig. 25. Among these combinations, we found that the best policies aiding in improving the model’s accuracy are School closing, workspace closing, and public event cancellation policy. This combination gave the minimum MAE shown in the Table 10, Table 11. Table 10 shows the error rate of Bi-LSTM model whereas Table 11 shows the results of LSTM model. We show the comparison between both model accuracies. All the plotted figures from 12, 13, 14....26, only show the results of Bi-LSTM model.

Fig. 23.

Fig. 23

Forecasting curve (school closing, workspace closing policy & public event cancellation).

Fig. 24.

Fig. 24

Forecasting curve (school closing, facial covering & public event cancellation).

Fig. 25.

Fig. 25

Forecasting curve (school closing, facial covering & restriction on large gathering).

Fig. 26.

Fig. 26

Forecasting curve using 10 control policies.

We chose the best-performing model (trained over three policies). We changed the activation function in the Bi-LSTM layer keeping the final dense layer (with no activation function) constant. The default activation function is Tanh in the Bi-LSTM layer, but we replaced it with ReLU to further study its impact on model accuracy. We observe that changing the models’ activation function does not help improve the accuracy. The default Tanh performed better than ReLU. The difference between error rates with and without ReLU can be observed in the Table 10.

The model is also fed a week prior data to forecast one day’s future data, but the performance is not better compared to three days prior data. It can be assumed that too ‘old’ data points or more past days data points do not practically affect (correlate to) the most recent data points. The data points strongly correlated to the newer ones can be used for making the forecast, while the weakly correlated to the newer ones (or not correlated at all) should not be included in forecasting. The forecasting accuracy of the model is shown in Table 6.

Table 6.

Forecasting/Prediction accuracy over 7 parameters (3 control policies and 4 target variables), 7 days prior data used in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases.

Policy Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
School closing, Workspace closing & Public event cancellation (With Bi-LSTM) 1228.0 10.65 465.4 374.5 0.0079 0.0053 0.0034 0.045 1406.5 11.94 611.09 531.16

School closing, Workspace closing & Public event cancellation (With LSTM) 3658.2 23.43 2971.4 1180.2 0.024 0.011 0.021 0.137 4950.3 24.41 3661.4 1401.5

4.3.4. Forecasting using 10 policies data

Finally the model is trained using all 10 policies and four target variables. The graph can be visualized in Fig. 26. The curve observed after the blue line represents the forecasting done on completely unseen data. 10 policies are used as an input along with four target variables to train both LSTM and Bi-LSTM model. The forecasting accuracy for both models can be observed in the Table 8, Table 9. It is inferred that adding more policies that are not useful distorted the curve and it increased the error rate. The models are also trained using 5 policies but the error rate kept increasing. Moreover, from Table 3 till Table 11. It can be observed that Bi-LSTM model performs better than LSTM model trained using COVID-19 control policies as Bi-LSTM model shows better forecasting accuracy than LSTM. Hence, it can be concluded as we found the best forecasting model trained using 3 control policies which are OVAR1 (School closing), OVAR3 (Closing of workplaces) and OVAR5 (Canceling of public event). The model trained using these restriction policies showed the best forecasting results. It also performed better than the model trained over just four target variables i.e. Positive, death, recovered and quarantined cases without the involvement of control policies. The forecasting accuracy for this model is shown in the Table 8, Table 9 row 2, under the tag “No policy”. The combine results of best performing Bi-LSTM model in comparison with LSTM model are shown in the Table 12.

Table 8.

Forecasting/Prediction accuracy over 14 parameters (10 control policies and 4 target variables) and with 4 parameters (no policy control), in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using Bi-LSTM.

Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
All 10 policies 787.63 21.238 422.66 344.34 0.0050 0.0103 0.0029 0.040 1042.3 29.120 559.05 524.42
3 best policies 186.55 6.809 179.21 322.13 0.0011 0.0034 0.0012 0.038 276.41 7.649 234.88 436.40
No policy 603.30 2.323 413.53 376.10 0.0039 0.0011 0.0029 0.044 711.61 2.767 471.49 569.38
Table 9.

Forecasting/Prediction accuracy over 14 parameters (10 control policies and 4 target variables) and with 4 parameters (no policy control), in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using LSTM.

Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
All 10 policies 6667.1 24.06 11 493.5 1436.3 0.045 0.011 0.088 0.161 7990.0 36.90 12 399.8 1568.9
No policy 2532.6 70.17 1032.7 1012.2 0.016 0.033 0.0072 0.136 2809.5 84.04 1641.8 1301.6
Table 12.

Forecasting/Prediction accuracy comparison between LSTM and Bi-LSTM in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q).

Policy Bi-LSTM (MAE)
LSTM (MAE)
P D R Q P D R Q
School closing 195.73 2.727 238.05 236.22 4402.3 90.68 7089.7 612.7
Workplace closing 174.17 5.014 265.26 288.56 1702.2 9.199 899.6 659.8
Public event cancellation 145.70 8.062 352.14 312.16 2432.8 13.98 3908.2 644.38
Large gathering restriction 186.09 3.699 113.48 225.13 1269.6 55.35 2470.1 548.25
Stay at home restriction 440.09 12.592 206.87 339.2 1060.0 87.76 2706.2 735.31
Internal movement restriction 580.17 6.724 190.01 539.27 1705.8 41.27 1461.3 897.7
International travel restriction 313.66 8.592 230.76 383.32 3169.7 12.59 3453.9 695.35
Public information campaign 178.84 3.536 302.10 332.3 1824.1 120.3 2135.8 675.8
Contact tracing policy 143.35 4.08 521.46 317.90 1836.9 112.04 2116.7 858.4
Facial covering 228.74 5.354 165.78 269.27 4128.0 16.22 2699.3 863.9
School closing & Canceling of public event 229.09 5.038 211.08 388.17 3002.0 51.68 5583.6 658.55
School & Workspace closing 220.85 6.919 132.25 316.97 7670.7 67.30 8424.8 658.6
graphic file with name fx1014_lrg.gif graphic file with name fx1015_lrg.gif graphic file with name fx1016_lrg.gif graphic file with name fx1017_lrg.gif graphic file with name fx1018_lrg.gif graphic file with name fx1019_lrg.gif graphic file with name fx1020_lrg.gif graphic file with name fx1021_lrg.gif graphic file with name fx1022_lrg.gif
All 10 policies 787.63 21.238 422.66 344.34 6667.1 24.06 11 493.5 1436.3
No Policy 603.30 2.323 413.53 376.10 2532.6 70.17 1032.7 1012.2

4.3.5. Forecasting using ARIMA, SARIMA and VAR (classical models)

We trained the classical models’ ARIMA, SARIMA, and VAR for positive, death, recovered, and quarantined cases. The VAR uses multiple parameters, which include all ten control policies. The results obtained using ARIMA and SARIMA are shown in Table 13. VAR is trained using three best-performing policies (which give good results and minimum error in Bi-LSTM), and finally, all ten policies are used. The difference between accuracy of VAR, Bi-LSTM and LSTM model is shown in Table 14, Table 15. Table 14 models are trained using three policies, and Table 15 models are trained using all ten policies alongside their comparison with ARIMA and SARIMA. We can observe that Bi-LSTM outperforms all the other models in forecasting positive, death, recovered, and quarantined cases.

Table 13.

Forecasting/Prediction accuracy in forecasting of Positive (P), Death (D), Recovered (R) and Quarantined (Q) cases using ARIMA and SARIMA (using 4 forecasting parameters).

Models Mean absolute error (MAE)
Mean absolute percentage error
Root mean square error (RMSE)
P D R Q P D R Q P D R Q
ARIMA 6338.5 24.05 2678.1 2005.5 0.044 0.012 0.019 0.237 7462.3 38.31 3217.9 2380.9
SARIMA 6075.39 168.56 5578.2 1156.8 0.042 0.085 0.041 0.133 7069.9 200.18 6787.3 1599.7

5. Conclusion

This paper proposed the control policies-based COVID-19 parameters forecasting in South Korea using stacked Bi-LSTM architecture. We further presented the comparison between the accuracy of stacked Bi-LSTM with classical time-series models VAR, ARIMA, SARIMA, and LSTM. It is observed that Bi-LSTM performs better than traditional models and LSTM for forecasting COVID-19 cases, considering the past three days’ historical data. The models are also trained using seven days of historical data and changing the activation functions (ReLU and Tanh) in Bi-LSTM and LSTM layers. However, three days of historical data and default activation function Tanh in Bi-LSTM and LSTM layer outperformed seven days of historical data and ReLU. We examined ten policies, such as school closure, restrictions on international movement, restrictions on large gatherings, contact tracing, facial covering (face mask), internal travel (between cities), public event cancellation, stay-at-home restrictions, and vice versa, to forecast the future value of COVID-19 positive confirmed, recovered, death, and quarantine cases. The model is trained using a single policy, two pairs, and three pairs of policies, and finally, all the policies are taken as input altogether. The single policies that gave the best results are School closing, Workspace closing, public event cancellation, large gatherings, public information campaigns, and contact tracing policy. In the case of two-pair policies, the best results are given by School and workspace closing, secondly school closing and cancellation of public events. In case of three pair policies, the best forecasting results are yielded by School, workspace closing and public event cancellation with MAE of 186.55, 6.809, 179.21 and 322.13 for positive, death, recovered and deceased cases, respectively. We conclude that these three pairs of policies performed better than all other pairs. While originally, this study intended to deal with pandemic growth and its impact on South Korea. However, the proposed model can be developed to forecast COVID-19 cases and can be used on other pandemic time-series problems as well with further fine-tuning. Moreover, this study will also help the government bring out more effective policies to stop COVID-19 growth, and it will also help eliminate unusual restrictions that have no impact on stopping COVID-19.

In future work, we plan to incorporate vaccinated cases and vaccination policies to check the effect of the vaccination drive on COVID-19 cases. Furthermore, we will use attention models to see if there are more chances of getting better-optimized results.

CRediT authorship contribution statement

Furqan Ali: Conceptualization, Methodology, Software, Writing – original draft, Visualization, Formal analysis. Farman Ullah: Conceptualization, Supervision, Investigation, Validation, Final Review. Junaid Iqbal Khan: Writing – original draft, Visualization, Formal analysis. Jebran Khan: Data curation, Investigation. Abdul Wasay Sardar: Writing – original draft, Visualization. Sungchang Lee: Supervision, Investigation, Validation, Final Review, Fund acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

This research was supported by National Research Foundation of Korea (NRF) Grant funded by the Korean Government (Ministry of Science and ICT) NRF-2020K1A3A1A47110830.

Data availability

Dataset link : https://sites.google.com/view/snuaric/data/covid-19-data/covid-19-data?authuser=0

References

  • 1.Ullah F., Haq H.U., Khan J., Safeer A.A., Asif U., Lee S. Wearable iots and geo-fencing based framework for COVID-19 remote patient health monitoring and quarantine management to control the pandemic. Electronics. 2021;10(16):2035. [Google Scholar]
  • 2.2021. South Korea: coronavirus cases by test result | Statista. https://www.statista.com/statistics/1095848/south-korea-confirmed-and-suspected-coronavirus-cases [Online; accessed 15. Dec. 2021] [Google Scholar]
  • 3.Organization W.H., et al. World Health Organization; 2020. Responding to community spread of COVID-19: interim guidance, 7 March 2020: Tech. rep. [Google Scholar]
  • 4.Mahanty C., Kumar R., Mishra B.K., Hemanth D.J., Gupta D., Khanna A. Prediction of COVID-19 active cases using exponential and non-linear growth models. Expert Syst. 2022;39(3) [Google Scholar]
  • 5.Harb A.M., Harb S.M. Corona COVID-19 spread-a nonlinear modeling and simulation. Comput Electr Eng. 2020;88 doi: 10.1016/j.compeleceng.2020.106884. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Sardar A.W., Ullah F., Bacha J., Khan J., Ali F., Lee S. Mobile sensors based platform of human physical activities recognition for COVID-19 pandemic spread minimization. Comput Biol Med. 2022 doi: 10.1016/j.compbiomed.2022.105662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Khashei M., Bijari M. A new hybrid methodology for nonlinear time series forecasting. Model Simul Eng. 2011;2011 [Google Scholar]
  • 8.Chen R.-C., Dewi C., Huang S.-W., Caraka R.E. Selecting critical features for data classification based on machine learning methods. J Big Data. 2020;7(1):1–26. [Google Scholar]
  • 9.Said A.B., Erradi A., Aly H.A., Mohamed A. Predicting COVID-19 cases using bidirectional LSTM on multivariate time series. Environ Sci Pollut Res. 2021;28(40):56043–56052. doi: 10.1007/s11356-021-14286-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hu Y., Huber A., Anumula J., Liu S.-C. Overcoming the vanishing gradient problem in plain recurrent networks. ArXiv. 2018 [Google Scholar]
  • 11.Kucharski A.J., Russell T.W., Diamond C., Liu Y., Edmunds J., Funk S., Eggo R.M., Sun F., Jit M., Munday J.D., et al. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect Dis. 2020;20(5):553–558. doi: 10.1016/S1473-3099(20)30144-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ghaderzadeh M., Asadi F. Deep learning in the detection and diagnosis of COVID-19 using radiology modalities: A systematic review. J Healthc Eng. 2021;2021 doi: 10.1155/2021/6677314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Salam M.A., Taha S., Ramadan M. COVID-19 detection using federated machine learning. PLoS One. 2021;16(6) doi: 10.1371/journal.pone.0252573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Khan S.D., Alarabi L., Basalamah S. Toward smart lockdown: A novel approach for COVID-19 hotspots prediction using a deep hybrid neural network. Computers. 2020;9(4):99. [Google Scholar]
  • 15.Khan J.I., Khan J., Ali F., Ullah F., Bacha J., Lee S. Artificial intelligence and internet of things (AI-IoT) technologies in response to COVID-19 pandemic: A systematic review. IEEE Access. 2022;10:62613–62660. [Google Scholar]
  • 16.Amar L.A., Taha A.A., Mohamed M.Y. Prediction of the final size for COVID-19 epidemic using machine learning: A case study of Egypt. Infect Dis Model. 2020;5:622–634. doi: 10.1016/j.idm.2020.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Shastri S., Singh K., Kumar S., Kour P., Mansotra V. Time series forecasting of Covid-19 using deep learning models: India-USA comparative case study. Chaos Solitons Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.2021. COVID-19 in India. https://www.google.com/covid19/mobility https://www.kaggle.com/datasets/sudalairajkumar/covid19-in-india/ [Online; accessed 20. Apr. 2022] [Google Scholar]
  • 19.2022. COVID-19 in USA. https://www.kaggle.com/datasets/sudalairajkumar/covid19-in-usa?select=us_states_covid19_daily.csv [Online; accessed 20. Apr. 2022] [Google Scholar]
  • 20.Mohammdi Farsani R., Pazouki E. A transformer self-attention model for time series forecasting. J Electr Comput Eng Innov (JECEI) 2021;9(1):1–10. [Google Scholar]
  • 21.2021. Inter-series attention model for COVID-19 forecasting | proceedings of the 2021 SIAM international conference on data mining (SDM) | society for industrial and applied mathematics. [Online; accessed 4. Sep. 2021] [Google Scholar]
  • 22.Istaiteh O., Owais T., Al-Madi N., Abu-Soud S. 2020 international conference on intelligent data science technologies and applications (IDSTA) IEEE; 2020. Machine learning approaches for COVID-19 forecasting; pp. 50–57. [Google Scholar]
  • 23.Battineni G., Chintalapudi N., Amenta F. Forecasting of COVID-19 epidemic size in four high hitting nations (USA, Brazil, India and Russia) by Fb-prophet machine learning model. Appl Comput Inform. 2020 [Google Scholar]
  • 24.Chandra R., Jain A., Singh Chauhan D. Deep learning via LSTM models for COVID-19 infection forecasting in India. PLoS One. 2022;17(1) doi: 10.1371/journal.pone.0262708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.2020. Covid-19 states of India and karnataka district timeline 20-21. http://www.isibang.ac.in/~athreya/incovid19. [Google Scholar]
  • 26.Chen R.-J. Correction: A COVID-19 pandemic artificial intelligence–based system with deep learning forecasting and automatic statistical data acquisition: Development and implementation study. J Med Internet Res. 2021;23(7) doi: 10.2196/27806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kafieh R., Arian R., Saeedizadeh N., Amini Z., Serej N.D., Minaee S., Yadav S.K., Vaezi A., Rezaei N., Haghjooy Javanmard S. COVID-19 in Iran: Forecasting pandemic using deep learning. Comput Math Methods Med. 2021;2021 doi: 10.1155/2021/6927985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Prasanth S., Singh U., Kumar A., Tikkiwal V.A., Chong P.H.J. Forecasting spread of COVID-19 using google trends: A hybrid GWO-deep learning approach. Chaos Solitons Fractals. 2021;142 doi: 10.1016/j.chaos.2020.110336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Abbasimehr H., Paki R. Prediction of COVID-19 confirmed cases combining deep learning methods and Bayesian optimization. Chaos Solitons Fractals. 2021;142 doi: 10.1016/j.chaos.2020.110511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ayoobi N., Sharifrazi D., Alizadehsani R., Shoeibi A., Gorriz J.M., Moosaei H., Khosravi A., Nahavandi S., Chofreh A.G., Goni F.A., et al. Time series forecasting of new cases and new deaths rate for COVID-19 using deep learning methods. Results Phys. 2021;27 doi: 10.1016/j.rinp.2021.104495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.2021. WPRO COVID-19 dashboard_bk. https://www.arcgis.com/apps/dashboards/345dfdc82b5c4f6a815f1d54a05d18ec [Online; accessed 27. Dec. 2021] [Google Scholar]
  • 32.Farooq J., Bazaz M.A. A deep learning algorithm for modeling and forecasting of COVID-19 in five worst affected states of India. Alex Eng J. 2021;60(1):587–596. [Google Scholar]
  • 33.2021. COVID-19 in India. https://www.kaggle.com/sudalairajkumar/covid19-in-india?select=StatewiseTestingDetails.csv [Online; accessed 27. Dec. 2021] [Google Scholar]
  • 34.Cabras S. A Bayesian - deep learning model for estimating Covid-19 evolution in Spain. ArXiv. 2020 at URL= https://arxiv.org/abs/2005.10335v2. [Google Scholar]
  • 35.datadist S. 2021. Datasets. https://github.com/datadista/datasets/tree/master/COVID [Online; accessed 27. Dec. 2021] [Google Scholar]
  • 36.Ardabili S.F., Mosavi A., Ghamisi P., Ferdinand F., Varkonyi-Koczy A.R., Reuter U., Rabczuk T., Atkinson P.M. COVID-19 outbreak prediction with machine learning. MedRxiv. 2020 [Google Scholar]
  • 37.2021. COVID live update: 226,734,304 cases and 4,664,427 deaths from the coronavirus - worldometer. https://www.worldometers.info/coronavirus [Online; accessed 15. Sep. 2021] [Google Scholar]
  • 38.Gupta A.K., Singh V., Mathur P., Travieso-Gonzalez C.M. Prediction of COVID-19 pandemic measuring criteria using support vector machine, prophet and linear regression models in Indian scenario. J Interdiscip Math. 2021;24(1):89–108. [Google Scholar]
  • 39.umangkejriwal1122 A.K. 2021. Machine-learning. https://github.com/umangkejriwal1122/Machine-Learning/blob/master/Data [Online; accessed 15. Sep. 2021] [Google Scholar]
  • 40.Cobb J.S., Seale M.A. Examining the effect of social distancing on the compound growth rate of SARS-CoV-2 at the county level (United States) using statistical analyses and a random forest machine learning model. Public Health. 2020;185 doi: 10.1016/j.puhe.2020.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Saba A.I., Elsheikh A.H. Forecasting the prevalence of COVID-19 outbreak in Egypt using nonlinear autoregressive artificial neural networks. Process Saf Environ Prot. 2020;141:1–8. doi: 10.1016/j.psep.2020.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Neeraj A.I., Mathew J., Behera R.K., Panthakkalakath Z.E. A deep learning framework for COVID outbreak prediction. ArXiv. 2020 [Google Scholar]
  • 43.Cabras S. A Bayesian - deep learning model for estimating Covid-19 evolution in Spain. ArXiv. 2020 [Google Scholar]
  • 44.2021. COVID-19. https://cnecovid.isciii.es/covid19 [Online; accessed 15. Sep. 2021] [Google Scholar]
  • 45.Chimmula V.K.R., Zhang L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos Solitons Fractals. 2020;135 doi: 10.1016/j.chaos.2020.109864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Arora P., Kumar H., Panigrahi B.K. Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India. Chaos Solitons Fractals. 2020;139 doi: 10.1016/j.chaos.2020.110017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Devaraj J., Madurai Elavarasan R., Pugazhendhi R., Shafiullah G.M., Ganesan S., Jeysree A.K., Khan I.A., Hossain E. Forecasting of COVID-19 cases using deep learning models: Is it reliable and practically significant? Results Phys. 2021;21 doi: 10.1016/j.rinp.2021.103817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Tomar A., Gupta N. Prediction for the spread of COVID-19 in India and effectiveness of preventive measures. Sci Total Environ. 2020;728 doi: 10.1016/j.scitotenv.2020.138762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Huang Y., Wu Y., Zhang W. Comprehensive identification and isolation policies have effectively suppressed the spread of COVID-19. Chaos Solitons Fractals. 2020;139 doi: 10.1016/j.chaos.2020.110041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Siami-Namini S., Tavakoli N., Namin A.S. 2019 IEEE international conference on big data (Big Data) IEEE; 2019. The performance of LSTM and BiLSTM in forecasting time series; pp. 3285–3292. [DOI] [Google Scholar]
  • 51.2022. SNU ARIC (Asia regional information center) https://sites.google.com/view/snuaric/home?authuser=0 [Online; accessed 20. Apr. 2022] [Google Scholar]
  • 52.Jiang W., Schotten H.D. Deep learning for fading channel prediction. IEEE Open J Commun Soc. 2020;1:320–332. [Google Scholar]
  • 53.Connor J.T., Martin R.D., Atlas L.E. Recurrent neural networks and robust time series prediction. IEEE Trans Neural Netw. 1994;5(2):240–254. doi: 10.1109/72.279188. [DOI] [PubMed] [Google Scholar]
  • 54.Hochreiter S., Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
  • 55.Sun Q., Jankovic M.V., Bally L., Mougiakakou S.G. Predicting blood glucose with an LSTM and Bi-LSTM based deep neural network. ArXiv. 2018 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Dataset link : https://sites.google.com/view/snuaric/data/covid-19-data/covid-19-data?authuser=0


Articles from Chaos, Solitons, and Fractals are provided here courtesy of Elsevier

RESOURCES