Temporal deep learning architecture for prediction of COVID-19 cases in India

Hanuman Verma; Saurav Mandal; Akshansh Gupta

doi:10.1016/j.eswa.2022.116611

. 2022 Feb 5;195:116611. doi: 10.1016/j.eswa.2022.116611

Temporal deep learning architecture for prediction of COVID-19 cases in India

Hanuman Verma ^a, Saurav Mandal ^b,^⁎, Akshansh Gupta ^c

PMCID: PMC8817764 PMID: 35153389

Abstract

To combat the recent coronavirus disease 2019 (COVID-19), academician and clinician are in search of new approaches to predict the COVID-19 outbreak dynamic trends that may slow down or stop the pandemic. Epidemiological models like Susceptible–Infected–Recovered (SIR) and its variants are helpful to understand the dynamics trend of pandemic that may be used in decision making to optimize possible controls from the infectious disease. But these epidemiological models based on mathematical assumptions may not predict the real pandemic situation. Recently the new machine learning approaches are being used to understand the dynamic trend of COVID-19 spread. In this paper, we designed the recurrent and convolutional neural network models: vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM model to capture the complex trend of COVID-19 outbreak and perform the forecasting of COVID-19 daily confirmed cases of 7, 14, 21 days for India and its four most affected states (Maharashtra, Kerala, Karnataka, and Tamil Nadu). The root mean square error (RMSE) and mean absolute percentage error (MAPE) evaluation metric are computed on the testing data to demonstrate the relative performance of these models. The results show that the stacked LSTM and hybrid CNN+LSTM models perform best relative to other models.

Keywords: Deep learning, COVID-19, CNN, LSTM

1. Introduction

The coronavirus disease 2019 (COVID-19) was identified in Wuhan city of China in December 2019 that arises due to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Huang et al., 2020). It is categorized as an infectious disease and spreads among people through coming in close contact with infected people generally via small droplets due to coughing, sneezing, or talking, and through the infected surface. On March 11, 2020, the World Health Organization (WHO) declared the COVID-19 as a pandemic of infectious disease. In India, the first case of COVID-19 was reported in Kerala on January 30, 2020 and gradually spread throughout India especially in urban area, and India witnessed the first wave of COVID-19. India witnessed the second wave in March 2021, which was much more devastating than the first wave, with shortages of hospital beds, vaccines, oxygen cylinder and other medicines in parts of the country. To fight with the COVID-19, the country has vaccination, herd immunity, and epidemiological interventions as few possible options. In the early stage of COVID-19, India had imposed complete as well as partial lockdown as epidemiological interventions during the first wave that slowed the transmission rate and delayed the peak, and resulted in a lesser number of COVID-19 cases. India is the second most populous country in the world, where 68.84% and 31.16% India’s population lives in rural areas and urban areas respectively. The population density in northeast India is low in comparison to other states of India. The chance of getting infection depends on the spatial distance between the contacts and low-density population is less prone in comparison to high density population. Individual personal behaviour (social distancing, frequent hand sanitation, and wearing a mask, etc.) also plays a key role to control the COVID-19 spread.

Prediction of COVID-19 new cases per day will help the administration and planners to take the proper decision and help them in making effective policy to tackle the pandemic situation. The epidemiological models are very helpful to understand the trend of COVID-19 spread and useful in predicting the spread rate of the disease, the duration of the disease, and the peak of the infectious disease. It can be used for short term and long term predictions for new confirmed COVID-19 cases per day that may be used in decision making to optimize possible controls from the infectious disease. In literature, several mathematical models for infectious diseases such Logistic models (Turner et al., 1976), generalized growth models (Chowell, 2017), Richards’s models (Richards, 1959), sub epidemics wave models (Chowell et al., 2019), Susceptible–Infected–Recovered (SIR) model (Kermack & McKendrick, 1927), and Susceptible–Exposed–Infectious–Removed (SEIR) have been introduced. The SIR model is a compartmental model that considers the whole population as a closed population and divides this closed population into susceptible, infected, and recovered compartments. Few infected persons infect some other persons at an average rate $R 0$ , known as the basic reproduction number. Recently, some works have been reported in the literature using the SIR and its variants model to predict the COVID-19 outbreak (Ardabili et al., 2020, Bagal et al., 2020, Chen et al., 2020, Cooper et al., 2020, Verma et al., 2020). These epidemiological models are good in understanding the trend of COVID-19 spread but are designed based on several assumptions that would not hold generally on real-life data (Chimmula & Zhang, 2020). It is unreliable due to the complex trend of spread of the infection as it depends on population density, travel, and individual social aspects like cultural and life styles. Therefore, there is a need for deep learning approaches to accurately predict the COVID-19 trends in India. In deep learning, convolutional neural network (CNN) (LeCun et al., 1989) is one form of deep learning architecture for processing data that has a grid like topology. It includes the time series data that can be considered as 1D grid taking samples at regular time intervals and image data considered as 2D grid of pixels. A typical end-to-end CNN network consists of different layers such as convolution, activation, max-pooling, softmax layer etc.

Recurrent neural network (RNN) (Rumelhart et al., 1986) derived from the feedforward neural networks can use their internal states (memory) to process variable length sequences of data suitable for the sequential data. Long Short-Term Memory (LSTM) has been introduced by Hochreiter and Schmidhuber (1997) which overcomes the vanishing and exploding gradient problem in RNN and have long dependencies that proved to be very promising for modelling of sequential data. A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell. For a given input sequence $x = (x_{1}, x_{2}, \dots, x_{T})$ from time $t = 1$ to $T$ , LSTM calculates an output sequence $y = y_{1}, y_{2}, \dots, y_{T}$ , mathematically represented as (Hochreiter & Schmidhuber, 1997):

i_{i} = σ (W_{i x} x_{t} + W_{i m} m_{t - 1} + W_{i c} c_{t - 1} + b_{i})

(1)

i_{f} = σ (W_{f x} x_{t} + W_{f m} m_{t - 1} + W_{f c} c_{t - 1} + b_{f})

(2)

c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ g (W_{c x} x_{t} + W_{c m} m_{t - 1} + b_{c})

(3)

o_{t} = σ (W_{o x} x_{t} + W_{o m} m_{t - 1} + W_{o c} c_{t - 1} + b_{o})

(4)

m_{t} = o_{t} ⊙ h (c_{t})

(5)

y_{t} = ϕ (W_{y m} m_{t} + b_{y})

(6)

From Eq. (1) to Eq. (6), $i, o, f$ and $c$ represent the input gate, output gate, forget gate and cell activation vector respectively, $m$ depicts hidden state vector also known as output vector of the LSTM unit. $W$ denotes the weight matrix, for example $W_{i x}$ means weight matrix from input gate to input. The $⊙$ stands for element wise multiplication, and b denotes the bias term, whereas $g$ and $h$ are used for activation functions at the input and output respectively. $σ$ represents logistic sigmoid function.The LSTM cell is depicted in Fig. 1.

Fig. 1 — Architecture of LSTM cell (Van Houdt et al., 2020).

LSTM is a method having multiple layers which can map the input sequence to a vector having fixed dimensionality, in which the deep LSTM decodes the target sequence from the vector. This deep LSTM is essential for a recurrent neutral network model except on the input sequence. The LSTM can solve problems with long term dependencies which may be caused due to the introduction of many short term dependencies to the dataset. LSTM has the ability to learn successfully on data having a long range of temporal dependencies because of the time lag between the input and their corresponding outputs (Sutskever et al., 2014). LSTM can be used for predicting time series and it is beneficial for sequential data (Abdollahi et al., 2021).

Deep learning models such as LSTM and CNN are well suited for understanding and predicting the dynamical trend of COVID-19 spread and have recently been used in prediction by several researchers (Bedi et al., 2021, Dairi et al., 2021, Devaraj et al., 2021, Iqbal et al., 2021, Nabi et al., 2021, Shastri et al., 2020, Wang et al., 2020). Chandra et al. (2021) used the LSTM and its variants for ahead prediction of COVID-19 spread for India with split the training and testing data as static and dynamics. LSTMs have been used for COVID-19 transmission in Canada by Chimmula and Zhang (2020) and results show the linear transmission in the Canada. Arora et al. (2020) performed forecasting of the COVID-19 cases for India using LSTMs variants and categorized the Indian states in different zones based on COVID-19 cases. With combining the Susceptible–Infectious–Recovered–Deceased and machine learning strategies, the novel forecasting data-driven method has been introduced by Amaral et al. (2021), where mathematical model parameters are determined through the artificial neural network and predicted COVID-19 cases in São Paulo and Brazil. Wieczorek et al. (2020) designed the neural network powered COVID-19 spread forecasting model by using NAdam optimizer and with finding the best time-step in which the data fed in the network, and compare with the classic statistical approaches. The experimental results show the better accuracy for some regions.

In this paper, we employ the vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM model to capture the dynamic trend of COVID-19 spread and predict the COVID-19 daily confirmed cases for 7, 14 and 21 days for India and its four most affected states: Maharashtra, Kerala, Karnataka, and Tamil Nadu. To demonstrate the performance of deep learning models, RMSE and MAPE errors are computed on the testing data. The flowchart of the designed model is represented in Fig. 2.

Fig. 2 — Flowchart describing the entire experimental work for designed model. The details of designed model: vanilla LSTM, stacked LSTM, BiLSTM, ED_LSTM, CNN, and CNN+LSTM are depicted in Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8, respectively.

The rest of the manuscript is organized as follows. Section 2, describes the deep learning model along with experimental setup and evaluation metrics. In Section 3, we present the COVID-19 dataset and experimental results and discussions. Finally, the conclusion is made in Section 4.

2. Methods

2.1. Experimental setup

The COVID-19 outbreak trend is highly dynamic and depends on imposing various intervention strategies. To capture the complex trend, in this study, we proceed the following steps during the training, testing and forecasting.

•
We used early COVID-19 data up to July 10, 2021, and split the COVID-19 time series data into training and testing data by taking the last 20 days data as testing data and remaining data as training data.
•
To avoid the inconsistency in COVID-19 time series data, the data is normalized in the interval $[0, 1]$ using ‘MinMaxScaler’ Keras function.
•
The COVID-19 time series data is reshaped into the input shape data by taking time step (time-lag) or observation window 15 and number of features is one as for the univariate model. The observation window 15 means, we are using previous 15 days COVID-19 time series data to predict the next day, that is the 16th day. In a univariate model the input contains only one feature.
•
Further, we train and test the recurrent and convolutional neural network approaches on COVID-19 time series data and setup the model with setting hyper parameters through manual search. COVID-19 daily confirmed cases predictions are performed up to July 17, 2021(7 days), up to July 24, 2021 (14 days) and up to July 31, 2021(21 days) from July 10, 2021 using vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM for India and its four most affected states Maharashtra, Kerala, Karnataka and Tamil Nadu. The experimental work is summarized in Fig. 2.

RNN abd CNN approaches viz. vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM have been implemented in Python using Keras module of Tensorflow and consider the prediction by taking univariate approaches.

2.2. Vanilla LSTM

A vanilla LSTM is an LSTM model that has a single hidden layer of LSTM units. The encoder is responsible for interpreting and reading the input sequence whereas the output encoder has a fixed-length vector (Brownlee, 2018b). Vanilla LSTM has a property to isolate the effect due to change on the performance variant. So, when vanilla LSTM is used as a baseline it evaluates with all of its variants and allows the isolating effect for the changes made in each of the variants. The performance of vanilla LSTM is reasonably well on various data sets (Greff et al., 2016). This vanilla LSTM is kind of art model for different variety of machine learning programs. So, vanilla LSTM neural networks predict with accuracy making most of the long short-term memory when the cases are complicated while operating (Wu et al., 2018). The designed vanilla LSTM model is depicted in Fig. 3 and its hyper parameters are shown in Table 1.

Fig. 3 — The architecture of designed vanilla LSTM.

Table 1.

Hyper parameter and structure of vanilla LSTM, stacked LSTM, ED-LSTM, BiLSTM models.

Models	Layers	Number	Bias	Models	Layers	Number	Bias
		of units	regularizer			of units	regularizer
			L1				L1
Vanilla LSTM	LSTM	200	0.02	BiLSTM	Bidirectional	250	0.4
	Dropout	0.2	–		Dropout	0.2	–
	Dense	1	–		Dense	1	–
Stacked LSTM	LSTM	130	0.04	ED_LSTM	LSTM	125	0.02
	Dropout	0.4	–		Dropout	0.2	–
	LSTM	100	0.04		Repeat Vector	4	–
	Dropout	0.4	–		LSTM	75	0.02
	LSTM	75	0.04		Dropout	0.2	–
	Dense	1	–		Dense	1	–

Open in a new tab

2.3. Stacked LSTM

Stacked LSTM has more than one LSTM sub-layers that are connected together using various weight parameters. On a single-layer LSTM, stacked LSTM overlays the hidden layers of LSTM (Sun et al., 2020). In stacked LSTM each edge weight corresponds to weight value and the cell is the time unit. The data transformation process performed in stacked LSTM is mathematically shown below,

i^{n e x t} = f (\sum_{n = 1}^{M} (w_{n}^{n e x t} * o_{n} + b^{n e x t}))

(7)

Here, $f$ is the activation function, $i^{n e x t}$ is the input data for the next hidden layer, weight of edge connected to previous output and next layer input is defined in $w_{n}^{n e x t}$ , $o_{m}$ contains output value of one cell and $b^{n e x t}$ contains bias. For feature extraction the stacked LSTM proves to improve the extraction process (Yu et al., 2019). The designed stacked LSTM model is shown in Fig. 4 and its hyper parameters are tabulated in Table 1.

Fig. 4 — The architecture of designed stacked LSTM.

2.4. Bidirectional-LSTM

Bidirectional Long Short-Term memory (BiLSTM) is a deep learning algorithm applied for forecasting the time series data. It is adopted to learn from the framework providing better understanding from the learning context (Abdollahi et al., 2021). As BiLSTM is a multivariate time series it allows multiple time series dependent which can be designed together to predict the correlations along with the series recorded or captured variables varying simultaneously over time period (Said et al., 2021). BiLSTM is a deep learning models for the sequential prediction without much error (Shahid et al., 2020). It has many more features like handling temporal dependencies along with time series data distributing free learning models and flexibility in modelling non-linear features. In other words, BiLSTM is an enhanced version of LSTM algorithm in which it can deal with the combination of two variants having hidden states that allows information to come from the backward layer as well as from the forward layer. The BiLSTM is helpful for situation that require context input. It is widely used in classification especially like text classification, sentiment classification and speed classification and recognition and load forecasting. As BiLSTM is a deep learning models having capacity to capture non-linearity process and being flexible in modelling time-dependent data; so now-a-days BiLSTM have been using for real-time forecasts of the daily events (Zeroual et al., 2020). The designed BiLSTM model is depicted in Fig. 5 and its hyper parameters are presented in Table 1.

Fig. 5 — The architecture of designed BiLSTM.

2.5. Encoder Decoder-LSTM

ED_LSTM (Encoder Decoder) is a network of sequence-to-sequence model for mapping a fixed-length input to a fixed-length output. It handles variable length input and output first by encoding the input sequence, then decoded from the representation. This method can compute a sequence having hidden states. In ED_LSTM, the encoder and decoder improved the continuity of learning input and output sequences. It experiences reuse for reading input sequence and writing output sequence many times sequentially. And the times of reuse skill depend on the length of the input and output sequences. ED_LSTM model is so consistent and its outputs are stable, reliable and accurate. It can even effectively mimic the long-term dependence between variables (Kao et al., 2020). The advantage of ED_LSTM is that the network of models can be constructed from the model definition which consists of a list of input and outputs. So, the models can be automatically trained from the provided dataset. This advantage of ED_LSTM help to reduce the model construction and training cost (Ellis & Chinde, 2020). The designed ED_LSTM model is depicted in Fig. 6 and its hyper parameters are shown in Table 1.

Fig. 6 — The architecture of designed ED_LSTM.

2.6. Convolution Neural Network (CNN)

CNN is one of the algorithms in deep learning that automatically captures and identifies the important features without the need of human supervision (Gu et al., 2018). Local connections and shared weights employed in the CNN are useful in extracting features from 2-D input signals such as image signals. Basically, CNN has three kinds of layers: convolution layer, pooling layer and fully connected layer. Convolution layer is primarily associated with the identification of features from raw data. This is achieved by applying filters having predefined size followed by convolution operation. Pooling layer applies a pooling operation that reduces the dimension of feature maps while retaining the important features (Albawi et al., 2017). Some of the pooling methods are max pooling and average pooling. The fully connected layer or the dense layer generates forecasting after features extracting process. The final fully connected layers have flattened features arising after applying convolution and pooling operations (Alzubaidi et al., 2021, Zhou et al., 2016).

(I) The convolutional layer in CNN architecture consists of multiple convolutional filters. These filters are also known as kernels. Convolution operation is performed between the raw data that is in the form of a matrix and these kernels that generate an output feature map. The numbers present in the kernel is the kernel weight of the kernel. The initial values of the kernel are random in nature, during the training process the kernel values are adjusted to help in extracting important features from the data. In convolutional operation the CNN input format description is present. In convolution operation let us say in 10 $*$ 10 grey-scale image a randomly initialized kernel slides vertically and horizontally and the dot product between them is computed. In 1D-CNN the kernel function moves in one direction only. Similarly, in 2D-CNN and 3D-CNN the kernel function moves in two and three directions respectively. The computed values are multiplied to create a single scalar value. The data processed by the kernel of CNN sometimes may require padding. It is a process of extracting border information from the input data. Padding refers to the adding layers of extra pixels (zeros) to the input data that helps to preserve information present on the borders (Albawi et al., 2017).

(II) Pooling layer: The feature maps generated from the convolutional operations are sub-sampled in the pooling layer. This reduces the large size feature maps to generate smaller feature maps. The pooling layer reduces the dimension of the feature map resulting in reduction in the number of parameters to learn. It also reduces the computation that needs to be performed. There are various types of pooling such as average pooling, max pooling, min pooling, global average pooling (GAP) etc. It may be possible sometimes that the performance output of CNN model decreases because of the pooling layer as it focuses primarily on ascertaining the correct location of a feature rather than focusing on particular features available in the data (Alzubaidi et al., 2021, Gu et al., 2018, Zhou et al., 2016).

(III). Activation function (Transfer function): In a neural network based on the weighted sum of the neuronal input activation function transforms it into output form. It performs mapping of the input to the output depending upon the neuronal input so as to fire a particular neuron or not. Activation functions can be linear or non-linear functions. Some of the activation function used in CNN are described below: (a) Rectilinear Unit (ReLU): The ReLU function converts the input to a piecewise linear function to a positive output otherwise it will output zero. It is one of the common activation functions in most of the neural networks. One of the advantages of using ReLU over other activation functions is that it has lower computational load (Albawi et al., 2017). Mathematically it is represented as below,

f {(x)}_{R e L U} = m a x {0, x}

(8)

(b) Sigmoid: In this the input are real numbers and the output is constrained to be in between zero and one. It is S-Shaped function and is mathematically represented as shown below,

f {(x)}_{s i g m o i d} = \frac{1}{1 + e^{- x}}

(9)

(c) Tanh: In Tanh activation function the input is real numbers and output is in between −1 and 1. It is described mathematically as shown below,

f {(x)}_{t a n h} = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

(10)

(IV). Fully Connected layer: In this layer each neuron is fully connected to other neurons of the other layer, hence the name Fully Connected (FC) layer. It is located at the end of the CNN architecture and it forms the last few layers in the network. The final pooling layer that is flattened is the input to the FC layer. Flattening is a process in which a matrix is unrolled at its values to form a vector (Albawi et al., 2017).

(V). Loss function: Loss functions are used in the output layer to compute the predicted error created during training samples in CNN. This error is the difference between the actual output and the predicted values. Some of the loss functions used in neural network are Mean Squared Error (MSE), Cross-Entropy or Softmax loss function, Euclidean loss function and Hinge loss function (Albawi et al., 2017). The designed CNN model is depicted in Fig. 7 and its hyper parameters are shown in Table 2.

Fig. 7 — The architecture of designed CNN.

Table 2.

Hyper parameter and structure of CNN, and CNN+LSTM models.

Models	Layers	Number	kernel	Bias	Kernel
		of filters/	size	regularizer	regularizer
		units		L1	L1
CNN	Conv1D	100	2	0.4	0.002
	Conv1D	75	2	0.4	0.002
	MaxPooling1D	–	2	–	–
	Flatten	–	–	–	–
	Dense	64	–	–	–
	Dense	1	–	–	–
CNN+LSTM	Conv1D	100	2	–	0.002
	MaxPooling1D	–	2	–	–
	Flatten	–	–	–	–
	LSTM	64	–	0.5	–
	Dropout	0.3	–	–	–
	Dense	1	–	–	–

Open in a new tab

2.7. Hybrid CNN+LSTM

Hybrid CNN+LSTM deep learning architecture combines the benefits of both the LSTM and CNN. The LSTM in this hybrid model learns the temporal dependencies that are present in the input data. The CNN is integrated such that it can process high dimensional data. The components of LSTM are input gate, forget gate, output gate, memory cell, candidate memory cell and hidden state (Li et al., 2020). The 1-D CNN finds the important features from temporal feature space using non-linear transformation generated by LSTM. The convolution layers are wrapped with a time-distributed layer in the model and it is ensured that data is transformed appropriately. The layers used in the model are two convolutional layers, max-pooling layer, flatten layer, time-distributed layer, followed by LSTM layers (Li et al., 2020, Liu et al., 2018). The designed CNN+LSTM model is presented in Fig. 8 and its hyper parameters are tabulated in Table 2.

Fig. 8 — The architecture of designed CNN+LSTM.

2.8. Evaluation metrics

To demonstrate the relative performance of various deep learning models, the root mean square error (RMSE) and mean absolute percentage error (MAPE) have been computed, which is mathematically defined as:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(11)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y_{i} - \hat{y}}{y_{i}} | \times 100

(12)

here $y_{i}$ denote the actual confirmed cases, $\hat{y_{i}}$ is the predicted daily confirmed cases using the deep learning model, and $n$ is the total number of observation under the study. The small value of RMSE and MAPE represents the better performance of that model. In this study, RMSE and MAPE are computed on the test data where the actual and predicted values of various other models are available. Throughout all predictions of 7, 14, and 21 days, we also computed the confidence interval (Gupta & Kapoor, 1994) at 95% for the predicted new confirmed COVID-19 cases counts per day. The confidence interval gives a range of values for new cases and it gives the probability with which an estimated interval will contain the true value of the confirmed cases.

3. Results and discussions

For the analysis and forecasting of the daily confirmed COVID-19 cases for training and testing the RNN and CNN models are considered. In our study we used vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM to build a map that captures the complex trend in the given sequence of COVID-19 time series data and performs forecasting using these maps. The details are discussed in the following subsections:

3.1. COVID-19 data and preprocessing

In this study, daily new COVID-19 cases have been predicted for 7 days, 14 days and 21 days for the whole country (India) and four of its most affected states (Maharashtra, Kerala, Karnataka, and Tamil Nadu) using deep learning approaches. Previous COVID-19 time series data is accessed from COVID-19India.org during January 30, 2020 to July 10, 2021, where numbers of daily confirmed, recovered and deceased cases are publicly available online at https://api.covid19india.org/documentation/csv/. We use data up to July 10, 2021 as illustrated in Fig. 9 to train and test the recurrent and convolutional neural network models. The trends of COVID-19 time series data is highly inconsistent in nature and it may be due to the rate of individual infections, number of reporting of the cases, individual behaviour, effect of lockdown, and non-pharmaceuticals measures. India and its states witnessed two waves and new cases count per day during peak of the second wave were much more than the first wave as depicted in Fig. 9. Due to higher consistency in per day count, these data are normalized in the interval of [0, 1] using ‘MinMaxScaler’ of the keras function in the preprocessing step before applying the deep learning models.

Fig. 9 — Daily confirmed COVID-19 cases for India from Jan 20, 2020 to Jul 10, 2021 and its states Maharashtra, Kerala, Karnataka, and Tamil Nadu from Mar 14, 2020 to Jul 10, 2021.

The ‘MinMaxScaler’ function normalizes the given time series data $(x)$ using the formula $x_{n o r m a l} = (x - x_{m i n}) / (x_{m a x} - x_{m i n})$ , where $x_{m a x}$ and $x_{m i n}$ represents the maximum and minimum value of data $(x)$ . After the forecasting of the confirmed cases count per day that lies in the interval $[0, 1]$ it is again re-transformed into the corresponding actual number by applying reverse operation using ‘ $i n v e r s e_t r a n s f o r m$ ’ keras function.

3.2. Hyper parameter tuning

The hyper parameters in vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM models are summarized in Table 1, Table 2. To avoid the over-fitting, we regularize the model on the training data using $L 1$ regularizer (bias/kernel) with different settings along with Dropout as shown in Table 1, Table 2. Around 20% to 40% neurons are dropped through the Dropout layers. In CNN and hybrid CNN+LSTM, we use the Conv1D layer along with the kernel size 2, depicted in Table 2. Throughout the entire experiment ‘ReLu’ activation function, ‘adamax’ optimizer and ‘MSE’ loss function is considered in our study. As tuning the training epochs, we setup the ‘EarlyStopping’ callback with number of epochs 1000, batch size 64 along with patience=250. This setup checks the performance of the respective model on train and validation datasets and stops the training if it looks like that if the model is starting to over learn or over fit. The learning algorithm is stochastic in nature therefore the results may be varying in nature (Brownlee, 2018a). To address this issue, we have run each deep learning model up to 10 times and saved the better model and noted their corresponding performance results in our experiment.

3.3. Prediction performance

In this section, we discuss the prediction performance of deep learning models for India and four it states, individually in the following subsections:

3.3.1. India

India is the second most populous country in the world, it may lead to higher threats because of the spread of COVID-19. The daily confirmed cases in India from Jan 30, 2020 to July 10, 2021 are depicted in Fig. 9(a). It is observed that the new confirmed cases per day are highly inconsistent. India witnessed two waves, in second waves around $4, 00, 000$ new cases were reported per day. To address this issues, we train and test the vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and CNN+LSTM models on the India normalized time series data to capture the real trend with setting the hyper parameter, shown in Table 1, Table 2, with manual tuning of hyper parameters. The predicted new cases of COVID-19 for 7, 14, and 21 days are calculated from July 10, 2021 using various recurrent and CNN model and determine the corresponding performance metrics: RMSE and MAPE as presented in Table 3. RMSE and MAPE are computed for the actual and predicted daily confirmed case from June 21, 2021 to July 10, 2021 on the test data. From Table 3, it can be seen that the RMSE and MAPE (7.57%–11.36%) are comparatively smaller for the stacked LSTM and hybrid CNN+LSTM. In some cases RMSE and MAPE (7.36%–12.96%) is less for BiLSTM and ED_LSTM on the test data but the predicted new cases per day is far from the actual cases (Fig. 10). BiLSTM and ED_LSTM models have the over-fitting problem. The predicted and actual (red colour) cases for India for 7 days (up to July 17, 2021), 14 days (up to July 24, 2021) and 21 days (up to July 31, 2021) are shown in Fig. 10. It can be observed that the stacked LSTM and hybrid CNN+LSTM provide better prediction as forecasting in count cases is close to actual count per day. The predicted new cases for 7, 14 and 21 days with various models along with 95% level confidence intervals are shown in Table 4. In our study, for India, we found that stacked LSTM and hybrid CNN+LSTM performed best in terms of prediction consistency among all six deep learning models.

Table 3.

RMSE and MAPE with various model from June 21, 2021 to July 10, 2021.

Country/States	Models	Next 7 days		Next 14 days		Next 21 days
		prediction		prediction		prediction
		RMSE	MAPE	RMSE	MAPE	RMSE	MAPE
India	Vanilla LSTM	9746.6	19.57	8454.28	17.01	11912.86	24.09
	Stacked LSTM	4687.34	8.68	5273.4	9.99	6023.1	11.36
	BiLSTM	3889.28	7.34	5349.35	10.3	6369	12.29
	ED_LSTM	4067.74	7.95	4385.09	8.1	4431.91	8.75
	CNN	6729.1	12.61	5320.62	10.89	4304.57	7.6
	CNN+LSTM	4831.59	9.07	4916.03	7.86	4431.45	7.57
Maharashtra	Vanilla LSTM	1455.74	14.63	2594.56	25.73	1075.67	10.69
	Stacked LSTM	1505.12	15.55	1046.71	10.21	1125.31	10.8
	BiLSTM	1038.16	9.95	1339.5	14.08	1041.19	10.24
	ED_LSTM	1889.96	19.36	1288.35	13.8	1363.88	14.56
	CNN	1871.02	19.54	1140.99	11.5	1232.49	12.65
	CNN+LSTM	1413.44	13.61	1231.39	11.83	1328.62	12.76
Kerala	Vanilla LSTM	3758.49	22.17	2219.32	13.78	1470.86	9.55
	Stacked LSTM	3747.23	23.48	2740.37	19.24	1801.96	13.2
	BiLSTM	2468.36	18.21	3015.41	21.99	1778.38	13.56
	ED_LSTM	2436.79	17.05	2700.32	19.71	2304.13	16.57
	CNN	1836.93	13.6	1839.97	13.58	2436.62	17.26
	CNN+LSTM	2641.06	19.17	1950.78	13.54	4276.85	28.01
Karnataka	Vanilla LSTM	948.58	27.13	461.05	14.14	460.7	13.72
	Stacked LSTM	616.77	19.12	482.77	13.43	876.11	25.35
	BiLSTM	702.86	21.95	632.44	19.64	549.77	17.01
	ED_LSTM	495.17	15.81	497.36	15.42	720.96	21.64
	CNN	539.86	17.15	767.04	22.86	767.9	22.64
	CNN+LSTM	659.5	21.37	513.75	15.96	590.59	18.37
Tamil Nadu	Vanilla LSTM	1096.36	25.25	1437.03	33.79	1571.95	36.87
	Stacked LSTM	1527.06	35.82	671.79	14.99	1253.73	29.34
	BiLSTM	1716.46	40.13	393.62	8	779.47	17.67
	ED_LSTM	700.86	14.66	768.71	15.97	826.78	17.26
	CNN	1516.43	35.44	1040.41	24.12	1134.7	26.52
	CNN+LSTM	594.12	11.8	823.92	15.37	1419.17	31.53

Open in a new tab

Fig. 10 — Predicted and actual cases for India ahead of different days.

Table 4.

Prediction of new confirmed case ahead of 7, 14 and 21 days with various models.

Country/States	Models	Next 7 days prediction		Next 14 days prediction		Next 21 days prediction
		predicted	95% confidence	predicted	95% confidence	predicted	95% confidence
		on	interval	on	interval	on	interval
		17-07-2021		24-07-2021		31-07-2021
India	Vanilla LSTM	31759	[29437, 30832]	35003	[30647, 32344]	41310	[26377, 30991]
	Stacked LSTM	44617	[38520, 42429]	42595	[36814, 39217]	48593	[36012, 39710]
	BiLSTM	43071	[39296, 41623]	33485	[31856, 33257]	61464	[39453, 46424]
	ED_LSTM	55588	[48027, 53046]	72134	[55317, 63396]	156472	[75427, 104118]
	CNN	72572	[59971, 68365]	54217	[43309, 48319]	81682	[56401, 65913]
	CNN+LSTM	35746	[36094, 37051]	39876	[40839, 41877]	35664	[37370, 38641]
Maharashtra	Vanilla LSTM	11080	[10708, 10971]	17372	[14738, 16075]	13909	[10731, 11978]
	Stacked LSTM	6729	[6812, 6656]	12500	[10553, 11445]	15587	[11992, 13394]
	BiLSTM	9497	[9303, 9429]	6446	[6696, 6974]	9979	[9229, 9532]
	ED_LSTM	6109	[6149, 6279]	7278	[7377, 7502]	7254	[7291, 7356]
	CNN	6057	[6131, 6332]	9113	[8616, 8847]	8459	[8199, 8295]
	CNN+LSTM	10164	[9945, 10115]	9464	[9225, 9343]	10693	[10213, 10461]
Kerala	Vanilla LSTM	5296	[5390, 6225]	17549	[11776, 14275]	38564	[23776, 29651]
	Stacked LSTM	6866	[6736, 7182]	34136	[23184, 28533]	25395	[17505, 20351]
	BiLSTM	21663	[17734, 20477]	30926	[21760, 25726]	15940	[14720, 15209]
	ED_LSTM	10229	[10010, 10313]	25344	[20536, 22902]	11223	[11206, 11425]
	CNN	18715	[16398, 17947]	23336	[18276, 20758]	12219	[11081, 11496]
	CNN+LSTM	9224	[9366, 9829]	17342	[14624, 15881]	4495	[4828, 5390]
Karnataka	Vanilla LSTM	1156	[1128, 1154]	2537	[2246, 2358]	2627	[2423, 2490]
	Stacked LSTM	3688	[3275, 3541]	1581	[1732, 1921]	1107	[1014, 1073]
	BiLSTM	2525	[2575, 2719]	1611	[1563, 1607]	4128	[2505, 3068]
	ED_LSTM	3725	[3095, 3509]	5910	[3130, 4307]	1730	[1315, 1438]
	CNN	3966	[3130, 3685]	1959	[1552, 1712]	4246	[2200, 2916]
	CNN+LSTM	747	[815, 1120]	1480	[1607, 1773]	1762	[2131, 2402]
Tamil Nadu	Vanilla LSTM	2093	[1096, 2048]	980	[996, 1115]	2540	[1443, 1810]
	Stacked LSTM	1036	[1055, 1196]	1923	[2042, 2209]	551	[814, 1063]
	BiLSTM	832	[895, 1067]	3247	[3358, 3493]	977	[1326, 1650]
	ED_LSTM	6263	[4765, 5754]	7711	[5521, 6578]	13661	[7504, 9848]
	CNN	1078	[1034, 1086]	1698	[1677, 1749]	537	[683, 908]
	CNN+LSTM	3923	[3618, 4018]	2317	[2485, 2664]	278	[826, 1232]

Open in a new tab

3.3.2. Maharashtra

Maharashtra was one of the worst-affected states in India during the second wave with COVID-19. The new cases count per day is depicted in Fig. 9(b), which shows that the number of daily cases might count nearly 70,000 in the second waves and outbreak scenario being highly dynamic. To capture the dynamic trend of data, we train and test the vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM model on Maharashtra time series data with setting the hyper parameters, illustrated in Table 1, Table 2, and computed RMSE and MAPE on test data, presented in Table 3. Further forecasting of confirmed new cases per day for 7 days (up to July 17, 2021), 14 days (up to July 24, 2021) and 21 days (up to July 31, 2021) from July 10, 2021 are shown in Table 4. Fig. 11 illustrates the predicted and actual cases using deep learning models. In 7 days prediction, the stacked LSTM (MAPE=15.55%) and BiLSTM (MAPE=9.95%) forecasts value close to the actual values whereas in 14 days prediction the BiLSTM and ED_LSTM forecasts cases close to actual cases. Table 4 shows 95% confidence interval for the predicted confirmed cases per day up to July 31, 2021.

Fig. 11 — Predicted and actual cases for Maharashtra ahead of different days.

3.3.3. Kerala

We train and test the different recurrent and convolution neural network models: vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and CNN+LSTM models on Kerala COVID-19 early data from Mar 14, 2020 to Jul 10, 2021 (Fig. 9(c)) with setting the hyper parameters (Table 1, Table 2) to capture the trend of daily confirmed cases and computed RMSE and MAPE ( Table 3) on test data (last 20 days data). The RMSE and MAPE (=9.55%) values for vanilla LSTM is smallest on test data among six models. Using different learning models the prediction of 7 days (up to July 17, 2021), 14 days (up to July 24, 2021) and 21 days (up to July 31, 2021) has been done as shown in Table 4 and their comparison is illustrated in Fig. 12. Due to the highly dynamic trend (zigzag) of the Kerala time series data it is difficult to capture its trend. In 7 and 14 days prediction, CNN+LSTM forecasts the confirmed cases per day close to the actual cases counts per day and in 21 days prediction the stacked LSTM forecasting value is close to actual values.

Fig. 12 — Predicted and actual cases for Kerala ahead of different days.

3.3.4. Karnataka

The time series data of Karnataka depicted in Fig. 9(d) shows the dynamic trend of data during the first and the second wave. To address these issues and capture the trend of new cases count per day, vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM models are trained and tested on Karnataka data with the hyper parameters shown in Table 1, Table 2. Further prediction is performed for 7 days (up to July 17, 2021), 14 days (up to July 24, 2021) and 21 days (up to July 31, 2021) as displayed in Table 4. The comparisons between the predicted and actual case by different models are illustrated in Fig. 13. In 14 days prediction stacked LSTM gives less MAPE (=13.43%) error among other models and also predicted new cases per day close to the actual cases, whereas in 7 days prediction the hybrid CNN+LSTM provides predicted cases per day close to actual cases. The ED_LSTM performance is better in 21 days prediction but in 14 days prediction the predicted cases are far from the actual cases that may be because of over fitting.

Fig. 13 — Predicted and actual cases for Karnataka ahead of different days.

3.3.5. Tamil Nadu

The new cases count per day in Tamil Nadu is depicted in Fig. 9(e) which shows that the number of daily cases might count nearly 35,000 in the second wave and outbreak scenario being inconsistent in nature. Further forecasting of new confirmed cases per day for 7 days (up to July 17, 2021), 14 days (up to July 24, 2021) and 21 days (up to July 31, 2021) from July 10, 2021 is shown in Table 4. The comparison of predicted and actual cases per day for 7, 14, and 21 days using deep learning models are illustrated in Fig. 14. All models except ED_LSTM are able to capture the declining cases in Tamil Nadu. In 14 and 21 days prediction, forecasting of the case counts per day by vanilla LSTM, stacked LSTM, BiLSTM, CNN, and hybrid CNN+LSTM models are close to actual cases (Fig. 14).

Fig. 14 — Predicted and actual cases for Tamil Nadu ahead of different days.

4. Conclusion

The COVID-19 outbreak is a potential threat due to its dynamical behaviour and more threatening in a country like India because it is very densely populated. The researchers are engaged in seeking new approaches to understand the COVID-19 dynamics that will overcome the limitation of existing epidemiological models. In this study, we designed the vanilla LSTM, stacked LSTM, ED_LSTM, BiLSTM, CNN, and hybrid CNN+LSTM model to capture the complex dynamical trends of COVID-19 spread and perform forecasting of the COVID-19 confirmed cases of 7, 14, 21 days for India and its four most affected states: Maharashtra, Kerala, Karnataka, and Tamil Nadu. These deep learning approaches overcome the problem of mathematical modelling that depends on various situations. The RMSE and MAPE errors on the testing data are computed to demonstrate the relative performance of the deep learning models, which is in some case the RMSE and MAPE value reaches below 7.50% and 400 respectively. The predicted COVID-19 confirmed cases of 7, 14, and 21 days for entire India and its states: Maharashtra, Kerala, Karnataka, and Tamil Nadu along with confidence intervals results shows that predicted daily confirmed cases by most of the models studied are very close to actual confirmed cases per day. The stacked LSTM and hybrid CNN+LSTM models perform better among the six models and its predicted confirmed cases values mostly occurs in the confidence interval. These accurate predictions can help the government authority, academician and planner to managing services and take decisions accordingly and create more infrastructures if required. The designed models also applicable to other country/regions as well.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Biographies

Dr. Hanuman Verma has done Ph.D. and M.Tech degree in Computer Science and Technology from Jawaharlal Nehru University, New Delhi, India. He did his M.Sc. in Mathematics & Statistics from Dr. R.M.L. Avadh University, Faizabad, Uttar Pradesh, India. Currently, he is working as Assistant Professor at Department of Mathematics, Bareilly College, Bareilly, Uttar Pradesh, India. He has published research papers in reputed international journals, including Elsevier, Wiley, World Scientific, and Springer. His primary research interest includes machine learning, deep learning, medical image computing and soft computing.

Dr. Saurav Mandal did his Bachelor of Technology from National Institute of Technology, Allahabad, India in the field of Information Technology. Then he pursued Master of Technology from Jawaharlal Nehru University, New Delhi, India in the field of Computational and Systems Biology. He holds a doctorate in the field of Computational Biology from Jawaharlal Nehru University, New Delhi, India. His major field of interest is in solving Computational Biology problems using machine learning and deep learning algorithms.

Dr. Akshansh Gupta is currently scientist at CSIR-Central Electronic Engineering Research Institute Pilani Rajasthan. He has worked as a DST funded postdoctoral research fellow as a principal investigator under the scheme of Cognitive Science Research Initiative (CSRI) from the Department of Science and Technology (DST), Ministry of Science and Technology, Government of India, from 2016 to 2020 in School of Computational Integrative and Science, Jawaharlal Nehru University, New Delhi. He has many publications including Springer, Elsevier, IEEE Transaction, and many more. He received his master’s and Ph.D. degrees from the School of computer and systems sciences, JNU, in the years 2010 and 2015 respectively. His research interests include Pattern Recognition, Machine Learning, Data Mining Signal Processing, Brain–Computer Interface, Cognitive Science, IoT. He is also working as CO-PI on a consultancy project named “Development of Machine Learning Algorithms for Automated Classification Based on Advanced Signal Decomposition of EEG Signals” ICPS Program, DST Govt. of India.

References

Abdollahi J., Irani A.J., Nouri-Moghaddam B. 2021. Modeling and forecasting spread of COVID-19 epidemic in Iran until sep 22, 2021, based on deep learning. arXiv preprint arXiv:2103.08178. [Google Scholar]
Albawi S., Mohammed T.A., Al-Zawi S. 2017 International conference on engineering and technology. Ieee; 2017. Understanding of a convolutional neural network; pp. 1–6. [Google Scholar]
Alzubaidi L., Zhang J., Humaidi A.J., Al-Dujaili A., Duan Y., Al-Shamma O., et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal Of Big Data. 2021;8(1):1–74. doi: 10.1186/s40537-021-00444-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Amaral F., Casaca W., Oishi C.M., Cuminato J.A. Towards providing effective data-driven responses to predict the Covid-19 in São Paulo and Brazil. Sensors. 2021;21(2):540. doi: 10.3390/s21020540. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ardabili S.F., Mosavi A., Ghamisi P., Ferdinand F., Varkonyi-Koczy A.R., Reuter U., et al. Covid-19 outbreak prediction with machine learning. Algorithms. 2020;13(10):249. [Google Scholar]
Arora P., Kumar H., Panigrahi B.K. Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India. Chaos, Solitons & Fractals. 2020;139 doi: 10.1016/j.chaos.2020.110017. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bagal D.K., Rath A., Barua A., Patnaik D. Estimating the parameters of susceptible-infected-recovered model of COVID-19 cases in India during lockdown periods. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110154. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bedi P., Dhiman S., Gole P., Gupta N., Jindal V. Prediction of COVID-19 trend in India and its four worst-affected states using modified SEIRD and LSTM models. SN Computer Science. 2021;2(3):1–24. doi: 10.1007/s42979-021-00598-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Brownlee J. Machine Learning Mastery; 2018. Better deep learning: train faster, reduce overfitting, and make better predictions. [Google Scholar]
Brownlee J. Machine Learning Mastery; 2018. How to develop LSTM models for time series forecasting. Vol. 14. [Google Scholar]
Chandra R., Jain A., Chauhan D.S. 2021. Deep learning via LSTM models for COVID-19 infection forecasting in India. arXiv preprint arXiv:2101.11881. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen Y.-C., Lu P.-E., Chang C.-S., Liu T.-H. A time-dependent SIR model for COVID-19 with undetectable infected persons. IEEE Transactions On Network Science And Engineering. 2020;7(4):3279–3294. doi: 10.1109/TNSE.2020.3024723. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chimmula V.K.R., Zhang L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos, Solitons & Fractals. 2020;135 doi: 10.1016/j.chaos.2020.109864. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chowell G. Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts. Infectious Disease Modelling. 2017;2(3):379–398. doi: 10.1016/j.idm.2017.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chowell G., Tariq A., Hyman J.M. A novel sub-epidemic modeling framework for short-term forecasting epidemic waves. BMC Medicine. 2019;17(1):1–18. doi: 10.1186/s12916-019-1406-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cooper I., Mondal A., Antonopoulos C.G. A SIR model assumption for the spread of COVID-19 in different communities. Chaos, Solitons & Fractals. 2020;139 doi: 10.1016/j.chaos.2020.110057. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dairi A., Harrou F., Zeroual A., Hittawe M.M., Sun Y. Comparative study of machine learning methods for COVID-19 transmission forecasting. Journal Of Biomedical Informatics. 2021;118 doi: 10.1016/j.jbi.2021.103791. [DOI] [PMC free article] [PubMed] [Google Scholar]
Devaraj J., Elavarasan R.M., Pugazhendhi R., Shafiullah G., Ganesan S., Jeysree A.K., et al. Forecasting of COVID-19 cases using deep learning models: Is it reliable and practically significant? Results In Physics. 2021;21 doi: 10.1016/j.rinp.2021.103817. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ellis M.J., Chinde V. An encoder–decoder LSTM-based EMPC framework applied to a building HVAC system. Chemical Engineering Research And Design. 2020;160:508–520. [Google Scholar]
Greff K., Srivastava R.K., Koutník J., Steunebrink B.R., Schmidhuber J. LSTM: A search space odyssey. IEEE Transactions On Neural Networks And Learning Systems. 2016;28(10):2222–2232. doi: 10.1109/TNNLS.2016.2582924. [DOI] [PubMed] [Google Scholar]
Gu J., Wang Z., Kuen J., Ma L., Shahroudy A., Shuai B., et al. Recent advances in convolutional neural networks. Pattern Recognition. 2018;77:354–377. [Google Scholar]
Gupta S., Kapoor V. Sultan Chand & Sons AS Printing Press; India: 1994. Fundamental of mathematical statistics. [Google Scholar]
Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
Huang C., Wang Y., Li X., Ren L., Zhao J., Hu Y., et al. Clinical features of patients infected with 2019 novel coronavirus in wuhan, China. The Lancet. 2020;395(10223):497–506. doi: 10.1016/S0140-6736(20)30183-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Iqbal M., Al-Obeidat F., Maqbool F., Razzaq S., Anwar S., Tubaishat A., et al. COVID-19 patient count prediction using LSTM. IEEE Transactions On Computational Social Systems. 2021 doi: 10.1109/TCSS.2021.3056769. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kao I.-F., Zhou Y., Chang L.-C., Chang F.-J. Exploring a long short-term memory based encoder-decoder framework for multi-step-ahead flood forecasting. Journal Of Hydrology. 2020;583 [Google Scholar]
Kermack W.O., McKendrick A.G. A contribution to the mathematical theory of epidemics. Proceedings Of The Royal Society Of London. Series A, Containing Papers Of A Mathematical And Physical Character. 1927;115(772):700–721. [Google Scholar]
LeCun Y., et al. Generalization and network design strategies. Connectionism In Perspective. 1989;19:143–155. [Google Scholar]
Li T., Hua M., Wu X. A hybrid CNN-LSTM model for forecasting particulate matter (PM2. 5) IEEE Access. 2020;8:26933–26940. [Google Scholar]
Liu T., Bao J., Wang J., Zhang Y. A hybrid CNN–LSTM algorithm for online defect recognition of CO2 welding. Sensors. 2018;18(12):4369. doi: 10.3390/s18124369. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nabi K.N., Tahmid M.T., Rafi A., Kader M.E., Haider M.A. Forecasting COVID-19 cases: A comparative analysis between recurrent and convolutional neural networks. Results In Physics. 2021;24 doi: 10.1016/j.rinp.2021.104137. [DOI] [PMC free article] [PubMed] [Google Scholar]
Richards F. A flexible growth function for empirical use. Journal Of Experimental Botany. 1959;10(2):290–301. [Google Scholar]
Rumelhart D.E., Hinton G.E., Williams R.J. Learning representations by back-propagating errors. Nature. 1986;323(6088):533–536. [Google Scholar]
Said A.B., Erradi A., Aly H.A., Mohamed A. Predicting COVID-19 cases using bidirectional LSTM on multivariate time series. Environmental Science And Pollution Research. 2021:1–10. doi: 10.1007/s11356-021-14286-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shahid F., Zameer A., Muneeb M. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110212. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shastri S., Singh K., Kumar S., Kour P., Mansotra V. Time series forecasting of Covid-19 using deep learning models: India-USA comparative case study. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110227. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sun L., Wang Y., He J., Li H., Peng D., Wang Y. A stacked LSTM for atrial fibrillation prediction based on multivariate ECGs. Health Information Science And Systems. 2020;8(1):1–7. doi: 10.1007/s13755-020-00103-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sutskever I., Vinyals O., Le Q.V. Advances in neural information processing systems. 2014. Sequence to sequence learning with neural networks; pp. 3104–3112. [Google Scholar]
Turner M.E., Jr., Bradley E.L., Jr., Kirk K.A., Pruitt K.M. A theory of growth. Mathematical Biosciences. 1976;29(3–4):367–373. [Google Scholar]
Van Houdt G., Mosquera C., Nápoles G. A review on the long short-term memory model. Artificial Intelligence Review. 2020;53(8):5929–5955. [Google Scholar]
Verma H., Gupta A., Niranjan U. 2020. Analysis of COVID-19 cases in India through machine learning: A study of intervention. arXiv preprint arXiv:2008.10450. [Google Scholar]
Wang P., Zheng X., Ai G., Liu D., Zhu B. Time series prediction for the epidemic trends of COVID-19 using the improved LSTM deep learning method: Case studies in Russia, Peru and Iran. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110214. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wieczorek M., Siłka J., Woźniak M. Neural network powered COVID-19 spread forecasting model. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110203. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu Y., Yuan M., Dong S., Lin L., Liu Y. Remaining useful life estimation of engineered systems using vanilla LSTM neural networks. Neurocomputing. 2018;275:167–179. [Google Scholar]
Yu Y., Si X., Hu C., Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation. 2019;31(7):1235–1270. doi: 10.1162/neco_a_01199. [DOI] [PubMed] [Google Scholar]
Zeroual A., Harrou F., Dairi A., Sun Y. Deep learning methods for forecasting COVID-19 time-series data: A comparative study. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110121. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhou, J., Hong, X., Su, F., & Zhao, G. (2016). Recurrent convolutional neural network regression for continuous pain intensity estimation in video. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 84–92).

[b1] Abdollahi J., Irani A.J., Nouri-Moghaddam B. 2021. Modeling and forecasting spread of COVID-19 epidemic in Iran until sep 22, 2021, based on deep learning. arXiv preprint arXiv:2103.08178. [Google Scholar]

[b2] Albawi S., Mohammed T.A., Al-Zawi S. 2017 International conference on engineering and technology. Ieee; 2017. Understanding of a convolutional neural network; pp. 1–6. [Google Scholar]

[b3] Alzubaidi L., Zhang J., Humaidi A.J., Al-Dujaili A., Duan Y., Al-Shamma O., et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal Of Big Data. 2021;8(1):1–74. doi: 10.1186/s40537-021-00444-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b4] Amaral F., Casaca W., Oishi C.M., Cuminato J.A. Towards providing effective data-driven responses to predict the Covid-19 in São Paulo and Brazil. Sensors. 2021;21(2):540. doi: 10.3390/s21020540. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b5] Ardabili S.F., Mosavi A., Ghamisi P., Ferdinand F., Varkonyi-Koczy A.R., Reuter U., et al. Covid-19 outbreak prediction with machine learning. Algorithms. 2020;13(10):249. [Google Scholar]

[b6] Arora P., Kumar H., Panigrahi B.K. Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India. Chaos, Solitons & Fractals. 2020;139 doi: 10.1016/j.chaos.2020.110017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b7] Bagal D.K., Rath A., Barua A., Patnaik D. Estimating the parameters of susceptible-infected-recovered model of COVID-19 cases in India during lockdown periods. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110154. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b8] Bedi P., Dhiman S., Gole P., Gupta N., Jindal V. Prediction of COVID-19 trend in India and its four worst-affected states using modified SEIRD and LSTM models. SN Computer Science. 2021;2(3):1–24. doi: 10.1007/s42979-021-00598-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b9] Brownlee J. Machine Learning Mastery; 2018. Better deep learning: train faster, reduce overfitting, and make better predictions. [Google Scholar]

[b10] Brownlee J. Machine Learning Mastery; 2018. How to develop LSTM models for time series forecasting. Vol. 14. [Google Scholar]

[b11] Chandra R., Jain A., Chauhan D.S. 2021. Deep learning via LSTM models for COVID-19 infection forecasting in India. arXiv preprint arXiv:2101.11881. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b12] Chen Y.-C., Lu P.-E., Chang C.-S., Liu T.-H. A time-dependent SIR model for COVID-19 with undetectable infected persons. IEEE Transactions On Network Science And Engineering. 2020;7(4):3279–3294. doi: 10.1109/TNSE.2020.3024723. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b13] Chimmula V.K.R., Zhang L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos, Solitons & Fractals. 2020;135 doi: 10.1016/j.chaos.2020.109864. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b14] Chowell G. Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts. Infectious Disease Modelling. 2017;2(3):379–398. doi: 10.1016/j.idm.2017.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b15] Chowell G., Tariq A., Hyman J.M. A novel sub-epidemic modeling framework for short-term forecasting epidemic waves. BMC Medicine. 2019;17(1):1–18. doi: 10.1186/s12916-019-1406-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b16] Cooper I., Mondal A., Antonopoulos C.G. A SIR model assumption for the spread of COVID-19 in different communities. Chaos, Solitons & Fractals. 2020;139 doi: 10.1016/j.chaos.2020.110057. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b17] Dairi A., Harrou F., Zeroual A., Hittawe M.M., Sun Y. Comparative study of machine learning methods for COVID-19 transmission forecasting. Journal Of Biomedical Informatics. 2021;118 doi: 10.1016/j.jbi.2021.103791. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b18] Devaraj J., Elavarasan R.M., Pugazhendhi R., Shafiullah G., Ganesan S., Jeysree A.K., et al. Forecasting of COVID-19 cases using deep learning models: Is it reliable and practically significant? Results In Physics. 2021;21 doi: 10.1016/j.rinp.2021.103817. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b19] Ellis M.J., Chinde V. An encoder–decoder LSTM-based EMPC framework applied to a building HVAC system. Chemical Engineering Research And Design. 2020;160:508–520. [Google Scholar]

[b20] Greff K., Srivastava R.K., Koutník J., Steunebrink B.R., Schmidhuber J. LSTM: A search space odyssey. IEEE Transactions On Neural Networks And Learning Systems. 2016;28(10):2222–2232. doi: 10.1109/TNNLS.2016.2582924. [DOI] [PubMed] [Google Scholar]

[b21] Gu J., Wang Z., Kuen J., Ma L., Shahroudy A., Shuai B., et al. Recent advances in convolutional neural networks. Pattern Recognition. 2018;77:354–377. [Google Scholar]

[b22] Gupta S., Kapoor V. Sultan Chand & Sons AS Printing Press; India: 1994. Fundamental of mathematical statistics. [Google Scholar]

[b23] Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]

[b24] Huang C., Wang Y., Li X., Ren L., Zhao J., Hu Y., et al. Clinical features of patients infected with 2019 novel coronavirus in wuhan, China. The Lancet. 2020;395(10223):497–506. doi: 10.1016/S0140-6736(20)30183-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b25] Iqbal M., Al-Obeidat F., Maqbool F., Razzaq S., Anwar S., Tubaishat A., et al. COVID-19 patient count prediction using LSTM. IEEE Transactions On Computational Social Systems. 2021 doi: 10.1109/TCSS.2021.3056769. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b26] Kao I.-F., Zhou Y., Chang L.-C., Chang F.-J. Exploring a long short-term memory based encoder-decoder framework for multi-step-ahead flood forecasting. Journal Of Hydrology. 2020;583 [Google Scholar]

[b27] Kermack W.O., McKendrick A.G. A contribution to the mathematical theory of epidemics. Proceedings Of The Royal Society Of London. Series A, Containing Papers Of A Mathematical And Physical Character. 1927;115(772):700–721. [Google Scholar]

[b28] LeCun Y., et al. Generalization and network design strategies. Connectionism In Perspective. 1989;19:143–155. [Google Scholar]

[b29] Li T., Hua M., Wu X. A hybrid CNN-LSTM model for forecasting particulate matter (PM2. 5) IEEE Access. 2020;8:26933–26940. [Google Scholar]

[b30] Liu T., Bao J., Wang J., Zhang Y. A hybrid CNN–LSTM algorithm for online defect recognition of CO2 welding. Sensors. 2018;18(12):4369. doi: 10.3390/s18124369. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b31] Nabi K.N., Tahmid M.T., Rafi A., Kader M.E., Haider M.A. Forecasting COVID-19 cases: A comparative analysis between recurrent and convolutional neural networks. Results In Physics. 2021;24 doi: 10.1016/j.rinp.2021.104137. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b32] Richards F. A flexible growth function for empirical use. Journal Of Experimental Botany. 1959;10(2):290–301. [Google Scholar]

[b33] Rumelhart D.E., Hinton G.E., Williams R.J. Learning representations by back-propagating errors. Nature. 1986;323(6088):533–536. [Google Scholar]

[b34] Said A.B., Erradi A., Aly H.A., Mohamed A. Predicting COVID-19 cases using bidirectional LSTM on multivariate time series. Environmental Science And Pollution Research. 2021:1–10. doi: 10.1007/s11356-021-14286-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b35] Shahid F., Zameer A., Muneeb M. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110212. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b36] Shastri S., Singh K., Kumar S., Kour P., Mansotra V. Time series forecasting of Covid-19 using deep learning models: India-USA comparative case study. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110227. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b37] Sun L., Wang Y., He J., Li H., Peng D., Wang Y. A stacked LSTM for atrial fibrillation prediction based on multivariate ECGs. Health Information Science And Systems. 2020;8(1):1–7. doi: 10.1007/s13755-020-00103-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b38] Sutskever I., Vinyals O., Le Q.V. Advances in neural information processing systems. 2014. Sequence to sequence learning with neural networks; pp. 3104–3112. [Google Scholar]

[b39] Turner M.E., Jr., Bradley E.L., Jr., Kirk K.A., Pruitt K.M. A theory of growth. Mathematical Biosciences. 1976;29(3–4):367–373. [Google Scholar]

[b40] Van Houdt G., Mosquera C., Nápoles G. A review on the long short-term memory model. Artificial Intelligence Review. 2020;53(8):5929–5955. [Google Scholar]

[b41] Verma H., Gupta A., Niranjan U. 2020. Analysis of COVID-19 cases in India through machine learning: A study of intervention. arXiv preprint arXiv:2008.10450. [Google Scholar]

[b42] Wang P., Zheng X., Ai G., Liu D., Zhu B. Time series prediction for the epidemic trends of COVID-19 using the improved LSTM deep learning method: Case studies in Russia, Peru and Iran. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110214. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b43] Wieczorek M., Siłka J., Woźniak M. Neural network powered COVID-19 spread forecasting model. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110203. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b44] Wu Y., Yuan M., Dong S., Lin L., Liu Y. Remaining useful life estimation of engineered systems using vanilla LSTM neural networks. Neurocomputing. 2018;275:167–179. [Google Scholar]

[b45] Yu Y., Si X., Hu C., Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation. 2019;31(7):1235–1270. doi: 10.1162/neco_a_01199. [DOI] [PubMed] [Google Scholar]

[b46] Zeroual A., Harrou F., Dairi A., Sun Y. Deep learning methods for forecasting COVID-19 time-series data: A comparative study. Chaos, Solitons & Fractals. 2020;140 doi: 10.1016/j.chaos.2020.110121. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b47] Zhou, J., Hong, X., Su, F., & Zhao, G. (2016). Recurrent convolutional neural network regression for continuous pain intensity estimation in video. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 84–92).

PERMALINK

Temporal deep learning architecture for prediction of COVID-19 cases in India

Hanuman Verma

Saurav Mandal

Akshansh Gupta

Abstract

1. Introduction

Fig. 1.

Fig. 2.

2. Methods

2.1. Experimental setup

2.2. Vanilla LSTM

Fig. 3.

Table 1.

2.3. Stacked LSTM

Fig. 4.

2.4. Bidirectional-LSTM

Fig. 5.

2.5. Encoder Decoder-LSTM

Fig. 6.

2.6. Convolution Neural Network (CNN)

Fig. 7.

Table 2.

2.7. Hybrid CNN+LSTM

Fig. 8.

2.8. Evaluation metrics

3. Results and discussions

3.1. COVID-19 data and preprocessing

Fig. 9.

3.2. Hyper parameter tuning

3.3. Prediction performance

3.3.1. India

Table 3.

Fig. 10.

Table 4.

3.3.2. Maharashtra

Fig. 11.

3.3.3. Kerala

Fig. 12.

3.3.4. Karnataka

Fig. 13.

3.3.5. Tamil Nadu

Fig. 14.

4. Conclusion

Declaration of Competing Interest

Biographies

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases