Highlights
-
•
Deep Learning based time series forecasting and comparative case study of Covid-19 confirmed and death cases in India and USA.
-
•
Recurrent neural network (RNN) based variants of long short term memory (LSTM) are being used to design proposed models.
-
•
Convolutional LSTM based model outperform other models with high accuracy and very less error.
-
•
One of the unique studies providing state-of-the-art results to help both countries to recede Covid-19 impact.
Keywords: Recurrent neural networks, Time series, Covid-19, LSTM, Forecasting, Deep learning
Abstract
Covid-19 is a highly contagious virus which almost freezes the world along with its economy. Its ability of human-to-human and surface-to-human transmission turns the world into catastrophic phase. In this study, our aim is to predict the future conditions of novel Coronavirus to recede its impact. We have proposed deep learning based comparative analysis of Covid-19 cases in India and USA. The datasets of confirmed and death cases of Covid-19 are taken into consideration. The recurrent neural network (RNN) based variants of long short term memory (LSTM) such as Stacked LSTM, Bi-directional LSTM and Convolutional LSTM are used to design the proposed methodology and forecast the Covid-19 cases for one month ahead. Convolution LSTM outperformed the other two models and predicts the Covid-19 cases with high accuracy and very less error for all four datasets of both countries. Upward/downward trend of forecasted Covid-19 cases are also visualized graphically, which would be helpful for researchers and policy makers to mitigate the mortality and morbidity rate by streaming the Covid-19 into right direction.
1. Introduction
The world has been facing the severe epidemic and pandemic issues over the centuries periodically. The history of epidemics were started from 1300th century, when Black Death killed tens of millions or 50% of western European people mainly targeting the elderly adults and individuals exposed to psychological stressors [1]. The next epidemic in the early 1500’s was the worst demographic catastrophe for the mankind on earth. The wake of smallpox epidemic in 1519 killed around 8 million people in the Mexico. The same century for the same country became catastrophic in terms of epidemic awakening. This results invocation of cocoliztli epidemic in 1545 and leads to death of around 15 million people, which was around 80% of native population of the Mexico. The same cocoliztli epidemic in 1576 again arose and killed around 2.5 million people, which was 50% of remaining native population [2, 3]. The Spanish flu and HIV/AIDS in early 1900’s were also the catastrophic pandemic phases for mankind on earth that killed tens of millions of people [4, 5]. The trend of epidemics and pandemics were carried out from centuries before till another catastrophic year of 2020. This year has almost stopped the speed of the world with an outbreak of novel Coronavirus (also Covid-19) pandemic caused by SARS-CoV-2 [6]. The origin of this deadly virus is Wuhan city of Hubei province, China in December 2019 [7]. After an outbreak, world is under the net of Covid-19 due to its escalation into the fatal pandemic. The first mortality was observed in early January which leads to know human to human transmission with close or direct contact to an infected person [8]. According to Johns Hopkins University, total 12,745,734 confirmed cases of Covid-19 were reported with 556,036 global deaths and 7,030,749 recovered cases throughout the world till 12th of July 2020 [9]. The United States became most effected country in the world due to Covid-19 with 4.2% case-fatality rate and 41.20 deaths per 100k. India is having 2.7% Covid-19 case fatality rate with 1.68 deaths per 100k [10]. The world is continually fighting against Covid-19 by taking different precautionary measurements but still there doesn't exist any vaccine for Covid-19. Different studies were published to make progress in vaccination development and modeling diagnosing solutions mathematically. Some of them are discussed below.
Arora et al. [11] reported the Covid-19 forecasting of all states of India by applying long short term memory (LSTM) models and predicted the next day and one week Covid-19 cases with error of 3%. Hariri et al. [12] used COVD-19 Essential Supplies Forecasting Tool (COVID-ESFT) to forecast Covid-19 severe, critical and death cases of northwest Syria. It also identified the time points when health care system capacity will be worsen. Abdulmajeed et al. [13] proposed an online forecasting methodology that takes real time Covid-19 data from Nigeria Center for Disease Control. An ensemble model was designed to predict Covid-19 cases by combining ARIMA, additive regression model (namely Prophet that is developed by Facebook), and Holt-Winter Exponential smoothing method. Cheng et al. [14] developed a platform namely icumonitoring.ch to forecast the ICU beds for individual hospitals at canton level in Switzerland. Two canton level epidemic S-I-E-R models (CZ and BT models) have been used to derive projections of Covid-19 cases. Sohail et al. [15] designed a mathematical model which leads to interaction of ACE-2 protein and S proein sequence of amino acid with the help of Hill function. The designed model helps to know the time period that Covid-19 can take to malign the body cell and start its spread, which can also help to know incubation period. Saba et al. [16] used AI and statistics based proposed models to forecast the daily Covid-19 cases of Egypt. ARIMA and NARANN techniques have been used to design prediction models, which gives the forecasting error of 5%. Pathan et al. [17] used RNN based long short term memory (LSTM) to design a model and forecast the mutation rate of Covid-19. The nucleotide mutation rate of 400th patient is predicted with a root mean square error of 0.06 and 0.04 for training and testing processes respectively.
In this study, we are proposing the deep learning based models to predict Covid-19 confirmed and death cases for India and USA and presenting comparative case analysis. We forecasted the Covid-19 cases aimed one month ahead for both countries. We used recurrent neural network (RNN) based long short term memory (LSTM) variants to design proposed models. Stacked LSTM, Bi-directional LSTM and Convolutional LSTM are being used to derive the forecasted results, which are able to predict better future values than traditional LSTM models. To the best of our knowledge, no such comparative case study of India and USA on Covid-19 pandemic has been done before. Our contribution in this comparative study would be beneficial for both the countries to rebuild the design and demographics of Covid-19 preparation.
The rest of the paper is organized as follows: Section 2 describes the dataset formulation and explains long short term memory (LSTM) variants used in this study. Section 3 represents the proposed model experimentations with environmental setup. Section 4 deals with results and best model extraction followed by discussion work. Section 5 gives conclusion of the proposed comparative study with future work.
2. Materials and methods
2.1. Data distribution
In this paper, datasets of India and USA Covid-19 confirmed and death cases are being used. Datasets of India and USA are taken from the Ministry of Health and Family Welfare, Government of India [18] and Centers for Disease Control and Prevention, U.S Department of Health and Human Services [19] respectively. Datasets used in this study to carry out the experimental predictive analysis with the categorization of data is shown in Table 1 .
Table 1.
Country | Confirmed cases | Death cases |
---|---|---|
India | 7th Feb. – 7th July 2020 | 12th March – 7th July 2020 |
USA | 7th Feb. – 7th July 2020 | 26th Feb. – 7th July 2020 |
Graphical representation of confirmed and death cases of both countries shows an upward trend as shown in Fig. 1, Fig. 2 . To visualize and compare the statistics of Covid-19 cases we normalize our raw data using MinMaxScaler, a scikit-learn library in python programming language. Normalization of the data becomes important here because of the inconsistency of data. We cannot visualize data graphically in its original form, therefore we scale our data within some range to make it visible in clear fashion.
2.2. Severity analysis
Both countries are divided into different stages using their States/Union Territories as initial, moderate and severe Covid-19 cases. The comparative distribution of cases is shown in Fig. 3 [18, 19]. Seven moderate and most severe States/Union Territories of both countries are compared separately in Fig. 4 . California and Maharashtra are worst affected States in USA and India respectively. Benefits behind distributing country into different stages are accurate planning to focus and mitigate effect of high alert areas on urgent basis.
2.3. Methods
RNN is an integral part of deep learning methods that is used to find temporal correlations in time series prediction [20]. RNN performs well if only recent information is required to decide present state. It consists of hidden states that are distributed in temporal fashion and are able to predict the future events with more accuracy than traditional exponential smoothing methods [21], [22], [23]. RNN can store lot of previous information for short time which makes training more complicated for certain applications. Problem called vanishing gradient cannot overcome by RNN and also for next state it involves hidden layer activation function of previous state only which is the disadvantage of RNN [24]. RNN was unable to model long term dependencies resulting LSTM came into picture and was introduced by Hochreiter et al. in 1997 [25].
Long Short Term Memory (LSTM) is a variant of Recurrent Neural Network (RNN) that is used to overcome the limitations of RNN. LSTMs are capable to learn long term dependencies by replacing the hidden layers of RNN with memory cells as shown in Fig. 5 . Different gate units such as input gate (it), output gate (ot), forget gate (ft) along with the activation function are used to model LSTMs and learn the behavior of temporal correlations. The working procedure of the LSTM cell is also defined mathematically in Eqs. (1)–(5 ) below [26].
(1) |
(2) |
(3) |
(4) |
(5) |
where σ is logistic sigmoid function, i, f, c, o are input gate, forget gate, memory cell and output gate respectively. W xi, f, c, o are diagonal weight matrices from memory cell to gate units. In this paper, three variants of LSTM are used to carry out experimentation and are explained in further sections.
2.3.1. Stacked LSTM
Stacked LSTM also known as multilayer fully connected structure is comprised of multiple LSTM layers resulting stack like architecture as shown in Fig. 6 . Combining multiple LSTM layers leads to greater model complexity and increased depth of the model [27]. Each intermediate LSTM layer outputs resulting sequential vectors that are used as an input for next LSTM layer. Stacked LSTM provides output for each time stamp and not the single output for all time stamps [28].
For unrolled stacked LSTM network, we can mathematically model Lth LSTM layer as given below in Eqs. (6)–(11) [26].
(6) |
(7) |
(8) |
(9) |
(10) |
(11) |
In this deep multilayer fully connected network we have output of (L-1)th layer as which intern is an input for intermediate Lth layer. In similar way output of Lth layer is input for (L+1)th layer. This input-output interconnection is the only relation between two subsequent intermediate layers.
2.3.2. Bi-directional LSTM
Traditional RNNs can process the information in single direction only and pay no heed to future processed information. To overcome this limitation, concept of Bi-directional RNN was given by Schuster et al. in 1997 [29]. Bi-directional RNN can process the information in both directions with different hidden layers as forward layers and backward layers. Combining Bi-directional RNN with LSTM cell results Bi-directional LSTM (BD-LSTM) that is introduced by Graves et al. in 2005 [30]. Structural idea of Bi-directional LSTM is to split standard RNN into forward states and backward states as shown in Fig. 7 . The output of forward states does not used as an input for backward states and vice-versa. Forward layer connections are similar as Stacked LSTM outlined in previous section. In BD-LSTM hidden layer sequences of backward layer are computed iteratively from time t = 1 to T. Mathematically, layer L of BD-LSTM can be expressed at time t as shown in Eqs. (12)–(18) [31].
(12) |
(13) |
(14) |
(15) |
(16) |
(17) |
The output of BD-LSTM is cumulative result of forward and backward layer outputs such as and respectively.
Therefore,
(18) |
BD-LSTM trains the model by using forward, backward pass and upgrading the weights. Forward pass is to run all the inputs for time 1 ≤ t ≤ T and by using BD-LSTM find all predicted outputs. For time perform forward pass for forward states and for time perform backward pass for backward states and then do forward pass for output neurons. Similarly, for backward pass find objective function derivative that used in forward pass for time 1 ≤ t ≤ T. Perform backward pass for forward states for time and backward states for time . At last, upgrade the weights [32].
2.3.3. Convolutional LSTM (ConvLSTM)
Standard fully connected long short term memory (FC-LSTM) is efficient in handling temporal correlations but contains greater amount of redundancy for spatial data. To overcome the limitations of FC-LSTM and predict spatiotemporal data, convolutional LSTM (ConvLSTM) is proposed by Shi et al. in 2015 [33]. Various features such as gates it, ot, ft input, cell output and hidden states are 3D tensors of convolutional LSTM having last two dimensions as spatial dimensions. By using convolutional operator (*) in state to state and input to state transitions as shown in Fig. 8 , it can determine the future state of cell which intern is determined by inputs and past state of its local neighbors. Mathematical formulation of ConvLSTM is shown in Eqs. (19)–(23), where ‘*’ is Convolutional operator and ‘ · ’ represents Hadamard product [34, 35].
(19) |
(20) |
(21) |
(22) |
(23) |
3. Experiment
The experiments are carried out in Google Colaboratory using python 3.0 with open source libraries like Tensorflow [36], Pandas [37], Numpy [38], and keras [39]. The experimental setup is based on working environment having Intel(R) Core (TM) i5-7400 CPU @ 3.00GHz with 4 GB RAM under 64-bit Windows 10 pro Operating system. Time series forecasting of Covid-19 datasets are modeled using three variants of Recurrent Neural Networks (RNN) such as Stacked LSTM, Bi-directional LSTM and ConvLSTM. These models are used to learn hidden behavior of time series data to predict future values of Covid-19 cases. Historical datasets are given to the models based on Covid-19 confirmed and death cases as given in Table 1. Selection and tuning of hyper-parameters for all the models are chosen rigorously. Workings and parameters selection of all three variants are discussed in further sections. The framework of proposed methodology is explained in Fig. 9 . Train models of all three RNN variants are used to forecast one month future sequence values of Covid-19 cases for both the countries. Best model is chosen among three by comparing error (MAPE) values and accuracy attained.
3.1. Stacked LSTM model
Stacked LSTM is differentiated from simple RNN models with respect to model depth and complexity. Before working with the designed model we make out input suitable for the model by considering 3 lag structures and number of features as 1. We divided out input data into 80% training set and 20% for testing purpose and normalize it using MinMaxScaler. A total of four datasets are being used to predict Covid-19 cases using stacked LSTM model. Covid-19 confirmed and death cases of India and USA are given to model and forecast the future sequences and comparative results are also driven in next section. We designed model using two-layer stacked LSTM setup with 100 hidden neurons each. Return sequences is also true for the model to make sure that each time step of input data has LSTM output. One dense layer is also added to the model to connect each neuron with the next neuron also called fully connected network. ReLu activation function is used to overcome the problem of standard RNNs called exploding gradient or vanishing gradient problem. Adam is used as an optimizer and mean square error as loss function to evaluate the model. We train our model for 500 epochs with validation split equals 0.2 and verbose equals 1. The results of stacked LSTM on both countries are comparatively summarized separately in results Section 4.
3.2. Bi-directional LSTM model
Bi-directional LSTM is the extended version of unidirectional or traditional LSTM networks, because it works in both directions (Forward and Backward states). Like stacked LSTM here also, we feed the input to our model in correct dimension such as it takes 3D input probably as samples, time steps and features. It trains Bi-LSTM instead of one input sequence. First it trains for input sequence given and secondly it trains for reversed input sequences. For BD-LSTM we split our dataset in training and testing sets with 80% and 20% respectively and MinMaxScaler is used to normalize data to make it fit for use. Experimental setup of BD-LSTM is also carried out with four datasets as mentioned in stacked LSTM. BD-LSTM model is designed using single hidden layer of 100 neurons with ReLu activation function to deal with vanishing gradient. BD-LSTM wrapper structure is passed through dense layer to acquire forecasted values. We train our model for 500 epochs with validation split equals 0.2 and verbose equals 2. Finally, output of the BD-LSTM is evaluated by taking cumulative results of forward and backward layers as mathematically shown by Eq. (18) in Section 2.3.2. Adam optimizer is used along with mean square error as loss function. The results of BD-LSTM on both countries are comparatively summarized separately in results Section 4.
3.3. Convolutional LSTM model
Convolutional LSTM is the extension of simple LSTM model that is able to read 2D spatial-temporal data. It replaces the internal matrix multiplication with Convolutional operation. In this experiment first we split out dataset into 80% training and 20% testing sets and then we normalize it using MinMaxScaler. Input data to the model is divided using split sequence() function to make our input in correct form. Probably, it takes the input data in the form (samples, time_steps, rows, columns, features). We use single ConvLSTM2D layer to design our model with 64 filters and kernel size as (1, 2), meaning that number of row is one and columns are two. To exploit vanishing gradient ReLu activation function is used. The output of model is flattened before we use it for prediction purpose and connected along with dense layer. Adam optimizer and mean square error is used before evaluating model summary. We trained the model for 500 epochs having validation split as 0.2 and verbose as 2. Output of the experimentation is discussed in next section.
4. Results & discussion
Results of time series forecasting of both countries are compared in terms of graphs and different classification metrics. Forecasted Covid-19 confirmed and death cases of both countries are graphically compared in Sections 4.1 and 4.2. Comparison is driven with respect to Stacked, Bi-directional and Convolutional LSTMs.
4.1. Covid-19 confirmed cases prediction
Fig. 10 (a) and (b) represents time series actual and forecasted data of India and USA using Stacked LSTM. Fig. 10(c) and 10(d) shows forecasted results using Bi-directional LSTM. Fig. 10(e) and (f) represents actual and predicted time series trend using Convolutional LSTM. Actual (solid blue line) and forecasted (solid red line) data can be visualize graphically in Fig. 10 and sudden change in forecasted data (red dotted line) is also seen in case of ConvLSTM.
4.2. Covid-19 death cases prediction
Fig. 11 (a) and (b) shows forecasted (solid red line) death cases of both countries using Stacked LSTM. Fig. 11(c) and (d) shows predictions using BD-LSTM and Fig. 11(e) and (f) are ConvLSTM graphical forecasted findings that shows significant upward trend in number of death cases for next one month. For Figs. 10 and 11 graphs are titled with the country name and type of Covid-19 cases. Best model is chosen among three by comparing classification metrics of all LSTM variants in Table 2, Table 3, Table 4 .
Table 2.
Models | India confirmed cases | USA confirmed cases | ||||||
---|---|---|---|---|---|---|---|---|
Accuracy | Precision | Recall | F-measure | Accuracy | Precision | Recall | F-measure | |
Stacked LSTM | 96.00 | 86.06 | 100 | 0.92 | 90.00 | 100 | 44.44 | 0.61 |
Bi-directional LSTM | 96.60 | 91.60 | 100 | 0.95 | 93.33 | 100 | 66.66 | 0.80 |
ConvLSTM | 97.82 | 100 | 75.00 | 0.85 | 98.00 | 100 | 88.00 | 0.94 |
Table 3.
Models | India death cases | USA death cases | ||||||
---|---|---|---|---|---|---|---|---|
Accuracy | Precision | Recall | F-measure | Accuracy | Precision | Recall | F-measure | |
Stacked LSTM | 93.33 | 84.61 | 100 | 0.91 | 92.00 | 100 | 69.23 | 0.81 |
Bi-directional LSTM | 96.00 | 75.00 | 100 | 0.85 | 94.00 | 100 | 75.000 | 0.85 |
ConvLSTM | 96.66 | 100 | 80.00 | 0.88 | 97.50 | 87.50 | 100 | 0.93 |
Table 4.
Models | India confirmed cases | USA confirmed cases | India death cases | USA death cases |
---|---|---|---|---|
MAPE | MAPE | MAPE | MAPE | |
Stacked LSTM | 4.00 | 10.00 | 6.66 | 8.00 |
Bi-directional LSTM | 3.33 | 6.66 | 4.00 | 6.00 |
ConvLSTM | 2.17 | 2.00 | 3.33 | 2.50 |
By analyzing the graphs and classification metrics it is concluded that ConvLSTM model outperformed the other two models. Predicted graphical representation and results of ConvLSTM are quite similar with the real case scenarios of both countries. According to our forecasted findings, we concluded that by the end of July up to mid August 2020, 12,82,346 cumulative daily confirmed Covid-19 cases could reach in India which are 7,42,417 now (as on 7th July 2020) and predicted cumulative daily deaths in India could reach up to 24,333 which are now (as on 7th July 2020) 20,655 according to [18]. Forecasted Covid-19 confirmed and death cases for USA by the end of July up to mid August 2020 could be 5,82,44,656 cumulative daily cases which are now (as on 7th July 2020) 3,68,80,242 and predicted cumulative daily deaths for USA could be 1,71,806 which are now (as on 7th July 2020) 1,23,826 according to [19]. Official Covid-19 website of USA [19] also forecasted the number of deaths due to Covid-19 as 1,40,000 to 1,60,000 by 25th July which are very much closer to our predicted death cases. According to best proposed model accuracies of 97.82%, 98%, 96.66% and 97.50% for India confirmed cases, USA confirmed cases, India death cases and USA death cases are achieved respectively. Mean Absolute Percentage Error (MAPE) of all given models are also compared in Table 4 which shows very less error in our forecasted data. Stacked and Bi-directional LSTMs also shows constant trend for death cases of USA but ConvLSTM provides better results.
4.3. Discussion and suggestions
According to our study, number of Covid-19 confirmed and death cases for both countries will increase for next one month. Necessary steps must be taken to ensure mitigation of this devastating pandemic. Solution like complete lockdown can't work for months in any country with this situation like economic crises may arise. Public awareness is important and key step in this regard that could help to minimize unnecessary exercise of contact tracing and making quarantine arrangements. India became third worst hit country in Covid-19 cases after USA and Brazil, which is a warning in itself to fight against Covid-19 with full capacity. India is the largest democratic nation in the world with a population of around 1.3 billion and USA is the oldest democracy with around 331 million population which is a matter of great concern as both countries are becoming prime targets of this pandemic. Covid-19 can attain exponential growth in its spread in the densely populated areas very easily. Important ongoing achievements of both countries regarding Covid-19 are in need of continuous improvement to make sure defeat of this unseen enemy. Some of them are discussed below.
-
I
Vaccination development
WHO reported on 6th July 2020, that there are 21 leading candidates for vaccination development through clinical trials and are working at different phases of clinical evolution [40]. Both the countries show their remarkable presence in development of vaccine through clinical trials.
In India, Bharat Biotech in collaboration with Indian Council of Medical Research (ICMR) and National Institute of Virology (NIV) developed COVAXINTM that is first indigenous Covid-19 vaccine [41]. Bharat Biotech submitted its preclinical generated results after which the Drug Controller General of India approved it for phase 1 and 2 human clinical trials. Codagenix with Serum Institute of India and Indian Immunologicals Ltd. in collaboration with Australian Griffith University are also in preclinical evolution stage to use Codon deoptimized live attenuated vaccines to prevent Covid-19.
In USA, National Institute of Allergy and Infectious Diseases (NIAID) [42] uphold the clinical trials on thousands of volunteers aiming lipid nanoparticle (LNP) encapsulated mRNA formulation. It is under second phase of clinical process evolution, which could prove panacea to prevent Covid-19 pandemic in its successive trials. BioNTech/Fosun is another developer in second phase of clinical evolution to use 3 LNP-mRNAs against Covid-19. Massachusetts Eye and Ear Hospital is located in the United States, which is in pre-clinical trial to use a non replicating Adeno-associated virus vector (AAVCOVID).
-
II
Increasing recovery rate
In India, total Covid-19 cases (as on 10th July 2020) are 7,93,802. Out of which 4,95,515 cases were recovered leading to recovery rate of 62.42 percent [18]. Recovery rate of India shows upward trend from the month of March when first recovery of Covid-19 case was observed. There are around 2,76,682 active cases which can be recovered with in less time frame if country is able to recede the spread of Covid-19.
In USA, recovered cases are 1,426,428 out of 3,219,999 total cases (as on 10th July 2020) leading to recovery rate of 44.29 percent [19]. Around 1,657,749 active cases take time to recover with such slow recovery rate. To slower down the spread of pandemic and to faster the recovery rate can turn the table for both countries to mold the economy rate on track.
-
III
Real time Covid-19 testing
Effective testing is the prime part in the diagnosis of Covid-19. Real time testing can lead to early detection and prevention of the disease. Reverse transcription-polymerase chain reaction (RT-PCR) is one such real time testing technique which detects the viral RNA in the body. USA based company named Applied Bio-Systems created testing kit namely TaqMan 2019-nCoV Control Kit v1 that is being used for real time PCR. ICMR in India also evaluated this product and found it satisfactory for its use. Except this, ICMR approved 150 real time testing kits from Indian manufacturers and other countries for Covid-19. LyteStar 2019-nCoV RT PCR Kit 1.0 and PATHOFIND COVID-19 Realtime RTPCR Kit are 2 out of 23 Indian manufactured real time testing kits [41]. US Food and Drug Administration also approved 171 tests including 142 molecular, 27 antibody and 2 antigen tests [43]. The IgG ELISA and CLIA testing kits are used for serosurveys to calculate the population proportion exposed to Covid-19 infection. By considering depth of seroprevalence of Covid-19 infection, suitable public health policies can be implemented to detect and prevent Covid-19. With these testing kits health care workers can also survey the containment zones for past Covid-19 cases. Euroimmun Anti- SARS-COV-2 IgG ELISA and Erbalisa COVID-19 IgG ELISA are IgG ELISA testing kits of USA, COVID Kavach IgG ELISA and KAVACH Karwa SARS-COV 2 IgG ELISA are IgG ELISA and CLIA testing kits of India. ICMR approved all four kits for its use and found them suitable for testing. Fast possible testing is also an important factor to increase the recovery rate and hence decrease mortality.
-
IV
Airborne Spread of Covid-19
Covid-19 is emerging as an airborne disease and is a new set of threat to the world, as various recent researches provided the evidence of it. WHO also takes into consideration the possibility of aerosol transmission as mode of transmission of Covid-19. Current guidelines of WHO and other international bodies are staying at home, washing hands frequently, social distancing are mandatory but not sufficient to mitigate airborne spread of Covid-19. Wearing face masks can prevent the aerosol transmission by stopping it from inhalation. It must be mandatory to avoid public gathering and provide effective ventilation with high efficiency air filtration systems in schools, hospitals, universities and other important places. There is no universal acceptance to aerosol transmission till now but some researchers found sufficient proofs of it [44]. Therefore, it is matter of great concern to take precautionary measures before getting trapped into more precarious cage of the pandemic.
5. Conclusion and future work
Novel coronavirus (Covid-19) has almost stopped the world due to its devastating pandemic nature. World has to learn to live with it and make the precautions of Covid-19 as their part of life because this pandemic can exist for months to come and we cannot lockup all the essential and non-essential things for so long.
In this study, we used deep learning models to demonstrate forecasting of Covid-19 for India-USA comparatively. Covid-19 confirmed and death cases of both the countries are taken into consideration. Limitation of Covid-19 data is challenging factor for forecasting of time series data. Extension of Recurrent Neural Network (RNN) as Long Short Term Memory (LSTM) cell and its variants such as Stacked LSTM, Bi-directional LSTM and Convolutional LSTM are being used to model the predictions. Best model is chosen on the basis of error rate in predictions and the error is calculated using Mean Absolute Percentage Error (MAPE). Convolutional LSTM model performs better than other two models with error rate ranges from 2.0 to 3.3 percent for all four datasets. Performance of Stacked LSTM model is worst in our experimental comparative study. According to our forecasted results the confirmed and death cases for both the countries will rise for next one month. This study is entirely demonstrated by taking statistical data and methodology into consideration. The rate of the reliability of our best proposed model that gives state-of-the-art result is satisfactory with the real time Covid-19 data.
To sum up, this is the first comparative case study of India-USA to put light on factors effecting Covid-19 and predict the future time line of both countries. Our study will become helpful for both countries to take all precautionary measurements before getting subjugated by Covid-19 pandemic. In the future, we can study the total economy loss at the end of Covid-19 in different sectors and make proper plan to recover it, which can help the countries to recover economy rate. We plan to forecast possible Covid-19 cases for other countries and also aerosol transmission of Covid-19 can be verified.
Funding information
This research has no funding.
Declaration of Competing Interest
None declared.
Acknowledgement
This research work is dedicated to Covid-19 frontline workers.
References
- 1.DeWitte S.N. Mortality risk and survival in the aftermath of the medieval black death. PLoS ONE. 2014;9(5) doi: 10.1371/JOURNAL.PONE.0096513. e96513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Acuna-Soto R., Stahle D.W., Cleaveland M.K., Therrell M.D. Megadrought and megadeath in 16th century Mexico. Emerg Infect Dis. 2002;8(4):360–362. doi: 10.3201/eid0804.010175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kahn C. History of smallpox and its prevention. Arch Pediatr Adolesc Med. 1963;106(6):597. doi: 10.1001/archpedi.1963.02080050599011. [DOI] [PubMed] [Google Scholar]
- 4.Shinde G.R., Kalamkar A.B., Mahalle P.N., Dey N., Chaki J., Hassanien A.E. Forecasting models for coronavirus disease (COVID-19): a survey of the state-of-the-art. SN Comp Sci. 2020;1:197. doi: 10.1007/s42979-020-00209-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.McKibbin, W.J., Sidorenko, A.A. Global macroeconomic consequences of pandemic infuenza. analysis. Sydney, Australia: Lowy Institute for International Policy; 2006.
- 6.Salim, N., Chan, W.H., Mansor, S., Bazin, N.E.N., Amaran, S., Faudzi, A.A.M., et al. COVID-19 epidemic in Malaysia: impact of lockdown on infection dynamics. medRxiv preprint 2020 10.1101/2020.04.08.20057463.
- 7.Huang C., Wang Y., Li X., Ren L., Zhao J., Hu Y. Clinical features of patients infected with 2019 novel coronavirus in wuhan, china. Lancet North Am Ed. 2020;395(10223):497–506. doi: 10.1016/S0140-6736(20)30183-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wang C., Horby P.W., Hayden F.G., Gao G.F. A novel coronavirus outbreak of global health concern. Lancet North Am Ed. 2020;395(10223):470–473. doi: 10.1016/S0140-6736(20)30185-9. 10.1016%2FS0140-6736(20)30185-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Johns Hopkins University, Johns Hopkins coronavirus resource center. https://coronavirus.jhu.edu/map.html. (Accessed on 12th of July 2020).
- 10.Johns Hopkins coronavirus resource center, mortality analyses https://coronavirus.jhu.edu/data/mortality. (Accessed on 12th of July 2020).
- 11.Arora, P., Kumar, H., Panigrahi, B.K. Prediction and analysis of COVID-19 Positive cases using deep learning models:a descriptive case study of India. Chaos, Solitons and Fractals 2020 10.1016/j.chaos.2020.110017. [DOI] [PMC free article] [PubMed]
- 12.Hariri, M., Obaid, W., Rihawi, H., Safadi, S., McGlasson, M.A. The Covid-19 Forecast in Northwest Syria. medRxiv preprint 2020 10.1101/2020.05.07.20085365.this.
- 13.Abdulmajeed K., Adeleke M., Popoola L. Online forecasting of Covid-19 cases in Nigeria using limited data. Data Brief. 2020;30 doi: 10.1016/j.dib.2020.105683. 105683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cheng Z., Burcu T., Nicola G.C, Perdo D.W.G., Matthias P.H. ICUmonitoring.ch: a platform for short-term forecasting of intensive care unit occupancy during the COVID-19 epidemic in Switzerland. Swiss Med Wkly. 2020;150 doi: 10.4414/smw.2020.20277. w20277. [DOI] [PubMed] [Google Scholar]
- 15.Sohail A., Nutini A. Forecasting the timeframe of 2019-nCoV and human cells interaction with reverse engineering. Prog Biophys Mol Biol. 2020;155:29–35. doi: 10.1016/j.pbiomolbio.2020.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Saba A.I., Elsheikh A.H. Forecasting the prevalence of COVID-19 outbreak in Egypt using nonlinear autoregressive artificial neural networks. Process Saf Environ Prot. 2020;141:1–8. doi: 10.1016/j.psep.2020.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Pathan R.K., Biswas M., Khandaker M.U. Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model. Chaos Solitons Fractals. 2020;138 doi: 10.1016/j.chaos.2020.110018. 110018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ministry of Health and Family Welfare (GOI) - https://www.mohfw.gov.in/ (Accessed on July 2nd 2020).
- 19.Centers for Disease Control and Prevention (US DoHHS) - https://www.cdc.gov/ (Accessed on July 2nd 2020).
- 20.Sherstinsky A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys D. 2020 doi: 10.1016/j.physd.2019.132306. [DOI] [Google Scholar]
- 21.Singh K., Shastri S., Bhadwal A.S., Kour P. Implementation of exponential smoothing for forecasting time series data. Int J Sci Res Comput Sci Appl Manage Stud. 2019 2319 – 1953. [Google Scholar]
- 22.Zhao, Z., Nehil-Puleoa, K., Zhao, Y. How well can we forecast the COVID-19 pandemic with curve fitting and recurrent neural networks?. medRxiv preprint 2020 10.1101/2020.05.14.20102541.
- 23.Shastri S., Sharma A., Mansotra V. A model for forecasting tourists arrival in J&K, India. Int J Comput Appl. 2015;129(15) 0975 – 8887. [Google Scholar]
- 24.Fakhfakh M., Bouaziz B., Gargouri F., Chaari L. ProgNet: Covid-19 prognosis using recurrent and convolutional neural networks. medRxiv preprint. 2020 doi: 10.1101/2020.05.06.20092874. [DOI] [Google Scholar]
- 25.Hochreiter S., Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
- 26.Yu Y., Xi S., Hu C., Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019;31(1235–1270) doi: 10.1162/neco_a_01199. [DOI] [PubMed] [Google Scholar]
- 27.Kuo C.E., Chen G.T. Automatic sleep staging based on a hybrid stacked LSTM neural network: verification using large-scale dataset. IEEE Access. 2020;8 doi: 10.1109/ACCESS.2020.3002548. 111837-111849. [DOI] [Google Scholar]
- 28.Jahromi A.N., Hashemi S., Dehghantanha A., Parizi R.M. An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems. IEEE Trans Emerg Top Comput Intell. 2020;(1-11) doi: 10.1109/TETCI.2019.2910243. 2471-285X. [DOI] [Google Scholar]
- 29.Schuster M., Paliwal K.K. Bidirectional recurrent neural networks. IEEE Trans Signal Process. 1997;45(11) doi: 10.1109/78.650093. [DOI] [Google Scholar]
- 30.Graves A., Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 2005;18(5–6):602–610. doi: 10.1016/j.neunet.2005.06.042. [DOI] [PubMed] [Google Scholar]
- 31.Graves, A., Mohamed, A., Hinton, G. Speech recognition with deep recurrent neural networks. IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC 2013;6645-6649 10.1109/ICASSP.2013.6638947.
- 32.Xia T., Song Y., Zheng Y., Pan E., Xi L. An ensemble framework based on convolutional bi-directional LSTM with multiple time windows for remaining useful life estimation. Comput Ind. 2020;115 doi: 10.1016/j.compind.2019.103182. [DOI] [Google Scholar]
- 33.Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C. Convolutional LSTM network: a machine learning approach for precipitation nowcasting. arXiv 2015;1506.04214v2.
- 34.Sun Y. P.-W. Lo F, Lo B. EEG-based user identification system using 1D-convolutional long short-term memory neural networks. Expert Syst Appl. 2019;125:259–267. doi: 10.1016/j.eswa.2019.01.080. [DOI] [Google Scholar]
- 35.Agethen S., Hsu W.H. Deep multi-kernel convolutional LSTM networks and an attention-based mechanism for videos. IEEE Trans Multimedia. 2020;22(3):819–829. doi: 10.1109/TMM.2019.2932564. [DOI] [Google Scholar]
- 36.Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J. et al. TensorFlow: a system for large-scale machine learning, in: 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16)2016;265–283.
- 37.McKinney W. Proceedings of the 9th Python in Science Conference. Austin, TX. Vol. 445. 2010. Data structures for statistical computing in python; pp. 51–56. [Google Scholar]
- 38.Oliphant T.E. Trelgol Publishing USA; 2006. A guide to NumPy, Vol. 1. [Google Scholar]
- 39.Chollet, F., et al., Keras, https://github.com/keras-team/keras (Accessed on 7th July 2020).
- 40.Covid-19 vaccine candidates, World Health Organisation - https://www.who.int/publications/m/item/draft-landscape-of-covid-19-candidate-vaccines (Accessed on 11 July).
- 41.Indian Council of Medical Research (ICMR) - https://www.icmr.gov.in/ (Accessed on 9th July 2020).
- 42.National Institutes of Health, Department of Health and Human Services, U.S- https://www.nih.gov/coronavirus (Accessed on July 9th 2020).
- 43.The U.S. Food and Drug Administration - https://www.fda.gov/ (Accessed on 9th July 2020).
- 44.Morawska L., Milton DK. It is time to address airborne transmission of COVID-19. Clin Infect Dis. 2020 doi: 10.1093/cid/ciaa939. ciaa939. [DOI] [PMC free article] [PubMed] [Google Scholar]