Forecasting outbreak of COVID-19 in Turkey; Comparison of Box–Jenkins, Brown’s exponential smoothing and long short-term memory models

Didem Guleryuz

doi:10.1016/j.psep.2021.03.032

. 2021 Mar 22;149:927–935. doi: 10.1016/j.psep.2021.03.032

Forecasting outbreak of COVID-19 in Turkey; Comparison of Box–Jenkins, Brown’s exponential smoothing and long short-term memory models

Didem Guleryuz ^1,^⁎

PMCID: PMC7983456 PMID: 33776248

Graphical abstract

Keywords: Box-Jenkins method, Brown’s exponential smoothing model, LSTM, COVID-19 forecasting

Abstract

The new coronavirus disease (COVID-19), which first appeared in China in December 2019, has pervaded throughout the world. Because the epidemic started later in Turkey than other European countries, it has the least number of deaths according to the current data. Outbreak management in COVID-19 is of great importance for public safety and public health. For this reason, prediction models can decide the precautionary warning to control the spread of the disease. Therefore, this study aims to develop a forecasting model, considering statistical data for Turkey. Box-Jenkins Methods (ARIMA), Brown’s Exponential Smoothing model and RNN-LSTM are employed. ARIMA was selected with the lowest AIC values (12.0342, -2.51411, 12.0253, 3.67729, -4.24405, and 3.66077) as the best fit for the number of total case, the growth rate of total cases, the number of new cases, the number of total death, the growth rate of total deaths and the number of new deaths, respectively. The forecast values of the number of each indicator are stable over time. In the near future, it will not show an increasing trend in the number of cases for Turkey. In addition, the pandemic will become a steady state and an increase in mortality rates will not be expected between 17–31 May. ARIMA models can be used in fresh outbreak situations to ensure health and safety. It is vital to make quick and accurate decisions on the precautions for epidemic preparedness and management, so corrective and preventive actions can be updated considering obtained values.

1. Introduction

Coronavirus disease (COVID-19) causes acute respiratory damage and causes severe consequences via the high transmission rate SARS-CoV-2 virus that was defined as an epidemic by World Health Organization (WHO). Shortly after the disease was initially seen in the Wuhan region of China, according to WHO data, it spread to millions of people around the world and became a global threat (Ndaïrou et al., 2020). Although there was a significant number of deaths worldwide due to COVID-19, North America and Europe were deemed to be more affected than other continents. Turkey ranks 9th in the world while ranking 6th in Europe with 146,457 confirmed total cases, according to the WHO up-date on May 16, 2020 (WHO, 2020). According to the literature, using mathematical models has become more prominent for the estimation of cases in epidemic diseases, as they can determine spread prevention and preventive measures. In addition, since the epidemic is a global problem, analyzing the pandemic process and predicting critical data has become an important research area for researchers.

Since data sets of epidemic diseases make up a time series, the spread of diseases can be monitored by using estimation methods. In the literature, there are some adopted methods that developed prediction models using the Time Series of Diseases data such as the Box-Jenkins Method (Liu et al., 2011), Exponential Smoothing Method (Kondratyev and Tsybalova, 2013), Holt-Winter's Model (Araz et al., 2014; Becerra et al., 2020), Grey Prediction Method (Jia et al., 2019), Artificial Intelligence-based Prediction Models (Liu et al., 2018) and some simulation models (Mei et al., 2012; Yang et al., 2015). Table 1 summarizes the studies conducted to observe the spread of diseases in the past by using ARIMA, EXPOS, LSTM.

Table 1.

Studies Based on Forecasting Disease Outbreak via ARIMA, EXPOS, LSTM.

Methods	Reference (Disease)
ARIMA	(Promprou et al., 2006) - Dengue Haemorrhagic Fever
	(Ture and Kurt, 2006) - Hepatitis A
	(Liu et al., 2011) - Hemorrhagic Fever
	(Kondratyev and Tsybalova, 2013) - Influenza-Like Illnesses
	(Permanasari, 2013) - Malaria
	(Zheng et al., 2015) - Morbidity of Tuberculosis
	(Sharafi et al., 2017) - Zoonotic Cutaneous Leishmaniasis
	(Durga et al., 2019) - Dengue Fever
	(Liu et al., 2015) - Hand, Foot and Mouth Disease
Exponential Smoothing	(Linden et al., 2004) - Disease Management
	(Araz et al., 2014) - Influenza-like–illnesses,
	(Farrington and Andrews, 1996) - Infectious Disease
LSTM	(Lim, 2018) - Alzheimer’s Disease
	(Venna et al., 2019) - Influenza
	(Zhang and Nawata, 2018) - Influenza

Open in a new tab

Foresight has grown to be an important issue in the fight against disease with the COVID-19 outbreak. Some studies using prediction models for different countries have been carried out since the end of December 2019. Fanelli and Piazza (2020) improved a model to analyze and predict COVID-19 temporal dynamics in China, France and Italy by employing a simple mean-field model (Fanelli and Piazza, 2020).

Ceylan (2020) predicted the COVID-19 spread in Italy, Spain, and France by developing ARIMA models (Ceylan, 2020). Henrique et al. (2020) compared the ARIMA, random forest, support vector regression, cubist regression, ridge regression and stacking-ensemble learning outcomes of COVID-19 based on cumulative confirmed cases in Brazil (Henrique et al., 2020). Ndaïrou et al. (2020) proposed and simulated a compartmental mathematical model to identify the transmission dynamics of COVID-19 in Wuhan, China (Ndaïrou et al., 2020). Tomar and Gupta (2020) used LSTM based on data-driven estimation methods and curve fitting to gain foresight on COVID-19 cases in India, and performance criteria for preclusive actions were evaluated (Tomar and Gupta, 2020). Zhang et al. (2020). Used Holt–Winters exponential smoothing to forecast the worldwide COVID-19 pandemic improvement in Hubei, China (Zhang et al., 2020). Haghani et al. (2020) analyzed the bibliometric aspects of the studies on COVID-19 at a macro level (Haghani et al., 2020).

The aim of this study is to predict whether Turkey welcomes encounters with extraordinary situations. Therefore, using ARIMA, EXPOS, and LSTM methods, six different variables were estimated, and the results were compared with the determined performance criteria. The method that sets the best results was selected and an estimation was made for the interval covering May 17–May 31. According to the obtained results, new optimization studies can be done by changing conditions in Turkey.

2. Materials and methods

In this study, EXPOS, Box-Jenkins Method, also known as ARIMA model and LSTM in the literature, were used. To see the Corona virus pandemic spread in Turkey, the obtained prediction values and results were evaluated.

2.1. Data collection process

Data research for Turkey is conducted daily, this includes; the Growth Rate of Total Case (GRTC), the Number of Total Deaths (TD), the Growth Rate of Total Deaths (GRTD), the Number of Total Cases (TC), the Number of New Cases (NC), and the Number of New Deaths (ND) from COVID-19 from 12 March 2020 to 16 May 2020 which covers 64 days. These values were gathered from WHO (WHO, 2020). GRTC and GRTD are calculated with the data on the same website. After that, the data are organized to be used in ARIMA, EXPOS, LSTM. Then, using the three methods, the data for the variables mentioned above starting from the 65th day for the next 15 days were estimated. Fig. 1 shows that prevalence of COVID-19 in Turkey.

Fig. 1 — Statistics of COVID-19 Outbreaks in Turkey.

2.2. Box-Jenkins model development

Box and Jenkins (2015) proposed a repetitive and a three-step method for modeling for the time series. These three stages are; model definition, model prediction and model validity (Box and Jenkins, 2015). The Box-Jenkins method provides a facilitating method to identify p, d, q values and to estimate these series for forecasting. The first step initiates whether the time series is stationary. If it is not stationary, the data are transformed to become stationary through differencing to the appropriate degree. It can grant an understanding of investigating at the time dependent graph of the stationary time series, and statistical methods can be used (Makridakis and Hibon, 1997). If autocorrelation in the series cannot be ignored, it is concluded that the stochastic process in the series is not stationary. When real-life data are analyzed in the form of time series, it is seen that they are generally not stationary. Hence the variables have a variation which fluctuates around a fixed mean. The time series does not come back to its original state when it deviates from the average for any reason and moves away from the target unless a control activity is carried out. ARIMA models are drawn attention to in the forecasting literature as they are very successful in representing non-stationary time series in real-life applications (Siami-Namini and Namin, 2018).

Numerous tests have been utilized for looking through the stationarity. In this study, Augmented Dickey Fuller (ADF), which is most frequently employed by searching the stationary, was used. According to the ADF Test Results, GRTC, TD, GRTD are stationary at the levels; and the second difference of TC, the first difference NC, and first difference of ND are stationary about a constant. In addition, the stationarity of the time series was controlled by the autocorrelation graphs given in Fig. 2 .

Fig. 2 — Initial Data Autocorrelations for TC (a), NC (b), GRTC (c), TD (d), ND (e), GRTD (f).

ARIMA consists of three parts, that are Autoregression (AR), Moving Average (MA) and an integrated operator (I), and is generally shown as follows (Box and Jenkins, 2015):

Y_{t} = \sum_{i = 1}^{q} β_{i} ε_{t - i} + \sum_{i = 1}^{p} α_{i} Y_{t - i} + ε_{t}

Where $β_{i}$ represents moving averages of values and $α_{i}$ is an autoregressive parameter. $Y_{t}$ is the observed value at time t and $ε_{t}$ is random variation at time t.

If the time series is not stationary, there is a new time series by using of difference equation.

▽ X_{t} = X_{t} - X_{t - 1} = {X^{'}}_{t}

Given the fact that the time series is not yet stationary, the differentiating process continues. The purpose in the model definition phase is to estimate the p, d and q parameters. An estimation method using diagrams of autocorrelation function (ACF) and partial autocorrelation function (PACF) is proposed as the most proper method to find the model parameters (John et al., 2019). ACF and PACF diagrams of the selected models can be seen in Fig. 3, Fig. 4 . The stability and seasonality can be controlled with the help of these graphics.

Fig. 3 — ACF and PACF Diagrams for TC (a), NC (b) and GRTC (c).

Fig. 4 — ACF and PACF Diagrams for TD (d), ND (e) and GRTD (f).

2.3. Brown’s exponential smoothing method

Robert G. Brown, who conducted the first experimental operations research studies, improved the EXPOS estimation method in 1944. This method has been accepted in literature as it predicts by giving a high weight value to the nearest outcomes over time. Considering the datasets containing the prediction data up to time interval t, it is concluded that EXPOS can be used for the prediction of z_t+1, which is the next value of z_t. Brown's EXPOS prediction equation is considered in Eq. (1).

{\hat{y}}_{t+1} = {\hat{y}}_{t} + λ (y_{t} - {\hat{y}}_{t - 1})

(1)

In this equation, λ is the smoothing constant and can take values between 0 and 1, where ${\hat{y}}_{t + 1}$ is the predicted value at t + 1, $y_{t}$ is the observed value at time t, ${\hat{y}}_{t}$ is the predicted value at time t and $λ$ is the smoothing constant. It can be seen from the equation that the new prediction not only means correcting the error that occurred in the final prediction, but it is also the old prediction. Ones $λ$ closes to 1, the new prediction contains a significant correction for the error in the preceding prediction. The correction rate of the error in the previous prediction is less. Based on these equations, it can be said that ${\hat{y}}_{t + 1}$ represents the weighted moving average of past observations with exponentially decreasing weights. Therefore, the method is known as the exponential smoothing method (Hyndman et al., 2008).

There are numerous theoretical and experimental discussions to choose a suitable smoothing constant value. A large smoothing constant does not mean a better estimate. For instance, using a larger value can cause large forecast errors, while a small value can cause a trend not to respond quickly. Therefore, it is very important to decide the value of the smoothing constant (Box and Jenkins, 2015). In this study, the optimum smoothing constant was found by using the statistical package program. The performance criteria for estimation made with EXPOS are seen in Appendix A.

2.4. Long short-term memory

One of the most significant disadvantages of traditional neural networks is that the data learned in the previous step are not used to display the information in the next step. At this point, Recurrent Neural Network (RNN) can use loops that can transfer information from one to the next. Therefore, the information obtained in the previous cycle can help detect the resulting sample. LSTM is a unique form of RNN, which is designed to prevent the problem of long sort-term dependence problems. Therefore, LSTM have memory blocks that connect to progressive layers. This network is trained using backpropagation, and it comes through the vanishing gradient problem. Every memory block of LSTM involves gates which conduct the condition of the block and the output (Abdel-Nasser and Mahmoud, 2019).

In this study, the LSTM network is structured as a one input layer which is visible and a hidden layer with six LSTM blocks which contains 50 units that are neurons and also the network has an output layer that gives the prediction power. The LSTM blocks utilize the default sigmoid activation function and the networks employ this between 100 and 250 epochs with batch size from 1 to 45. Performance criteria for estimations made with LSTM-RNNs are seen in Appendix A.

2.5. Model performance measurement

The accuracy of the model can be identified by using selection criteria. In the study four criteria were calculated to test the capability of ARIMA, EXPOS and LSTM, which are Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Akaike’s information criterion (AIC). AIC values have been taken into account primarily to identify the most accurate model. The formulations of these criteria are expressed Eq. (5) to Eq. (8), respectively (Becerra et al., 2020).

MAE = \frac{1}{n} \sum_{t = 1}^{n} | e_{t} |

(5)

RMSE = \sqrt{\frac{1}{n} \sum_{t = 1}^{n} {e_{t}}^{2}}

(6)

MAPE = \frac{\sum_{t = 1}^{n} \frac{|e_{t}|}{a_{t}} 100}{n}

(7)

AIC = - 2 ln (L) + 2 k

(8)

Where n is the number of observations, k is the number of model parameters, L is the value of likelihood, a_t is actual value at time t, e_t is errors between observed and predicted data. ARIMA and EXPOS analyses were performed using Statgraphics Centurion 18 software and also Python are used for constructing the LSTM predicting model.

3. Results and discussion

Turkey has faced the coronavirus pandemic process later than other European countries. Due to this time advantage Turkey has taken some social and economic precautions to combat the pandemic, taking into consideration other European countries. In this study, the statistical data that impact the ongoing pandemic process are predicted using three different methods in the literature. A Box- Jenkins Method with different parameters, EXPOS and LSTM were applied on the selected time series and the comparative results of those can be seen in Appendix A. According to results of EXPOS and LSTM, ARIMA forecasting models, ARIMA with different parameters was chosen as the best fit for each variable. The comparative results are shown in Table 2 .

Table 2.

Comparison of AIC Values.

Variables	Models	AIC
TC	ARIMA(1,2,4)	12.03
	EXPOS with $λ$ = 0.9912	12.09
	LSTM	14.66
GRTC	ARIMA(0,1,2)	−2.51
	EXPOS with $λ$ = 0.1601	0.027
	LSTM	0.486
NC	ARIMA(1,1,4)	12.03
	EXPOS with $λ$ = 0.3585	12.19
	LSTM	14.743
TD	ARIMA(0,2,2)	3.677
	EXPOS with $λ$ = 0.9999	3.768
	LSTM	8.132
GRTD	ARIMA(0,2,3)	−4.24
	EXPOS with $λ$ = 0.2	−1.87
	LSTMs	−1.17
ND	ARIMA(0,1,2)	3.661
	EXPOS with $λ$ = 0.6074	3.719
	LSTM	6.859

Open in a new tab

After deciding the accurate model, the forecasting equation should be developed. Table 3 summarizes the equation coefficients and variables.

Table 3.

Selected ARIMA Models Summaries.

	Parameters	Coefficient	Std. Error	t	P-value
TC ARIMA(1,2,4)	AR(1)	0.84863	0.085336	9.94453	0.0000
	MA(1)	0.883853	0.116838	7.56474	0.0000
	MA(4)	−0.56891	0.117572	−4.83883	0.000011
GRTC ARIMA(0,1,2)	MA(1)	1.45741	0.025023	58.2434	0.0000
GRTC ARIMA(0,1,2)	MA(2)	−1.02327	0.02682	−38.154	0.0000
NC	AR(1)	0.844464	0.088505	9.54144	0.0000
ARIMA(1,1,4)	MA(1)	0.866544	0.119131	7.27387	0.0000
ARIMA(1,1,4)	MA(4)	−0.56408	0.116328	−4.84902	0.00001
GRTD	MA(1)	1.13117	0.083246	13.5883	0.0000
ARIMA(0,2,3)	MA(2)	−1.26959	0.053086	−23.9156	0.0000
ARIMA(0,2,3)	MA(3)	0.739661	0.072606	10.1873	0.0000
TD	MA(2)	−0.33195	0.12398	−2.67742	0.009555
ARIMA(0,2,2)	MA(2)	−0.33195	0.12398	−2.67742	0.009555
ND	MA(2)	−0.33191	0.122997	−2.69851	0.008997
ARIMA(0,1,2)	MA(2)	−0.33191	0.122997	−2.69851	0.008997

Open in a new tab

P-values in the last column of Table 3 are less than 0.05 and are statistically different from zero at the 95 % confidence level. For instance, the P-value of the MA(2) term is 0.009555 for TD and the P-value of the MA(4) term is 0.00001 for NC, so it differs from 0. In other words, these p-values indicate that these terms should be in the prediction equation.

The fitted and estimated values made with ARIMA models that give the best results for each variables can be seen in Table 4 for each time series. In addition, upper and lower limits, and forecasting of the next 15 days’ variables calculated with 95 % confidence interval are given in Table 4.

Table 4.

Forecast Table for Each Time Series.

	TC			GRTC			NC
Date	Forecast	Lower Limit 95 % CI	Upper Limit 95 % CI	Forecast	Lower Limit 95 % CI	Upper Limit 95 % CI	Forecast	Lower Limit 95 % CI	Upper Limit 95 % CI
17.05.2020	148,120	147,356	148,884	1.01701	−0.6613	2.6953	1683.98	925.42	2442.53
18.05.2020	149,589	147,905	151,272	1.1173	−0.7282	2.9628	1510.52	449.54	2571.5
19.05.2020	150,804	148,129	153,478	1.1173	−0.9582	3.1928	1269.51	54.845	2484.17
20.05.2020	152,045	148,477	155,613	1.1173	−1.1652	3.3998	1301.12	49.94	2552.29
21.05.2020	153,309	148,753	157,865	1.1173	−1.3549	3.5895	1327.81	−12.98	2668.59
22.05.2020	154,592	148,871	160,313	1.1173	−1.531	3.7656	1350.35	−133.4	2834.13
23.05.2020	155,891	148,780	163,001	1.1173	−1.6961	3.9307	1369.38	−300.1	3038.87
24.05.2020	157,203	148,454	165,953	1.1173	−1.8521	4.0867	1385.46	−499.3	3270.23
25.05.2020	158,528	147,878	169,177	1.1173	−2.0003	4.2349	1399.03	−719.4	3517.51
26.05.2020	159,862	147,049	172,675	1.1173	−2.1417	4.3763	1410.49	−951.9	3772.85
27.05.2020	161,205	145,966	176,443	1.1173	−2.2773	4.5119	1420.17	−1191	4030.87
28.05.2020	162,554	144,633	180,475	1.1173	−2.4076	4.6422	1428.35	−1431	4288.01
29.05.2020	163,910	143,057	184,764	1.1173	−2.5333	4.7679	1435.25	−1671	4541.95
30.05.2020	165,271	141,242	189,300	1.1173	−2.6548	4.8894	1441.08	−1909	4791.23
31.05.2020	166,636	139,196	194,076	1.1173	−2.7725	5.0071	1446	−2143	5034.97

	TD			GRTD			ND
Period	Forecast	Lower Limit 95 % CI	Upper Limit 95 % CI	Forecast	Lower Limit 95 % CI	Upper Limit 95 % CI	Forecast	Lower Limit 95 % CI	Upper Limit 95 % CI
17.05.2020	4100.3	4088.1	4112.5	1.03164	0.67934	1.3839	45.3119	33.225	57.3985
18.05.2020	4143.1	4113.3	4172.93	1.04278	0.57609	1.5095	42.8137	23.644	61.9834
19.05.2020	4185.9	4130.9	4240.96	1.04881	0.20155	1.8961	42.8137	15.9	69.7273
20.05.2020	4228.8	4143.4	4314.15	1.05483	−0.1436	2.2533	42.8137	9.9321	75.6954
21.05.2020	4271.6	4151.5	4391.6	1.06086	−0.4924	2.6141	42.8137	4.8919	80.7356
22.05.2020	4314.4	4155.9	4472.82	1.06689	−0.8531	2.9869	42.8137	0.4471	85.1804
23.05.2020	4357.2	4156.9	4557.45	1.07292	−1.2286	3.3744	42.8137	−3.574	89.2012
24.05.2020	4400	4154.8	4645.22	1.07895	−1.6199	3.7778	42.8137	−7.273	92.9003
25.05.2020	4442.8	4149.7	4735.93	1.08497	−2.0273	4.1972	42.8137	−10.72	96.3444
26.05.2020	4485.6	4141.9	4829.41	1.091	−2.4508	4.6328	42.8137	−13.95	99.5798
27.05.2020	4528.5	4131.4	4925.51	1.09703	−2.89	5.0841	42.8137	−17.01	102.641
28.05.2020	4571.3	4118.4	5024.11	1.10306	−3.3448	5.5509	42.8137	−19.92	105.552
29.05.2020	4614.1	4103.1	5125.09	1.10909	−3.8147	6.0329	42.8137	−22.71	108.335
30.05.2020	4656.9	4085.4	5228.38	1.11512	−4.2994	6.5296	42.8137	−25.38	111.004
31.05.2020	4699.7	4065.5	5333.87	1.12114	−4.7986	7.0408	42.8137	−27.94	113.572

Open in a new tab

Fig. 5 represents fitted and forecast values of each variables. In the first part of the graphs, all observation values from March 12 to May 16 can be seen and in the second part, forecast values from May 17, 2020 to May 31, 2020 appear within the 95 % confidence interval. Since TC (a) and TD (d) show a cumulative increase, as in the historical data, it can be seen from the forecast values that it increased between 17 May - 31 May.

When the estimates for NC (b), GRTC (c) are examined, it is seen from Fig. 5 that the forecast values are stable between May 17 and May 31. Thus, NC (b) and GRTC (c) show that there is a stable value for Turkey, it can be concluded that there is no rising trend in the near future for Turkey's number of cases. In addition, this shows the successful results of the precautions taken by Turkey and it is projected that some strict applications can be loosened in a controlled manner.

In addition, the graph gives the results of ARIMA estimation for ND (e) and GRTD (f); it is noticed that there is no increment in the GRTD, which will become stable between 17–31 May. This is a promising situation for Turkey in the fight against the pandemic. ARIMA models can be estimated by using the observed values of May at the beginning of June, and governments can determine the precautions they will take accordingly. Turkey reconsidered precautions, such as quarantine application for people over 65 years old and under 20 years old, interruption of education in schools and universities, closing of restaurants, cafes and places of worship, minimum number of shifts in government departments, intercity travel restrictions and entry and exit ban for the most affected cities in the coming months, and loosening policy measures may be taken step by step.

4. Conclusion

In this study, using some indicators published officially by the World Health Organization for COVID-19 in Turkey, estimated values of which results may affect the precautions taken during the pandemic process of six different variables for the next 15 days were calculated. To determine which prediction model is more suitable, the estimates were made with ARIMA with different parameters, EXPOS and LSTM and the results were compared accordingly to the determined performance criteria. The most appropriate method was found to be ARIMA with different parameters. Forecasting results were tested based on AIC performance measurement criteria. ARIMA(1,2,4), ARIMA(0,1,2), ARIMA(1,1,4), ARIMA(0,2,2), ARIMA(0,2,3) and ARIMA(0,1,2) models were chosen with the lowest AIC values (12.0342, -2.51411, 12.0253, 3.67729, -4.24405, and 3.66077) as the best fit for TC, GRTC, NC, TD, GRTD, ND, respectively.

In conclusion, the forecasting results show that COVID-19 cases will be stable in the near future in Turkey. Moreover, TC and TD variables are significantly encouraging. The government can apply the recent rules for the pandemic process according to these results; so that precautionary restraint procedures can be reduced. While monitoring of forecasting values of TC and TD, if there will be a tendency to increase, the government can tighten the procedures. As a result, it is clearly seen that ARIMA models with different parameters can be employed to observe the spread of different epidemics. Thus, governments can set policies to ensure public health and safety according to the results obtained from these practices.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. The Comparison for Performance Results of Applied Models

Variables	Models	RMSE	MAE	MAPE	AIC
The Number of Total Case	ARIMA(1,2,4)	379.539	282.306	26.0422	12.03
	ARIMA(2,1,4)	380.117	276.566	122.887	12.07
	ARIMA(0,2,0)	418.859	299.371	4.63986	12.08
	ARIMA(4,2,0)	394.781	285.83	4.51657	12.08
	ARIMA(1,1,0)	418.21	294.491	4.61463	12.1
	EXPOS λ = 0.9912	415.848	289.797	7.29059	12.09
	LSTM	535.96	489.30	0.35	14.66
The Growth Rate of Total Cases	ARIMA(0,1,2)	0.275737	0.141242	10.0591	−2.51
	ARIMA(0,1,3)	0.344714	0.162714	9.97369	−2.04
	ARIMA(0,1,4)	0.362201	0.157848	9.65822	−1.91
	ARIMA(1,2,4)	0.357786	0.197014	14.8608	−1.9
	ARIMA(0,2,3)	0.420685	0.291096	23.3476	−1.64
	EXPOS λ = 0.1601	0.997909	0.397147	19.5974	0.027
	LSTM	0.00218	0.00161	0.15858	0.486
The Number of New Case	ARIMA(1,1,4)	377.866	278.118	163.11	12.03
	ARIMA(2,0,4)	377.386	272.416	337.864	12.05
	ARIMA(0,1,0)	415.521	294.619	17.479	12.06
	ARIMA(4,1,0)	391.426	281.324	18.6604	12.06
	EXPOS λ = 0.3585	435.701	305.3	240.838	12.19
	LSTM	1047.47	998.91	60.07	147.43
The Number of Total Death	ARIMA(0,2,2)	6.09454	4.70935		3.677
	ARIMA(2,2,0)	6.10109	4.6473		3.679
	ARIMA(1,2,1)	6.13436	4.6938		3.69
	ARIMA(2,0,2)	5.97003	4.33173		3.699
	ARIMA(2,1,1)	6.0851	4.44364		3.705
	EXPOS λ = 0.9999	6.4762	4.90641		3.768
	LSTM	24.179	22.40	0.59	8.132
The Growth Rate of Total Death	ARIMA(0,2,3)	0.114303	0.0506061	4.06235	−4.24
	ARIMA(0,2,4)	0.116274	0.0454201	3.64399	−4.18
	ARIMA(2,2,2)	0.11858	0.0539604	4.29279	−4.14
	ARIMA(4,1,4)	0.116602	0.0530026	4.14388	−4.05
	ARIMA(3,2,3)	0.129159	0.0552289	4.36873	−3.91
	EXPOS λ = 0.2	0.387387	0.160508	10.4269	−1.87
	LSTM	0.1046	0.1023	10.01119	−1.17
The Number of New Death	ARIMA(0,1,2)	6.04441	4.63418		3.661
	ARIMA(2,1,0)	6.05087	4.5735		3.663
	ARIMA(0,2,1)	6.17175	4.82092		3.671
	ARIMA(2,0,1)	6.03522	4.37296		3.689
	EXPOS λ = 0.6074	6.32014	4.65942		3.719
	LSTM	6.723	5.9351	11.3374	6.859

Open in a new tab

References

Abdel-Nasser M., Mahmoud K. Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Comput. Appl. 2019;31:2727–2740. doi: 10.1007/s00521-017-3225-z. [DOI] [Google Scholar]
Araz O.M., Bentley D., Muelleman R.L. Using Google Flu Trends data in forecasting influenza-like–illness related ED visits in Omaha, Nebraska. Am. J. Emerg. Med. 2014;32:1016–1023. doi: 10.1016/j.ajem.2014.05.052. [DOI] [PubMed] [Google Scholar]
Becerra M., Jerez A., Aballay B., Garcés H.O., Fuentes A. Forecasting emergency admissions due to respiratory diseases in high variability scenarios using time series: a case study in Chile. Sci. Total Environ. 2020:706. doi: 10.1016/j.scitotenv.2019.134978. [DOI] [PubMed] [Google Scholar]
Box G.E.P., Jenkins G.M. 5th ed. Wiley; 2015. Time Series Analysis Forecasting and Control. [Google Scholar]
Ceylan Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci. Total Environ. 2020:729. doi: 10.1016/j.scitotenv.2020.138817. [DOI] [PMC free article] [PubMed] [Google Scholar]
Durga M.S., Nayak P., Narayan K.A. Forecasting dengue fever incidence using ARIMA analysis. Int. J. Collab. Res. Intern. Med. Public Heal. 2019;11:24–32. [Google Scholar]
Fanelli D., Piazza F. Analysis and forecast of COVID-19 spreading in China, Italy and France. Chaos Solitons Fractals. 2020;134 doi: 10.1016/j.chaos.2020.109761. [DOI] [PMC free article] [PubMed] [Google Scholar]
Farrington C.P., Andrews N.J. A statistical algorithm for the early detection of outbreaks of infectious disease. J. R. Stat. Soc. 1996;159:547–563. doi: 10.2307/2983331. [DOI] [Google Scholar]
Haghani M., Bliemer M.C.J., Goerlandt F., Li J. The scientific literature on Coronaviruses, COVID-19 and its associated safety-related research dimensions: a scientometric analysis and scoping review. Saf. Sci. 2020;129 doi: 10.1016/j.ssci.2020.104806. [DOI] [PMC free article] [PubMed] [Google Scholar]
Henrique M., Molin D., Gomes R., Cocco V., Coelho S. Short-term forecasting COVID-19 cumulative confirmed cases: perspectives for Brazil. Chaos Solitons Fractals. 2020:135. doi: 10.1016/j.chaos.2020.109853. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hyndman R.J., Koehler A.B., Ord J.K., Snyder R.D. 2008. Springer Series in Statistics Forecasting with Exponential Smoothing. [DOI] [Google Scholar]
Jia L., Li K., Jiang Y., Guo X., Zhao T. 2019. Prediction and Analysis of Coronavirus Disease 2019. [Google Scholar]
John B., Singhal S., John B., Singhal S. An application of integrated EPC–SPC methodology for simultaneously monitoring multiple output characteristics. Int. J. Qual. Reliab. Manag. 2019 doi: 10.1108/IJQRM-04-2018-0104. [DOI] [Google Scholar]
Kondratyev M., Tsybalova L. Long-term forecasting of influenza-like illnesses in Russia. Int. J. Pure Appl. Math. 2013;89:619–642. doi: 10.12732/ijpam.v89i4.14. [DOI] [Google Scholar]
Lim B. 2018. Forecasting Disease Trajectories in Alzheimer’s Disease using Deep Learning. [Google Scholar]
Linden A., Adams J.L., Roberts N. Evaluating disease management program effectiveness: an introduction to survival analysis. Dis. Manag. 2004;7:180–190. doi: 10.1089/dis.2004.7.180. [DOI] [PubMed] [Google Scholar]
Liu Q., Liu X., Jiang B., Yang W. Forecasting incidence of hemorrhagic fever with renal syndrome in China using ARIMA model. BMC Infect. Dis. 2011:11. doi: 10.1186/1471-2334-11-218. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu L., Luan R.S., Yin F., Zhu X.P. Predicting the incidence of hand, foot and mouth disease in Sichuan province, China using the ARIMA model. Epidemiol. Infect. 2015;144:144–151. doi: 10.1017/S0950268815001144. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu L., Han M., Zhou Y., Wang Y. LSTM recurrent neural networks for influenza trends prediction. ISBRA 2018 Bioinforma. Res. Appl. 2018;10847:259–264. doi: 10.1007/978-3-319-94968-0_25. [DOI] [Google Scholar]
Makridakis S., Hibon M. ARMA models and the Box-Jenkins methodology. J. Forecast. 1997;16:147–163. doi: 10.1002/(SICI)1099-131X(199705)16:3<147::AID-FOR652>3.0.CO;2-X. [DOI] [Google Scholar]
Mei S., Van De Vijver D., Xuan L., Zhu Y., Sloot P.M.A. Quantitatively evaluating interventions in the influenza a (H1N1) epidemic on China campus grounded on individual-based simulations. Procedia Comput. Sci. 2012;1:1675–1682. doi: 10.1016/j.procs.2010.04.187. [DOI] [Google Scholar]
Ndaïrou F., Area I., Nieto J.J., Torres D.F.M. Mathematical modeling of COVID-19 transmission dynamics with a case study of Wuhan. Chaos Solitons Fractals. 2020:135. doi: 10.1016/j.chaos.2020.109846. [DOI] [PMC free article] [PubMed] [Google Scholar]
Permanasari A.E. SARIMA (seasonal ARIMA) implementation on time series to forecast the number of malaria incidence. 2013 Int. Conf. Inf. Technol. Electr. Eng. 2013:203–207. doi: 10.1109/ICITEED.2013.6676239. [DOI] [Google Scholar]
Promprou S., Jaroensutasinee M., Jaroensutasinee K. Vol. 30. WHO Reg. Off. South-East Asia; 2006. pp. 99–106. (Forecasting Dengue Haemorrhagic Fever Cases in Southern Thailand Using ARIMA Models). [Google Scholar]
Sharafi M., Ghaem H., Tabatabaee H.R., Faramarzi H. Using seasonal ARIMA time series method. Asian Pacific J. Trop. Med. 2017;10:79–86. doi: 10.1016/j.apjtm.2016.12.007. [DOI] [PubMed] [Google Scholar]
Siami-Namini S., Namin A.S. 2018. Forecasting Economics and Financial Time Series: ARIMA vs. LSTM; pp. 1–19. [Google Scholar]
Tomar A., Gupta N. Prediction for the spread of COVID-19 in India and effectiveness of preventive measures. Sci. Total Environ. 2020;728 doi: 10.1016/j.scitotenv.2020.138762. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ture M., Kurt I. Vol. 31. 2006. pp. 41–46. (Comparison of Four Different Time Series Methods to Forecast Hepatitis A Virus Infection). [DOI] [Google Scholar]
Venna S.R., Tavanaei A., Gottumukkala R.N., Raghavan V.V., Member L.S., Maida A.S., Nichols S. A novel data-driven model for real-time influenza forecasting. IEEE Access. 2019;7:7691–7701. doi: 10.1109/ACCESS.2018.2888585. [DOI] [Google Scholar]
WHO . 2020. World Health Organization. [Google Scholar]
Yang W., Cowling B.J., Lau E.H.Y., Shaman J. Forecasting influenza epidemics in Hong. PLoS Comput. Biol. 2015:1–17. doi: 10.1371/journal.pcbi.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang J., Nawata K. Multi-step prediction for influenza outbreak by an adjusted long short-term memory. Epidemiol. Infect. 2018;146:809–816. doi: 10.1017/S0950268818000705. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang Z., Wang X., Gong H., Liu X., Chen H., Chu Z., Guo Y., Chen Z., Gao C., Liu Z. Lancet Dly.; 2020. Daily Tracking and Forecasting of the Global COVID-19 Pandemic Trend using Holt– Winters Exponential Smoothing. [DOI] [Google Scholar]
Zheng Yan-ling, Zhang L., Zhang X., Wang K., Zheng Yu-jian. Forecast model analysis for the morbidity of tuberculosis in Xinjiang, China. PLoS One. 2015:1–13. doi: 10.1371/journal.pone.0116832. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0005] Abdel-Nasser M., Mahmoud K. Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Comput. Appl. 2019;31:2727–2740. doi: 10.1007/s00521-017-3225-z. [DOI] [Google Scholar]

[bib0010] Araz O.M., Bentley D., Muelleman R.L. Using Google Flu Trends data in forecasting influenza-like–illness related ED visits in Omaha, Nebraska. Am. J. Emerg. Med. 2014;32:1016–1023. doi: 10.1016/j.ajem.2014.05.052. [DOI] [PubMed] [Google Scholar]

[bib0015] Becerra M., Jerez A., Aballay B., Garcés H.O., Fuentes A. Forecasting emergency admissions due to respiratory diseases in high variability scenarios using time series: a case study in Chile. Sci. Total Environ. 2020:706. doi: 10.1016/j.scitotenv.2019.134978. [DOI] [PubMed] [Google Scholar]

[bib0020] Box G.E.P., Jenkins G.M. 5th ed. Wiley; 2015. Time Series Analysis Forecasting and Control. [Google Scholar]

[bib0025] Ceylan Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci. Total Environ. 2020:729. doi: 10.1016/j.scitotenv.2020.138817. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0030] Durga M.S., Nayak P., Narayan K.A. Forecasting dengue fever incidence using ARIMA analysis. Int. J. Collab. Res. Intern. Med. Public Heal. 2019;11:24–32. [Google Scholar]

[bib0035] Fanelli D., Piazza F. Analysis and forecast of COVID-19 spreading in China, Italy and France. Chaos Solitons Fractals. 2020;134 doi: 10.1016/j.chaos.2020.109761. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0040] Farrington C.P., Andrews N.J. A statistical algorithm for the early detection of outbreaks of infectious disease. J. R. Stat. Soc. 1996;159:547–563. doi: 10.2307/2983331. [DOI] [Google Scholar]

[bib0045] Haghani M., Bliemer M.C.J., Goerlandt F., Li J. The scientific literature on Coronaviruses, COVID-19 and its associated safety-related research dimensions: a scientometric analysis and scoping review. Saf. Sci. 2020;129 doi: 10.1016/j.ssci.2020.104806. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0050] Henrique M., Molin D., Gomes R., Cocco V., Coelho S. Short-term forecasting COVID-19 cumulative confirmed cases: perspectives for Brazil. Chaos Solitons Fractals. 2020:135. doi: 10.1016/j.chaos.2020.109853. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0055] Hyndman R.J., Koehler A.B., Ord J.K., Snyder R.D. 2008. Springer Series in Statistics Forecasting with Exponential Smoothing. [DOI] [Google Scholar]

[bib0060] Jia L., Li K., Jiang Y., Guo X., Zhao T. 2019. Prediction and Analysis of Coronavirus Disease 2019. [Google Scholar]

[bib0065] John B., Singhal S., John B., Singhal S. An application of integrated EPC–SPC methodology for simultaneously monitoring multiple output characteristics. Int. J. Qual. Reliab. Manag. 2019 doi: 10.1108/IJQRM-04-2018-0104. [DOI] [Google Scholar]

[bib0070] Kondratyev M., Tsybalova L. Long-term forecasting of influenza-like illnesses in Russia. Int. J. Pure Appl. Math. 2013;89:619–642. doi: 10.12732/ijpam.v89i4.14. [DOI] [Google Scholar]

[bib0075] Lim B. 2018. Forecasting Disease Trajectories in Alzheimer’s Disease using Deep Learning. [Google Scholar]

[bib0080] Linden A., Adams J.L., Roberts N. Evaluating disease management program effectiveness: an introduction to survival analysis. Dis. Manag. 2004;7:180–190. doi: 10.1089/dis.2004.7.180. [DOI] [PubMed] [Google Scholar]

[bib0085] Liu Q., Liu X., Jiang B., Yang W. Forecasting incidence of hemorrhagic fever with renal syndrome in China using ARIMA model. BMC Infect. Dis. 2011:11. doi: 10.1186/1471-2334-11-218. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0090] Liu L., Luan R.S., Yin F., Zhu X.P. Predicting the incidence of hand, foot and mouth disease in Sichuan province, China using the ARIMA model. Epidemiol. Infect. 2015;144:144–151. doi: 10.1017/S0950268815001144. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0095] Liu L., Han M., Zhou Y., Wang Y. LSTM recurrent neural networks for influenza trends prediction. ISBRA 2018 Bioinforma. Res. Appl. 2018;10847:259–264. doi: 10.1007/978-3-319-94968-0_25. [DOI] [Google Scholar]

[bib0100] Makridakis S., Hibon M. ARMA models and the Box-Jenkins methodology. J. Forecast. 1997;16:147–163. doi: 10.1002/(SICI)1099-131X(199705)16:3<147::AID-FOR652>3.0.CO;2-X. [DOI] [Google Scholar]

[bib0105] Mei S., Van De Vijver D., Xuan L., Zhu Y., Sloot P.M.A. Quantitatively evaluating interventions in the influenza a (H1N1) epidemic on China campus grounded on individual-based simulations. Procedia Comput. Sci. 2012;1:1675–1682. doi: 10.1016/j.procs.2010.04.187. [DOI] [Google Scholar]

[bib0110] Ndaïrou F., Area I., Nieto J.J., Torres D.F.M. Mathematical modeling of COVID-19 transmission dynamics with a case study of Wuhan. Chaos Solitons Fractals. 2020:135. doi: 10.1016/j.chaos.2020.109846. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0115] Permanasari A.E. SARIMA (seasonal ARIMA) implementation on time series to forecast the number of malaria incidence. 2013 Int. Conf. Inf. Technol. Electr. Eng. 2013:203–207. doi: 10.1109/ICITEED.2013.6676239. [DOI] [Google Scholar]

[bib0120] Promprou S., Jaroensutasinee M., Jaroensutasinee K. Vol. 30. WHO Reg. Off. South-East Asia; 2006. pp. 99–106. (Forecasting Dengue Haemorrhagic Fever Cases in Southern Thailand Using ARIMA Models). [Google Scholar]

[bib0125] Sharafi M., Ghaem H., Tabatabaee H.R., Faramarzi H. Using seasonal ARIMA time series method. Asian Pacific J. Trop. Med. 2017;10:79–86. doi: 10.1016/j.apjtm.2016.12.007. [DOI] [PubMed] [Google Scholar]

[bib0130] Siami-Namini S., Namin A.S. 2018. Forecasting Economics and Financial Time Series: ARIMA vs. LSTM; pp. 1–19. [Google Scholar]

[bib0135] Tomar A., Gupta N. Prediction for the spread of COVID-19 in India and effectiveness of preventive measures. Sci. Total Environ. 2020;728 doi: 10.1016/j.scitotenv.2020.138762. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0140] Ture M., Kurt I. Vol. 31. 2006. pp. 41–46. (Comparison of Four Different Time Series Methods to Forecast Hepatitis A Virus Infection). [DOI] [Google Scholar]

[bib0145] Venna S.R., Tavanaei A., Gottumukkala R.N., Raghavan V.V., Member L.S., Maida A.S., Nichols S. A novel data-driven model for real-time influenza forecasting. IEEE Access. 2019;7:7691–7701. doi: 10.1109/ACCESS.2018.2888585. [DOI] [Google Scholar]

[bib0150] WHO . 2020. World Health Organization. [Google Scholar]

[bib0155] Yang W., Cowling B.J., Lau E.H.Y., Shaman J. Forecasting influenza epidemics in Hong. PLoS Comput. Biol. 2015:1–17. doi: 10.1371/journal.pcbi.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0160] Zhang J., Nawata K. Multi-step prediction for influenza outbreak by an adjusted long short-term memory. Epidemiol. Infect. 2018;146:809–816. doi: 10.1017/S0950268818000705. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0165] Zhang Z., Wang X., Gong H., Liu X., Chen H., Chu Z., Guo Y., Chen Z., Gao C., Liu Z. Lancet Dly.; 2020. Daily Tracking and Forecasting of the Global COVID-19 Pandemic Trend using Holt– Winters Exponential Smoothing. [DOI] [Google Scholar]

[bib0170] Zheng Yan-ling, Zhang L., Zhang X., Wang K., Zheng Yu-jian. Forecast model analysis for the morbidity of tuberculosis in Xinjiang, China. PLoS One. 2015:1–13. doi: 10.1371/journal.pone.0116832. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Forecasting outbreak of COVID-19 in Turkey; Comparison of Box–Jenkins, Brown’s exponential smoothing and long short-term memory models

Didem Guleryuz

Graphical abstract

Abstract