Skip to main content

Some NLM-NCBI services and products are experiencing heavy traffic, which may affect performance and availability. We apologize for the inconvenience and appreciate your patience. For assistance, please contact our Help Desk at info@ncbi.nlm.nih.gov.

Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2021 Mar 18;226:120403. doi: 10.1016/j.energy.2021.120403

Forecasting the U.S. oil markets based on social media information during the COVID-19 pandemic

Binrong Wu a, Lin Wang a, Sirui Wang a, Yu-Rong Zeng b,
PMCID: PMC8486164  PMID: 34629690

Abstract

Accurate oil market forecasting plays an important role in the theory and application of oil supply chain management for profit maximization and risk minimization. However, the coronavirus disease 2019 (COVID-19) has compelled governments worldwide to impose restrictions, consequently forcing the closure of most social and economic activities. The latter leads to the volatility of the oil markets and poses a huge challenge to oil market forecasting. Fortunately, the social media information can finely reflect oil market factors and exogenous factors, such as conflicts and political instability. Accordingly, this study collected vast online oil news and used convolutional neural network to extract relevant information automatically. Oil markets are divided into four categories: oil price, oil production, oil consumption, and oil inventory. A total of 16,794; 9,139; 8,314; and 8,548 news headlines were collected in four respective cases. Experimental results indicate that social media information contributes to the forecasting of oil price, oil production and oil consumption. The mean absolute percentage errors are respectively 0.0717, 0.0144 and 0.0168 for the oil price, production, and consumption prediction during the COVID-19 pandemic. Marketers must consider the impact of social media information on the oil or similar markets, especially during the COVID-19 outbreak.

Keywords: Social media information, Deep learning, Text mining, Time series forecasting, COVID-19 pandemic

Nomenclature

WHO

World Health Organization

COVID-19

Coronavirus disease 2019

IEA

International Energy Agency

VAR

Vector autoregressive

CNN

Convolutional neural network

BPNN

Backpropagation neural networks

SVM

Support vector machines

MLR

Multiple linear regression

RNN

Recurrent neural network

LSTM

Long short-term memory

U.S.

United States

NLP

Natural Language Processing

WTI

West Texas Intermediate

USA

United States of America

HQIC

Hannan–Quinn information criterion

GDP

Gross Domestic Product

TF–IDF

Term frequency-inverse document frequency

BPD

Barrels per day

API

American Petroleum Institute

OPEC

Organization of Petroleum Exporting Countries

LNG

Liquefied Natural Gas

MAE

Mean absolute error

MAPE

Mean absolute percentage error

RMSE

Root mean square error

ADF

Augmented Dickey-Fuller test

AI

Artificial intelligence

IR

Improving rate

MIV

Mean Impact Value

AIC

Akaike information criterion

BIC

Bayesian information criterion

FPE

Final prediction error

DJI

Dow Jones Industrial Average

1. Introduction

Oil plays a leading role in energy resources, and oil supply chain plays an important role in the global economy [1]. It has profound significance to explore the theory and application of petroleum supply chain management in profit maximization and risk minimization [2]. However, the uncertainty surrounding the ongoing COVID-19 pandemic brings great challenges to forecasting the oil markets.

The World Health Organization (WHO) declares the ongoing coronavirus pandemic as a global threat. Currently, no effective treatment is found for this respiratory disease, namely, the coronavirus disease 2019 (COVID-19). Since the end of February 2020, the new virus has spread across many countries. As of early 2021, more than 91 million cases have been reported worldwide with more than 1.9 million deaths. Governments worldwide have imposed restrictions to slow down the spread of the virus, consequently forcing the closure of most social and economic activities. Some of the measures are partial or complete blockades, including banning public gatherings, closing non-essential businesses, and even educational institutions [3] (International Energy Agency [4]). Owing to the impact of the COVID-19 pandemic, oil markets in many countries have shown a volatile trend. With the global economic downturn, the unpredictable COVID-19 pandemic has led to a sharp drop in oil demand [5].

In recent years, researchers have extensively incorporated Internet data into forecasting studies as explanatory variables [6]. The combination of Internet data and prediction models might facilitate improved forecasting performance [7]. Looking at related oil events can help forecast the oil markets. The challenging question is how to select and quantify these events. Fortunately, qualitative evidence, such as emergencies and politics, has been transformed into the quantitative analysis (e.g., time-series data) through text mining, social network, and sentiment analysis techniques [8]. Especially, text mining is useful for identifying ideas and extracting information [9]. For example, Zhang et al. [10] used text mining techniques to mine product innovation ideas from online reviews. Jeong, Yoon, and Lee [11] proposed a product opportunity mining approach for product planning.

In this study, the text mining method of text extraction is implemented to convert unstructured text into a structured format. According to the major validated scientific studies, Hemmatian and Sohrabi [12] summarized the recent methods for opinion classification and aspect extraction. They deemed the CNN model as the most potential technique of text mining. Meanwhile, CNN has been widely used in other popular fields, including sentence modeling [13], image classification [14], and speech recognition [15]. Inspired by this research, the current study uses the CNN model to process online oil news. Besides, the volume of media information continues to increase, including a wealth of qualitative information that can be chosen to improve the accuracy of oil markets forecasting. As online news is more persuasive and quieter, it is a more reliable source than other social media, such as Twitter and blogs [16]. Therefore, online oil news sources are considered to be qualitative data that can help develop the accuracy of oil market forecasting.

Above all, the authors aim to present a novel oil market prediction methodology, whose focus lies in developing forecasting performance by examining social media information. Oil markets are divided into four categories: oil price, oil production, oil consumption, and oil inventory. This research attempts to solve the following three research questions: firstly, which type of news is more relevant to the oil price, oil production, oil consumption, and oil inventory? Secondly, to what extent can the application of social media information improve the forecasting of oil price, oil production, oil consumption, and oil inventory during the COVID-19 pandemic? Thirdly, how do different forecasting models perform in the oil market prediction task?

Fig. 1 illustrates the general framework of the system design of this study. This study takes the U.S. oil markets as an example, as the U.S. is the leading economy in the world, and its oil markets are highly volatile as an effect of the COVID-19 pandemic. In this study, different keywords, such as “Crude oil” and “American oil,” were employed to collect oil news headlines released from the popular oil website “Oilprice.com” from May 2012 to August 2020. The convolutional neural network (CNN) model is applied to extract textual information of these news headlines. The VAR method is used to select the appropriate lay order of the outputs of CNN and historical oil data. Then, text feature, financial market data, and historical oil data are input into several prediction techniques, namely, backpropagation neural networks (BPNN), support vector machines (SVM), multiple linear regression (MLR), recurrent neural network (RNN), and long short-term memory (LSTM). The results illustrate that online media information can facilitate crude oil price, oil production, and oil consumption forecasting, especially during the COVID-19 pandemic. However, using oil news might not help predict oil inventory.

Fig. 1.

Fig. 1

System design of this study.

The main contributions of this study are summarized as follows:

  • (a)

    This study is the first to examine the predictive relationship between social media information and oil price, production, consumption, and inventory. The results of text mining indicate that the fluctuations of oil price, production, and inventory are influenced by various international events. By contrast, oil consumption is closely related to domestic events.

  • (b)

    A deep learning algorithm, namely, CNN, was employed to extract textual information from social media information automatically. The results of the mean impact value (MIV) approach indicate that text features are important predictors, which illustrate the explanatory power of textual information in the forecasting of oil price, production, and consumption.

  • (c)

    We proposed a novel oil price, production, and consumption forecasting methodology. The findings contribute specifically to theoretical insights for processing information, in that oil price, production, and consumption prediction obtained remarkable accuracy performance by considering social media information during the COVID-19 pandemic.

The rest of the paper is arranged as follows. Section 2 presents a literature review of the prediction of oil price, production, consumption, and inventory. Section 3 provides the methodology of this study, including the text mining technique and several forecasting techniques. Section 4 describes the data and empirical results of text mining and oil market forecasting. Finally, Section 5 provides the conclusion and future research.

2. Literature review

The research community focuses on oil markets with various forecasting techniques [17,18]. Studies on these topics are divided into four categories, namely, oil price forecasting, oil production forecasting, oil consumption forecasting, and oil inventory forecasting. Table 1 presents a summary of typical studies for oil market forecasting.

Table 1.

Summary of recent studies for oil market forecasting.

Classification Methods Influencing factors Forecasting object or areas
Oil price forecasting Vector Trend Forecasting Method (VTFM) [20] Historical oil price data Brent crude oil spot price
A semi-heterogeneous approach [21] Historical oil price data West Texas Intermediate (WTI) crude oil price
Intelligent model search engine [22] CPI, IPI, USI, BTI, CU, LBR, SP, JPU, and CHU Brent oil price
Convolutional neural network and Latent Dirichlet Allocation (LDA) topic model [23] Online oil news, financial market data, and oil price data West Texas Intermediate (WTI) crude oil
Convolutional neural network with Variational mode decomposition [24] Google Trends and online media information West Texas Intermediate (WTI) crude oil price
Oil production forecasting Combining the nonlinear metabolism grey model with Auto-Regressive Integrated Moving Average (ARIMA) [25] Historical production data U.S. shale oil production
Ensemble empirical mode decomposition with Long Short-Term Memory [26] Historical production data Two actual oilfields from China
Oil consumption forecasting AdaBoost ensemble technology [28] Historical consumption data Oil consumption of China
GM (1,1) model [27] Historical consumption data Global oil consumption
NMGM (1, 1, α) [29] Historical consumption data Oil consumption of China
LogR, DT, BPNN, and SVM [30] Google Trends and historical consumption data Global oil consumption
Oil inventory forecasting

Note:Studies rarely investigate the implementation of oil inventory prediction.

Crude oil price fluctuations have a significant impact on the global economy. Forecasting oil price fluctuations that affect a country’s social stability, economic development can help governments make policies and reduce financial losses in the industrial sector [19]. Using historical statistical data, some studies have attempted econometric techniques, intelligent algorithms, and various decomposition techniques to predict crude oil price. To illustrate, Zhao et al. [20] used the vector trend forecasting method (VTFM) to forecast Brent crude oil price. Wang, Li, and Hong [21] decomposed the oil price series using four decomposition techniques, namely, Variational Mode Decomposition, Empirical Mode Decomposition, Singular Spectral Analysis, and Wavelet Analysis; then, four different forecasting methods are used to predict the components from each decomposition technique. They reconstructed the oil price forecasts and obtained better performance of oil price forecasting. However, crude oil price movements are driven by a variety of factors, including oil market factors (e.g., oil demand, stocks, and supply) and exogenous factors (e.g., epidemics, political instability). Considering these related factors to forecast oil price has been a hot spot. For example, Bekiroglu et al. [22] proposed an intelligent model search engine to forecast Brent oil using some financial indexes and oil data. Li, Shang, and Wang [23] presented a new oil price forecasting method using online media text mining. Wu et al. [24] combined Google Trends and social media information to forecast weekly oil prices. Drawing on experience from previous studies, we further coped with online media news and compared various forecasting techniques to forecast oil price during the COVID-19 pandemic.

As for oil production forecasting, the research uses historical data to predict oil yields. For instance, Wang, Song, and Li [25] developed a hybridization model of the nonlinear grey model and linear ARIMA to forecast U.S. shale oil production. Liu, Liu, and Gu [26] proposed an ensemble empirical mode decomposition (EEMD) based on Long Short-Term Memory (LSTM) to forecast the production of two actual oilfields from China. Recently, oil consumption forecasting has been a hot field. For example, Yuan et al. [27] used the GM (1,1) model cluster to predict global oil consumption. Xiao et al. [28] proposed a new hybrid oil consumption forecasting model using selective ensemble, which could extract nonlinear subseries of oil consumption. Wang and Song [29] developed a new nonlinear-dynamic grey model, namely, NMGM (1, 1, α), to forecast China’s oil consumption. Yu et al. [30] found that the accuracy of oil consumption prediction can be significantly improved using Google Trends, which finely reflects related factors of oil markets. In comparison with previous work, we focused more on analyzing the role of media information in predicting oil consumption.

Above all, many studies examined the oil market factors and exogenous factors to predict oil prices. However, research scarcely uses these factors to forecast oil production, consumption, and inventory. Influenced by COVID-19, oil markets have experienced drastic fluctuations, with which accurate prediction using historical data is impossible. Looking at oil market factors and exogenous factors may facilitate the forecasting of oil price, production, consumption, and inventory. Meanwhile, the online media information can finely reflect oil market factors and exogenous factors. Compared with previous studies, it is the first time for this study to apply online oil news to examine the predictive relationship with oil markets during the COVID-19 pandemic.

3. Methodology

3.1. Convolutional neural network for text classification

Fig. 2 illustrates the basic structure of the text CNN model. The first step in using CNN for text mining is to implement tokenization and filter punctuation and stop words. Stop words such as "the," "in," and "is" are generally considered useless, as they are common and they dramatically increase the size of the index without increasing the accuracy or recall.

Fig. 2.

Fig. 2

Structure of the text CNN.

Considering the difference in length of each document, the padded sequence technique is implemented to convert each document into the same length. We cut off the words, which exceed the fixed value. Then, we add “0” after a document that is shorter than the specified value. Thereafter, we employed a word embedding model (word2vec) to convert each word into a unique vector. Words with similar meanings are also closer to each other in Euclidean terms.

In the text-CNN, the width of the convolution kernel is consistent with the dimension of the word vector. Each line of input vector represents a word, and in the process of feature extraction, the word is the minimum granularity of text. Given that the input of CNN is a sentence and the correlation between adjacent words is high, the word order and its context are considered in the process of convolution.

In the process of convolution layer with different heights of convolution kernels, the different dimensions of the vector are obtained. In the pooling layer, we use a 1-Max-pooling to extract the maximum characteristics of each feature vector. After 1-Max-pooling for all feature vectors, each value shall be subjected to splicing. To prevent overfitting, dropout is added before the pooling layer to the full connection layer.

We identify two full connection layers. The first layer uses “relu” as the activation function, and the second layer employs a softmax activation function to obtain the probability belonging to each class.

The outputs of the CNN model denote the fluctuation of monthly oil markets. The oil market movement om is described as follows:

om={0,cm<cm11,cmcm1, (1)

where cm denotes the oil markets at the end of month m. The above structure of text CNN is implemented by a Python library, TensorFlow.

3.2. Forecasting models

In this study, common and popular forecasting models, such as backpropagation neural network, SVM, and multivariate linear regression, are implemented [18,24,31]. RNN and LSTM, which are the latest hot deep learning models, are also considered to predict the oil markets [32,33]. The following prediction models are all in one-step-ahead prediction.

  • (a)

    Backpropagation neural network

BPNN, which belongs to supervised learning, is the most basic neural network. Its output results are propagated forward, and errors are propagated backward. Fig. 3 illustrates the BPNN for a single hidden layer. x1,x2,...,xn denote the input values, and y denotes the output value.

  • (b)

    Multivariate linear regression

Fig. 3.

Fig. 3

Structure of BPNN.

When a linear relationship exists between multiple independent variables and dependent variables, the regression analysis is multivariate linear regression (MLR). y is the dependent variable, and x1xk are the independent variables. If a linear relationship exists between the independent variables and the dependent variable, the multiple linear regression model is described as follows:

y=b0+b1x1+b2x2++bnxk+e, (6)

where b0 is a constant, b1bk are regression coefficients, and e denotes an error. Model parameters are calculated using the ordinary least square method (OLS).

  • (c)

    Support vector machine

SVM is a linear classifier with the largest interval defined in the feature space. The SVM model can be expressed as the following constrained optimization problem:

Given the training data{(x1,y1),,(xn,yn)}, where xi(i=1,,n) is the input and yi(i=1,,n) is the output, the primal formulation is expressed as follows:

minJ(ω,b,ξ)=(12)ωTω+γi=1nξi
s.t.yi[ϕ(xi)ωi+b]1ξi,ξi0(i=1,,n), (7)

where {(x1,y1),,(xn,yn)} is the training data, ω denotes the hyperplane vector, γ represents the regularization parameter, b represents the bias, ξ denotes the tolerable misclassification error and ϕ() denotes nonlinear mapping function, which is the Gaussian radial basis function kernel K(xi,xj)=exp(||xixj||)/2σ2 with variance σ2.

  • (d)

    Recurrent neural network and Long short-term memory

RNNs denote a class of neural networks that processes sequential data. A basic neural network only establishes weight connections between layers. RNN also establishes weight connections between neurons in the same layer.

Hochreiter and Schmidhuber [34] proposed LSTM, which is an RNN variant. LSTM extends the RNN architecture with a separate memory unit and control mechanism that controls the flow of information in the network. The gating mechanism consists of input gates, forget gates, and output gates.

4. Experiment study

This section presents the implementation of econometric methods by Econometrics Views 10 and that of artificial intelligence models by Python 3.8. The computation is evaluated on an efficient computer with an Intel (R) Core (TM) i7-10700K CPU, 3.80 GHz, 32 GB RAM, and Windows 10 system.

4.1. Data retrieval and descriptions

Fig. 4 presents the time series data in May 2017–August 2020 for monthly oil price (West Texas Intermediate, WTI), oil production, oil consumption by the industrial sector, and oil stocks, which were all selected from the U.S. Energy Information Administration (http://www.eia.gov). In this graphic representation, the oil markets have changed dramatically, especially in March, April, and May 2020.

Fig. 4.

Fig. 4

Time series of U.S. monthly oil price, production, consumption, and stocks.

As shown in Fig. 5 , the coronavirus disease 2019 (COVID-19) pandemic and the oil price war between Russia and Saudi Arabia affected the monthly oil price in 2020, which was significantly lower than that in the same period of 2018 and 2019. Especially from February to March 2020, the oil price fell by $25.08, which represents a total reduction of 55.57%. Meanwhile, the end of the oil price war and effective control of the COVID-19 pandemic initiated the rise of crude oil prices in May 2020.

Fig. 5.

Fig. 5

Monthly oil price, production, consumption, and inventory in 2018, 2019, and 2020.

With the development of shale gas in the U.S., total petroleum production capacity has continued to grow in recent years. However, the collapse in crude oil price caused by the oil price war and the decline in oil demand caused by the outbreak of the COVID-19 pandemic have sharply reduced petroleum production capacity in the U.S. Especially, from April to May 2020, the oil production fell by 1,991.192 thousand BPD (barrels per day), which is a 16.58% drop. The crude oil production approached the lowest level for the recent three years in May 2020.

Fig. 5 also illustrates the evident reduction in oil consumption in April since the beginning of the lockdown measures adopted by the U.S. government. With the reopening policy announced in May 2020, total petroleum consumption by the industrial sector saw a rapid recovery. Uncertainty surrounding the ongoing COVID-19 pandemic could also lead to uncertainty in oil consumption. Meanwhile, petroleum stocks in 2020 are consistently higher than those in the same periods of 2018 and 2019. Under the influence of falling demand for oil, petroleum stocks approached the highest level for recent years in June 2020.

As mentioned above, the impact caused by the COVID-19 pandemic and oil price war has caused oil markets to fluctuate dramatically in 2020. Accurately predicting the oil markets has become a huge challenge.

Since many literatures show that stock markets and economic development affect the oil market, Dow Jones Industrial Average (DJIA) and real Gross Domestic Product (GDP) in May 2017–August 2020 were collected as two predictive indicators that influence oil markets [35]. DJIA was selected from a financial portal ‘‘Investing.com’’ and monthly GDP was collected from “Macroadvisers.com”. Meanwhile, this study utilized different keywords to collect oil news headlines released from the popular oil website “Oilprice.com,” from May 2012 to August 2020. Notably, the monthly oil news is consolidated into a sample, with a total of 100 observations. Fig. 6 illustrates train, validation, and test periods of the oil market prediction model. In the CNN model, the training period is May 2012–April 2017, including 60 monthly records. The test period is May 2017–Aug 2020, consisting of 40 monthly records. As an input variable of oil market forecasting, the output of the CNN model in the test period is used as the input of the train, validation, and test periods of the oil market prediction model. The training period for the oil market forecasting model is May 2017–April 2019, consisting of 24 monthly observations. The validation period is May 2019–December 2019, including 8 monthly observations. The test period is January 2020–August 2020, including 8 monthly observations. A rolling window is used to estimate the oil market prediction model.

Fig. 6.

Fig. 6

Train, validation, and test period of oil market prediction model.

4.2. Oil market forecasting

4.2.1. Oil price forecasting

  • (a)

    News selection and text mining

We collected four online oil news collections with different keywords “Crude oil,” “Crude oil price,” “American oil,” and “WTI.” Table 2 presents the number of different news collections. Notably, the oil news collections with the keyword “WTI” are unavailable every month between May 2012 and August 2020. Thus, we use the other three news collections to carry out the next step analysis.

Table 2.

Number of news using different keywords.

Type Keywords The number of news in CNN train period The number of news in CNN test period Total numbers
International news Crude oil 2408 4385 6793
Crude oil price 2674 5006 7680
WTI Not enough news 860
Domestic news American oil 622 839 1461

Fig. 7, Fig. 8, Fig. 9 describe the top 100-word cloud with keywords “American oil,” “Crude oil,” and “Crude oil price,” respectively. Words of different news are sorted using the popular largest term frequency-inverse document frequency (TF–IDF) weightings.

Fig. 7.

Fig. 7

Word cloud in the news corpus with keywords “American oil”.

Fig. 8.

Fig. 8

Word cloud in the news corpus with keywords “Crude oil”.

Fig. 9.

Fig. 9

Word cloud in the news corpus with keywords “Crude oil price”.

As shown in Fig. 7, the top 20 words, which are collected with keywords “American oil,” are as follows: “crude,” “API (American Petroleum Institute),” “report,” “build,” “inventory,” “price,” “U(USA),” “draw,” “Venezuela,” “sanction,” “gas,” “oil,” “pipeline,” “gasoline,” “Iran,” “surprise,” “energy,” “production,” “export,” and “shale”. For a more detailed analysis, “crude,” “gas,” “oil,” “pipeline,” “gasoline,” “energy,” and “shale” show a close relationship with oil. Meanwhile, “report,” “build,” “inventory,” “price,” “draw,” “production,” and “export” may represent online oil news related to oil supply, oil demand, oil inventory, and oil price. Furthermore, “API (American Petroleum Institute),” “U (USA),” “Venezuela,” “sanction,” and “Iran” may reflect related political and international events.

As illustrated in Fig. 8, the top 100 words collected with keywords “Crude oil.” Compared with the collected words using keywords “American oil,” these words refer to other countries and institutions, such as “U (USA),” “Aramco (Arabian-American Oil Company),” “OPEC (Organization of Petroleum Exporting Countries),” “Tesla,” “Venezuela,” “API (American Petroleum Institute),” “Iranian,” and “Saudi.” It suggests that the online oil news collections with keywords “Crude oil,” are more diverse and international. Fig. 9 illustrates that the collected top 100 words with keywords “Crude oil price,” are similar to the collected top 100 words with keywords “Crude oil.”

Appendix A lists the final parameter values of the CNN model in all examples. Table 3 presents the results of CNN classification using different datasets on the test dataset. Table 3 shows that the online oil news collections with keywords “Crude oil” achieved a better performance. Fig. 10 shows the time series of oil price and the three CNN values. To show the connection more clearly, Fig. 11 shows the time series of oil price and keywords “Crude oil.” Above all, the online oil news collections with keywords “Crude oil” are the most appropriate dataset to predict crude oil price. This finding indicates that the fluctuation of crude oil price is influenced by various international events.

  • (b)

    Performance assessment

Table 3.

CNN classification results of different datasets.

Keywords Accuracy Precision Recall F-measure
Crude oil 0.66 0.70 0.58 0.63
Crude oil price 0.61 0.65 0.37 0.47
American oil 0.60 0.61 0.57 0.59

Note: The accuracy, precision, recall, and F-measure of CNN classification are evaluated as follows: Accuracy = (TP + TN)/(TP + FP + TN + FN); Precision = TP/(TP + FP); Recall = TP/(TP + FN); F-measure = 2 ∗Precision∗Recall/(Precision + Recall), where TP (true positive) is the number of positive cases which are categorized as positive; FP (false positive) is the number of positive cases which are classified as negative; TN (true negative) is the number of negative cases which are classified as negative; and FN (false negative) is the number of positive cases which are classified as negative. The precision, recall, and F-measure in the table are all micro averages.

Fig. 10.

Fig. 10

Time series of oil price and the three CNN values.

Fig. 11.

Fig. 11

Time series of crude oil price and keywords “Crude oil”.

The performance of the forecasting model is estimated by three statistical criteria, namely, mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE) [36]. These statistical criteria are the basis for evaluating the differences between actual and predicted values. MAPE, MAE, and RMSE are described as follows:

MAPE=t=1k|yˆtyt|/ytk, (8)
MAE=1kt=1k|(yˆtyt)| (9)
RMSE=t=1k(yˆtyt)2k, (10)

where k is the total month of the test dataset, yˆt is the predicted value at month t, and yt is the actual value at month t.

  • (c)

    Results and discussion

The lag order selection results of all examples are shown in Appendix B. The results demonstrate that the optimal lay orders for text features extracted by CNN, historical data, DJI, and GDP are one, two, two, and two, respectively. Table 4 presents the descriptive statistics of the CNN values with keywords “Crude oil,” historical oil price, DJI, GDP, and oil price datasets. According to the Jarque-Bera test, the historical oil price, GDP, and oil price datasets both reject the null hypothesis of normal distribution. Based on the Augmented Dickey-Fuller test (ADF), the historical oil price, DJI, GDP, and oil price dataset are demonstrated to have non-stationarity.

Table 4.

Descriptive statistics of the CNN values, historical oil price, DJI, GDP, and oil price datasets.

CNN Historical oil price DJI GDP Oil price
Mean 0.5099 52.5055 25139.47 18627.88 51.9398
Median 0.5112 54.8 25396.24 18696.71 54.8
Std. Dev. 0.0046 16.0557 304.6084 88.0079 16.5658
Skewness −0.3799 −1.3844 −0.3707 −1.6744 −1.2559
Kurtosis 0.2140 1.6120 −0.2720 4.6609 1.0222
Jarque-Bera 0.8936 14.5199∗∗∗ 1.0959 43.3310∗∗∗ 10.6770∗∗
Augmented Dickey-Fuller −4.8209∗∗∗ −1.3072 −2.1983 −1.8139 −1.4594

Note: ∗, ∗∗, ∗∗∗represent significance at the 10%, 5%, 1% levels, respectively.

The adopted typical techniques and artificial intelligence (AI) models are employed to select the optimal method to forecast oil price. Table 5 lists the performance comparison of different models using different inputs. The results show that the BPNN model obtains the best forecasting performance.

Table 5.

Performance comparison of different techniques.

MAPE RMSE MAE IR(MAPE)
BPNN 1 46.78% 11.1041 8.4084
2 9.56% 2.2285 1.6749 79.56%
3 28.51% 6.6743 5.3789 39.06%
4 7.17% 1.9194 1.5775 84.67%
MLR 1 76.27% 14.5121 12.8406
2 76.19% 14.6423 12.7107 0.10%
3 64.45% 15.4351 12.0035 15.50%
4 56.28% 12.9765 10.4790 26.21%
SVM 1 36.17% 8.9761 7.4284
2 26.97% 7.1991 6.4495 19.20%
3 46.59% 14.4929 10.1242 −28.81%
4 23.60% 5.2307 4.0115 34.75%
LSTM 1 54.43% 14.5938 11.1827
2 32.53% 9.7595 8.1046 40.24%
3 44.40% 8.2702 6.9034 18.43%
4 31.88% 6.7175 5.2260 41.43%
RNN 1 58.92% 12.3870 11.2355
2 22.53% 4.7991 4.4857 61.76%
3 50.54% 15.0736 11.4218 14.22%
4 17.02% 4.7706 3.3586 71.11%

Note: “1” means that historical data are used to predict. “2” means that historical data and text features are used to predict. “3” means that historical data and financial data are used to predict. “4” means that historical data, financial data, and text features data are both employed to predict. IR(MAPE) means improving the rate of MAPE from “1” to “2 (3 or 4)”. The grid search method is used to determine the parameters of adopted algorithms [37]. Appendix C lists the final parameter values of these forecasting models in all examples.

Furthermore, Fig. 12 shows the forecasting performances of different predictive factors using BPNN. As shown in Table 5 and Fig. 12, combining historical data, text feature data, and financial features can obtain better forecasting performance. Especially, in terms of IR (MAPE), the forecasting performance with text features, financial features, and historical data is improved by 84.67% than the forecasting performance of historical oil price using the BPNN model. In most scenarios, using financial data or text feature data can enhance forecasting performance than using only historical oil price. This finding suggests that online news and financial data may provide additional predictive information for oil price forecasting.

  • (d)

    Evaluate the importance of each predictor using the MIV approach

Fig. 12.

Fig. 12

Forecasting performances of different predictive factors using BPNN.

The MIV approach is implemented to evaluate the assigned coefficient values. The relative importance of each predictor is analyzed quantitatively. The size of each input variable is changed by ±10% to generate two new training sets. The average value of the difference between the two simulation results is evaluated, and MIV is obtained. Fig. 13 presents the ranking of the MIV for oil price prediction. The DJI (−2,-1,0), GDP (−1,0), text features (0, −1) and historical data (−2) are important factors. The historical oil price data (−1,0) and GDP (−2) have few effects on the oil price prediction. This finding suggests that financial information and online news have more explanatory power than historical oil prices for oil price forecasting.

Fig. 13.

Fig. 13

Ranking of the mean impact value for oil price prediction using BPNN.

4.2.2. Oil production forecasting

We collected three online oil news collections with different keywords “Crude oil,” “American oil,” and “American oil production.” Table 6 presents the number of different news collections. The oil news collections with the keywords “American oil production,” are unavailable every month. The pieces of news collected by keywords “Crude oil” reflect all international oil events, and those collected by “American oil” reflect oil events related to America.

Table 6.

Number of news using different keywords.

Type Keywords The number of news in CNN train period The number of news in CNN test period Total numbers
International news Crude oil 2408 4385 6793
Domestic news American oil 622 839 1461
American oil production Not enough news 885

Table 7 presents the classification performances of the CNN model. The results show that the online oil news collections with keywords “Crude oil,” achieved a better performance. Fig. 14 shows the time series of oil production and the two CNN values. The online oil news collections with the keywords “Crude oil,” are the appropriate dataset to predict oil production.

Table 7.

CNN classification results.

Keywords Accuracy Precision Recall F-measure
Crude oil 0.70 0.70 0.70 0.70
American oil 0.66 0.66 0.68 0.67
Fig. 14.

Fig. 14

Time series of oil production and the two CNN values.

Based on the VAR model, the optimal lay orders for text features, historical oil production data, DJI, and GDP are one, one, three, and three, respectively. Table 8 presents the descriptive statistics of the CNN values with keywords “Crude oil,” historical oil production, and oil production datasets. Judged by the Jarque-Bera test, only the historical oil production rejects the null hypothesis of normal distribution. Based on the ADF test, the historical oil production and oil production datasets are demonstrated to have non-stationarity.

Table 8.

Descriptive statistics of CNN values, historical oil production, and oil production datasets.

CNN Historic oil production Oil production
Mean 0.3409 11135.46 11172.07
Median 0.3311 11423.45 11423.45
Std. Dev. 0.0803 1220.75 1179.87
Skewness 0.2133 −0.2150 −0.2034
Kurtosis 0.3729 −1.3088 −1.2607
Jarque-Bera 0.3358 3.0895∗ 2.8797
Augmented Dickey-Fuller −3.1838∗∗ −2.0042 −1.8434

Note: The descriptive statistics of DJI and GDP are shown in Table 4.

Table 9 lists the performance comparison of different models using different inputs. The best MAPE was achieved using the SVM model with all selected features. Fig. 15 shows the forecasting performances of different predictive factors using SVM. Fig. 16 shows the ranking of mean impact value for oil production prediction. The MIV results show that the GDP (−3,-2,-1,0), DJI (−2,0), text features (0), and historical data (0) are important factors, but the DJI (−3,-1), text features (−1), and historical data (−1) only have little help for oil production forecasting. Online news exhibits some explanatory power for oil production forecasting, and GDP is the best predictor of oil production.

Table 9.

Performance comparison of different models.

MAPE RMSE MAE IR(MAPE)
BPNN 1 4.40% 616.4990 475.5733
2 3.94% 625.7280 428.3586 10.45%
3 2.24% 294.0954 254.1759 49.09%
4 1.69% 262.4757 190.5080 61.59%
MLR 1 5.10% 847.4493 545.5473
2 5.37% 907.5902 574.5642 −5.29%
3 16.70% 2.4171e+03 1.8282e+03 −227.45%
4 12.18% 1.6593e+03 1.3454e+03 −138.82%
SVM 1 3.59% 504.9028 398.3374
2 4.16% 609.1765 738.3157 −15.88%
3 2.04% 293.7686 230.5830 43.18%
4 1.44% 213.6692 163.0783 59.89%
LSTM 1 4.23% 738.3157 447.6391
2 4.28% 589.7330 485.2420 −1.18%
3 3.40% 477.3984 369.1028 19.62%
4 2.93% 649.7016 339.4954 30.73%
RNN 1 3.84% 500.2649 424.0437
2 3.61% 627.1218 387.3984 5.99%
3 3.37% 691.9704 394.2134 12.24%
4 3.06% 433.1777 337.4889 20.31%
Fig. 15.

Fig. 15

Forecasting performances of different predictive factors using SVM.

Fig. 16.

Fig. 16

Ranking of the mean impact value for oil production prediction.

4.2.3. Oil consumption forecasting

We collected three online oil news collections with different keywords “Crude oil,” “American oil,” and “American oil consumption.” Table 10 shows the different news collections.

Table 10.

Number of news using different keywords.

Type Keywords The number of news in CNN train period The number of news in CNN test period Total numbers
International news Crude oil 2408 4385 6793
Domestic news American oil 622 839 1461
American oil consumption Not enough news 60

Table 11 exhibits the classification results of the CNN model. The results show that the online oil news collections with keywords “American oil,” achieved better performance. This finding suggests that oil consumption is closely related to oil news or events, which are related to American. Fig. 17 illustrates the time series of crude oil consumption and the two CNN values. Fig. 18 shows the time series of oil consumption and keywords “American oil.”

Table 11.

Classification performance of the CNN model.

Keywords Accuracy Precision Recall F-measure
Crude oil 0.60 0.63 0.52 0.57
American oil 0.64 0.66 0.56 0.61
Fig. 17.

Fig. 17

Time series of oil consumption and the two CNN values.

Fig. 18.

Fig. 18

Time series of oil consumption and CNN values with keywords “American oil”.

The results demonstrate that the appropriate lay orders for keywords “American oil,” historical oil consumption data, DJI, and GDP are one, two, one, and two, respectively. Table 12 shows the descriptive statistics of the CNN values with keywords “American oil,” historical oil consumption, and oil consumption datasets. Based on the ADF test, at the significance of 1% level, the historical oil consumption and oil consumption datasets are both demonstrated to have non-stationarity.

Table 12.

Descriptive statistics of CNN values, historical oil consumption, and oil consumption datasets.

CNN Historical oil consumption Oil consumption
Mean 0.4803 5084.385 5092.151
Median 0.4841 45.8462 5120.261
Std. Dev. 0.0128 289.9571 45.2494
Skewness −0.6523 −0.2681 −0.337
Kurtosis 0.5070 0.2633 0.4568
Jarque-Bera 2.7751 0.4555 0.8095
Augmented Dickey-Fuller −2.0265 −3.8996∗∗∗ −3.8380∗∗∗

Table 13 lists the performance comparison of different models using different inputs. The best MAPE, RMSE, and MAE were obtained using the BPNN model with text features and historical data. Fig. 19 shows the forecasting performances of different predictive factors using BPNN. Fig. 20 shows the ranking of mean impact value for oil consumption prediction. The MIV results show that the historical data (0, −2) and text features (0, −1) are important factors, but the historical data (−1) only have little help for oil consumption forecasting.

Table 13.

Performance comparison of different models.

MAPE RMSE MAE IR(MAPE)
BPNN 1 4.08% 364.7069 182.6500
2 1.68% 94.5647 80.3324 58.82%
3 2.98% 176.9245 144.6125 26.96%
4 2.93% 191.2712 136.6095 28.19%
MLR 1 5.61% 355.8126 258.9670
2 4.97% 339.5771 228.4414 11.41%
3 80.80% 4.3301e+03 3.9944e+03 −1340.29%
4 27.59% 1.6708e+03 1.2832e+03 −391.80%
SVM 1 4.12% 329.7182 184.5873
2 3.24% 174.3819 155.5488 21.36%
3 2.85% 152.4080 137.7221 30.83%
4 2.78% 156.5058 134.2022 32.52%
LSTM 1 3.53% 316.4694 157.4753
2 2.55% 135.8281 123.2782 27.76%
3 3.45% 194.4310 161.6923 2.27%
4 3.86% 277.9089 181.9194 −9.35%
RNN 1 3.48% 314.4836 154.8826
2 2.72% 183.2028 129.0973 21.84%
3 1.91% 110.1180 91.6264 45.11%
4 3.41% 198.0660 160.7579 2.01%
Fig. 19.

Fig. 19

Forecasting performances of different predictive factors using BPNN.

Fig. 20.

Fig. 20

Ranking of the mean impact value for oil consumption prediction.

As shown in Table 13 and Fig. 19, the forecasting performance of historical data and text feature data is better than that of historical data. In terms of IR (MAPE), the forecasting performance with text feature and historical data is improved by 58.82% than the prediction performance of historical oil consumption using the BPNN model. Meanwhile, using historical data and text feature data enhances performance than using only historical oil consumption in the other four methods. This finding suggests that online news may provide additional predictive information for oil consumption forecasting.

4.2.4. Oil inventory forecasting

Three online oil news collections with different keywords “Crude oil,” “American oil,” and “American oil inventory,” are collected. Table 14 presents the number of different news collections. The oil news collections with keywords “American oil inventory,” which only include 294 pieces of news, are unavailable every month.

Table 14.

Number of news using different keywords.

Type Keywords The number of news in CNN train period The number of news in CNN test period Total numbers
International news Crude oil 2408 4385 6793
Domestic news American oil 622 839 1461
American oil inventory Not enough news 294

Table 15 presents the classification results of CNN. The results show that the online oil news collections with keywords “Crude oil,” achieved a better performance. Fig. 21 shows the time series of oil inventory and the two CNN values.

Table 15.

Results of CNN classification.

Keywords Accuracy Precision Recall F-measure
Crude oil 0.75 0.75 0.73 0.74
American oil 0.69 0.71 0.62 0.66
Fig. 21.

Fig. 21

Time series of oil inventory and the two CNN values.

The results of the VAR model demonstrate that the optimal lay orders for text feature, historical oil inventory data, DJI, and GDP are two, two, one, two, respectively. Table 16 presents the descriptive statistics of the CNN values with keywords “Crude oil,” historical oil inventory, and oil inventory datasets. Based on the Jarque-Bera test, the historical oil inventory and oil inventory datasets both reject the null hypothesis of normal distribution. Based on the ADF test, CNN values are demonstrated to have non-stationarity at the significance of 1% level.

Table 16.

Descriptive statistics of CNN values, historical oil inventory, and oil inventory datasets.

CNN Historical oil inventory Oil inventory
Mean 0.4320 1941.638 1943.099
Median 0.4617 1923.667 1923.667
Std. Dev. 0.1337 10.1227 66.5930
Skewness −0.2266 1.1259 1.1397
Kurtosis −0.9941 0.8091 0.6487
Jarque-Bera 2.0521 8.3552∗∗ 8.3145∗∗
Augmented Dickey-Fuller −4.2322∗∗∗ −1.1972 −1.6843

Table 17 lists the performance comparison of different models using different inputs. The best MAPE was achieved using the LSTM model with historical data and text features. Fig. 22 shows the forecasting performances of different predictive factors using LSTM. However, using the BPNN, MLR, and RNN models obtains better performance when only using historical data to forecast. In terms of IR(MAPE), using text features or financial features obtained worse MAPE than only using historical features. Above all, we can conclude that the oil news or financial features might not help the forecast of oil inventory.

Table 17.

Performance comparison of different models.

MAPE RMSE MAE IR(MAPE)
BPNN 1 1.52% 40.0705 30.6790
2 2.24% 52.0491 45.9824 −47.37%
3 1.51% 44.1664 30.2314 0.66%
4 1.66% 39.5310 33.9962 −9.21%
MLR 1 1.60% 38.7095 32.4300
2 1.73% 40.9421 35.1122 −8.1%
3 2.77% 67.0007 56.1855 −73.13%
4 2.59% 59.2417 52.7921 −61.88%
SVM 1 1.43% 39.3432 28.7629
2 1.43% 39.3953 26.4642 0
3 2.00% 56.6477 41.6142 −39.86%
4 2.38% 52.1609 48.6861 −66.43%
LSTM 1 1.31% 39.8103 26.4642
2 1.14% 30.4406 23.3398 12.98%
3 1.69% 43.0887 34.5260 −29.01%
4 1.55% 40.4426 31.5797 −18.32%
RNN 1 1.37% 40.0492 27.5696
2 1.58% 36.1341 31.5475 −15.33%
3 2.10% 54.8052 43.2982 −53.28%
4 1.66% 51.7864 33.1418 −21.68%
Fig. 22.

Fig. 22

Forecasting performances of different predictive factors using LSTM.

4.3. Results analysis and managerial implications

  • (a)

    Result analysis of text mining

The results show that online oil news collections with keywords “Crude oil” is conducive to oil price and oil production forecasting, and oil news collections with keywords “American oil” facilitate consumption forecasting. This finding indicates that the fluctuation of crude oil price, oil production, and oil inventory is influenced by various international events, but oil consumption is influenced by domestic events. Using the keywords “Crude oil” to forecast oil inventory obtained 75% classification accuracy. However, the different forecasting experiments demonstrate that using oil news might not help predict oil inventory.

  • (b)

    News facilitates oil price, production, and consumption forecasting

Oil price, production, and consumption show a violent fluctuating trend during the COVID-19 pandemic, and using online media information obtains better forecasting performance. Online media information includes recent events related to oil markets and the media sentiment of short-term oil markets. The results illustrate that online media information can facilitate crude oil price, production, and consumption forecasting, especially during the COVID-19 pandemic. This finding demonstrates that the news might help predict volatile data rather than smooth or regular data.

  • (c)

    Managerial implications

Affected by the coronavirus disease 2019 (COVID-19) pandemic and the oil price war between Russia and Saudi Arabia from February to March 2020, the oil price represents a total reduction of 55.57%. The collapse in crude oil price and the decline in oil demand caused by the outbreak of the COVID-19 pandemic have sharply reduced petroleum production capacity in the U.S. Oil consumption has dropped significantly since the beginning of the lockdown measures adopted by the U.S. government. With the reopening policy released in May 2020, oil consumption saw a rapid recovery. Consequently, forecasting oil price, production, and consumption becomes challenging. Fortunately, as social media messages contain explanations or analyses of the relevant restrictions or reopening policies, the utilization of online oil news can accurately predict the large fluctuations in oil price, production, and consumption during the COVID-19 pandemic.

The results of oil price, production, and consumption prediction have helpful implications for marketers. It helps marketers deepen their understanding of the internal relationship between social media information and oil markets. Social media information can be used to estimate whether the news is positive or negative for the oil markets. An important positive correlation exists between social media information and the performance of oil markets. In other words, oil markets tend to perform better when social media information shows optimism about oil markets. By contrast, when social media information shows a negative sentiment toward oil markets, oil markets tend to underperform. Therefore, the previous mood of social media information should be considered when carrying out oil marketing activities. In other words, marketers must consider the impact of social media information on the oil markets or other markets, especially during the COVID-19 pandemic.

5. Conclusion and future research

Owing to the impact of the COVID-19 pandemic, the oil markets have shown profound uncertainty and volatility, which pose a huge challenge to accurate forecasting. To overcome this challenge, this study uses qualitative information to forecast oil markets, as online media information can reflect a variety of oil-related social events or unexpected political events and plays an important role in the fluctuating trend of the oil markets. Inspired by this issue, we use online news as a predictor for predicting oil price, production, consumption, and inventory during the COVID-19 pandemic. The CNN, which is a deep learning model, is employed to extract online news text features automatically. MLR, BPNN, SVM, RNN, and LSTM are used as forecasting techniques. The results demonstrate that social media information contributes to the forecasting of oil price, production, and consumption. However, the forecasting accuracies of oil inventory are unaffected by online news information.

The main contribution is the introduction of social media information to forecast oil price, production, and oil consumption. Especially, this study is the first attempt to consider online oil news to forecast oil production and consumption, and excellent forecasting performance is achieved. The findings contribute to broadening the theoretical insights for methodological forecasting in that social media information might facilitate the prediction of unstable oil price, production, and consumption during the COVID-19 pandemic.

Some limitations and potential extensions are acknowledged. First, other public opinions on breaking news, press release, and regulatory announcements can be accessed to distinguish sentiments and extract information. Second, attempting other emerging text classification technologies, such as Graph Neural Network (GNN) and long short-term memory (LSTM) is worth doing [[38], [39]]. Third, to explore the applicability of this study, other markets affected by the pandemic, such as stocks markets and gold markets, can also consider using online social information for forecasting. In future research, we will also use the proposed methodology to handle more forecasting problems under complex environment [40].

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research is partially supported by National Natural Science Foundation of China (No: 71810107003).

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.energy.2021.120403.

Credit author statement

Binrong Wu: Conceptualization, Methodology, Software, Investigation, Writing – original draft. Lin Wang: Conceptualization, Methodology, Writing-Reviewing and Editing, Supervision, Funding acquisition. Sirui Wang: Software, Visualization, Validation. Yu-Rong Zeng: Conceptualization, Methodology, Visualization, Validation, Writing- Reviewing and Editing.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component 1
mmc1.zip (55.1KB, zip)

References

  • 1.Ou S., He X., Ji W., Chen W., Sui L., Gan Y., et al. Machine learning model to project the impact of COVID-19 on US motor gasoline demand. Nature Energy. 2020;5(9):666–673. doi: 10.1038/s41560-020-00711-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Yu L., Dai W., Tang L. A novel decomposition ensemble model with extended extreme learning machine for crude oil price forecasting. Eng Appl Artif Intell. 2016;47:110–121. [Google Scholar]
  • 3.Santiago I., Moreno-Munoz A., Quintero-Jiménez P., Garcia-Torres F., Gonzalez-Redondo M.J. Electricity demand during pandemic times: the case of the COVID-19 in Spain. Energy Pol. 2021;148:111964. doi: 10.1016/j.enpol.2020.111964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.International Energy Agency (IEA) 2020. An unprecedented global Health and economic crisis.https://www.iea.org/topics/covid-19 [Google Scholar]
  • 5.Sharif A., Aloui C., Yarovaya L. COVID-19 pandemic, oil prices, stock market, geopolitical risk and policy uncertainty nexus in the US economy: fresh evidence from the wavelet-based approach. Int Rev Financ Anal. 2020;70:101496. doi: 10.1016/j.irfa.2020.101496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hashem I.A.T., Yaqoob I., Anuar N.B., Mokhtar S., Gani A., Ullah Khan S. The rise of “big data” on cloud computing: review and open research issues. Inf Syst. 2015;47:98–115. [Google Scholar]
  • 7.Li X., Law R., Xie G., Wang S. Review of tourism forecasting research with internet data. Tourism Manag. 2021;83:104245. [Google Scholar]
  • 8.Curiskis S.A., Drake B., Osborn T.R., Kennedy P.J. An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit. Inf Process Manag. 2020;57(2):102034. [Google Scholar]
  • 9.Mosa M.A., Anwar A.S., Hamouda A. A survey of multiple types of text summarization with their satellite contents based on swarm intelligence optimization algorithms. Knowl Base Syst. 2019;163:518–532. [Google Scholar]
  • 10.Zhang M., Fan B., Zhang N., Wang W., Fan W. Mining product innovation ideas from online reviews. Inf Process Manag. 2021;58(1):102389. [Google Scholar]
  • 11.Jeong B., Yoon J., Lee J. Social media mining for product planning: a product opportunity mining approach based on topic modeling and sentiment analysis. Int J Inf Manag. 2019;48:280–290. [Google Scholar]
  • 12.Hemmatian F., Sohrabi M.K. A survey on classification techniques for opinion mining and sentiment analysis. Artif Intell Rev. 2017:1–51. [Google Scholar]
  • 13.Er M.J., Zhang Y., Wang N., Pratama M. Attention pooling-based convolutional neural network for sentence modelling. Inf Sci. 2016;373:388–403. [Google Scholar]
  • 14.Esteva A., Kuprel B., Novoa R.A., Ko J., Swetter S.M., Blau H.M., et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–118. doi: 10.1038/nature21056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sainath T.N., Kingsbury B., Saon G., Soltau H., Mohamed A., Dahl G., et al. Deep convolutional neural networks for large-scale speech tasks. Neural Network. 2015;64:39–48. doi: 10.1016/j.neunet.2014.08.005. [DOI] [PubMed] [Google Scholar]
  • 16.Ksiazek T.B., Peer L., Lessard K. User engagement with online news: conceptualizing interactivity and exploring the relationship between online news videos and user comments. New Media Soc. 2016;18(3):502–520. [Google Scholar]
  • 17.de Albuquerquemello V.P., de Medeiros R.K., da Nóbrega Besarria C., Maia S.F. Forecasting crude oil price: does exist an optimal econometric model? Energy. 2018;155:578–591. [Google Scholar]
  • 18.Ghoddusi H., Creamer G.G., Rafizadeh N. Machine learning in energy economics and finance: a review. Energy Econ. 2019;81:709–727. [Google Scholar]
  • 19.Karasu S., Altan A., Bekiros S., Ahmad W. A new forecasting model with wrapper-based feature selection approach using multi-objective optimization technique for chaotic crude oil time series. Energy. 2020;212:118750. [Google Scholar]
  • 20.Zhao L., Wang Y., Guo S., Zeng G. A novel method based on numerical fitting for oil price trend forecasting. Appl Energy. 2018;220:154–163. [Google Scholar]
  • 21.Wang J., Li X., Hong T., Wang S. A semi-heterogeneous approach to combining crude oil price forecasts. Inf Sci. 2018;460–461:279–292. [Google Scholar]
  • 22.Bekiroglu K., Duru O., Gulay E., Su R., Lagoa C. Predictive analytics of crude oil prices by utilizing the intelligent model search engine. Appl Energy. 2018;228:2387–2397. [Google Scholar]
  • 23.Li X., Shang W., Wang S. Text-based crude oil price forecasting: a deep learning approach. Int J Forecast. 2019;35(4):1548–1560. [Google Scholar]
  • 24.Wu B., Wang L., Lv S., Zeng Y. Effective crude oil price forecasting using new text-based and big-data-driven model. Measurement. 2021;168:108468. [Google Scholar]
  • 25.Wang Q., Song X., Li R. A novel hybridization of nonlinear grey model and linear ARIMA residual correction for forecasting U.S. shale oil production. Energy. 2018;165:1320–1331. [Google Scholar]
  • 26.Liu W., Liu W.D., Gu J. Forecasting oil production using ensemble empirical model decomposition based Long Short-Term Memory neural network. J Petrol Sci Eng. 2020;189:107013. [Google Scholar]
  • 27.Yuan C., Zhu Y., Chen D., Liu S., Fang Z. Using the GM (1,1) model cluster to forecast global oil consumption. Grey Systems. 2017;7(2):286–296. [Google Scholar]
  • 28.Xiao J., Li Y., Xie L., Liu D., Huang J. A hybrid model based on selective ensemble for energy consumption forecasting in China. Energy. 2018;159:534–546. [Google Scholar]
  • 29.Wang Q., Song X. Forecasting China’s oil consumption: a comparison of novel nonlinear-dynamic grey model (GM), linear GM, nonlinear GM and metabolism GM. Energy. 2019;183:160–171. [Google Scholar]
  • 30.Yu L., Zhao Y., Tang L., Yang Z. Online big data-driven oil consumption forecasting with Google trends. Int J Forecast. 2019;35(1):213–223. [Google Scholar]
  • 31.Wen S., Zhang C., Lan H., Xu Y., Tang Y., Huang Y. A hybrid ensemble model for interval prediction of solar power output in ship onboard power systems. IEEE Transactions on Sustainable Energy. 2021;12(1):14–24. [Google Scholar]
  • 32.Su H., Zio E., Zhang J., Xu M., Li X., Zhang Z. A hybrid hourly natural gas demand forecasting method based on the integration of wavelet transform and enhanced Deep-RNN model. Energy. 2019;178:585–597. [Google Scholar]
  • 33.Li J., Wang J. Forcasting of energy futures market and synchronization based on stochastic gated recurrent unit model. Energy. 2020;213:118787. [Google Scholar]
  • 34.Hochreiter S., Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
  • 35.Cifarelli G., Paladino G. Oil price dynamics and speculation A multivariate financial approach. Energy Econ. 2010;32(2):363–372. [Google Scholar]
  • 36.Nikolopoulos K., Punia S., Schäfers A., Tsinopoulos C., Vasilakis C. Forecasting and planning during a pandemic: COVID-19 growth rates, supply chain disruptions, and governmental decisions. Eur J Oper Res. 2021;290(1):99–115. doi: 10.1016/j.ejor.2020.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wang L., Wu B., Zhu Q., Zeng Y. Forecasting monthly tourism demand using enhanced backpropagation neural network. Neural Process Lett. 2020;52(3):2607–2636. [Google Scholar]
  • 38.Zhao L., Xu W., Gao S., Guo J. Cross-sentence N-ary relation classification using LSTMs on graph and sequence structures. Knowl Base Syst. 2020;207:106266. [Google Scholar]
  • 39.Peng L., Zhu Q., Lv S.X., Wang L. Effective long short-term memory with fruit fly optimization algorithm for time series forecasting. Soft Comput. 2020;24(19):15059–15079. [Google Scholar]
  • 40.Wang L., Peng L., Wang S.R., Liu S. Advanced backtracking search optimization algorithm for a new joint replenishment problem under trade credits with grouping constraint. Appl Soft Comput. 2020;86 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.zip (55.1KB, zip)

Articles from Energy (Oxford, England) are provided here courtesy of Elsevier

RESOURCES