Abstract
The volatility index is the implied volatility calculated inversely from the option prices. This study investigates whether the official Chinese volatility index, iVX, can represent investor sentiment. In order to describe investor sentiment comprehensively, we build a three-dimensional investor sentiment measurement system composed of macro, meso and micro level, and decompose iVX into three components to obtain short-term, medium-term fluctuations and long-term trend by EEMD method. The relationships between iVX, its components and sentiment indexes at each level have been analyzed separately, and the empirical results reveal all components of iVX can reflect the investor sentiment at the corresponding level but to which extent they can reflect are not the same. Further we introduce the mixed-frequency dynamic factor analysis to extract the common sentiment factor, which shows stronger correlation with contemporaneous iVX, compared with the sentiment indexes at each level. The ADL model in robustness check also demonstrates the results. Our findings confirm iVX can represent the common sentiment and expectations of Chinese investors in different time scales.
Keywords: iVX, Investor sentiment, EEMD, Mixed-frequency dynamic factor analysis, Correlation
Highlights
-
•
A three-dimensional investor sentiment measurement system composed of macro, meso and micro level is built.
-
•
iVX is decomposed into three components to obtain short-term, medium-term fluctuations and long-term trend.
-
•
Mixed-frequency dynamic factor analysis is introduced to extract the common sentiment factor.
-
•
The relationships between iVX, its components and sentiment indexes at each level have been analyzed separately.
1. Introduction
Since the stock came into being, its risk characteristics have been the focus of investors. Risk refers to the uncertainty of return, which is generally defined as volatility. According to different calibers and methods, volatility can be categorized into historical volatility, predicted volatility, and implied volatility. Historical volatility is calculated as the average deviation from the average price of a financial instrument in the given time period. Predicted volatility is the volatility estimated based on asset conditions, economic situation, historical experience, etc. When calculating the option price, it is brought into the option pricing formula to obtain the price. Implied volatility is the volatility inversely derived by putting the market price of the option and other known parameters except volatility into the Black-Scholes option pricing model.
Investor sentiment refers to a belief formed by the expectation of future cash flows and investment risks of assets, which yet does not fully reflect the existing facts (Wurgler & Baker, 2006). This kind of belief is not only influenced by the fundamentals of assets and the information transmission of capital market, but also strongly related to the education, personal experience and personal preference of investors. Therefore, for the same asset, different investors will hold different beliefs, namely sentiment. Herd effect and other phenomena show that sentiment is contagious, which will lead to consistent actions among people. Due to this irrational behavior, if limited arbitrage exists, asset prices will have systematic bias.
In 1993, Chicago Board Options Exchange (CBOE) delivered the world's first volatility index VIX, which is the implied volatility calculated using the S&P 500 index. In 2016, Shanghai Stock Exchange of China also announced the first official Chinese volatility index, iVX, based on the Shanghai 50ETF. The volatility index, which measures investors' expectations of future market volatility, has become one of the main basis for assessing market risks. Before the subprime mortgage crisis in 2008, the European debt crisis in 2011, and the plunge of the financial market caused by the novel coronavirus epidemic in 2020, VIX rose sharply and all hit historical highs, reflecting that VIX has good capabilities of risk prediction. The fact that the VIX spikes during periods of market turmoil is why it has become known as the “investor fear gauge”(Whaley, 2000).
The research on VIX has always been one of the important issues in finance, which can be mainly divided into three aspects: (1) The prediction of the realized volatility. On the one hand, scholars have different opinions on whether the volatility index has the ability to predict. Some scholars think that the correlation between implied volatility and real market volatility is very weak (Canina & Figlewski, 1993). The results of Chow et al. (2018) show that when market returns are expected to be positively biased, VIX usually overestimates market volatility, otherwise it underestimates, and the higher the negatively biased expectations are, the more serious the VIX underestimates the volatility. On the other hand, there is no agreement on the comparison among the prediction capabilities of implied volatility, historical volatility, and time series models such as ARMA and GARCH family models (Day & Lewis, 1992; Martens & Zein, 2004; Shaikh & Padhi, 2013). Some studies believe that combining different model volatility can predict market volatility more accurately (Fleming, 1998; Kambouroudis, McMillan, & Tsakou, 2016). (2) Research on the relationship between the volatility index and rate of return. Owing to the risk-indicating characteristic, the fluctuations of the volatility index are closely related to the stock return rate. The study finds that the volatility index is negatively correlated with the underlying index (Giot, 2002), negatively asymmetrically correlated with the return rate, namely “leverage effect” (Chandra & Thenmozhi, 2015; Peng & Ng, 2012), and effects the relation between idiosyncratic volatility and stock returns (Qadan, Kliger, & Chen, 2019). VIX has spillover effect on other stock markets and is negatively correlated with other stock market returns (Chen, Jiang, Liu, et al., 2017; Delisle, Doran, & Peterson, 2011; Sarwar, 2012; Wu, Pan, & Tai, 2015). It can predict other stock markets (Tissaoui & Azibi, 2019). (3) Pricing of derivatives. Presently, the application research on the volatility index and related derivatives has been improving. Dupoyet, Daigler, and Chen (2011) believe that the CEV model has certain advantages in VIX futures pricing. Kanniainen, Lin, and Yang (2014) price S&P500 index options based on the GARCH model and the VIX index. Basher at el. (2016) compare the effects of hedging market risks with indicators such as VIX index and oil prices. Some researchers also study the volatility index in other stock markets (Badshah, Bekiros, Lucey, & Uddin, 2018; Siriopoulos & Fassas, 2012).
Due to the late release of the Chinese volatility index iVX, there are few studies on iVX. The related researches mainly focus on three aspects: the prediction ability of iVX (Qiao, Teng, Li, et al., 2019), the relationship between iVX and the yield, including the “leverage effect” (Li, Yu, Luo, et al., 2019; Yue, Ruan, Gehricke, & Zhang, 2019), and the inclusion of volatility index as an indicator in the sentiment index construction system (Xu & Zhou, 2018).
In order to measure the expectation of investor, scholars put forward the concept of investor sentiment index. Previous methods for obtaining investor sentiment indicators can be divided into three categories: (1) Investor sentiment index composed of questionnaires and interviews. American studies mainly use the University of Michigan consumer survey sentiment index, while Chinese indexes mainly include “CCTV Watch Index”, good and bad index by Stock Market Trend Analysis Weekly, “Long and Short Polls” by Huading, consumer confidence index et al. (Lemmon & Portniaguina, 2006; Schmeling, 2009). Although the survey method aims to directly quantify investor sentiment, its operation cost is high, the data frequency of constructing sentiment index is relatively low, and the time span is relatively short. (2) Indirect indicators obtained from the historical data of the stock market. This method mainly uses historical data of several proxies that can be observed on the stock market to synthesize the investor sentiment index. At present, the academic community mainly adopts the method proposed by Baker and Wurgler (2007). They use principal component analysis to extract a sentiment index from six variables of the stock market, including closed-end fund discount, NYSE share turnover, the number and average first-day returns on IPOs, the equity share in new issues, and the dividend premium. Although the approach of using indirect indicators in the stock market is more time-saving and labor-saving than survey and interview, it is limited by some conditions such as index selection and synthesis methods. Furthermore, using market variables as proxies for investor sentiment may not only reflect investor sentiment, but also reflects the equilibrium results after the interaction of sentiment and other economic factors (Da, Engelberg, & Gao, 2014; Qiu & Welch, 2004). (3) Analysis of information on the Internet. Many researchers start with the Internet information and apply text analysis to analyze the information released by investors on social platforms or searched content to get the investor sentiment. This method avoids the shortcomings of the first two methods, but is susceptible to network noise information. Text data mainly includes Yahoo Finance posts (Antweiler & Frank, 2004; Das & Chen, 2007; Kim & Kim, 2014; Tsukioka, Yanagi, & Takada, 2018), WeChat (Shi, Zhu, Zhao, Kang, & Xiong, 2018), Weibo platform (Checkley, Higón, & Alles, 2017; Renault, 2017), Twitter (Behrendt & Schmidt, 2018; Li, Chan, Ou, & Ruifeng, 2017) and Google search (Da et al., 2014; Gao, Ren, & Zhang, 2019), etc.
The purpose of this paper is to investigate the ability of the Chinese volatility index iVX to represent investor sentiment. Since iVX is computed given the option prices and the prices reflect the expectations of the stock market by the investors, thus iVX reveals the investors' predictions towards future of the stocks. And the predictions reflect the investors' sentiment. Therefore, iVX can simultaneously represent the investors' sentiment in theory. Li et al. (2019) find that iVX is negatively associated with the price and return of Shanghai 50ETF separately, which indirectly proves that iVX could reflect investors' panic sentiment, but did not directly measure the sentiment. Regarding the sentiment representation of the VIX created by CBOE, most of the researches take it as the known facts and conduct research on this basis (Pan, 2018; Smales, 2017; Yang, Jhang, & Chang, 2016). However, there is little literature that confirms VIX has this effect. Some of them indirectly proves it by analyzing the negative correlation between VIX's change and the return rate of the underlying index (Smales, 2016; Whaley, 2000; Whaley, 2009). The researches which directly prove it only involve the relationship of volatility index on sentiment index based on news, social media or other single-level sentiment (Pineiro-Chousa, López-Cabarcos, & Pérez-Pico, 2016; Smales, 2014; Zhang, Fuehres, & Gloor, 2011).
In order to obtain fluctuations on different time scales, this paper will use Ensemble Empirical Mode Decomposition (EEMD) to decompose iVX. EEMD provides a feasible way to deal with non-linear and non-stationary data. It consists in a local and fully data-driven separation of a signal in fast and slow oscillations (Torres, Marcelo, Gaston, et al., 2011). It has been widely used in industry and resources (Lei & Zuo, 2009; Wang, Gao, & Yan, 2014; Wang, Xu, Chau, et al., 2013). Meanwhile, Zhang et al., 2008, Zhang et al., 2009 and Yu, Wang, and Lai (2008) have successfully applied this idea to the field of social science, and prove that the integrated empirical mode decomposition method is a useful approach for financial time series analysis.
The contributions of this paper include the following three aspects: (1) This study seeks to examine systematically on whether iVX has the ability to represent sentiment, and analyze it at the macro, meso and micro levels, while previous studies rarely discuss this issue, especially for the newly released and short-lived Chinese volatility index iVX. (2) This paper constructs a three-dimensional comprehensive measurement system of investor sentiment, which is composed of three levels: macro-economy, stock market and individual opinions. The frequency is monthly, weekly and daily, respectively, including rational and irrational components, so our measure system can capture more information about the investor attitudes. However, most of the previous studies only focus on one level of sentiment, such as micro-blog sentiment from the micro perspective. (3) By employing dynamic factor analysis on different sentiment indexes with mixed-frequency to extract the common factor, we investigate whether iVX can comprehensively represent investor sentiments at different time scales.
The remaining of the article is organized as follows. Section 2 introduces iVX and compiles sentiment indexes at different levels. Section 3 decomposes iVX by EEMD to get three components and analyzes their representation on investor sentiment at different levels. Section 4 applies mixed-frequency dynamic factor analysis to obtain a common sentiment factor, and discuss its relationship with iVX. Section 5 is the robustness test. Finally, this study concludes with a few remarks.
2. Data and variables construction
2.1. iVX and its descriptive analysis
Chinese volatility index iVX, released by the Shanghai Stock Exchange, can be traced back to February 9, 2015, and was suspended on February 22, 2018 for unknown reasons. Using model-free methodologies similarly to VIX, the iVX is estimated from the bid and ask prices of the underlying options for Shanghai 50ETF. This paper collects all the data during the iVX release period, with 740 effective observations from the WIND database.
Fig. 1, Fig. 2 depict the daily time series of iVX and Shanghai 50ETF, ΔiVX and 50ETF's variation Δ50ETF in the sample period, respectively. It is clearly visible from the graphs that when the Chinese stock market falls sharply in mid-2015, iVX rises to a historical peak of 63.79, which suggests iVX is negatively related to the underlying assets to a certain extent.
Fig. 1.
iVX and Shanghai 50ETF. Notes: Time series plot of the iVX and Shanghai 50ETF from February 9, 2015 to February 22, 2018. The left vertical axis depicts iVX and the right depicts 50ETF.
Fig. 2.
ΔiVX and Δ50ETF. Notes: Time series plot of the ΔiVX and Δ50ETF from February 9, 2015 to February 22, 2018. The left vertical axis depicts ΔiVX and the right depicts Δ50ETF.
Table 1 reports the summary statistics for iVX and ΔiVX. The mean value of iVX is 23.879 and ΔiVX is approximately zero. The two are both positively skewed. ΔiVX presents excess kurtosis while iVX does not. The above two points suggest that big positive changes of iVX occur more frequently than large negative changes and vice versa. The reported first-order autocorrelations show that iVX is highly persistent, but ΔiVX is not. The Augmented Dickey Fuller (ADF) and PP tests on the levels, reject the null hypothesis of unit root for ΔiVX, while accepts for iVX.
Table 1.
Descriptive statistics.
| Mean | Median | Maximum | Minimum | Std. Dev. | Skewness | Kurtosis | ρ1 | ADF | PP | |
|---|---|---|---|---|---|---|---|---|---|---|
| iVX | 23.879 | 19.335 | 63.790 | 8.310 | 12.186 | 0.919 | 3.087 | 0.990*** | −2.917 | −2.998 |
| ΔiVX | −0.002 | −0.070 | 15.450 | −11.510 | 1.749 | 2.371 | 25.766 | 0.024 | −26.531*** | −26.506*** |
Notes: The autocorrelation coefficient ρ, the Augmented Dickey-Fuller (ADF) and the PP (intercept is included) test values are reported. ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
2.2. Measurement of investor sentiment
Investor sentiment reflects the expectation of investors for the future economy and stock market. This paper builds a three-dimensional measurement system of investor sentiment, which depicts sentiment from three levels: macro, meso and micro. Among them, macro level quantifies investor sentiment towards the whole economy; meso level measures investor sentiment towards the stock market; and micro level shows investor sentiment towards each individual stock. The three levels of sentiment index are monthly, weekly and daily data separately.
2.2.1. Macro investor sentiment
In order to reflect the latest situation of economic development in time, taking into account the representativeness of China's macroeconomic cycle variables and the availability of data, this paper uses the Purchasing Manager Index (PMI) and Production Price Index (PPI) on the production side, the Consumer Satisfaction Index (SI), Consumer Price Index (CPI) and total retail sales of consumer goods (RSCD) on the consumer side, and the consistent index of business climate index (CBCI) indicating the economic boom and bust issued by the China Economic Prosperity Monitoring Center, to form a composite macro-level index of sentiment. These data are monthly and from WIND database.
Fig. 3 presents the fluctuations of above six variables from February 2015 to February 2018. We find that the three variables CBCI, PPI, and SI show a strong co-movement. Each variable first moves downwards, then reaches the lowest level in November 2015, December 2015 and April 2016, respectively, and then rises. CBCI is ahead of the other two, implying it appears to be a predictive indicator. CPI reaches the lowest in January 2016, later than PPI, in line with the transmission law of economics, i.e. volatility in industrial production have a lag effect on consumption. Table 2 displays the correlation between the six proxy variables. CPI is negatively associated with other variables, and RSCD has the same linkage with others except CBCI. Except for that, the other variables have significantly positive correlations with each other.
Fig. 3.
Macro-sentiment proxy variables. Notes: Time series plot of the macro-sentiment proxy variables from February 2015 to February 2018. The left vertical axis depicts CBCI, PPI, CPI and SI, while the right depicts RSCD and PMI.
Table 2.
Macro coefficient matrix.
| CBCI | PMI | PPI | SI | CPI | RSCD | |
|---|---|---|---|---|---|---|
| CBCI | 1.000 | 0.762 | 0.866 | 0.611 | −0.808 | 0.155 |
| PMI | 0.762 | 1.000 | 0.892 | 0.671 | −0.467 | −0.164 |
| PPI | 0.866 | 0.892 | 1.000 | 0.796 | −0.543 | −0.186 |
| SI | 0.611 | 0.671 | 0.796 | 1.000 | −0.174 | −0.075 |
| CPI | −0.808 | −0.467 | −0.543 | −0.174 | 1.000 | −0.265 |
| RSCD | 0.155 | −0.164 | −0.186 | −0.075 | −0.265 | 1.000 |
This paper applies the principal component analysis on the proxy variables to extract the sentimental principal components.
Further, considering the fact that some variables take longer to reveal the same sentiment (Wurgler & Baker, 2006, Baker and Wurgler, 2007), if the variables are synthesized directly in the same period, the composite index will not effectively reflect the market sentiment movement. Referring to Baker and Wurgler (2007), this paper uses the following two steps to get the final sentiment index. First we estimate the first principal component on the six proxies and their lags, and get a first-stage index MacSEN. Then we compare the correlation between the first-stage index MacSEN and 12 variables. We pick the six variables with the strongest linkages with MacSEN, and form the final sentiment index MacSI by the second-stage principal component analysis.
According to the above steps, six variables were finally selected to extract the principal component, namely CBCIt-1, PMIt-1, PPIt, SIt, CPIt-1 and RSCDt-1. The second-stage PCA results show that the cumulative contribution rate of the first principal component is 76.512%, indicating that the first principal component can well retain the original information. The correlation between the first-stage index MacSEN and the final MacSI is 0.997, suggesting that little information is lost in dropping the six terms with other time subscripts. Therefore, this paper adopts MacSI as the macro-level sentiment index. Eq. (1) shows the computation of MacSI. Fig. 4 presents the fluctuations of MacSI which appears to increase throughout the whole period.
| (1) |
Fig. 4.
MacSI. Notes: Time series plot of MacSI from February 2015 to February 2018.
2.2.2. Meso investor sentiment
The formation of the investor sentiment index at the meso level is mainly based on the method of Baker and Wurgler (Wurgler & Baker, 2006, Baker and Wurgler, 2007). The six proxy variables they selected are: the closed-end fund discount, NYSE share turnover, the number and average first-day returns on IPOs, the equity share in new issues, and the dividend premium. However, compared with the US stock market, the Chinese stock market has its own particularities, mainly containing the following three points: (1) Almost every IPO will hit a daily limit on the first day of listing, which means the average value for first day's earnings is almost 44%. (2) From July 10, 2015 to November 27, 2015, China Securities Regulatory Commission (CSRC) suspended the issuance of IPOs to save the market, and consequently, the number of IPOs was zero. (3) There are few companies which pay dividends in Chinese stock market, and most of the investors' income comes from capital gains. Therefore, some of the proxies used by Baker and Wurgler are not suitable for the Chinese market. Furthermore, according to the research of Xu and Zhou (2018), this study selects closed-end fund discount rate (FDDR), turnover rate (TO), price-earnings ratio (PE) and the number of new investors (NIA) as sentiment proxies, which are all weekly variables. The data comes from the WIND database.
Among the above variables, in theory, the price of closed-end funds should be consistent with the value of the unit net assets of the stock portfolio. However, in reality, closed-end funds often trade at a discount (Lee & Thaler, 1991). It is generally believed that the higher the absolute value of the discount rate, the less optimistic view investors take about the market outlook. Since FDDR is negative in normal conditions, it is positively correlated with investor sentiment. TO signifies market liquidity. Under the restrictions of market short selling and the participation of irrational investor, high liquidity is often accompanied by overvaluation, which has positive correlation with investor sentiment. In addition, PE and NIA are also positively associated with sentiment.
In order to control the impact of fund size, we adopt the following criteria for fund selection, excluding small-scale closed-end funds and innovative closed-end funds which are less than 2 billion yuan. There are 18 closed-end funds meet the conditions altogether. For NIA, China Securities Depository and Clearing Co., Ltd. released on May 4, 2015 at the earliest. The previous data is adjusted in light of the data of China's stock account (http://www.homilychart.com) published by Hongli Database to acquire NIA estimated value.
Fig. 5 shows the movement of the four variables. We find that NIA, PE, and TO present very similar fluctuations. Each variable first shows tendency to ascend before spiking to the highest level around mid-June 2015, followed to descend (Table 3 ). FDDR is behind the other three, peaks in August 2015, and then declines. Table 4 reports that there is a certain degree of association between the four sentiment agent variables.
Fig. 5.
Meso-sentiment proxy variables. Notes: Time series plot of meso-sentiment proxy variables from February 13, 2015 to February 14, 2018. The start date is Friday of the week with February 9, 2015. The end is earlier than February 22, 2018 due to the Spring Festival in China. The left vertical axis depicts NIA, PE and TO, while the right depicts FDDR.
Table 3.
Meso coefficient matrix.
| TO | PE | NIA | FDDR | |
|---|---|---|---|---|
| TO | 1.000 | 0.303 | 0.695 | 0.256 |
| PE | 0.303 | 1.000 | 0.547 | −0.172 |
| NIA | 0.695 | 0.547 | 1.000 | −0.070 |
| FDDR | 0.256 | −0.172 | −0.070 | 1.000 |
Table 4.
Accuracy of two classifiers.
| Precision/% | Recall/% | F-measure/% | |
|---|---|---|---|
| Noise elimination | 75.567 | 75.200 | 75.383 |
| Sentiment identification | 75.216 | 70.391 | 72.724 |
We use the same method as Section 2.2.1 to select the proxies or their lags, and then perform principal component analysis to calculate the meso-level sentiment index. The selected terms are TOt-1, PEt, NIAt-1, and FDDRt. Taking into account the influence of macro-economic factors on the meso level sentiment, we explicitly remove business cycle variation from the four proxies prior to the principal component analysis. The six macroeconomic variables in the previous Section 2.2.1 are used as the proxies of economic fundamental factors. Each of the raw variables are regressed with 6 macroeconomic variables (standardized before regression), thereby obtaining the residual sequences, which the principal component analysis is employed on. The first and second principal component explains 51.833% and 29.518% of the sample variance respectively, thus the two factors with cumulative rate of 81.351% capture much of the common variation. Therefore, we take weighted average of the first and second principal components as meso sentiment index, without the macro influence. The resulting sentiment index MeSI is shown in Eq. (2), and its variation throughout the sample period is shown in Fig. 6 . The index falls sharply in the mid-2015, after then appears to be less volatile.
| (2) |
Fig. 6.
MeSI. Notes: Time series plot of MeSI from February 13, 2015 to February 14, 2018.
2.2.3. Micro investor sentiment
Based on investors' online opinion posts on the Internet social media, the investor sentiment at micro level is extracted and analyzed from massive text data. In terms of data acquisition, we use web crawler tools to grab the information on the East money stock forum, which has a considerable number of visits and influence among the stock forums in China. According to ranking of visitor volume by Alexa's website and Baidu weight rankings, East money stock forum currently ranks first in the major stock forums. Besides, as one of the earliest online stock forums in China, the data of East money stock forum can be traced for a long time, and retain multi-dimensional information such as the post, post time, title, page views, and number of replies. Hence, it can be interpreted that opinion posts on East money stock forum reflect sentiment of most individual investors towards the stock market directly.
We crawl the commentary information of all stocks composed of the CSI 800 index from February 9, 2015 to February 22, 2018 on East money stock forum, with a total of about 17 million records. CSI 800 includes CSI 500 and CSI 300 constituent stocks, which comprehensively reflect the overall status of large, medium and small market capitalization companies in the Shanghai and Shenzhen securities markets.
There are usually two types of sentiment classification, based on sentiment dictionary or classifier, respectively (Liu & Zhang, 2012). It is hardly possible to meet the professional needs in the field of finance by using a general sentiment dictionary in natural languages, and empirical evidence has shown that the classification results obtained by the latter are better than the former (Wu et al., 2015). Therefore, on the grounds of previous research, we adopt machine learning algorithm and choose SVM classifier to identify investor sentiment and calculate the micro sentiment index.
Considering that online posts contain a great deal of noise, we use two-step classifier for sentiment analysis (Shi et al., 2018). The first step is to get rid of noise by separating the text into noise and non-noise, and the second step is to divide non-noise text into bullish and bearish ones. We use several technologies including data cleaning, text representation, feature extraction and classification, to compute investor sentiment index with individual sentiments, which will be applied into following study.
-
(1)
Data cleaning
The posts on the online stock forum involve a lot of punctuation, noise, etc. which cannot be directly used to sentiment analysis. Therefore, in the data cleaning process, we first remove the punctuation and gibberish, and then operate word segment.
-
(2)
Text representation
In order to enable the computer to read text, words must be changed into digital data for computer processing. In this paper, Word2vec, which is commonly used in academia, is applied to represent words. It can calculate word vectors according to the context of words, fully capture the semantic information of the context, and has good performance in text classification (Lilleberg, Zhu, & Zhang, 2015; Wolf, Hanani, Bar, et al., 2014). Then we use TF-IDF (term frequency-inverse document frequency) algorithm to compute the weight of words in short text, and provide weighted word2vec text vector.
Word2vec is a tool based on deep learning and released by Google in 2013 (Zhang, Xu, Su, et al., 2015). This neural network includes two architectures: Continuous Bag-of-Words Model (CBOW) and Skip–Gram. The former predicts the current word based on the context, while the latter predicts surrounding words given the current word (Mikolov, Chen, Corrado, & Corrado, 2013). Compared with the CBOW model, Skip–Gram has higher semantic accuracy, higher computational complexity, and longer model training time. In this paper, the Skip–Gram model is used to predict words through context, the mathematical representation (3) is as follows:
| (3) |
where the input W t is a word in the corpus.
The main idea of TF-IDF is if a word or phrase appears frequently in one article and rarely in other articles, we infer that this word or phrase shows good performance on distinguishing different text information. Formula (4) shows TF-IDF is the product of TF term frequency and IDF reverse document frequency.
| (4) |
where n i, j is the number of times that the word t i appears in the file d j, while the denominator ∑k n k, jis the sum of number of times that all the words appear in the file d j, |D| represents the total number of files in the corpus, and ∣ {j : t i ∈ d j}∣ represents the number of files containing t i.
By adding the weighted word2vec vector of words in document, we get the new vector R(d j) of document d j:
| (5) |
where word2vec(t) is the word2vec vector of word t i.
(3) Classifiers for sentiment identification.
We use Support Vector Machine (SVM) to identify investor sentiment, which has already become an important approach for classification due to its outstanding performance (Cortes & Vapnik, 1995; Deng, Tian, & Zhang, 2012). In the first step of “noise elimination” identification, non-noise data is labeled with +1 and noise data with −1. In the second step of “bullish-bearish” identification, bullish sentiment is labeled with +1 and bearish sentiment with −1. Through the procedures of classifying the text, we can identify the investors' attitudes towards individual stock. Table 4 reports the classification accuracy rate obtained by the 10-fold cross validation method. It indicates that the classification accuracy and recall rate based on the SVM algorithm have reached more than 70%.
After identifying individual investor sentiment, we can combine the views of all individual investors into micro investor sentiment index. In accordance with the method in previous literatures (Antweiler & Frank, 2004; Kim & Kim, 2014; Wu et al., 2015), we define M t BUY as total bullish posts in time interval t, and M t SELL as total bearish posts in time interval t. The calculation for sentiment index is:
| (6) |
Eq. (6) is used to compose the CSI 800 daily micro investor sentiment index, named as MicSI, which can reflect the comparison of investors' long or short views. The higher the investor sentiment index, the more investors hold positive expectation on the future stock market.
Fig. 7 presents the variation of MicSI throughout the duration of iVX, i.e. from February 9, 2015 to February 22, 2018. It depicts MicSI varies in the interval of [−0.7, 0.5] and has relatively obvious trends in different periods. The average value of investor sentiment MicSI is −0.066, which is less than zero slightly. It suggests that on average, investor sentiment is towards bearish, which demonstrates the viewpoint of investors' irrational biases in behavioral finance (Odean, 1998).
Fig. 7.
MicSI. Notes: Time series plot of MicSI from February 22, 2015 to February 22, 2018.
3. Sentiment representation of iVX
The U.S. volatility index VIX has the ability of reflecting investor sentiment, thus it is known as “the investor fear gauge”. In order to study whether the Chinese volatility index iVX possess this capability, we conduct the following research. First, we decompose iVX into short-term, medium-term and long-term fluctuations, considering macro-level sentiment is the investor's sentiment towards the market, which lasts for a long time, and the persistence of meso level and micro level sentiment decreases in turn. Then we explore the relationship between the decomposed iVX and corresponding sentiment respectively. Compared to that, the relationship between the iVX and sentiment is also examined on each level. With the above process, we can verify whether iVX is a proper representation of investor sentiment.
3.1. iVX decomposition
Previous studies show short-term volatility clustering (Cont, 2007; Gray, 1996) and long-term mean reverting of the volatility (Arav, John, & Yaseen, 2018; Bollerslev & Mikkelsen, 1996), which indicates the characteristics of stock market on different time scales are not consistent. In order to obtain the fluctuations on different time scales, we need to decompose iVX. Since the duration of three levels of sentiment is different, if iVX has the ability of representing sentiment, the short-term with high frequency, medium-term fluctuations with low frequency and long-term trend obtained after decomposition should have a relationship with corresponding sentiment at the level of macro, meso and micro.
We use EEMD to operate decomposition. Compared with wavelet analysis, it can avoid the instability of the result caused by manual selection of wavelet function. Moreover, compared with fast Fourier transform, it can realize the analysis of high-frequency data volatility. Therefore, EEMD is chosen to decompose iVX.
3.1.1. EEMD and fine-to-coarse reconstruction
-
(1)
EEMD model
Empirical Mode Decomposition (EMD) is an adaptive data analysis method introduced by Huang, Shen, Long, et al. (1998). EMD separates the full signal into a small number of Intrinsic Mode Functions (IMFs) or modes with frequencies ranged from higher to lower. However, EMD has an obvious drawback, i.e. similar oscillations in different modes, named as “mode mixing”. To alleviate it, Wu and Huang (2004) propose a new method, namely the Ensemble Empirical Mode Decomposition (EEMD).
The EEMD algorithm can be summarized as follows:
Step 1. Initialize the number of realizations and amplitudes of added white noise. Set m equal 1.
- Step 2. Perform the mth EMD decomposition.
-
1)Add white noise n m(t) with the given signal x(t).
-
1)
| (7) |
Where n m(t) is the white noise added at the mth time, and x m(t) is the signal containing white noise at the mth time.
-
2)
After employing EMD to decompose the signal x m(t), we get a group of IMFCn, m(n = 1, 2, ⋯, N), where c n, m is the nth IMF in the mth decomposition.
-
3)
If m < M, then go back to Step 1) and let n = m + 1. Repeat Steps 1) and 2) until m equals M.
Step 3. Compute the population mean y n in the Mth decomposition.
| (8) |
Step 4. Take N IMFs in the Mth decomposition, i.e. y n(n = 1, 2, ⋯, N) as the final IMFs.
(2) Fine-to-coarse reconstruction.Antweiler and Frank, 2004, Arav et al., 2018, Basher & Sadorsky, 2016
In practice, fine-to-coarse reconstruction is used to get short-term, medium-term fluctuations and the long-term trend of the original signal (Zhang et al., 2008). The algorithm is as follows:
Step 1. Except for the residue, calculate the mean of the sum of y 1 to y i for each component.
Step 2. Identify for which i the mean departs from zero significantly.
Step 3. If i is identified as a significant change point, partial reconstruction with IMFs from this to the end is identified as the medium-term fluctuations, the partial reconstruction with other IMFs is identified as the short-term fluctuations and the residue as the long-term trend.
3.1.2. Empirical results
We get five the IMFs and residue by decomposing iVX, of which the variations are depicted in Fig. 8 .
Fig. 8.
IMFs of iVX.
When comparing IMFs with different frequencies, each component of iVX can be observed more clearly, which is beneficial to examine their relationship with investor sentiment. We used fine-to-coarse reconstruction to obtain the short-term fluctuations (high frequency component), medium-term fluctuations (low frequency component), and the long-term trend of iVX, denoted as HIVX, LIVX, TIVX respectively. Fig. 9 shows the average of IMFs as a function of IMFs index K.
Fig. 9.
Averages of IMFs.
It can be observed that the average of the fine-to-coarse reconstruction departs significantly from zero at IMF4. Furthermore, we use t-test to demonstrate the result (the significance value is set at 0.05). Therefore, we take the partial reconstruction with IMF1, IMF2 and IMF3 as short-term fluctuations and the partial reconstruction with IMF4 and IMF5 as medium-term fluctuations. Short-term fluctuations have the characteristics of small amplitudes, and each steep rise or fall in medium-term fluctuations corresponds to a significant event. The residue retains as long-term trend which is slowly varying around the long-term mean. The variation of three components including HIVX, LIVX, TIVX is shown in Fig. 10 .
Fig. 10.
Three decomposed components of iVX.
3.2. Correlations between sentiments and iVX's components
In this section, we plot the scatter diagrams of sentiment index and iVX or its components on the level of macro, meso and micro respectively. Then we calculate their correlation coefficients, and analyze the sentiment representation of iVX at different levels.
Fig. 11 shows the relationship of sentiment index and its corresponding component of iVX on different levels. It suggests negative correlation is the most obvious at the macro level, followed by the meso level, and the weakest at the micro level.
Fig. 11.
Sentiments, iVX and its components.
Table 5 gives the correlation coefficients of the sentiment index and the contemporaneous iVX or its components, and their first, second and third order as well, including Pearson, Spearman and Kendall correlation coefficients. It can be inferred that: (1) MacSI, the macro-level sentiment index, is highly negatively correlated with iVX and TIVX. Compared with iVX, the sentiment index has a higher correlation with TIVX in the same or lag period, indicating that the long-term trend from iVX decomposition can better reflect the macro-level sentiment. When MacSI and TIVX are contemporaneous, the three correlation coefficients all reach the maximum, indicating at the macro level, iVX can better reflect the contemporaneous sentiment. (2) All correlation coefficients at the meso level show that the sentiments have a moderate negative correlation with iVX and LIVX. Compared with iVX, MeSI has a higher correlation with LIVX, indicating the medium-term fluctuations of iVX decomposition can better reflect the meso-level sentiment. In the case when lag order is 2, the Pearson correlation coefficient reaches the peak, whereas when lag order is 1, the spearman and Kendall correlation coefficients reach the peak, indicating the meso-level iVX can better represent the sentiment lagging 1–2 periods, and the meso-level iVX is a lagging indicator. (3) MicSI, the micro-level sentiment index, has a weak negative correlation with iVX and HIVX. Comparing the correlation during different periods, we find that, except for Pearson coefficient, the other two correlation coefficients show the contemporaneous relationship is the strongest for both iVX and HIVX. It indicates that at the micro level, iVX represents the sentiment of the same period. In addition, the Pearson coefficient shows that compared with HIVX, iVX is more related to MicSI. The reason may be that the medium-term fluctuations or long-term trend of iVX is included in the micro-level sentiment, and the variation is approximately linear. Since the Pearson coefficient reflects the linear variation compared with the other two correlation coefficients, only Pearson coefficient demonstrates the higher correlation between iVX and MicSI.
Table 5.
Three correlation coefficients of iVX's components and sentiment indexes.
| TiVXt | TiVXt-1 | TiVXt-2 | TiVXt-3 | iVXt | iVXt-1 | iVXt-2 | iVXt-3 | ||
|---|---|---|---|---|---|---|---|---|---|
| MacSIt | Pearson | −0.87444 | −0.84891 | −0.82547 | −0.80065 | −0.82055 | −0.80913 | −0.79668 | −0.78018 |
| Spearman | −0.84495 | −0.82214 | −0.80728 | −0.79618 | −0.81650 | −0.81750 | −0.79440 | −0.79679 | |
| Kendall | −0.64565 | −0.60952 | −0.58992 | −0.59002 | −0.60360 | −0.62540 | −0.58655 | −0.60071 | |
| LiVXt | LiVXt-1 | LiVXt-2 | LiVXt-3 | iVXt | iVXt-1 | iVXt-2 | iVXt-3 | ||
| MeSIt | Pearson | −0.31527 | −0.40692 | −0.42547 | −0.37008 | −0.22912 | −0.26626 | −0.27281 | −0.26329 |
| Spearman | −0.35993 | −0.39097 | −0.37475 | −0.30252 | −0.09968 | −0.11994 | −0.11911 | −0.10658 | |
| Kendall | −0.24726 | −0.26984 | −0.26230 | −0.21140 | −0.07307 | −0.08285 | −0.08523 | −0.07503 | |
| HiVXt | HiVXt-1 | HiVXt-2 | HiVXt-3 | iVXt | iVXt-1 | iVXt-2 | iVXt-3 | ||
| MicSIt | Pearson | −0.12418 | −0.12909 | −0.08559 | 0.00702 | −0.21884 | −0.21817 | −0.20928 | −0.19179 |
| Spearman | −0.18585 | −0.14072 | −0.07918 | −0.00688 | −0.17953 | −0.17086 | −0.16399 | −0.15331 | |
| Kendall | −0.13233 | −0.09630 | −0.05366 | −0.00421 | −0.12304 | −0.11701 | −0.11344 | −0.10649 | |
In the above discussion, we decompose iVX into three components with different characteristics, and then study the sentiment representation of different components on each level. It is concluded that only the component at the macro level has a strong ability to reflect sentiments. However, the weekly and daily data account for the majority of the research of financial market, to which the corresponding levels are meso and micro. According to our study, the correlations between iVX or its components and sentiment index are low at the meso and micro levels (the maximum absolute values are about 0.2 and 0.4), which is not enough to prove that iVX can reflect investor sentiment. But meanwhile, it is noted that the correlations between the components of iVX at different levels and the corresponding sentiment index are almost stronger than those between iVX and the sentiment index, which suggests that iVX also reflects the sentiments and expectations of investors at different levels to some degree. Therefore, it is necessary to combine the sentiment indexes of different levels together to test how the volatility index iVX represents the investor sentiment.
4. Representation of common sentiment factor
The dynamic factor model is mainly used in the modeling and analysis of macroeconomics. In recent years, it has received more and more attention. Its theoretical basis can be traced back to Lucas (1977). He finds that a single macro variable is not enough to reflect all the characteristics of economic fluctuations, and the evolution of fluctuations accompanies with the interaction and coordinated movement of many macro variables. The main advantage of the dynamic factor model is that it can extract dominating factors from a variety of macro data through metrological methods, that is, assuming that the mutual movement of many macro variables can be explained by an unobservable common random factor, which can better reflect the variation and characteristics of economic (Koopman & Harvey, 2003).
In this paper, we think that all levels of sentiment are influenced by not only specific factors but also common sentiment factors. Section 3.2 shows that iVX has different correlations with the sentiment index at three levels, hence it should also have potential relationship with the common sentiment factors. Therefore, the dynamic factor model is employed on the extraction of common sentiment factors. Considering that the frequencies of sentiment indexes at three levels are different, we further combine the mixed-frequency modeling method (Aruoba, Diebold, & Scotti, 2009), and build the mixed-frequency dynamic factor model to extract the common sentiment factor.
4.1. Mixed-frequency dynamic factor model
The frequencies of sentiment indexes at macro, meso and micro level are monthly, weekly and daily, respectively. In order to construct a high-frequency daily common sentiment factor, x t denotes the unobservable common sentiment factor at day t, which evolve daily with AR(p) dynamics,
| (9) |
where e t is a white noise innovation with unit variance. Interested in tracking and forecasting real activity, we use a single-factor model, i.e. x t is a scalar.
yit denotes the sentiment index at day t, which depends linearly on x t and also on lags of y it, and i represents the level of sentiment index, which equals 1, 2, and 3 in the case of micro, meso and macro, respectively.
| (10) |
where β i coefficient for x t, reflects the effect of the common sentiment factor on the sentiment index y it. y it−jD is the lagged y it, D represents number of days in each cycle of low-frequency sentiment index, ranging from 28 to 31 for macro index for monthly data, j denotes the lag order. u it are contemporaneously and serially uncorrelated innovations, uncorrelated with e t as well.
The state space model derived from Eq. (10) has the following form:
| (11) |
| (12) |
where y t = (y 1t, y 2t, y 3t) is an vector of observed sentiment indexes, α t is an m × 1 vector of common factors, w t contains constant terms and lags of observed sentiment indexes, ε t and η t are vectors of measurement and transition shocks containing the u it and e t.
Due to different frequencies, the observed vector y t will have a very large number of missing values. The maximum likelihood estimation is taken to estimate parameters, in which the missing values do not offer any data information. Hence joint probability density function can be written as:
| (13) |
where y 3, t is missing for t ∈ A and y 2, t is missing for t ∈ B. Based on Eqs. (13), (11) is transformed into the following form:
| (14) |
where γ t is a 3 × 3 identity matrix when there is no missing value in observed vector y t, γ t is a 2 × 3 matrix deleted ith row when the ith value in y t is missed, and γ t is a 1 × 3 matrix when two of values in y t is missed.
In Eq. (14), there is no problem with missing data. Hence for the state space model composed of Eqs. (12), (14), the maximum likelihood estimation based on Kalman filter is used to obtain the consistent estimators, and simultaneously, Kalman filter and smoothing estimators of state variable α t can be calculated.
4.2. Relationship of iVX and common sentiment factor
Since Eqs. (12), (14) demand the variables in state space model should be stationary, we first take the ADF test on three sentiment indexes of different levels. Table 6 reports MacSI is non-stationary, whereas the other two are opposite. In an attempt to maintain the data properties, we take first difference of the variables with trend. Finally, all variables are standardized eliminating the influence of dimension. Table 6 also gives some descriptive statistics of three sentiment indexes.
Table 6.
Statistics of sentiment indexes.
| ADF(p-value) | PP(p-value) | Mean | Std. Deviation | Max | Min | |
|---|---|---|---|---|---|---|
| MacSI | −0.242 | −0.198 | 0.000 | 2.143 | 3.477 | −2.824 |
| MeSI | −4.664*** | −3.820*** | 0.000 | 0.703 | 3.141 | −1.920 |
| MicSI | −5.836*** | −18.984*** | −0.068 | 0.152 | 0.428 | −0.638 |
According to AIC、BIC, we set p equals 1. Hence the measurement and transition equations become
| (15) |
| (16) |
In the transition equation, the covariance matrices are
| (17) |
On the basis of the optimal estimation of unobserved state vector, Table 7 gives the estimation values of related parameters. The estimates are all significant, and those of β1, β2, and β3 are positive, which suggests the effects of the common sentiment factor on the three sentiment at different levels are consistent. ρ is significantly positive, indicating the first-order common sentiment factor has a positive effect on the current one, i.e. investor sentiment is influenced by inertia, which will lead to conservatism bias in behavioral finance: investors' insufficient response to new information, resulting in short-term momentum effect.
Table 7.
Estimations of mixed-frequency dynamic factor model.
| Estimate | t-statistic | Estimate | t-statistic | ||
|---|---|---|---|---|---|
| β1 | 0.409⁎⁎⁎ | 192.181 | γ2 | 0.999⁎⁎⁎ | 1221.900 |
| β2 | 0.017⁎⁎⁎ | 27.900 | γ3 | 0.575⁎⁎⁎ | 536.982 |
| β3 | 0.226⁎⁎⁎ | 155.683 | σ1 | 0.586⁎⁎⁎ | 292.436 |
| ρ | 0.563⁎⁎⁎ | 356.392 | σ2 | 0.064⁎⁎⁎ | 481.327 |
| γ1 | 0.419⁎⁎⁎ | 394.044 | σ3 | 0.922⁎⁎⁎ | 32.688 |
Notes: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
According to the estimations of the common sentiment factor obtained by Kalman smoothing, Fig. 12 represents its time series. We calculate the correlation coefficients between the common sentiment factor and iVX as shown in Table 8 . In comparison with Table 5, the three correlation coefficients show a significant negative correlation between the common sentiment factor and iVX. The absolute values are larger than the correlation coefficients between the meso and micro level sentiment indexes and the corresponding iVX component, and also significantly larger than the correlation coefficients between the two sentiment indexes and iVX. It suggests that compared with the sentiment index of each level, the common sentiment factor we extract is more closely related to iVX. Therefore, we believe that iVX represents more a mutual portion of all levels of sentiments than the diversity. Besides, compared with lags ranging from 1 to 3, iVX has the strongest correlation with the contemporaneous common sentiment factor, indicating that iVX can reflect the current mutual sentiment.
Fig. 12.
Common sentiment factor.
Table 8.
Three correlation coefficients of iVX and common sentiment factor.
| iVXt | iVXt-1 | iVXt-2 | iVXt-3 | |
|---|---|---|---|---|
| Pearson | −0.587 | −0.579 | −0.571 | −0.568 |
| Spearman | −0.513 | −0.507 | −0.501 | −0.497 |
| Kendall | −0.370 | −0.363 | −0.358 | −0.354 |
5. Robustness testing
In the previous section, we use three correlation coefficients to discuss the sentiment representation of iVX. In this section, we will further investigate the relationship of iVX and sentiment based on econometric model.
According to the formula of iVX, it is affected by risk-free rate, option prices and strike prices. Here we take Shibor 1-year rate as the risk-free rate and average strike price of all near-term and next-term options as the strike prices. As for option prices, except for investors' sentiment, it can be influenced by other factors. Therefore, the explanatory variables will include lagging iVX or its component.
Autoregressive Distributed Lag model (ADL) and Distributed Lag model (DL), which are widely used in econometrics, allow for impact of both current and lagged sentiment. The difference between the two is that the first one considers lagged dependent variable as independent one. Thus, we believe ADL is suitable for our research. Further, finite autoregressive distributed lag model is adopted in light of parsimony that infinite model may not give accurate results if the samples are not large enough. Therefore, ADL model can be written as:
| (18) |
| (19) |
where r stands for Shibor 1-year rate, exp indicates average strike price, ivx_compent refers to TIVX, LIVX and HIVX, sen is MacSI, MeSI and MicSI, and u t (∙) is the error term. a 1, a 2, b 1, b 2, k 1 and k 2 are the lag lengths chosen based on AIC and SC. α i (∙), β i (∙) and γ i (∙) can be transformed as Almon polynomial:
| (20) |
The polynomial order h is decided by maximum of residual sum of squares divided by degrees of freedom, recommended by Mitchell and Speaker (1986).
At the macro level, all monthly variables, including iVX (MIVX), average strike price (MSP), Shibor 1-year rate (MSR) and MacSI are tested non-stationary, while their first differences (which are denoted as DMIVX, DMSP, DMSR and DMacSI, respectively) are the opposite. The results are shown in Table 9 .
Table 9.
Stationary tests of macro variables.
| Variables | ADF | PP | variables | ADF | PP |
|---|---|---|---|---|---|
| MIVX | −0.527 | −1.233 | DMSR | −2.849⁎⁎ | −2.703⁎ |
| DMIVX | −4.866⁎⁎⁎ | −4.745⁎⁎⁎ | MacSI | −0.242 | −0.198 |
| MSP | −1.284 | −0.985 | DMacSI | −6.026⁎⁎⁎ | −5.380⁎⁎⁎ |
| DMSP | −3.885⁎⁎⁎ | −3.713⁎⁎⁎ | MTIVX | −4.227⁎⁎⁎ | −0.944 |
| MSR | −1.926 | −1.347 | DMTIVX | −2.011⁎⁎ | −1.728⁎ |
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
The results from the tests of whether MTIVX is stationary are not the same. The reason is our sample size of monthly variable is a bit small. To further test the variable, we use KPSS test. The output shows the MTIVX is non-stationary with the critical value of 1%. Thus, we believe MTIVX is not stationary.
Johansen test indicates there is cointegration relationship between the four variables of MIVX, MSP, MSR and MacSI (Table 10 , Panel A), and also MTIVX, MSP, MSR and MacSI (Table 10, Panel A).
Table 10.
Johansen tests of four macro variables.
| Panel A |
|||
|---|---|---|---|
| Hypothesized: number of cointegration(s) | Eigenvalue | Trace statistic | Probability |
| None | 0.593 | 75.659⁎⁎⁎ | 0.000 |
| At most 1 | 0.474 | 44.204⁎⁎ | 0.004 |
| At most 2 | 0.296 | 21.751⁎⁎ | 0.031 |
| At most 3 | 0.237 | 9.454⁎⁎ | 0.044 |
| Panel B | |||
| None | 0.647 | 62.509⁎⁎⁎ | 0.001 |
| At most 1 | 0.285 | 28.151⁎ | 0.077 |
| At most 2 | 0.245 | 17.096⁎⁎ | 0.029 |
| At most 3 | 0.211 | 7.805⁎⁎⁎ | 0.005 |
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
Thus we use the original values for the macro level. The regression models are defined as the following way:
| (21) |
| (22) |
The estimations are presented in Table 11 . For Eq. (21), current sentiment has significantly strongest negative impact on MTIVX, compared with other periods. For Eq. (22), the results suggest current sentiment influence iVX most strongly. These inferences indicate iVX and its low-frequency component can reflect investors' sentiment simultaneously.
Table 11.
Estimations of ADL at macro level.
| Panel A: Eq. (21) | |||||
|---|---|---|---|---|---|
| Coefficient | t-statistic | Coefficient | t-statistic | ||
| ω1 | −7.894⁎⁎⁎ | −5.173 | ρ21 | −0.074 | −0.583 |
| α01 | 0.167 | 0.420 | ρ31 | −0.126 | −1.473 |
| β01 | 4.769⁎⁎⁎ | 5.368 | ρ41 | −0.304 | −1.664 |
| ρ01 | −0.354⁎ | −1.848 | δ11 | 1.382⁎⁎⁎ | 18.288 |
| ρ11 | −0.150⁎ | −1.897 | δ21 | −0.584⁎⁎⁎ | −8.765 |
| Panel B: Eq. (22) | |||||
| ω2 | −45.866⁎⁎⁎ | −3.658 | ρ22 | −0.945⁎⁎⁎ | −3.243 |
| α02 | −3.438 | −1.087 | ρ32 | −0.938⁎⁎ | −2.724 |
| β02 | 4.777 | 0.659 | ρ42 | −0.930⁎ | −1.858 |
| β12 | 13.971 | 1.366 | ρ52 | −0.922 | −1.330 |
| β22 | 4.971 | 0.502 | δ12 | −0.012 | −0.073 |
| β32 | 12.506 | 1.383 | δ22 | −0.057 | −0.540 |
| ρ02 | −0.961⁎ | −1.718 | δ32 | −0.103 | −1.384 |
| ρ12 | −0.953⁎⁎ | −2.463 | δ42 | −0.148 | −1.627 |
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
At the meso level, we test weekly iVX (WIVX), average strike price (WSP), Shibor 1-year rate (WSR) and their first differences (which are denoted as DWIVX, DWSP and DWSR, respectively), as well as MeSI and weekly LIVX(WLIVX). Table 12 reports summary statistics for the variables.
Table 12.
Stationary tests of meso variables.
| Variables | ADF | PP | Variables | ADF | PP |
|---|---|---|---|---|---|
| WIVX | −0.668 | −1.636 | WSR | −1.509 | −1.342 |
| DWIVX | −7.729⁎⁎⁎ | −10.1563⁎⁎⁎ | DWSR | −5.407⁎⁎⁎ | −4.50⁎⁎ |
| WSP | −0.909 | −1.086 | WLIVX | −8.530⁎⁎⁎ | −3.054⁎⁎⁎ |
| DWSP | −8.158⁎⁎⁎ | −7.687⁎⁎ | MeSI | −4.664⁎⁎⁎ | −3.820⁎⁎⁎ |
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
The regression models are defined as the following way:
| (23) |
| (24) |
Table 13 reports the estimations. It shows that current sentiment cannot effectively affect iVX or its component. For both WLIVX and iVX, first-order lagging sentiment has the significantly negative impact. Thus, iVX and its medium-frequency component represent previous sentiment, consistent with our conclusions.
Table 13.
Estimations of ADL at meso level.
| Panel A: Eq. (23) | |||||
|---|---|---|---|---|---|
| Coefficient | t-statistic | Coefficient | t-statistic | ||
| ω3 | −0.150⁎ | −1.729 | ρ21 | −0.162 | −1.263 |
| α01 | −0.280 | −0.184 | ρ31 | −0.017 | −0.208 |
| β01 | −3.569⁎ | −1.919 | ρ41 | 0.199 | 1.153 |
| ρ01 | −0.239 | −1.376 | δ11 | 1.488 | 31.638 |
| ρ11 | −0.236⁎⁎⁎ | −2.911 | δ21 | −0.798 | −16.962 |
| Panel B: Eq. (24) | |||||
| ω4 | −0.059 | −0.253 | ρ32 | 0.270 | 1.277 |
| α02 | −3.333 | −0.804 | ρ42 | 0.929⁎ | 2.015 |
| β02 | −0.356 | −0.070 | δ12 | 0.225⁎⁎ | 2.718 |
| ρ02 | −0.721 | −1.521 | δ22 | −0.351⁎⁎⁎ | −4.267 |
| ρ12 | −0.555⁎⁎ | −2.509 | δ32 | 0.089 | 1.072 |
| ρ22 | −0.224 | −0.649 | |||
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
At the micro level, we test daily iVX, average strike price (DSP), Shibor 1-year rate (DSR) and their first differences (which are denoted as DIVX, DDSP and DDSR, respectively), as well as MicSI and daily HIVX. Table 14 reports summary statistics for the variables.
Table 14.
Stationary tests of micro variables.
| Variables | ADF | PP | Variables | ADF | PP |
|---|---|---|---|---|---|
| iVX | −2.917 | −1.99 | DSR | −1.396 | −1.208 |
| DIVX | −26.531⁎⁎⁎ | −26.513⁎⁎⁎ | DDSR | −5.627⁎⁎⁎ | −6.977⁎⁎⁎ |
| DSP | −0.923 | −0.997 | MicSI | −5.836⁎⁎⁎ | −14.322⁎⁎⁎ |
| DDSP | −25.898⁎⁎⁎ | −25.898⁎⁎⁎ | HIVX | −12.463⁎⁎⁎ | −12.444⁎⁎⁎ |
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
We define the regression model as the following way:
| (25) |
| (26) |
We estimate the equations, as shown in Table 15 . For Eq. (25), only ρ 0 1 is significantly negative among ρ i 1, which suggests iVX reveals more the current sentiment. Although ρ 2 1 and ρ 3 1 are significantly positive, but the sum of all the coefficients on sentiment is negative, which means iVX can reflect investor sentiment. For Eq. (26), ρ 0 2 is negatively larger than ρ 1 2.
Table 15.
Estimations of ADL at micro level.
| Panel A: Eq. (25) | |||||
|---|---|---|---|---|---|
| Coefficient | t-statistic | Coefficient | t-statistic | ||
| ω5 | 0.002 | 0.036 | ρ21 | 0.732⁎⁎⁎ | 2.781 |
| α01 | −15.008⁎ | −1.856 | ρ31 | 0.695⁎⁎⁎ | 3.460 |
| α11 | 13.625⁎ | 1.684 | ρ41 | −0.172 | −0.454 |
| β01 | −5.274⁎⁎⁎ | −2.611 | δ11 | 0.737⁎⁎⁎ | 19.956 |
| ρ01 | −1.683⁎⁎⁎ | −4.516 | δ21 | −0.071 | −1.561 |
| ρ11 | −0.061 | −0.301 | δ31 | −0.035 | −0.942 |
| Panel B: Eq. (26) | |||||
| Coefficient | t-statistic | Coefficient | t-statistic | ||
| ω6 | 0.008 | 0.130 | ρ12 | −0.064⁎ | −1.813 |
| α02 | −5.539 | −1.195 | ρ22 | 0.018 | 0.679 |
| β02 | −5.671⁎⁎ | −2.434 | ρ32 | 0.032 | 1.170 |
| β12 | 1.160 | 0.495 | ρ42 | −0.025 | −0.694 |
| ρ02 | −0.217⁎⁎ | −2.134 | δ12 | 0.242 | 2.193⁎⁎ |
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
After discussing iVX and investors' sentiment at different levels, we analyze its representation of common sentiment factor (CSF). The factor is tested stationary with ADF test of −6.625 and PP test of −6.427. The other variables have been tested during the previous investigation. Then we run the following regression model:
| (27) |
Table 16 shows that our conclusions hold, and only the current sentiment can significantly negatively impact iVX. Therefore, the application of ADL confirms our findings on iVX' s representation to investor sentiment, for different levels or the common sentiment factor.
Table 16.
Estimations of ADL for CSF.
| Coefficient | t-statistic | |
|---|---|---|
| ω7 | 0.009 | 0.146 |
| α01 | −23.405⁎⁎ | −2.513 |
| α11 | 19.037⁎⁎ | 2.044 |
| β01 | −6.290⁎⁎⁎ | −2.713 |
| ρ01 | −0.445⁎⁎⁎ | −3.429 |
| ρ11 | −0.067 | −1.064 |
| ρ21 | 0.164 | 1.602 |
| ρ31 | 0.248⁎⁎⁎ | 3.925 |
| ρ41 | 0.185 | 1.413 |
| δ01 | 0.001 | 0.021 |
Note: ***, **, and * denote rejection of the null hypothesis at the 1%, 5% and 10% significance levels, respectively.
6. Conclusions
In this paper, we investigate whether Chinese volatility index iVX can represent investor sentiment. Most of the previous researches treat the sentiment representation of the volatility index as a known fact, or only analyze a certain level of sentiment. However, we construct a three-dimensional investor sentiment measurement system, and use EEMD method to decompose iVX into three components, i.e. short-term fluctuations, medium-term fluctuations and long-term trend. We study the relationship of iVX and micro, meso and macro sentiments respectively. Further, in the framework of dynamic factor model, with mixed-frequency data, a common sentiment factor is constructed to verify the representation of iVX to mutual sentiment. Finally, we use ADL to test our results.
Our findings mainly include the following aspects:
First, the iVX's components at the macro, meso, and micro levels all have certain ability to represent sentiment. Among them, the iVX decomposition component at the macro level has the strongest ability to represent the current sentiment; the one at the meso level reflects the first-order or second-order lagging sentiment; the one at the micro-level can represent the current sentiment, but the effect is weak. Overall, iVX tends to more reflect the macro sentiment, yet has a weaker ability to represent sentiment at meso and macro levels.
Second, this paper finds that compared with the sentiment index at each level, the common sentiment factor has a stronger relationship with iVX. This fact implies that iVX more represents the mutual composition of sentiment at different levels, and iVX has the strongest correlation with the current common sentiment factor. Hence, we believe that iVX can represent mutual sentiment of all levels to a certain extent.
The underlying asset of iVX is Shanghai 50ETF option, and the pricing of option should theoretically reflect the expectations and sentiments of investors. That is confirmed to be effective, for we have verified the ability of iVX to represent sentiment in this paper. Since Chinese volatility index iVX is still in its infancy, the empirical evidence of this paper would contribute to providing a theoretical basis for the development of iVX.
Our results show iVX can comprehensively reflect the sentiment and expectations of investors on different time scales. The measurement of investor sentiment is complicated from the macro level to the micro level, while iVX can be calculated at high frequency. Thus, we strongly recommend resuming the release of iVX. It will help investors better understand the overall financial market situation, allocate asset portfolios, and perform risk management, and is also beneficial for regulatory authorities to deepen their understanding of changes in the domestic financial market and effectively conduct financial supervision, so as to achieve the healthy and sustainable development of the financial market and ensure the stable operation of the macroeconomic system.
Declaration of Competing Interest
This research was partly supported by National Natural Science Foundation of China (No. 71771204).
References
- Antweiler W., Frank M.Z. Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance. 2004;59(3):1259–1294. [Google Scholar]
- Arav O., John T., Yaseen A. Persistence and discontinuity in the VIX dynamics. Chaos, Solitons & Fractals. 2018;113:333–344. [Google Scholar]
- Aruoba S.B., Diebold F.X., Scotti C. Real-time measurement of business conditions. Journal of Business & Economic Statistics. 2009;27(4):417–427. [Google Scholar]
- Badshah I., Bekiros S., Lucey B.M., Uddin G.S. Asymmetric linkages among the fear index and emerging market volatility indices. Emerging Markets Review. 2018;37:17–31. [Google Scholar]
- Baker M., Wurgler J. Investor sentiment in the stock market. The Journal of Economic Perspectives. 2007;21(2):129–151. [Google Scholar]
- Basher S.A., Sadorsky P. Hedging emerging market stock prices with oil, gold, VIX, and bonds: A comparison between DCC, ADCC and GO-GARCH. Energy Economics. 2016;54(Feb.):235–247. [Google Scholar]
- Behrendt S., Schmidt A. The Twitter myth revisited: Intraday investor sentiment, Twitter activity and individual-level stock return volatility. Journal of Banking & Finance. 2018;96:355–367. [Google Scholar]
- Bollerslev T., Mikkelsen H.O. Modeling and pricing long memory in stock market volatility. Journal of Econometrics. 1996;73(1):151–184. [Google Scholar]
- Canina L., Figlewski S. The informational content of implied volatility. The Review of Financial Studies. 1993;6(3):659–681. [Google Scholar]
- Chandra A., Thenmozhi M. On asymmetric relationship of India volatility index (India VIX) with stock market return and risk management. Decision. 2015;42(1):1–23. [Google Scholar]
- Checkley M.S., Higón D.A., Alles H. The hasty wisdom of the mob: How market sentiment predicts stock market behavior. Expert Systems with Applications. 2017;77:256–263. [Google Scholar]
- Chen J., Jiang F., Liu Y., et al. International volatility risk and Chinese stock return predictability. Journal of International Money and Finance. 2017;70:183–203. [Google Scholar]
- Chow V., Jiang W., Li J. Does VIX truly measure return volatility. Electronic Journal. 2018;20(1):1–35. [Google Scholar]
- Cont R. Volatility clustering in financial markets: Empirical facts and agent-based models. SSRN Electronic Journal. 2007;1:289–309. [Google Scholar]
- Cortes C., Vapnik V. Support-vector networks. Machine Learning. 1995;20(3):273–297. [Google Scholar]
- Da Z., Engelberg J., Gao P. The sum of all FEARS investor sentiment and asset prices. The Review of Financial Studies. 2014;28(1):1–32. [Google Scholar]
- Das S.R., Chen M.Y. Yahoo! For Amazon: Sentiment extraction from small talk on the web. Management Science. 2007;53(9):1375–1388. [Google Scholar]
- Day E., Lewis M. Derivatives on market volatility and the information content of stock index options. Journal of Econometrics. 1992;52(1–2):267–287. [Google Scholar]
- Delisle R.J., Doran J.S., Peterson D.R. Asymmetric pricing of implied systematic volatility in the cross-section of expected returns. Journal of Futures Markets. 2011;31(1):34–54. [Google Scholar]
- Deng N., Tian Y., Zhang C. 2012. Support vector machines: Optimization based theory, algorithms, and extensions. [Google Scholar]
- Dupoyet B., Daigler R.T., Chen Z. A simplified pricing model for volatility futures. Journal of Futures Markets. 2011;31(4):307–339. [Google Scholar]
- Fleming J. The quality of market volatility forecasts implied by S&P100 index option prices. Journal of Empirical Finance. 1998;5(4):317–345. [Google Scholar]
- Gao Z., Ren H., Zhang B. Googling investor sentiment around the world. Journal of Financial and Quantitative Analysis. 2019:1–66. [Google Scholar]
- Giot P. Université catholique de Louvain, Center for Operations Research and Econometrics (CORE); 2002. Implied volatility indices as leading indicators of stock index returns? [Google Scholar]
- Gray F.S. Modeling the conditional distribution of interest rates as a regime-switching process. Journal of Financial Economics. 1996;42(1):27–62. [Google Scholar]
- Huang N.E., Shen Z., Long S.R., et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings A. 1998;454(1971):903–995. [Google Scholar]
- Kambouroudis S., McMillan G., Tsakou K. Forecasting stock return volatility: A comparison of Garch, implied volatility, and realized volatility models. Journal of Futures Markets. 2016;36(12):1127–1163. [Google Scholar]
- Kanniainen J., Lin B., Yang H. Estimating and using GARCH models with VIX data for option valuation. Journal of Banking & Finance. 2014;43(jun):200–211. [Google Scholar]
- Kim S.H., Kim D. Investor sentiment from internet message postings and the predictability of stock returns. Journal of Economic Behavior & Organization. 2014;107:708–729. [Google Scholar]
- Koopman S.J., Harvey A. Computing observation weights for signal extraction and filtering. Journal of Economic Dynamics and Control. 2003;27(7):1317–1333. [Google Scholar]
- Lee C., Thaler S. Investor sentiment and the closed-end fund puzzle. Journal of Finance. 1991;46(1):75–109. [Google Scholar]
- Lei Y., Zuo M.J. Fault diagnosis of rotating machinery using an improved HHT based on EEMD and sensitive IMFs. Measurement ence & Technology. 2009;20(12):125701. [Google Scholar]
- Lemmon M., Portniaguina E. Consumer confidence and asset prices: Some empirical evidence. Review of Financial Studies. 2006;19(4):1499–1529. [Google Scholar]
- Li B., Chan K.C.C., Ou C., Ruifeng S. Discovering public sentiment in social media for predicting stock movement of publicly listed companies. Information Systems. 2017;69:81–92. [Google Scholar]
- Li J., Yu X., Luo X., et al. Volatility index and the return-volatility relation: Intraday evidence from Chinese options market. Journal of Futures Markets. 2019;39(11):1348–1359. [Google Scholar]
- Lilleberg J., Zhu Y., Zhang Y. 2015. Support vector machines and Word2vec for text classification with semantic features[C]// 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCICC) IEEE. [Google Scholar]
- Liu B., Zhang L. Mining Text Data, Springer US; New York: 2012. A survey of opinion mining and sentiment analysis; pp. 415–463. [Google Scholar]
- Lucas R. Understanding business cycles. Carnegie-Rochester Conference Series on Public Policy. 1977;5:7–29. [Google Scholar]
- Martens M., Zein J. Predicting financial volatility: High frequency time series Forecasts vis-à-vis implied volatility. Journal of Futures Markets. 2004;23(11):1005–1028. [Google Scholar]
- Mikolov T., Chen K., Corrado G., Corrado J.D. Advances in neural information processing systems 26. Proceeding of a meeting held December 5–8, Lake Tahoe, Nevada, United States. 2013. Districted representations of words and phrases and their compositionality, in 27th Annual Conference on Neural Information Processing Systems; pp. 3111–3119. [Google Scholar]
- Mitchell D.W., Speaker P.J. A simple, flexible distributed lag technique: The polynomial inverse lag. Journal of Econometrics. 1986;31(3):329–340. [Google Scholar]
- Odean T. Are investors reluctant to realize their losses. The Journal of Finance. 1998;53(5):1775–1798. [Google Scholar]
- Pan W. Sentiment and asset price bubble in the precious metals markets. Finance Research Letters. 2018;26:106–111. [Google Scholar]
- Peng Y., Ng W.L. Analysing financial contagion and asymmetric market dependence with volatility indices via copulas. Annals of Finance. 2012;8(1):49–74. [Google Scholar]
- Pineiro-Chousa J.R., López-Cabarcos M.á., Pérez-Pico A.M. Examining the influence of stock market variables on microblogging sentiment. Journal of Business Research. 2016:2087–2092. [Google Scholar]
- Qadan M., Kliger D., Chen N. Idiosyncratic volatility, the VIX and stock returns. The North American Journal of Economics and Finance. 2019;47:431–441. [Google Scholar]
- Qiao G., Teng Y., Li W., et al. Improving volatility forecasting based on Chinese volatility index information: Evidence from CSI 300 index and futures markets. The North American Journal of Economics and Finance. 2019;49(Jul):133–151. [Google Scholar]
- Qiu L., Welch I. Investor sentiment measures. No. w10794. National Bureau of Economic Research. 2004 doi: 10.3386/w10794. [DOI] [Google Scholar]
- Renault T. Intraday online investor sentiment and return patterns in the US stock market. Journal of Banking & Finance. 2017;84(Nov):25–40. [Google Scholar]
- Sarwar G. Is VIX an investor fear gauge in BRIC equity markets? Journal of Multinational Financial Management. 2012;22(3) [Google Scholar]
- Schmeling M. Investor sentiment and stock returns: Some international evidence. Journal of Empirical Finance. 2009;16(3):394–408. [Google Scholar]
- Shaikh I., Padhi P. The information content of implied volatility index (India VIX) Global Business Perspectives. 2013;1(4):359–378. [Google Scholar]
- Shi S., Zhu Y., Zhao Z., Kang K., Xiong X. The investor sentiment mined from WeChat text and stock market performance. Systems Engineering - Theory & Practice. 2018;38(06):1404–1412. [Google Scholar]
- Siriopoulos C., Fassas A. An investor sentiment barometer — Greek Implied Volatility Index (GRIV) Global Finance Journal. 2012;23(2):77–93. [Google Scholar]
- Smales L.A. News sentiment and the investor fear gauge. Finance Research Letters. 2014;11(2):122–130. [Google Scholar]
- Smales L.A. Risk-on/Risk-off: Financial market response to investor fear. Finance Research Letters. 2016;17:125–134. [Google Scholar]
- Smales L.A. The importance of fear: Investor sentiment and stock market returns. Applied Economics. 2017:1–27. [Google Scholar]
- Tissaoui K., Azibi J. International implied volatility risk indexes and Saudi stock return-volatility predictabilities. The North American Journal of Economics and Finance. 2019;47:65–84. [Google Scholar]
- Torres M.E., Marcelo A.C., Gaston S., et al. IEEE International Conference on Acoustics. IEEE. 2011. A complete ensemble empirical mode decomposition with adaptive noise. [Google Scholar]
- Tsukioka Y., Yanagi J., Takada T. Investor sentiment extracted from internet stock message boards and IPO puzzles. International Review of Economics and Finance. 2018;56(Jul):205–217. [Google Scholar]
- Wang J., Gao R.X., Yan R. Integration of EEMD and ICA for wind turbine gearbox diagnosis. Wind Energy. 2014;17(5):757–773. [Google Scholar]
- Wang W.C., Xu D.M., Chau K.W., et al. Improved annual rainfall-runoff forecasting using PSO-SVM model based on EEMD. Journal of Hydroinformatics. 2013;15(4):1377–1390. [Google Scholar]
- Whaley R.E. The investor fear gauge: Explication of the CBOE VIX. The Journal of Portfolio Management. 2000;26(3):12–17. [Google Scholar]
- Whaley R.E. Understanding the VIX. The Journal of Portfolio Management. 2009;35(3):98–105. [Google Scholar]
- Wolf L., Hanani Y., Bar K., et al. Joint Word2vec networks for bilingual semantic representations. International Journal of Computational Linguistics and Applications. 2014;5(01):27–44. [Google Scholar]
- Wu D.D., Zheng L., Olson D.L. A decision support approach for online stock forum sentiment analysis. IEEE Transactions on Systems Man & Cybernetics Systems. 2014;44(44):1077–1087. [Google Scholar]
- Wu P.C., Pan S.C., Tai X.L. Non-linearity, persistence and spillover effects in stock returns: The role of the volatility index. Empirica. 2015;42(3):597–613. [Google Scholar]
- Wu Z., Huang N.E. A study of the characteristics of white noise using the empirical mode decomposition method. Proceedings of the Royal Society A Mathematical Physical & Engineering Sciences. 2004;460(2046):1597–1611. [Google Scholar]
- Wurgler J., Baker M. Investor Sentiment and the Cross-Section of Stock Returns. The Journal of Finance. 2006;61(4):1645–1680. [Google Scholar]
- Xu H.C., Zhou W.X. A weekly sentiment index and the cross-section of stock returns. Finance Research Letters. 2018;27:135–139. [Google Scholar]
- Yang C.-Y., Jhang L.-J., Chang C.-C. Do investor sentiment, weather and catastrophe effects improve hedging performance? Evidence from the Taiwan options market. Pacific-Basin Finance Journal. 2016;37(Apr.):35–51. [Google Scholar]
- Yu L., Wang S., Lai K.K. Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm. Energy Economics. 2008;30(5):2623–2635. [Google Scholar]
- Yue T., Ruan X., Gehricke S., Zhang J. The Chinese volatility index and the volatility risk premium. 2019. https://nzfc.ac.nz/papers/updated/32.pdf
- Zhang D., Xu H., Su Z., et al. Chinese comments sentiment classification based on word2vec and SVMperf. Expert Systems with Application. 2015;42(4):1857–1863. [Google Scholar]
- Zhang X., Fuehres H., Gloor P.A. Predicting stock market indicators through twitter “I hope it is not as bad as I fear”. Procedia-Social and Behavioral Sciences. 2011;26:55–62. [Google Scholar]
- Zhang X., Lai K.K., Wang S. A new approach for crude oil price analysis based on empirical mode decomposition. Energy Economics. 2008;30(3):905–918. [Google Scholar]
- Zhang X., Yu L., Wang S., et al. Estimating the impact of extreme events on crude oil price: An EMD-based event analysis method. Energy Economics. 2009;31(5):768–778. [Google Scholar]













