Abstract
This paper examines the diffusion of COVID-19 news on social media using a large sample of approximately 45 million tweets. Using textual analysis, I identify tweets containing COVID-19 news, and construct an index representing the intensity of Twitter discussions. Moreover, I use retweets and favorites as additional measures of investor attention to COVID-19. The results show that the intensity of Twitter discussions about COVID-19 (and about the treatment program) correspond to market returns. This suggests a role for financial social networks in transmitting information related to crises, such as COVID-19, and the resolution of crises.
Keywords: Networks, Social media, Big Data, COVID-19
1. Introduction
The goal of this paper is to examine the diffusion of news related to COVID-19 on social media, and to relate this to daily returns. To undertake this study, I utilize a comprehensive dataset of approximately 45 million tweets covering over 4500 firms for the full calendar year of 2020. Using textual analysis, I construct a daily index of the intensity of Twitter discussions about COVID-19, as well as a second index of the intensity of Twitter discussions about COVID-19 treatment. I then relate these two indexes to daily stock returns. The main finding of this paper is that the intensity of Twitter discussions corresponds to daily returns. Moreover, Using a vector autoregression analysis, I show that this effect is not reversed in the short-run. In addition, I derive other measures of investor attention about COVID-19 from Twitter using retweets and the number of times tweets are marked as favorite. Such measures also correspond to daily returns.
While investors have relied on the Internet to communicate financial news (Antweiler and Frank, 2004), have used Google (Da et al., 2011; Drake et al., 2012); they have most recently turned to social media to search for and to discuss financial information. Several contemporary papers discuss the role of social media in news diffusion (e.g. Chen et al., 2014; Jung et al., 2018; and Chawla et al., 2017). Social media exhibits a structure that may allow for information and noise to propagate widely (Al Guindy, Riordan 2019). This being the case, the diffusion of discussions and opinions – particularly those related to fear – may have wide-reaching implications on investors. In this paper, I focus on the transmission of information related to fear (from COVID-19), and hope (in the form of treatment) on economic social networks that form on social media. Indeed, early evidence suggest that retail investors’ trading behavior is affected by COVID-19 (Pagano et al., 2021).
2. Data and methodology
The data used in this paper contains pubicly avilable tweets about all firms listed on the three major US exchanges for the full calendar year 2020. In addition to containing the text and time of the tweets, the dataset also contains information on the number of retweets as well as the number of times tweets are marked as favorite.
To collect the dataset, I set up three robots that were tasked with the collection of daily tweets about all US firms. The Twitter Application Programming Interface (API) makes those tweets available for a period of approximately one week, after which they are no longer available on the public API. The tweets collected represent all the (publicly available) discussions about each firm in the sample. To identify those tweets, I utilize the “cashtag” feature on Twitter; the cashtag is the ticker symbol used with a Dollar sign to designate a tweet as strictly financial. For example, the cashtag symbol “$PFE” included in a tweet indicates that the tweet is specifically about the stock of Pfizer Inc. I take advantage of this feature to collect the record of financial tweeting about all firms listed in the NYSE, NYSE MKT and NASDAQ.1
After collecting the tweeting history, I conduct textual analysis on the sample to search for financial tweets that particularly discuss COVID-19. To do so, I conduct a keyword search on the text of each tweet, searching for words such as COVID-19, corona virus, SARS-CoV-2, etc. A full list of the words used is shown in the Appendix. If a tweet contains one of these keywords, I label it as a tweet that discusses COVID-19.
After identifying COVID-19 tweets, I identify a subset containing information about a treatment or vaccine, and label them as ‘COVID treatment’ tweets. The list of keywords used to identify the treatment tweets is also included in the appendix. To separate tweets containing news about a treatment from those that are general COVID-19 tweets, I exclude tweets that contain keywords related to a treatment from the original list of COVID-19 tweets such that the first list of tweets is strictly about the pandemic, rather than about the treatment.
Having labeled the tweets, I construct an index that measures the “intensity” of the Twitter discussions about COVID-19, I do so by dividing the number of tweets containing COVID-19 news by the total number of daily tweets about a firm, producing a variable that varies between 0 and 1. Similarly, I construct a second variable representing the intensity of the Twitter discussions about the treatment of COVID-19 using the same method. I then standardize both parameters with zero mean and unit standard deviation for ease of empirical interpretation.
I also take advantage of the depth of the dataset which includes information about retweets and the number of times tweets are marked as favorite. To do so, I calculate the total number of retweets of all tweets containing information about COVID-19 (or about a treatment) and use this as an additional variable in the empirical analysis. Moreover, I calculate another variable with the total number of times such tweets are marked as favorite.
Table 1 shows a summary of all tweeting variables, as well as statistics about tweets that discuss COVID-19 in particular. The variables are presented at the firm-day level for the full calendar year 2020.
Table 1.
General statistics
This table provides general descriptive statistics for the sample set. Panel A contains information about firms constituting the sample set. Panel B shows daily tweeting statistics (average by firm) in the sample set.
| Panel A: general statistics | ||
|---|---|---|
| Number of firms in sample | 4743 | |
| Stock exchanges covered | NYSE, NYSE MKT, NASDAQ | |
| Number of tweets used to construct the sample | 44,857,312 | |
| Panel B: Tweeting (daily) descriptive statistics | ||
|---|---|---|
| This panel summarizes daily parameters including number of tweets per day per firm, number of retweets, number of times tweets are marked as favorite. In addition, the table shows the statistics for daily COVID-19 tweets | ||
| Mean | Standard deviation | |
| Number of tweets | 42.31 | 335.63 |
| Number of retweets | 5080.56 | 799,716 |
| Number of favorites | 70.90 | 939.85 |
| Tweets containing COVID-19 mentions | 0.75 | 6.71 |
| Tweets containing COVID-19 treatment | 0.15 | 4.24 |
| Retweets about COVID-19 | 8.60 | 1218.95 |
| Number of favorites/likes about COVID-19 tweets | 1.48 | 21.76 |
| Retweets about COVID-19 treatment | 2.46 | 373.22 |
| Number of favorites/likes about COVID-19 treatment tweets | 0.32 | 15.85 |
3. Empirical results
3.1. Intensity of COVID-19 Twitter discussions and daily returns
To begin the analysis, I start with the simplest model that examines the relationship between the intensity of the COVID-19 Twitter discussions, and daily firm returns. The intensity is defined as the number of tweets discussing COVID-19 divided by the total number of tweets about a firm (in a 24-hour window from market-close to market close2 ). The intensity parameter is then standardized for ease of interpretation. The model also includes firm characteristics and fixed effects as control variables as follows:
| (1) |
A second model focuses particularly on the COVID-19 discussions containing information about a treatment:
| (2) |
The results of these two models (with 3 variations) are reported in Table 2 . All the results suggest a relationship between tweeting about COVID-19 (and the treatment program) and daily stock returns. For example, the first column in the table shows that a one standard deviation increase in the intensity of Twitter discussions about COVID-19 corresponds to a 3.37 basis points reduction in returns on a given day. The second column shows that a one standard deviation increase in the intensity of Twitter discussions about treatment correspond to a 4.02 basis points increase in returns.
Table 2.
Daily returns and tweets about COVID-19 and COVID-19 treatment
This table documents the results of the panel regression of daily stock returns in basis points on the proportion of daily tweets about COVID-19 (first column) and COVID-19 treatment news (second column). 3 different sets of models are estimated. The proportion of daily tweets is the standardized ratio of tweets related to COVID-19 relative to all tweets about a firm on a given day (first column); and the standardized ratio of tweets related to COVID-19 treatment relative to all tweets about a firm on a given day (second column). Control variables include: Firm size defined as the natural logarithm of the market value of equity; B/M defined as the book-to-market ratio; Beta defined as the CAPM beta; Leverage is the proportion of leverage in the firms’ capital structure. Payout ratio defined as the firm's payout proportion; Institutional ownership defined as the percentage of the firm's shares held by institutional owners; Analysts is the number of analysists following the firm. Standard errors are shown in parenthesis. ***, **, * denote statistical significance at the 1% 5%, and 10% levels respectively.
| Daily returns (basis points) | COVID-19 tweets | COVID-19 treatment tweets | COVID-19 tweets | COVID-19 treatment tweets | COVID-19 tweets | COVID-19 treatment tweets |
|---|---|---|---|---|---|---|
| Proportion of daily tweets | −3.37*** (0.80) |
4.02*** (0.92) |
−1.20* (0.67) |
2.42*** (0.75) |
−1.33** (0.67) |
2.28*** (0.75) |
| Firm size | 3.28*** (0.57) |
3.14*** (0.57) |
1.73*** (0.47) |
1.65*** (0.47) |
1.85*** (0.47) |
1.77*** (0.47) |
| B/M | −0.32 (0.56) |
−0.33 (0.56) |
−0.53 (0.46) |
−0.53 (0.46) |
−0.49 (0.46) |
−0.49 (0.46) |
| Beta | 11.30*** (2.65) |
11.87*** (2.64) |
11.14*** (2.18) |
11.36*** (2.17) |
10.97*** (2.18) |
11.21*** (2.18) |
| Leverage | −4.85 (3.56) |
−5.45 (3.56) |
−5.36* (2.93) |
−5.62* (2.93) |
−4.96* (2.93) |
−5.24* (2.93) |
| Payout ratio | −13.54*** (2.65) |
−13.29*** (2.65) |
−11.07*** (2.18) |
−10.95*** (2.18) |
−11.19*** (2.19) |
−11.07*** (2.18) |
| Institutional ownership | −0.01 (0.04) |
−0.001 (0.04) |
0.01 (0.03) |
0.01 (0.03) |
0.01 (0.03) |
0.02 (0.03) |
| Analysts | −0.14*(0.07) | −0.15**(0.07) | −0.03(0.06) | −0.03(0.06) | −0.04(0.06) | −0.05(0.06) |
| Market return | 1.10***(0.003) | 1.10***(0.003) | 1.13***(0.003) | 1.13***(0.003) | ||
| VIX | 0.23***(0.05) | 0.22***(0.05) | 0.44***(0.05) | 0.43***(0.05) | ||
| Lagged Market return | 0.09***(0.003) | 0.09***(0.003) | ||||
| Industry fixed effects | Included | Included | Included | Included | Included | Included |
| Day of the week fixed effects | Included | Included | Included | Included | Included | Included |
| Adj. R2 | 0.004 | 0.004 | 0.33 | 0.33 | 0.33 | 0.33 |
| N | 333,334 | 333,334 | 333,334 | 333,334 | 332,042 | 332,042 |
3.2. Dissemination of Twitter discussions and daily returns
In the previous sub-section, the focus was on the number of tweets containing COVID-19 information. The goal of this sub-section is to focus on another aspect – that of further dissemination or reach. Particularly, it may be possible that investors tweet about COVID-19, but that the tweets are not getting any attention – the goal of this sub-section thus is to focus on the investor attention aspect of tweeting. Barber and Odean (2008) show that retail investors are more susceptible to attention-grabbing events, and the diffusion of COVID-19 news, and treatment news may comprise just that. The emerging literature on social media in finance has identified both retweets and the number of times tweets are marked as favorite as measures of investor attention. (Crowley et al., 2018 and Chawla, Da, Xu and Ye, 2017).
The empirical analysis of this section uses the following model:
| (3) |
Similarly, I also estimate a similar model focusing particularly on retweets of tweets about treatment of COVID-19 as such:
| (4) |
The results of this model are reported in Table 3 . Importantly, in this model, I include the total number of retweets about each firm on a given day as a control variable in addition to focusing on retweets about COVID-19; this is important because retweets of general news may be received differently by market participants than retweets about COVID-19 news.
Table 3.
Daily returns and measures of Twitter attention about COVID-19 and COVID-19 treatment
This table documents the results of the panel regression of daily stock returns in basis points on measures of investor attention and control variables. Number of retweets represents the natural logarithm of the daily number of COVID-19 retweets about a firm. Number of favorites represents the natural logarithm of the number of “likes” for a COVID-19 tweet. The table includes observations where the number of retweets/favorites of the covid-19 (treatment) tweets is greater than zero. Control variables include Retweets of all tweets represents the total number of retweets about a firm on a given day; Number of favorites of all tweets represents the total number of favorites of all tweetts about a firm on a given day; Firm size defined as the natural logarithm of the market value of equity; B/M defined as the book-to-market ratio; Beta defined as the CAPM beta; Leverage is the proportion of leverage in the firms’ capital structure. Payout ratio defined as the firm's payout proportion; Institutional ownership defined as the percentage of the firm's shares held by institutional owners; Analysts is the number of analysists following the firm. Standard errors are shown in parenthesis. ***, **, * denote statistical significance at the 1% 5%, and 10% levels respectively. Panel A shows the results for retweets, while panel B shows the results for total favorites.
| Panel A: Daily returns and retweets about COVID-19 and COVID-19 treatment | ||
|---|---|---|
| Daily returns (basis points) | COVID-19 retweets | COVID-19 treatment retweets |
| Retweets of COVID-19 tweets |
−10.96*** (2.45) |
10.73* (5.69) |
| Retweets of all tweets | 4.62***(1.48) | 8.29*(4.30) |
| Firm size | −5.47** (2.34) |
−12.04* (6.88) |
| B/M | −1.77 (3.38) |
−2.95 (13.91) |
| Beta | 6.35 (10.27) |
67.58** (27.14) |
| Leverage | −14.14 (14.75) |
55.98 (51.71) |
| Payout ratio | −11.99 (13.36) |
−23.01 (40.32) |
| Institutional ownership | 0.33* (0.17) |
0.44 (0.52) |
| Analysts | 0.28 (0.25) |
−0.51 (0.79) |
| Industry fixed effects | Included | Included |
| Day of the week fixed effects | Included | Included |
| Adj. R2 | 0.004 | 0.01 |
| N | 25,431 | 4,665 |
| Panel B: Daily returns and number of tweeting “favorites” about COVID-19 and COVID-19 treatment | ||
|---|---|---|
| Daily returns (basis points) | COVID-19 number of favorites | COVID-19 treatment number of favorites |
| Number of favorites of COVID-19 tweets | −9.68*** (2.98) |
24.56*** (6.79) |
| Number of favorites of all tweets | 21.66*** (2.02) |
25.10*** (5.24) |
| Firm Size | −6.04*** (2.14) |
−14.46** (5.72) |
| B/M | 0.91 (3.05) |
−19.21 (11.82) |
| Beta | 18.16* (9.32) |
49.60** (23.42) |
| Leverage | −2.95 (13.36) |
82.67* (43.49) |
| Payout ratio | −27.07** (12.02) |
−24.73 (33.57) |
| Institutional ownership | 0.43*** (0.16) |
1.28*** (0.44) |
| Analysts | −0.71*** (0.23) |
−1.24* (0.67) |
| Industry fixed effects | Included | Included |
| Day of the week fixed effects | Included | Included |
| Adj. R2 | 0.01 | 0.01 |
| N | 30,384 | 6115 |
As Table 3 shows, an increase in the number of retweets about COVID-19 corresponds to a 10.96 basis points reduction in returns on a given day. On the other hand, an increase in the number of retweets about COVID-19 treatment corresponds to a 10.73 basis points increase in daily returns. The two columns together show that investor attention to COVID-19, as measured by the number of retweets corresponds to market returns on a given day.
As a second proxy for investor attention, I use the number of times tweets are marked as favorites and repeat models 3 and 4 above but use the number of favorites rather than the number of retweets. The results are reported in Panel B of Table 3 and are consistent with the finding that investor attention to COVID-19 (COVID-19 treatment) is related to daily market outcomes.
3.3. Vector autoregression analysis
Thus far, the analysis has focused on the daily relationship between tweeting behavior or intensity of Twitter discussions about COVID-19 and returns. However, it is possible that the relationship is more complex, and is one that may span multiple days. For example, it is possible that the impact of the discussions is short-lived, or that it is reversed on subsequent trading days. The question then becomes, what is the multi-day relationship between COVID tweeting intensity and daily returns? To tackle this question, I conduct a vector autoregression (VAR) analysis, similar to Tetlock (2007), that examines the relationship between these two constructs over multiple days as follows:
| (5) |
The results of this VAR model are reported in Table 4 and cover the entire sample of firms. Panel A of the model focuses on the tweeting intensity about COVID-19 news and returns over multiple days. The table shows that where there is a “shock” to tweeting intensity (an increase in the proportion of COVID-19 tweets), then there is a 4.21 basis points reduction in returns on the tweeting day. The VAR model shows no obvious signs of reversal on the three subsequent days, suggesting that the impact of the “tweeting shock” is permanent (or at least is not reversed in the short-run).
Table 4.
Vector Autoregression (VAR) of COVID-19 tweets and COVID-19 treatment tweets
This table reports estimates from panel vector autoregressions of daily returns and the proportion of COVID-19 (treatment) tweets. The coefficients are obtained using system GMM estimations. The dependent variables are contemporaneous returns (in basis points) with 3 lags. The independent variables are: the standardized ratio of tweets related to COVID-19 relative to all tweets about a firm on a given day (Panel A); and the standardized ratio of tweets related to COVID-19 treatment relative to all tweets about a firm on a given day (Panel B). Standard errors are shown in parantheses. ***, **, * denote statistical significance at the 1%, 5% and 10% levels respectively.
| Panel A: Returns of a firm (in basis points) given an increase in the proportion of COVID-19 tweets | |
|---|---|
| Dep. variable: returns (basis points) | |
| Tweeting day t | −4.21*** (0.68) |
| Tweeting day t-1 | 1.52** (0.73) |
| Tweeting day t-2 | 1.41*(0.74) |
| Tweeting day t-3 | 0.08 (0.73) |
| N | 758,195 |
| Panel B: Returns of a firm (in basis points) given an increase in the proportion of COVID-19 treatment tweets | |
|---|---|
| Dep. variable: returns (basis points) | |
| Tweeting day t | 1.74*** (0.61) |
| Tweeting day t-1 | 1.26** (0.64) |
| Tweeting day t-2 | 0.30 (0.67) |
| Tweeting day t-3 | −0.50 (0.67) |
| N | 758,195 |
Panel B focuses on the tweeting intensity related to COVID-19 treatment. As the panel shows, a tweeting intensity shock to the discussions about COVID-19 treatment corresponds to a positive return on the particular day with no obvious reversals on the subsequent three days. Taken together, the results of Panels A and B suggest that the relationship between tweeting intensity about COVID-19 (treatment of COVID-19) has a long-lasting impact on return behavior.
4. Additional and robustness tests
4.1. Effect of S&P500 membership and trading exchanges
The S&P 500 firms are, by definition, central to the economy, and it is important to examine whether the main result of the study holds for the sample of firms in the S&P500 index. In addition, it is useful to study whether the result holds for the NYSE and NASDAQ individually. This analysis is conducted in Table 5 , and the results suggest that the main findings hold for S&P500 firms, NYSE-listed firms, as well as firms listed on the NASDAQ.
Table 5.
Daily returns and COVID-19 tweets (robustness tests)
This table documents the results of the panel regression of daily stock returns in basis points on the proportion of daily tweets about COVID-19 (first column) and COVID-19 treatment news (second column). 5 different sets of models are estimated representing various robustness tests. The proportion of daily tweets is the standardized ratio of tweets related to COVID-19 relative to all tweets about a firm on a given day (first column); and the standardized ratio of tweets related to COVID-19 treatment relative to all tweets about a firm on a given day (second column). Control variables include: Firm size defined as the natural logarithm of the market value of equity; B/M defined as the book-to-market ratio; Beta defined as the CAPM beta; Leverage is the proportion of leverage in the firms’ capital structure. Payout ratio defined as the firm's payout proportion; Institutional ownership defined as the percentage of the firm's shares held by institutional owners; Analysts is the number of analysists following the firm. Standard errors are shown in parenthesis. ***, **, * denote statistical significance at the 1% 5%, and 10% levels respectively.
| S&P 500 firms | NASDAQ firms | NYSE firms | Bottom decile firm days | Top decile firm days | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Daily returns (bps) | COVID-19 tweets | COVID-19 treatment tweets | COVID-19 tweets | COVID-19 treatment tweets | COVID-19 tweets | COVID-19 treatment tweets | COVID-19 tweets | COVID-19 treatment tweets | COVID-19 tweets | COVID-19 treatment tweets |
| Proportion of daily tweets | −8.31*** (1.34) |
4.93*** (1.36) |
−2.80** (1.35) |
3.44** (1.41) |
−3.41*** (1.02) |
4.50*** (1.22) |
0.09 (1.56) |
1.67 (2.49) |
−18.73*** (3.02) |
6.81*** (2.26) |
| Firm size | 8.05*** (1.49) |
7.77*** (1.49) |
3.51*** (1.10) |
3.47*** (1.10) |
3.96*** (0.75) |
3.76*** (0.75) |
10.15*** (1.69) |
10.15*** (1.69) |
−10.10*** (1.88) |
−10.61*** (1.88) |
| B/M | −0.41 (0.71) |
−0.39 (0.71) |
−6.38* (3.55) |
−6.42* (3.55) |
−0.30 (0.67) |
−0.31 (0.67) |
−0.12 (1.24) |
−0.12 (1.24) |
−2.39 (3.02) |
−2.85 (3.02) |
| Beta | 17.87*** (6.02) |
19.64*** (6.01) |
4.28 (4.99) |
5.08 (4.99) |
12.75*** (3.33) |
12.97*** (3.33) |
−5.86 (7.44) |
−5.91 (7.44) |
10.76 (8.06) |
16.60** (8.03) |
| Leverage | 1.28 (6.21) |
0.49 (6.21) |
−0.89 (6.80) |
−1.94 (6.79) |
−2.89 (4.58) |
−3.18 (4.58) |
−21.23** (10.10) |
−21.20** (10.10) |
13.99 (12.62) |
10.93 (12.61) |
| Payout ratio | −10.43** (4.52) |
−10.16** (4.52) |
−15.12*** (5.03) |
−14.66*** (5.03) |
−12.56*** (3.23) |
−12.39*** (3.23) |
−3.19 (6.38) |
−3.18 (6.38) |
−34.49*** (10.59) |
−33.27*** (10.59) |
| Institutional ownership | 0.18* (0.10) |
0.23** (0.10) |
−0.01 (0.07) |
0.003 (0.07) |
0.04 (0.05) |
0.05 (0.05) |
0.04 (0.10) |
0.04 (0.10) |
0.34** (0.13) |
0.42*** (0.13) |
| Analysts | −0.15 (0.10) |
−0.17* (0.10) |
−0.21* (0.12) |
−0.22* (0.12) |
−0.16* (0.09) |
−0.17* (0.09) |
−0.92*** (0.31) |
−0.92*** (0.31) |
−0.01 (0.19) |
−0.04 (0.19) |
| Industry F.E. | Included | Included | Included | Included | Included | Included | Included | Included | Included | Included |
| Day of the week F.E. | Included | Included | Included | Included | Included | Included | Included | Included | Included | Included |
| Adj. R2 | 0.005 | 0.004 | 0.003 | 0.003 | 0.004 | 0.004 | 0.01 | 0.01 | 0.01 | 0.005 |
| N | 87,908 | 87,908 | 103,740 | 103,740 | 222,114 | 222,114 | 38,421 | 38,421 | 38,740 | 38,740 |
4.2. Isolating the effect of tweeting
Presumably, if the results in this study are driven by tweeting, then we should expect the results to be strongest among firms/days with the most tweets, and weakest among firms/days with the least tweets. To test this empirically, I divide the sample into tweeting deciles and focus on the highest and lowest tweeting deciles. The results of this test are reported in the last 4 columns of table 5 and demonstrate that firms that are tweeted about the most exhibit a very significant response, both statistically and economically. In contrast, firms in the lowest tweeting decile do not exhibit significant results – suggesting that the results of this study are driven primarily by firms that are tweeted about the most, and thus reinforcing the role of Twitter as a driver of these results.
4.3. The industry effect
While COVID-19 had a significant impact on the economy, its impact may have been manifested differently on different industries. Whereas some industries were significantly affected by the pandemic, others were less so. It is useful to examine the “industry effect” in this study. To do so, I replicate the main analyses of Eq. (1) and Eq. (2) for each of the Fama and French 48 industries and report the results in Table 6 . Each entry in Table 6 represents a regression analysis conducted on the specific industries, both for the COVID-19 index, and the COVID-19 treatment index.
Table 6.
COVID-19 tweets and industry response
This table shows the results of the panel regression of daily stock returns in basis points on the proportion of daily tweets about COVID-19, and COVID-19 treatment news. The results are for regressions conducted for each industry separately. The industries are defined according to the Fama & French 48 industry classification. The standard control variables from Table 5 are used within. Standard errors are shown in parenthesis. ***, **, * denote statistical significance at the 1% 5%, and 10% levels respectively.
| Industry | COVID-19 tweets | COVID-19 treatment tweets | Industry | COVID-19 tweets | COVID-19 treatment tweets |
| Agriculture | 5.68(10.1) | −2.84(16.6) | Petroleum and Natural Gas | −3.62(8.0) | 19.11**(7.8) |
| Food products | −0.30(4.3) | 13.29**(5.9) | |||
| Candy & soda | −4.14(6.2) | −27.88(36.5) | Utilities | −2.02(3.4) | −1.36(5.7) |
| Beer & liquor | −12.68(29.0) | −67.13(132.5) | Communication | −2.18(3.8) | 1.06(5.8) |
| Tobacco products | −14.57(10.7) | 25.49**(12.1) | Personal services | −9.11*(5.1) | 16.26(23.8) |
| Recreation | 10.55(9.6) | −45.77(56.9) | Business services | −1.12(2.2) | 6.77**(2.9) |
| Entertainment | −13.33**(6.4) | 4.95(8.7) | Computers | 11.48**(5.6) | −10.78(12.3) |
| Printing and publishing | 4.75(14.5) | −18.46(12.2) | Electronic equipment | −9.92**(4.0) | 8.12(11.4) |
| Consumer goods | −7.24(6.1) | 8.84(7.8) | Measuring and control equipment | −1.75(5.4) | 6.62(6.7) |
| Apparel | −3.72(7.7) | 33.39(34.2) | Business supplies | 7.71(7.7) | −36.74(40.2) |
| Healthcare | 0.61(5.5) | 4.10(7.0) | Shipping containers | 3.67(13.3) | −19.20(26.6) |
| Medical equipment | −10.40***(3.5) | 7.67(5.8) | Transportation | −14.28***(3.6) | 6.58**(3.0) |
| Pharmaceutical products | −3.54(2.9) | 1.75(1.6) | Wholesale | −0.92(3.9) | 4.26(4.3) |
| Chemicals | −6.01(5.4) | −1.07(7.2) | Retail | −2.74(4.1) | 7.87(6.8) |
| Rubber and plastic products | −17.62(16.4) | 3.70(9.4) | Restaurants, hotels, motels | −8.83**(4.1) | 23.19***(8.1) |
| Textiles | −11.74(25.2) | --- | Banking | 5.75(6.7) | 3.86(10.3) |
| Construction materials | −4.54(10.1) | −10.76(35.9) | Insurance | 0.73(3.9) | 7.63*(4.3) |
| Construction | −1.16(6.2) | −7.64(8.6) | Real estate | 8.78(9.8) | 11.93(31.8) |
| Steel works | 9.59(12.4) | 36.64**(18.5) | Trading | 5.50(4.0) | −10.47(9.1) |
| Fabricated products | 10.06(16.2) | --- | Others | −11.13(9.4) | 4.20(18.0) |
| Machinery | −0.31(4.6) | −0.59(4.6) | |||
| Electrical equipment | 1.39(6.5) | −262.02*(142.7) | |||
| Automobiles and trucks | −0.20(6.3) | 12.23(12.4) | |||
| Aircraft | 17.59(17.1) | 26.10**(12.4) | |||
| Shipbuilding, railroad equipment | 0.22(49.1) | --- | |||
| Defense | 3.59(18.3) | 0.70(20.2) | |||
| Precious metals | −14.25(10.4) | −12.10(13.6) | |||
| Non-metallic and industrial metal mining | −14.20*(7.4) |
−16.74(33.8) |
|||
| Coal | 123.42**(53.3) | 22.49(37.3) |
The results suggest that some industries were hit particularly hard by the pandemic as evidenced by the statistical and economic magnitude of the results.3 For example, Transportation, Restaurants, Hotels and Motels appear to exhibit significant negative returns when news about the pandemic are discussed, and significantly positive returns when news about the treatment program emerged on social media. These industries in particular appear to be more responsive than the rest of the economy, which is consistent with the intuition that demand for these industries is affected by the pandemic. In addition, the recovery news, in particular, had a strong positive impact on oil and aircrafts – which again is consistent with intuition. Interestingly, the Technology industry behaved opposite to the rest of the economy in that it exhibited significant positive returns when COVID-19 discussions were taking place. This observation is consistent with the notion that the technology sector made strong gains as remote work and study became the norm during the pandemic, and thus the reliance on the technology sector increased.
5. Conclusion
This paper uses approximately 45 million financial tweets about all firms listed on the three major US exchanges spanning the full calendar year 2020. Because the dataset spans the initial COVID-19 pandemic, as well as news about the vaccine and treatment program, it is possible to study the diffusion of both fear and hope within economic social networks forming on Twitter. Using the intensity of Twitter discussions about COVID-19, as well as the diffusion of this news through retweets and favorites, the relationship between COVID-19 news and daily returns is established.
The high-frequency nature of social media shows that fear about COVID-19, and hope about the treatment program diffuse within financial social networks, and that such behavior corresponds to daily returns. Having documented this general relationship, future work could focus on particular COVID-19 news events and examine their diffusion on social media. Other future work could also examine the diffusion of high-impact news and crises more generally on social media and the resulting impact on stock returns.
Footnotes
The sample includes firms with share codes 10, 11 and 12 on CRSP. The primary firm identifier is taken to be the ticker symbol - which matches the Twitter data collection that is also based on ticker symbol.
The calculation of daily returns as well as the tweeting for each day are both matched (close of market to close of market).
It is important to note that many industry results do not exhibit statistical significance. This is largely due to the small-sample effect (as the sample is broken down into 48 industries) and the resulting decrease in the power of tests. However, some interesting observations can still be drawn despite this limitation.
Appendix
Words used to identify COVID-19 tweets
-
•
Covid, covid-19, covid_19
-
•
Corona virus, coronavirus, corona
-
•
SARS-CoV-2, sars-2, sars2, sars,
-
•
Social distance, social distancing
-
•
Quarantine
-
•
Self isolate, self-isolate
-
•
Pandemic
-
•
Flatten the curve
Words used to identify COVID-19 treatment tweets
-
•
Drug
-
•
Vaccine
-
•
Treatment
Sample Tweets:
References
- Al Guindy, M., Riordan, R., 2019. The Social Internetwork and stock returns. Working paper.
- Antweiler W., Frank M. Is all that talk just noise? The information content of Internet stock message boards. Journal of Finance. 2004;59(3):1259–1293. [Google Scholar]
- Barber B., Odean T. All that glitters: the effect of attention and news on the buying behavior of individual and institutional investors. Review of Financial Studies. 2008;21:785–818. [Google Scholar]
- Chawla, N., Da, Z., Xu, J., Ye, M., 2017. Information diffusion on social media: does it affect trading, return, and liquidity? Working paper.
- Chen H., Prabuddha D., Yu H., Hwang B. Wisdom of crowds: the value of stock opinions transmitted through social media. Review of Financial Studies. 2014;27(5):1367–1403. [Google Scholar]
- Crowley, R., Huang, W., Lu, H., 2018. Discretionary dissemination on Twitter. Rotman School of Management Working Paper.
- Da Z., Engelberg J., Gao P. In search of attention. Journal of Finance. 2011;66(5):1461–1499. [Google Scholar]
- Drake M., Roulstone D., Thornock J. Investor information demand: evidence from Google searches around earnings announcements. Journal of Accounting Research. 2012;50(4):1001–1040. [Google Scholar]
- Jung M., Naughton J., Tahoun A., Wang C. Do firms strategically disseminate? Evidence from corporate use of social media. The Accounting Review. 2018;93(4):225–252. [Google Scholar]
- Pagano M., Sedunov J., Velthuis R. How did retail investors respond to the COVID-19 pandemic? the effect of Robinhood brokerage customers on market quality. Finance Research Letters. 2021 (available online) [Google Scholar]
- Tetlock P. Giving content to investor sentiment: the role of media in the stock market. Journal of Finance. 2007;62:1139–1168. [Google Scholar]
