Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2023 Jun 23. Online ahead of print. doi: 10.1016/j.vaccine.2023.06.058

Public perception of COVID-19 vaccines through analysis of Twitter content and users

Sameh N Saleh a,b,, Samuel A McDonald b,c, Mujeeb A Basit a,b, Sanat Kumar b,d, Reuben J Arasaratnam a, Trish M Perl a, Christoph U Lehmann b,e, Richard J Medford a,b
PMCID: PMC10288320  PMID: 37385887

Abstract

Background

With the global continuation of the COVID-19 pandemic, the large-scale administration of a SARS-CoV-2 vaccine is crucial to achieve herd immunity and curtail further spread of the virus, but success is contingent on public understanding and vaccine uptake. We aim to understand public perception about vaccines for COVID-19 through the wide-scale, organic discussion on Twitter.

Methods

This cross-sectional observational study included Twitter posts matching the search criteria ((‘covid*’ OR ‘coronavirus’) AND ‘vaccine’) posted during vaccine development from February 1st through December 11th, 2020. These COVID-19 vaccine related posts were analyzed with topic modeling, sentiment and emotion analysis, and demographic inference of users to provide insight into the evolution of public attitudes throughout the study period.

Findings

We evaluated 2,287,344 English tweets from 948,666 user accounts. Individuals represented 87.9 % (n = 834,224) of user accounts. Of individuals, men (n = 560,824) outnumbered women (n = 273,400) by 2:1 and 39.5 % (n = 329,776) of individuals were ≥40 years old. Daily mean sentiment fluctuated congruent with news events, but overall trended positively. Trust, anticipation, and fear were the three most predominant emotions; while fear was the most predominant emotion early in the study period, trust outpaced fear from April 2020 onward. Fear was more prevalent in tweets by individuals (26.3 % vs. organizations 19.4 %; p < 0.001), specifically among women (28.4 % vs. males 25.4 %; p < 0.001). Multiple topics had a monthly trend towards more positive sentiment. Tweets comparing COVID-19 to the influenza vaccine had strongly negative early sentiment but improved over time.

Interpretation

This study successfully explores sentiment, emotion, topics, and user demographics to elucidate important trends in public perception about COVID-19 vaccines. While public perception trended positively over the study period, some trends, especially within certain topic and demographic clusters, are concerning for COVID-19 vaccine hesitancy. These insights can provide targets for educational interventions and opportunity for continued real-time monitoring.

Keywords: COVID-19, Vaccine, Twitter, Social media, Public opinion, COVID-19 vaccines, SARS-CoV-2, Vaccination, Vaccination refusal, Vaccine hesitancy, Natural language processing, Sentiment analysis, Topic modeling, Demographic inference

1. Introduction

With the global continuation of the COVID-19 pandemic, the large-scale administration of a SARS-CoV-2 vaccine (referred from here on as the COVID-19 vaccine) is crucial to achieve herd immunity and curtail further spread of the virus [1]. As governments work to approve and distribute safe and effective vaccines [2], important questions regarding vaccination willingness persist: What are the attitudes and perceptions of the public [3] to these vaccines and how can they affect vaccine uptake [4]? These questions are important to develop an education and outreach approach to achieve the desired vaccine penetration to achieve herd immunity [5]. In 2019, prior to the COVID-19 pandemic, the World Health Organization (WHO) had identified vaccine hesitancy as one of the top 10 greatest global health threats [6]. While surveys on attitudes and perception of a COVID-19 vaccine show significant vaccine hesitancy among the general population [7], [8], [9] and health care providers [10], [11], studies remain small in size, tend to focus on local participants, are prone to sampling error from non-probability sampling and reporting bias, and perhaps most poignantly, cannot capture real-time changes in vaccine willingness. Crowdfunding platforms may provide an indication of emerging community needs related to COVID-19 but fail to provide a continuous assessment of community sentiment [12].

Twitter, the microblogging platform, with over 187 million daily monetizable active users [13], serves as a robust medium to better understand wide-scale, organic public perception about the COVID-19 vaccine. With nearly 400 million mentions, #COVID19 was the most used hashtag on Twitter in 2020 [14]. Social media has become increasingly recognized for its rapid information dissemination (whether accurate or not) and dispersion of sentiment that quickly crosses geographic and social boundaries [15]. Analysis of social media text can inform real-time changes and evolution in population-level attitudes [16], [17]. As evident with the rise of the “infodemic” during the COVID-19 pandemic, Twitter has become a particularly useful data source in public health and healthcare-related research [18] and has been repeatedly used to study public sentiment and understand trends throughout the COVID-19 pandemic [19], [20], [21], [22], [23], [24]. Earlier in the COVID-19 pandemic, we were able to demonstrate initial public sentiment regarding the virus, its origin and spread, and measures to limit its spread [25] as well as early support for social distancing [26] on Twitter.

Social media, and specifically Twitter, has been shown to be a major factor in vaccine uptake and should be monitored and potentially used for interventions to address vaccine hesitancy [27]. Examining sentiments towards the influenza A H1N1 vaccine in 2009 showed that projected vaccination rates based on Twitter sentiment were similar to vaccination rates estimated by traditional phone surveys used by the Centers for Disease Control and Prevention (CDC) [28]. A previous study noted information exposure on Twitter may account for differences in human papillomavirus (HPV) vaccine uptake that are not accounted for by socioeconomic factors like education, insurance, or poverty [29]. Another study noted that there is a significant relationship between social media use by the public and organized action and public doubts of vaccine safety [30].

Content analysis and themes around COVID-19 vaccine hesitancy were evaluated in a subset of Canadian tweets [31] and in tweets following the announcement of successful vaccine trials [32].

Multiple studies thus far have shown the importance and value of analyzing social media to understand public sentiment and discussion about COVID-19 vaccination. One study explored the network map of worldwide Facebook users to examine the relationships and growth in anti-vaccination views and relates these theoretically to COVID-19 vaccination [33]. In a study from China, analysis of 1.75 million Weibo messages between January and October 2020 found positive trends in COVID-19 vaccine acceptance, but revealed areas of misinformation [34]. One study from Australia collected 31,100 English tweets related to COVID-19 vaccine and identified sentiment, emotion, and topics [35]. In a study in the United Kingdom and the United States, analysis of 300,000 Twitter and Facebook posts showed sentiments that correlated with national surveys [36].

We aimed to apply content and sentiment analysis on COVID-19 vaccine related tweets as well as analysis of the responsible, originating user accounts to provide insight into the evolution of public attitudes about the COVID-19 vaccines over time. We hypothesized that content analysis from the start of the pandemic will identify important themes of discussion (especially those with negative sentiment or evidence of misinformation) throughout the vaccine development process that would inform health care officials, public health agencies, and policy makers and could be used to aid in the outreach and educational interventions for the COVID-19 vaccine to the general public.

2. Methods

2.1. Data source

We performed a cross-sectional observational study of English-language tweets obtained by matching the keywords ((‘covid*’ OR ‘coronavirus’) AND ‘vaccine’) from February 1, 2020 to December 11th, 2020. December 11th was chosen as an end date to mark the United States Food and Drug Administration’s first emergency use authorization of a COVID-19 vaccine [37]. We used the snscrape library [38] to obtain (“scrape”) tweets identified through Twitter's advanced search tool, which returns a relevant sample of tweets. We manually reviewed a random subsample of 1,000 tweets and verified the tweets’ relevance to the topic of COVID-19 vaccination. We extracted 21 and 20 variables related to the tweets and to the posting user accounts, respectively (Supplemental Tables S1 and S2).

2.2. Data processing

We measured total daily tweets and completed descriptive statistics for collected variables. We applied natural language processing techniques to process, analyze, and visualize the text from tweets. To preprocess the tweet text for analysis (“cleaning”), we removed hyperlinks, user tags, and words of little analytical value. We also returned words to their root form and segmented text into one- and two-word terms. Further details are discussed in Supplemental Appendix A. We visualized the top 300 processed terms as a word cloud with larger font size representing greater term frequency. All analyses were conducted using Python, version 3.8.2 (Python Software foundation). Institutional review board approval was not required because this study used only publicly available data.

2.3. Sentiment analysis

Sentiment analysis describes the affect of a piece of text — the intrinsic attractiveness or aversiveness of a subject such as events, objects, or situations [39]. We used the Valence Aware Dictionary and sEntiment Reasoner (VADER) [40] to analyze the sentiment polarity of a tweet. VADER, a lexicon and rule-based tool, was particularly designed for sentiment analysis for social media text. In addition to regular words, VADER leverages punctuation, emoticons, emojis, sentiment-laden slang words and acronyms, as well as syntax and capitalization schemas to inform labeling of a positive, neutral, and negative score for each document. These three scores were combined to form a normalized, weighted composite score. Overall positive (≥0.05), neutral (-0.05 to 0.05), and negative (≤−0.05) sentiments are defined at standardized composite score thresholds. When sentiment has been aggregated, we refer to an average sentiment of ≥0.05 as positive and ≤−0.05 as negative. Trends in sentiment over time were determined using the Mann-Kendall trend test. We also manually labeled sentiment for a random sample of 1000 tweets to evaluate VADER performance. VADER had a weighted average F1 score of 0.77. The negative sentiment had precision of 0.79 and recall of 0.65. We used the TextBlob library [41] to label each tweet from a range of 0 (objective) to 1 (subjective) where objective tweets relay factual information and subjective tweets typically communicate an opinion or belief. Finally, we used the NRCLex library to label words within each tweet with corresponding emotional affects (i.e., Plutchik’s wheel of emotions which include anger, anticipation, fear, disgust, joy, sadness, surprise, and trust) based on the National Research Council Canada (NRC) affect lexicon [42]. This wheel of emotions was used since they can be naturally paired into opposites (i.e., joy-sadness, anger-fear, trust-disgust, anticipation-surprise). Based on these labels, we identified tweets with their primary emotion and visualized how the proportion of eligible tweets (i.e., those with an identified primary emotion) with a particular primary emotion changed over time.

2.4. Topic modeling

After cleaning the tweets to distill analyzable text as described in the methods, we applied a machine learning algorithm called Correlation Explanation (CorEx) [43] to identify clusters of topics for all tweets. CorEx identifies the most informative topics based on a set of latent factors that best explain the correlations in the data in turn maximizing the total correlation or the multivariate mutual information [44]. We used CorEx as opposite to a generative model like Latent Dirichlet Allocation to avoid making assumptions and specifications of hyperparameters. Each document (in our case, each tweet) may include multiple topics. We iterated through 2–20 clusters by analyzing the distribution of topic coherence (TC) for each topic and evaluating how much each additional topic enhances the overall TC. The number of topics was increased until further topics no longer made a substantial contribution to the overall TC. This process is analogous to selecting a cutoff eigenvalue when conducting topic modeling using Latent Semantic Analysis (LSA). That resulted in 15 topics for the topic model. We presented the top 20 words for each topic cluster to author CUL without prior access to individual tweets from the dataset to manually label a theme for each topic. The manually labeled topic labels were reviewed by two other authors SNS and RJM with unanimous agreement. We visualized the monthly distribution of topics over time and utilized a heat map to visualize how the mean sentiment of each topic has changed per month.

2.5. User exploration and demographic inference

Given that each tweet has one authoring account, we identified all unique user accounts in our dataset and provided descriptive statistics with metadata available for the users, including the launch date of the account, followers (accounts following them), follows (accounts they follow), lifetime posts, likes, and media shared, as well as profile pictures, description information, and verified status (badge to indicate an account of public interest that has been verified to be authentic). To better understand demographic differences, we applied a previously validated deep learning system through the m3inference library [45] to infer the account user as an individual or an organization based on multimodal input that includes username, display name, description, and profile picture image. If the account is labeled as an individual, the gender (female or male) and age group (≤18 years old, 19–29 years old, 30–39 years old, and ≥40 years old) are then labeled. Each label using the algorithm has an accompanying probability. The automatic demographic detection was particularly designed for Twitter profiles for health-related cohort studies [46]. We provided summary statistics for the demographics identified and stratified sentiment and subjectivity analyses by the different demographic groups to evaluate for differences. We used Mann-Whitney U and χ2 where appropriate to determine significance. Alpha level of significance was set a priori at 0.05 and all hypothesis testing was two-sided. We did not adjust for multiple comparisons as this was an exploratory study and should be interpreted as hypothesis-generating.

3. Results

A total of 2,356,285 tweets were extracted for the study period, of which 2,287,344 tweets were English-only and included for evaluation. The tweets were generated by 948,666 accounts which had been active for an average of 6.9 years (interquartile range [IQR], 2.6–10.0) with a median of 267 (IQR, 55–1,100) followers and 3,600 (519–15,572) lifetime likes. Only 2.9 % (n = 27,443) of accounts were verified (Table 1 ). Of the tweets analyzed, 54 % (n = 1,235,575) had a link, 40.1 % (n = 916,585) mentioned other twitter accounts, 18.1 % (n = 414,173) used hashtags, and 11.9 % (n = 273,278) contained media like an image or video. In terms of engagement, 41.3 % (n = 943,639), 24.0 % (n = 548,863), and 20.7 % (n = 473,204) of tweets received likes, replies, and retweets, respectively (Table 1). Individuals (vs. organizations) generated 87.9 % (n = 834,224) of tweets. Of individuals, men (n = 560,824) outnumbered women (n = 273,400) by 2:1 and 39.5 % (n = 329,776) of individuals were ≥40 years old (Table 1).

Table 1.

Tweet and user account characteristics are shown on top and inferred user demographics are shown on bottom.

Tweets
n = 2,287,344
User Accounts
N = 948,666
Has link(s) 1,235,575 (54.0) Years account active 6.9 (2.6–10.0)
Mentions user(s) 916,585 (40.1) Followers 267 (55–1,100)
Has hashtag(s) 414,173 (18.1) Following 407 (137–1,069)
Has media 273,278 (11.9) Lifetime statuses 4,605 (1,027–16,365)
Is quoted tweet 133,404 (5.8) Lifetime likes 3,600 (519–15,572)
Has like 943,639 (41.3) Media shared 205 (36–875)
Has reply 548,863 (24.0) Public lists, member 2 (0–12)
Has retweet 473,204 (20.7) Contains description 800,619 (84.4)
Has quoted tweet 183,982 (8.0) Location listed 659,720 (69.5)
Twitter source Contains profile picture 902,666 (95.2)
Web App 686,296 (30.0) Contains banner picture 721,542 (76.1)
iPhone/iPad 660,382 (28.9) Contains link in profile 303,761 (32.0)
Android 432,862 (18.9) Verified account 27,443 (2.9)
TweetDeck 60,845 (2.7)



User Demographics
N = 948,666

N (%) Probabilitymedian (IQR)

Entity Individual 834,224 (87.9) 0.999 (0.997–0.999)
Organization 114,443 (12.1) 0.867 (0.727–0.999)
Sex (of individuals) Female 273,400 (32.8) 0.992 (0.949–0.998)
Male 560,824 (67.2) 0.996 (0.980–0.999)
Age (of individuals) <40 years old 504,448 (60.5) 0.972 (0.896–0.994)
≤18 years old 109,327 (13.1) 0.660 (0.528–0.821)
19–29 years old 225,360 (27.0) 0.611 (50.4–0.754)
30–39 years old 169,761 (20.4) 0.765 (0.568–0.936)
≥40 years old 329,776 (39.5) 0.950 (0.734–0.996)

Daily tweets abruptly spiked to 51,176 tweets on November 9th, the day Pfizer and BioNTech announced their vaccine’s effectiveness [47] (up from 4,052 tweets on November 8th) and peaked on December 8th with 55,779 tweets. Tweets from November 1st to the end of the study period on December 11th accounted for 39.8 % (n = 910,593) of all tweets (Fig. S1). The corpus of tweets contained over 62.5 million words and 416 million characters. The ten most commonly tweeted terms and their frequencies were as follows: “people” (228,482), “trial” (206,310), “take” (181,598), “flu” (159,043), “trump” (149,042), “first” (147,103), “make” (142,242), “test” (131,719), “need” (126,846), and “one” (122,966). Fig. 1 displays a word cloud of the top 300 words with larger font size concordant with frequency.

Fig. 1.

Fig. 1

Word cloud of top 300 words related to COVID-19 and vaccine. Larger fonts represent higher frequency in the corpus after preprocessing text.

Daily mean sentiment of tweets fluctuated congruent with news events, but overall trended positively throughout the study period (Mann-Kendall statistic = 10,122; tau = 0.218; p < 0.001) (Fig. 2 a). Several days in early to mid-March and on October 13th saw particularly negative sentiments, coinciding with news of the declaration of a pandemic by the WHO and Johnson & Johnson’s halting of their vaccine trial on October 12th [48]), respectively. Highest daily mean positive sentiment revolved around Moderna’s July 14th announcement of a safe vaccine with “robust immune response” in an early trial [49] and Pfizer’s November 9th announcement of over 90 % effectiveness of its vaccine [47]. Twitter accounts representing organizations had more positive sentiments than tweets from individuals (median weekly difference, 0.118; IQR, 0.091–0.144), but there was no significant difference in polarity for age (median weekly difference, 0.006; IQR, −0.011 to 0.019) and only minimal positive difference for males (median weekly difference, 0.030; IQR, 0.012–0.044) (Fig. 2b–d).

Fig. 2.

Fig. 2

. a–d. (a) Mean sentiment polarity shown by day (as points) and by week (as a dashed line). Each tweet was labeled as primarily negative (−1), neutral (0), or positive (1). (b) Mean weekly polarity stratified by individual versus organization. (c) Mean weekly polarity stratified by gender for individual accounts. (d) Mean weekly polarity stratified by age more or <40 years than for individual accounts.

The sentiment trends were reflected by the primary emotions identified in the COVID-19 vaccine tweets by month (Fig. 3 a). Fear started as the most prevalent primary emotion in nearly 40 % of eligible tweets early on but decreased to under 20 % by the end of the study period. Conversely, trust increased from below 20 % to around 40 % and outpaced fear in April 2020, maintaining as the most prevalent primary emotion thereafter. Anticipation was the second most prevalent primary emotion for most of the study period, steadily ranging from 25 % to 30 %. All other emotions were consistently expressed as the predominant emotion in <10 % of eligible tweets. Individuals had an increased predominance of fear (26.3 % vs. 19.4 %; p < 0.001) and decreased predominance of anticipation (25.9 % vs. 33.6 %; p < 0.001) and trust (32.5 % vs. 35.2 %; p < 0.001). For individual accounts, women had more fear (28.4 % vs. males 25.4 %; p < 0.001) with less anticipation (23.8 % vs. 26.8 %; p < 0.001) than men, but no significant difference in trust (32.3 % vs. 32.5 %, p = 0.11). Those <40 years old had more fear (26.6 % vs. 26.0 %; p < 0.001) and less trust (32.0 % vs. 33.0 %; p < 0.001) (Fig. 3b–d). Tweets throughout the year tended to be more objective (where 0 is fully objective and 1 as fully subjective) with limited daily variation (overall mean 0.359; std 0.028) (Fig. S2).

Fig. 3.

Fig. 3

. a–d. Percent of tweets with primary emotion per month (a) overall, (b) stratified by individual versus organization, (c) stratified by gender for individual accounts, and (d) stratified by age more or <40 years than for individual accounts. Only tweets with a predominant primary emotion (n = 1,489,027) are included.

Table 2 shows each topic label with their key words and sample tweets. Fig. S3 shows the 15 topics obtained from topic modeling with the proportion of tweets per month that contained each topic. The dominant topic (topic 15) focused on mask use and public reactions. Discussions about misinformation and conspiracy theories comprised the next most common topic, peaking in May and staying relatively consistent from July through December. Tweets related to the Indian and Russian governments’ decision on producing and using the Sputnik V vaccine (topic 2) spiked in August. Discussion of Emergency Use Authorizations (EUA) and vaccine approvals (topic 12) did not spike until November 2020 with the approval of the Pfizer and Moderna vaccines. Several topics had strong mean positive sentiments throughout the study period, including discussions of biotechnology companies and the stock market (topic 3), vaccination firsts (topic 4), vaccine development (topic 6), and EUAs (topic 12). Other topics showed a progressive trend from positive to negative throughout the study period including discussion of US politics and the election (topic 1), the FDA and CDC (topic 14), and mask use and public reactions (topic 15). Tweets comparing COVID-19 to influenza (topic 5) and its vaccine had strongly negative early sentiment but improved over time (Fig. 4 ). Compared to the rest of individual users (n = 810,318), those exhibiting negative sentiment posting about topic 5 (n = 51,686) were proportionally more likely to be ≥40 years old (45.1 % vs. 39.6 %; p < 0.001) and female (34.0 % vs. 32.7 %; p < 0.001). The only other topic with persistently negative sentiment was discussion of misinformation and conspiracy theories (topic 13). Those exhibiting negative sentiment posting about topic 13 (n = 166,819) were more likely than other user accounts (n = 741,388) to be individuals (90.9 % vs. 87.3 %; p < 0.001) and of those individual accounts, more likely to be female (34.4 % vs. 32.4 %; p < 0.001).

Table 2.

Topic clusters identified by topic modeling. Words contributing to the model are shown in decreasing order of weighting. The topics are labeled manually based on these words.

Possible Topic Label Topic # Tweets/Topic Words contributing to topic model(in ↓ order of weighting) Representative Tweet
Mask use and general reactions 15 811,844 people, mask, even, dont, would, take, know, die, death, need, one, still, many, kill, risk, never, work, way, yet, wear_mask “Pretty much what it boils down to, at this point. Ignorance, arrogance, and stupidity will end up killing LOTS of people this year, I'm afraid! Be SMART. WEAR your mask. Wash your hands. Hold off on large gatherings until a safe, effective Covid-19 vaccine arrives. [link]”
Conspiracies and misinformation 13 557,301 want, think, fake, make, try, believe, conspiracy, bill_gate, money, really, gonna, real, force, shit, god, anything, anyone, hoax, black, put “@WhiteHouse Also, isnt this a RNA vaccine? Super experimental albeit dangerous, could mean with DNA as well. Human Guinea pigs. Wouldn't be surprised if the vaccine harms more then the COVID did.”
Impact on lockdowns on school, work, and economy 7 406,227 wait, year, lockdown, open, month, next, life, time, long, end, last, come, back, next year, week, school, ago, day, economy, away “@[tag] @[tag] @[tag] @[tag] And even with a vaccine they will continue with the lockdowns, the social distance and the fear mongering… If not for the Covid, they will find something…”
Vaccine mechanisms and immunity 10 235,625 virus, mrna, immunity, antibody, prevent, infection, disease, spread, strain, protein, herd_immunity, mutate, symptom, immune_system, sarscov, prevent infection, mutation, cell, cause, infect “If tests show one already had COVID-19 so one has antibodies and is now immune, CDC currently counts that as one infected and positive for COVID-19. After a vaccine, will every person vaccinated who therefore grows antibodies, be considered positive & infected? @realDonaldTrump”
First vaccinations and recipients 4 232,613 first, world, around, world first, become, first test, receive, country, first line, first person, world news, first country, person receive, world leader, government vote, make sure_pass, yearold_woman, day government, first dos, first world “Thank the lord this is the beginning of the end: First patient receives Pfizer Covid-19 vaccine [link]”
Emergency use authorizations and approvals 12 213,036 pfizer, moderna, approval, effective, pfizer_biontech, emergency_use, data, biontech, authorization, receive, regulator, next_week, approve pfizer, effective prevent, pfizer ceo, show effective, moderna effective, approve, data show, early data “Pfizer's Covid vaccine is days away from approval after data reveals it is 95 % effective [link]”
US politics and election 1 212,597 trump, biden, realdonaldtrump, president, election, operation_warp, american, credit, speed, lie, take credit, joebiden, gop, democrat, potus, win, vote, joe_biden, admin, america “@realDonaldTrump If you want to take partial credit for the Covid-19 vaccine fine. You still LOST the election. In Georgia for example you are behind there by 12 k votes. The recount wont change the outcome. I look forward to your predictable reply and the end of your regime.”
Stock market and pharma/biotech companies 3 205,557 market, stock, news, company, good news, biotech, drug company, pharma, price, billion, drug, surge, late, update, rise, break news, positive news, investor, pharma company, announce “Markets are supported by both the cumulative upside surprises to the economy since the end of the recession and the apparently faster-than-expected progress toward a COVID-19 vaccine. [link]”
Clinical trials and participants 8 205,468 trial, clinical, human trial, phase, human, volunteer, participant, oxford, begin, phase clinical, show, trial participant, result, volunteer trial, ahead_large trial, ahead_large, show_promise, immune_response, test, number “Coronavirus Vaccine Update | Oxford’s COVID-19 vaccine trial in Brazil begins: Scientists say coronavirus jab may not work for older adults [link]”
Vaccine development and supporters 6 204,083 research, development, global, effort, develop, fund, researcher, global effort, join, effort develop, access, help, target, hacker, treatment, support, dolly_parton, research development, accelerate, collaboration “As the world continues to feel the impact of COVID-19, the biopharmaceutical industry is working around the clock to identify and develop safe and effective vaccines to prevent infection, while also researching and developing new therapies to treat those infected with the virus.”
Russian response and global partners 2 187,369 russia, india, via, sputnik, russian, china, putin, serum_institute, indian, covaxin, chinese, bharat_biotech, icmr, hacker_target, time india, via nbcnews, russia sputnik, indias_serum, narendramodi, possible “A Sputnik moment, president #Putin has announced that #Russia is the first country in the world to register a #Covid_19 vaccine. 10 s of countries already requested it [link]”
FDA and CDC 14 179,396 trump, fda, failure, administration, trump administration, fda approval, food_drug, food_drug administration, fda approve, cdc, trump admin, white_house, trump claim, president_donald trump, president_donald, cuomo, trump supporter, want, take, exist_sustainable “'@CDCgov if you try and push through an unproven vaccine because of Trump’s desperation to recover from his abysmal handling of Covid-19…good luck. No one I know myself included will be getting vaccinated.”
Philanthropy and public health 9 156,183 health, public, mandatory, public health, health official, health care, public trust, official, gavi_sdg, cdc_gatesfoundation, read_billgates cdc_gatesfoundation, gavi_sdg vaccination, make mandatory, read_billgates, health expert, care_worker, health minister, health care_worker, clinton, obama_bush “@[tag] This presents a problem and crashes into the argument, should covid vaccines be mandated. I initially thought that it will need more then encouragement and common sense from the public but these vaccine deniers are going to deprive people of protection through fear. Arrest them.”
Comparison to influenza 5 151,177 flu, shot, influenza, flu shot, every year, season, seasonal, die flu, every, flu death, year flu, kill, jab, take flu, first shot, virus, people, side_effect shot, compare, via “'Ok so I’m usually not super crunchy about everything but I’ve been hospitalized 2x after getting the flu shot bc of how badly I got the flu within months so I was told not to get the shot by my drs. what does that mean for COVID’s vaccine? Like what if I react the same?”
Safety and side effects from trials 11 137,445 astrazeneca, safety, effect, johnson johnson, pause, study, unexplained_illness, astrazeneca trial, johnson_pause, long_term effect, oxford_university, pause trial, safety efficacy, efficacy, safety concern, resume, put hold, astrazeneca study, illness, side 'AstraZeneca COVID-19 vaccine study put on hold due to suspected adverse reaction in UK participant [link]”

Fig. 4.

Fig. 4

Heat map showing mean sentiment by month for each topic. Note that a tweet can include multiple topics.

4. Discussion

Twitter is a rich medium that can serve as both thermometer and thermostat for the COVID-19 vaccine, which is a crucial public health strategy to combat the pandemic. It can provide insight into public perception of a COVID-19 vaccine, but can also be used to understand and combat knowledge deficits and vaccine hesitancy through information and education [50], [51]. A majority (59 %) of US Twitter users regularly obtain news on Twitter, proportionally more than any other social media platform [52]. We analyzed nearly 2.3 million COVID-19 vaccine-related tweets in 2020, creating a dataset that exceeded the scope of related studies [53], [54] and is the largest study to date of social media posts about COVID-19 vaccination at the time of this manuscript. We evaluated public perception as the COVID-19 vaccine development went from speculative to theoretical to actual. Generally, we believe that Twitter users favored the vaccine during its development phase. Tweets with positive sentiment were more prominent than tweets with negative sentiment and trust emerged as the predominant emotion. However, there were periods of time (usually linked to events in the public news cycle), demographic subgroups, and topic clusters that had more prominent negative sentiment and emotion.

Organizational accounts were significantly more positive, exhibiting more anticipation and trust and less fear. For individuals, the gender and age distribution in our dataset parallels the reported proportional share of Twitter’s global advertising audience [55]. Women expressed more fear and less anticipation, but by the end of the study period, that gap had narrowed. Those <40 years old tended to express less trust and more fear, but the margin was small.

The topic most strongly associated with negative, albeit improving, sentiment was the discussion of the influenza vaccine in combination with the COVID-19 vaccine. These tweets often compared deaths and illness from both diseases or expressed general vaccine mistrust to both vaccines. Examples include: “@[user] Only time I've ever had the flu is the 2 times I got flu shots. It was not a minor case either it was the full blown flu. I refuse to get another flu shot and I also will refuse the covid vaccine” and “Flu Virus equals Flu Vaccine. Coronavirus Equals Covid-19 VaccineNow if the Flu shot gives you the flu, the Covid-19 Shot will give you Coronavirusam I in the general area of Right??”. Notably, these users exhibiting negative sentiment about this topic were more likely to be ≥40 years old and female. This focused topic-demographic cluster, for example, exposes a direct opportunity for intervention to correct misinformation and mitigate vaccine hesitancy. Conversely, the emergency use authorizations of the vaccine and reports of the first vaccine recipients, which arose later in the study period, were celebrated with positive sentiment and mirror the overall increasing trend in positive sentiment and trust.

While the percent population immunity needed to achieve herd immunity (either through innate or acquired immunity) for COVID-19 is not yet known, estimates have increased from 60 to 70 % to possibly closer to 75–85 % [56], [57]. Achieving herd immunity through infection would come at an untenable cost [58], making the immunization effort critical to protect lives. Therefore, it was concerning to us that fear was a common and persistent predominant emotion in COVID-19 vaccine tweets. While the proportion of ‘trust’ tweets outpaced ‘fear’ tweets relatively early in the study period, approximately 20 % of eligible tweets still expressed fear in association with the vaccine. If this fear translates into refusal to become immunized, we are not only likely to see a prolonged pandemic, but also further increases in COVID-19 related deaths as concerning virus variants take hold. As more people receive the vaccine in the future, we anticipate that sentiments will become more positive over time with increased trust and vaccine uptake, but this will need to be consistently studied, especially in the context of newly approved vaccines and news events.

4.1. Limitations

Our study was limited by several factors. First, we recognize that our dataset is not all inclusive of tweets discussing the COVID-19 vaccine. Our tweet search criterion was narrow to ensure accuracy of captured tweets for this initial work and did not include terms such as “shot(s)”, “immunization” and “inoculation.” Moreover, despite the volume of tweets analyzed, we are limited to only a relevant sample of all tweets per Twitter’s advanced search tool. Second, we used existing tools to analyze sentiments and emotion of tweets that are not specific to health care topics, which could have skewed our analysis. Third, tweets related to COVID-19 vaccination could have been flagged or removed by Twitter for containing misinformation, but we were not privy to that context to determine how that could have affected our sample. Finally, since we targeted only tweets in English and are unable to determine geographic location for users, we are limited in making conclusions about specific countries or countries where English is not the predominant language.

5. Conclusions

Leveraging 2.3 million COVID-19 vaccine related tweets in 2020, we were able to successfully explore sentiment, emotion, topics, and user demographics to elucidate important trends in public perception about the COVID-19 vaccine. Tweets were overall positive in sentiment and with growing trust. However, fear maintained as a dominant emotion raising concern regarding the willingness to receive the COVID-19 vaccine and subsets of negative sentiment emerged. Comparison to influenza and the influenza vaccine as well as discussion about conspiracy theories were important topics with negative sentiment and showed some demographic differences that could allow for informed intervention. Future work will leverage these natural language processing tools to engage in targeted messaging based on user interests and emotions.

Ethics approval and consent to participate

The University of Texas Southwestern Human Research Protection Program Policies, Procedures, and Guidance did not require institutional review board approval as all data were publicly available.

Authors’ contributions

Study concept and methodology/design: SNS, CUL, RJM; Data curation: SNS, SK; Analysis: SNS, SK, RJM; Interpretation of data: SNS, SK, CUL, RJM; Manuscript preparation: all authors; Manuscript reviewing and editing: all authors. All authors read and approved the final manuscript. SNS, SK, and RJM have accessed and verified the underlying data.

Funding

None.

Declaration of Competing Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Christoph Lehmann reports a financial relationship with Celanese Corporation and Colfax Corp.

Acknowledgements

Not applicable.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.vaccine.2023.06.058.

Appendix A. Supplementary material

The following are the Supplementary data to this article:

Supplementary data 1
mmc1.docx (389.6KB, docx)

Data availability

The code that support the findings of this study is available upon request. Please request from the corresponding author.

References

  • 1.Sridhar D., Gurdasani D. Herd immunity by infection is not an option. Science. 2021;371(6526):230–231. doi: 10.1126/science.abf7921. [DOI] [PubMed] [Google Scholar]
  • 2.Forni G., Mantovani A., COVID-19 Commission of Accademia Nazionale dei Lincei, Rome COVID-19 vaccines. Cell Death Differ. 2021;28(2):626–639. doi: 10.1038/s41418-020-00720-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Caserotti M., Girardi P., Rubaltelli E., Tasso A., Lotto L., Gavaruzzi T. Associations of COVID-19 risk perception with vaccine hesitancy over time for Italian residents. Soc Sci Med. 2021;272 doi: 10.1016/j.socscimed.2021.113688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wong M.C.S., Wong E.L.Y., Huang J., Cheung A.W.L., Law K., Chong M.K.C., et al. Acceptance of the COVID-19 vaccine based on the health belief model: a population-based survey in Hong Kong. Vaccine. 2021;39(7):1148–1156. doi: 10.1016/j.vaccine.2020.12.083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Feleszko W., Lewulis P., Czarnecki A., Waszkiewicz P. Flattening the curve of COVID-19 vaccine rejection-an international overview. Vaccines (Basel) 2021;9(1) doi: 10.3390/vaccines9010044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ten threats to global health in 2019. https://www.who.int/news-room/spotlight/ten-threats-to-global-health-in-2019.
  • 7.Alley S.J., Stanton R., Browne M., To Q.G., Khalesi S., Williams S.L., et al. As the pandemic progresses, how does willingness to vaccinate against COVID-19 evolve? Int J Environ Res Public Health. 2021;18(2) doi: 10.3390/ijerph18020797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Reiter P.L., Pennell M.L., Katz M.L. Acceptability of a COVID-19 vaccine among adults in the United States: how many people would get vaccinated? Vaccine. 2020;38:6500–6507. doi: 10.1016/j.vaccine.2020.08.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Malik A.A., McFadden S.M., Elharake J., Omer S.B. Determinants of COVID-19 vaccine acceptance in the US. EClinicalMedicine. 2020;26 doi: 10.1016/j.eclinm.2020.100495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Shaw J., et al. Assessment of U.S. health care personnel (HCP) attitudes towards COVID-19 vaccination in a large university health care system. Clin Infect Dis. 2021 doi: 10.1093/cid/ciab054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Verger P., et al. Attitudes of healthcare workers towards COVID-19 vaccination: a survey in France and French-speaking parts of Belgium and Canada, 2020. Euro Surveill. 2021;26 doi: 10.2807/1560-7917.ES.2021.26.3.2002047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Saleh S.N., Lehmann C.U., Medford R.J. Early crowdfunding response to the COVID-19 pandemic: cross-sectional study. J Med Internet Res. 2021;23(2) doi: 10.2196/25429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Q3 2020 letter to shareholders. https://s22.q4cdn.com/826641620/files/doc_financials/2020/q3/Q3-2020-Shareholder-Letter.pdf.
  • 14.McGraw, T. Spending 2020 together on Twitter. https://blog.twitter.com/en_us/topics/insights/2020/spending-2020-together-on-twitter.html.
  • 15.Stieglitz S., Dang-Xuan L. Emotions and information diffusion in social media—sentiment of microblogs and sharing behavior. J Manag Inf Syst. 2013;29(4):217–248. [Google Scholar]
  • 16.Depoux A., Martin S., Karafillakis E., Preet R., Wilder-Smith A., Larson H. The pandemic of social media panic travels faster than the COVID-19 outbreak. J Travel Med. 2020;27(3) doi: 10.1093/jtm/taaa031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wilson A.E., Lehmann C.U., Saleh S.N., Hanna J., Medford R.J. Social media: a new tool for outbreak surveillance. ASHE. 2021;1 doi: 10.1017/ash.2021.225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sinnenberg L., Buttenheim A.M., Padrez K., Mancheno C., Ungar L., Merchant R.M. Twitter as a tool for health research: a systematic review. Am J Public Health. 2017;107(1):e1–e8. doi: 10.2105/AJPH.2016.303512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chang C.-H., Monselise M., Yang C.C. What are people concerned about during the pandemic? Detecting evolving topics about COVID-19 from Twitter. J Healthc Inform Res. 2021;5(1):70–97. doi: 10.1007/s41666-020-00083-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chandrasekaran R., Mehta V., Valkunde T., Moustakas E. Topics, trends, and sentiments of tweets about the COVID-19 pandemic: temporal infoveillance study. J Med Internet Res. 2020;22(10) doi: 10.2196/22624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Xue J., Chen J., Hu R., Chen C., Zheng C., Su Y., et al. Twitter discussions and emotions about the COVID-19 pandemic: machine learning approach. J Med Internet Res. 2020;22(11) doi: 10.2196/20550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gallotti R., Valle F., Castaldo N., Sacco P., De Domenico M. Assessing the risks of ‘infodemics’ in response to COVID-19 epidemics. Nat Hum Behav. 2020;4:1285–1293. doi: 10.1038/s41562-020-00994-6. [DOI] [PubMed] [Google Scholar]
  • 23.Lanier H.D., Diaz M.I., Saleh S.N., Lehmann C.U., Medford R.J., De Silva D. Analyzing COVID-19 disinformation on Twitter using the hashtags #scamdemic and #plandemic: retrospective study. PLoS ONE. 2022;17(6) doi: 10.1371/journal.pone.0268409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Diaz M.I., Hanna J.J., Hughes A.E., Lehmann C.U., Medford R.J. The politicization of ivermectin tweets during the COVID-19 pandemic. Open Forum Infect Dis. 2022;9(7) doi: 10.1093/ofid/ofac263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Medford R.J., Saleh S.N., Sumarsono A., Perl T.M., Lehmann C.U. An ‘Infodemic’: leveraging high-volume Twitter data to understand early public sentiment for the coronavirus disease 2019 outbreak. Open Forum Infect Dis. 2020;7 doi: 10.1093/ofid/ofaa258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Saleh S.N., Lehmann C.U., McDonald S.A., Basit M.A., Medford R.J. Understanding public perception of coronavirus disease 2019 (COVID-19) social distancing on Twitter. Infect Control Hosp Epidemiol. 2021;42(2):131–138. doi: 10.1017/ice.2020.406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Systematic scoping review on social media monitoring methods and interventions relating to vaccine hesitancy; 2020. https://www.ecdc.europa.eu/en/publications-data/systematic-scoping-review-social-media-monitoring-methods-and-interventions.
  • 28.Salathé M., Khandelwal S., Meyers L.A. Assessing vaccination sentiments with online social media: implications for infectious disease dynamics and control. PLoS Comput Biol. 2011;7(10) doi: 10.1371/journal.pcbi.1002199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Dunn A.G., Surian D., Leask J., Dey A., Mandl K.D., Coiera E. Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States. Vaccine. 2017;35(23):3033–3040. doi: 10.1016/j.vaccine.2017.04.060. [DOI] [PubMed] [Google Scholar]
  • 30.Wilson S.L., Wiysonge C. Social media and vaccine hesitancy. BMJ Glob Health. 2020;5(10) doi: 10.1136/bmjgh-2020-004206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Griffith J., Marani H., Monkman H. COVID-19 vaccine hesitancy in Canada: content analysis of Tweets using the theoretical domains framework. J Med Internet Res. 2021;23(4) doi: 10.2196/26874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Boucher J.-C., Cornelson K., Benham J.L., Fullerton M.M., Tang T., Constantinescu C., et al. Analyzing social media to explore the attitudes and behaviors following the announcement of successful COVID-19 vaccine trials: infodemiology study. JMIR Infodemiol. 2021;1(1) doi: 10.2196/28800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Johnson N.F., Velásquez N., Restrepo N.J., Leahy R., Gabriel N., El Oud S., et al. The online competition between pro- and anti-vaccination views. Nature. 2020;582(7811):230–233. doi: 10.1038/s41586-020-2281-1. [DOI] [PubMed] [Google Scholar]
  • 34.Yin F., Wu Z., Xia X., Ji M., Wang Y., Hu Z. Unfolding the determinants of COVID-19 vaccine acceptance in China. J Med Internet Res. 2021;23(1) doi: 10.2196/26089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Kwok S.W.H., Vadde S.K., Wang G. Twitter speaks: an analysis of Australian Twitter users’ topics and sentiments about COVID-19 vaccination using machine learning. J Med Internet Res. 2021 doi: 10.2196/26953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hussain A., et al. Artificial intelligence-enabled analysis of UK and US public attitudes on Facebook and Twitter towards COVID-19 vaccinations. J Med Internet Res. 2021 doi: 10.2196/26627. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.FDA takes key action in fight against COVID-19 by issuing emergency use authorization for first COVID-19 vaccine. https://www.fda.gov/news-events/press-announcements/fda-takes-key-action-fight-against-covid-19-issuing-emergency-use-authorization-first-covid-19.
  • 38.Snscrape. (Github).
  • 39.Frijda NH. The emotions. (Cambridge University Press; Editions de la Maison des sciences de l’homme, 1986).
  • 40.Hutto C, Gilbert E. VADER: a parsimonious rule-based model for sentiment analysis of social media text; 2014.
  • 41.Loria S. Textblob: simplified text processing. (Textblob).
  • 42.Mohammad SM, Turney PD. Emotions evoked by common words and phrases: using mechanical Turk to create an emotion lexicon. In: Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text. Association for Computational Linguistics; 2010. p. 26–34.
  • 43.Gallagher RJ, Reing K, Kale D, Steeg GV. Anchored correlation explanation: topic modeling with minimal domain knowledge. arXiv:1611.10277 [cs, math, stat]; 2018.
  • 44.Steeg GV, Galstyan A. Discovering structure in high-dimensional data through correlation explanation. arXiv:1406.1222 [cs, stat]; 2014.
  • 45.Wang Z et al. Demographic inference and representative population estimates from multilingual social media data. In: The World Wide Web conference (ACM, 2019). p. 2056–2067. doi: 10.1145/3308558.3313684.
  • 46.Yang Y-C, Al-Garadi MA, Love JS, Perrone J, Sarker A. Automatic gender detection in Twitter profiles for health-related cohort studies; 2021. doi: 10.1101/2021.01.06.21249350. [DOI] [PMC free article] [PubMed]
  • 47.Pfizer and Biontech announce vaccine candidate against Covid-19 achieved success in first interim analysis from phase 3 study. https://www.pfizer.com/news/press-release/press-release-detail/pfizer-and-biontech-announce-vaccine-candidate-against.
  • 48.AJMC Staff. A timeline of COVID-19 developments in 2020. AJMC; 2021.
  • 49.Jackson L.A., Anderson E.J., Rouphael N.G., Roberts P.C., Makhene M., Coler R.N., et al. An mRNA vaccine against SARS-CoV-2 - preliminary report. N Engl J Med. 2020;383(20):1920–1931. doi: 10.1056/NEJMoa2022483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Subbaraman N. This COVID-vaccine designer is tackling vaccine hesitancy - in churches and on Twitter. Nature. 2021;590(7846):377. doi: 10.1038/d41586-021-00338-y. [DOI] [PubMed] [Google Scholar]
  • 51.Loomba S., de Figueiredo A., Piatek S.J., de Graaf K., Larson H.J. Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA. Nat Hum Behav. 2021;5:337–348. doi: 10.1038/s41562-021-01056-1. [DOI] [PubMed] [Google Scholar]
  • 52.Shearer E, Mitchell A. News use across social media platforms in 2020; 2021. https://www.journalism.org/2021/01/12/news-use-across-social-media-platforms-in-2020/.
  • 53.Damiano A.D., Allen Catellier J.R. A content analysis of coronavirus tweets in the United States just prior to the pandemic declaration. Cyberpsychol Behav Soc Netw. 2020;23(12):889–893. doi: 10.1089/cyber.2020.0425. [DOI] [PubMed] [Google Scholar]
  • 54.Jang H., Rempel E., Roth D., Carenini G., Janjua N.Z. Tracking COVID-19 discourse on Twitter in North America: infodemiology study using topic modeling and aspect-based sentiment analysis. J Med Internet Res. 2021;23(2) doi: 10.2196/25431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kemp S. Digital 2020 global digital overview; 2020. https://p.widencdn.net/1zybur/Digital2020Global_Report_en.
  • 56.Fontanet A., Cauchemez S. COVID-19 herd immunity: where are we? Nat Rev Immunol. 2020;20:583–584. doi: 10.1038/s41577-020-00451-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.McNeil Jr DG. The New York Times; 2021. https://www.nytimes.com/2020/12/24/health/herd-immunity-covid-coronavirus.html.
  • 58.Randolph H.E., Barreiro L.B. Herd immunity: understanding COVID-19. Immunity. 2020;52(5):737–741. doi: 10.1016/j.immuni.2020.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary data 1
mmc1.docx (389.6KB, docx)

Data Availability Statement

The code that support the findings of this study is available upon request. Please request from the corresponding author.


Articles from Vaccine are provided here courtesy of Elsevier

RESOURCES