Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2022 May 5;134:107320. doi: 10.1016/j.chb.2022.107320

The popularity of contradictory information about COVID-19 vaccine on social media in China

Dandan Wang a,c,d,e,, Yadong Zhou b
PMCID: PMC9068608  PMID: 35527790

Abstract

To eliminate the impact of contradictory information on vaccine hesitancy on social media, this research developed a framework to compare the popularity of information expressing contradictory attitudes towards COVID-19 vaccine or vaccination, mine the similarities and differences among contradictory information's characteristics, and determine which factors influenced the popularity mostly. We called Sina Weibo API to collect data. Firstly, to extract multi-dimensional features from original tweets and quantify their popularity, content analysis, sentiment computing and k-medoids clustering were used. Statistical analysis showed that anti-vaccine tweets were more popular than pro-vaccine tweets, but not significant. Then, by visualizing the features' centrality and clustering in information-feature networks, we found that there were differences in text characteristics, information display dimension, topic, sentiment, readability, posters' characteristics of the original tweets expressing different attitudes. Finally, we employed regression models and SHapley Additive exPlanations to explore and explain the relationship between tweets' popularity and content and contextual features. Suggestions for adjusting the organizational strategy of contradictory information to control its popularity from different dimensions, such as poster's influence, activity and identity, tweets' topic, sentiment, readability were proposed, to reduce vaccine hesitancy.

Keywords: COVID-19 vaccine, Weibo, Attitude, Information popularity, Content feature, Contextual feature

1. Introduction

In China, as of April 8, 2020, the number of confirmed cases of COVID-19 reached approximately 80,000. Although physical preventive measures such as wearing masks and social distancing effectively cut off the spread of the virus, long-term control of the COVID-19 pandemic hinged on the development and uptake of vaccines (Chou & Budenz, 2020). In March 2020, an anonymous cross-sectional survey, conducted online among Chinese adults, showed that 91.3% of participants would accept COVID-19 vaccination after the vaccine became available, among whom 52.2% wanted to get vaccinated as soon as possible, while others would delay vaccination until the vaccine’ safety was confirmed (J. Wang, Jing, et al., 2020). As a preventive innovation, vaccines’ diffusion and adoption are inevitably influenced by the competing dissemination of contradictory information expressing different attitudes towards vaccine and vaccination on social media (Cohen & Head, 2013; Pan & Di Zhang, 2020). Social media such as: Twitter (Jamison et al., 2020), Facebook (Xu & Guo, 2018), Instagram (Massey et al., 2020), YouTube (Ekram et al., 2019) etc., is not only an important resource for obtaining health information, but also serves as a breeding ground of health misinformation (Y. Wang, McKee, et al., 2019).

Information cues, such as “getting COVID-19 vaccination can effectively prevent COVID-19 infection, but meanwhile causing side effects, like fatigue, sore arms”, are insufficient or insufficiently cogent, individuals could not accurately predict their outcomes (Mishel, 1988), which leads to confusion and negative beliefs about vaccine or vaccination in the context of health communication (Nagler et al., 2019). Uncertainty management theory (Brashers, 2001) concludes that, exposure to this two-sided health information would increase the ambivalence of messages, encourage people to be reluctant to follow health recommendations, and implement harmful or even dangerous health decisions and behaviors (Chang, 2013), namely vaccine hesitancy in this case.

Social media provides multiple interactive perspectives (such as posting, liking, retweeting, commenting, etc.) to encourage “dialogue” and compete for the limited attention of users (Zhu et al., 2020). Therefore, it plays a powerful role in popularizing pro-vaccine and anti-vaccine arguments (Jamison et al., 2020). Weibo in China serves equivalent to Twitter (Pulido Rodríguez et al., 2020). The key to solving vaccine hesitancy among Chinese, which could also serve as reference for other communities, states and even countries to improve immunization rates, was to facilitate the victory of pro-vaccine messages in the competitive dissemination with anti-vaccine messages to dominate public opinion, increasing the consistency of online opinions. Only after making sense of what subjects about COVID-19 vaccine and vaccination were disseminating on Weibo, how popular these subjects were among social media users, and what items contributed to their popularity, can we provide urgent insights about online vaccine promotion for public health communication and education programs from the perspective of the relationship between information characteristics and its popularity.

2. Relevant researches

2.1. Vaccine hesitancy

Vaccine hesitancy referred to an attitude (doubts, concerns) as well as a behavior (refusing some/many vaccines, delaying vaccination), which was complex and context-specific, varying across time, place and vaccines (MacDonald, 2015). Most researches explored vaccine hesitancy's scope and determinants based on self-reported attitude and behavior data from the perspective of vaccinators. J. Wang et al., 2020 conducted an anonymous cross-sectional online survey to evaluate the acceptance of COVID-19 vaccine among Chinese adults in March 2020, and performed multivariate logistic regression to identify factors considered by individual during his/her decision process, involving perceived-risk and impact of COVID-19, attribute-preferences of vaccines (effectiveness, safety, source, cost, means to get vaccinated). Similar research was conducted in USA (Khubchandani et al., 2021), Italy (Biasio et al., 2020), and Syrians (Labban et al., 2020). Lazarus et al. (2021) expanded the research globally, concluded that public trusted government-sourced information more, thus more likely to accept vaccination.

Limited researches shifted attention to health communication on social media. Elkin et al. (2020) input the personal profile and post to code author's vaccine attitude on Google, Facebook and YouTube. Jamison et al. (2020) combined manual content analysis and Latent Dirichlet Allocation (LDA) model to mine posts' vaccine-topics on Twitter. Ittefaq et al. (2021) analyzed topics about polio vaccine in online news comments in Pakistan. Du et al. (2020), adapting Health Belief Model (HBM) and Theory of Planned Behavior (TPB) as framework, used deep learning to detect and summarize topics about HPV vaccine on Twitter.

2.2. Contradictory information about vaccine

Contradictory information is defined as logically inconsistent statements (Carpenter et al., 2016). Inconsistencies can be found in true and false messages or in scientifically recognized positive and negative findings of certain issues (Pan & Di Zhang, 2020). Some researches cited game theory to construct competitive propagation models of contradictory information, then analyzed propagation results (game equilibrium states: dominance, polarization and consensus) and detected influencing factors on results (number of initial spreaders, participation degree, and network structures) through computational experiments at a macro level (Huang et al., 2021; Sun et al., 2019; Vasconcelos et al., 2019). They focused on the interaction between contradictory information.

Other researches concentrated on analyzing similarities and differences among online contradictory information's characteristics, and comparing contradictory information's dissemination effectiveness at a micro level. Limited researches further explored the relationship between the above two. Chou and Budenz (2020) claimed that anti-vaccine messages contained stronger anger than pro-vaccine messages. Xu and Guo (2018) used word clouds and networks to visualize the word usage and clustering in pro- and anti-vaccine headlines searched from Google, and combined text mining and sentiment analysis, declaring that pro-vaccine information's emotion was more positive. They then compared headlines' popularity (sum of shares on Facebook, Google+, LinkedIn, Pinterest, and StumbleUpon, reactions, and comments on Facebook), finding that anti-vaccine information was more popular. Finally using statistical analysis, they declared that the number of sentiment-words positively influenced pro-vaccine information's popularity, while which had insignificant effect on anti-vaccine's popularity. Massey et al. (2020) analyzed topics (coded based on HBM), sentiment, images, and social media features (links, “mention”, location in text) as well as posters' identities of pro-vaccine and anti-vaccine (HPV) tweets on Instagram. Through univariate, bivariate, and network analysis, they detected frequently used features and their clustering, indicating that pro-vaccine tweets got more likes. Ekram et al. (2019) discovered that there was no significant difference in the popularity (considering number of views, likes, dislikes, and comments) of videos expressing different attitudes about HPV vaccines on YouTube, and most of videos were either negative or neutral in tone, which was not a predictor of popularity, but topics about side effects, safety, conspiracy theories caught more attention. W. Wang et al., 2020 focused on messages about HPV vaccines from Chinese websites and WeChat public accounts in 2019, indicating that over 90% of messages were difficult to read, and topics about vaccine's effectiveness were mostly emphasized. Gandhi et al. (2020) searched posts about influenza vaccine on Facebook, finding that anti-vaccine posts were shared and liked more than pro-vaccine posts, there was no correlation between ease of reading and popularity, and pro-vaccine personal post by a nurse was the most popular.

2.3. Research questions

Researches evaluating vaccine hesitancy lacked sufficient mining of health information on social media. Some researches set initial conditions and interaction rules to model communication process. Although having analyzed how characteristics affected information receivers' cognitive decision-making, they inevitably oversimplified the complex communication mechanism of contradictory information, whose conclusions were not robust enough. Other researches about its characteristics and popularity, proposed diverse features from information's source and content, which needed to be logically summarized by a unified framework to suit different-form information. Besides, they lacked in-depth modeling for each feature, and ignored that stakeholders had different habits of creating and adapting information in contradictory information-environment. Results varied from social media, feature-dimensions, measurement of popularity. To fill these gaps, we took COVID-19 vaccine in China as an example, stating that vaccine hesitancy's scale could be reflected by the popularity of information expressing contradictory attitudes on Weibo. We established:

RQ1

Were there significant differences among the popularity of tweets expressing different attitudes towards COVID-19 vaccine or vaccination? Which attitude was generally more popular, about what topics, and from whom?

RQ2

What were the similarities and differences of characteristics among tweets expressing different attitudes?

RQ3

How characteristics influenced the popularity of tweets expressing different attitudes? Positive or negative?

3. Methods

Fig. 1 outlined the research framework.

Fig. 1.

Fig. 1

Research framework.

3.1. Data collection and preprocessing

We firstly called Sina Weibo Application Program Interface (API) to crawl original tweets, which contained the keywords, “COVID-19 vaccine (新冠疫苗)” or “COVID-19 vaccination (新冠疫苗接种)”, and their posters’ information from January 23, 2020 to February 11, 2021. This period covered the entire process of the first outbreak and cessation of COVID-19 epidemic in China, as well as the initial stage of vaccine development and promotion. Due to the timeliness, the interaction (i.e., retweet, comment or like) data of an original tweet tended to stabilize after one week it was released (Wang et al., 2015; Wang et al., 2019). Hence, we traversed the retweet-list and comment-list of each original tweet within one week since it was released, and crawled the tweets and user information of retweet/comment. The initial dataset contained 29,218 original tweets, corresponding to 50,693 retweets and 50,796 comments.

Then came preprocessing. We deleted the low-influence original tweets whose number of likes, retweets or comments was 0 (3062 original tweets remained.). Next, we invited two trained professionals to annotate the 3062 original tweets. If it contained above keywords but talked about unrelated topics, it was coded as ‘N’; if not, it was ‘Y’. The coders conducted the intercoder reliability test (Krippendorff, 2011) based on 10% of tweets (κ = 0.958). After eliminating differences and reaching agreement through discussion, they marked the remaining samples. We deleted 375 original tweets coded as ‘N’. The corresponding retweets and comments as well as user information were also eliminated (2687 original tweets, their 40,325 retweets and 38,865 comments remained.). In Fig. 2 , as soon as the epidemic broke out, discussions about vaccines arose (Wuhan began to close on January 23, 2020). Even if the epidemic became under control, vaccine discussions continued to rise until the end of 2020.

Fig. 2.

Fig. 2

The number of original tweets during the period.

3.2. Text categorization

We classified original tweets into four categories according to the attitude expressed in each original tweet based on the theory of planned behavior (TPB) (Du et al., 2020). TPB believes that attitudes, subjective norms, and perceived behavioral control drive individuals’ intention to perform health behaviors (Ajzen, 1991). We focused only on an amalgamated construct of attitude due to the low prevalence of other constructs in data set, though they also influence vaccination behavior. Two trained professionals were invited to annotate the attitude for 10% of samples, passing through intercoder reliability tests (Krippendorff, 2011) (κ = 0.942). After repeating review and eliminating disagreements, they marked the remaining samples. Coding scheme was shown in Table 1 .

Table 1.

Definitions of key constructs of Theory of Planned Behavior (TPB) found in original tweets.

Construct Attitudes Examples in samples
Approving attitude Approve of COVID-19 vaccine or vaccination “The number of COVID-19 cases in the world has exceeded 100 million, get vaccinated quickly!”
Disapproving attitude Disapprove of COVID-19 vaccine or vaccination “COVID-19 Vaccination is associated with serious side effects, stay away from it!”
Querying attitude Query COVID-19 vaccine or vaccination “COVID-19 Vaccination price may be 200 RMB/pc, is it necessary to vaccinate COVID-19 vaccine?”
Neutral attitude Stay neutral towards COVID-19 vaccine or vaccination “The COVID-19 vaccine has obvious protective effect only after 35 days of inoculation”

3.3. Popularity index construction

To evaluate the effectiveness of rumor rebuttals on social media, Li et al. (2021) proposed rumor refutation effectiveness index (REI), measured as:

REI=log21r(1+k)+p+l+1 (1)

l was the number of likes the original tweets received, r was the number of retweets, k was the ratio of retweets by influential users (Influential accounts on Sina Weibo are stamped with the letter “V”), p was the number of positive comments. Likes imply that users approve of the tweet or are interested in it (Del Vicario et al., 2017; Massey et al., 2020; Schmidt et al., 2018) More retweets mean higher credibility and stronger sharing intention (Del Vicario et al., 2017; Lee & Oh, 2017; Schmidt et al., 2018; Zeng et al., 2019). Positive comments indicate audiences' supports, while negative comments indicate mistrusting (Wang & Song, 2020; Zeng et al., 2019). This study used the same formula to calculate the popularity index (PI) of each original tweet. r and l measured popularity from the scale of information dissemination, while k and p measured popularity from the quality of information dissemination (Fu & Oh, 2019).

To count positive comments, we firstly deleted irrelevant comments and converted traditional Chinese characters to simplified ones in each comment. Comments usually contained emojis which could complement semantics and express emotions (Zhang et al., 2019), and one emoji may have different meanings when being used to discuss about different topics. So we manually converted emojis into corresponding text according to the context of the comment. Finally, we adapted Baidu's AipNLP (Hong et al., 2021) to compute the sentiment positive probability α (0α1) for each comment. If 0.5<α1, this comment was regarded as positive.

3.4. Factor extraction

Humans mainly process information in two modes: systematics and heuristics (Chaiken, 1980). From the systematic view of persuasion, social media users make behavioral decisions based on their perception of information quality displayed in the content (Ghaisani et al., 2017). From the heuristic view of persuasion, information recipients may rely on the more accessible contextual cues than content characteristics (Chaiken, 1980), because excessive online information may reduce users' motivation to scrutinize content carefully (Alsmadi & O'Brien, 2020). This research comprehensively considered the impact of content and contextual factors of information on its popularity.

3.4.1. Content factors

Content factors of each original tweet involved general text characteristics (Li et al., 2021; Massey et al., 2020), information display dimension (Image and video were vivid and straightforward; link could direct readers to external webpages for more information (Fu et al., 2017; Li et al., 2021; Massey et al., 2020).), topic and sentiment (Chou & Budenz, 2020; Jamison et al., 2020; Li et al., 2021; Massey et al., 2020), readability (W. Wang, Jing, et al., 2020), summarized in Table 3 .

Table 3.

Each original tweet's content and contextual factors that might affect its PI.

Variable Description
Content factors General text characteristics text_length the number of Chinese characters
num_sentence the number of sentences
num_first_person the number of first-person, e.g. I (“我”)
num_number the number of numeric
num_noun the number of nouns
num_verb the number of verbs
num_adj the number of adjectives
num_adv the number of adverbs
num_emo the number of emojis
num_@ the number of “@” (mention)
num_! the number of “!”
num_? the number of “?”
num_# the number of “#” (hashtag)
num_place the number of place names
location_included the poster stated his/her location in the original tweet, yes or no
Information display dimension link_ included it contained one or more links, yes or no
image_ included it contained one or more images, yes or no
video_ included it contained one or more videos, yes or no
Topic “risk”, “severity”, “effectiveness”, “adverse_effects, “cost”, “fake_vaccine”, “security”, “conspiracy”, “means”, “dos_don'ts”, “domestic”, “foreign”, “experience”
Sentiment positive_probability α [0,1]
emotional_intensity β [0,1]
emotional_fluctuation f [0,1]
emotional_trend “rise”, “fall”, “rise_fall”, “stable”
Readability proportion_passive the proportion of passive sentences
aver_sentence the average length of sentences
proportion_prep the proportion of prepositions
num_ term the number of medical terms
Contextual factors Posters' characteristics is_V marked with the letter “V”, yes or no
num_tweet the number of tweets he/she already posted.
num_fan the number of fans
identity “government”, “traditional_media”, “self_media”, “organization”, “platform”, “medical_company”, “common_company”, “campus”, “medical_personnel”, “common_personnel”

We coded the topic of each original tweet based on the health belief model (HBM). HBM believes that the motivation of individuals to adopt preventive health behaviors (e.g. vaccination) is affected by six factors: perceived susceptibility, perceived severity, perceived benefits, perceived barriers, cues to action, and self-efficacy (Champion & Skinner, 2008). Due to the low prevalence of self-efficacy in data set, we focused on the other five constructs. The two professionals firstly annotated the topic for 10% of samples, passing through intercoder reliability tests (Krippendorff, 2011) (κ = 0.973). After adding/deleting the coding scheme from Du et al. (2020), they marked the remaining samples. The final scheme was shown in Table 2.

Table 2.

Definitions of key constructs of Health Belief Model (HBM) found in original tweets.

Construct Topics Examples in samples
Perceived susceptibility Risk of getting COVID-19 infection. “The number of COVID-19 cases in the world has exceeded 100 million, get vaccinated quickly!”
Perceived severity Severity of getting COVID-19 infection or refusing COVID-19 vaccination. “COVID-19 causes severe sequelae, not getting vaccinated is like facing death.”
Perceived benefits Effectiveness of COVID-19 vaccination. “COVID-19 Vaccination not only protects against infection, but also reduces contagion.”
“The COVID-19 vaccine has obvious protective effect only after 35 days of inoculation”
Perceived barriers Adverse effects of COVID-19 vaccination “COVID-19 Vaccination is associated with serious side effects, stay away from it!”
Cost of COVID-19 vaccination “COVID-19 Vaccination price may be 200 RMB/pc, is it necessary to vaccinate COVID-19 vaccine”
Fake (Counterfeit vaccines, fraudulent information) “Some institutions use normal saline to make fake COVID-19 vaccines.”
Safety (novelty, infectivity of the vaccine and the standardization of vaccination process) “COVID-19 vaccine is produced with relatively new technology, and its safety performance cannot be totally guaranteed.”
Conspiracy theory “COVID-19 Vaccinations are a scam!”
Cues to action Means or channels to get vaccination “After making an appointment online for COVID-19 vaccination, you can get vaccinated in the community where you live.”
Dos and don'ts for vaccination “Do not eat foods that are prone to allergies, such as seafood, for a day or two after getting COVID-19 vaccine.”
Domestic vaccine development, production and vaccination “More than 14 million people in China have been vaccinated with COVID-19 vaccine.”
Foreign vaccine development, production and vaccination “1.5 million people in the UK have reportedly received at least one dose of COVID-19 vaccine.”
Personal experience of vaccination “On February 5, 2021, I finished the first injection of COVID-19 vaccine and made an appointment for the second injection on February 20, without discomfort.”

The emotional positivity expressed in tweets affected audiences' retweeting (Saura et al., 2019). The emotional intensity amplified the information's vividness, making the publisher's standpoint seem more extreme and more likely to trigger feedback, like comments (Huffaker, 2010). The emotional trend and fluctuation also mattered (Li et al., 2021). We adapted Baidu's AipNLP (Hong et al., 2021) to compute the sentiment positive probability α of each original tweet (0α1, higher value meant more positive emotion). The emotional intensity β, referring to Zhang and Zhang (2014), defined as:

β=|12α| (2)

To describe emotional fluctuation, we firstly split the text into separate sequential sentences and computed the positive probability of each sentence, then calculated the standard deviation of all sentences' positive probabilities (Li et al., 2021). To measure emotional trend, we firstly converted each original tweet to a vector in which each component represented each sentence's positive probability. Due to the different number of sentences in different tweets, then combined Dynamic Time Warping (DTW) (Berndt & Clifford, 1994) to align the score vectors of all the tweets. To classify the emotional trends for these vectors (tweets), K-means (Hartigan & Wong, 1979), K-medoids (Park & Jun 2009) and K-shape Clustering Method (Paparrizos & Gravano, 2015) were compared. The effect and the interpretability of each cluster obtained by K-medoids Clustering Method were the strongest. Hence, K-medoids Clustering Method was implemented on aligned vectors to classify the emotional trends. The optimal number of clusters was 4, that is to say, the emotional trends are classified into 4 categories, namely, ‘‘rise’‘, ‘‘fall’‘, ‘‘first rise and then fall’‘, “stable”.

Readability of online vaccine information affected public's immunization willingness (MacLean et al., 2019; W. Wang, Jing, et al., 2020; Xu et al., 2019). Flesch Reading Ease formula, Flesch-Kincaid Grade Level, Fog Scale and SMOG Index can be used to measure readability (Ley & Florio, 1996). However, these functions were neither specific to health information nor suitable for Chinese languages. Therefore, we constructed four indicators to measure readability, summarized in Table 3. Compared to the active voice, passive voice was more difficult to be understood by readers in Chinese daily language situation (Hsu et al., 2020). Sentences written in a passive voice often used more characters and prepositional phrases, which could obscure the intended meaning (Hsu et al., 2020). The terminology used in medical consultations might contribute to insecurity and anxiety (Peters et al., 2016). COVID Term (National Population Health, 2020) contained 442 COVID-related terms' full names in Chinese and English, involving disease, virus, symptoms and signs, infected population, epidemic prevention and control, psychological assistance, etc. THUOCL (HanShiyi et al., 2016) contained 18,749 common medical terms in Chinese derived from social media. Regarding words in above two thesaurus as medical terms, we counted the number of medical terms appearing in each original tweet.

3.4.2. Contextual factors

Author's influence, as heuristic cues to clarify source identity and activity, was critical for assigning credibility to a given message (Massey et al., 2020; Zareie et al., 2019). The number of tweets (Noro et al., 2013; Riquelme & González-Cantergiani, 2016) and fans (Cappelletti & Sastry, 2012) of posters, and whether their accounts were stamped with the letter “V”, derived from user profiles, were considered. In addition, researchers claimed that tweets posted by news media were retweeted more frequently than tweets posted by common users (Cha et al., 2012), and vaccine information from health accounts gained more likes than non-health ones (Massey et al., 2020). We categorized posters' stakeholder-identities by matching keywords in their personal authentication, introduction, and tags. Referring to the identity-keyword list from An and Ou (2017), firstly manually marking the identity of 10% of posters (two coders' intercoder reliability tests: κ = 0.971), we modified and expanded the list, then determined 10 categories, shown in Table 3. Using the new list, remaining posters' identities were finally coded automatically.

3.5. Statistics analysis

One-Way analysis of variance is used to infer the significant differences among three or more independent groups’ averages of a variable (Bewick et al., 2004). To answer RQ1, we used it to compare the popularity indexes of original tweets with different attitudes.

3.6. Network analysis of tweets and factors

To explore the central characteristics and their clustering from original tweets with different attitudes towards vaccines to answer RQ2, this research established three affiliation networks, whose nodes contained original tweets and their characteristics (Faust, 1997). Original tweets coded as “approve” attitude were used to establish the “approve” network, coded as “disapprove” attitude were for the “disapprove” network, coded as “query” or “neutral” attitude were for the “unclear” network. Firstly, for each continuous variable (A) in Table 3, we calculated its first-quartile and third-quartile among all tweets, then we transferred A into three sub-categorical variables: A_low (A < first-quartile); A_medium (first-quartile ≤ A < third-quartile); A_high (A ≥ third-quartile). For each categorical variable (B) in Table 3, like “emotional_trend”, we transferred “emotional_trend” into 4 (number of possible values of “emotional_trend”) sub-categorical variables (emotional_trend_rise; emotional_trend_fall; emotional_trend_rise_fall; emotional_trend_stable etc.). We acquired 106 sub-features. Then, in the “approve” network, if tweet i had the sub-feature j, then there was a link from tweet i to sub-feature j. Following the same method, we established the other two networks, using Gephi (Bastian et al., 2009) to visualize these directed but unweighted networks and calculate the in-degree centrality for each sub-feature node which indicated how connected or popular a single node was (Farooq et al., 2018). Finally, Gephi's community detection algorithm (Kauffman et al., 2014) was adapted to detect frequent combination of sub-features in each network.

3.7. Regression model establishment

Linear regression models, like Lasso (Ranstam & Cook, 2018) and Ridge (McDonald, 2009), are commonly used. Support Vector regression model (SVR) maps the linear inseparable sample points in the low-dimensional space to the high-dimensional linear separable feature space through nonlinear mapping, and then performs linear regression (Ahmad et al., 2020). Random Forest Regressor (Pedregosa et al., 2017), Extreme Gradient Boosting regression model (XGBoostRegressor) (Dong et al., 2020) and Light Gradient Boosting Machine regression model (LGBMRegressor) (Ke et al., 2017) are integrated learning algorithms based on decision tree regression. To answer RQ3, namely to explore the relationship between PI and its possible affecting factors, the above six models were established on three data sets respectively (original tweets coded as “approve” attitude; original tweets coded as “disapprove” attitude; original tweets coded as “query” or “neutral” attitude). For each data set, 75% of which as training-set was used to find each model's optimal function to fit the data, remaining 25% as testing-set was used to evaluate the performance of the trained optimal function. We used the mean absolute error (MAE) and mean square error (MSE) to compare six models' optimal function's performance. Cox and Wermuth (1992) reminded that the correlation coefficient (R2) was not suitable to judge the effectiveness of regression models, especially for linear regression, so we did not consider it in this study. The smaller MAE and MSE meant the less error between the actual and predicted values (PI). SHapley Additive exPlanations (SHAP) (Lundberg & Lee, 2017) was employed on the best model to explain the regression results for each data set.

4. Results

4.1. Descriptive statistics

In Fig. 3 , most tweets’ attitudes were clear (“approve” or “disapprove”). Tweets about domestic status, self-experience, risk, severity, foreign status, and means to get vaccine mostly supported vaccine. While tweets about fake vaccine and conspiracy (vaccine nationalism, terrorism, stigmatization, racial discrimination, religion, monopoly, ethics, pseudoscience) mostly held “disapprove” attitude. Tweets about adverse effects and cost were highly controversial. Most tweets from stakeholders supported vaccine.

Fig. 3.

Fig. 3

Percentages of original tweets expressing different attitudes belonged to different topics and from different stakeholders.

Fig. 4 showed that tweets about COVID-19 vaccine were long. Some used many numbers to declare the scale of infected people to emphasize the risk of infection and the urgency of vaccination, meanwhile conveying the number of people who had been vaccinated at home or abroad. One tweet using multiple “#” meant poster quoting multiple hashtags to make the tweet more searchable, and citing place-name made content detailed and focused. Vaccine sentiment among tweets was polarized, with high emotional intensity and strong emotional fluctuation. Passive voice, prepositions and professional terms rarely appeared in each tweet, which meant that the text was readable. High number of tweets and fans meant that these original posters were highly active and influential on social media.

Fig. 4.

Fig. 4

The distribution of features (continuous variables) among original tweets. The three horizontal lines from top to bottom represented the maximum, median, and minimum values. The horizontal width of the shadow represented the number of tweets whose feature took the value this horizontal line points to.

In Fig. 5 , few posters displayed their current location when posting. Most tweets contained external links, images, videos, and were published by traditional media, self-media, government or general public marked with “V”. Most tweets’ emotional trends were not stable.

Fig. 5.

Fig. 5

The distribution of features (categorical variables) among original tweets.

4.2. Comparison for popularity indexes of vaccine tweets with different attitudes

In Fig. 6 , tweets whose topic and attitude were “fake_vaccine-approve”, “dos_don't-approve”, “cost-query”, “conspiracy-approve”, “security-query”, “security-approve”, were more popular. Notably, there was one tweet about fake vaccine, five tweets about conspiracy all supporting vaccine, and nine tweets about domestic status against vaccine. In response to rumors and conspiracy theories, government, traditional media, and self-media actively refuted rumors, guided the public to establish a correct view of a great country to promote the fair distribution of vaccines around the world. These positive speeches widely spread among public. However, the negative evaluation of domestic vaccines by a few traditional media and self-media also attracted widespread attention.

Fig. 6.

Fig. 6

The average popularity indexes of tweets with different attitudes under different topics.

In Fig. 7 , tweets whose source and attitude were “platform-query”, “medical_personnel-query”, “common_company-disapprove”, “traditional_media-disapprove”, “traditional_media-approve”, “self_media-query”, “self_media-disapprove” were more popular. Medical companies and campus only expressed deterministic attitudes. The former, as vaccine providers, showed support. The latter was responsible for the health of students, making careful decisions for or against vaccines. Most popular anti-vaccine tweets were created by non-medical companies, while pro-vaccine tweets were from traditional media.

Fig. 7.

Fig. 7

The average popularity indexes of tweets with different attitudes posted by users with different identities.

Although the average popularity index of tweets holding “disapprove” attitude (5.12) was slightly higher than tweets holding “approve” attitude (5.08), “query” attitude (5.09), and “neutral” attitude (4.69). The results of One-Way analysis of variance showed that there was no significant difference among popularity indexes of tweets expressing different attitudes (“approve-disapprove”: p = 0.999; “approve-query”: p = 1.000; “approve-neutral”: p = 0.640; “disapprove-query”: p = 1.000; “disapprove-neutral”: p = 0.564; “query-neutral”: p = 0.807).

4.3. Characteristics of vaccine tweets with different attitudes

The number of tweets coded as “query” or “neutral” was low, and they both meant unclear attitudes. Therefore, we combined this two data sets, overviewed in Table 4 .

Table 4.

Network overview.

Original tweet nodes Attribute nodes Edges
Approve network 1709 106 52,979
Disapprove network 784 106 24,304
Unclear network 194 106 6014

In Fig. 8 , the “approve” network was visualized using Gephi's Fruchterman Reingold layout algorithm (Grandjean, 2015). The node's color was consistent with the feature's name (tweet-nodes set to light green). Sub-features’ names were labeled out. The node's size was proportional to its in-degree. The line's color was consistent with the targeted feature-node. Fig. 9, Fig. 10 followed the same settings. In Table 5 , tweets with different attitudes shared some commonalities. They had low number of passive sentences, “@“, emojis, “!“, first-person, and contained no videos. Posters with “V” would not like to state their geographic location. Differences were that “approve” tweets contained more “#“, emotional fluctuation in “disapprove” tweets was stronger, and “unclear” tweets (“query”, “neutral”) contained fewer clues (links, images).

Fig. 8.

Fig. 8

“Approve” network.

Fig. 9.

Fig. 9

“Disapprove” network.

Fig. 10.

Fig. 10

“Unclear” network.

Table 5.

Top 10 in-degree centrality.

Rank Approve network Disapprove network Unclear network
1 proportion_passive_low location_not_included proportion_passive_low
2 location_not_included proportion_passive_low location_not_included
3 is_V is_V num_@_low
4 num_?_low num_@_low num_!_low
5 num_@_low num_emo_low is_V
6 num_emo_low video_not_included num_emo_low
7 num_!_low num_!_low video_not_included
8 video_not_included num_?_low num_first_person_low
9 num_first_person_low num_first_person_low link_not_included
10 num_#_Medium emotional_fluctuation_Medium image_not_included

In Fig. 11 , based on detected features' combination, we could dig out the writing pattern of self-media from approve-community 1. They usually used short, easy-to-understand language to express support, preferred videos over links to increase information capacity, and adapted relatively stable emotional expression, rather than large emotional swings (“stable”, low emotional fluctuation). In approve-community 2, tweets about “foreign” were long and complex (prepositions, professional terms, place nouns), with significant emotional change (“rise_fall”, medium emotional fluctuation). In approve-community 3, government usually informed vaccines’ effectiveness and domestic vaccination status, using professional terms but rarely expressed strong emotions. In approve-community 4, different from self-media, emotional trends of tweets from traditional media were “fall”. In approve-community 5, tweets about individual experience were positive (high positive probability, high emotional intensity, rise emotional trend).

Fig. 11.

Fig. 11

Communities in the “approve” network and “disapprove” network.

In disapprove_community 2, the posting mode of self-media when expressing “disapprove” attitude was similar to “approve” attitude. In disapprove_community 3, tweets involving side effects were poorly readable (long, many professional terms, preposition, etc.)

In Fig. 12 community 1, the posting mode of self-media when expressing “query” or “unknown” attitude was similar to “approve” attitude. In community 2, non-medical companies, without “V”, created messages about vaccination channels. In community 3, when traditional media expressed uncertainty about vaccine prices at home and abroad, emotions were relatively negative, and showed a downward trend, but the intensity was not strong. In community 4, although government and social media platform's accounts did not express clear views on vaccines' effectiveness, they expressed optimistic expectations.

Fig. 12.

Fig. 12

Communities in the “unclear” network.

4.4. Features influence popularity indexes of vaccine tweets with different attitudes

In Fig. 13 , trained RandomForestRegressor on each dataset performed best. Based on the fitted RandomForestRegressor, the following analyses are carried out.

Fig. 13.

Fig. 13

Performance of models on “approve” tweets (a), “disapprove” tweets (b), “unclear” tweets (c).

SHapley Additive exPlanations (SHAP) is a game theory method used to explain the output of any machine learning model (Lundberg & Lee, 2017). Fig. 14 sorted features by the sum of SHAP value magnitudes over “approve” samples, and used SHAP values to display the distribution of the impact each feature had on the RandomForestRegressor model output. The color represented the feature's value (red-high, blue-low), and features with negligible impact on the model output were omitted. Fig. 15, Fig. 16 were for “disapprove” and “unclear” samples, respectively.

Fig. 14.

Fig. 14

Results of RandomForestRegressor shown by SHAP based on tweets with “approve” attitude.

Fig. 15.

Fig. 15

Results of RandomForestRegressor shown by SHAP based on tweets with “disapprove” attitude.

Fig. 16.

Fig. 16

Results of RandomForestRegressor shown by SHAP based on tweets with “unclear” attitude (“query”, “unknown”).

In Fig. 14, for “approve” tweets, the number of fans of poster, text length, the number of adjectives, emotional intensity, emotional positive probability, the number of places, “domestic” topic-category, the number of exclamation marks, “self-media” identity of poster, had positive impact on PI. While the number of tweets the poster had posted, the average length of sentences, the number of adverbs, proportion of prepositions, the number of nouns, “traditional media” identity of poster, “foreign” topic-category had negative impact on PI. The number of professional terms, verbs and numeric, emotional fluctuation might have a positive or negative effect.

In Fig. 15, for “disapprove” tweets, the number of fans of poster, proportion of prepositions, “conspiracy” topic-category, the number of first-person and sentences, emotional intensity, the number of verbs, emojis and images, “self-media” identity of poster had positive impact on PI. While the number of tweets poster had posted, the number of hashtags, the average length of sentences, the number of numeric, nouns and links had negative impact on PI. Text length, the number of places, emotional fluctuation might have a positive or negative effect.

In Fig. 16, for “unclear” tweets, the number of fans and “self-media” identity of poster, the number of question marks, proportion of prepositions, the number of hashtags and adjectives, emotional intensity, the number of numeric had positive impact on PI. While the number of tweets poster had posted, average length of sentences, text length, the number of nouns, sentences, emotional fluctuation, the number of places, images, “medical_company” identity of poster, emotional positive probability had negative impact on PI.

5. Discussion

5.1. Popularity of information created by users when expressing different attitudes

First of all, there was more information supporting COVID-19 vaccination on Weibo than against vaccination, consistent with existing researches (Biasio et al., 2020; Jamison et al., 2020; Lazarus et al., 2021; Massey et al., 2020; J. Wang, Jing, et al., 2020), yet a few of users did not express a clear attitude (Elkin et al., 2020). Conspiracy theories were common in anti-vaccine tweets (Jamison et al., 2020). Different from Gandhi et al. (2020) and the active role of doctors in promoting children's immunization measures emphasized by Wheeler and Buttenheim (2013), although medical companies were committed to promoting COVID-19 vaccination, medical personnel were likely to induce immunization concerns.

Consistent with Xu and Guo (2018), but inconsistent with Massey et al. (2020), we found that the overall popularity of anti-vaccine tweets was higher than pro-vaccine, but not significantly. Diverse conclusions might due to different types of vaccines and social media. In anti-vaccine tweets, vaccines' safety received widespread attention; in pro-vaccine tweets, vaccination precautions were widely disseminated (Massey et al., 2020). The most popular pro-vaccine tweets were created by traditional media, while the most popular anti-vaccine tweets were from non-medical companies. Massey et al. (2020)'s research among Americans found that tweets with the most likes, whether for or against vaccines, came from individuals, not media or institutions. This might be resulted from to national cultural differences between Chinese collectivism and American individualism (Huff & Kelley, 2003), or the urgency of epidemic that the government and media took higher participation in COVID-19 vaccine than HPV vaccine.

5.2. Characteristics of information created by users when expressing different attitudes

The information-feature networks of different attitudes not only highlighted the frequently used features, but also visualized the clustering of multi-dimensional features of tweets in the community subgraph. Massey et al. (2020) found that people tended to mention others for direct communication in anti-vaccine (HPV) tweets, and include location information in pro-vaccine tweets. But in our COVID-19 vaccine samples, tweets with different attitudes all rarely contained “@” or indicated geographic location. The existence of geo-tagging was a persuasive indicator of the transparency and credibility of the content-creators (Wirtz & Zimbres, 2018). On the one hand, it might be attributed to the COVID-19 vaccine being more novel than HPV vaccine, hence public lacked confidence in it. On the other hand, it might be that a COVID-19 vaccine was suitable for a wider population (HPV vaccine was mainly targeted at women), therefore users were more cautious when making tweets, to avoid making misleading remarks and exposing personal privacy. “V” users also created insufficiently informative tweets without a clear attitude (Rieh, 2002).

5.3. Features impact the popularity of information with different attitudes

Zhang et al. (2014) examined the impact of tweets' content and contextual features on the number of retweets and comments received on Weibo, emphasizing the significant impact of content features. Contrary to them, contextual factors outperformed content ones in explaining the variance in tweet popularity, suggesting that heuristic strategies dominated users' information processing, specifically about vaccine, compared with systematic strategies. High number of fans meant high exposure of authors' tweets, and high number of tweets meant high activity (Williams et al., 2015). The former had a positive impact on information's popularity, while the latter had a negative impact. Excessive posting might cause information recipients to doubt content's quality, and overloaded information might damage author's influence (Qiu et al., 2017). People with low education levels perceived poorly about COVID-19 (Labban et al., 2020), failed to recognize reasons behind medical recommendations and realize outcomes of their possible actions (Biasio et al., 2020), hence were not likely/definitely to get vaccinated (Khubchandani et al., 2021). Therefore, high readability was conducive to increasing information's popularity. This was contrary to the research of Gandhi et al. (2020) about general influenza vaccine, with whose knowledge public were more familiar.

Pro-vaccine tweets expressing positive emotions were more popular (Xu & Guo, 2018), different from Ekram et al. (2019). But tweets questioning vaccines showing positive emotions might be refused. Regardless of the attitude or emotional polarity, strong emotional intensity was attractive (the number of exclamation marks and emotional intensity) (Gupta & Yang, 2019). However, emotion fluctuating largely reduced information's popularity. X. Wang, Jing, et al. (2020) claimed that adults preferred emotional roller coasters when reading books. Crisis might reduce public's emotional tolerance.

Among pro-vaccine tweets, public paid more attention to the domestic status. The credibility of traditional media was questioned, public trusted self-media more. In anti-vaccine tweets, conspiracy theories went viral (Jamison et al., 2020). Vaccine providers did not popularize vaccine-knowledge effectively. Adjectives, emoticons and images were vivid and intuitive, and were conducive to information dissemination (Mode, 2020).

5.4. Theoretical contributions

Firstly, researches on vaccine hesitancy paid more attention to data obtained from questionnaire surveys, interviews or experimental methods, ignoring the impact of online contradictory information on people's psychology and behavior by competing for users' limited attention, lacking of objective authenticity. This research provided a brand-new perspective to interpret the scope and determinants of vaccine hesitancy by comparing contradictory information's popularity and its affecting factors.

Secondly, during public health emergencies, the dissemination of disaster information and vaccine information interacted with each other. Few studies have investigated contradictory information in this complex context. This research made up for this gap.

Thirdly, this research developed a uniformed framework to guide the process from constructing popularity index, extracting information characteristics, to exploring relationship between characteristics and popularity for different-attitude information, which could be applied into other fields of contradictory information, such as rumors and rumor rebuttals.

Fourthly, we extracted features according to users’ systematic and heuristic information processing modes, and introduced Health Belief Models and Planned Behavior Theory to code topics and attitudes, enriching the application scope of theories and providing theoretical reference for future researches on feature extraction and topic mining in big data era.

Finally, text-feature networks could visualize posting behavior rules of social media stakeholders, which could provide new forms of data resource for researches such as user behavior classification and prediction, construction of user portraits, and expand application modes and fields of social network theory.

5.5. Practical implications

Firstly, managers could quickly and accurately know citizens' willingness to get vaccinated on a large scale by evaluating vaccine-messages’ popularity on social media commonly used in specific country or community. Especially during public health emergencies, users concerning or questioning in tweets receiving widely attention due to lack of information clues. On the one hand, public health department should timely publish relatively consistent information to avoid information vacuum, cognitive defects and narrow biases. On the other hand, it's important to improve information's readability and information recipients' health literacy.

Secondly, in targeted and tailored vaccine advocacy efforts, we must avoid one-size-fits-all strategies and instead consider posting patterns used by different stakeholders when discussing about different topics and expressing different opinions. Different patterns led to different-degree impact on information popularity. Public opinion departments should systematically monitor all posting-users’ tweet-feature networks and corresponding popularity among receivers in real time. So that they can timely discover the inflection point of public opinion evolution, and carry out risk aversion and traceability work by educating targeted users’ posting-behavior, to help pro-vaccine information dominate public opinion.

Thirdly, we found that users' debates on controversial topics were accompanied by strong emotional conflicts and fluctuations. Some highly active and influential users as online opinion leaders, also expressed radical statements. We knew that these could exacerbate attitude polarization (Nan & Daily, 2015), push users into echo chamber, and cause invalidation of public opinion guidance (Asker & Dinas, 2019). Hence, we recommended that traditional media, who conveyed government's directives in China (Guo, 2020), should avoid too many personal emotions when reporting news, to improve its credibility. Meanwhile, self-media's influence should not be neglected. Besides, to refute conspiracy theories, on the one hand, countries should strengthen international vaccine mutual assistance, eliminating them from the source. Once spread, even if widely refuted, their exposure only increased (Majid & Pal, 2020). On the other hand, Internet managers should strengthen user posting restrictions, like adding real-name and location settings, reducing rumors, low-quality and repetitive information to avoid information distortion and overload (Soroya et al., 2021).

5.6. Limitations

Firstly, the same feature had different impact on popularity of tweets with different attitudes, such as: the number of hashtags had a negative impact on the popularity of tweets against vaccine, but had a positive impact for unclear attitudes. We should not only consider the number. Features such as length, hot or not, and semantic similarity between hashtags all mattered (Wang et al., 2016). Secondly, multi-dimensional features have been extracted, the interactive impact of which on information popularity could be analyzed in more detail later. Finally, our data limited to the early stage of vaccine promotion, data could be supplemented and sliced in more detail based on different stages of events to study the dynamic changes of information characteristics and their impact on information popularity.

6. Conclusions

This research firstly evaluated and compared the popularity of information expressing different attitudes towards COVID-19 vaccine or vaccination to reflect the vaccine hesitancy on social media. Then, it extracted the content and contextual features, visualized and compared their combining patterns frequently used in different-attitude information. Finally, it clarified the direction and degree of impact of features on information popularity. These findings could provide several suggestions for adjusting organizational strategies of contradictory information to reduce vaccine hesitancy.

Credit author statement

Dandan Wang: Conceptualization; Methodology; Software; Formal analysis; Data curation; Visualization; Writing – original draft; Writing – review & editing. Yadong Zhou: Software; Data curation; Visualization.

Funding

This work was supported by the National Natural Science Foundation of China (grant numbers 71661167007, 71420107026) and by the National Key Research and Development Program of China (grant number 2018YFC0806904-03).

Biographies

Dandan Wang (Corresponding author) is a PhD student majoring in Management Science and Engineering in School of Information Management, Wuhan University. Her research focuses on the dissemination and governance of online rumor (1248948872@qq.com; dandanw@whu.edu.cn).

Yadong Zhou is a technician who specialize in big data and is good at natural language processing technology (lightdisappear@hotmail.com).

References

  1. Ahmad M.S., Adnan S.M., Zaidi S., Bhargava P. A novel support vector regression (SVR) model for the prediction of splice strength of the unconfined beam specimens. Construction and Building Materials. 2020;248:118475. doi: 10.1016/j.conbuildmat.2020.118475. [DOI] [Google Scholar]
  2. Ajzen I. The theory of planned behavior. Organizational Behavior and Human Decision Processes. 1991;50(2):179–211. doi: 10.1016/0749-5978(91)90020-T. [DOI] [Google Scholar]
  3. Alsmadi I., O'Brien M.J. Toward autonomous and collaborative information-credibility-assessment systems. Procedia Computer Science. 2020;168:118–122. doi: 10.1016/j.procs.2020.02.272. [DOI] [Google Scholar]
  4. An L., Ou M. Social network sentiment map of the stakeholders in public health emergencies. Library and Information Service in China. 2017;61(20):120–130. doi: 10.13266/j.issn.0252-3116.2017.20.013. [DOI] [Google Scholar]
  5. Asker D., Dinas E. Thinking fast and furious: Emotional intensity and opinion polarization in online media. Public Opinion Quarterly. 2019;83(3):487–509. doi: 10.1093/poq/nfz042. [DOI] [Google Scholar]
  6. Bastian M., Heymann S., Jacomy M., Others Gephi: An open source software for exploring and manipulating networks. Icwsm. 2009;8(2009):361–362. [Google Scholar]
  7. Berndt D.J., Clifford J. KDD workshop; 1994. Using dynamic time warping to find patterns in time series. [Google Scholar]
  8. Bewick V., Cheek L., Ball J. Statistics review 9: One-way analysis of variance. Critical Care. 2004;8(2):130. doi: 10.1186/cc2836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Biasio L.R., Bonaccorsi G., Lorini C., Pecorelli S. Human Vaccines & Immunotherapeutics; 2020. Assessing COVID-19 vaccine literacy: A preliminary online survey; pp. 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Brashers D.E. Communication and uncertainty management. Journal of Communication. 2001;51(3):477–497. doi: 10.1111/j.1460-2466.2001.tb02892.x. [DOI] [Google Scholar]
  11. Cappelletti R., Sastry N. 2012 international conference on social informatics. 2012. Iarank: Ranking users on twitter in near real-time, based on their information amplification potential. [Google Scholar]
  12. Carpenter D.M., Geryk L.L., Chen A.T., Nagler R.H., Dieckmann N.F., Han P.K. Conflicting health information: A critical research need. Health Expectations. 2016;19(6):1173–1182. doi: 10.1111/hex.12438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cha M., Benevenuto F., Haddadi H., Gummadi K. The world of connections and information flow in twitter. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans. 2012;42(4):991–998. doi: 10.1109/TSMCA.2012.2183359. [DOI] [Google Scholar]
  14. Chaiken S. Heuristic versus systematic information processing and the use of source versus message cues in persuasion. Journal of Personality and Social Psychology. 1980;39(5):752. doi: 10.1037/0022-3514.39.5.752. [DOI] [Google Scholar]
  15. Champion V.L., Skinner C.S. The health belief model. Health behavior and health education: Theory, research, and practice. 2008;4:45–65. [Google Scholar]
  16. Chang C. Men's and women's responses to two-sided health news coverage: A moderated mediation model. Journal of Health Communication. 2013;18(11):1326–1344. doi: 10.1080/10810730.2013.778363. [DOI] [PubMed] [Google Scholar]
  17. Chou W.-Y.S., Budenz A. Considering emotion in COVID-19 vaccine communication: Addressing vaccine hesitancy and fostering vaccine confidence. Health Communication. 2020;35(14):1718–1722. doi: 10.1080/10410236.2020.1838096. [DOI] [PubMed] [Google Scholar]
  18. Cohen E.L., Head K.J. Identifying knowledge-attitude-practice gaps to enhance HPV vaccine diffusion. Journal of Health Communication. 2013;18(10):1221–1234. doi: 10.1080/10810730.2013.778357. [DOI] [PubMed] [Google Scholar]
  19. Cox D.R., Wermuth N. A comment on the coefficient of determination for binary responses. The American Statistician. 1992;46(1):1–4. doi: 10.1080/00031305.1992.10475836. [DOI] [Google Scholar]
  20. Del Vicario M., Gaito S., Quattrociocchi W., Zignani M., Zollo F. 2017 IEEE international conference on data science and advanced analytics (DSAA) 2017. News consumption during the Italian referendum: A cross-platform analysis on facebook and twitter. [Google Scholar]
  21. Dong W., Huang Y., Lehane B., Ma G. XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. Automation in Construction. 2020;114:103155. doi: 10.1016/j.autcon.2020.103155. [DOI] [Google Scholar]
  22. Du J., Luo C., Shegog R., Bian J., Cunningham R.M., Boom J.A., Poland G.A., Chen Y., Tao C. Use of deep learning to analyze social media discussions about the human papillomavirus vaccine. JAMA Network Open. 2020;3(11):e2022025. doi: 10.1001/jamanetworkopen.2020.22025. e2022025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Ekram S., Debiec K.E., Pumper M.A., Moreno M.A. Content and commentary: HPV vaccine and YouTube. Journal of Pediatric and Adolescent Gynecology. 2019;32(2):153–157. doi: 10.1016/j.jpag.2018.11.001. [DOI] [PubMed] [Google Scholar]
  24. Elkin L.E., Pullon S.R., Stubbe M.H. ‘Should I vaccinate my child?’comparing the displayed stances of vaccine information retrieved from Google, Facebook and YouTube. Vaccine. 2020;38(13):2771–2778. doi: 10.1016/j.vaccine.2020.02.041. [DOI] [PubMed] [Google Scholar]
  25. Farooq A., Joyia G.J., Uzair M., Akram U. 2018 international conference on computing, mathematics and engineering technologies (iCoMET) 2018. Detection of influential nodes using social networks analysis based on network metrics. [Google Scholar]
  26. Faust K. Centrality in affiliation networks. Social Networks. 1997;19(2):157–191. doi: 10.1016/S0378-8733(96)00300-0. [DOI] [Google Scholar]
  27. Fu H., Oh S. Quality assessment of answers with user-identified criteria and data-driven features in social Q&A. Information Processing & Management. 2019;56(1):14–28. doi: 10.1016/j.ipm.2018.08.007. [DOI] [Google Scholar]
  28. Fu P.-W., Wu C.-C., Cho Y.-J. What makes users share content on facebook? Compatibility among psychological incentive, social capital focus, and content type. Computers in Human Behavior. 2017;67:23–32. doi: 10.1016/j.chb.2016.10.010. [DOI] [Google Scholar]
  29. Gandhi C.K., Patel J., Zhan X. Trend of influenza vaccine facebook posts in last 4 years: A content analysis. American Journal of Infection Control. 2020;48(4):361–367. doi: 10.1016/j.ajic.2020.01.010. [DOI] [PubMed] [Google Scholar]
  30. Ghaisani A.P., Munajat Q., Handayani P.W. 2017 second international conference on informatics and computing (ICIC) 2017. Information credibility factors on information sharing activites in social media. [Google Scholar]
  31. Grandjean M. 2015. GEPHI: Introduction to network analysis and visualisation. [Google Scholar]
  32. Guo L. China's “fake news” problem: Exploring the spread of online rumors in the government-controlled news media. Digital Journalism. 2020;8(8):992–1010. doi: 10.1080/21670811.2020.1766986. [DOI] [Google Scholar]
  33. Gupta R.K., Yang Y. Proceedings of the 27th ACM international conference on multimedia. 2019. Predicting and understanding news social popularity with emotional salience features. [Google Scholar]
  34. Han Shiyi Z.Y., Ma Y., Tu C., Guo Z., Liu Z., Sun M. 2016. THUOCL: Tsinghua open Chinese lexicon. [Google Scholar]
  35. Hartigan J.A., Wong M.A. Algorithm as 136: A K-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics) 1979;28(1):100–108. doi: 10.2307/2346830. [DOI] [Google Scholar]
  36. Hong D., Gao Z., Luo J., Liu J., Xu M., Suo Z., Ji X. Emotion analysis of teaching evaluation system based on AI techno-logy towards Chinese texts. MATEC Web of Conferences. 2021 [Google Scholar]
  37. Hsu P.-H., Kelly-Campbell R.J., Wise K. Readability of hearing-related internet information in traditional Chinese language. Speech, Language and Hearing. 2020;23(3):158–166. doi: 10.1080/2050571X.2019.1702240. [DOI] [Google Scholar]
  38. Huang H., Chen Y., Ma Y. Modeling the competitive diffusions of rumor and knowledge and the impacts on epidemic spreading. Applied Mathematics and Computation. 2021;388:125536. doi: 10.1016/j.amc.2020.125536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Huffaker D. Dimensions of leadership and social influence in online communities. Human Communication Research. 2010;36(4):593–617. doi: 10.1111/j.1468-2958.2010.01390.x. [DOI] [Google Scholar]
  40. Huff L., Kelley L. Levels of organizational trust in individualist versus collectivist societies: A seven-nation study. Organization Science. 2003;14(1):81–90. doi: 10.1287/orsc.14.1.81.12807. [DOI] [Google Scholar]
  41. Ittefaq M., Baines A., Abwao M., Shah S.F.A., Ramzan T. Does Pakistan still have polio cases?”: Exploring discussions on polio and polio vaccine in online news comments in Pakistan. Vaccine. 2021;39(3):480–486. doi: 10.1016/j.vaccine.2020.12.039. [DOI] [PubMed] [Google Scholar]
  42. Jamison A., Broniatowski D.A., Smith M.C., Parikh K.S., Malik A., Dredze M., Quinn S.C. Adapting and extending a typology to identify vaccine misinformation on Twitter. American Journal of Public Health. 2020;110(S3):S331–S339. doi: 10.2105/AJPH.2020.305940. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kauffman J., Kittas A., Bennett L., Tsoka S. DyCoNet: A Gephi plugin for community detection in dynamic complex networks. PLoS One. 2014;9(7) doi: 10.1371/journal.pone.0101357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q., Liu T.-Y. Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems. 2017;30:3146–3154. [Google Scholar]
  45. Khubchandani J., Sharma S., Price J.H., Wiblishauser M.J., Sharma M., Webb F.J. COVID-19 vaccination hesitancy in the United States: A rapid national assessment. Journal of Community Health. 2021;46(2):270–277. doi: 10.1007/s10900-020-00958-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Krippendorff K. 2011. Computing krippendorff's alpha-reliability.https://repository.upenn.edu/asc_papers/43 [Google Scholar]
  47. Labban L., Thallaj N., Labban A. Assessing the level of awareness and knowledge of COVID 19 pandemic among syrians. Arch Med. 2020;12(2):8. doi: 10.36648/1989-5216.12.2.309. [DOI] [Google Scholar]
  48. Lazarus J.V., Ratzan S.C., Palayew A., Gostin L.O., Larson H.J., Rabin K., Kimball S., El-Mohandes A. A global survey of potential acceptance of a COVID-19 vaccine. Nature Medicine. 2021;27(2):225–228. doi: 10.1038/s41591-020-1124-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Lee H., Oh H.J. Normative mechanism of rumor dissemination on Twitter. Cyberpsychology, Behavior, and Social Networking. 2017;20(3):164–171. doi: 10.1089/cyber.2016.0447. [DOI] [PubMed] [Google Scholar]
  50. Ley P., Florio T. The use of readability formulas in health care. Psychology Health & Medicine. 1996;1(1):7–28. doi: 10.1080/13548509608400003. [DOI] [Google Scholar]
  51. Li Z., Zhang Q., Du X., Ma Y., Wang S. Social media rumor refutation effectiveness: Evaluation, modelling and enhancement. Information Processing & Management. 2021;58(1):102420. doi: 10.1016/j.ipm.2020.102420. [DOI] [Google Scholar]
  52. Lundberg S., Lee S.-I. 2017. A unified approach to interpreting model predictions. arXiv preprint arXiv:1705.07874. [Google Scholar]
  53. MacDonald N.E. Vaccine hesitancy: Definition, scope and determinants. Vaccine. 2015;33(34):4161–4164. doi: 10.1016/j.vaccine.2015.04.036. [DOI] [PubMed] [Google Scholar]
  54. MacLean S.A., Basch C.H., Ethan D., Garcia P. Readability of online information about HPV Immunization. Human Vaccines & Immunotherapeutics. 2019;15(7–8):1505–1507. doi: 10.1080/21645515.2018.1502518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Majid G.M., Pal A. 2020. Conspiracy and rumor correction: Analysis of social media users' comments. 2020 3rd international conference on information and computer technologies (ICICT) [Google Scholar]
  56. Massey P.M., Kearney M.D., Hauer M.K., Selvan P., Koku E., Leader A.E. Dimensions of misinformation about the HPV vaccine on Instagram: Content and network analysis of social media characteristics. Journal of Medical Internet Research. 2020;22(12) doi: 10.2196/21451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. McDonald G.C. Ridge regression. Wiley Interdisciplinary Reviews: Computational Statistics. 2009;1(1):93–100. doi: 10.1002/wics.14. [DOI] [Google Scholar]
  58. Mishel M.H. Uncertainty in illness. Image - the Journal of Nursing Scholarship. 1988;20(4):225–232. doi: 10.1111/j.1547-5069.1988.tb00082.x. [DOI] [PubMed] [Google Scholar]
  59. Mode I. 2020. 8 emoji-text relations on Instagram. Shifts towards image-centricity in contemporary multimodal practices; p. 177. [Google Scholar]
  60. Nagler R.H., Yzer M.C., Rothman A.J. Effects of media exposure to conflicting information about mammography: Results from a population-based survey experiment. Annals of Behavioral Medicine. 2019;53(10):896–908. doi: 10.1093/abm/kay098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Nan X., Daily K. Biased assimilation and need for closure: Examining the effects of mixed blogs on vaccine-related beliefs. Journal of Health Communication. 2015;20(4):462–471. doi: 10.1080/10810730.2014.989343. [DOI] [PubMed] [Google Scholar]
  62. National Population Health Science Data Center released "COVID Term" Journal of Medical Informatics in China. 2020;41(2):94. [Google Scholar]
  63. Noro T., Ru F., Xiao F., Tokuda T. Twitter user rank using keyword search. Information Modelling and Knowledge Bases XXIV. Frontiers in Artificial Intelligence and Applications. 2013;251:31–48. doi: 10.3233/978-1-61499-177-9-31. [DOI] [Google Scholar]
  64. Pan S., Di Zhang J.Z. Caught in the crossfire: How contradictory information and norms on social media influence young women's intentions to receive HPV vaccination in the United States and China. Frontiers in Psychology. 2020;11 doi: 10.3389/fpsyg.2020.548365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Paparrizos J., Gravano L. Proceedings of the 2015 ACM SIGMOD international conference on management of data, melbourne, victoria, Australia. 2015. k-Shape: Efficient and accurate clustering of time series. [DOI] [Google Scholar]
  66. Park H.-S., Jun C.-H. A simple and fast algorithm for K-medoids clustering. Expert Systems with Applications. 2009;36(2):3336–3341. doi: 10.1016/j.eswa.2008.01.039. [DOI] [Google Scholar]
  67. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V. Scikit-learn: A random forest regressor. In. 2017. [Google Scholar]
  68. Peters P., Smith A., Funk Y., Boyages J. Language, terminology and the readability of online cancer information. Medical Humanities. 2016;42(1):36–41. doi: 10.1136/medhum-2015-010766. [DOI] [PubMed] [Google Scholar]
  69. Pulido Rodríguez C., Villarejo Carballido B., Redondo Sama G., Guo M., Ramis Salas M.d.M., Flecha R. False news around COVID-19 circulated less on Sina Weibo than on Twitter. How to overcome false information? RIMCIS-International and Multidisciplinary Journal of Social Sciences. 2020;9:22. doi: 10.17583/rimcis.2020.5386. [DOI] [Google Scholar]
  70. Qiu X., Oliveira D.F., Shirazi A.S., Flammini A., Menczer F. Limited individual attention and online virality of low-quality information. Nature Human Behaviour. 2017;1(7):1–7. doi: 10.1038/s41562-017-0132. [DOI] [Google Scholar]
  71. Ranstam J., Cook J. LASSO regression. Journal of British Surgery. 2018;105(10):1348. doi: 10.1002/bjs.10895. 1348. [DOI] [Google Scholar]
  72. Rieh S.Y. Judgment of information quality and cognitive authority in the Web. Journal of the American Society for Information Science and Technology. 2002;53(2):145–161. doi: 10.1002/asi.10017. [DOI] [Google Scholar]
  73. Riquelme F., González-Cantergiani P. Measuring user influence on twitter: A survey. Information Processing & Management. 2016;52(5):949–975. doi: 10.1016/j.ipm.2016.04.003. [DOI] [Google Scholar]
  74. Saura J.R., Reyes-Menendez A., Palos-Sanchez P. Are black friday deals worth it? Mining twitter users' sentiment and behavior response. Journal of Open Innovation: Technology, Market, and Complexity. 2019;5(3):58. doi: 10.3390/joitmc5030058. [DOI] [Google Scholar]
  75. Schmidt A.L., Zollo F., Scala A., Betsch C., Quattrociocchi W. Polarization of the vaccination debate on Facebook. Vaccine. 2018;36(25):3606–3612. doi: 10.1016/j.vaccine.2018.05.040. [DOI] [PubMed] [Google Scholar]
  76. Soroya S.H., Farooq A., Mahmood K., Isoaho J., Zara S.-e. From information seeking to information avoidance: Understanding the health information behavior during a global health crisis. Information Processing & Management. 2021;58(2):102440. doi: 10.1016/j.ipm.2020.102440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Sun Q., Li Y., Hu H., Cheng S. A model for competing information diffusion in social networks. IEEE Access. 2019;7:67916–67922. doi: 10.1109/ACCESS.2019.2918812. [DOI] [Google Scholar]
  78. Vasconcelos V.V., Levin S.A., Pinheiro F.L. Consensus and polarization in competing complex contagion processes. Journal of The Royal Society Interface. 2019;16(155):20190196. doi: 10.1098/rsif.2019.0196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Wang J., Jing R., Lai X., Zhang H., Lyu Y., Knoll M.D., Fang H.J.V. Acceptance of COVID-19 vaccination during the COVID-19 pandemic in China. Vaccines. 2020;8(3):482. doi: 10.3390/vaccines8030482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Wang R., Liu W., Gao S. Online Information Review; 2016. Hashtags and information virality in networked social movement: Examining hashtag co-occurrence patterns. [DOI] [Google Scholar]
  81. Wang C., Liu Y., Xiao Z., Zhou A., Zhang K. Analyzing internet topics by visualizing microblog retweeting. Journal of Visual Languages & Computing. 2015;28:122–133. doi: 10.1016/j.jvlc.2014.11.007. [DOI] [Google Scholar]
  82. Wang S., Li Z., Wang Y., Zhang Q. Machine learning methods to predict social media disaster rumor refuters. International Journal of Environmental Research and Public Health. 2019;16(8):1452. doi: 10.3390/ijerph16081452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Wang W., Lyu J., Li M., Zhang Y., Xu Z., Chen Y., Zhou J., Wang S. Human Vaccines & Immunotherapeutics; 2020. Quality evaluation of HPV vaccine-related online messages in China: A cross-sectional study; pp. 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Wang Y., McKee M., Torbica A., Stuckler D. Systematic literature review on the spread of health-related misinformation on social media. Social Science & Medicine. 2019;240:112552. doi: 10.1016/j.socscimed.2019.112552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Wang X., Song Y. Viral misinformation and echo chambers: The diffusion of rumors about genetically modified organisms on social media. Internet Research. 2020;30(5):1547–1564. doi: 10.1108/INTR-11-2019-0491. [DOI] [Google Scholar]
  86. Wang X., Zhang S., Smetannikov I. 2020. Fiction popularity prediction based on emotion analysis. 2020 international conference on control, robotics and intelligent system. [Google Scholar]
  87. Wheeler M., Buttenheim A.M. Parental vaccine concerns, information source, and choice of alternative immunization schedules. Human Vaccines & Immunotherapeutics. 2013;9(8):1782–1789. doi: 10.4161/hv.25959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Williams H.T.P., McMurray J.R., Kurz T., Lambert F.H. Network analysis reveals open forums and echo chambers in social media discussions of climate change. Global Environmental Change. 2015;32:126–138. doi: 10.1016/j.gloenvcha.2015.03.006. [DOI] [Google Scholar]
  89. Wirtz J.G., Zimbres T.M. A systematic analysis of research applying ‘principles of dialogic communication’to organizational websites, blogs, and social media: Implications for theory and practice. Journal of Public Relations Research. 2018;30(1–2):5–34. doi: 10.1080/1062726X.2018.1455146. [DOI] [Google Scholar]
  90. Xu Z., Ellis L., Umphrey L.R. The easier the better? Comparing the readability and engagement of online pro-and anti-vaccination articles. Health Education & Behavior. 2019;46(5):790–797. doi: 10.1177/1090198119853614. [DOI] [PubMed] [Google Scholar]
  91. Xu Z., Guo H. Using text mining to compare online pro-and anti-vaccine headlines: Word usage, sentiments, and online popularity. Communication Studies. 2018;69(1):103–122. doi: 10.1080/10510974.2017.1414068. [DOI] [Google Scholar]
  92. Zareie A., Sheikhahmadi A., Jalili M. Identification of influential users in social networks based on users' interest. Information Sciences. 2019;493:217–231. doi: 10.1016/j.ins.2019.04.033. [DOI] [Google Scholar]
  93. Zeng J., Burgess J., Bruns A. Is citizen journalism better than professional journalism for fact-checking rumours in China? How Weibo users verified information following the 2015 tianjin blasts. Global Media and China. 2019;4(1):13–35. doi: 10.1177/2059436419834124. [DOI] [Google Scholar]
  94. Zhang P., Cui Y., Lan Y., Wu L. Sentiment analysis of the micro-blog emergency and related guiding strategy based on the grounded theory and lexicon construction. Journal of Modern Information in China. 2019;39(3):122–131. doi: 10.3969/j.issn.1008.0821.2019.03.014. 143. [DOI] [Google Scholar]
  95. Zhang L., Peng T.-Q., Zhang Y.-P., Wang X.-H., Zhu J.J. Content or context: Which matters more in information processing on microblogging sites. Computers in Human Behavior. 2014;31:242–249. doi: 10.1016/j.chb.2013.10.031. [DOI] [Google Scholar]
  96. Zhang Y., Zhang D. Proceedings of the 2014 IEEE 15th international conference on information reuse and integration (IEEE IRI 2014) 2014. Automatically predicting the helpfulness of online reviews. [Google Scholar]
  97. Zhu Z., Gao C., Zhang Y., Li H., Xu J., Zan Y., Li Z. Cooperation and Competition among information on social networks. Scientific Reports. 2020;10(1):12160. doi: 10.1038/s41598-020-69098-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Computers in Human Behavior are provided here courtesy of Elsevier

RESOURCES