. 2021 Jan 28;3(3):e175–e194. doi: 10.1016/S2589-7500(20)30315-0

Table.

Summary of chosen articles

	Publication month	Origin	Social media	Study population and sample size	Methods	Key findings
Detection or prediction of COVID-19 cases
Li et al¹⁷	March	China	Google Trends, Baidu Search Index, and Sina Weibo Index	Keywords of coronavirus and pneumonia were searched, and trend data was collected from Google Trends, Baidu Search Index, and Sina Weibo Index from Jan 2 to Feb 20, 2020	Lag correlation	Lag correlations showed a maximum correlation between trend data and the number of diagnoses at 8–12 days before for laboratory-confirmed cases and 6–8 days before for suspected cases
Liu et al¹⁸	August	China	Sina Weibo	Sina Weibo messages between Jan 20 and Feb 15, 2020; 599 participants	Gathered data via Sina Weibo, then followed up with telephone call; statistical analysis taken with Fisher exact test; rates of death calculated with Kaplan-Meier method; multivariate Cox regression used to establish risk factors for mortality	Older age (ie, >69 years), diffuse pneumonia, and hypoxaemia are factors that can help clinicians to identify patients with COVID-19 who have poor prognosis; aggregated data from social media can also be comprehensive, immediate, and informative in disease prognosis
O'Leary and Storey¹⁹	September	USA	Google Trends, Wikipedia, and Twitter	Google Trends searches for coronavirus and COVID-19 between Jan 21 and April 5, 2020; Wikipedia page views for coronavirus and COVID-19 between Jan 12 and April 5, 2020; number of Twitter original tweets between Jan 27 and April 5, 2020; numbers of COVID-19 cases and deaths in the USA²⁰	Regression analysis	To model the number of cases, the current Wikipedia page views, tweets from 1 week before, and Google Trends searches from 2 weeks before were used; to model of the number of deaths, each variable was taken from 1 week earlier than for cases
Peng et al²¹	June	China	Sina Weibo	1200 records	Spatiotemporal distribution of COVID-19 cases in the main urban area of Wuhan, China; kernel density analysis; ordinary least square regression	Older people (ie, >60 years) are at high risk of severe symptoms and have high prevalence in the COVID-19 outbreak, and they account for >50% of the total number of Sina Weibo help seekers; early transmission of COVID-19 in Wuhan, China, could be divided into three phrases: scattered infection, community spread, and full-scale outbreak
Qin et al²²	March	China	Baidu Search Index	Social media search index for dry cough, fever, chest distress, coronavirus, and pneumonia from Dec 31, 2019, to Feb 9, 2020; data for new suspected cases of COVID-19 from Jan 20 to Feb 9, 2020	Subset selection; forward selection; lasso regression; ridge regression; elastic net	Case numbers of new suspected COVID-19 correlated significantly with the lagged series of social media search index; social media search index could detect new suspected COVID-19 cases 6–9 days earlier than could laboratories
Zhu et al²³	April	China	Sina Weibo	1101 Sina Weibo posts related to COVID-19 from Dec 31, 2019, to Feb 12, 2020	Descriptive statistics: numbers and percentage; time series analysis	Attention to COVID-19 was low until China openly admitted human-to-human transmission on Jan 20, 2020; attention quickly increased and remained high over time
Government responses
Basch et al²⁴	April	USA	YouTube	100 most widely viewed videos uploaded in January, 2020	Descriptive analysis: frequency, percentage, mean, and standard deviation	Percentage of each of the seven key prevention behaviours that are listed on the US Centers for Disease Control and Prevention website that were covered in the 100 videos varied from 0% (eg, use a face mask for protection if you are caring for the ill) to 31% (avoid close contact with people who are sick); overall, videos that covered at least one prevention behaviour accounted for less than one-third of the 100 videos
Basch et al²⁵	April	USA	YouTube	100 most widely viewed YouTube videos as of Jan 31, 2020, and March 20, 2020, with keyword of coronavirus in English, with English subtitles, or in Spanish	Descriptive analysis: frequency, percentage, mean, and standard deviation	<50% of videos in either sample covered any of the prevention behaviours that are recommended by the US Centers for Disease Control and Prevention
Khatri et al²⁶	March	Singapore	Youtube	150 videos collected on Feb 1–2, 2020, with keywords of 2019 novel coronavirus (50 videos), and Wuhan virus in English (50 videos) and Mandarin (50 videos)	Descriptive analysis: percentage and mean; DISCERN score; Medical Information and Content Index score	Mean DISCERN score for reliability was 3·12 of 5·00 for English and 3·25 of 5·00 for Mandarin videos; mean cumulative Medical Information and Content Index score of useful videos was 6·71 of 25·00 for English and 6·28 of 25·00 for Mandarin
Li et al²⁷	March	China	Sina Weibo	36 746 Sina Weibo data from Dec 30, 2019, to Feb 1, 2020; a random sample of 3000 Sina Weibo posts as training dataset	Linear regression; support vector machine; Naive Bayes; natural language processing	Classified the information related to COVID-19 into seven types of situational information and their predictors
Merkley et al²⁸	April	Canada	Twitter and Google Trends	33 142 tweets from 292 social media accounts of federal members of parliament from Jan 1 to March 28, 2020; 87 Google search trends for the search term coronavirus in the first half (ie, days 1–14) and second half (ie, days 15–31) of March, 2020; a survey of 2499 Canadian citizens ≥18 years from April 2 to April 6, 2020	Linear regression	No members of parliament from any party downplaying the pandemic; no association between Conservative Party vote share and Google search interest in the coronavirus
Rufai and Bunce²⁹	April	USA	Twitter	203 viral tweets from G7 world leaders from Nov 17, 2019, to March 17, 2020 with keywords COVID-19 or coronavirus and a minimum of 500 likes	Qualitative design; content analysis	166 of 203 of tweets were informative; 9·4% (19) were morale-boosting; 6·9% (14) were political
Sutton et al³⁰	September	USA	Twitter	690 accounts representing public health, emergency management, and elected officials and 149 335 tweets	χ² analyses; negative binomial regression modelling	Systematic changes were made in message strategies over time and identified key features that affect message passing, both positively and negatively; results have the potential to aid in message design strategies as the pandemic continues, or in similar future events
Wang et al³¹	September	USA	Twitter	13 598 tweets related to COVID-19 from Jan 1 to April 27, 2020	Temporal analysis and networking analysis	16 categories of message types were manually annotated; inconsistencies and incongruencies were identified in four critical topics (ie, wearing masks, assessment of risks, stay at home order, and disinfectant and sanitiser); network analysis showed increased communication coordination over time
Infodemics
Ahmed et al³²	October	UK	Twitter	22 785 tweets and 11 333 Twitter users with #FilmYourHospital from April 13 to April 20, 2020	Social network analysis; user analysis	The most important drivers of the #FilmYourHospital conspiracy theory are ordinary citizens; YouTube was the information source most linked to by users; the most retweeted post belonged to a verified Twitter user
Ahmed et al³³	May	UK	Twitter	A subsample of 233 tweets from 10 140 tweets collected from 19:44 h UTC on Friday, March 27, 2020, to 10:38 h UTC on Saturday, April 4, 2020 were used for content analysis	Descriptive statistics: numbers, percentage; social network analysis; content analysis	34·8% (81 of 233) of tweets linked 5G and COVID-19; 32·2% (75) of tweets denounced the conspiracy theory
Brennen et al³⁴	October	UK	Digital visual media	96 samples of visuals from January to March, 2020	Qualitative coding	Organised all findings into six trends: authoritative agency, virulence, medical efficacy, intolerance, prophecy, satire; a small number of manipulated visuals, all were produced by use of simple tools; no examples of so-called deepfakes (ie, techniques that are used to make synthetic videos that closely resemble real videos) or other techniques that were based on artificial intelligence
Bruns et al³⁵	August	Australia	Facebook	89 664 distinct Facebook posts from Jan 1 to April 12, 2020	Time series; network analysis	Substantially increased number of posts about 5G rumours on Facebook after March 19, 2020; network analysis showed that coalitions of various groups were brought together by conspiracy theories about COVID-19 and 5G technology
Galhardi et al³⁶	October	Brazil	WhatsApp, Instagram, and Facebook	Fake news collected from March 17 to April 10, 2020, on the basis of data from the Eu Fiscalizo app (version 5.0.5)	Quantitative content analysis	WhatsApp is the main channel for sharing fake news, followed by Instagram and Facebook
Gallotti et al³⁷	October	Italy	Twitter	>100 million Tweets	Developed an Infodemic Risk Index	Before the rise of COVID-19 cases, entire countries had measurable waves of potentially unreliable information, posing a serious threat to public health
Islam et al³⁸	October	Bangladesh	Fact-checking agency websites, Facebook, Twitter, and websites for television networks and newspapers	2311 infodemic reports related to COVID-19 between Dec 31, 2019, and April 5, 2020	Descriptive analysis; spatial distribution analysis	Misinformation that is fuelled by rumours, stigma, and conspiracy theories can have potentially severe implications on public health if prioritised over scientific guidelines; governments and other agencies should understand the patterns of rumours, stigma, and conspiracy theories that are related to COVID-19 and circulating globally so that they can develop appropriate messages for risk communication
Kouzy et al³⁹	March	Lebanon	Twitter	673 English tweets collected on Feb 27, 2020; 617 tweets after exclusion of tweets that were humorous or not serious	Descriptive statistics; bar chart; χ² statistic to calculate p value (2-sided; p=0·05 significance threshold) for the association between account or tweet characteristics and the presence of misinformation or unverifiable information about COVID-19	153 (24·8%) of 617 tweets had misinformation; 107 (17·3%) had unverifiable information; misinformation rate higher in informal individual or group accounts than in formal individual or group accounts (33·8% [123 of 364] vs 15·0% [30 of 200], p<0·0010)
Moscadelli et al⁴⁰	August	Italy	Fake news and corresponding verified news that was circulated in Italy	2102 articles between Dec 31, 2019, and April 30, 2020	Social media trend analysis by use of BuzzSumo	Links containing fake news were shared 2 352 585 times, accounting for 23·1% (2 352 585 of 10 184 351) of total shares of all reviewed articles
Pulido et al⁴¹	April	Spain	Twitter	942 valid tweets between Feb 6 and Feb 7, 2020	Communicative content analysis	Misinformation was tweeted more but retweeted less than tweets based on scientific evidence; tweets based on scientific evidence had more engagement than misinformation
Rovetta and Bhagavathula⁴²	August	Italy	Google Trends and Instagram	2 million Google Trends queries and Instagram hashtags from Feb 20 to May 6, 2020	Classification of infodemic monikers (ie, a term, query, hashtag, or phrase that generates or feeds fake news, misinterpretations, or discrimination); computed the mean peak volume with a 95% CI	Globally, growing interest exists in COVID-19, and numerous infodemic monikers continue to circulate on the internet
Uyheng and Carley⁴³	October	USA and Philippines	Twitter	12·0 million tweets from 1·6 million users from the USA and 15·0 million tweets from 1·0 million users from the Philippines between March 5 and March 19, 2020	Hate speech score assigned to each tweet by use of machine learning algorithm; bot scores were assigned to each user via BotHunter algorithm; social media analysis via ORA software; network analysis via centrality analysis; cluster analysis via Leiden algorithm	Analysis showed idiosyncratic relationships between bots and hate speech across datasets, emphasising different network dynamics of racially charged toxicity in the USA and political conflicts in the Philippines; bot activity is linked to hate in both countries, especially in communities that are dense and isolated from others
Mental health
Gao et al⁴⁴	April	China	Sina Weibo	Online survey on Wenjuanxing platform from Jan 31 to Feb 2, 2020; with 4872 Chinese citizens aged ≥18 years from 31 provinces and autonomous regions in China	Multivariable logistic regression	Social media exposure was frequently positively associated with high odds of anxiety (odds ratio 1·72, 95% CI 1·31–2·26) and combination of depression and anxiety (odds ratio 1·91, 95% CI 1·52–2·41)
Li et al⁴⁵	March	China	Sina Weibo	Sina Weibo posts from 17 865 active Sina Weibo users between Jan 13 and Jan 26, 2020	Sentiment analysis; paired sample t-test	Negative emotions and sensitivity to social risks increased; scores of positive emotions and life satisfaction decreased after outbreak declaration
Prevention education in videos
Hakimi and Armstrong⁴⁶	September	USA	YouTube	49 of the first 100 videos on YouTube with the most views that were identified by the search term DIY hand sanitiser; 51 videos were excluded because they were not in English or not related to the search term	Codified video content; assessed by use of Cohen's κ; descriptive statistics calculated; assessed by χ² test with 2-sided p value <0·05 as the threshold for significance	Most videos did not describe labelling storage containers, 69% (34 of 49) of videos encouraged the use of oils or perfumes to enhance hand sanitiser scent, and 2% (1) of videos promoted the use of colouring agents to be more attractive for use among children specifically; significantly increased mean number of daily calls to poison control centres regarding unsafe paediatric exposure to hand sanitiser since the first confirmed patient with COVID-19 in the USA (p<0·0010); significantly increased mean number of daily calls in March, 2020, compared with the previous 2 years (p<0·0010)
Hernández-García and Giménez-Júlvez⁴⁷	June	Spain	YouTube	129 videos in Spanish with the terms prevencion coronavirus and prevencion COVID19	Univariate analysis; multiple logistic regression model	Information from YouTube in Spanish on basic measures to prevent COVID-19 is usually not complete and differs according to the type of authorship (ie, mass media, health professionals, individual users, or others)
Moon and Lee⁴⁸	August	South Korea	YouTube	105 most viewed YouTube videos from Jan 1 to April 30, 2020	Modified DISCERN index; Journal of the American Medical Association Score benchmark criteria; Global Quality Score; Title–Content Consistency Index; Medical Information and Content Index	37·14% (39 of 105) of videos contained misleading information; independent user-generated videos showed the highest proportion of misleading information at 68·09% (32 of 47); misleading videos had more likes, fewer comments, and longer running times than did useful videos; transmission and precautionary measures were the most frequently covered content
Ozdede and Peker⁴⁹	July–August	Turkey	YouTube	The top 116 English language videos with at least 300 views	Precision indices and total video information and quality index scores were calculated	High number of views on dentistry YouTube videos related to COVID-19; quality and usefulness of these videos are moderate
Yüce et al⁵⁰	July	Turkey	YouTube	55 English videos about COVID-19 control procedures for dental practices collected on March 31, 2020, between 9:00 h and 18:00 h	Modified DISCERN instrument; descriptive statistics	Only two (3·6%) of 55 videos were good quality, whereas 24 (43·6%) videos were poor quality
Public attitudes
Abd-Alrazaq et al⁷	April	Qatar	Twitter	2·8 million English tweets (167 073 unique tweets from 160 829 unique users) from Feb 2 to March 15, 2020	Word frequencies of single (ie, unigrams) and double words (ie, bigrams); sentiment analysis; mean number of retweets, likes, and followers for each topic; interaction rate per topic; LDA for topic modelling	Identified 12 topics and grouped into four themes; average sentiment positive for ten topics and negative for two topics
Al-Rawi et al⁵¹	November	Canada	Twitter	Over 50 million tweets referencing #Covid-19 and #Covid19 for more than 2 months in early 2020	Mixed method: analysed emoji use by each gender category; the top 600 emojis were manually classified on the basis of their sentiment	Identified five major themes in the analysis: morbidity fears, health concerns, employment and financial issues, praise for front-line workers, and unique gendered emoji use; most emojis are extremely positive across genders, but discussions by women and gender minorities are more negative than by men; when discussing particular topics (eg, financial and employment matters, gratitude, and health care), there are many differences; use of several unique gender emojis to express specific issues (eg, coffin, skull, and siren emojis were used more often by men than by other genders when discussing fears and morbidity, whereas the use of the folded hands emoji as a thankful gesture for front-line workers was found more often in discussions by women than by other genders and the bank emoji was noted only in women's discussions)
Arpaci et al⁵²	July	Turkey	Twitter	43 million tweets between March 22 and March 30, 2020	Evolutionary clustering analysis	Unigram terms appear more frequently than bigram and trigram (ie, triple words) terms; during the epidemic, many tweets about COVID-19 were distributed and attracted widespread public attention; high-frequency words (eg, death, test, spread, and lockdown) indicated that people were afraid of being infected and people who were infected were afraid of death; people agreed to stay at home due to fear of spread and called for physical distancing since they became aware of COVID-19
Barrett et al⁵³	August	USA	Twitter	188 tweets about Governor Dan Patrick's statement on March 23, 2020, about generational self-sacrifice.	Thematic analysis	90% (169 of 188) of tweets opposed calculated ageism, whereas only 5% (9) supported it and 5% (10) conveyed no position; opposition centred on moral critiques, political–economic critiques, assertions of the worth of older adults (eg, >60 years), and public health arguments; support centred on individual responsibility and patriotism
Boon-Itt and Skunkan⁵⁴	November	Thailand	Twitter	107 990 English tweets related to COVID-19 between Dec 13, 2019, and March 9, 2020	Sentiment analysis; topic modelling by use of LDA	Sentiment analysis showed a predominantly negative feeling towards the COVID-19 pandemic; topic modelling revealed three themes relating to COVID-19 and the outbreak: the COVID-19 pandemic emergency, how to control COVID-19, and reports on COVID-19
Budhwani and Sun⁵⁵	May	USA	Twitter	16 535 tweets about Chinese virus or China virus between March 9 and March 15, 2020, 177 327 tweets between March 19 and March 25, 2020	Descriptive analysis; spatial analysis	Nearly 10 times increase at the national level; all 50 states had an increase in the number of tweets exclusively mentioning Chinese virus or China virus instead of coronavirus disease, COVID-19, or coronavirus; mean 0·38 tweets referencing Chinese virus or China virus were posted per 10 000 people at the state level in the preperiod (ie, March 9–15, 2020), and 4·08 of these stigmatising tweets were posted in the postperiod (ie, March 19–25, 2020), also indicating a 10 times increase
Chang et al⁵⁶	November	Taiwan	10 news websites, 11 discussion forums, 1 social network, 2 principal media sharing networks	1·07 million Chinese texts from Dec 30, 2019, to March 31, 2020	Deductive analysis	Online news promoted negativity and drove emotional social posts; stigmatising language that was linked to the COVID-19 pandemic showed an absence of civic responsibility that encouraged bias, hostility, and discrimination
Chehal et al⁵⁷	July	India	Twitter	29 554 tweets during the second lockdown (ie, April 15–May 3, 2020); 47 672 tweets during the third lockdown (May 4–17, 2020)	Sentiment analysis by use of the National Research Council of Canada Emotion Lexicon	A positive approach in the second lockdown but a negative approach in the third lockdown
Chen et al⁵⁸	September	China	Sina Weibo	1411 posts pertinent to COVID-19 taken from Healthy China, an official Sina Weibo account of the National Health Commission of China, from Jan 14 to March 5, 2020	Descriptive analysis; hypothesis testing	Media richness (ie, potential information load, where low richness is only text and high richness is not only text) negatively predicted citizen participation via government social media, but dialogic loop (ie, stimulation of public dialogue, provision of the dialogue channel, and response to public feedback in a timely manner) facilitated engagement
Damiano and Allen Catellier⁵⁹	August	USA	Twitter	600 English tweets from the USA were selected: 300 from February, 2020, and 300 from March, 2020	Frequencies; χ² statistics	Neutral sentiment; tweets about COVID-19 risks and emotional outrage accounted for <50% (135 of 600); few tweets were related to blame
Darling-Hammond et al⁶⁰	September	USA	Twitter	339 063 tweets from non-Asian respondents of the Project Implicit Asian Implicit Association Test from 2007–20 and were broken into two datasets: the first dataset was from Jan 1, 2007, to Feb 10, 2020; the second data set was from Feb 11 to March 31, 2020	Local polynomial regression; interrupted time-series analyses	Implicit Americanness Bias steadily decreased from 2007 to 2020; when media entities began using stigmatising terms, such as Chinese virus, starting from March 8, 2020, Implicit Americanness Bias began to increase; such bias was more pronounced among conservative individuals than among non-conservative individuals
Das and Dutta⁶¹	July	India	Twitter	410 643 tweets with #IndiaLockdown and #IndiafightsCorona from March 22 to April 21, 2020	National Research Council of Canada lexicon for corpus-level emotion mining; sentimentr from open source R software for sentiment analysis to create additional sentiment scores; LDA for topic models; Natural Language Toolkit to develop sentiment-based topic models	For the broad corpus-level analysis, the context of positiveness was substantially higher than were negative sentiments; however, positive sentiment trends were similar to negative sentiment trends in terms of topics covered when the analysis was done at individual tweet level; the results showed that the discussion of COVID-19 in India on Twitter contains slightly more positive sentiments than negative sentiments
De Santis et al⁶²	July	Italy	Twitter	1 044 645 tweets	A general purpose methodological framework, grounded on a biological metaphor and on a chain of NLP and graph analysis techniques	Energy evolution through time was monitored; daily hot topics were identified (eg, COVID-19, Walter Ricciardi's retweet of an anti-Trump tweet from Michael Moore, Gabriele Gravina's argument against suspension of Italian football, increased COVID-19 cases in Italy, high case numbers in Lombardy, Italy, and an interview of Matteo Salvini about COVID-19 topics by Massimo Giletti)
Dheeraj⁶³	May–June	India	Reddit	868 posts related to COVID-19	Fetching the articles: Python Reddit Application Programming Interface Wrapper; data preprocessing: Reddit Application Programming Interface and Natural Language Toolkit library	Of 868 posts on Reddit that were related to COVID-19 articles, 50% (434) were neutral, 22% (191) were positive, and 28% (243) were negative
Essam and Abdo⁶⁴	August	Egypt	Twitter	1 920 593 tweets with corona, coronavirus, or COVID-19 keywords from Feb 1 to April 30, 2020	Thematic analysis	The dominant themes that were closely related to coronavirus tweets included the outbreak of the pandemic, metaphysics responses, signs and symptoms in confirmed cases, and conspiracies; the psycholinguistic analysis showed that tweeters maintained high amounts of affective talk (ie, expression of feelings), which was loaded with negative emotions and sadness; Linguistic Inquiry and Word Count's psychological categories of religion and health dominated the Arabic tweets discussing the pandemic situation
Yin FL et al⁶⁵	March	China	Sina Weibo	Sina Weibo posts from Dec 31, 2019, to Feb 7, 2020	Multiple-information susceptible-discussing-immune model	Model reproduction ratio declined from 1·78 to 0·97, showing that the peak of posts had passed but the topic was still on social media afterwards with a decreased number of posts
Gozzi et al⁶⁶	October	Italy, UK, USA, and Canada	News, YouTube, Reddit, and Wikipedia	227 768 web-based news articles from Feb 7 to May 15, 2020; 13 448 YouTube videos from Feb 7 to May 15, 2020; 107 898 English user posts and 3 829 309 comments on Reddit from Feb 15 to May 15, 2020; 278 456 892 views of Wikipedia pages that were related to COVID-19 from Feb 7 to May 15, 2020	Linear regression; topic modelling by use of LDA	Collective attention was mainly driven by media coverage rather than epidemic progression, rapidly became saturated, and decreased despite media coverage and COVID-19 incidence remaining high; Reddit users were generally more interested in health, data regarding the new disease, and interventions needed to halt the spreading with respect to media exposure than were users of other platforms
Green et al⁶⁷	July	USA	Twitter	19 803 tweets from Democrats and 11 084 tweets from Republicans between Jan 17 and March 31, 2020	Random forest	Democrats discussed the crisis more frequently—emphasising public health and direct aid to US workers—whereas Republicans placed greater emphasis on national unit, China, and businesses
Han et al⁶⁸	April	China	Sina Weibo	1 413 297 Sina Weibo messages, including 105 330 texts with geographical location information, from 00:00 h on Jan 9, 2020, to 00:00 h on Feb 11, 2020	Time series analysis; kernel density estimation; Spearman correlation; LDA model; random forest algorithm	Public response was sensitive to the epidemic and notable social events, especially in urban agglomerations
Jelodar et al⁶⁹	June	China	Reddit	563 079 English comments related to COVID-19 from Reddit between Jan 20 and March 19, 2020	Topic modelling by use of LDA and probabilistic latent semantic analysis; sentiment classification by use of recurrent neural network	The results showed a novel application for NLP based on a long short term memory model to detect meaningful latent topics and sentiment–comment classification on issues related to COVID-19 on social media
Jimenez-Sotomayor et al⁷⁰	April	Mexico	Twitter	A random sample of 351 of 18 128 tweets were analysed from March 12 to March 21, 2020	Qualitative content classification	The most common types of tweets were personal opinions (31·9% [112 of 351]), followed by informative tweets (29·6% [104]), jokes or ridicule (14·2% [50]), and personal accounts (13·4% [47]); 72 of 351 tweets were most likely intended to ridicule or offend someone and 21·1% (74) had content implying that the life of older adults (ie, referred to in tweets as “elderly”, “older”, and “boomer”) was less valuable than that of younger people or downplayed the relevance of COVID-19
Kim⁷¹	August	South Korea	Twitter	27 849 individual tweets about COVID-19 between Feb 10 and Feb 14, 2020	Binary logistic regression; semantic network analysis	Social network size was a negative predictor of incivility
Kurten and Beullens⁷²	August	Belgium	Twitter	373 908 tweets and retweets from Feb 25 to March 30, 2020	Time series; network bigrams; emotion lexicon; LDA	Notable COVID-19 events immediately increased the number of tweets; most topics focused on the need for EU collaboration to tackle the pandemic
Kwon et al⁷³	October	USA	Twitter	259 529 unique tweets containing the word coronavirus between Jan 23 and March 24, 2020	Trending analysis; spatiotemporal analysis	Early facets of physical distancing appeared in Los Angeles (CA, USA), San Francisco (CA, USA), and Seattle (WA, USA); social disruptiveness tweets were most retweeted, and intervention implementation tweets were most favourited
Lai et al⁷⁴	October	USA	Reddit	522 comments from an Ask Me Anything session on COVID-19 on March 11, 2020, from 14:00 h to 16:00 h EST	Content analysis	The highest number of posts were about symptoms (27% [141 of 522]), followed by prevention (25% [131]); symptoms was the most common intended topic for further discussions (28% [94 of 337])
Li et al⁷⁵	April	China	Sina Weibo	115 299 Sina Weibo posts from Dec 23, 2019, to Jan 30, 2020; 11 893 of them were collected from Dec 31, 2019, to Jan 20, 2020, for qualitative analysis; total daily cases of COVID-19 in Wuhan, China, were obtained from the Chinese National Health Commission	Linear regression model; qualitative content analysis	Positive correlation between the number of Sina Weibo posts and the number of reported cases, with ten COVID-19 cases per 40 posts; posts grouped into four themes
Li et al⁷⁶	September	USA	Twitter	155 353 unique English tweets related to COVID-19 that were posted from Dec 31, 2019, to March 13, 2020	Content analysis	Peril of COVID-19 was mentioned the most often, followed by content about marks (ie, cues to identify members of a stigmatised group: flu-like symptoms, personal protective equipment, Asian origin, and health-care providers and essential workers), responsibility, and group labelling; information on conspiracy theories was more likely to be included in tweets about group labelling and responsibility than in tweets about COVID-19 peril
Lwin et al⁷⁷	May	Singapore	Twitter	20 325 929 tweets from 7 033 158 unique users from Jan 28 to April 9, 2020	Sentiment analysis	Public emotions shifted strongly from fear to anger over the course of the pandemic, while sadness and joy also surfaced; anger shifted from xenophobia at the beginning of the pandemic to discourse around the stay-at-home notices; sadness was emphasised by the topics of losing friends and family members, whereas topics that were related to joy included words of gratitude and good health; emotion-driven collective issues around shared public distress experiences of the COVID-19 pandemic are developing and include large-scale social isolation and the loss of human lives
Ma et al⁷⁸	July	China	WeChat	Top 200 accounts from Jan 21 to Jan 27, 2020	Simple linear regression; multiple linear regression; content analysis	For non-medical institution accounts in the model, report and story types of articles had positive effects on whether users followed behaviours; for medical institution accounts, report and science types of articles had a positive effect
Medford et al⁷⁹	June	USA	Twitter	126 049 English tweets from 53 196 unique users with matching hashtags that were related to COVID-19 from Jan 14 to Jan 28, 2020	Temporal analysis; sentiment analysis; topic modelling by use of LDA	The hourly number of tweets that were related to COVID-19 starkly increased from Jan 21, 2020, onwards; fear was the most common emotion and was expressed in 49·5% (62 424 of 126 049) of all tweets; the most common predominant topic was the economic and political effect
Mohamad⁸⁰	June	Brunei	Twitter, Instagram, and TikTok	30 individual profiles from Instagram, Twitter, and TikTok	Qualitative content analysis	Five narratives of local responses to physical distancing practices were apparent: fear, responsibility, annoyance, fun, and resistance
Nguyen et al⁸¹	September	USA	Twitter	3 377 295 US tweets that were related to race from November, 2019, to June, 2020	Support vector machine was used for sentiment analysis	Proportion of negative tweets referencing Asians increased by 68·4%; proportion of negative tweets referencing other racial or ethnic minorities was stable; common themes that emerged during the content analysis of a random subsample of 3300 tweets included: racism and blame, anti-racism, and effect on daily life
Odlum et al⁸²	June	USA	Twitter	2 558 474 Tweets from Jan 21 to May 3, 2020	Clustering algorithm; NLP; network diagrams	15 topics (in four themes) were identified; positive sentiments, cohesively encouraging online discussions, and behaviours for COVID-19 prevention were uniquely observed in African American Twitter communities
Park et al⁸³	May	South Korea	Twitter	43 832 unique users and 78 233 relationships on Feb 29, 2020	Network analysis; content analysis	Spread of information was faster in the COVID-19 network than in the other networks; tweets containing medically framed news articles were more popular than were tweets that included news articles adopting non-medical frames
Pastor⁸⁴	April	Philippines	Twitter	Tweets were collected on three Tuesdays in March, 2020, since lockdown in Philippines	NLP for sentiment analysis	Negative sentiments increased over time in lockdown
Samuel et al⁸⁵	June	USA	Twitter	900 000 tweets from February to March, 2020	Sentiment analysis packages; textual analytics; machine learning classification methods: Naive Bayes and logistic regression	For short tweets, classification accuracy was 91% with Naive Bayes whereas accuracy was 74% with logistic regression; both methods showed weaker performance for longer tweets
Samuel et al⁸⁶	August	USA	Twitter	293 597 tweets, 90 variables	Textual analytics to analyse public sentiment support; sentiment analysis by use of R package Syuzhet (version 1.0.6)	For the reopening of the US economy, there was more positive sentiment support than there was negative support; developed a novel sentiment polarity based public sentiment scenarios framework
Su et al⁸⁷	June	China and Italy	Sina Weibo and Twitter	850 Sina Weibo users with posts published from Jan 9 to Feb 5, 2020; 14 269 tweets from 188 unique Twitter users from Feb 23 to March 21, 2020	Wilcoxon tests	Individuals focused more on home and expressed a high level of cognitive process after a lockdown in both Wuhan, China, and Lombardy, Italy; level of stress decreased, and the attention to leisure increased in Lombardy, Italy, after the lockdown; attention to group, religion, and emotions became more prevalent in Wuhan, China, after the lockdown
Thelwall and Thelwall⁸⁸	May	UK	Twitter	3 038 026 English tweets from March 10 to March 23, 2020	Word frequency comparison; χ² analysis	Women were more likely to tweet about the virus in the context of family, physical distancing, and health care, whereas men were more likely to tweet about sports cancellations, the global spread of the virus, and political reactions
Wang et al⁸⁹	July	China	Sina Weibo	999 978 randomly selected Sina Weibo posts that were related to COVID-19 from Jan 1 to Feb 18, 2020	Unsupervised Bidirectional Encoder Representations from Transformers model: classify sentiment categories; Term Frequency-Inverse Document Frequency model: summarise the topics of posts; trend analysis; thematic analysis	People were concerned about four aspects regarding COVID-19: the virus origin, symptoms, production activity, and public health control
Wicke and Bolognesi⁹⁰	September	Ireland	Twitter	203 756 tweets	Topic modelling	Although the family frame covers a wider portion of topics, among the figurative frames, war (a highly conventional one) was the frame used most frequently; yet, this frame does not seem to be appropriate to elaborate the discourse around some aspects that are involved in the situation
Xi et al⁹¹	September	China	Sina Weibo	188 unique topics, their views, and comments from Jan 20 to April 28, 2020	Thematic analysis; temporal analysis	Six themes were identified: the most prominent theme was older people contributing to the community (46 [24%] of 188) followed by older patients (defined by keywords—eg, “older people”, “old-aged people”, “grandmother”, “grandfather”, “old grandmother”, “old grandfather”, “old woman”, and “old man”) in hospitals (43 [23%]); the theme of contributing to the community was the most dominant in the first phase (Jan 20–Feb 20, 2020; period of COVID-19 outbreak in China); the theme of older patients in hospitals was most dominant in the second (Feb 21–March 17, 2020; turnover period) and third phase (March 18–April 28, 2020; post-peak period in China)
Xie et al⁹²	August	China	Baidu Search Index and Google Trends	Number of cases by Feb 29, 2020: 79 968 cumulative confirmed cases, 41 675 cured cases, 2873 dead cases	Kendall's T_b rank test	Both the Baidu Search Index and Google Trends indices showed a similar trend in a slightly different way; daily Google Trends were correlated to seven indicators, whereas daily Baidu Search Index was correlated to only three indicators; these indexes and rumours are statistically related to disease-related indicators; information symmetry was also noted
Xue et al⁹³	November	Canada	Twitter	1 015 874 tweets from April 12 to July 16, 2020	LDA	Nine themes about family violence were identified
Yigitcanlar et al⁹⁴	October	Australia	Twitter	96 666 tweets from Australia in Jan 1 to May 4, 2020	Descriptive analysis; content analysis; sentiment analysis; spatial analysis	Social media analytics is an efficient approach to capture attitudes and perceptions of the public during a pandemic; crowdsourced social media data can guide interventions and decisions of the authorities during a pandemic; effective use of government social media channels can help the public to follow the introduced measures and restrictions
Yu et al⁹⁵	July	Spain	Twitter	22 223 tweets	Topic modelling; network analysis	Identified eight news frames for each newspaper's Twitter account; the entire pandemic development process is divided into three periods: precrisis, lockdown, and recovery period; understanding of how Spanish news media cover public health crises on social media platforms
Zhao et al⁹⁶	May	China	Sina Weibo and microblog hot search list	4056 topics from Dec 31, 2019, to Feb 20, 2020	Word segmentation; word frequency; sentiment analysis	The trend of public attention could be divided into three stages; the hot topic keywords of public attention at each stage were slightly different; the emotional tendency of the public towards the COVID-19 pandemic-related hot topics changed from negative to neutral between January and February, 2020, with negative emotions weakening and positive emotions increasing overall; COVID-19 topics with the most public concern were divided into five categories: the situation of the new cases of COVID-19 and its effects, front-line reporting of the pandemic and the measures of prevention and control, expert interpretation and discussion on the source of infection, medical services on the front line of the pandemic, and focus on the pandemic and the search for suspected cases
Zhu et al⁹⁷	July	China	Sina Weibo	1 858 288 microblog data	LDA	A so-called double peaks feature appeared in the search curve for epidemic topics; the topic changed over time, the fluctuation of topic discussion rate gradually decreased; political and economic centres attracted high attention on social media; the existence of the subject of rumours enabled people to have more communication and discussion

All studies were published in 2020. LDA=latent Dirichlet allocation. NLP=natural language processing.