Skip to main content
Journal of Medical Internet Research logoLink to Journal of Medical Internet Research
. 2022 Oct 5;24(10):e40323. doi: 10.2196/40323

Emotions and Topics Expressed on Twitter During the COVID-19 Pandemic in the United Kingdom: Comparative Geolocation and Text Mining Analysis

Hassan Alhuzali 1,#, Tianlin Zhang 2,#, Sophia Ananiadou 2,3,
Editor: Corey Basch
Reviewed by: Joshua Nielsen, Masud Rabbani
PMCID: PMC9536769  PMID: 36150046

Abstract

Background

In recent years, the COVID-19 pandemic has brought great changes to public health, society, and the economy. Social media provide a platform for people to discuss health concerns, living conditions, and policies during the epidemic, allowing policymakers to use this content to analyze the public emotions and attitudes for decision-making.

Objective

The aim of this study was to use deep learning–based methods to understand public emotions on topics related to the COVID-19 pandemic in the United Kingdom through a comparative geolocation and text mining analysis on Twitter.

Methods

Over 500,000 tweets related to COVID-19 from 48 different cities in the United Kingdom were extracted, with the data covering the period of the last 2 years (from February 2020 to November 2021). We leveraged three advanced deep learning–based models for topic modeling to geospatially analyze the sentiment, emotion, and topics of tweets in the United Kingdom: SenticNet 6 for sentiment analysis, SpanEmo for emotion recognition, and combined topic modeling (CTM).

Results

We observed a significant change in the number of tweets as the epidemiological situation and vaccination situation shifted over the 2 years. There was a sharp increase in the number of tweets from January 2020 to February 2020 due to the outbreak of COVID-19 in the United Kingdom. Then, the number of tweets gradually declined as of February 2020. Moreover, with identification of the COVID-19 Omicron variant in the United Kingdom in November 2021, the number of tweets grew again. Our findings reveal people’s attitudes and emotions toward topics related to COVID-19. For sentiment, approximately 60% of tweets were positive, 20% were neutral, and 20% were negative. For emotion, people tended to express highly positive emotions in the beginning of 2020, while expressing highly negative emotions over time toward the end of 2021. The topics also changed during the pandemic.

Conclusions

Through large-scale text mining of Twitter, our study found meaningful differences in public emotions and topics regarding the COVID-19 pandemic among different UK cities. Furthermore, efficient location-based and time-based comparative analysis can be used to track people’s thoughts and feelings, and to understand their behaviors. Based on our analysis, positive attitudes were common during the pandemic; optimism and anticipation were the dominant emotions. With the outbreak and epidemiological change, the government developed control measures and vaccination policies, and the topics also shifted over time. Overall, the proportion and expressions of emojis, sentiments, emotions, and topics varied geographically and temporally. Therefore, our approach of exploring public emotions and topics on the pandemic from Twitter can potentially lead to informing how public policies are received in a particular geographical area.

Keywords: Twitter, COVID-19, geolocation, emotion detection, sentiment analysis, topic modeling, social media, natural language processing, deep learning

Introduction

The crisis of the COVID-19 pandemic has influenced the whole world on an enormous scale, causing most countries to deal with an unprecedented situation. The societal consequences due to lockdowns were tremendous on all levels. The pandemic caused most countries to impose various stages of restrictions on moving, traveling, and gathering to contain the outbreak of infection. Such restrictions changed how people used to work, socialize, shop, travel, etc, leading to various behavioral and societal changes to deal with the situation (eg, working from home, fear of social interaction, isolation, loneliness). Because of this unprecedented societal change, it was important for policymakers to understand people’s state of mind to help institutions, governments, and individuals navigate through the pandemic [1-4].

Traditionally, policymakers used questionnaires to capture public opinion toward major events, but there are disadvantages limiting the effectiveness of such methods of evidence gathering due to bias caused by spatiotemporal granularity and sample sizes. Recently, social media have become an important vehicle of gathering information and evidence about public opinion. Twitter is a popular social media platform with more than 19 million users in the United Kingdom [5], where there are many discussions and opinions about topics related to COVID-19. Previous studies show that Twitter can yield important public health information and has broad applicability for public health research, including medical well-being and tracking infectious disease outbreaks [6,7]. Therefore, to address the evidence gap from traditional surveys, Twitter data can be used to supplement data gathering, and to understand public opinion on pandemics [8,9] and reactions to the COVID-19 outbreak [10].

There is a growing body of research that has recently focused on the COVID-19 pandemic with respect to different attributes, including sentiment, emotions, and topics [11-16]. Kleinberg et al [11] built the COVID-19 Real World Worry Dataset, which is based on a direct survey written by 2500 participants who reported their feelings while writing. Gupta et al [13] created another COVID-19 data set from Twitter by using a set of keywords related to the pandemic, as well as analyzing sentiment and topics as additional attributes to emotion. For instance, there are some analyses of COVID-19 vaccine–related discussions on Twitter or Reddit based on sentiment analysis and topic modeling in different countries, including the United States [17-19], Canada [20], the United Kingdom [18], Saudi Arabia [21], and Australia [22].

Sentiment represents the attitude and feelings expressed by people. Sentiment analysis determines and interprets whether online posts collected from social media are positive, neutral, or negative, and helps to gain better insight into public perceptions and attitudes. Sentiment analysis can also help to understand how information spreads on social media: a tweet with positive/negative sentiment generates another tweet with the same or opposing sentiment [23]. Sentiment analysis has been used for many practical applications, including financial analysis, politics, health prediction, and health care service improvement [24]. For instance, by analyzing public messages, sentiment analysis can be used by health practitioners to understand potential obstacles to population-based intervention approaches such as COVID-19 vaccination. In addition, analyzing patients’ online reviews of different treatments can improve patient satisfaction [25].

Emotion detection from social media plays an important role in monitoring health and well-being [26]. Clinicians and health professionals also benefit from emotion analysis to understand public emotions and public health changes in perception of an intervention (ie, vaccine). Emotion detection systems have been used for alerting public health practitioners, for monitoring mental health patients [27], suicide prevention [28], and adverse drug reactions [29]. Some works utilized emotion-based features to specifically detect adverse drug reactions reported by users on social media, which can guide health professionals and pharmaceutical companies in making medications safer and advocating for patient safety [30-32]. Moreover, the idea of emotional contagion can further play a crucial role in either improving the overall well-being of users or preventing them from developing mental health problems. Kramer et al [33] stated that emotions can be transferred to others through emotional contagion. Emotional contagion makes people experience similar emotions, even if they are not aware of their emotional changes. On the one hand, other works found a strong link between people’s mental health problems (ie, depression and anxiety) and the outbreak of COVID-19 due to the intense exposure to negative content on social media [34,35]. On the other hand, one can also expose people to positive or desired emotions (eg, calm, joy, optimism, and rest) to improve their overall well-being [33].

Besides sentiment analysis and emotion detection, topic modeling is an important text analysis technology by grouping texts into different themes. Most models can find hidden topics without supervision, and therefore do not require training on specific data with predefined topics, which makes this approach suitable for analyzing social media data to determine what people are talking about on these platforms. Topic modeling has been used for many health applications during the COVID-19 pandemic [36], such as monitoring people’s concerns, predicting COVID-19 cases, and analyzing government responses. Topic modeling has played a crucial role in health information surveillance and public opinion monitoring [37].

Given the growing interest of research in understanding people’s opinions and emotions regarding the pandemic [37], the objective of this study was to use deep learning–based methods to understand public emotions on topics related to the COVID-19 pandemic in the United Kingdom through a comparative geolocation and text mining analysis on Twitter. Specifically, we utilized three advanced deep learning–based methods (ie, SenticNet [38], SpanEmo [39], and combined topic modeling [CTM] [40]), and then performed our analysis on a data set collected from Twitter to explore people’s sentiment, emotions, and topics toward COVID-19. We further included analyses of these attributes focused on understanding the impact of the pandemic over time. The overall goal of this study was to automatically capture the impact COVID-19 had on the UK population using emotion detection, sentiment analysis, and topic modeling.

Methods

Data Source

To develop our corpus, we used the Twitter application programming interface by collecting data via the use of several bounding boxes over multiple cities in the United Kingdom. We further used a list of keywords that are of relevance to the pandemic (eg, coronavirus, sars19, covid19, and NHS [National Health Service]). The data covered the period of the last 2 years (ie, 2020 and 2021). To acquire location labels on the data, we used the Python geocoding library “geopy” [41], which helps locate the coordinates of addresses (eg, Oxford Rd, Manchester M13 9PL), cities (eg, Manchester), countries (eg, United Kingdom), and landmarks (in the form of latitude and longitude coordinates) based on third-party geocoders and several other data sources. More specifically, we use “Nominatim” [42] as a third-party tool. As a result, we acquired a total of 516,427 tweets from 48 cities in this study.

The number of tweets per city and emoji is shown in Table 1 and Multimedia Appendix 1, respectively. We further highlight the 9 cities that were used for our analysis: Birmingham, Bristol, Leeds, Leicester, Liverpool, London, Manchester, Nottingham, and Sheffield. It is worth mentioning that these 9 cities are also among the top populated cities in the United Kingdom [43]. This shows that there is a link between the population size and the number of posted tweets from a given geolocation area. Multimedia Appendix 1 displays the top 50 tweets (according to percentage) associated with each individual emoji and its meaning, highlighting the usage of emojis expressing different health issues (eg, virus, face with medical mask, syringe, or vaccine) and mental health conditions (eg, hands pressed together).

Table 1.

Number of tweets per city in the United Kingdom.

City Tweets, n Population, n
Bath 1698 105,730
Birminghama 21,120 1,159,888
Blackburn 1092 121,475
Bradford 4980 368,485
Brighton 10,092 245,504
Bristola 10,338 580,199
Cambridge 6894 149,155
Canterbury 2292 64,495
Carlisle 1098 74,536
Chelmsford 3894 119,468
Chester 3516 87,881
Chichester 864 31,881
Coventry 6072 388,793
Derby 3503 264,430
Durham 9414 56,920
Ealing 4914 340,341
Ely 432 20,333
Exeter 3360 127,709
Gloucester 1740 148,167
Hereford 1134 64,037
Kingston 5286 287,705
Kirklees 3156 441,290
Lancaster 876 52,935
Leedsa 11,628 516,298
Leicestera 19,818 472,897
Lichfield 792 34,686
Lincoln 4614 107,434
Liverpoola 15,876 589,774
Londona 111,667 9,088,994
Luton 2658 222,043
Manchestera 25,260 567,334
Newcastle 9642 290,688
Northampton 3954 230,070
Norwich 4290 199,245
Nottinghama 11,827 320,536
Peterborough 2054 179,349
Plymouth 2736 240,297
Portsmouth 4878 248,748
Preston 3816 100,095
Redbridge 3227 310,330
Ripon 138 15,971
Rochdale 1415 114,511
Rotherham 198 111,158
Salford 8034 125,983
Sheffielda 15,582 557,039
Southampton 7806 270,333
Worcester 3492 101,816
York 5748 164,934

aTop nine cities used in subsequent analyses.

Methodology

To preprocess the data, we used the “ekphrasis” tool designed for the specific characteristics of Twitter (ie, misspellings and abbreviations) [44]. The tool provides different functionalities such as tokenization, normalization, and spelling correction. We utilized the tool to tokenize the text; convert words to lowercase; and normalize user mentions, URLs, and repeated characters. Once the preprocessing step was complete, we fed the data through three models: (1) a textual emotion deep learning–based recognition model, (2) a deep learning–based sentiment model, and (3) a neural network topic model. Figure 1 depicts our pipeline, in which we provide an illustration of the three deep-learning models.

Figure 1.

Figure 1

Overview of our pipeline. CTM: combined topic modeling.

We used SenticNet 6 [38] for sentiment analysis, since this model has achieved better performance than other machine learning–based sentiment analysis methods. SenticNet 6 can provide sentiment scores (between –1 and 1) for approximately 200,000 common-sense concepts by using both symbolic models (ie, logic and semantic networks) and subsymbolic methods with deep learning architectures to encode the meanings and syntactic relations. We then added up the sentiment scores of each concept in the post and used two basic linguistic patterns (negation and adversative patterns) [45]. For example, if the patterns are not used, “The television is old but rather not expensive” could be wrongly classified although both “old” and “expensive” are negative. Finally, we calculated the sentiment polarity of each post automatically. We divided our data into five categories based on the following score range: strong negative (–1 to –0.5), weak negative (–0.5 to –0.1), neutral (–0.1 to 0.1), weak positive (0.1 to 0.5), and strong positive (0.5 to 1).

The emotion recognition model is based on our deep learning–based model “SpanEmo” [39] that is designed for multilabel emotion classification. This model is specifically trained on the SemEval-2018 multilabel emotion classification data set [46], labeled with multiple emotions classes (ie, anger, anticipation, disgust, fear, joy, love, optimism, pessimism, sadness, surprise, and trust). SpanEmo focuses on both learning emotion-specific associations and integrating their correlations into the training objective. Since SpanEmo achieved strong performance for the task of multilabel emotion classification, we decided to use it to generate predictions for our data. It should be mentioned that only examples with high predictions are retained.

Last, for topic modeling, we used CTM [40]. This model incorporates contextualized document embeddings into neural topic models to produce more coherent and meaningful topics. Because the evaluation results on five publicly available data sets illustrate that the performance achieved by CTM is better than traditional latent Dirichlet allocation [47] topic models and other neural models, we employed CTM to extract the topics and their associated words from our data.

Ethical Considerations

Since our data were collected from Twitter, we followed Twitter’s terms of service and strict ethical research protocols similar to the guidelines [48], protecting the privacy and security of personal data. It should be mentioned that our study was focused on the tweet level; we do not anticipate any negative ethical impact from our analysis. However, we believe that these results provide insights into people’s emotions and topics among different cities in the United Kingdom during the COVID-19 pandemic.

Results

Words Associated With Emotions

We performed different types of analyses focused on sentiment, emotion, and topic modeling of the COVID-19 online data sets. First, we analyzed emotion-words and topic-words associations where both demonstrate the relationship between words and their respective emotion label and topic. We then analyzed where the location is given, and where the impact of COVID-19 on different cities in the United Kingdom is discussed. Furthermore, an analysis of time-based features was undertaken, focusing on showing the impact of COVID-19 over time. Finally, we analyzed instances from our data that discuss the benefits of considering sentiment, emotion, and topical analysis in understanding the concerns of people during the pandemic in the United Kingdom.

Table 2 presents the top 6 words associated with each emotion and learned by SpanEmo. More detailed information on how to generate these words is provided by Alhuzali and Ananiadou [39]. There were words that are indicative of both the corresponding emotion as well as the COVID-19 pandemic. For instance, the words “death” and “spread” were highly associated with the emotion class fear, whereas words such as “vaccine” and “support” were highly associated with the emotion class anticipation. This is intuitive since some words directly express emotion (eg, angry, afraid, and glad), while other words indirectly express emotion (eg, accident, failure, and birthday). We also observed that some emotion classes shared similar words, especially those that belong to the same valence space [49]. The analysis presented in Table 2 demonstrates that it is possible to understand the impact of COVID-19 with the help of emotion analysis and the concerns of people during the pandemic.

Table 2.

The top 6 words associated with each emotion class, predicted by SpanEmo.

Emotion class Associated words
Negative emotions

Anger death, think, public, virus, don’t, against

Disgust deaths, virus, against, because, public, after

Fear deaths, spread, symptoms, coronavirus, identify, self-reporting

Sadness deaths, going, cases, hospital, other, please

Pessimism sadly, family, friend, during, weeks, passed
Positive emotions

Anticipation support, vaccine, first, working, public, cases

Joy great, thank, support, happy, amazing, staysafe

Trust trust, thank, protect, important, community, everyone

Love happy, loved, share, beautiful, wonderful, amazing

Optimism please, thank, support, working, great, spread

Surprise shocking, surprised, amazing, public, absolutely, deaths

We extracted topics using CTM. Table 3 summarizes the top 18 topics extracted as well as the top 5 associated words per topic. We noticed that there were many different topics mentioned by users, ranging from those related to COVID-19, such as epidemic control, government policies, and vaccination, to indirectly related subjects such as work, online, and social networking. For example, topic 1 (t1) contains some words about gratitude (ie, grateful, thank), which is related to the attitude toward social support and vaccination. Topic 3 (t3) is about the discussion during the pandemic, topic 10 (t10) centers on the serious consequences of COVID-19 (die, killing), and topic 8 (t8) reveals occupational patterns.

Table 3.

Topics extracted using combination topic modeling and the top 5 associated words per topic.

Topic Associated words
t1 thank, grateful, proud, amazing, heroes
t2 class, sign, trade, worldwide, hold
t3 discuss, blog, discussion, recovery, opportunities
t4 united, fitness, kingdom, complete, image
t5 episode, tune, film, videos, radio
t6 rear, accord, whack, discomfort, fills
t7 vaccination, vaccine, dose, drug, booster
t8 letter, homes, worker, pay, private
t9 visit, eye, tweet, click, website
t10 die, dying, true, killing, cause
t11 confirmed, total, English, wales, reports
t12 rear, accord, jeopardise, unknowingly, discomfort
t13 lies, cummings, press, leader, prime
t14 coronavirus, pandemic, outbreak, instagram, outbreak
t15 masks, wear, face, hand, covering
t16 slow, thread, implement, testandtrace, symptom
t17 couple, havent, felt, daughter, holiday
t18 stay, loved, tough, pray, healthy

Analysis of Location

Figure 2 shows the number of emojis across a sample of UK cities, where the sample consists of the top 9 cities in our data, more specifically those that had the highest number of tweets (Table 1): Bristol, Birmingham, Leicester, Leeds, Liverpool, London, Manchester, Nottingham, and Sheffield. The emoji set included the following topics: virus, face-mask, thumbs-up/-down, broken heart, and others. The proportion of emojis differed from city to city. For example, usage of the syringe, or known today as the COVID-19 vaccine emoji, was high in Liverpool; the thumbs-down emoji was high in Birmingham; and the mask emoji was highly used in London and Liverpool. These emojis are relevant to the COVID-19 pandemic, demonstrating the benefits of our data in mining and analyzing social data such as Twitter for a better understanding of the impact of the pandemic on people from different areas in the United Kingdom.

Figure 2.

Figure 2

The number of emojis used across a sample of UK cities.

In Figure 3, we present the proportions of five sentiments (strong positive, weak positive, neutral, weak negative, and strong negative) in the top 9 cities in our data in terms of their number of tweets. We can observe that approximately 60% of tweets were positive and 20% were negative in each city. At the same time, the percentage of tweets with different sentiments differed among these cities. For example, Leeds had a relatively high proportion of strong negative tweets and Sheffield had a relatively low proportion of strong positive tweets.

Figure 3.

Figure 3

The distribution of sentiment expressions across a sample of UK cities.

In Figure 4, we present the distribution of emotion expressions across the top 9 cities in our data. It can be observed that these 9 cities shared quite similar distributions, although the proportion differed from emotion to emotion. For instance, “optimism” and “anticipation” were the most frequently expressed emotions. We also noted some mixed emotions such as joy, disgust, and anger, which are reasonable feelings to be expressed during the COVID-19 pandemic. Interestingly, the proportion of trust expressions was extremely low, which could be linked to the lack of trust in decision-makers to deal properly with the situation due to the high infection rates. It is noteworthy that the proportion of trust expressions has been found to be generally scarce on Twitter in previous work [50,51].

Figure 4.

Figure 4

The distribution of emotion expressions across a sample of UK cities.

In addition, we also counted the proportion of 10 topics in different cities, as shown in Figure 5. Similar topics received different degrees of attention in different cities. For instance, the main topic discussed in Leicester was t2 (trade, worldwide), which revealed that the public is more concerned about international trade. In London, the residents talked more about t4 (kingdom, united) than in other cities. In addition, Sheffield’s population focused more on the death topic given the higher proportion of t10 (die, killing) than found in the other cities.

Figure 5.

Figure 5

The distribution of topic expressions across a sample of UK cities. See Table 3 for a description of topics t1-t10.

Analysis of Time

With time, the situation of the epidemic has changed, reflecting the level of concern about the epidemic. Figure 6 displays the number of tweets related to COVID-19 from January 2020 to December 2021. We can observe a sharp increase in the number of tweets from January 2020 to February 2020 (approximately 100,000 tweets), mainly due to the outbreak of COVID-19 in the United Kingdom. There was a gradual decline in the number of tweets as of February 2020, suggesting that people became less concerned about the epidemic. Moreover, the overall number of tweets was relatively low in 2021. With identification of the COVID-19 Omicron variant in the United Kingdom in November 2021, the number of tweets posted increased.

Figure 6.

Figure 6

Number of tweets related to COVID-19 from January 2020 to December 2021. Each colored line represents a specific year (ie, red represents 2020, while orange represents 2021).

Figure 7 presents the emotion expressions over time, covering the 2 years (ie, 2020 and 2021). We noticed that the distribution changed with time. In the beginning of 2020, almost all emotion labels displayed high peaks of expressions, with some obviously higher than others, such as optimism. As time progressed, the number of posted tweets containing emotions decreased, but the emotion distributions had dramatically changed from being highly positive to negative. This trend progressed until reaching the end of 2021. For instance, disgust, sadness, and hopelessness were among the top expressed emotions during this period, which were reasonable emotions to be expressed during this period since the number of cases and deaths increased [52].

Figure 7.

Figure 7

Number of tweets with different emotion expressions from 2020 to 2021.

Figure 8 shows the change in topics (among 10 selected topics) of all tweets between February 2020 and November 2021. We can see that the change is relatively significant. In April 2020, many tweets expressed gratitude to heroes of local councils for the epidemic, given the highest frequency of messages related to t1 (grateful, thanks). In addition, due to advances in vaccine research and an increase in the number of people vaccinated, the number of tweets referring to t7 (vaccination) relatively increased, and reached the highest value in January 2021. Interestingly, there were many tweets related to t5 (film, videos) because of the emergence of films with special significance, such as A Beacon of Hope: The UK Vaccine Story and One Year On: A pandemic poem for Londoners. For example, someone posted “What an honour to be filmed by @BBCLondonNews reading this part of our One Year On, poem marking the anniversary of the 1st lockdown.”

Figure 8.

Figure 8

Number of tweets with different topic expressions from 2020 to 2021. See Table 1 for the descriptions of topics t1-t10.

Analysis of Examples

Multimedia Appendix 2 presents 9 instances from our data, each of which is linked to different attributes (ie, emotions, emojis, sentiment, and topics), demonstrating interesting findings that highlight the benefits of these attributes to the understanding of people’s reactions with respect to the pandemic. Here, we describe some use cases of emojis in tweets that were commonly observed across our data. Examples 1 and 3 display the use of emojis that are related to vaccine-taking (syringe emoji) and feeling strong/protected (muscle emoji). These two examples suggest that being vaccinated can make people feel strong and protected against the COVID-19 disease. Other examples (ie, Examples 4 and 5) discuss flight cancellation (airplane emoji), causing people to miss their already planned trips and holidays. Example 5 also discusses the potential of being able to travel again once the COVID-19 vaccine has been taken. A further example illustrates the benefits of developing volunteering programs that can assist hospitals and communities in fighting the COVID-19 crisis. Furthermore, the mask emoji was used in different ways, depending on the context (eg, lockdown for a long period).

From the perspective of sentiment, different tweets expressed different sentiments (including positive, neutral, and negative sentiments). Example 3 discusses that the second COVID-19 vaccine had been successfully administered and Example 9 praises community groups for their help and support, both of which show strong positive sentiment from the users. Example 7 expresses negative sentiment since the user could not see her relatives due to the epidemic. Some other examples (ie, Examples 1, 2, 5, 6, and 8) generally express positive attitudes during the pandemic by introducing vaccination, lockdown, or volunteers. In addition, Example 4 shows an instance that expresses mixed sentiment (positive and negative), although it was labeled by SenticNet as neutral. However, SpanEmo identified some mixed emotions, which helps to overcome the limitation of SenticNet in dealing properly with expressions having mixed sentiments or emotions.

Multimedia Appendix 2 also shows the top 3 topics for each example according to the probability calculated based on CTM. Examples 1, 3, and 5 belong to t7, which dominates the discussion of vaccination and boosters. Examples 4, 7, and 8 express the users’ attitudes and moods toward the impact of COVID-19 on their lives, and thus all of these were classified as t18. Examples 3 and 6 also belong to t1 (related to gratitude) because of the appearance of “thank you.” Moreover, discussion or usage of social media (t3) was expressed in some tweets (eg, Example 9).

Discussion

Principal Findings

This study explored more than 500,000 tweets related to COVID-19 between January 2020 and December 2021 in different cities of the United Kingdom, where the number of tweets increased dramatically following the outbreak. We used three deep learning–based models to analyze and combine sentiments, emotions, and topics to identify the key public concerns during the pandemic. Through our analysis, we found that emotion analysis can support understanding of people’s opinions and attitudes during the COVID-19 pandemic. Meanwhile, taking geolocation information into account can reveal differences between different areas in the United Kingdom. The overall sentiment was positive over time, and optimism was the predominant emotion, suggesting that people tend to be optimistic about the situation. There were changes in the sentiments, emotions, and topics expressed on Twitter as the epidemiological situation and government policies changed (eg, vaccination, social distancing) over these 2 years, which also reflect changes in people’s attitudes.

The benefits of the selected attributes for gathering evidence about people’s reactions during the pandemic in the United Kingdom were also identified. These attributes include emotion, sentiment, emojis, and topic modeling. This analysis demonstrated that such attributes can help gather evidence and analyze interactions between people during the pandemic. The first attribute was emotion, which can serve as a guide in understanding people’s reactions. For example, some people express concerns about COVID-19 for multiple reasons such as (1) taking a longer time to be resolved than expected, (2) cancelling or changing plans, (3) traveling restrictions, (4) wearing masks, and (5) isolation and lack of contact from family and friends. Others express some positive reactions and potential solutions for dealing with the pandemic, including family support, being inoculated with vaccines, staying at home or wearing masks, and volunteering. The second attribute was emojis, which describe the overall expression in the text, similar to topic modeling in the sense that both refer to the topics expressed in tweets. This provides another dimensionality for emojis, which have been used as a surrogate to collect emotion data [53,54]. Although this point is interesting to observe through this work, we leave it for future work to be investigated in greater depth.

Sentiment analysis is also useful to gain insight into the public opinion and perception behind certain events. By analyzing the sentiments in our data, we found that most people have had a positive attitude during the pandemic, which matches the conclusion of previous research [55], since they often post information related to good policies such as social support and vaccination to boost confidence in the fight against COVID-19. Definitely, some people still expressed worry about the outbreak and developed negative feelings due to the deaths, isolation, and lockdown policies, which affected their normal lives.

From the topics extracted, we found that there are many distinct topics people focus on, including symptoms of COVID-19, vaccination, social media, government policies, and living conditions. The changing themes of social media reveal the impact of COVID-19 on people’s lives, shifting the discussion about daily life to the pandemic and policies.

In addition, the emojis used, the emotions expressed, and the topics discussed by people who are from different cities in the United Kingdom all differed because of various factors such as the environment in the city, the epidemic situation, policies, and hot spots. The findings reveal the complexity and diversity of people’s perceptions toward the COVID-19 pandemic, which indicates the need to keep track of public attitudes.

Limitations

This work is based on existing natural language processing methods that were used to analyze different attributes such as emotions, sentiment, and topics. However, these existing methods may not guarantee that their predictions reflect the actual attribute. In addition, emotion and sentiment are subjective tasks, which make them difficult to model and in turn could affect our interpretation as well as our results. Moreover, since our data were collected from Twitter with the use of specific keywords, it is possible that we missed other topics in online threads and viewpoints. Related discussions could also be taken from other social media platforms (eg, Facebook, Reddit). In this respect, our data provide a partial sample of user interactions on Twitter. The methods nevertheless are applicable to other longitudinal data and social media platforms.

Conclusion

Our main contribution is the multimethod approach that provides insights into public sentiment and emotions in UK cities during the COVID-19 pandemic. Furthermore, our methods are location- and time-based, supporting a comparative analysis to track public concerns. Our analysis demonstrated that positive attitudes were common during the pandemic; optimism and anticipation were the dominant emotions. With the outbreak and epidemiological change, the government developed control measures and vaccination policies, and the topics also shifted over time. In addition, the comparative geolocation analysis revealed differences in the emotions expressed and topics discussed by people in different cities. Overall, our study shows that analyzing the data from social media can help to better understand public emotions and concerns related to COVID-19 at the city level, which will potentially enable developing acceptable policies.

Acknowledgments

This work is supported in part by funds from the Medical Research Council (MRC), UK MR/R022461/1, and the Alan Turing Institute, United Kingdom.

Abbreviations

CTM

combined topic modeling

NHS

National Health Service

Multimedia Appendix 1

Percentages of tweets in the data set associated with each individual emoji.

Multimedia Appendix 2

Examples of tweets expressing positive and negative reactions about COVID-19.

Footnotes

Authors' Contributions: HA curated the data for this study. HA and TZ designed methodologies, analyzed and interpreted the data and drafted the manuscript. SA edited the manuscript.

Conflicts of Interest: None declared.

References

  • 1.Shuja J, Alanazi E, Alasmary W, Alashaikh A. COVID-19 open source data sets: a comprehensive survey. Appl Intell. 2021;51(3):1296–1325. doi: 10.1007/s10489-020-01862-6. https://europepmc.org/abstract/MED/34764552 .1862 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Debata B, Patnaik P, Mishra A. COVID-19 pandemic! It's impact on people, economy, and environment. J Public Affairs. 2020 Sep 02;20:e2372. doi: 10.1002/pa.2372. [DOI] [Google Scholar]
  • 3.Hussain MW, Mirza T, Hassan MM. Impact of COVID-19 Pandemic on the Human Behavior. Int J Educ Manag Eng. 2020 Dec 08;10(6):35–61. doi: 10.5815/ijeme.2020.05.05. [DOI] [Google Scholar]
  • 4.Sparasci O, Bhui K, Biswas A, Chamberlain S, Dubicka B, Dudas R, Farooq S, Ford T, Husain N, Jones I, Killaspy H, Lee W, Lingford-Hughes A, Mulholland C, Rubinsztein J, Shankar R, Sharma A, Sinclair L, Stone J, Young A. Impact of COVID-19 on mental health research: is this the breaking point? Br J Psychiatry. 2022 Feb 17;:1–3. doi: 10.1192/bjp.2022.8.S0007125022000083 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Top 25 surprising Twitter statistics UK edition. cybercrew. 2022. [2022-09-28]. https://cybercrew.uk/blog/twitter-statistics-uk/
  • 6.Paul M, Dredze M. You are what you tweet: analyzing twitter for public health. Fifth International AAAI Conference on Web and Social Media; July 17-21, 2011; Barcelona, Spain. 2011. [Google Scholar]
  • 7.Sinnenberg L, Buttenheim AM, Padrez K, Mancheno C, Ungar L, Merchant Rm. Twitter as a tool for health research: a systematic review. Am J Public Health. 2017 Jan;107(1):e1–e8. doi: 10.2105/ajph.2016.303512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Ainley E, Witwicki C, Tallett A, Graham C. Using Twitter comments to understand people's experiences of UK health care during the COVID-19 pandemic: thematic and sentiment analysis. J Med Internet Res. 2021 Oct 25;23(10):e31101. doi: 10.2196/31101. https://www.jmir.org/2021/10/e31101/ v23i10e31101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhang C, Xu S, Li Z, Hu S. Understanding concerns, sentiments, and disparities among population groups during the COVID-19 pandemic via Twitter data mining: large-scale cross-sectional study. J Med Internet Res. 2021 Mar 05;23(3):e26482. doi: 10.2196/26482. https://www.jmir.org/2021/3/e26482/ v23i3e26482 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Saleh SN, Lehmann CU, McDonald SA, Basit MA, Medford RJ. Understanding public perception of coronavirus disease 2019 (COVID-19) social distancing on Twitter. Infect Control Hosp Epidemiol. 2021 Feb;42(2):131–138. doi: 10.1017/ice.2020.406. https://europepmc.org/abstract/MED/32758315 .S0899823X20004067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kleinberg B, van DVI, Mozes M. Measuring emotions in the COVID-19 real world Worry Dataset. 1st Workshop on NLP for COVID-19 at ACL 2020; July 5-10, 2020; online. 2020. [Google Scholar]
  • 12.Li X, Zhou M, Wu J. Analyzing COVID-19 on online social media: trends, sentiments and emotions. arXiv. 2020. [2022-09-28]. https://arxiv.org/abs/2005.14464 .
  • 13.Gupta RK, Vishwanath A, Yang Y. COVID-19 Twitter dataset with latent topics, sentiments and emotions attributes. arXiv. 2020. [2022-09-28]. https://arxiv.org/abs/2007.06954 .
  • 14.Glandt K, Khanal S, Li Y. Stance detection in COVID-19 tweets. 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; August 1-6, 2021; Bangkok, Thailand. 2021. [DOI] [Google Scholar]
  • 15.Kruspe A, Häberle M, Kuhn I. Cross-language sentiment analysis of European Twitter messages during the COVID-19 pandemic. 1st Workshop on NLP for COVID-19 at ACL 2020; July 5-10, 2020; online. 2020. [Google Scholar]
  • 16.Hossain T, Logan IRL, Ugarte A. COVIDLies: Detecting COVID-19 misinformation on social media. 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020; December 2020; online. 2020. [DOI] [Google Scholar]
  • 17.Hu T, Wang S, Luo W, Zhang M, Huang X, Yan Y, Liu R, Ly K, Kacker V, She B, Li Z. Revealing public opinion towards COVID-19 vaccines with Twitter data in the United States: spatiotemporal perspective. J Med Internet Res. 2021 Sep 10;23(9):e30854. doi: 10.2196/30854. https://www.jmir.org/2021/9/e30854/ v23i9e30854 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Hussain A, Tahir A, Hussain Z, Sheikh Z, Gogate M, Dashtipour K, Ali A, Sheikh A. Artificial intelligence-enabled analysis of public attitudes on Facebook and Twitter toward COVID-19 vaccines in the United Kingdom and the United States: observational study. J Med Internet Res. 2021 Apr 05;23(4):e26627. doi: 10.2196/26627. https://www.jmir.org/2021/4/e26627/ v23i4e26627 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lyu JC, Han EL, Luli GK. COVID-19 vaccine-related discussion on Twitter: topic modeling and sentiment analysis. J Med Internet Res. 2021 Jun 29;23(6):e24435. doi: 10.2196/24435. https://www.jmir.org/2021/6/e24435/ v23i6e24435 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Yan C, Law M, Nguyen S, Cheung J, Kong J. Comparing public sentiment toward COVID-19 vaccines across Canadian cities: analysis of comments on Reddit. J Med Internet Res. 2021 Sep 24;23(9):e32685. doi: 10.2196/32685. https://www.jmir.org/2021/9/e32685/ v23i9e32685 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Addawood A, Alsuwailem A, Alohali A. Tracking and understanding public reaction during COVID-19: Saudi Arabia as a use case. 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020; December 2020; online. 2020. [DOI] [Google Scholar]
  • 22.Kwok SWH, Vadde SK, Wang G. Tweet topics and sentiments relating to COVID-19 vaccination among Australian Twitter users: machine learning analysis. J Med Internet Res. 2021 May 19;23(5):e26953. doi: 10.2196/26953. https://www.jmir.org/2021/5/e26953/ v23i5e26953 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gorodnichenko Y, Pham T, Talavera O. Social media, sentiment and public opinions: evidence from #Brexit and #USElection. Eur Econ Rev. 2021 Jul;136:103772. doi: 10.1016/j.euroecorev.2021.103772. [DOI] [Google Scholar]
  • 24.Zunic A, Corcoran P, Spasic I. Sentiment analysis in health and well-being: systematic review. JMIR Med Inform. 2020 Jan 28;8(1):e16023. doi: 10.2196/16023. https://medinform.jmir.org/2020/1/e16023/ v8i1e16023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhang L, Hall M, Bastola D. Utilizing Twitter data for analysis of chemotherapy. Int J Med Inform. 2018 Dec;120:92–100. doi: 10.1016/j.ijmedinf.2018.10.002.S1386-5056(18)30432-5 [DOI] [PubMed] [Google Scholar]
  • 26.Andalibi N, Buss J. The human in emotion recognition on social media: attitudes, outcomes, risks. CHI '20: 2020 CHI Conference on Human Factors in Computing Systems; April 25-30, 2020; Honolulu, HI. 2020. [DOI] [Google Scholar]
  • 27.Zhang T, Schoene AM, Ji S, Ananiadou S. Natural language processing applied to mental illness detection: a narrative review. NPJ Digit Med. 2022 Apr 08;5(1):46. doi: 10.1038/s41746-022-00589-7.10.1038/s41746-022-00589-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ji S, Pan S, Li X, Cambria E, Long G, Huang Z. Suicidal ideation detection: a review of machine learning methods and applications. IEEE Trans Comput Soc Syst. 2021 Feb;8(1):214–226. doi: 10.1109/tcss.2020.3021467. [DOI] [Google Scholar]
  • 29.Weissenbacher D, Sarker A, Magge A. Overview of the Fourth Social Media Mining for Health (SMM4H) Shared Tasks at ACL 2019. Fourth Social Media Mining for Health Applications (# SMM4H) Workshop & Shared Task; August 2019; Florence, Italy. 2019. pp. 21–30. [DOI] [Google Scholar]
  • 30.Aragón ME, Monroy APL, González-Gurrola LC. Detecting depression in social media using fine-grained emotions. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; June 2019; Minneapolis, MN. 2019. pp. 1481–1486. [DOI] [Google Scholar]
  • 31.Chen X, Sykora MD, Jackson TW. What about mood swings? Identifying depression on twitter with temporal measures of emotions. WWW '18: The Web Conference 2018; April 2018; Lyon, France. 2018. pp. 1653–1660. [DOI] [Google Scholar]
  • 32.Korkontzelos I, Nikfarjam A, Shardlow M, Sarker A, Ananiadou S, Gonzalez GH. Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. J Biomed Inform. 2016 Aug;62:148–158. doi: 10.1016/j.jbi.2016.06.007. https://linkinghub.elsevier.com/retrieve/pii/S1532-0464(16)30050-8 .S1532-0464(16)30050-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kramer ADI, Guillory JE, Hancock JT. Experimental evidence of massive-scale emotional contagion through social networks. Proc Natl Acad Sci U S A. 2014 Jun 17;111(24):8788–8790. doi: 10.1073/pnas.1320040111. https://www.pnas.org/doi/abs/10.1073/pnas.1320040111?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%3dpubmed .1320040111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Gao J, Zheng P, Jia Y, Chen H, Mao Y, Chen S, Wang Y, Fu H, Dai J. Mental health problems and social media exposure during COVID-19 outbreak. PLoS One. 2020;15(4):e0231924. doi: 10.1371/journal.pone.0231924. https://dx.plos.org/10.1371/journal.pone.0231924 .PONE-D-20-06332 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bavel JJV, Baicker K, Boggio PS, Capraro V, Cichocka A, Cikara M, Crockett MJ, Crum AJ, Douglas KM, Druckman JN, Drury J, Dube O, Ellemers N, Finkel EJ, Fowler JH, Gelfand M, Han S, Haslam SA, Jetten J, Kitayama S, Mobbs D, Napper LE, Packer DJ, Pennycook G, Peters E, Petty RE, Rand DG, Reicher SD, Schnall S, Shariff A, Skitka LJ, Smith SS, Sunstein CR, Tabri N, Tucker JA, van der Linden S, van Lange P, Weeden KA, Wohl MJA, Zaki J, Zion SR, Willer R. Using social and behavioural science to support COVID-19 pandemic response. Nat Hum Behav. 2020 May;4(5):460–471. doi: 10.1038/s41562-020-0884-z.10.1038/s41562-020-0884-z [DOI] [PubMed] [Google Scholar]
  • 36.Tsao SF, Chen H, Tisseverasinghe T, Yang Y, Li L, Butt ZA. What social media told us in the time of COVID-19: a scoping review. Lancet Digit Health. 2021 Mar;3(3):e175–e194. doi: 10.1016/S2589-7500(20)30315-0. https://linkinghub.elsevier.com/retrieve/pii/S2589-7500(20)30315-0 .S2589-7500(20)30315-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Boon-Itt S, Skunkan Y. Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health Surveill. 2020 Nov 11;6(4):e21978. doi: 10.2196/21978. https://publichealth.jmir.org/2020/4/e21978/ v6i4e21978 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Cambria E, Li Y, Xing FZ. SenticNet 6: Ensemble application of symbolic and subsymbolic AI for sentiment analysis. CIKM '20: 29th ACM International Conference on Information & Knowledge Management; October 19, 2020; online. 2020. [DOI] [Google Scholar]
  • 39.Alhuzali H, Ananiadou S. SpanEmo: Casting multi-label emotion classification as span-prediction. EACL 2021: 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume; April 19-23, 2021; online. 2021. pp. 1573–1584. [DOI] [Google Scholar]
  • 40.Bianchi F, Terragni S, Hovy D. Pre-training is a hot topic: contextualized document embeddings improve topic coherence. 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; August 2021; online. 2021. pp. 759–766. [DOI] [Google Scholar]
  • 41.geopy 2.2.0. [2022-09-28]. https://pypi.org/project/geopy/
  • 42.Welcome to GeoPy’s documentation! GeoPy. [2022-09-28]. https://geopy.readthedocs.io/en/stable/#nominatim .
  • 43.United Kingdom: Countries and Major Cities. City Population. [2022-09-28]. https://www.citypopulation.de/en/uk/cities/
  • 44.Baziotis C, Pelekis N, Doulkeridis C. DataStories at SemEval-2017 Task 4: deep LSTM with attention for message-level and topic-based sentiment analysis. 11th International Workshop on Semantic Evaluation (SemEval-2017); August 2017; Vancouver, BC. 2017. [DOI] [Google Scholar]
  • 45.Poria S, Cambria E, Gelbukh A, Bisio F, Hussain A. Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput Intell Mag. 2015 Nov;10(4):26–36. doi: 10.1109/mci.2015.2471215. [DOI] [Google Scholar]
  • 46.Mohammad S, Bravo-Marquez F, Salameh M, Kiritchenko S. SemEval-2018 Task 1: Affect in Tweets. 12th International Workshop on Semantic Evaluation; June 2018; New Orleans, LA. 2018. [DOI] [Google Scholar]
  • 47.Blei DM, Ng AY, Jordan MI. Latent dirichllocation. J Mach Learn Res. 2003;3:993–1022. [Google Scholar]
  • 48.Benton A, Coppersmith G, Dredze M. Ethical research protocols for social media health research. First ACL Workshop on Ethics in Natural Language Processing; April 2017; Valencia, Spain. 2017. pp. 94–102. [DOI] [Google Scholar]
  • 49.Mohammad S. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. 56th Annual Meeting of the Association for Computational Linguistics; July 2018; Melbourne, Australia. 2018. pp. 174–184. [DOI] [Google Scholar]
  • 50.Xu P, Liu Z, Winata GI, Lin Z, Fung P. Emograph: Capturing emotion correlations using graph networks. arXiv. 2020. [2022-09-28]. http://128.84.4.34/abs/2008.09378 .
  • 51.Saif M, Kiritchenko S. Understanding emotions: a dataset of tweets to study interactions between affect categories. Eleventh International Conference on Language Resources and Evaluation; May 2018; Miyazaki, Japan. 2018. [Google Scholar]
  • 52.Barletta WA. The influence of SARS-CoV-2 variants on national case-fatality rates: correlation and validation study. JMIRx Med. 2022;3(2):e32935. doi: 10.2196/32935. https://xmed.jmir.org/2022/2/e32935/ v3i2e32935 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Shoeb AAM, de Melo G. EmoTag1200: Understanding the association between emojis and emotions. 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP); November 2020; online. 2020. pp. 8957–8967. [DOI] [Google Scholar]
  • 54.Felbo B, Mislove A, Søgaard A. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. Conference on Empirical Methods in Natural Language Processing; September 2017; Copenhagen, Denmark. 2017. pp. 1615–1625. [DOI] [Google Scholar]
  • 55.Hung M, Lauren E, Hon ES, Birmingham WC, Xu J, Su S, Hon SD, Park J, Dang P, Lipsky MS. Social network analysis of COVID-19 sentiments: application of artificial intelligence. J Med Internet Res. 2020 Aug 18;22(8):e22590. doi: 10.2196/22590. https://www.jmir.org/2020/8/e22590/ v22i8e22590 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia Appendix 1

Percentages of tweets in the data set associated with each individual emoji.

Multimedia Appendix 2

Examples of tweets expressing positive and negative reactions about COVID-19.


Articles from Journal of Medical Internet Research are provided here courtesy of JMIR Publications Inc.

RESOURCES