Abstract
Objectives. To illustrate the spatiotemporal distribution of geolocated tweets that contain anti-Asian hate language in the contiguous United States during the early phase of the COVID-19 pandemic.
Methods. We used a data set of geolocated tweets that match with keywords reflecting COVID-19 and anti-Asian hate and identified geographical clusters using the space-time scan statistic with Bernoulli model.
Results. Anti-Asian hate language surged between January and March 2020. We found clusters of hate across the contiguous United States. The strongest cluster consisted of a single county (Ross County, Ohio), where the proportion of hateful tweets was 312.13 times higher than for the rest of the country.
Conclusions. Anti-Asian hate on Twitter exhibits a significantly clustered spatiotemporal distribution. Clusters vary in size, duration, strength, and location and are scattered across the entire contiguous United States.
Public Health Implications. Our results can inform decision-makers in public health and safety for allocating resources for place-based preparedness and response for pandemic-induced racism as a public health threat. (Am J Public Health. 2022;112(4):646–649. https://doi.org/10.2105/AJPH.2021.306653
Since the first confirmed case of COVID-19 in the United States on January 19, 2020,1 anti-Asian racist and xenophobic rhetoric has surged on social media,2,3 followed by acts of discrimination and harassment against Asians and Asian Americans in the United States. Between March 19 and May 13, 2020, 1843 hate incidents were reported to the Stop AAPI (Asian American and Pacific Islander) Hate reporting center.4 The surge of anti-Asian hate is deeply rooted in the Yellow Peril ideology, which racializes Asians as a threat to US and Western culture,5 including reimagining Asians as a diseased public health threat.6 What remains unexplored is the spatial concentration of anti-Asian sentiment on social media. Although the online environment of social media is aspatial, spatiotemporal information from social media can provide critical insights about relationships between online sentiment and physical localities,7 and may reflect spatial differences in social and cultural norms and historical contexts of hate activities.8
This study aims to assess the spatial and temporal distributions of tweets containing hateful language toward Asians and Asian Americans in the United States from November 2019 to May 2020, by identifying geographical clusters across US counties. To our knowledge, this is the first spatiotemporal assessment of anti-Asian hate on social media during the early phase of the pandemic.
METHODS
We purchased 4 234 694 geolocated tweets from Twitter. We included tweets in English, located in the contiguous United States, that matched a list of COVID-19 keywords (e.g., “covid2019,” “SARSCoV2”; Table A, available as a supplement to the online version of this article at http://www.ajph.org) and that were sent between November 1, 2019, and May 15, 2020. We excluded tweets with imprecise location information (i.e., tweets that could only be matched at the state level). We classified the remaining 3 274 614 tweets into hateful or nonhateful based on presence of additional keywords related to anti-Asian hate in the tweet body (e.g., “kungflu,” “Wuhanvirus”; Table B, available as a supplement to the online version of this article at http://www.ajph.org), and assigned them to US counties. An example of a hateful tweet is, “The true spelling of the coronavirus is #Wuhanvirus.” The keywords represent anti-Asian hate in general as well as in the context of COVID-19; we sourced them from hatebase.org and 2 studies on similar topics.3,9 We chose this approach over more sophisticated machine learning classifiers because of simplicity and training data requirements. In addition, we assessed the accuracy of our classifier against a set of 500 manually labeled tweets. The county-level counts of hateful and nonhateful tweets correspond to case–control data that can be modeled using the Bernoulli distribution.
To identify spatiotemporal clusters of anti-Asian hate, we employed the space-time scan statistic (STSS) with Bernoulli model.10 The STSS finds the most likely cluster of hateful tweets using spatial and temporal scanning windows (maximum window size: 5% of population at risk, 50% of study duration) with significance level P < .05. We visualized the spatial distribution of anti-Asian hate as a choropleth map of the relative risk (RR, the proportion of hateful tweets inside a region divided by the proportion outside). Lastly, we supplemented the map with circular clusters identified by STSS.
RESULTS
The keyword-based tweet classification resulted in 10 823 tweets (0.31%) that included anti-Asian hate language. Their temporal distribution indicated (1) low hate from November 2019 to January 2020 (0%–0.1% of daily tweets hateful); (2) low total number of daily tweets, but a high percentage of hateful tweets in January 2020 (1.5% out of 2884 daily tweets hateful); (3) a second peak in mid-March 2020 (1% out of 90 075 daily tweets hateful); and subsequent decline. An accuracy assessment of our classifier resulted in interrater agreement of 94% and accuracy of 90%.
The STSS identified 15 clusters of anti-Asian hate (Figure 1—for example, cluster 9, which included 21 counties in Connecticut, New York, and Massachusetts and had an RR of 3.39, which means the proportion of hateful tweets inside cluster 9 was 3.39 times higher than outside (Table C, available as a supplement to the online version of this article at http://www.ajph.org). We found the highest RRs in small clusters consisting of 1 county each (cluster 1, RR = 312.13; cluster 2, RR = 14.24; cluster 7, RR = 30.05; cluster 8, RR = 138.71). Therefore, cluster 1 (Ross County, Ohio) was the strongest cluster. Although these clusters exhibited a low total number of tweets (62, 3793, 364, and 46, respectively), cluster 3 (RR = 3.17) topped this category (63 349). Given the extreme variations in size, location, and proportion of hateful tweets, cluster statistics varied considerably.
FIGURE 1—
The Spatial Distribution of Hateful Tweets Against Asians and Asian Americans: United States, November 2019–May 2020
Note. Map of the relative risk (RR), augmented by clusters identified by the space-time scan statistic.
DISCUSSION
In this study, we classified geolocated tweets into hateful or nonhateful against Asians based on a set of keywords. We described the spatiotemporal distribution of anti-Asian tweets and identified statistically significant clusters (Figure 1, Table C). The main strength of our approach is the ability to delineate areas and time periods that exhibited strong anti-Asian language. In addition, Figure 1 illustrates statistically significant clusters of increased anti-Asian tweets as well as within-cluster variation.
Significant clusters included rural places, as well as high–population density cities in the United States. It is worth noting that our clusters at least partially included the most populous urban regions, often together with their surrounding suburban and rural areas. In summary, clusters are scattered and there is no identifiable pattern of anti-Asian hate along urban–rural and geographic gradients. Further analysis including demographic and socioeconomic factors may explain cluster locations. Our findings differ from similar studies,3,9 which identified a spike in the number of anti-Asian tweets starting in March 2020, but not in January 2020. Whereas their results focused on absolute number of tweets, our graphs and analyses are based on hateful tweets normalized by nonhateful tweets, rather than bare counts.
A limitation of our approach is our keyword-based classifier. A tweet stating “It is wrong to blame China” would be falsely classified as hateful; use of machine learning–based classifiers may address this problem.9 Another limitation stems from the use of geolocated tweets, which have biases regarding spatial distribution, demographics, and topics.11 Lastly, our study is exploratory in nature, and the results serve as a pointer towards areas that exhibit anti-Asian sentiment, which should be further analyzed in follow-up studies.
PUBLIC HEALTH IMPLICATIONS
The ability to clearly delineate areas (clusters) of anti-Asian hate allows for designing and implementing place-based targeted response measures, such as awareness campaigns, adjustments of communication strategies, hate crime prevention, and public safety resource allocation. A next step is to better understand the relationship between anti-Asian hate sentiment and hate incidents. The STSS has proven to be suitable for monitoring hateful tweets because of its 4 outputs: the geographic extents of clusters, duration, strength (RR), and statistical significance. Such monitoring can be conducted in real time as new data become available.12
Hate, xenophobia, and extremist White nationalism within the United States have grown in recent years. Conspiracy theories are rampant and work to create “enemies,” such as the scapegoating of Asians and Asian Americans during the COVID-19 pandemic. This is an example of hate campaigns toward groups of people, and the results of our study may contribute to removing negative associations from Asian Americans during the ongoing public health threat.
Although racism has been recognized as a public health threat,13 further research is needed to assess how anti-Asian hate in social media affects individual and community health. The deleterious effects of hateful racist and xenophobic language on the mental health of Asians and Asian Americans have been documented for COVID-19,14 but whether such effects are long lasting is unknown. Health departments may take additional steps to prevent and mitigate hate-related health needs (e.g., by increasing culturally and linguistically appropriate counseling services or trauma support for individuals in affected areas). This place-based approach can be especially helpful for Asians and Asian Americans in areas with small local Asian populations without appropriate services available otherwise.
ACKNOWLEDGMENTS
This work was supported by the Immunology, Inflammation and Infectious Diseases Initiative and the Office of the Vice President for Research of the University of Utah.
We thank graduate students Serena Madsen, Ronae Matriano, Katherine Chipman, Makaio Kimbrough, and Marco Allain for their indispensable assistance.
CONFLICTS OF INTEREST
The authors report no conflicts of interest.
HUMAN PARTICIPANT PROTECTION
This research was declared exempt from University of Utah, Salt Lake City institutional review board review per protocol IRB_00132748.
Footnotes
See also Hswen, p. 545.
REFERENCES
- 1.Holshue ML, DeBolt C, Lindquist S, et al. First case of 2019 novel coronavirus in the United States. N Engl J Med. 2020;382(10):929–936. doi: 10.1056/NEJMoa2001191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nguyen TT, Criss S, Pallavi D, et al. Exploring US shifts in anti-Asian sentiment with the emergence of COVID-19. Int J Environ Res Public Health. 2020;17(19):7032–7045. doi: 10.3390/ijerph17197032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hswen Y, Xu X, Hing A, Hawkins JB, Brownstein JS, Gee GC. Association of “#covid19” versus “#chinesevirus” with anti-Asian sentiments on Twitter: March 9–23, 2020. Am J Public Health. 2021;111(5):956–964. doi: 10.2105/AJPH.2021.306154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Borja M, Jeung R, Yellow Horse A, et al. 2020. https://caasf.org/2020/06/anti-chinese-rhetoric-tied-to-racism-against-asian-americans-stop-aapi-hate-report
- 5.Le TK, Cha L, Han HR, Tseng W. Anti-Asian xenophobia and Asian American COVID-19 disparities. Am J Public Health. 2020;110(9):1371–1373. doi: 10.2105/AJPH.2020.305846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Molina N. Fit to be Citizens? Public Health and Race in Los Angeles, 1879–1939. Berkeley, CA: University of California Press; 2006. [Google Scholar]
- 7.Li L, Goodchild MF, Xu B. Spatial, temporal, and socioeconomic patterns in the use of Twitter and Flickr. Cartogr Geogr Inf Sci. 2013;40(2):61–77. doi: 10.1080/15230406.2013.777139. [DOI] [Google Scholar]
- 8.Medina RM, Nicolosi E, Brewer S, Linke AM. Geographies of organized hate in America: a regional analysis. Ann Am Assoc Geogr. 2018;108(4):1006–1021. doi: 10.1080/24694452.2017.1411247. [DOI] [Google Scholar]
- 9.He B, Ziems C, Soni S, Ramakrishnan N, Yang D, Kumar S. Racism is a virus: anti-Asian hate and counterspeech in social media during the COVID-19 crisis. arXiv. arXiv:2005.12423v2.
- 10.Kulldorff M. A spatial scan statistic. Commun Stat Theory Methods. 1997;26(6):1481–1496. doi: 10.1080/03610929708831995. [DOI] [Google Scholar]
- 11.Sloan L, Morgan J. Who tweets with their location? Understanding the relationship between demographic characteristics and the use of geoservices and geotagging on Twitter. PLoS One. 2015;10(11):e0142209. doi: 10.1371/journal.pone.0142209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kulldorff M. Prospective time periodic geographical disease surveillance using a scan statistic. J R Stat Soc Ser A Stat Soc. 2001;164(1):61–72. doi: 10.1111/1467-985X.00186. [DOI] [Google Scholar]
- 13.American Medical. https://www.ama-assn.org/delivering-care/health-equity/ama-racism-threat-public-health
- 14.Quintero Johnson JM, Saleem M, Tang L, Ramasubramanian S, Riewestahl E. Media use during COVID-19: an investigation of negative effects on the mental health of Asian versus white Americans. Front Commun (Lausanne). 2021;6:638031. doi: 10.3389/fcomm.2021.638031. [DOI] [Google Scholar]

