Abstract
The COVID-19 pandemic brought upon a massive wave of disinformation, exacerbating polarization in the increasingly divided landscape of online discourse. In this context, popular social media users play a major role, as they have the ability to broadcast messages to large audiences and influence public opinion. In this article, we make use of openly available data to study the behavior of popular users discussing the pandemic on Twitter. We tackle the issue from a network perspective, considering users as nodes and following relationships as directed edges. The resulting network structure is modeled by embedding the actors in a latent social space, where users closer to one another have a higher probability of following each other. The results suggest the existence of two distinct communities, which can be interpreted as “generally pro” and “generally against” vaccine mandates, corroborating existing evidence on the pervasiveness of echo chambers on the platform. By focusing on a number of notable users, such as politicians, activists, and news outlets, we further show that the two groups are not entirely homogeneous, and that not just the two poles are represented. To the contrary, the latent space captures an entire spectrum of beliefs between the two extremes, demonstrating that polarization, while present, is not the only driver of the network, and that more moderate, “central” users are key players in the discussion.
Keywords: polarization, COVID-19, network analysis, Twitter, latent space models
Significance Statement.
Popular social media users play a major role in the COVID-19 infodemic, as they can influence public opinion through their massive reach. Using state-of-the-art statistical network modeling techniques, we embed popular Twitter users discussing the pandemic in a latent social space, producing a map of the COVID-19 social media universe. The results suggest the existence of two distinct communities, which respectively favor and oppose vaccine mandates, thus corroborating the presence of echo chamber effects on the platform. We further show that the two groups are not entirely homogeneous: instead, the social map describes an entire spectrum of beliefs between the two extremes, demonstrating that polarization is not the only relevant factor, and that moderate users are central to the discussion.
Introduction
COVID-19 dramatically affected the lives of billions of people around the globe. Given its massive impact, the pandemic naturally assumed a central role in both private and public discourse, dominating the discussion on- and offline. Social media, in particular, has been extensively used to exchange pandemic-related information as well as disinformation, leading to what has been defined as an “infodemic” alongside the pandemic (1–3). This context saw the emergence of pandemic-related social media elites, accounts with a large number of followers that regularly discuss the pandemic and the issues surrounding it (4, 5). These actors play a central role in public communication, as they can shape popular sentiment and public discourse and thus potentially influence political decision-making (6). This is especially true in a setting characterized by increasing polarization and historically low trust in mainstream news, which allows politically and financially motivated actors to emerge (7–10). Because of this, understanding the role that popular social media users play and the ways in which they operate is crucial for tackling arising challenges in public communication (11). In this article, we tackle this issue with the aim of drawing an explanatory map of the network of COVID-19 Twitter elites. We first identify users that are popular in the discussion related to the pandemic on Twittera, and go on to study their (directed) network, where an edge between two actors is present if one follows the other. To analyze the resulting network structure we make use of latent space models, which postulate that nodes in the network are embedded in a latent social space, where the probability for two actors to connect is inversely related to their distance within the space (12). We, in particular, make use of the latent cluster random effects model, which incorporates model-based clustering, allowing it to identify cohesive communities in the network, as well as additional nodal parameters to account for actor-specific heterogeneity in the propensity to form edges (13). The results suggest that the network can be partitioned into two macro-communities. By focusing on a number of notable users, such as politicians, activists, and news outlets, we show how the two communities can be interpreted as “generally pro” and “generally against” pandemic containment measures and vaccine mandates. This finding supports the extensive body of literature that demonstrates the existence of significant polarization on social media (14–18). The central role of polarization has also been demonstrated for the specific case of pandemic-related online conversations, especially with respect to opinions on vaccination (19–23). However, our results also demonstrate how polarization, while prevalent, is far from being the only driver in the network. The continuous latent space enables us to see that substantial within-cluster heterogeneity is present: not all users in the two communities have the same opinions, and not just the two polar opposites are represented. On the contrary, a full spectrum of beliefs between the two poles is found. In particular, more radical users are found to be positioned towards the extremes of the latent space, while more moderate and neutral actors, such as health ministers and news outlets, are closer to the center. These central users thus occupy a uniquely powerful position, as they can act as a bridge between the two communities and thereby mitigate polarization. In addition to these results, our analysis demonstrates how, by making use of latent space models, it is possible to accurately map the COVID-19 Twitter landscape by only modeling information on who follows whom within the elite network. This finding highlights the strength and the pervasiveness of echo chamber effects on the platform, and showcases the power of latent space network models for studying communication on social media.
Data and methods
Identifying the network of COVID-19 Twitter elites
Social media elites can be broadly understood as users with the ability to influence (24). The term typically refers to a group of highly influential and popular users with considerable reach who significantly impact conversations, trends, and narratives circulating on social media. These users often include celebrities, politicians, journalists, thought leaders and influencers, who have a large and engaged audience and are frequently retweeted, quoted, and mentioned by others. While informative, this characterization is quite broad and does not indicate a unique way of identifying elites in practice. Operational definitions for empirical applications are often based on engagement metrics, such as the number of followers of each user, and engagement metrics, such as likes, shares, quotes, and replies. As our focus lies on analyzing the behavior of actors who actively engage in the discussion of the pandemic and that exert significant influence on the conversation, we here choose to identify elites as those who authored the most popular tweets, where the popularity of a tweet is given by the sum of its likes, replies, and retweets (including quotes). Based on this characterization, we will therefore first need to identify popular tweets discussing COVID-19 and then relate those tweets to their authors. More motivation and details on this choice, as well as robustness checks, are included in the Supplementary material. For our study, we make use of the COVID-19 Twitter dataset published by Banda et al. (25), which comprises IDs of tweets containing pandemic-related keywords from January 1st, 2020 onward. These keywords were handpicked and continuously tracked to provide a global and real-time overview of the chatter related to the COVID-19 pandemic. The dataset was collected using Twitter’s streaming API, which allows free access to a random 1% sample of publicly available tweets in real time (26). At the time of the analysis, the entire dataset contains about 1.32 billion tweet IDs, representing both tweets and retweets in all languages, 340 million of which are unique (without retweets). Each tweet’s creation time and language are also provided. Using the tweet IDs, we are then able to recover additional information on the tweets, such the text, the author, and metrics such as likes and retweets counts.
As a global platform, Twitter is host to speakers of many different languages, which induce the formation of largely separate communities. Since our goal is to map the latent space of COVID-19 elites, we choose to limit our analysis to a single language, as doing otherwise would return a fragmented map shaped mainly by language. In principle, it is possible to work with any single language, and we here opt for using tweets in German. The choice is motivated by the combination of two facts: Firstly, German is predominantly spoken by people from Germany, and to a smaller extent from Austria and parts of Switzerland, thereby guaranteeing a reasonable degree of geographical homogeneity. This prevents the estimated latent positions of the actors (and the resulting clusters) from being predominantly driven by their geographical locations. Secondly, German is used by a relevant proportion of the Twitter user base, allowing for a more than sufficient sample size. As the first COVID-19 vaccines started to be available to the public towards the very end of 2020, and given that one of the points we are most interested in investigating is attitude towards vaccination, we limit our sample to 2021 only, spanning from January 1st to December 31st. Considering all tweets in German from 2021 results in a total of 1.51 million unique tweets from 184,406 accounts. The data, sketched in Table 1, allow us to pinpoint popular users by looking at the authors of tweets with the highest interaction metrics. More specifically, we classify a user as elite if they authored a tweet that achieved a popularity score of at least 2000, where we define popularity as the sum of likes, replies, and retweets (including quotes) gathered. This threshold results in 1024 popular tweets spanning all months of 2021, with each month represented by 53–156 tweets. Those 1024 tweets were produced by 372 users, 31.7% of which were granted verified status by Twitter, meaning that the platform deemed them both authentic and of public interest (27). In contrast, only 2.4% of the user base in the initial sample was verified. This confirms that more notable accounts and public figures are, on average, more central to the discussion, as we would expect. The bar plot in Fig. 1 depicts the number of tweets authored by the top 10 most popular users in our final sample, displayed by their Twitter usernames. From it, it is apparent how certain actors play a very prominent role in the conversation, with some accounts having authored more than 100 popular tweets in our 1% sample, meaning that one can expect them to have as much as 100 times more than that overall. This tells us how truly influential elites can be on Twitter, and also indicates that, given the sheer amount of popular tweets by the most prominent accounts, it is quite likely that they will be captured in our 1% sample.
Table 1.
Structure of the analyzed dataset. Only columns relevant to our study are displayed.
Tweet ID | Author | Likes | Replies | Retweets |
---|---|---|---|---|
138712… | AnikaBlub | 1,162 | 61 | 53 |
135224… | goetageblatt | 1 | 2 | 1 |
140697… | galottom | 1 | 0 | 0 |
146632… | 1_FCM | 171 | 26 | 35 |
135269… | covid_watch | 0 | 0 | 0 |
… | … | … | … | … |
Fig. 1.
Number of tweets authored by the 10 most popular users in our sample.
After pinpointing these accounts as COVID-19 elites, we are able to define their following network in a natural way. Specifically, we consider the users as the nodes, and establish that a (directed) edge from actor i to actor j is present if, at the time of the analysis, i follows j. After removing the only nine users with no connections, the resulting network is composed of 363 nodes connected by a total of 12,182 edges, and is visualized in Fig. 2. From the plot, it is immediately apparent that the network is quite dense: In fact, 9.2% of all possible edges are observed. Given that the network is composed of users who produced popular tweets about the same topic, the fact that many of them follow each other makes intuitive sense. Moreover, from the graph representation, laid out using a variant of the Yifan Hu force-directed graph drawing algorithm (28), the network seems to be approximately split into two main groups of different sizes. This already gives a first impression of the two main poles in the network, which will be investigated in more detail in the Results section.
Fig. 2.
Graphical representation of the network of COVID-19 elites on German-speaking Twitter.
Latent space models for social network data
To model the network data, we make use of the latent cluster random effects model for social networks (13). This model is part of the general family of latent space models, originating from the latent distance model proposed by Hoff et al. (12). Latent space network models postulate that each actor has an unobserved position in a d-dimensional Euclidean latent social space, and that the probability for two actors to form an edge is inversely related to their distance in the space. This family of models is particularly suitable for social networks, in which mechanisms such as homophily and triadic closure often play a major role (29). Handcock et al. (30) added the idea of model-based clustering to the original latent distance model, allowing for the actors’ positions in the latent space to come from a mixture of normal distributions, where each mixture component represents a cluster. Krivitsky et al. (13) further extend this by adding nodal random effects to control for actor-specific heterogeneity in the propensity to form edges. More precisely, without the inclusion of nodal or edgewise covariates, the model specifies the probability of an edge between nodes i and j through:
(1) |
where are the latent positions of the nodes in the d-dimensional latent space, is an intercept, and and are node-specific sender and receiver effects that account for the individual users’ propensity of following or being followed, respectively. Here, the latent positions are assumed to originate from a finite spherical multivariate mixture of independent normal distributions, and the random effects and are assumed to be drawn independently from normal distributions with mean 0 and variances and , respectively. The model is estimated through the R package latentnet, which implements a Bayesian routine based on the use of a Markov chain Monte Carlo algorithm (31). It is interesting to note that this model can be viewed as a generalization of the (latent) fitness model for networks (32, 33), as the node-specific random effects and can be seen as measuring the intrinsic fitness of node i to send and receive ties, while its latent position affects its probability of forming ties differently for each (potential) connection.
Homophily and triadic closure are generally prevalent in social media, particularly on Twitter and between popular accounts (34, 35). Those mechanisms often lead to the formation of subgroups of actors based on shared beliefs or other characteristics. Identifying such clusters can be helpful in understanding the drivers of polarization and, more in general, grouping behavior. The general task of identifying assortative, tightly knit groups in networks is a large area of research, known under the umbrella term of “community detection” (36). Notable examples of such methods include modularity maximization algorithms (37) and stochastic blockmodels (38). Classical community detection techniques are well suited for finding group structures, but they have the drawback of only returning a discrete partition of the network into clusters, where the connectivity behavior of each actor is fully described by its group label. In other words, two nodes in the same group are considered identical in all aspects. This is generally quite simplistic for social networks, in which cohesive groups often do exist, but where members of each group can also be very different from one another. Within a single group, for example, some nodes might be more “extreme” and isolated from all other communities. In contrast, others might be more central to the network and have many connections to other groups. We expect this to be the case in our network of COVID-19 Twitter elites: While we can assume polarization and grouping behavior to be present, we also expect the social positioning and political beliefs of the actors to be more accurately described through a continuous, multidimensional spectrum rather than with discrete labels. Because of that, we are not only interested in the clear-cut grouping of nodes but also in uncovering the (continuous) social positioning of the users relative to one another. The chosen latent cluster random effects model is, therefore, particularly well suited for our application, as it combines clustering and latent position modeling, thereby enabling us to simultaneously capture polarization and grouping behavior as well as the positioning of the actors relative to each other in the socio-political spectrum.
Results
We fit the latent cluster random effects model to our data, setting both the number of clusters k and the number of dimensions d to 2. The choice of two clusters is backed by the approximated Bayesian Information Criterion for data-driven model selection proposed by Handcock et al. (30). Moreover, since much of the literature concerns itself with investigating polarization in the online discussion revolving around the COVID-19 pandemic, and given that polarization suggests the existence of two subgroups (39), setting appears to be the natural choice from a substantive perspective. With regards to the choice of d, while dimensionality for latent space network models is generally an open question, setting is considered to be the standard for applications in which interpretability of the positions is central, as it simplifies the visualization and description of social relationships (40). We also experimented with different values of d and observed that using higher dimensionality did not greatly impact the cluster assignments.
The results of the model fitting are visualized in Fig. 3. The axes correspond to the two latent dimensions and , respectively, and the nodes’ colors indicate the estimated community memberships. More specifically, the node-specific pie charts represent the posterior probabilities for each user to belong to the one or the other cluster. Node sizes are scaled by each actor’s total degree within the network. Note that, as defined by the model, two nodes that are closer to one another have a higher probability of forming an edge, i.e. of following each other. Also note that estimates of the node-specific random effects and , incorporating information on how active specific nodes are with respect to following or being followed, are made available in the supplementary materials. At first glance, we see that the two communities are distributed along the horizontal axis , with the more numerous blue community occupying the left and center parts of the figure, and the orange one being located towards the right-hand side. Moreover, from the posterior membership probabilities we can see that group memberships are fairly clear for most nodes. Nonetheless, significant uncertainty can be observed for a non-negligible proportion of the actors, which lie in between the two clear communities in the space.
Fig. 3.
Graphical representation of the latent positions of the actors in the network of COVID-19 Twitter elites estimated via the latent cluster random effects model, where the node size for each actor is scaled by its degree. A number of notable users are highlighted. The axes correspond to the two latent dimensions and , while the estimated posterior probabilities for each user to belong to the “pro vaccine mandates” (blue) or “anti compulsory vaccination” (orange) cluster are depicted through the node-specific pie charts. Major German media outlets are found between the two communities.
As our task is of unsupervised nature, we do not have a set-in-stone “ground truth” with which to compare the model-based labeling and the estimated positions of the actors. To interpret the results, we therefore need to dig into the data and consider the emerging patterns. As the network is limited in size, and thanks to the naturally high propensity of elite users to voice their opinions, it is relatively straightforward to identify some of the more prominent actors and gauge their views on pandemic-related governmental interventions based on public information. Through this process, we can appreciate how the latent position of each actor in the network is strongly associated with their public stances on government mandates. More specifically, despite substantial within-cluster heterogeneity in stances (and their intensity) on several issues, users in the blue community tend to hold views that can be summarized as “generally for” interventions and vaccine mandates. The opposite is true for actors in the orange community, which can be described as “generally against” such measures. Moreover, the positioning of nodes within communities is also informative on the actors’ beliefs, capturing the within-cluster heterogeneity mentioned. Specifically, more central (external) positions in the overall latent space are associated with more moderate (extreme) stances. To showcase these patterns, we highlighted and labeled some notable users in Fig. 3, where each user is indicated with their Twitter username. The very center of the space is occupied by the most popular actors, most of whom, despite having connections to both groups thanks to their “elite among elites” status, reside firmly in the blue camp: A prime example is Karl_Lauterbach, an exponent of the Social Democratic Party who, at the time of writing, has been serving as the health minister of Germany since December 8th, 2021. He is known to be a strong proponent of vaccination and mandatory vaccination for all (41). Two other notable members of this group are Christian Drosten (c_drosten), a prominent virologist who has been described by major media outlets as “the country’s real face of the coronavirus crisis” and “the nation’s corona-explainer-in-chief” (42), and Melanie Brinkmann (BrinkmannLab), another well-known virologist who was among the proponents of the No-COVID strategy (43). Moving a bit further left in the space, another very popular user in the network is Flying__Doc, a medical doctor who has been outspoken in his support for policy proposals such as a vaccine mandate for all adults, and allowing access to events only to people who are both fully vaccinated and tested (“1G+” in the German political jargon). Looking even more toward the left on the dimension, we encounter positions that are increasingly more in the direction of decisive government interventions. Examples of this are dr_heartbreaker, a medical professional who has expressed his support for hard lockdowns and the aforementioned No-COVID strategy, and NavomDienst and Doktor_Freakout, two anonymous medical doctors who also vehemently voiced their dissent for what they deemed to be bland policy making, and vouched their support for stronger restrictions. To conclude our outlook on the blue community, we also labeled two more peripheric, less Twitter-popular nodes. On the bottom-left of the plot we find DanZickler, an intensive care doctor who also expressed his support for more decisive action by the government, while on the top left we find MuttivsFaschos, who tweeted at the hashtags #ZeroCovid and #harterLockdownJetzt (“harder lockdown now”). All in all, our analysis highlights how users categorized in the blue group generally tend to openly support governmental efforts to contain the pandemic, and that the estimated dimension is associated with the intensity of the actors’ voiced stances on policy.
We now shift our focus to the orange community, composed of actors who have, on average, significantly fewer followers within this elite network, and tend to more or less strongly oppose pandemic-related government mandates. We start our overview with DrPuerner, the user with the highest number of popular tweets in our dataset. A medical doctor, Puerner rose to prominence during the pandemic for his stark criticism of COVID measures and opposition to government mandates. While not downplaying the dangers posed by COVID-19, he attracted following and praise from conspiracy theorists and the populist right-wing party “Alternative for Germany” (“AfD”), notorious for its antisystem beliefs (44). Closer to DrPuerner in the latent space we can also find wolff_ernst, a self-described journalist and writer, who has openly associated himself with COVID-related and general conspiracy theories (45). We also labeled two more peripheral nodes in this cluster, namely users zukunft37 and Whereismymodel3, anonymous accounts who openly voice their vaccine skepticism and opposition to government mandates. Two elected members of the aforementioned AfD, namely Alice_Weidel, who has been the leader of the party in the Bundestag (German Federal Parliament) since October 2017, and JoanaCotar, another member of the Bundestag who was part of AfD for the whole studied period and until late 2022, are also part of the orange community. Unsurprisingly, the two are close in the latent space, reflecting their similar policy stances. Perhaps more surprisingly, their estimated latent positions are not far from that of Sahra Wagenkecht (SWagenknecht), member of the Bundestag for “The Left” (“Die Linke”) since 2009, and former parliamentary leader of that same party. Despite being on the other end of the political spectrum, she also opposes general vaccination mandates (46). She is located more towards the middle of the plot and has substantial uncertainty in her community membership, with a posterior probability of approx. 75% to belong to the orange community. Another actor whose community membership is uncertain is Christian Democratic Union politician Jens Spahn, who served as health minister for most of the analyzed period, i.e. until December 8th, 2021 (jensspahn). He is not far in the space from his successor Karl Lauterbach but lies a bit more on the right: He is classified in the blue community but has a posterior probability of approximately 25% to belong to the orange one. This is in line with the fact that, while he is a proponent of widespread vaccination, he is opposed to the idea of compulsory vaccination for all (47). To conclude our overview of the space, we highlight some other notable accounts located in between the two clusters, namely those belonging to prominent news outlets. Given that we expect them to have a diverse following due to their authority status, their central positioning makes intuitive sense. But even between media outlets, the model is able to draw a distinction: zeitonline and tagesschau, generally reputable news sources, are closer to the center of the space, and, although with substantial uncertainty, labeled as blue. On the other hand, BILD, the most prominent German boulevard newspaper, is located more towards the right, and has a higher probability of belonging to the orange group.
Discussion
In this article, we identified and modeled the network of users leading the conversation revolving around the COVID-19 pandemic on Twitter. More specifically, we made use of the latent cluster random effects model to map these elite users into a 2D Euclidean social space, in which users that are closer to each other have a higher likelihood to connect, i.e. to follow each other. The results suggest the emergence of a natural partition of the network into two dense macro-communities, which are only loosely connected with their opposing counterparts. By focusing on a number of notable users, such as politicians, activists, and news outlets, we show how those two communities can be interpreted as “generally pro” and “generally against” public interventions and vaccine mandates. This finding corroborates recent research demonstrating the polarized nature of pandemic-related online discourse, especially concerning vaccination (19–23). But a deeper inspection of the latent space further reveals that users within communities are only partially homogeneous in their stances. To the contrary, the model is able to uncover a nuanced, continuous spectrum of pandemic-related beliefs and policy positions, ranging from demanding radical containment measures all the way to vaccine skepticism and COVID-denying conspiracy theories, covering everything in between those two extremes. In this context, neutral actors, such as mainstream news outlets, are positioned between the two clusters, which makes intuitive sense given their authority status. From the latent positions of users whose political inclination is known, we can also appreciate how attitudes toward governmental interventions tend to follow political inclination, with left- and right-wing respectively corresponding to more favorable or unfavorable positions towards restrictions and vaccine mandates. This finding echoes recent research showing how ideology can shape trust in scientists and attitudes towards vaccines (48, 49). The importance of vaccination as a subtheme within the pandemic-related discussion is corroborated by the fact that “vaccine” is one of the words appearing more often in the data (while not being used as a filtering mechanism), as shown in the Supplementary material (Fig. S3).
A particular feature of the employed methodology is the ability to combine “classical” community detection, which alone would be insufficient to gain a proper understanding of the network at hand, with more refined, continuous latent space modeling. This allows to map the underlying latent social space with the necessary nuance while simultaneously returning a partition of the network into subgroups, which can be useful for understanding the network at a coarser resolution, or for classification purposes. The modeling results thus allow us to obtain a clearer picture of the network as a whole and can be used for garnering insight on single (politically unaffiliated) users.
We note that the studied network is fairly small as a result of the relatively restrictive popularity threshold we chose for defining a popular tweet: It would thus be possible to decrease the threshold to obtain a larger network. We also note, however, that using a lower value somehow “loosens” the definition of an elite, as users that are less popular on average would make it into the network. Experimenting with the threshold, we also observed that using different values almost only impacts the size of the network’s periphery and does not change the overall picture. Results of alternative analyses with different threshold values and inclusion criteria are provided in the Supplementary material (Figs. S1 and S2) and corroborate the robustness of our findings. Moreover, a stricter definition of elites incidentally makes the network size more manageable, which is relevant given that model estimation, as it is currently implemented in the R package latentnet, only scales well up to a few thousand nodes. Nonetheless, while latent space models do pose serious computational challenges, different approaches to estimate them for larger networks have been proposed (50, 51). We also note that, as we here only model the behavior of elites, we cannot a priori assume our results to be valid for the overall discussion. While, given the well-documented strong influence of popular users in the conversation, it is reasonable to believe that many of the results could extend to the general Twitter population, further research would be needed to confirm this. Furthermore, there may be different patterns in how elite and nonelite actors follow other users. For example, whereas nonelites are likely to use their follows primarily instrumentally, i.e. to see tweets they are interested in on their timeline, elites could also use theirs for signaling, i.e. to publicly show support or endorsement towards other users, and may thus curate their follows more carefully. Similarly, highly active elites could be more likely than nonelites to enter conflicts with each other and block opposing elites. On the one hand, these strategic follows (or nonfollows) are indeed relevant to our analysis, as they give information on the potential factions at play in the network, and aid us in identifying them. On the other hand, as a result of these mechanisms, polarization in the elite network may be higher than in the complete one. The latter consideration strengthens the notion that polarization, although undoubtedly present to some extent, is not the only determining factor in network formation, and that the different groups exist on a continuous spectrum rather than being completely isolated from one another.
We further emphasize that our approach is purely unsupervised and completely based on network structure, without including any element of natural language processing. In other words, this means that the two groups emerge only from using information on who follows whom. In this sense, we could have simply labeled the two clusters as “blue” and “orange”, or “left” and “right”. The description of the communities with respect to their attitudes towards vaccination, and, more in general, pandemic management, was done after the modeling, to shed some additional light on the data-driven cluster selection, and alternative characterizations would also be viable. While it would certainly be possible to make use of the tweets’ text content to obtain further insight into the users, we here explicitly chose to focus solely on the network component, thus demonstrating how tightly the users’ personal networks are intertwined with their beliefs. Indeed, given that the latent positions of the actors are estimated by the model solely using their follows and followers within the network, it is quite remarkable how consistently actors neighboring each other in the estimated latent space are also near in their stances on COVID-19 and its management, and how closely the space is able to track the belief spectrum. The echo chamber effect is well documented in the literature: Users tend to follow those who share similar ideas, and are thus rarely exposed to contrasting views. This, in turn, leads the users’ beliefs to become self-reinforcing (52, 53). However, our analysis demonstrates how this behavior is not only prevalent at the extremes of the socio-political spectrum but also towards the center of the belief space. On the one hand, the phenomenon implies that users with radical ideas will tend to follow people with similarly extreme beliefs, leading to further polarization; On the other hand, it also means that users following more moderate voices will also tend to gravitate towards more nuanced views. Central actors, which have the ability to act as a bridge between the two communities, are thus uniquely positioned to mitigate the polarization loop.
The fact that following behavior is so closely related to beliefs and attitudes paves the way for latent space models as powerful tools for drawing maps of social media landscapes, which can, in turn, be used to increase our understanding of the underlying social and behavioral structures. Indeed, while we here applied the methodology to map the discussion revolving around COVID-19, it is possible to perform similar types of analysis on other topics of public relevance. Given its explanatory and predictive power, we believe latent space modeling of elite social media networks to have the potential for improving our general understanding of the online landscape, ultimately aiding policymakers in making more informed decisions in their quests against polarization and misinformation worldwide.
Supplementary Material
Acknowledgments
This manuscript was posted on arXiv as a preprint: arXiv:2207.13352.
Note
At the time of publication, the Twitter social media platform is in the process of rebranding to “X”.
Contributor Information
Giacomo De Nicola, Department of Statistics, Ludwig Maximilian University of Munich, 80539 Munich, Germany.
Victor H Tuekam Mambou, Department of Statistics, Ludwig Maximilian University of Munich, 80539 Munich, Germany; ifo Institute – Leibniz Institute for Economic Research at the University of Munich, 81679 Munich, Germany.
Göran Kauermann, Department of Statistics, Ludwig Maximilian University of Munich, 80539 Munich, Germany.
Supplementary material
Supplementary material is available at PNAS Nexus online.
Funding
This research was partially funded by the Elite Network of Bavaria (ESG Data Science).
Author contributions
G.D.N., V.H.T.M. and G.K. designed research; G.D.N. and V.H.T.M. performed research; G.D.N. and V.H.T.M. analyzed data; G.D.N., V.H.T.M. and G.K.wrote the paper.
Data availability
The network data used in this article and the code to reproduce the analysis are publicly available on our GitHub repository, at https://github.com/gdenicola/latent-space-covid-twitter-elites.
References
- 1. Cinelli M, et al. 2020. The COVID-19 social media infodemic. Sci Rep. 10:16598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Zarocostas J. 2020. How to fight an infodemic. Lancet. 395(10225):676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Gollust SE, Nagler RH, Fowler EF. 2020. The emergence of COVID-19 in the US: a public health and political communication crisis. J Health Polit Policy Law. 45(6):967–981. [DOI] [PubMed] [Google Scholar]
- 4. Gallagher RJ, Doroshenko L, Shugars S, Lazer D, Welles BF. 2021. Sustained online amplification of COVID-19 elites in the United States. Soc Media Soc. 7(2). doi: 10.1177/20563051211024957 [DOI] [Google Scholar]
- 5. Molyneux L, McGregor SC. 2021. Legitimating a platform: evidence of journalists’ role in transferring authority to Twitter. Inf Commun Soc. doi: 10.1080/1369118X.2021.1874037 [DOI] [Google Scholar]
- 6. Leader AE, Burke-Garcia A, Massey PM, Roark JB. 2021. Understanding the messages and motivation of vaccine hesitant or refusing social media influencers. Vaccine. 39(2):350–356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Finkel EJ, et al. 2020. Political sectarianism in America. Science. 370(6516):533–536. [DOI] [PubMed] [Google Scholar]
- 8. Fink K. 2019. The biggest challenge facing journalism: a lack of trust. Journalism. 20(1):40–43. [Google Scholar]
- 9. Bridgman A, et al. 2020. The causes and consequences of COVID-19 misperceptions: understanding the role of news and social media. Harvard Kennedy School Misinformation Review. doi: 10.37016/mr-2020-028 [DOI] [Google Scholar]
- 10. Donovan J. 2020. Social-media companies must flatten the curve of misinformation. Nature. doi: 10.1038/d41586-020-01107-z [DOI] [PubMed] [Google Scholar]
- 11. Johnson NF, et al. 2020. The online competition between pro- and anti-vaccination views. Nature. 582(7811):230–233. [DOI] [PubMed] [Google Scholar]
- 12. Hoff PD, Raftery AE, Handcock MS. 2002. Latent space approaches to social network analysis. J Am Stat Assoc. 97(460):1090–1098. [Google Scholar]
- 13. Krivitsky PN, Handcock MS, Raftery AE, Hoff PD. 2009. Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models. Soc Netw. 31(3):204–213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Caldarelli G, De Nicola R, Del Vigna F, Petrocchi M, Saracco F. 2020. The role of bot squads in the political propaganda on Twitter. Commun Phys. 3(1):81. [Google Scholar]
- 15. Conover M, et al. 2011. Political polarization on Twitter. Proceedings of the International AAAI Conference on Web and Social Media 5(1):89–96.
- 16. Garimella VRK, Weber I. 2017. A long-term analysis of polarization on Twitter. Proceedings of the International AAAI Conference on Web and Social Media. 11(1):528–531.
- 17. Del Vicario M, et al. 2016. Echo chambers: emotional contagion and group polarization on Facebook. Sci Rep. 6(1):37825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Del Vicario M, Zollo F, Caldarelli G, Scala A, Quattrociocchi W. 2017. Mapping social dynamics on facebook: the brexit debate. Soc Networks. 50:6–16. [Google Scholar]
- 19. Jiang X, et al. 2021. Polarization over vaccination: ideological differences in Twitter expression about COVID-19 vaccine favorability and specific hesitancy concerns. Soc Media Soc. 7(3). doi: 10.1177/20563051211048413 [DOI] [Google Scholar]
- 20. Reiter-Haas M, Klösch B, Hadler M, Lex E. 2022. Polarization of opinions on COVID-19 measures: integrating Twitter and survey data. Soc Sci Comput Rev. doi: 10.1177/08944393221087662 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. SteelFisher GK, Blendon RJ, Caporello H. 2021. An uncertain public—encouraging acceptance of Covid-19 vaccines. N Engl J Med. 384(16):1483–1487. [DOI] [PubMed] [Google Scholar]
- 22. Cowan SK, Mark N, Reich JA. 2021. COVID-19 vaccine hesitancy is the new terrain for political division among Americans. Socius. 7:237802312110236. [Google Scholar]
- 23. Mønsted B, Lehmann S. 2022. Characterizing polarization in online vaccine discourse—a large-scale study. PLoS One. 17(2):e0263746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Dubois E, Gaffney D. 2014. The multiple facets of influence: identifying political influentials and opinion leaders on Twitter. Am Behav Sci. 58(10):1260–1277. [Google Scholar]
- 25. Banda JM, et al. 2021. A large-scale COVID-19 Twitter chatter dataset for open scientific research—an international collaboration. Epidemiologia. 2(3):315–324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Twitter , Volume streams [accessed 2022 Oct 16]. https://developer.twitter.com/en/docs/twitter-api/tweets/volume-streams/introduction
- 27. Edgerly S, Vraga EK. 2019. The blue check of credibility: does account verification matter when evaluating news on Twitter? Cyberpsychol Behav Soc Netw. 22(4):283–287. [DOI] [PubMed] [Google Scholar]
- 28. Hu Y. 2005. Efficient, high-quality force-directed graph drawing. Math J. 10(1):37–71. [Google Scholar]
- 29. Rivera MT, Soderstrom SB, Uzzi B. 2010. Dynamics of dyads in social networks: assortative, relational, and proximity mechanisms. Annu Rev Sociol. 36:91–115. [Google Scholar]
- 30. Handcock MS, Raftery AE, Tantrum JM. 2007. Model-based clustering for social networks. J R Stat Soc A (Stat. Soc.). 170(2):301–354. [Google Scholar]
- 31. Krivitsky PN, Handcock MS. 2008. Fitting position latent cluster models for social networks with latentnet. J Stat Softw. 24(5):1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Caldarelli G, Capocci A, De Los Rios P, Munoz MA. 2002. Scale-free networks from varying vertex intrinsic fitness. Phys Rev Lett. 89(25):258702. [DOI] [PubMed] [Google Scholar]
- 33. Bianconi G, Barabási A-L. 2001. Competition and multiscaling in evolving networks. Europhys Lett. 54(4):436–442. [Google Scholar]
- 34. Lou T, Tang J, Hopcroft J, Fang Z, Ding X. 2013. Learning to predict reciprocity and triadic closure in social networks. ACM Trans Knowl Discov Data. 7(2):1–25. [Google Scholar]
- 35. Colleoni E, Rozza A, Arvidsson A. 2014. Echo chamber or public sphere? Predicting political orientation and measuring political homophily in Twitter using big data. J Commun. 64(2):317–332. [Google Scholar]
- 36. Fortunato S, Hric D. 2016. Community detection in networks: a user guide. Phys Rep. 659:1–44. [Google Scholar]
- 37. Newman MEJ. 2006. Modularity and community structure in networks. Proc Natl Acad Sci USA. 103(23):8577–8582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. De Nicola G, Sischka B, Kauermann G. 2022. Mixture models and networks: the stochastic blockmodel. Stat Modelling. 22(1–2):67–94. [Google Scholar]
- 39. Guerra P, Meira Jr W, Cardie C, Kleinberg R. 2013. A measure of polarization on social media networks based on community boundaries. Proceedings of the International AAAI Conference on Web and Social Media. 7(1):215–224.
- 40. Sosa J, Betancourt B. 2022. A latent space model for multilayer network data. Comput Stat Data Anal. 169:107432. [Google Scholar]
- 41. Zimmermann K. 2022. Karl Lauterbach sieht gescheiterte Impfpflicht als herbe Niederlage. ZeitOnline [accessed 2022 Oct 16]. https://zeit.de/politik/deutschland/2022-04/karl-lauterbach-impfpflicht-niederlage-impfkampagne.
- 42. Henley J. 2020. Coronavirus: meet the scientists who are now household names. The Guardian [accessed 2022 Oct 16]. https://theguardian.com/world/2020/mar/22/coronavirus-meet-the-scientists-who-are-now-household-names.
- 43. Baumann M, et al. 2021. Eine neue proaktive Zielsetzung für Deutschland zur Bekämpfung von SARS-CoV-2. ifo Institute [accessed 2022 Oct 16]. https://ifo.de/en/publikationen/2021/monograph-authorship/proaktive-zielsetzung-bekaempfung-sars-cov-2-handlungsoptionen.
- 44. Stoeppler T. 2021. Thesen vom Amtsarzt. Sueddeutsche Zeitung [accessed 2022 Oct 16]. https://sueddeutsche.de/bayern/bayern-corona-amtsarzt-friedrich-puerner-buch-1.5466845.
- 45. Ayyadi K. 2021. Wenn ein selbsterkärter “Ökonom” mit Antisemitismus Corona erklären will. Belltower News [accessed 2022 Oct 16]. https://belltower.news/youtube-wenn-ein-selbsterklaerter-oekonom-mit-antisemitismus-corona-erklaeren-will-97409/.
- 46. Wagenknecht S. 2022. Deutsche Politik hat sich bei der Impfpflicht verrannt. FOCUS Online [accessed 2022 Oct 16]. https://focus.de/politik/deutschland/weitergedacht/weitergedacht-die-wagenknecht-kolumne-deutsche-politik-hat-sich-bei-der-impfpflicht-verrannt˙id˙40754360.html.
- 47. Kubitza M. 2021. Spahn will im Bundestag nicht für allgemeine Impfpflicht stimmen. BR24 [accessed 2022 Oct 16]. https://br.de/nachrichten/deutschland-welt/spahn-will-im-bundestag-gegen-allgemeine-impfpflicht-stimmen,SqWKYhF.
- 48. Featherstone JD, Bell RA, Ruiz JB. 2019. Relationship of people’s sources of health information and political ideology with acceptance of conspiratorial beliefs about vaccines. Vaccine. 37(23):2993–2997. [DOI] [PubMed] [Google Scholar]
- 49. Kossowska M, Szwed P, Czarnek G. 2021. Ideology shapes trust in scientists and attitudes towards vaccines during the COVID-19 pandemic. Group Process Intergr Relat. 24(5):720–737. [Google Scholar]
- 50. Raftery AE, Niu X, Hoff PD, Yeung KY. 2012. Fast inference for the latent space network model using a case-control approximate likelihood. J Comput Graph Stat. 21(4):901–919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Yin J, Ho Q, Xing EP. 2013. A scalable approach to probabilistic latent space inference of large-scale networks. Adv Neural Inf Process Syst. 26:422–430. [PMC free article] [PubMed] [Google Scholar]
- 52. Cinelli M, De Francisci Morales G, Galeazzi A, Quattrociocchi W, Starnini M. 2021. The echo chamber effect on social media. Proc Natl Acad Sci USA. 118(9):e2023301118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Nguyen CT. 2020. Echo chambers and epistemic bubbles. Episteme. 17(2):141–161. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The network data used in this article and the code to reproduce the analysis are publicly available on our GitHub repository, at https://github.com/gdenicola/latent-space-covid-twitter-elites.