Characterizing engagement dynamics across topics on Facebook

Gabriele Etta; Emanuele Sangiorgio; Niccolò Di Marco; Michele Avalle; Antonio Scala; Matteo Cinelli; Walter Quattrociocchi

doi:10.1371/journal.pone.0286150

. 2023 Jun 28;18(6):e0286150. doi: 10.1371/journal.pone.0286150

Characterizing engagement dynamics across topics on Facebook

Gabriele Etta ¹, Emanuele Sangiorgio ², Niccolò Di Marco ³, Michele Avalle ¹, Antonio Scala ⁴, Matteo Cinelli ¹, Walter Quattrociocchi ^1,^*

Editor: Vincent Antonio Traag⁵

PMCID: PMC10306180 PMID: 37379268

Abstract

Social media platforms heavily changed how users consume and digest information and, thus, how the popularity of topics evolves. In this paper, we explore the interplay between the virality of controversial topics and how they may trigger heated discussions and eventually increase users’ polarization. We perform a quantitative analysis on Facebook by collecting ∼57M posts from ∼2M pages and groups between 2018 and 2022, focusing on engaging topics involving scandals, tragedies, and social and political issues. Using logistic functions, we quantitatively assess the evolution of these topics finding similar patterns in their engagement dynamics. Finally, we show that initial burstiness may predict the rise of users’ future adverse reactions regardless of the discussed topic.

Introduction

The advent of social media platforms changed how users consume information online [1–4]. The micro-blogging features on Twitter and Facebook, combined with a direct interaction between news producers and consumers, have remarkably affected how people get informed, shape their own opinions, and debate with other peers online [5–7]. Over the years, following the business model of social media platforms, news outlets and producers attempted to maximize the time spent by users on their contents [8, 9], giving birth to the concept of attention economy [10]. The term refers to the users’ limited capability and time to process all information they interact with [11–13]. The transition toward a news ecosystem shaped on social media platforms unveiled patterns in information consumption at multiple scales [14, 15], which contributed to the emergence of the polarisation phenomenon and the formation of like-minded groups called echo chambers [16–18]. Within echo chambers, characterized by homophily in the interaction network and bias in information diffusion towards like-minded peers, selective exposure [19] is a significant driver for news consumption [16]. The combination of echo chambers and selective exposure makes users more likely to ignore dissenting information [20], choosing to interact with narratives adhering to their point of view [15, 21].

Several studies explored the existence of these mechanisms in many topics concerning political elections, public health, climate change, and trustworthiness of the news sources [15, 21–29]. Findings indicate neither the topic nor the quality of information explains the users’ opinion-formation process. Instead, several studies observed how the virality of discussions can increase the likelihood of inducing polarization, hate speech, and toxic behaviors [30–32], highlighting how recommendation algorithms may have a role in shaping the news diet of users.

Therefore, it is necessary to provide a better understanding of how user interest evolves in online debates. To achieve this goal, we provide a quantitative assessment of the dynamics underlying user interest in news articles about different topics. In this paper, we analyze the engagement patterns produced by ∼57M posts on Facebook related to ∼300 topics, involving a total of ∼2M posting pages and groups over a period that ranges from 2018 to 2022. We first provide a quantitative assessment of topics’ attention through time, extracting insightful parameters from their engagement evolution. Then, we construct a metric called the Love-Hate Score to estimate the level of controversy associated with a topic using the sentiment of users’ engagement, as expressed by the normalized difference between their positive and negative reactions. Our results show that topics are generally characterized by an interest that constantly increases since the appearance of the first post. We find that topics’ interactions grow with permanent intensity, even for prolonged periods, indicating how interest is a cumulative process that takes time. We statistically validate this result by comparing parameters across topic categories, discovering no differences in the evolution of the engagement. Indeed, regardless of their category, topics keep users engaged steadily over time, and their lifetime progression seems thus unrelated to its thematic field. Finally, we find that topics with sudden virality tend to occur with more controversial and heterogeneous interactions. In turn, topics with a steady evolution exhibit more positive and homogeneous reaction types. This difference in the sentiment of reactions, and the protracted duration of topics’ lifetime, are both upshots consistent with the emergence of selective exposure as a driver of news consumption.

Materials and methods

This section describes the data collection process, the topic extraction process, the models and the metrics employed in assessing collective attention.

Overview of the data collection process

The data collection process comprises several parts, as described in Fig 1. We start by creating a sample of news articles from the GDELT event database [33]. Then, we process the articles’ text to obtain a set of representing terms. Consequently, we apply the Louvain community detection algorithm [34] on the bipartite projection of the co-occurrence term network to identify the topics of interest. The terms representing these topics will serve as input for collecting posts from Facebook.

The data collection and analysis process are compliant with the terms and conditions [35] imposed by Crowdtangle [36]. Therefore, the results described in this paper cannot be exploited to infer the identity of the accounts involved.

News extraction from GDELT

The GDELT (Global Database of Events, Language, and Tone) Project [37], powered by Google Jigsaw, is a database of global human society which monitors the world’s broadcast, print, and web news from nearly every corner of every country in more than 100 languages. It identifies the people, locations, organisations, themes, sources, emotions, counts, quotes, images and events driving our global society every second of every day [38]. We gathered news articles from the GDELT 2.0 Event Database [33], which can store new world’s breaking events every 15 minutes and translates the corresponding news articles in 65 languages, representing 98.4% of its daily non-English monitoring volume [33]. The analysis covers a period between 1/1/2018 and 13/5/2022, collecting 50 news articles each week for a total of ∼79K.

Extracting representative keywords from news articles

To clean and extract the most representative keywords of each news article, we employed the newspaper3k Python package [39]. We initially extracted words from the body of the article, excluding stopwords and numbers. Then, we computed the word frequency f(w, i) for each word w in article i. Finally, we sorted words in descending order according to their frequency, keeping the top 10 most frequent words.

Topic extraction from news article’s keywords

The list of terms with the corresponding news articles can be formalised as a bipartite graph G = (T, A, E) whose partitions T and A represent the set of terms t ∈ T and the articles a ∈ A respectively, for which an edge (t, a)∈E exists if a term t is present in an article a. By projecting graph G on its terms T we obtain an undirected graph P made up of nodes t ∈ T, which are connected if they share at least one news article.

We perform community detection on the nodes of P by employing the Louvain algorithm [34]. As a result, we obtain a set of clusters C, where each cluster c ∈ C contains a list of keywords that are assumed to be semantically related to a topic. We then asked a pool of three human labellers to select, for each community, from two to three terms they considered the most representative to identify a topic unambiguously.

Data collection of Facebook posts

The news articles obtained from the GDELT Event Database do not contain information helpful in estimating the attention they generate online. To include the dimension of user engagement, we employ each topic’s set of representative terms to collect Facebook data over a period that goes from 01/01/2018 to 05/05/2022. The data was obtained using CrowdTangle [36], a Facebook-owned tool that tracks interactions on public content from Facebook pages, groups, and verified profiles. CrowdTangle does not include paid ads unless those ads began as organic, non-paid posts that were subsequently “boosted” using Facebook’s advertising tools. CrowdTangle also does not store data regarding the activity of private accounts or posts made visible only to specific groups of followers.

The collection process produced a total of ∼57M posts from ∼2M unique pages and groups, generating ∼8B interactions. The result of the data collection process is described in Table 1.

Table 1. Data Breakdown of the study, including the total amount of news articles and posts collected from GDELT and Facebook respectively, together with the number of topics and the analysis period.

News Articles from GDELT	Total Posts from Facebook	Total Interactions	Total Groups and Pages	Number of Topics Collected	Period
79 650	57 031 026	8 015 177 602	2 224 430	296	1/1/2018—5/5/2022

Open in a new tab

Topic categorization

To provide a correspondence between topics and their area of interest, we performed a categorization activity under the following labels: Art-Culture-Sport (ACS), Economy, Environment, Health, Human Rights, Labor, Politics, Religion, Social and Tech-Science. Three human labellers carried out the activity to connect topics and categories, choosing as the representative only those categories selected by at least two of the three labellers.

Metrics

We begin by describing a measure for fitting the cumulative engagement evolution. Then, based on the previous step, we outline an index to evaluate the sharpness of the topic’s diffusion. Finally, using Facebook’s reactions, we introduce a sentiment score to assess the topic’s controversy. A topic-aggregated version of the dataset containing all the metrics defined in this section can be found in the Data Breakdown Section of S1 File.

Fitting cumulative engagement evolution

The study of the diffusion of new ideas has been carried on through the years, starting from the Bass diffusion model [40] and then extended to a multitude of topics [41–47], indicating the relevance of s-curves in the analysis of innovation spreading. Therefore, to model the evolution of the engagement received by posts, we fit the cumulative distribution of the overall engagement (i.e., the number of likes, shares and comments) over time employing a function f_α,β(t), with $α, β \in R$ , defined as

\begin{matrix} f_{α, β} (t) = \frac{1}{1 + e^{- α (t - β)}} . \end{matrix}

(1)

From a mathematical point of view, Eq 1 defines a general sigmoid function that depends on the parameters α and β. The α parameter represents the slope of the function, describing the steepness of the engagement evolution. On the other hand, β is the point at which the function reaches the value 0.5 and quantifies the time required for a topic to reach half its total interactions.

To provide a representation of the impact that α and β can have in topic engagement evolution, Fig 2 displays four topics with peculiar configurations. Fig 2a shows a sigmoid in which the high values of α and β produce a sharp increment relatively far from t₀. Such behaviour corresponds to those topics that require some time before gaining maximum diffusion with the public. Fig 2b instead provides a fit where the sigmoid produces low values for α and β, resulting in a smoother increment in the proximity of t₀ than the one described in Fig 2a. Finally, Fig 2c and 2d provide an example of how two curves that share similar values of β parameters can have a different evolution of their increase by slightly modifying the values for α parameter.

Speed Index

To provide a measure of how quickly the attention towards a topic reaches its saturation, we define a measure called the Speed Index SI(f_α,β) as

\begin{matrix} S I (f_{α, β}) = \frac{\int_{0}^{T} f_{α, β} (t) d t}{T} . \end{matrix}

(2)

The SI considers the joint contribution of α and β parameters, where T represents the time of the last observed value for f_α,β(t). Note that the SI is the mean integral value of f_α,β, i.e. the normalised area under the curve of f_α,β (therefore SI(f_α,β) ∈ [0, 1]). The assumption in the definition of this function relies on the fact that high-speed values are obtained by sigmoids that reach the plateau in a short time, as the behaviour represented in Fig 2b.

Love-Hate Score

To quantify the level of controversy that a Facebook post may produce, we define a measure called the Love-Hate (LH) Score. In line with previous works that quantified controversy from post reactions [48, 49], we define the LH Score LH(i)∈[−1, 1] as

\begin{matrix} L H (i) = \frac{l_{i} - h_{i}}{l_{i} + h_{i}}, \end{matrix}

(3)

where h_i and l_i are respectively the total number of Angry and Love reactions collected by a post i. A value of LH equal to −1 indicates that the post received only Angry reactions from the users, while a value equal to 1 indicates that the post received only Love reactions. Therefore, a value close to 0 reflects the presence of controversy on a post due to a balance of positive and negative reactions.

Results and discussion

Quantifying topic engagement evolution

We first provide a quantitative assessment of the the evolution of engagement with topics on social media. To do so, we perform a Non-linear Least Squares (NLS) regression by fitting the sigmoid function f_α,β(t) to the cumulative engagement gained by each topic.

The distribution of the α parameter provided in Fig 3 describes how the majority of topics have a value of α belonging to the [0, 0.0047] interval. This result demonstrates how user interest in a topic does not suddenly increase but results from a long-term process. Instead, the distribution of the β parameter describes a prevalence of topics in the [600, 1000] interval, identifying the tendency of topics to become a matter of interest with some delay w.r.t the first post covering them.

Evaluating the relationship between topic engagement and controversy

To quantify the interplay between users’ interest in a topic and the associated level of controversy, we compute the Spearman correlation between the Speed Index and the LH Score for each topic. Results from the upper panel of Fig 4 show a general negative tendency of users to react with a negative sentiment when a topic gains engagement faster (ρ = −0.26), leaving positive reactions to those topics that require time to obtain maximum diffusion. Results described in the lower panel of Fig 4 provide further characterisation of the interplay between the Speed Index and the LH Score after classifying the topics according to the four most frequent categories analyzed, i.e., Politics, Labor, Human Rights and Health. We observe how the Politics and Health categories have the lowest correlation scores (ρ = −0.36 and ρ = −0.45), providing an indication of their intrinsic polarizing attitude (see S1 Fig for further details about correlation coefficients). Furthermore, the correlation between α and LH Score produces similar results as with the Speed Index (see S2 Fig for more details).

Fig 4 — Upper panel: correlation between SI and LH score for each identified topic. Lower panel: correlation between SI and LH score for the top 4 most frequent topics. Overall, we observe how users react negatively as topics become sharply viral.

Assessing the differences of engagement behaviors across topic categories

To conclude our analysis, we investigate the differences in the evolution of engagement across topic categories. In particular, for each parameter distribution (α, β and SI), we apply a two-tailed Mann–Whitney U test [50] to each pair of parameters. Table 2 provides the percentages of the significant p-values for the four parameters. Due to the necessity to perform multiple tests, we apply a Bonferroni correction to our standard significance level of 0.05, leading to reject the null hypothesis if the p-value p < 0.001. Our results show that the resulting p-values from the tests do not lead to rejecting the null hypothesis. Such a result corroborates the hypothesis that, on average, users are characterized by homogeneous engagement patterns that are not influenced by the consumed topic. We further extend the statistical assessment by performing the same test between LH Score distributions of the different categories.

Table 2. Percentage of p-values resulting from the two-sided Mann–Whitney U test between each category employing their α, β, Speed Index and LH Score.

	α	β	Speed Index	LH
<0.001	2.22%	0%	0%	20%
>0.001	97.78%	100%	100%	80%

Open in a new tab

Conversely to engagement evolution results, the topic’s category explains differences in the sentiment of reactions in 20% of cases. Such findings reveal that some categories are composed of significantly more negative and controversial topics, indicating how elicited reactions vary according to specific subjects. Understanding that some of them are more prone to induce negative feedback from users could be a proxy to introduce their related topics in the online debate.

Conclusions

In this work, we perform a quantitative analysis of user interest on a total of ∼57M Facebook posts referring to ∼300 different topics ranging from 2018 to 2022. We initially quantify the distribution of topics’ engagement evolution throughout the analysis. Then, we evaluate the relationship between engagement and controversy. Ultimately, we assess the differences in engagement across different categories of topics. Our findings show that, on average, users’ interest in topics does not increase exponentially right after their appearance but, instead, it grows steadily until it reaches a saturation point. From a sentiment perspective, topics that reached a plateau in their engagement evolution right after their initial appearance are more likely to collect negative/controversial reactions, whilst topics which are more steady in their growth tend to attract positive users’ interactions. This result provides evidence about how recommendation algorithms should introduce topics adequately since sudden rises in topic diffusion could be related to the reinforcement of polarization mechanisms. Finally, we find no statistical difference between user interest across different categories of topics, providing evidence that, on a relatively large time window, the evolution of engagement with posts is primarily unrelated to their subject. On the contrary, we observe differences in the sentiment generated by topics with different diffusion speed, providing evidence of how people perceive the piece of content they consume online in different ways, according to how suddenly they get exposed to it.

Users’ interest and engagement evolution in the online debate are both aspects of human behaviour on social media whose underlying dynamics still need to be discovered from an individual point of view. Our findings provide an aggregate perspective of the interplay between major emerging behavioral dynamics and topics’ lifetime progression, deepening the relationship between diffusion patterns and users’ reactions. Understanding that topics with an early burst in virality are associated with primarily adverse reactions from users may enable the identification of highly polarizing topics since their initial stage of diffusion.

The following study presents some limitations. In data collection, CrowdTangle provides only posts from public Facebook pages with more than 25K Page Likes or Followers, public Facebook groups with at least 95K members, all US-based public groups with at least 2K members, and all verified profiles. These restrictions affected our datasets’ sample and our findings’ generality. Moreover, we could not access removed posts, groups, and pages, which could have been a meaningful proxy to characterize the attention dynamics of retracted content. Finally, since Crowdtangle does not provide information about users interacting with posts, we cannot assess their engagement from an individual perspective and model the possible relationship between users and topics employing a network approach.

The results obtained in this work may help to better understand how users consume information, improving social media moderation tools by considering both the “life-cycle” of topics and their potential controversy. Indeed, the introduction of the Speed Index and the Love-Hate Score can be exploited to identify in advance topics with the potential to collect considerable interest and generate heated debates quickly. From a news outlet and content creator perspective, understanding that specific topics may reach broader audiences and produce controversial opinions can improve the quality of the communication produced by these two types of authors.

Supporting information

S1 Data. This CSV file contains, for each identified topic, the statistics of α and β value, the Love Hate Score, the first and last post dates, the topic lifetime (in days), the Speed Index value, the number of posts, total interactions and users posting.

(CSV)

Click here for additional data file.^{(26.6KB, csv)}

S1 File. This file provides the topic aggregated statistics employed in the study.

Moreover, here are provided the figures reporting the correlations between α and LH Score for each topic and the goodness of the fitting procedures.

(PDF)

Click here for additional data file.^{(160KB, pdf)}

S1 Fig. Correlation between α and LH score for each identified topic.

(TIF)

Click here for additional data file.^{(248.4KB, tif)}

S2 Fig. Joint distribution of the errors

S E (\hat{α_{i}})

and

S E (\hat{β_{i}})

for each topic i, whose cumulative curve was estimated by means of f_α,β.

The colour of each point represents the number of posts produced by topic i.

(TIF)

Click here for additional data file.^{(662.1KB, tif)}

Data Availability

Data cannot be shared publicily because the study mainly relies on facebook posts obtained from Crowdtangle which, as it states in https://help.crowdtangle.com/en/articles/4558716-understanding-and-citing-crowdtangle-data, cannot be shared in CSV format. However, any researcher can require access to CrowdTangle upon request. Our Supporting information files contain all the contents to guide the interested reader in replicating our study with information about CrowdTangle access and data collection methodology.

Funding Statement

This study was supported by the 100683 EPID Project “Global Health Security Academic Research Coalition” provided by UK/G7 in the form of funds to WQ, GE, MA, MC [SCH-00001-3391], the SERICS under the NRRP MUR program funded by the European Union - NextGenerationEU in the form of funds to WQ [PE00000014], the project CRESP from the Italian Ministry of Health under the program CCM 2022 granted to WQ, and by the PON project “Ricerca e Innovazione,” funded by Ministero dell’Istruzione, dell’Università e della Ricerca, granted to MC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Taha Yasseri, Patrick Gildersleve, and Lea David. Collective memory in the digital age. arXiv preprint arXiv:2207.01042, 2022. [DOI] [PubMed]
2. Lazaroiu George. The role of social media as a news provider. Review of Contemporary Philosophy, 13:78–84, 2014. [Google Scholar]
3. Ahmad Ali Nobil. Is twitter a useful tool for journalists? Journal of media practice, 11(2):145–155, 2010. doi: 10.1386/jmpr.11.2.145_1 [DOI] [Google Scholar]
4. Notarmuzi Daniele, Castellano Claudio, Flammini Alessandro, Mazzilli Dario, and Radicchi Filippo. Universality, criticality and complexity of information propagation in social media. Nature communications, 13(1):1–8, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Brown Jo, Broderick Amanda J, and Lee Nick. Word of mouth communication within online communities: Conceptualizing the online social network. Journal of interactive marketing, 21(3):2–20, 2007. doi: 10.1002/dir.20082 [DOI] [Google Scholar]
6. Kellner Richard Kahn and Douglas. New media and internet activism: from the ‘battle of seattle’to blogging. New media & society, 6(1):87–95, 2004. doi: 10.1177/1461444804039908 [DOI] [Google Scholar]
7. McGregor Shannon C. Social media as public opinion: How journalists use social media to represent public opinion. Journalism, 20(8):1070–1086, 2019. doi: 10.1177/1464884919845458 [DOI] [Google Scholar]
8.Roope Jaakonmäki, Oliver Müller, and Jan Vom Brocke. The impact of content, context, and creator on user engagement in social media marketing. Proceedings of the 50th Hawaii International Conference on System Sciences, 2017.
9. Di Gangi Paul M and Wasko Molly M. Social media engagement theory: Exploring the influence of user engagement on social media usage. Journal of Organizational and End User Computing (JOEUC), 28(2):53–73, 2016. doi: 10.4018/JOEUC.2016040104 [DOI] [Google Scholar]
10. Simon Herbert A et al. Designing organizations for an information-rich world. Computers, communications, and the public interest, 72:37, 1971. [Google Scholar]
11. Kies Stephen C et al. Social media impact on attention span. Journal of Management & Engineering Integration, 11(1):20–27, 2018. [Google Scholar]
12. Holt Kristoffer, Shehata Adam, Strömbäck Jesper, and Ljungberg Elisabet. Age and the effects of news media attention and social media use on political interest and participation: Do social media function as leveller? European journal of communication, 28(1):19–34, 2013. doi: 10.1177/0267323112465369 [DOI] [Google Scholar]
13. Brooks Stoney. Does personal social media usage affect efficiency and well-being? Computers in Human Behavior, 46:26–37, 2015. doi: 10.1016/j.chb.2014.12.053 [DOI] [Google Scholar]
14. Cinelli Matteo, Brugnoli Emanuele, Schmidt Ana Lucia, Zollo Fabiana, Quattrociocchi Walter, and Scala Antonio. Selective exposure shapes the facebook news diet. PloS one, 15(3):e0229129, 2020. doi: 10.1371/journal.pone.0229129 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Vicario Michela Del, Bessi Alessandro, Zollo Fabiana, Petroni Fabio, Scala Antonio, Caldarelli Guido, et al. The spreading of misinformation online. Proceedings of the National Academy of Sciences, 113(3):554–559, 2016. doi: 10.1073/pnas.1517441113 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Cinelli Matteo, Morales Gianmarco De Francisci, Galeazzi Alessandro, Quattrociocchi Walter, and Starnini Michele. The echo chamber effect on social media. Proceedings of the National Academy of Sciences, 118(9):e2023301118, 2021. doi: 10.1073/pnas.2023301118 [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Flaxman Seth, Goel Sharad, and Rao Justin M. Filter bubbles, echo chambers, and online news consumption. Public opinion quarterly, 80(S1):298–320, 2016. doi: 10.1093/poq/nfw006 [DOI] [Google Scholar]
18. Cookson J Anthony, Engelberg Joseph, and Mullins William. Echo chambers. The Review of Financial Studies, 36.2 (2023): 450–500. doi: 10.1093/rfs/hhac058 [DOI] [Google Scholar]
19.Joseph T Klapper. The effects of mass communication. 1960.
20. Zollo Fabiana, Bessi Alessandro, Vicario Michela Del, Scala Antonio, Caldarelli Guido, Shekhtman Louis, et al. Debunking in a world of tribes. PloS one, 12(7):e0181821, 2017. doi: 10.1371/journal.pone.0181821 [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Bessi Alessandro, Scala Antonio, Rossi Luca, Zhang Qian, and Quattrociocchi Walter. The economy of attention in the age of (mis) information. Journal of Trust Management, 1(1):1–13, 2014. [Google Scholar]
22. Mocanu Delia, Rossi Luca, Zhang Qian, Karsai Marton, and Quattrociocchi Walter. Collective attention in the age of (mis) information. Computers in Human Behavior, 51:1198–1204, 2015. doi: 10.1016/j.chb.2015.01.024 [DOI] [Google Scholar]
23. Cinelli Matteo, Quattrociocchi Walter, Galeazzi Alessandro, Valensise Carlo Michele, Brugnoli Emanuele, Schmidt Ana Lucia, et al. The covid-19 social media infodemic. Scientific reports, 10(1):1–10, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Etta Gabriele, Galeazzi Alessandro, Hutchings Jamie Ray, Smith Connor Stirling James, Conti Mauro, Quattrociocchi Walter, et al. Covid-19 infodemic on facebook and containment measures in italy, united kingdom and new zealand. PloS one, 17(5):e0267022, 2022. doi: 10.1371/journal.pone.0267022 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Max Falkenberg, Alessandro Galeazzi, Maddalena Torricelli, Niccolò Di Marco, Francesca Larosa, Madalina Sas, et al. Growing polarization around climate change on social media, Nature Climate Change, pages 50–60, 2022.
26. Candia Cristian, Jara-Figueroa C, Rodriguez-Sickert Carlos, Barabási Albert-László, and Hidalgo César A. The universal decay of collective memory and attention. Nature human behaviour, 3(1):82–91, 2019. [DOI] [PubMed] [Google Scholar]
27. Briand Sylvie C, Cinelli Matteo, Nguyen Tim, Lewis Rosamund, Prybylski Dimitri, Valensise Carlo M, et al. Infodemics: A new challenge for public health. Cell, 184(25):6010–6014, 2021. doi: 10.1016/j.cell.2021.10.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Bovet Alexandre and Makse Hernán A. Influence of fake news in twitter during the 2016 us presidential election. Nature communications, 10(1):1–14, 2019. doi: 10.1038/s41467-018-07761-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Carlo M Valensise, Matteo Cinelli, Matthieu Nadini, Alessandro Galeazzi, Antonio Peruzzi, Gabriele Etta, et al. Lack of evidence for correlation between covid-19 infodemic and vaccine acceptance. arXiv preprint arXiv:2107.07946, 2021.
30.Fatemeh Tahmasbi, Leonard Schild, Chen Ling, Jeremy Blackburn, Gianluca Stringhini, Yang Zhang, et al. “go eat a bat, chang!”: On the emergence of sinophobic behavior on web communities in the face of covid-19. In Proceedings of the Web Conference 2021, WWW’21, page 1122–1133, New York, NY, USA, 2021. Association for Computing Machinery.
31. Cinelli Matteo, Pelicon Andraž, Mozetič Igor, Quattrociocchi Walter, Novak Petra Kralj, and Zollo Fabiana. Dynamics of online hate and misinformation. Scientific reports, 11(1):1–12, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Social Media and Democracy: The State of the Field, Prospects for Reform. SSRC Anxieties of Democracy. Cambridge University Press, 2020.
33.GDELT. Gdelt 2.0: Our global world in realtime.
34. Blondel Vincent D, Guillaume Jean-Loup, Lambiotte Renaud, and Lefebvre Etienne. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10):P10008, 2008. doi: 10.1088/1742-5468/2008/10/P10008 [DOI] [Google Scholar]
35.Understanding and Citing CrowdTangle Data, Crowdtangle, Crowdtangle Team.
36.CrowdTangle Team. Crowdtangle. Facebook, Menlo Park, California, United States, 2020.
37.GDELT. The GDELT project.
38.Kalev Leetaru and Philip A Schrodt. Gdelt: Global data on events, location, and tone, 1979–2012. In ISA annual convention, volume 2, pages 1–49. Citeseer, 2013.
39.Lucas Ou-Yang. Newspaper3k, 2013.
40.Bass Frank M., A new product growth for model consumer durables, Management science, 15(5):215–227, 1969.
41.Gabriel De Tarde. The laws of imitation. H. Holt, 1903.
42. Everett M. Rogers. New Product Adoption and Diffusion. Journal of Consumer Research, 2(4):290–301, 3 1976. doi: 10.1086/208642 [DOI] [Google Scholar]
43.Arnulf Grubler. The rise and fall of infrastructures: dynamics of evolution and technological change in transport. Physica-Verlag, 1990.
44.Carlota Perez. Technological revolutions and financial capital. Edward Elgar Publishing, 2003.
45.Les Robinson. Changeology. How to enable groups, communities and societies to do things they’ve never done before. 272p, 2012.
46. Kanjanatarakul Orakanya, Suriya Komsan, et al. Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function. The Empirical Econometrics and Quantitative Economics Letters, 1(4):89–106, 2012. [Google Scholar]
47. Billy Spann and Esther Mead and Maryam Maleki and Nitin Agarwal and Williams, Therese, Applying diffusion of innovations theory to social networks to understand the stages of adoption in connective action campaigns, Online Social Networks and Media, Elsevier, 2022(28):P100201, 2022. [Google Scholar]
48.Beel, Jacob and Xiang, Tong and Soni, Sandeep and Yang, Diyi. Linguistic Characterization of Divisive Topics Online: Case Studies on Contentiousness in Abortion, Climate Change, and Gun ControlProceedings of the International AAAI Conference on Web and Social Media, pages 32–42, 2022.
49.Hessel, Jack and Lee, Lillian, Something’s brewing! Early prediction of controversy-causing posts from discussion features, arXiv:1904.07372, 2019
50.Henry B Mann and Donald R Whitney. On a Test of Whether One of Two Random Variables Is Stochastically Larger than the Other. The annals of mathematical statistics, pages 50–60, 1947.

PLoS One. doi: 10.1371/journal.pone.0286150.r001

Decision Letter 0

Vincent Antonio Traag

13 Mar 2023

PONE-D-22-32836Characterizing Engagement Dynamics across Topics on FacebookPLOS ONE

Dear Dr. ETTA,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The reviewer raised a few minor issues that should be addressed.

The data availability should be clarified further. In the cited URL of CrowdTangle, it only states that "raw data" cannot be shared publicly. On this basis, it would be possible to share the overall statistics at the topic level as constructed by the author. Please include the data at this aggregated level, or provide a clear reason why this is also not possible,

In addition, there are a few other minor issues that should be improved:

- There are many Figure references missing, this should be corrected.

- There are some causal interpretations for which it is not clear whether such a causal interpretation is actually warranted. For instance, "topics with sudden virality tend to trigger more controversial and heterogeneous interactions", but it is not clear whether the virality actuallly *causes the controversial interactions. There are some other such statements that should be revised.

- It is not really clear whether Speed Index (SI) has much to do with "Speed" or whether this more reflects a quick saturation of attention. Please motivate this measure more clearly.

- For the topic extraction, please make sure that the community detection employed is actually applicable to bipartite networks. There are several possibilities for clustering bipartite networks, and they provide different results from cluster method that are intended to be used on unipartite networks.

Please submit your revised manuscript by Apr 27 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Vincent Antonio Traag, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In your Methods section, please include additional information about your dataset and ensure that you have included a statement specifying whether the collection and analysis method complied with the terms and conditions for the source of the data.

3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

"Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

4. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This paper explores the impact of engagement dynamics on Facebook by examining how the increased engagement of controversial topics may trigger heated discussions and increase polarization among users. The authors analyzed 57M posts from Facebook using logistic functions and an s-curve model to assess the evolution of different topics to show that similar patterns in engagement existed for the controversial topics. They measure the sentiment of posts by users' positive and negative reactions and find that topics with sudden virality tend to trigger more controversial and heterogeneous reactions.

The methodology initially uses the GDELT Event Database to create a sample of news articles, and those articles are reduced to top 10 representative words in the article. The authors then apply Louvain community detection on the co-occurrence term networks to identify topics of interest. Those terms were then used as input for the collection of Facebook posts.

The analysis uses sentiment scores to assess the topic’s controversy using Facebook’s reactions. The author's discussion of the difference in the sentiment of reactions between topics with sudden virality and those with a steady evolution is insightful. However, it’s unclear how the authors justify using sentiment scores as a good measure for controversy. In the results section, they define the measure of a Love-Hate score to measure controversy, but it’s slightly confusing from the beginning of the paper where the authors mention using sentiment scores as the measure of controversy. I think the Love-Hate score serves as the ‘sentiment’ score in this case, but it’s not clear. The authors also quantify the relationship between topic resonance and controversy, but again earlier in the paper it’s not clear how the terms of resonance, controversy, and sentiment align with the defined measures in the results section. Possibly add a table clarifying these terms, or just mention in the paper that these terms will be discussed later in the paper in Section X.

The author provides a good overview of the methodology used and the results obtained. However, the paper could benefit from more detailed explanations of some of the terms used, such as "resonance", "controversy”, and sentiment scores in the context these terms are used. It would also be helpful if the author could provide some recommendations for how news outlets or content producers could use this information to improve their content and engagement with users. There have also been other works that should be cited that use a similar approach of s-curves and the sigmoid function in social media such as (Spann, et al., 2022) and Diffusion of Innovations Theory (Bass, 1969).

To be considered for publication, the minor revisions should be applied:

1.) Consider adding a sentence or two explaining the context of topic resonance, controversy, and that sentiment scores are actually defined by the Love-Hate score (if that is indeed the case)

2.) The following references are relevant to the author's work, especially the discussion on diffusion of innovations and s-curves.

- Spann, B., Mead, E., Maleki, M., Agarwal, N., & Williams, T. (2022). Applying diffusion of innovations theory to social networks to understand the stages of adoption in connective action campaigns. Online Social Networks and Media, 28, 100201. https://doi.org/10.1016/j.osnem.2022.100201

- Bass, F. M. (1969). A new product growth for model consumer durables. Management science, 15(5), 215-227.

3.) Consider adding some recommendations for how news outlets or content producers could use this information to improve their content and engagement with users.

Overall, this is a well written and informative manuscript. The author's discussion of the difference in the sentiment of reactions between topics with sudden virality and those with a steady evolution is insightful. I think with the clarifications and additional citations from above, their quantitative analysis will make a valuable contribution to the research community.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jun 28;18(6):e0286150. doi: 10.1371/journal.pone.0286150.r002

Author response to Decision Letter 0

4 Apr 2023

Reviewer #1

Consider adding a sentence or two explaining the context of topic resonance, controversy, and that sentiment scores are actually defined by the Love-Hate score (if that is indeed the case)

Authors’ response:

We thank the reviewer for the comment. We addressed this ambiguity by clarifying the usage of resonance, controversy and what Love-Hate score indicates. As result of this process, we introduced the following changes in the text:

In the Introduction section, at Page 2, we changed the following sentence

“We first provide a quantitative assessment of topics' resonance through time, extracting insightful parameters from their engagement evolution. Then, we exploit the obtained parameters by assessing relationships with the sentiment expressed by users through their positive and negative reactions”

“We first provide a quantitative assessment of topics' attention through time, extracting insightful parameters from their engagement evolution. Then, we construct a metric called the Love-Hate Score to estimate the level of controversy associated with a topic using the sentiment of users' engagement, as expressed by the normalized difference between their positive and negative reactions. ”

Page 5, in Fitting cumulative engagement evolution section, we changed the sentence “Such behaviour corresponds to those topics that require some time before gaining resonance with the public” into “Such behaviour corresponds to those topics that require some time before gaining maximum diffusion with the public”

Page 6, we changed the content of the Love-Hate Score section, from

“To quantify the level of sentiment that a Facebook post produces, we define a measure of controversy called the Love-Hate (LH) Score $LH (i) \\in \\left[-1,1 \\right]$ as

\\begin{equation}

\\label{eq:lhi}

LH (i) = \\frac{l_i-h_i}{l_i+h_i},

\\end{equation}

where $h_i$ and $l_i$ are respectively the total number of Angry and Love reactions collected by a post $i$. A value of $LH$ equal to $-1$ indicates that the post received only Angry reactions from the users, while a value equal to $1$ indicates that the post received only $Love$ reactions.”

“To quantify the level of controversy that a Facebook post may produce, we define a measure called the Love-Hate (LH) Score. In line with previous works that quantified controversy from post reactions [48-49], we define the LH Score $LH (i) \\in \\left[-1,1 \\right]$ as

\\begin{equation}

\\label{eq:lhi}

LH (i) = \\frac{l_i-h_i}{l_i+h_i},

\\end{equation}

where $h_i$ and $l_i$ are respectively the total number of \\textit{Angry} and \\textit{Love} reactions collected by a post $i$. A value of $LH$ equal to $-1$ indicates that the post received only \\textit{Angry} reactions from the users, while a value equal to $1$ indicates that the post received only $Love$ reactions. Therefore, a value close to $0$ reflects the presence of controversy in a post due to a balance of positive and negative reactions.”

We also included two references of papers ([48-49]) that already implemented a similar way of quantifying controversy through post interactions. The papers are:

Beel, J., Xiang, T., Soni, S., & Yang, D. (2022). Linguistic Characterization of Divisive Topics Online: Case Studies on Contentiousness in Abortion, Climate Change, and Gun Control. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 32-42.

Hessel, Jack; LEE, Lillian. Something's brewing! Early prediction of controversy-causing posts from discussion features. arXiv preprint arXiv:1904.07372, 2019.

We changed the title of the Quantifying topic resonance section, at Page 6, to Quantifying topic engagement evolution. Moreover, in this section we changed its initial part from

“We first provide a quantitative assessment of the topics' resonance on social media. To do so, we perform a Non-linear Least Squares (NLS) regression by fitting the sigmoid function $f_{\\alpha,\\beta} (t)$ to the cumulative evolution of the engagement for each topic.”

“We first provide a quantitative assessment of the evolution of engagement with topics on social media. To do so, we perform a Non-linear Least Squares (NLS) regression by fitting the sigmoid function $f_{\\alpha,\\beta} (t)$ to the cumulative engagement gained by each topic.”

In page 7, we changed the Evaluating the relationship between topic resonance and controversy section name into Evaluating the relationship between topic engagement and controversy. Moreover, in that section we changed the use of adverse with a negative, gain resonance with obtaining maximum diffusion and we changed “To quantify the interplay between users’ interest in a topic and the controversy it produces” to “To quantify the interplay between users’ interest in a topic and the associated level of controversy”

In the Conclusions section, at page 9, we changed the occurrence of “resonance” with “engagement evolution". In the same section, we then changed the second occurrence of “gained resonance” with “reached a plateau in their engagement evolution” and the third one with “diffusion”.

The following references are relevant to the author's work, especially the discussion on diffusion of innovations and s-curves.

- Bass, F. M. (1969). A new product growth for model consumer durables. Management science, 15(5), 215-227.

Author’s response:

We thank the reviewer for pointing out these relevant references that we included in the manuscript. Accordingly to the suggestion, we modified the beginning of Fitting cumulative engagement evolution subsection, at Page 4, from:

“The diffusion of new ideas has been widely studied in the past [40-45], indicating how the logistic function can effectively model the diffusion of innovations.”

“The study of the diffusion of new ideas has been carried on through the years, starting from the Bass diffusion model [41] and then extended to a multitude of topics [42-48], indicating the relevance of s-curves in the analysis of innovation spreading.”

Where [41] is the reference to the Bass’ paper and [48] the reference to Spann's work.

Consider adding some recommendations for how news outlets or content producers could use this information to improve their content and engagement with users.

Author’s response:

We thank the reviewer for this important suggestion. According to this, we added the following part in the Conclusion session, at Page 10:

The results obtained in this work may help to better understand how users consume information, helping social media regulators improve their moderation tools considering both the "life-cycle" of topics and their potential controversy. Indeed, the introduction of the Speed Index and the Love-Hate score can be exploited to identify in advance topics with the potential to collect considerable interest and generate heated debates quickly. From a news outlet and content creator perspective, understanding that specific topics may reach broader audiences and produce controversial opinions can improve the quality of the communication produced by these two types of authors.

Minor Issues addressed by the Editor:

Author’s response:

We thank the editor for the comment. We provided a CSV file named Topic_Aggregated_Data.csv and a corresponding section in SI, at Page 18, called Data breakdown at topic level. Such a section contains the title of the file (Topic_Aggregated_Data.csv File) and the following caption:

“This CSV file contains, for each identified topic, the statistics of $\\alpha$ and $\\beta$ value, the Love Hate Score, the First and Last Post Dates, the Topic Lifetime (in days), the Speed Index value, the number of Posts, Total Interactions and Users posting.”

In the Metrics section, at Page 4, we added the following sentence to refer to the new dataset:

“A topic-aggregated version of the dataset, containing all the metrics defined in this section, can be found in Section \\ref{S1_File} of SI.”

There are many Figure references missing, this should be corrected.

Author’s response:

We thank the editor for the comment. We reformatted the manuscript by including the figures, thus reinstantiating the original references.

There are some causal interpretations for which it is not clear whether such a causal interpretation is actually warranted. For instance, "topics with sudden virality tend to trigger more controversial and heterogeneous interactions", but it is not clear whether the virality actually *causes the controversial interactions. There are some other such statements that should be revised.

Author’s response:

We thank the reviewer for the comment. We reviewed all the sentences that refer to causal interpretations, modifying them in all those cases for which the result was obtained through correlation analysis instead of casual ones.

According to this purpose, we modified the following sentences:

In the Introduction Section, Page 2, we modified the “Finally, we find that topics with sudden virality tend to trigger more controversial and heterogeneous interaction” to “Finally, we find that topics with sudden virality tend to occur with more controversial and heterogeneous interaction”

In the Evaluating the relationship between topic engagement and controversy, Page 7, we modified the “[...] providing further evidence of their intrinsic polarizing attitude [...]” sentence into “[...] providing an indication of their intrinsic polarizing attitude [...]”

In the Conclusions section, Page 9, we modified the “This result provides evidence about how recommendation algorithms should introduce topics adequately since sudden rises in topic resonance tend to reinforce the polarization mechanisms” into “This result provides evidence about how recommendation algorithms should introduce topics adequately since sudden rises in topic diffusion could be related to the reinforcement of polarization mechanisms”

In the Conclusions section, Page 9, we modified the “On the contrary, we observe differences in the sentiment generated by the different topics, providing evidence of how polarisation drives people to perceive the piece of content they consume online in different ways, according to their framing and system of beliefs.” into “On the contrary, we observe differences in the sentiment generated by topics with different diffusion speed, providing evidence of how people perceive the piece of content they consume online in different ways, according to how suddenly they get exposed to it.”

In the Conclusions section, Page 9, we modified the “Understanding that topics with an early burst in virality are associated with primarily adverse reactions from users sheds light on their tendency to react instinctively to new content. This approach enables the identification of highly polarizing topics since their initial stage of diffusion, by observing the heterogeneity of users' reactions.” into “Understanding that topics with an early burst in virality are associated with primarily adverse reactions from users may enable the identification of highly polarizing topics since their initial stage of diffusion.”

It is not really clear whether Speed Index (SI) has much to do with "Speed" or whether this more reflects a quick saturation of attention. Please motivate this measure more clearly.

Author’s response:

We thank the reviewer for the comment. To remove the ambiguity in the definition of the Speed Index (SI), in the Speed Index Section, Page 5, we replaced the “To model the evolution of a topic by taking into account the joint contribution of α and β parameters” sentence with “To provide a measure of how quickly the attention towards a topic reaches its saturation, we define a measure called the Speed Index”. We therefore changed the explanation of the equation from “where T represents the time of the last observed value for fα,β (t).” to “The SI considers the joint contribution of α and β parameters, where T represents the time of the last observed value for fα,β (t).”

For the topic extraction, please make sure that the community detection employed is actually applicable to bipartite networks. There are several possibilities for clustering bipartite networks, and they provide different results from cluster methods that are intended to be used on unipartite networks.

Author’s response:

We thank the reviewer for the comment. We actually employed community detection on the projection of the bipartite network on the keywords. Therefore, in the Overview of the data collection process section, at Page 2, we changed the “Consequently, we apply the Louvain community detection algorithm [34] on the co-occurrence term network to identify the topics of interest“ sentence to “Consequently, we apply the Louvain community detection algorithm [34] on the bipartite projection of the co-occurrence term network to identify the topics of interest”.

For coherence, we applied the same clarification in the Figure 1 caption, resulting in the following sentence “[...] The bipartite projection of the co-occurrence network built upon these terms serves as an input for the Louvain community detection algorithm to identify keyword clusters. [...]”

Journal Requirements:

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming.

Author’s response:

We thank the editor for the comment. We carefully revised the paper in order to meet PLOS ONE’s style requirements, including the filenamings.

In your Methods section, please include additional information about your dataset and ensure that you have included a statement specifying whether the collection and analysis method complied with the terms and conditions for the source of the data

Author’s response:

We thank the editor for the comment. In the Overview of data collection process, at Page 2, we added the following part:

“The data collection and analysis process are compliant with the terms and conditions [35] imposed by Crowdtangle [36]. Therefore, the results described in this paper cannot be exploited to infer the identity of the accounts involved. ”

In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety.

Author’s response:

We thank the editor for the comment. Due to the previous response, we cannot share a minimal dataset to reproduce the results. Therefore, we changed the Data Availability statement accordingly.

Please review your reference list to ensure that it is complete and correct.

Author’s response:

We thank the editor for the comment. We carefully revised the entire reference list, performing the following changes:

In the Collective memory in the digital age paper, we added the arXiv reference arXiv preprint arXiv:2207.01042

In the Falkenberg22 paper, we added its own title, i.e., Growing polarization around climate change on social media

In the mann1947test paper, we corrected its title into the correct one, i.e., On a Test of Whether One of Two Random Variables Is Stochastically Larger than the Other.

We removed a duplicate about the GDELT Project.

We changed the Acknowledgments section by including all the funding authorities that supported this study. The section now contains the following sentences: “The work is supported by IRIS Infodemic Coalition (UK government, grant no. SCH-00001-3391),

SERICS (PE00000014) under the NRRP MUR program funded by the European Union - NextGenerationEU, project CRESP from the Italian Ministry of Health under the program CCM 2022, and PON project “Ricerca e Innovazione” 2014-2020.”

Attachment

Submitted filename: Response_to_Reviewers.pdf

Click here for additional data file.^{(166.9KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0286150.r003

Decision Letter 1

Vincent Antonio Traag

10 May 2023

Characterizing Engagement Dynamics across Topics on Facebook

PONE-D-22-32836R1

Dear Dr. ETTA,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Vincent Antonio Traag, Ph.D.

Academic Editor

PLOS ONE

PLoS One. doi: 10.1371/journal.pone.0286150.r004

Acceptance letter

Vincent Antonio Traag

1 Jun 2023

PONE-D-22-32836R1

Characterizing Engagement Dynamics across Topics on Facebook

Dear Dr. Quattrociocchi:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Vincent Antonio Traag

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

(CSV)

Click here for additional data file.^{(26.6KB, csv)}

S1 File. This file provides the topic aggregated statistics employed in the study.

Moreover, here are provided the figures reporting the correlations between α and LH Score for each topic and the goodness of the fitting procedures.

(PDF)

Click here for additional data file.^{(160KB, pdf)}

S1 Fig. Correlation between α and LH score for each identified topic.

(TIF)

Click here for additional data file.^{(248.4KB, tif)}

S2 Fig. Joint distribution of the errors

S E (\hat{α_{i}})

and

S E (\hat{β_{i}})

for each topic i, whose cumulative curve was estimated by means of f_α,β.

The colour of each point represents the number of posts produced by topic i.

(TIF)

Click here for additional data file.^{(662.1KB, tif)}

Attachment

Submitted filename: Response_to_Reviewers.pdf

Click here for additional data file.^{(166.9KB, pdf)}

Data Availability Statement

[pone.0286150.ref001] 1.Taha Yasseri, Patrick Gildersleve, and Lea David. Collective memory in the digital age. arXiv preprint arXiv:2207.01042, 2022. [DOI] [PubMed]

[pone.0286150.ref002] 2. Lazaroiu George. The role of social media as a news provider. Review of Contemporary Philosophy, 13:78–84, 2014. [Google Scholar]

[pone.0286150.ref003] 3. Ahmad Ali Nobil. Is twitter a useful tool for journalists? Journal of media practice, 11(2):145–155, 2010. doi: 10.1386/jmpr.11.2.145_1 [DOI] [Google Scholar]

[pone.0286150.ref004] 4. Notarmuzi Daniele, Castellano Claudio, Flammini Alessandro, Mazzilli Dario, and Radicchi Filippo. Universality, criticality and complexity of information propagation in social media. Nature communications, 13(1):1–8, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref005] 5. Brown Jo, Broderick Amanda J, and Lee Nick. Word of mouth communication within online communities: Conceptualizing the online social network. Journal of interactive marketing, 21(3):2–20, 2007. doi: 10.1002/dir.20082 [DOI] [Google Scholar]

[pone.0286150.ref006] 6. Kellner Richard Kahn and Douglas. New media and internet activism: from the ‘battle of seattle’to blogging. New media & society, 6(1):87–95, 2004. doi: 10.1177/1461444804039908 [DOI] [Google Scholar]

[pone.0286150.ref007] 7. McGregor Shannon C. Social media as public opinion: How journalists use social media to represent public opinion. Journalism, 20(8):1070–1086, 2019. doi: 10.1177/1464884919845458 [DOI] [Google Scholar]

[pone.0286150.ref008] 8.Roope Jaakonmäki, Oliver Müller, and Jan Vom Brocke. The impact of content, context, and creator on user engagement in social media marketing. Proceedings of the 50th Hawaii International Conference on System Sciences, 2017.

[pone.0286150.ref009] 9. Di Gangi Paul M and Wasko Molly M. Social media engagement theory: Exploring the influence of user engagement on social media usage. Journal of Organizational and End User Computing (JOEUC), 28(2):53–73, 2016. doi: 10.4018/JOEUC.2016040104 [DOI] [Google Scholar]

[pone.0286150.ref010] 10. Simon Herbert A et al. Designing organizations for an information-rich world. Computers, communications, and the public interest, 72:37, 1971. [Google Scholar]

[pone.0286150.ref011] 11. Kies Stephen C et al. Social media impact on attention span. Journal of Management & Engineering Integration, 11(1):20–27, 2018. [Google Scholar]

[pone.0286150.ref012] 12. Holt Kristoffer, Shehata Adam, Strömbäck Jesper, and Ljungberg Elisabet. Age and the effects of news media attention and social media use on political interest and participation: Do social media function as leveller? European journal of communication, 28(1):19–34, 2013. doi: 10.1177/0267323112465369 [DOI] [Google Scholar]

[pone.0286150.ref013] 13. Brooks Stoney. Does personal social media usage affect efficiency and well-being? Computers in Human Behavior, 46:26–37, 2015. doi: 10.1016/j.chb.2014.12.053 [DOI] [Google Scholar]

[pone.0286150.ref014] 14. Cinelli Matteo, Brugnoli Emanuele, Schmidt Ana Lucia, Zollo Fabiana, Quattrociocchi Walter, and Scala Antonio. Selective exposure shapes the facebook news diet. PloS one, 15(3):e0229129, 2020. doi: 10.1371/journal.pone.0229129 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref015] 15. Vicario Michela Del, Bessi Alessandro, Zollo Fabiana, Petroni Fabio, Scala Antonio, Caldarelli Guido, et al. The spreading of misinformation online. Proceedings of the National Academy of Sciences, 113(3):554–559, 2016. doi: 10.1073/pnas.1517441113 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref016] 16. Cinelli Matteo, Morales Gianmarco De Francisci, Galeazzi Alessandro, Quattrociocchi Walter, and Starnini Michele. The echo chamber effect on social media. Proceedings of the National Academy of Sciences, 118(9):e2023301118, 2021. doi: 10.1073/pnas.2023301118 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref017] 17. Flaxman Seth, Goel Sharad, and Rao Justin M. Filter bubbles, echo chambers, and online news consumption. Public opinion quarterly, 80(S1):298–320, 2016. doi: 10.1093/poq/nfw006 [DOI] [Google Scholar]

[pone.0286150.ref018] 18. Cookson J Anthony, Engelberg Joseph, and Mullins William. Echo chambers. The Review of Financial Studies, 36.2 (2023): 450–500. doi: 10.1093/rfs/hhac058 [DOI] [Google Scholar]

[pone.0286150.ref019] 19.Joseph T Klapper. The effects of mass communication. 1960.

[pone.0286150.ref020] 20. Zollo Fabiana, Bessi Alessandro, Vicario Michela Del, Scala Antonio, Caldarelli Guido, Shekhtman Louis, et al. Debunking in a world of tribes. PloS one, 12(7):e0181821, 2017. doi: 10.1371/journal.pone.0181821 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref021] 21. Bessi Alessandro, Scala Antonio, Rossi Luca, Zhang Qian, and Quattrociocchi Walter. The economy of attention in the age of (mis) information. Journal of Trust Management, 1(1):1–13, 2014. [Google Scholar]

[pone.0286150.ref022] 22. Mocanu Delia, Rossi Luca, Zhang Qian, Karsai Marton, and Quattrociocchi Walter. Collective attention in the age of (mis) information. Computers in Human Behavior, 51:1198–1204, 2015. doi: 10.1016/j.chb.2015.01.024 [DOI] [Google Scholar]

[pone.0286150.ref023] 23. Cinelli Matteo, Quattrociocchi Walter, Galeazzi Alessandro, Valensise Carlo Michele, Brugnoli Emanuele, Schmidt Ana Lucia, et al. The covid-19 social media infodemic. Scientific reports, 10(1):1–10, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref024] 24. Etta Gabriele, Galeazzi Alessandro, Hutchings Jamie Ray, Smith Connor Stirling James, Conti Mauro, Quattrociocchi Walter, et al. Covid-19 infodemic on facebook and containment measures in italy, united kingdom and new zealand. PloS one, 17(5):e0267022, 2022. doi: 10.1371/journal.pone.0267022 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref025] 25.Max Falkenberg, Alessandro Galeazzi, Maddalena Torricelli, Niccolò Di Marco, Francesca Larosa, Madalina Sas, et al. Growing polarization around climate change on social media, Nature Climate Change, pages 50–60, 2022.

[pone.0286150.ref026] 26. Candia Cristian, Jara-Figueroa C, Rodriguez-Sickert Carlos, Barabási Albert-László, and Hidalgo César A. The universal decay of collective memory and attention. Nature human behaviour, 3(1):82–91, 2019. [DOI] [PubMed] [Google Scholar]

[pone.0286150.ref027] 27. Briand Sylvie C, Cinelli Matteo, Nguyen Tim, Lewis Rosamund, Prybylski Dimitri, Valensise Carlo M, et al. Infodemics: A new challenge for public health. Cell, 184(25):6010–6014, 2021. doi: 10.1016/j.cell.2021.10.031 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref028] 28. Bovet Alexandre and Makse Hernán A. Influence of fake news in twitter during the 2016 us presidential election. Nature communications, 10(1):1–14, 2019. doi: 10.1038/s41467-018-07761-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref029] 29.Carlo M Valensise, Matteo Cinelli, Matthieu Nadini, Alessandro Galeazzi, Antonio Peruzzi, Gabriele Etta, et al. Lack of evidence for correlation between covid-19 infodemic and vaccine acceptance. arXiv preprint arXiv:2107.07946, 2021.

[pone.0286150.ref030] 30.Fatemeh Tahmasbi, Leonard Schild, Chen Ling, Jeremy Blackburn, Gianluca Stringhini, Yang Zhang, et al. “go eat a bat, chang!”: On the emergence of sinophobic behavior on web communities in the face of covid-19. In Proceedings of the Web Conference 2021, WWW’21, page 1122–1133, New York, NY, USA, 2021. Association for Computing Machinery.

[pone.0286150.ref031] 31. Cinelli Matteo, Pelicon Andraž, Mozetič Igor, Quattrociocchi Walter, Novak Petra Kralj, and Zollo Fabiana. Dynamics of online hate and misinformation. Scientific reports, 11(1):1–12, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0286150.ref032] 32.Social Media and Democracy: The State of the Field, Prospects for Reform. SSRC Anxieties of Democracy. Cambridge University Press, 2020.

[pone.0286150.ref033] 33.GDELT. Gdelt 2.0: Our global world in realtime.

[pone.0286150.ref034] 34. Blondel Vincent D, Guillaume Jean-Loup, Lambiotte Renaud, and Lefebvre Etienne. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10):P10008, 2008. doi: 10.1088/1742-5468/2008/10/P10008 [DOI] [Google Scholar]

[pone.0286150.ref035] 35.Understanding and Citing CrowdTangle Data, Crowdtangle, Crowdtangle Team.

[pone.0286150.ref036] 36.CrowdTangle Team. Crowdtangle. Facebook, Menlo Park, California, United States, 2020.

[pone.0286150.ref037] 37.GDELT. The GDELT project.

[pone.0286150.ref038] 38.Kalev Leetaru and Philip A Schrodt. Gdelt: Global data on events, location, and tone, 1979–2012. In ISA annual convention, volume 2, pages 1–49. Citeseer, 2013.

[pone.0286150.ref039] 39.Lucas Ou-Yang. Newspaper3k, 2013.

[pone.0286150.ref040] 40.Bass Frank M., A new product growth for model consumer durables, Management science, 15(5):215–227, 1969.

[pone.0286150.ref041] 41.Gabriel De Tarde. The laws of imitation. H. Holt, 1903.

[pone.0286150.ref042] 42. Everett M. Rogers. New Product Adoption and Diffusion. Journal of Consumer Research, 2(4):290–301, 3 1976. doi: 10.1086/208642 [DOI] [Google Scholar]

[pone.0286150.ref043] 43.Arnulf Grubler. The rise and fall of infrastructures: dynamics of evolution and technological change in transport. Physica-Verlag, 1990.

[pone.0286150.ref044] 44.Carlota Perez. Technological revolutions and financial capital. Edward Elgar Publishing, 2003.

[pone.0286150.ref045] 45.Les Robinson. Changeology. How to enable groups, communities and societies to do things they’ve never done before. 272p, 2012.

[pone.0286150.ref046] 46. Kanjanatarakul Orakanya, Suriya Komsan, et al. Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function. The Empirical Econometrics and Quantitative Economics Letters, 1(4):89–106, 2012. [Google Scholar]

[pone.0286150.ref047] 47. Billy Spann and Esther Mead and Maryam Maleki and Nitin Agarwal and Williams, Therese, Applying diffusion of innovations theory to social networks to understand the stages of adoption in connective action campaigns, Online Social Networks and Media, Elsevier, 2022(28):P100201, 2022. [Google Scholar]

[pone.0286150.ref048] 48.Beel, Jacob and Xiang, Tong and Soni, Sandeep and Yang, Diyi. Linguistic Characterization of Divisive Topics Online: Case Studies on Contentiousness in Abortion, Climate Change, and Gun ControlProceedings of the International AAAI Conference on Web and Social Media, pages 32–42, 2022.

[pone.0286150.ref049] 49.Hessel, Jack and Lee, Lillian, Something’s brewing! Early prediction of controversy-causing posts from discussion features, arXiv:1904.07372, 2019

[pone.0286150.ref050] 50.Henry B Mann and Donald R Whitney. On a Test of Whether One of Two Random Variables Is Stochastically Larger than the Other. The annals of mathematical statistics, pages 50–60, 1947.

PERMALINK

Characterizing engagement dynamics across topics on Facebook

Gabriele Etta

Emanuele Sangiorgio

Niccolò Di Marco

Michele Avalle

Antonio Scala

Matteo Cinelli

Walter Quattrociocchi

Roles

Abstract

Introduction

Materials and methods

Overview of the data collection process

Fig 1. Summary of the analysis workflow followed in the current study.

News extraction from GDELT

Extracting representative keywords from news articles

Topic extraction from news article’s keywords

Data collection of Facebook posts

Table 1. Data Breakdown of the study, including the total amount of news articles and posts collected from GDELT and Facebook respectively, together with the number of topics and the analysis period.

Topic categorization

Metrics

Fitting cumulative engagement evolution

Fig 2. Representation of a sample of four topics employing their normalized cumulative evolution of engagements and fittings.

Speed Index

Love-Hate Score

Results and discussion

Quantifying topic engagement evolution

Fig 3. Joint distribution of α and β parameters obtained from the NLS regression for each topic.

Evaluating the relationship between topic engagement and controversy

Fig 4.

Assessing the differences of engagement behaviors across topic categories

Table 2. Percentage of p-values resulting from the two-sided Mann–Whitney U test between each category employing their α, β, Speed Index and LH Score.

Conclusions

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

Vincent Antonio Traag

Roles

Author response to Decision Letter 0

Decision Letter 1

Vincent Antonio Traag

Roles

Acceptance letter

Vincent Antonio Traag

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases