Skip to main content
IEEE - PMC COVID-19 Collection logoLink to IEEE - PMC COVID-19 Collection
. 2020 Oct 14;17(4):2145–2155. doi: 10.1109/TNSM.2020.3031034

Critical Impact of Social Networks Infodemic on Defeating Coronavirus COVID-19 Pandemic: Twitter-Based Study and Research Directions

Azzam Mourad 1,, Ali Srour 1, Haidar Harmanani 1, Cathia Jenainati 2, Mohamad Arafeh 1
PMCID: PMC8544937

Abstract

News creation and consumption has been changing since the advent of social media. An estimated 2.95 billion people in 2019 used social media worldwide. The widespread of the Coronavirus COVID-19 resulted with a tsunami of social media. Most platforms were used to transmit relevant news, guidelines and precautions to people. According to WHO, uncontrolled conspiracy theories and propaganda are spreading faster than the COVID-19 pandemic itself, creating an infodemic and thus causing psychological panic, misleading medical advises, and economic disruption. Accordingly, discussions have been initiated with the objective of moderating all COVID-19’s communications, except those initiated from trusted sources such as the WHO and authorized governmental entities. This article presents a large-scale study based on data mined from Twitter. Extensive analysis has been performed on approximately one million COVID-19 related tweets collected over a period of two months. Furthermore, the profiles of 288,000 users were analyzed including unique users’ profiles, meta-data and tweets’ context. The study noted various interesting conclusions including the critical impact in term of reach level of the (1) exploitation of the COVID-19 crisis to redirect readers to irrelevant topics and (2) widespread of unauthentic medical precautions and information. Further data analysis revealed the importance of using social networks in a global pandemic crisis by relying on credible users with variety of occupations, content developers and influencers in specific fields. In this context, several insights and findings have been provided while elaborating computing and non-computing implications and research directions for potential solutions and social networks management strategies during crisis periods.

Keywords: Coronavirus, COVID-19, pandemic, infodemic, misinformation, misleading information, social networks, social networks management, defeating coronavirus, data analytics

I. Introduction

The world wide spread of the COVID-19 infectious disease resulted in a pandemic that has threatened millions of lives. Social media has been playing a major role in disseminating information about the virus and its impact through a multitude of measures including the continuous transmission of local and global updates, as well as issuing warnings and guidelines for dealing with the pandemic and its aftermath. According to Statista [1], an estimated 2.95 billion people in 2019 used social media world wide. The number is projected to increase to 3.43 billion in 2023. One remarkable statistic is around the continually changing demographic of new consumers and the increase in social media penetration reach. The Pew Research Centre [2] reported in 2018 that “most Americans continue to get news on social media, even though they may have concerns about its accuracy.” Numerous surveys have been undertaken to capture the online behavior of news consumers worldwide, and the trends seem to be that social media platforms are highly influential when it comes to acquiring news stories, for the majority of people. In a large-scale study conducted in 2019 by Ofcom [3], the U.K. Government’s regulator for public communications services, it was shown that “half of the adults in the U.K. now use social media to keep up with the latest news.” Furthermore, governments and major centers for disease control, including the World Health Organization (WHO) and the Centers for Disease Control and Prevention (CDC), are relying on social networks as a mean for managing the evolving pandemic by regularly disseminating guidance and updates and by providing emergency responses.

The dark side of social media was exhibited in a tsunami of fake and unreliable news that ranged from selling fake cures to using the social media as a platform to launch cyberattacks on critical information systems. This led the United Nations to warn against a proliferation of false information about the virus and the emergence of the COVID-19 infodemic, according to WHO Director-General Tedros Adhanom Ghebreyesus at the Munich Security Conference on Feb. 15, 2020 [4]. Moreover, various researchers and news outlets [5][12] tackled the rising infodemic issue and presented real-life case studies detailing actual examples that impeded people from acting appropriately during the infodemic. For example, malicious users have used social media platforms such as Facebook, Twitter, Instagram, Youtube and WhatsApp in order to spread panic and confusion through deliberate overabundance of misleading information and rumors. A notable false claim that 5G damages the immune system and consequently causes the COVID-19 outbreak went viral and resulted in vandalism of cell towers in Europe [13]. Other conspiracy theories spread rumors regarding the source and cure for COVID-19 at a time when people needed to focus during the outbreak on how to do the right thing in order to control the disease and mitigate its impact (e.g., virus does not infect children, virus dies in temperature above 27 degrees, a certain diet cures and provides immunity from the virus, cure discovery, Inline graphic). Cyberattacks also flourished during the outbreak [9]. Videos, photos and posts in different languages exploited the COVID-19 context in order to redirect the general public to shady websites and inadvertently install spyware. Some cybersecurity firms claimed that 3–8% of the newly registered COVID-19 related sites are suspicious, while others phishing messages about potential cures lead to the installation of malware.

Consequently, organizations, governments and business leaders exercised excessive pressure on social media platforms in order to curtail the flood of fake news and viral misinformation. This became a priority in order to ensure that people in lockdown received accurate and medically sound information. Although social media outlets claimed to have control over identifying and banning harmful content, it soon became apparent that they themselves were not well prepared and needed contingency plans in order to respond to COVID-19 infodemic. The main focus of the social network platform was mainly on advertisement and offering personalized services to both industries and people while analyzing human behavior and preference. In fact, most platforms are now filtering and banning users who are identified as sources of misinformation. According to [9], the adopted machine learning and artificial intelligence moderation tackling the identification of credible content in social media resulted in several unfair disbanding of user accounts and content due to the shortage of human verification and review during the pandemic.

In this context, this article provides a large-scale quantitative measurement of the critical impact of social networks infodemic during the COVID-19 pandemic. Please note that we refer to the term impact as the potential reach of tweets, i.e., impact in term of reach level. The first objective is to explore quantitatively using large dataset the potential reach of this infodemic on billions of users on social media platforms. The second objective is to technically identify the shortcomings that led to the infodemic and provide some research directions to limit the impact of its spread. We tackle the evolving challenges using a large dataset that was extracted from Twitter targeting COVID-19. The study uses a data analytics approach based on tweets meta-data, text and context, as well as users meta-data and profiles. We explore extensively one million COVID-19-related tweets that were collected over a period of two months belonging to 288K users. The analysis of the unique users’ profiles, meta-data and context of the tweets allowed us to deduce various important findings and insights while providing guidance for potential solutions. To the best of our knowledge, except Li et al. [14] who characterized the propagation of situational information in social media during COVID-19, no computing related work has yet addressed analyzing experimentally either the reach level or the positive/negative impacts of social networks in propagating COVID pandemic. Accordingly, this article contributes by highlighting, based on empirical analysis, several findings and directions to a research field to become of great importance in the near future.

We provide in the sequel a summary of our findings noting the following terms usages: a Tweet refers to a unique tweet excluding the retweets, Interactions refer to the total number of retweets and favorites per unique tweet, and Reach refers to total number of followers of the user who initiated the unique tweet and reflects the number of tweeters that may potentially see and interact with it. The initial results indicate that around 16.1% of the tweets (i.e., 160K Tweets, 2.1M Interactions and 5.6B Reach) are exploiting COVID-19 contexts for advertisement, redirecting users to out of scope topics or even maliciously misleading the community. A further lexicon-based analysis on the context and users’ meta-data confirms that only 3.5% of the unique users initiating the tweets have a medical profile while 2.8% are virus specialists. Accordingly, at least 93.7% of the COVID-related tweets (i.e., 800K Tweets, 17M Interactions and 30B reach) may be transmitting misleading or unverified medical information. Conversely, and in order to highlight the importance of non-medical users in spreading important medical information, a deeper analysis was performed to identify unique users with key specialties. Results reaffirmed our initial findings and showed that users with context-relevant occupations such as doctor, writer, reporter, journalist, editor and governor do not even constitute 1% of the total reach count (i.e., 300M out of 37B). Accordingly, these insights illustrated the need to identify relevant influencers in specific contexts and to seek their help in order to disseminate verified and reliable information.

The contributions of this work are threefold:

  • Providing quantitative measurements analyzing the potential reach level of social network infodemic during the COVID pandemic. To the best of our knowledge, no computing related work has yet addressed analyzing experimentally either the reach level or the positive/negative impacts of social networks in propagating COVID pandemic.

  • A lexicon-based data analytics approach for social networks users and content using natural language processing techniques. Insights into user profiles and tweet contexts are inferred in order to (1) detect misleading information that are spread using tweets that exploit COVID-19 and (2) measure the credibility and reliability of the disseminated COVID-19-related tweets by classifying the tweeters based on users’ specialties and occupations.

  • Elaboration of both computing and non-computing findings, implications, social networks management strategies and research directions addressing the infodemic-related problems supported by thorough literature review for a field to become of great importance in the near future.

This article is organized as follows. In Section II, we describe the research methodology while in Section III we analyze the impact of misleading twitter contexts. Section IV provides empirical analysis of the impact of COVID-19 related posts per user specialty and occupation. Section V details our research findings and directions. In Section VI, a thorough literature review related to the addressed problems is presented while Section VII concludes with comments.

II. Methodology and Data Processing

The adopted lexicon-based data analytics methodology for social networks users and content, illustrated in Figure 1, is based on natural language processing techniques (NLP). It starts by choosing the pertinent topic and selecting the top used hashtags. A search query that forms the basis of the data collection scripts was next built and the keywords were selected. The system systematically fetched approximately a million tweets from Twitter along with their corresponding users’ profiles. A descriptive analysis report was then generated by aggregating the collected records. In order to gain deeper insight into the collected data, we developed five different lexicon relationships. The lexicon properties allowed the system to analyze the content and consequently build the targeted aggregations. Natural Language Processing (NLP) techniques were used in order to classify the tweets and the users based on the above analysis. Finally, the results were aggregated and inferences were made based on users’ occupations.

Fig. 1.

Fig. 1.

Overview.

In the sequel, we provide an ordered and detailed description of the methodology presented in Figure 1 including the proposed approaches and elaborated solutions within each of the system modules:

  • A crawler Python script was implemented using a tweepy [15] for collecting one million public tweets that include the “corona” or “covid” terms. To classify hashtags whether they are covid-related or non-covid related ones, a list of the top used hashtags within the context of COVID pandemic was created (the Covid Hashtags Lexicon), regardless of the keywords that were used to collect the tweets that are “covid” and “corona.” Once the data is fetched, a list of unique users who initiated the tweets was extracted and Twitter REST API [16] access tokens were used in order to fetch the public profiles and perform the aggregations and analysis.

  • A set of lexicons was built based on a special list of keywords to classify tweets into corona or non-corona related ones and infer insights from tweets and user profile datasets. In this regard, the lexicons were used as a base for the NLP entity extractor to classify each tweet based on its content regardless of the hashtags. Similarly, they were also used to classify users that have medical and speciality backgrounds. The following are the five built lexicons: Corona Top Used Hashtags Lexicon, Corona Social Media Context Lexicon for Tweets, Occupation Lexicon for Grouping Users Based on their Biographic Information, Medical Occupation Lexicon for Users and Virus Specialty Occupation Lexicon for Users. We built the NLP model using our manually created lexicons and dictionaries since we could not find previously created dictionaries about COVID at the time of the study. In order to build the COVID related lexicons (hashtags and context), we fetched the top used keywords and hashtags on twitter and manually built a list of keywords and expressions for each lexicon. However, for the “Occupation” lexicon, we used a previously collected list of keywords (extracted from Google Cloud1 and Amazon AWS2 NLP modules) to classify user profiles based on the text similarities and occurrences. Also, we built the “Medical and Virus Specialists’ lexicons by performing manual search and filtering thousands of twitter user profiles to collect user occupational titles and job descriptions about the mentioned themes. In the sequel, we present more details about the component of each lexicon:
    • COVID Hashtags Lexicon: contains a list of most of the used hashtags about COVID on Twitter.
    • COVID Context Lexicon: contains a dictionary of keywords, expressions and abbreviations that are being used during the COVID pandemic (i.e., stay at home, masks, virus, covid china, etc.).
    • Occupation Lexicon: contains a list of keywords and expressions about the most common job titles, descriptions and classes (i.e., engineer, writer, journalist, economist, doctor, musician, consultant, etc.).
    • Medical Context Lexicon: contains a dictionary of keywords, expressions and abbreviations that are directly related to the medical family (i.e., doctor, clinic, psychiatrist, etc.).
    • Virus Specialty Lexicon: contains a dictionary of keywords, expressions and abbreviations that are directly related to the viruses and biological clinical occupations (i.e., virus specialist, bacteriologist, vaccines, immunology, etc.).
  • The lexicon-based processing scripts were next built in order to extract entities from tweets as well as from the user biography fields. Accordingly, we inferred credibility measurement using NLP analysis. The distributed scripts simultaneously processed tweets and user records in order to tag record with a final value (i.e., isCorona, isMedicalProfile, isSpecialtyProfile, isCoronaHashtag, and the list of detected occupations).

  • The dataset was next decorated for advanced filtering and analysis queries by merging the aggregated data into one enriched dataset, which was augmented with the following attributes: Unique Tweet ID, Hashtag Counts per Tweet, Favorite Counts per Tweet, Retweet Counts per Tweet, Mention Counts per Tweet, Interactions (favorite and retweet) Counts per Tweet, Total Reach Count (number of followers per user per unique tweet), Unique User ID, Claimed Locations per User, Occupations per User (extracted from the user profile biography field), isCorona-Related (a boolean expression), isMedicalProfile-Related (a Boolean expression) and isSpecialtyProfile-Related (a boolean expression).

  • An occupation classification was next performed in order to better understand the effect of the tweets that were initiated by users with different roles and specialities. Within this context, we counted each user’s unique tweets per occupation group (e.g., journalists), calculated the total Interactions per tweet, and calculated the total Reach counts caused by the mentioned group of users.

  • An Analysis of the correlation across users/groups that have medical profiles as well as a specialization in the study of viruses or infectious diseases was performed. Both profile types share similar entities and keywords, and thus we attempted to highlight and studied the impact of users with a virus specialization profiles rather than those with a general medical background by sub-categorizing users with medical profiles.

III. Impact Analysis of Tweets Exploiting COVID-19 Context

In this section, we present the main findings and discuss the insights and the results of the analysis based on the predefined framework approach and KPIs. As we processed around 109.3K hashtags from the one million random unique tweets, it was important to classify each hashtag according to its direct relationship to the COVID family of hashtags. For instance, regardless of the context of the tweet, a hashtag that matches or partially contains #COVID or #CORONA is classified as CORONA since it is explicitly related to the Corona virus, while other hashtags like #China, #U.S. or #Italy are classified as NON-CORONA since they are not directly related to the Corona virus. Figure 2 shows the comparison between the occurrences of the two classes (CORONA and NON-CORONA) in the tweets. It can be noted that 53.5% of the tweets (582.9K Tweets) represent tweets that contains CORONA hashtags and might/not be talking about COVID, while 46.5% (506.2K Tweets) represent tweets that do not contain CORONA hashtags but might be talking about COVID in general. It should be noted that since some tweets contain hashtags from both classes, the total number of the classified classes does not reflect the number of unique tweets but rather the count of tweets. This explains the fact that the number of tweets per each class does not add up to one million (same applies to Figure 4).

Fig. 2.

Fig. 2.

CORONA vs NON-CORONA Related Hashtags.

Fig. 4.

Fig. 4.

Tweets, Interactions & Reach Counts of COVID Related & Non-Related Hashtags.

Figure 3 displays the top used hashtags from the 109.3K ones sorted by the total count of occurrences in all tweets. The TreeMap visualization chart has three dimensions to display. The position (from left to right), the box size (bigger to smaller), and the color opacity(100% to 1%). All dimensions are displayed based on the number of the total occurrences of each hashtag in the entire tweets dataset. It is worth mentioning that the displayed hashtags have different dialogs and formats. For example, Covid19, COVID19, and covid19 were counted as separate hashtags in order to measure the different usage for further text analysis. Moreover, other medical terms not included in our search such as Sars-cov2 may also be relevant depending on the context and type of needed analysis [17].

Fig. 3.

Fig. 3.

Top Used Hashtags in Different Dialogs.

Figure 4 shows the total Interactions and Reach counts of each class of hashtags (CORONA and NON-CORONA) using a stacked column chart. It is interesting to notice that the number of Interactions and Reach level covered by the COVID hashtags on just a small set of users compared to the actual twitter size. The number of Interactions reflects the total Interactions (i.e., retweets and favorites) of all the unique tweets where the classified hashtags were used. The total Reach displays the possible Reach counts of the mentioned unique tweets based on their users’ followers count. Again, both Reach and Interactions summations of the two classes do not sum up to the total Reach and Interactions specified in the header. We notice that the total number of Reach counts of the two classes is 36.6B out of 36.7B (a difference of 86,000,000 possible Reach), which indicates that more than 80% of the users received the Unique Tweets.

Furthermore, an lexicon-based classification of the contexts was performed in order to understand the meaning of the tweets. The lexicon is built from COVID related dictionary for identifying the tweets diverting from the context to different topics. Figure 5 shows that 16.1% of the tweets (i.e., 160.1K Unique Tweets) were not related to the COVID situation at all, while 83.9% (839.2K Unique Tweets) were related based on their content. Some of the non-related ones were using the trend hashtags to advertise for products and other topics, and others were malicious intended to mislead the trend into different subjects. Figure 6 shows the total Interaction and Reach counts of each tweet in each classified category. In addition to the details mentioned in the description of Figure 5, it is important to highlight the large effect of the 16.1% tweets in terms of Interactions and Reach counts, which recorded around 2M and 5B respectively. It is very important to mention that those counts are subject to increase with time, hence enlarging the misleading ratios. In this context, additional research need to take place in order to identify the final destination of these tweets in order to take the needed actions for immediate remediation.

Fig. 5.

Fig. 5.

Tweets Within and Diverting Out of COVID Context.

Fig. 6.

Fig. 6.

Tweets, Interactions and Reach Counts Within and Diverting Out of COVID Context.

IV. Impact Analysis of COVID-19 Related Tweets Initiated Per User Occupation/Specialty

Additional experiments were performed by considering the 83% COVID related tweets in order to distinguish the identity of the tweeters initiating the unique tweets with COVID-19 context. The results of the lexicon-based classification allowed us to study the profile of the 288K tweeters and identify 510 occupations belonging to the COVID tweet initiators. In this regard, we extracted very important insights about the credibility of tweets’ initiators who might be eligible for broadcasting relevant messages in such a critical period.

Among the 83.9% of tweets, we first filtered the 839.2K Unique Tweets into Medical Profile and Non-Medical Profile categories based on the biographic information of each tweeter having at least one COVID context related tweet. Figure 7 aims at showing the participation of users that have medical backgrounds in the overall conversations in order to measure their effect based on their corresponding Interactions and Reach counts. It is clear that only 29.1K Tweets (i.e., 3.5% of the COVID related tweets) were initiated by tweeters that have medical profiles, while the other 96.4% of the tweets were initiated by tweeters that do not have medical profiles or expertise. Likewise, Figure 8 measures the different Interaction and Reach counts for the tweeters having virus specialty backgrounds. It also shows that only 2.8% of the COVID related tweets were initiated by specialists, while the remaining 97.2% were initiated by other tweeters’ profiles. Usually, a specialty profile could be inherited from a medical profile, but not the opposite. We can depict from both Figures 7 and 8 that the total Interactions and Reach counts of tweets initiated by non-specialists tweeters are around 18M and 31B respectively, which reflect 38.6 and 303 times more than the tweets initiated by specialists in the field respectively. This might be very critical since it reflects the extent of the unintentional or intentional mislead ratios who may lead to potentially spreading unverified and non-credible medical information and guidelines for defeating COVID-19.

Fig. 7.

Fig. 7.

Interactions and Reach Counts of the 3.5% COVID Tweets Initiated by Medical Experts.

Fig. 8.

Fig. 8.

Interactions and Reach Counts of the 2.8% COVID Tweets Initiated by Virus Specialists.

The above implications should neither overshadow nor dominate the need for credible professional tweeters who should contribute to the information that will raise awareness and defeat the virus. Governors, mayors, editors, writers and journalists are obvious examples of tweeters who should be on the list of occupations other than medically related who should be encouraged to interact and engage in such critical times. The list of credible tweeters could be expanded to include public figures such as actors and artists. Figure 9 presents three wordles (word clouds) that rearrange these occupations into a visual pattern broken down per Tweet, Reach and Interactions counts. The font size per occupation reflects its frequency while Figure 10 shows the top 18 occupations for the COVID tweeters’ occupations initiating related unique tweets broken down per Tweet, Interactions and Reach. The main objective is to assess the impact of each group of tweeters and study their impact and influence rate in terms of Interactions and Reach. Clearly, both figures illustrate visually and numerically that the correlation between the number of Tweet, Interactions and Reach counts is not linear. In other words, the total Reach count of tweets initiated by the group of tweeters having Arts profiles and backgrounds are much higher than the total Reach of tweets initiated by the group of users having Doctor profiles and backgrounds, regardless of the number of uniquely initiated tweets by both groups. Furthermore, the correlation between the Tweet and Interactions counts is also not linear but logical. For instance, relevant occupations such as writers and journalists achieve high Interactions level, while non-related ones such as engineers and retired are getting low counts. Moreover, numerical results illustrate that context-related occupations such as doctors, writers, reporters, journalists, editors and governors do not even constitute 1% of the total Reach counts, i.e., a total of around 300M out of 30B Reach counts. To further highlight the problem, these 1% tweeters are supposed to be the only ones allowed to interact with people during such a critical situation. In this regard, two main implications can be reached from the presented results. First, accurate techniques are needed in order to verify the authenticity of the reported occupations based on historical and real-time means. Second, detection approaches need to be elaborated for identifying influencers relevant within specific contexts and situations.

Fig. 9.

Fig. 9.

COVID Tweeters’ Occupations/Specialities (Font Size Reflects the Count Value).

Fig. 10.

Fig. 10.

COVID Tweet, Interactions and Reach Counts by Different Occupations/Specialities.

V. Implications & Future Research Directions

In this section, we provide various computing and non-computing implications, recommendations, limitations of our approach and future research directions in relation to the aforementioned raised problems as inferred empirically and quantitatively:

  • An immediate ban should be placed on all the users, posts and tweets exploiting the COVID-19 context in order to mislead users and disseminate fake news. In this regard, various researchers tackled detecting spams and misleading information in social networks based on users’ meta-data, texts and contexts [18][33]. However, these approaches did not consider critical and crisis times where high accuracy and time efficiency factors have major impact on overall solutions. According to [9], major social network platforms have confirmed that applying current AI techniques without human interventions may lead to unfairness by wrongly banning valid accounts and interactions. Consequently, additional research efforts have to investigate efficient and accurate human-less techniques and methodologies for better understanding the origin of misinformation while identifying both disruptive contexts and users.

  • Although information broadcasts are not initiated by medical experts or officials, they may be at times essential and useful. Accordingly, allowing only communications by specific categories may be counterproductive as it could block legitimate and helpful information. In this regard, several approaches have addressed reputation and credibility based on user-centered and content-based analysis [34][46]. However, to the best of our knowledge, none of these approaches have classified and managed posts and accounts based on their verified roles, occupations and specialities. Consequently, mechanisms should be proposed in order to efficiently and accurately allow postings based on the aforementioned criteria, while at the same time considering credibility, historical engagement, insights and influence rate in related contexts and events. Moreover, there is a need at this time to develop systems that have efficient and highly accurate trust and credibility preserving models to be opportunistically adopted during crisis periods.

  • Results show that the Reach level of professional COVID-19 context-relevant roles and occupations (e.g., doctors, editors, governors) is very low (i.e., only 1% of total Reach). Accordingly, extensive effort should be put to elaborate methodologies and recommendation systems for efficiently recognizing credible and convincing influencers in specific events/locations/communities (e.g., based on profile, insights, historical engagement) for spreading the relevant and cited information provided by trusted scientists and experts at large scale, in the right place and to the right people. In this context, researchers may benefit from the rich literature that targets identifying influencers based on selected events in order to build relevant approaches [47][54].

  • Current raised infodemic shed the light on the urgent need to elaborate methodologies and techniques to be embedded in the social network platforms for systematically adopting emergency and crisis mode management strategies and responding to the situation dangers. This also includes developing code of conduct, standards and regulations to abide by during crisis periods, which may differ from the policies applied within regular terms. Although few approaches studied the role and reaction of social network platforms in response to previous natural disasters [55], [56], the research field still lacks solid and sustainable methodologies to deal with epidemic and pandemic contexts, and prior, during and post crisis.

  • Infodemic made it difficult for people to find reliable resources for information. Accordingly, the UN is stepping up their communications efforts through global cooperation and viral acts of humanity. Although some are promoting the Chinese model of censored contagion, the solution is for health authorities, governments and social network stakeholders to formulate regular responses to the infodemic using a strategy of active engagement and communication with those who are spreading inaccurate stories in order to gain a deeper understanding of how infodemic spread. Governments should set-up official units mandated to combat the spread of inaccurate and unsubstantiated news. For example, the U.K. established a rapid response unit within the Cabinet Office. The Unit will work with social media firms in order to filter fake news and harmful content.

  • The most powerful solution to tackle this, or any future infodemic, lies with the consumers themselves. Taking personal responsibility of the role that each person plays when they receive, read, edit, comment and then forward a piece of information that originates on a social media platform is, arguably, the most impactful intervention to debunk the myths and falsehoods that are generated on an hourly basis. Targeted campaigns must be launched to educate anyone whose date of birth precedes the year 2000 to educate them on the social responsibility that they bear whenever they partake in perpetuating stories on Twitter or any other platform.

Finally, we provide in the sequel the limitations of our approach mainly related to the adopted methodology and assumptions, which may be also considered as future directions for further research and studies:

  • Having to build manually most of the COVID dictionaries and lexicons may result in some assumptions without enough experimental analysis supporting them. Accordingly, more thorough analysis and effort are still needed to elaborate comprehensive lexicons and ontologies related to COVID. Moreover, using only NLP ready models like Google Cloud or Amazon AWS NLP modules is not enough when assessing small paragraphs and sentences like tweets. Furthermore, extracting entities by combining all tweets together may also affect the results in our case since we are dealing with unique profiles and single classification of tweets. Accordingly, a combination of ontology-based analysis and user clustering techniques to identify trustable users is needed to reach highly accurate solutions as indicated in [57].

  • Extracting user occupations from their unverified profiles may affect the analysis results in case of false claims. In our approach, classifying users or profiles as doctors, scientists or artists are based on their claimed biography information without additional review for assuring the real occupation. For instance, users might be wrongly claiming to be doctors even if they are not related to the field at all.

  • Assuming that all non-medical profiles are potential sources of misinformation may not be valid in many situations. However, we did benefit from this assumption and limitation to elaborate on important future directions for building efficient and accurate methodologies for identifying relevant influencers in specific fields to help disseminating the needed information. Moreover, a deeper analysis for medical profiles is needed to distinguish between specialties. Performing a deeper analysis on the meaning and sentiment of the COVID-related tweet contexts is also needed to infer relevant results about the directions and objective of the tweets. In this regard, relevant contexts, lexicons and ontologies need to be built and verified.

VI. Literature Review

In this section, we provide a literature review in relation to the aforementioned implications and proposed research directions, and which may form a solid ground for potential solutions.

A. Spam and Misleading Posts Detection

Detecting spammers on social networks most often relies on analyzing the content of messages [9], [29], [31][33], [57]. However, most of the approaches extend their techniques by exploiting users’ profile and their relations [30]. Sedhai and Sun [18] proposed a semi-supervised technique for spam detection, in which they proposed multiple detectors that investigate tweets’ contents to classify maliciousness. Similarly, Alghamdi et al. [19] exploited a set of OSNs object and URL features for the same purpose. Such features include information related to user’s profile, and URL related features including hosts and domains. Similarly to the previous approach, Lee and Kim [26] deployed a real-time malicious URL detector by exploiting URL redundancy driven by the limitation posed on the attackers’ resources. Guille and Favre [25] proposed another approach that takes advantage of the URL used by the users in their tweets to spot malicious intents. A multi-feature analysis like unique mentions, trends, hyperlinks, and tweets ratio has been employed by Amleshwaram et al. [20] to distinguish spam accounts in a supervised manner. Moreover, Benevenuto et al. in [21] aimed to classify users between promoters, spammer and legitimated from their videos. By manually selecting different users and learning their behaviors, authors were able to employ a supervised machine learning technique capable of classifying malicious users with a relatively small margin error. Chen et al. [22] described spamming strategies techniques of more than 570 million tweets. Shen and Liu [28] deployed another approach that depends on the tweets contents to extract users’ behaviors and supply them to a supervised classifier. However, supervised and semi-supervised techniques cannot classify data by discovering features on their own, which requires manual classification in the initial stages. Such involvement requires the intervention of human in which by its nature prone to errors, thus reducing the accuracy of the results. In case of online social networks, classification of diverse and large amount of data has been proven difficult.

B. User-Centered & Content-Based Reputation & Credibility Analysis

Despite the work on detecting spammers in social networks, other approaches took advantage of the abundance number of information for ranking users based on their influence rate. Such techniques stem from the need to rank the relevance of the users and their tweets, and thus two main categories of solutions exist to address the issue in dispute. The first set of approaches focused on the content to assign reputation using machine learning techniques [47], [58], [59], while the second set relied on the user and its relation described as nodes in a graph model [35], [48], [58]. Moreover, there are other solutions that depend on both methods to achieve better accuracy. In the following, we overview the main approaches belonging to these categories.

Jain et al. [34] took advantage of the capabilities of graph theories and related algorithms to calculate a score for each user based on their centralities. Such scores are later used to identify universal leaders’ opinions. Riyantoa and Jonathan [35] provided an in-depth analysis on how social distancing and environment can affect trust and trustworthiness between users. Mohammadinejad et al. [38] presented a framework that takes advantage of the consensus opinion within social network relations to infer scores such as user’s personality to derive the most influential users in the network. Zhang et al. [41] benefited from the relations through social network messages and contact frequency to learn the user’s behavior, thus providing a credibility score that describes the risk levels of users’ interactive messages. Wang and Chen [39] provided an empirical analysis on the information credibility and provided a credibility assessment framework. They also emphasized the value of users’ credibility in relation to the credibility of the information. Tsikerdekis and Zeadally [40] drew the attention towards recent adversaries related to social network including identity deception and multiple account creation, and employed a behavioral framework to detect such actions.

Ahmad and Rizvi [46] presented a survey on different approaches used for the detection of rumors on social networks. Curiskis et al. [42] provided a comparison of different document clustering techniques that are mostly used on OSNs and supplied by multiple features. Moreover, they also provided several evaluation measures to assess their accuracy. Curiskis et al. [42] focused on the content in different languages such case “Arabic” in order to produce a framework that is able to distinguish fake news by allotting a score for each content through sentiment analysis with the help of different classification algorithms. Alrubaian et al. [44] proposed a system with multiple components that work in conjunction to deduce the credibility of users and their related tweets to restrain the spread of fake and malicious news.

C. Influence Ranking in Social Networks

Users’ influence rating and ranking have become one of the most important topics when analyzing social networks, especially in microblogs like Twitter. Authors in [52], [53], [59][62] explored that user meta data like follower count, tweets count, following count and tweets meta data like retweet count and favorite count are enough to calculate the user influence ratio. On the other hand, authors in [63] analyzed the relationships between users in order to rank them by their influence relationships. Reference [64] analyzed the user’s social activity during a specific event. Anger and Kittl [65] determined a grounded approach to measure the individual’s influence or potential social networking ratio (SNP) using users and tweets metadata to find the top 10 Twitter users in Austria. Bakshy et al. [66] calculated the user influence rate per event using diffusion trees and cascading methods by selecting only events that have URLs. Then, they applied diffusion algorithms on the shared URLs to measure the reach of the initial tweets. Anjaria and Guddeti [67] used NLTK sentiment analysis and Incremental Learning algorithms to predict the presidential elections in the U.S. Moreover, Schenk and Sicker [68] categorized influencers into four influence groups using a bagging classification algorithm by studying users static and dynamic influence features and comparing them over time.

In [69], Mei et al. approached an entropy weighting algorithm based on eight data points per each user to find their influence ratios. They added the features of new followers and new mentions to measure users’ popularity ratios in order to sort a list of the top hundred users in Australia by their influence rates. Riquelme et al. [47] proposed two linear threshold centrality based approaches to measure the rank of the users and the propagation rate of their contents in the network. Similarly, Li et al. [48] presented an eigenvector centrality based approach to measure the influence rate. Lahuerta-Otero and Cordero-Gutiärrez [49] presented a brief analysis of the behavior of special kind of tweeter users, and evaluated their influence ratio through different data mining techniques. Through their analysis, they were able to spot different techniques to increase users’ influence. Sharma et al. [50] proposed a novel approach to elect influential users by calculating the influence rate through their tweet and trend scores. Huynh et al. [54] focused on the relation between the tag used in the tweets to calculate the influence rate and the speed of their propagation.

VII. Conclusion

This article investigated the COVID-19 infodemic negative impact on the major efforts to defeat the pandemic through a novel large-scale Twitter-based study, which provided quantitative assessment using real-life experiments reflecting the actual environments. The empirical analysis of 1 million COVID-19-related tweets belonging to 288K unique users illustrated the severe impact of misleading people and spreading unreliable information. Inferred insights showed that (1) the potential reachability of the 16.1% none relevant tweets that might or might not be misled users by redirecting them to out of scope and/or malicious content is 5.6 billion, and (2) a minimum of 93.7% of the remaining within-context 83.9% tweets (i.e., with around 17M Interactions and 30B Reach counts) were initiated by users with non-reliable medical and/or relevant specialty profiles, and consequently might be disseminating misleading non-credible medical information. Moreover, different insights highlighted the low reachability (i.e., 1% of the total Reach counts, which is equivalent 300M out of 30B) of the unique users with key context-relevant specialties and occupations such as doctor, writer, reporter, journalist, editor and governor. As previously explored, the number of tweets initiated by users claimed to be “Doctors” having credible medical profiles is really low. On the other hand, those tweets have the highest interaction rate among all other tweets but at the same time a very low level of reach or impact compared to other occupational groups “Arts” or “Journalists.” That by default reflects their influence rates and reputation levels that were highlighted to be addressed in future directions. The results shed the light on the importance of identifying non-medical key influencers for assisting in spreading legitimate information relevant in such situations. Finally, this article elaborated on few computing and non-computing implications as well as future research directions to highlight the potential solutions and future work in such a promising field.

Biographies

Azzam Mourad (Senior Member, IEEE) received the M.Sc. degree in CS from Laval University, Canada, in 2003, and the Ph.D. degree in ECE from Concordia University, Canada, in 2008. He is currently an Associate Professor of computer science with the Lebanese American University and an Affiliate Associate Professor with the Software Engineering and IT Department, Ecole de Technologie Superieure, Montreal, Canada. He published more than 100 papers in international journal and conferences on Security, Network and Service Optimization and Management targeting IoT, Cloud/Fog/Edge Computing, Vehicular and Mobile Networks, and Federated Learning. He has served/serves as an Associate Editor for IEEE Transactions on Network and Service Management, IEEE Network, IEEE Open Journal of the Communications Society, IET Quantum Communication, and IEEE Communications Letters, the General Chair of IWCMC2020, the General Co-Chair of WiMob2016, and the track chair, a TPC member, and a reviewer for several prestigious journals and conferences.

Ali Srour is currently pursuing the master’s degree in computer science with the Lebanese American University. He is a Data Scientist and an AI Consultant. His research interests are social network analysis, data science, and artificial intelligence innovations.

Haidar Harmanani (Senior Member, IEEE) received the B.S., M.S., and Ph.D. degrees in computer engineering from the Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA, in 1989, 1991, and 1994, respectively. He is currently a Professor of Computer Science with the Lebanese American University. He serves on the steering committee of the IEEE NEWCAS conference and the IEEE ICECS conference. He has also served on the program committees of various international conferences. His research interests include electronic design automation, high-level synthesis, design for testability, and parallel programming. He is a Senior Member of ACM.

Cathia Jenainati is currently a Professor of English Literature and the Dean of the School of Arts and Sciences, Lebanese American University. He has previously served as the Founding Head of the School for Cross-Faculty Studies, Warwick University, U.K. Her research focuses on women’s activism, oral history, the global sustainable development agenda, and the history of education missions in the Middle East. In addition, she has been recognized as a leader in innovative pedagogies especially around liberal education, and received several research grants in the field. She serves as the Associate Editor for the Journal of Coaching Practice, the Chair of the International Advisory Board of Amsterdam University College, an Educational Coach of the Growth Coaching International (Australia–U.K.–USA), and a Founding Fellow of the Warwick Higher Education Academy, U.K.

Mohamad Arafeh received the M.Sc. degree in business computer and information systems from the Lebanese University. He is currently a Research Assistant with the Lebanese American University. His research interests are crowdsensing, social network analysis, and blockchain technology.

Funding Statement

This work was supported by the Lebanese American University.

Footnotes

References

  • [1].Clement J.. Number of Social Network Users Worldwide From 2010 to 2023. Accessed: Jul. 15, 2020. [Online]. Available: https://www.statista.com/statistics/278414/number-of-worldwide-socialnetwork-users/ [Google Scholar]
  • [2].Elisa Shearer K. M.. (2018). News Use Across Social Media Platforms. [Online]. Available: https://www.journalism.org/2018/09/10/news-use-across-social-media-platforms-2018/ [Google Scholar]
  • [3].(2019). News Consumption in the U.K.: 2019. [Online]. Available: https://www.ofcom.org.uk/
  • [4].Lederer E.. U.N. Chief Antonio Guterres: Misinformation About COVID-19 is the New Enemy. Accessed: Mar. 27, 2020. [Online]. Available: https://time.com/5811939/un-chief-coronavirus-misinformation/ [Google Scholar]
  • [5].Wemer D. A.. (2020). Addressing the Coronavirus Infodemic. [Online]. Available: https://atlanticcouncil.org/blogs/new-atlanticist/addressing-the-coronavirus-infodemic/ [Google Scholar]
  • [6].Schaake M.. (2020). Coronavirus Shows Big Tech Can Fight—Infodemic of Fake News. [Online]. Available: https://www.ft.com/content/b2e2010e-6cf8-11ea-89df-41bea055720b [Google Scholar]
  • [7].Skopeliti C.. (2020). Coronavirus: How Are the Social Media Platforms Responding to the Infodemic? [Online]. Available: https://firstdraftnews.org/latest/how-social-media-platforms-are-responding-tothe-coronavirus-infodemic/ [Google Scholar]
  • [8].Raina N. L. and Merchant M.. (2020). Social Media and Emergency Preparedness in Response to Novel Coronavirus. [Online]. Available: https://jamanetwork.com/journals/jama/fullarticle/2763596 [DOI] [PubMed] [Google Scholar]
  • [9].Macaulay T.. (2020). Social Media Firms Will Use More AI to Combat Coronavirus Misinformation, Even if it Makes More Mistakes. [Online]. Available: https://thenextweb.com/neural/2020/03/17/social-media-firms-will-use-more-ai-to-combat-coronavirus-misinformation-even-if-it-makes-more-mistakes/ [Google Scholar]
  • [10].Frenkel D. A. S. and Zhong R.. (2020). Surge of Virus Misinformation Stumps Facebook and Twitter. [Online]. Available: https://www.nytimes.com/2020/03/08/technology/coronavirus-misinformation-socialmedia.html [Google Scholar]
  • [11].Savov V.. (2020). COVID-19: Twitter Escalates Moderation of Misleading Content Around Virus. [Online]. Available: https://www.thestar.com.my/tech/tech-news/2020/03/19/covid-19-Twitter-escalates-moderation-of-misleading-content-around-virus [Google Scholar]
  • [12].Hatmaker T.. (2020). Twitter Broadly Bans Any COVID19 Tweets That Could Help the Virus Spread. [Online]. Available: https://techcrunch.com/2020/03/18/Twitter-coronavirus-covid-19-misinformation-policy/ [Google Scholar]
  • [13].Budryk Z.. (2020). Conspiracy Theorists Who Claim 5G Linked to Coronavirus Believed to Burn Cell Towers in Europe. [Online]. Available: https://thehill.com/policy/technology/493927-arsonistsfalsely-linking-5g-to-coronavirus-burn-cell-towers-in-europe [Google Scholar]
  • [14].Li L.et al. , “Characterizing the propagation of situational information in social media during COVID-19 epidemic: A case study on weibo,” IEEE Trans. Comput. Social Syst., vol. 7, no. 2, pp. 556–562, Apr. 2020. [Google Scholar]
  • [15].Tweepy. Tweepy Api Reference. Accessed: 2020. [Online]. Available: http://docs.tweepy.org/en/latest/api.html
  • [16].Twitter. Twitter Api Reference. Accessed: 2020. [Online]. Available: https://developer.Twitter.com/en/docs/api-reference-index
  • [17].Chen E., Lerman K., and Ferrara E., “Tracking social media discourse about the COVID-19 pandemic: Development of a public coronavirus Twitter data set,” JMIR Public Health Surveillance, vol. 6, no. 2, May 2020, Art. no. e19273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Sedhai S. and Sun A., “Semi-supervised SPAM detection in Twitter stream,” IEEE Trans. Comput. Social Syst., vol. 5, no. 1, pp. 169–175, Dec. 2018. [Google Scholar]
  • [19].Alghamdi B., Watson J., and Xu Y., “Toward detecting malicious links in online social networks through user behavior,” in Proc. IEEE/WIC/ACM Int. Conf. Web Intell. Workshops (WIW), 2016, pp. 5–8. [Google Scholar]
  • [20].Amleshwaram A. A., Reddy N., Yadav S., Gu G., and Yang C., “CATs: Characterizing automation of Twitter spammers,” in Proc. 5th Int. Conf. Commun. Syst. Netw. (COMSNETS), 2013, pp. 1–10. [Google Scholar]
  • [21].Benevenuto F., Rodrigues T., Almeida J., Goncalves M., and Almeida V., “Detecting spammers and content promoters in online video social networks,” in Proc. IEEE INFOCOM Workshops, 2009, pp. 1–2. [Google Scholar]
  • [22].Chen C., Zhang J., Xiang Y., Zhou W., and Oliver J., “Spammers are becoming ‘smarter’ on Twitter,” IT Prof., vol. 18, no. 2, pp. 66–70, 2016. [Google Scholar]
  • [23].Chew C. and Eysenbach G., “Pandemics in the age of Twitter: Content analysis of tweets during the 2009 H1N1 outbreak,” in PLoS ONE, vol. 5, no. 11, 2009, Art. no. e14118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Fazil M. and Abulaish M., “Why a socialbot is effective in Twitter? A statistical insight,” in Proc. 9th Int. Conf. Commun. Syst. Netw. (COMSNETS), 2017, pp. 564–569. [Google Scholar]
  • [25].Guille A. and Favre C., “Mention-anomaly-based event detection and tracking in Twitter,” in Proc. IEEE/ACM Int. Conf. Adv. Soc. Netw. Anal. Min. (ASONAM), 2014, pp. 375–382. [Google Scholar]
  • [26].Lee S. and Kim J., “WarningBird: A near real-time detection system for suspicious urls in Twitter stream,” IEEE Trans. Depend. Secure Comput., vol. 10, no. 3, pp. 183–195, Sep. 2013. [Google Scholar]
  • [27].Martinez-Romo J. and Araujo L., “Detecting malicious tweets in trending topics using a statistical analysis of language,” Expert Syst. Appl., vol. 40, no. 8, pp. 2992–3000, 2013. [Google Scholar]
  • [28].Shen H. and Liu X., “Detecting spammers on Twitter based on content and social interaction,” in Proc. Int. Conf. Netw. Inf. Syst. Comput., 2015, pp. 413–417. [Google Scholar]
  • [29].Shirakawa M., Nakayama K., Hara T., and Nishio S., “Wikipedia-based semantic similarity measurements for noisy short texts using extended Naive Bayes,” IEEE Trans. Emerg. Topics Comput., vol. 3, no. 2, pp. 205–219, Mar. 2015. [Google Scholar]
  • [30].Shoaib M. and Farooq M., “USPAM—A user centric ontology driven spam detection system,” in Proc. 48th Hawaii Int. Conf. Syst. Sci., 2015, pp. 3661–3669. [Google Scholar]
  • [31].Sumner C., Byers A., Boochever R., and Park G. J., “Predicting dark triad personality traits from Twitter usage and a linguistic analysis of tweets,” in Proc. 11th Int. Conf. Mach. Learn. Appl., vol. 2, 2012, pp. 386–393. [Google Scholar]
  • [32].Chen C., Wang Y., Zhang J., Xiang Y., Zhou W., and Min G., “Statistical features-based real-time detection of drifted Twitter spam,” IEEE Trans. Inf. Forensics Security, vol. 12, no. 4, pp. 914–925, Apr. 2017. [Google Scholar]
  • [33].Meel P., Agrawal H., Agrawal M., and Goyal A., “Analysing tweets for text and image features to detect fake news using ensemble learning,” in Proc. Int. Conf. Intell. Comput. Smart Commun., 2019. [Online]. Available: https://doi.org/10.1007/978-981-15-0633-8_46 [Google Scholar]
  • [34].Jain L., Katarya R., and Sachdeva S., “Opinion leader detection using whale optimization algorithm in online social network,” Expert Syst. Appl., vol. 142, Mar. 2020, Art. no. 113016. [Google Scholar]
  • [35].Riyanto Y. E. and Jonathan Y. X., “Directed trust and trustworthiness in a social network: An experimental investigation,” J. Econ. Behav. Org., vol. 151, pp. 234–253, Mar. 2018. [Google Scholar]
  • [36].Shariff S. M., Zhang X., and Sanderson M., “On the credibility perception of news on Twitter: Readers, topics and features,” Comput. Human Behav., vol. 75, pp. 785–796, Oct. 2017. [Google Scholar]
  • [37].Meo P. D., Fotia L., Messina F., Rosaci D., and Sarné G. M., “Providing recommendations in social networks by integrating local and global reputation,” Inf. Syst., vol. 78, pp. 58–67, Jul. 2018. [Google Scholar]
  • [38].Mohammadinejad A., Farahbakhsh R., and Crespi N., “Consensus opinion model in online social networks based on influential users,” IEEE Access, vol. 7, pp. 28436–28451, 2019. [Google Scholar]
  • [39].Wang D. and Chen Y., “A neural computing approach to the construction of information credibility assessments for online social networks,” Neural Comput. Appl., vol. 31, no. S1, pp. 259–275, Sep. 2018. [Google Scholar]
  • [40].Tsikerdekis M. and Zeadally S., “Multiple account identity deception detection in social media using nonverbal behavior,” IEEE Trans. Inf. Forensics Security, vol. 9, no. 8, pp. 1311–1321, Jun. 2014. [Google Scholar]
  • [41].Zhang S., Cai Y., and Xia H., “A privacy-preserving interactive messaging scheme based on users credibility over online social networks,” in Proc. IEEE/CIC Int. Conf. Commun. China (ICCC), 2017, pp. 1–6. [Google Scholar]
  • [42].Curiskis S. A., Drake B., Osborn T. R., and Kennedy P. J., “An evaluation of document clustering and topic modelling in two online social networks: Twitter and reddit,” Inf. Process. Manag., vol. 57, no. 2, 2020, Art. no. 102034. [Google Scholar]
  • [43].Jardaneh G., Abdelhaq H., Buzz M., and Johnson D., “Classifying arabic tweets based on credibility using content and user features,” in Proc. IEEE Jordan Int. Joint Conf. Elect. Eng. Inf. Technol. (JEEIT), 2019, pp. 596–601. [Google Scholar]
  • [44].Alrubaian M., Al-Qurishi M., Hassan M. M., and Alamri A., “A credibility analysis system for assessing information on Twitter,” IEEE Trans. Depend. Secure Comput., vol. 15, no. 4, pp. 661–674, Aug. 2018. [Google Scholar]
  • [45].Shariff S. M., “A review on credibility perception of online information,” in Proc. 14th Int. Conf. Ubiquitous Inf. Manag. Commun. (IMCOM), 2020, pp. 1–7. [Google Scholar]
  • [46].Ahmad F. and Rizvi S. A. M., “Identification of credibility content measures for twitter and Sina-Weibo social networks,” in Proc. ICETIT, vol. 605, 2019. [Online]. Available: https://doi.org/10.1007/978-3-030-30577-2_32 [Google Scholar]
  • [47].Riquelme F., Gonzalez-Cantergiani P., Molinero X., and Serna M., “Centrality measure in social networks based on linear threshold model,” Knowl. Based Syst., vol. 140, pp. 92–102, Jan. 2018. [Google Scholar]
  • [48].Li X., Liu Y., Jiang Y., and Liu X., “Identifying social influence in complex networks: A novel conductance eigenvector centrality model,” Neurocomputing, vol. 210, pp. 141–154, Oct. 2016. [Google Scholar]
  • [49].Lahuerta-Otero E. and Cordero-Gutiärrez R., “Looking for the perfect tweet. the use of data mining techniques to find influencers on Twitter,” Comput. Human Behav., vol. 64, pp. 575–583, Nov. 2016. [Google Scholar]
  • [50].Sharma P., Agarwal A., and Sardana N., “Extraction of influencers across Twitter using credibility and trend analysis,” in Proc. IEEE 11th Int. Conf. Contemp. Comput. (IC3), Aug. 2018, p. 124, doi: 10.1109/IC3.2018.8530462. [DOI] [Google Scholar]
  • [51].Effing R., van Hillegersberg J., and Huibers T., “Social media indicator and local elections in the Netherlands: Towards a framework for evaluating the influence of Twitter, YouTube, and Facebook,” in Public Administration and Information Technology, vol. 15. Cham, Switzerland: Springer, 2016. [Online]. Available: https://doi.org/10.1007/978-3-319-17722-9_15 [Google Scholar]
  • [52].Rios S. A., Aguilera F., Nuñez-Gonzalez J. D., and Graña M., “Semantically enhanced network analysis for influencer identification in online social networks,” Neurocomputing, vols. 326–327, pp. 71–81, Jan. 2019. [Google Scholar]
  • [53].Liu Y. and Cao J., “IiRank: A novel algorithm for identifying influencers in micro-blog social networks,” in Proc. IEEE Int. Conf. Data Min. Workshops (ICDMW), Nov. 2019, pp. 1–8. [Google Scholar]
  • [54].Huynh T., Zelinka I., Pham X. H., and Nguyen H. D., “Some measures to detect the influencer on social network based on information propagation,” in Proc. 9th Int. Conf. Web Intell. Min. Semantics (WIMS), 2019, pp. 1–6. [Google Scholar]
  • [55].Abbasi M.-A. and Liu H., “Measuring user credibility in social media,” in Social Computing, Behavioral-Cultural Modeling and Prediction (SBP) (Lecture Notes in Computer Science), vol. 7812. Berlin, Germany: Springer, 2013. [Online]. Available: https://doi.org/10.1007/978-3-642-37210-0_48 [Google Scholar]
  • [56].Pandey R., Purohit H., Chan J. L., and Johri A., “AI for trustworthiness! Credible user identification on social Web for disaster response agencies,” 2018. [Online]. Available: http://arxiv.abs/1810.01013 [Google Scholar]
  • [57].Halawi B., Mourad A., Otrok H., and Damiani E., “Few are as good as many: An ontology-based tweet spam detection approach,” IEEE Access, vol. 6, pp. 63890–63904, 2018. [Google Scholar]
  • [58].Gün A. and Karagoz P., “A hybrid approach for credibility detection in Twitter,” in Proc. Hybrid Artif. Intell. Syst. (HAIS), vol. 8480, 2014. [Online]. Available: https://doi.org/10.1007/978-3-319-07617-1_45 [Google Scholar]
  • [59].Kwak H., Lee C., Park H., and Moon S., “What is Twitter, a social network or a news media?” in Proc. 19th Int. Conf. World Wide Web (WWW), vol. 19, Jan. 2010, pp. 591–600. [Google Scholar]
  • [60].Cha M., Haddadi H., Benevenuto F., and Gummadi K. P., “Measuring user influence in Twitter: The million follower fallacy,” in Proc. ICWSM, 2010, pp. 10–15. [Google Scholar]
  • [61].Luiten M., Kosters W. A., and Takes F. W., “Topical influence on Twitter: A feature construction approach,” M.S. thesis, Master Comput. Sci., Leiden Univ., Leiden, The Netherlands, 2012. [Google Scholar]
  • [62].Weng J., Lim E.-P., Jiang J., and Qi Z., “TwitterRank: Finding topic-sensitive influential Twitterers,” in Proc. 3rd ACM Int. Conf. Web Search Data Min., Jan. 2010, pp. 261–270. [Google Scholar]
  • [63].Huberman B., Romero D., and Wu F., “Social networks that matter: Twitter under the microscope,” First Monday, vol. 14, no. 1, pp. 1–5, Jan. 2009. [Google Scholar]
  • [64].Cappelletti R. and Sastry N. R., “IARank: Ranking users on Twitter in near real-time, based on their information amplification potential,” in Proc. Int. Conf. Soc. Informat., 2012, pp. 70–77. [Google Scholar]
  • [65].Anger I. and Kittl C., “Measuring influence on Twitter,” in Proc. I-KNOW, Sep. 2011, p. 31. [Google Scholar]
  • [66].Bakshy E., Hofman J., Mason W., and Watts D., “Everyone’s an influencer: Quantifying influence on Twitter,” in Proc. 4th ACM Int. Conf. Web Search Data Min. (WSDM), Jan. 2011, pp. 65–74. [Google Scholar]
  • [67].Anjaria M. and Guddeti R. R., “Influence factor based opinion mining of Twitter data using supervised learning,” in Proc. 6th Int. Conf. Commun. Syst. Netw. (COMSNETS), Jan. 2014, pp. 1–8. [Google Scholar]
  • [68].Schenk C. and Sicker D., “Finding event-specific influencers in dynamic social networks,” in Proc. IEEE 3rd Int. Conf. Privacy Security Risk Trust IEEE 3rd Int. Conf. Soc. Comput., Oct. 2011, pp. 501–504. [Google Scholar]
  • [69].Mei Y., Zhong Y., and Yang J., “Finding and analyzing principal features for measuring user influence on Twitter,” in Proc. IEEE 1st Int. Conf. Big Data Comput. Service Appl., Mar. 2015, pp. 478–486. [Google Scholar]

Articles from IEEE Transactions on Network and Service Management are provided here courtesy of Institute of Electrical and Electronics Engineers

RESOURCES