Table 9.
Multimodal Fake News datasets
Dataset | Year of release | Statistics | Domain | Contents | Labels | Collected from | Used in |
---|---|---|---|---|---|---|---|
Twitter 15 [144] | 2015 |
361 (I) 7032 (F) 5008 (R) |
Posts related to 11 events |
Text, visual | 2 | [4, 15, 26–28, 99] | |
Twitter 16 [89] | 2016 |
413 (I) 9596 (F) 6225 (R) |
Posts related to 17 events |
Text, visual | 2 | [25, 91, 111, 129] | |
Weibo [25] | 2016 |
9528 (I) 4749 (F) 4779 (R) |
Crawl the verfi ed false rumor posts from May, 2012 to Jan, 2016 |
Text, visual | 2 |
Weibo (Non-rumor tweets are verifi ed by Xinhua News Agency, an authoritative news agency in China) |
[4, 15, 25–28, 91, 91, 99] |
PHEME [12] | 2016 |
2672 (I) 1972 (F) 3830 (R) |
9 different events, which include 5 cases of breaking news |
Tweet, conversational threads | 3 | [16, 92, 96, 99] | |
ALLData [100] | 2018 |
20,015 (I) 11,941 (F) 8074 (R) |
2016 US Presidential elections |
The title, text, image, author and website |
2 |
Fake and real news scraped from 240 websites and authoritative news websites, i.e., the New York Times, Washington Post, etc. respectively |
[100, 110, 111] |
FakeNewsNet [120] | 2019 |
19,200 (I) 5367 (F) 17,222 (R) |
Politics, Entertainment | Text, image url, conversational threads, location, and timestamp of engagement | 2 |
Content is crawled from PolitiFact, GossipCop, E! online; For user engagements Twitter API is used |
[94, 95, 97, 98] |
Note: I—Total Number of Images, F—Number of Fake claims, R—Number of Real claims