Skip to main content
. 2021 Jun 17;7:e598. doi: 10.7717/peerj-cs.598

Table 4. Boosted sampling methods of the most commonly studied hate speech datasets (Waseem & Hovy, 2016; Davidson et al., 2017; Founta et al., 2018; Basile et al., 2019).

Description as appeared in the publications.

Dataset Keywords Haters Other
Waseem “Common slurs and terms used pertaining to religious, sexual, gender, and ethnic minorities” “A small number of prolific users” N/A
Davidson HateBase (https://www.hatebase.org/) “Each user from lexicon search” N/A
Founta HateBase, NoSwearing (https://www.noswearing.com/dictionary/) N/A Negative sentiment
HatEval “Neutral keywords and derogatory words against the targets, highly polarized hashtags” “Identified haters” “Potential victims of hate accounts”

Notes.

N/A
no relevant descriptions found