Table 4. Boosted sampling methods of the most commonly studied hate speech datasets (Waseem & Hovy, 2016; Davidson et al., 2017; Founta et al., 2018; Basile et al., 2019).
Description as appeared in the publications.
| Dataset | Keywords | Haters | Other |
|---|---|---|---|
| Waseem | “Common slurs and terms used pertaining to religious, sexual, gender, and ethnic minorities” | “A small number of prolific users” | N/A |
| Davidson | HateBase (https://www.hatebase.org/) | “Each user from lexicon search” | N/A |
| Founta | HateBase, NoSwearing (https://www.noswearing.com/dictionary/) | N/A | Negative sentiment |
| HatEval | “Neutral keywords and derogatory words against the targets, highly polarized hashtags” | “Identified haters” | “Potential victims of hate accounts” |
Notes.
- N/A
- no relevant descriptions found