. 2021 Jun 17;7:e598. doi: 10.7717/peerj-cs.598

Table 4. Boosted sampling methods of the most commonly studied hate speech datasets (Waseem & Hovy, 2016; Davidson et al., 2017; Founta et al., 2018; Basile et al., 2019).

Description as appeared in the publications.

Dataset	Keywords	Haters	Other
Waseem	“Common slurs and terms used pertaining to religious, sexual, gender, and ethnic minorities”	“A small number of prolific users”	N/A
Davidson	HateBase (https://www.hatebase.org/)	“Each user from lexicon search”	N/A
Founta	HateBase, NoSwearing (https://www.noswearing.com/dictionary/)	N/A	Negative sentiment
HatEval	“Neutral keywords and derogatory words against the targets, highly polarized hashtags”	“Identified haters”	“Potential victims of hate accounts”

Notes.