Skip to main content
. 2025 Sep 17;11:e3133. doi: 10.7717/peerj-cs.3133

Table 5. Overview of the main model types for Arabic hate speech detection, examples, use cases, Arabic-specific adaptations, and key studies in Arabic hate speech detection.

Model type Example models Use case Arabic-specific adaptations Key studies
Traditional SVM Binary classification of hate speech using hand-crafted features. TF-IDF + Arabic lexicons (e.g., hate word lists). Alakrot, Murray & Nikolov (2018b)
Logistic regression Probabilistic hate speech classification. Feature engineering for Arabic morphology (e.g., root extraction). Ousidhoum et al. (2021b)
Naive Bayes Lightweight hate speech detection. Tokenization tailored for Arabic script and stopword removal. Abozinadah, Mbaziira & Jones (2015)
Random forest Ensemble-based classification of offensive language. Dialect-aware feature selection (e.g., Levantine vs. Gulf Arabic). Aref et al. (2020)
k-NN Nearest-neighbor classification for small datasets. Normalization of Arabic diacritics and elongations. Cahyana et al. (2022)
Deep learning CNN Character/word-level feature extraction for hate speech. Character-level embeddings to handle Arabic orthography such as (Inline graphic). Mohaouchane, Mourhir & Nikolov (2019)
LSTM Sequential modeling of hate speech context. Bidirectional LSTM (BiLSTM) for Arabic morphology and word order. Al-Ani, Omar & Nafea (2021)
GRU Efficient sequential hate speech detection. Dialect-specific tokenization (e.g., Egyptian Arabic). Alshalan & Al-Khalifa (2020)
Transformers AraBERT Contextual hate speech classification. Pre-trained on Arabic social media (Twitter) with subword tokenization. Khezzar, Moursi & Al Aghbari (2023)
MARBERT Dialect-aware hate speech detection. Pre-trained on 1B Arabic tweets covering multiple dialects. Ben Nessir et al. (2022)
XLM-Roberta Cross-lingual hate speech detection. Fine-tuned on Arabic datasets (e.g., OCA, ArSAS). Felipe et al. (2022)
Hybrid CNN-LSTM Combining spatial and temporal features. Multi-channel input for Arabic dialects and MSA. Mohaouchane, Mourhir & Nikolov (2019)
CNN-GRU Extract local and sequential features from textual data, GRU captures sequence orders. Al-Hassan & Al-Dossari (2021)