|
Traditional
|
SVM |
Binary classification of hate speech using hand-crafted features. |
TF-IDF + Arabic lexicons (e.g., hate word lists). |
Alakrot, Murray & Nikolov (2018b)
|
| Logistic regression |
Probabilistic hate speech classification. |
Feature engineering for Arabic morphology (e.g., root extraction). |
Ousidhoum et al. (2021b)
|
| Naive Bayes |
Lightweight hate speech detection. |
Tokenization tailored for Arabic script and stopword removal. |
Abozinadah, Mbaziira & Jones (2015)
|
| Random forest |
Ensemble-based classification of offensive language. |
Dialect-aware feature selection (e.g., Levantine vs. Gulf Arabic). |
Aref et al. (2020)
|
| k-NN |
Nearest-neighbor classification for small datasets. |
Normalization of Arabic diacritics and elongations. |
Cahyana et al. (2022)
|
|
Deep learning
|
CNN |
Character/word-level feature extraction for hate speech. |
Character-level embeddings to handle Arabic orthography such as ( ). |
Mohaouchane, Mourhir & Nikolov (2019)
|
| LSTM |
Sequential modeling of hate speech context. |
Bidirectional LSTM (BiLSTM) for Arabic morphology and word order. |
Al-Ani, Omar & Nafea (2021)
|
| GRU |
Efficient sequential hate speech detection. |
Dialect-specific tokenization (e.g., Egyptian Arabic). |
Alshalan & Al-Khalifa (2020)
|
|
Transformers
|
AraBERT |
Contextual hate speech classification. |
Pre-trained on Arabic social media (Twitter) with subword tokenization. |
Khezzar, Moursi & Al Aghbari (2023)
|
| MARBERT |
Dialect-aware hate speech detection. |
Pre-trained on 1B Arabic tweets covering multiple dialects. |
Ben Nessir et al. (2022)
|
| XLM-Roberta |
Cross-lingual hate speech detection. |
Fine-tuned on Arabic datasets (e.g., OCA, ArSAS). |
Felipe et al. (2022)
|
|
Hybrid
|
CNN-LSTM |
Combining spatial and temporal features. |
Multi-channel input for Arabic dialects and MSA. |
Mohaouchane, Mourhir & Nikolov (2019)
|
| CNN-GRU |
Extract local and sequential features from textual data, GRU captures sequence orders. |
|
Al-Hassan & Al-Dossari (2021)
|