Skip to main content
. 2025 Sep 17;11:e3133. doi: 10.7717/peerj-cs.3133

Table 7. Comparative overview of Arabic hate speech detection studies (2020–2024) using transformers methods models highlighting preprocessing steps, dataset, model performance, and associated limitations/future research directions.

Ref. Year Preprocessing Dataset details Best model/Performance Limitations/Future Directions
Tanase, Cercel & Chiru (2020) Tokenization using BERT specific tokenizer, normalizing hashtag and replacing emoji’s with textual description. Offensive 2020 dataset/Arabic subset 7000 tweets. Classes: offensive or not offensive BERT-based model F1 = 82.19% The authors suggest that the limited availability of training data for non-English languages could improve the performance of multilingual models. They plan to use transfer learning to leverage similar tasks in the same language to enhance offensive language detection models.
Socha (2020) Multiple consecutive user mentions replaced with a single. All tweets truncated or padded to common length. Dataset in Almaliki et al. (2023)/12,698 tweets. Classes: offensive (OFF)/not offensive (NOT) Monolingual: Arabic BERT BASE F1 = 86% The article does not mention any limitation, challenges or future research directions. Observation: Monolingual models outperform multilingual models for Arabic. A minimal amount of preprocessing was done.
Alami et al. (2020) Substitute emoji’s with special token and translate emoji’s meanings from English to Arabic, then concatenate emoji’s-free tweets with their Arabic meanings. Tokenization. The Arabic dataset used in OffensEval 2020/10,000 tweet. Classes: offensive (OFF), not offensive (NOT OFF) AraBERT F1 = 90.17% The AraBERT model faced a problem with the (MASK0 token not being included in the fine-tuning dataset. The issue was resolved by replacing Twitter emojis with (MASK) tokens. Future work includes using advanced word embeddings.
Abdul-Mageed, Elmadany & Nagoudi (2021) Removing diacritics and replacing URLs, user mentions, and hashtags with generic string tokens (URL, USER, HASHTAG). Tokenization. For shared task (subtasks A and B) Masadeh, Davanager & Muaad (2022)/10,000 tweet. Classes: social meaning task hate and offensive detection (Hate, not-hate/offensive, not-offensive) ARBERT F1 = 83% for hate, F1 = 90% for offensive. MARBERT F1 = 84.79% for hate, F1 = 92.41% for offensive. The authors aim to improve multilingual language models by self-training and creating models that use less energy, as high inference costs and lack of diversity in non-English pre-training data limit their effectiveness.
Hadj Ameur & Aliane (2021) Removing diacritical marks, links, user’s references, elongated and repeated characters. Normalization. Tokenization. AraCOVID19-MFH dataset/10,828 tweets. Classes: Yes, No, Indeterminate AraBERTCov19 F1 = 98.58% The tweet preprocessing was used for model training, not released with the dataset. Authors plan to re-annotate using multiple annotators and expand the annotated dataset with COVID-19 events and discussions.
Masadeh, Davanager & Muaad (2022) Remove punctuation, slang, and stop words. Tokenization. Stemming and Lemmatization. Dataset used in (Alghamdi et al., 2024)/6,164 tweet+ Arabic Jordanian General Tweets (AJGT) corpus, with 900 tweet. Classes: Hate, Non-hate BERT-AJGT Acc = 79% The study focuses on detecting religious hate speech in Arabic, addressing mixed language issues. The future plan is exploring methods for detecting racism, misogyny, and religious prejudice.
Boulouard et al. (2022) Remove emoji’s, punctuation, stop words, and extra letters used for emphasis. Some words stemmed and lemmatized. Tokenization. 11,268 Arabic YouTube comments. Classes: Hateful: 1, Non-hateful: 0 AraBERT Pre = 95%, F1 = 95%, Rec = 96%, Acc = 96% BERT models need to handle Arabic dialects, but their versatility limits multilingual performance. Future plans involve using more Levantine and North-African dialect datasets, including “Arabizi”.
Althobaiti (2022) Remove HTML tags, hashtags, mentions, diacritical, punctuation, mathematical signs and symbols, URLs, retweets RT and symbols different from emoji’s. Normalization. The Arabic dataset in Zaghouani, Mubarak & Biswas (2024), consisting of 12,698 Arabic tweets. Classes: Offensive language detection: OFF, NOT OFF. Hate speech detection: HS, NOT HS BERT-Based Offensive language detection: F1 = 84.3%. Hate speech detection: F1 = 81.8% The article does not mention any limitation, challenges, or future research directions.Observations: Additional research is needed to properly understand the influence of emojis and their textual explanations, as the dataset used in the study may be too small and uneven.
Alzu’bi et al. (2022) Remove URLs, mentions, diacritics, tatweel, punctuation, noisy signals in the tweet. Emoji’s translated to Arabic using an English to Arabic model. OSCAT5 Arabic hate speech task/12,698 tweets. Classes: OFF, NOT OFF AraBERTv0.2-Twitter-large Pre = 85.2%, F1 = 84.9%, Rec = 84.7%, Acc = 86.4% Dialect mismatch in pre-trained models makes normalizing tweet dialects, extracting relevant features like POS tags and NER, and recognizing offensive tweets challenging. Future research directions are not explicitly mentioned in the article.
Ben Nessir et al. (2022) Remove white spaces, non-Arabic tokens, USER, URL, and emoji’s. Normalizing all the hashtags by simply decomposing them. Dataset in Zaghouani, Mubarak & Biswas (2024)/12,698 tweets. Classes: Subtask A: offensive, not offensive. Subtask B: hate, not hate. Subtask C: fine-grained type of hate speech MARBERT fine-tuned with QRNN Acc = 85.4% for Subtask A, Acc = 94.1% for Subtask B, Acc = 91.9% for Subtask C on the test dataset. Language complexity in Arabic, cultural, political, and religious dependence, and dialect differences contribute to unbalanced data and class proportions. Future research should explore meta-learning, focus loss, semi-supervised learning, and incorporating disabled and religious minorities.
Shapiro, Khalafallah & Torki (2022) Remove repeated characters, emoji’s, diacritic and symbols. Normalization. Dataset in Zaghouani, Mubarak & Biswas (2024)/12,698 tweets. Classes: Subtask A: offensive, not offensive. Subtask B: hate, not hate. Subtask C: fine-grained type of hate speech MarBERT v2 Subtask A F1 = 84.1%, Subtask B F1 = 81.7%, Subtask C F1 = 47.6% Small or unbalanced dataset overfitting. Larger data sets degrade contrastive loss. Future solutions include using a language-agnostic encoder with contrastive aim and utilizing data from multiple languages for the same function to address data imbalance.
Almaliki et al. (2023) Removing @username, URLs, hashtags, punctuation. Tokenization Normalization. 9,352 tweets. Classes: normal, abusive, hate speech ABMM (Arabic BERT-Mini Model) Pre = F1 = Rec = Acc = 98.6% The study suggests incorporating data from Facebook and exploring text representation methods like AraVec to improve neural network model training and enhance the dataset, despite hardware limitations.
de Paula et al. (2023) Removing punctuation, special characters, stop words. Converting lower case to upper. Stemming. Tokenization. Lemmatization. CERIST NLP challenge dataset/10,828 tweets. Classes: Hateful, Not Hateful AraBERT F1 = 60%, Acc = 86% Dataset limited to COVID-19 disinformation domain. Small proportion of hate speech in the dataset (11%). The article does not mention any future research directions.
Khezzar, Moursi & Al Aghbari (2023) Removing hashtags, stop words, filter out irrelevant symbols. Lemmatization Normalization. arHateDataset/34,107 tweet. Classes: hate, normal AraBERT F1 = 93% Problems with data imbalances and Arabic dialect complexity. The article does not suggest future research directions.
Chiker (2023) Removing elongations, non-Arabic characters, numbers, symbols, emoticons, punctuation, hashtags, web addresses, empty lines, diacritics. and stop words. Normalization. Provided by CERIST/10,278 Comments from Twitter and others social media. Classes: Hateful, Not hateful BERT + GRU and LSTM For focal loss training F1 = 98.02%. For data augmentation F1 = 99.14% Imbalance between “hateful” and “not hateful” classes. The article does not suggest future research directions.
Alghamdi et al. (2024) Removing diacritics, punctuation, repeated characters, symbols, special characters, URLs, English tokens, emoji’s. Normalization. AraTar corpus/11,219 tweets. Classes: Task1: RH (Religious Hate), EH (Ethnic Hate), NH (Nationality Hate), GH (Gender Hate), UDH (Undefined hate), CL (Clean) AraBERTv0.2-twitter (base) F1 = 84.5% Not all Arabic dialects are incorporated. The future plan is to improve the corpus representation for underrepresented hate targets with data augmentation.
Zaghouani, Mubarak & Biswas (2024) Removing unwanted characters, English words, and punctuation. 15,965 tweets. Classes: multi-labels. For hate speech and offensive: Yes, No AraBERT F1 = 66% for hate speech detection. F1 = 65% for offensive language detection. Arabic regional backgrounds of annotators may affect labeling accuracy. The article does not suggest future research directions.
Bensoltane & Zaki (2024) Removing dates, time, numbers both in English and Arabic, URLs, and Twitter-specific symbols. OSCAT-5 dataset/12,698 tweets. Classes: offensive, normal, hate (disability, social class, race, gender, religion, ideology). MARBERT v2+BiGRU F1 = 61.68% Unbalanced dataset. The future plan is to Combine BERT with different neural network designs and investigate transformer-based models. Find solutions for unbalanced datasets.
Eddine & Boualleg (2024) Removing mentions, URLs, RT, hashtags, punctuation, special characters, numerical characters, repeated characters, Arabic stop words, non-Arabic letters, new lines and diacritics. Normalization. 11,634 tweets +6853 tweets used for data augmentation. Classes: non-hate, general hate speech, sexism, racism, and religious hate speech. Ensemble learning based on pre-trained models. F1 = 85.48% using majority voting and 85.10% using average voting. Data-augmented model F1 = 85.65% Confined to specific dataset and time period. The future plan is to improve the contextual embedding model, classify Algerian hate speech, and track trends in hate speech.
Asiri & Saleh (2024) replace user mentions with “USER”, URLs with “URL”, and newline with “NL”. 24,500 tweets, by data augmentation to over 35,000 tweets to address class imbalance. Classes: Offensive/non-offensive. Multi-classes: general insults, hate speech, or sarcasm 91% F1-score with data augmentation techniques using the AraBERT model limited regional coverage reduces model generalizability, while models such as AraBERT demand substantial computational resources. Future work should prioritize: (1) developing comprehensive dialect-specific datasets, (2) refining dialect-aware NLP tools, and (3) optimizing models for dialectal variations.
Mazari, Benterkia & Takdenti (2024) replace e-mail address and user mentions with <user>, URLs with <url>, and numbers by the ¡number¿, etc. Remove Arabic diacritics and elongations, pictographs, symbols, flags, etc. OSACT2020 dataset/10,000 posts Classes: Offensive, Not Offensive Ensemble learning models based on BERT, 94.56% F1-score. Class imbalance presents a significant challenge in datasets. Future work will evaluate models for detecting offensive Arabic language forms, explore pretrained BERT variants, and generative AI models to address challenges in detecting such language.
Mousa et al. (2024) Cleaning, normalization, Farasa segmentation, and tokenization. 13,000 tweets Multiclasses: racism, bullying, insult, obscene language, and non-offensive content. ArabicBERT–BiLSTM-RBF with F score 98.4%. limitations including computational complexity from cascaded models, extended training times due to large datasets, and reliance on multiple machine learning model combinations. Future work will focus on: (1) adopting faster contextual models to replace BERT architectures, (2) optimizing parameters and feature extraction for efficiency, (3) integrating attention mechanisms for acceleration, and (4) evaluating cross-lingual performance.