Skip to main content
. 2022 Apr 15;2022:5211949. doi: 10.1155/2022/5211949

Table 1.

Summary of Twitter spam detection and sentiment analysis.

Techniques used Key findings
Backpropagation neural network and naïve Bayes are used as classifiers [1] for spam detection. Spam classification is performed on real-time Twitter data. Naïve Bayes performs better than backpropagation neural network.
Support vector machine method and sequence minimal optimization algorithm [4] are used for spam detection. When compared to other spam detection models, this model has a high level of reliability based on the correctness of the system.
The decision tree induction algorithm, the naïve Bayes algorithm, and the KNN algorithm are used for spam detection [6]. The proposed solution has the advantage of being practical and delivering much better classification results than other methodologies now in use.
Relief and information gain are the two approaches used for feature selection. Classifiers used for spam detection are multilayer perceptrons, decision trees, naïve Bayes, and k-nearest neighbors [7]. A total of 82 Twitter profiles have been gathered in this dataset. The proposed work uses different language tweets but fails to give better accuracy as the dataset size is small.
The support vector machine, K-nearest neighbor (KNN), naïve Bayes, and bagging algorithms are used for spam detection [8]. Naïve Bayes was compared against bagging (an ensemble classifier) to filter out spam comments. Ensemble classifiers have been discovered to generate better outcomes in the vast majority of cases.
Naïve Bayes classifier (NB), support vector machine (SVM), K-nearest neighbor (KNN), artificial neural network (ANN), and random forest (RF) are used for spam detection [9]. SMS spam corpora (UCI repository) and Twitter corpora (public live tweets) datasets are used for analysis. The benefit is that these classical classifiers performed well in terms of accuracy in spam classification in both datasets.
The random forest, maximum-entropy (MaxEnt), C-Support vector classification (SVC5), extremely randomized trees (ExtraTrees), gradient boosting, spam post detection (SPD), and multilayer perceptron (MLP) algorithms are used for spam detection [10]. The automatically annotated spam posts detection dataset (SPDautomated) named Honeypot and manually annotated spam posts detection dataset was used (SPDmanual) and the different algorithms are evaluated and compared.
Agglomerative hierarchical clustering is used for spam detection [13]. The movie review dataset is used for the analysis. The lexicon method used is simpler than the methods available in machine learning.
Naïve Bayes and SVM are used for spam detection [14]. The political dataset is used for analysis. It was observed that SVM performed better for the given contextual data.
Rapid miner and the NamSor are used for tweet classification [15]. NamSor, which was used for gender identification, is not very accurate.
An unsupervised machine learning approach is used for tweet spam classification and sentiment analysis [17]. The proposed unsupervised learning method achieved a recall value of 95% to learn the pattern of new spam activities.
Lexicon-based sentiment analysis [18]. A small Twitter-specific lexicon set is used, which gives good accuracy. For general tweet analysis, the accuracy is reduced.
Bayesian classifier (NB), support vector machines (SVM), part-of-speech tagging, and SVM and scoring-based hybrid approach (called HS-SVM) are used in scientific article reviews classification [19]. The HS-SVM classifier produces the best results.
Max entropy, naïve Bayes, and support vector machine are used for sentiment classification [20]. The tweets are analyzed on domains such as medical, social media, and sports.