Skip to main content
. 2024 Mar 17;14:6420. doi: 10.1038/s41598-024-56259-z

Table 7.

Results of the adversarial attack mitigation technique (before and after) on multiple datasets.

Attack type Dataset Movie reviews Hate speech Clickbait
Attack Before After Before After Before After
Word-level TextFloor 0.27 0.79 0.32 0.68 0.65 0.85
PWWS 0.21 0.58 0.35 0.60 0.66 0.80
GENETIC 0.15 0.63 0.23 0.58 0.58 0.76
SememePSO 0.40 0.76 0.67 0.79 0.79 0.90
BAE 0.85 0.99 0.94 0.96 0.98 0.97
BERT-ATTACK 0.8 0.71 0.24 0.75 0.52 0.78
HotFlip 0.47 0.84 0.71 0.88 0.81 0.85
Sentence-level SEA 1 1 1 1 1 1
GAN 0.63 0.89 0.60 0.66 0.54 0.56
SCPN 1 1 1 1 1 1
Char-level DeepWordBug 0.58 0.92 0.74 0.83 0.62 0.85
VIPER 0.94 0.93 0.99 0.97 0.98 0.97
UAT 0.11 0.74 0.18 0.35 0.39 0.36
TextBugger 0.5 0.64 0.25 0.65 0.61 0.74
Average 0.565 0.815 0.587 0.764 0.723 0.813
Relative ↑% ↑25% ↑17% ↑9%

Significant values are in bold.