Skip to main content
. 2020 Sep 9;8:165201–165215. doi: 10.1109/ACCESS.2020.3022867

TABLE 2. Accuracy, Error Rate, and Area Under Curve of the Validation Results.

Classification Algorithm
Metric Feature Extraction DT MNB BNB LR kNN Perceptron NN LSVM ERF XGBoost
Accuracy (%) TF 99.04 98.29 95.09 99.36 96.42 98.93 99.68 99.52 99.52 99.15
TF-IDF Unigram 98.93 98.01 93.96 98.18 98.67 99.57 99.63 99.52 99.15 99.25
Bigram 98.02 97.92 90.55 97.49 97.06 98.72 98.99 98.99 98.56 96.80
Trigram 93.00 92.57 92.68 92.47 90.92 93.54 93.75 93.91 93.43 91.45
N-gram (n=2:3) 98.56 98.08 90.97 97.81 96.64 98.99 99.57 99.57 98.72 97.44
Characters Level 99.20 97.97 91.35 98.08 99.36 99.52 99.63 99.63 99.09 99.41
Word Embeddings 97.33 55.45 64.80 80.61 89.16 72.12 93.86 65.99 98.72 99.52
Error Rate (%) TF 0.96 1.71 4.92 0.64 3.58 1.07 0.32 0.48 0.48 0.86
TF-IDF Unigram 1.07 1.99 6.04 1.82 1.34 0.43 0.37 0.48 0.86 0.75
Bigram 1.98 2.08 9.46 2.51 2.94 1.28 1.02 1.02 1.44 3.21
Trigram 7.00 7.43 7.32 7.53 9.08 6.46 6.25 6.09 6.57 8.55
N-gram (n=2:3) 1.44 1.92 9.03 2.19 3.37 1.02 0.43 0.43 1.28 2.56
Characters Level 0.80 2.03 8.65 1.92 0.64 0.48 0.37 0.37 0.91 0.59
Word Embeddings 2.67 44.55 35.20 19.39 10.84 27.89 6.14 34.01 1.28 0.48
Area Under the Curve (%) TF 97.91 96.87 90.58 98.74 93.45 98.83 99.41 99.03 99.03 98.33
TF-IDF Unigram 97.83 97.36 88.75 97.75 98.65 99.16 99.28 99.12 98.14 98.58
Bigram 97.09 97.62 84.60 98.13 97.75 98.00 99.07 99.07 98.06 96.75
Trigram 92.86 94.16 92.29 94.98 88.41 95.06 95.06 95.70 94.34 93.26
N-gram (n=2:3) 97.15 97.67 84.66 98.43 96.91 97.96 99.34 99.34 97.68 97.24
Characters Level 98.19 97.23 85.12 97.30 99.20 99.03 99.47 99.28 98.20 98.78
Word Embeddings 94.68 63.05 65.02 70.54 82.48 60.20 89.46 57.59 97.85 99.21