Table 3.
Algorithms | Ft | da | ds | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
P | R | F | A | P | R | F | A | P | R | F | A | |
SVM + Linear | 67.1 | 62.2 | 62.2 | 63.9 | 57.3 | 44.0 | 44.0 | 47.2 | 63.5 | 54.1 | 54.1 | 56.3 |
SVM + RBF | 70.2 | 51.0 | 51.0 | 40.2 | 58.5 | 53.9 | 53.9 | 55.5 | 63.2 | 51.5 | 51.5 | 54.6 |
XGBoost | 69.0 | 69.5 | 69.5 | 66.7 | 58.3 | 59.8 | 59.8 | 56.3 | 61.5 | 62.3 | 62.3 | 58.9 |
ANN | 63.1 | 63.7 | 63.7 | 63.4 | 56.8 | 58.8 | 58.8 | 54.7 | 62.0 | 61.9 | 61.9 | 58.0 |
RF | 69.6 | 67.5 | 67.5 | 63.5 | 60.8 | 60.7 | 60.7 | 57.0 | 59.9 | 61.9 | 61.9 | 59.5 |
NB | 59.4 | 56.1 | 56.1 | 57.5 | 47.0 | 48.7 | 48.7 | 44.9 | 48.5 | 50.0 | 50.0 | 45.8 |
LR | 65.1 | 67.4 | 67.4 | 64.7 | 54.2 | 56.6 | 56.6 | 52.0 | 63.2 | 61.8 | 61.8 | 61.8 |
K-NN | 65.2 | 65.2 | 65.2 | 60.3 | 51.8 | 57.5 | 57.5 | 52.8 | 61.3 | 61.6 | 61.6 | 57.4 |
Note that P, R, F, and A denote overall Precision, Recall, F1-score, and Accuracy for three types of embeddings (ft: fastText, da: domain-agnostic, and ds: domain-specific), respectively. The hyperparameters of traditional machine learning algorithms are as follows: SVM + Linear (c: 1, Gamma: 0.1), SVM + RBF (c: 100, Gamma: 0.1), XGBoost (learning-rate: 0.1, max-depth: 7, n-estimators: 150), ANN (Hidden-layer-size: 20, learning-rate-init: 0.01, max-iter: 1000), RF (min-sample-leaf: 3, min-sample-split: 6, n-estimators: 200), LR (C: 10, solver: lbfgs, max-iter: 1000), and K-NN (leaf-size: 35, n-neighbor: 120, p: 1). Boldface denotes the highest performance.