Table 4.
Prediction accuracy comparison of our attention-based multi-scale CNN model and other methods. GBDT: Gradient Boosting Decision Tree, KNN: k-nearest neighbor, SVM: support vector machine.
| Method | Feature | Recognition Model | Accuracy (%) | |
|---|---|---|---|---|
| Evaluation A Train on A Test on B |
Evaluation B Train on B Test on A |
|||
| Attention-based multi-scale CNN | Time signal | Multi-scale CNN | 93.3 | 82.8 |
| MFCC CNN | MFCC | (b) module | 78.7 | 59.4 |
| MFCC-delta CNN | MFCC-delta | (b) module | 83.3 | 58.8 |
| MFCC-delta-delta CNN | delta-Deltas | (b) module | 82.6 | 57.4 |
| End-to-end stacked CNN | Time–frequency signal | _ | 89.7 | 81.1 |
| Multiple feature + KNN | Multiple feature | KNN | 87.9 | 76.8 |
| Multiple feature + SVM | Multiple feature | SVM | 83.2 | 66.7 |
| Multiple feature + GBDT | Multiple feature | GBDT | 71.5 | 48.4 |