Table 2.
Feature ablation study on the Random Forest model. Each set of features is removed, and the difference of the performance is measured
| #features | Validation set | Test set | |
|---|---|---|---|
| Full model | 14 | 0.8832 | 0.8246 |
| - Token-based | 5 | 0.8689 (−1.5%) | 0.8129 (−1.2%) |
| - Character-based | 2 | 0.8655 (−1.8%) | 0.8154 (−0.9%) |
| - Sequence-based | 4 | 0.8697 (−1.4%) | 0.8034 (−2.1%) |
| - Semantic-based | 1 | 0.8704 (−1.3%) | 0.8235 (−0.1%) |
| - Entity-based | 2 | 0.8738 (−0.9%) | 0.8150 (−0.9%) |