Table 2.
Details of the used experimental datasets
| Dataset | Type of data | #Classes | Exceeding ratio | Vocabulary size | #Training set | #Testing set |
|---|---|---|---|---|---|---|
| DATASET-1 | BINARY | 2 | 0.471 | 16449 | 6911 | 1821 |
| DATASET-2 | BINARY | 2 | 0.495 | 39515 | 15000 | 5000 |
| DATASET-3 | MULTICLASS | 8 | 0.297 | 13737 | 4937 | 2175 |
| DATASET-4 | MULTICLASS | 3 | 0.563 | 16230 | 10000 | 3000 |
| DATASET-5 | MULTICLASS | 5 | 0.553 | 24742 | 6000 | 1500 |
Note: Exceeding ratio = number of samples with a length greater than the average length/number of all training samples