Table 1.
Hyperparameter settings in our model.
| Hyperparameter | Optimum value | Search range |
| Learning rate | 5×10-5 | {1×10-3, 1×10-4, ..., 1×10-6} |
| Batch size | 64 | {16, 32, 64, 128, 256} |
| Sentence vector size | 300 | {100, 200, 300, 400, 500} |
| Dropout rate | 0.5 | {0.1, 0.2, 0.3, ..., 0.8} |
| Down-sampling rate | 0a | {0, 0.1, ..., 1} |
aThe optimum setting had no down-sampling.