Table 8. Results of training and validation for several BERT classifiers (multi-class).
| Sequence length | Batch size | Epochs | Learning rate | Epsilon | Training loss | Validation loss | Validity accuracy | Validating F1-score | Training time | Validation time |
|---|---|---|---|---|---|---|---|---|---|---|
| BERT -64 | 32 | 1 | 2e−5 | 1e−8 | 0.35 | 0.29 | 0.8948 | 0.8422 | 0:02:03 | 0:00:05 |
| 2 | 2e−5 | 1e−8 | 0.21 | 0.25 | 0.9083 | 0.8578 | 0:01:59 | 0:00:04 | ||
| 3 | 2e−5 | 1e−8 | 0.13 | 0.35 | 0.9083 | 0.8578 | 0:01:59 | 0:00:05 | ||
| 4 | 2e−5 | 1e−8 | 0.09 | 0.39 | 0.9083 | 0.8578 | 0:01:59 | 0:00:05 | ||
| BERT -128 | 32 | 1 | 2e−5 | 1e−8 | 0.35 | 0.27 | 0.8948 | 0.8291 | 0:03:42 | 0:00:08 |
| 2 | 2e−5 | 1e−8 | 0.21 | 0.25 | 0.9083 | 0.8844 | 0:03:40 | 0:00:08 | ||
| 3 | 2e−5 | 1e−8 | 0.14 | 0.36 | 0.8948 | 0.8291 | 0:03:39 | 0:00:08 | ||
| 4 | 2e−5 | 1e−8 | 0.09 | 0.38 | 0.8948 | 0.8291 | 0:03:38 | 0:00:08 |