Table 4. Results of training and validation for several BERT classifiers (damage vs non-damage).
| Sequence length | Batch size | Epochs | Learning rate | Epsilon | Training loss | Validation loss | Validity accuracy | Validating F1-score | Training time | Validation time |
|---|---|---|---|---|---|---|---|---|---|---|
| BERT-64 | 32 | 1 | 2e−5 | 1e−8 | 0.07 | 0.12 | 0.9094 | 0.8244 | 0:02:11 | 0:00:05 |
| 2 | 2e−5 | 1e−8 | 0.09 | 0.24 | 0.9120 | 0.8928 | 0:01:58 | 0:00:04 | ||
| 3 | 2e−5 | 1e−8 | 0.07 | 0.37 | 0.9120 | 0.8928 | 0:01:58 | 0:00:04 | ||
| 4 | 2e−5 | 1e−8 | 0.03 | 0.39 | 0.9055 | 0.8262 | 0:01:58 | 0:00:04 | ||
| BERT-128 | 32 | 1 | 2e−5 | 1e−8 | 0.29 | 0.23 | 0.9245 | 0.8479 | 0:03:57 | 0:00:09 |
| 2 | 2e−5 | 1e−8 | 0.19 | 0.25 | 0.9333 | 0.8633 | 0:03:56 | 0:00:09 | ||
| 3 | 2e−5 | 1e−8 | 0.12 | 0.29 | 0.9245 | 0.8479 | 0:03:56 | 0:00:09 | ||
| 4 | 2e−5 | 1e−8 | 0.08 | 0.33 | 0.9245 | 0.8479 | 0:03:55 | 0:00:09 |