Table 5. Best subset of hyperparameters for each LLM for the MET detection model.
Model | Epoch | Weight decay | Batch size | Learning rate |
---|---|---|---|---|
BETO | 10 | 0.0448468569 | 16 | 0.000034 |
ALBETO | 10 | 0.0139110539 | 16 | 0.000039 |
DistilBETO | 10 | 0.1950135486 | 8 | 0.000028 |
MarIA | 7 | 0.0734046361 | 16 | 0.000038 |
BERTIN | 8 | 0.0139110539 | 16 | 0.000039 |