Table 8. Best subset of hyperparameters for each model and each target entity.
Target entity | LLM | Epochs | Batch size | Weight decay | Learning rate |
---|---|---|---|---|---|
MET | BETO | 4 | 16 | 0.044847 | 0.000034 |
ALBETO | 8 | 8 | 0.011343 | 0.000022 | |
DistilBETO | 4 | 16 | 0.073405 | 0.000038 | |
MarIA | 8 | 16 | 0.188325 | 0.000017 | |
BERTIN | 6 | 8 | 0.081163 | 0.000017 | |
Society | BETO | 9 | 16 | 0.073405 | 0.000038 |
ALBETO | 7 | 8 | 0.195014 | 0.000028 | |
DistilBETO | 10 | 16 | 0.073405 | 0.000038 | |
MarIA | 6 | 16 | 0.188325 | 0.000017 | |
BERTIN | 7 | 16 | 0.126252 | 0.000012 | |
Others companies | BETO | 9 | 16 | 0.044847 | 0.000034 |
ALBETO | 7 | 8 | 0.011343 | 0.000022 | |
DistilBETO | 7 | 8 | 0.011343 | 0.000022 | |
MarIA | 8 | 16 | 0.073405 | 0.000038 | |
BERTIN | 9 | 8 | 0.195014 | 0.000028 |