Table 9. Benchmark of the different LLMs with the whole corpus and the two splits evaluated for sentiment towards the MET.
Dataset | Model | W-P | W-R | W-F1 | M-F1 |
---|---|---|---|---|---|
Tweet | BETO | 0.8227 | 0.8266 | 0.8193 | 0.6284 |
ALBETO | 0.7813 | 0.8110 | 0.7950 | 0.5386 | |
DistilBETO | 0.7652 | 0.7992 | 0.7818 | 0.5268 | |
MarIA | 0.8200 | 0.8292 | 0.8238 | 0.6371 | |
BERTIN | 0.7884 | 0.8120 | 0.7999 | 0.5409 | |
Headlines | BETO | 0.8336 | 0.8409 | 0.8341 | 0.6596 |
ALBETO | 0.8034 | 0.8031 | 0.8019 | 0.6629 | |
DistilBETO | 0.7945 | 0.8018 | 0.7967 | 0.6214 | |
MarIA | 0.8380 | 0.8370 | 0.8375 | 0.6949 | |
BERTIN | 0.8006 | 0.8031 | 0.8018 | 0.6564 | |
Total | BETO | 0.8428 | 0.8422 | 0.8424 | 0.7259 |
ALBETO | 0.8428 | 0.8435 | 0.8428 | 0.7336 | |
DistilBETO | 0.8068 | 0.8136 | 0.8089 | 0.6599 | |
MarIA | 0.8580 | 0.8605 | 0.8588 | 0.7597 | |
BERTIN | 0.8270 | 0.8201 | 0.8229 | 0.6743 |