Skip to main content
. 2023 May 9;9:e1377. doi: 10.7717/peerj-cs.1377

Table 14. Benchmark of the different LLMs with the whole corpus and the two splits evaluated for sentiment towards other companies.

For each model and dataset, the weighted precision (W-P), weighted recall (W-R), weighted (W-F1) and macro (M-F1) are reported.

Dataset Model W-P W-R W-F1 M-F1
Tweets BETO 0.7161 0.7288 0.7031 0.6056
ALBETO 0.6532 0.6806 0.6600 0.5448
DistilBETO 0.6765 0.7001 0.6757 0.5730
MarIA 0.7088 0.7158 0.7049 0.6144
BERTIN 0.6184 0.6441 0.6271 0.5055
Headlines BETO 0.6832 0.6728 0.6770 0.6035
ALBETO 0.6541 0.6375 0.6438 0.5657
DistilBETO 0.6453 0.6375 0.6409 0.5491
MarIA 0.6858 0.6780 0.6809 0.6054
BERTIN 0.6569 0.5698 0.5924 0.5109
Total BETO 0.7384 0.7445 0.7400 0.6711
ALBETO 0.7251 0.7327 0.7280 0.6503
DistilBETO 0.7155 0.7223 0.7177 0.6352
MarIA 0.7373 0.7445 0.7382 0.6655
BERTIN 0.7020 0.6741 0.6816 0.6028