Skip to main content
. 2021 Dec 17;22(Suppl 1):598. doi: 10.1186/s12859-021-04141-4

Table 8.

Span detection F1 score results for all algorithms tested against the core evaluation annotation set of the 30 held-out articles

Ontology CRF BiLSTM BiLSTM-CRF Char-Embeddings BiLSTM-ELMo BioBERT
ChEBI 0.7234 0.6545 0.5000 0.5280 0.0620 0.9091*
CL 0.8333 0.5882 0.3774 0.8000 0.0000 0.9231*
GO_BP 0.8677* 0.5498 0.3661 0.6346 0.0685 0.8646
GO_CC 0.9412 0.1379 0.2689 0.2581 0.1000 0.9444*
GO_MF 0.9999* 0.9999* 0.9999* 0.8421 0.0000 0.9999*
MOP 0.9999* 0.9999* 0.9999* 0.9999* 0.0000 0.9999*
NCBITaxon 0.9959* 0.8551 0.9440 0.9569 0.0711 0.9453
PR 0.4351 0.2979 0.2151 0.0995 0.0339 0.8199*
SO 0.9435* 0.4935 0.4897 0.8203 0.1059 0.9081
UBERON 0.7913 0.7206 0.4758 0.7440 0.0854 0.8826*

The best-performing algorithm per ontology is bolded with an asterisk*