Table 4.
The average length of errors from the lists of FPs and FNs, and the F1-scores of single-token and multi-token entities measured for baselines (B), generic CRF methods (C) and generic LSTM-CRF methods (L) per entity type
F1-score (%) |
Mention length |
||||
---|---|---|---|---|---|
Single-Token | Multi-Token | FP | FN | ||
Chemicals | L | 84.02 | 79.90 | 19.07 | 20.69 |
C | 82.53 | 78.07 | 25.1 | 20.11 | |
B | 76.77 | 76.84 | 19.13 | 16.49 | |
Diseases | L | 86.13 | 76.54 | 14.51 | 15.58 |
C | 84.20 | 74.92 | 15.09 | 14.66 | |
B | 80.42 | 73.80 | 13.90 | 14.36 | |
Species | L | 86.11 | 79.10 | 10.68 | 12.34 |
C | 85.57 | 76.75 | 11.55 | 11.52 | |
B | 67.72 | 79.80 | 8.00 | 9.15 | |
Genes/Proteins | L | 86.49 | 72.64 | 12.85 | 14.13 |
C | 84.77 | 69.32 | 13.50 | 13.52 | |
B | 83.21 | 70.99 | 11.73 | 11.98 | |
Cell Lines | L | 72.94 | 64.93 | 20.02 | 15.65 |
C | 64.77 | 62.52 | 19.01 | 15.24 | |
B | 62.55 | 64.24 | 18.41 | 13.91 |
The highest F1-scores are emphasized in bold.