Table 1.
Tagging | R | P | F | |
(a) POS/PROTEIN tagging | Full | 52.91 | 43.85 | 47.96 |
Left | 61.48 | 50.95 | 55.72 | |
Right | 61.38 | 50.87 | 55.63 | |
Sequential Labelling | R | P | F | |
(b) Word feature | Full | 63.23 | 70.39 | 66.62 |
Left | 68.15 | 75.86 | 71.80 | |
Right | 69.88 | 77.79 | 73.63 | |
(c) (b) + orthographic feature | Full | 77.17 | 67.52 | 72.02 |
Left | 82.51 | 72.20 | 77.01 | |
Right | 84.29 | 73.75 | 78.67 | |
(d) (c) + POS feature | Full | 76.46 | 68.41 | 72.21 |
Left | 81.94 | 73.32 | 77.39 | |
Right | 83.54 | 74.75 | 78.90 | |
(e) (d) + PROTEIN feature | Full | 77.58 | 69.18 | 73.14 |
Left | 82.69 | 73.74 | 77.96 | |
Right | 84.37 | 75.24 | 79.54 | |
(f) (e) after adding protein names in the training set to the lexicon | Full | 79.85 | 68.58 | 73.78 |
Left | 84.82 | 72.85 | 78.38 | |
Right | 86.60 | 74.37 | 80.02 |
Protein name recognition performance of the proposed method, evaluated by recall (R), precision (P), and F-measure (F). The left boundary (Left), the right boundary (Right), and both boundary (Full) recognition performance were measured. (a) the performance of POS/PROTEIN tagging. (b) the performance of sequential labelling when using the word feature only. (c) the performance of sequential labelling when using the word and orthographic features. (d) the performance of sequential labelling when using the word, orthographic, and POS features. (e) the performance of sequential labelling when using the word, orthographic, POS, and PROTEIN name features. (f) the performance of sequential labelling with the features used in (e) after adding protein names appearing in the training set to the lexicon. NB: no retraining was conducted.