Table 2.
Table 2 Recognition Performance Over the Training Corpus of the BioCreAtIvE II GM Corpus, N = 18,265
| Software | Notes | Precision | Recall | F-Measure |
|---|---|---|---|---|
| BioThesaurus | With all mapping | 0.2253 | 0.8654 | 0.3576 |
| With all mapping + false-positive list | 0.5000 | 0.8541 | 0.6308 | |
| Above w/longest first mapping | 0.6100 | 0.8378 | 0.7059 | |
| ABNER | First-order CRF model | 0.8324 | 0.7246 | 0.7753 |
| With post-processing module | 0.8361 | 0.7493 | 0.7901 | |
| LingPipe | CharLmRescoring with 36-gram | 0.7637 | 0.8204 | 0.7910 |
| With post-processing module | 0.7661 | 0.8364 | 0.7997 | |
| MEMM (MALLET) | Second-order MEMM | 0.8432 | 0.8044 | 0.8233 |
| With post-processing module | 0.8412 | 0.8175 | 0.8291 | |
| CRF (MALLET) | Without BioThesaurus | 0.8621 | 0.7765 | 0.8170 |
| Without post-processing | 0.8718 | 0.8133 | 0.8415 | |
| Without POS | 0.8717 | 0.8138 | 0.8417 | |
| Without UMLS | 0.8660 | 0.8187 | 0.8417 | |
| Without false-positive list | 0.8772 | 0.8109 | 0.8428 | |
| With longest first mapping | 0.8673 | 0.8212 | 0.8436 | |
| The best configuration | 0.8714 | 0.8261 | 0.8481 | |
| BioTagger-GM | Combination of four systems | 0.8658 | 0.8717 | 0.8687 |
GM = gene mention; MEMM = maximum entropy Markov model; CRF = conditional random field.
Reported numbers are averages of performance measures in 5×2-fold cross-validation tests.