Table 2.
Table 2 Recognition Performance Over the Training Corpus of the BioCreAtIvE II GM Corpus, N = 18,265
Software | Notes | Precision | Recall | F-Measure |
---|---|---|---|---|
BioThesaurus | With all mapping | 0.2253 | 0.8654 | 0.3576 |
With all mapping + false-positive list | 0.5000 | 0.8541 | 0.6308 | |
Above w/longest first mapping | 0.6100 | 0.8378 | 0.7059 | |
ABNER | First-order CRF model | 0.8324 | 0.7246 | 0.7753 |
With post-processing module | 0.8361 | 0.7493 | 0.7901 | |
LingPipe | CharLmRescoring with 36-gram | 0.7637 | 0.8204 | 0.7910 |
With post-processing module | 0.7661 | 0.8364 | 0.7997 | |
MEMM (MALLET) | Second-order MEMM | 0.8432 | 0.8044 | 0.8233 |
With post-processing module | 0.8412 | 0.8175 | 0.8291 | |
CRF (MALLET) | Without BioThesaurus | 0.8621 | 0.7765 | 0.8170 |
Without post-processing | 0.8718 | 0.8133 | 0.8415 | |
Without POS | 0.8717 | 0.8138 | 0.8417 | |
Without UMLS | 0.8660 | 0.8187 | 0.8417 | |
Without false-positive list | 0.8772 | 0.8109 | 0.8428 | |
With longest first mapping | 0.8673 | 0.8212 | 0.8436 | |
The best configuration | 0.8714 | 0.8261 | 0.8481 | |
BioTagger-GM | Combination of four systems | 0.8658 | 0.8717 | 0.8687 |
GM = gene mention; MEMM = maximum entropy Markov model; CRF = conditional random field.
Reported numbers are averages of performance measures in 5×2-fold cross-validation tests.