Skip to main content
. 2008 Apr 11;9(Suppl 3):S2. doi: 10.1186/1471-2105-9-S3-S2

Table 7.

Dictionary lookup performance.

This table shows the speed and accuracy of dictionary lookup tasks using the human gene/protein dictionary and gene/protein name snippets. F-score is the harmonic mean of precision and recall. The values in the parentheses are the threshold values in soft string matching.

Method Precision Recall F-score Average lookup time (microsecond)
Bigram similariy (0.97) 0.758 0.587 0.661 6.7 × 105
Bigram similariy (0.95) 0.691 0.592 0.638 6.8 × 105
Bigram similariy (0.93) 0.612 0.610 0.611 6.8 × 105
No normalization 0.809 0.502 0.619 7
Case normalization 0.782 0.582 0.666 8
Heuristic normalization [18] 0.730 0.657 0.692 8
Automatic normalization 0.767 0.633 0.694 29