Table 7.
Method | Precision (exact) | Recall (exact) | F1 (exact) | Precision (hierarchy) | Recall (hierarchy) | F1 (hierarchy) |
---|---|---|---|---|---|---|
Indri (baseline) | 1% | 3% | 1.5% | 9.9% | 33.1% | 15.2% |
Indri + definition | 0.8% | 3% | 1.3% | 8.5% | 34.7% | 13.7% |
Cosine | 2.4% | 7.6% | 3.6% | 7.2% | 40.6% | 12.2% |
GORank | 5.9% | 14.3% | 8.4% | 13.5% | 31.8% | 19% |
GORank + hierarchy | 10.6% | 10.6% | 10.6% (+606.7%) | 21.6% | 21.2% | 21.4% |
Cosine + Frequency | 4.6% | 9.8% | 6.2% | 15.1% | 28.4% | 19.7% |
GORank + frequency | 5.5% | 10.7% | 7.3% | 17.4% | 27.5% | 21.3% |
GORank + frequency + hierarchy (Run 3) | 9.5% | 6.7% | 7.8% | 27.8% | 16.1% | 20.4% |
GORank + frequency + hierarchy (Run 1) | 5.2% | 11.2% | 7.1% | 17% | 32% | 22.2% (+46%) |
GORank + frequency + hierarchy (Run 2) | 4.9% | 14.3% | 7.3 | 12.7% | 36.8% | 18.8% |
‘Indri’ is a language model-based method (23). ‘Definition’ means appending the definition of GO terms to expand the text representation. ‘Cosine’ is the similarity function in the first part of Formula (2). ‘Frequency’ is to limit GO vocabulary to the high-frequency GO terms (Table 3). ‘Hierarchy’ is the high-level GO class-based filtering.