Table 1.
Performances for different combinations of Information Retrieval (IR) component/GO classifier for the micro-reading then the macro-reading tasks, in terms of Top Precision P0 and Recall at rank r
Task | Benchmark | IR component | GO classifier | P0 | R at rank r |
---|---|---|---|---|---|
Micro-reading | GOA benchmark | N/A | EAGL | 0.23 | 0.17 |
GOCat | 0.48* (+109%) | 0.37* (+117%) | |||
Macro-reading | CTD benchmark | PubMed | EAGL | 0.34 | 0.15 |
GOCat | 0.69* (+103%) | 0.33* (+120%) | |||
GoPubMed | 0.39 | 0.16 | |||
Vectorial | EAGL | 0.33 | 0.14 | ||
GOCat | 0.66* (+100%) | 0.33* (+135%) | |||
UniProt benchmark | PubMed | EAGL | 0.33 | 0.45 | |
GOCat | 0.58* (+76%) | 0.73* (+62%) | |||
GoPubMed | 0.22 | 0.21 | |||
Vectorial | EAGL | 0.34 | 0.49 | ||
GOCat | 0.58* (+70%) | 0.75* (+53%) |
For Recall at rank r, according to the average number of expected good answers for each benchmark, r was 5 for the GOA and the UniProt benchmarks (respectively 2.8 and 1.3 expected good answers) and 100 for the CTD benchmark (30 expected good answers). For the GOCat classifier results, improvements of performances (+ x%) are given compared with the EAGL classifier. Statistically significant improvements (P < 0.05) are marked up in the table with an ‘*’.