Table 2.
Statistics for the training datasets created by only using annotations with manual and experimental evidence codes and the training datasets created by using annotations with all evidence codes.
Training Dataset Statistics | |||||||
---|---|---|---|---|---|---|---|
Annotation Count | Annotations with only manual experimental evidence codes | Annotations with all evidence codes | |||||
# of available levels | # of GO terms | # of annotations | # of available levels | # of GO terms | # of annotations | ||
Molecular Function | ≥30 | 9 | 838 | 281 125 | 11 | 2 776 | 6 451 530 |
≥100 | 9 | 605 | 272 235 | 10 | 1 598 | 6 386 105 | |
≥200 | 9 | 395 | 257 404 | 10 | 1 174 | 6 326 109 | |
≥300 | 8 | 226 | 233 476 | 9 | 942 | 6 269 643 | |
≥400 | 8 | 165 | 218 591 | 9 | 809 | 6 223 762 | |
≥500 | 8 | 142 | 210 790 | 9 | 698 | 6 173 867 | |
Biological Process | ≥30 | 10 | 4 215 | 1 433 220 | 12 | 8 404 | 16 537 812 |
≥100 | 10 | 2 993 | 1 386 588 | 12 | 4 768 | 16 335 538 | |
≥200 | 9 | 1 782 | 1 302 577 | 11 | 3 299 | 16 129 271 | |
≥300 | 9 | 1 059 | 1 199 604 | 10 | 2 631 | 15 965 583 | |
≥400 | 8 | 743 | 1 123 037 | 9 | 2 233 | 15 828 012 | |
≥500 | 8 | 603 | 1 075 353 | 9 | 1 978 | 15 713 431 | |
Cellular Comp. | ≥30 | 7 | 606 | 340 995 | 8 | 1 268 | 4 167 000 |
≥100 | 6 | 460 | 335 445 | 8 | 750 | 4 138 327 | |
≥200 | 6 | 324 | 325 687 | 7 | 549 | 4 110 383 | |
≥300 | 6 | 206 | 309 390 | 6 | 442 | 4 083 834 | |
≥400 | 6 | 155 | 296 929 | 6 | 377 | 4 061 654 | |
≥ 500 | 5 | 118 | 283 616 | 6 | 335 | 4 043 150 |