Skip to main content
. 2021 Dec 17;22(Suppl 1):598. doi: 10.1186/s12859-021-04141-4

Table 2.

Statistics for the concept annotation classes used in the training (67-document) and evaluation (30-document) data sets and for those added as additional training data for concept normalization for all ontologies

Ontology # training set annotation classes Avg/median # training set annotation classes per article # classes added to training set # evaluation set annotation classes Avg/median # evaluation set annotation classes per article
ChEBI 1463 22/18 58,214 627 21/20
ChEBI_EXT 2852 43/38 58,439 1167 39/39
CL 581 9/7 2163 253 8/9
CL_EXT 651 10/8 2168 286 10/10
GO_BP 1586 24/21 29,213 682 23/23
GO_BP_EXT 2511 37/33 29,301 1090 36/37
GO_CC 677 10/9 4052 212 7/6
GO_CC_EXT 896 13/12 4086 296 10/9
GO_MF 49 1/1 10951 19 1/1
GO_MF_EXT 738 11/11 10,031 377 13/12
MOP 85 1/1 3574 32 1/1
MOP_EXT 108 2/1 3578 40 1/1
NCBITaxon 690 10/9 1,175,661 315 11/9
NCBITaxon_EXT 757 11/10 1,175,682 346 12/10
PR 1278 19/18 213,371 466 16/16
PR_EXT 1534 23/22 213,531 588 20/19
SO 1216 18/18 2256 544 18/19
SO_EXT 3172 47/47 2405 1409 47/48
UBERON 2048 31/24 14,057 1040 35/31
UBERON_EXT 2409 36/29 14,113 1217 41/38

Avg average