Table 1.
|
|
|
Number of gene sets in collection (average number of genes in set) |
|
---|---|---|---|---|
Taxon ID | Organism | Number of genes with GO annotation | All evidence codes | High quality evidence codes |
234826 |
Anaplasma marginale str. St. Maries |
196 |
48 (40) |
|
212042 |
Anaplasma phagocytophilum str. HZ |
1288 |
218 (55) |
221 (60) |
3702 |
Arabidopsis thaliana |
27942 |
2032 (129) |
1951 (85) |
227321 |
Aspergillus nidulans FGSC A4 |
7326 |
1152 (69) |
35 (31) |
198094 |
Bacillus anthracis str. Ames |
5097 |
465 (81) |
466 (81) |
9913 |
Bos taurus |
5567 |
2634 (67) |
1285 (58) |
6239 |
Caenorhabditis elegans |
12642 |
1505 (84) |
1098 (81) |
195099 |
Campylobacter jejuni RM1221 |
1826 |
315 (62) |
316 (63) |
246194 |
Carboxydothermus hydrogenoformans Z-2901 |
2609 |
363 (64) |
362 (65) |
227377 |
Coxiella burnetii RSA 493 |
1798 |
271 (67) |
272 (67) |
214684 |
Cryptococcus neoformans var. neoformans JEC21 |
3427 |
969 (68) |
|
7955 |
Danio rerio |
16957 |
2201 (83) |
1342 (68) |
243164 |
Dehalococcoides ethenogenes 195 |
1583 |
265 (72) |
265 (71) |
352472 |
Dictyostelium discoideum AX4 |
7694 |
1184 (86) |
801 (72) |
7227 |
Drosophila melanogaster |
12560 |
2750 (83) |
2459 (78) |
205920 |
Ehrlichia chaffeensis str. Arkansas |
1090 |
221 (56) |
223 (59) |
511145 |
Escherichia coli str. K-12 substr. MG1655 |
2518 |
198 (112) |
|
9031 |
Gallus gallus |
2104 |
1460 (64) |
643 (52) |
243231 |
Geobacter sulfurreducens PCA |
3269 |
347 (82) |
348 (82) |
9606 |
Homo sapiens |
18106 |
5808 (82) |
4403 (81) |
265669 |
Listeria monocytogenes serotype 4b str. F2365 |
2811 |
384 (79) |
385 (79) |
243233 |
Methylococcus capsulatus str. Bath |
2902 |
377 (72) |
378 (72) |
10090 |
Mus musculus |
24667 |
5615 (79) |
3643 (74) |
222891 |
Neorickettsia sennetsu str. Miyayama |
928 |
204 (54) |
206 (56) |
39947 |
Oryza sativa Japonica Group |
4266 |
30 (18) |
2 (14) |
36329 |
Plasmodium falciparum 3D7 |
1770 |
212 (65) |
219 (67) |
223283 |
Pseudomonas syringae pv. tomato str. DC3000 |
3950 |
436 (73) |
439 (77) |
10116 |
Rattus norvegicus |
18599 |
5746 (79) |
3081 (75) |
246200 |
Ruegeria pomeroyi DSS-3 |
4250 |
497 (85) |
496 (86) |
559292 |
Saccharomyces cerevisiae S288c |
6244 |
2005 (75) |
1849 (74) |
284812 |
Schizosaccharomyces pombe 972 h- |
5276 |
1627 (82) |
1118 (67) |
211586 |
Shewanella oneidensis MR-1 |
4272 |
418 (79) |
419 (79) |
999953 |
Trypanosoma brucei brucei strain 927/4 GUTat10.1 |
1073 |
157 (74) |
147 (80) |
9606 |
Homo sapiens (MSigDB collection) |
18106 |
|
1422 (69)2 |
9606 | Homo sapiens (From Affymetrix annotation file) | 18106 | 5383 (80) |
Gene sets were built from the NCBI gene2go annotation table and GO ontology downloaded on 13th September 2013. Default settings were used which filter out gene sets containing fewer than 10 or more than 700 genes. Organisms were omitted when the biggest collection contained fewer than 30 sets. In cases where use of all evidence codes reduces the number of gene sets compared with using high quality codes only, this is due to maximum set size filtering. 1For comparison the currently available MSigDB GO based human collection and a human set built from the annotation file for the Affymetrix HG-U133 Plus 2.0 array are also shown. 2Set number and sizes were calculated for the MSigDB collection with filtering as above (the full collection contains 1454 gene sets).