Table 1. Representation of genomes in the COGsa.
Species |
Total no. of encoded proteins |
No. of proteins assigned to COGs |
Proteins in COGs (%) |
Archaea | |||
Archaeoglobus fulgidus | 2420 | 1817 | 75 |
Methanococcus jannaschii | 1786 | 1301 | 74 |
Methanobacterium thermoautotrophicum | 1873 | 1365 | 73 |
P.abyssi | 1767 | 1430 | 81 |
Pyrococcus horikoshiib | 2080 | 1353 | 66 |
A.pernixb | 2722 | 1157 | 43 |
Bacteria | |||
Aquifex aeolicus | 1560 | 1312 | 84 |
Bacillus subtilis | 4118 | 2767 | 67 |
Borrelia burgdorferic | 1637 | 693 | 43 |
Campylobacter jejuni | 1634 | 1282 | 78 |
Chlamydia trachomatis | 895 | 630 | 71 |
Chlamydia pneumoniae | 1053 | 646 | 62 |
D.radiodurans | 3194 | 2133 | 67 |
Escherichia coli | 4285 | 3308 | 77 |
Haemophilus influenzae | 1695 | 1497 | 88 |
Helicobacter pylori | 1578 | 1070 | 68 |
Mycobacterium tuberculosis | 3924 | 2456 | 63 |
Mycoplasma genitalium | 471 | 374 | 79 |
Mycoplasma pneumoniae | 680 | 419 | 62 |
Neisseria meningitidis | 2081 | 1446 | 70 |
Pseudomonas aeruginosa | 5567 | 4166 | 75 |
Rickettsia prowazekii | 836 | 673 | 81 |
Synechocystis sp. | 3168 | 2048 | 65 |
Thermotoga maritima | 1858 | 1497 | 81 |
Treponema pallidum | 1036 | 705 | 68 |
Vibrio cholerae | 3828 | 2715 | 71 |
Ureaplasma urealyticum | 613 | 398 | 64 |
Xylella fastidiosa | 2766 | 1481 | 54 |
Eukaryotes | |||
S.cerevisiae | 5964 | 2158 | 36 |
Total | 68571 | 45350 | 66 |
aNewly added genomes are underlined.
bThe low fraction of proteins assigned to COGs is probably due to over-prediction of protein-coding genes in the original genome annotation (see text and Table 2)
cThe low fraction of proteins assigned to COGs is due to the fact that part of the genome consists of multiple plasmids that code for poorly conserved proteins