Table 1.
General | ||||
Size | 5,751,492 bp | |||
% G + C | 42.7% | |||
Protein-coding genes | 4524 | |||
RNA genes | 68 | |||
Pseudogenes | 117 | |||
Percent coding | 74% | |||
Average gene size | 936 bp (312 aa) | |||
Average intergenic space size | 340 bp | |||
Predicted protein-coding sequences | ||||
Identified by similarity to known sequences | 2226 (49%) | |||
Conserved hypothetical proteins | 908 (20%) | |||
Predicted proteins (no similarity to known sequences) | 1390 (31%) | |||
Multigene families | ||||
Number of multigene families | 539 | |||
Number of genes in multigene families | 2178 (48%) | |||
Domain comparison | Genes with top BLASTX hit to domain | Genes with BLASTX hits only to domain | ||
Archaea | 1852 (41%) | 759 (17%) | ||
Bacteria | 945 (21%) | 387 (9%) | ||
Eucarya | 98 (2%) | 47 (1%) | ||
Organism comparison | Genes w/ BLAST hits | Genes w/ top BLASTX hits | Genes w/ BBH hitsa | Conserved clustersb |
Archaeoglobus fulgidus | 1618 (36%) | 549 (12%) | 967 (21%) | 93 (256 genes) |
Methanococcus thermoautotrophicum | 1485 (33%) | 377 (8%) | 897 (20%) | 90 (295 genes) |
Methanococcus jannaschii | 1331 (29%) | 237 (5%) | 822 (18%) | 70 (194 genes) |
Pyrococcus abyssi | 1193 (26%) | 116 (3%) | 730 (16%) | 58 (151 genes) |
Pyrococcus horikoshii | 1141 (25%) | 96 (2%) | 690 (15%) | 47 (117 genes) |
Halobacterium sp NRC-1 | 1067 (24%) | 125 (3%) | 731 (16%) | 65 (195 genes) |
Best bidirectional hit: Defined between two species as a pair of proteins, each of which has its top best BLASTP hit to the other when all proteins from both species are compared to one another.
A cluster in one species was defined as a set of genes with each member in the set no further than 10 genes from at least 1 other gene in the set. A conserved cluster was defined as a cluster in M. acetivorans with each member of the cluster having a best bidirectional hit (BBH) with a member of a single cluster in a second species. Thus, a conserved cluster consists of a set of genes “close” in the M. acetivorans genome whose BBH genes are also “close” in the genome of the second species. The number of genes indicates the number of genes in conserved clusters.