Skip to main content
. Author manuscript; available in PMC: 2012 Sep 15.
Published in final edited form as: Cell Host Microbe. 2011 Sep 15;10(3):260–272. doi: 10.1016/j.chom.2011.08.005

Table 2. Top organisms functionally similar to SFB based on shared gene families and metabolic modules.

1,191 microbial reference genomes were sorted by the specific orthologous gene families (using MBGD (Uchiyama et al., 2010)), general gene families (KO) (using the KEGG Orthology (Kanehisa et al., 2010)), or metabolic modules (MO) (small ~5-20 gene pathways from KEGG) shared with SFB. Genera appearing at least twice among the 20 most similar organisms are shown here, using the Tversky similarity index with α=0.25 (emphasizing organisms with few pathways not carried by SFB) and with α=0.75 (emphasizing organisms missing few pathways carried by SFB). Percentages in parentheses refer to the fraction of the top 20 genomes that fall within the respective genus. In addition, the catalogs were split into core (present in at least 75% of available genomes) and variable (present in 5-25%) subsets, and the reference genomes most similar to SFB in these subsets are shown here. SFB carries core subsets similar to both Firmicutes and minimal organisms, but its variable subsets are most similar to Clostridia and Streptococci.

Genus Total genomes in genus MBGD KO MO
α = 0.25 α = 0.75 α = 0.25 α = 0.75 α = 0.25 α = 0.75
# of organisms within the 20 genomes sharing the most gene families

Borrelia 8 6 (30%) 7 (35%)
Clostridium 31 2 (10%) 16 (80%) 12 (60%) 2 (10%)
Lactococcus 4 3 (15%)
Mycoplasma 26 10 (50%) 4 (20%) 11 (55%)
Streptococcus 52 9 (45%)
Thermoanaerobacter 7 4 (20%) 5 (25%) 2 (10%)
Ureaplasma 3 3 (15%)

# of organisms within the 20 genomes sharing the most CORE gene families

Borrelia 8 7 (35%) 7 (35%) 2 (10%)
Clostridium 31 11 (55%)
Lactobacillus 25 5 (25%)
Mycoplasma 26 5 (25%) 2 (10%)
Propionibacterium 3 3 (15%)
Streptococcus 52 6 (30%) 8 (40%) 10 (50%)
Thermoanaerobacter 7 2 (10%) 2 (10%)
Ureaplasma 3 3 (15%)

# of organisms within the 20 genomes sharing the most VARIABLE gene families

Clostridium 30 15 (75%) 18 (90%) 6 (30%) 12 (60%) 6 (30%)
Streptococcus 50 5 (25%) 4 (20%) 9 (45%)
Thermoanaerobacter 7 4 (20%) 2 (10%) 6 (30%) 6 (30%) 2 (10%) 4 (20%)