Skip to main content
. 2010 Aug 5;5(1):122–130. doi: 10.1038/ismej.2010.125

Table 1. Phylogenetic affiliations of major bins in the TA data set identified with the composition-based classifier, PhyloPythia (McHardy et al., 2007)a.

Phylogenetic affiliation No of DNA contigs Total sequence (Mb) Average read depth Expected genome size (Mb)b
Bacteroidetes (class) 70 0.155 2.3±1.1
Bacteroidales 120 0.235 2.2±0.9
Betaproteobacteria 73 0.096 1.7±0.6
         
Deltaproteobacteria 160 0.343 2.1±0.8
 Uncultured Syntrophus 196 0.724 2.9±1.3 1.9
Geobacter 264 0.578 2.1±0.7
         
Firmicutes 430 0.628 2.1±0.7
Clostridia 66 0.372 3.3±1.6
 Uncultured Pelotomaculum sppc 1083 4.256 3.2±1.5 3.6
OP5 228 1.411 3.8±1.9 2.8
Spirochaetes (class) 81 0.177 2.3±0.9
         
Euryarchaeota 1560 2.648 2.1±0.9
Thermoplasmata 71 0.148 1.9±0.8
Methanomicrobiales 36 0.098 2.4±1.8
  Uncultured Methanolinea sppc 78 2.162 5.3±3.9 3.7
Methanosarcinales 15 0.095 3.7±1.9
  Uncultured Methanosaeta 1180 2.613 2.6±1.1 3.1
  Uncultured Methanosaeta 351 2.361 4.2±1.3 2.8
Unclassified 46 280      

Abbreviations: rRNA, ribosomal RNA; SNP, single-nucleotide polymorphism; TA, terephthalate.

a

No DNA contigs were binned to WWE1 related to C. acidaminovorans because of insufficient training data for PhyloPythia. 16S rRNA clone library indicted that132 sequences were affiliated with WWE1 and grouped into two different clusters. One of the clusters (37/132) was closely related to C. acidaminovorans (similarity=96–98.8%).

b

Expected genome size was calculated based on the percent coverage of the corresponding isolate genomes. For example, there are 1735 genes in Pelotomaculum thermopropionicum that are best-BLAST matches to genes from the metagenome dataset. Given that P. thermopropionicum contains 2920 genes, we estimate the genome size of the uncultured Pelotomaculum sp. was 7.16 Mb (4.256 *2920/1735). However, there are at least two strains of Pelotomaculum present in the sample. Therefore, the individual genome size for each strain is estimated to be around 3.6 Mb. For estimating the Methanolinea genome size we used as a reference genome Methanoculleus marisnigri. For Methanosaeta genome size, 2.6 Mb of sequences give hits to 1438 proteins in M. thermophila genome has 1730 coding sequences predicted so the expected genome size would be (2.6 × 1730)/1438=3.1 Mb. In the case of OP5, expected genome size was calculated based on the occurrence of phylogenetic marker clusters of orthologous genes (COGs) that are defined as COGs having one or mostly one member in the genomes that are present and are available in Integrated Microbial Genome/Microbiome (IMG/M). The OP5 bin contained 91 out of 180 phylogenetic marker COGs.

c

At least two species/strains were observed in each bin. With SNP frequencies of at least 0.03–0.07% (data not shown), we concluded that these species/strains are not clonal.