Skip to main content
. 2012 Jul 23;13:327. doi: 10.1186/1471-2164-13-327

Table 2.

General features of the metagenome datasets

Feature KB-1 DonnaII ANAS
Type of sequencing
Sanger
454
454 & Sanger
Total number of bases pre-assembly
106,515,530
930,446,714
330,964,688
Number of contigs
6,361
47,030
10,807
Total length of contigs (bp)
14,988,108
24,573,718
30,615,713
Number of singletons
18,629
105,608
15,486
Total length of singletons (bp)
13,487,233
57,708,799
10,450,264
Largest contig (bp)
155,970
121,460
921,258
Average contig size (bp)
2,356
522
2,832
Average G + C content (%)
52.33
52.28
51.91
Protein coding genes
40,766
194,527
60,992
- with COGs
21,857
116,001
39,920
- connected to KEGG pathways
8,077
36,685
11,878
rRNA genes (5 S/16 S/23 S)
18 (7/5/6)
185 (11/62/112)
40 (23/8/9)
tRNA genes
330
818
525
CRISPR count
48
7
57
MG-RAST data
 
 
 
% Dhc in culture*
43.7
31.3
18.2
Metagenome size (bp)*
106,508,248
916,191,214
330,396,345
Average read length*
958
477
547
Number of sequences*
111,162
1,920,396
603,841
Number (%) identified for metabolic analysis
63,352 (57.0)
363,424 (18.9)
222,012 (36.8)
Number (%) identified for phylogenetic analysis 88,888 (80.0) 540,785 (28.2) 294,470 (48.8)

* = post-MG-RAST preprocessing, which removed duplicate reads and nonsense reads from the datasets.

† = maximum e-value of 1x10-5, minimum alignment length ~100.