Skip to main content
. 2013 Apr;20(4):359–371. doi: 10.1089/cmb.2012.0098

Table 2.

Comparison of Different Assemblers

Assemblera No. of contigs N50 N75 Covered (%)b MAc MMd CGe
EC215 dataset
Velvet 198 82776 42878 99.93 4 1.2 4223
SOAPdenovo 192 62512 35069 97.72 1 26.1 4141
IDBA 246 48825 25483 99.60 3 1.3 4170
SPAdes 385 86548 42441 99.53 2 3.7 4223
SPAdes single read 458 54858 30309 99.56 0 0.7 4239
Pathset 360 91829 56830 99.56 1 2.1 4249
EC500 dataset
Velvet 169 105637 57172 99.28 5 2.5 4095
SOAPdenovo 982 57167 31582 99.88 0 0.2 4196
IDBA 227 57827 34421 99.26 1 4.2 4158
SPAdes 215 95454 46490 99.80 4 1.7 4223
SPAdes single read 493 54666 35158 99.88 0 0.9 4215
Pathset 320 97971 58548 99.46 2 2.0 4252
a

For each column, the best assembler by each criteria is indicated in bold.

b

Percent of genome covered is the ratio of total number of aligned bases in the assembly to the genome size.

c

MA: Misassemblies are locations on an assembled contig where the left flanking sequence aligns over 1 kb away from the right flanking sequence on the reference.

d

MM: Mismatch (substitution) error rate per 100 kbp is measured in the correctly assembled contigs.

e

CG: Complete genes is the number of genes contained completely within assembly contigs (using E. coli gene annotations from www.ecogene.org).