Skip to main content
. 2015 Jun 10;31(12):i80–i88. doi: 10.1093/bioinformatics/btv262

Table 1.

The performance comparison between major assembly tools on the F.tularensis dataset, which has a genome length of 1 892 775 bp and 6 907 220 number of 101 bp reads, using QUAST in default mode (Gurevich et al. 2013)

Assembler No. contigs (no. unaligned) N50 Largest (bp) Total (bp) MA local MA MA (bp) GF (%)
Velvet 358 (3 + 35 part) 7377 39 381 1 762 202 11 36 84 965 92.09
SOAPdenovo 307 (3 + 31 part) 8767 39 989 2 018 158 10 35 96 258 92.05
ABySS 96 (1 part) 27 975 88 275 1 875 628 64 32 1 330 684 95.87
SPAdes (−rr) 102 (2 + 11 part) 25 148 87 449 1 788 634 11 30 258 309 92.81
SPAdes (+rr) 100 (2 + 17 part) 26 876 87 891 1 797 197 23 31 497 356 93.75
IDBA 109 (1 + 10 part) 23 223 87 437 1 768 958 10 31 221 087 92.64

All statistics are based on contigs no shorter than 500 bp. N50 is defined as the length for which the collection of all contigs of that length or longer contains at least half of the sum of the lengths of all contigs and for which the collection of all contigs of that length or shorter also contains at least half of the sum of the lengths of all contigs. The no. unaligned is the number of contigs that did not align to the reference genome, or they were only partially aligned (part). Total is sum of the length of all contigs. MA is the number of (extensively) misassembled contigs. Local MA is the total number of contigs that had local misassemblies. MA (bp) is the total length of the MA contigs. GF is the genome fraction percentage, which is the fraction of genome bases that are covered by the assembly. −rr and ++rr denotes before and after repeat resolution, respectively.