Table 1.
Assembler | NGA50 | NA50 | Largest (bp) | Total (bp) | MA | GF (%) | Unaligned (bp) |
---|---|---|---|---|---|---|---|
SOAPdenovo | 32,032 | 35,343 | 101,201 | 4,304,232 | 3 | 95.2 | 3,421 |
ABySS | 31,237 | 32,987 | 110,012 | 4,530,701 | 0 | 97.56 | 0 |
SPAdes | 60,338 | 60,768 | 173,976 | 4,545,775 | 0 | 97.8 | 3,001 |
IDBA | 57,826 | 58,549 | 173,964 | 4,538,426 | 0 | 97.7 | 2,349 |
HyDA | 36,292 | 39,069 | 123,771 | 4,524,075 | 0 | 97.4 | 0 |
HyDA-Vista | 82,838 | 94,910 | 204,602 | 4,544,286 | 0 | 97.9 | 0 |
All statistics are based on contigs no shorter than 500 bp. Since there are not (QUAST-defined) misassemblies in any of the assemblies, the length statistics are based on correct contigs. NGA50 (NA50) is a (QUAST-corrected) contig size the contigs larger than which cover half of the genome (assembly) size [43,44]. Total is sum of the length of all contigs. MA is the number of misassemblies. GF is the genome fraction percentage, which is the fraction of genome bases that are covered by the assembly. Unaligned is the total length of all of the contigs that could not be aligned to the reference.