Skip to main content
. 2017 Aug 30;30(4):1015–1063. doi: 10.1128/CMR.00016-17

TABLE 2.

Performance analysis of assembly toolsa

Analysis tool (reference[s]) Concept Computational requirement Speed Assembly quality Preferred sequencing technology(ies) Web address(es) Input format Output format(s)
Web based
    Velvet (103, 126) de Bruijn graph-based assembly that resolves repeat-rich regions; can be used for de novo or reference-guided assembly; requires paired reads with 20- to 25-fold coverage Mid* Medium* Low* Illumina https://cge.cbs.dtu.dk/services/Assembler/ FASTA, FASTQ, SAM, or BAM AMOS, modified FASTA
    SPAdes/hybridSPAdes (112) de Bruijn graph-based assembler for de novo assembly of short and long reads Low** Low** Mid*/** Mixed input (Illumina, Ion Torrent, PacBio CLR, Oxford Nanopore) https://cge.cbs.dtu.dk/services/SPAdes/ FASTA, FASTQ, or BAM FASTA, FASTQ, FASTG
Command line
    IDBA-UD (108) de Bruijn graph-based assembly designed for assembly of repeat-rich reads of various sequencing depths Low* Medium* Mid* Illumina http://i.cs.hku.hk/~alse/hkubrg/projects/idba_ud/ FASTA FASTA
    RAY (96) de Bruijn graph-based assembly that uses seeds instead of Eulerian walks; used for de novo assembly; designed for short reads Low*** Fast*** Low*** Mixed input (454, Illumina, Ion Torrent) http://denovoassembler.sourceforge.net/ FASTA, FASTQ, or SFF FASTA, TXT
    Minimap/miniasm (116) OLC framework that computes overlaps and performs read trims and unitig construction; can be used for de novo or reference-guided assembly Low** High** High*/** PacBio, Oxford Nanopore https://github.com/lh3/minimap, https://github.com/lh3/miniasm FASTA GFA, PAF
    Canu (118) OLC framework that computes overlaps and performs read correction, read trims, and unitig construction; used for de novo assembly Mid** Low** High*/** PacBio, Oxford Nanopore https://github.com/marbl/canu FASTA or FASTQ FASTA
a

All quantitative performance measures were taken from data reported previously, as indicated. CLR, continuous long reads; GFA, graphical fragment assembly; PAF, pairwise mapping format; SFF, standard flowgram format (454 data format); *, E. coli K-12 MG1655 data set (110); **, Enterobacter kobei data set (233); ***, Illumina data from E. coli (SRA accession number SRX000429) (234). Note that for SPAdes, only the nonhybrid tool is accessible as a Web-based tool.