Table 3:
Overview of the different de novo assembly tools evaluated in this study
| Assembler | Version | MK | Setup | Usage | Runtime | Memory (GB) | Source | Year | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min | Max | Median | Min | Max | Median | |||||||
| Trans-ABySS | 2.0.1 | Yes |
|
|
16 m | 2 d 6 h 23 m | 11 h 11 m | 0.6 | 49.2 | 19.7 | [9] | 2010 |
| Trinity | 2.8.4 | No |
|
|
28 m | 1 d 20 h 10 m | 6 h 40 m | 7.2 | 243.9 | 27.7 | [10] | 2011 |
| Oasesa | 0.2.08 | Yes |
|
|
25 m | 8 d 15 h 45 m | 6 h 47 m | 3.1 | 110.2 | 31.3 | [11] | 2012 |
| SPAdes-sc b | 3.13.0 | Yes |
|
|
16 m | 7 h 52 m | 2 h 26 m | 5.0 | 37.4 | 25.3 | [18] | 2012 |
| SPAdes-rna b | 3.13.0 | Yesc |
|
|
11 m | 7 h 24 m | 2 h 17 m | 5.0 | 44.2 | 19.5 | [17] | 2018 |
| IDBA-Tran | 1.1.1 | Yes |
|
|
7 m | 8 h 49 m | 2 h 44 m | 0.6 | 29.1 | 9.6 | [12] | 2013 |
| SOAPdenovo-Trans | 1.03 | No |
|
|
1 m | 1 h 48 m | 24 m | 2.1 | 45.6 | 26.4 | [13] | 2014 |
| Bridger d | 14-12-01 | No |
|
|
11 m | 21 h 11 m | 5 h 9 m | 1.6 | 109.3 | 30.4 | [14] | 2015 |
| BinPacker d | 1.0 | No |
|
|
5 m | 15 h 57 m | 3 h 3 m | 1.5 | 96.2 | 27.9 | [15] | 2016 |
| Shannon | 0.0.2 | No |
|
|
9 m | 10 h 45 m | 3 h 18 m | 3.8 | 121.4 | 83.6 | [16] | 2016 |
We rated our experiences regarding the installation and usability of each tool (
: excellent;
: good;
: unsatisfactory). These experiences might be subjective; nevertheless, we want to share them to give non-experienced users an idea of how difficult it is to get each tool installed (Setup) and executed (Usage) (see Methods for details). For Trinity, we observed high memory peaks at the beginning of the calculations for large (human, mouse) data sets, which immediately returned to moderate memory levels after a few minutes. More details about runtime and memory consumption can be found in Electronic Supplement Fig. S11. MK: presence of a built-in multiple k-mer approach and the ability to automatically integrate the output of different k-mer runs.
a Oases was used on top of the de novo genome assembler Velvet (v1.2.10) [45].
b SPAdes, originally designed as a de novo genome assembler for single-cell data, was used in single-cell modus (–sc) and RNA-Seq modus (–rna).
cWhen running SPAdes in RNA-Seq modus, 2 k-mer values are used by default.
dBridger and BinPacker are based on a splicing graph construction instead of de Bruijn graphs.