Skip to main content
. 2019 Jul 5;35(14):i51–i60. doi: 10.1093/bioinformatics/btz350

Table 3.

Comparison between space-efficient colored de Bruijn graph construction methods for 4000, 8000 and 16 000 Salmonella strains using VariMerge versus competing methods

Dataset No. of k-mers (Billion) Program Output size (GB) Time RAM (GB)
4000 1.1 Vari/Rainbowfish 51 10 h 25 min 136
Bloom Filter Trie 99 51 h 42 min 120
Multi-BRWT 1.3 TB 42 h 23 min 156
Mantis/Method of Almodaresi et al. 36 5 h 58 min 313
VariMerge 51 10 h 25 min 136
8000 2.4 Vari/Rainbowfish 114 37 h 27 min 271
Bloom Filter Trie N/A N/A N/A
Multi-BRWT N/A N/A N/A
Mantis/Method of Almodaresi et al. 38 13 h 37 min 370
VariMerge 106 26 h 30 min 137
16 000 5.8 Vari/Rainbowfish N/A N/A N/A
Bloom Filter Trie N/A N/A N/A
Multi-BRWT N/A N/A N/A
Mantis/method of Almodaresi et al. 256 36 h 12 min 316
VariMerge 233 69 h 8 min 254

Note: We report N/A for any method that exceeded 140 CPU hours, 4 TB of disk space and 750 GB of memory. We anticipate add-on methods to compress better but will still consume the resources shown for their base method because they reuse base the method’s output. We measured RAM as max resident set size. Mantis authors noted their use of memory mapped I/O means this reveals opportunistic consumption and not necessarily requirement for their program. To the best of our knowledge, no extra external memory is needed for Bloom Filter Trie, Multi-BRWT, Mantis and the method of Almodaresi et al., so it is omitted from the table.