Table 3.
Comparison between space-efficient colored de Bruijn graph construction methods for 4000, 8000 and 16 000 Salmonella strains using versus competing methods
Dataset | No. of k-mers (Billion) | Program | Output size (GB) | Time | RAM (GB) |
---|---|---|---|---|---|
4000 | 1.1 | /Rainbowfish | 51 | 10 h 25 min | 136 |
Bloom Filter Trie | 99 | 51 h 42 min | 120 | ||
Multi-BRWT | 1.3 TB | 42 h 23 min | 156 | ||
Mantis/Method of Almodaresi et al. | 36 | 5 h 58 min | 313 | ||
51 | 10 h 25 min | 136 | |||
8000 | 2.4 | /Rainbowfish | 114 | 37 h 27 min | 271 |
Bloom Filter Trie | N/A | N/A | N/A | ||
Multi-BRWT | N/A | N/A | N/A | ||
Mantis/Method of Almodaresi et al. | 38 | 13 h 37 min | 370 | ||
106 | 26 h 30 min | 137 | |||
16 000 | 5.8 | /Rainbowfish | N/A | N/A | N/A |
Bloom Filter Trie | N/A | N/A | N/A | ||
Multi-BRWT | N/A | N/A | N/A | ||
Mantis/method of Almodaresi et al. | 256 | 36 h 12 min | 316 | ||
233 | 69 h 8 min | 254 |
Note: We report N/A for any method that exceeded 140 CPU hours, 4 TB of disk space and 750 GB of memory. We anticipate add-on methods to compress better but will still consume the resources shown for their base method because they reuse base the method’s output. We measured RAM as max resident set size. Mantis authors noted their use of memory mapped I/O means this reveals opportunistic consumption and not necessarily requirement for their program. To the best of our knowledge, no extra external memory is needed for Bloom Filter Trie, Multi-BRWT, Mantis and the method of Almodaresi et al., so it is omitted from the table.