Table 2.
Tools | Algorithm highlight | Data structure | Pros | Cons | Quality score | Target error type |
---|---|---|---|---|---|---|
Reptile | Explore multiple alternative k-mer decompositions and contextual information of neighboring k-mers for error correction | Hamming graph | Contextual information can help resolve errors without increasing k and lowering local coverage | Uses a single core (non-parallelized) | Used | Substitution Deletion Insertion |
Musket | Multi-stage correction: two-sided conservative, one-sided aggressive and voting-based refinement | Bloom filter | Multi-threading based on a master–slave model results in high parallel scalability | A single static coverage cut-off to differentiate trusted k-mers from weak ones | Not used | Substitution |
Bless | Count k-mer multiplicity; correct errors using Bloom filter; restore false positives | Bloom filter | High memory efficiency; handle genome repeats better; correct read ends | Cannot automatically determine the optimal k value | Not used | Substitution Deletion Insertion |
Bloocoo | Parallelized multi-stage correction algorithm (similar to Musket) | Blocked Bloom filter | Faster and lower memory usage than Musket | Not extensively evaluated | Not used | Substitution |
Trowel | Rely on quality values to identify solid k-mers; use two algorithms (DBG and SBE) for error correction | Hash table | Correct erroneous bases and boost base qualities | Only accept FASTQ files as input | Used | Substitution |
Lighter | Random sub-fraction sampling; parallelized error correction | Pattern-blocked Bloom filter | No k-mer counting; near constant accuracy and memory usage | A user must specify k-mer length, genome length, and sub-sampling fraction α | Used | Substitution Deletion Insertion |