Megahit v.0.2.2 |
Metagenome assembler using multiple k-mer sizes and succinct de Bruijn graphs |
Assembly of unique genomes across a broad abundance range. |
Ray Meta v2.3.2 |
Distributed de Bruijn graph metagenome assembler |
Assembly of abundant, unique genomes, dependent on k-mer size. |
Meraga v2.0.4 |
Meraculous + Megahit |
Assembly of unique genomes across a broad abundance range, assembly of high coverage (>600) circular elements. |
Minia 2 and Minia 3 |
De Bruijn graph assembler based on a Bloom filter |
Assembly of unique genomes across a broad abundance range, assembly of high coverage (>200) circular elements. |
A* |
OperaMS Scaffolder using SOAPde novo2 on medium complexity and Ray assemblies on low and high complexity data sets |
Assembly of abundant, unique genomes. |
Velour |
De Bruijn graph genome assembler |
Assembly of abundant, unique genomes, dependent on k-mer size. |
|
Genome and taxonomic binners |
|
|
|
CONCOCT |
Genome binner using differential coverage, tetranucleotide frequencies, paired-end linkage |
Near complete (>95%) assignment of datasets at some cost for average genome purity and completeness. |
MaxBin 2.0 |
Genome binner using multi-sample coverage, tentranucleotide frequencies |
Largest average purity and completeness across entire abundance range. Recovery of 2nd most genomes with high purity and completeness. |
MetaBAT |
Genome binner using multi-sample coverage, tetranucleotide frequencies, paired-end linkage |
Assignment of a large portion (>88%) of datasets at some costs for average genome purity and completeness. |
MetaWatt-3.5 |
Genome binner using tetranucleotide frequencies |
Recovery of the most genomes with high purity and completeness; near complete assignment of datasets at some cost for average genome purity and completeness. |
MyCC |
Genome binner using short k-mer frequencies, multi-sample coverage, and 40 universal phylogenetic marker genes |
Near complete assignment of datasets at some cost for average genome purity and completeness. |
Kraken |
Taxonomic binner using long k-mers and Lowest Common Ancestor (LCA) related assignments. Also returns a taxonomic profile. |
Good performance until family level; substantial decrease below. When removing small predicted bins, 2nd best sum of purity and completeness for taxon bins, completeness, overall sample assignment accuracy and bases assigned. |
Megan 6 |
Taxonomic binner using sequence similarities and LCA-related assignments |
Also rank-dependent performance. When removing small predicted bins, 2nd lowest misclassification rate (fraction of false predictions), mid-range performance otherwise. |
PhyloPythiaS+ |
Taxonomic binner using k-mer frequencies (4-6mers), structural SVM |
Good performance until family level; substantial decrease below. Best sum of purity and completeness, completeness, overall sample assignment accuracy and bases assigned. Best for deep brancher binning. |
taxator-tk |
Taxonomic binner using sequence homology and taxon placement algorithm |
When removing small predicted bins, highest purity and lowest misclassification rate, but very low completeness. Suggested application: taxon labeling of genome bins. |
|
Taxonomic profilers |
|
|
|
MetaPhyler |
Phylogenetic marker genes |
Best inference of taxon relative abundances to the family level, moderately high recall at the cost of very low precision. |
mOTU |
Phylogenetic marker genes |
Neither best nor worst with any metric, with a slight favoring of precision over recall. |
Quikr/ARK/SEK |
k-mer based nonnegative least squares using extracted 16S rRNA sequences. |
Highest recall with second worst precision. Suitable mostly for higher taxonomic ranks. Relatively good abundance estimation for low complexity samples and at higher taxonomic ranks. |
Taxy-Pro |
Mixture model analysis of protein signatures |
Very good inference of taxon relative abundances to the family level, high recall and low precision. |
TIPP |
Marker genes and SATÉ phylogenetic placement |
Accurate inference of taxon relative abundances down to the family level, high recall and low precision. |
CLARK |
Phylogenetically discriminative k-mers |
High recall and decidedly worst precision for all ranks and complexity levels. |
Common Kmers/MetaPalette |
Long k-mer based nonnegative least squares |
Comparable to MetaPhlAn2.0 (high precision with low recall), but more accurate inference of relative taxon abundances at the cost of fewer distinguished species. |
DUDes |
Read mapping and deepest uncommon descendant |
Tool parameters substantially affect tradeoff between precision and recall, particularly at lower taxonomic ranks and for high complexity samples. |
FOCUS |
k-mer based nonnegative least squares |
Good inference of relative abundances down to the family level, low precision and recall, especially for lower taxonomic ranks. |
MetaPhlAN 2.0 |
Clade specific marker genes |
Most precise method by far with ability to distinguish between a few species, low recall. |