Skip to main content
. 2018 Jan 9;8:2241. doi: 10.3389/fpls.2017.02241

Table 2.

Brief description of GRA estimation tools: advantages and disadvantages.

Tool Bioinformatics tools for Genomes Relative Abundance (GRA) estimation
Brief description Advantages Disadvantages
TETRA Pioneering classifier that uses tetranucleotide-derived z-score correlations to taxonomically classify genomic fragments. Compositional-based. Provides statistical analysis of tetranucleotide usage patterns in genomic fragments. It works either via a web-service or a stand-alone program. Accuracy at genus level is reached using long reads (>1 kb). Tends to create multiple clusters for reads originating from highly abundant species when the sample contains multiple species with highly varying levels of abundance.
CompostBin DNA compositional-based algorithm which adopts a weighted Principal Component Analysis (PCA)-based strategy. Compositional-based. Reduces the dimensionality of compositional space. Bins raw sequence reads without need for assembly or training. Accuracy at genus level is reached using long reads (>1 kb). Tends to create multiple clusters for reads originating from highly abundant species when the sample contains multiple species with highly varying levels of abundance.
TACOA Multi-class taxonomic classifier combining the idea of the k-nearest neighbor with strategies from kernel-based learning. Compositional-based. Easily installed and run on a desktop computer. Its reference set can be easily updated with newly sequenced genomes. Accuracy at genus level is reached using long reads (>1 kb).
AbundanceBin Binning tool, based on the l-tuple content of reads, developed on the assumption that reads are sampled from genomes following a Poisson distribution. Compositional-based. Capable to return accurate results also when the sequence lengths are very short (~75 pb). Binning efficiency decrease in case of samples which tend to have a uniform distribution of species.
MEGAN Standalone computer program allowing large metagenomic data sets. It uses BLAST or other comparison tools to assign species to each read, and then employs the NCBI taxonomy. Alignment-based. Allows large data sets to be dissected without the need for assembly or the targeting of specific phylogenetic markers. Provides statistical and graphical output. Computes quantitatively accuracy and specificity. Uses bit-score of individual hits as the sole parameter for judging significance, thus affecting specificity and accuracy of taxonomic assignments in different scenarios.
GRAMMy Probabilistic framework developed for GRA. It uses the Mixture Model theory. Exploitable with mapping, alignment and composition-based tools. Possibility to handle very short reads obtaining accurate results. Accuracy in estimated abundance decreases in case of closely related microbes whose genomic sequences are highly similar.