Table 1.
Methods | Species-reference (~16.7 million reads) | Order-reference (~20.0 million reads) | ||||||
---|---|---|---|---|---|---|---|---|
Accurate | Higher | Incorrect | Unassigned | Accurate | Higher | Incorrect | Unassigned | |
MetaCluster-TA | 60.9% | 32.9% | 4.0% | 2.2% | 31.8% | 38.1% | 22.6% | 7.5% |
MEGAN4 (contigs) | 60.7% | 12.9% | 22.3% | 4.1% | 12.3% | 13.1% | 65.3% | 9.3% |
MEGAN4 (reads) | 57.7% | 14.8% | 4.6% | 22.8% | 0.7% | 0.7% | 3.4% | 95.2% |
* "Accurate" corresponds to the percentage of species-reference/order-reference reads annotated correctly, i.e., their correct species/order names of the target genomes; "Higher" corresponds to the percentage of species-reference/order-reference reads that are correctly annotated, but to taxonomy of higher levels than species/order of the target genomes (e.g. reads of E. coli-reference annotated with family name Enterobacteriaceae); "Incorrect" corresponds to the percentage of reads which are annotated incorrectly; "Unassigned" corresponds to the percentage of reads that cannot be annotated to any taxonomy.
* Running time of MetaCluster-TA is about 8 hours; running time of MEGAN4 (reads) is about 4 days; running time of MEGAN4 (contigs) is about 1 day.
* About 80% reads can be aligned to contigs of length > 500 bp with <5% mismatches.