Table 2. Taxonomic distance analysis for AMD metagenome scaffolds assignment to draft genome assemblies generated for five strains of three different genera in the AMD metagenome project.
Method | Measure | Genus | Taxonomic Distance | |||
L (543) | T (404) | F (236) | Micro average | Macro average | ||
PPS SS | Assigned | 528 | 404 | 236 | – | – |
Const_n_scaff | 0.92 | 0.83 | 0.97 | 0.89 | 0.91 | |
Const_n_bp | 0.97 | 0.89 | 0.99 | 0.95 | 0.95 | |
Tax dist | 0.96 | 1.79 | 2.22 | 1.48 | 1.65 | |
PPS G | Assigned | 540 | 403 | 236 | – | – |
Const_n_scaff | 0.36 | 0.81 | 0.95 | 0.63 | 0.71 | |
Const_n_bp | 0.24 | 0.86 | 0.98 | 0.62 | 0.70 | |
Tax dist | 6.90 | 1.96 | 2.53 | 4.32 | 3.80 | |
BLASTN | Assigned | 542 | 403 | 236 | – | – |
Const_n_scaff | 0.13 | 0.13 | 0.07 | 0.12 | 0.11 | |
Const_n_bp | 0.06 | 0.08 | 0.02 | 0.05 | 0.05 | |
Tax dist | 9.36 | 3.78 | 4.95 | 6.56 | 6.03 | |
MEGAN | Assigned | 337 | 272 | 194 | – | – |
Const_n_scaff | 0.60 | 0.14 | 0.12 | 0.22 | 0.28 | |
Const_n_bp | 0.58 | 0.14 | 0.11 | 0.30 | 0.28 | |
Tax dist | 5.77 | 2.12 | 3.93 | 2.78 | 3.94 | |
NBC | Assigned | 539 | 403 | 235 | – | – |
Const_n_scaff | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
Const_n_bp | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
Tax dist | 10.10 | 9.44 | 11.79 | 10.16 | 10.45 |
The genera are Leptospirillum (L), Thermoplasmatales (T), and Ferroplasma (F). The evaluated methods are the PhyloPythiaS sample-specific model (PPS SS), the PhyloPythiaS generic model (PPS G), BLASTN, MEGAN and the Naïve Bayesian classifier method (NBC). The assignments provided by each method were mapped to the genus or corresponding clade at a higher taxonomic rank for this analysis. The numbers in brackets after the population name show the number of scaffolds originating from each genus. The rows show the number of assigned scaffolds (Assigned), the fraction of scaffolds assigned to either the correct clade itself or a parental clade thereof (Const_n_scaff), the fraction of base-pairs in the same lineage as the correct taxon (Const_n_bp) and the average taxonomic distance of assignments with respect to genus level clades of the draft reference genomes (Tax Dist). See ‘Results’ for the definitions of consistency and taxonomic distance. Micro average shows average value over all test scaffolds and macro average shows average over the three genera.