Skip to main content
. 2020 Sep 15;9:e59726. doi: 10.7554/eLife.59726

Figure 2. A method for estimating the abundance of phz+ bacteria using shotgun-metagenomics.

(A) Schematic representation of the computational approach. Phenazine producer abundance is calculated using the median Reads Per Kilobase (RPK) levels. (B) Validation results using simulated communities. Box plots represent different producer combinations as indicated in the legend. Each boxplot represents data from simulations performed with 80% amino-acid identity threshold and across 5–20M library coverages (n = 48 per individual boxplot). The black bar represents the box median. (C) Accuracy in measuring relative abundance of multiple phz+ species in a mixed community. Y-axis shows the difference between known and estimated species abundance across the indicated library coverages, considering samples with >0.1% phz+ bacteria. (D) Example of individual simulation results with two different phz+ bacteria. Similar heights of dark and light column portions represent good agreement between known and estimated levels. (E) Scatter plot depicting phenazine degrader (M. fortuitum CT6) frequency estimates at a gradient of known levels in simulated metagenomes. M. fortuitum levels are estimated using either phdA or podA genes, shown in pink and green, respectively.

Figure 2.

Figure 2—figure supplement 1. Benchmarking with mapping and library coverage parameters.

Figure 2—figure supplement 1.

(A) Cross-validation of Simulated communities spiked with phz+ Streptomyces at different frequencies. The simulated metagenomic data were mapped against a database that does not contain the specific Streptomyces producer that was spiked-in and various percent amino-acid identity thresholds were tested (as indicated in the legend). (B) Simulations of all simulated communities using an 80% amino-acid identity threshold, separated by different library coverages.
Figure 2—figure supplement 2. Low variation in bac120 reference gene coverage across samples.

Figure 2—figure supplement 2.

Histogram shows the coefficient of variation (CV; standard deviation/mean) calculated using the 25 chosen bac120 reference genes across all analyzed samples in this study.