Skip to main content
. 2010 Jul 29;5(7):e11652. doi: 10.1371/journal.pone.0011652

Figure 6. Minimax contig sizes observed for simulated viral metagenome assemblies.

Figure 6

For a viral metagenome experiment design based on a Poisson number of species, uniformly distributed genome sizes and Pareto distributed abundances (Inline graphic = 100, Inline graphic = Uniform(50000,350000), Inline graphic = 3.5), Inline graphic = 67109, 96992 and 126271 were calculated to have 95% probability of yielding assembled contigs of at least size Inline graphic = 4, 5 and 6 for all species respectively. In Fig. 6, we show the distribution of minimax contig sizes obtained from 100 simulations of an assembly of these numbers of reads on a pool of Inline graphic = 100 species with Uniform(50000,350000)-distributed genome sizes and Pareto(1,3.5)-distributed abundances (solid lines) vs. their targeted sizes (dashed). Consistent with previous observations for this case, the actual contig sizes obtained are slightly smaller than the targeted length. The median minimax contig sizes are 3.68, 4.85 and 6.16 (in read lengths, which is 92–103% of the target length), and 95% of all experiments yield contigs of length 3.38, 4.43 and 5.63 from all species (85–94% of the target length). The slight undersizing of contigs is consistent with previous observations (e.g. Figs. 3 and 4).