Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 Mar 24;11:giac028. doi: 10.1093/gigascience/giac028

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2022. Published by Oxford University Press GigaScience.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 1: — Benchmarking analysis of cassava TME204 assemblies from PacBio CLRs and HiFi reads. (a) Assembly size of all resolved alleles. (b) Contig continuity measured as N50 and NG50. N50 is the length of the shortest contig in the set of largest contigs that make up 50% of the assembly size as shown in (a). NG50 is the length of the shortest contig in the set of largest contigs that make up 50% of the haploid genome size of 750 Mb. (c) Base accuracy of contigs, measured by sequence similarity between contigs and mapped Illumina reads, and as the fraction of k-mers found in both contigs and the Illumina reads. (d) Structure accuracy of contigs, measured by the percentage of properly paired Illumina PE reads. (e) Assembly completeness, measured by the percentage of mapped Illumina reads and the fraction of reliable Illumina k-mers retained in the contigs. (f) Phred scale quality value (QV) of contigs, calculated using the error probability P with the formula: QV = −10 * log(P, 10), where P is the fraction of k-mers found in the contigs but missing in the Illumina reads. (g) Completeness of resolved haplotypes measured by Merqury copy number spectrum plots. The x-axis shows k-mer multiplicity computed from the Illumina reads. The y-axis shows the abundance for k-mers with a given multiplicity, either in the Illumina reads (black) or in the contigs (colored by the number of times they are found in the underlying assembly). Red peaks at 45× represent resolved haplotype alleles, red peaks at 90× collapsed haplotype alleles. Black humps found at either 40× (heterozygotes/1-copy k-mers) or 80× (homozygotes/2-copy k-mers) represent reliable Illumina k-mers missing in contigs, corresponding to the assembly completeness in (e). Assembly-specific k-mers absent from the Illumina reads are plotted as a bar at zero k-mer multiplicity, corresponding to the error probability in (f).