Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Aug 13;10(8):1026. doi: 10.3390/pathogens10081026

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2021 by the authors.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

PMC Copyright notice

Read Mapping. (A) The HPV E6/E7 (660 bp) gene segment highlighted in blue on the circular prototypical HPV-16 genome (GenBank ID: K02718) is the target used for amplicon sequencing and genotyping. The Map Reads to Reference workflow output displays the reads mapped on to the linearized HPV reference genome. Zooming in from the whole genome window (top) allows viewing of the sequences down to the nucleotide level (bottom). The color-coding legend defines the corresponding read types and nucleotide mismatches. (B) NGS paired-sequence file size for each of the 155 study samples. The bar chart reveals the extent of file size variation between samples. The median (▬) was 58.5 MB (range, 8.9–208.5). (C) Scatterplot between NGS paired-sequence file size (MB) and merged sequences (n) for the 155 study samples showed near-perfect linear correlation (R² = 0.9945). The regression line (merged sequences = 3415 + 3868 × file size) is shown as (▬). (D) Merged sequences (n) and reads mapping time (s) for the study cohort (n = 155) were highly correlated (R² = 0.7233) as shown by the scatterplot and regression line (mapping time = 5.7 + 4.44 × 10⁻⁵ × merged sequences) (▬). The equations above may be used jointly or independently to estimate total mapping time based on file size or number of sequences.