Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2019 Mar 11;17:100080. doi: 10.1016/j.bdq.2019.01.002

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2019 The Authors

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

PMC Copyright notice

Fig. 3 — Example of derivation of single genome amplicon NGS consensus. A) At each alignment position (columns), the sequence of the reference (top) is compared with the frequency of nucleotide bases or gaps (rows) tallied based on the analysis of the sam file (to ease visualization, frequencies are here presented as a heat map). Whenever the most frequent base/gap differs from the reference (red border), the sequence of the consensus is modified accordingly (black boxes). B) By analyzing the CIGAR field in the sam alignment file it is possible to tally the sequences from the NGS reads encoded as “I”, which correspond to “insertion to the reference” (i.e., bases present in the NGS reads that do not have a corresponding position in the reference). The plot depicts, at position of the alignment (x-axis), the frequency of the insertions (y-axis). Data points are color-coded based on the length of the insertion. The arrows depict the sequences of three predominant insertions. Insertions above the operational threshold (dotted line at 50%) are followed up in downstream analysis, where C) the most common motif (in this case “TAA”) is inserted back into the new consensus sequence in the corresponding position.