Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Feb 2;11(2):224. doi: 10.3390/diagnostics11020224

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2021 by the authors.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

PMC Copyright notice

NGS bioinformatic workflow. (A) The sequencing reaction generates millions of short reads (40 to 400 nucleotides long). The reads are processed by marking duplicates and barcode and adapter sequences. Individual reads are retained in a FASTQ file. They are then aligned to the reference genome, generating a BAM file. Variants are identified (called) from nucleotide positions differing from the reference sequence, and gathered in a VCF file, consisting of a list of genomic coordinates with the reference sequences, the putative variants and quality scores. These variants are then annotated with information gathered from various databases, on variant frequency, gene(s) involved, gene products, predicted deleteriousness, reported pathogenicity or benignity. They are then manually analyzed by filtering (B) and prioritization (C). Variants with no impact on gene products, high frequency in the general population and not segregating with the phenotype are ruled out. The remaining variants are prioritized by various criteria, for example by potential relation to the phenotype or predicted deleteriousness.