Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Nov 18;10(11):giab074. doi: 10.1093/gigascience/giab074

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2021. Published by Oxford University Press GigaScience.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 1: — Schematics of core algorithm and data processing steps. (a) Read depth analysis steps include parsing alignment file, calculating and storing read depths in 100-bp intervals, binning using user-specified bin size, correcting RD for GC bias, segmenting by mean-shift, and calling CNVs. (b) B-allele frequency (BAF) analysis steps include reading variant file, storing the data about SNPs and small indels, filtering variants using strict mask, calculating BAF for heterozygous variants (HETs), and calculating likelihood function for bins. For CNVs, BAF signal splits away from value 0.5 expected for HETs. (c) Distribution of the variant allele frequency for all variants and variants within strict mask as defined by the 1000 Genomes Project. Black line shows fit by Gaussian distribution. (d) An example of RD depending on GC within bin. Statistics of RD signal within bins of the same percentage of GC content is used to correct for GC bias in the signal. White line represents average RD level for bins with given GC content. (e) An example of RD and BAF signals for a germline duplication in NA12878 sample (raw RD signal is in grey, GC-corrected RD signal is in black, brighter color of BAF likelihood corresponds to higher values of the likelihood).