Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2014 Sep 30;2:e600. doi: 10.7717/peerj.600

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2014 Warden et al.

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

PMC Copyright notice

For each of the three datasets characterized in this study (1KG targeted exon, n = 14; 1KG exome, n = 12; SRP019719 exome, n = 15), the number of coding SNPs called per sample is plotted along the x-axis and the proportion of novel variants is plotted on the y-axis. In order to simplify presentation of these results, we focused on the highest quality variant calls for each variant calling strategy: GATK UnifiedGenotyper with low-quality variants removed (UG-HQ, blue), GATK HaplotypeCaller with low-quality variants removed (HC-HQ, green), and VarScan using a custom set of conservative parameters (VarScan-Cons, red). Additionally, an unfiltered set of variants called via samtools are plotted in black. Only variants subject to GATK indel realignment and quality score recalibration (“Full Pipeline”) are considered for this comparison. The shape of the data point corresponds to the depth of on-target coverage: <50x coverage is represented as an X in an open-circle, 50–100x is represented as an open circle, and >100x is represented as a filled circle. If the novel percentage was tightly correlated with the actual false positive rate and the number of variants was tightly correlated with the actual sensitivity of the variant caller, than the ideal variant caller would show a cluster of data points in the bottom-right hand corner of the plot.