Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 Sep 1;185(18):3426–3440.e19. doi: 10.1016/j.cell.2022.08.004

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2022 The Authors

This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

PMC Copyright notice

Comparison of small variant calls to the phase 3 call set

(A and B) Number of SNVs (A) and INDELs (B) across the 2,504 samples in phase 3 and high-coverage datasets, stratified by AF bins and regions of the genome. Secondary y axis: % of autosomal phase 3 variants recalled in the high-coverage call set across SNVs (A) and INDELs (B) in easy and difficult regions of the genome. See also Figure S4C.

(C and D) Comparison of FDR across SNVs (C) and INDELs (D) between the high-coverage and phase 3 call sets, stratified by AF bins and regions of the genome. See also Figure S4B.

(E and F) Sample-level SNV (E) and INDEL (F) counts in the phase 3 versus high-coverage call sets, stratified by 1kGP super-population ancestry. EUR, European; AFR, African; EAS, East Asian; SAS, South Asian; AMR, American. Reported counts are at a locus level.

(G and H) Comparison of predicted functional SNV (G) and INDEL (H) counts in the high-coverage versus phase 3 call set. Log2(ratio) denotes ratio of variant counts in the high-coverage versus phase 3 call set. Top row: cohort-level comparison. Middle row: sample-level comparison. Bottom row: comparison of FDR. Red asterisks mark categories with fewer than 100 sites in sample NA12878 (i.e., categories where FDR estimation is less reliable). See also Figures S4D and S4E.

FDR in (C), (D), (G), and (H) was estimated based on comparison of calls in sample NA12878 to the GIAB truth set v3.3.2. (A), (B), and (E–H): chromosomes (chr) 1–22; (C) and (D): chr1–22 and X.