Table 3.
HapMap | 1000GP Whole Genome | |||||
---|---|---|---|---|---|---|
Sample | Positions | Variants | Heterozygous variants | Positions | Variants | Heterozygous variants |
CEU-D | 42,964 | 11,558 | 6,568 | 24,926,557 | 14,605 | 9,595 |
CEU-M | 42,967 | 11,455 | 6,460 | 7,038,292 | 3,489 | 2,514 |
CEU-F | 43,049 | 11,461 | 6,498 | 12,227,792 | 6,030 | 4,012 |
YRI-D | 43,161 | 12,320 | 7,041 | 25,818,250 | 18,730 | 12,569 |
YRI-M | 43,219 | 12,205 | 6,843 | 2,356,922 | 1,136 | 831 |
YRI-F | 43,246 | 12,202 | 6,734 | 11,322,826 | 6,393 | 4,052 |
Both gold standards comprise a set of positions within CCDS for each of the six samples for which genotypes are given. We use the gold standards to compare the given genotypes to the genotype calls we make from our capture data over the same positions. The positions for the HapMap gold standard are taken to be the CCDS positions that have been successfully genotyped by the HapMap project using SNP arrays. The positions for the 1000 Genomes Project gold standard are the ones for which we were able to obtain high confidence (minimum consensus quality of 100) genotype calls based on the 1000 Genomes Project trio pilot sequence data. For both gold standards, the Positions column specifies the number of positions in the gold standard for each of the six samples, the Variants column specifies the number of gold standard genotypes that differ from the reference allele at the corresponding position, and the Heterozygous variants column specifies the number of heterozygous gold standard genotypes.