Table 2.
Dataset | Total SNP | Typed SNP | Sample | Ref Haps | Miss % |
---|---|---|---|---|---|
Sim 10K | 62 704 | 22 879 | 1000 | 10 000 | 0.5 |
Sim 100K | 80 029 | 23 594 | 1000 | 100 000 | 0.5 |
Sim 1M | 97 750 | 23 092 | 1000 | 1 000 000 | 0.5 |
1000G chr10 | 3 968 020 | 192 683 | 52 | 4904 | 0.1 |
1000G chr20 | 1 802 261 | 96 083 | 52 | 4904 | 0.1 |
HRC chr10 | 1 809 068 | 191 210 | 1000 | 52 330 | 0.1 |
HRC chr20 | 829 265 | 95 414 | 1000 | 52 330 | 0.1 |
Note: For the 1000G and HRC datasets, the SNPs also present on the Infinium Omni5-4 Kit constitute the typed SNPs. Miss % is the percentage of typed SNPs randomly masked to mimic random missing data. For ancestry estimation, we used the top 50 000 most ancestry informative SNPs in each 1000G chromosome.