Table I.
Remove SNPs with: | SNPs losta | |
---|---|---|
Projects using Illumina arrays | ||
Addiction Illumina Human1M | Lung Cancer Illumina HumanHap550 | |
Pre-release failures | ||
Missing call rate > 15% | 4,434 | 417 |
>1 discordance in replicate HapMap controls | 2,725 | 331 |
Manual review and other criteriab | 1,743 | 213 |
Total SNPs failed by Genotyping Center | 8,902 | 961 |
Post-release recommended filters | ||
MAF = 0 | 31,755 | 106 |
Missing call rate ≥ 2% | 28,800 | 7,569 |
Missing call rate ≥ 5% in one or both sexes | 0 | 0 |
>1 family with Mendelian error(s) | 835 | 486 |
>1 subject with discordant call(s) | 843 | 66 |
Sex difference in allelic frequency ≥0.2 | 13 | 0 |
Sex difference in heterozygosity > 0.3c | 0 | 0 |
HWE p-value < 10−4 in study controls | 2,275 | 1,242 |
MAF < 0.01 | 134,710 | 23,036 |
Initial number of SNPsd | 1,049,008 | 561,466 |
Percent of SNPs lost excluding MAF filter | 7.0% | 1.9% |
Percent of SNPs lost including MAF filter | 19.8% | 6.0% |
Genome coverage at r2>0.8 for all SNPs on the arraye | 91.2% | 87.4% |
Genome coverage at r2>0.8 after filteringe | 90.0% | 86.8% |
Projects with Affymetrix arrays | ||
T2D - NHS Affymetrix 6.0 | T2D HPFS Affymetrix 6.0 | |
Pre-release failures | ||
Missing call rate > 5% | 23,859 | 26,872 |
HWE p-value < 10−8 in all samples | 3,312 | 2,389 |
Plate associations (single plate p<10−8, 2 or more p<10−4)f | 3,380 | 5,844 |
Total SNPs failed by Genotyping Center | 30,551 | 35,105 |
Post-release recommended filters | ||
One member of each pair of duplicate probes (mostly AFFX)g | 2,839 | 2,903 |
MAF=0 | 1,438 | 2,782 |
Missing call rate ≥ 3% | 17,802 | 15,987 |
> 1 discordance in replicate samples of NA12144 | 7,121 | 5,340 |
HWE p-value < 10−4 in study controls | 540 | 513 |
MAF < 0.01 | 126,331 | 121,469 |
Initial number of SNPs | 909,622 | 909,623 |
Percent of SNPs lost excluding MAF filter | 6.6% | 6.9% |
Percent of SNPs lost including MAF filter | 20.5% | 20.2% |
Genome coverage at r2>0.8 for all SNPs on the arraye | 80.0% | 80.0% |
Genome coverage at r2>0.8 after filteringe | 78.1% | 77.9% |
The number of SNPs lost at each step is after losses at the previous step
Other criteria include gender difference in missing call rate and autosomal heterozygosity, male X heterozygosity, female Y heterozygosity
For autosomal and pseudo-autosomal SNPs only
The initial number of SNPs assayed is the total number of probes on the Illumina Human1M array (1,072,820) minus the number of intensity-only probes.
Calculated with HapMap II data for CEU subjects [Barrett and Cardon 2006] with software by Carl Anderson (see Web resources)
These plate association tests were conducted without adjustment for ethnicity differences among plates.
The Affymetrix 6.0 array has 3024 SNPs with the same ‘rs’ number and the same map position. Each of these SNPs is assayed with two different probes, one of which is ‘AFFX’, used for quality control.