Skip to main content
. Author manuscript; available in PMC: 2011 Sep 1.
Published in final edited form as: Genet Epidemiol. 2010 Sep;34(6):591–602. doi: 10.1002/gepi.20516

Table I.

SNP failure and recommended filter criteria with results from 4 GENEVA projects.

Remove SNPs with: SNPs losta
Projects using Illumina arrays
Addiction Illumina Human1M Lung Cancer Illumina HumanHap550
Pre-release failures
  Missing call rate > 15% 4,434 417
  >1 discordance in replicate HapMap controls 2,725 331
  Manual review and other criteriab 1,743 213
Total SNPs failed by Genotyping Center 8,902 961
Post-release recommended filters
  MAF = 0 31,755 106
  Missing call rate ≥ 2% 28,800 7,569
  Missing call rate ≥ 5% in one or both sexes 0 0
  >1 family with Mendelian error(s) 835 486
  >1 subject with discordant call(s) 843 66
  Sex difference in allelic frequency ≥0.2 13 0
  Sex difference in heterozygosity > 0.3c 0 0
  HWE p-value < 10−4 in study controls 2,275 1,242
  MAF < 0.01 134,710 23,036
Initial number of SNPsd 1,049,008 561,466
Percent of SNPs lost excluding MAF filter 7.0% 1.9%
Percent of SNPs lost including MAF filter 19.8% 6.0%
Genome coverage at r2>0.8 for all SNPs on the arraye 91.2% 87.4%
Genome coverage at r2>0.8 after filteringe 90.0% 86.8%
Projects with Affymetrix arrays
T2D - NHS Affymetrix 6.0 T2D HPFS Affymetrix 6.0
Pre-release failures
  Missing call rate > 5% 23,859 26,872
  HWE p-value < 10−8 in all samples 3,312 2,389
  Plate associations (single plate p<10−8, 2 or more p<10−4)f 3,380 5,844
Total SNPs failed by Genotyping Center 30,551 35,105
Post-release recommended filters
  One member of each pair of duplicate probes (mostly AFFX)g 2,839 2,903
  MAF=0 1,438 2,782
  Missing call rate ≥ 3% 17,802 15,987
  > 1 discordance in replicate samples of NA12144 7,121 5,340
  HWE p-value < 10−4 in study controls 540 513
  MAF < 0.01 126,331 121,469
Initial number of SNPs 909,622 909,623
Percent of SNPs lost excluding MAF filter 6.6% 6.9%
Percent of SNPs lost including MAF filter 20.5% 20.2%
Genome coverage at r2>0.8 for all SNPs on the arraye 80.0% 80.0%
Genome coverage at r2>0.8 after filteringe 78.1% 77.9%
a

The number of SNPs lost at each step is after losses at the previous step

b

Other criteria include gender difference in missing call rate and autosomal heterozygosity, male X heterozygosity, female Y heterozygosity

c

For autosomal and pseudo-autosomal SNPs only

d

The initial number of SNPs assayed is the total number of probes on the Illumina Human1M array (1,072,820) minus the number of intensity-only probes.

e

Calculated with HapMap II data for CEU subjects [Barrett and Cardon 2006] with software by Carl Anderson (see Web resources)

f

These plate association tests were conducted without adjustment for ethnicity differences among plates.

g

The Affymetrix 6.0 array has 3024 SNPs with the same ‘rs’ number and the same map position. Each of these SNPs is assayed with two different probes, one of which is ‘AFFX’, used for quality control.