Skip to main content
. Author manuscript; available in PMC: 2017 Oct 18.
Published in final edited form as: Am J Addict. 2017 Jul 17;26(5):494–501. doi: 10.1111/ajad.12586

Table 2.

Post imputation marker filtering by MAF and HWE for self-reported census race and genetic population assignment for an example GWAS of unrelated individuals using 17,461,305 imputed variants.

Group n min. MAF MAF fail HWE fail MAF/HWE fail
ALL 5880 0.0034 77,826 2,716,350 2,794,175

WHT 3002 0.0067 6,965,188 10,004 6,974,728
 EUR 2972 0.0067 7,036,314 9,308 7,045,613

BLK 1202 0.0166 3,333,134 4,169 3,337,283
 AFR 1329 0.0150 2,973,215 4,579 2,977,773

ASN 921 0.0217 9,182,668 143,288 9,325,808
 EAS 552 0.0362 10,561,171 2,829 10,563,961
 SAS 447 0.0447 10,093,522 1,774 10,095,181

HSP 357 0.0560 10,450,319 925 10,451,231
 AMR 580 0.0345 9,402,185 2,229 9,404,397

Analysis nmax nmin nmedian Total SNPs

 ALL 5880 14,667,130

Census Race 5482 1,202 4,204 16,377,112

Pop. Match 5880 1,027 4,104 16,597,801

Note: SNP = single nucleotide polymorphism, n = sample size, min. MAF = minimum minor allele frequency where at least 40 minor alleles are observed in the sample, MAF pass = number of SNPs passed MAF, HWE = number of SNPs passed Hardy Weinberg Equilibrium threshold, Total fail = number of SNPs failing MAF or HWE threshold, ALL = all data available for genetic analysis, WHT = self-identified census race (SIA) white, EUR = matched to 1KGP European population, BLK = SIA black/African American, AFR = matched to 1KGP African population, ASN = SIA Asian, EAS = matched to 1KGP East Asian population, SAS = matched to 1KGP South Asian population, HSP = SIA Hispanic/Latino, AMR = matched to 1KGP Americas population, Census Race = total sample available for analysis using SIA, Pop. Match = total sample available for analysis using genetic population assignment, Analysis = ALL is all subjects analyzed together, Census Race or Pop. Match is each group analyzed separately and then meta-analyzed, nmax = maximum number of individuals in analysis, nmin and nmedian are the minimum and median number of individuals available per meta-analysis, Total SNPs = total SNPs available for genome-wide association analysis.