Skip to main content
. 2010 Mar 8;19(13):2539–2553. doi: 10.1093/hmg/ddq102

Table 1.

Subjects in the extreme “tail” of the rare-het and rare-hom count distributions of two typical sample sets

SubjectIDa RHH Counts and p-valuesb
Genotype Confidencec
“Ethnic” Het Countsd
Other Ethnic Outlier methods
All SNPs
1 Mb apart
All 500K Rare Hets YRI CHB WTCCC PC-MDSe PLINK Z-scoref
Hets Homs Hets Homs
Sample set A (UKBS controls)
A1-1-1-1 9063 386 2003 229 0.06 0.04 356 19 YES −21.4
A2-6-2-5 3040 6 787 6 0.04 0.04 164 14 YES −5.5
A3-9-6-7 2090 1 617 1 0.04 0.04 111 5 YES −3.2
A4-3-3-3 1528 48 678 32 0.04 0.04 40 66 YES −14.5
A5-4-4-2 1315 46 659 38 0.03 0.03 50 64 YES −13.4
A6-2-5-2 1314 51 634 38 0.03 0.02 50 65 YES −13.8
A7-5-8-4 949 7 329 7 0.04 0.04 62 12 YES −2.7
A8-9-13-7 810 1 223 1 0.03 0.03 30 6 −1.2
A9-10-9-8 704 0 257 0 0.03 0.04 40 6 YES −2.0
A10-10-7-8 569 0 438 0 0.06 0.20 23 6 −1.4
A11-9-10-7 524 1 248 1 0.05 0.04 35 22 YES −3.7
A12-8-12-6 446 2 236 2 0.03 0.02 33 17 YES −7.2
A13-7-11-6 406 3 244 2 0.05 0.05 38 16 YES −6.1
A14-10-25-8 394 0 117 0 0.04 0.04 26 3 −0.9
A15-9-15-7 361 1 175 1 0.03 0.04 21 19 YES −4.2
A16-10-15-8 337 0 175 0 0.03 0.03 27 18 YES −5.7
A17-9-18-7 311 1 154 1 0.04 0.06 30 4 −1.9
A18-10-20-8 306 0 151 0 0.05 0.04 25 8 −2.0
A19-9-30-7 276 1 85 1 0.04 0.05 23 0 −0.4
A20-9-24-7 262 1 118 1 0.04 0.05 16 2 −1.6
A20-10-26-8 262 0 111 0 0.03 0.04 26 1 −1.1
A21-10-22-8 259 0 120 0 0.04 0.05 21 3 −2.2
A22-10-21-8 258 0 138 0 0.04 0.06 22 7 YES −4.7
A23-9-28-7 251 1 107 1 0.05 0.07 21 7 −0.6
A24-10-26-8 243 0 111 0 0.03 0.02 22 6 −1.0
Sample set B (58BC controls)
B1-3-1-1 2802 4 814 4 0.04 0.03 114 33 YES −6.0
B2-5-2-3 2221 2 556 2 0.03 0.03 102 9 YES −4.2
B3-4-4-2 1525 3 404 3 0.03 0.03 72 5 YES −2.4
B4-6-5-4 1068 1 317 1 0.06 0.05 49 3 YES −2.5
B5-4-12-2 600 3 180 3 0.04 0.05 33 2 −1.4
B6-7-3-5 507 0 411 0 0.07 0.22 24 7 −2.1
B7-7-18-5 488 0 137 0 0.03 0.03 35 1 −1.1
B8-6-16-4 451 1 141 1 0.04 0.04 25 2 −0.9
B9-7-11-5 442 0 185 0 0.07 0.07 22 9 −1.5
B10-5-8-3 432 2 224 2 0.06 0.06 39 17 YES −2.5
B11-2-17-2 413 7 139 3 0.02 0.04 34 2 −4.1
B12-7-15-5 384 0 151 0 0.06 0.07 24 3 −0.5
B13-7-13-5 383 0 161 0 0.06 0.08 25 1 −1.1
B14-7-22-5 352 0 108 0 0.03 0.03 22 1 −0.8
B15-7-17-5 318 0 139 0 0.05 0.08 19 3 −1.4
B16-1-18-1 313 8 122 4 0.03 0.03 28 5 −3.7
B17-7-15-5 298 0 151 0 0.03 0.03 20 5 −2.7
B18-6-24-4 294 1 98 1 0.03 0.04 14 6 −0.3
B19-7-14-5 282 0 157 0 0.05 0.06 24 8 −2.3
B20-7-7-5 269 0 233 0 0.07 0.19 14 7 −0.7
B21-7-19-5 257 0 121 0 0.03 0.03 25 3 −2.1
B22-7-25-5 255 0 96 0 0.03 0.03 23 2 −1.2
B23-7-6-5 252 0 234 0 0.05 0.19 14 4 −1.2
B24-7-20-5 248 0 118 0 0.03 0.03 19 5 −2.3
B24-7-26-5 248 0 88 0 0.04 0.04 15 5 YES −1.8

aSubjects are sorted from highest to lowest rare-het counts for “All SNPs” (column 2); subjectID is sample set followed by count rank in columns 2, 3, 4 and 5.

bCounts from all Affy500K SNPs or “thinned” to derive only from SNPs at least 1 Mb apart. Counts exceeding permutation-derived threshold are bold and underlined (signifying p<0.001) or only underlined (signifying p<0.05).

cMean BRLMM confidence for subject genotypes at all Affy500K SNPs and at all rare hets. Mean rare-het confidence above 0.1 is in bold italics to indicate doubtful genotype accuracy and likely false-positive ethnic outlier.

dCounts of heterozygotes at “ethnic” SNPs at least 1 Mb apart. Statistically excess counts (p<0.001 or p<0.05) denoted by bold and underline as in footnote b.

eSubject identified by WTCCC as having “non-Caucasian ancestry” based on PC-MDS analysis (12).

fLowest PLINK Z-score from 1st thorough 10th nearest-neighbor distributions. Z-scores are bold and underlined if statistically significant (Z<−4.0).