Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2011 May 1.

Published in final edited form as: Nat Rev Genet. 2010 May;11(5):356–366. doi: 10.1038/nrg2760

Imputation accuracy is plotted as a function of LD measured by mean r² at a distance of 10 kb in a genome-wide dataset⁴³. Genotypes in a genome-wide study are hidden and then imputed, with two different designs. In the shaded region, genotypes in each population are imputed without an external reference panel, so that the information for imputing “missing” genotypes comes from other individuals in the population. In the unshaded region, genotypes in the population are imputed using an external reference panel, chosen optimally among 36 mixtures of the HapMap CEU (European American), CHB+JPT (Chinese and Japanese), and YRI (Yoruba) panels. Color coding for populations follows that of Fig. 3. The regression lines exclude the African populations, and they have coefficients of determination 0.003 (external reference) and 0.953 (internal reference). The figure shows that imputation accuracy based on an internal reference is highly correlated with LD. However, imputation accuracy based on an external reference is not correlated with LD (and instead depends on the composition of the particular reference panels available). The figure is based on the data in scenarios 1, 3, and 6 in Table 1 of Huang et al. ⁶⁸.