Fig. 3.
Phasing and imputation based on the HRC1.1 reference panel [mostly European ancestry; Table 1 (A)] performed better for EagleImp (green) compared to the original Eagle2/PBWT (blue) in all five test datasets HRC.AFR, HRC.AMR, HRC.EAS, HRC.EUR and HRC.SAS [Table 2 (1–5)]. The figure shows boxplots (hinges at the first and third quartiles, whiskers extending up to interquartile range, outliers plotted separately) for sample-wise switch error rates after phasing (top) and genotype error rates after imputation (bottom), shown for four different values of K: 10 000 (default setting in Eagle2), 16 384, 32 768 and ‘max’ (corresponding to maximum available haplotypes in reference panel). The mean values are shown as red dots. The parameter K selects the K-best haplotypes from the reference for phasing. As expected, phasing and imputation of input data of European ancestry using the HRC1.1 reference panel, which consist mostly of European samples, results in the best accuracy, which increases with an increasing K parameter. For populations that do not match the predominant population of the reference panel, an increasing K parameter may lead to less accurate results. A scaled version of this figure with a stretched Y axis is provided in Supplementary Material Figure S2 to better illustrate the difference between EagleImp and Eagle2/PBWT (A color version of this figure appears in the online version of this article.)
