Table 5.
Comparison of the Cross-Platform Performance of the Different Methods
Method | Data | Euclidean Distance: 2nd[1st, 3rd] Quartile | Distance: 2nd[1st, 3rd] Quartile (km) | Relative to Full |
---|---|---|---|---|
PCA pruned | full | 2.70 [1.56, 4.38] | 238.7 [151.7, 361.7] | 1 |
PCA pruned | imputed | 6.28 [3.27, 11.0] | 513.1 [287.4, 800.2] | 2.15 |
PCA pruned | intersection | 2.92 [1.58, 5.13] | 246.2 [148.1, 420.2] | 1.03 |
SPA pruned | full | 2.57 [1.48, 4.17] | 228.4 [130.7, 350.8] | 1 |
SPA pruned | imputed | 2.60 [1.70, 4.60] | 235.4 [146.3, 381.8] | 1.03 |
SPA pruned | intersection | 3.12 [1.75, 4.65] | 257.2 [146.6, 379.2] | 1.13 |
LOCO-LD | full | 2.19 [1.40, 3.81] | 195.9 [118.7, 321.6] | 1 |
LOCO-LD | imputed | 2.66 [1.62, 4.39] | 232.3 [141.0, 365.6] | 1.19 |
LOCO-LD | intersection | 2.69 [1.57, 4.44] | 227.4 [139.7, 371.3] | 1.16 |
The genotypes of 10% of the POPRES samples were set to missing for all SNPs not contained in the Illumina 650Y array (∼80% of the SNPs) for the simulation of localization of Illumina-genotyped samples with the use of the POPRES Affymetrix reference data set. These samples (named the Illumina set) were localized with the use of a training set consisting of the rest of the POPRES samples. “Full” denotes localization using the full Affymetrix SNP set, as in the previous experiments. “Imputed” denotes imputing the test set to the POPRES SNPs set with BEAGLE prior to localization. “Intersection” denotes using only the SNPs contained in both arrays for localization. For PCA and SPA, the resulting data sets were pruned for short-range and long-range LD. Reported error measures are the same as in Table 1. “Relative to Full” gives, per method, the ratio between the median error (in km) and the result on the full SNP set.