Table 3.
Identification accuracies (Mean (SE) over 50 replications) when the reference population was genotyped with sequencing and the test population was genotyped with different SNP chips or with sequencinga
Chip/SEQ | No. MBI SNPs containedb | Imputation accuracy | Identification accuracy, % | |||
---|---|---|---|---|---|---|
KNN | RF | SVM | KSR | |||
50K | 20 | 83.58% | 99.69 (0.01) | 97.87 (0.03) | 98.01 (0.00) | 99.16 (0.02) |
80K | 52 | 88.52% | 99.86 (0.00) | 99.19 (0.04) | 98.16 (0.00) | 99.51 (0.03) |
100K | 65 | 89.41% | 99.33 (0.01) | 98.75 (0.03) | 98.01 (0.00) | 98.84 (0.02) |
150K | 91 | 91.16% | 99.65 (0.01) | 99.04 (0.03) | 98.01 (0.00) | 99.19 (0.03) |
777K | 261 | 94.36% | 99.72 (0.00) | 99.15 (0.03) | 98.01 (0.00) | 99.30 (0.03) |
SEQ | 2,000 | – | 99.86 (0.00) | 99.24 (0.03) | 98.16 (0.00) | 99.65 (0.03) |
aThe chip genotypes were imputed to sequence level. The reference population size was 30 individuals per breed and 2,000 most breed-informative SNPs derived by DFI were used
bNumber of SNPs among the 2,000 most breed-informative (MBI) SNPs derived from the reference population which were contained in the chips
KNN K-Nearest Neighbor, RF Random Forest, SVM Support Vector Machine, KSR, an integration of KNN, SVM and RF