Table 2. Estimates of imputation accuracy as the percentage of the imputed genotypes matched with the corresponding original genotypes over all the random missing SNP genotypes over 100 runs of simulation on each SNP genotype data set (corn, rice, and wheat) with 10−90% of total observations as random missing and as imputed by an imputation method (RF, PP, or NI).
Data Set/Missing (%) | RF | PP | NI | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
M | aa | Aa | AA | M | aa | Aa | AA | M | aa | Aa | AA | |
Corn | ||||||||||||
10 | 51.3 | 12.0 | 7.9 | 31.3 | 54.4 | 2.6 | 6.2 | 45.6 | 50.0 | 1.6 | 6.7 | 41.7 |
20 | 50.7 | 11.9 | 8.2 | 30.6 | 54.2 | 2.5 | 6.4 | 45.2 | 43.9 | 1.6 | 7.8 | 34.5 |
30 | 50.6 | 11.7 | 8.0 | 30.8 | 54.1 | 2.5 | 6.3 | 45.3 | 44.7 | 0.8 | 7.2 | 36.6 |
40 | 49.8 | 11.4 | 8.2 | 30.1 | 53.8 | 2.5 | 6.4 | 44.9 | 42.5 | 0.4 | 7.4 | 34.8 |
50 | 48.8 | 10.9 | 8.0 | 29.9 | 53.8 | 2.6 | 6.3 | 45.0 | 34.4 | 0.4 | 8.3 | 25.8 |
60 | 47.6 | 10.4 | 7.7 | 29.5 | 53.0 | 2.6 | 6.2 | 44.2 | 28.7 | 0.0 | 8.6 | 20.1 |
70 | 45.8 | 9.5 | 6.9 | 29.4 | 53.0 | 2.8 | 6.0 | 44.2 | 14.6 | 1.1 | 9.1 | 4.3 |
80 | 43.6 | 8.2 | 5.1 | 30.3 | 52.2 | 3.2 | 5.6 | 43.3 | 15.9 | 10.2 | 5.6 | 0.1 |
90 | 45.3 | 7.6 | 3.0 | 34.7 | 50.4 | 4.6 | 4.5 | 41.3 | 21.1 | 20.9 | 0.2 | 0.0 |
Asd | 1.3 | 0.5 | 0.8 | 1.3 | 1.3 | 0.4 | 0.6 | 1.2 | 1.1 | 0.3 | 0.7 | 1.0 |
Rice | ||||||||||||
10 | 74.1 | 23.1 | 0.1 | 50.9 | 68.6 | 13.1 | 0.3 | 55.2 | 62.7 | 11.0 | 0.3 | 51.4 |
20 | 73.8 | 23.0 | 0.1 | 50.7 | 68.5 | 13.1 | 0.3 | 55.2 | 56.3 | 10.8 | 0.4 | 45.1 |
30 | 73.8 | 23.0 | 0.1 | 50.7 | 68.3 | 13.0 | 0.3 | 55.0 | 51.2 | 7.0 | 0.4 | 43.9 |
40 | 73.6 | 22.9 | 0.1 | 50.5 | 68.1 | 13.1 | 0.3 | 54.8 | 38.5 | 7.1 | 0.4 | 30.9 |
50 | 73.8 | 22.9 | 0.1 | 50.9 | 68.3 | 13.2 | 0.3 | 54.9 | 29.0 | 3.5 | 0.5 | 25.1 |
60 | 73.8 | 22.7 | 0.1 | 51.0 | 66.9 | 12.6 | 0.3 | 54.0 | 13.8 | 5.0 | 0.5 | 8.3 |
70 | 73.7 | 22.7 | 0.1 | 51.0 | 67.7 | 13.2 | 0.2 | 54.2 | 5.3 | 1.9 | 0.5 | 2.9 |
80 | 72.4 | 21.9 | 0.0 | 50.5 | 66.5 | 13.1 | 0.2 | 53.2 | 15.2 | 14.9 | 0.2 | 0.1 |
90 | 65.1 | 18.2 | 0.0 | 46.9 | 61.2 | 12.7 | 0.2 | 48.3 | 28.4 | 28.3 | 0.0 | 0.0 |
Asd | 1.4 | 0.6 | 0.0 | 1.0 | 1.6 | 0.8 | 0.1 | 1.0 | 0.8 | 0.5 | 0.1 | 0.6 |
Wheat | ||||||||||||
10 | 61.4 | 12.8 | 0.1 | 48.4 | 71.5 | 3.7 | 0.2 | 67.6 | 68.6 | 2.8 | 0.3 | 65.5 |
20 | 61.4 | 12.6 | 0.1 | 48.6 | 71.7 | 3.7 | 0.2 | 67.8 | 64.3 | 2.9 | 0.3 | 61.1 |
30 | 60.9 | 12.4 | 0.1 | 48.4 | 71.5 | 3.7 | 0.2 | 67.6 | 63.8 | 1.7 | 0.3 | 61.8 |
40 | 60.8 | 12.1 | 0.1 | 48.7 | 71.2 | 3.5 | 0.2 | 67.4 | 58.4 | 1.2 | 0.3 | 56.9 |
50 | 61.6 | 12.1 | 0.1 | 49.5 | 71.1 | 3.7 | 0.2 | 67.2 | 52.7 | 0.7 | 0.3 | 51.6 |
60 | 62.8 | 11.8 | 0.1 | 50.9 | 70.5 | 3.8 | 0.2 | 66.5 | 44.6 | 0.1 | 0.4 | 44.2 |
70 | 63.9 | 11.3 | 0.1 | 52.5 | 70.7 | 4.0 | 0.2 | 66.5 | 15.7 | 0.8 | 0.4 | 14.4 |
80 | 66.0 | 10.6 | 0.0 | 55.4 | 69.8 | 4.3 | 0.2 | 65.3 | 8.3 | 7.7 | 0.3 | 0.4 |
90 | 67.0 | 8.1 | 0.0 | 58.9 | 67.0 | 4.6 | 0.1 | 62.2 | 17.2 | 17.2 | 0.0 | 0.0 |
Asd | 1.4 | 0.6 | 0.1 | 1.4 | 1.7 | 0.4 | 0.1 | 1.6 | 1.5 | 0.3 | 0.1 | 1.4 |
RF, random forest; PP, probabilistic principal component analysis, NI, nonlinear iterative partial least squares PCA); M, the percentage of the imputed genotypes matched with the corresponding original genotypes over all the random missing SNP genotypes; aa, Aa, or AA, the percentage of the specific imputed genotypes matched with the corresponding original minor (aa), heterozygous (Aa), or major (AA) genotypes over all the random missing SNP genotypes, respectively; Asd, the average of the SDs obtained for an accuracy estimate over the nine levels of missingness.