Table 1.
Algorithm | Ref 50* | Ref 100* | Ref 200* | Ref 300* |
---|---|---|---|---|
cor | cor | cor | cor | |
Low to high marker density | ||||
Beagle | 0.61 | 0.70 | 0.75 | 0.78 |
FImpute | 0.68 | 0.73 | 0.77 | 0.80 |
IMPUTE2 | 0.74 | 0.77 | 0.81 | 0.84 |
Random Forest | 0.56 | 0.61 | 0.66 | 0.69 |
Genotyping-by-sequencing-like | ||||
Beagle | 0.76 | 0.85 | 0.92 | 0.95 |
FImpute | 0.59 | 0.79 | 0.91 | 0.95 |
IMPUTE2 | 0.68 | 0.82 | 0.91 | 0.95 |
Random Forest | 0.54 | 0.64 | 0.75 | 0.83 |
Map- dependent (Beagle, FImpute, and IMPUTE2) and map-independent (Random Forest) algorithms were applied with reference population sizes of 50, 100, 200, and 300 lines out of 371, and imputing was performed for a low to high marker density and for a GBS-like data scenario.
*For GBS-like imputation scenarios, Ref 50, Ref 100, Ref 200, and Ref 300 refer to missing value rates 72.8%; 61.5%; 38.8%; 16.1% for all lines of the population, corresponding to scenarios with reference population sizes of 50, 100, 200, and 300, of the total of 371 lines.