Skip to main content
. 2015 Mar 11;16(1):168. doi: 10.1186/s12864-015-1366-y

Table 1.

Accuracies of imputing measured as average correlations (cor) between observed and estimated marker genotypes

Algorithm Ref 50* Ref 100* Ref 200* Ref 300*
cor cor cor cor
Low to high marker density
Beagle 0.61 0.70 0.75 0.78
FImpute 0.68 0.73 0.77 0.80
IMPUTE2 0.74 0.77 0.81 0.84
Random Forest 0.56 0.61 0.66 0.69
Genotyping-by-sequencing-like
Beagle 0.76 0.85 0.92 0.95
FImpute 0.59 0.79 0.91 0.95
IMPUTE2 0.68 0.82 0.91 0.95
Random Forest 0.54 0.64 0.75 0.83

Map- dependent (Beagle, FImpute, and IMPUTE2) and map-independent (Random Forest) algorithms were applied with reference population sizes of 50, 100, 200, and 300 lines out of 371, and imputing was performed for a low to high marker density and for a GBS-like data scenario.

*For GBS-like imputation scenarios, Ref 50, Ref 100, Ref 200, and Ref 300 refer to missing value rates 72.8%; 61.5%; 38.8%; 16.1% for all lines of the population, corresponding to scenarios with reference population sizes of 50, 100, 200, and 300, of the total of 371 lines.