Table 3.
Best XGB model parameters for each dataset obtained via grid search cross-validation
| Data seta | Subset of features | Learning rate | Individual tree depth | Data sample |
|---|---|---|---|---|
| Sim 5 | 0.01 | 0.05 | 2 | 0.05 |
| Sim 50 | 0.40 | 0.10 | 2 | 0.90 |
| Sim rand | 0.30 | 0.05 | 6 | 0.80 |
| Sim real | 0.80 | 0.20 | 2 | 0.10 |
| Real data | 0.60 | 0.30 | 6 | 0.90 |
aSim 5 = simulated data with a 5% missing rate for all SNPs; Sim 50 = simulated data with a 50% missing rate for all SNPs; Sim rand = simulated data with a random error rate in the range of 5 to 50%; Sim real = simulated data with a random error rate in the range of 5 to 10% for 5 SNPs and 10 to 50% for 10 SNPs; real data = data from the Finnish Rainbow Trout Breeding Program