TABLE 2.
Description of each step that is performed during a single iteration of the simulation routine
| Step | Description |
|---|---|
| 1 | Sample 400 vertices from the pedigree graph by means of the ‘forest fire’ algorithm: |
| Indirect sampling of hybrids | |
| Indirect sampling of multi-environment trials | |
| 2 | Partition sampled inbred lines in c heterotic groups by means of the Dsatur vertex coloring algorithm |
| 3 | Simulate 8 breeding cycles on each of the c heterotic groups |
| 4 | Simulate phenotypic records on the sampled hybrids |
| 5 | Reduce the number of sampled hybrids by gradually increasing CDmin |
| 6 | Reduce the number of genotyped inbred lines by means of the greedy densest k-subgraph algorithm |
| 7 | Select q SSR markers with maximal genome coverage |
| 8 | Determine the prediction accuracy of ɛ-SVR and BLP using the reduced set of training examples |
The goal is to find the optimal trade-off between the number of genotyped inbred lines and the size of their molecular fingerprint, when the total genotyping budget is fixed.