Skip to main content
. 2023 Jan 10;11:802. Originally published 2022 Jul 18. [Version 2] doi: 10.12688/f1000research.122437.2

Table 2. List of datasets. 23 .

ID Description n p
simdata Simulated dataset used to explore GP characteristics of trait genetic complexity, population properties and dimensionality. See Methods section 2.1.1 for details.
Wheat Real wheat dataset from Norman, Taylor 24 containing 13 traits of varying genetic complexity. These traits are referred to by abbreviations:
BM: Biomass, PH: Plant Height, NDVI: Normalised Difference Vegetative Index, LL: Leaf Loss, LW: Leaf Width, GY: Grain Yield, GL: Glaucousness, GP: Grain Protein, Y: Physiological Yellows, TW: Test Weight of grains, TKW: Thousand Kernel Weight, GH: Growth Habit, GR: Greenness
10,375 17,181
STRUCT-simdata Real structured RegMap panel genotype data of Arabidopsis thaliana with simulated phenotypes data used to analyse the effect of population structure 1,307 15,662
STRUCT-realdata A subset of the real Arabidopsis thaliana structured RegMap panel genotype data with real phenotype data of the sodium accumulation trait used to analyse the effect of population structure 300 169,881
LD-simdata An unstructured set accessions from the core set of the Arabidopsis thaliana HapMap population with known genotype data and simulated phenotype data to study the impact of LD 344 48,343
LD-soy Real soybean dataset of with real phenotypes (R8, HT: height and YLD: yield) for studying the impact of low SNP-QTN LD 32 5,014 4,235