Skip to main content
. 2022 Nov 4;13:932512. doi: 10.3389/fpls.2022.932512

Table 1.

Overview of simulation settings to create synthetic phenotypes for real Arabidopsis thaliana genotypes.

Sim n k y c ϵ
A 100 1 β 1x1 + u + ϵ c 1 = 0.3 𝒩 (0, σ2)
B 500 1 β 1x1 + u + ϵ c 1 = 0.3 𝒩 (0, σ2)
C 1,000 1 β 1x1 + u + ϵ c 1 = 0.3 𝒩 (0, σ2)
D 2,000 1 β 1x1 + u + ϵ c 1 = 0.3 𝒩 (0, σ2)
E 1,000 2 β 1x1 + β 2x2 + β 3x1 x2 + u + ϵ c1 = c2 = 0.05, c 3 = 0.2 𝒩 (0, σ2)
F 1,000 2 β 1x1 + β 2x2 + β 3x1 x2 + u + ϵ c 1 = c2 = 0.01, c3 = 0.28 𝒩 (0, σ2)
G 1,000 1 β 1x1 + u + ϵ c 1 = 0.3 Γ(0, σ2/4)
H 1,000 1 β 1x1 + u + ϵ c 1 = 0.3 Γ(0, σ2)
I 1,000 5 Σi=15 β ixi + u + ϵ ci𝒩 (6, 22)/100, i = 1, . . . , 5 𝒩 (0, σ2)
J 1,000 20 Σi=120 β ixi + u + ϵ ci𝒩 (1.5, 0.52)/100, i = 1, . . . , 20 𝒩 (0, σ2)
K 1,000 50 Σi=150 β ixi + u + ϵ ci𝒩 (0.6, 0.22)/100, i = 1, . . . , 50 𝒩 (0, σ2)
L 1,000 100 Σi=1100 β ixi + u + ϵ ci𝒩 (0.3, 0.12)/100, i = 1, . . . , 100 𝒩 (0, σ2)

The first column Sim indicates the simulation setting, n is the number of samples, k the number of causal markers, y the formula of the phenotype, c the effect size of the causal markers, and ϵ the added noise. Here, x i denotes a single marker and x i ◦ x j is the Hadamard product of two markers.