#In our simulations, let:

#1.	Y represent a continuous phenotype measure.
#2.	X represent the allele frequency of a particular SNP.
#3.	PC1 and PC2 represent the first two principle components of the genetic related matrix of the colonies.
#4.	E represent a normally distributed random error term with mean 0 and variance σ2.  

#We generate observations according to the linear regression model:

#Y = β0 + β1X + β2PC1 + β3PC2 + E		       (1)

#Under this regression model, σ2 quantifies the residual variance of Y.

#We first generate n = 9 observations of X, PC1, and PC2. The values of X are generated 
#from a uniform distribution between 0 and 0.5, and PC1 and PC2 are generated from normal distributions 
#and then transformed to be orthonormal to each other, to be consistent with the behavior of principle 
#components. We fix the seed of the random number generator to give a reproducible simulation.

## set seed of random number generator
set.seed(1)

## sample size
n = 9

## simulate orthonormal principle components 
PC1 = rnorm(n)
PC1 =  PC1 / sqrt(sum(PC1^2))
PC2 = rnorm(n)
PC2 = PC2 - sum(PC1 * PC2) * PC1
PC2 = PC2 / sqrt(sum(PC2^2))

## simulate allele frequency for one SNP 
X = runif(n, 0, 0.5)

#Next, for different values of the residual variance σ2, we generate different sets of n = 9 values of E. 
#For each set, we generate a corresponding set of values Y according to model (1). We then fit a linear 
#model regressing Y on X, PC1, and PC2, and record the resulting p-values for the significance of X.

## possible values of sigma
sigmas = c(1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8, 1e-9, 1e-10, 1e-11)

## store p-values
pvals = rep(NA,length(sigmas)) 
for (i in 1:length(sigmas)) {
  ## generate data using sigma = ith value of sigmas
  sigma = sigmas[i]

  ## generate epsilon
  epsilon = rnorm(n, sd = sigma)

  ## generate Y according to model (1)
  Y = X + PC1 + PC2 + epsilon
  
  ## fit regression model
  fit = lm(Y ~ X + PC1 + PC2)

  ## save p-value corresponding to X
  pvals[i] = summary(fit)$coefficients[2, 4]
  }

#The simulation can be repeated using different values of X, PC1, PC2, and E, by changing the seed for 
#the random number generator, and the same qualitative results will hold (S2).