Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2021 Nov 10;11:22306. doi: 10.1038/s41598-021-00399-z

Publisher Correction: Robustification of GWAS to explore effective SNPs addressing the challenges of hidden population stratification and polygenic effects

Zobaer Akond 1,2,4,#, Md Asif Ahsan 1,#, Munirul Alam 3, Md Nurul Haque Mollah 1,
PMCID: PMC8580988  PMID: 34759286

Correction to: Scientific Reports 10.1038/s41598-021-90774-7, published online 22 June 2021

The original version of this Article contained typographical errors in the Equations.

In Table 1, fourth column, the Equation “Polygenic effect variation” was incorrect:

vark-m1+1m2bkZk

now reads:

vark=m1+1m2bkZk

In the Materials and methods section, under the subheading “Robustification of LMM based GWAS by using the outlier modification rule (proposed)”,

b is the vector of random polygenic effects which follows N(0,σg2K), where σg2 is the polygenic variance component, and K = (kijt) is the m × m genomic relationship matrix.”

now reads:

b is the vector of random polygenic effects which follows N(0,σg2K), where σg2 is the polygenic variance component, and K = (kjt) is the m × m genomic relationship matrix.”

And,

“When the maximum likelihood (ML) or restricted maximum likelihood (REML) variance component V^=σ^g2K+σ^ε2I is estimated, the classical F-statistic for testing the null hypothesis Ma = 0 for an arbitrary full-rank p × q matrix M13,60.”

now reads:

“When the maximum likelihood (ML) or restricted maximum likelihood (REML) variance compontent V^=σ^g2ZKZ+σ^ε2I is estimated, the classical F-statistic for testing the null hypothesis Ma = 0 for an arbitrary full-rank p × q matrix M13,60.”

And,

“(ii) Divide the phenotypic data into m groups corresponding to the m genotypic labels of the selected most significant SNP. For example, let

y=(y1,y2,...,yn)=(y11,..,y1n1,...,ym1,..,ymnm)

be the partition of phenotypic observations corresponding to the selected SNP, where, n = n1 + n2 + ……..+ nk.

(iii) Detect the outlying observations from the lth (l = 1,2,…,k) group using the β-weight function defined by

Wβ(yli|θ^l)=exp{-β2σli2(yli-μ^l)2} 4

where i =  1, 2, …, n”

now reads:

“(ii) Divide the phenotypic data into m groups corresponding to the m genotypic labels of the selected most significant SNP. For example, let

y=(y1,y2,...,yn)=(y11,..,y1n1,...,ym1,..,ymnm)

be the partition of phenotypic observations corresponding to the selected SNP, where, n = n1 + n2 + …….. + nm.

(iii) Detect the outlying observations from the lth (l = 1,2,….,k) group using the β-weight function defined by

Wβ(yli|θ^l)=exp{-β2σli2(yli-μ^l)2} 4

where i = 1, 2,  …, nl

Additionally, Equation 7 was incorrect:

“An outlying phenotypic observation ylk in the lth group is defined based on the β-weight function mentioned below:

Wβ(xjk|θ^j,β)=>τj,ifxjkisnotanoutlierτj,ifxjkisanoutlier 7

where the threshold value τl is the pth quantile value of the empirical distribution of Wβ(xli|θ^l,β).”

now reads:

“An outlying phenotypic observation yli in the lth group is defined based on the β-weight function mentioned below:

Wβ(yli|θ^j,β)=>τj,ifyliisnotanoutlierτj,ifyliisanoutlier 7

where the threshold value τl is the pth quantile value of the empirical distribution of Wβ(yli|θ^l,β).”

In Table 3, fourth column, the Equation “Polygenic effect variation” was incorrect:

varj=1m2bkxk

now reads:

vark=m1+1m2bkzk

Under the subheading “Genotype simulation”,

“For this purpose, m = 2000 SNPs were generated for n = 1000 individuals, and these individuals were taken from k = 3 distinct population by considering different minor allele frequencies (MAFs). To do this, first, a set of mlatent vectors {v1, v2, …..vm} was generated from a multivariate normal distribution with mean zero and variance–covariance matrix Cov(vj,vk) = ρ|j-k|64,65. In our simulation, we considered ρ = 0.5 to avoid the linkage disequilibrium (LD) between the SNPs. Finally, two cutoff values s1 and s2 were used to convert the design matrix V = [v1, v2, …, vm ] = [vij] of latent vectors to the genotypic score matrix xij(i=1,2,,n,j=1,2,,m1)andzij(i=m1,1,,n,j=1,2,,m2) as follows:

xij,zij=-1,vij<s10,s1vijs22,vij>s2

where s1 and s2 determine the minor allele frequency.”

now reads:

“For this purpose, m* = 2000 SNPs were generated for n = 1000 individuals, and these individuals were taken from k = 3 distinct population by considering different minor allele frequencies (MAFs). To do this, first, a set of latent vectors {v1, v2, …..vm*} was generated from a multivariate normal distribution with mean zero and variance–covariance matrix Cov(vj,vk) = ρ|j-k|64,65. In our simulation, we considered ρ = 0.5 to avoid the linkage disequilibrium (LD) between the SNPs. Finally, two cutoff values s1 and s2 were used to convert the design matrix V = [v1, v2, …, vm* ] = [vij] of latent vectors to the genotypic score matrix xij(i=1,2,,n,j=1,2,,m1)andzij(i=1,2,,n;j=1,2,,m2) as follows:

xij,zij=0,vij<s11,s1vijs22,vij>s2

where s1 and s2 determine the minor allele frequency.”

Furthermore, in the section “Phenotype simulation,”

“In every situation, m1 = 4 SNPs were considered as causal variants and the remaining m2 = m-m1 = 1996 SNPs were allocated as polygenic variants (effects).”

now reads:

“In every situation, m1 = 4 SNPs were considered as causal variants and the remaining m2 = m* - m1 = 1996 SNPs were allocated as polygenic variants (effects).”

Lastly, in the section “Consequence of phenotypic outliers on the partition of total variations”,

h=vark=1m1akxkvary>vark=1m1akxkvary=h2

now reads:

h2=vark=1m1akxkvary>vark=1m1akxkvary=h2

The original Article has been corrected.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES