Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2020 May 28;107(1):46–59. doi: 10.1016/j.ajhg.2020.05.004

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2020 American Society of Human Genetics.

PMC Copyright notice

Overview of Non-Parametric Shrinkage (NPS)

(A) For unlinked markers, NPS partitions SNPs into K subgroups splitting the GWAS effect sizes ( ${\hat{β}}_{j}$ ) at cut-offs of $b_{0}, b_{1}, \dots, b_{K}$ . Partitioned risk scores $G_{i k}$ are calculated for each partition $k$ and individual $i$ using an independent genotype-level training cohort. The per-partition shrinkage weights $ω_{k}$ are determined by the separation of $G_{i k}$ between training case subjects and control subjects. Estimating the per-partition shrinkage weights is a far easier problem than estimating per-SNP effects. The training sample size is small but still larger than the number of partitions, whereas for per-SNP effects, the GWAS sample size is considerably smaller than the number of markers in the genome. This procedure “shrinks” the estimated effect sizes not relying on any specific assumption about the distribution of true effect sizes.

(B) For markers in LD, genotypes and estimated effects are decorrelated first by a linear projection $P$ in non-overlapping windows of ∼2.5 Mb in length, and then NPS is applied to the data. The size of black dots indicates genotype frequencies in population. Before projection, genotypes at SNP 1 and 2 are correlated due to LD ( $D$ ), and thus sampling errors of estimated effects ( ${\hat{β}}_{j} | β_{j}$ ) are also correlated between adjacent SNPs. The projection $P$ neutralizes both correlation structures. The axes of projection are marked by red dashed lines. $β_{j}$ denotes the true genetic effect at SNP $j$ . $N_{g}$ is the sample size of GWAS cohort.