Skip to main content
. 2020 May 28;107(1):46–59. doi: 10.1016/j.ajhg.2020.05.004

Table 3.

Accuracy of Polygenic Prediction in Independent Validation Cohorts

Discovery GWAS Training (UK Biobank) Validation (Partners) Method R2Nag 5% Tail OR
Breast cancer 2017 (n = ~230,000) n = 3,956/3,956 n = 754/8,324 P+T 0.016 1.56
LDPred 0.015 1.78
PRS-CS 0.034 2.23
NPS 0.034 2.32
Inflammatory bowel disease (n = ~35,000) n = 2,483/2,483 n = 839/16,000 P+T 0.050 3.57
LDPred 0.038 3.07
PRS-CS 0.065 4.11
NPS 0.069 4.32
Type 2 diabetes (n = ~160,000) n = 7,298/7,298 n = 2,026/14,813 P+T 0.038 2.10
LDPred 0.046 2.51
PRS-CS 0.058 2.80
NPS 0.054 2.97
Coronary artery disease (n = ~330,000) n = 2,000/2,000 n = 268/7,107 P+T 0.018 2.72
LDPred 0.016 2.31
PRS-CS 0.027 3.16
NPS 0.025 4.10

Non-parametric shrinkage (NPS) and PRS-CS outperform both pruning and thresholding (P+T) and LDPred in completely independent validation cohorts from US white population (Partners Biobank). The same cohorts from UK Biobank was used for training prediction models (Table 2). The tail odds ratios (OR) stand for the odds ratios of cases over controls at the 5% tail in polygenic score distribution compared to the rest. For CAD and T2D, all prediction models were trained and validated with the sex covariate to account for the difference of disease prevalence by sex.