Skip to main content
. 2022 Nov 16;46(5):929–937. doi: 10.2337/dc22-0295

Figure 1.

Figure 1

A: ExWAS design workflow. We classified type 2 diabetes from Health and Exposure Survey data. We selected exposures for ExWAS from PEGS questionnaire responses and individually modeled these exposures for associations with type 2 diabetes. We used DSA modeling to select the most parsimonious model for exposures associated with type 2 diabetes (FDR < 0.10). B: Risk score design and workflow. We divided participants into three data sets based on genotyping status. The derivation data set comprised nongenotyped participants and was used to derive risk score features with an ExWAS. We split genotyped participants into training and test data sets. We developed OCS and PXS in the training data set using lasso regression. We computed PXS using 13 exposure variables and OCS using four clinical variables. We developed a PGS using 122,359 SNPs in UK Biobank data. We tested risk scores for association with type 2 diabetes in the training data set. We then used the held-out test data set to evaluate the predictive accuracy of the computed risk scores for type 2 diabetes. C: PGS computation workflow. We used four data sets for PGS development. We used summary statistics from the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) consortium meta-analysis, the linkage disequilibrium (LD) reference panel of the 1000 Genomes Project, and UK Biobank data as the training data set and the genotyped PEGS participants as the test data set. We used LDpred2 to derive 200 polygenic risk scores in UK Biobank data from which we selected the best PGS based on the AUC. We used the adjusted weights of the best PGS to compute PGS for the genotyped PEGS participants.