Skip to main content
. 2022 Jun 27;9(7):936–949. doi: 10.1002/acn3.51569

Figure 1.

Figure 1

Overview of study. The study consisted of two major steps: (1) GWAS in the 23andMe Cohort for nomination of novel variants associated with ICD in PD and (2) development of a model to predict ICD behavior in PD subjects. The GWAS in the 23andMe Cohort (3286 ICD negative (ICD−) and 1976 ICD positive (ICD+) participants) uncovered four SNPs associated with ICD behavior in PD subjects at p < 1.3e‐06. These four and the additional 13 SNPs that were previously reported in the literature to associate with ICD were tested for association with ICD behavior and used to develop an ICD risk score in PD subjects. In particular, we obtained genotypes of 17 nominated SNPs for 320 (252 ICD− and 68 ICD+ participants) PPMI and 188 (139 ICD− and 49 ICD+ PD participants) UPenn Cohort PD subjects. We applied model selection to develop a final logistic regression classifier model. First, we combined the PPMI and UPenn Cohorts (N = 508) and then we randomly split this combined dataset into a non‐overlapping Training dataset and Test dataset in a 2:1 ratio. To select the subset of variables to keep in our final model (providing the best fit to the data), we used the Training dataset only, first performing backward feature selection with fivefold cross‐validation repeated 100 times on the Training dataset (261 ICD− and 78 ICD+ PD participants). We fit the final model (which included two SNPs (rs1800497 and rs1799971) as well as cohort, age, sex, dopamine agonist use, levodopa use, disease duration, and ethnicity as predictors) to the Training dataset employing Bayesian logistic regression. We then evaluated the ability of the Bayesian logistic regression model to predict ICD in the held‐out Test dataset (130 ICD− and 39 ICD+ PD participants). The final classifier model achieved ROC‐AUC = 0.72 on the Test dataset. For each PD participant in the Test dataset, we calculated the risk score (log odds) and RR of developing an ICD behavior using the predictive model. ICD, impulse control disorder; PD, Parkinson's disease; GWAS, Genomewide Association Study; PPMI, Parkinson's Progression Markers Initiative; RR, risk ratio. [Colour figure can be viewed at wileyonlinelibrary.com]