Abstract
Parkinson’s disease is a neurodegenerative movement disorder that currently has no disease-modifying treatment, partly owing to inefficiencies in drug target identification and validation. We use Mendelian randomization to investigate over 3,000 genes that encode druggable proteins and predict their efficacy as drug targets for Parkinson’s disease. We use expression and protein quantitative trait loci to mimic exposure to medications, and we examine the causal effect on Parkinson’s disease risk (in two large cohorts), age at onset and progression. We propose 23 drug-targeting mechanisms for Parkinson’s disease, including four possible drug repurposing opportunities and two drugs which may increase Parkinson’s disease risk. Of these, we put forward six drug targets with the strongest Mendelian randomization evidence. There is remarkably little overlap between our drug targets to reduce Parkinson’s disease risk versus progression, suggesting different molecular mechanisms. Drugs with genetic support are considerably more likely to succeed in clinical trials, and we provide compelling genetic evidence and an analysis pipeline to prioritise Parkinson’s disease drug development.
Subject terms: Genetics, Genetics research, Neurology
There is currently no disease-modifying treatment for Parkinson’s disease, a common neurodegenerative disorder. Here, the authors use genetic variation associated with gene and protein expression to find putative drug targets for Parkinson’s disease using Mendelian randomization of the druggable genome.
Introduction
Parkinson’s disease (PD) is a neurodegenerative movement disorder that currently has no disease-modifying treatment. Despite efforts, around 90% of drugs that enter clinical trials fail, mostly due to insufficient efficacy or safety1–3. This contributes to the staggering $1.3 billion mean price of bringing a new drug to the market1.
Incorporating genetics in drug development could be one of the most efficient ways to improve the process, because drugs with genetic support are considerably more likely to succeed in clinical trials4–6. “Druggable” genes encode proteins that have been targeted by medications or are possible to target with a small molecule or monoclonal antibody7,8. While genome-wide association studies (GWAS) have effectively identified single nucleotide polymorphisms (SNPs) linked to PD risk and progression9–11, the GWAS design cannot reliably pinpoint causal genes and directly inform drug development.
Mendelian randomization (MR) is a genetic technique that can predict the efficacy of a drug by mimicking a randomized controlled trial12–15. SNPs associated with expression levels of a gene (expression quantitative trait loci, eQTLs) may be analogous to lifelong exposure to a medication targeting the encoded protein8,16. The association between the same genetic variants and a disease (the outcome) can then be extracted from a GWAS for the outcome (Fig. 1a). The SNP-gene-expression and SNP-disease associations can be combined using MR to infer the causal effect of the exposure on the outcome. Since the exposure and outcome can be measured in two independent cohorts, openly available data from two large-scale GWASs can be used for one well-powered MR study. Because of Mendel’s law of independent assortment, individuals are “randomized” at conception to have genetically higher or lower expression levels of the druggable gene (Fig. 1b). Individuals are generally unaware of their genotype, so the MR study is effectively blinded.
In this study, we use eQTLs in blood and brain tissue to predict the efficacy of over 3000 drug-targeting mechanisms in two independent PD case-control cohorts and examine several PD progression markers (Fig. 1c). Where possible, we repeat the analysis using SNPs associated with circulating levels of the encoded proteins. Using large-scale, openly available data and MR techniques, we propose a list of genetically-supported drug targets for PD, including repurposing opportunities of already-licensed or clinical-phase drugs.
Results
Mimicking medications with expression quantitative trait loci
The druggable genome encompasses human genes that encode drug targets, including proteins targeted by approved and clinical-phase drugs, proteins similar to approved drug targets and proteins accessible to monoclonal antibodies or drug-like small molecules in vivo7. The most comprehensive version to date includes 4863 genes, and we sought to identify openly available eQTL data for these genes to mimic exposure to the corresponding medications7. Although the transcript level is biologically a step before the protein level, expression GWASs cover many genes across the genome and provide tissue specificity. Gene expression GWAS data thus provide a very good resource for high-level screens to develop drug-targeting hypotheses.
We used eQTL data from blood (31,684 mostly European-ancestry individuals)17 and brain tissue (1387 prefrontal cortex samples of mostly European ancestry, including 679 healthy controls, 497 schizophrenia, 172 bipolar disorder, 31 autism spectrum disorder, 8 affective disorder patients)18. We kept eQTLs with false discovery rate (FDR) < 0.05 and located within 5 kb of the associated gene to increase the specificity of the eQTL.
As such, eQTLs were available for 2786 and 2448 druggable genes in blood and brain tissue, respectively, and these were clumped at r2 = 0.2. Compared to a lower clumping threshold, this increases the number of SNPs available per gene, which in turn improves the power to detect an effect and makes it possible to test for biases in the MR estimate (as discussed later). When clumping at r2 = 0.2, SNPs are not strictly independent. We therefore used MR methods that incorporate a linkage disequilibrium matrix based on the 1000 genomes EUR reference panel in the MR analysis, which accounts for correlation between SNPs19,20. These methods therefore take linkage disequilibrium into account.
Discovery phase identifies 31 potential drug targets to prevent PD
The largest GWAS available for a PD trait studied disease risk in European-ancestry individuals, which we obtained from the International PD Genomics Consortium (IPDGC)9. Our discovery cohort consisted of 13,708 PD patients and 95,282 controls collected for a 2014 GWAS meta-analysis21. The MR effect estimate for each SNP (Wald ratio) was calculated, and where >1 eQTL was available per gene after clumping, Wald ratios were meta-analysed, weighted by inverse-variance (IVW). Genetically-determined expression of 31 genes (11 in blood only; 17 in brain tissue only; three in both blood and brain tissue) was significantly associated with PD risk in the discovery cohort at FDR < 0.05. All remained significant when clumping at r2 = 0.001 (Supplementary Data 1).
15 potential preventative agents replicate in an independent PD case-control cohort
We sought to replicate all genes that reached significance in the discovery phase using the Wald ratio or IVW method in an independent PD case-control cohort (Fig. 1). The replication population consisted of 8036 PD patients and 5803 controls (no overlap with the discovery cohort)9. The MR methods were identical to those used in the discovery phase.
Genetically-predicted expression of 15 genes (four in blood only; nine in brain tissue only; two in both tissues) replicated using the Wald ratio or IVW method (Fig. 2 and Supplementary Data 1). BST1, CD38, CHRNB1, CTSB, GPNMB, LGALS3, MAPT, MMRN1, NDUFAF2, PIGF, VKORC1 and WNT3 reached FDR < 0.05; ACVR2A, HSD3B7 and MAP3K12 reached nominal significance. GPNMB and HSD3B7 reached significance in both blood and brain tissue. Of these 15 potential drug targets to prevent PD, nine were not nominated by the PD risk GWAS meta-analysis9.
Three replicated genes encode targets of approved or clinical-phase drugs with an appropriate direction of effect for PD protection, presenting a possible repurposing opportunity: CHRNB1, NDUFAF2 and VKORC1 (Table 1 and Supplementary Data 1). The GPNMB protein is a receptor targeted by glematumumab, an antibody-drug conjugate that is being evaluated for several types of cancer22. After binding to GPNMB, the drug is internalised by the cell and is cytotoxic. Since this mechanism of action does not reflect a change in GPNMB levels, we do not consider glematumumab a potential candidate for repurposing. We find that CD38-inhibitors such as daratumumab, licensed to treat multiple myeloma, and MAP3K12-inhibitors such as CEP-1347 may increase PD risk. Interestingly, CEP-1347 failed to modify PD progression in a phase 3 clinical trial23, and our data may provide a genetic explanation why CEP-1347 was unsuccessful.
Table 1.
Gene | Outcome (tissue) | Drug name | Clinical phase | Indications/Uses |
---|---|---|---|---|
CHRNB1 | Risk (brain) | Rocuronium | Approved | Muscle relaxant in anaesthesia |
NDUFAF2 | Risk (brain) | Metformin | Approved | Type 2 diabetes mellitus, polycystic ovarian syndrome |
RHD | Dyskinesia (brain) | Roledumab | Phase 2 | Prevent alloimmunisation in Rhesus negative mothers carrying a Rhesus positive child |
VKORC1 | Risk (blood) | Warfarin | Approved |
Prophylactic anticoagulation *high risk of falls in Parkinson’s disease |
These drugs are either approved or in clinical trial phase, and the mechanism of action is consistent with the direction of our MR effect estimate. The second column displays the potential effect on PD and target tissue. Clinical phase and drug indication based on https://clinicaltrials.gov/ and the British National Formulary. Direction of effect was confirmed using https://www.drugbank.ca or https://www.ebi.ac.uk/chembl/ databases.
MR quality control suggests that CD38, CTSB, GPNMB and MAP3K12 have the most robust MR evidence for PD risk
We completed a series of quality control steps to prioritise the replicated genes. The direction of effect was consistent between the discovery and replication phases for all 15 replicated genes (Supplementary Data 2). Previous eQTL-based MR studies have reported heterogeneity in magnitude and direction of effect between tissues8,24, and we found that raised HSD3B7 expression was associated with raised PD risk in blood and reduced PD risk in brain tissue (Fig. 2). This pattern was consistent between the discovery and replication phase. Although this may suggest opposing effects between tissues, there was only one eQTL available for HSD3B7 in brain and two eQTLs in blood (discovery phase). Results based on one or two SNPs should be interpreted with caution, because it is not possible to perform the additional quality control discussed below.
The IVW method assumes that (1) the genetic variant(s) must be associated with the exposure, (2) the genetic variant(s) must not be associated with any confounders, and (3) the genetic variant(s) must not be associated directly with the outcome. This means that the SNP should affect the outcome (PD risk) through the exposure (gene expression) only, so the y-intercept of the IVW regression is fixed at zero25. This assumption is violated if there is genetic pleiotropy, where a SNP affects the outcome through an alternative pathway. This kind of pleiotropy may arise due to measured and unmeasured confounders, for example if the SNP is an eQTL for another gene that is not tested in this MR study. If pleiotropy pushes the effect in one direction, the IVW method yield a biased effect estimate. The MR-Egger method relaxes this assumption by not constraining the y-intercept. If the MR-Egger y-intercept significantly deviates from zero, this suggests that there is directional pleiotropy. This method assumes that any pleiotropic effects are independent of the gene-exposure association26.
If several meta-analysis methods yield a similar result, such as the MR-Egger and maximum likelihood methods, we consider the MR result more robust25,27,28. The latter allows more uncertainty in the SNP-exposure and SNP-outcome associations29. These methods are only possible if >2 SNPs are available per gene, and all genes with >2 SNPs reached at least nominal significance using the maximum likelihood method (uncorrected p < 0.05). The magnitude and direction of effect were largely consistent between methods, except for BST1. For BST1, the MR-Egger estimate was in the opposite direction to the IVW and maximum likelihood results (Supplementary Data 1). All genes with >2 SNPs available passed the MR-Egger intercept test except BST1, explaining the deviant MR-Egger estimate for this gene (Supplementary Data 2).
Nevertheless, if SNPs for a gene are pleiotropic in opposing directions, the MR-Egger y-intercept will still be zero. The Cochran’s Q and I2 tests usefully assess overall heterogeneity between Wald ratios. NDUFAF2, WNT3 and VKORC1 did not pass the Cochran’s Q (p < 0.05) nor I2 (I2 > 0.50) tests (Supplementary Data 2). This means that there is significant heterogeneity in the MR result for these genes, and such heterogeneity among Wald ratios can for example happen if at least one SNP for the gene is pleiotropic30.
We repeated the analysis in the discovery outcome data using only SNPs that were specifically associated with our replicated genes. In other words, we removed any SNPs associated with the expression of any other gene in the original eQTL dataset. All replicated genes remained significant in this analysis (Supplementary Data 8 and 9).
In addition, a spurious MR result may arise from a locus where the SNP-exposure and SNP-outcome associations are rooted in two distinct causal SNPs in close linkage disequilibrium30. When the SNP is significantly associated with both exposure and outcome, this can be probed using colocalization analysis31. There is evidence that proteins with both MR and colocalization evidence are more likely to be successful drug targets32; this may simply reinforce that GWAS-nominated drug targets are more likely to reach approval4. Using the discovery outcome data, we had sufficient power to perform a colocalization analysis for 13 genes (see ‘Methods’ and Supplementary Data 7). Of these, ACVR2A, BST1, CHRNB1, CTSB, GPNMB, HSD3B7, LGALS3, MAPT, MMRN1 and VKORC1 had strong evidence of colocalization . All genes with sufficient power colocalized in the replication data (BST1, CD38, GPNMB, HSD3B7, MAPT, MMRN1, VKORC1 and WNT3). Similarly, Kia et al. recently found that eQTLs in brain tissue for CD38 and GPNMB based on a different eQTL dataset colocalize with PD risk loci33, strengthening the evidence for the encoded proteins as drug targets for PD.
Four potential targets for preventative drugs may also affect PD age at onset
Pharmacologically delaying the age of onset of a debilitating disease may have a considerable impact on both socioeconomic burden of disease and quality of life by providing disability-free years to people at risk. Evidence from polygenic risk score analyses suggests that genetic risk of PD is correlated with PD age at onset11,34–36. We therefore asked whether expression of the genes reaching significance in our MR discovery phase for PD risk also predict PD age of onset. We sourced openly available summary statistics from a PD age of onset GWAS, including 17,996 patients (Fig. 1c). Based on the same analysis pipeline as the replication step for PD risk, expression of four genes predicted PD age of onset at p < 0.05: BST1 in blood, CD38 in brain tissue, CTSB in brain tissue and MMRN1 in brain tissue (Fig. 3 and Supplementary Data 3). CD38 and MMRN1 remained significant when clumping at r2 = 0.001. There were >2 SNPs available for BST1, CD38 and CTSB, and the IVW, maximum likelihood and MR-Egger methods yielded a consistent direction of effect for these genes (Supplementary Data 1). All three genes passed the MR-Egger intercept (), Cochran’s Q (), and I2 tests (. BST1 and MMRN1 remained significant when removing SNPs associated with expression of any other gene in the original eQTL dataset (Supplementary Data 8 and 9). Of the four genes, we had sufficient power to perform a colocalization analysis for BST1, and we found strong evidence of colocalization (; Supplementary Data 7).
If increased expression of a gene predicts reduced PD risk, this gene should be associated with a delayed age at onset. This was consistently the case for all four genes that reached significance for age at onset. Overall, these data suggest that there may be some shared molecular mechanisms driving PD risk and age at onset, yet this overlap may be incomplete.
There is little overlap between drug targets to prevent PD and reduce PD progression
The PD risk GWAS data afford large discovery and replication cohorts, which is a great advantage in MR. Nevertheless, it is currently not possible to reliably predict PD, limiting the immediate usefulness of a drug to prevent or delay this condition. Many clinical trials for PD use progression markers such as the Unified PD Rating Scale (UPDRS) to evaluate drug efficacy, and it remains unclear how the molecular mechanisms driving PD risk relate to clinical progression. We used MR to probe whether expression of any of the 4863 druggable genes is significantly associated with PD progression, measured by the UPDRS (total and parts 1 to 4), mini-mental state examination (MMSE), Montreal cognitive assessment (MOCA), modified Schwab and England activities of daily living scale (SEADL), Hoehn and Yahr stage, dementia, depression, and dyskinesia. The MR pipeline for each progression marker was identical to the discovery phase for PD risk (Fig. 1).
We used openly available summary statistics from a GWAS for these PD progression markers in 4093 European PD patients, followed over a median of 2.97 years10. 3455 druggable genes had an eQTL available for MR using a PD progression marker (2752 in blood, 2353 in brain tissue), and eight genes reached significance across five progression outcomes (Fig. 4 and Supplementary Data 1). One of these, RHD, encodes the target of a clinical-phase medication with an appropriate direction of effect, possibly representing a repurposing opportunity (Table 1).
IRAK3 expression in blood was significantly associated with UPDRS parts 2 and 4, depression, and dyskinesias. LMAN1 expression in blood reached significance for dyskinesias and UPDRS part 2. Reaching significance for several progression markers strengthens the evidence for these two genes. No genes reached significance for both PD risk and progression. Since age at motor symptom onset may be considered an early marker of PD progression, we also used our MR approach to assess whether genes discovered by our progression analysis causally predict age at onset. Of the genes that reached significance for a progression marker, none reached nominal significance using the age at onset data (Supplementary Data 3 and 4). CD177, IRAK3, RHD and STK4 remained significant when removing SNPs associated with expression of any other gene in the original eQTL dataset (Supplementary Data 8 and 9). It was not possible to perform a reliable colocalization analysis in our progression study, since the discovered genes did not have sufficient power to do so (i.e. ).
The direction of effect was consistent between the IVW, maximum likelihood and MR-Egger methods for all genes except RHD, where the MR-Egger method opposed the direction of the IVW and maximum likelihood methods. CD177 (depression), RHD (dyskinesia), PYGL (UPDRS part 4) and STK4 (Hoehn and Yahr) reached significance when clumping at . ADAM32 (Hoehn and Yahr), IRAK3 (dyskinesia), LMAN1 (UPDRS part 2), and RHD (dyskinesia) passed MR-Egger intercept, Cochran’s Q and I2 tests (Supplementary Data 2). Taken together, these five genes have the most robust MR evidence for modifying a PD progression marker.
Protein quantitative trait locus data provide further genetic evidence
Most clinically-used drugs target proteins, not gene expression, and genetic variants associated with protein levels (protein quantitative trait loci, pQTLs) may model drug target effects more accurately than eQTLs8. Even with high throughput protein assays, however, the spectrum of reliable, well-powered GWAS data on protein targets is limited. Many genetic studies on protein levels are based on plasma and lack tissue diversity37–39. Of the 23 proposed targets, we found pQTLs for BST1, CD38, CTSB, GPNMB and LGALS3 for PD risk, as well as PYGL and QDPR for UPDRS part 437–40.
Our MR analysis found that BST1, CTSB and LGALS3 levels were consistently associated with PD risk (p < 0.05; Fig. 5 and Supplementary Data 5). The result for GPNMB (risk) and PYGL (UPDRS part 4) lost significance when using data from different pQTL studies. The direction of effect was consistent between the pQTL and eQTL results for all genes except BST1, and the MR-Egger intercept, Cochran’s Q and I2 tests suggest that the BST1 results may be biased by genetic pleiotropy (Supplementary Data 6). This illustrates the importance of MR quality control—maximizing the number of SNPs available per drug target and validation with different data types and independent replication cohorts is essential for a reliable effect estimate.
Discussion
This work explicitly seeks to identify new drug targets for PD, and we provide genetic evidence in favour of 23 potential disease-modifying drug targets. Tables 2 and 3 summarize the evidence supporting these genes. The genes were prioritised using several meta-analysis methods (IVW, MR-Egger and maximum likelihood), the MR-Egger intercept test, Cochran’s Q test, the I2 test, a pQTL study, colocalization analysis and previously published MR and colocalization evidence. This allowed us to look for pleiotropy due to both measured and unmeasured confounders25. We propose six drug targets with the strongest MR evidence: CTSB, GPNMB, CD38, RHD, IRAK3 and LMAN1.
Table 2.
Gene | PD outcome | Tissue | Replication | Sign. with max. lik. | Sign. with MR-Egger | MR-Egger intercept test | Cochran’s Q test | I2 test | pQTL evidence | Coloc | Previously published MR or coloc support |
---|---|---|---|---|---|---|---|---|---|---|---|
GPNMB | Risk | Blood | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | MR9 |
Risk | Brain | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | MR24; coloc33 | ||
CTSB | Risk | Brain | ✓ | ✓ | x | ✓ | ✓ | ✓ | ✓ | ✓ | MR9 |
Age at onset | Brain | ✓ | x | ✓ | ✓ | ✓ | |||||
VKORC1 | Risk | Blood | ✓ | ✓ | x | ✓ | ✓ | ✓ | ✓ | ||
CD38 | Risk | Brain | ✓ | ✓ | x | ✓ | ✓ | ✓ | x | x | Coloc33 |
Age at onset | Brain | ✓ | x | ✓ | ✓ | ✓ | |||||
MAP3K12 | Risk | Blood | ✓ | ✓ | x | ✓ | ✓ | ✓ | x | ||
NDUFAF2 | Risk | Brain | ✓ | ✓ | x | ✓ | ✓ | ✓ | x | ||
BST1 | Risk | Blood | ✓ | ✓ | x | x | ✓ | ✓ | ✓ | ✓ | MR9 |
Age at onset | Blood | ✓ | x | ✓ | ✓ | ✓ | ✓ | ||||
HSD3B7 | Risk | Blood | ✓ | ✓ | ✓ | ✓ | x | ||||
Risk | Brain | ✓ | ✓ | MR24 | |||||||
LGALS3 | Risk | Brain | ✓ | ✓ | ✓ | ||||||
ACVR2A | Risk | Brain | ✓ | ✓ | |||||||
CHRNB1 | Risk | Brain | ✓ | ✓ | MR9 | ||||||
MAPT | Risk | Brain | ✓ | ✓ | |||||||
MMRN1 | Risk | Brain | ✓ | ✓ | |||||||
Age at onset | Brain | ||||||||||
PIGF | Risk | Brain | ✓ | ||||||||
WNT3 | Risk | Blood | ✓ | ✓ | x | ✓ | x | x | x | MR9 |
✓ pass, x fail, blank not possible to test, coloc colocalization, max. lik. maximum likelihood, MR Mendelian randomization, PD Parkinson’s disease, pQTL protein quantitative trait locus, sign significant.
Table 3.
Gene | PD outcome | Tissue | Sign. with max. lik. | Sign. with MR-Egger | MR-Egger intercept test | Cochran’s Q test | I2 test | pQTL evidence | Coloc |
---|---|---|---|---|---|---|---|---|---|
RHD | Dyskinesia | Brain | ✓ | x | ✓ | ✓ | ✓ | ||
IRAK3 | Dyskinesia | Blood | ✓ | ✓ | ✓ | ✓ | ✓ | ||
Depression | Blood | ✓ | ✓ | x | ✓ | ✓ | |||
UPDRS part 2 | Blood | ✓ | x | x | ✓ | ✓ | |||
UPDRS part 4 | Blood | ✓ | ✓ | x | ✓ | ✓ | |||
ADAM32 | Hoehn and Yahr | Brain | ✓ | x | ✓ | ✓ | ✓ | ||
LMAN1 | UPDRS part 2 | Blood | ✓ | x | ✓ | ✓ | ✓ | ||
Dyskinesia | Blood | ✓ | ✓ | x | ✓ | ✓ | |||
PYGL | UPDRS part 4 | Brain | ✓ | ||||||
CD177 | Depression | Brain | |||||||
QDPR | UPDRS part 4 | Blood | ✓ | ✓ | ✓ | x | x | x | |
STK4 | Hoehn and Yahr | Brain | x | x | x |
✓ pass, x fail, blank not possible to test, coloc colocalization, max. lik. maximum likelihood, MR Mendelian randomization, PD Parkinson’s disease, pQTL protein quantitative trait locus, sign significant, UPDRS Unified Parkinson’s Disease Rating Scale.
We identifed four genes encoding targets for existing drugs warranting further discussion (Table 1). NDUFAF2 encodes a subunit of a target of metformin, an approved medication for type 2 diabetes mellitus. There is extensive evidence for a relationship between diabetes mellitus and PD41, and several rodent studies have investigated the potential of metformin as a neuroprotective agent41–43. We found significant heterogeneity in the MR result for this gene. Although this may be because at least one SNP for this gene is pleiotropic, we speculate that this could occur if the effect is driven by a subset of PD patients. This, however, remains a subject for future research, because the GWAS data used in this study are not stratified by any kind of PD subtype. Epidemiological studies on the relationship between long-term medication use and incidence of a disease are an invaluable contribution to evaluating preventative agents for PD. A retrospective cohort study of over 6000 patients with type 2 diabetes mellitus found that more than four years of metformin use maybe associated with a reduced PD incidence44. Our MR study thus provides further evidence in favour of repurposing anti-diabetic drugs for PD.
Other medications may not be as suitable for repurposing. To our knowledge, there is no evidence linking PD and the drug roledumab, which is currently in a phase II clinical trial to prevent alloimmunisation in Rhesus negative mothers carrying a Rhesus positive child (NCT02287896). Our evidence suggests that RHD expression in brain tissue, rather than blood, is associated with PD dyskinesia. Next, CHRNB1 encodes the beta subunit of the muscle acetylcholine receptor at the neuromuscular junction, which is inhibited by muscle relaxants used during surgical anaesthesia. VKORC1 encodes the catalytic subunit of the vitamin K epoxide reductase, and this enzyme is targeted by the oral anticoagulant warfarin. The key adverse effect of warfarin treatment is haemorrhage, and since PD is a movement disorder where patients experience frequent falls, any potential benefit of warfarin treatment would likely be outweighed by the added risk of haemorrhagic strokes and complications of bleeding.
The two-sample MR design allowed us to explore different tissues and PD traits, and we identified different candidates to prevent, delay onset, and slow progression of PD (Figs. 2, 3 and 4). Although we found that four of the drug targets for PD risk may also affect PD age at onset, we found very different candidates for progression. Age at motor symptom onset can be considered an early sign of PD progression, and it is striking that none of the genes that reached significance for a progression outcome reached significance in the age at onset data. These results are in line with the GWAS data, finding little overlap between loci associated with PD risk, age at onset and progression markers9–11. This may reflect the limited sample size of current PD progression GWAS data. Nevertheless, this raises questions about what drives PD susceptibility versus progression, painting a yet unclear picture of partially overlapping molecular mechanisms.
Our candidates to slow PD progression may be of most immediate relevance, because currently PD cannot be accurately predicted. A preventative agent would need to be highly tolerable and have a very safe side effect profile, and our approach is not well suited for systematically evaluating the safety aspects of our proposed candidates in this study. To our knowledge, the data used here are from the largest openly available progression GWAS to date. We did not find any non-overlapping PD progression GWAS with sufficient power for a replication step in our progression analysis, which would have to measure progression in a similar way to the study used here. As such, the preventative list carries more robust evidence, because each gene reached significance in two large, independent cohorts. Replication is critical to validating scientific findings and eliminating false positives, and this has been an crucial lesson for genetic research45–47. Replication is not common practice in MR yet48, and it is a key strength of our study. Although including all samples available in one analysis would maximise statistical power46,49, using independent discovery and replication cohorts allowed us to validate our proposed drug targets. Since our overarching intention was to provide genetic evidence to improve success rates in clinical trials, we made this decision to reduce the number of false positives.
Our study has valuable advantages compared to previous MR projects studying PD using QTL data. In the latest GWAS meta-analysis for PD risk in Europeans, Nalls et al. selected SNPs associated with PD risk and used MR to identify whether any of these loci alter expression or methylation of genes within 1 Mb of the SNP9. This contrasts with our exposure-centred MR analysis, where we chose SNPs associated with the expression of a druggable gene, rather than the disease outcome. More recently, Baird and colleagues conducted a transcriptome-wide MR study for a series of brain diseases and found six genes whose expression in brain tissue was significantly associated with PD risk24. Two of these were also discovered in our study: GPNMB and HSD3B7. The remaining four were either not part of the druggable genome, rendering the encoded proteins less actionable drug targets, or did not reach significance in our discovery or replication cohorts, illustrating the importance of replication. Furthermore, our MR study is the first to study druggable genes in the context of PD age at onset and progression.
Nevertheless, progression and age at onset studies are particularly affected by collider bias50–52. For example, if expression of a gene and depression are both associated with disease risk, that gene’s expression will be artificially associated with depression in a cohort containing only cases. In a progression study, genetic variants that cause disease will thus be associated with other risk factors for disease. The druggable genes we identified in our progression study did not reach significance in our risk study, so this kind of collider bias is less likely to have occurred for our candidate genes. The age at onset analysis was comparably more affected, since we tested genes that reached significance for PD risk. Overall, this emphasises the importance of MR quality control methods (including replication) for identifying reliable causal effects, representative sampling in GWAS, as well as continued development of methods to formally test for collider bias53,54.
Another key limitation of this study is that MR cannot fully recapitulate a clinical trial. MR mimics lifelong, low-dose exposure to a drug and assumes a linear relationship between exposure and outcome. This differs from a clinical trial, which typically investigates comparably high doses of drug over a much shorter timeframe. The MR result may therefore not directly correspond to the effect size in practice and does not perfectly predict the effect of a drug.
In addition, the eQTL cohorts contained some non-European individuals17,18, three of the pQTL studies sourced were based on Icelandic, Scottish and German cohorts38–40, and the PD populations were comprised of European individuals only9–11. Linkage disequilibrium patterns differ between populations, which may compromise how well our QTLs mimic drug action in the PD cohorts and introduce bias to the MR effect estimate25.
It is difficult to interpret which tissue would be the most appropriate site of action. Whereas the genes that reached significance in both blood and brain tissue may have stronger MR evidence, targeting the protein of a widely expressed gene may lead to systemic side-effects. Brain tissue may be more biologically relevant for neurodegeneration, but a drug acting in the blood stream may not need to cross the blood–brain barrier to exert its effect. A limitation of using brain tissue is that gene expression is quantified post-mortem, and measured expression levels are influenced by RNA degradation occurring after cell death as well as transcriptional changes occurring in response to death55. We included both blood and brain tissue eQTLs to capture as many genes as possible and explore two potential tissue sites of action, but we note that it is difficult to prioritise genes based on which tissue(s) they reached significance in.
Furthermore, the sample size of our blood eQTL data (n = 31,684) is larger than that for brain tissue eQTLs (n = 1387) and the blood pQTL study (n = 750–4137). A larger sample size allows greater power to detect QTLs, meaning there are more SNPs per gene. Nevertheless, it is unclear how well QTL data mimic medications that modulate activity levels of the protein. We are encouraged that five of the seven proteins we were able to probe using both eQTL and pQTL data were successfully validated, adding to existing evidence that regulatory variants may be used for robust causal inferences in drug target MR8. Nevertheless, this MR study does not provide functional evidence for the proposed drug targets, and the MR process does not replace pre-clinical evaluation of drug targets in vitro and in vivo. Genomic approaches serve as adjuncts thereto, promising to better prioritise drug targets carried forward to clinical phase trials.
A 9.6% vs. 13.8% success rate for drugs from phase 1 trials to approval may mean a $480 million difference in the median research and development cost of bringing a new drug to the market1. The druggable genome resource has opened up new avenues for drug target identification using existing genetic data7,56,57, and if genetic evidence increases success rates even by a few percent, this could have a substantial effect on drug development costs4,5. As such, MR a highly compelling, time- and cost-effective adjunct to the randomized controlled trial. We have made our code openly available for use beyond PD research (https://github.com/catherinestorm/mr_druggable_genome_pd/)58, and we have demonstrated ways to prioritise drug targets based on genetic data. We have provided human genetic evidence of drug efficacy for PD, and we hope that these data will serve as a useful resource for prioritising drug development efforts.
Methods
All DNA positions are based on the human reference genome build hg19 (GRCh37). Data processing was completed using R software version 3.6.359.
Exposure data
Tissue-specific eQTL data were obtained from the eQTLGen (https://eqtlgen.org/) and PsychENCODE consortia (http://resource.psychencode.org/); full descriptions of the data are available in the original publications17,18. Briefly, the eQTLGen data consisted of cis-eQTLs for 16,987 genes and 31,684 blood samples, of which most were healthy European-ancestry individuals. We downloaded the full significant cis-eQTL results (FDR < 0.05) and allele frequency information from the eQTLGen consortium on 13 May 2020.
The PsychENCODE data included 1387 prefrontal cortex, primarily-European samples (679 healthy controls, 497 schizophrenia, 172 bipolar disorder, 31 autism spectrum disorder and 8 affective disorder patients). We downloaded all significant eQTLs (FDR < 0.05) for genes with expression >0.1 fragments per kilobase per million mapped fragments (FPKM) in at least ten samples and all SNP information, accessed on 13 May 2020.
We obtained an updated version of the druggable genome containing 4863 genes from the authors of the original publication7, double-checking the druggability level for all genes marked as approved or in clinical trials (“druggability tier 1”). We removed non-autosomal genes, leaving 4560 druggable genes. We filtered both eQTL datasets to include SNPs 5 kb upstream of the target druggable gene start or 5 kb downstream of the target druggable gene end position.
We sought freely available pQTL data from blood or brain tissue for all druggable genes that reached significance for any PD outcome in our study. Out of 23 pQTL studies identified, four studies (1) reported significant pQTLs in individuals of European descent for any of the druggable proteins proposed by our eQTL analysis, (2) provided all the SNP information required for MR and (3) reported SNPs that were available in our PD outcome data37–40. Sun and colleagues measured 3622 proteins in 3301 healthy European blood donors from the INTERVAL study and identified 1927 pQTLs for 1478 proteins. Emilsson and colleagues measured 4137 proteins in the serum of 5457 Icelanders from AGES Reykjavik study. Effect alleles and effect allele frequencies were obtained from the authors. Suhre and colleagues measured 1124 proteins in 1000 blood samples from a German population. Hillary and colleagues measured 92 proteins in the blood of 750 healthy Scottish controls.
In total, we found pQTLs that were available in the appropriate PD outcome data for seven of our druggable proteins of interest: BST1, CD38, CTSB, GPNMB, LGALS3, PYGL and QDPR. All pQTLs included in our analysis had p < 5e−6 in the original pQTL study. All pQTLs were found on the same chromosome as the associated gene except for: rs62143198 for PYGL, rs62143197 for QDPR, rs4253282 for GPNMB, rs2731674 for GPNMB37. These latter four SNPs are therefore acting in trans.
Outcome data
All PD data were obtained from the IPDGC, and details on recruitment and quality control are available in the original publications9–11,21. In the discovery phase for PD risk, we used openly available summary statistics from a 2014 case-control GWAS meta-analysis, which included 13,708 PD patients and 95,282 controls21.
In the replication phase for PD risk, we obtained summary statistics from 11 case-control GWAS studies included in the most recent PD risk GWAS meta-analysis from the authors9. The 11 studies, as named and described in the PD GWAS meta-analysis, were Spanish Parkinson’s, Baylor College of Medicine/University of Maryland, McGill Parkinson’s, Oslo Parkinson’s Disease Study, Parkinson’s Progression Markers Initiative (PPMI), Finnish Parkinson’s, Harvard Biomarker Study (HBS), UK PDMED (CouragePD), Parkinson’s Disease Biomarker’s Program (PDBP), Tübingen Parkinson’s Disease cohort (CouragePD) and Vance (dbGap phs000394). These yielded a total of 8036 PD cases and 5803 controls. We meta-analysed the data using METAL (version 2011-03-25) using default settings, weighting by sample size60. The overall genomic inflation factor was , and when scaled to 1000 cases and 1000 controls . Based on genomic inflation factors and quantile–quantile plots of the original GWASs9,21, we considered our quantile–quantile plot to show adequate agreement with the expected null distribution (Supplementary Fig. 1).
For the progression marker analyses, we used summary statistics from the largest publicly available GWAS meta-analyses for PD age at onset and clinical progression10,11. For age at onset, this includeed 17,996 PD cases, and age at onset was defined as self-reported age at motor symptom onset or PD diagnosis. The authors reported a high correlation between age at diagnosis and age at onset.
The progression GWAS meta-analysis included 4093 PD patients from 12 cohorts, followed over a median of 2.97 years (mean visits per individual over the study period: 5.44). We downloaded summary statistics for nine continuous outcomes and four binomial outcomes (https://pdgenetics.shinyapps.io/pdprogmetagwasbrowser/). Continuous outcomes included Hoehn and Yahr stage (PD progression rating scale), total UPDRS/Movement Disorder Society revised version total (PD progression rating scale), UPDRS parts 1 to 4 (1 = non-motor symptoms, 2 = motor symptoms, 3 = motor examination, 4 = motor complications), MOCA (cognitive impairment), MMSE (cognitive impairment) and SEADL (activities of daily living and independence). The binomial outcomes we used were dementia, depression, dyskinesia, as well as reaching Hoehn and Yahr stage 3 or more.
Mendelian randomization
MR analyses were completed using the R package “TwoSampleMR” (version 0.5.4)61, unless stated otherwise. The exposure and outcome data were loaded and harmonized using in-built functions. SNPs were then clumped at using European samples from the 1000 Genomes Project20,61. Steiger filtering was used to remove genes where SNPs explained a greater proportion of variation in the outcome (PD trait) than variation in the exposure (gene expression). For the eQTL analysis, the Steiger filtering excluded 0–403 genes per outcome tested in a tissue, representing 0–15% of all genes studied per outcome tested in a tissue.
Wald ratios were calculated for all SNPs. These were meta-analysed using the IVW, MR-Egger and maximum likelihood methods, including a linkage disequilibrium matrix to account for correlation between SNPs; this function uses the R package “MendelianRandomization” version 0.4.229. Forest plots were produced using the R package “forestplot”.
Where >2 SNPs were available per exposure, we assessed whether the MR-Egger intercept significantly deviated from zero, as well as Cochran’s Q and I2 methods to test for heterogeneity between Wald ratios62. FDR-corrected p-values were calculated within each exposure-outcome combination to correct for multiple testing. In the discovery study for PD risk and the PD progression studies, we considered FDR < 0.05 significant. In the replication studies for PD risk and age at onset, as well as the pQTL study, we considered nominal p < 0.05 significant.
For genes which reached significance using the IVW method (>1 SNP available), we carried out another MR analysis, clumping at . If >1–2 SNPs were available at this clumping threshold, Wald ratios were meta-analysed using the IVW, MR-Egger, weighted mode and weighted median methods.
Colocalization
We carried out a colocalization analysis for PD risk, age at onset and progression outcomes using the R package “coloc”31. We harmonized exposure and outcome datasets using the “TwoSampleMR” package. We used default priors: , , . p1, p2 and p12 are the prior probabilities that a SNP in the tested region is significantly associated with expression of the tested gene, the tested PD outcome, or both, respectively. The colocalization yields posterior probabilities corresponding to one of five hypotheses: PPH0, no association with either trait; PPH1, association with expression of the gene, but not the PD trait; PPH2, association with the PD trait, but not expression of the gene; PPH3, association with the PD trait and expression of the gene, with distinct causal variants; PPH4, association with the PD trait and expression of the gene, with a shared causal variant31. A low PPH3 and PPH4 in combination with a high PPH0, PPH1 and/or PPH2 indicates limited power in the colocalization analysis31. We therefore restricted our analysis to genes reaching .
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
C.S.S. would like to thank Dr. Vishal Rawji for his invaluable support and insightful ideas about the clinical implications and communication of this study. C.S.S. is funded by Rosetrees Trust, John Black Charitable Foundation and the University College London MBPhD Programme. D.A.K. is supported by an MBPhD Award from the International Journal of Experimental Pathology. M.A. is funded by the Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia. N.W.W. and A.D.H. are National Institute for Health Research senior investigators. N.W.W., A.D.H. and C.F. receive support from the National Institute for Health Research University College London Hospitals Biomedical Research Centre. We would like to thank all members of the International Parkinson Disease Genomics Consortium (IPDGC) and the authors of QTL projects referenced here, who make their data openly available. We thank all the patients and families whose decision to donate tissue samples make our research possible.
Author contributions
Conceptualization: C.S.S., D.A.K., M.M.A. and N.W.W.; methodology: C.S.S., D.A.K., M.M.A. and N.W.W.; investigation, formal analysis, visualization: C.S.S.; resources: S.B.C., C.F., A.D.H. and IPDGC; writing—original draft: CSS; writing—review & editing: C.S.S., D.A.K., M.M.A., N.W.W., S.B.C., C.F., A.D.H. and IPDGC.
Data availability
The data generated in this study and a data dictionary (Supplementary Information) are provided in the Supplementary Information. Tissue-specific eQTL data were obtained from the eQTLGen (https://eqtlgen.org/) and PsychENCODE consortia (http://resource.psychencode.org/). The druggable genome data were obtained via from the original authors; an immediately accessible version of the druggable genome is available in the supplementary materials of the original publication7. Subsets of the most recent PD risk GWAS were obtained from the original authors9. GWAS summary statistics for PD age at onset are available on the IPDGC website (http://pdgenetics.org/resources). Parkinson’s progression GWAS data can be found here: https://pdgenetics.shinyapps.io/pdprogmetagwasbrowser/.
Code availability
The code used for this study is openly available on GitHub, accompanied by instructions for use and required tools (https://github.com/catherinestorm/mr_druggable_genome_pd). Data processing was completed using R software (version 3.6.3), with packages TwoSampleMR (version 0.5.4), MendelianRandomization (version 0.4.2), coloc (version 4.0.4), dplyr (version 1.0.0), readr (version 1.3.1), stringr (version 1.4.0), tidyverse (version 1.3.0), forestplot (version 1.9), plyr (version 1.8.6), devtools (version 2.3.0), remotes (version 2.1.1). The GWAS meta-analysis was completed using METAL (version 2011-03-25).
Competing interests
The authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks Neil Davies, Haydeh Payami and Heiko Runz for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A list of authors and their affiliations appears at the end of the paper.
Contributor Information
Nicholas W. Wood, Email: n.wood@ucl.ac.uk
International Parkinson’s Disease Genomics Consortium (IPDGC):
Alastair J. Noyce, Rauan Kaiyrzhanov, Ben Middlehurst, Manuela Tan, Henry Houlden, Huw R. Morris, Helene Plun-Favreau, Peter Holmans, John Hardy, Daniah Trabzuni, John Quinn, Vivien Bubb, Kin Y. Mok, Kerri J. Kinghorn, Patrick Lewis, Sebastian R. Schreglmann, Ruth Lovering, Lea R’Bibo, Claudia Manzoni, Mie Rizig, Mina Ryten, Sebastian Guelfi, Valentina Escott-Price, Viorica Chelban, Thomas Foltynie, Nigel Williams, Karen E. Morrison, Carl Clarke, Kirsten Harvey, Benjamin M. Jacobs, Alexis Brice, Fabrice Danjou, Suzanne Lesage, Jean-Christophe Corvol, Maria Martinez, Claudia Schulte, Kathrin Brockmann, Javier Simón-Sánchez, Peter Heutink, Patrizia Rizzu, Manu Sharma, Thomas Gasser, Susanne A. Schneider, Mark R. Cookson, Cornelis Blauwendraat, David W. Craig, Kimberley Billingsley, Mary B. Makarious, Derek P. Narendra, Faraz Faghri, J. Raphael Gibbs, Dena G. Hernandez, Kendall Van Keuren-Jensen, Joshua M. Shulman, Hirotaka Iwaki, Hampton L. Leonard, Mike A. Nalls, Laurie Robak, Jose Bras, Rita Guerreiro, Steven Lubbe, Timothy Troycoco, Steven Finkbeiner, Niccolo E. Mencacci, Codrin Lungu, Andrew B. Singleton, Sonja W. Scholz, Xylena Reed, Ryan J. Uitti, Owen A. Ross, Francis P. Grenn, Anni Moore, Roy N. Alcalay, Zbigniew K. Wszolek, Ziv Gan-Or, Guy A. Rouleau, Lynne Krohn, Kheireddin Mufti, Jacobus J. van Hilten, Johan Marinus, Astrid D. Adarmes-Gómez, Miquel Aguilar, Ignacio Alvarez, Victoria Alvarez, Francisco Javier Barrero, Jesús Alberto Bergareche Yarza, Inmaculada Bernal-Bernal, Marta Blazquez, Marta Bonilla-Toribio, Juan A. Botía, María Teresa Boungiorno, Dolores Buiza-Rueda, Ana Cámara, Fátima Carrillo, Mario Carrión-Claro, Debora Cerdan, Jordi Clarimón, Yaroslau Compta, Monica Diez-Fairen, Oriol Dols-Icardo, Jacinto Duarte, Raquel Duran, Francisco Escamilla-Sevilla, Mario Ezquerra, Cici Feliz, Manel Fernández, Rubén Fernández-Santiago, Ciara Garcia, Pedro García-Ruiz, Pilar Gómez-Garre, Maria Jose Gomez Heredia, Isabel Gonzalez-Aramburu, Ana Gorostidi Pagola, Janet Hoenicka, Jon Infante, Silvia Jesús, Adriano Jimenez-Escrig, Jaime Kulisevsky, Miguel A. Labrador-Espinosa, Jose Luis Lopez-Sendon, Adolfo López de Munain Arregui, Daniel Macias, Irene Martínez Torres, Juan Marín, Maria Jose Marti, Juan Carlos Martínez-Castrillo, Carlota Méndez-del-Barrio, Manuel Menéndez González, Marina Mata, Adolfo Mínguez, Pablo Mir, Elisabet Mondragon Rezola, Esteban Muñoz, Javier Pagonabarraga, Pau Pastor, Francisco Perez Errazquin, Teresa Periñán-Tocino, Javier Ruiz-Martínez, Clara Ruz, Antonio Sanchez Rodriguez, María Sierra, Esther Suarez-Sanmartin, Cesar Tabernero, Juan Pablo Tartari, Cristina Tejera-Parrado, Eduard Tolosa, Francesc Valldeoriola, Laura Vargas-González, Lydia Vela, Francisco Vives, Alexander Zimprich, Lasse Pihlstrom, Mathias Toft, Pille Taba, Sulev Koks, Sharon Hassin-Baer, Kari Majamaa, Ari Siitonen, Pentti Tienari, Njideka U. Okubadejo, Oluwadamilola O. Ojo, Chingiz Shashkin, Nazira Zharkinbekova, Vadim Akhmetzhanov, Gulnaz Kaishybayeva, Altynay Karimova, Talgat Khaibullin, and Timothy L. Lynch
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-021-26280-1.
References
- 1.Wouters OJ, McKee M, Luyten J. Estimated research and development investment needed to bring a new medicine to market, 2009–2018. JAMA - J. Am. Med. Assoc. 2020;323:844–853. doi: 10.1001/jama.2020.1166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Smietana K, Siatkowski M, Møller M. Trends in clinical success rates. Nat. Rev. Drug Discov. 2016;15:379–380. doi: 10.1038/nrd.2016.85. [DOI] [PubMed] [Google Scholar]
- 3.Harrison RK. Phase II and phase III failures: 2013–2015. Nat. Rev. Drug Discov. 2016;15:817–818. doi: 10.1038/nrd.2016.184. [DOI] [PubMed] [Google Scholar]
- 4.Nelson MR, et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 2015;47:856–860. doi: 10.1038/ng.3314. [DOI] [PubMed] [Google Scholar]
- 5.King EA, Wade Davis J, Degner JF. Are drug targets with genetic support twice as likely to be approved? Revised estimates of the impact of genetic support for drug mechanisms on the probability of drug approval. PLoS Genet. 2019;15:1–20. doi: 10.1371/journal.pgen.1008489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hingorani AD, et al. Improving the odds of drug development success through human genomics: modelling study. Sci. Rep. 2019;9:1–25. doi: 10.1038/s41598-019-54849-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Finan C, et al. The druggable genome and support for target identification and validation in drug development. Sci. Transl. Med. 2017;9:eaag1166. doi: 10.1126/scitranslmed.aag1166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Schmidt AF, et al. Genetic drug target validation using Mendelian randomisation. Nat. Commun. 2020 doi: 10.1038/s41467-020-16969-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Nalls MA, et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 2019;18:1091–1102. doi: 10.1016/S1474-4422(19)30320-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Iwaki, H. et al. Genetic risk of Parkinson disease and progression: an analysis of 13 longitudinal cohorts. Neurol. Genet. 5, e354 (2019). [DOI] [PMC free article] [PubMed]
- 11.Blauwendraat, C. et al. Parkinson’s disease age at onset genome-wide association study: defining heritability, genetic loci, and α-synuclein mechanisms. Mov. Disord. 1–10, 10.1002/mds.27659 (2019). [DOI] [PMC free article] [PubMed]
- 12.Katan MB. Apoliporotein E isoforms, serum cholesterol, and cancer. Lancet. 1986;1:507–508. doi: 10.1016/s0140-6736(86)92972-7. [DOI] [PubMed] [Google Scholar]
- 13.Davey Smith G, Ebrahim S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 2003;32:1–22. doi: 10.1093/ije/dyg070. [DOI] [PubMed] [Google Scholar]
- 14.Hingorani A, Humphries S. Nature’s randomised trials. Lancet. 2005;366:1906–1908. doi: 10.1016/S0140-6736(05)67767-7. [DOI] [PubMed] [Google Scholar]
- 15.Holmes MV, Ala-Korpela M, Davey Smith G. Mendelian randomization in cardiometabolic disease: challenges in evaluating causality. Nat. Rev. Cardiol. 2017;14:577–590. doi: 10.1038/nrcardio.2017.78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhu Z, et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 2016;48:481–487. doi: 10.1038/ng.3538. [DOI] [PubMed] [Google Scholar]
- 17.Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet.53, 1300–1310 (2021). [DOI] [PMC free article] [PubMed]
- 18.Wang D, et al. Comprehensive functional genomic resource and integrative model for the human brain. Science. 2018;362:eaat8464. doi: 10.1126/science.aat8464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Burgess S, Dudbridge F, Thompson SG. Combining information on multiple instrumental variables in Mendelian randomization: Comparison of allele score and summarized data methods. Stat. Med. 2016;35:1880–1906. doi: 10.1002/sim.6835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. doi: 10.1038/nature11632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Nalls MA, et al. Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease. Nat. Genet. 2014;46:989–993. doi: 10.1038/ng.3043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Rose AAN, Biondini M, Curiel R, Siegel PM. Targeting GPNMB with glembatumumab vedotin: current developments and future opportunities for the treatment of cancer. Pharmacol. Ther. 2017;179:127–141. doi: 10.1016/j.pharmthera.2017.05.010. [DOI] [PubMed] [Google Scholar]
- 23.Wang LH, Johnson EM. Mixed lineage kinase inhibitor CEP-1347 fails to delay disability in early Parkinson’s disease. Neurology. 2008;71:462–463. doi: 10.1212/01.wnl.0000324506.93877.5e. [DOI] [PubMed] [Google Scholar]
- 24.Baird DA, et al. Identifying drug targets for neurological and psychiatric disease via genetics and the brain transcriptome. PLoS Genet. 2021;17:1–26. doi: 10.1371/journal.pgen.1009224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Burgess S, et al. Guidelines for performing Mendelian randomization investigations. Wellcome Open Res. 2020;4:186. doi: 10.12688/wellcomeopenres.15555.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bowden J, et al. Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic. Int. J. Epidemiol. 2016;45:1961–1974. doi: 10.1093/ije/dyw220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Haycock PC, et al. Statistical Commentary Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies 1. Am. J. Clin. Nutr. 2016;103:965–978. doi: 10.3945/ajcn.115.118216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Slob, E. A. W. & Burgess, S. A comparison of robust Mendelian randomization methods using summary data. Genet. Epidemiol. 1–17, 10.1002/gepi.22295 (2020). [DOI] [PMC free article] [PubMed]
- 29.Yavorska OO, Burgess S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int. J. Epidemiol. 2017;46:1734–1739. doi: 10.1093/ije/dyx034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hemani G, Bowden J, Davey Smith G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet. 2018;27:R195–R208. doi: 10.1093/hmg/ddy163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014). [DOI] [PMC free article] [PubMed]
- 32.Zheng J, et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat. Genet. 2020;52:1122–1131. doi: 10.1038/s41588-020-0682-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kia DA, et al. Integration of eQTL and Parkinson’s disease GWAS data implicates 11 disease genes. JAMA Neurol. 2020 doi: 10.1101/627216. [DOI] [Google Scholar]
- 34.Escott-Price V, et al. Polygenic risk of Parkinson disease is correlated with disease age at onset. Ann. Neurol. 2015;77:582–591. doi: 10.1002/ana.24335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ibanez L, et al. Parkinson disease polygenic risk score is associated with Parkinson disease status and age at onset but not with alpha-synuclein cerebrospinal fluid levels. BMC Neurol. 2017;17:1–9. doi: 10.1186/s12883-017-0978-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Nalls MA, et al. Genetic risk and age in Parkinson’s disease: continuum not stratum. Mov. Disord. 2015;30:850–854. doi: 10.1002/mds.26192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Sun BB, et al. Genomic atlas of the human plasma proteome. Nature. 2018;558:73–79. doi: 10.1038/s41586-018-0175-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Emilsson V, et al. Co-regulatory networks of human serum proteins link genetics to disease. Science. 2018;361:769–773. doi: 10.1126/science.aaq1327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Suhre K, et al. Connecting genetic risk to disease end points through the human blood plasma proteome. Nat. Commun. 2017;8:14357. doi: 10.1038/ncomms14357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hillary, R. F. et al. Genome and epigenome wide studies of neurological protein biomarkers in the Lothian Birth Cohort 1936. Nat. Commun. 10, 3160 (2019). [DOI] [PMC free article] [PubMed]
- 41.Foltynie, T. & Athauda, D. Repurposing anti-diabetic drugs for the treatment of Parkinson’s disease: Rationale and clinical experience. Progress in Brain Research 252 (Elsevier B.V., 2020). [DOI] [PubMed]
- 42.Rotermund C, Machetanz G, Fitzgerald JC. The therapeutic potential of metformin in neurodegenerative diseases. Front. Endocrinol. 2018;9:1–26. doi: 10.3389/fendo.2018.00400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Mor DE, et al. Metformin rescues Parkinson’s disease phenotypes caused by hyperactive mitochondria. PNAS. 2020;117:26438–26447. doi: 10.1073/pnas.2009838117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Shi Q, Liu S, Fonseca VA, Thethi TK, Shi L. Effect of metformin on neurodegenerative disease among elderly adult US veterans with type 2 diabetes mellitus. BMJ Open. 2019;9:e024954. doi: 10.1136/bmjopen-2018-024954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genet. Med. 2002;4:45–61. doi: 10.1097/00125817-200203000-00002. [DOI] [PubMed] [Google Scholar]
- 46.Chanock SJ, et al. Replicating genotype-phenotype associations. Nature. 2007 doi: 10.1038/447655a. [DOI] [PubMed] [Google Scholar]
- 47.Marigorta UM, Rodríguez JA, Gibson G, Navarro A. Replicability and prediction: lessons and challenges from GWAS. Trends Genet. 2018;34:504–517. doi: 10.1016/j.tig.2018.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Burgess, S., Foley, C. N. & Zuber, V. Inferring causal relationships between risk factors and outcomes from genome-wide association study data stephen. Annu. Rev. Genom. Hum. Genet. 303–27, 10.1146/annurev-genom-083117-021731-021731 (2018). [DOI] [PMC free article] [PubMed]
- 49.Huffman JE. Examining the current standards for genetic discovery and replication in the era of mega-biobanks. Nat. Commun. 2018;9:1–4. doi: 10.1038/s41467-018-07348-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Paternoster, L., Tilling, K. & Smith, G. D. Genetic epidemiology and Mendelian randomization for informing disease therapeutics: conceptual and methodological challenges. 1–10, 10.1371/journal.pgen.1006944 (2017). [DOI] [PMC free article] [PubMed]
- 51.Munafò MR, Tilling K, Taylor AE, Evans DM, Davey Smith G. Collider scope: when selection bias can substantially influence observed associations. Int. J. Epidemiol. 2018;47:226–235. doi: 10.1093/ije/dyx206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Bartlett JW, Harel O, Carpenter JR. Asymptotically unbiased estimation of exposure odds ratios in complete records logistic regression. Am. J. Epidemiol. 2015;182:730–736. doi: 10.1093/aje/kwv114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Gkatzionis A, Burgess S. Contextualizing selection bias in Mendelian randomization: How bad is it likely to be? Int. J. Epidemiol. 2019;48:691–701. doi: 10.1093/ije/dyy202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Griffith GJ, et al. Collider bias undermines our understanding of COVID-19 disease risk and severity. Nat. Commun. 2020;11:1–12. doi: 10.1038/s41467-020-19478-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Ferreira, P. G. et al. The effects of death and post-mortem cold ischemia on human tissue transcriptomes. Nat. Commun. 9, 490 (2018). [DOI] [PMC free article] [PubMed]
- 56.Hegvik, T. A. et al. Druggable genome in attention deficit/hyperactivity disorder and its co-morbid conditions. New avenues for treatment. Mol. Psychiatry10.1038/s41380-019-0540-z (2019). [DOI] [PMC free article] [PubMed]
- 57.Jacobs BM, et al. Summary-data-based Mendelian randomization prioritizes potential druggable targets for multiple sclerosis. Brain Commun. 2020;2:1–17. doi: 10.1093/braincomms/fcaa119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Storm, C. S. Finding genetically-supported drug targets for Parkinson’s disease using Mendelian randomization of the druggable genome. GitHub Repos. (2020). [DOI] [PMC free article] [PubMed]
- 59.R Core Team. R: A Language and Environment for Statistical Computing (2019).
- 60.Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Hemani G, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7:e34408. doi: 10.7554/eLife.34408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Del Greco M F, Minelli C, Sheehan NA, Thompson JR. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat. Med. 2015;34:2926–2940. doi: 10.1002/sim.6522. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data generated in this study and a data dictionary (Supplementary Information) are provided in the Supplementary Information. Tissue-specific eQTL data were obtained from the eQTLGen (https://eqtlgen.org/) and PsychENCODE consortia (http://resource.psychencode.org/). The druggable genome data were obtained via from the original authors; an immediately accessible version of the druggable genome is available in the supplementary materials of the original publication7. Subsets of the most recent PD risk GWAS were obtained from the original authors9. GWAS summary statistics for PD age at onset are available on the IPDGC website (http://pdgenetics.org/resources). Parkinson’s progression GWAS data can be found here: https://pdgenetics.shinyapps.io/pdprogmetagwasbrowser/.
The code used for this study is openly available on GitHub, accompanied by instructions for use and required tools (https://github.com/catherinestorm/mr_druggable_genome_pd). Data processing was completed using R software (version 3.6.3), with packages TwoSampleMR (version 0.5.4), MendelianRandomization (version 0.4.2), coloc (version 4.0.4), dplyr (version 1.0.0), readr (version 1.3.1), stringr (version 1.4.0), tidyverse (version 1.3.0), forestplot (version 1.9), plyr (version 1.8.6), devtools (version 2.3.0), remotes (version 2.1.1). The GWAS meta-analysis was completed using METAL (version 2011-03-25).