Skip to main content
Briefings in Bioinformatics logoLink to Briefings in Bioinformatics
. 2023 Dec 27;25(1):bbad470. doi: 10.1093/bib/bbad470

Applying polygenic risk score methods to pharmacogenomics GWAS: challenges and opportunities

Song Zhai 1, Devan V Mehrotra 2, Judong Shen 3,
PMCID: PMC10782924  PMID: 38152980

Abstract

Polygenic risk scores (PRSs) have emerged as promising tools for the prediction of human diseases and complex traits in disease genome-wide association studies (GWAS). Applying PRSs to pharmacogenomics (PGx) studies has begun to show great potential for improving patient stratification and drug response prediction. However, there are unique challenges that arise when applying PRSs to PGx GWAS beyond those typically encountered in disease GWAS (e.g. Eurocentric or trans-ethnic bias). These challenges include: (i) the lack of knowledge about whether PGx or disease GWAS/variants should be used in the base cohort (BC); (ii) the small sample sizes in PGx GWAS with corresponding low power and (iii) the more complex PRS statistical modeling required for handling both prognostic and predictive effects simultaneously. To gain insights in this landscape about the general trends, challenges and possible solutions, we first conduct a systematic review of both PRS applications and PRS method development in PGx GWAS. To further address the challenges, we propose (i) a novel PRS application strategy by leveraging both PGx and disease GWAS summary statistics in the BC for PRS construction and (ii) a new Bayesian method (PRS-PGx-Bayesx) to reduce Eurocentric or cross-population PRS prediction bias. Extensive simulations are conducted to demonstrate their advantages over existing PRS methods applied in PGx GWAS. Our systematic review and methodology research work not only highlights current gaps and key considerations while applying PRS methods to PGx GWAS, but also provides possible solutions for better PGx PRS applications and future research.

Keywords: polygenic risk score, pharmacogenomics GWAS, challenges and opportunities, Eurocentric bias, Bayesian, multiple traits

INTRODUCTION

Polygenic risk scores (PRSs) have recently emerged as promising tools in disease genome-wide association studies (GWAS) for predicting human diseases and complex traits. This is particularly important for diseases/complex traits with polygenic genetic architectures, where many genetic variants have small but genuine effects that do not reach the genome-wide significant threshold [1]. A PRS combines multiple single nucleotide polymorphisms (SNPs) into a single aggregated score that can be used to predict disease risk. It is an individual-level score calculated based on the number of risk variants that a person carries, weighted by SNP effect sizes that are derived from an independent large-scale discovery GWAS. Thus, this score represents the total genetic risk of a specific individual for a particular trait, which can be used for clinical prediction or screening. To date, many PRSs have been successfully used in disease risk prediction and population stratification. For example, Khera et al. [2] developed a PRS for coronary artery disease (CAD), where the PRS-high group (i.e. the top 8.0% of the population) inherited Inline graphic 3-fold increased risk for CAD. Mavaddat et al. [3] built a PRS that was optimized for predicting estrogen receptor (ER)-specific disease. Individuals with the highest 1% of PRS had 4.37-fold (increased) risk of developing ER-positive disease, whereas those with the lowest 1% of PRS had 0.16-fold (decreased) risks.

In contrast to disease genetic studies, pharmacogenomics (PGx) studies explore how genetic variation influences drug responses, including drug metabolism, efficacy and toxicity, with the ultimate goal of improving and personalizing drug therapy. Such influence is usually via alterations in a drug’s pharmacokinetics (PK, i.e. absorption, distribution, metabolism, elimination) or via modulation of a drug’s pharmacodynamics (PD, i.e. modifying a drug’s target or perturbing biological pathways that alter sensitivity to the drug’s pharmacological effects) [4]. Like many complex traits, most drug responses in PGx are extremely polygenic [5, 6, 7]. For example, Muhammad et al. showed that the six PD and five PK phenotypes they studied were highly heritable, and the majority of the heritability was explained by small-effect and moderate-effect variants instead of large-effect variants, which demonstrates the potential for using PRS approaches in the clinic to improve prediction of PD/PK phenotypes to fulfill the promise of precision medicine [7]. There are some emerging PRS applications in PGx studies published in most recent years. Zhang et al. [8] developed a PRS from schizophrenia GWAS summary statistics reported by the Psychiatric Genomics Consortium for schizophrenia risk and showed that patients with higher PRSs tended to have less improvement with antipsychotic drug treatment. Similarly, Damask et al. [9] found that both the absolute and relative reduction rates of major adverse cardiovascular events by alirocumab treatment compared with placebo were greater in patients with higher PRS in the ODYSSEY OUTCOMES Trial, where they constructed the PRS using disease GWAS summary statistics from a genome-wide meta-analysis of CAD risk. In addition, Marston et al. [10] used a PRS constructed from 27 SNPs derived from a recent large-scale disease GWAS study (a meta-analysis GWAS for CAD) to successfully predict benefit from Evolocumab therapy in patients with atherosclerotic disease. On the other hand, instead of building PRSs with disease GWAS data, a few published studies choose to construct PRSs from drug-related data for safety or efficacy drug response predictions. For example, Lanfear et al. [11] built an efficacy PGx PRS from a PGx genome-wide analysis of Inline graphic-blocker Inline graphic SNP interaction, and successfully predicted all-cause mortality (Inline graphic-blocker benefit) in the European population. These examples highlight the potential benefits of developing PRSs for (safety or efficacy) drug response predictions in PGx studies, as well as their potential utilities in clinic.

PRS applications in PGx studies have been reviewed by several papers with different focuses. Specifically, Johnson et al. [12] and Cross et al. [13] reviewed 51 and 63 PRS application papers, respectively, focusing on the PRSs built with variants from disease GWAS. They included an overview of the PRS applications and successful findings in different disease areas, challenges and reporting guidelines. Siemens et al. [14] reviewed 89 papers, with a focus on the PRSs derived from pharmacogenetic variants associated with drug responses in either candidate gene PGx studies or PGx GWAS. In addition, the authors also reviewed the strategies of PRS performance evaluation and validation. Similarly, Kumuthini et al. [15] focused on the validation of the PRSs in PGx, and the potential impact on their translation into clinical utility. Regarding the PRS methods developed in the disease genetics field, there are several papers reviewing the PRS methodologies using disease GWAS summary statistics, focusing on the genotype (main or prognostic) effects only [16, 17, 18]. In contrast, PRSs built for PGx studies from randomized clinical trials (RCTs) need to handle both prognostic and predictive effects. This is because in PGx studies, a patient’s clinical outcomes are influenced by both prognostic and predictive factors. A prognostic biomarker provides information about an endpoint (i.e. clinical outcome) irrespective of the treatment type. However, a predictive biomarker is associated with an endpoint related to the treatment type (i.e. predicting treatment benefits). In the PRS context, the prognostic effect measures the main genotype (G) association strength with clinical outcome before any treatment intervention. On the other hand, the predictive effect measures the G x T interaction association strength with clinical outcome after treatment. While current practice involves in directly applying disease GWAS-based PRSs to PGx studies, Zhai et al. [19] pointed out that this approach might not fully recover the heritability of drug response since it relied on a stringent assumption which was barely satisfied in real PGx data. The authors further proposed a series of PRS-PGx methods using PGx GWAS summary statistics instead.

PRS modeling in PGx GWAS shares the same challenges as those in disease GWAS. These challenges include trans-ethnic bias across populations (i.e. GWAS sample size for non-European populations is relatively lower and a PRS derived from one population is expected to perform less well in other populations due to differences in allele frequencies, linkage disequilibrium (LD) patterns and effect sizes across different populations) [20, 21, 22] and architectural diversity across multiple correlated traits [23]. Some other challenges include the lack of clear guidelines on clinical interpretation of PGx polygenic models, PGx studies usually requiring more well-defined drug response clinical endpoints, and the lack of PRS reporting guidelines. Despite previous discussions and summaries of these challenges (for example, in the above review papers), possible strategies and solutions to tackling these challenges remain unclear. Therefore, it is critical to take one step further by proposing new PRS application strategies and methods. To gain insights into this landscape, we first conduct a systematic review of the current progress in terms of both the PRS applications and statistical methods development in PGx GWAS. We identify 90 papers published by 11 March 2022 for our PRS PGx applications review, and summarize 23 PRS methods used in these papers in our PRS methods review. We further analyze these systematic review results to provide insights into the status, trends and challenges of PRS applications and method developments in PGx GWAS and discuss potential areas for improvement.

Compared with PRS modeling in disease GWAS, PRS analysis in PGx GWAS with drug response endpoints (efficacy or safety) is more challenging and faces additionally unique challenges. They include the lack of knowledge about whether to use PGx GWAS, disease GWAS or both GWAS/variants in the base cohort (BC) for PRS construction, the small sample sizes in PGx GWAS from RCTs (compared to large disease cohorts) which often result in low power for prediction or association analysis, and the more complex statistical modeling for handling both prognostic and predictive effects simultaneously. There is a trade-off between choosing PGx and disease GWAS (summary statistics) data in the BC to build PRSs. Choosing disease GWAS data, which typically has a large sample size, usually provides large power for prognostic effect prediction, but low power for predictive effect (i.e. genotype-by-treatment interaction) prediction. In contrast, choosing PGx GWAS data, which typically has a relatively small sample size, usually provides lower power for prognostic effect prediction, but likely larger power for predictive effect prediction since PGx variants used for PRS construction are directly drug response related.

In this paper, rather than choosing either PGx or disease GWAS alone in the BC to build a PRS, we propose a new strategy to leverage both disease and PGx GWAS summary statistics. This approach benefits from the large sample size in disease genetic studies and the additionally strong predictive effects in PGx studies. In addition, similar to the cross-population disease GWAS PRS, leveraging and properly modeling trans-ethnic populations can potentially reduce the prediction bias for cross-population prediction in PGx GWAS. To overcome the challenge of Eurocentric or trans-ethnic bias in PGx GWAS, we extend the PRS-PGx-Bayes method [19] to conduct the cross-population PRS modeling with shared shrinkage parameters among multiple populations. The new method simultaneously handles both prognostic and predictive effects. We further perform extensive simulation studies to compare our novel PRS methods using both internal validation with a cross-validation strategy and external validation with an independently simulated validation dataset, as suggested by Siemens et al. [14] and Kumuthini et al. [15]. Furthermore, integrating multiple genetically correlated traits can potentially increase the effective sample size in PGx GWAS, and thus improve the power of PRS association test, PRS prediction accuracy and PRS-based patient stratification. To investigate the impact of complex genetic architectures among multiple traits, Zhai et al. [24] systematically reviewed and compared multiple types of multi-trait PRS methods including regression-based methods (mtPRS-ML and mtPRS-MR), meta−/multi-GWAS-based methods (mtPRS-minP and mtPRS-GSEM), PCA-based method (mtPRS-PCA), and omnibus mtPRS method (mtPRS-O) under various genetic architectures. In this paper, we briefly summarize the current status of applying mtPRS methods to PGx GWAS and main observations from our previous mtPRS method research work [24].

In summary, in this paper, we aim to provide an in-depth overview of the current status of both PRS applications in PGx GWAS and the PRS methods used in those applications, identify the main gaps and challenges, and then propose the possible solutions including two new PRS application strategies and methods for filling the gaps and tackling the challenges. The overall workflow of the paper is summarized in Figure S1.

METHODS

Our review aimed to summarize the current applications of PRSs in PGx GWAS as well as the PRS analysis methods used in the applications. The detailed paper searching workflow is provided in Figure S2. The study screening was conducted by two independent reviewers (Z.S. and S.J.) to minimize selection bias. In the ‘Identification’ step, we used the same search terms as used by Johnson et al. [12], which mainly included two categories: polygenic score related terms and drug response related terms. Our paper search was based on Medline and EMBASE databases up to 11 March 2022, separately. Non-English publications were excluded due to resource constraints. In the ‘Pre-screening’ step, we excluded publications by title and abstract. Publications with abstract only, reviews, letters, notes, etc. were excluded as they did not contain enough details about PRSs [e.g. base and target cohorts (TC)] for our review. The remaining articles that were not drug related and did not use qualified PRSs (e.g. not genome-wide, not genetic variation based or unweighted PRSs) were further excluded. In the ‘Combination’ step, we combined the papers from Medline and EMBASE together and removed duplicate records. In the ‘Screening’ step, records were further excluded based on a full-text screen due to either unqualified publication types or unqualified approaches to build PRSs. Specifically, papers that were not drug related, not drug response PGx study related, not constructing qualified PRSs or only working on TC via cross-validation, were excluded. The whole workflow followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [25] to ensure completeness of the review.

In addition to summarizing the PRS applications in PGx studies, we also summarized the PRS analysis methods used in the 90 identified PGx PRS application papers as well as some additional PRS methods reviewed in other methodology review papers [16, 17, 18]. Regarding the challenges of applying PRSs to PGx GWAS, in this paper we mainly focus on three key challenges: (i) the lack of knowledge about whether to choose PGx, disease or a combination of both PGx and disease GWAS summary statistics in the BC for PRS construction; (ii) the Eurocentric or trans-ethnic bias in cross-population PRS prediction and (iii) the small sample size, low power and more complex PRS modeling in PGx GWAS. We propose two novel PRS strategies and methods to overcome the challenges, which provide potential solutions for better future PRS applications in the field.

RESULTS

Overview of main challenges of PRS applications in PGx GWAS

Our initial search identified 834 papers from Medline and 2487 papers from EMBASE between 2013 and 2022. After pre-screening and combining the two databases, 127 papers were left for full-text screening. Ultimately, 90 papers were included in our systematic review (Figure S2, Table S1). Despite a steady increase in the number of PRS application articles between 2013 and 2022 (Figure 1), the PRS analysis in PGx GWAS still faces multiple challenges, which have been extensively discussed in the published overview papers [12–15] and briefly summarized in the previous section. In this paper, we focus on the three aforementioned key challenges in greater details.

Figure 1.

Figure 1

Number of PRS applications included in the review over time. (A) Source types of GWAS summary statistics in BC over time. (B) Types of PGx studies in the TC over time. (C) Ancestries of participants in BC over time. (D) Ancestries of participants in the TC over time. (E) Pattern of single−/multi-trait in BC over time (‘Single Trait’ means a paper only uses one single trait for PRS analysis; ‘stPRS’ means a paper explores multiple traits one at a time; and ‘mtPRS’ means a paper construct PRS using multiple traits jointly).

Challenge 1: Lack of knowledge about whether to choose PGx, disease or both GWAS summary statistics in the base cohort for PRS construction

The use of PRSs in PGx requires a BC and a TC. In the BC, summary statistics are generated to obtain risk variant effect sizes, standard errors and p-values to inform PRS calculations; in the TC, the PRS is developed and tested. To date, many researchers leverage GWAS derived from large cohorts of related disease and/or complex trait (i.e. non-disease) phenotypes in the development of polygenic models to predict drug outcome. For example, Figure 1A and B indicate that 74 out of 90 identified articles (~82%) published by 2022 use BC GWAS from disease studies. These include disease phenotypes related to the drug efficacy [8, 9, 26–68], complex trait phenotypes related to the drug efficacy [69–81], disease phenotypes related to the adverse drug reaction [82–91], and complex trait phenotypes related to the adverse drug reaction [92–96]. However, our previous research showed that such disease PRS approach cannot recover the full heritability of drug response unless an extremely stringent assumption that every causal variant has an interaction effect proportionate to its main effect is true [19]. More details are provided in the following section. Therefore, 16 out of 90 papers construct the PRSs for drug response prediction using PGx GWAS summary statistics directly instead of using disease GWAS summary statistics since the PGx variants are directly associated with drug response, which may provide larger power (especially in predicting drug response through the predictive effect of a PRS) [11, 97–111]. However, using PGx GWAS as the BC for PRS modeling presents unique challenges due to its typically small sample size, and the difficulty in identifying two cohorts with uniformly treated patients. These factors can lead to increased PRS modeling uncertainty and result in low PRS prediction power. To date, it remains unclear whether it is optimal to use disease GWAS summary statistics that emphasize disease variants with only prognostic effects, PGx GWAS summary statistics that emphasize PGx variants with both prognostic and predictive effects, or a combination of both in the BC for PRS construction and analysis in PGx GWAS.

Challenge 2: Eurocentric or trans-ethnic bias in cross-population PRS prediction

Recent studies have found that PRSs exhibit reduced cross-population prediction accuracy, particularly in non-European populations [20, 21, 22]. Building PRSs with GWAS data from the same non-European population may yield limited prediction accuracy due to the typically small sample size of non-European GWAS compared with European GWAS. This applies to PRS applications in both disease GWAS and PGx GWAS. Conversely, constructing PRSs using European GWAS with larger sample size may offer limited improvement in prediction accuracy due to Eurocentric or trans-ethnic bias. This bias arises from differences in allele frequencies, LD patterns and effect sizes across populations [20, 21, 22]. With the increasing efforts to diversify genomic study samples, non-European genomic resources have been expanded, leading to a rapid growth in PGx studies leveraging multiple populations (Figure 1C and D), from 1 study in 2016 to 22 in 2021. For instance, Cearns et al. constructed a PRS using meta summary statistics calculated from seven cohorts with European, American, Asian and African populations [55]. To mitigate the impact of trans-ethnic bias, there is a pressing need to develop PRS methods that can handle such trans-ethnic data. Recent publications have introduced new methods for trans-ethnic PRS analysis in disease genetics [112, 113], but it is unclear whether these methods can be directly applied to PGx studies. To date, to our knowledge, no PGx PRS methods have been developed to address trans-ethnic drug response prediction under PGx settings.

Challenge 3: Small sample size, low power and more complex PRS modeling in PGx GWAS

Subjects in the PGx GWAS are usually from RCTs, where the sample sizes are much smaller than those from large disease cohorts. Such small sample size typically results in low power for predicting drug responses with prognostic PRS components. There are many strategies to increase the power of PRS analysis in PGx GWAS. In this sense, integrating multiple traits during PGx PRS construction provides a natural way to increase power. To date, most PRSs are constructed using a single trait (i.e. using univariate disease or PGx GWAS summary statistics). However, most disease and PGx GWAS data are multivariate in nature with multiple correlated traits or drug responses. Intuitively, leveraging information from multiple correlated traits can potentially capture more genetic variance, thus boosting the power of PRS analysis. In fact, the number of published papers that explore multiple traits increased rapidly between 2019 and 2021 from 4 to 25 (Figure 1E). Specifically, 20 out of 25 papers analyze multiple PRSs separately. These PRSs are constructed from multiple diseases and/or complex traits that are potentially related to the drug response phenotype in the TC. For example, in Fanelli et al., the authors investigated the possible association of PRSs for bipolar disorder, major depressive disorder, neuroticism and schizophrenia with antidepressant non-response or non-remission in patients with major depressive disorder [28]. The remaining papers analyzed multiple PRSs jointly using either machine learning (ML) based regression or multivariate regression. For example, Taylor et al. used 10 PRSs for depression and genetically correlated traits as predictors in an elastic net model to predict response in tertiary care patients with resistant depression [81].

There are additional challenges to be addressed while constructing a multi-trait PRS in PGx GWAS compared with disease GWAS. First, a PGx multi-trait PRS requires more complex statistical modeling to handle prognostic and predictive effects simultaneously. In literature, very few papers explore this strategy. For example, Lanfear et al. [11] presented a successful PRS application example in PGx GWAS study, which built a PRS using 44 SNPs with predictive (or treatment-by-SNP interaction) effects only. Second, the genetic architecture is more complicated when considering different effect correlation relationships (e.g. magnitudes and directions) between multiple traits in the BC and the phenotype in the TC. It remains a question whether the current multi-trait PRS (mtPRS) approaches built for disease multi-trait PRS analysis are still robust when applied to PGx GWAS or not.

Overview of the PRS methods applied in PGx GWAS

A large variety of methods are available for PRS construction and analysis. They differ from each other in terms of which variants and weights are used for constructing PRS and conducting PRS association analysis and/or prediction. Existing PRS methods generally fall into one of four categories: (i) clumping and thresholding (C + T) approaches, which shrink effect sizes of non-significant SNPs to zero according to their p-values, and account for LD by clumping variants at a given LD; (ii) methods that select variants jointly using penalized regression in the framework of ML, where the number of selected causal variants is controlled by the penalty parameter, and the LD matrix is intrinsic to the algorithm [114–118]; (iii) methods that account for LD through a linear mixed effects model, and estimate effect sizes as best linear unbiased predictions (BLUP) [119, 120, 121] and (iv) Bayesian approaches that explicitly model causal effects and LD to infer the posterior distribution of causal effect sizes, where the shrinkage is controlled by the prior distributions, and the LD matrix is integral to the algorithm [19, 112, 122–126]. Compared with simple C + T method, genome-wide model fitting with penalized or Bayesian regression generally better accounts for LD and has more efficient bias-variance trade-off, thus achieving better performance. In this paper, we summarized 25 PRS methods (including two novel methods/strategies we propose to tackle the challenges) using a three-level hierarchical structure (Table 1). On the first level, we categorized the methods based on the type of BC (i.e. using disease GWAS only, using PGx GWAS only or leveraging both disease and PGx GWAS summary statistics). On the second level, the methods are further categorized based on trait and ancestry information (i.e. single-trait single-ancestry; single-trait trans-ethnic; and multi-trait single-ancestry approaches). We do not consider the multi-trait trans-ethnic approaches since no such methods are currently available. On the third level, methods are divided into the four categories we mentioned before: C + T, ML, BLUP and Bayesian regression. More details about the 23 existing methods including their software sources are summarized in Table S2.

Table 1.

Summary of 23 existing methods and two newly proposed analysis strategies and methods for PRS analysis, which are organized in a three-level hierarchical structure. On the first level, methods are categorized into using disease GWAS only, using PGx GWAS only, and leveraging both disease and PGx GWAS summary statistics. On the second level, methods are further divided into single-trait single-ancestry, single-trait trans-ethnic, and multi-trait single-ancestry approaches. On the third level, methods are assigned into four categories: C + T, ML, BLUP and Bayesian regression. The new strategies and methods [i.e. PRS-PGx-Bayes (Disease + PGx) and PRS-PGx-Bayesx] are highlighted in bold, which are proposed to address the challenges in the current PRS PGx methods. The number in the ‘Applications (# of papers)’ represents the total number of papers out of the 90 papers that use the method in that row. ‘0’ in the same column indicates this method is not from the 90 papers, but from other method (review) papers

GWAS Scenario Category Method Description Required Data Tuning Parameters Applications (# of papers)
Disease GWAS Single-trait single-ancestry C + T C + T LD based clumping and p-value thresholding Summary statistics P-value threshold 60
ML Lassosum Lasso regression based Summary statistics Penalty (Inline graphic), s 1
MegaPRS A super learning algorithm choosing the best prediction model from Lasso, Ridge, and BOLT-LMM Summary statistics Parameters that maximize prediction for each model 0
BLUP SBLUP BLUP in LMM Summary statistics NA 0
Bayesian LDpred Bayesian shrinkage using point-normal prior Summary statistics Fraction of causal markers, Heritability 4
LDpred2 Bayesian shrinkage method extended from LDpred, by enabling sparsity Summary statistics Fraction of causal markers, Heritability, Sparsity 1
PRS-CS Bayesian shrinkage method using global–local scale mixtures of normals prior Summary statistics Global shrinkage parameter (Inline graphic) 3
DBSLMM Bayesian shrinkage Summary statistics NA 0
SBayesR Bayesian shrinkage Summary statistics NA 0
Single-trait trans-ethnic C + T CT-Meta C + T based on meta-GWAS Summary statistics P-value threshold 22
ML TL-Multi Lasso-regression-based method extended from Lassosum. Conducting transfer learning to learn information from EUR to correct the bias for non-EUR Summary statistics Penalty (Inline graphic), s 0
Multi-ethnic PRS Combine training data from European samples and training data from the target population Summary statistics Mixing weights Inline graphic and Inline graphic 0
BLUP XP-BLUP Two-component LMM Individual data NA 0
Bayesian PRS-CSx Bayesian shrinkage method extended from PRS-CS, by leveraging trans-ethnic populations Summary statistics Global shrinkage parameter (Inline graphic) 0
Multi-trait single-ancestry C + T CT-Meta C + T based on meta-GWAS Summary statistics P-value threshold 0
ML mtPRS-ML Combine multiple PRS constructed by C + T from different traits with elastic-net regression Individual data P-value threshold, Penalty (Inline graphic), Inline graphic 2
mtPRS-MultiReg Combine multiple PRS constructed by C + T from different traits with multivariate regression Individual data NA 2
CTPR A cross-trait penalty function with the Lasso and the minimax concave penalty assuming the secondary trait is beneficial for predicting the primary trait Individual data Lasso penalty (Inline graphic), Minimax concave penalty (Inline graphic) 0
BLUP wMT-SBLUP Extended from SBLUP for multiple traits, where weighted indices are calculated from SBLUP estimates Summary statistics NA 0
Bayesian
PGx GWAS Single-trait single-ancestry C + T PRS-PGx-CT C + T on PGx GWAS summary statistics Summary statistics P-value threshold 0
ML PRS-PGx-L, GL, SGL Penalized regression using Lasso, Group Lasso, and Sparse Group Lasso Individual data Penalty (Inline graphic), s 0
BLUP
Bayesian PRS-PGx-Bayes Bayesian shrinkage method using two-dimensional global–local scale mixtures of normals prior based on PGx GWAS summary statistics Summary statistics Global shrinkage parameter (Inline graphic), v 0
Single-trait trans-ethnic Bayesian PRS-PGx-Bayesx Bayesian shrinkage method extended from PRS-PGx-Bayes, by leveraging trans-ethnic populations Summary statistics Global shrinkage parameter (Inline graphic), v
Multi-trait single-ancestry
Disease and PGx GWAS Single-trait single-ancestry Bayesian PRS-PGx-Bayes (Disease + PGx) Bayesian shrinkage method extended from PRS-PGx-Bayes, by leveraging joint information from both disease and PGx GWAS summary statistics Summary statistics Global shrinkage parameter (Inline graphic), v

Figure 2A presents an overview of the methods used in the 90 articles identified in this study. As of March 11, 2022, 91.1% (82/90) of the papers employed the C + T method for PRS construction, which is simple and user-friendly but may discard informative SNPs, potentially limiting the prediction power. A small proportion of papers (8/90 = 8.9%) utilized Bayesian methods, including PRS-CS, LDpred and LDpred2. Only two papers (2/90 = 2.2%) used ML based approaches, and no study investigated BLUP methods. Despite the prevalence of the C + T method in these studies, the use of more complex Bayesian and ML algorithms has been increasing since 2014, accounting for 39.1% and 34.8% of studies from 2022, respectively (Figure 2B). This trend reflects a growing interest in employing more advanced PRS modeling methods to improve performance in both disease and PGx GWAS.

Figure 2.

Figure 2

Number of PRS methods included in the review over time. (A) Number of papers stratified by four types of PRS methods (C + T, ML, BLUP and Bayesian) applied to PGx GWAS over time. The total number of papers across different methodologies is larger than 90 since some papers explored multiple methods. (B) Number of PRS methods developed over time, stratified by four types of PRS methods.

TACKLING CHALLENGE 1: LEVERAGING BOTH PGX AND DISEASE GWAS SUMMARY STATISTICS IN THE BASE COHORT FOR PRS CONSTRUCTION IN PGX PRS APPLICATIONS

Using disease GWAS in the base cohort may lead to poor drug response prediction performance

PGx PRS applications can be categorized into three types in terms of the BC GWAS type: disease GWAS, the PGx GWAS with treatment arm only, and the full PGx GWAS with two arms (treatment and control); and two types in terms of the TC study type: the PGx study with treatment arm only, and the PGx study with two arms. Using disease GWAS in the BC and then applying it to PGx studies represents the largest percentage (74/90 = 82%) of 90 identified papers, possibly due to the wide publicly available resources of disease GWAS summary statistics, and the difficulty in finding PGx GWAS data with the same or similar drug response endpoints. However, as we noted before, such disease PRS approach can barely recover the full heritability of drug response, which is consistent with what we report in Figure 3A. Specifically, the use of disease GWAS in the BC results in the largest failure rate (defined as the proportion of the papers with no significant association found between the PRS and the drug responses): 20% (15/74); when the BC GWAS are from PGx studies, the failure rate decreases to 6% (2/16). This is not surprising since most papers using disease GWAS-based PRSs in PGx applications rely on the assumption that if a variant is a significant signal for some specific disease, then that variant is also causal for the response of the drug targeting that disease. However, the underlying genetic correlation between complex trait/disease risk and drug response needs to be carefully investigated and may vary case by case. Moreover, more studies now tend to focus on PGx studies with two arms and perform the interaction test of differential treatment effect between the high and low genetic risk subgroups when stratified by the PRS. Many such successful findings have been reported [11, 110].

Figure 3.

Figure 3

PRS application success rates. (A) Success rates under different combinations of study types between BC and TC. (B) Success rates under different combinations of ancestries between BC and TC and (C) Success rates stratified by different types of traits. If a paper has ‘PRS success status = Yes’, then that paper has identified a statistically significant association between the PRS and the endpoint in the TC.

Disease PRS is not able to recover the full heritability of drug response in theory

We have previously proved that, in theory, it is difficult for a disease PRS to recover the full heritability of a drug response in PGx GWAS [19]. Here, we briefly summarize the theoretical derivation and the main conclusions. Consider a high-dimensional regression model where the drug response, after adjusting for the covariates, is determined by three components: treatment, genotype and genotype-by-treatment interaction. Furthermore, to capture the correlation between prognostic and predictive effects, we assume the two effects follow a multivariate-normal distribution. Under these assumptions, we derive the squared correlation coefficient between a disease PRS (Inline graphic) and the drug response (Inline graphic) for the treated subjects, denoted by Inline graphic. We proved that by the Cauchy-Schwarz inequality, Inline graphic, where Inline graphic denotes the underlying heritability of the drug response studied. It directly showed that a disease PRS cannot recover the full genetic variability of a drug response unless the equality holds. We further proved that the equality holds if and only if the interaction effect is proportional to the main effect for every causal variant, which is a very stringent assumption under PGx settings. In [19], the authors used a specific example with a real PGx GWAS data from the IMPROVE-IT RCT [127] to calculate Inline graphic, which demonstrated that any PRSs developed from disease GWAS explained at most 46% variability of the LDL-C drug response. This real data example highlights the importance of switching to using PGx GWAS/variants (like what [19] does) or the combination of PGx and disease GWAS/variants as the BC in the PGx PRS modeling and analysis.

Leveraging both PGx and disease GWAS in the base cohort for improving drug response prediction

It is worth noting that there are certain limitations when calculating PRSs using PGx GWAS summary statistics only. First, it can be challenging to find two independent PGx GWAS data with same or similar traits. Second, the sample size in PGx studies is usually small, which may lead to limited prediction power. In fact, Zhai et al. [19] showed in their simulation studies that PGx PRS methods did not necessarily outperform disease PRS approaches in the control arm. In other words, if we only focus on the control arm of a PGx study, a disease PRS may still be useful compared to a PGx PRS due to its large sample size, which provides greater power in PRS prediction in terms of the prognostic effect component. However, in real scenario we may be more interested in the treatment arm or both treatment and control arms. Therefore, a compromise and a direct alterative solution is to use both PGx GWAS data and disease GWAS summary statistics for PRS construction in the BC if both are available. In this section, we aim to compare different combinations between two PRS construction strategies of using summary statistics (either disease or PGx GWAS only or both disease and PGx GWAS) and four PRS methods (C + T, Lassosum, PRS-CS and PRS-PGx-Bayes). Eight approaches will be compared, which are listed in Table 2. In this table, we mainly focus on how to incorporate strategies of leveraging summary statistics into PRS methods. And the description details of the methods themselves can be found in Table S2.

Table 2.

Summary of eight combinations between two PRS construction strategies of using summary statistics (either disease or PGx GWAS only or both disease and PGx GWAS) and four PRS methods (C + T, Lassosum, PRS-CS and PRS-PGx-Bayes). Detailed descriptions of methods can be found in Table S2

Strategies of using different summary statistics
Using either disease or PGx GWAS only Using both disease and PGx GWAS
PRS Methods C + T C + T (Disease): Train the C + T model using disease GWAS summary statistics only from BC. After the p-value shrinkage, the algorithm outputs prognostic effect sizes of SNPs Inline graphic. The PRS is calculated as Inline graphic. And the drug response is predicted as Inline graphic. C + T (Disease + PGx): Train the C + T model using disease GWAS summary statistics and PGx GWAS summary statistics with genotype-by-treatment interaction information, respectively, from BC. After the p-value shrinkage, the algorithm outputs prognostic effect sizes of SNPs Inline graphic and predictive effect sizes of SNPs Inline graphic from disease and PGx GWAS, respectively. Two PRSs are calculated as Inline graphic and Inline graphic. And the drug response is predicted as Inline graphic.
Lassosum Lassosum (Disease): Similar to the above C + T method, train the Lassosum penalized regression model using disease GWAS summary statistics only from the BC. Lassosum (Disease + PGx): Similar to the above C + T method, train the Lassosum penalized regression model using disease GWAS summary statistics and PGx GWAS summary statistics with genotype-by-treatment information, respectively, from the BC.
PRS-CS PRS-CS (Disease): Similar to the above C + T method, train the PRS-CS Bayesian regression model using disease GWAS summary statistics only from the BC. PRS-CS (Disease + PGx): Similar to the above C + T method, train the PRS-CS Bayesian regression model under disease GWAS summary statistics and PGx GWAS summary statistics with genotype-by-treatment information, respectively, from the BC.
PRS-PGx-Bayes PRS-PGx-Bayes (PGx): Train the PRS-PGx-Bayes model using PGx GWAS summary statistics with genotype and genotype-by-treatment information jointly from BC. After the global–local shrinkage, the algorithm outputs prognostic effect sizes of SNPs Inline graphic and predictive effect sizes of SNPs Inline graphic. Two PRSs are calculated as Inline graphic and Inline graphic. And the drug response is predicted as Inline graphic. PRS-PGx-Bayes (Disease + PGx): Replace the genotype information in PGx GWAS summary statistics with disease GWAS summary statistics and keep the genotype-by-treatment information in PGx GWAS summary statistics unchanged. Then train the PRS-PGx-Bayes model under this composite summary statistics from the BC.

Simulation studies for evaluating the new strategy by leveraging both PGx and disease GWAS in the base cohort for PRS construction in PGx GWAS

We performed extensive simulation studies to compare the performance of methods that use disease GWAS summary statistics only, PGx GWAS summary statistics only, or both. We simulated genotype data using R package sim1000G v1.40 with different sample sizes. To simulate prognostic Inline graphic and predictive Inline graphic effect sizes, we used the same spike-and-slab distribution as described in Zhai et al. [19]. The two effects were either correlated (i.e. a causal variant had both non-zero prognostic and non-zero predictive effects) or fully separated (i.e. a causal variant had either non-zero prognostic or non-zero predictive effect). The prognostic effect is either on the same scale as, dominated by, or dominating the predictive effect. Details of the data generation process are provided in Supplementary Method A.

Figure 4 shows the scenario where prognostic and predictive effects are correlated, and the two effect sizes are on the same scale. Results were assessed via internal 5-fold cross-validation in the TC. Specifically, the whole TC was randomly split into five folds. The tuning parameters were determined using four folds, and the PRS was constructed and recorded in the remaining fold. In Figure 4, we initially employed disease PRS methods (C + T, Lassosum, PRS-CS) using disease GWAS summary statistics only as the base. Subsequently, we incorporated additional PGx data to check the potential improvement of disease PRS methods. Similarly, we utilized PGx PRS method (PRS-PGx-Bayes) using PGx GWAS summary statistics only as the base, and then added additional disease genetics data to assess the improvement. All disease PRS and PGx PRS methods showed substantial improvement in the drug response prediction and patient stratification after leveraging additional GWAS information compared to their counterpart traditional approach using either disease or PGx GWAS alone. For example, PRS-PGx-Bayes (Disease + PGx) generally outperformed PRS-PGx-Bayes (PGx), especially under the higher polygenicity scenario when the sample size of PGx GWAS data was small. In addition, when the sample size of PGx GWAS data was large enough (for example, n = 10 000), PRS-PGx-Bayes (PGx) was superior to PRS-PGx-Bayes (Disease + PGx) as shown in both Figure 4 with internal cross-validation and Figure S3 with external validation (i.e. the optimal tuning parameter was selected using an independently simulated validation PGx dataset with sample size 1000 as shown in Supplementary Method A.5). One possible explanation is that it is due to the difference in the prognostic effect estimate between disease and PGx GWAS. The same pattern was also observed when the proportion of causal SNPs increased from 0.001 to 0.1, although all methods generally had lower performance.

Figure 4.

Figure 4

Methods comparison results (with internal cross-validation) among methods that use either disease GWAS summary statistics only, PGx GWAS summary statistics only, or both. The prognostic and predictive effects were correlated, and in the same scale. Heritability was fixed at 0.3. The numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 23, 226 and 2263, respectively. The PGx GWAS summary statistics in the BC were calculated with 1000, 5000 and 10 000 subjects; the disease GWAS summary statistics in the BC were calculated with 50 000 subjects. The tuning parameters were selected via cross-validation in the TC. The performance was assessed in terms of (A) prediction accuracy R2, and (B) predictive p-value for the two-sided PRS-by-treatment interaction test. Data were presented as mean values +/− standard deviations (error bars) with 10 000 replications, where results were calculated from the TC.

Sensitivity analyses results are provided in Figures S3S5. Specifically, in Figure S3, we repeated our comparisons using external validation, where the tuning parameters were selected with an independent validation dataset. The same patterns held compared to the case where internal validation was used. Figure S4 shows the scenario where the prognostic and predictive effect sizes were on different scales. When the prognostic effect dominated the predictive effect, it is not surprising that incorporating disease genetics data into the PGx PRS approach had a much larger improvement than incorporating PGx data into disease PRS methods. When the predictive effect dominated the prognostic effect, PRS-PGx-Bayes (Disease + PGx) consistently achieved the highest prediction accuracy. Figure S5 checks the situation when the prognostic and predictive effects were fully separated. In that setting, the disease PRS methods using disease GWAS summary statistics alone had the lowest Inline graphic since they could hardly capture any variability explained by the interaction. Thus, incorporating PGx GWAS information greatly improved the performance of disease PRS methods.

TACKLING CHALLENGE 2: REDUCING EUROCENTRIC BIAS USING A NEW TRANS-ETHNIC PGX PRS METHOD (PRS-PGX-BAYESX)

Overview of the cross-population PRS applications in PGx GWAS

PGx PRS applications could be classified into three categories based on the BC ancestry: European, non-European (e.g. Asian, African) and Multiple (which is the mixture of, in most cases, European and TC non-European population); and three categories based on the TC ancestry: European, non-European and Multiple. No published articles were found to build PRSs with non-European ancestry alone in the BC unless the TC population is also non-European (the percentage is 8/90 = 9% in Figure 3B). On the other hand, applying a PRS built from European ancestry to the European population in the TC (which is referred to as ‘European + European’) remains the largest proportion (44/90 = 49%) of 90 selected studies, which is not surprising. In addition, Multiple + European doesn’t result in a higher success rate (defined as the proportion of the papers with significant associations found between the PRSs and the drug responses) compared to European + European (9/14 = 64% versus 36/44 = 82%). One possible explanation is that adding non-European subjects may add more noise relative to the amount of additional information. When the TC population is non-European, a trans-ethnic PRS analysis method is needed since a traditional PRS built on European population has limited transferability across ancestry groups [20, 21, 22], and a PRS built on non-European GWAS might yield reduced prediction accuracy due to its small sample size. However, among the 90 articles identified, only one paper followed this trans-ethnic PRS strategy, and failed to find any significant association between PRSs and the drug response. Due to the limited exploration of trans-ethnic analysis in PGx applications, it may imply great potential opportunities for applying cross-population analysis in future PGx studies.

A new trans-ethnic PGx PRS method (PRS-PGx-Bayesx) for cross-population PRS analysis in PGx GWAS

A few PRS methods have been published for trans-ethnic analysis in disease genetics, which can be adapted to the PGx setting. A brief description of these methods is listed below, using two populations, EUR and EAS, as an example.

  1. CT-Meta: C + T based on Meta-GWAS summary statistics, which is calculated by aggregating two disease GWAS summary statistics from EUR and EAS together.

  2. Multi-ethnic PRS [117]: Inline graphic, which uses a grid to search the optimal Inline graphic using cross-validation.

  3. PRS-CSx [112]: a method extended from PRS-CS method for trans-ethnic analysis via a shared continuous shrinkage prior across different populations.

However, to our knowledge, no method has been adapted to trans-ethnic PGx PRS analysis yet. In this study, we propose a novel method (PRS-PGx-Bayesx) which is an extension of the PRS-PGx-Bayes method [19] for the cross-population PRS construction and analysis.

Consider Inline graphic high-dimensional Bayesian regression models of Inline graphic patients and Inline graphic SNPs from Inline graphic studies (or populations in the trans-ethnic GWAS case):

graphic file with name DmEquation1.gif

where Inline graphic denote the drug response, the binary treatment assignment, and the Inline graphic genotype matrix in study Inline graphic, respectively. Inline graphic and Inline graphic are the prognostic and predictive effects of SNP Inline graphic in study Inline graphic, respectively.

The regression coefficient Inline graphic is assumed to be random in the PRS-PGx-Bayesx method, following a prior distribution:

graphic file with name DmEquation2.gif

Inline graphic controls the overall degree of shrinkage, while Inline graphic and Inline graphic control the marker-specific degree of shrinkage. Both Inline graphic and Inline graphic are shared across all Inline graphic studies. Further, assume the residual variance Inline graphic follows a non-informative scale-invariant Jeffreys prior, that is, Inline graphic. As suggested by Zhai et al. [19], we propose to use the hierarchical half-t prior [128] on the variance–covariance matrix Inline graphic:

graphic file with name DmEquation3.gif

Inline graphic denotes the inverse Wishart distribution with scale matrix Inline graphic and Inline graphic degrees of freedom, and G denotes the Gamma distribution.

Given the above prior information, we can derive posterior distributions:

graphic file with name DmEquation4.gif

It is worth noting that, as derived in Zhai et al. [19], LD reference panel of population k is required to calculate Inline graphic, for each Inline graphic. Detailed theoretical derivation is provided in Supplementary Method B, and the detailed algorithm of PRS-PGx-Bayesx method is summarized in Algorithm 1. We ran extensive simulations to compare the proposed PRS-PGx-Bayesx method with existing trans-ethnic PRS methods mentioned above.

graphic file with name bbad470fx1.jpg

Simulation studies for evaluating the proposed trans-ethnic PGx PRS method (PRS-PGx-Bayesx)

In this section, we performed extensive simulation studies to compare performances of multiple trans-ethnic methods with single-ethnic methods when the TC population is either European (EUR) or East Asian (EAS). Simulation studies were performed with simulated genotype data using sim1000G software for both EUR and EAS. We used a vector of parameters Inline graphic to denote the underlying true prognostic (Inline graphic)/predictive (Inline graphic) effect size in base (Inline graphic)/target (Inline graphic) cohort of European (with index 1)/non-European (with index 2) population. Following the simulation setup by Ruan et al. [112], we further assumed Inline graphic follows a multivariate-normal distribution with a well-defined variance–covariance matrix Inline graphic, where Inline graphic measures the correlation between prognostic and predictive effects, Inline graphic measures the effect correlation between European and non-European populations, Inline graphic measures the effect correlation between BC and TC. Details of the data generation process are provided in Supplementary Method C.

Figure 5 shows that under our simulation setup (i.e. EUR-EAS effect correlation Inline graphic), constructing PRSs using EUR alone outperforms using EAS alone even when the TC is EAS. And using EAS in the BC to predict EUR TC results in the smallest Inline graphic and the largest interaction p-value. When the TC is EUR, using EUR + EAS in the BC has limited improvements compared to using EUR alone due to the small sample size of EAS and the moderate correlation between EUR and EAS. On the other hand, when the TC is EAS, trans-ethnic analysis of EUR + EAS is superior to using EAS alone clearly by incorporating the information of EUR population. Lastly, PRS-PGx-Bayesx method is superior to all the other methods for drug response prediction across different proportions of causal variants.

Figure 5.

Figure 5

Methods comparisons between single-ethnic methods (using EUR or EAS alone) and trans-ethnic methods (using EUR and EAS jointly) when the effect correlation between EUR and EAS populations Inline graphic. The prognostic and predictive effects were correlated, and in the same scale. Heritability was fixed at 0.3. The numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 12, 119 and 1188, respectively, for EUR population; and were 11, 106 and 1065, respectively, for EAS population. In the BC, the disease GWAS summary statistics of EUR and EAS populations were calculated based on 50 000 and 10 000 subjects, respectively; the PGx GWAS summary statistics of EUR and EAS populations were calculated based on 5000 and 1000 subjects, respectively. Sample sizes of EUR and EAS populations in the TC were 5000 and 1000, respectively. The tuning parameters were selected via cross-validation in the TC. The performance was assessed in terms of (A) prediction accuracy R2, and (B) predictive p-value for the two-sided PRS-by-treatment interaction test. Data were presented as mean values +/− standard deviations (error bars) with 10 000 replications, where results were calculated from the TC.

For sensitivity analysis, we considered scenarios when the EUR-EAS effect correlation is low (Inline graphic; Figure S6) and when the EUR-EAS effect correlation is high (Inline graphic; Figure S7). When Inline graphic, using EUR for the prediction of EAS and using EAS for the prediction of EUR both retained the lowest prediction accuracy. Furthermore, trans-ethnic predictions are not necessarily superior to single-ethnic ones, since integrating EAS in the BC to predict EUR in the TC may include noise rather than signals, and vice versa. When Inline graphic, all approaches have better performance. Both Figures S6 and S7 indicate that our proposed novel PRS-PGx-Bayesx still outperforms other methods.

TACKLING CHALLENGE 3: USING MULTI-TRAIT PRS METHODS TO INCREASE POWER FOR PRS ANALYSIS IN PGX GWAS

We categorized the 90 selected papers into three categories: those using a single trait, those exploring multiple traits one at a time, and those building PRSs by aggregating multiple traits together. Figure 3C indicates that the three approaches had similar success rates of 82% (53/65), 85% (17/20) and 80% (4/5), respectively. This is possibly due to the very small number of the multi-trait PRS papers. With the success of leveraging information from multiple related traits for signal detection in GWAS [129] and WES (SNP-set analysis) [130], it is also appealing to use multiple traits to increase the power of PRS analysis for drug response prediction and patient stratification in PGx GWAS. Our review results in Figure 3C show that some PGx GWAS are starting to use multi-trait PRS methods. However, it remains unclear which methods are robust under different genetic architectures in PGx studies.

There are a variety of multi-trait PRS analysis methods in literature. For example, the regression-based methods [42, 71, 80, 81] fit a linear or a penalized regression model with individual PRSs from multiple traits as predictors; the meta-GWAS-based methods [131, 132] construct the PRSs using meta-GWAS summary statistics by aggregating individual GWAS summary statistics from multiple traits; the BLUP-based method (wMT-SBLUP) [121] combines the single-trait predictors with BLUP properties in a weighted index calculated from genome-wide SNP heritability, genetic correlation between traits, and expected squared correlations between the phenotype and BLUP predictors; the PCA-based method (mtPRS-PCA) [24] combines PRSs from multiple traits with weights obtained from performing PCA on the genetic correlation matrix, etc. In addition, a more robust method mtPRS-O [24] combines several complimentary multi-trait PRS methods via Cauchy Combination Test. With a variety of multi-trait PRS analysis methods developed, there is an urgent need to systematically evaluate their robustness to different genetic architectures. Zhai et al. [24] provided a comprehensive simulation framework and ran extensive simulations to systematically compare most of the multi-trait PRS methods under various genetic architectures covering different effect directions, signal sparseness and cross-trait correlation structures. Table S3 briefly summarizes the main features of the existing multi-trait PRS methods in terms of their pros, cons and performances in simulation studies from [24] and our additional simulation analyses (the detailed results are not shown). In terms of the PRS association test, mtPRS-O is the most robust method that achieves optimally larger power compared with other multi-trait methods; however, in terms of the drug response prediction, no single method uniformly outperforms the others across all scenarios. In summary, integrating multiple genetically correlated traits from disease GWAS does increase the power for PRS based drug response prediction and patient stratification in PGx GWAS.

DISCUSSION

In this study, we systematically review 90 PRS application papers in PGx GWAS for drug response prediction and patient stratification. We summarize 23 PRS methodologies from these 90 PRS application papers and three other PRS method review papers. From this review, we show that although both PRS application and PRS method development have progressed rapidly in PGx fields, the PRS analysis in PGx GWAS still faces multiple challenges from the PRS analysis method standpoint. In this paper, we mainly dive into three key challenges: (i) the lack of the knowledge on choosing PGx, disease or both GWAS summary statistics in the BC for PRS construction; (ii) the Eurocentric or trans-ethnic bias in the cross-population PRS prediction; (iii) the small sample size, low power and more complex PRS modeling in PGx GWAS. We further propose two new PRS analysis strategies and methods to tackle these challenges.

For the first challenge, under PGx settings, we compare traditional disease GWAS-based methods (C + T, Lassosum, PRS-CS) and PGx GWAS-based method (PRS-PGx-Bayes), with our proposed novel strategy of leveraging both disease and PGx GWAS summary statistics to construct PRS, PRS-PGx-Bayes (Disease + PGx). It is an extension of PRS-PGx-Bayes by replacing the prognostic effect size estimates from PGx GWAS summary statistics with the effect size estimates from disease GWAS summary statistics, which is simple and easy to be implemented. In the simulation studies, we find that the combination of disease GWAS based PRS analysis methods and additional information from PGx GWAS improves the drug response prediction accuracy. Moreover, PRS-PGx-Bayes (Disease + PGx) generally outperforms the other methods especially when the sample size of PGx GWAS in the BC is not large enough. Intuitively, using PGx GWAS summary statistics allows us to model the prognostic effect (i.e. genotype main effect) and predictive effect (i.e. genotype-by-treatment interaction effect) simultaneously, while including disease GWAS summary statistics from disease genetics studies generally provides a much larger sample size than PGx studies, which improves the modeling of the prognostic effect component of a PGx PRS. For the second challenge about the Eurocentric or trans-ethnic bias, although there are already some PRS methods in literature from the disease genetics field, their performance in PGx GWAS setting is unknown. To reduce the Eurocentric or trans-ethnic prediction bias, we propose a novel method, PRS-PGx-Bayesx, which is an extension of PRS-PGx-Bayes for trans-ethnic analysis by updating the global–local shrinkage parameters with cross-population information. Specifically, we assume that the variance–covariance matrix of prognostic and predictive effects of SNP j (Inline graphic) is shared across all K studies/populations, and the posterior distribution of Inline graphic is derived conditional on all K studies. Our simulation studies indicate that PRS-PGx-Bayesx is superior to other methods regardless of the correlation between European and non-European populations. For the third challenge, we focus on reviewing the methods which integrate multiple traits during PGx PRS construction and provide a natural way of increasing power for PRS analysis in PGx GWAS. In addition to the review of the applications of mtPRS methods in PGx GWAS, we further refer to our previous research work [24] and provide a comprehensive summary of the performance of the existing multi-trait performance in PGx GWAS.

Our study has some limitations. First, in this review, the study screening was conducted by two independent reviewers (Z.S. and S.J.) to minimize selection bias. However, there is currently no risk of bias assessment tool for PRS related reviews [133]. Therefore, some evidence may still be missing due to publication bias (e.g. negative results are less likely to be reported) [14, 134, 135]. Second, our efforts of tackling the first challenge demonstrate the fact that leveraging both disease and PGx GWAS summary statistics may further improve the power of PGx PRS analysis. However, it would be more challenging to obtain both disease GWAS and PGx GWAS with similar (or genetic correlated) phenotypes. In addition, it is difficult to evaluate the genetic correlation relationship between a disease phenotype and a relevant drug response phenotype (i.e. by calculating their genetic correlation). Our proposed solution by replacing Inline graphic of the PGx GWAS with Inline graphic from the disease GWAS is simple and intuitive. However, there is a gap between the prognostic effect sizes from disease GWAS and those from PGx GWAS. Therefore, our simple strategy may not increase the prediction accuracy, especially when the sample size of the PGx GWAS is large enough (we have demonstrated this point in the simulation results summarized in Figure 4). In theory, more complicated models can be constructed for further increasing the PRS prediction performance. However, they may face additional barriers in clinical interpretation and implementation. In addition, the PRS-PGx-Bayesx method we develop for tackling the second challenge is a Bayesian based method, which requires a relatively longer computational time (as shown in Figure S8). Therefore, it is recommended to perform PRS-PGx-Bayesx method by LD blocks [19, 124, 136] under the high performance computing or parallel computing environment. Besides, PRS-PGx-Bayesx is an extension from PRS-PGx-Bayes [19] for cross-population analysis, and it shares the same challenges of the PRS-PGx-Bayes method. For example, PRS-PGx-Bayesx uses one of the most popular continuous shrinkage priors, the global–local scale mixtures of normals (i.e. the Horseshoe prior), for effect size shrinkage. There are other candidate priors (e.g. the spike-and-slab prior and the Normal-Gamma shrinkage prior), and the currently existing Bayesian methods do not have a systematic way to determine the optimal prior. Third, we focus our study on the three main challenges in current PRS analyses under PGx settings. Other challenges mentioned before (e.g. the lack of guidance in clinical interpretation of PGx polygenic models) are beyond the scope of this paper, but have been discussed in other PRS PGx review papers [12–15]. Fourth, phenotypic characterization also presents a unique challenge within pharmacogenomic research as many drug outcomes are difficult to measure quantitatively [14]. Discrepant phenotyping may lead to different polygenic models being constructed depending on the definition of the drug outcome since they will result in different effective sample sizes and power for PRS analyses and predictions. Finally, with the rapid development of PRS applications and methodologies as we show in this paper, more efforts are needed to accelerate the PGx PRS application with clinical utilities. Further research should now aim at comparing the drug response prediction accuracy with and without the use of PRSs to demonstrate the benefit of PRSs in PGx applications.

As this field of research continues to grow, we believe that there are many promising applications for the use of PRSs in the context of PGx and precision medicine to further improve treatment outcomes. Our efforts of reviewing the current progress of PRS applications and methods in PGx GWAS, identifying the current main challenges and proposing new analysis strategies and methods to further overcome them in PGx PRS applications help move a step forward in the field and may accelerate the translation of PRSs to clinical practice.

Key Points

  • The application of PRSs in PGx GWAS has begun to show great potentials for improving patient stratification and drug response prediction. However, applying PRSs to PGx GWAS faces multiple challenges including (i) the lack of knowledge about whether PGx, disease or both GWAS/variants should be used in the BC; (ii) the Eurocentric or trans-ethnic bias; (iii) small sample sizes in PGx GWAS with low power and (iv) the more complex PRS modeling while handling both prognostic and predictive effects simultaneously, etc.

  • We conduct a systematic review of current progress in both PRS applications and PRS method developments in PGx GWAS to gain insights about the general trends, challenges and possible solutions.

  • We further propose (i) a novel PRS application strategy by leveraging both PGx and disease GWAS summary statistics in the BC for PRS construction in PGx PRS applications and (ii) a new Bayesian method (PRS-PGx-Bayesx) to reduce Eurocentric or across-population PRS prediction bias. Our extensive simulations demonstrate their advantages over existing PRS methods applied in PGx GWAS.

  • Our systematic review and methodology research work in this paper not only highlights current gaps and key considerations while applying PRS methods to PGx GWAS, but also provides possible solutions for better PGx PRS applications and future research.

Supplementary Material

PGxPRS_Review_ms_Supp_Final_bbad470
Table_S1_S2_bbad470
table_s1_s2_bbad470.xlsx (57.8KB, xlsx)

ACKNOWLEDGMENTS

The authors thank Dr. Hong Zhang for useful discussion and Drs Rachel Marceau West and Wujuan Zhong for their review comments on this manuscript.

Author Biographies

Song Zhai is a Senior Scientist, Biostatistics and Research Decision Sciences, Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. He holds a PhD in Statistics, and his current research focuses on biomarker discovery in pharmacogenomics and disease studies.

Devan V. Mehrotra is Vice President, Biostatistics and Research Decision Sciences at Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. His current research focus is on statistical methods for stratified medicine.

Judong Shen is a Senior Director, Biostatistics and Research Decision Sciences, Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA. His research focuses on developing statistical methods for data analysis in pharmacogenomics, disease genetics, drug target science and translational biomarker studies.

Contributor Information

Song Zhai, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

Devan V Mehrotra, Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, PA 19454, USA.

Judong Shen, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

FUNDING

The authors acknowledge that they received no funding in support for this research.

COMPETING INTERESTS

S.Z., D.V.M. and J.S. are employees at Merck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc., Rahway, NJ, USA.

DATA AND MATERIALS AVAILABILITY

1000 Genomes Phase3 v5 data: http://csg.sph.umich.edu/abecasis/mach/download/1000G.Phase3.v5.html. sim1000G R package v1.40: https://cran.r-project.org/web/packages/sim1000G/index.html. HapMap3 data: https://www.ncbi.nlm.nih.gov/variation/news/NCBI_retiring_HapMap/. LD blocks in human population: https://bitbucket.org/nygcresearch/ldetect-data/src/master. Lassosum: https://github.com/tshmak/lassosum. PRS-CS: https://github.com/getian107/PRScs. PRS-CSx: https://github.com/getian107/PRScsx. mtPRS-PCA and mtPRS-O: https://github.com/zhaiso1/mtPRS. PRS-PGx-Bayes R package v0.3.0: https://cran.r-project.org/web/packages/PRSPGx/index.html. R code we used to summarize our review results and the PRS-PGx-Bayesx R function are freely available at https://github.com/zhaiso1/PGxPRSOverview.

References

  • 1. Choi  SW, Mak  TS, O’Reilly  PF. Tutorial: a guide to performing polygenic risk score analyses. Nat Protoc  2020;15:2759–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Khera  AV, Chaffin  M, Aragam  KG, et al.  Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet  2018;50:1219–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Mavaddat  N, Michailidou  K, Dennis  J, et al.  Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am J Human Genetics  2019;104:21–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Relling  MV, Evans  WE. Pharmacogenomics in the clinic. Nature  2015;526:343–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Roden  D, MacLeod  H, Relling  M, et al.  Genomics medicine 2 - pharmacogenomics. Lancet  2019;394:521–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Crouch  DJ, Bodmer  WF. Polygenic inheritance, GWAS, polygenic risk scores, and the search for functional variants. Proc Natl Acad Sci  2020;117:18924–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Muhammad  A, Aka  IT, Birdwell  KA, et al.  Genome-wide approach to measure variant-based heritability of drug outcome phenotypes. Clin Pharmacol Therapeutics  2021;110:714–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Zhang  JP, Robinson  D, Yu  J, et al.  Schizophrenia polygenic risk score as a predictor of antipsychotic efficacy in first-episode psychosis. Am J Psychiatry  2019;176:21–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Damask  A, Steg  PG, Schwartz  GG, et al.  Patients with high genome-wide polygenic risk scores for coronary artery disease may receive greater clinical benefit from alirocumab treatment in the ODYSSEY OUTCOMES trial. Circulation  2020;141:624–36. [DOI] [PubMed] [Google Scholar]
  • 10. Marston  NA, Kamanu  FK, Nordio  F, et al.  Predicting benefit from evolocumab therapy in patients with atherosclerotic disease using a genetic risk score: results from the FOURIER trial. Circulation  2020;141(8):616–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Lanfear  DE, Luzum  JA, She  R, et al.  Polygenic score for β-blocker survival benefit in European ancestry patients with reduced ejection fraction heart failure. Circulation. Heart Failure  2020;13:e007012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Johnson  D, Wilke  MA, Lyle  SM, et al.  A systematic review and analysis of the use of polygenic scores in pharmacogenomics. Clin Pharmacol Therapeutics  2022;111:919–30. [DOI] [PubMed] [Google Scholar]
  • 13. Cross  B, Turner  R, Pirmohamed  M. Polygenic risk scores: an overview from bench to bedside for personalized medicine. Front Genet  2022;13:1000667. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Siemens  A, Anderson  SJ, Rassekh  SR, et al.  A systematic review of polygenic models for predicting drug outcomes. J Personalized Med  2022;12:1394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Kumuthini  J, Zick  B, Balasopoulou  A, et al.  The clinical utility of polygenic risk scores in genomic medicine practices: a systematic review. Hum Genet  2022;141(11):1697–1704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Chung  W. Statistical models and computational tools for predicting complex traits and diseases. Genomics Inform  2021;19(4):e36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Ni  G, Zeng  J, Revez  JA, et al.  A comparison of ten polygenic score methods for psychiatric disorders applied across multiple cohorts. Biol Psychiatry  2021;90:611–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Pain  O, Glanville  KP, Hagenaars  SP, et al.  Evaluation of polygenic prediction methodology within a reference-standardized framework. PLoS Genet  2021;17:e1009021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Zhai  S, Zhang  H, Mehrotra  DV, et al.  Pharmacogenomics polygenic risk score for drug response prediction using PRS-PGx methods. Nat Commun  2022;13:1–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Martin  AR, Gignoux  CR, Walters  RK, et al.  Human demographic history impacts genetic risk prediction across diverse populations. Am J Human Genetics  2017;100(4):635–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Wang  Y, Guo  J, Ni  G, et al.  Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat Commun  2020;11(1):3865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Duncan  L, Shen  H, Gelaye  B, et al.  Analysis of polygenic risk score usage and performance in diverse human populations. Nat Commun  2019;10(1):3328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Vroom  CR, de  Leeuw  C, Posthuma  D, et al.  The more the merrier? Multivariate approaches to genome-wide association analysis. bioRxiv  2019;610287. 10.1101/610287. [DOI] [Google Scholar]
  • 24. Zhai  S, Guo  B, Wu  B, et al.  Integrating multiple traits for improving polygenic risk prediction in disease and pharmacogenomics GWAS. Brief Bioinform  2023;24(4):bbad181. [DOI] [PubMed] [Google Scholar]
  • 25. Page  MJ, McKenzie  JE, Bossuyt  PM, et al.  The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg  2021;88:105906. [DOI] [PubMed] [Google Scholar]
  • 26. Sánez Tähtisalo  H, Ruotsalainen  S, Mars  N, et al.  Human essential hypertension: no significant association of polygenic risk scores with antihypertensive drug responses. Sci Rep  2020;10(1):1–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Li  QS, Wajs  E, Ochs-Ross  R, et al.  Genome-wide association study and polygenic risk score analysis of esketamine treatment response. Sci Rep  2020;10(1):12649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Fanelli  G, Benedetti  F, Kasper  S, et al.  Higher polygenic risk scores for schizophrenia may be suggestive of treatment non-response in major depressive disorder. Progress Neuro-Psychopharmacol Biol Psychiatry  2021;108:110170. [DOI] [PubMed] [Google Scholar]
  • 29. Zhong  Y, Yang  B, Su  Y, et al.  The association with quantitative response to attention-deficit/hyperactivity disorder medication of the previously identified neurodevelopmental network genes. J Child Adolesc Psychopharmacol  2020;30(6):348–54. [DOI] [PubMed] [Google Scholar]
  • 30. Amare  AT, Schubert  KO, Hou  L, et al.  Association of polygenic score for schizophrenia and HLA antigen and inflammation genes with response to lithium in bipolar affective disorder: a genome-wide association study. JAMA Psychiatry  2018;75(1):65–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Ward  J, Graham  N, Strawbridge  RJ, et al.  Polygenic risk scores for major depressive disorder and neuroticism as predictors of antidepressant response: meta-analysis of three treatment cohorts. PloS One  2018;13(9):e0203896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Kuo  HC, Wong  HS, Chang  WP, et al.  Prediction for intravenous immunoglobulin resistance by using weighted genetic risk score identified from genome-wide association study in Kawasaki disease. Circ Cardiovasc Genetics  2017;10(5):e001625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. García-González  J, Tansey  KE, Hauser  J, et al.  Pharmacogenetics of antidepressant response: a polygenic approach. Progress Neuro-Psychopharmacol Biol Psychiatry  2017;75:128–34. [DOI] [PubMed] [Google Scholar]
  • 34. Hettige  NC, Cole  CB, Khalid  S, De Luca  V. Polygenic risk score prediction of antipsychotic dosage in schizophrenia. Schizophr Res  2016;170(2–3):265–70. [DOI] [PubMed] [Google Scholar]
  • 35. Kullo  IJ, Jouni  H, Austin  EE, et al.  Incorporating a genetic risk score into coronary heart disease risk estimates: effect on low-density lipoprotein cholesterol levels (the MI-GENES clinical trial). Circulation  2016;133(12):1181–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Vedak  P, Kroshinsky  D, St. John  J, et al.  Genetic basis of TNF-α antagonist associated psoriasis in inflammatory bowel diseases: a genotype-phenotype analysis. Aliment Pharmacol Ther  2016;43(6):697–704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Tansey  KE, Guipponi  M, Domenici  E, et al.  Genetic susceptibility for bipolar disorder and response to antidepressants in major depressive disorder. Am J Med Genet B Neuropsychiatr Genet  2014;165(1):77–83. [DOI] [PubMed] [Google Scholar]
  • 38. Kogelman  LJ, Esserlind  AL, Christensen  AF, et al.  Migraine polygenic risk score associates with efficacy of migraine-specific drugs. Neurology Genetics  2019;5(6):e364. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Werner  MC, Wirgenes  KV, Haram  M, et al.  Indicated association between polygenic risk score and treatment-resistance in a naturalistic sample of patients with schizophrenia spectrum disorders. Schizophr Res  2020;218:55–62. [DOI] [PubMed] [Google Scholar]
  • 40. Mayén-Lobo  YG, Martínez-Magaña  JJ, Pérez-Aldana  BE, et al.  Integrative genomic–epigenomic analysis of clozapine-treated patients with refractory psychosis. Pharmaceuticals  2021;14(2):118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Santoro  ML, Ota  V, de  Jong  S, et al.  Polygenic risk score analyses of symptoms and treatment response in an antipsychotic-naive first episode of psychosis cohort. Transl Psychiatry  2018;8(1):174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Marshe  VS, Maciukiewicz  M, Hauschild  AC, et al.  Genome-wide analysis suggests the importance of vascular processes and neuroinflammation in late-life antidepressant response. Transl Psychiatry  2021;11(1):127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Frank  J, Lang  M, Witt  SH, et al.  Identification of increased genetic risk scores for schizophrenia in treatment-resistant patients. Mol Psychiatry  2015;20(2):150–1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Wigmore  EM, Hafferty  JD, Hall  LS, et al.  Genome-wide association study of antidepressant treatment resistance in a population-based cohort using health service prescription data and meta-analysis with GENDEP. Pharmacogenomics J  2020;20(2):329–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Gasse  C, Wimberley  T, Wang  Y, et al.  Schizophrenia polygenic risk scores, urbanicity and treatment-resistant schizophrenia. Schizophr Res  2019;212:79–85. [DOI] [PubMed] [Google Scholar]
  • 46. Wimberley  T, Gasse  C, Meier  SM, et al.  Polygenic risk score for schizophrenia and treatment-resistant schizophrenia. Schizophr Bull  2017;43(5):1064–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Moran  CJ, Huang  H, Rivas  M, et al.  Genetic variants in cellular transport do not affect mesalamine response in ulcerative colitis. PloS One  2018;13(3):e0192806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Schubert  KO, Thalamuthu  A, Amare  AT, et al.  Combining schizophrenia and depression polygenic risk scores improves the genetic prediction of lithium response in bipolar disorder patients. Transl Psychiatry  2021;11(1):606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Fanelli  G, Domschke  K, Minelli  A, et al.  A meta-analysis of polygenic risk scores for mood disorders, neuroticism, and schizophrenia in antidepressant response. Eur Neuropsychopharmacol  2022;55:86–95. [DOI] [PubMed] [Google Scholar]
  • 50. Luykx  JJ, Loef  D, Lin  B, et al.  Interrogating associations between polygenic liabilities and electroconvulsive therapy effectiveness. Biol Psychiatry  2022;91(6):531–9. [DOI] [PubMed] [Google Scholar]
  • 51. Fabbri  C, Hagenaars  SP, John  C, et al.  Genetic and clinical characteristics of treatment-resistant depression using primary care records in two UK cohorts. Mol Psychiatry  2021;26(7):3363–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Kim  JO, Schaid  DJ, Vachon  CM, et al.  Impact of personalized genetic breast cancer risk estimation with polygenic risk scores on preventive endocrine therapy intention and uptake. Cancer Prev Res  2021;14(2):175–84. [DOI] [PubMed] [Google Scholar]
  • 53. Wang  M, Hu  K, Fan  L, et al.  Predicting treatment response in schizophrenia with magnetic resonance imaging and polygenic risk score. Front Genet  2022;13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Talarico  F, Costa  GO, Ota  VK, et al.  Systems-level analysis of genetic variants reveals functional and spatiotemporal context in treatment-resistant schizophrenia. Mol Neurobiol  2022;59(5):3170–82. [DOI] [PubMed] [Google Scholar]
  • 55. Cearns  M, Amare  AT, Schubert  KO, et al.  Using polygenic scores and clinical data for bipolar disorder patient stratification and lithium response prediction: machine learning approach. Br J Psychiatry  2022;220(4):219–28. [DOI] [PubMed] [Google Scholar]
  • 56. Jiang  X, Askling  J, Saevarsdottir  S, et al.  A genetic risk score composed of rheumatoid arthritis risk alleles, HLA-DRB1 haplotypes, and response to TNFi therapy–results from a Swedish cohort study. Arthritis Res Ther  2016;18(1):1–0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Khan  Z, Di Nucci  F, Kwan  A, et al.  Polygenic risk for skin autoimmunity impacts immune checkpoint blockade in bladder cancer. Proc Natl Acad Sci  2020;117(22):12288–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Li  J, Yoshikawa  A, Brennan  MD, et al.  Genetic predictors of antipsychotic response to lurasidone identified in a genome wide association study and by schizophrenia risk genes. Schizophr Res  2018;192:194–204. [DOI] [PubMed] [Google Scholar]
  • 59. Tomasik  J, Lago  SG, Vázquez-Bourgon  J, et al.  Association of insulin resistance with schizophrenia polygenic risk score and response to antipsychotic treatment. JAMA Psychiatry  2019;76(8):864–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Emdin  CA, Bhatnagar  P, Wang  M, et al.  Genome-wide polygenic score and cardiovascular outcomes with evacetrapib in patients with high-risk vascular disease: a nested case-control study. Circ Genomic Precision Med  2020;13(1):e002767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Marderstein  AR, Kulm  S, Peng  C, et al.  A polygenic-score-based approach for identification of gene-drug interactions stratifying breast cancer risk. Am J Hum Genet  2021;108(9):1752–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Khan  Z, Hammer  C, Carroll  J, et al.  Genetic variation associated with thyroid autoimmunity shapes the systemic immune response to PD-1 checkpoint blockade. Nat Commun  2021;12(1):3355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Bakshi  A, Riaz  M, Orchard  SG, et al.  A polygenic risk score predicts incident prostate cancer risk in older men but does not select for clinically significant disease. Cancer  2021;13(22):5815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Bakshi  A, Cao  Y, Orchard  SG, et al.  Aspirin and the risk of colorectal cancer according to genetic susceptibility among older individuals. Cancer Prev Res  2022;15(7):447–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Clements  CC, Karlsson  R, Lu  Y, et al.  Genome-wide association study of patients with a severe major depressive episode treated with electroconvulsive therapy. Mol Psychiatry  2021;26(6):2429–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. van der  Sloot  KW, Voskuil  MD, Blokzijl  T, et al.  Isotype-specific antibody responses to mycobacterium avium paratuberculosis antigens are associated with the use of biologic therapy in inflammatory bowel disease. J Crohn's Colitis  2021;15(8):1253–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Shijo  T, Sakurada  Y, Yoneyama  S, et al.  Association between polygenic risk score and one-year outcomes following as-needed aflibercept therapy for exudative age-related macular degeneration. Pharmaceuticals  2020;13(9):257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Natarajan  P, Young  R, Stitziel  NO, et al.  Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from statin therapy in the primary prevention setting. Circulation  2017;135:2091–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Dauber  A, Meng  Y, Audi  L, et al.  A genome-wide pharmacogenetic study of growth hormone responsiveness. J Clin Endocrinol Metabol  2020;105(10):3203–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Amare  AT, Schubert  KO, Tekola-Ayele  F, et al.  Association of the polygenic scores for personality traits and response to selective serotonin reuptake inhibitors in patients with major depressive disorder. Front Psych  2018;9:65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Tadros  R, Tan  HL, ESCAPE-NET Investigators, et al.  Predicting cardiac electrical response to sodium-channel blockade and Brugada syndrome using polygenic risk scores. Eur Heart J  2019;40(37):3097–107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Tapela  NM, Collister  J, Liu  X, et al.  Are polygenic risk scores for systolic blood pressure and LDL-cholesterol associated with treatment effectiveness, and clinical outcomes among those on treatment?  Eur J Prev Cardiol  2022;29(6):925–37. [DOI] [PubMed] [Google Scholar]
  • 73. Hommers  L, Scherf-Clavel  M, Stempel  R, et al.  Antipsychotics in routine treatment are minor contributors to QT prolongation compared to genetics and age. J Psychopharmacol  2021;35(9):1127–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. D’Erasmo  L, Minicocci  I, Di Costanzo  A, et al.  Clinical implications of monogenic versus polygenic hypercholesterolemia: long-term response to treatment, coronary atherosclerosis burden, and cardiovascular events. J Am Heart Assoc  2021;10(9):e018932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Luo  J, Martucci  VL, Quandt  Z, et al.  Immunotherapy-mediated thyroid dysfunction: genetic risk and impact on outcomes with PD-1 blockade in non–small cell lung CancerThyroid irAEs: genetic risk and PD-1 blockade response. Clin Cancer Res  2021;27(18):5131–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Hubaceka  JA, Kurcovab  I, Maresovab  V, et al.  SNPs within CHRNA5-A3-B4and CYP2A6/B6, nicotine metabolite concentrations and nicotine dependence treatment success in smokers. Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub  2021;165(1):84–9. [DOI] [PubMed] [Google Scholar]
  • 77. Arathimos  R, Ronaldson  A, Howe  LJ, et al.  Vitamin D and the risk of treatment-resistant and atypical depression: a Mendelian randomization study. Transl Psychiatry  2021;11(1):561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Zwicker  A, Fabbri  C, Rietschel  M, et al.  Genetic disposition to inflammation and response to antidepressants in major depressive disorder. J Psychiatr Res  2018;105:17–22. [DOI] [PubMed] [Google Scholar]
  • 79. Wang  Y, Wactawski-Wende  J, Sucheston-Campbell  LE, et al.  Gene-hormone therapy interaction and fracture risk in postmenopausal women. J Clin Endocrinol Metabol  2017;102(6):1908–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Kowalec  K, Lu  Y, Sariaslan  A, et al.  Increased schizophrenia family history burden and reduced premorbid IQ in treatment-resistant schizophrenia: a Swedish National Register and genomic study. Mol Psychiatry  2021;26(8):4487–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Taylor  RW, Coleman  JR, Lawrence  AJ, et al.  Predicting clinical outcome to specialist multimodal inpatient treatment in patients with treatment resistant depression. J Affect Disord  2021;291:188–97. [DOI] [PubMed] [Google Scholar]
  • 82. Manousaki  D, Forgetta  V, Keller-Baruch  J, et al.  A polygenic risk score as a risk factor for medication-associated fractures. J Bone Miner Res  2020;35(10):1935–41. [DOI] [PubMed] [Google Scholar]
  • 83. Jarvis  KB, Nielsen  RL, Gupta  R, et al.  Polygenic risk score-analysis of thromboembolism in patients with acute lymphoblastic leukemia. Thromb Res  2020;196:15–20. [DOI] [PubMed] [Google Scholar]
  • 84. Noyes  JD, Mordi  IR, Doney  AS, et al.  Genetic risk of diverticular disease predicts early stoppage of nicorandil. Clin Pharmacol Therapeutics  2020;108(6):1171–5. [DOI] [PubMed] [Google Scholar]
  • 85. Eusebi  P, Romoli  M, Paoletti  FP, et al.  Risk factors of levodopa-induced dyskinesia in Parkinson’s disease: results from the PPMI cohort. NPJ Parkinson's Dis  2018;4(1):33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86. Amare  AT, Schubert  KO, Hou  L, et al.  Association of polygenic score for major depression with response to lithium in patients with bipolar disorder. Mol Psychiatry  2021;26(6):2457–70. [DOI] [PubMed] [Google Scholar]
  • 87. Aittokallio  J, Kauko  A, Vaura  F, et al.  Polygenic risk scores for predicting adverse outcomes after coronary revascularization. Am J Cardiol  2022;167:9–14. [DOI] [PubMed] [Google Scholar]
  • 88. Campbell  C, McCormack  M, Patel  S, et al.  A pharmacogenomic assessment of psychiatric adverse drug reactions to levetiracetam. Epilepsia  2022;63(6):1563–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89. Lu  Z, Zhang  Y, Yan  H, et al.  ATAD3B and SKIL polymorphisms associated with antipsychotic-induced QTc interval change in patients with schizophrenia: a genome-wide association study. Transl Psychiatry  2022;12(1):56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90. Lago  SG, Tomasik  J, Van Rees  GF, et al.  Exploring cellular markers of metabolic syndrome in peripheral blood mononuclear cells across the neuropsychiatric spectrum. Brain Behav Immun  2021;91:673–82. [DOI] [PubMed] [Google Scholar]
  • 91. Pechlivanis  S, Jung  D, Moebus  S, et al.  Pharmacogenetic association of diabetes-associated genetic risk score with rapid progression of coronary artery calcification following treatment with HMG-CoA-reductase inhibitors—results of the Heinz Nixdorf recall study. Naunyn Schmiedebergs Arch Pharmacol  2021;394(8):1713–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92. Barceló  C, Guidi  M, Thorball  CW, Hammer  C, Chaouch  A, Scherrer  AU, Hasse  B, Cavassini  M, Furrer  H, Calmy  A, Haubitz  S. Impact of genetic and nongenetic factors on body mass index and waist-hip ratio change in HIV-infected individuals initiating antiretroviral therapy. In Open forum infectious diseases  2020. (Vol. 7, No. 1, p. ofz464). US: Oxford University Press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93. Maciukiewicz  M, Tiwari  AK, Zai  CC, et al.  Genome-wide association study on antipsychotic-induced weight gain in Europeans and African-Americans. Schizophr Res  2019;212:204–12. [DOI] [PubMed] [Google Scholar]
  • 94. Amare  AT, Schubert  KO, Tekola-Ayele  F, et al.  The association of obesity and coronary artery disease genes with response to SSRIs treatment in major depression. J Neural Transm  2019;126:35–45. [DOI] [PubMed] [Google Scholar]
  • 95. Delacrétaz  A, Santos  PL, Morgui  NS, et al.  Influence of polygenic risk scores on lipid levels and dyslipidemia in a psychiatric population receiving weight gain-inducing psychotropic drugs. Pharmacogenet Genomics  2017;27(12):464–72. [DOI] [PubMed] [Google Scholar]
  • 96. Finch  ER, Smith  CA, Yang  W, et al.  Asparaginase formulation impacts hypertriglyceridemia during therapy for acute lymphoblastic leukemia. Pediatr Blood Cancer  2020;67(1):e28040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97. Duconge  J, Santiago  E, Hernandez-Suarez  DF, et al.  Pharmacogenomic polygenic risk score for clopidogrel responsiveness among Caribbean Hispanics: a candidate gene approach. Clin Transl Sci  2021;14(6):2254–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98. Koido  M, Kawakami  E, Fukumura  J, et al.  Polygenic architecture informs potential vulnerability to drug-induced liver injury. Nat Med  2020;26(10):1541–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99. Lacaze  P, Ronaldson  KJ, Zhang  EJ, et al.  Genetic associations with clozapine-induced myocarditis in patients with schizophrenia. Transl Psychiatry  2020;10(1):37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100. Guo  W, Machado-Vieira  R, Mathew  S, et al.  Exploratory genome-wide association analysis of response to ketamine and a polygenic analysis of response to scopolamine in depression. Transl Psychiatry  2018;8(1):280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101. Gendep Investigators, Mars Investigators, Star* D Investigators . Common genetic variation and antidepressant efficacy in major depressive disorder: a meta-analysis of three genome-wide pharmacogenetic studies. Am J Psychiatry  2013;170(2):207–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102. Alemany-Navarro  M, Costas  J, Real  E, et al.  Do polygenic risk and stressful life events predict pharmacological treatment response in obsessive compulsive disorder? A gene–environment interaction approach. Transl Psychiatry  2019;9(1):70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103. Li  JH, Szczerbinski  L, Dawed  AY, et al.  A polygenic score for type 2 diabetes risk is associated with both the acute and sustained response to sulfonylureas. Diabetes  2021;70(1):293–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104. Sordillo  JE, Lutz  SM, McGeachie  MJ, et al.  Pharmacogenetic polygenic risk score for bronchodilator response in children and adolescents with asthma: proof-of-concept. J Personalized Med  2021;11(4):319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105. Biernacka  JM, Coombes  BJ, Batzler  A, et al.  Genetic contributions to alcohol use disorder treatment outcomes: a genome-wide pharmacogenomics study. Neuropsychopharmacology  2021;46(12):2132–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106. Karmi  N, Bangma  A, Spekhorst  LM, et al.  Polygenetic risk scores do not add predictive power to clinical models for response to anti-TNFα therapy in inflammatory bowel disease. PloS One  2021;16(9):e0256860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107. Suzuki  K, Kakuta  Y, Naito  T, et al.  Genetic background of mesalamine-induced fever and diarrhea in Japanese patients with inflammatory bowel disease. Inflamm Bowel Dis  2022;28(1):21–31. [DOI] [PubMed] [Google Scholar]
  • 108. Pardiñas  AF, Smart  SE, Willcocks  IR, et al.  Interaction testing and polygenic risk scoring to estimate the association of common genetic variants with treatment resistance in schizophrenia. JAMA Psychiatry  2022;79(3):260–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Shen  FC, Weng  SW, Tsai  MH, et al.  Mitochondrial haplogroups have a better correlation to insulin requirement than nuclear genetic variants for type 2 diabetes mellitus in Taiwanese individuals. J Diabetes Investigation  2022;13(1):201–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110. Mariam  A, Miller-Atkins  G, Pantalone  KM, et al.  A type 2 diabetes subtype responsive to ACCORD intensive glycemia treatment. Diabetes Care  2021;44(6):1410–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111. Blackman  RK, Dickinson  D, Eisenberg  DP, et al.  Antipsychotic medication-mediated cognitive change in schizophrenia and polygenic score for cognitive ability. Schizophrenia Res Cognition  2022;27:100223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. Ruan  Y, Lin  YF, Feng  YC, et al.  Improving polygenic prediction in ancestrally diverse populations. Nat Genet  2022;54:573–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113. Zhao  Z, Fritsche  LG, Smith  JA, et al.  The construction of cross-population polygenic risk scores using transfer learning. Am J Human Genetics  2022;109:1998–2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114. Mak  TS, Porsch  RM, Choi  SW, et al.  Polygenic scores via penalized regression on summary statistics. Genet Epidemiol  2017;41(6):469–80. [DOI] [PubMed] [Google Scholar]
  • 115. Zhang  Q, Privé  F, Vilhjálmsson  B, Speed  D. Improved genetic prediction of complex traits from individual-level data or summary statistics. Nat Commun  2021;12(1):4192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116. Tian  P, Chan  TH, Wang  YF, et al.  Multiethnic polygenic risk prediction in diverse populations through transfer learning. Front Genet  2022;13:906965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117. Márquez-Luna  C, Loh  PR. South Asian type 2 diabetes (SAT2D) consortium, SIGMA type 2 diabetes consortium, price AL. Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genet Epidemiol  2017;41(8):811–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118. Chung  W, Chen  J, Turman  C, et al.  Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes. Nat Commun  2019;10(1):569. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119. Robinson  MR, Kleinman  A, Graff  M, et al.  Genetic evidence of assortative mating in humans. Nat Hum Behav  2017;1(1):0016. [Google Scholar]
  • 120. Coram  MA, Fang  H, Candille  SI, et al.  Leveraging multi-ethnic evidence for risk assessment of quantitative traits in minority populations. Am J Human Genetics  2017;101(2):218–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121. Maier  RM, Zhu  Z, Lee  SH, et al.  Improving genetic prediction by leveraging genetic correlations among human diseases and traits. Nat Commun  2018;9(1):989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122. Vilhjálmsson  BJ, Yang  J, Finucane  HK, et al.  Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am J Human Genetics  2015;97(4):576–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123. Privé  F, Arbel  J, Vilhjálmsson  BJ. LDpred2: better, faster, stronger. Bioinformatics  2020;36(22–23):5424–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124. Ge  T, Chen  CY, Ni  Y, et al.  Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun  2019;10(1):1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125. Yang  S, Zhou  X. Accurate and scalable construction of polygenic scores in large biobank data sets. Am J Human Genetics  2020;106(5):679–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126. Lloyd-Jones  LR, Zeng  J, Sidorenko  J, et al.  Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nat Commun  2019;10(1):5086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127. Cannon  CP, Blazing  MA, Giugliano  RP, et al.  Ezetimibe added to statin therapy after acute coronary syndromes. New England J Med  2015;372:2387–97. [DOI] [PubMed] [Google Scholar]
  • 128. Huang  A, Wand  MP. Simple marginally noninformative prior distributions for covariance matrices. Bayesian Anal  2013;8:439–52. [Google Scholar]
  • 129. Turley  P, Walters  RK, Maghzian  O, et al.  Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat Genet  2018;50:229–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130. Luo  L, Shen  J, Zhang  H, et al.  Multi-trait analysis of rare-variant association summary statistics using MTAR. Nat Commun  2020;11:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131. Fontaine-Bisson  B, Renström  F, Rolandsson  O, et al.  Evaluating the discriminative power of multi-trait genetic risk scores for type 2 diabetes in a northern Swedish population. Diabetologia  2010;53:2155–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132. Allegrini  AG, Selzam  S, Rimfeld  K, et al.  Genomic prediction of cognitive traits in childhood and adolescence. Mol Psychiatry  2019;24:819–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133. Van der Merwe  C, Passchier  R, Mufford  M, et al.  Polygenic risk for schizophrenia and associated brain structural changes: a systematic review. Compr Psychiatry  2019;88:77–82. [DOI] [PubMed] [Google Scholar]
  • 134. Curtis  MJ, Abernethy  DR. Replication-why we need to publish our findings. Pharmacol Res Perspect  2015;3:e00164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135. Joober  R, Schmitz  N, Annable  L, et al.  Publication bias: what are the challenges and can they be overcome?  J Psychiatry Neurosci  2012;37:149–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136. Berisa  T, Pickrell  JK. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics  2016;32:283. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

PGxPRS_Review_ms_Supp_Final_bbad470
Table_S1_S2_bbad470
table_s1_s2_bbad470.xlsx (57.8KB, xlsx)

Articles from Briefings in Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES