GRPa‐PRS: A risk stratification method to identify genetically‐regulated pathways in polygenic diseases

Xiaoyang Li; Brisa S Fernandes; Andi Liu; Jingchun Chen; Xiangning Chen; Zhongming Zhao; Yulin Dai

doi:10.1002/alz.70779

. 2025 Oct 14;21(10):e70779. doi: 10.1002/alz.70779

GRPa‐PRS: A risk stratification method to identify genetically‐regulated pathways in polygenic diseases

Xiaoyang Li ^1,², Brisa S Fernandes ¹, Andi Liu ^1,³, Jingchun Chen ¹, Xiangning Chen ⁴, Zhongming Zhao ^1,^3,^✉, Yulin Dai ^1,^✉

PMCID: PMC12519521 PMID: 41085130

Abstract

INTRODUCTION

Polygenic risk score (PRS) assesses genetic risk for diseases, yet some high‐risk individuals avoid illness while low‐risk individuals develop it. We hypothesize that unknown counterfactors may reverse PRS predictions, offering insights into disease mechanisms and interventions.

METHODS

We developed a novel framework to identify genetically‐regulated pathways (GRPas) using PRS‐based stratification in Alzheimer's disease (AD) and schizophrenia (SCZ) cohorts. We calculated PRS models, stratified individuals by risk and diagnosis, and analyzed differential GRPas. For AD, analyses were further conducted with and without apolipoprotein E (APOE) effects, and across APOE haplotype.

RESULTS

In AD, we identified several well‐known AD‐related pathways, including amyloid‐beta clearance, tau protein binding, and resilience‐related calcium signaling pathway, and divalent inorganic cation homeostasis.

DISCUSSION

Our method offers flexibility for exploring GRPas among PRS‐stratified subgroups using summary statistics or individual‐level data. Fewer GRPas identified in the no‐APOE AD model and SCZ suggest a more polygenic architecture, necessitating larger samples to detect significant GRPas.

Highlights

Characterize genetically‐regulated expression (GReX) among groups stratified by polygenic risk score (PRS)
Leverage GReX and PRS to explore the resilience and susceptibility at the pathway level
Highlight calcium signaling and cation homeostasis functions linked to resilience
Enable personalized prevention by reinforcing the different resilience factors present or absent in each individual
Our genetically‐regulated pathway (GRPa) ‐PRS framework can be further expanded to other complex polygenic traits

Keywords: Alzheimer's disease, genetically‐regulated expression, genetically‐regulated pathway, polygenic risk score, precision psychiatry, resilience, schizophrenia

1. INTRODUCTION

Polygenic risk scores (PRS) estimate an individual's genetic predisposition to complex diseases by aggregating the effects of numerous common variants. ¹ , ² , ³ Early genome‐wide association studies (GWAS) applications, such as in schizophrenia (SCZ), ⁴ demonstrated that many variants jointly influence risk, supporting the polygenic nature of complex traits. ⁵ , ⁶ Since then, risk prediction has advanced with increasingly refined computational methods, ⁷ , ⁸ , ⁹ , ¹⁰ , ¹¹ improving phenotype prediction accuracy. ⁸ , ¹² In diseases like Alzheimer's disease (AD) and SCZ, genetic and environmental factors interact. AD‐PRS models incorporating covariates such as age and sex can achieve an area under the curve AUC 0.75–0.84. ¹² , ¹³ Apolipoprotein E4 (APOE4) is the strongest risk factor, with carriers of one or two alleles showing odds ratios of 4.6 (95% confidence interval [CI]: 4.1–5.2) and 25.3 (95% CI: 20.4–31.2), respectively, compared to non‐carriers. ¹⁴ However, APOE alone does not fully explain late‐onset AD (LOAD), motivating further research into additional genetic contributors. Notably, high educational attainment is associated with reduced AD risk and slower cognitive decline, even in individuals with a high PRS load, underscoring the protective role of cognitive stimulation. ¹⁵ SCZ, in contrast, is a severe psychiatric disorder marked by disturbances in cognition, perception, emotion, and social behavior.

Polygenic risk scores (PRS) estimate an individual's genetic predisposition to complex diseases by aggregating the effects of numerous common variants. ¹ , ² , ³ Early genome‐wide association studies (GWAS) applications, such as in schizophrenia (SCZ), ⁴ demonstrated that many variants jointly influence risk, supporting the polygenic nature of complex traits. ⁵ , ⁶ Since then, risk prediction has advanced with increasingly refined computational methods, ⁷ , ⁸ , ⁹ , ¹⁰ , ¹¹ improving phenotype prediction accuracy. ⁸ , ¹² In diseases like Alzheimer's disease (AD) and SCZ, genetic and environmental factors interact. AD PRS models incorporating covariates like age and sex can achieve area under curve AUC 0.75–0.84. ¹² , ¹³ Apolipoprotein E4 (APOE4) is the strongest risk factor, with carriers of one or two alleles showing odds ratios of 4.6 (95% confidence interval [CI]: 4.1–5.2) and 25.3 (95% CI: 20.4–31.2), respectively, compared to non‐carriers. ¹⁴ However, APOE alone does not fully explain late‐onset AD (LOAD), motivating further research into additional genetic contributors. Notably, high educational attainment is associated with reduced AD risk and slower cognitive decline, even in individuals with high PRS, underscoring the protective role of cognitive stimulation. ¹⁵ SCZ, in contrast, is a severe psychiatric disorder marked by disturbances in cognition, perception, emotion, and social behavior.

Compared to AD, SCZ has a more polygenic genetic structure with no prominent genetic risk factor akin to APOE in AD. ¹⁶ , ¹⁷ SCZ PRS models with age and sex as covariates could reach an AUC 0.71–0.74. ¹⁸ , ¹⁹ Environmental factors such as early‐life stress, cannabis use, and urban upbringing also contribute to SCZ risk. Recent work by Hess et al. ²⁰ and Hou et al. ²¹ on high‐risk cases versus high‐risk controls for SCZ and AD identified genetic resilience factors that are orthogonal to PRS, acting as counterfactors rather than inverse risk markers. ²⁰ These findings suggest that resilience to diseases in high‐PRS individuals, and susceptibility (extra‐burden) in low‐PRS individuals, may be driven by modifiers not captured by PRS. Identifying these orthogonal genetic factors and their biological pathways could yield novel insights into disease mechanisms and inform early interventions, including lifestyle or pharmacological strategies ² , ²² .

Common genetic variants typically exert small effects, but their aggregation at the gene or pathway level can enhance detection of their collective impact. Such methods ²³ fall into three categories: over‐representation analysis (ORA), functional class scoring (FCS), and single‐sample (SS) approaches. ORA uses contingency table‐based tests to assess enrichment of genes mapped from variants surpassing a GWAS threshold in specific gene sets. However, traditional ORA tools ²⁴ have notable limitations: they require a predefined gene list, overlook gene correlations, and ignore gene ranking. FCS assesses gene‐set significance based on gene‐level associations from case‐control comparisons, while SS methods estimate gene‐set scores per individual before comparing groups. Tools like MAGMA ²⁵ (FCS), PoPS ²⁶ (FCS), and PRSet ²⁷ (SS) account for inter‐gene or inter‐variant correlations. PRSet uniquely computes polygenic scores per gene set in a single‐sample manner. ²³ Given the context‐specific nature of complex trait variants, transcriptome‐wide association studies (TWAS) offer improved power by incorporating genetically‐regulated expression (GReX), defined as the product of variant dosage and eQTL effect sizes. ²⁸ , ²⁹ , ³⁰ , ³¹ , ³² TWAS‐GSEA, ³³ developed for TWAS/FUSION, ³⁴ models inter‐gene correlations as random effects using gene‐gene covariance matrix from TWAS. However, as an FCS method requiring GWAS summary statistics and FUSION format, its compatibility with individual‐level TWAS tools like PrediXcan ³⁵ or TIGAR ³⁶ is limited ³⁷ .

Here, we developed a novel computational framework, genetically‐regulated pathway (GRPa) ‐PRS, to systematically explore differential GReX and GRPa among individuals based on their PRS risk strata in AD and SCZ. Specifically, we aimed to: (1) develop new computational methods to identify differential GRPa from biologically meaningful strata comparison; (2) disentangle the GRPa without the APOE factor in AD; (3) disentangle the GRPa by APOE haplotype; (4) identify the GRPa that is orthogonal to their AD or SCZ risks, which potentially unveils resilience and extra‐burden factors related to AD or SCZ; (5) benchmark our framework against another individual variant‐based method, PRSet, for functional enrichment of genetic risks.

2. MATERIALS AND METHODS

2.1. GWAS datasets

We utilized four large‐scale GWAS summary statistics datasets to calculate polygenic risk scores and evaluate genetically regulated pathways. The first dataset, Kunkle et al.’s (K) meta‐analysis of AD GWAS, focuses on the phenotype LOAD with the age of onset > 65 years, which includes 11,480,632 variants from the study consisting of 21,982 AD cases and 41,944 cognitively normal (CN) controls. ³⁸ The second GWAS, Schwartzentruber et al.’s (S) meta‐analysis of GWAS for AD and AD by proxy phenotypes (genome‐wide association by proxy [GWAX]) in the UK BioBank (UKBB), is composed of 898 AD cases, 52,791 AD by proxy cases, and 355,900 controls, including a total of 10,687,077 SNPs. ³⁹ The last dataset is from Wightman (W) et al.’s ⁴⁰ meta‐analysis of AD GWAS. We adopted the summary statistics of their GWAS meta‐analysis without the proxy cases from the UKBB and 23andMe, which resulted in 39,918 cases and 358,140 controls, including a total of 12,674,019 SNPs. ⁴⁰ This design aims to test the robustness and consistency of our parallel subgroup results across various GWAS datasets and potential impact of GWAX. ⁴¹ SCZ GWAS summary statistics were adopted from Trubetskoy et al.’s meta‐analysis, which includes 7,659,767 variants derived from 52,017 SCZ cases and 75,889 controls. ¹⁷ To address potential inflation in polygenic score estimates resulting from sample overlap, the GWAS summary statistics were further adjusted using the EraSOR software. ⁴²

RESEARCH IN CONTEXT

Systematic review: Many individuals with high genetic risk avoid disease, while others with low risk develop it. We suspect unknown counterfactors could alter polygenic risk score (PRS) predictions, offering insights into disease origins and interventions. Aggregating minor‐effect variants at the pathway level clarifies their collective impact. Unlike previous methods, which were limited to summary statistics or lacked functional enrichment optimization, our new computational framework, genetically‐regulated pathways (GRPa) ‐PRS, enables genetically regulated expression across different biologically meaningful polygenic risk score strata, offering flexibility for personalized research with both summary statistics and individual‐level data. We also explored the effect size of genetically regulated pathways among complex diseases with diverse polygenicity.
Interpretation: Our study highlighted calcium signaling and cation homeostasis, not only as merely reactive to AD pathogenesis, but also as a potential counterfactor to disease pathogenesis, with a possible role on preventive and early intervention strategies.
Future directions: Our GRPa‐PRS framework could facilitate personalized prevention by tailoring resilience and burden factors to individual PRS strata, rather than a one‐size‐fits‐all approach. This targeted identification of cluster pathways may guide more precise disease prevention strategies.

2.2. Genotyping data composition in discovery and replication cohorts

We collected AD discovery (disc) and replication (rep) cohorts from dbGaP (https://www.ncbi.nlm.nih.gov/gap/) and synapse (https://www.synapse.org/), respectively. The disc cohort is comprised of the National Institute on Aging/Late Onset Alzheimer's Disease Study (NIA/LOAD) cohort consents 1 and 2 (ADc12) [phs000168] ⁴³ and the Multi‐Site Collaborative Study for Genotype‐Phenotype Associations in Alzheimer's Disease (GenADA) [phs000192]. ⁴⁴ Raw single‐nucleotide polymorphism (SNP) arrays were downloaded from dbGaP accordingly (accessed on 6/15/2021). The rep cohort includes the Whole Genome Sequencing (WGS) (https://www.synapse.org/#!Synapse:syn5550382) data from the Religious Orders Study and Memory and Aging Project (ROS/MAP) Study, ⁴⁵ the MayoRNAseq (Mayo) study, ⁴⁶ the Mount Sinai Brain Bank (MSBB) study, ⁴⁷ the raw SNP array from Mount Sinai School of Medicine (MSSM) study (https://www.synapse.org/#!Synapse:syn20808201) and imputed SNP array (https://www.synapse.org/#!Synapse:syn3157325) data from ROS/MAP. The raw SNP array from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database ⁴⁸ was downloaded from https://adni.loni.usc.edu (accessed on 6/15/2021).

We collected SCZ disc and rep cohorts from the Swedish Case‐Control Study of Schizophrenia ⁴⁹ and Molecular Genetics of Schizophrenia [phs000167], ⁵⁰ respectively.

2.3. Genotyping data imputation

We imputed genotyping data for AD and SCZ cohorts using established pipelines. ⁵¹ AD datasets were processed with the Michigan Imputation Server or BEAGLE using 1000 Genomes reference panels, retaining SNPs with imputation quality r ² ≥ 0.6. For SCZ, we followed cohort‐specific imputation protocols as previously described. ⁵² Full details are provided in Supplementary Methods.

2.4. Diagnosis criteria

AD case/control definitions were based on clinical diagnoses, including National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association (NINCDS–ADRDA) criteria, ⁵³ consensus diagnosis, ⁵⁴ or Clinical Dementia Ratings, ⁴⁷ depending on cohort. SCZ cases were diagnosed using Diagnostic and Statistical Manual of Mental Disorders criteria (DSM‐IV or DSM‐III‐R). A complete description of diagnosis criteria is available in Supplementary Methods.

2.5. Genotyping QC and cohort summary

We retained only individuals of European ancestry (Figure S1), ⁵⁵ removed related individuals using KING2.2, ⁵⁶ and performed SNP‐ and sample‐level filtering. Variant annotation was done using ANNOVAR, ⁵⁷ and the apolipoprotein E (APOE) region (chr19: 44.4‐46.5Mb on GRCh37) was excluded in Model 2. The final AD discovery and replication cohorts included 2,722 and 2,854 individuals, respectively (Table 1); SCZ cohorts included 6628 (SWE) and 5334 (MGS) individuals (Table S1). Additional details are available in Supplementary Methods.

TABLE 1.

Demographic summary of AD discovery dataset and replication dataset.

	Discovery			Replication
Parameter	Cases (N = 1337)	Controls (N = 1384)	Total (N = 2721)	Cases (N = 1606)	Controls (N = 1248)	Total (N = 2854)	Overall (N = 5575)
Age
Mean (SD)	81.0 (7.72)	75.0 (7.59)	77.9 (8.23)	81.4 (8.32)	79.9 (8.75)	80.8 (8.54)	79.4 (8.51)
Median [min, max]	81.4 [57.3, 103]	75.0 [57.0, 98.0]	78.0 [57.0, 103]	83.0 [55.0, 95.0]	81.2 [56.0, 96.0]	82.3 [55.0, 96.0]	80.0 [55.0, 103]
Sex
Male	531 (39.7%)	550 (39.7%)	1081 (39.7%)	586 (36.5%)	578 (46.3%)	1164 (40.8%)	2245 (40.3%)
Female	806 (60.3%)	834 (60.3%)	1640 (60.3%)	1020 (63.5%)	670 (53.7%)	1690 (59.2%)	3330 (59.7%)
APOE SNP genotype
APOE2/APOE2	1 (0.1%)	6 (0.4%)	7 (0.3%)	3 (0.2%)	7 (0.6%)	10 (0.4%)	17 (0.3%)
APOE2/APOE3	42 (3.1%)	170 (12.3%)	212 (7.8%)	81 (5.0%)	164 (13.1%)	245 (8.6%)	457 (8.2%)
APOE2/APOE4	40 (3.0%)	28 (2.0%)	68 (2.5%)	36 (2.2%)	17 (1.4%)	53 (1.9%)	121 (2.2%)
APOE3/APOE3	412 (30.8%)	895 (64.7%)	1307 (48.0%)	723 (45.0%)	808 (64.7%)	1531 (53.6%)	2838 (50.9%)
APOE3/APOE4	658 (49.2%)	265 (19.1%)	923 (33.9%)	636 (39.6%)	237 (19.0%)	873 (30.6%)	1796 (32.2%)
APOE4/APOE4	184 (13.8%)	20 (1.4%)	204 (7.5%)	127 (7.9%)	15 (1.2%)	142 (5.0%)	346 (6.2%)

Open in a new tab

Abbreviations: AD, Alzheimer's disease; APOE, apolipoprotein E; SD, standard deviation; SNP, single‐nucleotide polymorphism.

2.6. Framework of GRPa‐PRS

We designed a computational framework (Figure 1), GRPa‐PRS, that can assess the individual‐level gene set risk from GReX and further compare the GRPas among biologically meaningful PRS risk strata (Figure 2A). We designed two statistical approaches to identify the differential GRPas and benchmark with variant‐based pathway PRS method, PRSet ²⁷ .

GRPa‐PRS workflow and study design. The blue tabulates in the top row indicate the input data from preprocessed individual genotyping data, AD and SCZ GWAS summary statistics, and curated gene sets. The green tabulates and background include the key steps of our GRPa‐PRS framework: 1. PRS strata strategy; 2. GReX imputation; 3.a GRPa‐MAGMA approach and 3.b GRPa‐GSVA; 4.a Differential GRPas summary and 4.b Orthogonality test. The dark blue tabulate includes the method PRSet we used to benchmark the performance of differential GRPa. The dashed line indicates the comparison between three approaches, PRSet, GRPa‐MAGMA, and GRPa‐GSVA. The orange tabulates are designed to explore the differential GRPa from three different approaches and their orthogonality. The highlighted two strata in Step 1: The resilience stratum (high‐risk controls) and the extra‐burden stratum (low‐risk cases) are defined in Table 2.

Illustration of six strata comparisons and genetic factor distribution in GRPas. (A) Stratify individuals based on PRS to evaluate underlying differential GRPas as shown in Table 2: (1) case–control, (2) TB20all, (3) TB20AD, (4) TB20Ctr, (5) T20, (6) B20. (B) Illustration of the distribution of genetic factors among individuals across different risk strata is shown in Table 2. B‐Ctr indicates bottom percentile controls carrying no effective risk GRPa. B‐AD indicates bottom percentile AD carrying effective risk GRPa. T‐Ctr* indicates extreme percentile of controls carrying the risk factors that are sporadically distributed and have no effective risk GRPa; T‐Ctr** indicates extreme percentile of controls carrying both effective risk and resilience‐related GRPa. T‐AD* indicates extreme percentile of cases carrying the risk factors gathered in effective risk GRPa. T‐AD** indicates extreme percentile of cases carrying the risk factors gathered in another effective risk GRPa.

GRPa‐PRS is designed for polygenic disease in general. In this study, we employed AD and SCZ cohorts to assess the generalizability of the GRPa‐PRS framework, with a focus on AD due to its unique nature encompassing two models: including or excluding the effect of APOE. To avoid redundancy, AD is used to exemplify our PRS stratification method, as similar PRS stratification was applied to SCZ. First, we adopted clinical diagnosis as the case and control labels. After, we estimated the AD PRS risk for each individual using the effect size from the previous three large‐scale AD GWAS summary statistics. ³⁸ , ³⁹ , ⁴⁰ Their PRS model was optimized using LDPred2, applying it separately to three different GWAS summary statistics to ensure a less biased estimation.

2.7. Novel risk strata design

There are a few assumptions for the conjecture's construction before we can define our risk strata. (1) the diagnosis of each individual mostly reflects their ultimate outcome status; (2) the individual disease genetic burden is largely reflected by the PRS; (3) the individuals in each strata comparison might carry different disease genetic risks (i.e., different genes), but the overall genetic burden is similar (Figure 2B); (4) the individuals are stratified to extreme percentiles (top and bottom 20% in our analyses) to reach a maximum PRS disparity as well as a balanced minimum number of individuals to reach statistical power; (5) considerable proportion of common variant genetic variation can be captured by the variation of GReX genes and their pathway. In other words, the genetic risk difference between strata could eventually be reflected by the GReX differences and functional pathway differences.

With these assumptions, we theorized that different strata, as defined by extremes of PRS load, would provide us with hints to different resilience and extra‐burden genetic factors by identifying those who defy genetic odds by remaining disease‐free despite a high disease genetic load and those who develop disease despite a low genetic load, and also those who conform to genetic odds by remaining disease‐free in concordance to a low disease genetic load and those with who develop disease in concordance to a high genetic load. By systematically contrasting these different strata, our design enabled us to innovatively identify genetic factors associated with resilience and extra‐burden in our disease exemplars, AD and SCZ. Identifying these genetic factors is the first step toward a framework for enhancing resilience and inhibiting susceptibility, enabling broad preventative efforts.

The six PRS strata used in our comparisons are summarized in Table 2 and illustrated in Figure 2A. Intuitively, for case‐control (all cases vs. all controls) stratum and TB20all (top 20% vs. bottom 20%) stratum, we examined GRPa differences among individuals from AD diagnosis groups, focusing on full cases or controls, or extreme percentiles (such 10,15,20) of diagnosis groups (cases or controls). In TB20AD (top 20% cases vs. bottom 20% cases) stratum and TB20Ctr (top 20% controls vs. bottom 20% controls) stratum, we assessed high‐risk vs. low‐risk burden within AD cases and controls, reflecting the PRS burden within the same diagnosis label. Finally, in T20 stratum comparison (AD cases vs. controls within top 20% PRS) and B20 stratum comparison (AD cases vs. controls within bottom 20% PRS), we explored GRPa differences in individuals with similar PRS burden but different diagnoses, highlighting the counterfactors in resilience and extra‐burden‐related GRPas.

TABLE 2.

Risk strata and their comparison group interpretation in the discovery and replication data sets.

Strata comparison	Comparison group 1 in cohort (disc/rep)	Comparison group 2 in cohort (disc/rep)	Interpretation for GRPa difference in strata comparison
Case–control Fig. 2A (1)	All clinically diagnosed cases	All clinically diagnosed controls	General GRPa difference between cases vs controls
TB20all Figure 2A (2)	Individuals with the top 20% PRS among total participants ^*	Individuals with the bottom 20% PRS among total participants ^*	PRS burden difference explained by GRPa in extreme‐high percentile vs extreme‐low percentile individuals
TB20AD Figure 2A (3)	Individuals with the top 20% PRS among total cases ^**	Individuals with the bottom 20% PRS among total cases ^**	PRS burden explained by GRPa difference among high‐risk cases vs low‐risk cases
TB20Ctr Figure 2A (4)	Individuals with the top 20% PRS among total controls ^***	Individuals with the bottom 20% PRS among total controls ^***	PRS burden explained by GRPa difference among high‐risk controls vs low‐risk controls
T20 Figure 2A (5)	Cases within the top 20% PRS among all participants starting from the top 2.5 percentiles of outlier controls. [higher than group 2 activity indicates extra burden]	Controls within the top 20% PRS among all participants starting from the top 2.5 percentiles of outlier controls. [higher than group 1 activity indicates resilience]	Resilience or extra burden factor explained by GRPa difference among similar high PRS cases vs controls
B20 Figure 2A (6)	Cases within the bottom 20% PRS among all participants excluding the bottom 2.5 percentiles of outlier cases. [higher than group 2 activity indicates extra burden]	Controls within the bottom 20% PRS among all participants excluding the bottom 2.5 percentiles of outlier cases. [higher than group 1 activity indicates resilience]	Resilience or extra burden factor explained by GRPa difference among similar low PRS cases vs controls

Open in a new tab

Abbreviations: AD, Alzheimer's disease; B, bottom; case–control, clinical diagnosis; disc, discovery; GRPa, genetically‐regulated pathway; PRS, polygenic risk score; rep, replication; T, top; TB, top vs bottom.

Excluding outliers of all with PRS lie outside of the range from 2.5 to 97.5 percentile of all participants

^**

Excluding outliers of cases with PRS lie outside of the range from 2.5 to 97.5 percentile of cases

^***

Excluding outliers of controls with PRS lie outside of range from 2.5 to 97.5 percentile of controls.

2.8. PRS model and risk strata percentile selection

To eliminate inflation caused by potential sample overlap between the target cohort and GWAS summary statistics in polygenic score analyses, we used the software EraSOR ⁴² to adjust the effect size from GWAS summary statistics. We used the overlapping variants between the HapMap3 Project ⁵⁸ and corrected GWAS summary statistics to match the variants in the genotyping data of both cohorts. LDpred2 ¹¹ (‐auto model) was applied to calculate PRS for each individual using matched variants. Specifically, we estimated the h² by LD score regression by `bigsnpr` R package, ⁵⁹ and input h² and causal variant p (a sequence of 30 logarithmically spaced numbers between 1×10⁻⁴ and 0.2) into the auto model to calculate PRS. Although p‐value thresholding is not required in LDpred2, we explored the signal‐to‐noise ratio of applying different variant‐level p‐value thresholds (1, 0.5, 0.2, 0.1, and 0.05) prior to LDpred2 affected classification performance. For example, when the variant p‐value threshold was set at 0.5, only variants with p‐values less than 0.5 from the summary statistics were included in the PRS calculation. Their prediction performance was compared using the AUC metrics by using the clinical diagnosis label (diagnosis criteria section) as the ground truth. The PRS reaching the highest AUC was used to stratify individuals into strata for comparison (Table 2). Except for the clinical diagnosis group (Diag) comparison between cases and controls, we repeated the above procedure for three different GWAS summary statistics (Kunkle et al. [K], ³⁸ Schwartzentruber et al. [S], ³⁹ and Wightman et al. [W]) ⁴⁰ mentioned in the GWAS datasets section and generated three sets with different labels. For SCZ, we used the GWAS by Trubetskoy et al. ¹⁷ to calculate the PRS with an optimized p‐value threshold of 0.05. We conducted sensitivity analyses using percentiles set at 10%, 15%, and 20% to define strata in both the disc and rep cohorts of AD and SCZ for each comparison.

2.9. Gene set curation

Gene ontology (GO) curation (Biological Process [BP], Molecular Function [MF], and Cellular Component [CC]) is composed of 1,303 non‐redundant GO terms from WebGestalt ²⁴ (accessed on 5/7/2021). We obtained the gene set from four different resources, including AD‐related pathways and terms and brain cell‐type‐specific pathway (AD brain function), non‐redundant GO, and canonical pathways. We limited the gene number in the gene set ranging from five to 500. Specifically, AD brain function gene sets contain 167 gene sets, curated from two major resources: 70 AD‐related function curation and 97 brain cell‐type‐specific functions (https://ctg.cncr.nl/software/genesets (accessed on 6/9/2020)). Lastly, we curated 2,082 canonical pathways from three major resources, Kyoto Encyclopedia of Genes and Genomes (KEGG), REACTOME, and BioCarta pathways from the Molecular Signatures Database (MSigDB 7.4 C2 category, accessed on 5/10/2021). Overall, we curated 4202 unique functional gene sets.

2.10. Semantic similarity analysis of GO terms

To better understand the similarities among the significant GO terms identified in different strata, we utilized the R package ‘rrvgo’, which leverages the hierarchical structure of GO terms for BP, MF, and CC separately. ⁶⁰

2.11. Inferring the GReX for individuals and association study with trait

We used PrediXcan ³⁵ to calculate the GReX via the Multivariate Adaptive Shrinkage in R (MASHR) model ⁶¹ for all the individuals using their genotyping data after QC (Table 1). The imputed gene expression was calculated for 13 different brain regions from the Genotype‐Tissue Expression (GTEx). ⁶² These served as the input for two differential GRPa approaches: (1) GRPa‐MAGMA (FCS method): TWAS + MAGMA gene set enrichment and (2) GRPa‐GSVA (SS‐based method): gene set variational analysis (GSVA) + logistic likelihood ratio test (LogisticLRT).

2.12. GRPa‐MAGMA

MultiXcan ⁶³ was first used to integrate TWAS across brain regions and identify the associations between genes and trait. Specifically, MultiXcan performs logistic regression for each gene individually and uses the F‐test to assess the significance of the joint fit by comparing the null model (formula [ 1]) with formula (2). The null model includes only demographic covariates (age, sex, and the top five genotype principal components), while Model 1 includes both demographic covariates and GReX predictors derived from individual genotypes. For each gene, GReX in different tissues was integrated by principal component analysis (PCA) and treated as predictors in the model. The trait outcome for each stratum was defined as a binary label representing the comparison group assignment (e.g., high vs. low PRS, case vs. control) based on polygenic risk scores and clinical diagnosis, as detailed in Table 2.

NULL Model : Outcome \sim covariates .

(1)

Model 1 : Outcome \sim GReX (full) + covariates .

(2)

Additionally, due to the dominant effect of APOE in AD, we designed Model 2 (no‐APOE model), which includes both demographic covariates and predictors derived from the genotype excluding APOE region as predictors. The GReX of Model 2 was generated by utilizing genotype data that excluded the APOE region. The same outcome and covariates were applied in Model 2 as in Model 1. In summary, Model 2 in this study is specifically designed to exclude the genotype or GReX from the APOE region, thereby representing a more polygenic genetic architecture.

Model 2 : Outcome \sim GReX (no A P O E region gene) + covariates .

(3)

We first employed genotype data of each cohort to construct the gene‐gene correlation matrix associated with AD using MAGMA. ²⁵ Based on the results of MultiXcan, we converted the gene p‐values to z‐scores using the inverse cumulative distribution function (CDF) of the standard normal distribution for a one‐tailed test $z = Φ^{- 1} (1 - p)$ and replaced them with the original z‐scores in the MAGMA gene‐gene correlation matrix. Then, we adapted the gene set enrichment function in MAGMA with “competitive” and “self‐contained” options on three gene set curations, AD brain function, GO, and canonical pathway. We defined a significant GRPa as having a BH‐adjusted p‐value < 0.05 after applying the Benjamini–Hochberg (BH) test to each gene set curation.

2.13. GRPa‐GSVA

Gene set enrichment was performed for all the imputed AD individuals using the R package GSVA (function gsva—arguments: method = “gsva”, mx.diff = TRUE). ⁶⁴ GSVA implements a nonparametric unsupervised method of gene set enrichment that allows an assessment of the relative enrichment of a selected pathway across the individual space in a single‐sample manner and increases the power to find differential associations. ⁶⁵ We define these as GRPa scores, representing genetically regulated pathway activity, which capture the magnitude difference between the largest positive and negative random walk values of each pathway. GSVA is a kernel‐based method that is less affected by gene‐set length or inter‐gene correlation. ⁶⁶ , ⁶⁷ We calculated three gene set curations (with minimum five genes and maximum 500 genes) by using the GReX for each individual and each of the 13 brain regions. We proposed two models as GRPa‐MAGMA models to detect the conditions with or without APOE region GReX. We used logisticLRT to assess the increment in the goodness of fit (deviance of NULL model – deviance of Model (1) and (deviance of NULL model—deviance of Model (2) for each GRPa in each tissue by the Model 1 and the no‐APOE Model 2, where outcome and covariates were the same as mentioned in the previous model, and GRPa was calculated from GSVA for each tissue and each individual from GReX input.

Model 1 : Outcome \sim GRPa (full) + covariates .

(4)

We defined the significant GRPa after Bonferroni correction for each gene set curation.

2.14. PRSet pathway polygenic risk score analysis

To evaluate the performance of our GRPa‐GSVA, we benchmarked against another gene set individual PRS estimated by one of the latest methods, PRSet. ²⁷ Briefly, PRSet employs the classical approach of clumping and thresholding (C+T) to compute personalized PRSs for specific genomic gene set of interest using individual genotyping information. SNPs within a 35 kb upstream and 10 kb downstream window of each gene coordinate from the curated gene set were accumulated. ⁶⁸ To evaluate the association between each pathway and AD outcome, a logistic regression model was applied with the same covariates as GRPa‐PRS methods; random SNPs were permuted 10,000 times as the background to assess the empirical competitive p‐values. We used the BH test to correct for multiple testing in gene sets. The number of pathways with an FDR < 0.05 was used to assess the performance of GRPa‐PRS and PRSet across three curated gene sets in two AD cohorts. We conducted logistic regression for genotyping data with and without SNPs in the APOE region (Model 1/Model 2) to evaluate PRSet's performance independent from the APOE region, applying the same outcomes and covariates as in previous models.

Model 1 : Outcome \sim gene set PRS (full) + covariates .

(5)

2.15. Comparing the variance explained by GRPa‐GSVA and PRSet

Model 1 : Outcome \sim gene set PRS (full) + GRPa (full) + covariates .

(6)

To compare the variance explained by the pathway score from GSVA and pathway score from PRSet, we evaluated the goodness of fit in the logistic regression model by Nagelkerke's R squared based on formula (4) and (5) for each gene set in function curation, using strata case_control as the outcome. In addition, we used both as predictors to compare with formula (5) to explore the improvement that comes from pathway score by our GRPa‐GSVA method as shown in formula (6). The same outcome and covariates were applied as the previous models. Furthermore, we divided Nagelkerke's R squared of the above models by the null model to obtain the R squared ratio to assess the goodness of fit improvement from the generated pathway score.

2.16. Definition of resilience and extra‐burden effects using orthogonality test

Our orthogonality test aims to establish genuine independence between individual PRS and GRPa‐GSVA scores, yet perfect independence is inherently difficult to quantify. We define an orthogonal effect as a GRPa feature whose Pearson correlation with PRS is statistically equivalent to zero. To address this, we reformulate the standard Pearson correlation test into an equivalence test by introducing a prespecified equivalence margin delta Δ (where Δ is chosen to be practically negligible e.g., 0.050), such that any observed correlation |ρ| < Δ is deemed effectively zero, thereby operationalizing true independence. Instead of testing H₀: ρ = 0 versus H₁: ρ ≠ 0, we pose H₀: |ρ| ≥ Δ (a non‐negligible correlation) against H₁: |ρ| < Δ (a negligible or zero correlation). Inspired by the Two One‐Sided Tests (TOST) equivalence test by Lakens, ⁶⁹ we apply Fisher's z‐transform, ⁷⁰ to stabilize the sampling distribution of the observed correlation r, translate the equivalence margin Δ into z‐bounds, and reject H₀ only if both $z_{r} > - z_{Δ} and z_{r} < z_{Δ}$ hold at level α. Intuitively, by requiring both one‐sided tests to reject, we constrain the true correlation within –Δ to Δ, allowing us to confidently accept “near‐zero” association as evidence of orthogonality. We used R package `TOSTER` (V0.8.4) ⁶⁹ to conduct the TOST correlation test. Subsequently, resilience factor or extra‐burden GRPs were defined as those showing significant case vs. control differences by two‐sample t‐tests from GRPa‐GSVA scores within the same stratum (T20 or B20), but no significant differences in the overall case_control or TB20all comparisons (Table 2).

2.17. Determination of margin delta

To determine an appropriate equivalence margin Δ for our TOST of Pearson correlations, we conducted a series of sample‐size‐driven power calculations under a two‐sided α = 0.05 and target power of 0.70. We varied Δ iteratively in small increments until the required sample size matched each of our two datasets. For the AD discovery cohort (n≈2800), setting Δ = 0.050 yielded an estimated n≈2874, confirming that our available sample affords adequate power to declare correlations smaller than this bounds effectively zero. Similarly, for the SCZ discovery cohort (n≈6600), a tighter margin of Δ = 0.033 produced a required n≈6600. Overall, these empirically calibrated Δ values achieve a balance between statistical rigor and the limitations imposed by sample size.

3. RESULTS

3.1. PRS prediction performance evaluation and selection

We calculated the PRS based on three different GWAS summary statistics (S/K/W) adjusted by EraSOR ⁴² for both disc and rep cohorts. To explore the best signal‐to‐noise value, the same PRS calculation process was performed for a list of variant p‐value thresholds (1, 0.5, 0.2, 0.1, and 0.05). The PRS prediction performance had the highest mean AUC at a variant p‐value threshold of 0.2 for both the disc dataset (AUC S/K/W: 0.69/0.65/0.67) and the rep dataset (AUC S/K/W: 0.68/0.66/0.66). Thus, PRSs calculated at this threshold 0.2 were used to generate the individual PRS, with their distribution shown in Figure S2. The demographic information for each stratum is shown in Tables S2–S7. For SCZ, we used the same process and obtained the highest AUC with a variant p‐value threshold at 0.05 for both the disc dataset (AUC Trubetskoy et al.: 0.77) and the rep dataset (AUC Trubetskoy et al.: 0.78). The corresponding demographic information for each stratum is shown in Tables S8 and S9. In our analysis, we observed that applying p‐value thresholds led to slightly improved prediction performance compared to using the full set of variants. This finding supports that Bayesian methods like LDpred2 are designed to utilize the entire spectrum of p‐values, and they can still perform robustly when a few high p‐value variants are filtered out.

3.2. APOE‐region gene sets dominate the differential GRPas in extreme diagnosis groups

As summarized in Table 2 and Figure 2A, we first explored the differential GRPas in PRS burden in the case_control stratum (all cases and controls) and TB20all stratum (top extreme percentile cases and controls vs bottom extreme percentile cases and controls). In the GRPa‐MAGMA Model 1 result (Figure 3A), the disc cohort revealed no significant GO terms when comparing cases and controls. Within the three TB20all subgroups, 16 unique significant GO terms were identified. The top three terms were BP amyloid beta metabolic process (FDR: 6.11 × 10⁻⁵), CC protein lipid complex (FDR: 1.05 × 10⁻⁴), and BP protein lipid complex subunit organization (FDR: 8.19 × 10⁻⁴). In the rep cohort (Figure 3B), there are two pathways when comparing cases vs controls and 11 unique significant GO terms within TB20all subgroups, which replicated 7 out of 16 pathways from the disc cohort (Figure 3G). However, for our no‐APOE Model 2 (Figure 3C and 3D), only one significant pathway (BP amyloid beta metabolic process (FDR: 1.25 × 10⁻²) was identified in the disc cohort and none identified in the rep cohort, suggesting that APOE‐region genes (Table S10) and their related functions dominate the PRS‐related GRPas.

Enrichment of AD GRPa‐MAGMA results on GO curation. GRPa was identified by GRPa‐MAGMA under Model 1, the model using full genotype to detect all pathways associated with strata comparison, and Model 2, the model using excluding *APOE* region genotype to detect pathways associated with six strata comparisons and independent from *APOE* effect. (A) GO GRPa identified in discovery (disc) dataset Model 1, (B) GO GRPa identified in replication (rep) dataset Model 1, (C) GO GRPa identified in disc dataset Model 2 (no‐APOE model), and (D) GO GRPa identified in disc dataset Model 2 (no‐APOE model), no significant result (FDR < 0.05) in this condition. * indicates the significant GRPas FDR < 0.05. Heatmap intensity indicates ‐log₁₀(FDR). The x‐axis shows the heatmap list of the subgroup comparison based on different GWAS summary statistics. S represents Schwartzentruber et al; K represents Kunkle et al; W represents Wightman et al. (E, F) The semantic similarity for significant terms from GRPa‐MAGMA in BP and MF, respectively. (G) The UpSet plot for overlapping signals between the strata among the disc cohort and the rep cohort under Model 1.

In the GRPa‐GSVA GO analyses, the disc cohort revealed 14 significant GO pathways when comparing cases and controls (Figure 4A). The top three pathways and terms were BP amyloid beta clearance (p‐value: 5.50 × 10⁻³¹), MF lipoprotein particle receptor binding (p‐value: 1.45 × 10⁻²⁵), and BP protein containing complex remodeling (p‐value: 2.51 × 10⁻¹⁷). Within the three TB20all subgroups, 27 significant GO pathways were identified. The top three terms were BP response to tumor cell (p‐value: 3.39 × 10⁻⁴¹), MF lipoprotein particle receptor binding (p‐value: 2.14 × 10⁻²⁴), and BP amyloid beta clearance (p‐value: 5.62 × 10⁻²⁰). The UpSet plot (Figure 4G) demonstrated partial replication of Model 1 results from the disc cohort (Figure 4A) in the rep cohort (Figure 4B) ranging from 41.3%–50.0%. Consistent with the GRPA‐MAGMA findings, Model 2 revealed a limited number of pathways (Figure 4C and 4D). In the discovery cohort, within the TB20all subgroups, amyloid beta‐related function was identified even in APOE‐removed Model 2. Conversely, in the rep cohort, MF clathrin binding (p‐value: 8.61 × 10⁻⁶) and BP vesicle cargo loading (p‐value: 3.42 × 10⁻⁵) emerged only in Model 2, suggesting those terms were independent from the APOE effect.

AD GRPa‐GSVA results enrich GO pathways. GRPas identified by GRPa‐GSVA under Model 1, the model using full genotype to detect all pathways associated with strata comparison, and Model 2, the model using excluding *APOE* region genotype to detect pathways associated with six strata comparisons and independent from *APOE* effect. (A) GO GRPa identified in discovery (disc) dataset Model 1, (B) GO GRPa identified in replication (rep) dataset Model 1, (C) GO GRPa identified in disc dataset Model 2 (no‐APOE), and (D) GO GRPa identified in disc dataset Model 2 (no‐APOE). * indicates the significant (p‐value < 0.05 / # of gene set) GRPas identified in this condition. Heatmap intensity indicates ‐log₁₀(p‐value). The x‐axis shows the heatmap list of the subgroup comparison based on different GWAS summary statistics. S represents Schwartzentruber et al; K represents Kunkle et al; W represents Wightman et al. (E, F) The semantic similarity for significant terms from GRPa‐GSVA in BP and MF, respectively. (G) The UpSet plot for overlapping signals between the disc cohort and rep cohort under Model 1.

We expanded our analysis to include two additional gene set curations (AD brain function terms and canonical pathways, see Methods) by GRPa‐MAGMA (Figures S3 and S4), as well as GRPa‐GSVA (Figures S5 and S6). The results exhibited a similar pattern to the GO pathways analysis, wherein a limited number of pathways were identified in the no‐APOE Model 2 compared to Model 1 and TB20all stratum (top extreme percentile cases and controls vs. bottom extreme percentile cases and controls) had much more significantly differential GRPas identified than the case_control stratum, highlighting the prominent impact of APOE and importance of stratifying individuals to strata based on their PRS.

3.3. TB20AD has more PRS‐related GRPas than TB20Ctr does despite their similar PRS burden difference in AD

In the GRPa‐MAGMA analysis of GO terms for the TB20AD stratum (high‐risk cases and low‐risk cases), we observed 12 and five significantly differential GRPas in the disc (Figure 3A) and rep cohorts (Figure 3B). For TB20Ctr stratum (high‐risk controls and low‐risk controls), there are only two pathways identified in the disc and rep cohorts (Figure 3C and 3D), namely protein lipid complex and protein lipid complex subunit organization. Only protein lipid complex is shared between the disc and rep cohort in GRPa‐MAGMA analysis (Figure 3G). Relatively fewer significant GRPas were observed in TB20AD compared to TB20All, yet TB20AD showed more significant GRPas than TB20Ctr. These trends were consistent across results from other gene‐set curations. (Figures S3 and S4). Notably, the TB20AD and TB20Ctr strata shared the same PRS burden difference (Figure 2A panel (3) & panel (4)), which indicates that genetic risk differences in top extreme percentiles of controls are more sporadic and less concentrated in a single pathway (Figure 2B). In our no‐APOE model, we only identified significant GRPas in TB20AD stratum, including BP Notch signaling pathway (FDR: 9.34 × 10⁻³) (Figure 3C), astrocyte and inhibitory synapse functions in AD brain function gene set curation (Figure S3C and S3D) and REACTOME signaling by notch (FDR: 3.92 × 10⁻²) in canonical pathway gene set curation (Figure S4C).

In the GRPa‐GSVA analysis of GO terms for the disc cohort (Figure 4A), we found 11 significantly different GRPas in comparisons within the TB20AD stratum with seven GRPas (19.4%) overlapping with 32 rep cohorts (Figure 4B & 4G). MF RNA polymerase II transcription factor binding and BP cognition were the two significant GRPas that only existed in TBAD20 stratum for GO terms (Figure 4B). For TB20Ctr stratum, we identified the BP myelin maintenance in both STB20Ctr and WTB20Ctr strata in AD brain function gene set curation (Figure S5B) and REACTOME assembly of active LPL and LIPC Lipase complexes (Figure S6A) and KEGG calcium signaling pathway (Figure S6B) in canonical pathway gene set curation. In the no‐APOE Model 2 for TB20Ctr strata comparison, microglia cell death & apoptosis (Figure 4C) was identified as significant GRPas in AD brain function and KEGG calcium signaling pathway (Figure S6B) in canonical pathway curation, respectively.

3.4. Comparisons in T and B strata highlight potential resilience‐related and extra‐burden–related GRPas that differ in AD PRS‐matched strata

Lastly, we assessed the potential differential GRPas in PRS‐matched strata: T20 stratum (AD cases vs. controls with top PRS) and B20 stratum (AD cases vs. controls with bottom PRS), suggesting the potential resilience and extra‐burden factors (counterfactors), respectively. In the GRPa‐MAGMA analysis for Model 1 (Figure 3A), only the astrocyte immune system (FDR: 4.84 × 10⁻³) (Figure S3A) and REACTOME in cargo recognition for clathrin‐mediated endocytosis (FDR: 3.36 × 10⁻²) (Figure S4B) exist only in T20 stratum. In the no‐APOE model, the REACTOME pathway cargo recognition for clathrin‐mediated endocytosis (FDR: 3.36 × 10⁻²) (Figure S4B) is exclusively identified in the T20 stratum and does not appear in any other PRS strata, suggesting that this pathway may operate independently of APOE‐related functions.

In the GRPa‐GSVA GO analysis for GO the BP divalent inorganic cation homeostasis (Figure 4A) was detected in the disc cohort. In the GRPa‐GSVA canonical pathway analysis, KEGG calcium signaling pathway (Figure S6B) is significant in the SB20 stratum. Notably, this term is also highlighted in Model 2 (Figure S6D) and the previous WTB20Ctr stratum, which suggests this differential GRPa may indicate a resilience effect, contingent on passing the subsequent orthogonality test.

3.5. Mitochondrial‐related function was highlighted in SCZ

To further generalize our framework to another polygenic disease, we applied Model 1 to two SCZ cohorts on GO terms. We identify a few GRPas relatively consistent in GRPa‐MAGMA and GRPa‐GSVA in each cohort. Specifically, we identified MF structural constituent of muscle in the case_control stratum. BP cytokinesis, BP cell surface receptor signaling pathway involved in heart development, and BP protein localization to cytoskeleton were identified in TBSCZ stratum (Figure S7A & S7C). Mitochondria function‐related terms were highlighted in the T stratum (Figure S7B). Lastly, BP translational elongation, MF phosphatidylinositol 3 kinase binding, and BP muscle tissue development were significant in the B stratum (Figure S7A and S7C). Overall, we observed that GRPs were highlighted for their potential roles in muscle and heart development as well as mitochondrial function in T and B strata, indicating their extra‐burden or resilience factors.

3.6. Comparison of shared and unique significant GO terms for AD Model 1

We assessed the performance of our GRPa‐PRS by comparing it to PRSet, a method that computes pathway PRSs based on variants and personalized gene sets for each individual. Within GO, there is a significant difference between the signals identified by the AD disc and rep cohorts. The disc cohort primarily identifies signals in the T20 and B20 strata, whereas the rep cohort signals exist in the TB20all, TB20AD, and TB20Ctr strata (Figure S8). We observed the same discrepancy between these two cohorts within the other two enrichment curations (Figures S9 and S10). Then, we compared the union set of pathways identified using three GWAS summary statistics in the disc and rep datasets across three different methods. The UpSet plot of GO pathways (Figure 5A) shows that nine of the ten largest intersections consist solely of signals from PRSet, without replication from GRPa‐MAGMA or GRPa‐GSVA, and most are different from established AD‐related pathways. Moreover, PRSet identified a disproportionately high number of pathways that more than 240 of 1,303 (18.5%) total GO terms showed significant differences in strata comparisons of two AD cohorts (Figure S8), raising concerns about the extent to which AD pathogenesis can alter GRPas. Compared with our GRPa‐MAGMA and GRPa‐GSVA, PRSet shows inconsistent replication performance across different cohorts (Figure 5A, Figures S11, and S12). Therefore, we focused on the results obtained from our GRPa‐MAGMA and GRPa‐GSVA. For example, in the GO gene set (Figure 5A), we found two overlapping signals between these two methods in the case versus control comparison, 13 (28.9% to 58.8%) overlapping signals in the TB20all stratum, and seven (19.4%–43.8%) overlapping signals in the TB20AD stratum. In the TB20Ctr stratum, only one signal overlapped, while there were no overlapping signals with other strata. These findings underscore the comparable results between our two methods.

Compare results of three methods GRPa‐MAGMA, GRPa‐GSVA, PRSet in GReX with *APOE*. (A) The overlapping signals between GRPa‐MAGMA, GRPa‐GSVA, PRSet are visualized in the UpSet plot. (B) The R2 improvement ratio using GRPa‐GSVA and PRSet from the nested model for formula (4), (5), (6) among different gene set curations.

3.7. Semantic similarity analyses reveal key functional modules related to significant GO terms for AD and SCZ

To further summarize our findings with similar functions, we conducted semantic similarity analyses of all significant terms to highlight a few main functional modules (similarity > 0.1) shared between GRPa‐MAGMA and GRPa‐GSVA results for GO terms. Although the terms within each module vary (Figure 3E and Figure 4E), we can still identify major shared terms that cluster into functional modules in BP. These include amyloid beta metabolic processes, amyloid beta clearance, transmembrane and extracellular functions (vesicle, synaptic transmission cholinergic, response to tumor cell, and exploratory behavior). In contrast, MF terms did not cluster into distinct modules; the shared terms are tau protein binding, clathrin binding, proteoglycan binding, and lipoprotein particle receptor binding (Figure 3F and 4F). In the CC category, only protein lipid complex was a shared term between GRPa‐MAGMA and GRPa‐GSVA results (Figure 4G). Lastly, GRPa‐GSVA identified a few unique GO terms, such as anion and cation homeostasis and immune‐related functions in GO BP.

For SCZ, BP cell surface receptor signaling pathway involved in heart development from the TBSCZ stratum and muscle tissue development from the B stratum (extra‐burden factor) have a shared semantic similarity of 0.25 (Figure S7E). For CC, mitochondrial protein complexes have highly shared semantic functions with other mitochondrial‐function‐related GO terms from the T stratum with resilience factors (Figure S7G).

3.8. Gene‐level association in significant GRPas from strata comparisons for AD and SCZ

To further explore the genes that contribute to significant terms, we checked the MultiXcan gene‐level p‐values and GWAS gene‐level p‐values. For GRPa‐MAGMA, we used the MultiXcan gene‐level p‐values to reflect the GReX difference in strata comparisons. In AD, the top 20 associated MultiXcan genes (BH‐adjusted p < 0.05) (Figure S13A and S13B) were positively correlated with the number of significant GRPas in each strata comparison (Figure 3A and 3B). Ten genes were shared between the top 20 associated MultiXcan genes in AD disc and rep cohorts, including BIN1, PICALM, and CD2AP. ⁷¹ In SCZ, no significant MultiXcan genes were identified (Figure S13C and S13D), while well‐known SCZ‐risk genes, ⁷² such as ANK3, DRD2, GRM5, and KIF13A were captured among top genes of either SWE or MGS cohorts.

As GRPa‐GSVA does not provide a gene‐level association with the trait, we used the GWAS gene p‐values derived from MAGMA (Wightman et al. ⁴⁰ for AD and Trubetskoy et al. ¹⁷ for SCZ) to measure the importance of genes from GRPa‐GSVA GO terms. For AD, we visualized the top five GWAS significant genes from each significant GRPa across all significant GO terms in AD disc and rep cohorts (Figure S14A, S14B and S14C). As expected, 28 out of 36 GO BP terms included the APOE gene (Figure S14A), with the exceptions being immune response, cellular response, and transmembrane activities. For SCZ, we included all the significant GO terms from GRPa‐MAGMA and GRPa‐GSVA in either SWE or MGS (Figure S14D). Only the mitochondrial function‐related terms have shared the top five GWAS significant genes between terms, suggesting that a polygenic effect contributes to SCZ GRPas.

3.9. Comparing the variance explained by GRPa‐GSVA vs PRSet in AD

As described in the methods section, we assessed the improvement brought by GRPa‐GSVA and PRSet respectively, and jointly by comparing Nagelkerke's R square values of formula (4), (5), and (6) with that of formula (1). The results based on the disc cohort are presented in Figure 5B. Taking the GO gene set as an example, in formula (4), where GRPa generated by GSVA were included as predictors, the ratio of R square values ranged from 1.00 to 1.35 (median: 1.03). In formula (5), which includes gene set PRS from PRSet as predictors, the ratio of R square values ranged from 1 to 1.61 (median: 1.08). In formula (6), which includes both gene set PRS and GRPa generated by GSVA as predictors, the ratio of R square values ranged from 1 to 1.71 (median: 1.11). From Figure 5B, we observed that the improvement in goodness of fit achieved by GRPa‐GSVA (formula [ 4]) surpasses that of PRSet (formula [ 5]) in the context of AD brain function and canonical pathways, though it is slightly less effective than PRSet in GO terms. Lastly, it is consistent that adding GRPa‐GSVA to PRSet (formula [ 6]) exhibits significantly better performance compared to PRSet alone (t‐test p‐value: 1.09 × 10⁻⁷).

3.10. Orthogonality test in the resilience and extra‐burden GRPas from GRPa‐GSVA for AD and SCZ

In this work, we aimed to identify novel GRPas that are independent from PRS risk and have resilience or extra‐burden factors to AD or SCZ, as we defined in the orthogonal effect GRPas session (Methods). Therefore, these resilience and extra‐burden GRPas are defined with following features: (1) to be significant in the T20 or B20 strata for resilience or extra‐burden factors (Table 2); (2) to be orthogonal to the PRS (correlation with PRS risk is significantly equivalent to 0 via TOST test). In Figure 6, we illustrated a few pathways and terms as positive and negative examples. For instance, BP divalent inorganic cation homeostasis activity was a notable signal identified in the T20 stratum of the disc cohort. As shown in Figure 6A, the correlation between its term score and PRS within the brain frontal cortex BA9 is significantly lies between ‐0.050 and 0.050 (r = 7.147 × 10⁻³, TOST p = 0.012, Δ = 0.050), suggesting the BP divalent inorganic cation homeostasis activity has an orthogonal effect to PRS liability and is a significant resilience‐related GRPa with higher activity in KT20 group 2 than KT20 group 1 (Figure 6B). Similarly, for KEGG calcium signaling pathway, another prominent signal identified in the WTBCtr subgroup and SB subgroup of rep cohort for both Model 1 and Model 2 (Figure S6B & S6D). Its correlation with PRS within the brain spinal cord cervical c−1 is statistically equal to 0 (r = ‐2.840 × 10⁻³, TOST p = 0.006, Δ = 0.050) (Figure 6C), suggesting the KEGG calcium signaling pathway has an orthogonal effect to PRS liability and is a resilience‐related GRPa (Table 2 & Figure 2A) with higher activity in SB20 group 2 than SB20 group 1 (Figure 6D) and higher activity in WTB20Ctrl group 1 than WTB20Ctrl group 2 (Table S11). On the contrary, BP amyloid beta clearance, a representative signal identified in both disc and rep cohorts of the case_control, TB20AD, and TB20Ctr strata, but not in the K20 stratum (Figure 6F), demonstrated a non‐zero correlation (r = 0.118, TOST p = 1.000, Δ = 0.050) with the PRS liability. As depicted in Figure 6E, BP amyloid beta clearance is positively correlated with PRS score and significantly different among the strata comparisons as shown in the heatmap in Figure 4A. For SCZ, BP muscle tissue development demonstrated a statistically equal to 0 correlation (r = ‐0.010, TOST p = 0.029, Δ = 0.033) with the SCZ PRS liability (Figure 6G), suggesting it is an orthogonal effect. Moreover, BP muscle tissue development has higher activity in stratum B20 group 1 than in B20 group 2 (Figure 6H), indicating extra burdens to SCZ (Figure S7C). Through GRPa‐GSVA analysis of AD brain function curation, we identified two potential resilience‐related pathways: microglia cell death and apoptosis (r = 0.010, TOST p = 0.019, Δ = 0.050) in disc AD TB20Ctr Model 2 (Figure S5C), and myelin maintenance (r = ‐0.010, TOST p = 0.015, Δ = 0.050) in rep AD TB20Ctr Model 1 (Figure S5B and(Figure S15).

Orthogonal test for key findings. We visualize the correlation between PRS and (A) divalent inorganic cation homeostasis activity, and (C) calcium signaling pathway activity, (E) amyloid beta clearance activity, and (G) muscle tissue development activity, respectively. cor: Pearson correlation coefficient and p: p‐value from TOST equivalence test. In (B), (D), (F), and (H), we plot the GSVA score distribution difference within group 1 versus group 2 comparisons among strata comparison, accordingly. The p‐value are from two‐sample t‐test. kPRS, sPRS, and tPRS represent the PRS calculated from Kunkle et al.’s AD GWAS, Schwartzentruber et al.’s AD GWAS, and Trubetskoy et al.’s SCZ GWAS, respectively.

3.11. Case–control comparisons for each of APOE haplotype for AD

We also observed that there is a tendency for more APOE4 haplotype in the high PRS group than in the low PRS group. Because the APOE4 allele mediates the pathway from PRS to downstream biology, adjusting for it would over‐adjust and attenuate the total PRS effect. In this study, we decided to conduct all differential GRPas comparison in strata without using APOE4 allele dosage as the covariate given the dosage difference listed in Tables S2–S9. We further stratified the case‐control comparisons by APOE haplotypes (22, 23, 33, 34, and 44), as well as included full case‐control comparison without APOE haplotypes stratification, and performed GRPa‐MAGMA analyses across three curated gene sets (Figure S16). Only one new significant GRPa (MF_Ran_GTPase_binding) was identified to the full case‐control comparison, suggesting that there are limited significant GRPas between APOE haplotype stratified case and control comparison, and the significant differential GRPas between case‐control comparison are mainly driven by APOE4 allele dosage differences in case and control groups.

3.12. Sample size, effect size, and power analysis for SCZ and AD

In this study, we conducted the PRS calculation with sample sizes of 2,722 and 2,854 for the two AD cohorts, and for SCZ, the sample sizes were 6,628 and 5,334, respectively. The smallest comparison group contains 535 individuals for AD and 534 for SCZ. We expected the effect size of GRPa to have an inverse relationship with the strata size, indicating that when the total sample size is fixed, the effect size (Cohen's d) increases with higher percentile thresholds. We conducted a sensitivity analysis of different extreme percentiles 10,15, and 20 to explore the relationship between extreme percentile and effect size of terms in supplement figures (Figure S17A and S17B). We could identify a consistent decrease in the absolute median effect size of GO terms across all 13 brain tissues and three parallel PRS‐stratification subgroups (K, S, W indicate the GWAS summary statistics used). Consequently, the balance between effect size and sample size (extreme percentile threshold) represents the trade‐off between signal and noise. Using the result of Figure 4A as an example, the current GRPas have the largest absolute effect size in BP amyloid beta clearance with an effect size of 0.5 and BP divalent inorganic cation homeostasis with an effect size of 0.25 in TB20AD stratum.

As shown in power analysis at an alpha of 0.05 (Figure S17C), a term with 0.2 effect size in 100 samples will have similar power with 0.15 effect size in 200 samples. All in all, the power analysis and sensitivity analysis further unveil the relationship between GRPa effect size and sample size (extreme percentile). In our current analysis of the extreme 20 percentiles for AD, we anticipate a power of 0.5 to detect effects in strata with a minimum sample size of 200 and an effect size of at least 0.2. For SCZ across the extreme 10, 15, and 20 percentiles (Figure S17B), we expect to achieve the same power, capable of capturing sample sizes ranging from 200 to 500, with minimum effect sizes varying between 0.2 and 0.13.

3.13. Robustness of GRPa‐PRS framework

To ensure the robustness of our findings, we developed two approaches (GRPa‐MAGMA and GRPa‐GSVA) to systematically characterize the GRPas among risk strata predefined by three GWAS summary statistics and two independent cohorts. Overall, similar trends of GRPa intensity among parallel subgroups were observed (Figures 3 and 4). However, due to the slight difference in each subgroup, significant pathways might not be always shared. Well‐known AD‐related pathways (including amyloid‐beta clearance, tau protein binding, and astrocytes response to oxidative stress) identified by these two approaches are highlighted consistently in case_control, TBall, and TBAD strata in Model 1, suggesting stable GRPas could be observed. Surprisingly, we also identified the amyloid‐beta formation and tau protein binding, etc functions significant in Model 2 (no‐APOE), which suggests other genes such as BIN1, CLU, PICALM, etc could drive the significant difference in GRPas without APOE region. Besides, GRPa‐MAGMA identified astrocyte, neuron, Notch signaling and clathrin functions (Figures 3A, 3C, S3C and S3D), while GRPa‐GSVA identified myelin maintenance, clathrin and microglia function (Figures 4B, 4D, S5C and S5D). We observed fewer significant GRPs in the TB20Ctr stratum compared to the TB20AD stratum, despite similar genetic burdens and sample sizes. As illustrated in Figure 2B, this suggests that the additive PRS liability in the top risk controls may result in a more sporadic genetic burden distributed across different GRPas, making them less susceptible than top risk cases.

3.14. Polygenicity in AD and SCZ

In AD with‐APOE (Model 1), GRPa‐MAGMA and GRPa‐GSVA identified over 30 well‐established AD risk GRPas in both the discovery and replication cohorts for GO curation, including pathways such as amyloid‐beta clearance, tau protein binding, and astrocyte responses to oxidative stress (Figures 3 and 4). In AD no‐APOE (Model 2), these methods identified only two GRPas in the discovery cohort and five in the replication cohort (Figures 3 and 4). In SCZ, the same analyses yielded nine significant GRPas in the discovery cohort and three in the replication cohort (Figure S7). Both SCZ and AD (no‐APOE Model 2) yielded far fewer significant GRPas compared with AD (Model 1), indicating that, in the absence of the dominant APOE effect, the genetic architecture of these disorders is more polygenic, with individual GRPas showing relatively smaller effect sizes.

4. DISCUSSION

Individuals carry different genetic liabilities to certain traits and diseases. However, which and how genetic risks play their roles at the pathway and individual level has not been fully revealed. PRS‐based methods have been widely used to characterize such liabilities. Here, we disentangled the TWAS‐based GRPa that are orthogonal to, not just correlated to, their AD and SCZ PRS risks, which unveils both well‐known and novel counterfactors playing the roles of resilience and extra‐burden underlying the PRS risk strata comparison. Previous FCS methods MAGMA ²⁵ or TWAS‐GSEA, ³³ which require GWAS summary statistics and are less applicable to SS data, our GRPa‐PRS framework offers both GRPa‐MAGMA (FCS) and GRPa‐GSVA (SS) approaches. Benchmarked against the variant‐based PRSet (SS), it showed a more robust performance and explained a higher median proportion of variance.

4.1. Resilience‐related and extra‐burden–related GRPas interpretation

From the orthogonality test on the GPRa‐GSVA results, we observed one verified resilience‐related gene set under two conditions, whereas three potential resilience‐related gene sets in AD and the extra‐burden gene set in SCZ were both identified under one condition.

Our major findings in canonical pathway explicitly related to resilience were KEGG Calcium signaling pathway (Figure S6B), showing higher activity in SB20 group 2 compared to group 1 (Figure 6D) and in WTB20Ctrl group 1 compared to group 2 (Table S11) in both Model 1 and Model 2 conditions. As advanced age is a major risk factor for AD, age‐dependent calcium (Ca2+) dysregulation has been reported in animal studies, showing that aged rodents brains exhibit elevated cytoplasmic Ca2+ ([Ca2+] cyt) and increased overall Ca2+ levels, ⁷³ , ⁷⁴ Nevertheless, presenilin mutations, accounting for over 90% of familial AD cases, disrupt cytoplasmic and mitochondrial Ca2+ homeostasis, leading to neurodegeneration. ⁷⁵ Thus, our identification of genetically regulated calcium signaling pathways suggests they are not merely reactive to AD pathogenesis, but may actively counteract disease initiation.

Another term related to resilience, divalent inorganic cation homeostasis activity (Figures 4, 6 & 6D), was only identified in AD Model 1, but might be related to KEGG Calcium signaling pathway. It involves homeostasis of essential biometals (e.g., Fe²⁺, Cu²⁺, Ca²⁺, Mn²⁺, Zn²⁺, and Mg²⁺) via transporters such as DMT1 (Fe²⁺, Cu²⁺, Mn²⁺), SLC39A13/ZIP13 (Zn²⁺), and SLC24A4/NCKX4 (Ca²⁺), ⁷⁶ which are among the top five GWAS risk genes from the divalent inorganic cation homeostasis activity term (Figure S14). Moreover, a previous study ⁷⁷ has also shown that ions like Zn²⁺ and Cu²⁺ interact with Ca²⁺ signaling and exacerbate AD pathology. These metal ions enter neurons via transporters such as Ctr1 (copper), TfR (iron), and ZIP (zinc), which become dysregulated in AD, leading to intracellular accumulation. This overload perturbs calcium signaling pathways, particularly at the endoplasmic reticulum (ER) and mitochondria, resulting in ion cross‐talk toxicity and neuronal dysfunction, and highlights the mechanistic interplay between these two resilience‐associated gene sets. Calcium‐homeostasis therapies have increasingly drawn the attention of researchers as promising strategies for AD. ⁷⁸ Current approaches include both approved and experimental interventions targeting calcium signaling. Memantine, an United States Food and Drug Administration (FDA)‐approved NMDA receptor antagonist, selectively inhibits pathological Ca²⁺ influx while sparing normal signaling, providing limited palliative benefit in moderate‐to‐severe AD ⁷⁹ .

From the GRPa‐GSVA of AD brain function gene set, we also identified two potential resilience‐related gene sets, microglia cell death and apoptosis in discovery AD, and myelin maintenance in replication AD. We identified the orthogonal effect (Figure S15A) and higher activity in group 1 [extreme percentile controls] compared to group 2 [bottom percentile controls] (Figure S15B). Microglia facilitate the clearance of Aβ and contribute to maintaining brain homeostasis. Therefore, reducing the number of over‐activated microglia through apoptosis can reduce the levels of inflammatory mediators, consequently mitigating neuroinflammation and its deleterious effects on neuronal integrity. ⁸⁰ The myelin maintenance has higher activity in group 1 [extreme percentile controls] than group 2 [bottom percentile controls] (Figure S15C). Evidence from AD mouse models suggests that maintaining myelin integrity may protect against disease by preventing amyloid deposition, thereby delaying or reducing plaque formation and supporting cognitive resilience ⁸¹ .

We only identified one significant extra‐burden BP GO term, muscle tissue development activity in SCZ (Figures S7C, 6G and 6H), with higher activity in B20 group 1 [bottom percentile cases] than in B20 group 2 [bottom percentile controls] (Figure 6H). FLOT1, ERBB4, SGCD, DDX39B, and ALPK3 are the top five most significant genes in muscle tissue development activity (Figure S14D) from SCZ GWAS summary statistics. ¹⁷ Specifically, FLOT1 (Flotillin 1) is involved in lipid raft‐mediated signal transduction, which is important for muscle cell differentiation and regeneration. ⁸² FLOT1 is involved in neurodevelopmental processes. Alterations in FLOT1 expression have been linked to neuropsychiatric disorders, including schizophrenia. ⁸³ ERBB4 (Erb‐B2 Receptor Tyrosine Kinase 4) plays a role in both neuromuscular junctions ⁸⁴ and neural development, impacting synaptic plasticity and neurotransmission. ⁸⁵ SGCD (Sarcoglycan Delta) is part of the sarcoglycan complex, crucial for muscle cell membrane integrity. Mutations in SGCD are linked to muscular dystrophies. ⁸⁶ Overall, these genes illustrate how disturbances in developmental pathways can confer extra burden to both muscular and psychiatric abnormalities, suggesting that impaired neuromuscular function could play a role in SCZ. ⁸⁷

4.2. Limitations

First, lacking individual‐level data information in GWAS summary statistics, we could not detect sample overlap (< 1%), but the potential PRS bias was likely minimized after applying the EraSOR adjustment ⁴² method. We still obtained PRS model with basic covariates like age and sex, AUC of ∼0.67 in AD and AUC of ∼0.77 in SCZ) after the overlap sample correction, which aligns with the AUC performance of ∼ 0.8 in AD ¹² , ¹³ and ∼0.72 in SCZ ⁸⁸ from previous studies. ¹⁸ , ¹⁹ Second, by borrowing the expression information from diseases/traits‐related tissues, our GRPa‐PRS framework has more power to detect GRPa independent from APOE at the cost of a higher computation burden and time than SNP‐based methods like PRSet. Third, our newly developed method does not consider the effect of rare variants; this fact might have had a larger impact in the comparisons involving the extra‐burden group, where rare variants possibly play a larger role due to their higher susceptibility. Fourth, although we identified relatively consistent signals in Model 1 (TBall, TBAD, and TBCtr) among three GWAS summary statistics and two independent cohorts using our two approaches in the GRPa‐PRS framework, we failed to identify relatively stable signals in Model 2 (no‐APOE) or T and B strata in either Model 1 or Model 2, suggesting a small effect size of resilience‐related GRPas compared to APOE‐related GRPas, as expected. Fifth, we chose a soft threshold of 20% and a minimum sample size of 500 in each extreme stratum to maximize the difference (Tables S2–S9). Notably, the results of soft thresholds of 10% and 15% are comparable to 20%, despite the decrease of the effect size (Figure S17A and S17B). We recognized that a larger cohort would increase our power to detect significant GRPas (Figure S17C). Therefore, applying our model to other polygenic diseases with larger cohorts would allow a more stringent percentile threshold, maximizing effect size differences. Sixth, we used MAGMA's LD‐based gene–gene correlation matrix in GRPa‐MAGMA to account for gene correlations. While TWAS focuses on predicted gene expression–phenotype associations, LD‐based matrices complement this by capturing part of the correlation structure. Further research will refine and validate this approach. Seventh, given the limited covariates (only age and sex) in our datasets, we employed regression‐based adjustment rather than matching (e.g., propensity score). Matching is generally more effective with richer covariates. ⁸⁹ In future work, incorporating lifestyle, behavioral, and clinical data from resources such as NIAGADS ⁹⁰ or large biobanks ⁹¹ may enable more comprehensive matching to better control for confounding. Eighth, we observed a higher AUC when excluding variants with higher p‐value thresholds (p < 0.2 for AD, p < 0.05 for SCZ), suggesting that retaining these variants may enhance the signal‐to‐noise ratio. While this reflects our exploratory findings, Bayesian methods such as LDpred2 still recommend including all variants regardless of p‐value. Finally, the minimum age of controls was 57 years, so some top‐risk controls may still develop AD, though this limitation applies less to earlier‐onset disorders such as SCZ.

4.3. Future directions

Although not yet ready for clinical use, our framework enables estimation of disease risk and GRPa scores from individual genotyping data. This may support future advances in clinical management as larger collections of individual genotype and phenotype data become available.

For fine‐grained risk stratification, GRPa‐PRS can help identify pathways linked to genetic risk or resilience. For instance, in AD, dysregulation of amyloid‐beta clearance and tau binding pathways are associated with increased risk, while calcium signaling and divalent cation homeostasis may confer resilience, independent of overall PRS, possibly aiding in selecting participants for clinical trials, for instance, by excluding those with an overabundance of resilience factors from clinical trials aimed at prevention of AD, thus improving trial selection.

Precision therapeutics and precision early Intervention are two other areas where GRPa‐PRS can be used to reveal biologically relevant dysregulations and guide targeted therapies. In SCZ, for example, it highlighted mitochondrial and muscle development pathways, suggesting potentially new, complementary treatments to antipsychotics. Such insights may enable early identification and proactive care to alter disease trajectories.

CONFLICT OF INTEREST STATEMENT

All authors declare no conflict of interest. Author disclosures are available in the Supporting Information.

CONSENT STATEMENT

All participants provided informed written consent in the original study; therefore the statement for this secondary analysis work is not applicable.

Supporting information

Supporting Information

ALZ-21-e70779-s010.docx^{(47.2KB, docx)}

Supporting Information

ALZ-21-e70779-s001.docx^{(28.2KB, docx)}

Supporting Information

ALZ-21-e70779-s009.docx^{(119.4KB, docx)}

Supporting Information

ALZ-21-e70779-s005.docx^{(949.4KB, docx)}

Supporting Information

ALZ-21-e70779-s023.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s014.docx^{(1.5MB, docx)}

Supporting Information

ALZ-21-e70779-s004.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s006.docx^{(1.5MB, docx)}

Supporting Information

ALZ-21-e70779-s019.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s012.pdf^{(288.2KB, pdf)}

Supporting Information

ALZ-21-e70779-s018.pdf^{(153KB, pdf)}

Supporting Information

ALZ-21-e70779-s022.pdf^{(268KB, pdf)}

Supporting Information

ALZ-21-e70779-s011.docx^{(1MB, docx)}

Supporting Information

ALZ-21-e70779-s003.docx^{(793.1KB, docx)}

Supporting Information

ALZ-21-e70779-s013.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s016.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s020.docx^{(235.5KB, docx)}

Supporting Information

ALZ-21-e70779-s021.docx^{(1.1MB, docx)}

Supporting Information

ALZ-21-e70779-s017.docx^{(735.2KB, docx)}

Supporting Information

ALZ-21-e70779-s015.docx^{(67.7KB, docx)}

Supporting Information

ALZ-21-e70779-s002.docx^{(26.6KB, docx)}

Supporting Information

ALZ-21-e70779-s007.docx^{(15.4KB, docx)}

Supporting Information

ALZ-21-e70779-s008.pdf^{(374.4KB, pdf)}

ACKNOWLEDGMENTS

The authors thank all the members of Bioinformatics and Systems Medicine Laboratory (BSML) for constructive discussion. This research was partially supported by National Institutes of Health grants awarded to Y.D. and Z.Z. (R21AG087299), to Z.Z (U01AG079847, R03AG077191, R01LM012806, R01DE030122, and R01DE029818), and to J.C (R15AG083618). We thanked the resource support from the Cancer Prevention and Research Institute of Texas (CPRIT RP240610). A.L. is supported by a training fellowship from the Gulf Coast Consortia on Training in Precision Environmental Health Sciences (TPEHS) Training Grant (T32ES027801).

Li X, Fernandes BS, Liu A, et al. GRPa‐PRS: A risk stratification method to identify genetically‐regulated pathways in polygenic diseases. Alzheimer's Dement. 2025;21:e70779. 10.1002/alz.70779

Contributor Information

Zhongming Zhao, Email: Zhongming.zhao@uth.tmc.edu.

Yulin Dai, Email: Yulin.Dai@uth.tmc.edu.

DATA AVAILABILITY STATEMENT

All the data generated or analyzed in this study are available from the authors upon reasonable request. The overall framework can be downloaded from https://github.com/davidroad/GRPa‐PRS.

REFERENCES

1. Sims R, Hill M, Williams J. The multiplex model of the genetics of Alzheimer's disease. Nat Neurosci. 2020;23:311‐322. [DOI] [PubMed] [Google Scholar]
2. Leonenko G, Baker E, Stevenson‐Hoare J, et al. Identifying individuals with high risk of Alzheimer's disease using polygenic risk scores. Nat Commun. 2021;12:4506. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. de Rojas I, Moreno‐Grau S, Tesi N, et al. Common variants in Alzheimer's disease and risk stratification by polygenic risk scores. Nat Commun. 2021;12:3417. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. International Schizophrenia Consortium , Purcell SM, Wray NR, Stone JL, et al, International Schizophrenia Consortium . Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748‐752. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Schizophrenia Psychiatric Genome‐Wide Association Study (GWAS) Consortium. Genome‐wide association study identifies five new schizophrenia loci. Nat Genet 2011;43:969‐976. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013;9:e1003348. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome‐wide association studies. Genome Res. 2007;17:1520‐1528. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Jiang W, Chen L, Girgenti MJ, Zhao H. Tuning parameters for polygenic risk score methods using GWAS summary statistics from training data. Nat Commun. 2024;15:24. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Euesden J, Lewis CM, O'Reilly PF. PRSice: polygenic risk score software. Bioinformatics. 2015;31:1466‐1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Ge T, Chen C‐Y, Ni Y, Feng Y‐CA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun. 2019;10:1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Privé F, Arbel J, Vilhjálmsson BJ. LDpred2: better, faster, stronger. Bioinformatics. 2020;36:5424‐5431. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Escott‐Price V, Sims R, Bannister C, et al. Common polygenic variation enhances risk prediction for Alzheimer's disease. Brain. 2015;138:3673‐3684. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Escott‐Price V, Myers AJ, Huentelman M, Hardy J. Polygenic risk score analysis of pathologically confirmed Alzheimer disease. Ann Neurol. 2017;82:311‐314. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Saddiki H, Fayosse A, Cognat E, et al. Age and the association between apolipoprotein E genotype and Alzheimer disease: a cerebrospinal fluid biomarker‐based case‐control study. PLoS Med. 2020;17:e1003289. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Stern Y. Cognitive reserve in ageing and Alzheimer's disease. Lancet Neurol. 2012;11:1006‐1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Pardiñas AF, GERAD1 Consortium , Holmans P, Pocklington AJ, et al, GERAD1 Consortium . Common schizophrenia alleles are enriched in mutation‐intolerant genes and in regions under strong background selection. Nat Genet. 2018;50:381‐389. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Trubetskoy V, Pardiñas AF, Qi T, et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature. 2022;604:502‐508. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Koch S, Schmidtke J, Krawczak M, Caliebe A. Clinical utility of polygenic risk scores: a critical 2023 appraisal. J Community Genet. 2023;14:471‐487. [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Zheutlin AB, Dennis J, Karlsson Linnér R, et al. Penetrance and pleiotropy of polygenic risk scores for schizophrenia in 106,160 patients across four health care systems. Am J Psychiatry. 2019;176:846‐855. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Hess JL, Tylee DS, Mattheisen M, et al, Schizophrenia Working Group of the Psychiatric Genomics Consortium . A polygenic resilience score moderates the genetic risk for schizophrenia. Mol Psychiatry. 2021;26:800‐815. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Hou J, Hess JL, Armstrong N, et al. Polygenic resilience scores capture protective genetic effects for Alzheimer's disease. Transl Psychiatry. 2022;12:296. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Baker E, Escott‐Price V. Polygenic Risk Scores in Alzheimer's Disease: current Applications and Future Directions. Front Digit Health. 2020;2:14. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Tarca AL, Bhatti G, Romero R. A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. PLoS One. 2013;8:e79217. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Liao Y, Wang J, Jaehnig EJ, Shi Z, Zhang B. WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Res. 2019;47:W199‐205. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. de Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA: generalized gene‐set analysis of GWAS data. PLoS Comput Biol. 2015;11:e1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Weeks EM, Ulirsch JC, Cheng NY, et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat Genet. 2023;55:1267‐1276. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Choi SW, García‐González J, Ruan Y, et al. PRSet: pathway‐based polygenic risk score analyses and software. PLoS Genet. 2023;19:e1010624. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Barbeira AN, Dickinson SP, Bonazzola R, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat Commun. 2018;9:1825. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Wainberg M, Sinnott‐Armstrong N, et al. Opportunities and challenges for transcriptome‐wide association studies. Nat Genet. 2019;51:592‐599. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Dai Y, Wang J, Jeong H‐H, Chen W, Jia P, Zhao Z. Association of CXCR6 with COVID‐19 severity: delineating the host genetic factors in transcriptomic regulation. Hum Genet. 2021;140:1313‐1328. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Dai Y, Jia P, Zhao Z, Gottlieb A. A method for bridging population‐specific genotypes to detect gene modules associated with Alzheimer's disease. Cells. 2022;11:2219. [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Dai Y, Pei G, Zhao Z, Jia P. A convergent study of genetic variants associated with Crohn's disease: evidence from GWAS, gene expression, methylation, eQTL and TWAS. Front Genet. 2019;10:318. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Dall'Aglio L, Lewis CM, Pain O. Delineating the genetic component of gene expression in major depression. Biol Psychiatry. 2021;89:627‐636. [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Gusev A, Ko A, Shi H, et al. Integrative approaches for large‐scale transcriptome‐wide association studies. Nat Genet. 2016;48:245‐252. [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Gamazon ER, Wheeler HE, Shah KP, et al. A gene‐based association method for mapping traits using reference transcriptome data. Nature Genetics. 2015;47:1091‐1098. doi: 10.1038/ng.3367 [DOI] [PMC free article] [PubMed] [Google Scholar]
36. Nagpal S, Meng X, Epstein MP, et al. TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am J Hum Genet. 2019;105:258‐266. [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Mai J, Lu M, Gao Q, Zeng J, Xiao J. Transcriptome‐wide association studies: recent advances in methods, applications and available databases. Commun Biol. 2023;6:899. [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Kunkle BW, Grenier‐Boley B, Sims R, et al. Genetic meta‐analysis of diagnosed Alzheimer's disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat Genet. 2019;51:414‐430. [DOI] [PMC free article] [PubMed] [Google Scholar]
39. Schwartzentruber J, Cooper S, Liu JZ, et al. Genome‐wide meta‐analysis, fine‐mapping and integrative prioritization implicate new Alzheimer's disease risk genes. Nat Genet. 2021;53:392‐402. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Wightman DP, Jansen IE, Savage JE, et al. A genome‐wide association study with 1,126,563 individuals identifies new risk loci for Alzheimer's disease. Nat Genet. 2021;53:1276‐1282. [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Wu Y, Sun Z, Zheng Q, et al. Pervasive biases in proxy genome‐wide association studies based on parental history of Alzheimer's disease. Nat Genet. 2024;56:2696‐2703. [DOI] [PMC free article] [PubMed] [Google Scholar]
42. Choi SW, Mak TSH, Hoggart CJ, O'Reilly PF. EraSOR: a software tool to eliminate inflation caused by sample overlap in polygenic score analyses. Gigascience. 2022;12:giad043. [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Lee JH, Cheng R, Graff‐Radford N, Foroud T, Mayeux R, National Institute on Aging Late‐Onset Alzheimer's Disease Family Study Group . Analyses of the National Institute on Aging late‐onset Alzheimer's disease family study: implication of additional loci. Arch Neurol. 2008;65:1518‐1526. [DOI] [PMC free article] [PubMed] [Google Scholar]
44. Li H, Wetten S, Li L, et al. Candidate single‐nucleotide polymorphisms from a genomewide association study of Alzheimer disease. Arch Neurol. 2008;65:45‐53. [DOI] [PubMed] [Google Scholar]
45. Bennett DA, Buchman AS, Boyle PA, Barnes LL, Wilson RS, Schneider JA. Religious orders study and rush memory and aging project. J Alzheimers Dis. 2018;64:S161‐89. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Allen M, Carrasquillo MM, Funk C, et al. Human whole genome genotype and transcriptome data for Alzheimer's and other neurodegenerative diseases. Sci Data. 2016;3:160089. [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Wang M, Beckmann ND, Roussos P, et al. The Mount Sinai cohort of large‐scale genomic, transcriptomic and proteomic data in Alzheimer's disease. Sci Data. 2018;5:180185. [DOI] [PMC free article] [PubMed] [Google Scholar]
48. Petersen RC, Aisen PS, Beckett LA, et al. Alzheimer's Disease Neuroimaging Initiative (ADNI): clinical characterization. Neurology. 2010;74:201‐209. [DOI] [PMC free article] [PubMed] [Google Scholar]
49. Bergen SE, O'Dushlaine CT, Ripke S, et al. Genome‐wide association study in a Swedish population yields support for greater CNV and MHC involvement in schizophrenia compared with bipolar disorder. Mol Psychiatry. 2012;17:880‐886. [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Shi J, Levinson DF, Duan J, et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature. 2009;460:753‐757. [DOI] [PMC free article] [PubMed] [Google Scholar]
51. Cammann D, Lu Y, Cummings MJ, et al. Genetic correlations between gut microbiome genera, Alzheimer's disease diagnosis, and APOE genotypes: a polygenic risk score study 2022. doi: 10.21203/rs.3.rs-2292371/v1 [DOI]
52. Chen X, Chen DG, Zhao Z, Zhan J, Ji C, Chen J. Artificial image objects for classification of schizophrenia with GWAS‐selected SNVs and convolutional neural network. Patterns (N Y). 2021;2:100303. [DOI] [PMC free article] [PubMed] [Google Scholar]
53. McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical diagnosis of Alzheimer's disease: report of the NINCDS‐ADRDA work group under the auspices of Department of Health and Human Services Task Force on Alzheimer's disease. Neurology. 1984;34:939‐944. [DOI] [PubMed] [Google Scholar]
54. Wang Q, Chen K, Su Y, Reiman EM, Dudley JT, Readhead B. Deep learning‐based brain transcriptomic signatures associated with the neuropathological and clinical severity of Alzheimer's disease. Brain Commun. 2022;4:fcab293. [DOI] [PMC free article] [PubMed] [Google Scholar]
55. 1000 Genomes Project Consortium , Auton A, Brooks LD, Durbin RM, et al, 1000 Genomes Project Consortium . A global reference for human genetic variation. Nature. 2015;526:68‐74. [DOI] [PMC free article] [PubMed] [Google Scholar]
56. Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen W‐M. Robust relationship inference in genome‐wide association studies. Bioinformatics. 2010;26:2867‐2873. [DOI] [PMC free article] [PubMed] [Google Scholar]
57. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high‐throughput sequencing data. Nucleic Acids Res. 2010;38:e164. [DOI] [PMC free article] [PubMed] [Google Scholar]
58. International HapMap 3 Consortium , Altshuler DM, Gibbs RA, Peltonen L, et al, International HapMap 3 Consortium . Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52‐58. [DOI] [PMC free article] [PubMed] [Google Scholar]
59. Privé F, Aschard H, Ziyatdinov A, Blum MGB. Efficient analysis of large‐scale genome‐wide data with two R packages: bigstatsr and bigsnpr. Bioinformatics. 2018;34:2781‐2787. [DOI] [PMC free article] [PubMed] [Google Scholar]
60. Sayols S. Rrvgo: a bioconductor package for interpreting lists of gene ontology terms. MicroPubl Biol. 2023;2023. doi: 10.17912/micropub.biology.000811 [DOI] [PMC free article] [PubMed] [Google Scholar]
61. Urbut SM, Wang G, Carbonetto P, Stephens M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat Genet. 2019;51:187‐195. [DOI] [PMC free article] [PubMed] [Google Scholar]
62. Consortiu. The Genotype‐Tissue Expression (GTEx) project. Nat Genet. 2013;45:580‐585. [DOI] [PMC free article] [PubMed] [Google Scholar]
63. Barbeira AN, Pividori M, Zheng J, Wheeler HE, Nicolae DL, Im HK. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 2019;15:e1007889. [DOI] [PMC free article] [PubMed] [Google Scholar]
64. Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA‐seq data. BMC Bioinformatics. 2013;14:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
65. Chatzinakos C, Georgiadis F, Lee D, et al. TWAS pathway method greatly enhances the number of leads for uncovering the molecular underpinnings of psychiatric disorders. Am J Med Genet B Neuropsychiatr Genet. 2020;183:454‐463. [DOI] [PMC free article] [PubMed] [Google Scholar]
66. Zheng X, Amos CI, Frost HR. Comparison of pathway and gene‐level models for cancer prognosis prediction. BMC Bioinformatics. 2020;21(1):76. doi: 10.1186/s12859-020-3423-z [DOI] [PMC free article] [PubMed] [Google Scholar]
67. Frost HR. Reconstruction Set Test (RESET): a computationally efficient method for single sample gene set testing based on randomized reduced rank reconstruction error. BioRxivorg. 2023. doi: 10.1101/2023.04.03.535366 [DOI] [PMC free article] [PubMed] [Google Scholar]
68. Dai Y, Hu R, Liu A, et al. WebCSEA: web‐based cell‐type‐specific enrichment analysis of genes. Nucleic Acids Res. 2022;50:W782‐90. [DOI] [PMC free article] [PubMed] [Google Scholar]
69. Lakens D. Equivalence tests: a practical primer for t tests, correlations, and meta‐analyses. Soc Psychol Personal Sci. 2017;8:355‐362. [DOI] [PMC free article] [PubMed] [Google Scholar]
70. Fisher RA. On the “Probable Error” of a Coefficient of Correlation Deduced from a Small Sample. Metron. 1921;1:3‐32. [Google Scholar]
71. Bellenguez C, Küçükali F, Jansen IE, et al. New insights into the genetic etiology of Alzheimer's disease and related dementias. Nat Genet. 2022;54:412‐436. [DOI] [PMC free article] [PubMed] [Google Scholar]
72. Jia P, Han G, Zhao J, Lu P, Zhao Z. SZGR 2.0: a one‐stop shop of schizophrenia candidate genes. Nucleic Acids Res. 2017;45:D915‐24. [DOI] [PMC free article] [PubMed] [Google Scholar]
73. Raza M, Deshpande LS, Blair RE, Carter DS, Sombati S, DeLorenzo RJ. Aging is associated with elevated intracellular calcium levels and altered calcium homeostatic mechanisms in hippocampal neurons. Neurosci Lett. 2007;418:77‐81. [DOI] [PMC free article] [PubMed] [Google Scholar]
74. Thibault O, Gant JC, Landfield PW. Expansion of the calcium hypothesis of brain aging and Alzheimer's disease: minding the store. Aging Cell. 2007;6:307‐317. [DOI] [PMC free article] [PubMed] [Google Scholar]
75. Sarasija S, Laboy JT, Ashkavand Z, Bonner J, Tang Y, Norman KR. Presenilin mutations deregulate mitochondrial Ca2+ homeostasis and metabolic activity causing neurodegeneration in Caenorhabditis elegans. Elife. 2018;7:e33052. [DOI] [PMC free article] [PubMed] [Google Scholar]
76. Song N, Jiang H, Wang J, Xie J‐X. Divalent metal transporter 1 up‐regulation is involved in the 6‐hydroxydopamine‐induced ferrous iron influx. J Neurosci Res. 2007;85:3118‐3126. [DOI] [PubMed] [Google Scholar]
77. Wang P, Wang Z‐Y. Metal ions influx is a double edged sword for the pathogenesis of Alzheimer's disease. Ageing Res Rev. 2016;35:265‐290. [DOI] [PubMed] [Google Scholar]
78. Song L, Tang Y, Law BYK. Targeting calcium signaling in Alzheimer's disease: challenges and promising therapeutic avenues. Neural Regen Res. 2024;19:501‐502. [DOI] [PMC free article] [PubMed] [Google Scholar]
79. Figueiredo CP, Clarke JR, Ledo JH, et al. Memantine rescues transient cognitive impairment caused by high‐molecular‐weight aβ oligomers but not the persistent impairment induced by low‐molecular‐weight oligomers. J Neurosci. 2013;33:9626‐9634. [DOI] [PMC free article] [PubMed] [Google Scholar]
80. Gao C, Jiang J, Tan Y, Chen S. Microglia in neurodegenerative diseases: mechanism and potential therapeutic targets. Signal Transduct Target Ther. 2023;8:359. [DOI] [PMC free article] [PubMed] [Google Scholar]
81. Papuć E, Rejdak K. The role of myelin damage in Alzheimer's disease pathology. Arch Med Sci. 2018;16:345‐351. [DOI] [PMC free article] [PubMed] [Google Scholar]
82. Banning A, Tomasovic A, Tikkanen R. Functional aspects of membrane association of reggie/flotillin proteins. Curr Protein Pept Sci. 2011;12:725‐735. [DOI] [PubMed] [Google Scholar]
83. Chen Y, Guan W, Wang M‐L, Lin X‐Y. PI3K‐AKT/mTOR signaling in psychiatric disorders: a valuable target to stimulate or suppress?. Int J Neuropsychopharmacol. 2024;27:pyae010. doi: 10.1093/ijnp/pyae010 [DOI] [PMC free article] [PubMed] [Google Scholar]
84. Tidcombe H, Jackson‐Fisher A, Mathers K, Stern DF, Gassmann M, Golding JP. Neural and mammary gland defects in ErbB4 knockout mice genetically rescued from embryonic lethality. Proc Natl Acad Sci U S A. 2003;100:8281‐8286. [DOI] [PMC free article] [PubMed] [Google Scholar]
85. Mei L, Xiong W‐C. Neuregulin 1 in neural development, synaptic plasticity and schizophrenia. Nat Rev Neurosci. 2008;9:437‐452. [DOI] [PMC free article] [PubMed] [Google Scholar]
86. McNally EM, Pytel P. Muscle diseases: the muscular dystrophies. Annu Rev Pathol. 2007;2:87‐109. [DOI] [PubMed] [Google Scholar]
87. Raj V, Stogios N, Agarwal SM, Cheng AJ. The neuromuscular basis of functional impairment in schizophrenia: a scoping review. Schizophr Res. 2024;274:46‐56. [DOI] [PubMed] [Google Scholar]
88. Ni G, Zeng J, Revez JA, et al. A comparison of ten polygenic score methods for psychiatric disorders applied across multiple cohorts. Biol Psychiatry. 2021;90:611‐620. [DOI] [PMC free article] [PubMed] [Google Scholar]
89. Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Sci. 2010;25:1‐21. [DOI] [PMC free article] [PubMed] [Google Scholar]
90. Greenfest‐Allen E, Valladares O, Kuksa PP, et al. NIAGADS Alzheimer's GenomicsDB: a resource for exploring Alzheimer's disease genetic and genomic knowledge. Alzheimers Dement. 2024;20:1123‐1136. [DOI] [PMC free article] [PubMed] [Google Scholar]
91. Sudlow C, Gallacher J, Allen N, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

ALZ-21-e70779-s010.docx^{(47.2KB, docx)}

Supporting Information

ALZ-21-e70779-s001.docx^{(28.2KB, docx)}

Supporting Information

ALZ-21-e70779-s009.docx^{(119.4KB, docx)}

Supporting Information

ALZ-21-e70779-s005.docx^{(949.4KB, docx)}

Supporting Information

ALZ-21-e70779-s023.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s014.docx^{(1.5MB, docx)}

Supporting Information

ALZ-21-e70779-s004.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s006.docx^{(1.5MB, docx)}

Supporting Information

ALZ-21-e70779-s019.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s012.pdf^{(288.2KB, pdf)}

Supporting Information

ALZ-21-e70779-s018.pdf^{(153KB, pdf)}

Supporting Information

ALZ-21-e70779-s022.pdf^{(268KB, pdf)}

Supporting Information

ALZ-21-e70779-s011.docx^{(1MB, docx)}

Supporting Information

ALZ-21-e70779-s003.docx^{(793.1KB, docx)}

Supporting Information

ALZ-21-e70779-s013.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s016.docx^{(1.3MB, docx)}

Supporting Information

ALZ-21-e70779-s020.docx^{(235.5KB, docx)}

Supporting Information

ALZ-21-e70779-s021.docx^{(1.1MB, docx)}

Supporting Information

ALZ-21-e70779-s017.docx^{(735.2KB, docx)}

Supporting Information

ALZ-21-e70779-s015.docx^{(67.7KB, docx)}

Supporting Information

ALZ-21-e70779-s002.docx^{(26.6KB, docx)}

Supporting Information

ALZ-21-e70779-s007.docx^{(15.4KB, docx)}

Supporting Information

ALZ-21-e70779-s008.pdf^{(374.4KB, pdf)}

Data Availability Statement

All the data generated or analyzed in this study are available from the authors upon reasonable request. The overall framework can be downloaded from https://github.com/davidroad/GRPa‐PRS.

[alz70779-bib-0001] 1. Sims R, Hill M, Williams J. The multiplex model of the genetics of Alzheimer's disease. Nat Neurosci. 2020;23:311‐322. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0002] 2. Leonenko G, Baker E, Stevenson‐Hoare J, et al. Identifying individuals with high risk of Alzheimer's disease using polygenic risk scores. Nat Commun. 2021;12:4506. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0003] 3. de Rojas I, Moreno‐Grau S, Tesi N, et al. Common variants in Alzheimer's disease and risk stratification by polygenic risk scores. Nat Commun. 2021;12:3417. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0004] 4. International Schizophrenia Consortium , Purcell SM, Wray NR, Stone JL, et al, International Schizophrenia Consortium . Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748‐752. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0005] 5. Schizophrenia Psychiatric Genome‐Wide Association Study (GWAS) Consortium. Genome‐wide association study identifies five new schizophrenia loci. Nat Genet 2011;43:969‐976. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0006] 6. Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013;9:e1003348. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0007] 7. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome‐wide association studies. Genome Res. 2007;17:1520‐1528. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0008] 8. Jiang W, Chen L, Girgenti MJ, Zhao H. Tuning parameters for polygenic risk score methods using GWAS summary statistics from training data. Nat Commun. 2024;15:24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0009] 9. Euesden J, Lewis CM, O'Reilly PF. PRSice: polygenic risk score software. Bioinformatics. 2015;31:1466‐1468. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0010] 10. Ge T, Chen C‐Y, Ni Y, Feng Y‐CA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun. 2019;10:1776. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0011] 11. Privé F, Arbel J, Vilhjálmsson BJ. LDpred2: better, faster, stronger. Bioinformatics. 2020;36:5424‐5431. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0012] 12. Escott‐Price V, Sims R, Bannister C, et al. Common polygenic variation enhances risk prediction for Alzheimer's disease. Brain. 2015;138:3673‐3684. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0013] 13. Escott‐Price V, Myers AJ, Huentelman M, Hardy J. Polygenic risk score analysis of pathologically confirmed Alzheimer disease. Ann Neurol. 2017;82:311‐314. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0014] 14. Saddiki H, Fayosse A, Cognat E, et al. Age and the association between apolipoprotein E genotype and Alzheimer disease: a cerebrospinal fluid biomarker‐based case‐control study. PLoS Med. 2020;17:e1003289. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0015] 15. Stern Y. Cognitive reserve in ageing and Alzheimer's disease. Lancet Neurol. 2012;11:1006‐1012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0016] 16. Pardiñas AF, GERAD1 Consortium , Holmans P, Pocklington AJ, et al, GERAD1 Consortium . Common schizophrenia alleles are enriched in mutation‐intolerant genes and in regions under strong background selection. Nat Genet. 2018;50:381‐389. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0017] 17. Trubetskoy V, Pardiñas AF, Qi T, et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature. 2022;604:502‐508. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0018] 18. Koch S, Schmidtke J, Krawczak M, Caliebe A. Clinical utility of polygenic risk scores: a critical 2023 appraisal. J Community Genet. 2023;14:471‐487. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0019] 19. Zheutlin AB, Dennis J, Karlsson Linnér R, et al. Penetrance and pleiotropy of polygenic risk scores for schizophrenia in 106,160 patients across four health care systems. Am J Psychiatry. 2019;176:846‐855. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0020] 20. Hess JL, Tylee DS, Mattheisen M, et al, Schizophrenia Working Group of the Psychiatric Genomics Consortium . A polygenic resilience score moderates the genetic risk for schizophrenia. Mol Psychiatry. 2021;26:800‐815. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0021] 21. Hou J, Hess JL, Armstrong N, et al. Polygenic resilience scores capture protective genetic effects for Alzheimer's disease. Transl Psychiatry. 2022;12:296. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0022] 22. Baker E, Escott‐Price V. Polygenic Risk Scores in Alzheimer's Disease: current Applications and Future Directions. Front Digit Health. 2020;2:14. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0023] 23. Tarca AL, Bhatti G, Romero R. A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. PLoS One. 2013;8:e79217. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0024] 24. Liao Y, Wang J, Jaehnig EJ, Shi Z, Zhang B. WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Res. 2019;47:W199‐205. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0025] 25. de Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA: generalized gene‐set analysis of GWAS data. PLoS Comput Biol. 2015;11:e1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0026] 26. Weeks EM, Ulirsch JC, Cheng NY, et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat Genet. 2023;55:1267‐1276. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0027] 27. Choi SW, García‐González J, Ruan Y, et al. PRSet: pathway‐based polygenic risk score analyses and software. PLoS Genet. 2023;19:e1010624. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0028] 28. Barbeira AN, Dickinson SP, Bonazzola R, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat Commun. 2018;9:1825. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0029] 29. Wainberg M, Sinnott‐Armstrong N, et al. Opportunities and challenges for transcriptome‐wide association studies. Nat Genet. 2019;51:592‐599. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0030] 30. Dai Y, Wang J, Jeong H‐H, Chen W, Jia P, Zhao Z. Association of CXCR6 with COVID‐19 severity: delineating the host genetic factors in transcriptomic regulation. Hum Genet. 2021;140:1313‐1328. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0031] 31. Dai Y, Jia P, Zhao Z, Gottlieb A. A method for bridging population‐specific genotypes to detect gene modules associated with Alzheimer's disease. Cells. 2022;11:2219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0032] 32. Dai Y, Pei G, Zhao Z, Jia P. A convergent study of genetic variants associated with Crohn's disease: evidence from GWAS, gene expression, methylation, eQTL and TWAS. Front Genet. 2019;10:318. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0033] 33. Dall'Aglio L, Lewis CM, Pain O. Delineating the genetic component of gene expression in major depression. Biol Psychiatry. 2021;89:627‐636. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0034] 34. Gusev A, Ko A, Shi H, et al. Integrative approaches for large‐scale transcriptome‐wide association studies. Nat Genet. 2016;48:245‐252. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0035] 35. Gamazon ER, Wheeler HE, Shah KP, et al. A gene‐based association method for mapping traits using reference transcriptome data. Nature Genetics. 2015;47:1091‐1098. doi: 10.1038/ng.3367 [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0036] 36. Nagpal S, Meng X, Epstein MP, et al. TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am J Hum Genet. 2019;105:258‐266. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0037] 37. Mai J, Lu M, Gao Q, Zeng J, Xiao J. Transcriptome‐wide association studies: recent advances in methods, applications and available databases. Commun Biol. 2023;6:899. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0038] 38. Kunkle BW, Grenier‐Boley B, Sims R, et al. Genetic meta‐analysis of diagnosed Alzheimer's disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat Genet. 2019;51:414‐430. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0039] 39. Schwartzentruber J, Cooper S, Liu JZ, et al. Genome‐wide meta‐analysis, fine‐mapping and integrative prioritization implicate new Alzheimer's disease risk genes. Nat Genet. 2021;53:392‐402. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0040] 40. Wightman DP, Jansen IE, Savage JE, et al. A genome‐wide association study with 1,126,563 individuals identifies new risk loci for Alzheimer's disease. Nat Genet. 2021;53:1276‐1282. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0041] 41. Wu Y, Sun Z, Zheng Q, et al. Pervasive biases in proxy genome‐wide association studies based on parental history of Alzheimer's disease. Nat Genet. 2024;56:2696‐2703. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0042] 42. Choi SW, Mak TSH, Hoggart CJ, O'Reilly PF. EraSOR: a software tool to eliminate inflation caused by sample overlap in polygenic score analyses. Gigascience. 2022;12:giad043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0043] 43. Lee JH, Cheng R, Graff‐Radford N, Foroud T, Mayeux R, National Institute on Aging Late‐Onset Alzheimer's Disease Family Study Group . Analyses of the National Institute on Aging late‐onset Alzheimer's disease family study: implication of additional loci. Arch Neurol. 2008;65:1518‐1526. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0044] 44. Li H, Wetten S, Li L, et al. Candidate single‐nucleotide polymorphisms from a genomewide association study of Alzheimer disease. Arch Neurol. 2008;65:45‐53. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0045] 45. Bennett DA, Buchman AS, Boyle PA, Barnes LL, Wilson RS, Schneider JA. Religious orders study and rush memory and aging project. J Alzheimers Dis. 2018;64:S161‐89. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0046] 46. Allen M, Carrasquillo MM, Funk C, et al. Human whole genome genotype and transcriptome data for Alzheimer's and other neurodegenerative diseases. Sci Data. 2016;3:160089. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0047] 47. Wang M, Beckmann ND, Roussos P, et al. The Mount Sinai cohort of large‐scale genomic, transcriptomic and proteomic data in Alzheimer's disease. Sci Data. 2018;5:180185. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0048] 48. Petersen RC, Aisen PS, Beckett LA, et al. Alzheimer's Disease Neuroimaging Initiative (ADNI): clinical characterization. Neurology. 2010;74:201‐209. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0049] 49. Bergen SE, O'Dushlaine CT, Ripke S, et al. Genome‐wide association study in a Swedish population yields support for greater CNV and MHC involvement in schizophrenia compared with bipolar disorder. Mol Psychiatry. 2012;17:880‐886. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0050] 50. Shi J, Levinson DF, Duan J, et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature. 2009;460:753‐757. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0051] 51. Cammann D, Lu Y, Cummings MJ, et al. Genetic correlations between gut microbiome genera, Alzheimer's disease diagnosis, and APOE genotypes: a polygenic risk score study 2022. doi: 10.21203/rs.3.rs-2292371/v1 [DOI]

[alz70779-bib-0052] 52. Chen X, Chen DG, Zhao Z, Zhan J, Ji C, Chen J. Artificial image objects for classification of schizophrenia with GWAS‐selected SNVs and convolutional neural network. Patterns (N Y). 2021;2:100303. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0053] 53. McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical diagnosis of Alzheimer's disease: report of the NINCDS‐ADRDA work group under the auspices of Department of Health and Human Services Task Force on Alzheimer's disease. Neurology. 1984;34:939‐944. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0054] 54. Wang Q, Chen K, Su Y, Reiman EM, Dudley JT, Readhead B. Deep learning‐based brain transcriptomic signatures associated with the neuropathological and clinical severity of Alzheimer's disease. Brain Commun. 2022;4:fcab293. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0055] 55. 1000 Genomes Project Consortium , Auton A, Brooks LD, Durbin RM, et al, 1000 Genomes Project Consortium . A global reference for human genetic variation. Nature. 2015;526:68‐74. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0056] 56. Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen W‐M. Robust relationship inference in genome‐wide association studies. Bioinformatics. 2010;26:2867‐2873. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0057] 57. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high‐throughput sequencing data. Nucleic Acids Res. 2010;38:e164. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0058] 58. International HapMap 3 Consortium , Altshuler DM, Gibbs RA, Peltonen L, et al, International HapMap 3 Consortium . Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52‐58. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0059] 59. Privé F, Aschard H, Ziyatdinov A, Blum MGB. Efficient analysis of large‐scale genome‐wide data with two R packages: bigstatsr and bigsnpr. Bioinformatics. 2018;34:2781‐2787. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0060] 60. Sayols S. Rrvgo: a bioconductor package for interpreting lists of gene ontology terms. MicroPubl Biol. 2023;2023. doi: 10.17912/micropub.biology.000811 [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0061] 61. Urbut SM, Wang G, Carbonetto P, Stephens M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat Genet. 2019;51:187‐195. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0062] 62. Consortiu. The Genotype‐Tissue Expression (GTEx) project. Nat Genet. 2013;45:580‐585. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0063] 63. Barbeira AN, Pividori M, Zheng J, Wheeler HE, Nicolae DL, Im HK. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 2019;15:e1007889. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0064] 64. Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA‐seq data. BMC Bioinformatics. 2013;14:7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0065] 65. Chatzinakos C, Georgiadis F, Lee D, et al. TWAS pathway method greatly enhances the number of leads for uncovering the molecular underpinnings of psychiatric disorders. Am J Med Genet B Neuropsychiatr Genet. 2020;183:454‐463. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0066] 66. Zheng X, Amos CI, Frost HR. Comparison of pathway and gene‐level models for cancer prognosis prediction. BMC Bioinformatics. 2020;21(1):76. doi: 10.1186/s12859-020-3423-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0067] 67. Frost HR. Reconstruction Set Test (RESET): a computationally efficient method for single sample gene set testing based on randomized reduced rank reconstruction error. BioRxivorg. 2023. doi: 10.1101/2023.04.03.535366 [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0068] 68. Dai Y, Hu R, Liu A, et al. WebCSEA: web‐based cell‐type‐specific enrichment analysis of genes. Nucleic Acids Res. 2022;50:W782‐90. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0069] 69. Lakens D. Equivalence tests: a practical primer for t tests, correlations, and meta‐analyses. Soc Psychol Personal Sci. 2017;8:355‐362. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0070] 70. Fisher RA. On the “Probable Error” of a Coefficient of Correlation Deduced from a Small Sample. Metron. 1921;1:3‐32. [Google Scholar]

[alz70779-bib-0071] 71. Bellenguez C, Küçükali F, Jansen IE, et al. New insights into the genetic etiology of Alzheimer's disease and related dementias. Nat Genet. 2022;54:412‐436. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0072] 72. Jia P, Han G, Zhao J, Lu P, Zhao Z. SZGR 2.0: a one‐stop shop of schizophrenia candidate genes. Nucleic Acids Res. 2017;45:D915‐24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0073] 73. Raza M, Deshpande LS, Blair RE, Carter DS, Sombati S, DeLorenzo RJ. Aging is associated with elevated intracellular calcium levels and altered calcium homeostatic mechanisms in hippocampal neurons. Neurosci Lett. 2007;418:77‐81. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0074] 74. Thibault O, Gant JC, Landfield PW. Expansion of the calcium hypothesis of brain aging and Alzheimer's disease: minding the store. Aging Cell. 2007;6:307‐317. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0075] 75. Sarasija S, Laboy JT, Ashkavand Z, Bonner J, Tang Y, Norman KR. Presenilin mutations deregulate mitochondrial Ca2+ homeostasis and metabolic activity causing neurodegeneration in Caenorhabditis elegans. Elife. 2018;7:e33052. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0076] 76. Song N, Jiang H, Wang J, Xie J‐X. Divalent metal transporter 1 up‐regulation is involved in the 6‐hydroxydopamine‐induced ferrous iron influx. J Neurosci Res. 2007;85:3118‐3126. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0077] 77. Wang P, Wang Z‐Y. Metal ions influx is a double edged sword for the pathogenesis of Alzheimer's disease. Ageing Res Rev. 2016;35:265‐290. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0078] 78. Song L, Tang Y, Law BYK. Targeting calcium signaling in Alzheimer's disease: challenges and promising therapeutic avenues. Neural Regen Res. 2024;19:501‐502. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0079] 79. Figueiredo CP, Clarke JR, Ledo JH, et al. Memantine rescues transient cognitive impairment caused by high‐molecular‐weight aβ oligomers but not the persistent impairment induced by low‐molecular‐weight oligomers. J Neurosci. 2013;33:9626‐9634. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0080] 80. Gao C, Jiang J, Tan Y, Chen S. Microglia in neurodegenerative diseases: mechanism and potential therapeutic targets. Signal Transduct Target Ther. 2023;8:359. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0081] 81. Papuć E, Rejdak K. The role of myelin damage in Alzheimer's disease pathology. Arch Med Sci. 2018;16:345‐351. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0082] 82. Banning A, Tomasovic A, Tikkanen R. Functional aspects of membrane association of reggie/flotillin proteins. Curr Protein Pept Sci. 2011;12:725‐735. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0083] 83. Chen Y, Guan W, Wang M‐L, Lin X‐Y. PI3K‐AKT/mTOR signaling in psychiatric disorders: a valuable target to stimulate or suppress?. Int J Neuropsychopharmacol. 2024;27:pyae010. doi: 10.1093/ijnp/pyae010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0084] 84. Tidcombe H, Jackson‐Fisher A, Mathers K, Stern DF, Gassmann M, Golding JP. Neural and mammary gland defects in ErbB4 knockout mice genetically rescued from embryonic lethality. Proc Natl Acad Sci U S A. 2003;100:8281‐8286. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0085] 85. Mei L, Xiong W‐C. Neuregulin 1 in neural development, synaptic plasticity and schizophrenia. Nat Rev Neurosci. 2008;9:437‐452. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0086] 86. McNally EM, Pytel P. Muscle diseases: the muscular dystrophies. Annu Rev Pathol. 2007;2:87‐109. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0087] 87. Raj V, Stogios N, Agarwal SM, Cheng AJ. The neuromuscular basis of functional impairment in schizophrenia: a scoping review. Schizophr Res. 2024;274:46‐56. [DOI] [PubMed] [Google Scholar]

[alz70779-bib-0088] 88. Ni G, Zeng J, Revez JA, et al. A comparison of ten polygenic score methods for psychiatric disorders applied across multiple cohorts. Biol Psychiatry. 2021;90:611‐620. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0089] 89. Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Sci. 2010;25:1‐21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0090] 90. Greenfest‐Allen E, Valladares O, Kuksa PP, et al. NIAGADS Alzheimer's GenomicsDB: a resource for exploring Alzheimer's disease genetic and genomic knowledge. Alzheimers Dement. 2024;20:1123‐1136. [DOI] [PMC free article] [PubMed] [Google Scholar]

[alz70779-bib-0091] 91. Sudlow C, Gallacher J, Allen N, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

GRPa‐PRS: A risk stratification method to identify genetically‐regulated pathways in polygenic diseases

Xiaoyang Li

Brisa S Fernandes

Andi Liu

Jingchun Chen

Xiangning Chen

Zhongming Zhao

Yulin Dai

Abstract

INTRODUCTION

METHODS

RESULTS

DISCUSSION

Highlights

1. INTRODUCTION

2. MATERIALS AND METHODS

2.1. GWAS datasets

RESEARCH IN CONTEXT

2.2. Genotyping data composition in discovery and replication cohorts

2.3. Genotyping data imputation

2.4. Diagnosis criteria

2.5. Genotyping QC and cohort summary

TABLE 1.

2.6. Framework of GRPa‐PRS

FIGURE 1.

FIGURE 2.

2.7. Novel risk strata design

TABLE 2.

2.8. PRS model and risk strata percentile selection

2.9. Gene set curation

2.10. Semantic similarity analysis of GO terms

2.11. Inferring the GReX for individuals and association study with trait

2.12. GRPa‐MAGMA

2.13. GRPa‐GSVA

2.14. PRSet pathway polygenic risk score analysis

2.15. Comparing the variance explained by GRPa‐GSVA and PRSet

2.16. Definition of resilience and extra‐burden effects using orthogonality test

2.17. Determination of margin delta

3. RESULTS

3.1. PRS prediction performance evaluation and selection

3.2. APOE‐region gene sets dominate the differential GRPas in extreme diagnosis groups

FIGURE 3.

FIGURE 4.

3.3. TB20AD has more PRS‐related GRPas than TB20Ctr does despite their similar PRS burden difference in AD

3.4. Comparisons in T and B strata highlight potential resilience‐related and extra‐burden–related GRPas that differ in AD PRS‐matched strata

3.5. Mitochondrial‐related function was highlighted in SCZ

3.6. Comparison of shared and unique significant GO terms for AD Model 1

FIGURE 5.

3.7. Semantic similarity analyses reveal key functional modules related to significant GO terms for AD and SCZ

3.8. Gene‐level association in significant GRPas from strata comparisons for AD and SCZ

3.9. Comparing the variance explained by GRPa‐GSVA vs PRSet in AD

3.10. Orthogonality test in the resilience and extra‐burden GRPas from GRPa‐GSVA for AD and SCZ

FIGURE 6.

3.11. Case–control comparisons for each of APOE haplotype for AD

3.12. Sample size, effect size, and power analysis for SCZ and AD

3.13. Robustness of GRPa‐PRS framework

3.14. Polygenicity in AD and SCZ

4. DISCUSSION

4.1. Resilience‐related and extra‐burden–related GRPas interpretation

4.2. Limitations

4.3. Future directions

CONFLICT OF INTEREST STATEMENT

CONSENT STATEMENT

Supporting information

ACKNOWLEDGMENTS

Contributor Information

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases