Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Aug 1.
Published in final edited form as: Stroke. 2021 May 27;52(9):2882–2891. doi: 10.1161/STROKEAHA.120.033670

Predictive performance of a polygenic risk score for incident ischemic stroke in a healthy older population

Johannes T Neumann 1,2,3, Moeen Riaz 1, Andrew Bakshi 1, Galina Polekhina 1, Le T P Thao 1, Mark R Nelson 1,4, Robyn L Woods 1, Gad Abraham 5, Michael Inouye 5,6, Christopher M Reid 1,7, Andrew M Tonkin 1, Jeff D Williamson 8, Geoffrey Donnan 9, Amy Brodtmann 9,10, Geoffrey C Cloud 11,12, John J McNeil 1, Paul Lacaze 1
PMCID: PMC8384668  NIHMSID: NIHMS1698807  PMID: 34039031

Abstract

Background and Purpose:

Polygenic risk scores (PRS) can be used to predict ischemic stroke (IS). However, further validation of PRS performance is required in independent populations, particularly older adults in which the majority of strokes occurs.

Methods:

We predicted risk of incident IS events in a population of 12,792 healthy older individuals enrolled in the ASPREE trial. The PRS was calculated using 3.6 million genetic variants. Participants had no previous history of cardiovascular events, dementia, or persistent physical disability at enrolment. The primary outcome was IS over 5 years, with stroke sub-types as secondary outcomes. A multivariable model including conventional risk factors was applied and re-evaluated after adding PRS. Area under the curve (AUC) and net reclassification were evaluated.

Results:

At baseline, mean population age was 75 years. In total, 173 incident IS events occurred over a median follow-up of 4.7 years. When PRS was added to the multivariable model as a continuous variable, it was independently associated with IS (hazard ratio 1.41 [95% confidence interval [CI] 1.20–1.65] per standard deviation of the PRS, p<0.001). The PRS alone was a better discriminator for IS events than most conventional risk factors. PRS as a categorical variable was a significant predictor in the highest tertile (HR 1.74, p=0.004) compared to lowest. The AUC of the conventional model was 66.6% (95%CI 62.2–71.1), and after inclusion of the PRS, improved to 68.5 (95%CI 64.0–73.0) (p=0.095). In sub-group analysis, the continuous PRS remained an independent predictor for large vessel and cardioembolic stroke subtypes, but not for small vessel stroke. Reclassification was improved, as the continuous net reclassification index after adding PRS to the conventional model was 0.25 (95%CI 0.17–0.43).

Conclusion:

PRS predicts incident IS in a healthy older population, but only moderately improves prediction over conventional risk factors.

Registration:

http://www.clinicaltrials.gov. Unique identifier: NCT01038583.

Keywords: Polygenic risk score, PRS, stroke, ischemic, elderly, risk prediction

Introduction

Stroke is the second most common cause of death and leading cause of neurological disability in adults, with incidence increasing in high and middle income countries.(13) The majority of stroke cases are ischemic strokes (IS), which differ from hemorrhagic strokes in terms of underlying pathophysiology and risk factors.(3) Established clinical risk factors for IS overlap with those for coronary heart disease and include age, dyslipidemia, elevated blood pressure, diabetes, and smoking.(2, 47) In addition to lifestyle factors, genetic factors have been found to be independently associated with IS.(8)

A number of genetic loci have been associated with increased risk of IS from genome-wide association studies (GWAS).(912) The contribution of multiple individual loci can be combined into a polygenic risk score (PRS), which can then be evaluated as a risk factor for IS, and used to predict incident IS events in independent populations. Predictive performance may be further improved with a meta-scoring approach (termed meta-genomic risk score, or ‘metaGRS’). This approach was recently used to derive a novel IS PRS based on 3.6 million genetic variants.(13). The metaGRS leverages combined analysis of multiple stroke outcomes and stroke related phenotypes across multiple stroke GWAS cohorts and a UK Biobank derivation set (N=11,995), and incorporates the long tail of many millions of genetic variants that lie beneath the genome-wide significance threshold.(14, 15) This score was found to be an independent predictor for IS in a UK Biobank validation set (N=395,393), with a hazard ratio of 1.25 per standard deviation change in PRS distribution (double that of a previous stroke PRS).(8) The discriminative capability of the metaGRS for IS was found to be similar to other conventional IS risk factors, raising the possibility of genomic risk prediction enabling more personalized prevention and/or individualized treatment for IS.

However, independent validation of the stroke metaGRS in other populations poses challenges. The UK Biobank includes a proportionately limited number of older individuals, reducing the ability to model stroke risk in the age range where the majority of events occur. In addition most existing large genetic studies of IS have already been used to derive the score, meaning these studies cannot be used for independent validation or external datasets (e.g., all studies in the MEGASTROKE consortium).(10) The ASPirin in Reducing Events in the Elderly (ASPREE) trial population represents such a dataset for evaluation of the metaGRS. ASPREE recruited a large sample of well-characterized healthy older individuals in an aspirin primary prevention trial, where IS events were adjudicated as trial endpoints. We performed an independent validation of the stroke metaGRS to evaluate its performance in a cohort of healthy elderly individuals without a history of stroke or coronary heart disease events with an average age of 75 years at study recruitment.

Methods

Study population

The study included genotyped participants from the ASPREE trial. The design and results of the trial have been reported previously.(1618) Briefly, ASPREE was a randomized double-blind placebo-controlled clinical trial investigating the effect of daily 100mg aspirin on disability-free survival over a median follow-up of 4.7-years (interquartile range 3.6 to 5.7 years). In total, 19,114 individuals aged ≥70 years (≥ 65 years for US minorities) were recruited. Participants were only included when they did not have prior cardiovascular events (including previous diagnosis of myocardial infarction, heart failure, angina pectoris, stroke, diagnosis of atrial fibrillation, or systolic blood pressure ≥180mmHg) and were free from dementia or physical disability at enrolment. All participants provided written informed consent. The study was approved by local Ethics Committees and registered on Clinicaltrials.gov (NCT01038583). The data that support the findings of this study are available from the corresponding author upon reasonable request.

Ischemic stroke endpoint

Incident IS was among the prespecified secondary endpoints in the ASPREE trial. IS was defined according to the World Health Organization definition and included imaging by computer tomography or magnetic resonance imaging in the majority of cases.(19) All cases of IS were further divided into subtypes of large vessel, small vessel, cardioembolic, and undetermined.(20) Undetermined stokes had undetermined causes, multiple causes identified, or an incomplete evaluation made. Hemorrhagic strokes were not considered for the present analyses. All stroke events were assessed by an Adjudication Committee, blinded to the identity of participants and study treatment group assignment, as described previously.(17)

Conventional risk model

Selection of risk factors for the conventional risk model was based on prior evidence, IS clinical guidelines and availability in ASPREE.(2, 4, 6) The following variables were included in the model: age, sex, smoking (current or former compared versus never), systolic blood pressure, non-high-density lipoprotein (HDL)-cholesterol, HDL-cholesterol, body-mass-index (BMI), alcohol consumption (current versus former or never consumption), family history of stroke (defined by an event occurring before the age of 50 years in a first degree relative), diabetes, and randomization to aspirin. Other risk factors, such as atrial fibrillation, and heart failure were not considered, due to the entry criteria for the ASPREE trial population.

Genotyping and calculation of the polygenic risk score

DNA samples provided by 14,052 ASPREE biobank participants were genotyped using the Axiom 2.0 Precision Medicine Diversity Research Array (Thermo Fisher Scientific, CA, USA). Variant calling used a custom pipeline aligned to human reference genome GRCh38. For the present analyses only participants with non-Finnish European descent, age of at least 70 years at enrollment, and self-reported European ancestry were considered, resulting in a total of 12,792 participants (Online Figure I). Genetic ancestry was defined using principal component analysis (PCA) with the 1,000 Genomes reference population, with participants that did not overlap with the Non-Finnish European 1,000 Genomes cluster excluded (Supplementary material, Online Figure II).(21) Imputation was performed using the Haplotype Reference Consortium panel, European samples (University of Michigan imputation server).(22) Post-imputation QC removed variants with low imputation quality scores (r2<0.3). The PRS was calculated in ASPREE using 3,225,583 variants from IS metaGRS downloaded from PGScatalog.(13) A total of 3,219,276 variants were present in ASPREE (6,307 variants were removed due to variant ID mismatch). We used plink version 1.9 to calculate the PRS for each individual as the weighted sum of the effect size for the number of risk alleles at each variant.(23)

Statistical analyses

Continuous variables were reported by mean value with standard deviation (SD) and binary variables were reported by absolute and relative numbers. Spearman correlation coefficients for continuous variables were used, and visualized in a correlation matrix using the R package “corrplot”. The association of conventional risk factors with incident IS within 5 years was assessed by a multivariable Cox proportional hazards regression model. Participants who died for other reasons than IS, were censored at the time of death. Continuous variables were used as linear predictors. The association was re-assessed by adding the PRS as continuous variable (per one SD change) and as categorical variable (divided into tertiles, using the lowest tertile as reference group). Tertiles were chosen based on the overall cohort size and IS event numbers per group, to facilitate robust statistical analysis between groups. Multivariable regression analyses were repeated for each subtype of IS (large vessel, small vessel, cardioembolic). Additionally, sensitivity analyses were performed after adding the genetic ancestry principal components (PCs), intake of antihypertensive drugs, intake of statins, and socioeconomic status (measured by Index of Relative Socio-economic Advantage and Disadvantage [IRSAD] deciles) to the multivariable model (Supplementary results). We also performed univariable Cox regression analysis to identify significant predictors, and then used only these significant predictors in an additional multivariable Cox regression model, as sensitivity analysis. Kaplan-Meier estimates for the incidence of IS events within 5 years, stratified by PRS tertiles, were calculated using the “survival” package.

The area under the curves (AUC) at 5 years, as well as the 95% confidence intervals (CI) were calculated for each single predictor, for the conventional risk model and for the conventional risk model plus the PRS using time-dependent receiver-operating-characteristics using the R package “timeROC”.(24) Finally, reclassification analyses were performed to evaluate the impact on risk categories, when adding the PRS to conventional risk factors. The R package “nricens” was used to calculate time-to-event continuous and categorical net reclassification improvement (NRI). Risk categories for the categorical reclassification analyses were chosen based on the observed risk in ASPREE and were set to <1.5%, 1.5 to 2.49% and ≥2.5% within five years. Analyses were performed using R version 4.0.2.(25)

Results

Baseline characteristics

The mean age of the genotyped study population of 12,792 participants was 75.1 years (SD 4.2) and 54.9% were female (Table 1). Mean systolic blood pressure was 139.5 mmHg (SD 16.3), 44.2% were current or former smokers and 9.3% were diagnosed with diabetes at baseline. The mean BMI was 28 kg/m2, 79.7% were current alcohol consumers and 1.2% had a family history of stroke. Comparing the genotyped with the non-genotyped ASPREE population, only minor differences were observed in baseline characteristics (Online Table I).

Table 1:

Baseline characteristics of the investigated study population

Characteristic Numbers
Number of participants 12,792
Age (mean (SD)) 75.06 (4.22)
Female gender (%) 7,027/12,792 (54.9)
Current or former smoker (%) 5,659/12,792 (44.2)
Systolic blood pressure mmHg (mean (SD)) 139.5 (16.3)
Diastolic blood pressure mmHg (mean (SD)) 77.2 (10.0)
Diabetes (%) 1,186/12,792 (9.3)
Randomized to Aspirin (%) 6,370/12,792 (49.8)
Body-mass-index kg/m2 (mean (SD)) 27.97 (4.55)
Current alcohol consumption (%) 10,198/12,792 (79.7)
Hemoglobin g/dL (mean (SD)) 14.23 (1.19)
HDL-c in mmol/L (mean (SD)) 1.59 (0.46)
Non-HDL-c in mmol/L (mean (SD)) 3.69 (0.93)
Creatinine in mg/dL (mean (SD)) 0.90 (0.22)
Family history of stroke (%) 139/11,724 (1.2)
Intake of antihypertensive drugs (%) 6,504/12,792 (50.8)
Polygenic Risk Score (mean (SD)) 1.84 (0.23)

Missing values for continuous variables were: 341 for creatinine, 331 for non-HDL-c, 330 for HDL-c, and 56 for body-mass-index. Abbreviations: SD = standard deviation, HDL-c = high density lipoprotein cholesterol, MI = myocardial infarction.

During the median follow-up time of 4.6 years, 173 IS events were observed. The PRS showed a normal distribution within the genotyped study population (Online Figure III) and showed only weak correlation with other continuous variables (Online Figure IV). When stratified according to tertiles of PRS, differences in baseline characteristics (p<0.001) included slightly higher BMI, diabetes and intake of antihypertensive drugs in the highest PRS tertile (Online Table II).

Risk prediction based on the conventional risk model

In the conventional risk model, significant predictors of incident IS included age (HR 1.08, p-value <0.001), systolic blood pressure (HR 1.16, p-value 0.003), non-HDL-cholesterol (HR 1.24, p-value 0.01), and current alcohol consumption (HR 0.66, p-value 0.025) (Table 2). The AUC of each predictor from the conventional model ranged from 50.1% (95%CI 49.4–50.8%) for family history of stroke to 58.9% (95%CI 53.9–64.0%) for age. The AUC of the conventional model was 66.6% (95%CI 62.2–71.1%) (Figure 1 and Online Table III).

Table 2:

Multivariable Cox Regression model for prediction of incident ischemic stroke

Conventional model Conventional model + continuous PRS Conventional model + categorical PRS Conventional model + categorical PRS comparing tertile 1 + 2 versus tertile 3
HR 95%CI p-value HR 95%CI p-value HR 95%CI p-value HR 95%CI p-value
Age 1.08 (1.05;1.12) <0.001 1.09 (1.05;1.12) <0.001 1.08 (1.05;1.12) <0.001 1.08 (1.05; 1.12) <0.001
Female gender 0.90 (0.63;1.28) 0.55 0.88 (0.62;1.26) 0.49 0.89 (0.62;1.27) 0.51 0.89 (0.62; 1.27) 0.51
Current or former smoker 1.29 (0.93;1.79) 0.13 1.26 (0.91;1.75) 0.16 1.28 (0.92;1.77) 0.15 1.28 (0.92; 1.77) 0.14
Systolic blood pressure per 10mmHg increase 1.16 (1.05;1.28) 0.003 1.15 (1.04;1.26) 0.004 1.15 (1.05;1.27) 0.004 1.15 (1.05; 1.27) 0.004
Non HDL-c 1.24 (1.05;1.47) 0.010 1.24 (1.05;1.46) 0.010 1.24 (1.05;1.47) 0.011 1.24 (1.05; 1.47) 0.011
HDL-c 0.66 (0.42;1.02) 0.06 0.63 (0.41;0.98) 0.041 0.64 (0.42;1.00) 0.049 0.64 (0.42; 1.00) 0.049
Body-mass-index 0.97 (0.93;1.01) 0.15 0.97 (0.93;1.00) 0.08 0.97 (0.93;1.01) 0.10 0.97 (0.93; 1.01) 0.10
Current alcohol consumption 0.66 (0.46;0.95) 0.025 0.66 (0.46;0.95) 0.025 0.66 (0.46;0.95) 0.023 0.66 (0.46; 0.95) 0.023
Family history of stroke 0.64 (0.09;4.57) 0.66 0.63 (0.09;4.49) 0.64 0.63 (0.09;4.53) 0.65 0.63 (0.09; 4.52) 0.65
Diabetes 1.60 (0.98;2.63) 0.06 1.55 (0.95;2.55) 0.08 1.56 (0.95;2.56) 0.08 1.56 (0.95; 2.56) 0.08
Randomized to Aspirin 0.75 (0.55;1.02) 0.07 0.75 (0.55;1.03) 0.07 0.75 (0.55;1.03) 0.07 0.75 (0.55; 1.03) 0.07
PRS (continuous) 1.41 (1.20;1.65) <0.001
PRS tertile 1 Reference
PRS tertile 2 1.06 (0.70;1.61) 0.79
PRS tertile 3 1.74 (1.19;2.56) 0.004
PRS tertile 1 + 2 Reference
PRS tertile 3 1.69 (1.24; 2.32) 0.001

Abbreviations: HDL-c = high density lipoprotein cholesterol, PRS = polygenic risk score, HR = hazard ratio, CI = confidence interval. The multivariable model is based on n = 12,405, while 387 observations were deleted due to missing values. For the cox model 171 IS events were observed.

Figure 1: Area under the curve for each predictor, the conventional model and the PRS added to the conventional model.

Figure 1:

This figure summarizes the AUC and the corresponding confidence intervals for each single predictor, the conventional risk model and the PRS added to the conventional risk model. The p-value comparing the AUC of the conventional model to the conventional model plus PRS is 0.0948. Abbreviations: HDL-c = high density lipoprotein cholesterol, PRS = polygenic risk score, AUC = area under the curve

Addition of the polygenic risk score

When added as a continuous variable to the conventional multivariable model, the PRS predicted IS events (HR 1.41, p-value <0.001, Table 2) and had independence as a variable in the model. Evaluating the PRS as a categorical variable, the highest tertile was a predictor (HR 1.74, p-value 0.004), but not the second tertile (HR 1.06, p-value 0.79). When comparing the highest tertile with tertiles 1 and 2 combined, the HR was 1.69 (p-value 0.001). In addition, those individuals in the highest PRS tertile showed the highest IS event rate in Kaplan-Meier analyses (Online Figure V). Although the AUC of the model increased to 68.5% (95%CI 64.0–73.0%), when the PRS was included, this difference was not significant. The calibration plot showed a good agreement between predicted and observed ischemic events, when using the conventional model including the PRS (Online Figure VI). In an additional model using only significant predictors from univariable analysis (age, systolic blood pressure, non-HDL-c, HDL-c and current alcohol consumption), the PRS remained a significant predictor (HR 1.40 per SD [1.20–1.64]) with almost identical HR to the original model (Online Tables VIII and IX).

In reclassification analyses, the continuous NRI was 0.252 (95%CI 0.175, 0.434), when the PRS was added to the conventional model (Table 3). This included more frequent downwards classification (NRI- 0.140, 95%CI 0.116, 0.186) and less frequent upwards classification (NRI+ 0.111, 95%CI 0.041, 0.260). For evaluation of the categorical NRI, the risk categories of <1.5%, <2.5% and ≥2.5% were chosen based on the observed risk within the ASPREE trial. The categorical NRI was 0.054 (95%CI −0.053, 0.174) and included mainly upwards classification (NRI+ 0.054, 95%CI −0.067, 0.172) (Tables 3 and Online Table IV).

Table 3:

Estimated continuous and categorical net reclassification improvement for risk of ischemic stroke within 5 years.

Continuous NRI Estimate 95%CI
NRI 0.252 0.175, 0.434
NRI+ 0.111 0.041, 0.260
NRI- 0.140 0.116, 0.186
P(Up|Case) 0.556 0.521, 0.631
P(Down|Case) 0.445 0.370, 0.480
P(Down|Ctrl) 0.570 0.558, 0.593
P(Up|Ctrl) 0.430 0.407, 0.442
Categorical NRI
NRI 0.054 −0.053, 0.174
NRI+ 0.054 −0.067, 0.172
NRI- 0.001 −0.005, 0.029
P(Up|Case) 0.142 0.064, 0.259
P(Down|Case) 0.088 0.042, 0.196
P(Down|Ctrl) 0.126 0.092, 0.161
P(Up|Ctrl) 0.125 0.090, 0.132

Abbreviations: NRI = Net reclassification analyses. The analyses are based on 11,385 individuals and 158 IS events.

Subgroup analyses

We performed subgroup analyses according to adjudicated cause of IS, which included large vessel (N=49 events), small vessel (N=43), and cardioembolic (N=36) stroke (Table 4). The continuous PRS when added to the conventional model remained a significant predictor only for large vessel and cardioembolic stroke (HR 1.43, p-value 0.021 and HR 1.74, p-value 0.001, respectively), not for small vessel stroke (HR 1.18, p-value 0.3).

Table 4:

Multivariable Cox Regression model for prediction of incident ischemic stroke according to stroke subtypes

Large Vessel stroke Small vessel stroke Cardioembolic stroke
HR 95%CI p-value HR 95%CI p-value HR 95%CI p-value
Age 1.06 (0.99; 1.13) 0.08 1.11 (1.05; 1.18) <0.001 1.12 (1.05; 1.20) <0.001
Female sex 0.72 (0.36; 1.46) 0.37 0.90 (0.44; 1.83) 0.77 0.69 (0.32; 1.48) 0.34
Current or former smoker 1.76 (0.92; 3.35) 0.09 1.35 (0.70; 2.59) 0.37 1.23 (0.61; 2.48) 0.56
Systolic blood pressure per 10mmHg increase 1.12 (0.93; 1.35) 0.22 1.01 (0.84; 1.23) 0.88 1.25 (1.02; 1.53) 0.030
Non-HDL-c 1.36 (1.00; 1.85) 0.05 1.58 (1.17; 2.15) 0.003 1.04 (0.72; 1.49) 0.85
HDL-c 0.92 (0.41; 2.09) 0.85 0.40 (0.16; 1.04) 0.06 0.82 (0.33; 2.01) 0.66
Body-mass-index 0.96 (0.89; 1.04) 0.32 0.93 (0.85; 1.01) 0.09 1.01 (0.93; 1.09) 0.84
Current alcohol consumption 0.51 (0.26; 1.01) 0.05 1.12 (0.50; 2.49) 0.78 0.55 (0.26; 1.17) 0.12
Diabetes 2.36 (0.99; 5.59) 0.05 3.70 (1.65; 8.26) 0.001 0.52 (0.12; 2.25) 0.38
Randomized to Aspirin 0.93 (0.51; 1.71) 0.83 0.39 (0.20; 0.79) 0.008 0.53 (0.26; 1.07) 0.08
PRS (continuous) 1.43 (1.05; 1.94) 0.021 1.18 (0.86; 1.62) 0.30 1.74 (1.24; 2.43) 0.001

Abbreviations: HDL-c = high density lipoprotein cholesterol, PRS = polygenic risk score, HR = hazard ratio, CI = confidence interval.

In sensitivity analysis, which also included information on genetic ancestry in the risk model, only the first PC was a predictor for IS, but this did not affect the status of the PRS as a predictor with independence as a variable in the model (Online Table V). When adding the information on intake of antihypertensive drugs and intake of statins to the conventional model, the PRS still remained a significant predictor, with independence as a variable (Online Table VI). Here only intake of statins (HR 0.64 [95%CI 0.42–0.98]), but not intake of antihypertensive drugs was an independent predictor of IS events.

In further sensitivity analyses, adding information on the socioeconomic status indicated by IRSAD deciles, none of the deciles was a significant predictor for IS events, while the PRS remained a significant predictor (Online Table VII). Next, when only using significant predictors identified from univariable analysis in the model (age, systolic blood pressure, non-HDL-c, HDL-c, and current alcohol consumption), the PRS remained a significant predictor with independence as a variable (Online Tables IIX and IX). The HRs produced by each of the different models are summarized in Online Figure VII and showed negligible differences. Finally, we investigated the interaction of treatment with aspirin and the PRS as a continuous variable and found no significant interaction (Online Table X).

Discussion

In this study, we evaluated the prognostic value of a previously derived PRS (metaGRS) to predict future IS events in a population of healthy older individuals from the ASPREE trial. This represents, to our knowledge, the first independent external validation of the stroke metaGRS in a population of older individuals. We found that the PRS, considered as a continuous or categorical variable, was a predictor of IS beyond conventional risk factors, with independence as a variable in the model. We also found that the metaGRS alone was a better discriminator for IS events than most conventional risk factors, including gender, cholesterol and family history. However, adding the PRS to the conventional risk model only modestly improved prediction, with small effects on reclassification but no statistically significant AUC increase (Figure 2). This suggests the PRS only moderately improves prediction of IS over conventional risk factors, though benefits may extend to risk stratification and classifying individuals close to treatment threshold recommendations.

Figure 2: Central figure.

Figure 2:

This figure summarizes the study population, research question, main study findings and conclusion.

Several findings of our study warrant further discussion. Firstly, the older study population investigated represents a distinct high-risk population where the majority of atherothrombotic cardiovascular and stroke events are reported to occur. Earlier studies have indicated that cardiovascular risk factors in the elderly differ from the general population.(26, 27) However, studies of older individuals who have reached the age of at least 70 years without prior cardiovascular event are rare. ASPREE therefore provides an important insight into this specific sub-group. Secondly, our study validates a previously-derived PRS in an independent population, confirming it acts as a significant predictor of future IS events with independence as a variable in the model, even in a highly selected population of healthy older individuals.(13) This speaks to the robustness of the stroke metaGRS when used across different populations, albeit of European ancestry. Thirdly, and perhaps most importantly, consistent with analysis of the UK Biobank, we found that the metaGRS alone was a better discriminator for IS events than most conventional stroke risk factors, including sex, cholesterol, and family history. Only the AUC of systolic blood pressure and age were higher, than the AUCs of the PRS alone. In regression analyses, the PRS remained an predictor with independence as a variable in the model, adjusting for a wide range of established risk factors.(2, 12). When including only significant predictors from univariable analysis in the model (age, systolic blood pressure, non-HDL-c, HDL-c, and current alcohol consumption), the PRS remained a significant predictor. Finally, we did not find evidence that the effect of aspirin treatment on preventing IS events was modified by PRS. Our results, when taken together, suggest that use of a stroke PRS for prediction of IS events clinically has future potential. However, the appropriate clinical context and timeline for implementation remain unclear.

At baseline, we observed higher BMI, diabetes and intake of antihypertensive drugs in the high-risk PRS tertile versus the low (p<0.001, Online Table I). This likely reflects the PRS capturing underlying genetic variation related to some conventional IS risk factors at baseline, which are also influenced by the environment and lifestyle. IS-associated genetic variants used in the PRS include variants that reflect biological processes that underpin IS risk (e.g. BMI, diabetes, hypertension). Yet importantly, when the PRS was used to predict future IS events during follow-up, it was found to be a significant predictor with independence as a variable in the model, when added to conventional IS risk factors. This reflects the unique property of a PRS, in capturing a diverse spectrum of biological signal, and aggregating that signal into a single risk measure. The ability to capture diverse biological heterogeneity contributing to disease risk, as well as the ability to predict risk from birth, gives the PRS a unique advantage over conventional IS risk factors.

In analyses according to IS subtype, we found that the PRS remained a predictor for large vessel and cardioembolic stroke, but not for small vessel stroke. This result suggests differential performance of the PRS (specificity) for certain IS sub-types over others. Possible explanations for this result may be related to the way stroke cases were ascertained in GWAS, with ascertainment potentially biased towards large vessel and cardioembolic stroke. Alternatively, it may reflect underlying biological differences between IS subtypes, which may have variable heritability and genetic etiology. In addition to the variability in predictive performance of the PRS observed between stroke sub-types, we also found that conventional risk factors for each subtype showed distinct differences demonstrating phenotypic heterogeneity. For example, diabetes was an important predictor for large and small vessel stroke but not for cardioembolic strokes. While intriguing, our IS sub-type finding should not be over-interpreted, predominantly because of the small numbers of events in each sub-type group.

Stroke is known to be a heterogenous disease, with the likelihood of distinct biological, genetic and phenotypic differences between sub-types, that are yet to be fully appreciated and drawn out from GWAS. Recent studies have suggested that underlying genetic differences exist between certain IS sub-types(10, 28), and that hypertension has a causative role in all major stroke subtypes except lobar intracerebral hemorrhage.(29) Larger GWASs with more phenotypic granularity are required to help to resolve the heterogeneity of the stroke phenotype, and allow more distinct subtype analyses. This could lead to an enhanced understanding of the genetic risk for IS and may help to identify optimal target populations, and more specific PRSs for each sub-type. However, we acknowledge that the difference in PRS performance observed between stroke sub-types in the present study may have been caused by limited event numbers in each sub-group. Further studies are required to validate our results in larger datasets.

With regards to clinical applicability, we found the stroke metaGRS was a significant predictor of IS events in our analyses, however, the improvement in prediction when the PRS was added as a continuous variable to a conventional model was modest. This suggests limited immediate clinical utility of the score, above conventional stroke risk factors However, it is likely that the predictive performance of stroke genetic risk scores will improve in the future, after larger and more phenotypically granular GWAS are performed, and effect sizes for each individual variant are refined. These improvements may add further incremental benefits in prediction and risk stratification, to strengthen the case for clinical usage of the PRS, particularly for informing decisions when risk is close to a treatment threshold. Yet in addition to the predictive performance of the score alone, the appropriate clinical context for implementation remains to be examined, with further studies needed. For example, should a stroke PRS be considered for use in primary prevention to identify those most at risk to intervene earlier in life, or will it be more appropriate to use in more focused contexts, such as stratification of specific high-risk sub-groups.

Several strengths and limitations of our study must be considered. Strengths include a well-characterized, understudied at risk population with incident IS clinically adjudicated as part of a randomized trial. No other large clinical trial has recruited this number of healthy elderly individuals without a prior history of IS events, with genotyping data and stroke sub-typing available. All ASPREE participants also received medical assessments by general practitioners at enrolment, to confirm eligibility for the trial, and to rule out previous diagnoses of stroke and other cardiovascular events. This provided confidence that participants were cardiovascular event-free at enrolment, to examine the value of PRS in the context of primary prevention of incident IS events in the elderly. Established, conventional risk factors for IS were also available in ASPREE, to examine alongside polygenic risk. Furthermore, adjudication of IS events in ASPREE included sub-categories, such as large vessel or cardioembolic stroke, which allow unique analyses although the relatively small number of events by subtype is noted. Our study was limited by a relatively short follow-up period of 4.6 years and a limited number of IS events. This restricted our ability to perform stratified analyses and investigate the question of whether individuals with a higher PRS would benefit from aspirin treatment for primary prevention of IS. Larger studies are needed to investigate this question. The ASPREE population was selected for the purposes of a clinical trial of aspirin where a healthy-volunteer bias may occur. This might result in underrepresentation of conventional risk factors in this population, as compared to the general population. Finally, the cohort is also of predominately European descent, limiting the generalizability of our findings to other ethnicities.

In conclusion, we present the first external validation of a previously derived PRS for prediction of IS. Our study demonstrates that the PRS is a significant predictor of incident IS events in a healthy older population. However, improvement in prediction after adding the PRS to conventional risk factors was found to be modest and incremental. Further studies are required to assess whether genetic risk scores for IS will have genuine clinical utility, and if so, determine the appropriate clinical context for their use.

Supplementary Material

Supplemental Publication Material

Acknowledgements:

We thank the ASPREE trial staff, participants, and general practitioners, and the Ramaciotti Centre for Genomics.

Funding Sources:

Supported by the National Institute on Aging and the National Cancer Institute at the NIH (grant number U01AG029824); the National Health and Medical Research Council of Australia (grant numbers 334047, 1127060); Monash University and the Victorian Cancer Agency. Genotyping supported by Bioplatforms Australia, National Framework Initiative (2018-2020). J.N. is recipient of a fellowship by the Deutsche Forschungsgemeinschaft (NE 2165/1-1). C.M.R. is supported through a National Health and Medical Research Council Principal Research Fellowship (APP 1136372). P.L is supported by a National Heart Foundation Future Leader Fellowship (102604).

Disclosures:

Dr Nelson reports personal fees from Bayer Healthcare outside the submitted work. Dr Abraham reports speaker honoraria from Amgen One Cardiovascular Academy outside the submitted work. Dr Tonkin reports other from Bayer, personal fees from Amgen, personal fees from Merck, and personal fees from Pfizer outside the submitted work. Dr Tonkin reports other from Bayer, personal fees from Amgen, personal fees from Merck, and personal fees from Pfizer outside the submitted work. Dr Brodtmann reports grants from NHMRC, grants from Heart Foundation, and personal fees from Biogen outside the submitted work. Dr McNeil reports grants from National Institute on Aging, US and grants from National Health & Medical Research Council, Australia during the conduct of the study. The other authors do not report any disclosures.

Non-standard Abbreviations and Acronyms:

IS

ischemic stroke

PRS

Polygenic risk score

ASPREE

ASPirin in Reducing Events in the Elderly

IRSAD

Index of Relative Socio-economic Advantage and Disadvantage

Footnotes

Supplemental Materials

Expanded Materials & Methods

Online Tables I - X

Online Figures I - VII

References

  • 1.Feigin VL, Krishnamurthi RV, Parmar P, Norrving B, Mensah GA, Bennett DA, Barker-Collo S, Moran AE, Sacco RL, Truelsen T, et al. Update on the Global Burden of Ischemic and Hemorrhagic Stroke in 1990–2013: The GBD 2013 Study. Neuroepidemiology. 2015;45(3):161–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hankey GJ. Stroke. The Lancet. 2017;389(10069):641–54. [DOI] [PubMed] [Google Scholar]
  • 3.Collaborators GBDS. Global, regional, and national burden of stroke, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol. 2019;18(5):439–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Meschia JF, Bushnell C, Boden-Albala B, Braun LT, Bravata DM, Chaturvedi S, Creager MA, Eckel RH, Elkind MS, Fornage M, et al. Guidelines for the primary prevention of stroke: a statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke; a journal of cerebral circulation. 2014;45(12):3754–832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hill VA, Towfighi A. Modifiable Risk Factors for Stroke and Strategies for Stroke Prevention. Semin Neurol. 2017;37(3):237–58. [DOI] [PubMed] [Google Scholar]
  • 6.Pandian JD, Gall SL, Kate MP, Silva GS, Akinyemi RO, Ovbiagele BI, Lavados PM, Gandhi DBC, Thrift AG. Prevention of stroke: a global perspective. The Lancet. 2018;392(10154):1269–78. [DOI] [PubMed] [Google Scholar]
  • 7.Sharrief A, Grotta JC. Stroke in the elderly. Handb Clin Neurol. 2019;167:393–418. [DOI] [PubMed] [Google Scholar]
  • 8.Rutten-Jacobs LC, Larsson SC, Malik R, Rannikmae K, consortium M, International Stroke Genetics C, Sudlow CL, Dichgans M, Markus HS, Traylor M. Genetic risk, incident stroke, and the benefits of adhering to a healthy lifestyle: cohort study of 306 473 UK Biobank participants. BMJ. 2018;363:k4168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Traylor M, Farrall M, Holliday EG, Sudlow C, Hopewell JC, Cheng YC, Fornage M, Ikram MA, Malik R, Bevan S, et al. Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE collaboration): a meta-analysis of genome-wide association studies. Lancet Neurol. 2012;11(11):951–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Malik R, Chauhan G, Traylor M, Sargurupremraj M, Okada Y, Mishra A, Rutten-Jacobs L, Giese AK, van der Laan SW, Gretarsdottir S, et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nature genetics. 2018;50(4):524–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Malik R, Rannikmae K, Traylor M, Georgakis MK, Sargurupremraj M, Markus HS, Hopewell JC, Debette S, Sudlow CLM, Dichgans M, et al. Genome-wide meta-analysis identifies 3 novel loci associated with stroke. Annals of neurology. 2018;84(6):934–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Dichgans M, Pulit SL, Rosand J. Stroke genetics: discovery, biology, and clinical applications. The Lancet Neurology. 2019;18(6):587–99. [DOI] [PubMed] [Google Scholar]
  • 13.Abraham G, Malik R, Yonova-Doing E, Salim A, Wang T, Danesh J, Butterworth AS, Howson JMM, Inouye M, Dichgans M. Genomic risk score offers predictive performance comparable to clinical risk factors for ischaemic stroke. Nature communications. 2019;10(1):5819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Maier RM, Zhu Z, Lee SH, Trzaskowski M, Ruderfer DM, Stahl EA, Ripke S, Wray NR, Yang J, Visscher PM, et al. Improving genetic prediction by leveraging genetic correlations among human diseases and traits. Nature communications. 2018;9(1):989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Inouye M, Abraham G, Nelson CP, Wood AM, Sweeting MJ, Dudbridge F, Lai FY, Kaptoge S, Brozynska M, Wang T, et al. Genomic Risk Prediction of Coronary Artery Disease in 480,000 Adults: Implications for Primary Prevention. J Am Coll Cardiol. 2018;72(16):1883–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Group AI. Study design of ASPirin in Reducing Events in the Elderly (ASPREE): a randomized, controlled trial. Contemp Clin Trials. 2013;36(2):555–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.McNeil JJ, Wolfe R, Woods RL, Tonkin AM, Donnan GA, Nelson MR, Reid CM, Lockery JE, Kirpach B, Storey E, et al. Effect of Aspirin on Cardiovascular Events and Bleeding in the Healthy Elderly. N Engl J Med. 2018;379(16):1509–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.McNeil JJ, Woods RL, Nelson MR, Reid CM, Kirpach B, Wolfe R, Storey E, Shah RC, Lockery JE, Tonkin AM, et al. Effect of Aspirin on Disability-free Survival in the Healthy Elderly. N Engl J Med. 2018;379(16):1499–508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.WHO. Stroke––1989. Recommendations on stroke prevention, diagnosis, and therapy. Report of the WHO Task Force on Stroke and other Cerebrovascular Disorders. Stroke; a journal of cerebral circulation. 1989;20(10):1407–31. [DOI] [PubMed] [Google Scholar]
  • 20.Adams HP Jr., Bendixen BH, Kappelle LJ, Biller J, Love BB, Gordon DL, Marsh EE 3rd. Classification of subtype of acute ischemic stroke. Definitions for use in a multicenter clinical trial. TOAST. Trial of Org 10172 in Acute Stroke Treatment. Stroke; a journal of cerebral circulation 1993;24(1):35–41. [DOI] [PubMed] [Google Scholar]
  • 21.Genomes Project C, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Das S, Forer L, Schonherr S, Sidore C, Locke AE, Kwong A, Vrieze SI, Chew EY, Levy S, McGue M, et al. Next-generation genotype imputation service and methods. Nature genetics. 2016;48(10):1284–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Blanche P, Dartigues JF, Jacqmin-Gadda H. Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks. Stat Med. 2013;32(30):5381–97. [DOI] [PubMed] [Google Scholar]
  • 25.R-Core-Team. R: A language and environment for statistical computing. Foundation for Statistical Computing, Vienna, Austria2013. [Google Scholar]
  • 26.Dalton JE, Rothberg MB, Dawson NV, Krieger NI, Zidar DA, Perzynski AT. Failure of Traditional Risk Factors to Adequately Predict Cardiovascular Events in Older Populations. J Am Geriatr Soc. 2020;68(4):754–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Nanna MG, Peterson ED, Wojdyla D, Navar AM. The Accuracy of Cardiovascular Pooled Cohort Risk Estimates in U.S. Older Adults. Journal of general internal medicine. 2020;35(6):1701–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.von Berg J, van der Laan SW, McArdle PF, Malik R, Kittner SJ, Mitchell BD, Worrall BB, de Ridder J, Pulit SL. Alternate approach to stroke phenotyping identifies a genetic risk locus for small vessel stroke. Eur J Hum Genet. 2020;28(7):963–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Georgakis MK, Gill D, Webb AJS, Evangelou E, Elliott P, Sudlow CLM, Dehghan A, Malik R, Tzoulaki I, Dichgans M. Genetically determined blood pressure, antihypertensive drug classes, and risk of stroke subtypes. Neurology. 2020;95(4):e353–e61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics. 2012;28(24):3326–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Publication Material

RESOURCES