Supplemental Digital Content is available in the text.
Keywords: coronary artery disease, genetic variation, hypercholesterolemia, primary prevention, risk
Abstract
Background:
Coronary artery disease (CAD) represents one of the leading causes of morbidity and mortality worldwide. Given the healthcare risks and societal impacts associated with CAD, their clinical management would benefit from improved prevention and prediction tools. Polygenic risk scores (PRS) based on an individual’s genome sequence are emerging as potentially powerful biomarkers to predict the risk to develop CAD. Two recently derived genome-wide PRS have shown high specificity and sensitivity to identify CAD cases in European-ancestry participants from the UK Biobank. However, validation of the PRS predictive power and transferability in other populations is now required to support their clinical utility.
Methods:
We calculated both PRS (GPSCAD and metaGRSCAD) in French-Canadian individuals from 3 cohorts totaling 3639 prevalent CAD cases and 7382 controls and tested their power to predict prevalent, incident, and recurrent CAD. We also estimated the impact of the founder French-Canadian familial hypercholesterolemia deletion (LDLR delta >15 kb deletion) on CAD risk in one of these cohorts and used this estimate to calibrate the impact of the PRS.
Results:
Our results confirm the ability of both PRS to predict prevalent CAD comparable to the original reports (area under the curve=0.72–0.89). Furthermore, the PRS identified about 6% to 7% of individuals at CAD risk similar to carriers of the LDLR delta >15 kb mutation, consistent with previous estimates. However, the PRS did not perform as well in predicting an incident or recurrent CAD (area under the curve=0.56–0.60), maybe because of confounding because 76% of the participants were on statin treatment. This result suggests that additional work is warranted to better understand how ascertainment biases and study design impact PRS for CAD.
Conclusions:
Collectively, our results confirm that novel, genome-wide PRS is able to predict CAD in French Canadians; with further improvements, this is likely to pave the way towards more targeted strategies to predict and prevent CAD-related adverse events.
Genome-wide association studies (GWAS) have shed light on the polygenic architecture of human quantitative traits, such as height and blood pressure, as well as common diseases, such as type 2 diabetes mellitus and coronary artery disease (CAD).1–4 These studies have shown that complex human phenotypes are controlled by hundreds of genetic variants, each with small effect size. Although individually they contribute to a small fraction of the phenotypic variation, together they account for a relatively large fraction of the heritability.5 This observation has raised the possibility to use genetic variants distributed across the genome to calculate polygenic risk scores (PRS) and use them to predict the risk to develop diseases.6 The availability of large human genetic data sets, such as the UK Biobank, now allows for calibration and validation of genome-wide PRS in >100 000 individuals.
CAD remains one of the main causes of morbidity and mortality worldwide.7 GWAS have already identified >100 loci associated with CAD, mostly in populations of European ancestry.2,8 Early prediction would benefit prevention, optimal management, and treatment strategies for CAD. Although CAD has high heritability (50%–60%),9,10 genetic testing is not readily used in the clinic, except in the context of Mendelian disease such as familial hypercholesterolemia (FH). Two recently developed genome-wide PRS for CAD by Khera et al11 (GPSCAD) and Inouye et al12 (metaGRSCAD) suggest that genetic risk prediction for CAD is ready to be applied in the clinical setting. Khera et al11 used the LDpred algorithm to model linkage disequilibrium and variant effect sizes from a CAD GWAS in the UK Biobank to create GPSCAD, which includes >6 million genetic variants throughout the genome.13 In contrast, Inouye et al12 created a PRS termed metaGRSCAD with >1.7 million variants, themselves explaining 26% of CAD heritability, using a meta-analysis of association results from 3 large CAD GWAS.2,14,15 The conclusions from both studies were encouraging. Khera et al11 showed that GPSCAD can identify a significant portion of individuals in the general population with a polygenic CAD risk as high as those who carry mutations that cause FH. For Inouye et al,12 the CAD risk estimated with metaGRSCAD was higher than the risk conferred by any single traditional risk factors, such as smoking or hypertension.12
Although these results are promising, the introduction of CAD PRS in clinical practice is likely to encounter resistance.16–18 In particular, whether PRS are sufficiently accurate to justify on their own early interventions—including pharmaceutical treatments—is an important debate. For this reason, it is critical to validate PRS in additional populations (GPSCAD and metaGRSCAD were initially only tested in European-ancestry participants from the UK Biobank) and determine whether ascertainment biases and study design impact their clinical utility. Khera et al11 recently tested the utility of GPSCAD in Americans from different ethnicities (white, black, Hispanic, and Asian) and compared the predicted risk to individuals with monogenic mutations in hypercholesterolemia genes.19 Their results indicate that GPSCAD can predict CAD risk in non-white individuals, although with lower accuracy. Here, we validated these 2 novel CAD PRS in individuals of French-Canadian descent recruited from the population- and hospital-based cohorts. We evaluated the performance of these polygenic predictors on prevalent, incident, and recurrent CAD. Finally, we used whole-genome sequence data to identify participants that carry a known FH mutation and compared its impact on CAD risk with that because of the inheritance of millions of weak effects common variants.
Methods
The data and materials used to perform the study cannot be made available because of ethical considerations. All analytical methods used are readily available and reported. All participants have provided written, informed consent and the project was approved by the ethics committee of the Montreal Heart Institute (MHI). The full methods are available as part of the Data Supplement.
Results
Genome-Wide PRS for Prevalent CAD in French Canadians
Using both models (GPSCAD and metaGRSCAD), we calculated PRS in French Canadians from 3 studies: 2 hospital-based cohorts from the MHI Biobank (phase 1, n=1964 and phase 2, n=3309),20,21 and 5762 participants from CARTaGENE, a public health research platform in the Province of Quebec, Canada.22 We present demographics and baseline clinical information for all participants in Table 1. After DNA genotyping and variant imputation (Data Supplement), most variants used to calculate GPSCAD and metaGRSCAD were present in our data sets (missingness range: 0.09%–6.96%; Table I in the Data Supplement), suggesting that our data sets can accurately capture the previously proposed CAD polygenic models. Both PRS were strongly correlated with each other in the French-Canadian data sets (Pearson r>0.73, P<2.2×10−16; Figure 1). We tested the association between the CAD PRS and prevalent CAD status in all 3 cohorts. The distributions of both GPSCAD and metaGRSCAD were shifted towards higher values in CAD cases when compared with controls (Figure 2). Combining results across the 3 cohorts, we found that one SD increase in GPSCAD or metaGRSCAD was associated with increased odds of CAD of 1.61 (P=6.18×10−42) and 1.69 (P=3.28×10−49), respectively (Table 2). In terms of prediction of prevalent CAD in French Canadians, the area under the receiver operating characteristic curve for both PRS was 0.72 to 0.89, largely consistent with the original reports (Table 2).
Table 1.
Demographics and Clinical Information for the Participants Involved in the Study

Figure 1.

Correlation between normalized GPSCAD and metaGRSCAD. The correlation between GPSCAD and metaGRSCAD in (A) the Montreal Heart Institute (MHI) Biobank phase 1 (Pearson r=0.75, P<2×10−16), (B) the MHI Biobank phase 2 (Pearson r=0.75, P<2×10−16), and (C) CARTaGENE (Pearson r=0.74, P<2×10−16).
Figure 2.

Distributions of GPSCAD and metaGRSCAD in the Montreal Heart Institute (MHI) Biobank phase 2 cohort. Distributions of the normalized polygenic risk score from Khera et al11 (GPSCAD, left column) and Inouye et al12 (metaGRSCAD, right column) in the MHI Biobank phase 2 data for prevalent (A and B), incident (C and D), and recurrent (E and F) coronary artery disease (CAD) events.
Table 2.
Association With and Prediction of CAD by Polygenic Risk Scores in 3 Cohorts

Estimation of CAD Risk for LDLR Delta >15 kb Deletion Carriers
Approximately 60% of FH cases in the French-Canadian population of Quebec are because of the delta >15 kb deletion of the LDLR gene.23 To compare the predictive power of CAD PRS with the impact of penetrant FH mutations on CAD risk in this population, we used whole-genome sequence data available in 1964 MHI Biobank participants to call copy-number variants at the LDLR locus.20 We identified a total of 14 heterozygous carriers of the LDLR delta >15 kb deletion (breakpoints: chr19:11 188 403-11 204 295 [hg19]). The estimated allele frequency in this cohort is 0.36%, which is in the range of the reported frequency for this mutation (≈0.03%–0.38%).24,25 In our data set, the LDLR delta >15 kb deletion was associated with increased low-density lipoprotein–cholesterol levels (1.34 mmol/L increase per copy of the LDLR deletion, P=1.2×10−8). When combining baseline and follow-up data, we found that 12 out of the 14 LDLR deletion carriers were CAD cases (odds ratio [OR]=3.30 and 95% CI, 0.72–15.2; P=0.13). Although this result is not statistically significant owing to our limited sample size, it allows us to estimate that French Canadians who carry a strong FH mutation are ≈3× more at risk to develop CAD. This provides a direct opportunity to identify the proportion of individuals at similar or increased risk for CAD based on their PRS. Using the distributions of GPSCAD and metaGRSCAD, we estimate that 6% to 7% of the French-Canadian population is at the same or higher risk for CAD than carriers of the FH LDLR delta >15 kb deletion. This result is consistent with the estimate by Khera et al11 that 8% of European-ancestry individuals in the UK Biobank have a PRS that confers comparable or higher CAD risk than rare FH mutations.
Prediction of Incident and Recurrent CAD
The MHI Biobank is a prospective hospital-based cohort with available regular follow-up clinical information collected. We took advantage of this design to also test the CAD PRS against the incident and recurrent CAD events. Because genetic variants are present at birth, it can be argued that PRS analyses of late-onset diseases, such as CAD, are always prospective. However, analyses of clinical information collected retrospectively is subject to selection biases and thereby, analysis of such information might impact the accuracy of the PRS. Inouye et al12 had shown that metaGRSCAD can identify incident cases in the UK Biobank. Among the 1245 controls at baseline with follow-up available in the combined MHI Biobank cohorts, 402 had a first CAD event between recruitment and follow-up (median follow-up time =4 years [range =5 weeks to 7.2 years]). Importantly, we note that most participants in the MHI Biobank, including controls free of CAD, were taking statins at baseline and this may confound our analyses of incident CAD events. With this important caveat in mind, we tested the CAD PRS against incident CAD events on statin treatment. GPSCAD was not associated with incident CAD (OR=1.11, P=0.071), whereas the association between metaGRSCAD and incident CAD was only modest (OR=1.15, P=0.022; Table II in the Data Supplement). The prediction of incident CAD by GPSCAD and metaGRSCAD was also markedly lower than for prevalent CAD (area under the receiver operating characteristic curve=0.57–0.60; Table II in the Data Supplement). Of the 1812 CAD cases at baseline with follow-up information available, 1382 had a recurrent CAD event during the follow-up period (median follow-up time =3.9 years [range =1.1–7). We found that GPSCAD and metaGRSCAD, 2 PRS developed to predict primary CAD events, were also associated with recurrent CAD events (GPSCAD: OR=1.13; P=6.12×10−4; metaGRSCAD: OR=1.17; P=4.33×10−5), although the area under the receiver operating characteristic curve was relatively small (0.57–0.60; Table 2).
Discussion
Because PRS are simple and relatively inexpensive, their implementation in the clinical setting holds great promises. For CAD, in particular, early detection could lead to simple yet extremely efficacious therapeutic interventions (eg, statins and aspirin). Given this exciting possibility, we tested 2 recently developed CAD PRS in French Canadians recruited from population- and hospital-based cohorts. We validated previous findings that both GPSCAD and metaGRSCAD perform well for prevalent CAD cases. However, their performance was lower for the incident and recurrent CAD in the MHI Biobank. Although both PRS could not predict incident CAD events in the MHI Biobank, these analyses might be confounded given that the majority of participants were on statin treatment at baseline. Using the French Canadian founder FH LDLR delta >15 kb mutation to calibrate CAD risk, we confirmed that PRS can identify about 6% to 7% of the population that is at equal or higher CAD risk than carriers of an FH monogenic mutation.
Our study raises a few interesting questions. Although it is appreciated that PRS do not transfer well between ancestral populations,26,27 little is known about the transferability of PRS across populations within the same ancestry. Our results indicate that CAD PRS developed in European-ancestry individuals perform quite well in the genetically and environmentally homogenous French-Canadian population. How well these same PRS would predict CAD in a more diverse European-ancestry population, or in a population living in a different environment, remain critical open questions for further investigation.19 Another important result from our analyses is the lower accuracy that these PRS have to predict an incident or recurrent CAD cases when compared with prevalent CAD cases, highlighting the importance of the method used to create the PRS. GPSCAD and metaGRSCAD were built using mainly GWAS for prevalent CAD and are, therefore, particularly suitable to predict prevalent CAD as opposed to incident or recurrent events. In particular, our analyses of incident and recurrent CAD were based on the MHI Biobank, which is a hospital-based cohort. Thus, it is possible that confounders such as the presence of comorbidities and medications (eg, antithrombotic, statin treatment at baseline [discussed above]) would impact PRS performance. Furthermore, because we matched cases and controls based on age at baseline, participants with incident or recurrent CAD were older at the time of their CAD events than prevalent cases. If the cause of CAD at an older age is less polygenic, as suggested12 it might not be surprising that GPSCAD and metaGRSCAD do not perform as well on incident or recurrent CAD. It is important to clarify these differences to determine what factors in the study design and what ascertainment biases influence the PRS. Furthermore, an extension of our results implicates that GWAS that aim to specifically identify the genetic architecture of incident of recurrent CAD events might yield improved predictive power to calibrate risk score models over PRS based on CAD prevalence alone.12
In conclusion, while it may still take some time before PRS become widely applicable in the clinic to predict CAD, their utility is likely to increase as the community continues to improve methods and gain access to large GWAS performed in populations of different ethnic backgrounds. But the true improvement in CAD prediction based on PRS will only occur if the scientific progress is mirrored by an effort to explain the strengths and limitations of this new biomarker to the medical community and the general population.
Acknowledgments
We thank all participants and staff of the André and France Desmarais Montreal Heart Institute (MHI) Biobank. Sequencing of the MHI Biobank samples (phase 1) was performed at the McGill University and Génome Québec Innovation Centre. Genotyping of the MHI Biobank samples (phase 2) was performed at the Université de Montréal Beaulieu–Saucier Pharmacogenomics Centre at the MHI. We would also like to thank the CARTaGENE staff for their support in validating phenotypes. We also thank Aikaterini Kritikou and Rafik Tadros for comments on an earlier version of this article. Web links: Genetic risk score model for GPSCAD Khera et al Nature Genetics 2018: http://www.broadcvdi.org/informational/data. Genetic risk score model for metaGRSCAD by Inouye et al JACC 2018: https://figshare.com/articles/Coronary_Artery_Disease_CAD_MetaGRS/5748096.
Sources of Funding
This work was funded by the Canadian Institutes of Health Research (MOP no. 136979), the Heart and Stroke Foundation of Canada (Grant no. G-18-0021604), the Canada Research Chair Program, Genome Quebec and Genome Canada, and the Montreal Heart Institute Foundation. F. Wünnemann holds a postdoctoral training scholarship from the Fonds de recherche Quebec Santé (FRQS).
Disclosures
None.
Supplementary Material
Footnotes
The Data Supplement is available at https://www.ahajournals.org/doi/suppl/10.1161/CIRCGEN.119.002481.
Guest editor for this article was Christopher Semsarian, MBBS, PhD, MPH.
References
- 1.Warren HR, et al. International Consortium of Blood Pressure (ICBP) 1000G Analyses; BIOS Consortium; Lifelines Cohort Study; Understanding Society Scientific group; CHD Exome+ Consortium; ExomeBP Consortium; T2D-GENES Consortium; GoT2DGenes Consortium; Cohorts for Heart and Ageing Research in Genome Epidemiology (CHARGE) BP Exome Consortium; International Genomics of Blood Pressure (iGEN-BP) Consortium; UK Biobank CardioMetabolic Consortium BP working group. Genome-wide association analysis identifies novel blood pressure loci and offers biological insights into cardiovascular risk. Nat Genet. 2017;49:403–415. doi: 10.1038/ng.3768. [Google Scholar]
- 2.Nikpay M, et al. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat Genet. 2015;47:1121–1130. doi: 10.1038/ng.3396. doi: 10.1038/ng.3396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Scott RA, et al. DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium. An expanded genome-wide association study of type 2 diabetes in europeans. Diabetes. 2017;66:2888–2902. doi: 10.2337/db16-1253. doi: 10.2337/db16-1253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Marouli E, et al. EPIC-InterAct Consortium; CHD Exome+ Consortium; ExomeBP Consortium; T2D-Genes Consortium; GoT2D Genes Consortium; Global Lipids Genetics Consortium; ReproGen Consortium; MAGIC Investigators. Rare and low-frequency coding variants alter human adult height. Nature. 2017;542:186–190. doi: 10.1038/nature21039. doi: 10.1038/nature21039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42:565–569. doi: 10.1038/ng.608. doi: 10.1038/ng.608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chatterjee N, et al. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat Genet. 2013;45:400, 405e1–405, 405e1. doi: 10.1038/ng.2579. doi: 10.1038/ng.2579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lerner DJ, Kannel WB. Patterns of coronary heart disease morbidity andmortality in the sexes: a 26-year follow-up of the Framingham population. Am Heart J. 1986;111:383–390. doi: 10.1016/0002-8703(86)90155-9. [DOI] [PubMed] [Google Scholar]
- 8.van der Harst P, et al. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ Res. 2018;122:433–443. doi: 10.1161/CIRCRESAHA.117.312086. doi: 10.1161/CIRCRESAHA.117.312086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zdravkovic S, et al. Heritability of death from coronary heart disease: a 36-year follow-up of 20 966 Swedish twins. J Intern Med. 2002;252:247–254. doi: 10.1046/j.1365-2796.2002.01029.x. [DOI] [PubMed] [Google Scholar]
- 10.Wienke A, et al. The heritability of mortality due to heart diseases: a correlated frailty model applied to Danish twins. Twin Res. 2001;4:266–274. doi: 10.1375/1369052012399. [DOI] [PubMed] [Google Scholar]
- 11.Khera AV, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet. 2018;50:1219–1224. doi: 10.1038/s41588-018-0183-z. doi: 10.1038/s41588-018-0183-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Inouye M, et al. UK Biobank CardioMetabolic Consortium CHD Working Group. Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention. J Am Coll Cardiol. 2018;72:1883–1893. doi: 10.1016/j.jacc.2018.07.079. doi: 10.1016/j.jacc.2018.07.079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Vilhjálmsson BJ, et al. Schizophrenia Working Group of the Psychiatric Genomics Consortium, Discovery, Biology; Risk of Inherited Variants in Breast Cancer (DRIVE) study. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am J Hum Genet. 2015;97:576–592. doi: 10.1016/j.ajhg.2015.09.001. doi: 10.1016/j.ajhg.2015.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Deloukas P, et al. CARDIoGRAMplusC4D Consortium. Large-scale association analysis identifies new risk loci for coronary artery disease. Nat Genet. 2013;45:25–33. doi: 10.1038/ng.2480. doi: 10.1038/ng.2480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Abraham G, et al. Genomic prediction of coronary heart disease. Eur Heart J. 2016;37:3267–3278. doi: 10.1093/eurheartj/ehw450. doi: 10.1093/eurheartj/ehw450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Torkamani A, et al. The personal and clinical utility of polygenic risk scores. Nat Rev Genet. 2018;19:581–590. doi: 10.1038/s41576-018-0018-x. doi: 10.1038/s41576-018-0018-x. [DOI] [PubMed] [Google Scholar]
- 17.Rosenberg NA, et al. Interpreting polygenic scores, polygenic adaptation, and human phenotypic differences. Evol Med Public Health. 2018;2019:26–34. doi: 10.1093/emph/eoy036. doi: 10.1093/emph/eoy036. eCollection 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wald NJ Old R. The illusion of polygenic disease risk prediction [published online January 12, 2019] Genet Med. doi: 10.1038/s41436-018-0418-5. https://www.nature.com/articles/s41436-018-0418-5. [DOI] [PubMed]
- 19.Khera AV, et al. Whole-genome sequencing to characterize monogenic and polygenic contributions in patients hospitalized with early-onset myocardial infarction. Circulation. 2019;139:1593–1602. doi: 10.1161/CIRCULATIONAHA.118.035658. doi: 10.1161/CIRCULATIONAHA.118.035658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Low-Kam C, et al. Whole-genome sequencing in French Canadians from Quebec. Hum Genet. 2016;135:1213–1221. doi: 10.1007/s00439-016-1702-6. doi: 10.1007/s00439-016-1702-6. [DOI] [PubMed] [Google Scholar]
- 21.Low-Kam C, et al. Variants at the APOE /C1/C2/C4 locus modulate cholesterol efflux capacity independently of high-density lipoprotein cholesterol. J Am Heart Assoc. 2018;7:e009545. doi: 10.1161/JAHA.118.009545. doi: 10.1161/JAHA.118.009545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Awadalla P, et al. CARTaGENE Project. Cohort profile of the CARTaGENE study: Quebec’s population-based biobank for public health and personalized genomics. Int J Epidemiol. 2013;42:1285–1299. doi: 10.1093/ije/dys160. doi: 10.1093/ije/dys160. [DOI] [PubMed] [Google Scholar]
- 23.Hobbs HH, et al. Deletion in the gene for the low-density-lipoprotein receptor in a majority of French Canadians with familial hypercholesterolemia. N Engl J Med. 1987;317:734–737. doi: 10.1056/NEJM198709173171204. doi: 10.1056/NEJM198709173171204. [DOI] [PubMed] [Google Scholar]
- 24.Vohl MC, et al. Geographic distribution of French-Canadian low-density lipoprotein receptor gene mutations in the Province of Quebec. Clin Genet. 1997;52:1–6. doi: 10.1111/j.1399-0004.1997.tb02506.x. [DOI] [PubMed] [Google Scholar]
- 25.Simard LR, et al. The Delta>15 Kb deletion French Canadian founder mutation in familial hypercholesterolemia: rapid polymerase chain reaction-based diagnostic assay and prevalence in Quebec. Clin Genet. 2004;65:202–208. doi: 10.1111/j.0009-9163.2004.00223.x. [DOI] [PubMed] [Google Scholar]
- 26.Martin AR, et al. Human demographic history impacts genetic risk prediction across diverse populations. Am J Hum Genet. 2017;100:635–649. doi: 10.1016/j.ajhg.2017.03.004. doi: 10.1016/j.ajhg.2017.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Reisberg S, et al. Comparing distributions of polygenic risk scores of type 2 diabetes and coronary heart disease within different populations. PLoS One. 2017;12:e0179238. doi: 10.1371/journal.pone.0179238. doi: 10.1371/journal.pone.0179238. [DOI] [PMC free article] [PubMed] [Google Scholar]
