Skip to main content
Nature Communications logoLink to Nature Communications
. 2022 May 25;13:2922. doi: 10.1038/s41467-022-30675-z

The potential of polygenic scores to improve cost and efficiency of clinical trials

Akl C Fahed 1,2,3, Anthony A Philippakis 4,5, Amit V Khera 1,2,3,6,
PMCID: PMC9132885  PMID: 35614072

Abstract

Polygenic scores can identify individuals with high disease risk based on inborn DNA variation. We explore their potential to enrich clinical trials by identifying individuals based on higher risk of disease (‘prognostic enrichment’), or increased probability of benefit (‘predictive enrichment’).

Subject terms: Personalized medicine, Predictive markers, Prognostic markers, Randomized controlled trials


Clinical trials typically study rates of disease in participants randomized to a placebo or a given intervention, serving two primary purposes—first, to provide ‘gold standard’ evidence of efficacy and safety needed to obtain regulatory approval; and second, to demonstrate adequate benefit to convince clinicians and payers to use the drug within clinical practice. Because such trials for common diseases often require tens of thousands of participants followed for several years, the typical cost is $350 million, out of reach for all but the largest pharmaceutical companies or governmental agencies1.

One important approach to increase clinical trial efficiency is to selectively enroll participants based on clinical or molecular characteristics2. Guidance from the U.S. Food and Drug Administration outlines two distinct conceptual approaches for enrichment. The first, termed ‘prognostic enrichment,’ aims to increase statistical power—and thus decrease sample size and cost—by increasing the proportion of patients likely to demonstrate disease onset or progression. Taking COVID-19 vaccine trials as an example, Moderna and other sponsors selectively enrolled participants in areas where the virus was rapidly spreading to more quickly demonstrate benefit3. For a new cholesterol-lowering therapy designed to prevent heart attack and stroke, the pivotal trial enrolled only those with preexisting cardiovascular disease based on data that the event rates in these individuals is much higher4. The second, termed ‘predictive enrichment,’ aims to enroll participants who are more likely to have an outsized benefit to the trial intervention. Demonstration that patients whose lung cancer contained specific gain-of-function mutations in the target of an inhibitor of this receptor’s signaling respond to treatment, while those without such mutations do not, inspired a new era in oncologic development where predictive enrichment using molecular profiling has substantially reduced development cost and duration2,5.

Despite the frequent use of enrichment strategies, clinical trials still often fail to achieve their aim of allowing the intervention to gain regulatory approval and adoption in clinical practice. These (costly) failures are particularly common when low event rates preclude the preferred trial design or the existing standard of care is already good, thus making the demonstration of a meaningful improvement with a new drug more challenging. For conditions such as Alzheimer’s dementia, enrollment of participants late in the disease process—which aims to increase event rates via prognostic enrichment—is often cited as a potential reason for the failures that have occurred even when the therapeutic target is believed to be pathophysiologically sound, as was the case for an antibody designed to clear amyloid plaques from the brain6,7. In cardiovascular disease, a powerful cholesterol-lowering medicine reduced the frequency of clinical events from 11.8 to 10.8% compared to placebo, achieving its primary endpoint with a compelling degree of statistical confidence (p = 0.004), but this effect size was deemed inadequate to justify pursuing its commercialization8.

Given these challenges in clinical trial design and execution, are genetic enrichment strategies using ‘polygenic scores’ worthwhile to consider?

The traditional approach to genetic risk stratification has focused on identifying the small subset of the population with rare monogenic mutations that substantially increase risk via disruption of a specific biologic pathway. More recently, polygenic scores—which instead consider the cumulative impact of many common DNA variants scattered across the genome—have gained traction as a promising approach with relevance for much larger subsets of the population. Initially proposed for applications in plant and animal breeding, newer generation polygenic scores have considerable predictive capacity across a range of important common diseases911. This stratification allows for the identification of individuals (as early as birth) whose inborn DNA variation places them on a markedly accelerated trajectory of disease onset. For coronary artery disease, we demonstrated that up to 8% of the population inherits triple the normal risk based on genetic variation alone, and these high-risk individuals cannot be reliably identified with traditional risk factors or family history10.

Post hoc analyses of clinical trials involving cholesterol-lowering therapies for cardiovascular disease have suggested that polygenic scores hold promise as a powerful enrichment strategy. Among healthy individuals randomized to statin or placebo to prevent cardiovascular disease, those with the highest polygenic score demonstrated the greatest benefit12,13. This benefit was related to both prognostic enrichment—the rates of developing heart disease in the placebo group was 19.6% for those in the top quintile of the score versus 12.9% in all others—and predictive enrichment, where a 44% relative risk reduction was noted for those with high score versus only 24% in the remainder of the participants13.

This observation from statin trials was later extended to two trials focused on preventing a second cardiovascular event in those with existing disease using powerful (and expensive) new injectable medications, where those with the highest polygenic score again derived the greatest benefit due to both prognostic and predictive enrichment14,15. This analysis suggests that—had it been possible to predict this enrichment in advance—the trials could have successfully demonstrated benefit with substantially fewer participants (Fig. 1). In this specific case, we estimate that a trial that enrolled only those participants in the top quintile of the polygenic score might have required only 2360 participants—a greater than 90% reduction from the 27,564 studied—and demonstrated a 31% relative risk reduction as compared to the 20% observed in the overall trial population. For a drug class that faced post-approval access challenges, initial commercialization for a subset of the population who derived greater benefit may have enhanced clinical uptake, perceived cost-effectiveness, and overall public health impact.

Fig. 1. Power and sample size estimation using prognostic or predictive model for polygenic score enrichment.

Fig. 1

The FOURIER clinical trial randomized 27,564 patients with cardiovascular disease to a placebo or evolucumab, a cholesterol-lowering therapy, and followed patients for a median of 2.2 years4. This trial design was based on a power calculation that predicted an event rate of ~6.4% in the control arm and a relative risk reduction (RRR) of 15%28. We used these data to model power calculations using polygenic score enrichment under either of two models. A With prognostic enrichment (increasing event rates beyond the 6.4% in the original trial), a polygenic score enrichment improves statistical power to detect a benefit despite a fixed effect size (relative risk reduction of 15%); B with predictive enrichment (increasing effect size of intervention beyond the 15% RRR in the original trial), a polygenic score enrichment improves power with a fixed event rate in the placebo arm of 6.4%. The dashed line in both panels denotes 90% power to detect a statistical benefit, a threshold commonly used in trial design. Using polygenic scores to enrich clinical trials could markedly improve power and reduce the number of participants needed by increasing event rates (“prognostic enrichment”) and/or increasing the effect size (“predictive enrichment”).

We believe that polygenic risk estimation will play an important role in the future of clinical medicine, enabling targeted screening or prevention strategies to overcome inherited predisposition, and warrants consideration as an enrichment strategy for clinical trials as well. Although the potential requirement of a genetic test as an inclusion criterion for a given trial creates a potential hurdle to recruitment, this has become common within clinical medicine for use cases ranging from targeted cancer therapies, drugs for cystic fibrosis that work only among those with a given genetic mutation in the CFTR gene, or a potent fish oil formulation that is approved only for those with high circulating triglyceride levels2,16,17. Compelling use cases might include primary prevention trials where traditional approaches would require a clinical trial that is intractable owing to the very large sample size and long follow-up that would be necessary to show benefit. Beyond conditions such as Alzheimer’s dementia discussed above, an additional public health need relates to nonalcoholic fatty liver disease—which affects up to 20% of the world’s population and is the leading risk factor for liver cirrhosis or cancer—but has been challenging to conduct trials for since only a small fraction of afflicted individuals progress to more advanced disease in a given year18. We and others have recently developed polygenic scores for this condition, laying the scientific foundation for a new generation of trials that incorporate genetic enrichment strategies19.

Alongside the considerable (and warranted) enthusiasm for the use of polygenic scores to meaningfully enhance clinical development, several potential barriers warrant discussion. First, the predictive capacity of a polygenic score is limited by heritability (proportion of risk explained by common DNA variants) and scores may not have adequate ability to stratify risk for some conditions20. Second, although in principle polygenic scores can be assessed for less than $100 U.S. dollars, few patients or healthcare systems currently offer them clinically, posing a logistical challenge for trial enrollment or medication prescribing. Third, current polygenic scores are typically associated with increased risk across all ancestries, but with an effect size that is highest in those of European ancestry (primarily due to lack of adequate training data in other groups)21,22. Fourth, most scores developed to date are based on case-control datasets for a given disease—additional work is needed to determine whether the genetic basis of disease progression meaningfully differs from disease onset and whether ‘pathway-specific’ scores may provide more reliable predictive enrichment23,24. Fifth, an approach that integrates polygenic risk with additional rare genetic or non-genetic factors such as clinical or biomarker concentrations is likely to outperform strategies based on a polygenic score alone, but few such algorithms have been developed to date25,26. Sixth, the regulatory guidelines surrounding polygenic score use in clinical development have not been fully articulated and scores are likely to evolve over time due to a lack of accepted standards to evaluate performance and reproducibility—increasing the risk of a sponsor obtaining an approved drug label with a given score. Seventh, most investigations of utilizing polygenic scores in clinical trials are from post hoc analyses, but prospective implementation may still face logistical and scientific challenges that would need to be solved.

Despite potential barriers, the high cost of clinical trials has emerged as arguably the single biggest barrier to the development of innovations that may well have substantial public health benefit—and potential strategies to meaningfully alter this landscape mandate serious consideration27. As observed in trials of cholesterol-lowering therapies, polygenic scores hold the potential to enable substantial predictive or prognostic enrichment and could have a deep impact on enabling a new era in clinical development.

Acknowledgements

Funding support was provided by grants 1K08HG010155 and 1U01HG011719 from the National Human Genome Research Institute (A.V.K.), a Hassenfeld Scholar Award from Massachusetts General Hospital (A.V.K.), a Merkin Institute Fellowship from the Broad Institute of MIT and Harvard (to A.V.K.), a sponsored research agreement from Bayer (A.A.P), the Eric and Wendy Schmidt Center (A.A.P.), and a sponsored research agreement from IBM Research (A.V.K. & A.A.P.).

Author contributions

A.C.F., A.A.P., and A.V.K. jointly drafted the manuscript and critically revised the manuscript for intellectual content.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer for their contribution to the peer review of this work.

Competing interests

A.C.F. is a consultant and holds equity in Goodpath. A.A.P. is employed as a Venture Partner at GV, a subsidiary of Alphabet Corporation. A.V.K. is an employee and holds equity in Verve Therapeutics; has served as a scientific advisor to Amgen, Maze Therapeutics, Navitor Pharmaceuticals, Sarepta Therapeutics, Novartis, Silence Therapeutics, Korro Bio, Veritas International, Color Health, Third Rock Ventures, Illumina, Foresite Labs, and Columbia University (NIH); received speaking fees from Illumina, MedGenome, Amgen, and the Novartis Institute for Biomedical Research; received a sponsored research agreement from IBM Research, and is listed as a co-inventor on a patent application for use of imaging data in assessing body fat distribution and associated cardiometabolic risk.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Moore TJ, Zhang H, Anderson G, Alexander GC. Estimated costs of pivotal trials for novel therapeutic agents approved by the US Food and Drug Administration, 2015–2016. JAMA Intern. Med. 2018;178:1451–1457. doi: 10.1001/jamainternmed.2018.3931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.U.S. Food and Drug Administration. Enrichment strategies for clinical trials to support approval of human drugs and biological products. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/enrichment-strategies-clinical-trials-support-approval-human-drugs-and-biological-products (2019).
  • 3.Baden LR, et al. Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. N. Engl. J. Med. 2021;384:403–416. doi: 10.1056/NEJMoa2035389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sabatine MS, et al. Evolocumab and clinical outcomes in patients with cardiovascular disease. N. Engl. J. Med. 2017;376:1713–1722. doi: 10.1056/NEJMoa1615664. [DOI] [PubMed] [Google Scholar]
  • 5.Lynch TJ, et al. Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib. N. Engl. J. Med. 2004;350:2129–2139. doi: 10.1056/NEJMoa040938. [DOI] [PubMed] [Google Scholar]
  • 6.McDade E, Bateman RJ. Stop Alzheimer’s before it starts. Nature. 2017;547:153. doi: 10.1038/547153a. [DOI] [PubMed] [Google Scholar]
  • 7.Honig LS, et al. Trial of solanezumab for mild dementia due to Alzheimer’s disease. N. Engl. J. Med. 2018;378:321–330. doi: 10.1056/NEJMoa1705971. [DOI] [PubMed] [Google Scholar]
  • 8.Bowman L, et al. Effects of anacetrapib in patients with atherosclerotic vascular disease. N. Engl. J. Med. 2017;377:1217–1227. doi: 10.1056/NEJMoa1706444. [DOI] [PubMed] [Google Scholar]
  • 9.Lande R, Thompson R. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics. 1990;124:743–756. doi: 10.1093/genetics/124.3.743. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Khera AV, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 2018;50:1219–1224. doi: 10.1038/s41588-018-0183-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet.10.1038/s41576-018-0018-x (2018). [DOI] [PubMed]
  • 12.Mega JL, et al. Genetic risk, coronary heart disease events, and the clinical benefit of statin therapy: An analysis of primary and secondary prevention trials. Lancet. 2015;385:2264–2271. doi: 10.1016/S0140-6736(14)61730-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Natarajan P, et al. Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from Statin therapy in the primary prevention setting. Circulation. 2017;135:2091–2101. doi: 10.1161/CIRCULATIONAHA.116.024436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Marston NA, et al. Predicting benefit from evolocumab therapy in patients with atherosclerotic disease using a genetic risk score. Circulation. 2020;141:616–623. doi: 10.1161/CIRCULATIONAHA.119.043805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Damask A, et al. Patients with high genome-wide polygenic risk scores for coronary artery disease may receive greater clinical benefit from alirocumab treatment in the ODYSSEY OUTCOMES trial. Circulation. 2020;141:624–636. doi: 10.1161/CIRCULATIONAHA.119.044434. [DOI] [PubMed] [Google Scholar]
  • 16.Collins FS. Realizing the dream of molecularly targeted therapies for cystic fibrosis. N. Engl. J. Med. 2019;381:1863–1865. doi: 10.1056/NEJMe1911602. [DOI] [PubMed] [Google Scholar]
  • 17.Bhatt DL, et al. Cardiovascular risk reduction with icosapent ethyl for hypertriglyceridemia. N. Engl. J. Med. 2019;380:11–22. doi: 10.1056/NEJMoa1812792. [DOI] [PubMed] [Google Scholar]
  • 18.Loomba R, Friedman SL, Shulman GI. Mechanisms and disease consequences of nonalcoholic fatty liver disease. Cell. 2021;184:2537–2564. doi: 10.1016/j.cell.2021.04.015. [DOI] [PubMed] [Google Scholar]
  • 19.Haas, M. E. et al. Machine learning enables new insights into genetic contributions to liver fat accumulation. Cell Genom.10.1016/j.xgen.2021.100066 (2021). [DOI] [PMC free article] [PubMed]
  • 20.Zhang Y, Qi G, Park JH, Chatterjee N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nat. Genet. 2018;50:1318–1326. doi: 10.1038/s41588-018-0193-x. [DOI] [PubMed] [Google Scholar]
  • 21.Fahed, A. C. et al. Transethnic transferability of a genome-wide polygenic score for coronary artery disease. Circ. Genomic Precis. Med.10.1161/CIRCGEN.120.003092 (2021). [DOI] [PMC free article] [PubMed]
  • 22.Martin AR, et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 2019;51:584–591. doi: 10.1038/s41588-019-0379-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Liu G, et al. Genome-wide survival study identifies a novel synaptic locus and polygenic score for cognitive progression in Parkinson’s disease. Nat. Genet. 2021;53:787–793. doi: 10.1038/s41588-021-00847-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.McCarthy MI. Painting a new picture of personalised medicine for diabetes. Diabetologia. 2017;60:793–799. doi: 10.1007/s00125-017-4210-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Fahed AC, et al. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat. Commun. 2020;11:3635. doi: 10.1038/s41467-020-17374-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lee A, et al. BOADICEA: A comprehensive breast cancer risk prediction model incorporating genetic and nongenetic risk factors. Genet. Med. 2019;21:1708–1718. doi: 10.1038/s41436-018-0406-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Moscicki RA, Tandon PK. Drug-development challenges for small biopharmaceutical companies. N. Engl. J. Med. 2017;376:469–474. doi: 10.1056/NEJMra1510070. [DOI] [PubMed] [Google Scholar]
  • 28.Sabatine MS, et al. Rationale and design of the further cardiovascular outcomes research with PCSK9 inhibition in subjects with elevated risk trial. Am. Heart J. 2016;173:94–101. doi: 10.1016/j.ahj.2015.11.015. [DOI] [PubMed] [Google Scholar]

Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES