Skip to main content
Communications Medicine logoLink to Communications Medicine
. 2024 Nov 8;4:227. doi: 10.1038/s43856-024-00647-z

Inclusiveness of the All of Us Research Program improves polygenic risk scores and fosters genomic medicine for all

Benson R Kidenya 1,2,, Gerald Mboowa 3,4,5
PMCID: PMC11544250  PMID: 39511400

Polygenic risk scores (PRS) show promise but have accuracy disparities across ancestries due to underrepresentation in the existing genomic databases. Here, we outline the initiative of All of Us Research Program in refining PRS and advancing genomic medicine for all.

Subject terms: Genetic testing, Predictive markers


Kidenya and Mboowa discuss the current state of genomic inclusiveness in medicine. They champion the efforts of the All of Us Research Program to broaden diversity in population genomics and reduce disparities across different ancestries.

The nature of genomic data and its use in medicine

For years, clinicians and researchers have diligently pursued precise predictions of genetic diseases and traits through the analysis of genetic data. Early identification of genetic predisposition for complex diseases allows for timely interventions, potentially averting adverse outcomes. While the genetics of many Mendelian diseases is well understood, predicting genetic susceptibility for complex diseases such as diabetes mellitus, stroke, schizophrenia, cancers, cardiovascular diseases, and others remains challenging. This is in part due to the cumulative effects of numerous genetic variants, each with minor individual impacts on disease development1.

Over the past two decades, studies to investigate the multiple genetic variations in humans such as genome-wide association studies (GWAS) have revealed polygenic variants linked to common complex disorders and certain traits2. While individual genetic variants’ contributions to disease risk or traits may seem modest, this insight has facilitated the development of polygenic risk scores (PRS). These scores aggregate a weighted lifetime sum of independent and significantly associated genetic variants or single nucleotide polymorphisms3. They aim to predict an individual’s susceptibility to acquiring a genetic disease or propensity towards a specific trait. PRS have found a widespread application in research, particularly in elucidating associations between scores, and disease status or traits. However, their clinical utility remains largely unestablished and constrained4. This is attributed in part to the insufficient inclusiveness of the existing genomic databases, leading to gaps in data representation across different populations.

PRS are developed to forecast, and thus improve, health outcomes through genomic medicine. This encompasses predicting disease risk, traits, treatment outcomes, and disease prognosis. Comprising hundreds to thousands of genetic variants, PRS are constructed from a compilation of independent risk genetic variants linked with a disease. These individual variants are derived from the latest evidence from the most expansive or informative GWAS. This compilation yields a single score representing each individual’s genetic predisposition for a disease or continuous trait. However, a primary ethical and scientific obstacle in the clinical integration of PRS is the significant discrepancy in accuracy across ancestries. Presently available PRS are notably more precise in individuals of European descent compared to other ancestries. This disparity stems from inherent Eurocentric biases in existing GWAS as the existing GWAS are mostly from white European populations, underscoring that the current clinical utilization of PRS predominantly benefits populations of European descent. Analyses indicate markedly lower accuracy of PRS among non-European populations such as African and Hispanic populations, posing a substantial challenge to equitable genomic medicine5.

This underrepresentation in genomic data, coupled with the high diversity of genomes and short blocks of genetic variants that are non randomly inherited together (linkage disequilibrium), particularly among African populations6,7, contributes to the challenge. In essence, using Eurocentric genomic datasets for PRS training and development leads to less accurate PRS for underrepresented populations. To fully realize the equitable potential of PRS, prioritizing greater diversity in genetic studies is essential. To bring this to realization, there should be a concerted effort among researchers, funders, and hosts of genomic databases. Additionally, public dissemination of summary statistics from all ancestries is crucial among authors/researchers and journals to prevent exacerbating health disparities among the most underserved individuals. This will facilitate the construction of PRS with improved accuracy in prediction of complex human diseases and traits.

Making genomic data more equitable by addressing existing disparities

In an effort to enhance healthcare by prioritizing the genetic and health data of historically marginalized populations, the National Institute of Health (NIH) in the United States recently established the All of Us Research Program8,9. With more than $3.1 billion in funding from the NIH, this initiative aims to compile detailed health profiles for one million diverse individuals within the US by 2026, thereby bridging existing gaps. From its inception in 201810 to April 2023, the program has enrolled 413,000 participants, with 46% belonging to minority racial or ethnic groups. Impressively, it has shared nearly 250,000 genomes, comprising the most extensive assembly of African American, Hispanic, and Latin-American genomes to date.

Data collected in the All of Us Research Program consists of whole-genome sequences, health records, and surveys, with intention to not only compile GWAS data, but also to provide insight into health across diverse ancestries, and levels of access to healthcare. This ambitious endeavor currently ranks as one of the largest and most accessible biomedical research repositories worldwide. In addition to the genomic data and participant surveys, electronic health records and data from wearable devices, such as Fitbits, have also been included to enhance its utility as a comprehensive genomic resource. Meaningful contributions to our understanding of genetic risk are already being realized from this database. Primary analyses of up to 245,000 diverse genomes from the All of Us Research Program have revealed over 275 million new genetic variants linked to a range of complex diseases, including nearly 150 potentially linked to type 2 diabetes mellitus8. These results serve as a demonstration of existing disparities in genetics research regarding non-white populations, as novel pathogenic variants are discovered in these diverse populations that have not already been identified in European populations. Additionally, new genetic information gathered in the All of Us Research Program shows that there are fewer people with pathogenic genetic variants and more with previously unknown variants11.

The dataset is freely accessible upon reasonable request, facilitating its sharing and enabling the recalibration and development of new PRS to enhance accuracy. In essence, the All of Us Research Program represents a pivotal step towards leveraging genomic diversity to foster inclusive genomic medicine and improve PRS accuracy, thereby advancing healthcare for all8.

The All of Us Program’s diverse genomic dataset is poised to revolutionize the development of PRS with heightened accuracy, while also facilitating the refinement of existing PRS initially constructed using Eurocentric genomic data. Consequently, the collection and utilization of additional genomic and health data from varied populations will be essential for generating precise PRS that offer an accurate assessment of an individual’s genetic susceptibility to developing a disease4. The diverse genomic dataset from the All of Us Research Program has been utilized to create and validate PRS customized for enhanced performance in clinical settings4. Past research has revealed that these scores, soon to be integrated into clinical practice for personalized healthcare, are often less accurate for minority populations than for majority populations. However, recent studies have already leveraged the inclusive All of Us Research Program data to enhance and validate scores for various conditions, including coronary heart disease and diabetes mellitus12,13.

This underscores the significance of the diverse genome dataset in updating and refining these PRS to enhance their accuracy for use in clinical practice. Nonetheless, a challenge persists in precisely interpreting these scores among clinicians. Upcoming research should concentrate on understanding how healthcare professionals interpret these scores, and how to apply them in treatment decisions. Currently, there is no specific African country widely recognized for actively using PRS in clinical settings as seen in the United States or United Kingdom. However, there are ongoing efforts and research initiatives across Africa to develop and utilize PRS, especially within the context of enhancing genomic data and precision medicine for African populations. One notable effort is the establishment of biobanks and genomic datasets in several African countries. For instance, the H3Africa initiative (Human Heredity and Health in Africa) is working to increase the understanding of how genomic and environmental factors influence disease in African populations, which could pave the way for future use of PRS1. Additionally, there is growing recognition of the need for more inclusive genomic research that represents the genetic diversity of African populations to improve the accuracy and applicability of PRS in these regions14. Countries like South Africa and Nigeria are also part of international collaborations aiming to gather more comprehensive genomic data. These efforts are essential steps toward potentially implementing PRS in healthcare systems across the continent in the future1. While the direct clinical use of PRS in Africa may still be in the developmental stages, these foundational efforts indicate a promising direction for the integration of genetic risk prediction in the continent’s healthcare landscape.

Conclusion

Emerging data from the All of Us Research Program shows that there are fewer people with deleterious genetic variants and more with genetic variants we do not fully understand in groups who have been less well studied in the past. This stresses the need for more genetic research in these groups. Furthermore, the All of Us Research Program’s diverse genomic data represents a pivotal platform for addressing genetic disparities and unlocking the full potential of optimized and accurate PRS. This, in turn, will advance the equitable application of genomic medicine and its tools, including PRS, heralding a new era of personalized healthcare for all.

Acknowledgements

We acknowledge Human Heredity and Health Africa Bioinformatics Network (H3ABioNet) for the training that helped us to conceptualize and execute this work.

Author contributions

B.R.K. conceptualized the study while discussing it with G.M. B.R.K. and G.M. gathered the literature and summarized the main findings. B.R.K. drafted the manuscript and B.R.K. and G.M. revised the manuscript critically.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Fatumo, S. et al. Polygenic risk scores for disease risk prediction in Africa: current challenges and future directions. Genome Med.15, 87 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Prim.1, 1–21 (2021). [Google Scholar]
  • 3.Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet.9, e1003348 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Lennon, N. J. et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat. Med.30, 480–487 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Lewis, C. M. & Vassos, E. Polygenic risk scores: from research tools to clinical instruments. Genome Med.12, 44 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Campbell, M. C. & Tishkoff, S. A. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genom. Hum. Genet.9, 403–433 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lonjou, C. et al. Linkage disequilibrium in human populations. Proc. Natl. Acad. Sci. USA100, 6069–6074 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bick, A. G. et al. Genomic data in the All of Us research program. Nature 1–7 10.1038/s41586-023-06957-x (2024). [DOI] [PMC free article] [PubMed]
  • 9.Kozlov, M. Ambitious survey of human diversity yields millions of undiscovered genetic variants. Nature.10.1038/d41586-024-00502-0 (2024). [DOI] [PubMed]
  • 10.All of Us Research Program Investigators. The “All of Us” research program. N. Engl. J. Med.381, 668–676 (2019). [DOI] [PMC free article] [PubMed]
  • 11.Venner, E. et al. The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities. Commun. Biol.7, 1–11 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Mapes, B. M. et al. Diversity and inclusion for the All of Us research program: a scoping review. PLoS ONE15, e0234962 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Venner, E. et al. Whole-genome sequencing as an investigational device for return of hereditary disease risk and pharmacogenomic results as part of the All of Us research program. Genome Med.14, 34 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Slunecka, J. L. et al. Implementation and implications for polygenic risk scores in healthcare. Hum. Genom.15, 46 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Communications Medicine are provided here courtesy of Nature Publishing Group

RESOURCES