Multi-ancestry polygenic risk scores for the prediction of type 2 diabetes and complications in diverse ancestries

Alicia Huerta-Chagoya; Joohyun Kim; Ravi Mandla; Yingchang Lu; Ken Suzuki; Lauren E Petty; Hong Kiat Ng; Jaewon Choi; Simon Lee; Madhusmita Rout; Kuang Lin; Linda S Adair; Adebowale Adeyemo; Habibul Ahsan; Masato Akiyama; Ping An; Sonia S Anand; Diane M Becker; Alain G Bertoni; Zheng Bian; Lawrence F Bielak; John Blangero; Michael Boehnke; Erwin P Bottinger; Donald W Bowden; Fiona Bragg; Jennifer A Brody; Thomas A Buchanan; Brian E Cade; Jin-Fang Chai; John C Chambers; Giriraj R Chandak; Li-Ching Chang; Kyong-Mi Chang; Miao-Li Chee; Chien-Hsiun Chen; Yuan-Tsong Chen; Zhengming Chen; Yii-Der I Chen; Ji Chen; Guanjie Chen; Shyh-Huei Chen; Wei-Min Chen; Ching-Yu Cheng; Yoon Shin Cho; Hyeok Sun Choi; Lee-Ming Chuang; Miguel Cruz; Mary Cushman; Swapan K Das; Ralph A DeFronzo; H Janaka deSilva; Latchezar Dimitrov; Ayo P Doumatey; Shufa Du; Qing Duan; Ravindranath Duggirala; Leslie S Emery; James C Engert; Daniel S Evans; Michele K Evans; Sarah Finer; Jose C Florez; James S Floyd; Myriam Fornage; Elizabeth G Frankel; Barry I Freedman; Lourdes García-García; Pauline Genter; Hertzel C Gerstein; Mark O Goodarzi; Penny Gordon-Larsen; Mariaelisa Graff; Myron Gross; Yu Guo; Xiuqing Guo; Yang Hai; Craig L Hanis; MGeoffrey Hayes; Momoko Horikoshi; Annie-Green Howard; Sarah Hsu; Willa Hsueh; Wei Huang; Mengna Huang; Yi-Jen Hung; Mi Yeong Hwang; Chii-Min Hwu; Sahoko Ichihara; Michiya Igase; Eli Ipp; Mohammad T Islam; Masato Isono; Hye-Mi Jang; Farzana Jasmine; Jost B Jonas; Yoonjung Y Joo; Edmond Kabagambe; Takashi Kadowaki; Yoichiro Kamatani

doi:10.1101/2025.07.21.25331778

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2025 Jul 23:2025.07.21.25331778. [Version 1] doi: 10.1101/2025.07.21.25331778

Multi-ancestry polygenic risk scores for the prediction of type 2 diabetes and complications in diverse ancestries

Alicia Huerta-Chagoya ^1,^2,^3,^4,^*, Joohyun Kim ^5,^*, Ravi Mandla ^1,^2,^3,^&, Yingchang Lu ^5,^&, Ken Suzuki ^6,^7,^8,^&, Lauren E Petty ^5,^&, Hong Kiat Ng ^9,^&, Jaewon Choi ^10,^&, Simon Lee ^11,^&, Madhusmita Rout ^12,^&, Kuang Lin ^13,^&, Linda S Adair ¹⁴, Adebowale Adeyemo ¹⁵, Habibul Ahsan ¹⁶, Masato Akiyama ^17,¹⁸, Ping An ¹⁹, Sonia S Anand ^20,^21,²², Diane M Becker ²³, Alain G Bertoni ²⁴, Zheng Bian ²⁵, Lawrence F Bielak ²⁶, John Blangero ²⁷, Michael Boehnke ²⁸, Erwin P Bottinger ^29,^30,¹¹, Donald W Bowden ^31,^32,³³, Fiona Bragg ^13,³⁴, Jennifer A Brody ³⁵, Thomas A Buchanan ³⁶, Brian E Cade ^37,³⁸, Jin-Fang Chai ³⁹, John C Chambers ⁹, Giriraj R Chandak ^40,⁴¹, Li-Ching Chang ⁴², Kyong-Mi Chang ^43,⁴⁴, Miao-Li Chee ⁴⁵, Chien-Hsiun Chen ⁴², Yuan-Tsong Chen ⁴², Zhengming Chen ¹³, Yii-Der I Chen ⁴⁶, Ji Chen ⁴⁷, Guanjie Chen ¹⁵, Shyh-Huei Chen ⁴⁸, Wei-Min Chen ⁴⁹, Ching-Yu Cheng ^50,^51,^45,⁵², Yoon Shin Cho ⁵³, Hyeok Sun Choi ⁵³, Lee-Ming Chuang ^54,⁵⁵, Miguel Cruz ⁵⁶, Mary Cushman ⁵⁷, Swapan K Das ⁵⁸, Ralph A DeFronzo ⁵⁹, H Janaka deSilva ⁶⁰, Latchezar Dimitrov ³², Ayo P Doumatey ¹⁵, Shufa Du ¹⁴, Qing Duan ⁶¹, Ravindranath Duggirala ²⁷, Leslie S Emery ⁶², James C Engert ⁶³, Daniel S Evans ⁶⁴, Michele K Evans ⁶⁵, Sarah Finer ⁶⁶, Jose C Florez ^1,^4,⁶⁷, James S Floyd ³⁵, Myriam Fornage ⁶⁸, Elizabeth G Frankel ⁵, Barry I Freedman ⁶⁹, Lourdes García-García ⁷⁰, Pauline Genter ⁷¹, Hertzel C Gerstein ^20,^21,²², Mark O Goodarzi ⁷², Penny Gordon-Larsen ¹⁴, Mariaelisa Graff ⁷³, Myron Gross ⁷⁴, Yu Guo ²⁵, Xiuqing Guo ⁴⁶, Yang Hai ⁴⁶, Craig L Hanis ⁷⁵, MGeoffrey Hayes ^76,^77,⁷⁸, Momoko Horikoshi ⁷⁹, Annie-Green Howard ⁸⁰, Sarah Hsu ^1,³, Willa Hsueh ⁸¹, Wei Huang ⁸², Mengna Huang ^83,⁸⁴, Yi-Jen Hung ^85,⁸⁶, Mi Yeong Hwang ⁸⁷, Chii-Min Hwu ^88,⁸⁹, Sahoko Ichihara ⁹⁰, Michiya Igase ⁹¹, Eli Ipp ⁷¹, Mohammad T Islam ⁹², Masato Isono ⁹³, Hye-Mi Jang ⁸⁷, Farzana Jasmine ¹⁶, Jost B Jonas ^94,^95,^96,^97,⁹⁸, Yoonjung Y Joo ^99,⁷⁸, Edmond Kabagambe ¹⁰⁰, Takashi Kadowaki ⁷, Yoichiro Kamatani ^101,¹⁰², Fouad R Kandeel ¹⁰³, Sharon LR Kardia ²⁶, Elizabeth W Karlson ¹⁰⁴, Anuradhani Kasturiratne ¹⁰⁵, Norihiro Kato ⁹³, Tomohiro Katsuya ^106,¹⁰⁷, Varinderpal Kaur ^1,^2,³, Takahisa Kawaguchi ¹⁰⁸, Jacob M Keaton ^32,¹⁰⁰, Abel N Kho ^109,¹¹⁰, Chiea-Chuen Khor ^111,^45,¹¹², Muhammad Kibriya ¹⁶, Bong-Jo Kim ⁸⁷, Woon-Puay Koh ^113,³⁹, Katsuhiko Kohara ¹¹⁴, Jaspal S Kooner ¹¹⁵, Charles Kooperberg ¹¹⁶, Raymond J Kreienkamp ^1,^4,^2,¹¹⁷, Amel Lamri ^20,²¹, Leslie A Lange ¹¹⁸, Nanette R Lee ¹¹⁹, Myung-Shik Lee ^120,¹²¹, Jung-Jin Lee ¹²², Donna M Lehman ⁵⁹, Liming Li ¹²³, Yun Li ¹²⁴, Victor JY Lim ³⁹, Jianjun Liu ^111,¹²⁵, Yongmei Liu ^126,²⁴, Simin Liu ^127,^128,^129,¹³⁰, Jirong Long ¹⁰⁰, Tin Louie ⁶², Xi Luo ¹³¹, Jun Lv ¹²³, Julie A Lynch ^132,¹³³, Shiro Maeda ^79,^134,¹³⁵, Anubha Mahajan ¹³⁶, Nisa M Maruthur ¹³⁷, Fumihiko Matsuda ¹⁰⁸, Mark I McCarthy ^136,^138,¹³⁹, Roberta McKean-Cowdin ¹⁴⁰, James B Meigs ^1,^4,¹⁴¹, Iona Y Millwood ¹³, Karen L Mohlke ⁶¹, Ayesha A Motala ¹⁴², Girish N Nadkarni ^11,²⁹, Jerry L Nadler ¹⁴³, Masahiro Nakatochi ¹⁴⁴, Mike A Nalls ^145,^146,¹⁴⁷, Uma Nayak ¹⁴⁸, Aude Nicolas ¹⁴⁵, Kari E North ⁷³, Darryl Nousome ¹⁴⁰, Yukinori Okada ^149,^150,¹⁵¹, Ian Pan ⁸⁴, James S Pankow ¹⁵², Guillaume Paré ^153,^21,²², Jaehyun Park ⁵, Kyong Soo Park ^154,¹⁵⁵, Esteban J Parra ¹⁵⁶, Sanjay R Patel ¹⁵⁷, Mark A Pereira ¹⁵², Patricia A Peyser ²⁶, Fraser J Pirie ¹⁴², Michael Preuss ¹¹, Michael A Province ¹⁹, Bruce M Psaty ¹⁵⁸, Leslie J Raffel ¹⁵⁹, Laura M Raffield ⁶¹, Laura J Rasmussen-Torvik ¹⁶⁰, Susan Redline ^161,^162,³⁸, Alexander P Reiner ^163,¹¹⁶, Stephen S Rich ¹⁶⁴, Rebecca Rohde ⁷³, Kathryn Roll ⁴⁶, Rashedeh Roshani ⁵, Charles N Rotimi ¹⁵, Charumathi Sabanayagam ^45,⁵⁰, Danish Saleheen ⁴³, Kevin Sandow ⁴⁶, Claudia Schurmann ^29,^30,¹¹, Mohammad Shahriar ¹⁶, Douglas M Shaw ⁵, Wayne H-H Sheu ^86,^88,^89,^165,¹⁶⁶, Jinxiu Shi ⁸², Xiao-Ou Shu ¹⁰⁰, Megan M Shuey ⁵, Moneeza K Siddiqui ⁶⁶, Jennifer A Smith ^26,¹⁶⁷, Tamar Sofer ^168,^169,⁴, Cassandra N Spracklen ^61,¹⁷⁰, Adrienne M Stilp ⁶², Meng Sun ¹⁷¹, Yasuharu Tabara ¹⁰⁸, E-Shyong Tai ^125,^39,¹¹², Salman M Tajuddin ⁶⁵, Atsushi Takahashi ^101,¹⁷², Fumihiko Takeuchi ⁹³, Jingyi Tan ⁴⁶, Kent D Taylor ⁴⁶, Katherine Taylor ¹, Farook Thameem ¹⁷³, Lin Tong ¹⁶, Fuu-Jen Tsai ¹⁷⁴, Philip S Tsao ^175,¹⁷⁶, Miriam S Udler ^1,^4,^2,³, Adan Valladares-Salgado ⁵⁶, David A van Heel ¹⁷⁷, Rob M vanDam ^39,¹²⁵, Rohit Varma ¹⁷⁸, Maheak Vora ^1,², Niels Wacher-Rodarte ¹⁷⁹, Ya-Xing Wang ¹⁸⁰, Ellie Wheeler ^181,⁴⁷, Eric A Whitsel ^73,¹⁸², Ananda R Wickremasinghe ¹⁰⁵, Genevieve L Wojcik ¹⁸³, Tien Y Wong ^45,^50,⁵¹, Jer-Yuarn Wu ⁴², Yong-Bing Xiang ^184,¹⁸⁵, Anny H Xiang ¹⁸⁶, Chittaranjan S Yajnik ¹⁸⁷, Ken Yamamoto ¹⁸⁸, Toshimasa Yamauchi ⁷, Lisa R Yanek ²³, Jie Yao ⁴⁶, Mitsuhiro Yokota ¹⁸⁸, Canqing Yu ¹²³, Jian-Min Yuan ^189,¹⁹⁰, Salim Yusuf ^191,^192,¹⁹³, Eleftheria Zeggini ^194,¹⁹⁵, Liang Zhang ⁴⁵, Weihua Zhang ¹⁹⁶, Wei Zheng ¹⁹⁷, Alan B Zonderman ⁶⁵; ENSA Genomics Consortium¹⁹⁸; Genes & Health Research Team¹⁷⁷; VA Million Veteran Program¹⁹⁹, Carlos A Aguilar-Salinas ^200,²⁰¹, Clicerio González-Villalpando ²⁰², Christopher A Haiman ²⁰³, Young Jin Kim ⁸⁷, Soo Heon Kwak ¹⁵⁴, Aaron Leong ^1,^2,^204,^3,^141,²⁰⁵, Ruth JF Loos ^11,²⁰⁶, Andres Moreno-Estrada ²⁰⁷, Andrew P Morris ^8,²⁰⁸, Lorena Orozco ²⁰⁹, Jerome I Rotter ⁴⁶, Dharambir Sanghera ¹², Teresa Tusie-Luna ²¹⁰, Benjamin F Voight ^211,^212,^213,²¹⁴, Marijana Vujkovic ^215,²¹¹, Robin G Walters ¹³, Tian Ge ^216,^{^}, Alisa K Manning ^217,^1,^4,^{^}, Marie Loh ^9,^218,^219,^{^}, Jennifer E Below ^5,^{^}, Xueling Sim ^39,^{^}, Josep M Mercader ^1,^4,^2,^3,^220,^{^,}^#, Maggie CY Ng ^5,^{^,}^#; D-PRISM Consortium

¹Programs in Metabolism and Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, 2142, USA,

²Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 2114, USA,

³Diabetes Unit, Massachusetts General Hospital, Boston, MA, 2114, USA,

⁴Department of Medicine, Harvard Medical School, Boston, MA, 2215, USA,

⁵Vanderbilt Genetics Institute, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA,

⁶Department of Statistical Genetics, Osaka University, Osaka, 565-0871, Japan,

⁷Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, The University of Tokyo, Tokyo, 113-8654, Japan,

⁸Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Division of Musculoskeletal and Dermatological Sciences, University of Manchester, Manchester, UK,

⁹Lee Kong Chian School of Medicine, Clinical Sciences Building, Nanyang Technological University, Singapore, 308232, Singapore,

¹⁰Department of Internal Medicine, Seoul National University Hospital, Seoul, 3080, South Korea,

¹¹The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA,

¹²Department of Pediatrics, Section of Genetics, University of Oklahoma, Oklahoma, OK, USA,

¹³Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, UK,

¹⁴Department of Nutrition, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA,

¹⁵Center for Research on Genomics and Global Health, National Human Genome Institute, National Institutes of Health, Bethesda, MD, USA,

¹⁶Biological Sciences Division, Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA,

¹⁷Laboratory for Statistical Analysis, RIKEN Centre for Integrative Medical Sciences, Yokohama, 230-0045, Japan,

¹⁸Department of Ophthalmology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, 812-8582, Japan,

¹⁹Division of Statistical Genomics, Washington University School of Medicine, St. Louis, MO, USA,

²⁰Department of Medicine, McMaster University, Hamilton, ON, Canada,

²¹Population Health Research Institute, Hamilton, ON, Canada,

²²Hamilton Health Sciences, Hamilton, ON, Canada,

²³Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA,

²⁴Department of Epidemiology and Prevention, Division of Public Health Sciences, Wake Forest University School of Medicine, Winston-Salem, NC, USA,

²⁵Chinese Academy of Medical Sciences, Beijing, 100730, China,

²⁶Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA,

²⁷Human Genetics and STDOI, University of Texas Rio Grande Valley, Edinburg, TX, USA,

²⁸Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA,

²⁹Digital Health Center, Digitl Engineering Faculty of Hasso Plattner Institue and University Potsdam, Potsdam, Germany,

³⁰Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA,

³¹Center for Diabetes Research, Wake Forest University School of Medicine, Winston-Salem, NC, USA,

³²Center for Genomics and Personalized Medicine Research, Wake Forest University School of Medicine, Winston-Salem, NC, USA,

³³Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA,

³⁴Health Data Research UK Oxford, University of Oxford, Oxford, UK,

³⁵Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA,

³⁶Department of Medicine, Division of Endocrinology and Diabetes, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA,

³⁷Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA, USA,

³⁸Harvard Medical School, Boston, MA, USA,

³⁹Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore, 117549, Singapore,

⁴⁰Genomic Research on Complex diseases, CSIR-Centre for Cellular and Molecular Biology (CSIR-CCMB), Hyderabad, India,

⁴¹Lightening Lives, Hyderabad, India,

⁴²Institute of Biomedical Sciences, Academia Sinica, Taipei, 11529, Taiwan,

⁴³Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, 19104, USA,

⁴⁴Division of Translational Medicine and Human Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA,

⁴⁵Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, 168751, Singapore,

⁴⁶The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, 90502, USA,

⁴⁷Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK,

⁴⁸Department of Biostatistical Sciences, Division of Public Health Sciences, Wake Forest University Health Sciences, Winston-Salem, NC, USA,

⁴⁹Department of Public Health Sciences and Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, VA, USA,

⁵⁰Ophthalmology & Visual Sciences Academic Clinical Program (Eye ACP), Duke-NUS Medical School, Singapore, 169857, Singapore,

⁵¹Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore, 119228, Singapore,

⁵²Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 119228, Singapore,

⁵³Biomedical Science, Hallym University, Chuncheon, 24252, South Korea,

⁵⁴Department of Internal Medicine, Division of Endocrinology and Metabolism, National Taiwan University Hospital, Taipei, Taiwan,

⁵⁵College of Medicine, National Taiwan University, Taipei, Taiwan,

⁵⁶Unidad de Investigación Médica en Bioquimica, Hospital de Especialidades, Centro Medico Nacional Siglo XXI, IMSS, Mexico City, Mexico,

⁵⁷Department of Medicine, University of Vermont, Colchester, VT, USA,

⁵⁸Department of Internal Medicine, Section on Endocrinology and Metabolism, Wake Forest University School of Medicine, Winston-Salem, NC, USA,

⁵⁹Department of Medicine, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA,

⁶⁰Department of Medicine, Faculty of Medicine, University of Kelaniya, Sri Lanka,

⁶¹Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA,

⁶²Department of Biostatistics, University of Washington, Seattle, WA, USA,

⁶³Department of Medicine, McGill University, Montreal, QC, Canada,

⁶⁴California Pacific Medical Center Research Institute, San Francisco, CA, USA,

⁶⁵Laboratory of Epidemiology and Population Sciences, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA,

⁶⁶Wolfson Institute of Population Health, Queen Mary University of London, London, UK,

⁶⁷Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 2114, USA,

⁶⁸Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX, USA,

⁶⁹Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA,

⁷⁰Instituto Nacional de Salud Pública, Cuernavaca, Mexico,

⁷¹Division of Endocrinology & Metabolism, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, 90502, USA,

⁷²Department of Medicine, Division of Endocrinology, Diabetes and Metabolism, Cedars-Sinai Medical Center, Los Angeles, CA, USA,

⁷³Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA,

⁷⁴Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, 55454, USA,

⁷⁵Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA,

⁷⁶Department of Anthropology, Northwestern University, Evanston, IL, USA,

⁷⁷Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA,

⁷⁸Division of Endocrinology, Metabolism, and Molecular Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA,

⁷⁹Laboratory for Endocrinology, Metabolism and Kidney Diseases, RIKEN Centre for Integrative Medical Sciences, Yokohama, 230-0045, Japan,

⁸⁰Department of Biostatistics, Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA,

⁸¹Department of Internal Medicine, Diabetes and Metabolism Research Center, Ohio State University Wexner Medical Center, Columbus, OH, USA,

⁸²Shanghai-MOST Key Laboratory of Health and Disease Genomics, Shanghai Institute for Biomedical and Pharmaceutical Technologies (SIBPT), Shanghai, 200032, China,

⁸³Center for Global Cardiometabolic Health, Brown University, Providence, RI, USA,

⁸⁴Department of Epidemiology, Brown University School of Public Health, Providence, RI, USA,

⁸⁵Division of Endocrine and Metabolism, Tri-Service General Hospital Songshan Branch, Taipei, Taiwan,

⁸⁶School of Medicine, National Defense Medical Center, Taipei, Taiwan,

⁸⁷Department of Precision Medicine, Division of Genome Science, National Institute of Health, Chungcheongbuk-do, 28159, Republic of Korea,

⁸⁸Department of Medicine, Section of Endocrinology and Metabolism, Taipei Veterans General Hospital, Taipei, Taiwan,

⁸⁹School of Medicine, National Yang-Ming University, Taipei, Taiwan,

⁹⁰Department of Environmental and Preventive Medicine, Jichi Medical University School of Medicine, Shimotsuke, 329-0498, Japan,

⁹¹Department of Anti-aging Medicine, Ehime University Graduate School of Medicine, Ehime, 791-0295, Japan,

⁹²U Chicago Research, Uttara, Dhaka, 1230, Bangladesh,

⁹³Department of Gene Diagnostics and Therapeutics, Research Institute, National Center for Global Health and Medicine, Tokyo, 162-8655, Japan,

⁹⁴Rothschild Foundation Hospital, Paris, France,

⁹⁵Singapore Eye Research Institute, Singapore National Eye Center, Singapore,

⁹⁶Privatpraxis Prof Jonas und Dr. Panda-Jonas, Heidelberg, Germany,

⁹⁷Beijing Visual Science and Translational Eye Research Institute (BERI), Beijing Tsinghua Changgung Hospital, Tsinghua Medicine, Tsinghua University, Beijing, China,

⁹⁸L V Prasad Eye Institute, Hyderabad, Telangana, India,

⁹⁹Department of Health and Biomedical Informatics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA,

¹⁰⁰Department of Medicine, Division of Epidemiology, Vanderbilt University Medical Center, Nashville, TN, USA,

¹⁰¹Laboratory for Statistical and Translational Genetics, RIKEN Centre for Integrative Medical Sciences, Yokohama, 230-0045, Japan,

¹⁰²Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, 108-8639, Japan,

¹⁰³Departments of Clinical Diabetes, Endocrinology & Metabolism and Translational Research and Cellular Therapeutics, City of Hope, Duarte, CA, USA,

¹⁰⁴Department of Medicine, Brigham and Women’s Hospital, Boston, MA, 2215, USA,

¹⁰⁵Department of Public Health, Faculty of Medicine, University of Kelaniya, Sri Lanka,

¹⁰⁶Department of Clinical Gene Therapy, Osaka University, Osaka, 565-0871, Japan,

¹⁰⁷Department of Geriatric and General Medicine, Graduate School of Medicine, Osaka University, Osaka, 565-0871, Japan,

¹⁰⁸Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, 606-8507, Japan,

¹⁰⁹Center for Health Information Partnerships, Institute for Public Health and Medicine, Northwestern UniversityFeinberg School of Medicine, Chicago, IL, USA,

¹¹⁰Department of Medicine, Division of General Internal Medicine and Geriatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA,

¹¹¹Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, 138672, Singapore,

¹¹²Duke-NUS Medical School, Singapore, 169857, Singapore,

¹¹³Health Services and Systems Research, Duke-NUS Medical School, Singapore, 169857, Singapore,

¹¹⁴Department of Regional Resource Management, Ehime University Faculty of Collaborative Regional Innovation, Ehime, 790-8577, Japan,

¹¹⁵Department of Cardiology, Ealing Hosptial, London North West Healthcare NHS Trust, Middlesex, UB1 3HW, UK,

¹¹⁶Fred Hutchinson Cancer Research Center, Seattle, WA, USA,

¹¹⁷Department of Pediatrics, Division of Endocrinology, Boston Children’s Hospital, Boston, MA, USA,

¹¹⁸Department of Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO, USA,

¹¹⁹Department of Anthropology, Sociology and History, University of San Carlos, Cebu City, 6000, Philippines,

¹²⁰Severance Biomedical Science Institute and Department of Internal Medicine, Yonsei University College of Medicine, Seoul, 3722, South Korea,

¹²¹Soonchunhyang Institute of Medi-bio Science, Cheonan, 31151, South Korea,

¹²²Translational Medicine and Human Genetics, Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA,

¹²³Department of Epidemiology and Biostatistics, Peking University Health Science Centre, Beijing, 100191, China,

¹²⁴Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA,

¹²⁵Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore, 117549, Singapore,

¹²⁶Department of Medicine, Division of Cardiology, Duke University School of Medicine, Durham, NC, USA,

¹²⁷Department of Epidemiology and Biostatistics, The Joe Wen School of Population & Pubic Health, UCI, Irvine, CA, USA,

¹²⁸Center for Global Cardiometabolic Health and Nutrition, the University of California, Irvine (UCI), Irvine, CA, USA,

¹²⁹Division of Cardiology, Department of Medicine, School of Medicine, UCI, USA,

¹³⁰Center for Global Cardiometabolic Health, Brown University School of Public Health, Providence, RI, USA,

¹³¹Department of Biostatistics and Data Science, University of Texas Health Science Center at Houston, Houston, TX, USA,

¹³²Salt Lake City VA Medical Center, Salt Lake City, UT, USA,

¹³³University of Utah, School of Medicine, Salt Lake City, UT, USA,

¹³⁴Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa, 903-0213, Japan,

¹³⁵Division of Clinical Laboratory and Blood Transfusion, University of the Ryukyus Hospital, Okinawa, 903-0213, Japan,

¹³⁶Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN, UK,

¹³⁷Department of Medicine and Epidemiology, Johns Hopkins University, Baltimore, MD, 21205, USA,

¹³⁸Oxford NIHR Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, UK,

¹³⁹Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK,

¹⁴⁰Department of Preventative Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA,

¹⁴¹Division of General Internal Medicine, Massachusetts General Hospital, Boston, MA, 2114, USA,

¹⁴²Department of Diabetes and Endocrinology, Nelson R Mandela School of Medicine, College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa,

¹⁴³Department of Medicine and Endocrinology Division, University of California Davis School of Medicine, Sacramento, CA, USA,

¹⁴⁴Department of Integrated Health Sciences, Nagoya University Graduate School of Medicine, Nagoya, 461-8673, Japan,

¹⁴⁵Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA,

¹⁴⁶Center for Alzheimer’s and Related Dementias, National Institutes of Health, Bethesda, MD, USA,

¹⁴⁷DataTecnica LLC, Washington, DC, USA,

¹⁴⁸Department of Genome Sciences, University of Virginia School of Medicine, Charlottesville, VA, USA,

¹⁴⁹Department of Statistical Genetics, Osaka University Graduate School of Medicine, Tokyo, 565-0871, Japan,

¹⁵⁰Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Osaka, 565-0871, Japan,

¹⁵¹Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan,

¹⁵²Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, 55454, USA,

¹⁵³Department of Pathology and Molecular Medicine, McMaster University, Hamilton, ON, Canada,

¹⁵⁴Department of Internal Medicine, Seoul National University College of Medicine and Seoul National University Hospital, Seoul, 3080, South Korea,

¹⁵⁵Department of Internal Medicine, Konkuk University Medical Center, Seoul, 3080, South Korea,

¹⁵⁶Department of Anthropology, University of Toronto at Mississauga, Mississauga, Canada,

¹⁵⁷Department of Medicine, Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Pittsburgh, Pittsburgh, PA, USA,

¹⁵⁸Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Systems and Population Health, University of Washington, Seattle, WA, USA,

¹⁵⁹Department of Pediatrics, Division of Genetic and Genomic Medicine, UCI Irvine School of Medicine, Irvine, CA, USA,

¹⁶⁰Department of Preventive Medicine, Northwestern University, Chicago, IL, USA,

¹⁶¹Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA,

¹⁶²Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, MA, USA,

¹⁶³Department of Epidemiology, University of Washington, Seattle, WA, USA,

¹⁶⁴Department of Genome Sciences, University of Virginia, Charlottesville, VA, USA,

¹⁶⁵Department of Medicine, Division of Endocrinology and Metabolism, Taichung Veterans General Hospital, Taichung, Taiwan,

¹⁶⁶Institutes of Molecular and Genomic Medicine, National Health Research Institutes, Taiwan,

¹⁶⁷Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA,

¹⁶⁸Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA,

¹⁶⁹Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA,

¹⁷⁰Department of Biostatistics and Epidemiology, School of Public Health and Health Sciences, University of Massachusetts, Amherst, MA, 1003, USA,

¹⁷¹Nuffield Department of Surgical Sciences, University of Oxford, Oxford, OX3 9DU, UK,

¹⁷²Department of Genomic Medicine, National Cerebral and Cardiovascular Center, Osaka, 565-8565, Japan,

¹⁷³Department of Biochemistry, Faculty of Medicine, Health Science Center, Kuwait University, Safat, Kuwait,

¹⁷⁴Department of Medical Genetics and Medical Research, China Medical University Hospital, Taichung, 40402, Taiwan,

¹⁷⁵Department of Medicine, Stanford University, Palo Alto, CA, USA,

¹⁷⁶VA Palo Alto Health Care System, Palo Alto, CA, USA,

¹⁷⁷Blizard Institute, Queen Mary University of London, London, UK,

¹⁷⁸Southern California Eye Institute, CHA Hollywood Presbyterian Medical Center, Los Angeles, CA, USA,

¹⁷⁹Unidad de Investigación Médica en Epidemiologia Clinica, Hospital de Especialidades, Centro Medico Nacional Siglo XXI, IMSS, Mexico City, Mexico,

¹⁸⁰Beijing Institute of Ophthalmology, Ophthalmology and Visual Sciences Key Laboratory, Beijing Tongren Hospital, Capital Medical University, Beijing, 100005, China,

¹⁸¹MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, UK,

¹⁸²Department of Medicine, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA,

¹⁸³Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA,

¹⁸⁴State Key Laboratory of Oncogene and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200032, China,

¹⁸⁵Department of Epidemiology, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200032, China,

¹⁸⁶Department of Research & Evaluation, Division of Biostatistics Research, Kaiser Permanente of Southern California, Pasadena, CA, USA,

¹⁸⁷Diabetes Unit, King Edward Memorial Hospital and Research Centre, Pune, India,

¹⁸⁸Department of Medical Biochemistry, Kurume University School of Medicine, Kurume, 830-0011, Japan,

¹⁸⁹Division of Cancer Control and Population Sciences, UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, 15232, USA,

¹⁹⁰Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, 15232, USA,

¹⁹¹Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada,

¹⁹²Population Health Research Institute, Hamilton, ON, Canada,

¹⁹³Hamilton Health Sciences, Hamilton, ON, Canada,

¹⁹⁴Institute of Translational Genomics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany,

¹⁹⁵Technical University of Munich (TUM), TUM University Hospital, TUM School of Medicine and Health, Munich, 81675, Germany,

¹⁹⁶Department of Epidemiology and Biostatistics, Imperial College London, London, W2 1PG, UK,

¹⁹⁷Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37203, USA,

¹⁹⁸Instituto Nacional de Salud Pública, Cuernavaca, Mexico,

¹⁹⁹Corporal Michael J Crescenz VA Medical Center, Philadelphia, PA, USA,

²⁰⁰Unidad de Investigación de Enfermedades Metabólicas, Research Direction of the Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico,

²⁰¹Tecnológico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Mexico City, Mexico,

²⁰²Centro de Estudios en Diabetes, Unidad de Investigacion en Diabetes y Riesgo Cardiovascular, Centro de Investigacion en Salud Poblacional, Instituto Nacional de Salud Pública, Mexico City, Mexico,

²⁰³Department of Population and Public Health Sciences, Keck School of Medicine of USC, Los Angeles, CA, USA,

²⁰⁴Department of Medicine, Massachusetts General Hospital, Boston, MA, 2114, USA,

²⁰⁵Endocrine Division, Massachusetts General Hospital, Boston, MA, 2114, USA,

²⁰⁶The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA,

²⁰⁷Aging Research Center, Cinvestav Sede Sur, Center for Research and Advanced Studies of the National Polytechnic Institute, Mexico City, Mexico,

²⁰⁸NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK,

²⁰⁹Instituto Nacional de Medicina Genómica, Mexico City, Mexico,

²¹⁰Unidad de Biología Molecular y Medicina Genómica, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México/Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico,

²¹¹Corporal Michael J. Crescenz Philadelphia VA Medical Center, Philadelphia, PA, USA,

²¹²Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA,

²¹³Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA,

²¹⁴Institute for Translational Medicine and Therapeutics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA,

²¹⁵Department of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA,

²¹⁶Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 2114, USA,

²¹⁷Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, 2114, USA,

²¹⁸Department of Epidemiology and Biostatistics, Imperial College London, St Mary’s Campus, London, UK,

²¹⁹Research Division, National Skin Centre, Singapore,

²²⁰Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA.

These authors equally contributed to this work.

^{^}

These authors jointly directed this work.

Contributors

J.M.M. and M.C.Y.N. conceived the study. A.H-C. and J.K. led the analyses. R.M., Y.L., K.S., L.E.P., H.K.N., J.C., S.L., M.R., and K.L. performed the training and/or validation of the PRS analyses. Y.J.K., R.G.W., D.S., S.H.K., and R.J.F.L. supervised the training and/or validation of the PRS analyses. T.G., A.K.M., M.L., J.E.B., X.S., J.M.M and M.C.Y.N. jointly directed this work. The rest of the authors provided data for the different stages of the analyses. A.H.-C., J.K., J.M.M., and M.C.Y.N. wrote the first draft of the manuscript. All authors contributed to interpreting the data, and they read, revised, and approved the final manuscript.

Corresponding authors: Josep M. Mercader, Programs in Metabolism and Medical and Population Genetics; Broad Institute of Harvard and MIT, 75 Ames St, Cambridge, MA, 02142, United States of America, mercader@broadinstitute.org, Maggie C.Y. Ng, Division of Genetic Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, 37232, United States of America; maggie.ng@vumc.org

PMCID: PMC12330429 PMID: 40778152

Summary

Background

Polygenic risk scores (PRSs) improve type 2 diabetes (T2D) prediction beyond clinical risk factors but perform poorly in non-European populations, where T2D burden is often higher, undermining their global clinical utility.

Methods

We conducted the largest global effort to date to harmonize T2D genome-wide association study (GWAS) meta-analyses across five ancestries—European (EUR), African/African American (AFR), Admixed American (AMR), South Asian (SAS), and East Asian (EAS)—including 360,000 T2D cases and 1·8 million controls (41% non-EUR). We constructed ancestry-specific and multi-ancestry PRSs in training datasets including 11,000 T2D cases and 32,000 controls, and validated their performance in independent datasets including 39,000 T2D cases and 126,000 controls of diverse ancestries. In the All of Us Research Program, we compared these PRSs to those from the Polygenic Score Catalog and assessed their ability to predict diabetes micro- and macrovascular complications.

Findings

Ancestry-specific PRSs showed limited prediction power for T2D in AFR, AMR, and SAS compared to EUR and EAS. In contrast, multi-ancestry PRSs, built using GWAS data from five ancestries, substantially improved T2D prediction across all ancestries. Compared to those in the interquartile range, individuals at the 97·5^th percentile of their PRSs had a 6-fold increased T2D risk in AMR, EAS, and EUR, and ≥3-fold in AFR and SAS. These PRSs were also associated with the development of microvascular complications and outperformed all previously reported PRSs for all ancestries.

Interpretation

We developed and extensively validated the most up-to-date T2D PRSs across diverse ancestry groups. These PRSs are publicly available to support further evaluation of their clinical utility in diverse ancestries.

Introduction

Type 2 diabetes (T2D) represents one of the largest health problems of the 21st century, affecting 537 million people globally, and is predicted to increase to 783 million by 2045.¹ Both genetic and environmental factors contribute to T2D susceptibility. Genome-wide association studies (GWAS) have identified 1,289 genetic signals associated with T2D in diverse ancestries.^2,3 The aggregation of risk alleles in polygenic risk scores (PRSs) for T2D^2,4 may provide insights into disease progression and prognosis, or help identify people at risk for prioritization of therapeutic or lifestyle intervention.⁵ Several initiatives are beginning to test their utility in clinical settings^6,7, and an observational study in Electronic Health Record (EHR) data has shown that T2D PRSs provide data orthogonal to standard clinical risk factors and are therefore particularly valuable in identifying at-risk individuals among those perceived to be low risk based on standard clinical risk factors.⁸

T2D disproportionately affects people of ancestral genetic backgrounds other than European, with South Asian, African American, and Hispanic/Latin American populations having a higher prevalence of diabetes and related complications.⁹ Yet, most PRSs are based on GWAS, including predominantly individuals of European ancestry, and have poor performance in individuals of other ancestries, which may further exacerbate health disparities if PRS were to be deployed for disease prediction in ancestrally diverse populations.^2,10

While efforts have been made to enhance PRS performance in diverse populations¹¹, a comprehensive, standardized, and harmonized approach for the development and validation of PRSs across continental ancestries for T2D remains lacking. In particular, recent efforts have aimed at improving the transferability of PRS across diverse ancestries. First, the Polygenic Risk Methods in Diverse Populations (PRIMED) consortium and others have developed new methods, such as those incorporating GWAS and linkage disequilibrium (LD) data from multiple ancestries, showing improved prediction.^11,12 Second, large-scale T2D GWAS data for populations other than European have become available, which may improve the power of PRS development in these populations. Third, the emergence of large-scale biobank data from around the world — such as the All of Us Research Program (AoU) — offers valuable opportunities to develop, train, and validate novel PRSs, as data from these participants has not yet been included in GWAS meta-analyses.^13,14

As part of the Type 2 Diabetes Global Genomics Initiative (T2DGGI), we published the largest multi-ancestry T2D GWAS meta-analysis to date, based on 2·5 million individuals, including 428,452 with T2D, and enumerating 1,289 signals for T2D risk.³ Here, in collaboration with T2DGGI, we present results from the Diabetes Polygenic Risk Scores in Multiple ancestries (D-PRISM)^2,11, an international consortium focusing on improving PRS prediction of different types of diabetes and progression across the lifespan in diverse ancestries. We leveraged these two consortia to aggregate extensive T2D GWAS data from five major continental ancestry groups, trained and validated PRS models in independent cohorts to evaluate their performance in predicting T2D and diabetes complications using a unified pipeline. We tested the optimal PRS model for each of the five ancestries and made them available to the community for downstream analyses, as well as future testing for clinical utility and implementation.

Methods

An overview of the overall strategy is shown in Fig.1 and detailed in the supplementary methods. We leveraged T2D GWAS from selected cohorts participating in three large consortia: the Diabetes Meta-analysis of Trans-ethnic Association Studies (DIAMANTE)¹⁵, the Million Veteran Program (MVP)¹⁶, and FinnGen¹⁷. Cohorts were categorized by genetic similarity to one or more of the five ancestries available in the 1000 Genomes (1KG) project¹⁸, or the country of origin: African/African American (AFR), Admixed American (AMR), East Asian (EAS), European (EUR), and South Asian (SAS). While ancestry labels do not represent or account for the full continuum of human genetic diversity, they were necessary for statistical analyses where no individual-level data were available. We included 2,185,548 individuals (359,819 cases, 1,825,729 controls) across 125 T2D GWAS datasets to conduct ancestry-specific meta-analyses, with summary statistics from EUR and EAS ancestry groups representing 86% of the total sample size (EUR=68%, EAS=18%, AFR=6%, AMR=4%, and SAS=4%; Fig.1a, Supplementary Tables 1,2).

We leveraged cohorts not included in previous meta-analyses and, when necessary, conducted held-out meta-analyses to allow some cohorts to train and validate PRS. We aggregated data from a total of 42,784 individuals (10,992 cases, 31,792 controls) from five ancestries for training (tuning PRS construction parameters; Fig.1b) and 164,972 individuals (39,148 cases, 125,824 controls) for validation (testing PRS performance; Fig.1c). We used the five ancestry-specific T2D GWAS summary statistics to construct single-ancestry and multi-ancestry PRSs, using the PRS-CS (Continuous Shrinkage)¹⁹ and PRS-CSx²⁰ methods, respectively (Fig.1b). Both methods use LD panels from reference datasets to model the pairwise correlations between SNPs in GWAS summary statistics during PRS construction. The SNPs in the most widely used LD panels are based on the HapMap3 (HM3), which generally do not tag well in non-EUR populations and may miss ancestry-specific signals. In addition, the samples used to model LD structure are either from the 1KG project or the UK Biobank (UKBB). The former has a limited sample size, while the latter lacks sufficient representation of populations other than the EUR. To improve tagging, we developed new LD panels with an expanded set of SNPs generated using the Tag(ging) It(erative) of SNPs in multiple populations (TagIt) program²¹ and variants with ancestry-specific minor allele frequency (MAF) ≥0·01 across samples from the 1KG. We also use >8,000 in-house samples to compute pairwise LD in each ancestry, enabling more accurate modeling of the LD structure (Fig.1a).

We compared the impact of several input parameters (GWAS ancestry, two SNP sets, and two LD sources) on PRS performance in the training cohorts. We defined the best models as maximizing the incremental AUC (iAUC) for predicting prevalent T2D, comparing a full model, including the PRS, sex, age, and genetic principal components (PCs), to a model without the PRS (Fig.1d). We validated the best-performing models for association with T2D in at least four independent cohorts from each ancestry group (Fig.1c) and fitted secondary models with body mass index (BMI) as an additional covariate. We estimated the effect size of the PRSs as the odds ratio per standard deviation (OR per SD) unit of the PRS, and calculated OR for individuals at the 90^th, 95^th, and 97·5^th PRS percentiles compared with the interquartile range as the reference. In the AoU cohort, we also compared the association of our best-performing T2D PRSs with other published T2D PRSs from the Polygenic Score (PGS) catalog, and tested their association with common diabetes complications (Fig.1d).

Results

Development of new LD reference panels to improve variant coverage and LD estimation in diverse populations.

Compared to the LD panels using the HM3 set of variants, our new TagIt-based LD panels increased the proportion of SNPs being tagged (r²≥0·8) up to 2 folds. The best improvement was 7 folds for SNPs with MAF 0·01–0·05 in AFR ancestry (Supplementary Table 7, Supplementary Fig.1). In all ancestry groups, the best-performing PRSs were those constructed using the expanded TagIt set of variants and/or the recomputed pairwise LD with large sample sizes (Supplementary Table 8, Supplementary Fig.2), suggesting that the prediction performance benefited from increasing SNP coverage, better LD modeling, or both.

Matched single-ancestry PRS performance is positively correlated with the sample size of GWAS summary statistics.

We first trained PRSs (Supplementary Tables 3,4) using GWAS summary statistics and four different LD panels, each matched to the ancestry of the validation cohorts (Supplementary Tables 5,6). When using a single-ancestry summary statistics, matching the GWAS and LD panel ancestries to that of the validation dataset resulted in the best prediction performance for the EUR (iAUC =0·07–0·14) and EAS (iAUC=0·02–0·16) ancestries. However, the prediction was poorer for AFR (iAUC=0·02–0·03), AMR (iAUC=0·02–0·04), and SAS (iAUC=0·02–0·04) (Fig.2a–e, Supplementary Table 9).

Fig. 2 | — a–e: Incremental AUC (iAUC) of the T2D PRS in the validation cohorts across ancestry groups: a, AFR, b, AMR, c, EAS, d, EUR, e, SAS. For each ancestry, the best-performing single-ancestry and multi-ancestry PRSs were evaluated. Each bar represents a single cohort. Bar colors represent the ancestry group: purple for AFR, yellow for AMR, green for EAS, red for EUR, and blue for SAS. Line colors represent the ancestry of the T2D GWAS summary statistics and LD panels used to train the PRS, using the same color codes for single-ancestry PRSs, and black for multi-ancestry PRSs *De Long test p<0·05. f–j: Odds ratio from the meta-analysis of validation cohorts across ancestry groups: f, AFR, g, AMR, h, EAS, i, EUR, j, SAS. Points represent the odds ratio per standard deviation (OR per SD) of the PRS distribution or the odds ratio (OR) comparing different PRS distribution extremes relative to the interquartile range. Error bars show the 95% confidence intervals (95% CI). Point colors represent the ancestry of the T2D GWAS summary statistics and LD panels used to train the PRS. *De Long p<0.05.

The effect size of the PRS on T2D risk showed a similar pattern with larger OR per SD from the meta-analysis of the EUR (OR per SD [95% CI] = 2·31 [2·26–2·35]) and EAS (2·19 [2·12–2·27]) validation cohorts, compared to the AFR (1·38 [1·34–1·43]), AMR (1·64 [1·57–1·71]) and SAS (1·6 [1·5–1·74]). The EUR and EAS PRSs also showed better power to identify individuals at the highest T2D genetic risk than those trained for the other ancestries. For instance, individuals at the 90^th percentile of the respective PRS distribution had ~4-fold increased risk of T2D in the EUR (OR [95% CI] = 4·01 [3·82–4·21]) and in the EAS (3·58 [3·24–3·94]) ancestries, compared to ~2-fold increased risk for individuals in the AFR (1·82 [1·64–2·01], AMR (2·1 [1·84–2·39]) and SAS (2·22 [1·75–2·81]) ancestries (Fig.2f–j, Supplementary Table 9). We observed consistent results regardless of adjustment for BMI, indicating that the predictive performance of the PRSs was robust to the inclusion of this well-established T2D risk factor (Supplementary Table 10, Supplementary Fig.3).

Using GWAS data from large non-matched ancestries improves PRS performance in ancestries with limited GWAS sample size.

We then tested whether constructing PRS using GWAS summary statistics from ancestries with larger sample sizes, such as EUR and EAS, could improve T2D prediction in other ancestries with limited GWAS data, despite the larger ancestral differences between discovery and validation cohorts. Compared to the matched single-ancestry PRSs, those based on the EUR GWAS improved the T2D prediction in the AFR, AMR, and SAS validation cohorts but had lower predictions in the EAS.

A PRS based on EAS GWAS showed modestly improved T2D prediction in the AMR and SAS validation cohorts, but worse performance in the AFR and EUR (Fig.2a–e, Supplementary Table 9). The effect sizes of the PRSs were consistent with their prediction performance. The best-performing single-ancestry PRSs were the one derived from EUR GWAS for AFR (OR per SD (95% CI) = 1·75 [1·67–1·82]), AMR (3·07 [2·89–3·27]), SAS (2·21 [2·03–2·42]), and EUR (2·31 [2·26–2·35]) validation cohorts, while for EAS cohorts (2·19 [2·12–2·27]), the best-performing single-ancestry PRS was based on EAS GWAS (Fig.2f–j, Supplementary Table 9).

Multi-ancestry PRSs show the best prediction performance in all ancestries.

We further applied a multi-ancestry PRS method, PRS-CSx²⁰, which jointly models ancestry-specific GWAS and LD panel data from multiple ancestries. This approach leverages the increased statistical power from GWAS of five continental ancestries, which are jointly modelled to maximize the power of variants that are present in all ancestries, while still incorporating the effects of variants that are specific or enriched in specific ancestries, even if their sample size is modest. Compared to the best single-ancestry PRSs, the prediction performance of the multi-ancestry PRSs was the highest across validation cohorts from all five ancestries (iAUC ranging from 0·02–0·06 in AFR, 0·06–0·09 in AMR, 0·04–0·17 in EAS, 0·07–0·14 in EUR, and 0·06–0·10 in SAS) (Fig.2a–e, Supplementary Table 9).

The multi-ancestry PRSs also had the highest effect sizes along with smaller confidence intervals across all ancestries (OR per SD [95% CI] = 1·73 [1·67–1·80] in AFR, 2·82 [2·67–2·97] in AMR, 2·45 [2·36–2·54] in EAS, 2·36 [2·32–2·41] in EUR, and 2·23 [2·05–2·42] in SAS). The improvement was particularly notable for individuals at the extremes of the PRS distributions. For instance, the individuals in the 97·5^th percentile of the multi-ancestry PRSs had 3 to 7-fold increased T2D risk compared to those in the interquartile range (OR [95% CI] = 3·43 [2·8–4·21] in AFR, 7·47 [5·64–9·89] in AMR, 6·62 [5·58–7·85] in EAS, 6·25 [5·72–6·82] in EUR and 4·50 [2·70–7·53] in SAS (Fig.2f–j, Supplementary Table 9).

Multi-ancestry T2D PRSs outperform previously published T2D PRSs.

We then leveraged the AoU cohort to compare this study’s best-performing multi-ancestry PRSs against the published T2D PRSs from the PGS catalog.²² We tested 55 out of 147 available PRS for T2D (accessed on October 07, 2024), after excluding pathway-specific PRSs or those for which the AoU cohort was used as a training dataset due to potential model overfitting. In all ancestry groups, our best-performing multi-ancestry PRSs showed better predictive performance than the previously published T2D PRSs from the PGS catalog. The differences with the best published PRSs from the PGS catalog were statistically significant for AFR, AMR, and EUR (De Long p<9×10⁻⁴, Bonferroni-corrected threshold=0·05/55 PRSs tested; iAUC for D-PRISM multi-ancestry PRS-CSx vs. best iAUC from PGS Catalog: 0·041 vs. 0·029 in AFR, 0·086 vs. 0·073 in AMR, 0·143 vs. 0·123 in EUR) (Fig.3a,b,d). The improvement was nominally significant for EAS (0·074 vs. 0·058, p=0·03) (Fig.3c), but not for SAS (p=0·58) (Fig.3e), likely due in part to their limited sample sizes in the AoU cohort for these two ancestries (Supplementary Table 11, Supplementary Fig.4–8).

Fig. 3 | — a–e: Incremental AUC (iAUC) across ancestry groups: a, AFR, b, AMR, c, EAS, d, EUR, e, SAS. Black bars highlight this study’s multi-ancestry PRSs. *De Long p<0.05; **Bonferroni-corrected De Long p<9×10⁻⁴.

We also compared different strategies for PRS construction. Some studies leverage ancestry diversity using the inverse variance-weighted (IVW) meta-analysis results from multi-ancestry GWAS, typically applying a single LD panel — often of EUR ancestry— which may fail to model LD patterns and tag ancestry-specific variants accurately. In contrast, PRS-CSx jointly models GWAS from multiple ancestries while accounting for differences in allele frequencies and LD patterns across ancestries. To assess the value of PRS-CSx, we constructed a PRS-CS model using summary statistics from the IVW GWAS meta-analysis of the D-PRISM ancestry-specific GWAS summary statistics and tested its performance in the validation cohorts. In all ancestry groups, the T2D prediction performance of the multi-ancestry PRS-CSx was better than that of the PRS-CS models constructed using the IVW GWAS meta-analysis. The improvement was statistically significant in the AMR, EAS and EUR (iAUC for the multi-ancestry PRS-CSx vs. the multi-ancestry PRS-CS: 0·086 vs. 0·077 in AMR, 0·074 vs. 0·061 in EAS, and 0·143 vs. 0·126 in EUR, De Long p<0·05) (Fig.3b–d) but not significantly different for AFR and SAS (0·041 vs. 0·038 in AFR, p=0·07, 0·061 vs. 0·060 in SAS, p=0·91) (Fig.3a,e, Supplementary Table 11, Supplementary Figures 4–8).

The largest multi-ancestry T2D GWAS meta-analysis to date, conducted by Suzuki et al.,³ includes a sample size 16% larger than our D-PRISM GWAS datasets (N=2,535,601, 428,452 cases, 2,107,149 controls), as we intentionally held out several cohorts for PRS training and validation in this study. Nevertheless, since the AoU cohort was not included in the discovery GWAS by Suzuki et al., we also evaluated the performance of a PRS-CS model built using those summary statistics. Despite a smaller sample size, our D-PRISM multi-ancestry PRS-CSx showed better prediction performance in the AMR, EUR, and EAS ancestry groups (iAUC for D-PRISM multi-ancestry PRS-CSx vs. Suzuki et al. PRS-CS: 0·086 vs. 0·077 in AMR, 0·074 vs. 0·059 in EAS, and 0·143 vs. 0·128 in EUR, De Long p<0·05) (Fig.3b–d). In the AFR and SAS ancestry groups, the D-PRISM multi-ancestry PRS-CSx yielded prediction performance compared to the Suzuki et al. PRS-CS model (0·041 vs. 0·041, p=0·82 in AFR, 0·061 vs. 0·068, p=0·34 in SAS) (Fig.3a,e).

In addition, our D-PRISM multi-ancestry PRS-CSx models showed consistently better prediction performance in four of the five ancestries (i.e., AMR, EAS, EUR, and SAS, p<0·05) (Fig.3b–e) compared to a multi-ancestry PRS that was restricted to the 1,289 genome-wide significant variants (rsPRS) identified in the discovery GWAS by Suzuki et al. We did not observe significant improvement in the AFR ancestry (Fig.3a, Supplementary Table 11, Supplementary Figs 4–8).

Multi-ancestry T2D PRSs are associated with the risk of microvascular diabetes complications.

T2D adversely affects the functioning of multiple organs, and the long-term complications accompanying the disease contribute the greatest morbidity for patients suffering from T2D.²³ We leveraged the AoU cohort to assess if the best-performing multi-ancestry T2D PRSs may be helpful to identify individuals who are at high risk of developing diabetes associated macrovascular [i.e., cardiovascular disease (CAD), ischemic stroke (IS)] and microvascular complications [i.e., diabetic retinopathy (DR), proliferative diabetic retinopathy (PDR), diabetic nephropathy (DN), and end-stage diabetic nephropathy (ESDN)]. Given the sample size constraints of the AoU cohort, we only assessed the AFR, AMR, and EUR ancestries. We restricted microvascular complication analyses to individuals with T2D, as these outcomes are largely diabetes specific. For macrovascular complications, which also occur in those without T2D, we included all individuals and adjusted for T2D status in the models.

In each of the three ancestries, the D-PRISM multi-ancestry PRS-CSx was associated with increased risk of developing DR among individuals with T2D: OR per SD [95% CI] = 1·24 [1·12–1·37] in AFR, 1·44 [1·27–1·63] in AMR, 1·32 [1·21–1·44] in EUR. For the more severe form, PDR, the PRSs were also associated with increased risk in AFR, AMR, and EUR ancestries: 1·35 [1·09–1·67] in AFR, 1·94 [1·51–2·49] in AMR, 1·55 [1·19–2·01] in EUR. Additionally, the multi-ancestry PRSs predicted the risk for developing DN: 1·49 [1·27–1·76] in AMR and 1·23 [1·12–1·35] in EUR, as well as ESDN: 1·75 [1·35–2·27] in AMR and 1·41 [1·1–1·8] in EUR. It was also associated with increased risk of CAD in EUR only: 1·05 [1·02–1·08] (p<0·008, Bonferroni-corrected threshold from 0·05/6 diabetes complications tested) (Fig.4, Supplementary Table 12).

Fig. 4 | — Odds ratio of this study’s multi-ancestry PRS for the following outcomes: CAD (cardiovascular disease), IS (ischemic stroke), DN (diabetic nephropathy), ESDN (end-stage diabetic nephropathy), DR (diabetic retinopathy), and PDR (proliferative diabetic retinopathy). Points represent the odds ratios per standard deviation (OR per SD) and are colored according to the genetic ancestry of the individuals tested: purple for AFR, yellow for AMR, and red for EUR. Error bars show the 95% confidence intervals (95% CI).

Discussion

Identifying individuals with high T2D risk is essential to prioritize those who will benefit from lifestyle or therapeutic interventions to delay the disease or its complications.²⁴ While clinical risk factors can identify individuals at risk, polygenic risk scores can be estimated at birth and can identify at risk individuals who may clinically be perceived at low risk, e.g., without a family history of T2D, young, lean, etc.⁸ Multiple efforts have been made to identify individuals at high genetic risk for T2D. More than a hundred PRS for T2D have been published²², yet they have poorer predictive performance in underrepresented populations. Several reasons influence the low transferability, including the overrepresentation of European populations in T2D GWAS and differential LD patterns between variants across ancestries. Importantly, populations in which PRSs are less predictive, including the AFR, AMR, and SAS, are disproportionately affected by diabetes and its complications, highlighting the risk of exacerbating health disparities by applying PRSs derived only from European genetic information.¹¹ To improve the transferability and predictive accuracy of PRSs across diverse populations, including those underrepresented in GWAS efforts, various multi-ancestry PRS methods have been developed, while in parallel, several efforts have focused on expanding the representation of diverse ancestry populations in GWAS.^{3,15,16,25,26} Beyond GWAS discovery, multiple and sufficiently large sample sizes are also essential for training and validating PRSs, and rigorous data aggregation and harmonization following best practices are crucial to determine PRS accuracy.

In this study, we developed the most comprehensive PRSs for T2D across five continental ancestries and conducted extensive evaluations in at least four independent cohorts per ancestry. To accomplish this we leveraged nearly all available genetic datasets with T2D phenotype information to: i) maximize ancestry diversity by harmonizing 125 T2D GWAS datasets, including up to 2·2 million individuals; ii) enhance the representation and tagging of the genetic variants contributing to the PRS by generating new ancestry-specific LD reference panels, iii) train PRS models across ancestries and iv) thoroughly validate the PRSs in multiple independent harmonized cohorts and assess their association with diabetes-related complications.

As previously described^12,27, we observed that PRSs have lower accuracy when the validation sample is genetically distant from the discovery GWAS sample. However, underrepresented populations in GWAS have limited power and imprecise variant effect size estimates. For this reason, PRSs derived from EUR GWAS continue to outperform those derived from ancestry-matched GWAS in AFR, AMR, and SAS ancestries, likely due to the EUR GWAS sample size being at least 11 times larger. In contrast, applying a EUR PRS yielded worse performance for EAS ancestry than using a matched-ancestry EAS GWAS, even though the latter was four times smaller than the EUR GWAS. This suggests that non-matched ancestry PRSs (i.e., PRSs based on GWAS from a different ancestry than the validation cohort) improve performance only when the GWAS sample size in the other ancestry is orders of magnitude larger than that of the matched-ancestry GWAS, which may lack power to capture ancestry-specific genetic effects. However, when power is sufficient, ancestry-matched GWAS can improve accuracy compared to larger cross-ancestry GWAS. The minimum ancestry-specific GWAS sample size for powerful PRS prediction may vary depending on the trait and ancestry and may require further investigation. Regardless, we observe that for all the ancestry groups, including EUR, which has the largest GWAS sample size¹⁵, the optimal PRS strategy is to combine PRSs from multiple ancestries, here using PRS-CSx. This approach can leverage the most accurate estimates from variants present in all ancestries while still accounting for population-specific or enriched variants, improving the overall prediction accuracy.

For all ancestries, the multi-ancestry PRSs showed the strongest associations with T2D risk among individuals at the extremes of the risk distribution. For instance, individuals of EUR, EAS, or AMR ancestry in the top 97·5^th percentile of the PRS distribution were associated with a seven-fold increased risk of developing T2D than those with average scores. Despite limited GWAS representation, the multi-ancestry PRS still shows the most predictive accuracy for individuals of AFR and SAS ancestry relative to the matched-ancestry PRS. However, overall performance remains lower than in other ancestry groups, with individuals in the 97·5^th percentile associated with three-fold and four-fold increased risk of T2D. These results underscore that, despite our best efforts, there is still a substantial gap among the performance of PRSs, particularly in these two ancestry groups.

Using data from the AoU cohort, we demonstrate that our multi-ancestry PRSs outperformed all previously available PRSs for T2D, likely because we used the largest and most diverse GWAS data and multi-ancestry-based methodology for PRS development. Additionally, we constructed our multi-ancestry PRSs using a standardized and rigorous approach and tested them extensively across diverse ancestry groups, supporting their broader applicability. For example, the multi-ancestry PRSs outperformed PRSs derived from the largest trans-ancestry meta-analysis. These findings underscore that, in addition to increasing representation in GWAS, leveraging methods that jointly model GWAS and LD panels across multiple ancestries can enhance the tagging of causal variants, thereby improving predictive performance. In contrast, standard methods that use multi-ancestry IVW GWAS meta-analysis results rely on a single LD reference panel and may fail to capture ancestry-specific LD patterns, limiting their ability to model genetic risk prediction accurately.

Previous studies have reported significant associations between T2D PRS, proliferative retinopathy, and end-stage diabetic nephropathy across and within ancestries.^15,16 We confirm and extend previous findings by showing that multi-ancestry PRSs predict microvascular complications in T2D individuals of AFR, AMR, and EUR ancestries. Notably, our results reveal stronger associations and capture a broader range of diabetes complications severity than previously reported.

Our PRSs still face some limitations, mainly due to the lack of diversity in available genetic data. First, while constructing multi-ancestry PRS including GWAS from diverse ancestries—even those with smaller sample sizes—has proven beneficial to capture ancestry-specific effects, the SNP effect sizes are still strongly influenced by the largest European cohorts. Second, we acknowledge that using discrete population categories or restricting the analyses to groups defined by genetic similarity is sub-optimal, particularly in highly admixed populations where such groupings do not fully capture heterogeneous ancestry. While our current approach partially addresses these challenges, it remains limited by the need to group individuals into discrete ancestries, rather than a continuum. A critical next step will be adopting continuous genetic ancestry methods, which will require broader data access and new analytical strategies.

Despite criticism regarding PRSs’ limited added value to clinical risk factors, it has been shown that incorporating PRSs can help identify high-risk individuals among people who are clinically perceived as low risk, such as the young, the lean, or those with sparse clinical data⁸, and that PRSs improve prediction over family history alone²⁸. Ongoing randomized controlled trials are starting to evaluate the impact of PRS implementation into adult primary care as an additional risk factor for the primary care provider to consider.^7,29 Importantly, for any PRS-informed interventions to be viable, a state-of-the-art T2D PRS is a necessary first step. Among the characteristics to prioritize when using a PRS are its generalizability, portability, and implementation feasibility. For instance, Lennon et al.⁶ prioritized a multi-ancestry T2D PRS³⁰ trained using EUR, EAS, and AFR GWAS data, and proposed a high-risk cut-off of 2%. Compared to the rest of the non-high-risk population, this PRS showed ORs ranging from 4·44 [3·60–5·49] in EUR to a maximum of 2·35 [1·54–3·60] in AFR cohorts. Notably, our multi-ancestry PRSs consistently improved the prediction performance of T2D irrespective of the individual’s continental ancestry over any previously existing PRS. Compared to people with average genetic risk, those above the high genetic risk cut-off (2·5%) have ORs ranging from 3·43 to 7·47 across ancestries, compared to the interquartile group. We therefore propose using the T2D PRS models presented here for future clinical applications as they show improved predictive ability across all ancestries.

In summary, this study addresses a critical gap by delivering the most comprehensive and rigorously tested set of multi-ancestry PRSs for T2D. These scores improve risk prediction across diverse populations and enhance the identification of individuals at high genetic risk of developing T2D and microvascular complications, including mild and severe forms. By making the PRS weights publicly available, we provide a valuable resource for researchers and clinicians seeking to advance genetic risk stratification and develop prevention strategies for T2D.

Supplementary Material

Supplement 1

media-1.pdf^{(4MB, pdf)}

Supplement 2

media-2.xlsx^{(248.8KB, xlsx)}

Acknowledgments

This work is supported by National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH) U01HG011723. A.H.-C. is supported by the American Diabetes Association (ADA) grant 11-23-PDF-35. Y.L. is supported by R56HL150186, R01HL158884, and R01DK135938. M.O.G. was supported in part by NIH grants from the National Institute of Diabetes and Digestive and Kidney Disease (NIDDK) (P30-DK063491) and from the National Center for Advancing Translational Sciences (NCATS) (UL1TR001420, UL1TR001881) and the Eris M. Field Chair in Diabetes Research. S.S.R. is supported by NHGRI U01HG011723. M.A.N. research was supported in part by the Intramural Research Program of the NIH, National Institute on Aging (NIA), NIH, Department of Health and Human Services; project number ZO1 AG000535, as well as the National Institute of Neurological Disorders and Stroke (NINDS). This work utilized the computational resources of the NIH HPC Biowulf cluster. (http://hpc.nih.gov). A.L. is supported by grant 2020096 from the Doris Duke Foundation, the ADA grant 7-22-ICTSPM-23, and NHGRI U01HG011723. A.P.M. acknowledges support from the NIHR Manchester Biomedical Research Centre (NIHR203308). C.N.S. was supported by the NIH (R01DK118011; R01DK136671) and the ADA (11-22-JDFPM-06). J.B.M. is supported by grant UMDK078616. B.F.V. is grateful for support from the NIDDK DK138521 and DK126194. A.K.M. is supported by grant UMDK078616. J.M.M. is supported by ADA grant #11-22-ICTSPM-16 and by NHGRI U01HG011723, by the NIDDK under Award Number R01DK137993 and U01 DK140757, AMP CMD award from RFP 6 from the Foundation for the NIH, and a Medical University of Bialystok (MUB) grant from the Ministry of Science and Higher Education (Poland). This work is supported by the Novo Nordisk Foundation (NNF21SA0072102). M.C.Y.N. is supported by U01HG011723, R01DK066358 and U01DK105556.

Declaration of interests

H.C.G. holds the McMaster-Sanofi Population Health Institute Chair in Diabetes Research and Care. He reports research grants from Eli Lilly, Novo Nordisk, and Hanmi Pharmaceutical; grants to support continuing education programs from Eli Lilly, Abbott, Sanofi, Novo Nordisk, and Boehringer Ingelheim; honoraria for speaking from AstraZeneca, Eli Lilly, Zuellig, and Jiangsu Hanson; and consulting fees from Abbott, Bayer, Biolinq, Eli Lilly, Novo Nordisk, Pfizer, Shionogi, and Zealand. M.S.U. has consulting activity and research funded in collaboration with Novo Nordisk. A.K.M. has research funded in collaboration with Novo Nordisk. M.S.U. has research funded in collaboration with Novo Nordisk and is an unpaid research collaborator with AstraZeneca. J.M.M. has research funded in collaboration with Novo Nordisk. S.R.P. has had research funded by Philips Respironics and consulting fees from Apnimed, Bayer, Philips Respironics, Mineralys, and SleepRes. M.A.N. ‘s participation in this project was part of a competitive contract awarded to DataTecnica LLC by the National Institutes of Health to support open science research. He also currently owns stock in Character Bio and Neuron23 Inc.

Data sharing

Individual participant data is not available because they are subject to data protection laws and restrictions imposed by the ethics committee to ensure study participants’ privacy. The study protocol and the individual methods are included in the methods section. The developed PRS weights are available without restrictions through the PGS catalog (https://www.pgscatalog.org).

References

1.Magliano Dianna J., Boyko Edward J., IDF Diabetes Atlas 10th edition scientific committee. IDF DIABETES ATLAS. 2021. [Google Scholar]
2.Mercader JM, Ng MCY, Manning AK, Rich SS. Predicting diabetes risk in diverse populations: what next? Lancet Diabetes Endocrinol 2021; 9: 808–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Suzuki K, Hatzikotoulas K, Southam L, et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature 2024; 627: 347–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Udler MS, McCarthy MI, Florez JC, Mahajan A. Genetic Risk Scores for Diabetes Diagnosis and Precision Medicine. Endocr Rev 2019; 40: 1500–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Diabetes Prevention Program Research Group, Knowler WC, Fowler SE, et al. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. Lancet 2009; 374: 1677–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Lennon NJ, Kottyan LC, Kachulis C, et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat Med 2024; 30: 480–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Vassy JL, Brunette CA, Lebo MS, et al. The GenoVA study: Equitable implementation of a pragmatic randomized trial of polygenic-risk scoring in primary care. Am J Hum Genet 2023; 110: 1841–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Mandla R, Schroeder P, Porneala B, et al. Polygenic scores for longitudinal prediction of incident type 2 diabetes in an ancestrally and medically diverse primary care physician network: a patient cohort study. Genome Med 2024; 16: 63. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Fortmann AL, Savin KL, Clark TL, Philis-Tsimikas A, Gallo LC. Innovative Diabetes Interventions in the U.S. Hispanic Population. Diabetes Spectr 2019; 32: 295–301. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet 2019; 51: 584–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Kullo IJ, Conomos MP, Nelson SC, et al. The PRIMED Consortium: Reducing disparities in polygenic risk assessment. Am J Hum Genet 2024; 111: 2594–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Kachuri L, Chatterjee N, Hirbo J, et al. Principles and methods for transferring polygenic risk scores across global populations. Nat Rev Genet 2024; 25: 8–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Kidenya BR, Mboowa G. Inclusiveness of the All of Us Research Program improves polygenic risk scores and fosters genomic medicine for all. Communications medicine 2024; 4: 227. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature 2024; 627: 340–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Mahajan A, Spracklen CN, Zhang W, et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat Genet 2022; 54: 560–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Vujkovic M, Keaton JM, Lynch JA, et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat Genet 2020; 52: 680–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Kurki MI, Karjalainen J, Palta P, et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 2023; 613: 508–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.1000 Genomes Project Consortium, Auton A, Brooks LD, et al. A global reference for human genetic variation. Nature 2015; 526: 68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Ge T, Chen C-Y, Ni Y, Feng Y-CA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 2019; 10: 1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Ruan Y, Lin Y-F, Feng Y-CA, et al. Improving polygenic prediction in ancestrally diverse populations. Nat Genet 2022; 54: 573–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Wojcik GL, Fuchsberger C, Taliun D, et al. Imputation-Aware Tag SNP Selection To Improve Power for Large-Scale, Multi-ethnic Association Studies. G3 (Bethesda) 2018; 8: 3255–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Lambert SA, Wingfield B, Gibson JT, et al. Enhancing the Polygenic Score Catalog with tools for score calculation and ancestry normalization. Nat Genet 2024; 56: 1989–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Farmaki P, Damaskos C, Garmpis N, Garmpi A, Savvanis S, Diamantis E. Complications of the Type 2 Diabetes Mellitus. Curr Cardiol Rev 2020; 16: 249–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Crandall JP, Knowler WC, Kahn SE, et al. The prevention of type 2 diabetes. Nat Clin Pract Endocrinol Metab 2008; 4: 382–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Loh M, Zhang W, Ng HK, et al. Identification of genetic effects underlying type 2 diabetes in South Asian and European populations. Commun Biol 2022; 5: 329. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Spracklen CN, Horikoshi M, Kim YJ, et al. Identification of type 2 diabetes loci in 433,540 East Asian individuals. Nature 2020; 582: 240–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Ding Y, Hou K, Xu Z, et al. Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature 2023; 618: 774–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Drzymalla E, Raffield L, Kolor K, et al. Additive Value of Polygenic Risk Score to Family History for Type 2 Diabetes Prediction: Results From the All of Us Research Database. Diabetes Care 2025; 48: 212–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Linder JE, Allworth A, Bland HT, et al. Returning integrated genomic risk and clinical recommendations: The eMERGE study. Genet Med 2023; 25: 100006. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Ge T, Irvin MR, Patki A, et al. Development and validation of a trans-ancestry polygenic risk score for type 2 diabetes in diverse populations. Genome Med 2022; 14: 70. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1

media-1.pdf^{(4MB, pdf)}

Supplement 2

media-2.xlsx^{(248.8KB, xlsx)}

Data Availability Statement

[R1] 1.Magliano Dianna J., Boyko Edward J., IDF Diabetes Atlas 10th edition scientific committee. IDF DIABETES ATLAS. 2021. [Google Scholar]

[R2] 2.Mercader JM, Ng MCY, Manning AK, Rich SS. Predicting diabetes risk in diverse populations: what next? Lancet Diabetes Endocrinol 2021; 9: 808–10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Suzuki K, Hatzikotoulas K, Southam L, et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature 2024; 627: 347–57. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Udler MS, McCarthy MI, Florez JC, Mahajan A. Genetic Risk Scores for Diabetes Diagnosis and Precision Medicine. Endocr Rev 2019; 40: 1500–20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Diabetes Prevention Program Research Group, Knowler WC, Fowler SE, et al. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. Lancet 2009; 374: 1677–86. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Lennon NJ, Kottyan LC, Kachulis C, et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat Med 2024; 30: 480–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Vassy JL, Brunette CA, Lebo MS, et al. The GenoVA study: Equitable implementation of a pragmatic randomized trial of polygenic-risk scoring in primary care. Am J Hum Genet 2023; 110: 1841–52. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Mandla R, Schroeder P, Porneala B, et al. Polygenic scores for longitudinal prediction of incident type 2 diabetes in an ancestrally and medically diverse primary care physician network: a patient cohort study. Genome Med 2024; 16: 63. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Fortmann AL, Savin KL, Clark TL, Philis-Tsimikas A, Gallo LC. Innovative Diabetes Interventions in the U.S. Hispanic Population. Diabetes Spectr 2019; 32: 295–301. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet 2019; 51: 584–91. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Kullo IJ, Conomos MP, Nelson SC, et al. The PRIMED Consortium: Reducing disparities in polygenic risk assessment. Am J Hum Genet 2024; 111: 2594–606. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Kachuri L, Chatterjee N, Hirbo J, et al. Principles and methods for transferring polygenic risk scores across global populations. Nat Rev Genet 2024; 25: 8–25. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Kidenya BR, Mboowa G. Inclusiveness of the All of Us Research Program improves polygenic risk scores and fosters genomic medicine for all. Communications medicine 2024; 4: 227. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature 2024; 627: 340–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Mahajan A, Spracklen CN, Zhang W, et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat Genet 2022; 54: 560–72. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Vujkovic M, Keaton JM, Lynch JA, et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat Genet 2020; 52: 680–91. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Kurki MI, Karjalainen J, Palta P, et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 2023; 613: 508–18. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.1000 Genomes Project Consortium, Auton A, Brooks LD, et al. A global reference for human genetic variation. Nature 2015; 526: 68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Ge T, Chen C-Y, Ni Y, Feng Y-CA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 2019; 10: 1776. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Ruan Y, Lin Y-F, Feng Y-CA, et al. Improving polygenic prediction in ancestrally diverse populations. Nat Genet 2022; 54: 573–80. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Wojcik GL, Fuchsberger C, Taliun D, et al. Imputation-Aware Tag SNP Selection To Improve Power for Large-Scale, Multi-ethnic Association Studies. G3 (Bethesda) 2018; 8: 3255–67. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Lambert SA, Wingfield B, Gibson JT, et al. Enhancing the Polygenic Score Catalog with tools for score calculation and ancestry normalization. Nat Genet 2024; 56: 1989–94. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Farmaki P, Damaskos C, Garmpis N, Garmpi A, Savvanis S, Diamantis E. Complications of the Type 2 Diabetes Mellitus. Curr Cardiol Rev 2020; 16: 249–51. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] 24.Crandall JP, Knowler WC, Kahn SE, et al. The prevention of type 2 diabetes. Nat Clin Pract Endocrinol Metab 2008; 4: 382–93. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Loh M, Zhang W, Ng HK, et al. Identification of genetic effects underlying type 2 diabetes in South Asian and European populations. Commun Biol 2022; 5: 329. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Spracklen CN, Horikoshi M, Kim YJ, et al. Identification of type 2 diabetes loci in 433,540 East Asian individuals. Nature 2020; 582: 240–5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Ding Y, Hou K, Xu Z, et al. Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature 2023; 618: 774–81. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Drzymalla E, Raffield L, Kolor K, et al. Additive Value of Polygenic Risk Score to Family History for Type 2 Diabetes Prediction: Results From the All of Us Research Database. Diabetes Care 2025; 48: 212–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Linder JE, Allworth A, Bland HT, et al. Returning integrated genomic risk and clinical recommendations: The eMERGE study. Genet Med 2023; 25: 100006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Ge T, Irvin MR, Patki A, et al. Development and validation of a trans-ancestry polygenic risk score for type 2 diabetes in diverse populations. Genome Med 2022; 14: 70. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

This is a preprint.

Multi-ancestry polygenic risk scores for the prediction of type 2 diabetes and complications in diverse ancestries

Alicia Huerta-Chagoya

Joohyun Kim

Ravi Mandla

Yingchang Lu

Ken Suzuki

Lauren E Petty

Hong Kiat Ng

Jaewon Choi

Simon Lee

Madhusmita Rout

Kuang Lin

Linda S Adair

Adebowale Adeyemo

Habibul Ahsan

Masato Akiyama

Ping An

Sonia S Anand

Diane M Becker

Alain G Bertoni

Zheng Bian

Lawrence F Bielak

John Blangero

Michael Boehnke

Erwin P Bottinger

Donald W Bowden

Fiona Bragg

Jennifer A Brody

Thomas A Buchanan

Brian E Cade

Jin-Fang Chai

John C Chambers

Giriraj R Chandak

Li-Ching Chang

Kyong-Mi Chang

Miao-Li Chee

Chien-Hsiun Chen

Yuan-Tsong Chen

Zhengming Chen

Yii-Der I Chen

Ji Chen

Guanjie Chen

Shyh-Huei Chen

Wei-Min Chen

Ching-Yu Cheng

Yoon Shin Cho

Hyeok Sun Choi

Lee-Ming Chuang

Miguel Cruz

Mary Cushman

Swapan K Das

Ralph A DeFronzo

H Janaka deSilva

Latchezar Dimitrov

Ayo P Doumatey

Shufa Du

Qing Duan

Ravindranath Duggirala

Leslie S Emery

James C Engert

Daniel S Evans

Michele K Evans

Sarah Finer

Jose C Florez

James S Floyd

Myriam Fornage

Elizabeth G Frankel

Barry I Freedman

Lourdes García-García

Pauline Genter

Hertzel C Gerstein

Mark O Goodarzi

Penny Gordon-Larsen

Mariaelisa Graff

Myron Gross

Yu Guo

Xiuqing Guo

Yang Hai