Skip to main content
Nature Portfolio logoLink to Nature Portfolio
. 2023 Sep 7;55(9):1448–1461. doi: 10.1038/s41588-023-01462-3

GWAS of random glucose in 476,326 individuals provide insights into diabetes pathophysiology, complications and treatment stratification

Vasiliki Lagou 1,2,3,#, Longda Jiang 4,5,#, Anna Ulrich 5,6,#, Liudmila Zudina 5,6, Karla Sofia Gutiérrez González 7,8, Zhanna Balkhiyarova 5,6,9, Alessia Faggian 5,6,10, Jared G Maina 6,11, Shiqian Chen 12, Petar V Todorov 13, Sodbo Sharapov 14,15, Alessia David 16, Letizia Marullo 17, Reedik Mägi 18, Roxana-Maria Rujan 19, Emma Ahlqvist 20, Gudmar Thorleifsson 21, Ηe Gao 22, Εvangelos Εvangelou 22,23, Beben Benyamin 24,25,26, Robert A Scott 27, Aaron Isaacs 7,28,29, Jing Hua Zhao 30, Sara M Willems 7, Toby Johnson 31, Christian Gieger 32,33,34, Harald Grallert 32,34, Christa Meisinger 35, Martina Müller-Nurasyid 36,37,38,39, Rona J Strawbridge 40,41,42, Anuj Goel 1,43, Denis Rybin 44, Eva Albrecht 36, Anne U Jackson 45, Heather M Stringham 45, Ivan R Corrêa Jr 46, Eric Farber-Eger 47, Valgerdur Steinthorsdottir 21, André G Uitterlinden 7,48, Patricia B Munroe 31,49, Morris J Brown 31, Julian Schmidberger 50, Oddgeir Holmen 51, Barbara Thorand 33,34, Kristian Hveem 52, Tom Wilsgaard 53,54, Karen L Mohlke 55, Zhe Wang 56; GWA-PA Consortium, Aleksey Shmeliov 6, Marcel den Hoed 57, Ruth J F Loos 13,56,58, Wolfgang Kratzer 50, Mark Haenle 50, Wolfgang Koenig 59,60,61, Bernhard O Boehm 62, Tricia M Tan 12, Alejandra Tomas 63, Victoria Salem 64, Inês Barroso 65, Jaakko Tuomilehto 66,67,68, Michael Boehnke 45, Jose C Florez 69,70,71, Anders Hamsten 40,41, Hugh Watkins 1,43, Inger Njølstad 53,54, H-Erich Wichmann 33, Mark J Caulfield 31,49, Kay-Tee Khaw 30, Cornelia M van Duijn 7,72,73, Albert Hofman 7,74, Nicholas J Wareham 27, Claudia Langenberg 27,75,76, John B Whitfield 77, Nicholas G Martin 77, Grant Montgomery 77,78, Chiara Scapoli 79, Ioanna Tzoulaki 22,23, Paul Elliott 22,80,81, Unnur Thorsteinsdottir 21,82, Kari Stefansson 21,82, Evan L Brittain 83, Mark I McCarthy 1,84,92, Philippe Froguel 5,11, Patrick M Sexton 85,86, Denise Wootten 85,86, Leif Groop 20,87, Josée Dupuis 44,88, James B Meigs 70,71,89, Giuseppe Deganutti 19, Ayse Demirkan 6,9,90, Tune H Pers 13, Christopher A Reynolds 19,91, Yurii S Aulchenko 7,14,15, Marika A Kaakinen 5,6,9,, Ben Jones 12,, Inga Prokopenko 6,9,11,; Meta-Analysis of Glucose and Insulin-Related Traits Consortium (MAGIC)
PMCID: PMC10484788  PMID: 37679419

Abstract

Conventional measurements of fasting and postprandial blood glucose levels investigated in genome-wide association studies (GWAS) cannot capture the effects of DNA variability on ‘around the clock’ glucoregulatory processes. Here we show that GWAS meta-analysis of glucose measurements under nonstandardized conditions (random glucose (RG)) in 476,326 individuals of diverse ancestries and without diabetes enables locus discovery and innovative pathophysiological observations. We discovered 120 RG loci represented by 150 distinct signals, including 13 with sex-dimorphic effects, two cross-ancestry and seven rare frequency signals. Of these, 44 loci are new for glycemic traits. Regulatory, glycosylation and metagenomic annotations highlight ileum and colon tissues, indicating an underappreciated role of the gastrointestinal tract in controlling blood glucose. Functional follow-up and molecular dynamics simulations of lower frequency coding variants in glucagon-like peptide-1 receptor (GLP1R), a type 2 diabetes treatment target, reveal that optimal selection of GLP-1R agonist therapy will benefit from tailored genetic stratification. We also provide evidence from Mendelian randomization that lung function is modulated by blood glucose and that pulmonary dysfunction is a diabetes complication. Our investigation yields new insights into the biology of glucose regulation, diabetes complications and pathways for treatment stratification.

Subject terms: Diseases, Genetics


Genome-wide association analyses of blood glucose measurements under nonstandardized conditions provide insights into the biology of glucose regulation, diabetes complications and pathways for treatment stratification.

Main

Genetic factors are important determinants of glucose homeostasis and type 2 diabetes (T2D) susceptibility. Heritability of both fasting glucose (FG) and T2D is high, at 35–40%1 and 30–60%2, respectively. To date, more than 400 genetic loci have been associated with T2D3,4. Genome-wide association studies (GWAS) for glycemic traits in individuals without diabetes have identified genetic predictors of blood glucose, insulin and other metabolic responses during fasting or after oral or intravenous glucose challenge tests58. However, physiological glucose regulation involves responses to diverse nutritional and other stimuli that were, by design, omitted from such studies. Blood glucose is frequently measured at different times throughout the day in clinical practice and research studies (random glucose (RG)). While RG is inherently more variable than standardized measures, we reasoned that, across a very large number of individuals, it gives a more comprehensive representation of complex glucoregulatory processes occurring in different organ systems. Therefore, to identify and functionally validate genetic effects influencing RG, explore its relationships with other traits and diseases, and use these data to provide pathways for T2D treatment stratification, we performed a large-scale cross-ancestry GWAS meta-analysis for RG in individuals without diabetes.

Results

RG GWAS expands the catalog of glycemia-related genetic associations

We undertook RG GWAS in 476,326 individuals without diabetes of European (n = 459,772) and other ancestries (n = 16,554) with adjustment for age, sex and time since last meal (where available), along with the exclusion of extreme hyperglycemia (RG > 20 mmol l−1) and individuals with diabetes (Supplementary Table 1). The covariate selection was done upon extensive phenotype modeling (Supplementary Note, Supplementary Table 2 and Extended Data Fig. 1a). We identified 150 distinct signals (P < 105) by fine mapping through conditional analysis within 120 loci reaching genome-wide significance (P < 5.0 × 108; Fig. 1a and Supplementary Tables 3 and 4). Fifty-three RG signals are reported for glycemic traits for the first time, greatly expanding our knowledge about the genetics of glycemia (Tables 1 and 2 and Supplementary Table 3). Adjustment for last meal timing (Extended Data Fig. 1b) did not change effect size estimates while enabling better power for the analysis. Application of glycated hemoglobin (HbA1c) cut point for diagnosing diabetes (HbA1c ≥ 6.5%) highlighted stronger associations at G6PC2 and GCK lead RG loci (Extended Data Fig. 1c), suggesting their roles in glucose set-point in normoglycemia9. Neither adjustment for body mass index (BMI), nor a more stringent hyperglycemia cut-off (RG > 11.1 mmol l−1; Extended Data Fig. 1d,e) materially changed the magnitude and significance of the RG effect estimates, although when all covariate models were individually applied, 11 additional signals at genome-wide significance were identified (Table 2 and Supplementary Table 5). Despite previous misconceptions that RG is of limited value for genetic discovery because of its inherent variability, our RG GWAS demonstrates that this trait variability has a clear genetic component.

Extended Data Fig. 1. RG trait models tested and sensitivity plots showing the correlations between association analyses beta coefficients and Z-scores from RG models in UKBB.

Extended Data Fig. 1

a, The models were labeled according to covariates included and RG cut-offs used. Individuals were included based on two RG cut-offs: <20 mmol/l to account for the effect of extreme RG values (20) and <11.1 mmol/l (11), which is an established threshold for T2D diagnosis. Hence, model 1 – AS20 refers to adjustment for age and sex, using a cut-off of <20 mmol/l, and so forth. b-e, For c, 4,138 individuals were excluded based on HbA1c ≥ 6.5%, in addition to the self-reported or diagnosed T2D cases. Variants with a heterogeneity P-value ≤ 0.05 (beta-coefficient plot) or a Z-score difference between the two models compared >3 (Z-score plots) are annotated. f, An enrichment plot showing the effect of RG signals (AS20 + AST20 model) on T2D. RG and T2D effect sizes are plotted along the y- and x-axes, respectively. Point size is proportional to the statistical significance of the variant for T2D, with red color indicating previously established signals and blue novel signals, respectively. The dashed line represents the line of best fit. Variants with T2D P-value in the lowest decile are labeled. g, An enrichment plot showing the effects of RG signals (AS20 + AST20 model) on HOMA-B and HOMA-IR. The effect sizes on HOMA-B and HOMA-IR are plotted along the y- and x-axes, respectively. Point size is proportional to the significance of the variant either in HOMA-B or HOMA-IR, depending on which trait has the smaller P value. Red color indicates previously established signals and blue indicates novel signals, respectively. Variants with suggestive significance (P < 5.0 × 10−6) are labeled.

Fig. 1. Summary of all RG loci identified in this study.

Fig. 1

a, Circular Manhattan plot summarizing findings from this study. In the outermost layer, gene names of the 133 distinct RG signals are labeled with different colors indicating the following three clusters defined in cluster analysis: 1a/1b, metabolic syndrome; 2a/2b, insulin release versus insulin action (with additional effects on inflammatory bowel disease for cluster 2a) and 3, defects of insulin secretion. Asterisks annotate RG signals that are new for glycemic traits. Track 1 shows RG Manhattan plot reporting −log10(P value) for RG GWAS meta-analysis. Signals reaching genome-wide significance (P < 5.0 × 108) are colored in red. Crosses annotate loci that show evidence of sex heterogeneity (Psex-dimorphic < 5.0 × 108 and Psex-heterogeneity < 0.05); blue crosses for larger effects in men, green crosses for larger effects in women. Track 2 shows the effects of the 133 independent RG signals on four GIP/GLP-1-related traits GWAS. The colors of the dotted lines indicate four GIP/GLP-1-related traits: gray dot, signals reaching P < 0.010 for a GIP/GLP-1-related trait; red dot, lead SNP has a significant effect on GIP/GLP-1-related trait (Bonferroni corrected P < 1.0 × 104). Track 3 shows the effects (−log10(P value)) of the 133 independent RG signals on 113 glycan PheWAS. Track 4 shows the effects (−log10(P value)) of the 133 independent RG signals on 210 gut-microbiome PheWAS. Track 5 shows MetaXcan results for ten selected tissues for RG GWAS meta-analysis; signals colocalizing with genes (Bonferroni corrected P < 9.0 × 107) are plotted for each tissue. All P values were calculated from the two-sided z statistics computed by dividing the estimated coefficients by the estimated standard error, without adjustment. b, Credible set analysis of RG associations in the European ancestry meta-analysis. Variants from each of the RG signal credible sets are grouped based on their posterior probability (the percentiles labeled on the sides of the bar). SNP variants with posterior probability >80%, along with their locus names, are provided. All variants from the credible set of lead signals are highlighted in bold.

Table 1.

New signals for glycemic traits discovered in GWAS meta-analysis of RG levels in up to 459,772 individuals of European ancestries without diabetes

Signal Nearest gene(s) Variants Chr Position Type/model Alleles EAF Effect SE P value P het n
EUR KDM4A rs3791033 1 44,134,077 lead/7 T/C 0.67 −0.0017 0.00031 3.9 × 108 0.58 455,267
EUR FAM46C rs1966228 1 118,144,332 additional/5 A/G 0.75 0.0032 0.00034 1.3 × 1020 0.98 412,368
EUR FAM46C rs17656269 1 118,162,139 lead/7 T/C 0.33 0.0030 0.00032 4.3 × 1021 0.075 455,647
EURa EDEM3 rs78444298 1 184,672,098 lead/5 A/G 0.020 0.0076 0.0011 2.8 × 1012 0.68 398,925
EUR ACVR1C rs58288813 2 158473008 lead/5 T/C 0.95 0.0037 0.00066 2.3 × 108 0.0073 415,629
EUR ACVR1C rs2848657 2 158495349 additional/7 A/T 0.13 0.0026 0.00044 2.4 × 109 0.13 454,031
EUR RBMS1 rs12692596 2 161,265,910 lead/7 T/C 0.37 0.0019 0.00030 1.2 × 109 0.84 457,182
EURa NEUROD1 rs8192556 2 182,542,998 lead/7 T/G 0.024 0.0053 0.00096 3.0 × 108 0.50 418,468
EUR CACNA2D3 rs34222465 3 55,123,055 lead/1 A/G 0.56 −0.0019 0.00030 3.7 × 1010 0.055 418,498
EUR TRIM59, KPNA4 rs9799314 3 160,082,071 lead/7 T/C 0.47 0.0018 0.00030 1.1 × 109 0.025 439,182
EUR MECOM rs73174306 3 169,194,244 lead/5 A/T 0.96 −0.0059 0.00074 1.3 × 1015 1.00 393,841
EUR LCORL rs1503884 4 18,207,538 lead/5 T/G 0.56 −0.0018 0.00030 8.8 × 1010 0.65 414,134
EUR SCD5 rs4693043 4 83,563,582 lead/7 A/G 0.14 0.0023 0.00042 2.9 × 108 0.66 456,696
EUR ADRB2 rs71584073 5 148,149,418 lead/5 T/C 0.92 0.0038 0.00056 1.7 × 1011 0.91 398,925
EUR SYNGAP1 rs9461856 6 33,395,199 lead/1 A/G 0.49 −0.0017 0.00030 4.9 × 109 0.29 436,654
EUR ARMC2, SESN1 rs118126621 6 109,304,170 lead/5 A/G 0.025 0.0055 0.00098 2.3 × 108 0.029 393,841
EUR PEX7 rs7756291 6 137,235,325 lead/7 T/C 0.55 −0.0016 0.00030 4.4 × 108 0.47 434,769
EUR POP7, EPO rs221798 7 100,287,495 lead/5 C/G 0.11 −0.0030 0.00047 7.1 × 1011 0.78 415,738
EUR PRKAR2B rs3801969 7 106,711,492 lead/1 T/G 0.44 0.0017 0.00030 1.2 × 108 0.47 458,102
EUR A1CF rs61856594 10 52,637,925 lead/7 A/G 0.70 0.0022 0.00032 7.3 × 1012 0.59 451,966
EUR ADRA2A rs11195538 10 113,117,650 additional/5 T/C 0.93 0.0031 0.00060 2.3 × 107 0.21 403,260
EUR TCF7L2 rs144155527 10 114,737,633 additional/5 T/C 0.019 −0.0061 0.0011 3.5 × 108 0.33 398,925
EUR USP47 rs11022029 11 11,806,317 lead/5 T/C 0.85 0.0023 0.00042 3.4 × 108 0.75 414,134
EUR PDE3B rs141521721 11 14,763,828 lead/5 A/C 0.024 0.0054 0.00098 2.6 × 108 0.38 398,925
EUR OR4A5 rs72913090 11 50,653,357 lead/5 A/C 0.92 0.0033 0.00055 2.7 × 109 1.0 380,422
EUR TRIM48 rs150587121 11 55,036,391 lead/5 T/C 0.91 0.0030 0.00054 3.3 × 108 0.12 396,388
EUR OR8K3, OR8K1 rs2170441 11 56,095,739 lead/5 A/G 0.078 −0.0032 0.00056 9.5 × 109 0.57 398,925
EUR CCND2 rs3217791 12 4,384,669 additional/7 T/C 0.074 −0.0032 0.00059 8.2 × 108 0.69 393,841
EUR SOX5 rs12581677 12 24,060,732 lead/5 A/G 0.91 0.0032 0.00053 3.1 × 109 0.10 414,063
EUR MANSC4, KLHL42 rs11049144 12 27,931,511 lead/5 A/C 0.22 −0.0022 0.00036 1.2 × 109 0.012 413,498
EUR RNF6 rs12874929 13 26,781,607 lead/1 A/G 0.77 −0.0026 0.00035 5.5 × 1014 1.0 456,162
EUR SPRY2 rs4884144 13 80,678,136 lead/5 A/G 0.67 0.0019 0.00032 1.2 × 109 0.38 411,619
EUR HERC1 rs67507374 15 64,038,340 additional/5 A/T 0.31 −0.0024 0.00032 8.9 × 1014 0.28 415,015
EUR HNF1B rs10908278 17 36,099,952 lead/5 A/T 0.52 −0.0019 0.00030 2.3 × 1010 0.39 398,925
EURb NMT1 rs2239923 17 43,176,804 lead/1 T/C 0.29 0.0020 0.00030 1.1 × 109 0.54 458,104
EUR WIPI1 rs2952295 17 66,447,421 lead/5 A/T 0.23 0.0024 0.00035 4.5 × 1012 0.14 398,925
EUR SKA1, MAPK4 rs2957989 18 48,075,733 lead/1 A/G 0.82 0.0021 0.00039 3.4 × 108 0.67 437,935
EUR RALY rs7274168 20 32,435,978 lead/1 T/C 0.48 0.0018 0.00030 4.5 × 109 0.75 443,728
EUR HNF4A rs2267850 20 43,524,963 lead/7 T/C 0.27 −0.0021 0.00033 6.2 × 1010 0.92 437,057
EUR TSHZ2 rs2255805 20 51,627,634 lead/5 T/C 0.58 −0.0019 0.00030 1.5 × 1010 0.90 414,134
EUR STX16–NPEPL1 rs61285514 20 57,283,828 lead/7 A/G 0.77 0.0021 0.00035 2.3 × 109 0.24 451,642
EUR EEF1A2, PPDPF rs6122466 20 62,139,177 lead/5 A/G 0.86 −0.0026 0.00043 7.8 × 1010 0.70 405,111

A lead signal was annotated as ‘EUR’ if it reached genome-wide significance (P < 5.0 × 108) in the meta-analysis of European ancestry cohorts in either of our two models of interest with adjustment for age, sex with or without time since last meal (where available) along with the exclusion of extreme hyperglycemia (RG > 20 mmol l−1) or in their combination. Additional distinct signals with a region-wide threshold of P ≤ 1.0 × 105 are also reported. Effects and P values reported are from the model indicated in column ‘type/model’ (1, AS20; 5, AST20; 7, AS20 + AST20). Heterogeneity among studies was assessed using the I2 index.

aNonsynonymous variants.

bSynonymous variants.

Alleles, effect/other; Chr, chromosome; EAF, effect allele frequency (frequency of allele, for which beta is reported); EUR, individuals of European ancestry; Pos, position GRCh37.

Table 2.

New signals for glycemic traits discovered through UK Biobank (UKBB) (European ancestry only) GWAS in other RG models, UKBB (European ancestry only) GWAS on rare variants and cross-ancestry meta-analysis of up to 476,326 individuals of European or other ancestries (Black, Indian, Pakistani and Chinese) in UKBB

Signal Nearest gene(s) Variants Chr Position Type/model Alleles EAF Effect SE P value P het n
UKBB PEX7 rs7756291 6 13,7235,325 lead/6 C/T 0.45 0.0018 0.00030 3.0 × 109 379,291
UKBB INAFM2 rs882829 15 40,607,689 lead/2 C/G 0.92 0.0032 0.00057 1.6 × 108 379,301
UKBB INAFM2, C15orf52 rs4143838 15 40,622,374 lead/3 T/C 0.95 −0.0039 0.00070 1.8 × 108 379,947
UKBB ADCY9, SRL rs2018506 16 4,227,922 lead/6 C/G 0.85 −0.0023 0.00042 2.2 × 108 379,291
UKBB ERN1 rs58642235 17 62,202,689 lead/5 T/C 0.86 −0.0024 0.00044 4.5 × 108 380,422
UKBBa WIPI1 rs883541 17 66,449,122 In LD with lead/6 G/A 0.23 0.0023 0.00036 5.5 × 1011 380,422
UKBBb RFX1 rs2305780 19 14,083,761 lead/4 T/C 0.54 0.0016 0.00029 1.5 × 108 378,819
UKBB, rarea ANKH rs146886108 5 14,751,305 rare/1 T/C 0.0072 −0.012 0.0018 3.2 × 1012 380,432
Cross-anc RRNAD1 rs3806415 1 156,698,265 lead/5 T/C 0.32 −0.0017 0.00031 3.6 × 108 0.51 476,326
Sex-dim (w) SGIP1 rs7544505 1 66,998,618 lead/5 T/C 0.84 −0.0030 0.00053 1.8 × 108 0.019 207,903
Sex-dim (m) SGIP1 rs7544505 1 66,998,618 lead/5 T/C 0.84 −0.0010 0.00063 0.10 172,529
Sex-dim (w) POP7, EPO rs534043 7 100,312,724 lead/5 A/G 0.11 −0.0018 0.00061 0.0029 0.0040 207,903
Sex-dim (m) POP7, EPO rs534043 7 100,312,724 lead/5 A/G 0.11 −0.0046 0.00073 4.8 × 1010 172,529
Sex-dim (w) SLC43A2 rs56405641 17 1,528,464 lead/5 C/T 0.91 −0.0040 0.00067 2.0 × 109 1.4 × 104 207,903
Sex-dim (m) SLC43A2 rs56405641 17 1,528,464 lead/5 C/T 0.91 −4.1 × 10−5 0.00081 0.96 172,529

Loci showing sex-dimorphic effects on glycemic trait levels for the first time are also shown.

A signal was annotated as ‘UKBB’ if it reached genome-wide significance (P < 5.0 × 108) in UKBB (European ancestry) in any of the six RG models. A signal was annotated as ‘UKBB, rare’ if it reached genome-wide significance (P < 5.0 × 108) in UKBB (European ancestry) analysis for rare variants. Additional distinct signals with a region-wide threshold of P ≤ 1.0 × 105 are also reported. Effects and P values reported are from the model indicated in column ‘type/model’ (2, ASB20; 3, AS11; 4, ASB11; 5, AST20; 6, ASTB20). Heterogeneity among studies was assessed using the I2 index. P het values for the sex-dimorphic variants are from Cochran’s Q test (for sex heterogeneity representing the differences in allelic effects between sexes). Sex-dimorphic P values (2 degrees of freedom test of association assuming different effect sizes between the sexes) for the SGIP1, POP7/EPO and SLC43A2 variants were 3.2 × 108, 4.3 × 1011 and 1.5 × 108, respectively.

aNonsynonymous variants.

bSynonymous variants.

Cross-anc, cross-ancestry; Sex-dim (m), sex-dimorphic results for men; Sex-dim (w), sex-dimorphic results for women.

A number of signals identified in individuals of European ancestry showed nominal significance (P < 0.05) in other ancestry groups, including new loci MANSC4/KLHL42 in African, FAM46C and ACVR1C in Indian and RBMS1 in Chinese ancestry groups (Supplementary Table 3). All such signals, except rs540524 at G6PC2, rs183606969 at GCK and rs6006399 at MTMR3/HORMAD2, were directionally concordant across ancestries. At GCK, rs2908286 (r21000GenomesAllAncestries = 0.83 with rs2971670 lead in European ancestry individuals) was genome-wide significant in the African ancestry individuals alone (Supplementary Table 6). Cross-ancestry meta-analyses combining European and the other four ancestral groups revealed two new RG signals at RRNAD1 and PROX1 (Table 2 and Supplementary Table 6). Overall, while being only 16,554 individuals larger in sample size than the European ancestry meta-analysis, the cross-ancestry analysis expanded the new locus discovery for RG, confirming the potential of cross-ancestry studies for complex trait genetics.

The strongest associations with RG were detected at G6PC2 (P < 1.0 × 10−746) and GCK (P < 3.7 × 10277), established loci for FG and with key roles in gluconeogenesis10 and glucose sensing11, respectively (Supplementary Table 3). Notably, only two-thirds of RG signals overlapped with T2D-associated loci (Extended Data Fig. 1f), including three new loci for glycemia (SCD5, RNF6 and TSHZ2). The direction of effects at these loci between RG, T2D and homeostasis model assessment of β-cell function/insulin resistance (HOMA-B/HOMA-IR)6 (Extended Data Figs. 1f,g and 2 and Supplementary Table 7) were consistent with their epidemiological correlation. We also discovered sex dimorphism at 13 RG loci, including male-specific PRDM16 and RSPO3, and female-specific SGIP1, SRRM3 and SLC43A2 (Table 2, Fig. 1a and Supplementary Tables 3 and 8). We conclude that sex dimorphism, characterizing over one-tenth of RG-associated loci, is a widespread feature of glucose metabolism.

Extended Data Fig. 2. Enrichment plots showing the effect of RG signals (AS20 + AST20 model) on glycemic and respiratory-related phenotypes.

Extended Data Fig. 2

a-i, Look-up of effects was done in previously published genome-wide association studies for HbA1c (a), fasting glucose (b), fasting insulin (c), type 2 diabetes (d), forced expiratory volume in one second (FEV1) (e), forced vital capacity (FVC) (f), FEV1/FVC (g), lung cancer (h) and squamous cell lung cancer (i). RG and other phenotype effect sizes are plotted along the y- and x-axes, respectively. Point size and color are proportional to the significance of the variant in each phenotype, with red indicating higher and blue lower significance, respectively. The dashed line represents the line of best fit. P < 5.0 × 10−8 was considered statistically significant after adjusting for multiple testing. Two-tailed P-values are reported. Variants with P-values in the lowest decile are labeled.

Coding, rare and causal variants in RG variability

The lead variants at two new RG loci (NMT1 and RFX1) and three previously reported loci for FG (TET2, THADA and RREB1) were all coding common (minor allele frequency (MAF) ≥ 5 %) variants (Supplementary Table 3 and Extended Data Fig. 3). Additionally, lead RG-associated SNPs at glucagon-like peptide-1 receptor (GLP1R), neuronal differentiation 1 (NEUROD1) and ER degradation enhancing α-mannosidase like protein 3 (EDEM3) loci in our analysis were low-frequency (5% > MAF ≥ 1%) coding variants (Table 1, Supplementary Table 3 and Extended Data Fig. 3). NEUROD1 and EDEM3 are plausible candidates for glucose homeostasis, with the former reported for glucosuria12 and the latter linked to renal function13,14. Within the rare allele frequency range (1% > MAF ≥ 0.001%), we first identified 30 RG loci and validated seven in whole-exome sequencing (WES) UK Biobank (UKBB) data (Supplementary Note). These included noncoding, such as rs2096313127 at CAMK2B (Supplementary Table 9) and synonymous rs2232324 in G6PC2 variant associations (Table 2 and Supplementary Table 9). We expanded the annotation of coding nonsynonymous independent (r21000GenomesAllAncestries < 0.0010) rare variant signals associated with RG to nondeleterious new rs146886108 (Arg187Gln) in ANKH15, and deleterious, including three in G6PC2 with predicted (rs2232326) and established (rs138726309, rs2232323)16 effects (Supplementary Table 9). Thus, a range of coding and rare variants contributes to RG level variability and can be detected in very large genetic studies.

Extended Data Fig. 3. LocusZoom plots of common variants in UKBB (Europeans) meta-analysis for RG.

Extended Data Fig. 3

a-h, Plots are shown for GCKR (a), TET2 (b), RREB1 (c), NMT1 (d) and WIPI1 (e) loci and low-frequency coding variants at EDEM3 (f), NEUROD1 (g) and GLP1R (h) loci. The x-axis shows the chromosomal position, and the y-axis shows the uncorrected two-sided −log10 P values from the UKBB GWAS conducted using linear mixed-modeling in BOLT-LMM. Horizontal line corresponds to P = 5 × 10−8 and blue peaks show the recombination rate.

Next, we sought to pinpoint the most plausible set of causal variants by calculating 99% credible sets for each RG locus. In the European ancestry-only analysis, 15 RG signals were explained by one variant with a posterior probability of ≥99% of being causal, including low-frequency variants in GLP1R, G6PC2, MECOM and CCND2 (ref. 17), and common variants in LMO1 and CACNA2D3 (Fig. 1b and Supplementary Table 10a). For another 16 signals, such as at RMST, FOXN3 and ADRA2A, a lead variant had a posterior probability ≥80%. Credible sets at WIPI1, GCKR, TET2, RREB1 and RFX6 included coding common variants. RREB1 and RFX6 encode transcription factors implicated in the development and function of pancreatic β cells18,19. The credible sets were narrowed down for several signals in cross-ancestry RG meta-analysis (European ancestry median credible set size = 12.0 and cross-ancestry = 12.0), with improvements observed at DGKB and TP53INP1 lead signals (Supplementary Table 10b,c). These analyses highlight examples of validated and potential targets for therapeutic development15.

Characterization of RG-associated GLP1R coding variants provides a framework for T2D treatment stratification

Following annotation and definition of likely causal variants, for functional studies, we prioritized GLP1R, which encodes a class B1 GPCR (GLP-1R) important in blood glucose and appetite regulation and a well-established target of the T2D drugs exenatide (exendin-4) and semaglutide20. We used RG data to validate an experimental framework for predicting individual responses to GLP-1R agonists, as this would be a major asset in clinical practice and is currently lacking. Within GLP1R, the lead missense variant at rs10305492 (A316T) has a strong (0.058 mmol l−1 per allele) RG-lowering effect, second by size only to G6PC2 locus variants, and is also associated with FG/T2D21,22.

We functionally tested the impact of rs10305492 (A316T) and 16 other GLP1R coding variants detected in the UKBB dataset, with effect allele frequency ranging from common (G168S, rs6923761, PRG GWAS meta-analysis = 5.20 × 105) to rare (R421W, rs146868158, PRG GWAS meta-analysis = 0.036), by measuring GLP-1-induced recruitment of mini-Gαs23 in HEK293 cells stably expressing wild-type (WT) or variant GLP-1R. This approach captures the most proximal part of the Gαs-adenylate cyclase-cyclic adenosine monophosphate pathway, which links GLP-1R activation to insulin secretion. With correction for differences in cell surface expression determined using SNAP-tag labeling24, mini-Gs-coupling efficiency was indeed predictive of the RG effect for these variants (Fig. 2a and Supplementary Table 11), thereby linking experimentally measured GLP-1R function in vitro to blood glucose homeostasis. This relationship was assessed in UKBB WES data (Supplementary Note and Extended Data Fig. 4).

Fig. 2. Functional and structural analysis of coding GLP1R variants.

Fig. 2

a, Minor allele frequency-weighted linear regression was used to test if mini-Gs response to GLP-1 stimulation substantially predicted point estimates of GLP1R variant effect on RG levels (AST20 βRG as estimated in the UKBB study, nmax = 401,810). Mini-Gs response to GLP-1 stimulation was corrected for variant surface expression (nmax = 22, exact n for each variant is provided in Supplementary Table 11). Error bars extend one standard error above and below the point estimate. Size of the dots is proportional to the weight applied in the regression model. The regression results (coefficient of determination R2 = 0.74, F(1, 15) = 47.5, P = 5.1 × 106) suggest that mini-Gs coupling in response to GLP-1 stimulation predicts the effect of these coding variants on RG levels (AST20 βRG = −0.030; 95% confidence interval (CI) = −0.039 to −0.020; P = 5.1 × 106). The gray shaded area around the regression line corresponds to the 95% CI of predictions from the model. Variants in red showed no detectable surface expression (NDE) and are not included in regression analysis. b, Mean GLP1R variant mini-Gs coupling and receptor endocytosis, with surface expression correction, in response to GLP-1, OXM, glucagon (GCG), exendin-4 (Ex4), semaglutide (Sema) and tirzepatide (TZP), n = 6. Positive deviation indicates variant gain-of-function, with statistical significance inferred when the 95% CIs shown do not cross zero. Responses are also compared between pathways by unpaired t test, with an asterisk indicating statistically significant differences. c, Architecture of the complex formed between the agonist-bound GLP-1R and Gs; the likely effect triggered by residues involved in GLP-1R isoforms A316T, G168S and R421W (in magenta) are reported. d, Distributions of the distance between Y2423.45 side chain and P3125.42 backbone computed during molecular dynamics simulations of GLP-1R WT and A316T; the cut-off distance for hydrogen bond is shown. e, Difference in the hydrogen bond network between GLP-1R WT and A316T. f, Analysis of water molecules within the TMD of GLP-1R WT and A316T suggests minor changes in the local hydration of position 5.46 (unperturbed structural water molecule). Also, a stabilizing role for the water molecules at the binding site of the G protein (water cluster apha5) cannot be ruled out. g, Distributions of the distance between position 1681.63 and Y1782.48 during molecular dynamics simulations of GLP-1R WT and G168S. h, During molecular dynamics simulations, the GLP-1R isoform S168G showed increased flexibility of ICL1 and H8 compared to WT, suggesting a different influence on G-protein intermediate states. i, Contact differences between Gs and GLP-1R WT or W421R; the C terminal of W421R H8 made more interactions with the N terminal segment of the Gs β subunit. j, Mini-Gs and GLP-1R endocytosis responses to 20 nM exendin-4, plotted against surface GLP-1R expression, from 196 missense GLP1R variants transiently transfected in HEK293T cells (n = 5 repeats per assay), with data represented as mean ± s.e.m. after normalization to WT response and log10-transformation. Variants are categorized as ‘LoF1’ when the response 95% CI falls below zero or ‘LoF2’ where the expression-normalized 95% CI falls below zero. k, GLP-1R snake plot created using gpcr.com summarizing the functional impact of missense variants; for residues with >1 variant, classification is applied as LoF2 > LoF1 > tolerated.

Extended Data Fig. 4. Association analysis of GLPR1 receptor function and random glucose effects of coding variants.

Extended Data Fig. 4

Minor allele frequency-weighted linear regression was used to test if mini-Gs response to GLP-1 stimulation significantly predicted point estimates of GLP1R variant effect on RG levels (AST20 βRG as estimated in whole-exome sequencing data from the UKBB study). Mini-Gs response to GLP-1 stimulation was corrected for variant surface expression (nmax = 22, exact n for each variant is provided in Supplementary Table 11). Error bars extend one standard error above and below the point estimate. Size of the dots is proportional to the weight applied in the regression model (Methods). The regression results (coefficient of determination R2 = 0.56, F(1, 14) = 20.1, P = 5.2 × 10−4) suggest that mini-Gs coupling in response to GLP-1 stimulation predicts the effect of these coding variants on RG levels (AST20 βRG = − 0.028; 95% CI = −0.042 to −0.015; P = 5.2 × 10−4). The gray shaded area around the regression line corresponds to the 95% confidence interval of predictions from the model. Variants in red showed no detectable surface expression (NDE) and are not included in regression analysis.

Focusing on the two directly genotyped GLP1R missense variants in UKBB, we also measured mini-Gs responses to several endogenous and pharmacological GLP-1R agonists, observing that A316T (rs10305492-A) showed increased responses and R421W (rs146868158-T) showed reduced responses, to all ligands except exendin-4 (both variants) and semaglutide (A316T only), in line with their RG effects (Fig. 2b). Interestingly, for late-stage T2D candidate tirzepatide, which has pronounced ‘biased agonism’ at GLP-1R25, the difference between A316T and R421W amounted to nearly tenfold difference in activity. The common G168S variant, with a relatively small RG-lowering effect (β = −0.0013, s.e. = 3.1 × 104), also showed increases in function with pharmacological agonist stimulation. As GLP-1R undergoes extensive agonist-induced endocytosis, a process that modulates the subcellular origin and temporal dynamics of receptor signaling26, we also assessed the endocytic characteristics of A316T, G168S and R421W variants using high content microscopy. Here the most notable observation was that agonist-induced GLP-1R endocytosis with R421W was normal despite its signaling deficit, suggesting a specific alteration to how this variant couples to downstream effectors24. These results, supported by RG data and clinical observations27,28, suggest that in vitro assessments can provide valuable insights into the optimal selection of GLP-1R treatment according to genotype.

Next, we performed molecular dynamics (MD) simulations of human GLP-1R bound to oxyntomodulin (OXM)29 to gain structural insights into the above-described GLP1R variant effects. A316T has a single amino acid substitution in the core of the receptor transmembrane (TM) domain (Fig. 2c) that leads to an alteration of the nearby hydrogen bond network that normally serves to stabilize the GLP-1R inactive state (Supplementary Video 1). Specifically, in A316T, residue T3165.46 replaces Y2423.45 (superscripts follow the study discussed in ref. 30 generic GPCR class B1 numbering system, where the number before the dot indicates the TM helix and the number after the dot refers to the sequence distance from the most conserved residue indicated by 50) in a persistent hydrogen bond with the backbone of P3125.42, one turn of the helix above T3165.46 (Fig. 2d,e and Supplementary Video 1). This triggers a local structural rearrangement that could transmit to the intracellular G-protein-binding site through TM3 and TM5, thereby enhancing G-protein coupling. A water molecule is close to position 5.46 in both A316T and WT (water cluster α5; Fig. 2f). Notably, the same water bridges the backbone of Y2413.44 and A3165.46 in WT or the backbone of Y2413.44 and the side chain of T3165.46 in A316T. Given the importance of conserved water networks in the activation of class A GPCRs31,32, the stability of the hydrated spot close to position 5.46 corroborates the importance of this site for GLP-1R effects. In analogy with A316T, simulations with the G168S variant indicated the formation of a stable new hydrogen bond between the side chain of residue S1681.63 and A1641.59, one turn above on the same helix (Fig. 2g and Supplementary Video 2). This moves the C-terminal end of TM1 closer to TM2 and reduces the overall flexibility of intracellular loop 1 (ICL1; Fig. 2h), altering the role of ICL1 in G-protein activation. In contrast to A316T and G168S, the site of variant R421W is consistent with persistent interactions with the G protein, and simulations predicted a propensity of R421W to interact with a different region of the G-protein β-subunit compared to WT (Fig. 2i). These results capture the full range of structural features in the current active GLP-1R models and provide clear clues about the dynamics of A316T and other GLP-1R variants, compared to early models that did not benefit from the structural insights obtained from cryo-electron microscopy22.

For a broader view of the impact of GLP1R coding variation, we screened an additional 178 missense variants identified from exome sequencing33 for exendin-4-induced mini-Gs coupling and endocytosis by transient transfection in HEK293 cells (Supplementary Note, Fig. 2j,k and Supplementary Table 12). In total, 110 variants showed a reduced response in either or both pathways (‘LoF1’) and 67 displayed a specific response deficit that was not fully explained by differences in GLP-1R surface expression (‘LoF2’). Many of these defects were larger than in the analysis in Fig. 2a, with a major loss of GLP-1R function a likely consequence, meaning that patients carrying these variants are less likely to benefit from GLP-1R agonist drug treatment.

Functional annotation of RG associations and intestinal health

Previous T2D and glycemic trait GWAS have primarily implicated pancreatic, adipose and liver tissues3. We performed a range of complementary functional annotation analyses by leveraging our RG GWAS results to identify additional cell and tissue types with etiological roles in glucose metabolism. Data-driven expression prioritized integration for complex traits (DEPICT)34, which predicts enriched tissue types from prioritized gene sets, highlighted intestinal tissues including ileum and colon, as well as pancreas, adrenal glands5, adrenal cortex and cartilage (false discovery rate, FDR < 0.20; Fig. 3a,b and Supplementary Table 13). Similarly, CELL type expression-specific integration for complex traits (CELLECT)35, which facilitates cell type prioritization based on single-cell RNA-sequencing (scRNA-seq) datasets, identified large intestinal tissue as second-ranked only to pancreatic cell types (Fig. 4 and Supplementary Table 14). Interestingly, RG variants were related particularly to enriched expression in pancreatic polypeptide cells, exceeding even the more conventionally implicated insulin-secreting β cells. Supporting evidence was obtained from transcriptome-wide association study (TWAS) analysis, where we identified a total of 216 (119 unique) significant genetically driven associations across the ten tested tissues (Supplementary Table 15a); 51 (25 unique) of highlighted genes are located at genome-wide significant RG loci (Supplementary Table 15). TWAS signals in skeletal muscle5 showed the largest overlap with RG signals, such as GPSM1 (ref. 36) and WARS. The combined results from ileum and colon also showed high enrichment, including the new NMT1 and the established FADS1/3 and MADD genes (Fig. 1a and Supplementary Table 15). Expression quantitative trait locus (eQTL) colocalization analyses, using eQTLgen whole blood expression data from 31,684 individuals37,38 and the COLOC2 approach, identified 14 loci with strong links (posterior probability >70%) to gene expression data, including TET2 (ref. 39), KCNJ11, KLHL42, IKBKAP and CAMK1D, with transcriptional effects in pancreatic islets and kidney mesangial cells (Supplementary Table 16). Similar analyses of human pancreatic islets regulatory variation in the translational human pancreatic islet genotype tissue-expression resource (TIGER) dataset38 defined 58 loci with strong statistical support for colocalization of the effects on RG and tissue expression of ADCY5, RNF6, FADS1, MADD and STARD10 (ref. 40), in addition to KLHL42 and CAMK1D, with the latter overlapping in whole blood. Moreover, epigenetic annotations using the GARFIELD tool highlighted significant (P < 2.5 × 105) enrichment of RG-associated variants in the fetal large intestine, as well as blood, liver and other tissues (Extended Data Fig. 5 and Supplementary Table 17). Adult intestinal tissues are not available in GARFIELD except for colon. Prompted by multiple analyses highlighting a potential role for the digestive tract in glucose regulation, we assessed the overlap between our signals and those from the latest gut-microbiome GWAS41 and identified two genera sharing signals and direction of effect with RG at one locus: Collinsella and Lachnospiraceae-FCS020 at ABO-FUT2 (Fig. 1a and Supplementary Table 18). The ABO-FUT2 locus effects on RG could be mediated by the abundance of Collinsella/Lachnospiraceae-FCS02, producing glucose from lactose and galactose42. Collinsella genus affects gut permeability via interleukin-17A43 and shows higher abundance in individuals with T2D compared to those with normal glucose tolerance and individuals with prediabetes44. Moreover, weight loss decreases Collinsella among obese individuals with T2D45. Higher prevalence of the Lachnospiraceae family is associated with metabolic disorders, while genus Lachnospiraceae-FCS02 abundance shows an inverse correlation with serum triglycerides46. However, the mechanism of their enrichment has yet to be studied. This multi-omics annotation provided strong evidence for links between RG and intestinal health.

Fig. 3. Deterioration of glucose homeostasis progressing into T2D and leading to complications in multiple organs and tissues.

Fig. 3

Established (left, in peach) and new (right, in green). a, A human figure illustrating the main causes of hyperglycemia (a combination of lifestyle and genetic factors), and how hyperglycemia affects many organs and tissues. Complications on the left panel are well-established for T2D. Those on the right panel are emerging ones and are supported by our current analyses. Figure created with BioRender.com. b, DEPICT prioritization of 134 tissues from the GTEx Project highlights the ileum and pancreas (shown in red, one-sided empirical P value with FDR < 0.05 determined against randomized phenotypes in a null GWAS).

Fig. 4. Cell type prioritization across 17 tissues identified large intestinal tissue ranked second only to pancreatic cell types.

Fig. 4

CELLECT prioritization of 115 cell types from Tabula Muris highlights pancreatic polypeptide (PP) cells (shown in black, one-sided Wilcoxon rank-sum test with significance threshold depicted by a dotted line indicating cell types with a nominal PS-LDSC < 4.3 × 104).

Extended Data Fig. 5. Epigenetic annotation of the RG GWAS results using GARFIELD.

Extended Data Fig. 5

The analyses were performed using generalized linear modeling in GARFIELD software. We considered enrichment to be statistically significant if the RG GWAS P-value reached P = 1 × 10−8 and the enrichment analysis P-value was < 2.5 × 10−5 (Bonferonni corrected for 2,040 annotations).

Finally, we observed associations at HNF1A47 with nine total plasma N-glycome traits48 at a Bonferroni-corrected threshold (Fig. 1a and Supplementary Table 19). These traits represent highly branched galactosylated sialylated glycans (attached to an α1-acid protein, an acute-phase protein49), known to lead to chronic low-grade inflammation50,51 and an increased risk of T2D5254 that might be explained by the role of N-glycan branching of the glucagon receptor in glucose homeostasis55. In addition, ten glycans showed association with five RG loci (HNF1A, BAG1, PLUT) at a suggestive level of significance (Fig. 1a). Among them, three are attached to immunoglobulin G molecules49, and their increased relative abundances are associated with a lower risk of T2D56 and diminished inflammation status57. These observations suggest an overlap between networks regulating RG homeostasis and plasma-protein N-glycosylation.

Genetic relationships between RG and other metabolic or nonmetabolic traits

Using linkage-disequilibrium score regression analyses, we estimated the genetic correlations between RG and other phenotypes to quantify the shared genetic contribution. We detected positive genetic correlations between RG and squamous cell lung cancer (rg = 0.28, P = 0.0015) and lung cancer (rg = 0.12, P = 0.037; Fig. 5 and Supplementary Table 20), as well as inverse genetic correlations with lung function related traits, such as forced vital capacity (FVC, rg = −0.090, P = 0.0059) and forced expiratory volume in 1 second (FEV1, rg = −0.054, P = 0.017; Figs. 3a and 5 and Supplementary Table 20). To investigate this further, we conducted bidirectional Mendelian randomization (MR) analysis, which suggested a causal effect of RG and T2D on lung function, including FEV1 (βMR–RG = −0.66, P = 9.6 × 105; βMR–T2D = −0.049, P = 1.3 × 1013) and FVC (βMR–RG = −0.60, P = 1.5 × 104; βMR–T2D = −0.062, P = 1.4 × 1021), but not vice versa (RG βMR–FEV1 = −0.0048, P = 0.42; βMR–FVC = −0.01, P = 0.17 and T2D (βMR–FEV1 = −0.18, P = 0.040; βMR–FVC = −0.21, P = 0.040; Supplementary Table 21a,b). External factors, such as smoking or sedentary lifestyle, could cause lung function to decline, independent of RG and T2D effects. We implemented multivariable MR (MVMR) and found (Supplementary Table 21c) that RG and T2D causal effects on FVC are independent of both cigarettes smoked per day (CPD; that is, proxy for smoking58) and leisure screen time (LST; that is, proxy for physical activity59). This is important as previous observational studies have highlighted worsening lung function, as defined by FVC, in patients with T2D, but whether this was a causal relationship was not clear60,61. More recently, it was shown that patients with diabetes are at an increased risk of death from the viral infection COVID-19 (ref. 62), with pulmonary dysfunction contributing to mortality63. Our data confirm the causal effect of glycemic dysregulation on a decline in lung function as a new complication of diabetes.

Fig. 5. Genome-wide genetic correlation between RG and a range of traits and diseases.

Fig. 5

The x axis provides the estimated rg genetic correlation values for traits or diseases (y axis) reaching at least nominal significance (P < 0.05). Correlations reaching P < 0.010 are labeled with the prime symbol, and those P < 2.1 × 104 are labeled with the asterisk symbol. P values were calculated from the two-sided z statistics computed by dividing the estimated rg by the estimated standard error, without adjustment. Each error bar represents the standard error of the estimate.

Genome-wide genetic correlation analyses also showed a strong positive genetic correlation of RG with FG (rg = 0.88, P = 6.93 × 10−61; Fig. 4 and Supplementary Table 20). We meta-analyzed RG studies other than UKBB with FG GWAS summary statistics64, observing 79 signals reaching nominal significance that were directionally consistent in both UKBB and RG + FG (Supplementary Table 3), providing additional support to our RG findings. Given the large genetic overlap between RG, other glycemic traits and T2D, we evaluated the ability of a trait-specific polygenic risk score (PRS) to predict RG, T2D and HbA1c levels using UKBB effect estimates and the Vanderbilt cohort. The RG PRS explained 0.58% of the variance in RG levels when individuals with T2D were included (Supplementary Table 22), and 0.71% of the variance after excluding those who developed T2D within 1 year of their last RG measurement. The RG PRS performance was comparable to that of the FG loci PRS (0.38% versus 0.42% for T2D; 0.40% versus 0.44% for HbA1c), indicating shared genetic variability determining glycemic traits.

We previously highlighted diverse effects of FG and T2D loci on pathophysiological processes related to T2D development by grouping associated loci in relation to their effects on multiple phenotypes6. Cluster analysis of the RG signals with 45 related phenotypes identified three separate clusters (Fig. 1a, Supplementary Table 23 and Extended Data Figs. 6 and 7), including ‘metabolic syndrome’ cluster 1, with 28 loci also leading to higher waist-to-hip ratio, blood pressure, plasma triglycerides, insulin resistance (HOMA-IR) and coronary artery disease risk, as well as lower sex hormone binding globulin levels in both sexes and testosterone in males. Cluster 3 was characterized, in particular, by insulin secretory defects6. Cluster 2 showed a primary effect on insulin release versus insulin action3, but included a subcluster of 11 loci, which exert protective effects on inflammatory bowel disease, a relationship not previously reported. Moreover, cluster 2 was notable for generally reduced T2D risk in comparison to clusters 1 and 3, shaping the partial overlap between genetic determinants of glycemia and T2D that is known to exist65. This RG loci grouping gave innovative insights into the etiology of glucose regulation and associated disease states.

Extended Data Fig. 6. Cluster analysis of effects (as Z-scores) of the distinct 143 RG signals on 45 relevant phenotypes.

Extended Data Fig. 6

All variant effects were aligned to the RG risk allele. HapMap2 based summary statistics were imputed using SS-Imp v0.5.565 to minimize missingness. Missing summary statistics values were imputed via mean imputation. The heatmap was produced using the Pheatmap package. For visualization, the Z-scores were truncated to the value corresponding to genome-wide significance (Z = 5.45), and 11 phenotypes with the lowest median absolute Z-scores were excluded.

Extended Data Fig. 7. Scatter plots of the standardized allelic effect estimates for selected trait pairs.

Extended Data Fig. 7

In each scatter plot, loci were assigned to the groups defined from the cluster analysis and highlighted by different colors. a, Corrected insulin response (CIR) vs. type 2 diabetes (T2D) (clusters 1a/b related to metabolic syndrome). b, Glycated hemoglobin (HbA1c) vs. inflammatory bowel disease (IBD) (cluster 2a) highlights the effects of loci with a protective role in IBD. c, Plasminogen activator inhibitor-1 (PAI-1) vs. CIR (cluster 3) highlights loci linked to insulin secretion defects.

Discussion

Leveraging data from 476,326 individuals, we have expanded by 44 the number of loci associated with glycemic traits. By using RG, our analysis integrates genetic contributions into a wider range of physiological stages, which thus far was not possible with standardized glycemic measures. Moreover, the greater statistical power obtained from large cross-ancestry meta-analysis improves confidence in identifying potentially causal variants, thereby helping to prioritize genes for more detailed functional analyses in the future. Our comprehensive functional characterization of GLP1R coding variation validates its role in blood glucose regulation and, more importantly, shows how GLP-1R-targeting drug responses depend on genetic variation. Notably, additional islet-expressed class B1 GPCRs identified in our current analysis and other glycemic trait/T2D GWAS, including GIPR, GLP2R (refs. 3,66) and SCTR21, are investigational targets for T2D treatment, which should be subjected to similar analysis. Our functional annotation analyses point to underexplored tissue mediators of glycemic regulation, with new evidence highlighting the role of the intestine. This observation supports the profound effects of gastric bypass surgery on T2D resolution67, as well as links between the intestinal microbiome and responses to several diabetes drugs68. In the near future, larger well-phenotyped datasets will enable high-dimensional GWAS investigations, disentangling the role of diet composition, physical activity and lifestyle on RG level variability in relation to genetic effects. Finally, through MR, we identified a causal effect of glucose levels and T2D on lung function, demonstrating the utility of this approach for corroborating findings from observational studies and elevating lung dysfunction as a new complication of diabetes.

Methods

Ethics

All participating studies were approved by their appropriate institutional review boards or committees, and written informed consent was obtained from all study participants. For all the participating studies, approval was received to use their data in the present work. Study-specific ethics statements are provided in the references listed in Supplementary Table 1.

Phenotype definition and model selection for RG GWAS

We used RG (mmol l−1) measured in plasma or in whole blood (corrected to plasma level using the correction factor of 1.13). Individuals were excluded from the analysis if they had a diagnosis of T2D or were on diabetes treatment (oral or insulin). Individual studies applied further sample exclusions, including pregnancy, fasting plasma glucose ≥7 mmol l−1 in a separate visit, when available, and having type 1 diabetes (Supplementary Table 1). Details about RG modeling in the first set of six available cohorts (Supplementary Table 2) can be found in the Supplementary Note. For the GWAS, we included individuals based on the following two RG cut-offs: <20 mmol l−1 (20) to account for the effect of extreme RG values and <11.1 mmol l−1 (11), which is an established threshold for T2D diagnosis. We then evaluated the following six different models in GWAS according to covariates included and cut-offs used: (1) age (A) and sex (S), RG < 20 mmol l−1 (AS20); (2) age, sex and BMI (B), RG < 20 mmol l−1 (ASB20); (3) age and sex, RG < 11.1 mmol l−1 (AS11); (4) age, sex and BMI, RG < 11.1 mmol l−1 (ASB11); (5) age, sex, time since last meal (accounted for as T, T2 and T3), RG < 20 mmol l−1 (AST20) and (6) age, sex, T, T2 and T3 and BMI, RG < 20 mmol l−1 (ASTB20). Apart from the above, additional adjustments for study site and geographical covariates were also applied.

RG meta-analyses

The GWAS meta-analysis of RG consisted of the following five components: (1) 37,239 individuals from ten European ancestry GWAS imputed up to the HapMap 2 reference panel; (2) 3,156 individuals from three European ancestry GWAS with Metabochip coverage; (3) 21,083 individuals from two European ancestry GWAS imputed up to 1000 Genomes reference panel; (4) 380,432 individuals of white European ancestry from the UKBB and (5) 16,983 individuals from the Vanderbilt cohort imputed to the HRC panel (Supplementary Note). We imputed the GWAS meta-analysis summary statistics of each component to all-ancestries 1000 Genomes reference panel69 using the summary statistics imputation method implemented in the SS-Imp v0.5.5 software70. SNPs with imputation quality scores <0.7 were excluded. We then conducted inverse-variance meta-analyses to combine the association summary statistics from all components using METAL v2011-03-25 (ref. 71). We focused our meta-analyses on models AS20 (17 cohorts, nmax = 459,772) and AST20 (when time from last meal was available in the cohort; 12 cohorts, nmax = 417,290). For the FHS cohort, where no information was available for individuals with RG > 11.1 (an established threshold for 2hGlu concentration, which is a criterion for T2D diagnosis), AS11 model results were used. We also performed a meta-analysis using cohorts with time from the last meal available (AST20 model, 12 cohorts) combined with those lacking this information (AS20, five cohorts) to maximize the association power while taking into account T. We termed this analysis as AS20 + AST20 in the following text (17 cohorts, nmax = 458,862). A signal was considered to be associated with RG if it reached genome-wide significance (P < 5.0 × 108) in the meta-analysis of UKBB and other cohorts in either of our two models of interest (AS20) or (AST20) or in their combination (AS20 + AST20).

Of 133 signals detected in the European ancestry subset (Supplementary Note), 105 were directionally consistent in the UK Biobank and other contributing studies grouped together, providing the discovery validation (Supplementary Table 3). We report the P value from the combined model unless otherwise stated. Full results from all models are provided in Supplementary Table 3. We checked for nominal significance (P < 0.05) and directional consistency of the effect sizes for the selected lead SNPs in the combined model in UKBB results versus other cohort results. We further extended the check between UKBB results and meta-analysis of other cohorts including FG GWAS meta-analysis64, excluding overlapping cohorts. This meta-analysis conducted in METAL v2011-03-25 was sample size and P value based due to the measures being at different scales (natural logarithm-transformed RG and untransformed FG).

Cross-ancestry analyses and meta-analysis

We performed GWAS in non-European ancestry populations within UKBB that had a sample size of at least 1,500 individuals. These were Black (n = 7,644), Indian (n = 5,660), Pakistani (n = 1,747) and Chinese (n = 1,503). We further meta-analyzed our European ancestry cohorts with the cross-ancestry UKBB cohorts. The analyses were performed with BOLT-LMM v2.3 (ref. 72) and METAL v2011-03-25.

Sex-dimorphic analysis

To evaluate sex dimorphism in our results, we meta-analyzed the UKBB and the Vanderbilt cohort with the GWAMA v2.1 software73, which provides a 2 degrees of freedom (df) test of association assuming different effect sizes between the sexes. We evaluated the evidence for heterogeneity of allelic effects between sexes using Cochran’s Q statistic73,74. We considered a signal to show evidence of sex dimorphism if the sex-dimorphic P value was <5.0 × 108 and if the sex heterogeneity P value (1 df) was <0.05.

Clumping and conditional analysis

We performed a standard clumping analysis (PLINK v1.90 (ref. 75) criteria—P ≤ 5 × 108, r2 = 0.01, window-size = 1 Mb, 1000 Genomes Phase 3 data as linkage disequilibrium (LD) reference panel) to select a list of near-independent signals. We then performed a stepwise model selection analysis (approximate conditional analysis) to replicate the analysis using GCTA v1.93.0 (ref. 76) with the following parameters: P ≤ 5 × 108 and window-size = 1 Mb. We further checked for additional distinct signals by using a region-wide threshold of P ≤ 1.0 × 105 for statistical significance. For validation and comparison, we also performed direct conditional analyses using BOLT-LMM v2.3 (Supplementary Note). We filtered the direct conditional analysis results and BOLT-LMM results by checking the LD between all the variants within the same locus and keeping only independent signals (r2 < 0.01). LD was calculated from European reference haplotypes from the 1000 Genomes Project using LDlinkR v1.1.2 library.

GLP-1R pharmacological and structural analysis

Mini-Gs recruitment assay

Where stable cell lines were used (that is, Fig. 2a,b), WT or variant T-REx-SNAP-GLP-1R-SmBiT cells (Supplementary Note) were seeded in 12-well plates and transfected with 1 µg per well LgBiT-mini-Gs23 (a gift from N. Lambert, Medical College of Georgia). The following day, GLP-1R expression was induced by the addition of tetracycline (0.2 µg ml−1) to the culture medium for 24 h. For transient transfection assays (that is, Fig. 2j), HEK293T cells in poly-d-lysine-coated white 96-well plates were transfected using Lipofectamine 2000 with 0.05 µg per well WT or variant SNAP-GLP-1R-SmBiT plus 0.05 µg per well LgBiT-mini-Gs and the assay performed 24 h later. Cells were then resuspended in Hank’s balanced salt solution + furimazine (Promega) diluted 1:50 and seeded in 96-well half-area white plates, or the same reagent added to adherent cells for transient transfection assays. Baseline luminescence was measured over 5 min using a Flexstation 3 plate reader at 37 °C before the addition of ligand or vehicle. Agonists were applied at a series of concentrations spanning the response range. After agonist addition, luminescent signal was serially recorded over 30 min, and ligand-induced effects were quantified by subtracting individual well baselines. Signals were corrected for differences in cell number as determined by bicinchoninic acid assay.

Analysis of pharmacological data

Technical replicates within the same assay were averaged to give one biological replicate. For concentration-response assays (Fig. 2a,b), ligand-induced responses were analyzed by three-parameter fitting in Prism 8.0 (GraphPad Software). As a composite measure of agonism77, log10-transformed Emax/half maximal effective concentration (EC50) values were obtained for each ligand/variant response. The WT response was subtracted from the variant response to give ∆log(max/EC50), a measure of gain- or loss-of-function for the variant relative to WT. Log10-transformed surface expression levels were obtained for each variant relative to WT; these were then used to correct mini-Gs ∆log(max/EC50) values for differences in variant GLP-1R surface expression levels, by subtraction with error propagation. GLP-1R internalization responses were already normalized to surface expression within each assay. Statistical significance between WT and variant responses was inferred if the 95% confidence intervals for ∆log(max/EC50) did not cross zero77. Changes to the profile of receptor response between mini-Gs recruitment and GLP-1R internalization were inferred if P < 0.05 with unpaired t test analysis, with Holm–Sidak correction for multiple comparisons. For transient transfection assays (Fig. 2j), responses were normalized to WT response and log10 transformed to give Log ∆ responses. Additionally, the impact of differences in the surface expression on functional responses was determined by subtracting the log-transformed normalized expression level from the log-transformed normalized response.

Variance explained in RG effects by mini-Gs recruitment at coding GLP1R variants

RG (AST20 model) effects estimated in the UKBB study at 16 independent (r2 < 0.02) coding GLP1R variants (Supplementary Table 11) were regressed on mini-Gs coupling in response to glucagon-like peptide-1 (GLP-1) stimulation (corrected for surface expression) giving more weight to variants with higher minor allele frequency.

Computational methods including MD simulations

The active state structure of GLP-1R in complex with OXM29 and Gs protein was used to simulate WT GLP-1R and G168S, A316T and R421W. The WT systems and variants were prepared for MD simulations and equilibrated as reported78. AceMD3 3.3.0 (ref. 79) was used for production runs (four MD replicas of 500 ns each). AquaMMapS v1 analysis80 was performed on 10 ns-long MD simulations of GLP-1R(WT) and GLP-1R(A316T) in complex with OXM, with all the α carbons restrained; coordinates were written every 10 ps of simulation.

Credible set analysis

After selecting the signals with each region based on different meta-analysis results from AS20, AST20 and AS20 + AST20 models, we further performed a credible set analysis to obtain a list of potential causal variants for each of the 133 selected signals (Supplementary Note). We also calculated credible sets for the cross-ancestry meta-analysis and compared the results between the European ancestry-only and cross-ancestry meta-analyses.

DEPICT analysis

DEPICT uses GWAS summary statistics and computes a prioritization of genes in associated loci, which are used to prioritize tissues via enrichment analysis. DEPICT v1_rel 194 was used with default settings and RG GWAS summary statistics as input against a genetic background of SNPsnap data81 derived from the 1000 Genomes Project Phase 3 (ref. 82) to prioritize genes (Supplementary Note).

CELLECT analysis

CELLECT35 v1.0.0 and Cell type EXpression-specificity35 v1.0.0 are two toolkits for genetic identification of likely etiologic cell types using GWAS summary statistics and scRNA-seq data. Tabula Muris gene expression data83, a scRNA-seq dataset derived from 20 organs from adult male and female mice, was preprocessed as described in the Supplementary Note.

Genetically regulated gene expression analysis

We used MetaXcan (S-PrediXcan) v0.6.10 (ref. 84) to identify genes whose genetically predicted gene expression levels are associated with RG in a number of tissues. The tested tissues were chosen based on their involvement in glucose metabolism. Those were adipose visceral omentum, adipose subcutaneous, skeletal muscle, liver, pancreas and whole blood. Additionally, we tested ileum, transverse colon, sigmoid colon and adrenal gland because they were highlighted by DEPICT analysis. The models for the tissues of interest were trained with GTEx Version 7 transcriptome data from individuals of European ancestry85. The tissue transcriptome models and 1000 Genomes86 based covariance matrices of the SNPs used within each model were downloaded from PredictDB Data Repository. The association statistics between predicted gene expression and RG were estimated from the effects and their standard errors coming from the AS20 + AST20 model. Only statistically significant associations after Bonferroni correction for the number of genes tested across all tissues (P ≤ 9.0 × 107) were included in the table. Genes, where less than 80% of the SNPs used in the model were found in the GWAS summary statistics, were excluded due to the low reliability of the association result.

GARFIELD analysis

We applied the GWAS analysis of regulatory or functional information enrichment with LD correction (GARFIELD) tool v2 (ref. 87) on the RG AS20 + AST20 meta-analysis results to assess the enrichment of the RG-associated variants within functional and regulatory features. GARFIELD integrates various types of data from a number of publicly available cell lines. Those include genetic annotations, chromatin states, DNaseI hypersensitive sites, transcription factor binding sites, FAIRE-seq elements and histone modifications. We considered enrichment to be statistically significant if the RG GWAS P value reached 1 × 108 and the enrichment analysis P value was <2.5 × 105 (Bonferroni corrected for 2,040 annotations).

Genetic association with gut microbiome

We assessed the genetic overlap between RG GWAS results and those for gut microbiome. GWAS of microbiome profiles were publicly available and downloaded from https://mibiogen.gcc.rug.nl/. For each of the 210 taxa, the corresponding P values for the 133 RG GWAS SNPs and their proxies were extracted.

Genetic association with GLP-1 and gastric inhibitory polypeptide (GIP)

We assessed the genetic overlap between RG GWAS results and those for GLP-1 and GIP measured at 0 min and 120 min. We extracted the results for the 133 RG signals from the GWAS summary statistics for GLP-1 and GIP88.

eQTL colocalization analysis

We further performed colocalization analysis using whole blood gene eQTL data provided by eQTLGen37 and human pancreatic islets eQTLs provided by TIGER38 for all 133 RG signals. We used meta-analysis results from AS20, AST20 or AS20 + AST20 depending on the degree of association of each signal. Only cis-eQTL data from eQTLGen/TIGER were incorporated to reduce the computational burden. The COLOC2 Bayesian-based method89 was used to interrogate the potential colocalization between RG GWAS signals and the genetic control of gene expression. First, for each signal, depending on which model (AS20, AST20 or AS20 + AST20) had the lowest GWAS P value, we extracted the RG GWAS test statistics of all SNPs within ±1 Mb region around the 133 RG signals. Then, for each RG signal, we matched the eQTLGen/TIGER results with the RG results and performed COLOC2 analysis evaluating the posterior probability of the following five hypotheses for each region: H0, no association; H1, GWAS association only; H2, eQTL association only; H3, both GWAS and eQTL association, but not colocalized and H4, both GWAS and eQTL association and colocalized. Only GWAS signals with at least one nearby gene/probe reaching posterior probability (H4) ≥ 0.5 were reported. We considered signals to have strong evidence of colocalization if posterior probability (H4) > 0.7.

Genetic association with human blood plasma N-glycosylation

We assessed genetic associations between 133 RG signals and 113 human blood plasma N-glycome traits using previously published genome-wide association summary statistics90. The description of the analyzed traits and details of the association analysis can be found elsewhere48. We considered associations to be significant when P < 0.05/113/133 = 3.3 × 106 (after Bonferroni correction). Association was considered suggestive when P < 104.

Genetic correlation analysis

We investigate the shared genetic component between RG and other traits, including glycemic ones, by performing genetic correlation analysis using the bivariate LD score regression method (LDSC v1.0.0)91. To reduce multiple testing burden, only the GWAS results of the AS20 + AST20 model were used. We used GWAS summary statistics available in LDhub92 and the Meta-Analysis of Glucose and Insulin-related Traits Consortium (MAGIC) website (https://www.magicinvestigators.org) for several traits including FG/FI64, HOMA-B/HOMA-IR93. In total, 228 different traits were included in the genetic correlation analysis with RG. We considered P ≤ 2.2 × 104 (Bonferroni correction for 228 traits) as the statistical significant level and P ≤ 0.05 as the nominal level.

MR analysis

We applied a bidirectional two-sample MR strategy (Supplementary Note) to investigate causality between RG and lung function, as well as T2D and lung function using independent genetic variants as instruments. We looked for evidence for the presence of a causal effect of RG and T2D on the following two lung function phenotypes: FEV1 and FVC in a two-sample MR setting. Genome-wide summary statistics for the lung function phenotypes were available94, involving cohorts from the SpiroMeta consortium and the UKBB study. T2D susceptibility variants and their effects were obtained from the largest-to-date T2D GWAS4.

To avoid confounding due to sample overlap, lung function summary statistics used as outcome data were those estimated in the SpiroMeta consortium alone. Similarly, when testing the effect of lung function on RG, RG genetic effects used as outcome data were estimated in all cohorts except UK Biobank. There was no sample overlap between the lung function and the T2D GWAS, thus allowing the use of T2D effects estimated in all contributing European ancestry studies. Genome-wide T2D summary statistics were available from a previous study3 to test for the causal effect of lung function on T2D. All analyses were conducted using the R software package TwoSampleMR v0.5.4 (ref. 95).

Causal effects were estimated using the inverse-variance weighted method, which combines the causal estimates of individual instrumental variants (Wald ratios; Supplementary Note) in a random-effects meta-analysis96. Instrument heterogeneity Q statistic P values are reported. As a sensitivity analysis, we used MR-Egger regression (Supplementary Note) to test for the presence of horizontal pleiotropy and obtain causal estimates that are more robust to the inclusion of invalid instruments97.

MVMR is an extension of MR that can be applied with either individual or summary-level data to estimate the effect of multiple, potentially related, exposures on an outcome98. We used the MVMR v0.3 R package to test whether the causal effects of RG and T2D on FVC are independent of possible confounders, such as physical activity and smoking. The same instrument selection criteria as described for the main MR analysis were used. CPD was instrumented by 54 (available out of the 58 in total) independent genome-wide significant variants, obtained from the GWAS discussed in ref. 58. LST served as a continuous proxy phenotype for physical activity from the recent study discussed in ref. 59 with 66 (available out of the 88 in total) independent genome-wide significant variants.

PRS analysis

We tested the ability of the RG genetic effects to predict RG, T2D and HbA1c. We compared that to the predictive power of T2D and FG genetic instruments by computing PRS for RG, T2D and FG and assessing their performance in predicting RG, T2D and HbA1c. PRS analyses require base and target data from independent populations. The base datasets in our analyses were UKBB-only estimates from the present RG GWAS, meta-analysis estimates of 32 studies for T2D15 and meta-analysis estimates from MAGIC for FG64. We used the second largest cohort, the Vanderbilt University Medical Center, as our target dataset. PRS construction and model evaluation (Supplementary Note) were done using the software PRSice v2.2.3 (ref. 99).

Clustering of the RG signals with results for 45 other phenotypes

We looked up the z scores (regression coefficient β divided by the standard error) of the distinct 133 RG signals in publicly available summary statistics of 45 relevant phenotypes (Supplementary Table 23). All variant effects were aligned to the RG risk allele. HapMap 2-based summary statistics were imputed using SS-Imp v0.5.5 (ref. 70) to minimize missingness. Missing summary statistics values were imputed via mean imputation. The resulting variant–trait association matrix was truncated to 2 s.d. to minimize the effect of outliers. We used agglomerative hierarchical clustering with Ward’s method to partition the variants into groups by their effects on the considered outcomes. The clustering analysis was performed in R using function hclust() from in-built stats package.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Online content

Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41588-023-01462-3.

Supplementary information

Supplementary Information (473.2KB, pdf)

Supplementary Note.

Reporting Summary (90.2KB, pdf)
Peer Review File (1.3MB, pdf)
Supplementary Video 1 (39.9MB, zip)

Superpositions between WT (blue) and A316T (red) GLP-1R during MD simulations. GLP-1R residue position 5.46: in WT (blue), Y2423.45 persistently forms a hydrogen bond with the backbone of P3125.42; in A316T (red), this interaction is replaced by the hydrogen bond with the side chain of T3165.46. Red dotted lines indicate hydrogen bonds.

Supplementary Video 2 (29.5MB, zip)

Superpositions between WT (blue) and G168S (gray) GLP-1R during MD simulations. GLP-1R residue position 1.63 and ICL1: G168S forms a persistent hydrogen bond between the S1681.63 side chain and the backbone of A1641.59 (red dotted line), not present in the WT (G168ICL1).

Supplementary Tables (1.6MB, xlsx)

Supplementary Tables 1–23.

Acknowledgements

This research has been conducted using the UK Biobank Resource, project number 37685. We are supported by the following: the Medical Research Council (grants MR/L01341X/1 to P.E., MR/R010676/1 to B.J., MR/R010676/1 to A.T.); PHE (to P.E.); the UK Dementia Research Institute (to P.E. and I.T.); the Alzheimer’s Society (to P.E. and I.T.); the Alzheimer’s Research UK (to P.E. and I.T.); the National Health and Medical Research Council (NHMRC) Fellowship Schemes (552498 to B.B., 339446 and 619667 to G.M.); the NHMRC Ideas (grant 1184726 to P.M.S. and D.W.); the NHMRC (grants 1150083 to P.M.S. and D.W., 1154434 to P.M.S., 1155302 to D.W.); the Swedish Research Council (grants 2017-02688, 2020-02191 to E. Ahlqvist and 2019-01417 to M.d.H.); the Swedish Heart-Lung Foundation (grants 20200781 and 20200602 to M.d.H.); the British Heart Foundation (to A.G.), the European Commission (grants LSHM-CT-2007-037273 and HEALTH-F2-2013-601456 to A.G.); the Novo Nordisk Foundation (grants NNF15CC0018486 to A.G. and NNF18CC0034900 to T.H.P.); the Lundbeck Foundation (grant R190-2014-3904 to T.H.P.); VIAgenomics (grant SP/19/2/344612 to A.G.); the Wellcome Trust (grants 090532/Z/09/Z, 203141/Z/16/Z to H.W., 104955/Z/14/Z to A. David, 090532 to M.I.M., 098381 to M.I.M., 106130 to M.I.M., 203141 to M.I.M., 212259 to M.I.M., 205915/Z/17/Z to I.P.); UKRI Innovation-HDR-UK Fellowship (grant MR/S003061/1 to R.J.S.); European Union’s Horizon 2020 research and innovation program LONGITOOLS (grant H2020-SC1-2019-874739 to M.A.K., A.U., Z.B. and I.P.); the European Foundation for the Study of Diabetes (to B.J. and M.A.K.); the Imperial Post-CCT Post-Doctoral Fellowship (to B.J.); the Academy of Medical Sciences (to B.J.); the National Institute for Health Research Imperial NIHR Biomedical Research Center (to B.J. and T.M.T.); the Engineering and Physical Sciences Research Council (to B.J.); the Society for Endocrinology (to B.J.); the British Society for Neuroendocrinology (to B.J.); Research England ‘Expanding excellence in England’ (to I.B.); the Research Foundation-Flanders (to V.L.); the Diabetes UK (to V. Salem, A.T.; BDA, 20/0006307 to I.P.); the Russian Science Foundation (grant 19-15-00115 to S.S.); the NIDDK (grant U01-DK105535 to M.I.M.); European Federation for the Study of Diabetes (to A.T.); the Agence Nationale de la Recherche (PreciDIAB, grant ANR-18-IBHU-0001 to J.G.M. and I.P.); the University of Lille mobility grant (to J.G.M.); the People-Centered Artificial Intelligence Institute, University of Surrey (Z.B., M.A.K., A. Demirkan and I.P.); the World Cancer Research Fund (to I.P.); the World Cancer Research Fund International (grant 2017/1641 to I.P.); the Royal Society (grant IEC\R2\181075 to I.P. and C.A.R.); the European Union through the ‘Fonds européen de développement regional’ (FEDER; to I.P.); the ‘Conseil Régional des Hauts-de-France’ (Hauts-de-France Regional Council; to I.P.); the ‘Métropole Européenne de Lille’ (MEL, European Metropolis of Lille; to I.P.).

Extended data

Author contributions

These authors junior-led the study analyses and write-up: V.L., L.J. and A.U. Central analysis and writing group included: V.L., L.J., A.U., L.Z., K.S.G.G., Z.B., A.F., L.M., A.S., M.A.K., B.J. and I.P. Additional analyses were junior-led by: J.G.M., S.C., P.V.T., S.S., A. David, R.M., R.-M.R., E. Ahlqvist, Z.W., T.M.T., A.T. and V. Salem. GWAS cohort analyses were carried out by: G.T., Η.G., Ε.Ε., B.B., R.A.S., A.I., J.H.Z., S.M.W., T.J., C.G., H.G., C.M., M.M.-N., R.J.S., A.G., D.R., J.D., Y.S.A. and M.A.K. Metabochip cohort analyses were undertaken by: E. Albrecht, A.U.J. and H.M.S. Cohort sample collection, genotyping, phenotyping or additional analyses were led by: I.R.C., E.F.-E., V. Steinthorsdottir, A.G.U., P.B.M., M.J.B., J.S., O.H., B.T., K.H., T.W., K.L.M., Z.W., M.d.H. and R.J.F.L. Metabochip cohort principal investigators were: W. Kratzer, M.H., W. Koenig and B.O.B. GWAS cohort principal investigators were: J.T., M.B., J.C.F., A. Hamsten, H.W., I.N., H.-E.W., M.J.C., K.T.K., C.M.v.D., A. Hofman, N.J.W., C.L., J.B.W., N.G.M., G.M., I.T., P.E., U.T., K.S., E.L.B. and J.B.M. Additional analyses senior leads were: P.M.S., D.W., L.G., G.D., A. Demirkan, T.H.P. and C.A.R. Senior authors who contributed to paper writing: I.B., C.S., M.I.M., P.F., J.D. and J.B.M. Senior author who contributed to analyses and was a member of the writing group: Y.S.A. Senior authors who led the study design, analyses and write-up were: M.A.K., B.J. and I.P.

Peer review

Peer review information

Nature Genetics thanks Marijana Vujkovic and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Data availability

Meta-analysis summary statistics for the GWAS presented in this manuscript are available on the MAGIC website (magicinvestigators.org) and through the NHGRI-EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/downloads/summary-statistics, GCP ID: GCP000666; with study accession codes for Europeans-only meta-analysis: GCST90271557; cross-ancestry meta-analysis: GCST90271558; and sex-dimorphic meta-analysis: GCST90271559). UK Biobank individual-level data can be obtained through a data access application available at https://www.ukbiobank.ac.uk/. In this study, we made use of data made available by: 1000 Genomes project (https://www.genome.gov/27528684/1000-genomes-project); SNPsnap (https://data.broadinstitute.org/mpg/snpsnap/index.html); Tabula Muris (https://www.czbiohub.org/tabula-muris/); GTEx Consortium (https://gtexportal.org/home/); microbiome GWAS (https://mibiogen.gcc.rug.nl/); Human Gut Microbiome Atlas (https://www.microbiomeatlas.org); eQTLGen Consortium (https://www.eqtlgen.org/); TIGER expression data (http://tiger.bsc.es/) and LDHub database (http://ldsc.broadinstitute.org/ldhub/).

Competing interests

A.T. has received grant funding from Sun Pharmaceuticals and Eli Lilly. J.B.M. is an academic associate for Quest Diagnostics. They make an HbA1c assay. I.R.C. is an employee of New England Biolabs, a manufacturer and vendor of reagents for life science research. M.J.C. is Chief Scientist for Genomics England, a UK Government company. The views expressed in this article are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. M.I.M. has served on advisory panels for Pfizer, Novo Nordisk and Zoe Global, has received honoraria from Merck, Pfizer, Novo Nordisk and Eli Lilly and research funding from Abbvie, AstraZeneca, Boehringer Ingelheim, Eli Lilly, Janssen, Merck, Novo Nordisk, Pfizer, Roche, Sanofi Aventis, Servier and Takeda. As of June 2019, M.I.M. is an employee of Genentech and a holder of Roche stock. P.M.S. received grant funding from Laboratoires Servier. P.M.S. and D.W. receive funding from Astex Pharmaceuticals and Novo Nordisk. They are both shareholders of Septerna, where P.M.S. is also a founder. P.M.S. is the director and D.W. the Monash Node leader of the Australian Research Council of Australia Center for Cryo-Electron Microscopy of Membrane Proteins that includes the following as Partner Organizations who provide cash or in-kind funding: Astex Pharmaceuticals, AstraZeneca, Boehringer Ingelheim, Catalyst Therapeutics, Dimerix Bioscience, Genentech, Novo Nordisk, Pfizer, Sanofi Aventis, Servier and Thermo Fisher Scientific. T.J. is now a GSK employee. W. Koenig reports consulting fees from AstraZeneca, Novartis, Pfizer, The Medicines Company, DalCor, Kowa, Amgen, Corvidia, Daiichi-Sankyo, Genentech, Novo Nordisk, Esperion, OMEICOS, LIB Therapeutics; speaker honoraria from Amgen, Novartis, Berlin-Chemie, Sanofi and Bristol-Myers Squibb; grants and nonfinancial support from Abbott, Roche Diagnostics, Beckmann and Singulex, all outside the submitted work. Y.S.A. is the owner of Maatschap PolyOmica and PolyKnomics BV, private organizations providing services, research and development in the field of computational and statistical, quantitative and computational (gen)omics. G.T., U.T. and K.S. are employees of deCODE genetics/Amgen. The other authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Vasiliki Lagou, Longda Jiang, Anna Ulrich.

These authors jointly supervised this work: Marika A. Kaakinen, Ben Jones, Inga Prokopenko.

A full lists of authors and their affiliations appears at the end of the paper.

Contributor Information

Marika A. Kaakinen, Email: m.kaakinen@imperial.ac.uk

Ben Jones, Email: ben.jones@imperial.ac.uk.

Inga Prokopenko, Email: i.prokopenko@surrey.ac.uk.

GWA-PA Consortium:

Marcel den Hoed

Meta-Analysis of Glucose and Insulin-Related Traits Consortium (MAGIC):

Cornelia M. van Duijn

Extended data

is available for this paper at 10.1038/s41588-023-01462-3.

Supplementary information

The online version contains supplementary material available at 10.1038/s41588-023-01462-3.

References

  • 1.Santos RL, et al. Heritability of fasting glucose levels in a young genetically isolated population. Diabetologia. 2006;49:667–672. doi: 10.1007/s00125-006-0142-6. [DOI] [PubMed] [Google Scholar]
  • 2.Almgren P, et al. Heritability and familiality of type 2 diabetes and related quantitative traits in the Botnia study. Diabetologia. 2011;54:2811–2819. doi: 10.1007/s00125-011-2267-5. [DOI] [PubMed] [Google Scholar]
  • 3.Scott RA, et al. An expanded genome-wide association study of type 2 diabetes in Europeans. Diabetes. 2017;66:2888–2902. doi: 10.2337/db16-1253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Vujkovic M, et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat. Genet. 2020;52:680–691. doi: 10.1038/s41588-020-0637-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Chen J, et al. The trans-ancestral genomic architecture of glycemic traits. Nat. Genet. 2021;53:840–860. doi: 10.1038/s41588-021-00852-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Dimas AS, et al. Impact of type 2 diabetes susceptibility variants on quantitative glycemic traits reveals mechanistic heterogeneity. Diabetes. 2014;63:2158–2171. doi: 10.2337/db13-0949. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ingelsson E, et al. Detailed physiologic characterization reveals diverse mechanisms for novel genetic loci regulating glucose and insulin metabolism in humans. Diabetes. 2010;59:1266–1275. doi: 10.2337/db09-1568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Scott RA, et al. Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways. Nat. Genet. 2012;44:991–1005. doi: 10.1038/ng.2385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bahl, V. et al. G6PC2 controls glucagon secretion by defining the setpoint for glucose in pancreatic α-cells. Preprint at bioRxiv10.1101/2023.05.23.541901 (2023).
  • 10.Bosma KJ, et al. Pancreatic islet β cell-specific deletion of G6pc2 reduces fasting blood glucose. J. Mol. Endocrinol. 2020;64:235–248. doi: 10.1530/JME-20-0031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Rutter GA, Georgiadou E, Martinez-Sanchez A, Pullen TJ. Metabolic and functional specialisations of the pancreatic β cell: gene disallowance, mitochondrial metabolism and intercellular connectivity. Diabetologia. 2020;63:1990–1998. doi: 10.1007/s00125-020-05205-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Benonisdottir S, et al. Sequence variants associating with urinary biomarkers. Hum. Mol. Genet. 2019;28:1199–1211. doi: 10.1093/hmg/ddy409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Teumer A, et al. Genome-wide association meta-analyses and fine-mapping elucidate pathways influencing albuminuria. Nat. Commun. 2019;10:4130. doi: 10.1038/s41467-019-11576-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wuttke M, et al. A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat. Genet. 2019;51:957–972. doi: 10.1038/s41588-019-0407-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mahajan A, et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 2018;50:1505–1513. doi: 10.1038/s41588-018-0241-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Mahajan A, et al. Identification and functional characterization of G6PC2 coding variants influencing glycemic traits define an effector transcript at the G6PC2-ABCB11 locus. PLoS Genet. 2015;11:e1004876. doi: 10.1371/journal.pgen.1004876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Pullen TJ, Rutter GA. Roles of lncRNAs in pancreatic β cell identity and diabetes susceptibility. Front. Genet. 2014;5:193. doi: 10.3389/fgene.2014.00193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Deng YN, Xia Z, Zhang P, Ejaz S, Liang S. Transcription factor RREB1: from target genes towards biological functions. Int. J. Biol. Sci. 2020;16:1463–1473. doi: 10.7150/ijbs.40834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Piccand J, et al. Rfx6 maintains the functional identity of adult pancreatic β cells. Cell Rep. 2014;9:2219–2232. doi: 10.1016/j.celrep.2014.11.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Tomkin GH. Treatment of type 2 diabetes, lifestyle, GLP1 agonists and DPP4 inhibitors. World J. Diabetes. 2014;5:636–650. doi: 10.4239/wjd.v5.i5.636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Spracklen CN, et al. Identification of type 2 diabetes loci in 433,540 East Asian individuals. Nature. 2020;582:240–245. doi: 10.1038/s41586-020-2263-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Wessel J, et al. Low-frequency and rare exome chip variants associate with fasting glucose and type 2 diabetes susceptibility. Nat. Commun. 2015;6:5897. doi: 10.1038/ncomms6897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wan Q, et al. Mini G protein probes for active G protein-coupled receptors (GPCRs) in live cells. J. Biol. Chem. 2018;293:7466–7473. doi: 10.1074/jbc.RA118.001975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Jones B, et al. Targeting GLP-1 receptor trafficking to improve agonist efficacy. Nat. Commun. 2018;9:1602. doi: 10.1038/s41467-018-03941-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Willard FS, et al. Tirzepatide is an imbalanced and biased dual GIP and GLP-1 receptor agonist. JCI Insight. 2020;5:e140532. doi: 10.1172/jci.insight.140532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Marzook A, Tomas A, Jones B. The interplay of glucagon-like peptide-1 receptor trafficking and signalling in pancreatic β cells. Front. Endocrinol. (Lausanne) 2021;12:678055. doi: 10.3389/fendo.2021.678055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Chedid V, et al. Allelic variant in the glucagon-like peptide 1 receptor gene associated with greater effect of liraglutide and exenatide on gastric emptying: a pilot pharmacogenetics study. Neurogastroenterol. Motil. 2018;30:e13313. doi: 10.1111/nmo.13313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.De Luis DA, Diaz Soto G, Izaola O, Romero E. Evaluation of weight loss and metabolic changes in diabetic patients treated with liraglutide, effect of rs6923761 gene variant of glucagon-like peptide 1 receptor. J. Diabetes Complications. 2015;29:595–598. doi: 10.1016/j.jdiacomp.2015.02.010. [DOI] [PubMed] [Google Scholar]
  • 29.Deganutti G, et al. Dynamics of GLP-1R peptide agonist engagement are correlated with kinetics of G protein activation. Nat. Commun. 2022;13:92. doi: 10.1038/s41467-021-27760-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wootten D, Simms J, Miller LJ, Christopoulos A, Sexton PM. Polar transmembrane interactions drive formation of ligand-specific and signal pathway-biased family B G protein-coupled receptor conformations. Proc. Natl Acad. Sci. USA. 2013;110:5211–5216. doi: 10.1073/pnas.1221585110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Venkatakrishnan AJ, et al. Diverse GPCRs exhibit conserved water networks for stabilization and activation. Proc. Natl Acad. Sci. USA. 2019;116:3288–3293. doi: 10.1073/pnas.1809251116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Yuan S, Filipek S, Palczewski K, Vogel H. Activation of G-protein-coupled receptors correlates with the formation of a continuous internal water pathway. Nat. Commun. 2014;5:4733. doi: 10.1038/ncomms5733. [DOI] [PubMed] [Google Scholar]
  • 33.Karczewski KJ, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Pers TH, et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 2015;6:5890. doi: 10.1038/ncomms6890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Timshel PN, Thompson JJ, Pers TH. Genetic mapping of etiologic brain cell types for obesity. eLife. 2020;9:e55851. doi: 10.7554/eLife.55851. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ding Q, et al. Genome-wide meta-analysis associates GPSM1 with type 2 diabetes, a plausible gene involved in skeletal muscle function. J. Hum. Genet. 2020;65:411–420. doi: 10.1038/s10038-019-0720-3. [DOI] [PubMed] [Google Scholar]
  • 37.Võsa U, et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 2021;53:1300–1310. doi: 10.1038/s41588-021-00913-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Alonso L, et al. TIGER: the gene expression regulatory variation landscape of human pancreatic islets. Cell Rep. 2021;37:109807. doi: 10.1016/j.celrep.2021.109807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Yang L, et al. Effect of TET2 on the pathogenesis of diabetic nephropathy through activation of transforming growth factor β1 expression via DNA demethylation. Life Sci. 2018;207:127–137. doi: 10.1016/j.lfs.2018.04.044. [DOI] [PubMed] [Google Scholar]
  • 40.Van de Bunt M, et al. Transcript expression data from human islets links regulatory signals from genome-wide association studies for type 2 diabetes and glycemic traits to their downstream effectors. PLoS Genet. 2015;11:e1005694. doi: 10.1371/journal.pgen.1005694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Kurilshikov A, et al. Large-scale association analyses identify host factors influencing human gut microbiome composition. Nat. Genet. 2021;53:156–165. doi: 10.1038/s41588-020-00763-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Lopera-Maya EA, et al. Effect of host genetics on the gut microbiome in 7,738 participants of the Dutch Microbiome Project. Nat. Genet. 2022;54:143–151. doi: 10.1038/s41588-021-00992-y. [DOI] [PubMed] [Google Scholar]
  • 43.Carmichael AJ, Arroyo CM, Cockerham LG. Reaction of disodium cromoglycate with hydrated electrons. Free Radic. Biol. Med. 1988;4:215–218. doi: 10.1016/0891-5849(88)90042-1. [DOI] [PubMed] [Google Scholar]
  • 44.Zhang X, et al. Human gut microbiota changes reveal the progression of glucose intolerance. PLoS ONE. 2013;8:e71108. doi: 10.1371/journal.pone.0071108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Frost F, et al. A structured weight loss program increases gut microbiota phylogenetic diversity and reduces levels of Collinsella in obese type 2 diabetics: a pilot study. PLoS ONE. 2019;14:e0219489. doi: 10.1371/journal.pone.0219489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Vojinovic D, et al. Relationship between gut microbiota and circulating metabolites in population-based cohorts. Nat. Commun. 2019;10:5813. doi: 10.1038/s41467-019-13721-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Segerstolpe A, et al. Single-cell transcriptome profiling of human pancreatic islets in health and type 2 diabetes. Cell Metab. 2016;24:593–607. doi: 10.1016/j.cmet.2016.08.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Sharapov SZ, et al. Defining the genetic control of human blood plasma N-glycome using genome-wide association study. Hum. Mol. Genet. 2019;28:2062–2077. doi: 10.1093/hmg/ddz054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Clerc F, et al. Human plasma protein N-glycosylation. Glycoconj. J. 2016;33:309–343. doi: 10.1007/s10719-015-9626-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Novokmet M, et al. Changes in IgG and total plasma protein glycomes in acute systemic inflammation. Sci. Rep. 2014;4:4347. doi: 10.1038/srep04347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Schmidt MI, et al. Markers of inflammation and prediction of diabetes mellitus in adults (Atherosclerosis Risk in Communities study): a cohort study. Lancet. 1999;353:1649–1652. doi: 10.1016/S0140-6736(99)01046-6. [DOI] [PubMed] [Google Scholar]
  • 52.Dotz V, et al. Plasma protein N-glycan signatures of type 2 diabetes. Biochim. Biophys. Acta Gen. Subj. 2018;1862:2613–2622. doi: 10.1016/j.bbagen.2018.08.005. [DOI] [PubMed] [Google Scholar]
  • 53.Keser T, et al. Increased plasma N-glycome complexity is associated with higher risk of type 2 diabetes. Diabetologia. 2017;60:2352–2360. doi: 10.1007/s00125-017-4426-9. [DOI] [PubMed] [Google Scholar]
  • 54.Wittenbecher C, et al. Plasma N-glycans as emerging biomarkers of cardiometabolic risk: a prospective investigation in the EPIC-Potsdam cohort study. Diabetes Care. 2020;43:661–668. doi: 10.2337/dc19-1507. [DOI] [PubMed] [Google Scholar]
  • 55.Johswich A, et al. N-glycan remodeling on glucagon receptor is an effector of nutrient sensing by the hexosamine biosynthesis pathway. J. Biol. Chem. 2014;289:15927–15941. doi: 10.1074/jbc.M114.563734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Lemmers RFH, et al. IgG glycan patterns are associated with type 2 diabetes in independent European populations. Biochim. Biophys. Acta Gen. Subj. 2017;1861:2240–2249. doi: 10.1016/j.bbagen.2017.06.020. [DOI] [PubMed] [Google Scholar]
  • 57.Liu D, et al. Ischemic stroke is associated with the pro-inflammatory potential of N-glycosylated immunoglobulin G. J. Neuroinflammation. 2018;15:123. doi: 10.1186/s12974-018-1161-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Liu M, et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 2019;51:237–244. doi: 10.1038/s41588-018-0307-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Wang Z, et al. Genome-wide association analyses of physical activity and sedentary behavior provide insights into underlying mechanisms and roles in disease prevention. Nat. Genet. 2022;54:1332–1344. doi: 10.1038/s41588-022-01165-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Kopf S, et al. Breathlessness and restrictive lung disease: an important diabetes-related feature in patients with type 2 diabetes. Respiration. 2018;96:29–40. doi: 10.1159/000488909. [DOI] [PubMed] [Google Scholar]
  • 61.Sonoda N, et al. A prospective study of the impact of diabetes mellitus on restrictive and obstructive lung function impairment: the Saku study. Metabolism. 2018;82:58–64. doi: 10.1016/j.metabol.2017.12.006. [DOI] [PubMed] [Google Scholar]
  • 62.Abdi A, Jalilian M, Sarbarzeh PA, Vlaisavljevic Z. Diabetes and COVID-19: a systematic review on the current evidences. Diabetes Res. Clin. Pract. 2020;166:108347. doi: 10.1016/j.diabres.2020.108347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Zhu L, et al. Association of blood glucose control and outcomes in patients with COVID-19 and pre-existing type 2 diabetes. Cell Metab. 2020;31:1068–1077. doi: 10.1016/j.cmet.2020.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Lagou V, et al. Sex-dimorphic genetic effects and novel loci for fasting glucose and insulin variability. Nat. Commun. 2021;12:24. doi: 10.1038/s41467-020-19366-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Marullo L, El-Sayed Moustafa JS, Prokopenko I. Insights into the genetic susceptibility to type 2 diabetes from genome-wide association studies of glycaemic traits. Curr. Diab. Rep. 2014;14:551. doi: 10.1007/s11892-014-0551-8. [DOI] [PubMed] [Google Scholar]
  • 66.Saxena R, et al. Genetic variation in GIPR influences the glucose and insulin responses to an oral glucose challenge. Nat. Genet. 2010;42:142–148. doi: 10.1038/ng.521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Mingrone G, et al. Metabolic surgery versus conventional medical therapy in patients with type 2 diabetes: 10-year follow-up of an open-label, single-centre, randomised controlled trial. Lancet. 2021;397:293–304. doi: 10.1016/S0140-6736(20)32649-0. [DOI] [PubMed] [Google Scholar]
  • 68.Whang A, Nagpal R, Yadav H. Bi-directional drug-microbiome interactions of anti-diabetics. EBioMedicine. 2019;39:591–602. doi: 10.1016/j.ebiom.2018.11.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.1000 Genomes Project Consortium. et al. A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Rueger S, McDaid A, Kutalik Z. Evaluation and application of summary statistic imputation to discover new height-associated loci. PLoS Genet. 2018;14:e1007371. doi: 10.1371/journal.pgen.1007371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Loh PR, et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 2015;47:284–290. doi: 10.1038/ng.3190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Magi R, Morris AP. GWAMA: software for genome-wide association meta-analysis. BMC Bioinformatics. 2010;11:288. doi: 10.1186/1471-2105-11-288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Magi R, Lindgren CM, Morris AP. Meta-analysis of sex-specific genome-wide association studies. Genet. Epidemiol. 2010;34:846–853. doi: 10.1002/gepi.20540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Chang CC, et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Yang J, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 2012;44:369–375. doi: 10.1038/ng.2213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Kenakin T. A scale of agonism and allosteric modulation for assessment of selectivity, bias, and receptor mutation. Mol. Pharmacol. 2017;92:414–424. doi: 10.1124/mol.117.108787. [DOI] [PubMed] [Google Scholar]
  • 78.Zhao P, et al. Activation of the GLP-1 receptor by a non-peptidic agonist. Nature. 2020;577:432–436. doi: 10.1038/s41586-019-1902-z. [DOI] [PubMed] [Google Scholar]
  • 79.Harvey MJ, Giupponi G, Fabritiis GD. ACEMD: accelerating biomolecular dynamics in the microsecond time scale. J. Chem. Theory Comput. 2009;5:1632–1639. doi: 10.1021/ct9000685. [DOI] [PubMed] [Google Scholar]
  • 80.Cuzzolin A, Deganutti G, Salmaso V, Sturlese M, Moro S. AquaMMapS: an alternative tool to monitor the role of water molecules during protein-ligand association. ChemMedChem. 2018;13:522–531. doi: 10.1002/cmdc.201700564. [DOI] [PubMed] [Google Scholar]
  • 81.Pers TH, Timshel P, Hirschhorn JN. SNPsnap: a web-based tool for identification and annotation of matched SNPs. Bioinformatics. 2015;31:418–420. doi: 10.1093/bioinformatics/btu655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.1000 Genomes Project Consortium. et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. doi: 10.1038/nature09534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Tabula Muris Consrtium. et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature. 2018;562:367–372. doi: 10.1038/s41586-018-0590-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Barbeira AN, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 2018;9:1825. doi: 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Gamazon ER, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 2015;47:1091–1098. doi: 10.1038/ng.3367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Delaneau O, Marchini J, the 1000 Genomes Project Consortium. Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nat. Commun. 2014;5:3934. doi: 10.1038/ncomms4934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Iotchkova V, et al. Discovery and refinement of genetic loci associated with cardiometabolic risk using dense imputation maps. Nat. Genet. 2016;48:1303–1312. doi: 10.1038/ng.3668. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Almgren P, et al. Genetic determinants of circulating GIP and GLP-1 concentrations. JCI Insight. 2017;2:e93306. doi: 10.1172/jci.insight.93306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Wallace C. A more accurate method for colocalisation analysis allowing for multiple causal variants. PLoS Genet. 2021;17:1–11. doi: 10.1371/journal.pgen.1009440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Sharapov, S. et al. Genome-wide association summary statistics for human blood plasma glycome. Zenodo. 10.5281/zenodo.1298406 (2018).
  • 91.Finucane HK, et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 2015;47:1228–1235. doi: 10.1038/ng.3404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Zheng J, et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics. 2017;33:272–279. doi: 10.1093/bioinformatics/btw613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Fedko, I. O. et al. Genetics of fasting indices of glucose homeostasis using GWIS unravels tight relationships with inflammatory markers. Preprint at bioRxiv10.1101/496802 (2018).
  • 94.Shrine N, et al. New genetic signals for lung function highlight pathways and chronic obstructive pulmonary disease associations across multiple ancestries. Nat. Genet. 2019;51:481–493. doi: 10.1038/s41588-018-0321-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Hemani G, et al. The MR-base platform supports systematic causal inference across the human phenome. eLife. 2018;7:e34408. doi: 10.7554/eLife.34408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Burgess S, et al. Guidelines for performing Mendelian randomization investigations. Wellcome Open Res. 2019;4:186. doi: 10.12688/wellcomeopenres.15555.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 2015;44:512–525. doi: 10.1093/ije/dyv080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Sanderson E, Spiller W, Bowden J. Testing and correcting for weak and pleiotropic instruments in two-sample multivariable Mendelian randomization. Stat. Med. 2021;40:5434–5452. doi: 10.1002/sim.9133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Choi SW, O’Reilly PF. PRSice-2: Polygenic Risk Score software for biobank-scale data. GigaScience. 2019;8:giz082. doi: 10.1093/gigascience/giz082. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (473.2KB, pdf)

Supplementary Note.

Reporting Summary (90.2KB, pdf)
Peer Review File (1.3MB, pdf)
Supplementary Video 1 (39.9MB, zip)

Superpositions between WT (blue) and A316T (red) GLP-1R during MD simulations. GLP-1R residue position 5.46: in WT (blue), Y2423.45 persistently forms a hydrogen bond with the backbone of P3125.42; in A316T (red), this interaction is replaced by the hydrogen bond with the side chain of T3165.46. Red dotted lines indicate hydrogen bonds.

Supplementary Video 2 (29.5MB, zip)

Superpositions between WT (blue) and G168S (gray) GLP-1R during MD simulations. GLP-1R residue position 1.63 and ICL1: G168S forms a persistent hydrogen bond between the S1681.63 side chain and the backbone of A1641.59 (red dotted line), not present in the WT (G168ICL1).

Supplementary Tables (1.6MB, xlsx)

Supplementary Tables 1–23.

Data Availability Statement

Meta-analysis summary statistics for the GWAS presented in this manuscript are available on the MAGIC website (magicinvestigators.org) and through the NHGRI-EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/downloads/summary-statistics, GCP ID: GCP000666; with study accession codes for Europeans-only meta-analysis: GCST90271557; cross-ancestry meta-analysis: GCST90271558; and sex-dimorphic meta-analysis: GCST90271559). UK Biobank individual-level data can be obtained through a data access application available at https://www.ukbiobank.ac.uk/. In this study, we made use of data made available by: 1000 Genomes project (https://www.genome.gov/27528684/1000-genomes-project); SNPsnap (https://data.broadinstitute.org/mpg/snpsnap/index.html); Tabula Muris (https://www.czbiohub.org/tabula-muris/); GTEx Consortium (https://gtexportal.org/home/); microbiome GWAS (https://mibiogen.gcc.rug.nl/); Human Gut Microbiome Atlas (https://www.microbiomeatlas.org); eQTLGen Consortium (https://www.eqtlgen.org/); TIGER expression data (http://tiger.bsc.es/) and LDHub database (http://ldsc.broadinstitute.org/ldhub/).


Articles from Nature Genetics are provided here courtesy of Nature Publishing Group

RESOURCES