Abstract
Genetic studies of nontraditional glycemic biomarkers, glycated albumin and fructosamine, can shed light on unknown aspects of type 2 diabetes genetics and biology. We performed a multiphenotype genome-wide association study of glycated albumin and fructosamine from 7,395 White and 2,016 Black participants in the Atherosclerosis Risk in Communities (ARIC) study on common variants from genotyped/imputed data. We discovered two genome-wide significant loci, one mapping to a known type 2 diabetes gene (ARAP1/STARD10) and another mapping to a novel region (UGT1A complex of genes), using multiomics gene-mapping strategies in diabetes-relevant tissues. We identified additional loci that were ancestry- and sex-specific (e.g., PRKCA in African ancestry, FCGRT in European ancestry, TEX29 in males). Further, we implemented multiphenotype gene-burden tests on whole-exome sequence data from 6,590 White and 2,309 Black ARIC participants. Ten variant sets annotated to genes across different variant aggregation strategies were exome-wide significant only in multiancestry analysis, of which CD1D, EGFL7/AGPAT2, and MIR126 had notable enrichment of rare predicted loss of function variants in African ancestry despite smaller sample sizes. Overall, 8 of 14 discovered loci and genes were implicated to influence these biomarkers via glycemic pathways, and most of them were not previously implicated in studies of type 2 diabetes. This study illustrates improved locus discovery and potential effector gene discovery by leveraging joint patterns of related biomarkers across the entire allele frequency spectrum in multiancestry analysis. Future investigation of the loci and genes potentially acting through glycemic pathways may help us better understand the risk of developing type 2 diabetes.
Article Highlights
Glycated albumin and fructosamine are biomarkers reflecting aspects of the glycemic process different from glycated hemoglobin or blood glucose levels. Thus, they can shed light on unknown aspects of type 2 diabetes genetics and biology.
We leveraged array-based and exome sequence data on multiancestry individuals in the U.S. to discover yet-unidentified genes.
We discovered 14 common variant loci and rare variant genes associated with glycated albumin and/or fructosamine, some of which have been implicated in type 2 diabetes. Locus-specific effects at common variants may vary by sex. Some loci and gene associations were unique to either European or African ancestry.
Introduction
Genetic studies of traditional measures of hyperglycemia, such as fasting glucose and HbA1c, have improved our understanding of the genetic mechanisms that may influence hyperglycemia and lead to type 2 diabetes (1). A large multiancestry study evaluating fasting glucose and HbA1c in >280,000 individuals without diabetes identified 102 loci for fasting glucose and 127 loci for HbA1c (2). However, little is known about the genetics of the nontraditional biomarkers of type 2 diabetes, such as fructosamine and glycated albumin, that reflect average blood glucose levels over the previous 2–3 weeks (3) and show promise as alternatives in the face of limitations of traditional measures (4–9). For instance, fasting glucose measurement requires patient preparation (8-h fasting), exhibits moderate within-person variability, has sample stability issues, and is affected by illness and stress, whereas measurement of HbA1c is costly, requires whole blood, is hemoglobin dependent, and may be inaccurate when the life of red blood cell is altered (7,10).
Fructosamine and glycated albumin improve risk stratification of diabetes and its long-term complications, provide information complementary to those from traditional glycemic biomarkers, and hold promise in providing unique insights into hyperglycemia and diabetes pathophysiology (11–14). Our previous genome-wide association study (GWAS) on fructosamine and glycated albumin identified several associated genetic variants, including a known type 2 diabetes-related missense variant in GCKR and a novel missense variant in RCN3, which may impact fructosamine in a nonglycemic manner (15). The estimated narrow-sense heritability—representing heritability from the additive effects of variants across the entire genome—for both fructosamine and glycated albumin are substantial (h2 = 0.44 [SE 0.13] and 0.45 [SE 0.13], respectively) and even greater than HbA1c (h2 = 0.34 [SE 0.13]) (16). Yet, the single nucleotide polymorphism (SNP)-based heritability estimates ()—representing heritability from common variants—are 0.11 (SE 0.03) for fructosamine and 0.10 (SE 0.04) for glycated albumin compared with 0.17 (SE 0.04) for HbA1c (16). These estimates suggest that there remain unidentified variants, including possibly rare variants, associated with these biomarkers. Just like GWAS of traditional glycemic biomarkers have improved knowledge of type 2 diabetes pathophysiology at a time when type 2 diabetes was studied as a dichotomous trait alone (1), it is important to understand the genetic basis of these complementary nontraditional glycemic biomarkers that may inform yet-unknown mechanisms modulating glucose control and eventually leading to type 2 diabetes. Identifying additional variants associated with fructosamine and glycated albumin not found in our previous GWAS (15) requires methods that exploit additional information on these biomarkers and/or genotypes to improve statistical power.
One way to achieve improved power to detect disease-associated variants is through multivariate analysis of underlying disease-related phenotypes (17–19). Multivariate methods jointly analyze two or more phenotypes (e.g., multiple glycemic biomarkers) by accounting for the correlation structure of the phenotypes and test whether a genetic variant is statistically associated with at least one phenotype. Such approaches can identify pleiotropic variants that are otherwise hard to capture using standard single-phenotype analyses (19–21). In addition, they may capture the genetic association of a single phenotype by harnessing joint patterns across related phenotypes (19). Bivariate heritability estimates (narrow-sense h2 = 0.46, SNP-based ) (16) indicate shared genetics between fructosamine and glycated albumin, and hence, bivariate analysis of these phenotypes may help identify genetic variants associated with both of these biomarkers.
In this study, we aimed to expand the current inventory of genetic variants influencing these nontraditional biomarkers of hyperglycemia. We investigated genetic associations of common variants with minor allele frequency (MAF) ≥5% using a data-adaptive multiphenotype single-variant test (metaUSAT [22]) on genotyped/imputed data, and examined rare genetic associations of variants with MAF <5% using multiphenotype gene-based method (GAMuT [23]) on exome sequence data from the Atherosclerosis Risk in Communities (ARIC) Study (24). We also investigated whether genetic findings for these biomarkers differed by genetic ancestry and sex. Finally we examined whether the identified common variant loci and genes containing rare variants were potentially acting through glycemic pathways.
Research Design and Methods
Study Population
We included 9,411 individuals (n = 7,395 self-reported White and n = 2,016 self-reported Black) who were genotyped and passed quality control, and 8,898 individuals (n = 6,589 self-reported White, n = 3,209 self-reported Black) who were exome sequenced and passed quality control from the ARIC study (24) (Supplementary Fig. 1). Because differences in clinical measures and outcomes between racial/ethnic categories do not necessarily indicate biological differences between them, we analyzed all individuals together and also performed genetic ancestry-stratified analyses (Supplementary Methods).
Genotyping, Imputation, Exome Sequencing, and Quality Control
Samples were genotyped on the Affymetrix 6.0 array and imputed to 1000 Genomes Phase I (March 2012) separately by ancestry using IMPUTE2 (25). Poor-quality samples and variants were removed using usual quality control metrics, including removal of first-degree relatives (Supplementary Methods). The final data set included 7,827,582 genotyped/imputed SNPs with MAF ≥5%. Blood-based whole-exome sequencing of the samples was done on the Illumina HiSeq 2000 or 2500 platform (San Diego, CA), and several quality control measures were implemented (Supplementary Methods). The postquality control sample included genetic data on 2,556,859 single nucleotide variants and 76,133 indels from the exome.
Glycemic Biomarkers
Fructosamine and glycated albumin were measured from serum samples collected at visit 2. Here, glycated albumin is expressed as percentage: [(glycated albumin/serum albumin) × 100/1.14] + 2.9.
Statistical Association Analyses
We natural log-transformed both biomarkers to account for skewed distributions and obtained biomarker residuals after adjusting each biomarker for age, sex, race-center variable, and the top 10 genetic principal components (PCs). We used the biomarker residuals as outcomes in all genetic association analyses. To evaluate effect of BMI on these glycemic traits, we also considered BMI-adjusted biomarker residuals. All single-variant analyses included variants with MAF ≥5%.
Single-Variant Multiphenotype Analysis
Data-adaptive multiphenotype (multivariate) methods—usually based on GWAS summary statistics—ensure robust power performance across different alternatives (26) that vary from one variant to the next. We first obtained GWAS summary statistics by performing linear regression analysis of the biomarker residuals on each variant from the genotyped/imputed data using an additive genetic model in PLINK (27). We then implemented metaUSAT (20,22) on the GWAS summary statistics and tested the joint association of each variant with the two biomarker residuals (Supplementary Methods).
Variant Set-Based Multiphenotype Analysis
A single-variant test of multiple phenotypes is not ideal for exome sequence data due to the large number of low-frequency and rare variants (26). We implemented gene-based multiphenotype analysis of the biomarkers on variants with MAF <5% from the exome sequence data using GAMuT (23), a nonparametric test of no association between a set of phenotypes and a set of genetic variants. Similar to previous human whole-exome association studies (28–30), we defined four different, sometimes nested, sets of functionally important rare and/or common variants called variant masks using SnpEff (31): 1) protein-truncating variants (PTVs) at any allele frequency; 2) PTVs in mask 1 plus missense variants with MAF <5%; 3) putative loss-of-function variants (pLOFs) at any allele frequency; 4) pLOFs with MAF <5%. We also created a mask involving rare deleterious missense variants and rare pLOFs; however, the resultant mask was identical to mask 4 (see Supplementary Methods for further details on masks and gene annotation). We collapsed all variants into a single-burden genotype (32) and included genes with at least three qualifying variants and a burden minor allele count (MAC) of five or more (33). We considered phenotype similarity matrix using both projection and linear kernels (i.e., two different ways of summarizing multiphenotype information). We declared exome-wide significance at 2.5 × 10–6 and additionally examined whether our findings are driven by nearby GWAS signals (Supplementary Methods).
Stratified Analyses
For sex-stratified analysis, we obtained biomarker residuals in males and females separately, which were used as outcomes in multiancestry models for association analysis. For ancestry-stratified analysis, biomarker residuals and genetic PCs were obtained in each group separately. We only considered BMI-unadjusted models here because results from the sex-combined multiancestry analysis were qualitatively similar with and without BMI adjustment.
Locus Annotation and Functional Gene Prioritization
Independent loci from single-variant analyses were defined using FUMA (34), and each locus was annotated by its most significant SNP (lead SNP) and their nearest gene names. To characterize the regulatory effects of the significant signals and prioritize functional genes, we searched for overlap between detected genetic determinants from our single-variant multiphenotype analysis and plasma protein quantitative trait loci (pQTLs) in ARIC (35). We additionally used three strategies in FUMA to perform functional gene mapping of the identified loci: positional mapping, expression (e)QTL mapping, and three-dimensional (3D) chromatin interaction mapping. We used GTEx v8 expression data (36) from diabetes-relevant tissues (adipose, liver, skeletal muscle, and pancreas) (37) and TIGER transcript expression data from human pancreatic islets (38) for eQTL mapping, and Hi-C data in GSE87112 (39) for chromatin interaction mapping (Supplementary Methods).
Characterization of Detected Loci and Genes as Glycemic Versus Nonglycemic
To understand biological pathways of these biomarkers that relate to glycemia, we repeated all of our analyses by conditioning on fasting glucose and followed a classification algorithm similar to those used previously (40,41). We classified the significant lead SNPs from our common variant analysis and significant genes from our set-based analyses as potentially “glycemic,” “maybe glycemic,” or “nonglycemic” based on considerable effect size attenuation in mediation analysis in ARIC, association analysis with fasting glucose and HbA1c in ARIC, and using single-variant and gene-based multiancestry meta-analysis results from the Common Metabolic Diseases Knowledge Portal (CMDKP) (42). Details about the classification algorithm, glycemic traits chosen from CMDKP, and other related choices are in the Supplementary Methods.
Data and Resource Availability
The genotyped data and the whole-exome sequence data are available in the database of Genotypes and Phenotypes (dbGaP) (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000090.v7.p1; https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000668.v5.p1). The phenotype data can be requested from the ARIC Coordinating Center.
Results
Study Characteristics
This study included 9,411 multiancestry participants (78% European ancestry, 21% African ancestry) with genotyped data and 8,898 multiancestry participants (67% European ancestry, 23% African ancestry) with exome sequencing data. More than half were females, and the mean age was ∼57 years. Values for all glycemic biomarkers were higher among African ancestry individuals than European ancestry (Supplementary Table 1). Correlation between the biomarkers was strong (r = 0.79), and estimates were similar in the genotyped and the exome-sequenced samples (Supplementary Table 2).
Locus Discovery from Multiphenotype Analysis of Common Variants
At genome-wide significance level of 5 × 10−8, the joint analysis of fructosamine and glycated albumin using the BMI-unadjusted model on sex-combined multiancestry data identified two loci: the well-known glycemic locus 11q13.4 (ARAP1/STARD10, rs116714277-T, P = 2.8 × 10−8) (43) and the locus 17q24.2 (PRKCA, rs59443763-C, P = 1.4 × 10−8) implicated in lipids and cardiovascular traits (42,44) (Table 1). Both the ARAP1/STARD10 and PRKCA loci were only identified in African ancestry individuals as the lead SNPs at these loci were removed due to poor quality in our European ancestry sample (Supplementary Fig. 2). These lead SNPs, rs116714277 and rs59443763, are extremely rare in the non-Finnish European population (MAF 0.03% and 0.06%, respectively) compared with the African/African American population (MAF 7.3% and 6.8%, respectively) (45). There were minimal differences in the genome-wide metaUSAT P values of BMI-unadjusted and BMI-adjusted models (r = 0.83) (Supplementary Fig. 3).
Table 1.
Association results for the most significant SNPs of the loci identified from multiphenotype analysis of fructosamine and glycated albumin using BMI-unadjusted model on sex-combined genotyped and imputed data
| Ancestry | Locus | Nearest gene | rsID (lead SNP) | Position (hg19) | Effect allele | Effect allele frequency† (in ARIC data) | Fructosamine | Glycated albumin | Multivariate analysis (metaUSAT) | cis-eQTL mapping in diabetes-relevant tissues# | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| European | African | Effect size β§ | P value | Effect size β§ | P value | BMI-unadjusted P value | BMI-adjusted P value | |||||||
| Multiancestry (n = 9,411) | 11q13.4¶ | ARAP1/ STARD10 | rs116714277 | 72473447 | T | — | 0.051 | 0.044 | 2.4 × 10−7 | 0.055 | 5.0 × 10−8 | 2.8 × 10−8 | 3.1 × 10−8 | Lead SNP is not cis-eQTL. Other cis-eQTL in this locus: Muscle_skeletal |
| 17q24.2 | PRKCA | rs59443763 | 64526988 | C | — | 0.065 | 0.037 | 1.6 × 10−6 | 0.053 | 3.3 × 10−9 | 1.4 × 10−8 | 6.7 × 10−8 | No cis-eQTL in this locus. | |
| European (n = 7,359) | 2p23.3 | GCKR | rs1260326 | 27730940 | T | 0.404 | 0.133 | −0.0036 | 2.0 × 10−2 | −0.010 | 4.0 × 10−8 | 5.2 × 10−8 | 6.4 × 10−8 | Lead SNP is cis-eQTL in Adipose_subcutaneous, Adipose_visceral, Muscle_skeletal, Pancreas, Pancreatic islets. Other cis-eQTLs in this locus: Adipose_subcutaneous, Adipose_visceral, Muscle_skeletal, Pancreas, Pancreatic islets |
| 2q37.1¶ | UGT1A1* | rs887829 | 234668570 | T | 0.327 | 0.437 | 0.0074 | 4.7 × 10−6 | 0.00079 | 6.8 × 10−1 | 2.2 × 10−8 | 2.1 × 10−8 | Lead SNP is cis-eQTL in Liver. Other cis-eQTLs in this locus: Adipose_subcutaneous, Liver, Pancreas, Pancreatic Islets | |
| 19q13.33 | FCGRT | rs59774409 | 50016748 | T | 0.055 | — | −0.018 | 1.6 × 10−7 | −0.022 | 1.2 × 10−7 | 2.4 × 10−8 | 7.2 × 10−8 | Lead SNP is cis-eQTL in Adipose_subcutaneous, Adipose_visceral, Muscle_skeletal, Pancreatic islets. Other cis-eQTLs in this locus: Adipose_subcutaneous, Adipose_visceral, Liver, Muscle_skeletal, Pancreas, Pancreatic islets | |
| African (n = 2,004) | 11q13.4¶ | ARAP1/ STARD10 | rs116714277 | 72473447 | T | — | 0.051 | 0.045 | 1.5 × 10−7 | 0.056 | 2.3 × 10−8 | 2.2 × 10−8 | 3.2 × 10−8 | Lead SNP is not cis-eQTL. Other cis-eQTL in this locus: Muscle_skeletal |
| 11q22.1 | RP11-115E19.1 | rs2438321 | 98500410 | G | 0.222 | 0.100 | 0.039 | 2.1 × 10−9 | 0.033 | 1.2 × 10−5 | 1.3 × 10−8 | 6.2 × 10−8 | No cis-eQTL in this locus. | |
| 17q24.2 | PRKCA | rs59443763 | 64526988 | C | — | 0.065 | 0.037 | 1.1 × 10−6 | 0.053 | 3.4 × 10−9 | 1.7 × 10−8 | 6.5 × 10−8 | No cis-eQTL in this locus. | |
These loci were identified by metaUSAT at the genome-wide threshold of 5 × 10−8 plus an additional locus that just missed this threshold. For a SNP, the metaUSAT P <5 × 10−8 indicates its statistically significant association with at least one of fructosamine and glycated albumin at the genome-wide level, which may or may not be significant in single-phenotype analysis. The effect sizes and P values for individual phenotypes are from the BMI-unadjusted model only.
Blank cells for ancestry-specific allele frequencies (calculated based on our ancestry-stratified sex-combined ARIC data) could be due to the removal of variant during QC procedure or the variant was nonexistent in our dataset due to rare population-level allele frequency.
β denotes mean change in log(outcome) for every additional copy of the effect allele.
The eQTL data sets consist primarily of European ancestry individuals, which may lead to limited eQTL findings for the loci identified in African ancestry participants.
Novel locus for fructosamine and/or glycated albumin (may or may not be novel for type 2 diabetes).
This region includes multiple alternatively spliced genes including UGT1A1/UGT1A3/UGT1A4/UGT1A5/UGT1A10.
Effect Size Heterogeneity at Two Loci of Common Variants Between European and African Ancestries
The multiphenotype analysis in African ancestry identified three significant loci: 11q13.4 (ARAP1/STARD10, rs116714277-T, P = 2.2 × 10−8), 11q22.1 (RP11-115E19.1, rs2438321-G, P = 1.3 × 10−8), and 17q24.2 (PRKCA, rs59443763-C, P = 1.7 × 10−8 (Table 1). The locus near RP11-115E19.1 showed significant effect size heterogeneity at the lead SNP (Supplementary Fig. 2) and some allele frequency differences between European (non-Finnish MAF 27.2%) and African/African American (MAF 12.4%) populations (45). We did not find any qualitative difference overall in significance when BMI-adjusted models were used (Supplementary Fig. 4). Among European ancestry individuals, two loci were genome-wide significant: 2q37.1 (UGT1A region, rs887829-T, P = 2.2 × 10−8) and 19q13.33 (FCGRT, rs59774409-T, P = 2.4 × 10−8). An additional well-known lipids and glycemic locus just missed the GWAS threshold: 2p23.3 (GCKR, rs1260326-T, P = 5.2 × 10−8) (46).
The UGT1A locus was not significant in the multiancestry GWAS of either fructosamine or glycated albumin alone. The UGT1A region lead SNP showed some heterogeneity in effect sizes between ancestries (Supplementary Fig. 2), and is quite common in both European (non-Finnish MAF 32.5%) and African/African American (MAF 45.9%) populations (45). The GCKR locus was significantly associated with glycated albumin alone (P = 4.0 × 10−8) as we had previously reported (15). There was no effect size heterogeneity at the GCKR lead SNP, yet this locus is nearly significant in European ancestry but not in African ancestry (P = 0.74). It is likely due to the limited African ancestry sample size exacerbated by population-level allele frequency differences (non-Finnish European MAF 40.9%, African/African American MAF 13.3%) (45). The lead SNP at the FCGRT locus was absent in our African ancestry genotyped/imputed data (removed due to poor quality) despite common allele frequencies in both European and African populations (non-Finnish European MAF 8.3%, African/African American MAF 30%) (45). The FCGRT locus, implicated in GWAS of serum albumin and lipids (44), was found to be a novel type 2 diabetes susceptibility locus, where the lead variant rs142385484—11 base pairs away from our lead SNP rs59774409— was genome-wide significant only in a multiancestry meta-analysis (47). Lack of complete attenuation of rs59774409’s effect when conditioned on rs14238548 (Supplementary Table 3) indicated the possibility of rs59774409 being related to type 2 diabetes through these nontraditional biomarkers independent of the effect of rs142385484. Although recently implicated as the likely effector transcript for fructosamine through eQTL colocalization of previously identified locus RCN3 (48), this is likely the first time FCGRT is directly discovered in a GWAS of these biomarkers. The lead SNPs at UGT1A1 and FCGRT in European ancestry (i.e., two of the five discovered glycemic trait loci) have moderate to strong evidence of association with type 2 diabetes (P < 10−3) in European ancestry-specific meta-analysis and in multiancestry meta-analysis (47), respectively (Supplementary Table 4), suggesting that they may be important in type 2 diabetes pathophysiology.
Locus-Specific Effects at Common Variants May Vary by Sex
The sex-specific P values of multiphenotype association seemed to indicate that signal at the ARAP1/STARD10 locus was primarily driven by females, whereas the other loci did not show sex-differentiation (Table 2). However, we did not find any significant difference in effect sizes of lead SNPs between females and males at these loci (Supplementary Fig. 5). The overall correlation between the sex-stratified P values at all SNPs in all identified loci was weak, regardless of whether BMI was adjusted or not (Supplementary Fig. 6). When scanning genome-wide, we identified locus 13q34 (TEX29) significantly associated in males (rs79276590-C, P = 3.0 × 10−8) but not in females (Supplementary Fig. 7), and exhibited effect size heterogeneity in both biomarkers between sexes (Supplementary Fig. 8). This male-specific locus was mapped to gene TUBGCP3 via 3D chromatin interactions in liver, and no relevant cis-eQTL or cis-pQTL was found (Supplementary Fig. 9). TEX29 has been implicated in BMI studies (49,50); however, BMI adjustment led to only slight attenuation of effects. Note, the effective sample sizes for genome-wide significant SNPs at this locus were low for males (n = 730–751) due to really low MAFs and low imputation quality in White participants (European MAF <0.01%, African/African American MAF 5%) (45). A few other loci were suggestively significant (P < 10−6 but P > 5 × 10−8) in one sex group but not the other (Supplementary Table 5 and Supplementary Fig. 10). We exercise caution in interpreting suggestively significant sex-differentiated loci due to much reduced sample sizes in each group.
Table 2.
Effects of sex and BMI on the most significant SNPs of the loci identified from multiphenotype analysis of fructosamine and glycated albumin in multiancestry sample
| Locus | Nearest gene | rsID (lead SNP) | Position (hg19) | Multivariate analysis (metaUSAT) | |||||
|---|---|---|---|---|---|---|---|---|---|
| BMI-unadjusted P value | BMI-adjusted P value | ||||||||
| Females | Males | Sex-combined | Females | Males | Sex-combined | ||||
| 2q37.1 | UGT1A1* | rs4148325 | 234673309 | 1.0 × 10−2 | 7.5 × 10−5 | 5.9 × 10−7 | 9.7 × 10−3 | 8.1 × 10−5 | 2.6 × 10−7 |
| 11q13.4 | ARAP1/ STARD10 | rs116714277 | 72473447 | 7.45 × 10−8 | 3.4 × 10−2 | 2.8 × 10−8 | 1.7 × 10−7 | 3.2 × 10−2 | 3.1 × 10−8 |
| 11q22.1 | RP11-115E19.1 | rs2438321 | 98500410 | 3.25 × 10−1 | 9.6 × 10−2 | 3.4 × 10−2 | 2.4 × 10−1 | 1.2 × 10−1 | 2.8 × 10−2 |
| 17q24.2 | PRKCA | rs59443763 | 64526988 | 3.1 × 10−4 | 3.5 × 10−5 | 1.4 × 10−8 | 2.3 × 10−4 | 2.3 × 10−4 | 6.7 × 10−8 |
| 19q13.33 | FCGRT | rs59774409 | 50016748 | 1.3 × 10−4 | 6.1 × 10−4 | 1.3 × 10−7 | 3.4 × 10−4 | 8.4 × 10−4 | 3.0 × 10−7 |
These loci were identified by metaUSAT at the genome-wide threshold of 5 × 10−8. The rsID and position corresponds to the most significant SNP (lead SNP) in a locus.
This region includes multiple alternatively spliced genes including UGT1A1/UGT1A3/UGT1A4/UGT1A5/UGT1A10.
Potential Functional Genes Identified From Plasma Cis-pQTL Mapping of Detected Loci
The lead SNP rs887829 in the UGT1A1 locus was cis-pQTL for proteins encoded by both UGT1A1 (P = 2.0 × 10−278, Uniprot ID P22309) and UGT1A6 (P = 5.6 × 10−104, Uniprot ID P19224) corresponding to isoforms of the uridine 5′-diphospho–glucuronosyltransferase 1A protein complex (Fig. 1). It was identified as cis-eQTL for UGT1A3 and UGT1A8 in the same complex of alternatively spliced genes in liver. This locus was also mapped to gene USP40 in diabetes-relevant tissues via both eQTL and 3D chromatin mapping strategies in FUMA (the protein UBP40 encoded by USP40 was not captured in the ARIC pQTL database). The lead SNP rs59774409 of the FCGRT locus was cis-pQTL for protein encoded by IRF3 (P = 3.4 × 10−28, Uniprot ID Q14653) and also cis-eQTL for the same gene in adipose tissues (Supplementary Fig. 11). No lead SNP identified in African ancestry was a significant cis-pQTL. Only one locus contained significant cis-pQTLs for protein encoded by PRKCA, with rs7222627—47.3 kilobase away from the lead SNP rs59443763—being the most significant cis-pQTL (P = 3.4 × 10−29, Uniprot ID P17252) (Fig. 2). The same gene was mapped in liver tissue via 3D chromatin interactions. A summary of these functional gene prioritizations is provided in Table 3 and Supplementary Figs. 12–17.
Figure 1.
Functional gene prioritization of the chr2q37.1 locus identified from multiphenotype analysis of fructosamine and glycated albumin using BMI-unadjusted model on sex-combined genotyped/imputed data on European ancestry participants. A and B: LocusZoom plots of ±250 kilobase radius around the lead SNP rs887829 are shown for cis-pQTL associations with plasma proteins encoded by UGT1A1 (A) and UGT1A6 (B) genes in ARIC White participants from a previous study. cM, centiMorgan; LD, linkage disequilibrium; Mb, megabase. C: This panel summarizes findings from cis-eQTL and 3D chromatin interaction mapping strategies in diabetes-relevant tissues from external data implemented in FUMA.
Figure 2.
Functional gene prioritization of the chr17q24.2 locus identified from multiphenotype analysis of fructosamine and glycated albumin using BMI-unadjusted model on sex-combined genotyped/imputed data on African ancestry participants. A: LocusZoom plot of ±250 Kb radius around the lead SNP rs59443763 for cis-pQTL associations with plasma protein encoded by PRKCA gene in ARIC Black participants from a previous study. cM, centiMorgan; LD, linkage disequilibrium; Mb, megabase. B: This panel summarizes findings from cis-eQTL and 3D chromatin interaction mapping strategies in diabetes-relevant tissues from external data implemented in FUMA.
Table 3.
Functional gene prioritization of the genome-wide significant loci identified from multiphenotype analysis of fructosamine and glycated albumin using BMI-unadjusted model on genotyped and imputed data
| cis-pQTL gene mapping (plasma)† | cis-eQTL gene mapping | 3D chromatin gene mapping | Classified as | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ancestry | Locus | Nearest gene | rsID (lead SNP) | Position (hg19) | Gene | Race/ ethnicity† | Gene | Diabetes-relevant tissue | Gene | Diabetes-relevant tissue | potentially glycemic? |
| Multiancestry (n = 9,411) | 11q13.4¶ | ARAP1/ STARD10 | rs116714277 | 72473447 | — | — | STARD10 | Muscle_skeletal | FCHSD2 | Liver, Pancreas | Glycemic |
| P2RY2 | Liver | ||||||||||
| 17q24.2 | PRKCA | rs59443763 | 64526988 | PRKCA | Black | — | — | PRKCA, CEP112 | Liver | Glycemic | |
| European (n = 7,359) | 2q37.1¶ | UGT1A1* | rs887829 | 234668570 | UGT1A1, UGT1A6 | White | UGT1A3, UGT1A8 | Liver | USP40, TRPM8 | Liver | No |
| UGT1A7 | Pancreatic islets | ||||||||||
| USP40 | Adipose_subcutaneous, Pancreas | ||||||||||
| 19q13.33# | FCGRT | rs59774409 | 50016748 | IRF3 | White | RPS11 | Adipose_subcutaneous, Adipose_visceral, Muscle_skeletal, Pancreatic islets | — | — | Glycemic | |
| FCGRT | Adipose_subcutaneous, Liver, Adipose_visceral | ||||||||||
| RPL13A | Muscle_skeletal, Pancreas | ||||||||||
| NUCB1 | Pancreatic islets | ||||||||||
| African (n = 2,004) | 11q13.4¶ | ARAP1/ STARD10 | rs116714277 | 72473447 | — | — | STARD10 | Muscle_skeletal | FCHSD2 | Liver, Pancreas | Glycemic |
| P2RY2 | Liver | ||||||||||
| 11q22.1 | RP11-115E19.1 | rs2438321 | 98500410 | — | — | — | — | — | — | Glycemic | |
| 17q24.2 | PRKCA | rs59443763 | 64526988 | PRKCA | Black | — | — | PRKCA, CEP112 | Liver | Glycemic | |
The rsID and position corresponds to the most significant SNP (lead SNP) in a locus. The cis-pQTL gene mapping used race-stratified ARIC data on plasma proteome from a previously published study. The cis-eQTL gene mapping used GTEx v8 data in diabetes-relevant tissues and the TIGER eQTL data on pancreatic islets. The 3D chromatin interaction gene mapping used Hi-C data in diabetes-relevant tissues from GSE87112. Blank cells indicate no significant gene mapping was found. The algorithm used to classify each locus as potentially glycemic or not is detailed in the Supplementary Methods.
The cis-pQTL results, taken from a previously published study, were available stratified by self-reported race/ethnicity and not by genetic ancestry.
Novel locus for fructosamine and/or glycated albumin (may or may not be novel for type 2 diabetes).
This region includes multiple alternatively spliced genes, including UGT1A1, UGT1A3, UGT1A4, UGT1A, and UGT1A10.
This locus reported 13 genes in eQTL analysis: RPS11, FCGRT, RPL13A, NUCB1, PRR12, CPT1C, SCAF1, RPL18, PRRG2, ALDH16A1, ADM5, C19orf73, and IRF3; only the top 4 genes with the highest number of eQTL associations from this locus across tissues are reported in this table.
Gene-Based Multiphenotype Tests Using Whole-Exome Sequence Data Identify Potential Effector Genes by Leveraging Rare Variants Enriched in African Ancestry
Across four different gene masks (i.e., gene-level variant aggregation approaches) and two different kernels—projection and linear kernels for summarizing multiphenotype information—we identified 10 significant genes from gene burden tests performed using GAMuT (Supplementary Fig. 18). In particular, when we considered common and rare PTVs (mask 1), the genes UGDH, TXNDC5, ARHGEF39, C15orf40, and ZNF208 were significantly associated using one or both phenotype kernels (Table 4). The variants included in these significant genes were not only very rare in our data but also in large databases such as Genome Aggregation Database (gnomAD) (45) (Supplementary Table 6). Of note, the gene QSER1 (four variants, MAC 4, Pprojection = 2.3 × 10−7, Plinear = 3.8 × 10−8)—recently implicated in type 2 diabetes by large-scale studies (37,51)—was exome-wide significant in our study but failed the minimum MAC gene filter. None of these genes were detected in our stratified analysis due to reduced effective sample size in each ancestry group. We additionally detected FOSL2 and the region with overlapping genes RNF103 and CHMP3 when we added rare missense variants to the set of all PTVs (mask 2). No gene in this mask was significantly detected in African ancestry; only RNF103-CHMP3 was significant in European ancestry. By considering pLOFs across the entire allele-frequency spectrum (mask 3), we identified significant variant sets annotated to genes CD1D, EGFL7, AGPAT2, MIR126. The variants included in these significant genes were mostly singletons or doubletons in our multiancestry data; consequently, we noted identical findings as mask 3 when we restricted ourselves to rare pLOFs only (mask 4). The eight pLOFs annotated to EGFL7 and AGPAT2 by SnpEff (31) were identical, of which four were designated as high-confidence pLOFs for EGFL7 by Variant Effect Predictor (52), three were high-confidence pLOFs for AGPAT2, and the remaining one variant had no information in gnomAD (45) (Supplementary Table 7). The variant set annotated to MIR126 included seven of the eight pLOFs annotated to EGFL7/AGPAT2; however, Variant Effect Predictor did predict any of these variants to have a consequence on MIR126 (45). All the five variants annotated to CD1D were high-confidence pLOFs for CD1D (45). We found notable enrichment of these rare pLOFs in African ancestry compared with European ancestry, despite the threefold smaller sample size for African ancestry in our study. All but CD1D also exhibited significant gene-level associations in African ancestry but none in European ancestry. We found only the FOSL2 gene was within 1 megabase of our common variant signal near GCKR that remained exome-wide significant after conditioning on the nearby signal (Supplementary Table 8). Overall, the quantile-quantile plots suggest there could be minor inflation in our gene-based tests for all but mask 2 (Supplementary Fig. 19).
Table 4.
Gene-based association results from gene burden multivariate analysis of fructosamine and glycated albumin using BMI-unadjusted model on sex-combined, multiancestry whole exome sequence data
| Variant sets (masks) | Chr | Gene§ | Multiancestry (n = 8,898) | European ancestry (n = 5,986) | African ancestry (n = 2,003) | Classified as potentially glycemic? | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No. of variants | Gene burden MAC | GAMuT P value | No. of variants | Gene burden MAC | GAMuT P value | No. of variants | Gene burden MAC | GAMuT P value | |||||||
| Projection kernel | Linear kernel | Projection kernel | Linear kernel | Projection kernel | Linear kernel | ||||||||||
| Mask 1: protein-truncating variants at any allele frequency | 4 | UGDH | 3 | 5 | 1.2 × 10−7 | 1.2 × 10−4 | 3 | 5 | 4.6 × 10−2 | 2.7 × 10−2 | 0 | 0 | — | — | No |
| 6 | TXNDC5 | 8 | 5 | 2.2 × 10−13 | 2.7 × 10−14 | 3 | 1 | — | — | 5 | 4 | — | — | Glycemic | |
| 9 | ARHGEF39 | 9 | 6 | 1.3 × 10−6 | 2.4 × 10−7 | 9 | 6 | 1.4 × 10−1 | 2.3 × 10−2 | 1 | 1 | — | — | Glycemic | |
| 15 | C15orf40 | 3 | 5 | 7.0 × 10−7 | 5.3 × 10−3 | 2 | 5 | 5.1 × 10−2 | 1.6 × 10−1 | 1 | 1 | — | — | No | |
| 19 | ZNF208 | 6 | 7 | 8.1 × 10−10 | 1.2 × 10−9 | 5 | 6 | 6.5 × 10−1 | 8.3 × 10−1 | 1 | 3 | — | — | Glycemic | |
| Mask 2: protein-truncating variants at any allele frequency + missense variants with MAF <5% | 2 | FOSL2 | 34 | 60 | 2.5 × 10−6 | 1.0 × 10−6 | 23 | 28 | 2.0 × 10−1 | 2.4 × 10−1 | 12 | 39 | 4.7 × 10−4 | 1.1 × 10−4 | Glycemic |
| 2 | RNF103-CHMP3 | 23 | 36 | 5.2 × 10−7 | 9.3 × 10−5 | 13 | 30 | 1.8 × 10−7 | 4.5 × 10−6 | 14 | 13 | 2.4 × 10−1 | 5.5 × 10−1 | Maybe | |
| Mask 3: putative loss-of-function variants at any allele frequency | 1 | CD1D | 5 | 9 | 1.6 × 10−7 | 3.3 × 10−8 | 2 | 2 | — | — | 4 | 7 | 8.9 × 10−3 | 3.9 × 10−3 | Glycemic |
| 9 | EGFL7/AGPAT2 | 8 | 7 | 8.5 × 10−11 | 1.2 × 10−3 | 3 | 2 | — | — | 5 | 6 | 1.6 × 10−6 | 4.5 × 10−2 | No | |
| 9 | MIR126 | 7 | 6 | 1.5 × 10−11 | 4.3 × 10−4 | 3 | 2 | — | — | 4 | 5 | 8.4 × 10−7 | 3.4 × 10−2 | No | |
| Mask 4: putative loss-of-function variants with MAF <5% | 1 | CD1D | 5 | 9 | 1.6 × 10−7 | 3.3 × 10−8 | 2 | 2 | — | — | 4 | 7 | 8.9 × 10−3 | 3.9 × 10−3 | Glycemic |
| 9 | EGFL7/AGPAT2 | 8 | 7 | 8.5 × 10−11 | 1.2 × 10−3 | 3 | 2 | — | — | 5 | 6 | 1.6 × 10−6 | 4.4 × 10−2 | No | |
| 9 | MIR126 | 7 | 6 | 1.5 × 10−11 | 4.3 × 10−4 | 3 | 2 | — | — | 4 | 5 | 8.4 × 10−7 | 3.4 × 10−2 | No | |
These genes were identified by GAMuT from the multiancestry data at the exome-wide significance threshold of 2.5 × 10−6 either from the projection kernel or the linear kernel, or both (here kernels summarize the multivariate phenotype information). Genes with missing ancestry-specific P values are those that fail gene filters (minimum 3 variants and minimum gene burden MAC 5). Algorithm used to classify each gene as potentially glycemic or not is detailed in Supplementary Methods.
All reported genes are novel for fructosamine and/or glycated albumin (may or may not be novel for type 2 diabetes).
Classification of Detected Loci and Genes Based on Their Likely Biological Pathways
Genetic basis of fructosamine and glycated albumin could be influenced through nonglycemic pathways. Among the significant loci identified from single-variant multiphenotype analysis of fructosamine and glycated albumin, our algorithm classified ARAP1/STARD10, RP11-115E19.1, and PRKCA loci as glycemic because they are likely mediated by fasting glucose, FCGRT locus as glycemic due to suggestive evidence of its association with type 2 diabetes in recent literature, and UGT1A1 locus as nonglycemic (Supplementary Table 9). From the set-based analyses, the significant genes TXNDC5, ARHGEF39, ZNF208, FOSL2, and CD1D were classified as glycemic, whereas RNF103-CHMP3 and AGPAT2 may be glycemic because they were associated with no other glycemic trait but HbA1c (Supplementary Table 10).
Discussion
These multiancestry multiphenotype analyses using common and rare variants in the ARIC study revealed novel genetic underpinnings of the nontraditional glycemic biomarkers fructosamine and glycated albumin. The multiphenotype analysis of common variants in European ancestry identified the UGT1A region and the FCGRT locus, which were missed by the typical single-phenotype analysis of fructosamine or glycated albumin alone and by the multiancestry joint analysis. The nontraditional biomarkers had effects in a direction opposite to that for T2D, HbA1c and fasting glucose from other multiancestry studies (2,47) at the lead SNP of the FCGRT locus (Supplementary Fig. 20). This is not unexpected because epidemiologic studies have indicated these nontraditional biomarkers have both shared and unique aspects compared with traditional biomarkers (10–13,53). The UGT1A lead SNP exhibited considerable effect size heterogeneity for fructosamine between European and African ancestries, whereas the FCGRT lead SNP was absent in the African ancestry data. The UGT1A region is a complex of alternatively spliced genes including UGT1A1, 1A3, 1A4, 1A5, 1A6, 1A7, 1A8, 1A9, and 1A10, some of which were implicated in our cis-pQTL, cis-eQTL, and 3D chromatin gene mapping strategies. These genes are involved in the glucuronidation of bilirubin creating water-soluble bilirubin. Moderately elevated bilirubin is associated with a decreased risk of diabetes and cardiovascular disease (54–56). Bilirubin can also bind to albumin (57). Although the UGT1A variants showed suggestive significance with fructosamine (a concentration not accounting for total serum protein), this region was not associated with glycated albumin (which, expressed as percentage, accounts for total serum albumin). It is possible the multiphenotype association is primarily driven by fructosamine via an albumin-related pathway rather than a diabetes-related pathway. Nonetheless, we found this locus was genome-wide significant in a multiancestry meta-analysis of type 2 diabetes (47), suggesting it may be important in type 2 diabetes pathophysiology.
The joint modeling of fructosamine and glycated albumin using the genotyped/imputed African ancestry data showed three genome-wide significant loci that were also identified in the single-phenotype analyses, and all were likely mediated by fasting glucose. Although the PRKCA and RP11-115E19.1 loci have been previously implicated for glycated albumin and fructosamine, respectively (15), the locus in ARAP1/STARD10 is novel for these glycemic biomarkers that showed nearly significant sex differentiation. Only the RP11-115E19.1 locus exhibited significant effect size heterogeneity between ancestries; the lead SNPs for the other two were absent in our European ancestry data due to removal of poor-quality SNPs. In the novel locus ARAP1/STARD10, also detected in our multiancestry analysis, we found significant cis-eQTLs for STARD10 in skeletal muscle, and significant 3D chromatin interactions with FCHSD2 in liver and pancreas and with P2RY2 in liver. Common variants in both ARAP1 and STARD10 have known associations with type 2 diabetes and traditional glycemic traits (58–61). This gene-rich region encompassing ARAP1, STARD10, and FCHSD2 is strongly associated with type 2 diabetes in CMDKP (42), which, along with our mediation analysis, indicates this locus likely influences these nontraditional glycemic traits via a diabetes-related pathway. We found a male-specific locus near TEX29 driven by the African ancestry participants; however, its role in type 2 diabetes pathophysiology is yet unclear.
Multiancestry joint analysis using variant sets from the exome sequencing data revealed 10 new genes associated with these nontraditional glycemic biomarkers. In particular, five genes—TXNDC5, ARHGEF39, ZNF208, RNF103-CHMP3, and CD1D—were also significantly associated with fasting glucose or HbA1c in ARIC. UGDH is involved in starch and sucrose metabolism, pentose, and glucuronate interconversions; the Human Protein Atlas (HPA) (62) indicates its tissue specificity (liver) and its involvement in lipid metabolism in intestine and liver based on gene expression clustering; and there is “moderate genetic support” for involvement in 2-h insulin (42) based on Human Genetic Evidence (HuGE) score (63). TXNDC5 plays an important role in iron metabolism (CMDKP shows “very strong genetic support” for hemoglobin concentration), and appears to play a role in diabetes progression and response of pancreatic cells to high glucose exposure (64). There is “extreme genetic support” for involvement of ZNF208 and “moderate support” for FOSL2, particularly due to rare variants, in fasting insulin adjusted for BMI and 2-h insulin, respectively (42). There is some evidence in the literature showing the role of FOSL2 in insulin regulation and glucose metabolism in humans and mice (65). CMDKP shows “very strong genetic support” for RNF103 in BMI, and the HPA indicates its involvement in metabolism in liver based on gene expression clustering.
Rare pLOFs have the potential to elucidate gene function (32), and we found four such genes from three exome-wide significant variant sets. CD1D could influence our nontraditional glycemic biomarkers via both diabetes- and blood-related pathways, as evidenced from associations of rare pLOFs in CD1D with fasting glucose (P = 2.2 × 10−6) and HbA1c (P = 1.8 × 10−5) in ARIC, “very strong” HuGE score in favor of both HbA1c and red blood cell distribution width in CMDKP, and multiple common variants close to this gene were strongly associated with blood traits in the Open Targets Genetics portal (66,67). A “moderate genetic support” for involvement of rare variants in EGFL7 in the glycemic trait adiponectin and of common variants in EGFL7 in BMI-adjusted fasting insulin is indicative of this gene’s likely influence via a glycemic pathway (42). EGFL7 may also act via a nonglycemic pathway, particularly a blood-related pathway, as implicated by a “very strong” HuGE score for involvement in lymphocyte count (42) and common variant associations with blood traits (67). There is some evidence of association of dysregulation of microRNA-126 (the RNA type for the MIR126 gene) with type 2 diabetes and related complications (68–70), but MIR126 is not associated with fasting glucose or HbA1c in ARIC, and little is known about this gene. AGPAT2 may be a glycemic gene (in CMDKP, HbA1c rare variant gene-based analysis yielded P = 0.003), but its role in type 2 diabetes is unclear. Lack of overlap between signals from common and rare variants, unlike some other whole-exome studies (32), is likely because we are limited by sample size and by studies reporting common variant signals on these nontraditional glycemic biomarkers.
A major limitation of this work is lack of validation in an external data set. Fructosamine and glycated albumin are not routinely collected in epidemiologic studies. Lack of studies with both genome-wide data on diverse ancestry and rigorous measurements of these biomarkers currently precludes a larger GWAS than the current study. However, by leveraging joint patterns, correlations, and genetic overlap between biomarkers, and by exploiting genotyped and exome sequence data to query variants across the entire allele frequency spectrum, we were able to improve power to unravel genetic architecture of fructosamine and glycated albumin. We compensated, to some extent, the lack of replication data by assessing the role of discovered loci and genes in type 2 diabetes using existing literature and large-scale public databases such as CMDKP. Our approach for classifying discovered loci and genes as potentially “glycemic” based on attenuation of effect estimates when adjusted for fasting glucose is not perfect because fasting glucose measurements are prone to error and are not a great gold standard. Further follow-up of the potentially glycemic loci and genes from this study is necessary to help uncover mechanisms by which these genetic regions may influence risk to type 2 diabetes. It will be an area of future work to investigate pleiotropy between the traditional and these nontraditional glycemic biomarkers and investigate causal relationships among these biomarkers by leveraging the loci not involved in a diabetes-related pathway. Nonetheless, this study expands the inventory of loci and genes associated with fructosamine and glycated albumin (some of which are unique to these glycemic biomarkers and others that overlap with type 2 diabetes-susceptibility regions), finds evidence of potential cross-ancestry differences in biology of these biomarkers (e.g., heterogenous effect sizes at the level of genetic ancestry), and finds suggestive evidence of sex-specific effects.
This article contains supplementary material online at https://doi.org/10.2337/figshare.26018581.
Article Information
Acknowledgments. All analyses were done using the computing cluster—the Joint High Performance Computing Exchange—at the Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health. The authors thank Dr. Candelaria Vergara at the Johns Hopkins Bloomberg School of Public Health for her guidance with the identification of genetic ancestry of study participants. The authors are thankful to the staff and the participants of the ARIC Study for their important contributions.
Funding. This research was supported in part by the National Institutes of Health (NIH), National Institute of Diabetes and Digestive and Kidney Diseases grant R21DK125888 (D.R., S.V., J.Z., and N.C.), the National Human Genome Research Institute grant R21HG012978 (D.R. and N.C.), the National Heart, Lung, and Blood Institute institutional training grant T32 HL007024 (S.J.L.), and the National Heart, Lung, and Blood Institute grant K24HL152440 (E.S.). The ARIC Study has been funded in whole or in part with federal funds from the National Heart, Lung, and Blood Institute, NIH, Department of Health and Human Services (contract numbers HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I, and HHSN268201700005I and grant numbers R01HL087641 and R01HL086694), National Human Genome Research Institute contract U01HG004402, and NIH contract HHSN268200625226C. Infrastructure was partly supported by grant number UL1RR025005, a component of the NIH and NIH Roadmap for Medical Research. Funding support for whole-exome sequencing “Building on GWAS for NHLBI-diseases: the U.S. Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium” was provided by the NIH through the American Recovery and Reinvestment Act of 2009 (ARRA) (5RC2HL102419). Whole-exome sequencing was performed at the Baylor College of Medicine Human Genome Sequencing Center (U54HG003273 and R01HL086694). Reagents for the glycated albumin assays were donated by the Asahi Kasei Corporation. Reagents for the fructosamine assays were donated by Roche Diagnostics Corporation.
Duality of Interest. The following authors report unrelated disclosures: S.J.L., as a Biogen employee, owns stock in the company but contributed to this work while she was a full-time graduate student in the Johns Hopkins Bloomberg School of Public Health. S.V. was employed at Schrödinger, Inc., New York, at the time of revision but contributed to this work when she was a full-time employee of the Johns Hopkins Bloomberg School of Public Health. No other potential conflicts of interest relevant to this article were reported.
Author Contributions. D.R. contributed to software and validation of results. D.R. and S.J.L. contributed to formal analysis, investigation, methodology, and project administration and wrote the original draft. D.R., S.J.L., E.S., and P.D. conceptualized the study. D.R. and J.Z. visualized the study. D.R., N.C., E.S., and P.D. supervised the study. D.R., E.S., and P.D. acquired funding. S.V., J.Z., and A.T. curated the data. D.R., S.J.L., S.V., J.Z., A.T., B.Y., N.C., E.S., and P.D. reviewed and edited the manuscript. D.R. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Prior Presentation. Portions of this study were presented as a poster at the 31st Annual Meeting of the International Genetic Epidemiology Society, Paris, France, 7–9 September 2022.
Web Resources.
ARIC pQTL summary statistics, http://nilanjanchatterjeelab.org/pwas/
FUMA, https://fuma.ctglab.nl/
GAMuT, https://www.biostat.umn.edu/∼baolin/research/gamut/
gnomAD (v4.0.0), https://gnomad.broadinstitute.org/
Human Protein Atlas, https://www.proteinatlas.org/
metaUSAT (v1.17), https://github.com/RayDebashree/metaUSAT
PLINK (v1.9), https://www.cog-genomics.org/plink/
SNP2GENE (v1.3.7), FUMA https://fuma.ctglab.nl/
SnpEff (v5.0e), https://pcingola.github.io/SnpEff
TIGER Data Portal, https://tiger.bsc.es
Funding Statement
This research was supported in part by the National Institutes of Health (NIH), National Institute of Diabetes and Digestive and Kidney Diseases grant R21DK125888 (D.R., S.V., J.Z., and N.C.), the National Human Genome Research Institute grant R21HG012978 (D.R. and N.C.), the National Heart, Lung, and Blood Institute institutional training grant T32 HL007024 (S.J.L.), and the National Heart, Lung, and Blood Institute grant K24HL152440 (E.S.). The ARIC Study has been funded in whole or in part with federal funds from the National Heart, Lung, and Blood Institute, NIH, Department of Health and Human Services (contract numbers HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I, and HHSN268201700005I and grant numbers R01HL087641 and R01HL086694), National Human Genome Research Institute contract U01HG004402, and NIH contract HHSN268200625226C. Infrastructure was partly supported by grant number UL1RR025005, a component of the NIH and NIH Roadmap for Medical Research. Funding support for whole-exome sequencing “Building on GWAS for NHLBI-diseases: the U.S. Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium” was provided by the NIH through the American Recovery and Reinvestment Act of 2009 (ARRA) (5RC2HL102419). Whole-exome sequencing was performed at the Baylor College of Medicine Human Genome Sequencing Center (U54HG003273 and R01HL086694). Reagents for the glycated albumin assays were donated by the Asahi Kasei Corporation. Reagents for the fructosamine assays were donated by Roche Diagnostics Corporation.
References
- 1. Billings LK, Florez JC. The genetics of type 2 diabetes: what have we learned from GWAS? Ann N Y Acad Sci 2010;1212:59–77 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Chen J, Spracklen CN, Marenne G, et al.; Meta-Analysis of Glucose and Insulin-related Traits Consortium (MAGIC) . The trans-ancestral genomic architecture of glycemic traits. Nat Genet 2021;53:840–860 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Armbruster DA. Fructosamine: structure, analysis, and clinical usefulness. Clin Chem 1987;33:2153–2163 [PubMed] [Google Scholar]
- 4. Goldstein DE, Little RR, Lorenz RA, et al. Tests of glycemia in diabetes. Diabetes Care 2004;27:1761–1773 [DOI] [PubMed] [Google Scholar]
- 5. Koga M, Kasayama S. Clinical impact of glycated albumin as another glycemic control marker. Endocr J 2010;57:751–762 [DOI] [PubMed] [Google Scholar]
- 6. Cohen RM, Sacks DB. Comparing multiple measures of glycemia: how to transition from biomarker to diagnostic test? Clin Chem 2012;58:1615–1617 [DOI] [PubMed] [Google Scholar]
- 7. Parrinello CM, Selvin E. Beyond HbA1c and glucose: the role of nontraditional glycemic markers in diabetes diagnosis, prognosis, and management. Curr Diab Rep 2014;14:548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Malmström H, Walldius G, Grill V, et al. Fructosamine is a useful indicator of hyperglycaemia and glucose control in clinical and epidemiological studies-cross-sectional and longitudinal experience from the AMORIS cohort. PLoS One 2014;9:e111463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Welsh KJ, Kirkman MS, Sacks DB. Role of glycated proteins in the diagnosis and management of diabetes: research gaps and future directions. Diabetes Care 2016;39:1299–1306 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Juraschek SP, Steffes MW, Selvin E. Associations of alternative markers of glycemia with hemoglobin A(1c) and fasting glucose. Clin Chem 2012;58:1648–1655 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Selvin E, Rawlings AM, Grams M, et al. Fructosamine and glycated albumin for risk stratification and prediction of incident diabetes and microvascular complications: a prospective cohort analysis of the Atherosclerosis Risk in Communities (ARIC) study. Lancet Diabetes Endocrinol 2014;2:279–288 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Selvin E, Rawlings AM, Lutsey PL, et al. Fructosamine and glycated albumin and the risk of cardiovascular outcomes and death. Circulation 2015;132:269–277 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Rawlings AM, Sharrett AR, Albert MS, et al. The association of late-life diabetes status and hyperglycemia with incident mild cognitive impairment and dementia: the ARIC study. Diabetes Care 2019;42:1248–1254 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Ding N, Kwak L, Ballew SH, et al. Traditional and nontraditional glycemic markers and risk of peripheral artery disease: the Atherosclerosis Risk in Communities (ARIC) study. Atherosclerosis 2018;274:86–93 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Loomis SJ, Li M, Maruthur NM, et al. Genome-wide association study of serum fructosamine and glycated albumin in adults without diagnosed diabetes: results from the Atherosclerosis Risk in Communities Study. Diabetes 2018;67:1684–1696 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Loomis SJ, Tin A, Coresh J, et al. Heritability analysis of nontraditional glycemic biomarkers in the Atherosclerosis Risk in Communities Study. Genet Epidemiol 2019;43:776–785 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Shen X, Klarić L, Sharapov S, et al. Multivariate discovery and replication of five novel loci associated with immunoglobulin G N-glycosylation. Nat Commun 2017;8:447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Shriner D. Moving toward system genetics through multiple trait analysis in genome-wide association studies. Front Genet 2012;3:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Turchin MC, Stephens M. Bayesian multivariate reanalysis of large genetic studies identifies many new associations. PLoS Genet 2019;15:e1008431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Ray D, Pankow JS, Basu S. USAT: a unified score-based association test for multiple phenotype-genotype analysis. Genet Epidemiol 2016;40:20–34 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Ray D, Chatterjee N. A powerful method for pleiotropic analysis under composite null hypothesis identifies novel shared loci between type 2 diabetes and prostate cancer. PLoS Genet 2020;16:e1009218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Ray D, Boehnke M. Methods for meta-analysis of multiple traits using GWAS summary statistics. Genet Epidemiol 2018;42:134–145 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Broadaway KA, Cutler DJ, Duncan R, et al. A statistical approach for testing cross-phenotype effects of rare variants. Am. J. Hum. Genet 2016;98:525–540 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Wright JD, Folsom AR, Coresh J, et al. The ARIC (Atherosclerosis Risk In Communities) study: JACC Focus Seminar 3/8. J Am Coll Cardiol 2021;77:2939–2959 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 2009;5:e1000529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Ray D, Chatterjee N. Effect of non-normality and low count variants on cross-phenotype association tests in GWAS. Eur J Hum Genet 2020;28:300–312 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559–575 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Fuchsberger C, Flannick J, Teslovich TM, et al. The genetic architecture of type 2 diabetes. Nature 2016;536:41–47 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Locke AE, Steinberg KM, Chiang CWK, et al.; FinnGen Project . Exome sequencing of Finnish isolates enhances rare-variant association power. Nature 2019;572:323–328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Akbari P, Gilani A, Sosina O, et al.; DiscovEHR Collaboration . Sequencing of 640,000 exomes identifies GPR75 variants associated with protection from obesity. Science 2021;373:eabf8683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Cingolani P, Platts A, Wang LL, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 2012;6:80–92 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Backman JD, Li AH, Marcketta A, et al.; DiscovEHR . Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 2021;599:628–634 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Dutta D, Scott L, Boehnke M, Lee S. Multi-SKAT: general framework to test for rare-variant association with multiple phenotypes. Genet Epidemiol 2019;43:4–23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun 2017;8:1826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Zhang J, Dutta D, Köttgen A, et al.; CKDGen Consortium . Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies. Nat Genet 2022;54:593–602 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. GTEx Consortium . The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020;369:1318–1330 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Mahajan A, Taliun D, Thurner M, et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat Genet 2018;50:1505–1513 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Alonso L, Piron A, Morán I, et al.; MAGIC . TIGER: the gene expression regulatory variation landscape of human pancreatic islets. Cell Rep 2021;37:109807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Schmitt AD, Hu M, Jung I, et al. A compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep 2016;17:2042–2059 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Wheeler E, Leong A, Liu C-T, et al.; Lifelines Cohort Study . Impact of common genetic determinants of hemoglobin A1c on type 2 diabetes risk and diagnosis in ancestrally diverse populations: a transethnic genome-wide meta-analysis. PLoS Med 2017;14:e1002383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Sarnowski C, Leong A, Raffield LM, et al.; National Heart, Lung, and Blood Institute TOPMed Consortium . Impact of rare and common genetic variants on diabetes diagnosis by hemoglobin A1c in multi-ancestry cohorts: the Trans-Omics for Precision Medicine Program. Am J Hum Genet 2019;105:706–718 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Costanzo MC, von Grotthuss M, Massung J, et al.; AMP-T2D Consortium . The Type 2 Diabetes Knowledge Portal: an open access genetic resource dedicated to type 2 diabetes and related traits. Cell Metab 2023;35:695–710.e6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Strawbridge RJ, Dupuis J, Prokopenko I, et al.; C4D Consortium . Genome-wide association identifies nine common variants associated with fasting proinsulin levels and provides new insights into the pathophysiology of type 2 diabetes. Diabetes 2011;60:2624–2634 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Buniello A, MacArthur JAL, Cerezo M, et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res 2019;47:D1005–D1012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Chen S, Francioli LC, Goodrich JK, et al.; Genome Aggregation Database Consortium . A genomic mutational constraint map using variation in 76,156 human genomes. Nature 2024;625:92–100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund University; Saxena R, Voight BF, Lyssenko V, et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 2007;316:1331–1336 [DOI] [PubMed] [Google Scholar]
- 47. Vujkovic M, Keaton JM, Lynch JA, et al.; VA Million Veteran Program . Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat Genet 2020;52:680–691 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Riveros-Mckay F, Roberts D, Di Angelantonio E, et al. An expanded genome-wide association study of fructosamine levels identifies RCN3 as a replicating locus and implicates FCGRT as the effector transcript. Diabetes 2022;71:359–364 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Pulit SL, Stoneman C, Morris AP, et al.; GIANT Consortium . Meta-analysis of genome-wide association studies for body fat distribution in 694 649 individuals of European ancestry. Hum Mol Genet 2019;28:166–174 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Huang J, Huffman JE, Huang Y, et al.; VA Million Veteran Program . Genomics and phenomics of body mass index reveals a complex disease network. Nat Commun 2022;13:7973–7972 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Mahajan A, Spracklen CN, Zhang W, et al.; eMERGE Consortium . Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat Genet 2022;54:560–572 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. McLaren W, Gil L, Hunt SE, et al. The Ensembl Variant Effect Predictor. Genome Biol 2016;17:122–124 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Juraschek SP, Steffes MW, Miller ER, Selvin E. Alternative markers of hyperglycemia and risk of diabetes. Diabetes Care 2012;35:2265–2270 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Cheriyath P, Gorrepati VS, Peters I, et al. High total bilirubin as a protective factor for diabetes mellitus: an analysis of NHANES data from 1999 - 2006. J Clin Med Res 2010;2:201–206 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Inoguchi T, Sasaki S, Kobayashi K, Takayanagi R, Yamada T. Relationship between Gilbert syndrome and prevalence of vascular complications in patients with diabetes. JAMA 2007;298:1398–1400 [DOI] [PubMed] [Google Scholar]
- 56. Ko GT, Chan JC, Woo J, et al. Serum bilirubin and cardiovascular risk factors in a Chinese population. J Cardiovasc Risk 1996;3:459–463 [DOI] [PubMed] [Google Scholar]
- 57. Vítek L. The role of bilirubin in diabetes, metabolic syndrome, and cardiovascular diseases. Front Pharmacol 2012;3:55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Voight BF, Scott LJ, Steinthorsdottir V, et al.; GIANT Consortium . Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet 2010;42:579–589 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Huyghe JR, Jackson AU, Fogarty MP, et al. Exome array analysis identifies new loci and low-frequency variants influencing insulin processing and secretion. Nat Genet 2013;45:197–201 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Kulzer JR, Stitzel ML, Morken MA, et al. A common functional regulatory variant at a type 2 diabetes locus upregulates ARAP1 expression in the pancreatic beta cell. Am J Hum Genet 2014;94:186–197 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Carrat GR, Hu M, Nguyen-Tu M-S, et al. Decreased STARD10 expression is associated with defective insulin secretion in humans and mice. Am J Hum Genet 2017;100:238–256 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. The Human Protein Atlas . Accessed 2 August 2022. Available from https://www.proteinatlas.org/
- 63. Dornbos P, Singh P, Jang D-K, et al. Evaluating human genetic support for hypothesized metabolic disease genes. Cell Metab 2022;34:661–666 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Chawsheen HA, Ying Q, Jiang H, Wei Q. A critical role of the thioredoxin domain containing protein 5 (TXNDC5) in redox homeostasis and cancer development. Genes Dis 2018;5:312–322 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Ahmed SAH, Ansari SA, Mensah-Brown EPK, Emerald BS. The role of DNA methylation in the pathogenesis of type 2 diabetes mellitus. Clin Epigenetics 2020;12:104–104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Ghoussaini M, Mountjoy E, Carmona M, et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res 2021;49:D1311–D1320 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Mountjoy E, Schmidt EM, Carmona M, et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat Genet 2021;53:1527–1533 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Liu Y, Gao G, Yang C, et al. The role of circulating microRNA-126 (miR-126): a novel biomarker for screening prediabetes and newly diagnosed type 2 diabetes mellitus. Int J Mol Sci 2014;15:10567–10577 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Suresh Babu S, Thandavarayan RA, Joladarashi D, et al. MicroRNA-126 overexpression rescues diabetes-induced impairment in efferocytosis of apoptotic cardiomyocytes. Sci Rep 2016;6:36207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Zeinali F, Aghaei Zarch SM, Jahan-Mihan A, et al. Circulating microRNA-122, microRNA-126-3p and microRNA-146a are associated with inflammation in patients with pre-diabetes and type 2 diabetes mellitus: a case control study. PLoS One 2021;16:e0251697. [DOI] [PMC free article] [PubMed] [Google Scholar]


