Skip to main content
Journal of the American Society of Nephrology : JASN logoLink to Journal of the American Society of Nephrology : JASN
. 2020 Jan 9;31(2):309–323. doi: 10.1681/ASN.2019030289

Whole-Genome Sequencing of Finnish Type 1 Diabetic Siblings Discordant for Kidney Disease Reveals DNA Variants associated with Diabetic Nephropathy

Jing Guo 1,2, Owen J L Rackham 2, Niina Sandholm 3,4,5, Bing He 1, Anne-May Österholm 1,2, Erkka Valo 3,4,5, Valma Harjutsalo 3,4,5,6, Carol Forsblom 3,4,5, Iiro Toppila 3,4,5, Maija Parkkonen 3,4,5, Qibin Li 7, Wenjuan Zhu 7, Nathan Harmston 2,8, Sonia Chothani 2, Miina K Öhman 2, Eudora Eng 2, Yang Sun 2, Enrico Petretto 2,9,, Per-Henrik Groop 3,4,5,10,, Karl Tryggvason 1,2,11,
PMCID: PMC7003303  PMID: 31919106

Significance Statement

Although diabetic nephropathy is partly genetic in nature, the underlying pathogenetic mechanisms are obscure. The authors assembled from the homogeneous Finnish population a cohort of 76 sibling pairs with type 1 diabetes who were discordant for diabetic nephropathy. Using whole-genome sequencing and multiple analytic approaches, they identified DNA variants associated with nephropathy or its absence and validated their findings in a 3531-member cohort of unrelated Finns with type 1 diabetes. The genes most strongly associated with diabetic nephropathy encode two protein kinase C isoforms (isoforms ε and ι) not previously implicated in the condition. Besides providing a resource for studies on diabetic complications, these findings support previous hypotheses that the protein kinase C family plays a role in diabetic nephropathy and suggest potential targets for treatment.

Keywords: diabetic nephropathy, diabetic kidney diseases, whole genome sequencing, discordant sibling pairs, association test

Visual Abstract

graphic file with name ASN.2019030289absf1.jpg

Abstract

Background

Several genetic susceptibility loci associated with diabetic nephropathy have been documented, but no causative variants implying novel pathogenetic mechanisms have been elucidated.

Methods

We carried out whole-genome sequencing of a discovery cohort of Finnish siblings with type 1 diabetes who were discordant for the presence (case) or absence (control) of diabetic nephropathy. Controls had diabetes without complications for 15–37 years. We analyzed and annotated variants at genome, gene, and single-nucleotide variant levels. We then replicated the associated variants, genes, and regions in a replication cohort from the Finnish Diabetic Nephropathy study that included 3531 unrelated Finns with type 1 diabetes.

Results

We observed protein-altering variants and an enrichment of variants in regions associated with the presence or absence of diabetic nephropathy. The replication cohort confirmed variants in both regulatory and protein-coding regions. We also observed that diabetic nephropathy–associated variants, when clustered at the gene level, are enriched in a core protein-interaction network representing proteins essential for podocyte function. These genes include protein kinases (protein kinase C isoforms ε and ι) and protein tyrosine kinase 2.

Conclusions

Our comprehensive analysis of a diabetic nephropathy cohort of siblings with type 1 diabetes who were discordant for kidney disease points to variants and genes that are potentially causative or protective for diabetic nephropathy. This includes variants in two isoforms of the protein kinase C family not previously linked to diabetic nephropathy, adding support to previous hypotheses that the protein kinase C family members play a role in diabetic nephropathy and might be attractive therapeutic targets.


With the increase in the incidence of diabetes worldwide, complications such as diabetic nephropathy (DN), retinopathy, neuropathy, skin ulcers, and amputations have become a major global health and socioeconomic threat. In addition to intensive blood glucose control,1 the only drugs providing a significant delay in progression of DN are angiotensin-converting enzyme inhibitors or angiotensin receptor blockers, which reduce intraglomerular pressure and efferent arteriolar vasoconstriction.2 The molecular pathogenesis of DN is still poorly understood. Hyperglycemia, a major risk factor for complications, causes accumulation of toxic glucose derivatives, such as methylglyoxal, that bind covalently to the side chains of amino acids, particularly arginine and lysine, and also methionine and cysteine.3,4 Hyperglycemia alone is not sufficient to trigger the development of complications, as only 30%–40% of individuals with type 1 diabetes (T1D) develop diabetic microangiopathy.1,5,6 Independent familial studies have shown a trend of family aggregation of DN in different populations,7,8 suggesting a genetic predisposition to DN. At least four metabolic pathways have been implicated in the development of complications: polyol flux, increased the formation of advanced glycation end products, hyperactivity of the hexosamine pathway, and activation of protein kinase C (PKC) isoforms.4,9,10

Genome-wide association studies (GWAS) and candidate gene approaches have identified several potential genomic loci for DN susceptibility,11 but no variants with a major effect on the risk of complications have been found, suggesting that DN is modulated by a number of variants in genes that cooperate within complex pathways. It is intriguing, however, that several independent, genome-wide linkage analysis studies carried out in white Americans, Pima Indians, black Americans, and Finns have identified the same DN susceptibility locus on chromosome 3q.1215 The complex interaction between genetics, risk factors such as hyperglycemia and environmental components makes it more challenging to find specific genes for DN using genetic association studies. To that end, it could be advantageous to search for DN susceptibility genes in populations such as Finns, a uniquely homogeneous European population16 with the world’s highest incidence of T1D.17,18 With a combination of founder effects and genetic isolation, the population has accumulated rare genetic traits referred to as the “Finnish Disease Heritage.”19 In addition, Finland has a good public health care system, including nationwide disease and treatment registries, which facilitates identification of patients and follow-up of their clinical records.

Methods

Experimental Design

To search for DN susceptibility genes, we have assembled a cohort of Finnish T1D siblings with extreme phenotypes regarding the presence (case) or absence (control) of DN. This discovery cohort contained 76 T1D sibling pairs discordant for DN, and three T1D families with three siblings (total of 80 cases and 81 controls). The samples came from two sources: the Finnish National Institutes of Health and Welfare diabetes collections, as described elsewhere15; and the Finnish Diabetic Nephropathy (FinnDiane) study.20 Furthermore, 3531 unrelated individuals with T1D (1344 cases and 2187 controls) (Figure 1A) from FinnDiane were used for replication of findings made in the discovery cohort. The main clinical characteristics of patients in the discovery cohort are summarized in Table 1.

Figure 1.

Figure 1.

Study design illustrates patient cohort and principles of DNA variant analysis. (A) Cohorts used in the search for DN susceptibility genes in Finnish patients with T1D: the genomes of a total of 76 sibling pairs concordant for T1D but discordant for diabetic nephropathy (DSPs) were subjected to WGS. Additionally, T1D siblings from three families with three siblings (multiple siblings [MS]) with or without DN were included in the sequencing analyses. The control siblings (n=81) have had diabetes for at least 15 years (range, 15–37 years) without developing DN, and have never been on angiotensin-converting enzyme inhibitor or angiotensin receptor blocker medication for kidney disease. The case siblings (n=80) have had overt proteinuria, been on dialysis, received a kidney transplant, or have died from kidney complications. (B) Multilevel strategy used to analyze the WGS data from Finnish T1D individuals with or without diabetes complications. SNV, single-nucleotide variants.

Table 1.

Clinical characteristics of the Finnish T1D patient discovery cohort

Characteristic Cases Controls
N (male %)a 80 (61.3) 81 (46.9)
T1D
 Duration, yrb Range, 21–38 Range, 15–37
 Age at onset, yr 11.6±8.1 16.6±11.3
BP, mm Hg
 Systolic 149.2±23.1 (n=60) 135.4±15.3 (n=59)
 Diastolic 82.1±11.3 (n=60) 79.2±8.0 (n=59)
Antihypertensive medication (%)
 At baseline 83.8 25.9
 During follow-up 98.0 74.0
Hemoglobin A1c (%) 9.0±2.0 (n=70) 8.4±1.4 (n=57)
Body mass index, kg/m2 26.3±5.0 (n=63) 26.4±3.9 (n=57)
Total cholesterol, mmol/L 5.5±1.2 (n=69) 5.1±1.0 (n=77)
Lipid-lowering medication (%)
 At baseline 22.5 9.9
 During follow-up 82.5 69.1
ESKD (%) 46.3 0

Data are reported as range or mean±SD.

a

N, number of subjects.

b

Duration till year 2017.

Study Participants

The discovery cohort consisted of sibling pairs and small families, whereas the replication cohort consisted of unrelated individuals, all having T1D. The renal status was on the basis of the albumin excretion rate (AER) in a 24-hour urine collection or the albumin-to-creatinine ratio (ACR) in a random spot-urine collection. The presence of ESKD was defined according to whether patients were undergoing dialysis or had received a kidney transplant. DN was defined by (1) persistent macroalbuminuria (AER≥300 mg/24 h or ACR>30 mg/mmol) in two of three consecutive measurements or the presence of ESKD; and (2) the absence of clinical or laboratory evidence of nondiabetic renal or urinary tract disease. Control status was defined by normoalbuminuria (AER<30 mg/24 h or ACR<3 mg/mmol) despite duration of diabetes for at least 15 years (range, 15–37 years). In the discovery cohort, all study participants had been diagnosed with T1D for at least 15 years, with the age at onset <30 years; in the replication cohort, age at diabetes onset was ≤40 years, with insulin dependence within 1 year after the diagnosis of diabetes (or age at diabetes onset ≤15 years). Controls in the replication cohort had minimum diabetes duration of 15 years. The replication cohort included 2187 controls with normal AER and 1344 cases with macroalbuminuria and ESKD.

Ethical Permits

All patients with diabetes gave written, informed consent to participate in the study and the Ethical Committee of the Finnish National Public Institute, the Ethical Committee of the Helsinki and Uusimaa Health District, and Karolinska Institutet approved the protocol for the study. The transgene manipulation in zebrafish was approved by the local ethical committee (the North Stockholm District Court).

Whole-Genome Sequencing

Whole-genome sequencing (WGS) was carried out on the discovery cohort using both Illumina HiSeq 2000 and Complete Genomics platforms. To evaluate the quality of the two different sequencing methods, we sequenced four discordant sibling pairs with both platforms and compared the difference of the called variants across different platforms. The methods used for sequence alignment, quality control, variant calling, and single-nucleotide variant (SNV) annotation can be found in Supplemental Appendix 1.

Bioinformatics Approaches for WGS Analysis

To fully utilize WGS data, we performed the association analysis with DN at three levels (Figure 1B): (1) genome-level analyses to study hotspots of mutations and SNVs affecting regulatory regions; (2) gene-level aggregation tests to identify genes with DN-predisposing (or protecting) variants; and (3) SNV-level tests focusing on the protein-altering variants (PAVs) present only in cases or only in controls, and therefore potentially causal or protective for DN. Each level of analysis uses different criteria for statistical significance; a brief summary of the statistical models and criteria used in each analysis is reported in Table 2. A global snapshot of all DN-associated variants and replicated in the FinnDiane cohort is provided in Figure 2.

Table 2.

Summary of test results from the genomic, gene, and single-variant levels of data analysis on 76 discordant sibling pairs

Test Name Test Model Multi-test Correction and threshold (discovery) Functional Annotation Results Replicated in FinnDiane Result Report
Genome level
 RMRs Hotspot clustering, negative binomial distributiona Bonferroni P<3.7×10−5 N.A. 850,137 RMR N.A. Only DN-RMR are replicated Figure 3, Supplemental Figure 1
 DN-RMR FET FDR<0.05 Genome location, pathway over-representation (KEGG), protein-protein interaction 732 DN-RMR and the pathways involved Bonferroni P<0.01 141 DN-RMR replicated Figure 3, Supplemental Table 4
 Promoters, enhancer, TFBS FET FDR<0.05 Functional enrichment test, protein-protein interaction 270 promoters, 44 enhancers, 40 TFBS Bonferroni P<0.01, 68 promoters, 5 enhancers, 6 TFBS replicated Figure 3, Supplemental Tables 5–7
Gene level
 F-SKAT (76 pairs plus three multisibling families) F-SKAT on DN-associated SNVs (OR>1.5; P<0.05) N.A. nominal P<0.01 Functional enrichment test, protein-protein interaction 206 F-SKAT significant genes 9 genes using strict replication approach Figure 4, Supplemental Tables 9–11
F-SKAT on rare SNVs (MAF<0.05; MAF<0.01) N.A. Supplemental Table 8
Single-variant level
 Single-variant association test OR in dominant and recessive model N.A. Case-only or control- only, PAV or ncRNA exonicb,c SNV location, SIFT, Polyphen2 3562 PAVs, 3259 variants in ncRNA exonic OR>1.5 and P<0.05, 47 recessive PAVs replicated, 86 recessive ncRNA variants replicated Table 3, Supplemental Table 12 and 13

N.A., not available; ncRNA, noncoding RNA.

a

Variant clustering method proposed by Weinhold et al.25

b

Case-only or control-only: ≥3 heterozygous individuals in only case/control in dominant model; ≥1 homozygous individual in only cases/control in recessive model.

c

i.e., nonsynonym, stop-gain, and stop-loss.

Figure 2.

Figure 2.

Analysis of whole-genome sequencing data from Finnish T1D sibling pairs reveals DNA variants and regions that are associated with DN. The Circos plot consists of multiple layers, each of which represents a bioinformatic analysis approach and its significant outcomes in the discovery cohort. From the outside to the center, with cytoband as a genome location reference. DN-associated PAVs that are replicated in FinnDiane are highlighted. PAVs that are highly enriched in cases are marked in red and controls are marked in green. In the second layer, genes with a highly enriched cluster of DN-associated variants that has been prioritized by F-SKAT are depicted in the orange circle, and those passing the stringent replication are marked by their names. From the third layer, regions of recurrent mutations that are associated with case or control (DN-RMR) are shown in the light green circle, followed by promoters (±500 bp from a promoter annotated CAGE cluster according to FANTOM5) in light blue, enhancers (±500 bp from an enhancer annotated CAGE cluster according to FANTOM5) in light purple, and TFBS in light red. The details of the statistical models and the call of significance of association for each approach are listed in Table 2.

Association Test for SNVs

For each SNV, we tested the association with DN using four genetic models: (1) case-dominant, (2) case-recessive, (3) control-dominant, and (4) control-recessive.21 To this aim, we used the Firth logistic regression method that accounts for rare variants and provides bias reduction in case of small sample size analysis22,23 to assess the significance of the association (P value) in the discovery, replication, and combined cohorts. Odds ratio (OR) and P values for association were calculated using the Firth bias-reduced, penalized-likelihood logistic regression method, and was implemented in the R package logistf.24 The association test results were used to select SNVs for gene-level test, and SNV-level test. The criteria for selection are different in gene- and SNV-level tests (see details below).

Genome-Level Analysis

To identify genomic regions with frequent variants associated with DN in the 76 discordant sibling pairs, we set out to (1) identify regions that are significantly recurrently mutated (recurrently mutated regions [RMRs]) compared with the distribution of mutations across the genome, and (2) test each region for significant over-representation of mutations in DN cases or controls. For (1), the identification of RMR was carried out following the method proposed by Weinhold et al.25 Briefly, all mutations located within 50 bps of each other were merged using BEDTools26 into hotspot clusters, and this procedure was repeated until no cluster was found within 50 bp of another cluster. The optimal cluster size was determined empirically given the observed distribution of mutations and their distance in the genome (data not shown). Clusters with fewer than three mutations were removed. For each cluster, a P value was calculated using the negative binomial distribution, taking into account the length of the candidate hotspot region, the number of mutations in the cluster, and the background mutation rate (average mutation rate per sample) for the cluster that was estimated using the genome-wide expectation. The candidate hotspot regions were selected for further analyses on the basis of their P value for significance and using a stringent Bonferroni correction for the number of regions tested (Supplemental Figure 1). To identify recurrently mutated regions associated with DN (DN-RMR), for each region we counted the number of mutations found in DN cases or controls and carried out a Fisher exact test (FET) to assess whether a mutation was over-represented in either cases or controls. The Benjamini–Hochberg false discovery rate (FDR) correction to account for the number of regions tested by FET was applied to identify DN-RMR at the genome-wide level. For details of the analyses performed on transcription factor binding sites (TFBS), promoters, and enhancers, please see Supplemental Appendix 1.

Gene-Level Analysis

We applied the adjusted sequence kernel association test for familial data of dichotomous traits (F-SKAT27) on the multisibling cohort (n=161). SNVs within a gene region were clustered together for the analysis. The gene region included variants in upstream 1000 bp, downstream 1000 bp, untranslated regions (UTR) at the 3′ side and 5′ side, intron, and exon. Only the PAVs in the exonic region were included, i.e., nonsynonymous, stop-gain, stop-loss, and splicing site variants in RefSeq. We performed the gene-level aggregation test on three different sets of variants: (1) SNVs nominally associated with a case-control phenotype in the discovery cohort (OR>1.5 and nominal P<0.05, Firth test), irrespective of their minor allele frequency (MAF); (2) all SNVs with MAF<0.01, irrespective of their association with DN in discovery cohort; and (3) all SNVs with MAF<0.05, irrespective of their association with DN in discovery cohort. Genes that reached significance in the F-SKAT analysis (nominal F-SKAT P<0.01) have been annotated for functional enrichment test using Enrichr.28

SNV-Level Analysis

To select significant SNVs for replication, we focused on the SNVs that are PAV (missense, nonsense, stop-loss, splicing site) or located in an exonic region in noncoding RNAs. SNVs present only in cases or only in controls in dominant (three or more individuals) or recessive (one or more individuals) were selected for replication in the FinnDiane cohort. For analysis of replication, an association test, as described above, was carried out in the discovery (T1D discordant sibling pairs, n=152), replication (FinnDiane, n=3531), and combined cohorts (discovery plus replication, n=3683). Methods for power calculation for SNVs association tests are described in Supplemental Appendix 1.

Analysis of Replication Cohort

Genome-wide genotyping was performed on the Illumina HumanCoreExome Bead arrays 12–1.0, 12–1.1, and 24–1.0. The arrays include a core set of genome-wide variants plus an extensive set of exome variants. Data processing and quality control methods have been described earlier.29 The genotype data were imputed with Micmac3 using the 1000 Genomes reference panel (phase 3, version 5). SNVs with poor quality (R2<0.3) were removed from analysis. Samples overlapping with the discovery cohort were excluded. Candidate SNVs were extracted from the GWAS imputation data and the number of genotypes was counted for controls and cases on the basis of the most likely genotypes using SNPTest.

To evaluate the false positive rate of replication at the SNV level, we performed an empirical test in three steps: (1) select a random set of SNVs from discovery PAVs; (2) test association on this random set, and count the number of significant variants (OR>1.5; P<0.05); and (3) repeat the steps 10,000 times to assess the false positive rate. For gene-level replication, we could not apply F-SKAT to FinnDiane data because that replication cohort does not contain familial data. Instead, we used the sequence kernel association test30 on the same SNV set (if found in FinnDiane) as we used for F-SKAT in the discovery cohort. For replication in the genome level DN-RMR test, we extracted all SNVs within the RMR regions defined by tests of the discovery cohort, and tested enrichment of variants in cases or controls for each region using two-tailed FET, then corrected by Bonferroni P<0.01.

Results

Variants Detected by WGS

We evaluated the sequencing quality by sequencing four sibling pairs with both Complete Genomics and Illumina HiSeq 2000 platforms. The concordance rate across the two platforms for all eight individuals was 98.8% (Supplemental Table 1, A–C).

WGS of the discovery cohort revealed 12 million SNVs (Supplemental Table 2) and >6 million short insertions and deletions (Supplemental Table 3). Here, we focused on genetic variants functionally associated with the DN phenotype, i.e., variants affecting gene regulatory elements and/or coding regions.

Genome-Level Analysis

We analyzed the complete genome sequences of the 76 T1D discordant sibling pairs to systematically identify genomic regions that are recurrently mutated and over-represented in the DN cases or controls (i.e., individuals with T1D but without DN). A similar approach has been used in studies of genome-wide noncoding regulatory mutations in cancer,25 and is on the basis of (1) genome-wide “hotspot” mutation analysis to identify small regions with frequent (recurrent) mutations and analysis of clusters of recurrent mutations, (2) DNA variants affecting TFBS, and (3) annotated regulatory regions (e.g., promoters and enhancers).

For the genome-wide hotspot mutation analysis, we identified a total of 850,137 RMRs. Each RMR represents a genomic locus enclosing a cluster of variants within 50 bp of each other, and genome-wide RMRs have a median size of 436 (4–37,433) bp. Each identified RMR is significantly recurrently mutated compared with a random distribution of mutations across the genome (Bonferroni-corrected P<3.7×10−5; Supplemental Figure 1).

We first tested whether these RMRs are significantly over-represented in DN cases or in controls. After correcting for the number of total RMRs analyzed, we detected 732 RMRs that are over-represented in either DN cases or in controls at FDR<5%, thereby identifying a set of DN-RMRs (Figure 3A, Supplemental Table 4). A total of 141 of these DN-RMRs (19.26%) were replicated in the FinnDiane cohort (Bonferroni P<0.01). We found that 458 (63%) DN-RMRs are intergenic, whereas 274 (37%) overlap with 194 annotated genes. When compared with the whole set of RMRs identified at the genome-wide level in the discovery cohort, the DN-RMRs more frequently overlap with exons, introns, 3′UTRs, 5′ UTRs, enhancers, and gene promoter regions (Figure 3A). This suggests that DN-associated clusters of mutations are more likely to affect exons and regulatory regions than the RMRs that are not associated with DN. The genes overlapping with DN-RMRs are significantly enriched for several canonical KEGG pathways relevant to the pathobiology of DN (P<0.01), including ECM-receptor interaction, focal adhesion, and T1D (Figure 3B). These pathways have several genes in common, suggesting that the identified DN-RMRs affect multiple genes interacting across overlapping functional pathways (Figure 3B). Interestingly, COL4A1 and COL4A2, which encode the most prominent non-GBM collagens were shown to be associated with DN, as previously reported.31,32 Both genes were enriched for variants in intronic regulatory regions, but their possible role in the pathogenesis of DN remains obscure, especially as no exonic mutations were different between cases and controls in these genes in the discovery cohort.

Figure 3.

Figure 3.

Genome-wide analysis of variants reveals regions and TFBSs associated with DN in the discovery cohort. (A) Annotation of RMRs with respect to overlapping gene regulatory elements; relative frequencies have been calculated with respect to each group: all RMRs (white) and DN-RMR (red). (B) Significantly over-represented KEGG pathways comprise common genes overlapping with DN-RMR. The relationships between genes overlapping with DN-RMR and KEGG pathways is depicted as a network graph, wherein the outer circle comprises genes and inner circle comprises the pathways. (C) Schematic representation of genome-wide analysis of variants occurring in TFBSs, that were derived from 668 chromatin immunoprecipitation sequencing data sets (see Methods). (D) We identified 40 TFs with significantly different variant frequencies between cases and controls, in the TFBS, which were significantly enriched for pathways relevant to the pathophysiology of DN. For the top ten enriched KEGG pathways, the known relationships (edges) between TFs (inner circle) and the KEGG pathways (outer circle) are depicted as a network graph. (E) We found an enrichment of variants in cases in the promoter and enhancer regions (±1 kb) of the ALOX5 gene locus. Enhancers and promoter regions were retrieved from FANTOM5 and crosschecked with chromHMM, whereas other gene annotations were obtained from RefSeq (see Methods).

As the second genome-level approach, to investigate the potential regulatory effect of DN-associated variants, we retrieved and annotated experimentally derived TFBS data from a large repository of chromatin immunoprecipitation sequencing data representing DNA binding data for 237 transcription factors (TFs).33 Within each TFBS region, we tested whether there was a significant over-representation of variants in DN-ascertained cases or in controls (Figure 3C). Overall, we found more variants affecting TFBS in controls than in cases, and in some instances these variants are present only in controls and across multiple families. By pooling results for TFs over their corresponding TFBSs, we identified 40 TFs with significantly different variant frequencies between cases and controls (Benjamini–Hochberg corrected P<0.05) and six (out of 20 TFs for which genotype data were available in the replication cohort) were replicated in FinnDiane (Bonferroni P<0.01) (Supplemental Table 5). The 40 TFs were enriched for pathways relevant to the pathophysiology of DN (Figure 3D). These include the EGF receptor-dependent endothelin signaling (implicated in the development and progression of renal fibrosis and hypertrophy of the glomerular basement membrane), which has been proposed for targeting by endothelin antagonist therapy in DN.34 We also found the structurally related transmembrane receptors belonging to the receptor tyrosine kinase superfamily (e.g., ErbB1) that are involved in the development and progression of DN.35 Of note, variants in ERBB4 have previously been suggested to be associated with DN,18,36 although the causal variants were not identified.

The third genome-level analysis approach was to study annotated regulatory regions in the genome (gene promoters and enhancers) that are derived from the FANTOM5 database37 and were further supported by ENCODE38 histone modification data, and to test whether variants in these regions were significantly over-represented in DN cases or controls. We found significant enrichment (FDR<0.05) for DN-associated variants in 270 promoter regions (±1 kb around the annotated gene transcription start site), 68 (25.2%) were replicated in the FinnDiane cohort (Bonferroni P<0.01) (Supplemental Table 6A). We also found significant enrichment (FDR<0.05) for DN-associated variants within ±1 kb of 44 predicted enhancers (Supplemental Table 6B). DN-associated variants in five enhancers were replicated in the larger FinnDiane cohort (Bonferroni P<0.01). We further prioritized candidate genes within these replicated enhancers using data related to topologically associated domains, epigenetic regulation, and transcriptome analysis of DN in human39 (Supplemental Table 7).

Not surprisingly, in a few cases distinct genome-level analyses prioritized the same gene locus. For instance, ALOX5, encoding arachidonate 5-lipoxygenase (a member of the lipoxygenase gene family regulating metabolites of AA), was found to overlap with an intragenic DN-RMR spanning 4724 bp and has DN-associated variants in two predicted enhancers and in its annotated promoter region, suggesting potential enhancer–promoter interaction40 (Figure 3E). A role for lipoxygenase inhibitors in DN has been proposed in the rat41 and 12-lipoxygenase is increased in glucose-stimulated cultured mesangial cells and in kidney of rat DN model.42 Furthermore, it has been shown that 5-lipoxygenase contributes to degeneration of retinal capillaries in a mouse model of diabetic retinopathy, suggesting a proinflammatory role of 5-lipoxygenase in the pathogenesis of DN.43

Gene-Level Analysis

To investigate the aggregated gene-level contribution of multiple SNVs, we used the F-SKAT framework.27 We tested different sets of SNVs that were aggregated at the gene level (see Methods). We only found a few genes that reached the nominal significance level of P<0.05 by testing on the rare variants (Supplemental Table 8), and found no associations with any relevant functional pathways or networks. Alternatively, we first identified 28,237 SNVs (within 3745 genes) that were nominally associated with DN susceptibility or protection (OR>1.5; P<0.05). Then we gathered all DN-associated SNVs that were within upstream 1000 bp, downstream 1000 bp, 3′UTR and 5′UTR regions, intron, and PAVs, and tested their accumulative effect on each gene. We found 206 genes that reach a significance level of P<0.01 in the F-SKAT analysis (Supplemental Table 9, A and B).

To investigate the potential function of the SNVs in the 206 genes detected by F-SKAT, we analyzed these SNVs using a recent expression quantitative trait locus44 data set from the glomerulus and tubulointerstitium of patients with nephrotic syndrome. We found that these F-SKAT significant genes are more likely to be under cis-acting regulation in the glomeruli of patients with nephrotic syndrome than genes with nonsignificant F-SKAT (OR=3.84; 95% confidence interval, 2.83 to 5.21; P=2.2×10−16). This suggests that the SNVs contributing to the gene-level association with DN (detected by F-SKAT) may exert their pathologic function by regulating gene expression in the kidney. We then used Enrichr28 to test for functional enrichment in the 206 genes identified by F-SKAT, and observed the only significant enrichment for protein–protein interactions in the podocyte network expanded by STRING (XPodNet45) (22 out of 808 genes, enrichment P=0.005; Wikipathways46). The F-SKAT–associated genes within the core XPodNet are shown in Figure 4A and Supplemental Table 10. The genes in this subnetwork of XPodNet are enriched for several pathways, including focal adhesion and insulin signaling (Figure 4B, Supplemental Table 11). The top candidate gene from the F-SKAT test is the protein kinase C ɛ gene (PRKCE) (F-SKAT P<0.001), with multiple intronic DN-associated SNVs that overlap with predicted regulatory regions (Figure 4C, Supplemental Table 9B). Protein kinases PRKCE, PTK2 (F-SKAT P=0.004), and PRKCI (F-SKAT P=0.009) are part of a “core protein-interaction network” representing proteins essential for podocyte function. These genes are particularly interesting as PKCs have been implicated in the pathogenesis of DN.10 However, specific inhibitors for those three PKCs have not yet been developed to our knowledge.

Figure 4.

Figure 4.

FSKAT gene-level analysis identifies genes associated with DN in core podocyte network. (A) Graphical representation of the core podocyte network that includes the genes associated with DN by F-SKAT analysis in the discovery cohort. Node color indicates the statistical significance (P value) of the F-SKAT test. White color nodes indicates podocyte network genes not detected in this study. (B) The F-SKAT–associated genes within the podocyte network are enriched (adjusted P<0.05) for several pathways; top six pathways and contributing genes are reported. Full functional enrichment results are reported in Supplemental Table 6. (C) Details on the PRKCE gene that showed the highest association with DN (by F-SKAT) and location of the intronic SNVs associated with DN. For each SNV, the association with DN is reported by OR tested in either a recessive or dominant model. Full statistics and regulatory information on the SNVs are reported in Supplemental Table 4B.

Furthermore, we tested the 206 genes that were found to be significant using F-SKAT in the replication cohort. This replication is limited by the less numerous SNVs in FinnDiane compared with the discovery cohort (2316 out of 3755 SNVs), which also does not include family data. Therefore, we applied SKAT using only the same SNVs used by F-SKAT in the discovery cohort. This is a rather stringent replication approach, as it tests for both the genes and the specific SNVs that were found to be associated with DN in the discovery cohort. Out of the 206 genes tested, only 120 genes were found with at least one F-SKAT SNV, and nine genes passed the nominal criteria P<0.05, including a protein kinase gene PTK2 (Supplemental Table 9C). The replicated genes are highlighted in Figure 2.

Analyses of PAVs

It has been estimated that about 85% of mutations underlying Mendelian diseases reside in coding sequences or at exon-intron borders.47,48 Numerous reports have described rare but highly penetrant exon mutations in Mendelian disease,49,50 and it is likely that such mutations also frequently contribute to complex disease phenotypes. Our initial exon variant analyses have focused on 53,449 PAVs (nonsynonymous, stop-gain, stop-loss, and splice site variants; Supplemental Table 2) that were exclusively found in cases or controls in the 76 T1D discordant sibling pairs and are associated with DN susceptibility or DN protection. The PAVs were tested for association with DN in the FinnDiane cohort using a recessive disease model for the homozygous variants detected in one or more cases/controls in the discovery cohort and by a dominant model for the heterozygous SNVs detected in three or more cases/controls in the discovery cohort. The 47 PAVs identified in the recessive model were replicated in FinnDiane (P<0.05; OR>1.5). By using a permutation-based strategy (see Methods), we estimated the probability that these 47 PAVs are replicable by chance alone is only 2.3%. However, the false positive rate in the dominant model is estimated to be high (Supplemental Figure 2). Therefore, only candidate SNVs that were replicated in the recessive model are reported (top SNVs in Table 3, and in full in Supplemental Tables 12 and 13). Some of the top-replicated PAVs are within genes that have previously been linked to renal disease, implying a potential role in DN, e.g., mutations in WDR73 have been reported to be responsible for late-onset steroid-resistant nephrotic syndrome.51 We also studied the gene function of ABTB1, where we found the only case-only homozygous mutation that is truncating the protein. Zebrafish knockout of the gene displayed a phenotype that is specific for kidney damage (Supplemental Appendix 1, Supplemental Figure 3).

Table 3.

Top protein-altering variants replicated with criteria P<0.05 and OR>1.5 in the FinnDiane cohort

Gene Symbol Gene Description dbSNP ID MAF SIFT | PP2 AA Change Discovery Replication, n=3531 Combined, n=3683
1000 Genomes ExAC (All|Finns) Case|Controla OR (95% CI) P Value OR (95% CI) P Value
WDR73 WD repeat domain 73 rs72750868 0.044 0.076 T|B D->G 2|0 2.52(1.36-4.77) 0.002 2.64 (1.44-4.96) 0.001
TPPP2 Tubulin polymerization promoting protein family member 2 rs9624 0.160 0.148 D|D R->L 1|0 3.28 (1.4-8.32) 0.003 3.4 (1.46-8.55) 0.002
UBR7 Ubiquitin protein ligase E3 component n-recognin 7 rs2286653 0.113 0.147 T|B A->T 1|0 3.55 (1.25-11.41) 0.008 3.74 (1.35-11.92) 0.005
ATP10D ATPase phospholipid transporting 10D rs34208443 0.077 0.141 T|B P->T 1|0 1.65 (1.11-2.46) 0.009 1.65 (1.11-2.44) 0.009
ANO9 Anoctamin 9 rs114405390 0.015 0.027 T|B T->A 1|0 4.09 (1.18-17.89) 0.01 4.41 (1.3-19.01) 0.007
SIGIRR Single Ig and TIR domain containing rs117739035 0.016 0.029 D|D S->Y 1|0 3.6 (1.15-13.25) 0.01 3.85 (1.26-13.97) 0.008
SFT2D1 SFT2 domain containing 1 rs11551053 0.111 0.077 T|B I->V 1|0 3.04 (1.12-9.02) 0.02 3.21 (1.21-9.41) 0.009
HKR1 HKR1, GLI-Kruppel zinc finger family member rs2921563 0.098 0.054 T|D R->H 1|0 5.72 (1.09-56.43) 0.02 6.4 (1.28-62.01) 0.009
KRT32 Keratin 32 rs2604956 0.046 0.071 T|D D->E 1|0 2.15 (1.07-4.43) 0.02 2.21 (1.1-4.52) 0.02
C6orf118 Chromosome 6 open reading frame 118 rs17852379 0.103 0.073 T|D G->E 1|0 2.55 (1.02-6.69) 0.03 2.67 (1.09-6.95) 0.02
PPP4R1 Protein phosphatase 4 regulatory subunit 1 rs329003 0.041 0.073 .|B I->V 2|0 3 (1.01-9.9) 0.03 3.47 (1.23-11.17) 0.009
ANKRD26 Ankyrin repeat domain 26 rs12572862 0.067 0.036 T|B V->L 1|0 8.16 (0.91-385.51) 0.03 9.59 (1.16-440.85) 0.01
PKHD1L1 Polycystic kidney and hepatic disease 1 rs117037399 0.005 0.019 T|P G->V 1|0 8.16 (0.91-385.51) 0.03 9.59 (1.16-440.74) 0.01
CSMD1 CUB and Sushi multiple domains 1 rs34337712 0.021 0.069 T|B Q->H 1|0 1.86 (1-3.49) 0.03 1.9 (1.03-3.53) 0.03
C6orf10 Chromosome 6 open reading frame 10 rs7775397 0.019 0.060 T|P K->Q 1|0 1.55 (1-2.38) 0.04 1.55 (1.01-2.37) 0.04
TMEM176A Transmembrane protein 176A rs10378 0.128 0.139 D|D L->F 0|1 0.38 (0.17-0.78) 0.004 0.37 (0.16-0.74) 0.002
C4orf51 Chromosome 4 open reading frame 51 rs10008599 0.077 0.098 D|B D->N 0|1 0.18 (0.02-0.75) 0.007 0.17 (0.02-0.69) 0.004
SIGMAR1 Sigma nonopioid intracellular receptor 1 rs1800866 0.217 0.184 T|B Q->P 0|2 0.43 (0.2-0.86) 0.01 0.4 (0.19-0.8) 0.005
CPTP Ceramide-1-phosphate transfer protein rs150672559 0.005 0.007 T|B R->H 0|1 0 (0-0.82) 0.01 0 (0-0.71) 0.008
NEFH Neurofilament heavy polypeptide rs5763269 0.151 0.182 D|B P->L 0|1 0.49 (0.26-0.9) 0.01 0.47 (0.25-0.86) 0.009
TNFRSF14 TNF receptor superfamily member 14 rs2234167 0.114 0.130 T|B V->I 0|1 0.47 (0.22-0.92) 0.02 0.45 (0.22-0.88) 0.01
TBC1D9 TBC1 domain family member 9 rs13118702 0.010 0.020 T|B E->K 0|1 0.13 (0-0.91) 0.02 0.12 (0-0.81) 0.01
UNC93A Unc-93 homolog A rs2235197 0.110 0.109 .|. W->* 0|4 0.51 (0.27-0.94) 0.02 0.46 (0.24-0.84) 0.007
TYR Tyrosinase rs1042602 0.123 0.252 .|D S->Y 0|5 0.63 (0.41-0.96) 0.03 0.58 (0.38-0.88) 0.007
ATAD3B ATPase family, AAA domain containing 3B rs139902189 0.078 0.076 D|P C->T 0|1 0.32 (0.08-0.97) 0.03 0.3 (0.08-0.9) 0.02
TEX101 Testis expressed 101 rs35033974 0.041 0.084 D|D G->T 0|3 0.51 (0.25-0.98) 0.03 0.47 (0.23-0.88) 0.01
AVEN Apoptosis and caspase activation inhibitor rs61729120 0.007 0.016 D|D G->T 0|2 0.23 (0.03-1.01) 0.03 0.2 (0.02-0.84) 0.01
ZNF844 Zinc finger protein 844 rs76842919 0.026 0.060 D|B A->G 0|1 0.23 (0.03-1.01) 0.03 0.21 (0.02-0.91) 0.02
ZNF844 Zinc finger protein 844 rs8102258 0.119 0.095 T|B T->C 0|1 0.23 (0.03-1.01) 0.03 0.21 (0.02-0.91) 0.02
OR6×1 Olfactory receptor family 6 subfamily X member 1 rs12364099 0.077 0.122 D|B C->A 0|1 0.54 (0.28-0.99) 0.04 0.51 (0.27-0.94) 0.02

Case/control only protein-altering SNVs that remain significant (OR>1.5; P<0.05) after replication in the FinnDiane cohort (1344 cases, 2187 controls). Only the top 15 protein-altering SNVs detected in the recessive model (case or control only) are listed here (full results are reported in Supplemental Table 12). MAF in general population is annotated from the 1000 Genomes project and ExAC. dbSNP, single nucleotide polymorphism database; ExAC, the exome aggregation consortium. The potential effect of a variant in the protein is predicted by SIFT (T: Tolerant; D: Deleterious) and Polyphen2 (B: Benign; P: Possibly damaging; D: Damaging). “.” data not available; *, stop codon.

a

The number of homozygous carriers of the SNV in the case group and in the control group. “|” sign is used to separate the two numbers.

Hyperglycemia causes an increase in intracellular reactive oxygen species that leads to increase in glucose derivatives, such as methylglyoxal, that readily react with amino groups of protein amino acid residues, particularly arginine, lysine, cysteine, and methionine.52 Here, PAVs altering amino acid codons to arginine were found to be significantly less represented in the set of mutations detected in controls only as compared with all PAVs (OR=0.66; 95% confidence interval, 0.43 to 0.97; P=0.03; Supplemental Figure 4). No other classes of mutations leading to individual amino acid(s) substitution showed significant over-representation/depletion in either cases or controls.

Power Calculation

To estimate the statistical power for detecting association in our sibship discovery cohort, we used a method described by Li et al.53 We estimated the power assuming different levels of penetrance (Supplemental Table 14A). Our sample size of 76 discordant sibling pairs reaches >80% power to detect significant associations (P<4.11×10−9) for rare variants with high penetrance (penetrance, 90%; MAF=0.01). Furthermore, we estimate the power for the replication study. Similar to a previous report,18 our replication cohort (n=3531) reaches at least 80% power detect common variants with high OR (OR=2, MAF=0.05 in the dominant model; OR=5, MAF=0.2 in the recessive model).

Discussion

To the best of our knowledge, this is the first study where WGS has been applied in a search for genomic variants specifically associated with the presence or absence of DN in patients with T1D. The challenge with finding susceptibility genes for diabetes complications is that one searches for mutations that only cause complications if the individual has hyperglycemia. We assembled a unique discovery cohort of T1D siblings from the highly homogeneous Finnish population and replicated key findings in a larger cohort of unrelated T1D Finns. This enabled a direct comparison of whole-genome sequences in individuals with extreme phenotypes, i.e., T1D with progressive DN on one hand, and siblings with no complications for at least 15 years (range, 15–37 years) on the other. The results provide a unique catalog of DNA variants in Finns.

We have developed a comprehensive panel of multiple bioinformatic approaches to detect genetic predisposition of DN in the discovery sibling cohort. The SNVs approach, which evaluates PAVs that are present only in cases or controls, focuses on the potential protein function in DN. The kernel test (F-SKAT) prioritizes genes with multiple associated variants within the gene region, and hypothesizes that the accumulated burden leads to malfunction of the gene. The genomic approach includes variants in other genome regions and could potentially detect functionally important regions. These approaches identified different individual variants, genes, and regulatory regions that are potentially involved in DN susceptibility.

Although the discovery cohort only consisted of 161 individuals with T1D, together with the FinnDiane replication cohort, we show that they can provide enough power to identify and replicate potential causative and protective mutations for DN. Here, the use of discordant T1D sibling pairs for DN has been pivotal to increase power to identify variants associated with DN susceptibility of protection.

We have also studied the replication of candidates and report candidates with robust signals for each analysis approach. However, although the replication of SNVs is commonly used for GWAS where it applies on the same loci, the replication for statistical tests that involve multiple loci (i.e., RMRs, F-SKAT, and TFBS) have limitations that need to be taken into consideration. For replication of F-SKAT in FinnDiane, about one third of F-SKAT SNVs cannot be found by array genotyping plus imputation. Thus, the number of replicable genes is limited (120 out of 206), and within each gene, SNVs are also less represented. Additionally, the use of a different statistical model (SKAT versus F-SKAT) might also introduce a bias in the replication test. The constraints caused by limited genotypes in the replication cohort also apply to the RMR and TFBS replications. The data-driven detection of RMRs requires comprehensive SNV data (i.e., WGS data). Using a panel of predefined genotyped single-nucleotide polymorphisms (i.e., single-nucleotide polymorphism array data), even if the panel is large and supported by imputation, might introduce a considerable bias in the replication of DN-associated RMRs.

The analyses of the discovery cohort led to the identification of several novel DN candidate genes in Finns, including PRKCE, PTK1, PRKCI, ABTB1, and ALOX5, as discussed above. The significant association of three protein kinase genes with DN is intriguing, as the large PKC protein family has long been associated with diabetes complications.4,10 Several clinical trials have been carried out for the treatment of DN with ruboxistaurin, a compound that inhibits PRKC-β.54 This suggests that hyperglycemia-driven PKC activation, particularly that of the β-isoform, may underlie endothelial dysfunction. In this study, we identified two novel isoforms of the PKC family (i.e., epsilon and iota) that have not been previously linked to DN. The results strongly support and extend previous hypotheses that protein kinases, especially the PKC family, play a role in the pathogenesis of DN, and could be attractive novel targets for the development of PKC inhibitors for DN treatment.

DN is a disorder characterized by hyperglycemia, which can lead to nonenzymatic glycation of amino acids and formation of advanced glycation end products in both intracellular and extracellular proteins.4,9,55 It can be speculated that glycation of amino acids in functionally important regions of the protein can affect functionality of the protein or promote their degradation.3 Amino acids that are most prone to become nonenzymatically glycated by methylglyoxal and other carbonyls are arginine and, to a lesser extent, lysine,56 cysteine, and methionine.4,9 Our study highlighted mutated arginine codons as being of special interest when considering mutations that can cause pathogenic nonenzymatic glycation of proteins and consequent development of DN.

Previously reported genes/regions associated with DN were not strongly replicated in our discovery cohort (Supplemental Table 15), suggesting that different sets of loci/variants contribute to the pathogenesis of DN. However, despite the scarce replication of previous loci in our cohort, we report the identification of variants/genes in functional pathways relevant to the pathobiology of DN, many of which have been previously reported (e.g., EGF receptor-dependent endothelin signaling34 and PodNet45).

Overall, we have performed a comprehensive study on the genetics of a unique T1D Finnish cohort of siblings discordant for nephropathy using WGS data. Although the sample size is relatively small and the association test for SNV cannot reach the genome-wide significance (P<4×10−9), efforts were made to optimize the test model to fit for the specific sibship cohort, and the top-listed SNVs were replicated (when applicable) in larger Finnish cohort. Novel potential DN susceptibility genes and regulatory variants are promoted in hope to merit further investigation in other populations and animal models.

Disclosures

Dr. Groop reports personal fees from AbbVie, personal fees from Astellas, personal fees from Astra Zeneca, personal fees from Boehringer Ingelheim, personal fees from Eli Lilly, personal fees from Elo Water, personal fees from Janssen, personal fees from Medscape, personal fees from MSD, personal fees from Mundipharma, personal fees from Novo Nordisk, and personal fees from Sanofi, outside the submitted work.

Funding

The work was supported by grants to Prof. Tryggvason from the Novo Nordisk Foundation, Knut and Alice Wallenberg Foundation, Söderberg Foundation, Hedlund Foundation, the Swedish Medical Research Council, the Swedish Foundation for Strategic Research, Sigrid Juselius Foundation, Folkhälsan Research Foundation, the Wilhelmand Else Stockmann Foundation, the Liv och Hälsa Foundation; and grants to Prof. Groop from Helsinki University Central Hospital Research Funds (EVO), Juvenile Diabetes Research Foundation (17-2013-7 [Diabetic Nephropathy Collaborative Research Initiative]), European Foundation for the Study of Diabetes Young Investigator Research Award funds, and the Academy of Finland grants (38387, 46558, 275614, 299200, and 316664). This work has also been supported by Singapore National Medical Research Council grants (NMRC/OFLCG/001/2017 and NMRC/STaR/0010/2012).

Supplementary Material

Supplemental Table 6
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data

Acknowledgments

All genetic association data presented here are made freely accessible via http://dnc.systems-genetics.net.

We are grateful to the physicians, nurses, and researchers in the FinnDiane study group and at each center participating in the collection of patients. We thank Leena Ollitervo and Maire Jarva for technical assistance on DNA extraction. We also acknowledge Dr. Jaakko Tuomilehto for the original collection of samples. The computational analyses were performed on resources provided by Swedish National Infrastructure for Computing through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) under Project b2013027, and the High-Performance Computing Cluster in the Duke-National University of Singapore Computational Center.

Footnotes

Published online ahead of print. Publication date available at www.jasn.org.

Supplemental Material

This article contains the following supplemental material online at http://jasn.asnjournals.org/lookup/suppl/doi:10.1681/ASN.2019030289/-/DCSupplemental.

Supplemental Appendix 1. Supplemental methods, results, web resources, and references.

Supplemental Table 1. Comparison of DNA sequencing quality using the Illumina and Complete Genomics platforms.

Supplemental Table 2. Annotation of SNVs and small insertions and deletions (indels) identified in 161 genomes in the discovery cohort by RefSeq.

Supplemental Table 3. Frameshift-causing small insertions and deletions (indels) found in DN cases-only or controls-only individuals of the Finnish T1D discordant sibling pair discovery cohort.

Supplemental Table 4. RMRs significantly over-represented in DN cases or controls (FDR<5% in discovery cohort) and replication in FinnDiane cohort.

Supplemental Table 5. TFBS affected by DN mutations.

Supplemental Table 6. Enhancer (S6a) and promoter (S6b) region with mutations over-represented in DN cases or controls (FDR<0.05 in discovery cohort) and replication statistics in FinnDiane cohort.

Supplemental Table 7. Enhancers replicated in FinnDiane cohort and gene prioritization.

Supplemental Table 8. (A) Genes associated with DN by F-SKAT analysis (P<0.01). (B) Details on the DN-associated SNVs used in the F-SKAT analysis.

Supplemental Table 9. (A) F-SKAT test results on rare SNVs with MAF<0.01. Only top genes with P<0.1 are reported. (B) F-SKAT test on SNVs with MAF<0.05. Only top genes with P<0.1 are reported. (C) Replication of F-SKAT significant (P<0.01) genes in FinnDiane cohort.

Supplemental Table 10. PodNet genes detected by F-SKAT (P<0.01).

Supplemental Table 11. Functional enrichment test (KEGG pathways) on the core genes within the XPodNet network in Figure 4.

Supplemental Table 12. Protein-altering SNVs replicated in FinnDiane cohort (combined P-value<0.05; OR>1.5), in each genetic model.

Supplemental Table 13. Noncoding RNA SNVs replicated in FinnDiane cohort (combined P-value<0.05; OR>1.5), in each genetic model.

Supplemental Table 14. (A) Power estimation of discovery cohort (76 discordant sibling pairs) on the whole genome level of significance (12 million, P<4.11×10−9) of case-only and control-only variants. Power estimation assuming different levels of penetrance. (B) Power estimation of replication cohort (2187 controls and 1344 cases) with genome-wide significance level (P<5×10−8) with one-stage study design.

Supplemental Table 15. Test previously reported SNVs in discovery cohort. SNVs were downloaded from GWAS catalog.

Supplemental Figure 1. Manhattan plot of the RMRs identified genome-wide in the 76 T1D discordant sibling pairs.

Supplemental Figure 2. Estimation of replication false positive rate on PAVs in FinnDiane cohort.

Supplemental Figure 3. Expression and functional analysis of Abtb1.

Supplemental Figure 4. Forest plots showing that PAVs altering amino acid codons for arginine (Arg) are less represented in the set of mutations detected in controls as compared with all protein-altering mutations (indicated by solid diamond).

Supplemental Figure 5. Chromosome 3q21 locus for DN susceptibility that was previously identified.

References

  • 1.Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). UK Prospective Diabetes Study (UKPDS) Group. Lancet 352: 837–853, 1998 [PubMed] [Google Scholar]
  • 2.Braam B, Koomans HA: Renal responses to antagonism of the renin-angiotensin system. Curr Opin Nephrol Hypertens 5: 89–96, 1996 [DOI] [PubMed] [Google Scholar]
  • 3.Westwood ME, Argirov OK, Abordo EA, Thornalley PJ: Methylglyoxal-modified arginine residues--a signal for receptor-mediated endocytosis and degradation of proteins by monocytic THP-1 cells. Biochim Biophys Acta 1356: 84–94, 1997 [DOI] [PubMed] [Google Scholar]
  • 4.Brownlee M: Biochemistry and molecular cell biology of diabetic complications. Nature 414: 813–820, 2001 [DOI] [PubMed] [Google Scholar]
  • 5.Thomas MC, Groop PH, Tryggvason K: Towards understanding the inherited susceptibility for nephropathy in diabetes. Curr Opin Nephrol Hypertens 21: 195–202, 2012 [DOI] [PubMed] [Google Scholar]
  • 6.Thomas MC, Brownlee M, Susztak K, Sharma K, Jandeleit-Dahm KA, Zoungas S, et al.: Diabetic kidney disease. Nat Rev Dis Primers 1: 15018, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Borch-Johnsen K, Nørgaard K, Hommel E, Mathiesen ER, Jensen JS, Deckert T, et al.: Is diabetic nephropathy an inherited complication? Kidney Int 41: 719–722, 1992 [DOI] [PubMed] [Google Scholar]
  • 8.Seaquist ER, Goetz FC, Rich S, Barbosa J: Familial clustering of diabetic kidney disease. Evidence for genetic susceptibility to diabetic nephropathy. N Engl J Med 320: 1161–1165, 1989 [DOI] [PubMed] [Google Scholar]
  • 9.Giacco F, Brownlee M: Oxidative stress and diabetic complications. Circ Res 107: 1058–1070, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Geraldes P, King GL: Activation of protein kinase C isoforms and its impact on diabetic complications. Circ Res 106: 1319–1331, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dahlström E, Sandholm N: Progress in defining the genetic basis of diabetic complications. Curr Diab Rep 17: 80, 2017 [DOI] [PubMed] [Google Scholar]
  • 12.Moczulski DK, Rogus JJ, Antonellis A, Warram JH, Krolewski AS: Major susceptibility locus for nephropathy in type 1 diabetes on chromosome 3q: Results of novel discordant sib-pair analysis. Diabetes 47: 1164–1169, 1998 [DOI] [PubMed] [Google Scholar]
  • 13.Imperatore G, Hanson RL, Pettitt DJ, Kobes S, Bennett PH, Knowler WC: Sib-pair linkage analysis for susceptibility genes for microvascular complications among Pima Indians with type 2 diabetes. Pima diabetes genes group. Diabetes 47: 821–830, 1998 [DOI] [PubMed] [Google Scholar]
  • 14.Bowden DW, Colicigno CJ, Langefeld CD, Sale MM, Williams A, Anderson PJ, et al.: A genome scan for diabetic nephropathy in African Americans. Kidney Int 66: 1517–1526, 2004 [DOI] [PubMed] [Google Scholar]
  • 15.Österholm AM, He B, Pitkäniemi J, Albinsson L, Berg T, Sarti C, et al.: Genome-wide scan for type 1 diabetic nephropathy in the Finnish population reveals suggestive linkage to a single locus on chromosome 3q. Kidney Int 71: 140–145, 2007 [DOI] [PubMed] [Google Scholar]
  • 16.Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al.: Exome Aggregation Consortium : Analysis of protein-coding genetic variation in 60,706 humans. Nature 536: 285–291, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Harjutsalo V, Sund R, Knip M, Groop PH: Incidence of type 1 diabetes in Finland. JAMA 310: 427–428, 2013 [DOI] [PubMed] [Google Scholar]
  • 18.Sandholm N, Van Zuydam N, Ahlqvist E, Juliusdottir T, Deshmukh HA, Rayner NW, et al.: The FinnDiane Study Group; The DCCT/EDIC Study Group; GENIE Consortium; SUMMIT Consortium : The genetic landscape of renal complications in type 1 diabetes. J Am Soc Nephrol 28: 557–574, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Peltonen L, Jalanko A, Varilo T: Molecular genetics of the Finnish disease heritage. Hum Mol Genet 8: 1913–1923, 1999 [DOI] [PubMed] [Google Scholar]
  • 20.Thorn LM, Forsblom C, Fagerudd J, Thomas MC, Pettersson-Fernholm K, Saraheimo M, et al.: FinnDiane Study Group : Metabolic syndrome in type 1 diabetes: Association with diabetic nephropathy and glycemic control (the FinnDiane study). Diabetes Care 28: 2019–2024, 2005 [DOI] [PubMed] [Google Scholar]
  • 21.Clarke GM, Anderson CA, Pettersson FH, Cardon LR, Morris AP, Zondervan KT: Basic statistical analysis in genetic case-control studies. Nat Protoc 6: 121–133, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Wang X: Firth logistic regression for rare variant association tests. Front Genet 5: 187, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ma C, Blackwell T, Boehnke M, Scott LJ; GoT2D investigators : Recommended joint and meta-analysis strategies for case-control association testing of single low-count variants. Genet Epidemiol 37: 539–550, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Georg Heinze M.P. logistf: Firth's Bias-Reduced Logistic Regression. R package version 1.22. 2016. Available at: https://rdrr.io/cran/logistf/man/logistf.html. Accessed October 11, 2019
  • 25.Weinhold N, Jacobsen A, Schultz N, Sander C, Lee W: Genome-wide analysis of noncoding regulatory mutations in cancer. Nat Genet 46: 1160–1165, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Quinlan AR, Hall IM: BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics 26: 841–842, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Yan Q, Tiwari HK, Yi N, Gao G, Zhang K, Lin WY, et al.: A sequence kernel association test for dichotomous traits in family samples under a generalized linear mixed model. Hum Hered 79: 60–68, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, et al.: Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res 44: W90-7, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Syreeni A, Sandholm N, Cao J, Toppila I, Maahs DM, Rewers MJ, et al.: DCCT/EDIC Research Group; Groop PH; FinnDiane Study Group : Genetic determinants of glycated hemoglobin in type 1 diabetes. Diabetes 68: 858–867, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X: Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89: 82–93, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Pihlajaniemi T, Myllylä R, Kivirikko KI, Tryggvason K: Effects of streptozotocin diabetes, glucose, and insulin on the metabolism of type IV collagen and proteoglycan in murine basement membrane-forming EHS tumor tissue. J Biol Chem 257: 14914–14920, 1982 [PubMed] [Google Scholar]
  • 32.Mason RM, Wahab NA: Extracellular matrix metabolism in diabetic nephropathy. J Am Soc Nephrol 14: 1358–1373, 2003 [DOI] [PubMed] [Google Scholar]
  • 33.Griffon A, Barbier Q, Dalino J, van Helden J, Spicuglia S, Ballester B: Integrative analysis of public ChIP-seq experiments reveals a complex multi-cell regulatory landscape. Nucleic Acids Res 43: e27, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Barton M: Therapeutic potential of endothelin receptor antagonists for chronic proteinuric renal disease in humans. Biochim Biophys Acta 1802: 1203–1213, 2010 [DOI] [PubMed] [Google Scholar]
  • 35.Zhang MZ, Wang Y, Paueksakon P, Harris RC: Epidermal growth factor receptor inhibition slows progression of diabetic nephropathy in association with a decrease in endoplasmic reticulum stress and an increase in autophagy. Diabetes 63: 2063–2072, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Sandholm N, Salem RM, McKnight AJ, Brennan EP, Forsblom C, Isakova T, et al.: DCCT/EDIC Research Group : New susceptibility loci associated with kidney disease in type 1 diabetes. PLoS Genet 8: e1002921, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, et al.: An atlas of active enhancers across human cell types and tissues. Nature 507: 455–461, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.ENCODE Project Consortium : An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Woroniecka KI, Park AS, Mohtat D, Thomas DB, Pullman JM, Susztak K: Transcriptome analysis of human diabetic kidney disease. Diabetes 60: 2354–2369, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Nolis IK, McKay DJ, Mantouvalou E, Lomvardas S, Merika M, Thanos D: Transcription factors mediate long-range enhancer-promoter interactions. Proc Natl Acad Sci U S A 106: 20222–20227, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ma J, Natarajan R, LaPage J, Lanting L, Kim N, Becerra D, et al.: 12/15-lipoxygenase inhibitors in diabetic nephropathy in the rat. Prostaglandins Leukot Essent Fatty Acids 72: 13–20, 2005 [DOI] [PubMed] [Google Scholar]
  • 42.Kang SW, Adler SG, Nast CC, LaPage J, Gu JL, Nadler JL, et al.: 12-lipoxygenase is increased in glucose-stimulated mesangial cells and in experimental diabetic nephropathy. Kidney Int 59: 1354–1362, 2001 [DOI] [PubMed] [Google Scholar]
  • 43.Gubitosi-Klug RA, Talahalli R, Du Y, Nadler JL, Kern TS: 5-Lipoxygenase, but not 12/15-lipoxygenase, contributes to degeneration of retinal capillaries in a mouse model of diabetic retinopathy. Diabetes 57: 1387–1393, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Gillies CE, Putler R, Menon R, Otto E, Yasutake K, Nair V, et al.: Nephrotic Syndrome Study Network (NEPTUNE) : An eQTL landscape of kidney tissue in human nephrotic syndrome. Am J Hum Genet 103: 232–244, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Warsow G, Endlich N, Schordan E, Schordan S, Chilukoti RK, Homuth G, et al.: PodNet, a protein-protein interaction network of the podocyte. Kidney Int 84: 104–115, 2013 [DOI] [PubMed] [Google Scholar]
  • 46.Kutmon M, Riutta A, Nunes N, Hanspers K, Willighagen EL, Bohler A, et al. : WikiPathways: capturing the full diversity of pathway knowledge. Nucl. Acids Res. 44: D488–D494, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, et al.: Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 42: 30–35, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Lalonde E, Albrecht S, Ha KC, Jacob K, Bolduc N, Polychronakos C, et al.: Unexpected allelic heterogeneity and spectrum of mutations in Fowler syndrome revealed by next-generation exome sequencing. Hum Mutat 31: 918–923, 2010 [DOI] [PubMed] [Google Scholar]
  • 49.Tsui LC, Dorfman R: The cystic fibrosis gene: A molecular genetic perspective. Cold Spring Harb Perspect Med 3: a009472, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Sulem P, Helgason H, Oddson A, Stefansson H, Gudjonsson SA, Zink F, et al.: Identification of a large set of rare complete human knockouts. Nat Genet 47: 448–452, 2015 [DOI] [PubMed] [Google Scholar]
  • 51.Colin E, Huynh Cong E, Mollet G, Guichet A, Gribouval O, Arrondel C, et al. : Loss-of-function mutations in WDR73 are responsible for microcephaly and steroid-resistant nephrotic syndrome: Galloway-Mowat syndrome. Am J Hum Genet 95: 637–648 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Thao MT, Gaillard ER: The glycation of fibronectin by glycolaldehyde and methylglyoxal as a model for aging in Bruch’s membrane. Amino Acids 48: 1631–1639, 2016 [DOI] [PubMed] [Google Scholar]
  • 53.Li Z, McKeague IW, Lumey LH: Optimal design strategies for sibling studies with binary exposures. Int J Biostat 10: 185–196, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Bansal D, Badhan Y, Gudala K, Schifano F: Ruboxistaurin for the treatment of diabetic peripheral neuropathy: A systematic review of randomized clinical trials. Diabetes Metab J 37: 375–384, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Qi W, Keenan HA, Li Q, Ishikado A, Kannt A, Sadowski T, et al.: Pyruvate kinase M2 activation may protect against the progression of diabetic glomerular pathology and mitochondrial dysfunction. Nat Med 23: 753–762, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Dhar I, Dhar A, Wu L, Desai K: Arginine attenuates methylglyoxal- and high glucose-induced endothelial dysfunction and oxidative stress by an endothelial nitric-oxide synthase-independent mechanism. J Pharmacol Exp Ther 342: 196–204, 2012 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Table 6
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data
Supplemental Data

Articles from Journal of the American Society of Nephrology : JASN are provided here courtesy of American Society of Nephrology

RESOURCES