Significance Statement
Genetic differences are possible contributing factors to the substantial unexplained variability in rates of renal function loss in type 1 diabetes. Gene-based testing of protein coding genetic variants in whole-exome scans of individuals with type 1 diabetes with advanced kidney disease, as opposed to genome-wide SNP analyses, revealed that carriers of rarer, disruptive alleles in HSD17B14 experienced net protection against loss of kidney function and development of ESKD. HSD17B14 encodes hydroxysteroid 17-β dehydrogenase 14, which regulates sex steroid hormone metabolism. Paradoxically, proximal tubules from patients and mouse models had high levels of expression of the gene and protein, with downregulation in the presence of kidney injury. Hydroxysteroid 17-β dehydrogenase 14 may therefore be a druggable therapeutic target.
Keywords: type 1 diabetes, hydroxysteroid 17-beta dehydrogenase 14, end stage kidney disease, gene-based tests, rare variants, diabetic nephropathy, diabetic kidney disease
Visual Abstract
Abstract
Background
Rare variants in gene coding regions likely have a greater impact on disease-related phenotypes than common variants through disruption of their encoded protein. We searched for rare variants associated with onset of ESKD in individuals with type 1 diabetes at advanced kidney disease stage.
Methods
Gene-based exome array analyses of 15,449 genes in five large incidence cohorts of individuals with type 1 diabetes and proteinuria were analyzed for survival time to ESKD, testing the top gene in a sixth cohort (n=2372/1115 events all cohorts) and replicating in two retrospective case-control studies (n=1072 cases, 752 controls). Deep resequencing of the top associated gene in five cohorts confirmed the findings. We performed immunohistochemistry and gene expression experiments in human control and diseased cells, and in mouse ischemia reperfusion and aristolochic acid nephropathy models.
Results
Protein coding variants in the hydroxysteroid 17-β dehydrogenase 14 gene (HSD17B14), predicted to affect protein structure, had a net protective effect against development of ESKD at exome-wide significance (n=4196; P value=3.3 × 10−7). The HSD17B14 gene and encoded enzyme were robustly expressed in healthy human kidney, maximally in proximal tubular cells. Paradoxically, gene and protein expression were attenuated in human diabetic proximal tubules and in mouse kidney injury models. Expressed HSD17B14 gene and protein levels remained low without recovery after 21 days in a murine ischemic reperfusion injury model. Decreased gene expression was found in other CKD-associated renal pathologies.
Conclusions
HSD17B14 gene is mechanistically involved in diabetic kidney disease. The encoded sex steroid enzyme is a druggable target, potentially opening a new avenue for therapeutic development.
Diabetic kidney disease (DKD) is one of the most prevalent, costly, and devastating for quality of life, of all of the complications of type 1 diabetes (T1D).1,2 It affects about a third of individuals with T1D3,4 with many of them developing ESKD. Despite improvements in glycemic control and almost universal implementation of reno-protective therapies, the incidence of new cases of ESKD in T1D has not changed over the last 20 years.2 Clearly, further knowledge is needed about the disease process underlying the development of DKD and the mechanisms of progression to ESKD, in order to develop more effective interventions to reduce the risk of ultimate kidney failure.
Recently it has been recognized that in addition to elevated urinary albumin excretion, the most significant clinical feature of DKD that predicts onset of ESKD is progressive decline in kidney function.5,6 Although renal decline is progressive, there is profound heterogeneity in the rate at which different patients lose renal function, as measured by GFR slope and time to develop ESKD.6,7 The disease process underlying this heterogeneity seems to be multifactorial and includes, among other factors, variation in exposures such as hyperglycemia, hypertension, and genetics.8
To date, genome-wide discovery studies in DKD have almost universally focused on testing individual common SNPs to identify associations.9–13 Progress using this approach has been slow, and published results from individual studies and consortium efforts have not been consistently replicated.9,10 Possible reasons include modest sample sizes, pragmatic but imprecise case and control definitions, and phenotypes that do not adequately interrogate the heterogeneous nature of T1D kidney function decline.14–17 Furthermore, most common variants are in non–protein coding segments of the genome18 that are difficult to functionally characterize. They may influence the expression of remote genes that are 100–1000s of kilobases distant from the landmark SNP.19
More recently, a new approach has been developed to test the aggregate association of multiple rare variants within gene coding and splicing regions.20,21 The basis of this method is the expectation that rare variants are likely to have larger individual effects on phenotypes through direct disruption of an encoded protein, whereas aggregate testing of all of the variants in a gene improves the statistical power. A genome scan using this method tests each individual gene in the genome (approximately 20,000) for net association of the variants with the phenotype. Two types of models are tested: (1) assume all of the variants in a gene have the same direction of effect on the phenotype, either risk or protective; or (2) variants in a gene can act in opposite directions, some increasing risk and some protecting. Statistical power is also improved as a result of the reduced multiple testing correction (20,000 gene tests versus 1 million or more SNPs). Finally, because the variants are at least partially likely to act through direct disruption of the gene they are located within, the target protein and its biologic actions are a logical starting point for functional interpretation, and for more detailed tissue and cellular experimentation.
Under the auspices of the JDRF Diabetic Nephropathy Collaborative Research Initiative (JDRF DNCRI), specifically the subproject entitled “Genes determining time of onset of ESRD in type 1 diabetes individuals with proteinuria,” we assembled one of the largest longitudinal multicohort collections to investigate genetic influences contributing to fast progression to ESKD in cohorts with advanced DKD. By analogy with other chronic diseases,22–24 we reasoned that focusing on rare variants of larger effect size could more quickly lead to direct functional insight compared with associated common variants and that a more precise phenotype that captured the heterogeneity in the rate of decline to ESKD would best exploit the power available in the examined cohorts with long-term follow-up.
Methods
Study Design and Participants
We assembled six T1D cohorts, with a baseline of prevalent proteinuria as a biomarker of advanced DKD, to study their rates of progression to ESKD. The recruitment, follow-up, and renal function measurements in four of the six international clinical cohorts (Joslin Kidney Study: USA [Joslin]; Finnish Diabetic Nephropathy Study: Finland [FinnDiane]; T1D patients from Steno Diabetes Center: Denmark [Steno]; INSERM: France) have been described in detail previously. Briefly, individuals with T1D were eligible for the genetic screening cohorts if they had persistent proteinuria at baseline, generally defined as two urinary measurements out of three consecutively, or two within 3 years, having >300 mg/g, but with slight differences between the cohorts, as described.8 Additional details of the ascertainment and determination of persistent proteinuria for the FinnDiane cohort are included in Supplemental Methods. Undiagnosed ESKD events were defined as the first occurrence of eGFR<10 ml/min per 1.73 m2. Two additional US-based cohorts (Pittsburgh Epidemiology of Diabetes Complications Study: USA [EDC]25; and Wisconsin Epidemiologic Study of Diabetic Retinopathy: USA [WESDR]26) included individuals with less stringent inclusion criteria, to maximize sample size, of incident proteinuria (first occurrence >300 mg/g) with an accompanying eGFR measurement and at least one other later eGFR measurement where the individual’s diagnosed and undiagnosed ESKD status was known. Data from EDC were combined with Joslin, FinnDiane, Steno, and INSERM in the Discovery Stage, whereas WESDR, with limited genetic data, provided data for extension and cohort meta-analysis.
Two T1D case-control replication studies were assembled from T1D cases with prevalent ESKD at study recruitment, and controls without clinical evidence of DKD but with very long duration of T1D. In the first study, through a collaboration of the Joslin and the USA Fresenius Dialysis Centers, East Coast patients with T1D and new-onset ESKD of non-Hispanic white ancestry were randomly selected as cases. They were 20–54 years of age at the initiation of dialysis, had been taking insulin, and had diagnosis of T1D. The controls were selected from among participants in the Joslin T1D Medalist Study. These individuals had a diabetes duration of at least 50 years without clinical evidence of DKD.27
The second T1D case-control replication group was drawn from the Genetics of Kidneys in Diabetes US Study (GoKinD) recruited through George Washington University (GWU-GoKinD).28 Joslin-recruited GoKinD participants were omitted to avoid overlap with the Joslin Kidney Study cohort. Lacking longitudinal time to ESKD for this group and to enrich for more rapid ESKD in cases versus longer duration in controls, cases were selected as having prevalent ESKD at study entry (either dialysis or transplant) with diabetes duration <33 years (median of all GWU-GoKinD ESKD), whereas controls had prevalent normoalbuminuria and diabetes duration >23 years (median of all GWU-GoKinD normoalbuminuria participants). Protocols for recruitment and data collection for participants in the above studies were approved by the relevant Institutional Review Boards or Ethics Committees.
Combined Genome and Exome Array Genotyping and Quality Control
All study groups were genotyped on a combined GWAS plus Exome array (Illumina HumanCoreExome, San Diego, CA, USA) at the Center for Public Health Genomics Laboratory, University of Virginia. The genotyping and quality control (QC) methods have been described extensively in a previous publication from this JDRF DNCRI Consortium.11 Briefly, the array contained 250K genome-wide SNPs and other variants, and >200K gene/exome-centered variants. The samples were genotyped in batches using Illumina Gentrain2 algorithm and software, and then re-called with zCall, an algorithm specially designed for calling array-genotyped rare variants.29 The sample batches were filtered for low-quality samples (call rate <98%, sex misclassification, extreme heterozygosity) and variants for quality (call rate <95%, extreme deviation from Hardy–Weinberg equilibrium). Duplicates and cryptic relatedness were identified using KING30 and one of each pair was removed. Principal component analysis was performed in each cohort and study group to exclude outliers with evidence of non-European ancestry.
Gene Resequencing and Bioinformatics Processing
DNA samples were selected from five cohorts (Joslin, FinnDiane, Steno, INSERM, and WESDR) that had passed sample QC in the prior genotyping array assays. EDC samples were not available for inclusion at the time of Resequencing Stage design. Cohort samples were resequenced at the same laboratory using Illumina TruSeq custom amplicon assays. Primers for the amplification of targeted regions were designed using the DesignStudio sequencing assay design tool (Illumina, San Diego, CA, USA). Targeted amplicon libraries, consisting of 151 targets, were constructed using Illumina TruSeq custom amplicon assay kit (version 1.5) according to the manufacturer’s protocol (Supplemental Material and Supplemental Figure 1). The amplicon sequencing libraries were sequenced with 150-bp paired-end reads on an Illumina MiSeq Sequencer. The within-amplicon overlapping paired-end FASTQ sequence reads were checked and pairwise assembled using PEAR.31 The assembled contigs were then aligned to genome build hg38 using BWA-MEM,32 and variants called using the Genome Analysis Toolkit (GATK) v4.1 best practices, modified for the amplicon sequencing protocol.33 More details about the methods used for the pipeline and QC are available in the Supplemental Material.
Variant Functional Annotation
Variants were annotated using the Ensembl Variant Effect Predictor (release 93) with LOFTEE plugin for loss-of-function estimation.34 Only nonsense or splice site variants that were predicted with High Confidence by LOFTEE were included. The preselected primary genome scan variant risk set “Missense” included all predicted protein coding missense and nonsense variants, and any intronic variant located in the canonical 2-bp splice donor or acceptor site at the 5′ and 3′ ends of an intron, in any aligned transcript of a gene. Two other secondary risk sets were tested for comparison of the relative magnitudes of variant associations using data from the resequencing experiment. The “Deleterious” variant risk set included any nonsense or splice variant, plus any missense variant that was predicted to be “probably_damaging” (PolyPhen) and “deleterious” (SIFT). The most restrictive “LOF/GOF/Splice” set was defined to include only nonsense and splice site variants.
Outcomes
The primary outcome for the cohort genetic analysis was the time to ESKD event from the baseline proteinuria cohort entry time (either incident and persistent, or first incident). The ESKD event was either clinically diagnosed or was inferred as an undiagnosed event at the first occurrence of eGFR<10 ml/min per 1.73 m2. Absent an ESKD event for a participant, follow-up time was censored at the last eGFR measure. eGFR was estimated using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) serum creatinine formula.35 Membership in the retrospective study groups was used for the dichotomous case-control replication analyses, with inclusion/exclusion criteria as described above.
Whole-Exome Gene-Aggregated Analysis
Gene-aggregated tests of variants were performed locally within each cohort using a minimal proportional hazards model including adjustments for eGFR at study baseline and principal components appropriate for the cohort ancestral composition. For the retrospective case-control study groups, a logistic regression model was used with similar adjustments. More details are available in the Supplemental Material. For each gene, two standard tests were performed: a Burden test, more powerful for genes with multiple variants that wholly or predominantly confer either risk or protection (variant effects in the same direction); and SKAT, which is more powerful for genes that contain multiple variants that confer risk and protection (variant effects in different directions in the same gene).21 The inheritance model for each variant was assumed to be additive such that two allele copies of the rarer allele were modeled with double the effect size of one, and the variants in each risk set in each gene were combined using a standard β(1,25) weighting scheme on the basis of their minor allele frequency (MAF), such that very rare variants (MAF close to 0) were given a weight of almost 25, whereas common variant weights dropped to nearly 0 at MAF=0.5. No MAF filter was applied to the variants, and hence both rare and common variants were included, but the common had reduced weights in the variant-aggregated statistic. The burden test (but not the bidirectional SKAT test) also permitted an estimate of the variant weighted average single direction effect size per minor allele. The gene score and covariance matrix summary statistics were transmitted to the genetic analysis center at the University of Virginia and combined into a meta-analysis of all genes using published methods. The statistical genetic models and meta-analyses were implemented in R 3.2 or later (R Core Team), using the seqMeta package ( v1.6.7; Voorman, Brody, Chen, Lumley, and Davis; https://cran.r-project.org/src/contrib/Archive/seqMeta/). Empirical genomic control was applied to the exome-wide gene-based test results by normalizing the null model residual standard error by the square root of the parameter (λ1/2), thereby adjusting the median P value to null expectation and all P values toward the null.36 More details are included in the Supplemental Material. For the whole-exome gene-aggregated scan, the study-wide significance threshold was set at P<1.6 × 10−6 calculated with Bonferroni correction (0.05/15449 nonmonomorphic genes/2 tests, burden and SKAT). The overall meta-analysis of cohort survival and case-control statistics was performed using a Liptak–Stouffer method of standardized normal deviates weighted by inverse standard error.37 The overall meta-analysis test was two-sided for burden and one-sided for SKAT.
Immunofluorescence Staining for HSD17B14 in Human Kidney
Human kidney biopsy specimen paraffin sections were deparaffinized with xylene and ethanol, and pressure cooker treated for antigen retrieval. The sections were blocked with 3% BSA-PBS and were incubated with primary antibodies (anti-HSD17B14 rabbit polyclonal antibody: a generous gift from Dr. A. Jansson, Linköping University, Linköping, Sweden; anti–human KIM-1 mouse monoclonal antibody, Clone AKG7) for 1 hour at room temperature. After washing with PBS, sections were incubated with secondary antibodies for 30 minutes. The sections were then incubated with Vector ABC Elite Kit followed by color development with Vector DAB kit.
Mouse Ischemia-Reperfusion Injury and Aristolochic Acid–Induced Nephrotoxicity Studies
The murine ischemia-reperfusion injury (IRI) and acute aristolochic acid–induced nephropathy (aristolochic acid nephropathy [AAN]) models have previously been described.38 Details of the preparation of mouse kidney samples are also described.39 Mouse kidney frozen sections were thawed and treated with 1% SDS/PBS, then washed in PBS. After blocking with 3% BSA, sections were incubated with primary antibodies for 1 hour at room temperature or overnight at 4°C. The frozen sections were then incubated with secondary antibodies for 30 minutes, then washed. Vectashield (Vector Laboratories, Burlingame, CA) containing DAPI (12.5 μg/ml) was applied and slide cover-slips applied.
Protein Structure Modeling
Available crystal structures of HSD17B14 in the Protein Data Bank (PDB; http://www.rcsb.org) showed very similar conformation and good superposition. PDB entry 6EMM was chosen for structure visualization of HSD17B14 because it contained a fully modeled C-terminal chain, possibly due to the preservation of the intersubunit Cys255-Cys255 disulfide bond (Bertoletti et al. PDB deposited 10/02/2018, publication in process). C-terminal residues Gly271 and Ser272 were excluded from the analysis and presumed to be a cloning product because no predicted translation product of HSD17B14 transcripts (Gencode v32), nor the canonical UniProt sequence Q9BPX1 (270 aa), contain these. CCP4mg was used for structural visualizations.40
Expression Quantitative Trait Locus Analysis
To test the association of specific common variants with gene expression in kidney compartments, existing published datasets were interrogated by variant.41 The methods for data generation are described therein, but briefly, human kidney samples were obtained from surgical nephrectomies, stored at −80°C in RNAlater (Ambion), then microdissected into glomerular and tubular compartments. RNA-seq data were generated using Illumina TruSeq protocols and GWAS data from Affymetrix Axiom Biobank arrays. Expression quantitative trait locus (eQTL) analyses were run on European ancestry samples with absence of significant kidney structural changes (tubular fibrosis <10%, glomerular sclerosis <10%). After QC and filtering, 121 and 119 samples were used for tubule and glomerular eQTL analyses, respectively.
Single Nucleus RNA-Seq Analysis
Single nucleus RNA-seq (snRNA-seq) data were downloaded from NCBI GEO (Series GSE131882), containing transcript counts from experiments on renal cortex from nephrectomies of three human nondiabetic controls and three patients with early diabetic nephropathy.42 The diabetic patients had elevated A1c, evidence of mesangial sclerosis, and glomerular basement membrane thickening. The patient ages ranged from 52 to 74 years and eGFR from 56 to 85 ml/min per 1.73 m2 and did not differ between groups. Two diabetic patients had proteinuria with an increased proportion of global glomerulosclerosis and interstitial fibrosis and tubular atrophy. The six samples were analyzed as a single group using Seurat v3.1 with default QC and log normalization. The top 30 principal components were used as the input dimensions for using the Uniform Manifold Approximation and Projection (UMAP) algorithm for dimensional reduction and visualization of the single nucleus cell types.43
Publicly Available Datasets
Bulk RNA-seq expression results were retrieved from GTEx (https://www.gtexportal.org), October 21, 2019; and human Affymetrix U133 array data from Nephroseq (http://www.nephroseq.org), October 21, 2019, from published experiments.44 Selected gene expression values were extracted together with ascertainment pathology and other covariates. Additional postprocessed snRNA-seq results were retrieved from http://humphreyslab.com, January 2, 2020, and used previously published experimental data.42,45 Previous observations of alleles and frequencies were downloaded from gnomAD v2.1 (http://gnomad.broadinstitute.org).46 CKDGen summary results were retrieved from https://ckdgen.imbi.uni-freiburg.de.
Results
Study Design and Participant Characteristics
We used clinical, phenotype, and the exome variant subset of Illumina HumanCoreExome genotyping data, for six international proteinuria cohorts with longitudinal eGFR data, and two case-control sets, all of European ancestry. The study design and participation in the genetic analysis stages are shown in Figure 1A, and the clinical characteristics of the study groups in Table 1. The five discovery cohorts (Joslin, FinnDiane, Steno, INSERM, EDC) contained 2212 total participants that experienced 1095 ESKD events. The extension cohort (WESDR) contributed 160 participants and 20 events giving 1115 events in 2372 participants for the overall meta-analysis of the leading gene. The two case-control replication studies respectively contained 946 new-onset ESKD cases (Joslin-Fresenius) and 610 controls (Joslin Medalists); and 126 cases (lower 50th percentile of diabetes duration) versus 142 controls (upper 50th percentile of diabetes duration) from GWU-GoKinD.
Figure 1.

Study design and primary gene-based whole-exome scan results. (A) The six T1DKD cohorts and two case-control studies showing participation in each phase of the study: primary whole-exome scan (Scan), replication (Replic), meta-analysis, and resequencing (Reseq). The first five cohorts (Joslin, FinnDiane, Steno, INSERM, EDC) were used for the primary genome screen resulting in the initial exome scan meta-analysis result (15,449 genes), whereas the smaller extension cohort, WESDR, was also included with the replication case-control studies of the primary gene result (HSD17B14) in the overall meta-analysis of the top gene. (B) Manhattan plot for the primary burden test P values. The gene HSD17B14 was the most significant in the primary scan (five cohorts) and improved with the inclusion of the WESDR extension cohort and the two replication case-control studies in the overall meta-analysis (+WESDR +Replication). The experiment-wise corrected P-threshold of 1.6 × 10−6 is shown as a red dashed line. Den, Denmark; Fin, Finland; Fra, France.
Table 1.
Type 1 DKD study groups
| Number of ESKD Events or Cases | Macroalbuminuria Cohorts | Case-Control Groups | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Joslin | FinnDiane | Steno | INSERM | EDC | WESDR | Joslin-Fresenius ESKD Cases | Joslin Medalist Controls | GWU-GoKinD Cases | GWU-GoKinD Controls | |
| [US State,] Country | MA, USA | Finland | Denmark | France | PA, USA | WI, USA | USA | MA, USA | USA | USA |
| Recruitment period | 1993–2002 | 1998–2001 | 1993–1999 | 1993–1998 | 1986–1988 | 1980–1982 | ||||
| Patients w/ Exome array data | 614 | 783 | 414 | 257 | 144 | N/Aa | 946 | 610 | 126 | 142 |
| Age at cohort entry, median yr (IQR) | 35.0 (29.0, 35.8) | 37.9 (31.3, 46.4) | 40.9 (34.4, 49.0) | 41.2 (32.4, 50.1) | 31.7 (26.2, 38.0) | 34.5 (27.5, 43.2) | 50.2 (46.0, 57.0)b | N/A | 43.0 (38.0, 48.0) | 41.5 (36.0, 47.0) |
| Male, N (%) | 339 (55.2) | 474 (60.5) | 251 (60.6) | 154 (59.9) | 82 (56.9) | 100 (64.1) | 527 (55.7) | 280 (45.9) | 58 (46.0) | 57 (40.1) |
| eGFR at cohort entry, median ml/min per 1.73 m2 (IQR) | 81.1 (62.9, 106.2) | 71.7 (60.4, 83.3) | 71.5 (51.8, 94.5) | 66.1 (45.7, 89.0) | 88.4 (66.9, 113.6) | 96.9 (74.9, 114.4) | N/A | 80.9 (70.7, 91.2) | 47.8 (34.7, 66.4) | 87.4 (77.2, 102.5) |
| Time to ESKD, median yr (IQR) | 6.6 (4.3, 9.9) | 6.4 (4.1, 9.9) | 7.0 (4.4, 11.1) | 5.4 (3.0, 10.2) | 10.0 (5.3, 13.3) | 6.0 (5.1, 18.8) | ||||
| Number of ESKD events | 354 | 447 | 132 | 99 | 63 | 20 | 946 | 0 | 126 | 0 |
[US State,] is omitted for USA cohorts or groups that were recruited in multiple states. Recruitment period is the range in years for the 25th and 75th percentiles of the collection.
aFor reasons of study timing and logistics, WESDR analyses were performed using only genotypes from resequencing.
bIn Joslin-Fresensius ESKD cases, age is the age at ESKD onset.
Genes Associated with Survival against ESKD in Participants with T1D and Advanced DKD
Using gene-aggregated tests of association of all coding and splice site–affecting variants on the Illumina HumanCoreExome Array, we identified HSD17B14 (hydroxysteroid 17-β dehydrogenase 14) as the most significantly associated gene in our initial whole-exome scan of 15,449 annotated, nonmonomorphic genes using five T1DKD discovery cohorts (Figure 1B and Table 2). After genomic control correction within each cohort (Supplemental Material and Supplemental Figure 2), the burden test P value was 8.6 × 10−6, just above genome-wide significance accounting for all multiple testing (1.6 × 10−6). The top ten genes by significance from this screen are included in Supplemental Table 1. Variant-weighted meta-analysis of the lead gene including the WESDR cohort yielded an overall model burden test β for the log(hazard ratio [HR]) of −0.046 (SEM=0.010), equivalent to an HR=0.955 (95% CI, 0.94 to 0.97). The meta-analysis of the two case-control study sets yielded a nearly identical burden test log(odds ratio [OR]) of −0.045 (SEM=0.019), equivalent to OR=0.955 (95% CI, 0.93 to 0.99). Overall meta-analysis of the standardized effect sizes of the cohort and case-control results gave an experiment-wide significant P value of 3.3 × 10−7 for HSD17B14, Figure 1B, and this remained the most significant gene by either test. The burden test showed a superior P value to the SKAT test in the same gene by more than two orders of magnitude (Supplemental Table 2), suggesting that the rare variants were predominantly acting in one direction of effect. The detailed study group results in Table 2 demonstrate the consistency of effect direction, with seven of eight study groups having a negative β for log(HR) or log(OR), indicating that the rare/minor alleles for the gene variants exerted an overall protective effect against progression to ESKD from an advanced state of DKD; in the cohorts, this was the ascertained state of proteinuria. The top genes from the combined cohort and case-control meta-analyses are included in Supplemental Table 3. Because the burden test was empirically more significant for HSD17B14, follow-up analyses focused on this model.
Table 2.
Results of gene-based testing of HSD17B14 by cohort and case-control group
| Study Group | Events/N | N Cases/Controls | Variants | Burden β (SEM) | Burden P Value |
|---|---|---|---|---|---|
| Discovery cohorts | |||||
| Joslin | 354/614 | 3 | −0.051 (0.030) | 0.091 | |
| FinnDiane | 447/783 | 5 | −0.045 (0.012) | 1.5 × 10−4 | |
| Steno | 132/414 | 3 | −0.218 (0.081) | 0.0072 | |
| INSERM | 99/257 | 4 | 0.011 (0.043) | 0.80 | |
| EDC | 63/144 | 4 | −0.078 (0.052) | 0.13 | |
| Discovery meta-analysis (n=5) | 1095/2212 | 5 | −0.045 (0.010) | 8.6 × 10−6 | |
| Nondiscovery cohort | |||||
| WESDR | 20/160 | 3 | −0.097 (0.10) | 0.35 | |
| Cohort meta-analysis (n=6) | 1115/2372 | 6 | −0.046 (0.010) | 6.3 × 10−6 | |
| Replication case-control | |||||
| Joslin-Fresenius versus Joslin Medalists | 946/610 | 5 | −0.041 (0.019) | 0.052 | |
| GWU-GoKinD cases versus controls | 126/142 | 5 | −0.057 (0.041) | 0.17 | |
| Case-control meta-analysis (n=2) | 1072/752 | 6 | −0.045 (0.019) | 0.017 | |
| Overall meta-analysis | 3.3 × 10−7 |
Results used the variants present on the Illumina Infinium HumanCoreExome Bead Array. Analysis for all cohorts except WESDR used array genotyping data; WESDR genotypes were derived from resequencing (described later). The variants column shows the number of variants tested in that cohort or case-control analysis of HSD17B14. The number varies because of batch QC. Integers in parentheses in the cohorts or case-control groups (n) are the numbers of studies included. Events are ESKD diagnosed clinically or by proxy from eGFR<10 ml/min per 1.73 m2. β values are the log values of the HR/OR and measure the mean effect of the aggregated rare or less common alleles weighted by the discovery weight function β(1,25). Weights for rare variants (MAF<0.005) are close to 25, whereas those for common SNPs (MAF>0.1) are near 0. SEMs are of the β estimates. The exome-wide gene discovery and case-control results were corrected for genomic control within each cohort or case-control study.
Association of HSD17B14 Variants Discovered by Resequencing
Because the exome array HSD17B14 content was limited to only six variants, we resequenced the HSD17B14 gene exons in five of our T1D cohorts (Figure 1A) to generate a deeper catalog of coding variation. The samples submitted for resequencing and that passed QC were similar to those with array genotyping but were augmented with newly eligible cohort samples, as well as omitting all of the EDC cohort samples for study logistics reasons, hence the sample sizes, variants seen, and distribution of genotype missing data differs from, and therefore is not directly comparable with, the results from the whole-exome array-based scan. In the study time interval, these additional samples had subsequent DKD progression and now met those criteria. Post sequencing QC, the cohort sample numbers were Joslin n=620 (n=614 previous array); FinnDiane 820 (n=783); Steno 416 (n=414); INSERM 254 (n=257); WESDR 312 (N/A) for 2422 samples, which resulted in 2239 samples with phenotypic data that were included in the sequence-based association meta-analyses. We increased the catalog of variation from six array genotyped variants (two SNPs with MAF>0.05 and four rare SNVs with MAF<0.002) to 20 coding SNVs and two noncoding intronic SNVs within the 2-bp splice recognition sites directly flanking intron-exon boundaries (Figure 2 and Table 3). Three of the variants were classified as loss or gain of function or within a splice site (LOF/GOF/splice site risk set) with high quality (technical sequencing quality statistics shown in Supplemental Table 4); an additional six were classified as nonsynonymous but deleterious (Deleterious risk set); and 13 were nondeleterious missense (Missense risk set). A frameshift variant (p.A249CfsTer55) was found in a Joslin cohort patient, and the clinical record revealed that this patient had the longest duration since diabetes diagnosis without progression to ESKD in the entire Joslin cohort of 620 patients (65.8 years; Figure 3A and Supplemental Table 5), suggesting this could be a protective mutation. The predicted functional consequence of the inserted base allele was to extend the protein length in the most highly expressed protein isoform from wild-type 270 aa to 302 aa, with the C-terminal end 22 aa mutated compared with the wild type (Figure 3B). This frameshift had been seen 14 times previously in 235,000 alleles (gnomAD v2.1; Supplemental Table 6) but no homozygotes had been seen. Using visualization of existing crystal structures for HSD17B14, we found the wild-type C-terminal residues 249–270 in each subunit of the homo-tetramer contacted with the opposite subunit for most of their length (Figure 3C), and residues Val263, Pro266, and Pro269 of the C-terminal tail formed complementary hydrophobic interactions with the core of the subunit (Figure 3D); moreover, Asp267 formed a salt bridge with Arg203 of the core. The most important interaction was possibly the disulfide bond between residues Cys255 of the adjacent subunits (Figure 3E). These models suggested that the p.A249CfsTer55 mutant could have a destabilized tetrameric complex structure and that the elongated mutated C-tail had the potential to hinder the entrance of the substrate into the active site (Figure 3E). Further commentary is available in the Supplemental Material and Supplemental Figure 3.
Figure 2.

Genomic positions (build hg38), alleles, and risk sets of all nonsynonymous (protein coding change) and splice site variants discovered by resequencing in five T1D cohorts. The “Missense” risk set contained any predicted nonsynonymous or splice site donor or acceptor genetic variant; the “LOF/GOF” risk set contained only variants predicted to cause loss or gain of function or splice site disruption; the “Deleterious” risk set contained all LOF/GOF and any other missense variant that was predicted by SIFT and PolyPhen to be deleterious. The Genome Array line shows the variants present in the original Illumina HumanCoreExome array genome scan. LOF/GOF variants are in red text. The filled circles beneath the variants are shaded red if theestimated single variant minor allele effect direction increased risk (+ve effect), and shaded blue if they decreased risk (-ve effect).
Table 3.
Association of individual coding and splice site variants in HSD17B14 with survival time to ESKD in five type 1 diabetic incidence cohorts with proteinuria
| Position | Location | REF | ALT | Consequence | P Value | MAC | MAF | β (SEM) | Dir |
|---|---|---|---|---|---|---|---|---|---|
| LOF/GOF/Splice site | |||||||||
| 488132444 | Exon 9 | A | AC | A249CfsTer55 | 0.17 | 1 | 2.2 × 10−4 | −1.4 (1.0) | - |
| 48835804 | Exon 2 + 1 donor | C | G | Splicing | 0.25 | 1 | 2.2 × 10−4 | −0.8 (0.7) | - |
| 48835844 | Exon 2–1 acceptor | C | T | Splicing | 0.50 | 1 | 2.2 × 10−4 | 1.3 (2.0) | + |
| Deleterious | |||||||||
| 48815123a | Exon 6 | G | A | R130W | 0.0077 | 234 | 0.052 | −0.3 (0.1) | ----- |
| 48834302a | Exon 3 | C | A | D62Y | 0.033 | 9 | 0.0020 | −1.0 (0.5) | -- |
| 48836333a | Exon 3 | G | A | R27C | 0.064 | 2 | 5.8 × 10−4 | −2.1 (1.1) | - |
| 48836347 | Exon 3 | C | T | G22E | 0.61 | 2 | 5.7 × 10−4 | −0.6 (1.2) | -+ |
| 48836366 | Exon 1 | C | A | G16W | 0.27 | 16 | 0.0045 | 0.6 (0.5) | +--+ |
| 48836387 | Exon 1 | C | T | G9R | 0.10 | 1 | 2.2 × 10−4 | −1.4 (0.9) | - |
| Missense | |||||||||
| 48813201 | Exon 9 | C | T | V263M | 0.42 | 1 | 2.2 × 10−4 | 1.5 (1.8) | + |
| 48813206 | Exon 9 | G | T | T261N | 0.64 | 1 | 2.2 × 10−4 | −4.4 (9.6) | - |
| 48813307 | Exon 7 | C | T | R159Qb | 0.99 | 1 | 2.2 × 10−4 | −1.4 (301.0) | - |
| 48831700 | Exon 5 | G | T | L113Mc | 0.52 | 1 | 2.3 × 10−4 | −1.4 (2.2) | - |
| 48831714 | Exon 5 | C | T | R108H | 0.25 | 1 | 2.4 × 10−4 | 2.8 (2.5) | + |
| 48831732 | Exon 5 | G | A | T102I | 0.21 | 1 | 2.4 × 10−4 | −2.5 (2.0) | - |
| 48831756 | Exon 5 | G | C | P94Rc | 0.94 | 1 | 2.3 × 10−4 | 12.6 (169.0) | + |
| 48832675 | Exon 4 | C | G | A90P | 0.61 | 1 | 2.2 × 10−4 | 0.7 (1.5) | + |
| 48832699 | Exon 4 | G | A | R82C | 0.53 | 1 | 2.2 × 10−4 | −1.4 (2.3) | - |
| 48834281a | Exon 3 | C | T | V69M | 0.08 | 3 | 6.7 × 10−4 | −1.4 (0.8) | --- |
| 48834301a | Exon 3 | T | C | D62Gc | 0.20 | 6 | 0.0013 | −0.7 (0.6) | -+- |
| 48834320 | Exon 3 | C | T | A56T | 0.49 | 80 | 0.018 | 0.1 (0.2) | ++-+- |
| 48835841a | Exon 2 | T | C | N31D | 0.08 | 1119 | 0.25 | 0.1 (0.06) | ++++- |
Deleterious variants: LOF or GOF + splice donor or acceptor sites AND deleterious missense sites (SIFT=deleterious AND PolyPhen=probably_damaging). Missense: LOF or GOF + splice donor or acceptor sites AND all missense variants. MAC, minor allele count; Dir, direction of the variant effect in the cohorts where the variant was seen. β gives the estimated effect size for the rare or uncommon allele at each variant. Consequence was predicted by VEP with LOFTEE plugin for the canonical full-length transcript. Individual results were corrected using the genomic control parameters from the whole-exome scan within each cohort. n=2239 total samples.
aVariants that were present in the post-QC Illumina exome-wide array scan.
bThis missense variant was predicted to be present in a secondary transcript (ENST00000595764) only.
cThese just missed inclusion in the Deleterious set and were predicted to be deleterious (SIFT) and possibly_damaging (PolyPhen).
Figure 3.

Clinical consequences and protein modeling of the loss-of-function p.A249CfsTer55 frameshift variant in HSD17B14. (A) For the Joslin T1D cohort, the distribution of maximum follow-up time since T1D diagnosis terminating in incident ESKD (green color), or censored at the last clinical visit with serum creatinine measurement for eGFR estimation (blue color). The time to ESKD for the participant carrying the frameshift (diagnosed with T1D at age 2 years) is shown with the red dotted line and has the maximum survival time since diagnosis in the entire cohort (65.7 years), in both incident ESKD and censored subgroups at last follow-up time. (B) Schematic diagram of the change in the protein C-terminal tail as a result of the frameshift; blue beads: wild-type sequence; red beads: mutated sequence. The first blue residue is Ala249 which mutated to Cys249, with other key residues indicated. The entire 22 amino acid (aa) tail mutated to 54 aa after the frameshift. Q9BPX1 is the UniProt accession number for the wild-type protein. (C) Homotetrameric organization of wild-type HSD17B14. Each subunit is colored separately. The core of each subunit is shown in molecular surface representation whereas the C-terminal fragments 249–270 are shown in ball-and-stick representation. C-terminal fragments of only two subunits are visible on this panel (green and purple) whereas the C-terminal fragments of the other two subunits are symmetrically located on the opposite nonvisible side. (D) Interactions between the C-terminal fragment 249–270 and the tetramer core, colored by the electrostatic potential of its surface: blue, positive; red, negative; white, hydrophobic. The C-terminal fragment is shown in ball-and-stick representation with white carbon atoms and sulfur in yellow. Key residues mentioned in the text are labeled and those from the adjacent subunit are labeled with B in parenthesis. (E) The entrance to the active site in a subunit of the tetramer. The NAD cofactor molecule, shown in a magenta ball-and-stick model, can be seen through the entrance.
Two splice site substitution mutations were found, one in the FinnDiane cohort (g.48835804C>G; exon 2, +1 donor position) and one in INSERM (g.48835844C>T; exon 2, −1 acceptor position). Neither of the two splice site mutations resulted in as extreme a phenotype as in the Joslin case (Supplemental Figure 4 and Supplemental Table 5), nor had they been described previously in gnomAD (Supplemental Table 6). Of all variants, R130W (rs35299026) and D62Y (rs139987974) were individually nominally significant (P<0.05), both were consistently protective against ESKD progression in the cohorts and predicted to be deleterious, and the former was also a common SNP (MAF=0.052). In the Joslin cohort, under a Weibull model of increasing hazard with age and all other factors held constant, each minor allele of the R130 variant was estimated to increase the median time to ESKD by 20%. There were no differences between sexes in the association effect size (Table 4). Stratification of the variants by MAF and equal weighting in the Burden test showed that the rare variants (MAF<0.01) were associated with protection overall (β=−0.65, P=0.0033), whereas the only protective common variant was rs35299026 (R130W). Further comparison of the risk set results is available in the Supplemental Material and Supplemental Table 7.
Table 4.
Survival time to ESKD association results for variants in HSD17B14 by variant risk set and sex, genotypes from resequencing
| Variant Risk Set | Sex | N | Variants Tested | Burden β (SEM) | Burden | Sex Interaction P Value |
|---|---|---|---|---|---|---|
| Stratum | P Value | |||||
| LOF/GOF/Splice | Both | 2239 | 3 | −0.75 (0.49) | 0.13 | Not tested |
| Deleterious | Both | 2239 | 9 | −0.032 (0.009) | 0.00036 | |
| Female | 1003 | 6 | −0.032 (0.015) | 0.035 | ||
| Male | 1236 | 7 | −0.034 (0.012) | 0.0040 | 0.90 | |
| Missense | Both | 2239 | 22 | −0.018 (0.006) | 0.0055 | |
| Female | 1003 | 14 | −0.021 (0.010) | 0.029 | ||
| Male | 1236 | 16 | −0.016 (0.009) | 0.065 | 0.72 | |
| GWAS variants | Both | 2239 | 6 | −0.044 (0.010) | 1.4 × 10−5 | |
| Female | 1003 | 5 | −0.044 (0.015) | 0.0043 | ||
| Male | 1236 | 6 | −0.047 (0.014) | 0.00081 | 0.87 | |
| Common, MAF>0.01a | Both | 2239 | 3 | 0.024 (0.053) | 0.65 | |
| Female | 1003 | 3 | 0.011 (0.080) | 0.90 | ||
| Male | 1236 | 3 | 0.004 (0.072) | 0.96 | 0.95 | |
| Rare, MAF<0.01a | Both | 2239 | 19 | −0.65 (0.22) | 0.0033 | |
| Female | 1003 | 19 | −0.62 (0.36) | 0.086 | ||
| Male | 1236 | 19 | −0.69 (0.29) | 0.016 | 0.88 |
The variant risk sets were: LOF/GOF/Splice: loss or gain of function or in a splice donor/accepter site; Deleterious: LOF/GOF variants plus those predicted by PolyPhen and SIFT to be deleterious; Missense: Deleterious variants plus any other nonsynonymous variants; GWAS variants: any Missense variant present on the original array after QC. Results were corrected using the genomic control parameters from the whole-exome scan within each cohort. Burden P value was derived from a score test. Sample sizes (N), variants seen, and distribution of genotype missing data differ from the sample sets used for the whole-exome gene-based scan and are not directly comparable.
aFor MAF strata, the variants were equally weighted; for all other risk sets, variants were weighted on the basis of their estimated MAF using a β(1,25) function.
Association of HSD17B14 Common Variant R130W in CKDGen GWAS Results
Because rs35299026 (R130W) was a common SNP, predicted to be deleterious, with the most significant single variant association in our data (log[HR] β[SEM] −0.3[0.1], P=0.0077; Table 3), we reviewed the results for this SNP in publicly available CKDGen results for European ancestry. We found weak nominal association of this SNP with quantitative blood urea nitrogen (β[SEM]= −0.005 [0.0023], P=0.03) in 211K samples but no association with CKD as a binary trait (n=388K), nor eGFR (n=484K) or UACR (n=510K) as continuous traits (Supplemental Table 8).
Normal Human Kidney Cell Types and Other Tissues:
We assessed HSD17B14 protein expression in sections of nephrectomized normal human kidney by immunohistochemical staining. HSD17B14 was expressed in proximal tubules (Figure 4A and 4B, brown staining), with little evidence of staining in glomeruli. Cell type expression estimates from combined snRNA-seq data from adult nephrectomies of three healthy patients and three patients with early DKD corroborated the higher expression within proximal tubule cells (Figure 5A and 5B, clusters PCT-1/2/3), with a lower fraction of cells expressing the HSD17B14 gene, and a lower average per cell expression, in other kidney cell types. A second snRNA-seq dataset for a single healthy adult confirmed the higher percentage of cells and level of expression in proximal tubule segments (Supplemental Figure 5). Publicly available bulk RNA-seq datasets confirmed relatively high HSD17B14 expression in kidney across multiple datasets. Kidney HSD17B14 gene expression was ranked first of 37 tissues in the Human Protein Atlas (mean transcripts/million=53), third of 36 tissues in FANTOM5 (mean tags/million=43), and tenth of 53 tissues in GTEx V7 (median transcripts/million=35) (Supplemental Material and Supplemental Figure 6).45
Figure 4.

HSD17B14 expression in nephrectomized normal human kidney tissue, human DKD biopsy specimens, and mouse models of kidney injury. (A and B) HSD17B14 expression was assessed by immunohistochemical staining. HSD17B14 (brown staining) is primarily expressed in the proximal tubules of the normal human kidney sections from nephrectomized kidneys. (C and D) Differential expression of HSD17B14 (arrows) and KIM-1 (arrowheads) was examined by immunofluorescence analysis in normal kidney tissue (C) and DKD biopsy tissue (D). HSD17B14, red; KIM-1, green; DAPI, blue. White scale bar, 20 μm. (E– G) Immunofluorescence staining of HSD17B14 in sham (E) compared with post–ischemia reperfusion injury (IRI) model kidney at day 2 post-ischemia (F) and 14 days after induction of aristolochic acid nephropathy (AAN) (G) to establish expression levels in tubules with the kidney tissue markers. HSD17B14, red; Endomucin (EMCN, endothelial cells), green; KIM-1(injured tubule), white; DAPI, blue. (H) Relative mRNA levels by RT-PCR in the IRI model at sequential time points. HSD17B14 transcript expression by RT-PCR normalized to normal kidney, in day 2 (d2) and day 21 (d21) post-ischemia in two replicates each. Bars, 20 μm.
Figure 5.

HSD17B14 gene expression in normal and diseased human kidney. (A and B) Cell type clusters from three control and three early diabetic nephropathy renal cortex nephrectomies, visualized using the unified manifold approximation and projection (UMAP) algorithm of single nucleus RNA-seq count data. (A) The projected clusters and their cell types : Pod, podocyte; PCT-1/2/3, three proximal convoluted tubule clusters; CFH-1/2, two clusters expressing novel complement factor H (CFH) marker pattern; TAL-1/2, two thick ascending limb clusters; ATL, ascending thin limb; DCT, distal convoluted tubule; CNT, connecting tubule; PC-1/2, two principal cell clusters; ICA-1/2, two collecting duct intercalated type A clusters; ICB, intercalated type B; Mes, mesangium; SMC+P, smooth muscle cells and pericytes; Endo, endothelial; Endo(Fen), fenestrated endothelial; Endo(Lym), lymphatic endothelial; Leuk, leukocytes. (B) Percentage of cells of each cluster with nonzero expression of HSD17B14 corresponding to (A). The relative expression level in each cell is color coded from gray (no expression) to blue (high expression). (C) Log2 expression of bulk HSD17B14 mRNA measured using U133 2.0 Affymetrix arrays in tubulointerstitium of renal biopsy specimens from 17 patients with diabetic nephropathy and 31 healthy controls (data from NephroSeq 3.0). (D) Bulk RNA-seq expression of HSD17B14 in log2(transcripts per million) in proximal tubule specimens from five disease states: Con, Control i.e. nondiseased; CKD, CKD with eGFR<60; DKD, patients with type 2 diabetic kidney disease; DM, patients with type 2 diabetes without DKD; HTN, patients with hypertension. (E and F) Corresponding to samples in (D), the same expression of HSD17B14 plotted against eGFR (E) and percentage of fibrosis (F) with fitted linear relationships shown.
HSD17B14 Expression in Human DKD:
We compared normal and DKD kidney tissue by immunofluorescence for HSD17B14 and KIM-1, a marker of tubular injury and dedifferentiation. In normal kidney sections from non-DKD individuals, no KIM-1–expressing proximal tubules were evident, as expected, but HSD17B14 was highly expressed in the same cells (Figure 4C, arrows). In contrast, we found markedly reduced HSD17B14 expression in DKD, whereas KIM-1 was upregulated in proximal tubules (Figure 4D, arrowheads). Similarly, we found evidence for reduction of HSD17B14 gene expression in DKD. Using previously published Affymetrix U133 microarray data from renal biopsy specimens of 17 patients with diabetic nephropathy and 31 healthy controls,44 we found that bulk mRNA HSD17B14 expression levels in tubulointerstitial cells were significantly lower in the diabetic nephropathy group compared with controls (t test, P=0.017; Figure 5C). We recapitulated this in a second dataset (Figure 5D) discussed in the next section.
HSD17B14 Gene Expression in Multiple Human Kidney Pathologies:
We analyzed HSD17B14 gene expression in 433 microdissected human kidney tubule samples obtained from subjects with diabetes (DM) or hypertension (HTN) and subjects with DKD or hypertensive kidney disease, as measured by RNA-seq (Figure 5D).47 The median gene expression in the DKD group was lower than that in nonoverlapping groups of samples from the same study: Controls, CKD, DM, HTN samples, and differed significantly between groups (ANOVA P=0.00054). The expression of HSD17B14 was strongly positively correlated with eGFR (cor=0.27, P=3 × 10−12) among all disease samples (Figure 5E), and negatively correlated with degree of tubulointerstitial fibrosis (cor=−0.56, P=2 × 10−16; Figure 5F). Analogous plots for glomeruli from the same study are shown in Supplemental Figure 7. A replication dataset that included samples with mixed kidney pathologies confirmed the lower HSD17B14 expression in other CKD.44 Mean expression was lower in CKD associated with hypertension, IgA nephropathy, rapidly progressing glomerulonephritis, and lupus at P<0.05 (Supplemental Figure 8).
Association of the Common Missense Variant R130W (rs35299026) with HSD17B14 Gene Expression:
Because the common SNP missense variant R130W was individually negatively associated with and protective against progression to T1D ESKD (Table 3; P=0.0077, MAF=0.052), we tested whether the coding variant was also associated with HSD17B14 gene expression using a published dataset (Table 5]. We found that the minor A (W amino acid) allele of the SNP that was associated with protection against progression of T1D ESRD was also associated with decreased gene expression in the tubule, and, with less confidence, with decreased gene expression in glomeruli.
Table 5.
Kidney phenotype associations of the common missense SNP R130W (rs35299026)
| Phenotype | Effect (SD) | P Value | Minor Allele Direction |
|---|---|---|---|
| Tubule HSD17B14 | −0.74 (0.215) | 0.00081 | Decreased gene expression |
| Glomerulus HSD17B14 | −0.54 (0.22) | 0.019 | Decreased gene expression |
| Time to T1D ESKD | −2.9 (1.1) | 0.0008 | Protective against ESKD |
The association results are coded for the minor allele (MAF=0.052), genome allele A, protein amino acid allele W.
HSD17B14 Expression in Mouse Models of Kidney Injury:
We assessed HSD17B14 expression in mouse kidneys by coimmunostaining HSD17B14 and KIM-1 in cortical and outer medullary proximal tubules, in tissue sections of sham-operated control kidneys (Figure 4E). These data were compared with day 2 of post–IRI mouse model kidneys (Figure 4F). In addition, we evaluated the protein expression on day 14 after induction of murine AAN (Figure 4G). The KIM-1–positive tubular cells, which exhibit undifferentiated characteristics including loss of apical brush border, had markedly reduced HSD17B14 staining in both post-IRI and AAN kidneys. There was a strong reduction in HSD17B14 mRNA levels post ischemia injury at day 2 compared with normal mouse control, and this decline persisted over 21 days. The mRNA analysis confirmed the protein expression reduction in proximal tubule after IRI at days 2 and 21 in two mice (Figure 4H).
Discussion
To the best of our knowledge, this is the first genome-wide screen of genes that are associated with longitudinal progressive loss of kidney function to ESKD in study participants with advanced DKD. Using data collected under the JDRF DNCRI, we identified hydroxysteroid 17-β dehydrogenase 14 (HSD17B14) as a novel gene associated with protection against onset of ESKD in individuals with type 1 diabetes. The gene encodes an enzyme that is known to convert estradiol (E2) to estrone (E1) and is a member of an enzyme family that controls the relative balance of estrogen and androgen substrates, with secondary functions such as fatty acid metabolism. The overall effect of the minor and rare genetic coding variants appeared to protect carriers against more rapid terminal loss of renal function, as measured by delayed time to onset of ESKD. Because the genetic analysis focused on protein coding variants, we have greater initial confidence that we have identified an important, or sole, disease-associated gene, unlike reports from GWAS studies. In humans, the gene is highly expressed in proximal tubule cells, with lower levels of expression in other kidney cell types, consistent with previously reported patterns of protein expression for this enzyme, namely strong expression in kidney and particularly in the epithelial cells of proximal and distal tubule sections, whereas Bowman's capsule and glomerular epithelium remained relatively unstained.48 Paradoxical to the presumed disruptive effect of the rare gene variants, we found that the gene was downregulated in advanced DKD, with a suggestion from other data that it is also downregulated in CKD states associated with other kidney pathologies. By comparing normal and DKD kidney tissue by immunofluorescence, we found markedly decreased proximal tubular expression of HSD17B14 in DKD specimens, suggesting that the expression of the gene is positively correlated with protein expression. Cell-specific expression of HSD17B14 and KIM-1 (a marker of kidney injury49) was inversely correlated such that tubule cells expressing HSD17B14 lacked KIM-1 reactivity, whereas injured positive KIM-1 cells showed attenuated HSD17B14 expression, an observation which was consistent with our mouse model studies. Hence the downregulation of HSD17B14 appears to be tightly associated with injury and dedifferentiation of the proximal tubules in both the mouse kidney disease model and human kidney diseases. Finally, our findings are strikingly similar to another member of this gene family, HSD17B13, which encodes an analogous enzyme and was very recently discovered to play a role in liver metabolism, in which a loss-of-function gene variant protects against chronic liver disease and fibrosis, yet whose active substrate in liver function also remains unknown.50
Currently, we do not know the mechanism for the association of this gene with loss of kidney function in DKD, but prima facie, our results pose an intriguing paradox. The gene and encoded protein were downregulated in the proximal tubules in advanced DKD, yet the genetic variants that protected against loss of further kidney function are presumed to disrupt wild-type protein function. This relationship between clinical outcome and human molecular expression was corroborated by the observation that the minor allele of the common coding missense SNP in the gene (rs35299026) was associated with less rapid loss of kidney function and protection against ESKD onset, but was also associated with decreased expression of HSD17B14, most significantly in the tubule cells. There are, however, important caveats to this line of reasoning. The human kidney tissues we studied may not have carried protective variants. Also, directly equating loss of gene activity through disruption of protein function with overall decrease in gene expression may be simplistic and overlook the fact that the rare protein variants were almost all heterozygous, leaving a functional gene haplocopy, whereas downregulation of gene expression affects both alleles. Additionally, the loss of gene expression may be wholly or partly a consequence of the advanced diabetic environmental insult to the kidney, increasing the fraction of cells undergoing fibrosis and profoundly altering individual cellular programming. Preserved function in the nonfibrotic cells may be the determinant of risk hidden among the gross molecular and structural changes.47
Our resequencing to develop a broader catalog of variants in the HSD17B14 gene locus identified three variants with putative loss or gain of function or within the canonical intron-exon flanking splice site boundary. The only frameshift variant (p.A249CfsTer55) was predicted to result in an elongated mutated C-terminal tail, and the carrier was found to have had the longest duration since diabetes diagnosis in the cohort, suggesting the possibility that the participant acquired additional genetic protection from progression to ESKD. However, the splice variants did not appear to result in unusual protection or risk for their carriers. Although overall the HSD17B14 gene variants carried by our cohort participants were aggregately associated with protection, we were unable to definitively specify the direction of effect for individual rare variants, because of the very small number of carriers and uncertainty around their estimated effect size. Association tests using our predicted Deleterious risk set resulted in a more significant protective statistical result than the most permissive risk set of any missense or splice variant, suggesting that this Deleterious set did capture more of the larger effect size variants. Remarkably, the initial set of variants on the array resulted in a comparable association and protection in the sequencing data to the Deleterious set. This fortuitous occurrence of variants of relatively strong joint effect on the original array no doubt aided the initial exome-wide discovery of the gene. We did not detect any sex-specific differences in the gene effects although direct comparison between the sex strata was imperfect because the sets of variants seen differed by sex because of the distribution of rare alleles.
HSD17B14 is the last member of the hydroxysteroid 17β dehydrogenase enzyme family which, with varying affinities and kinetics, interconvert 17-keto and 17β-hydroxy steroids, thus catalyzing conversion between E1 and E2, androstenedione (A) and testosterone (T), and 5α-androstanedione and dihydrotestosterone. HSD17B14 preferentially oxidizes E2 to E1, and T to A, although conversion rates are low compared with other family members, suggesting the possibility of other functional roles.48,51 Other HSD17B members are known to be involved in fatty acid metabolism, although in different compartments and at different biochemical pathway steps.52–54 By analogy, it is possible that HSD17B14 also has multiple sex steroid and fatty acid metabolic functions, but the fatty acid substrate(s) remains unknown. Of particular interest is the fact that the HSD superfamily is a key target for drug development of steroid hormonal–stimulated diseases. Inhibitors of HSD17B14 are under active development.55,56
The HSD17B14 gene locus has not previously been identified in any genome-wide scan of kidney-related traits, which now approach sample sizes of 1 million individuals for some traits in CKDGen,57,58 nor in the largest T1DKD studies.10,11 These previous studies focused on common variants whereas our exome-centric tests included rare variants. Additionally, the analyses of eGFR and other traits in CKDGen studies are largely measuring natural population variation, and the CKD binary trait case-control analysis was more likely to identify genetic loci specific for generalized early-stage renal impairment. Despite the larger sample sizes and exhaustive phenotype testing in the very recent T1D genetic study paper from this consortium,11 the analyses employed more pragmatic but less stringent phenotype/study group definitions and would have been underpowered to identify similar disease stage genes. We caution that to further replicate or extend these results, simple permissive ascertainment of cases and controls, even in participants with DKD, may not be an appropriate strategy. In the Joslin and FinnDiane cohorts with proteinuria, the cumulative risk of ESKD after 15 years was about 50%,2 suggesting that the misclassification rate of controls versus cases could be extremely high in unselected proteinuria versus ESKD comparisons.
Our study has multiple strengths. We defined a precise outcome phenotype to be tested in multiple incidence cohorts ascertained under a strict set of criteria and restricted to a particular stage in the natural history of kidney function loss in T1D. We utilized the intensive follow-up data in diabetes clinics to identify the time-to-event from cohort baseline. The samples were uniformly genotyped, sequenced, and underwent QC at centralized laboratory and analysis centers. We performed cellular studies using immunohistochemistry to verify changes in expression of the encoded protein in human specimens and tested the gene in mouse models of kidney injury. However, there are also limitations to our study. Although we demonstrated consistency of effects and replication of our top gene, our sample size was limited and there are undoubtedly additional genes associated with this phenotype still to be discovered. We used a standard weighting function for the aggregated rare and minor allele variant analysis which places disproportionately more weight on rare variants in the collapsed variant statistics to reflect an expectation of larger effect on the phenotype with lower frequency alleles. This weighting function was a necessary prespecified assumption and a different optimized function closer to the true effect size distribution might yield more significant association results. Our study groups were ascertained on advanced type 1 DKD so the generalizability of these findings to type 2 DKD and other CKD pathologies remains an open question, although the reduction of expression in other CKD pathologies provides preliminary data that this gene may have a role in kidney function loss in these other pathologies also. The genetic variants we identified were not confirmed by RNA-seq in the carrier participants and therefore their functional role in these individuals remains predicted rather than proven. Our functional experiments on HSD17B14 suggested a possible role for the encoded gene, but more extensive molecular and cellular work is needed to establish the exact mechanism and role of the gene during kidney function loss.
Disclosures
J.V. Bonventre reports being cofounder of Goldfinch Bio and is coinventor with T. Ichimura on KIM-1 patents assigned to Partners Healthcare; being a consultant for Aldeyra, Cerespir, Merck, Mitobridge, and PTC; ownership interest in Amazon; ownership equity in Avexxin, Dicerna, DXNow, Goldfinch, Goldilocks, Innoviva, MediBeacon, Medssenger, Pacific Biosciences, Rubius, Sensor-Kinesis, Sentien, Theravance, and Verinano; consultancy agreements with Aditum, Citrine, Janssen, MediBeacon, Praxis, and Serepta; scientific advisor or membership as Editor of Seminars in Nephrology, the Advisory Board of the Northwest Kidney Center, and Angion; and was supported by National Institutes of Health (NIH)-National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) grants: R37 DK39773, R01 DK072381, and NCATS/NIDDK UG TR002155. C. Forsblom reports Scientific Advisor or Membership as a member of the Advisory Board for Acta Diabetologica, and the Editorial Board of the Journal of Diabetes Research; and Other Interests/Relationships as Secretary of the Board of the Finnish Diabetes Research Society. P. Groop reports consultancy agreements with Bayer and Boehringer Ingelheim; research funding from Eli Lilly, Roche (>5 years ago); honoraria from lecture fees from Astellas, AstraZeneca, Bayer, Boehringer Ingelheim, Eli Lilly, Genzyme, Medscape, MSD, Mundipharma, Novartis, Novo Nordisk, PeerVoice, Sanofi, and SCIARC; scientific advisor or membership as Member of Advisory Boards for Astellas, AbbVie, AstraZeneca, Bayer, Boehringer Ingelheim, Eli Lilly, Janssen, Medscape, MSD, Mundipharma, Nestlé, Novartis, Novo Nordisk, and Sanofi; and other interests/relationships as Chairman of the Board of the Signe and Ane Gyllenberg Foundation, Chairman of the Board of the Board of the Finnish Kidney Disease Registry, and Member of the Board of the European Association for the Study of Diabetes. R. Korstanje reports Scientific Advisor or Membership with the Alport Syndrome Foundation. A. Krolewski reports current employment with Joslin Diabetes Center. K. O’Neil is employed by Joslin Diabetes Center. S. Rich reports Scientific Advisor or Membership with the American Diabetes Association (Diabetes Care Associate Editor), and the National Human Genome Research Institute National Advisory Board. P. Rossing reports Research Funding from AstraZeneca and Novo Nordisk; Honoraria from AstraZeneca, Boehringer Ingelheim, and Novo Nordisk, all honoraria to institution; and Scientific Advisor or Membership with Astellas, Astra Zeneca, Bayer, Gilead, MSD, Mundipharma, and Novo Nordisk, all honoraria to institution. E. Satake reports employment with Joslin Diabetes Center; and Research Funding from Novo Nordisk and the Sunstar Foundation. I. Shabalin reports employment with and ownership interest in IDEAYA Biosciences. K. Susztak reports Consultancy Agreements with Astra Zeneca, Bayer, Jnana, Maze, and Pfizer; Ownership Interest in Jnana; Research Funding from Bayer, Boehringer Ingelheim, Calico, Gilead, GSK, Lilly, Maze, Merck, Novartis, Novo Nordisk, and Regeneron; Honoraria from AstraZeneca, Bayer, Jnana, and Maze; and Scientific Advisor or Membership via Editorial board for Kidney International, the Journal of Clinical Investigation, Cell Metabolism, EBioMedicine, the Journal of American Society of Nephrology, and Jnana. S. Hadjadj reports consultancy agreements with Lilly, Boehringer, and Abbott; research funding from Abbott, Novo Nordisk, Novartis, and DinnoSanté; honoraria from Lilly, Boehringer, Abbott, Servier, Sanofi, Novo Nordisk, Bayer, AstraZeneca, and Mundi Pharma; and scientific advisor or membership with Valbiotis. A. Galecki reports scientific advisor or membership with Open Journal of Applied Statistics and was supported by National Institute of Aging Claude D. Pepper Older Americans Independence Center grant AG08808. All remaining authors have nothing to disclose.
Funding
This study was supported by a JDRF grant (3-SRA-2018-529-M-B) to J.C. Mychaleckyj, J.V. Bonventre, R. Klein, and A.S. Krolewski; a JDRF grant (17-2013-8) for DNCRI subproject “Search for genes determining time to onset of ESRD in type 1 diabetes patients with proteinuria” to S. Hadjadj, P. Rossing, P.-H. Groop, and A.S. Krolewski; JDRF grants (6-2010-550, 1-2008-1018) to A.S. Krolewski; National Institutes of Health grant DK-041526 to A.S. Krolewski; and the Joslin Diabetes Research Center grant P30 DK036836. The FinnDiane study was funded by JDRF (17-2013-7), the Novo Nordisk Foundation (OC0013659), the Academy of Finland (275614, 299200, and 316664), Folkhälsan Research Foundation (2019), the Wilhelm and Else Stockmann Foundation (2018), and Helsinki University Hospital Research Funds (TYH2018207) to E. Valo, N. Sandholm, C. Forsblom, and P.-H. Groop. The Pittsburgh Epidemiology of Diabetes Complications study was supported by NIDDK grant DK34818. The Wisconsin Epidemiologic Study of Diabetic Retinopathy study was funded by NEI EY016379.
Acknowledgments
M. Pragnell, S.S. Rich, J.V. Bonventre, S. Hadjadj, P. Rossing, P.-H. Groop, J.H. Warram, and A.S. Krolewski are responsible for study conception. J.C. Mychaleckyj, A.S. Krolewski., and A. Galecki designed the study. J.C. Mychaleckyj, E. Valo, T. Ichimura, T.S. Ahluwalia, C. Dina, R.G. Miller, B. Gyorgy, J. Cao, I.G. Shabalin, S. Onengut-Gumuscu, E. Satake, A.M. Smiles, J.K. Haukka, D.-A. Tregouet, A.D. Paterson, C. Forsblom, H.A. Keenan, K. O’Neil, M.G. Pezzolesi, N. Sandholm, T. Costacou, T.J. Orchard, G.L. King, R. Klein, B.E. Klein, K. Susztak, R. Korstanje, and J.H. Warram performed data acquisition, analysis, or interpretation. J.C. Mychaleckyj and A.S. Krolewski wrote the manuscript. J.V. Bonventre, R. Korstanje, and T. Ichimura reviewed and edited the manuscript. All authors approved the final version of the manuscript.
The Pittsburgh Epidemiology of Diabetes Complications (EDC) study gratefully acknowledges the financial support of the Rossi Memorial Fund.
Footnotes
Published online ahead of print. Publication date available at www.jasn.org.
See related editorial, “Coding variants in susceptibility to diabetic kidney disease,” on pages 2397–2399.
Data Sharing Statement
The summary statistics for the whole-exome gene-based scan are available in an Open Science Framework (https://osf.io) project with doi 10.17605/OSF.IO/9NEXD, and at the AMP-Type 1 Diabetes Knowledge Portal (http://www.type1diabetesgenetics.org/), under Renal, JDRF Diabetic Nephropathy Collaborative Research Initiative datasets. Genotypic data by individual cannot be shared because of study consent restrictions, and due to European Union and national regulations regarding protection of individual genetic data.
Supplemental Material
This article contains the following supplemental material online at http://jasn.asnjournals.org/lookup/suppl/doi:10.1681/ASN.2020101457/-/DCSupplemental.
Supplemental Material.
Supplemental Methods
Supplemental Results
Supplemental Table 1. Top 10 genes for association with survival against ESKD in five type 1 diabetes cohorts with advanced DKD
Supplemental Table 2. Results of array gene-based testing of HSD17B14 by cohort and case-control group including SKAT results
Supplemental Table 3. Top 10 genes after meta-analysis of the type 1 diabetes cohort and case-control results
Supplemental Table 4. Technical sequencing variant quality statistics of predicted LOF/GOF/splice site variants
Supplemental Table 5. Characteristics of carriers of predicted LOF/GOF/splice site variants
Supplemental Table 6. HSD17B14 coding and splice variants in gnomAD
Supplemental Table 7. Sensitivity of association results for survival time to ESKD by HSD17B14 variant risk set, genotypes from resequencing
Supplemental Table 8. CKDGen Consortium results at rs35299026 (R130W) from publicly available GWAS datasets
Supplemental Figure 1. Amplicon design in the HSD17B14 genomic region for the resequencing project.
Supplemental Figure 2. Quantile-quantile (QQ) plots for the 5-cohort discovery, gene-based, whole-exome scan.
Supplemental Figure 3. Tetrameric organization of the wild-type HSD17B14 protein.
Supplemental Figure 4. Distribution of time duration since diagnosis of T1D in cohorts carrying LOF variants.
Supplemental Figure 5. Single nucleus RNA-seq results for HSD17B14 gene expression in normal, undiseased kidney tissue from a single adult nephrectomy.
Supplemental Figure 6. Bulk non-diseased tissue expression of HSD17B14 in multiple public human data sets.
Supplemental Figure 7. Variation in HSD17B14 expression in kidney tissue from patients in 4 human disease states, and undiseased controls, measured by RNA-seq.
Supplemental Figure 8. HSD17B14 comparative gene expression in multiple chronic kidney disease pathologies.
References
- 1.Libby P, Nathan DM, Abraham K, Brunzell JD, Fradkin JE, Haffner SM, et al. ; National Heart, Lung, and Blood Institute; National Institute of Diabetes and Digestive and Kidney Diseases Working Group on Cardiovascular Complications of Type 1 Diabetes Mellitus: Report of the National Heart, Lung, and Blood Institute-National Institute of Diabetes and Digestive and Kidney Diseases Working Group on Cardiovascular Complications of Type 1 Diabetes Mellitus. Circulation 111: 3489–3493, 2005 [DOI] [PubMed] [Google Scholar]
- 2.Rosolowsky ET, Skupien J, Smiles AM, Niewczas M, Roshan B, Stanton R, et al. : Risk for ESRD in type 1 diabetes remains high despite renoprotection. J Am Soc Nephrol 22: 545–553, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hovind P, Tarnow L, Rossing P, Jensen BR, Graae M, Torp I, et al. : Predictors for the development of microalbuminuria and macroalbuminuria in patients with type 1 diabetes: inception cohort study. BMJ 328: 1105, 2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Costacou T, Orchard TJ: Cumulative kidney complication risk by 50 years of type 1 diabetes: the effects of sex, age, and calendar year at onset. Diabetes Care 41: 426–433, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Krolewski AS, Gohda T, Niewczas MA: Progressive renal decline as the major feature of diabetic nephropathy in type 1 diabetes. Clin Exp Nephrol 18: 571–583, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Krolewski AS, Skupien J, Rossing P, Warram JH: Fast renal decline to end-stage renal disease: an unrecognized feature of nephropathy in diabetes. Kidney Int 91: 1300–1311, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Frodsham SG, Yu Z, Lyons AM, Agarwal A, Pezzolesi MH, Dong L, et al. : The familiality of rapid renal decline in diabetes. Diabetes 68: 420–429, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Skupien J, Smiles AM, Valo E, Ahluwalia TS, Gyorgy B, Sandholm N, et al. : Variations in risk of end-stage renal disease and risk of mortality in an international study of patients with type 1 diabetes and advanced nephropathy. Diabetes Care 42: 93–101, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Pezzolesi MG, Poznik GD, Mychaleckyj JC, Paterson AD, Barati MT, Klein JB, et al. ; DCCT/EDIC Research Group: Genome-wide association scan for diabetic nephropathy susceptibility genes in type 1 diabetes. Diabetes 58: 1403–1410, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sandholm N, Van Zuydam N, Ahlqvist E, Juliusdottir T, Deshmukh HA, Rayner NW, et al. ; The FinnDiane Study Group; The DCCT/EDIC Study Group; GENIE Consortium; SUMMIT Consortium: The genetic landscape of renal complications in type 1 diabetes. J Am Soc Nephrol 28: 557–574, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Salem RM, Todd JN, Sandholm N, Cole JB, Chen WM, Andrews D, et al. ; SUMMIT Consortium, DCCT/EDIC Research Group, GENIE Consortium: Genome-wide association study of diabetic kidney disease highlights biology involved in glomerular basement membrane collagen. J Am Soc Nephrol 30: 2000–2016, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.van Zuydam NR, Ahlqvist E, Sandholm N, Deshmukh H, Rayner NW, Abdalla M, et al. ; Finnish Diabetic Nephropathy Study (FinnDiane); Hong Kong Diabetes Registry Theme-based Research Scheme Project Group; Warren 3 and Genetics of Kidneys in Diabetes (GoKinD) Study Group; GENIE (GEnetics of Nephropathy an International Effort) Consortium; Diabetes Control and Complications Trial (DCCT)/Epidemiology of Diabetes Interventions and Complications (EDIC) Research Group; SUrrogate markers for Micro- and Macrovascular hard endpoints for Innovative diabetes Tools (SUMMIT) Consortium: A genome-wide association study of diabetic kidney disease in subjects with type 2 diabetes. Diabetes 67: 1414–1427, 2018. 29703844 [Google Scholar]
- 13.Guan M, Keaton JM, Dimitrov L, Hicks PJ, Xu J, Palmer ND, et al. ; FIND Consortium: Genome-wide association study identifies novel loci for type 2 diabetes-attributed end-stage kidney disease in African Americans. Hum Genomics 13: 21, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Bakris GL, Molitch M: Microalbuminuria as a risk predictor in diabetes: the continuing saga. Diabetes Care 37: 867–875, 2014 [DOI] [PubMed] [Google Scholar]
- 15.Caramori ML, Fioretto P, Mauer M: The need for early predictors of diabetic nephropathy risk: is albumin excretion rate sufficient? Diabetes 49: 1399–1408, 2000 [DOI] [PubMed] [Google Scholar]
- 16.Pezzolesi MG, Krolewski AS: Diabetic nephropathy: is ESRD its only heritable phenotype? J Am Soc Nephrol 24: 1505–1507, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ahlqvist E, van Zuydam NR, Groop LC, McCarthy MI: The genetics of diabetic complications. Nat Rev Nephrol 11: 277–287, 2015 [DOI] [PubMed] [Google Scholar]
- 18.Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, et al. : Systematic localization of common disease-associated variation in regulatory DNA. Science 337: 1190–1195, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wang D, Rendon A, Wernisch L: Transcription factor and chromatin features predict genes associated with eQTLs. Nucleic Acids Res 41: 1450–1463, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Li B, Leal SM: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83: 311–321, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X: Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89: 82–93, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Abifadel M, Varret M, Rabès JP, Allard D, Ouguerram K, Devillers M, et al. : Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat Genet 34: 154–156, 2003 [DOI] [PubMed] [Google Scholar]
- 23.Jonsson T, Atwal JK, Steinberg S, Snaedal J, Jonsson PV, Bjornsson S, et al. : A mutation in APP protects against Alzheimer’s disease and age-related cognitive decline. Nature 488: 96–99, 2012 [DOI] [PubMed] [Google Scholar]
- 24.Wessel J, Chu AY, Willems SM, Wang S, Yaghootkar H, Brody JA, et al. ; EPIC-InterAct Consortium: Low-frequency and rare exome chip variants associate with fasting glucose and type 2 diabetes susceptibility. Nat Commun 6: 5897, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Orchard TJ, Dorman JS, Maser RE, Becker DJ, Drash AL, Ellis D, et al. : Prevalence of complications in IDDM by sex and duration. Pittsburgh Epidemiology of Diabetes Complications Study II. Diabetes 39: 1116–1124, 1990 [DOI] [PubMed] [Google Scholar]
- 26.Klein BEK, Klein R, McBride PE, Cruickshanks KJ, Palta M, Knudtson MD, et al. : Cardiovascular disease, mortality, and retinal microvascular characteristics in type 1 diabetes: Wisconsin epidemiologic study of diabetic retinopathy. Arch Intern Med 164: 1917–1924, 2004 [DOI] [PubMed] [Google Scholar]
- 27.Keenan HA, Costacou T, Sun JK, Doria A, Cavellerano J, Coney J, et al. : Clinical factors associated with resistance to microvascular complications in diabetic patients of extreme disease duration: the 50-year medalist study. Diabetes Care 30: 1995–1997, 2007 [DOI] [PubMed] [Google Scholar]
- 28.Mueller PW, Rogus JJ, Cleary PA, Zhao Y, Smiles AM, Steffes MW, et al. : Genetics of Kidneys in Diabetes (GoKinD) study: a genetics collection available for identifying genetic susceptibility factors for diabetic nephropathy in type 1 diabetes. J Am Soc Nephrol 17: 1782–1790, 2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Goldstein JI, Crenshaw A, Carey J, Grant GB, Maguire J, Fromer M, et al. ; Swedish Schizophrenia Consortium; ARRA Autism Sequencing Consortium: zCall: a rare variant caller for array-based genotyping: genetics and population analysis. Bioinformatics 28: 2543–2545, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM: Robust relationship inference in genome-wide association studies. Bioinformatics 26: 2867–2873, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Zhang J, Kobert K, Flouri T, Stamatakis A: PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30: 614–620, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754–1760, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. : A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43: 491–498, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. : The Ensembl Variant Effect Predictor. Genome Biol 17: 122, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Levey AS, Stevens LA, Schmid CH, Zhang YL, Castro AF 3rd, Feldman HI, et al. ; CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration): A new equation to estimate glomerular filtration rate. Ann Intern Med 150: 604–612, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Devlin B, Roeder K: Genomic control for association studies. Biometrics 55: 997–1004, 1999 [DOI] [PubMed] [Google Scholar]
- 37.Zaykin DV: Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis. J Evol Biol 24: 1836–1841, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Yang L, Besschetnova TY, Brooks CR, Shah JV, Bonventre JV: Epithelial cell cycle arrest in G2/M mediates kidney fibrosis after injury. Nat Med 16: 535–543, 1p, 143, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kishi S, Brooks CR, Taguchi K, Ichimura T, Mori Y, Akinfolarin A, et al. : Proximal tubule ATR regulates DNA repair to prevent maladaptive renal injury responses. J Clin Invest 129: 4797–4816, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.McNicholas S, Potterton E, Wilson KS, Noble MEM: Presenting your structures: the CCP4mg molecular-graphics software. Acta Crystallogr D Biol Crystallogr 67: 386–394, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Qiu C, Huang S, Park J, Park Y, Ko YA, Seasock MJ, et al. : Renal compartment-specific genetic variation analyses identify new pathways in chronic kidney disease. Nat Med 24: 1721–1731, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Wilson PC, Wu H, Kirita Y, Uchimura K, Ledru N, Rennke HG, et al. : The single-cell transcriptomic landscape of early human diabetic nephropathy. Proc Natl Acad Sci U S A 116: 19619–19625, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM 3rd, et al. : Comprehensive integration of single-cell data. Cell 177: 1888–1902.e21, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ju W, Nair V, Smith S, Zhu L, Shedden K, Song PXK, et al. ; ERCB, C-PROBE, NEPTUNE, and PKU-IgAN Consortium: Tissue transcriptome-driven identification of epidermal growth factor as a chronic kidney disease biomarker. Sci Transl Med 7: 316ra193, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wu H, Uchimura K, Donnelly EL, Kirita Y, Morris SA, Humphreys BD: Comparative analysis and refinement of human PSC-derived kidney organoid differentiation with single-cell transcriptomics. Cell Stem Cell 23: 869–881.e8, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. : The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581: 434–443, 2020 [DOI] [PMC free article] [PubMed]
- 47.Beckerman P, Qiu C, Park J, Ledo N, Ko YA, Park AD, et al. : Human kidney tubule-specific gene expression based dissection of chronic kidney disease traits. EBioMedicine 24: 267–276, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Sivik T, Vikingsson S, Gréen H, Jansson A: Expression patterns of 17β-hydroxysteroid dehydrogenase 14 in human tissues. Horm Metab Res 44: 949–956, 2012 [DOI] [PubMed] [Google Scholar]
- 49.Han WK, Bailly V, Abichandani R, Thadhani R, Bonventre JV: Kidney injury molecule-1 (KIM-1): a novel biomarker for human renal proximal tubule injury. Kidney Int 62: 237–244, 2002 [DOI] [PubMed] [Google Scholar]
- 50.Abul-Husn NS, Cheng X, Li AH, Xin Y, Schurmann C, Stevis P, et al. : A protein-truncating HSD17B13 variant and protection from chronic liver disease. N Engl J Med 378: 1096–1106, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Lukacik P, Keller B, Bunkoczi G, Kavanagh KL, Lee WH, Adamski J, et al. : Structural and biochemical characterization of human orphan DHRS10 reveals a novel cytosolic enzyme with steroid dehydrogenase activity. Biochem J 402: 419–427, 2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Yang SY, He XY, Miller D: HSD17B10: a gene involved in cognitive function through metabolism of isoleucine and neuroactive steroids. Mol Genet Metab 92: 36–42, 2007 [DOI] [PubMed] [Google Scholar]
- 53.Venkatesan R, Sah-Teli SK, Awoniyi LO, Jiang G, Prus P, Kastaniotis AJ, et al. : Insights into mitochondrial fatty acid synthesis from the structure of heterotetrameric 3-ketoacyl-ACP reductase/3R-hydroxyacyl-CoA dehydrogenase. Nat Commun 5: 4805, 2014 [DOI] [PubMed] [Google Scholar]
- 54.Hiltunen JK, Kastaniotis AJ, Autio KJ, Jiang G, Chen Z, Glumoff T: 17B-hydroxysteroid dehydrogenases as acyl thioester metabolizing enzymes. Mol Cell Endocrinol 489: 107–118, 2019 [DOI] [PubMed] [Google Scholar]
- 55.Braun F, Bertoletti N, Möller G, Adamski J, Steinmetzer T, Salah M, et al. : First structure-activity relationship of 17β-hydroxysteroid dehydrogenase type 14 nonsteroidal inhibitors and crystal structures in complex with the enzyme. J Med Chem 59: 10719–10737, 2016 [DOI] [PubMed] [Google Scholar]
- 56.Song C, Burgess S, Eicher JD, O’Donnell CJ, Johnson AD, Huang J, et al. : Causal effect of plasminogen activator inhibitor type 1 on coronary heart disease. J Am Heart Assoc 6: e004918, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Wuttke M, Li Y, Li M, Sieber KB, Feitosa MF, Gorski M, et al. ; Lifelines Cohort Study; V. A. Million Veteran Program: A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat Genet 51: 957–972, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Teumer A, Li Y, Ghasemi S, Prins BP, Wuttke M, Hermle T, et al. : Genome-wide association meta-analyses and fine-mapping elucidate pathways influencing albuminuria. Nat Commun 10: 4130, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]

