Skip to main content
Journal of Pharmaceutical Analysis logoLink to Journal of Pharmaceutical Analysis
. 2025 Mar 14;15(8):101265. doi: 10.1016/j.jpha.2025.101265

Prioritization of potential drug targets for diabetic kidney disease using integrative omics data mining and causal inference

Junyu Zhang a,1, Jie Peng b,1, Chaolun Yu a, Yu Ning c, Wenhui Lin d, Mingxing Ni e, Qiang Xie f,⁎⁎⁎⁎, Chuan Yang b,⁎⁎⁎, Huiying Liang e,⁎⁎, Miao Lin a,
PMCID: PMC12446642  PMID: 40979545

Abstract

Diabetic kidney disease (DKD) with increasing global prevalence lacks effective therapeutic targets to halt or reverse its progression. Therapeutic targets supported by causal genetic evidence are more likely to succeed in randomized clinical trials. In this study, we integrated large-scale plasma proteomics, genetic-driven causal inference, and experimental validation to identify prioritized targets for DKD using the UK Biobank (UKB) and FinnGen cohorts. Among 2844 diabetic patients (528 with DKD), we identified 37 targets significantly associated with incident DKD, supported by both observational and causal evidence. Of these, 22% (8/37) of the potential targets are currently under investigation for DKD or other diseases. Our prospective study confirmed that higher levels of three prioritized targets—insulin-like growth factor binding protein 4 (IGFBP4), family with sequence similarity 3 member C (FAM3C), and prostaglandin D2 synthase (PTGDS)—were associated with a 4.35, 3.51, and 3.57-fold increased likelihood of developing DKD, respectively. In addition, population-level protein-altering variants (PAVs) analysis and in vitro experiments cross-validated FAM3C and IGFBP4 as potential new target candidates for DKD, through the classic NLR family pyrin domain containing 3 (NLRP3)-caspase-1-gasdermin D (GSDMD) apoptotic axis. Our results demonstrate that integrating omics data mining with causal inference may be a promising strategy for prioritizing therapeutic targets.

Keywords: Diabetic kidney disease, Proteomics, Causal inference, Drug targets

Graphical abstract

Image 1

Highlights

  • We prioritized 37 targets for DKD by omics mining and genetic-driven causal inference.

  • 22% (8/37) of prioritized targets are under investigation for DKD or others.

  • FAM3C and IGFBP4 as new targets for DKD validated by in vivo and in vitro experiments.

  • Genetic-driven causal inference is a promising strategy for prioritizing drug targets.

1. Introduction

Diabetic kidney disease (DKD or diabetic nephropathy) is a major complication of diabetes and remains the leading cause of end-stage renal disease [1,2], currently affecting approximately 700 million individuals globally [3]. Compared to healthy or diabetic individuals, DKD patients face a threefold increased risk of all-cause and cardiovascular mortality, resulting in a 16-year drop in life expectancy. The management of DKD primarily relies on stringent lifestyle management and pharmacological treatments aimed at controlling blood glucose, blood pressure, and blood lipids. Recently, several novel drugs have been approved. Sodium-glucose cotransporter 2 (SGLT2) inhibitors [4] (e.g., dapagliflozin [5] and empagliflozin [6]) primarily function by reducing glucose reabsorption and sodium levels, while glucagon-like peptide-1 (GLP-1) receptor agonists (e.g., liraglutide [7] and semaglutide [8]) affect glucose regulation and exhibit anti-inflammatory effects [9]. These drugs help to delay the progression of kidney dysfunction and prevent cardiovascular complications [10]. Recent studies identified that kidney injury molecule 1 (KIM-1) showed great potential for DKD treatment [11]. However, DKD involves complex pathogenesis characterized by inflammation, fibrosis, apoptosis, and oxidative stress, and currently lacks targeted interventions [12]. Identifying novel molecular targets is crucial for mitigating DKD incidence and potentially halting or reversing its progression.

Genetics, proteomics, and causal inference studies play a crucial role in identifying new drug targets [13]. Increasing evidence suggests that drug targets supported by causal genetic evidence exhibit a higher likelihood of success in randomized clinical trials [14]. Mendelian randomization (MR) is an innovative causal inference method that is becoming increasingly important in observational epidemiology. Genetics and MR studies utilize the random distribution of genetic variations to investigate causal relationships between specific phenotypes, proteins, and diseases, thereby overcoming the confounding factors inherent in traditional observational studies [15]. Integrating findings from proteomic and genetic analyses provides a comprehensive understanding of the pathogenesis of DKD and helps to identify new molecular targets with causal associations.

In this study, we integrated large-scale plasma proteomics and genetic-driven causal inference to prioritize potential targets for DKD using data from the UK Biobank (UKB) and FinnGen cohorts (Fig. 1A). A community-based prospective cohort was applied to validate the association between prioritized targets and DKD progression. Finally, population-level protein-truncating variants (PTVs) analysis mimicking the effects of gene knockout, in vivo immunofluorescence staining, and in vitro intervention experiments were employed to cross-validate potential drug targets, providing strong evidence for promising new therapeutic opportunities in DKD.

Fig. 1.

Fig. 1

Adjusted hazard ratios (HRs) of 2,923 proteins associated with diabetic kidney disease (DKD) in the UK Biobank (UKB). (A) Study design integrating plasma proteomics and genetics to identify drug targets. This figure was created by BioRender.com. (B) Flowchart of participant inclusion and exclusion criteria for DKD in the baseline diabetic adults. (C) Scatter plot of adjusted HRs of 2,923 proteins associated with DKD in diabetic participants, analyzed through Cox regression. P-values were adjusted using Bonferroni correction for each protein. Proteins with an adjusted P-value < 0.05 were considered significant risk proteins. (D) Forest plot presenting the HRs of the top 50 risk proteins for DKD in diabetic adults. GWAS: genome-wide association studies; 95% CI: 95% confidence interval. Full name of abbreviated proteins could be found in Table S3.

2. Materials and methods

2.1. Study participants and ethics

The UKB and FinnGen studies are both large prospective cohort studies with comprehensive phenotypic and genotypic data [16,17]. Individual-level data from the UKB (n = 502,387, recruited between 2006 and 2010) and FinnGen R9 (n = 377,277, recruited between 2017 and 2022) were used to identify individuals with DKD. For each UKB participant, self-reported information, including lifestyle, health conditions, and medication history, was obtained through a questionnaire and interview. Blood samples were collected from participants for plasma proteomics and whole-exome sequencing tests [18]. We excluded individuals with no proteomics data (n = 452,962), missing data on diabetes indicators (n = 366), without diabetes (n = 45,562), and with diabetic complications (n = 653) at baseline, ultimately including 2,844 individuals for further analysis (Fig. 1B). The UKB project was approved by the North West Multi-centre Research Ethics Committee (ref: 06/MRE08/75) and all participants agreed with informed consent. The FinnGen study protocol was approved by the Coordinating Ethics Committee of the Hospital District of Helsinki and Uusimaa (HUS/990/2017) [17]. The outcomes for DKD were defined using individual-level data acquired from hospital episode statistics, death registries, self-reported information, and primary care records. Cases were classified according to the International Classification of Diseases, 10th revision (ICD-10 codes). Incident DKD was classified as the primary diagnosis of DKD, established through a meticulous evaluation conducted by experienced physicians, death registry personnel, and community physicians. A total of 2,844 individuals, with a median age ranging from 70 to 78 years, were included in the study. Detailed information regarding the DKD definition can be found in Table S1.

2.2. Large-scale plasma proteome data mining

For UKB participants, blood samples were collected and randomly selected for Olink protein analysis. Previous research has shown a detailed workflow of sample collection, processing, and data preprocessing [18]. In summary, plasma samples were collected into ethylenediaminetetraacetic acid (EDTA) tubes, centrifuged at 4 °C, 2,500 g for 10 min, and then stored at −80 °C. After being transported to the Olink analysis service in Sweden at low temperature, a total of 2,923 proteins were measured through proximity extension assays in four panels. Quality control documentation for the entire proteomics process can be viewed and downloaded from the UKB website (https://biobank.ndph.ox.ac.uk/showcase/ukb/docs/PPP_Phase_1_QC_dataset_companion_doc.pdf). In general, the quality of UKB Olink data is sufficiently high, with an intra-plate coefficient of variation (%CV) of less than 10% and an inter-plate %CV of less than 20%. The concentrations of various proteins were transformed into normalized protein expression (NPX). More details on data conversion can be found at https://biobank.ndph.ox.ac.uk/showcase/ukb/docs/Olink-3072_B0-B6_Normalization.pdf. Three proteins (GLIPR1, NPM1, and PCOLCE) were missing in more than 30% of plasma samples and excluded.

Descriptive statistics showed differences between individuals who developed DKD during follow-up and those who did not. The chi-square test or Fisher's exact test was used to analyze categorical variables, while the t-tests or Mann-Whitney U tests were used to assess differences in continuous variables. Cox proportional hazards regression was used to calculate hazard ratios (HRs) and 95% confidence interval (95% CI), assessing associations between each plasma protein and future DKD. Follow-up time was defined as the shortest span from baseline to DKD onset, death, or end of the whole study (the most recent available date, October 31, 2023). Models were adjusted for a comprehensive range of covariates, including age, sex, ethnicity, education (college or others), Townsend deprivation index, income, smoking (never, previous, or present), alcohol intake (never, previous, or present), healthy diet score, international physical activity questionnaire (IPAQ) score, body mass index (BMI), HbA1c, duration of diabetes, hypertension status, and antidiabetic medications. A healthy diet score was calculated according to a prior study [19]. Antidiabetic medication contained a series of drugs including metformin, acarbose, gliclazide, insulin, pioglitazone, rosiglitazone, glimepiride, glipizide, glitazone, and repaglinide. Missing contents of covariates were interpolated with multiple imputations, generating five datasets, and analysis results were combined using the Rubin rule. For each protein, individuals with missing NPX values were removed. P-values were corrected through Bonferroni correction for each protein and statistical significance was defined as a two-sided P-value of < 0.05.

2.3. Genetic-driven causal inference analysis

2.3.1. Instrumental variable (IV) selection

To analyze the protein quantitative trait loci (pQTL) of plasma proteomics as exposure, we utilized the UKB database, selecting 2,931 Olink proteins as exposure factors. Genome-wide association studies (GWAS) summary statistics for these proteins are available at https://www.decode.com/summarydata/. IVs need to satisfy three conditional assumptions: (1) Relevance hypothesis: IVs are highly correlated with exposures; (2) Independence hypothesis: IVs cannot be associated with confounding factors; (3) Exclusivity hypothesis: exposure factors are the only way that IVs affect outcome. To test the correlation hypothesis, single nucleotide polymorphisms (SNPs) associated with exposure were identified at the whole-genome level (P < 5.0 × 10−8) [20]. Subsequently, the interference of linkage imbalance was excluded with a clustering threshold of r2 < 0.001 and allele distance greater than 10,000 KB. Finally, outcome data were extracted according to exposure-related SNPs in outcome GWAS. The exposure and outcome data were integrated, and the palindromic SNP sequences were eliminated.

2.3.2. Causal inference analysis

Genetic association data for plasma proteins were sourced from the UKB Pharma Proteomics Project (UKBPPP), while genetic associations for disease outcomes were obtained from FINNGEN R9 (FinnGen phenocode: DM NEPHROPATHY EXMORE, E4_DM1REN, and E4_DM2REN). The UKB participants are of British, while the FinnGen participants are of Finnish, with little overlap between two cohorts. In our MR analysis, only SNPs with minor allele frequency (MAF) > 0.01 were kept. The Wald ratio or inverse variance weighted (IVW) methods were used to evaluate causal relationships [21]. Associations between proteins and DKD were quantified using odds ratios (ORs) and P-values. Proteins with a P-value < 0.05 were considered to have a potential causal link with DKD. Sensitivity analysis including horizontal pleiotropy and heterogeneity analysis were performed for significant proteins. Overlap proteins between Cox and MR results were clustered into further analysis. To validate the causality between key proteins and DKD, we evaluated the posterior probability of the same causal variant between protein GWAS and DKD GWAS in ± 90 kb regions of each gene encoding key proteins.

2.4. A community-based prospective validation for prioritized targets

Among all observational and causal risk proteins, Kaplan Meier (KM) survival analysis was conducted to validate their prognostic value in DKD progression. The Youden index was employed to establish each protein's cutoff. Differences in DKD incidence among individuals with different protein levels during the follow-up period were used to assess the prognostic significance. HRs and P-values were calculated using adjusted Cox models.

2.5. Histopathology analysis of tissue samples across humans and mice

2.5.1. Biological samples

Renal biopsy specimens from hospitalized patients diagnosed with diabetic nephropathy, along with adjacent normal tissues collected during nephrectomy as controls, were obtained and processed into paraffin sections for immunohistochemistry. The study was approved by the Ethics Committee of Guangdong Provincial People's Hospital (Approval No.: GDREC2019771H(R1)).

Male C57BL/6 mice, aged 6−8 weeks (weighing 20−23 g), were subjected to an injection of streptozotocin (STZ) intraperitoneally at a dose of 50 mg/kg for 5 consecutive days to induce a diabetic model. The control group received an equivalent volume of citrate buffer following the same injection schedule. Mice with fasting blood glucose greater than 250 mg/dL (13.9 mmol/L) at 72 h and 7 days post-STZ injection were confirmed to have diabetes mellitus (DM). The levels of blood glucose and urine proteins were collected weekly. Mice that exhibited negative proteinuria after 4 weeks of diabetes induction were classified into the non-diabetic kidney disease (NDKD) group for sample collection, while mice with positive proteinuria at 8–12 weeks post-diabetes induction were classified into the DKD group. All mice were euthanized and kidney tissues were collected. The experiments were conducted according to the National Institutes of Health (NIH) guidelines for the Care and Use of Laboratory Animals. All animal experiments were approved by the Sun Yat-sen University Animal Ethics Committee (Approval No.: SYSU-IACUC-2024-002503).

Renal histopathology was used to evaluate the animal model. Paraffin-embedded kidney tissues were sectioned using a microtome for each kidney (n = 5 kidneys per group). Individual sections were stained with periodic acid-Schiff (PAS) and periodic-acid silver methenamine (PASM) stains following standard protocols [22].

2.5.2. Immunofluorescence staining

Briefly, 4-μm–thick paraffin-embedded kidney sections were dewaxed, hydrated, antigen repaired and blocked with 0.5% bovine serum albumin in phosphate-buffered saline (PBS) for 30 min. These samples were incubated overnight at 4 °C with specific primary antibodies followed by incubation with secondary antibodies for 1 h at room temperature. After staining the nucleus with hematoxylin for 3 min, the images were viewed under a Nikon Eclipse microscopy system (200× , 400×; Nikon Ni–U, Sendai, Miyagi Prefecture, Japan). The specific primary antibodies included: family with sequence similarity 3 member C (FAM3C) antibody (HUABIO, Hangzhou, China, ER1908-63, 1:200), insulin-like growth factor binding protein 1 (IGFBP1) antibody (Proteintech, Wuhan, China, 13981-1-AP, 1:2000), prostaglandin D2 synthase (PTGDS) antibody (Proteintech, Wuhan, China, 10754-2-AP, 1:1000) and neuroblastoma suppressor of tumorigenicity 1 (NBL1) antibody (Affinity Biosciences, Nanjing, China, DF3177, 1:200). The semi-quantitative immunoreactivity score (IRS) was determined using quantitative image analysis software (NIH, Bethesda, Maryland, USA). The values were determined using an IRS ranging from 0 to 12 for each slide across four fields at 20× magnification. Two independent observers investigated the slides with this procedure.

Immunofluorescence was carried out on fresh frozen mouse kidney sections as previously described [23]. Immunofluorescence of the section used FAM3C (Proteintech, Wuhan, China, 14247-1-AP, 1:100), IGFBP4 (Proteintech, Wuhan, China, 18500-1-AP, 1:100), and PTGDS (Proteintech, Wuhan, China, 10754-2-AP, 1:250). Images of all tissues were analyzed with a Carl Zeiss LSM800 with airyscan (Oberkochen, Baden-Württemberg, Germany).

2.6. Virtual screening and TxGNN foundation model for therapeutic agents

Virtual screening for IGFBP4, FAM3C, and PTGDS was performed on Molecular Operating Environment (MOE, Release 2022.02) against a library of 2,619 approved drugs from DrugBank. TxGNN [24] (http://txgnn.org/), a new graph-based foundation model for drug repurposing, was used to prioritize candidates (score > 0.99) from the top 10 compounds with the strongest binding affinities. Key interactions between ligands and protein binding pockets were visualized using PyMOL (V2.5).

2.7. Population-level genetic inhibition of potential targets

The loss of function transcript effect estimator (LOFTEE) algorithm was adopted to predict PTVs [25]. AlphaMissense [26] and the sorting intolerant from tolerant (SIFT) algorithm were used to predict likely pathogenic and damaging variants. Protein-altering variants (PAVs), including PTVs and pathogenic and damaging variants, were analyzed [27]. Rare genetic variants (MAF < 0.1%) in IGFBP4, FAM3C, PTGDS, and positive controls, SGLT2 and glucagon-like peptide-1 receptor (GLP1R), were analyzed. Whole exome sequence data from 18,398 individuals with diabetes and no complication at baseline from the UKB study were used. For each gene, individuals were classified as carriers or non-carriers of PAVs. Logistic regression analysis was performed to discover associations between carrier status and DKD outcomes, adjusted with age, sex, genotyping array, and the first 10 genomic principal components [27].

2.8. In vitro cell experiments for prioritized targets’ validation

2.8.1. Cell culture and treatment

HK2 cells were cultured in dulbecco's modified eagle medium/nutrient mixture F-12 (DMEM/F-12, Gibco, Waltham, MA, USA) medium with 10% fetal bovine serum. Cells were plated at 60%–70% confluence, serum-starved for 12 h, and then exposed to either 50 mM mannitol (control group) or 50 mM glucose (high glucose (HG) group) for 48 h [28]. HK2 cells were exposed to Lisinopril (1 μM and 10 μM, obtained from MedChemExpress, Monmouth Junction, NJ, USA) and Telmisartan (4 μg/mL and 40 μg/mL, obtained from MedChemExpress) for 48 h, with the control group treated with solvents. Cell viability was measured using the Cell Counting Kit-8 assay (APExBIO, Houston, TX, USA).

2.8.2. FAM3C and IGFBP4 siRNA transfection

FAM3C and IGFBP4 were silenced using small interfering RNA (siRNA) (Hippobio, Huzhou, China) transfected 48 h after seeding using the Lipofectamine™ RNAiMAX (ThermoFisher, Waltham, MA, USA), following the manufacturer's instructions. Sequence of siRNAs were as below:

IGFBP4 siRNA sense: 5′-UCACUCAUCCAGCCACCUAAATT-3′;

Antisense: 5′-UUUAGGUGGCUGGAUGAGUGATT-3′;

FAM3C siRNA sense: 5′-GAGGAGAUGUGGCACCAUUUAdTdT-3′;

Antisense: 5′-UAAAUGGUGCCACAUCUCCUCdTdT-3′.

2.8.3. Immunofluorescence staining

HK2 cells were fixed with 4% paraformaldehyde and permeabilized with 0.5% Triton X-100. After fixation, the cells were incubated with primary antibodies at 4 °C overnight. After washing with phosphate-buffered saline (PBS), the cells were treated with CY3 (APExBIO, Houston, TX, USA). Subsequently, the samples were counterstained with 4,6-diamidino-2-phenylindole (DAPI) (Beyotime, Shanghai, China) and mounted with Vectashield mounting medium (Beyotime, Shanghai, China).

2.8.4. Western blot

Proteins from HK2 cells were collected and quantified using the BCA method. Equal amounts of protein were separated by 7.5% or 10% sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), transferred to nitrocellulose membranes, and probed with the following primary antibodies: anti-FAM3C (Proteintech, Wuhan, China, 14247-1-AP, 1:1000), anti-IGFBP4 (Proteintech, 18500-1-AP, 1:1000), anti-caspase-1 (Abcam, Cambridge, UK, ab179515, 1:1000), anti-gasdermin D (GSDMD) (Abmart, Shanghai, China, PU224937, 1:1000), and anti-NLR family pyrin domain containing 3 (NLRP3) (Abmart, P60622R3, 1:1000). After incubation with secondary antibodies, protein bands were visualized using the ECL system (APExBIO, Houston, TX, USA), and band intensities were normalized to β-actin using ImageJ software.

2.9. Statistical analysis

To evaluate their druggability, we summarized comprehensive information on drug molecule types, approved indications for specific targets, and clinical trial outcomes from various drug-related databases, including Open Targets [29], ClinicalTrials.gov, DrugBank, PharmGKB, and GeneCards. Data analyses were conducted using R and Python, with packages including survival (v3.6-4), mice (v3.16.0), gwaslab (v3.4.45, https://doi.org/10.51094/jxiv.370), gwasglue (v0.0.0.9000), gwasvcf (v0.1.2), TwoSampleMR (v0.6.2), coloc (v5.2.3), and lifelines (v0.28.0).

Experimental data were visualized using the GraphPad Prism 10.0 and expressed as the mean ± standard deviation (SD), with t-tests used to compare differences between two groups. For comparisons among multiple groups, one-way analysis of variance (ANOVA) followed by Tukey's post hoc test was performed. Statistical significance was defined as P < 0.05.

3. Results

A total of 2,844 baseline diabetic patients without additional complications were included. Among them, 528 participants developed DKD, with a median age of 62.8 years, 40.3% female, and an average follow-up of 7.33 years. The other 2,316 participants, who did not develop DKD or other complications, were considered the diabetic group (Without DKD), with an average follow-up of 12.5 years. Table S2 summarized the baseline characteristics.

3.1. Identifying potential targets for DKD with observational and causal evidence

Of these 2,923 proteins, after adjusting for age, sex, education, and nine other covariates, we found that 597 risk proteins were associated with incident DKD in diabetic adults (Fig. 1C and Table S3). After Bonferroni correction, IGFBP4 (HR = 3.37, P = 3.93 × 10−59), collagen type VI alpha 3 chain (COL6A3) (HR = 5.07, P = 8.89 × 10−56), and phosphoinositide-3-kinase interacting protein 1 (PIK3IP1) (HR = 6.78, P = 2.73 × 10−54) showed the most significant positive associations with DKD. The top 50 risk proteins, ranked by significance, are shown in Fig. 1D. Conversely, uromodulin (UMOD) (HR = 0.52, P = 1.12 × 10−18), neural epidermal growth factor-like 1 (NELL1) (HR = 0.49, P = 2.97 × 10−15), and nephrin (NPHS1) (HR = 0.33, P = 1.41 × 10−8) were negatively associated with DKD (Table S3).

To assess the causal relationship between 2,923 proteins and DKD, we analyzed the pQTL of plasma proteomics for exposure and calculated SNP IVs. Diabetic nephropathy (n = 274,660), type 1 diabetes with renal complications (n = 309,859), and type 2 diabetes with renal complications (n = 310,964) from the FinnGen study were selected as outcome variables. Using a two-sample MR method, we identified 205, 184, and 165 proteins causally associated with the above three DKD outcomes in the European population, with strict instrument selection parameters (P < 5 × 10−8). Detailed results are provided in Tables S4 and S5.

We further conducted integrated analyses of proteomic and genetic data. As shown in Fig. 2A, only 37 proteins met the adjusted P-value threshold of less than 0.05 through Cox regression and MR models (P < 5 × 10−8). Among 37 proteins, IGFBP4 (HR = 3.37, P = 3.93 × 10−59; OR = 1.73, P = 0.008), NBL1 (HR = 8.61, P = 2.13 × 10−50; OR = 1.51, P = 0.031), FAM3C (HR = 6.52, P = 3.15 × 10−47; OR = 1.98, P < 0.001) and PTGDS (HR = 4.29, P = 3.44 × 10−43; OR = 1.30, P = 0.022) was the most significant observational and causal risk proteins for incident DKD (Tables S6 and S7).

Fig. 2.

Fig. 2

Observational and causal estimates of 37 risk proteins associated with diabetic kidney disease (DKD). (A) Forest plots illustrate the hazard ratios (HRs, red) and odds ratios (ORs, blue) for 37 shared risk proteins, assessed through Cox regression and two-sample Mendelian randomization (MR) models, respectively. The 95% confidence intervals (95% CIs) are displayed. The heatmap displays the adjusted P-values from both observational and causal estimates, with color representing the direction of effect and association strength with DKD. Red in the heatmap signifies risk factors. Proteins are ranked by the adjusted P-values obtained from Cox regression, listed in ascending order. Ob: observational analysis. Ca: causal analysis. (B–D) Regional Manhattan plots of associations of single nucleotide polymorphisms (SNPs) at the locus of top genes: insulin-like growth factor binding protein 4 (IGFBP4) (B), neuroblastoma suppressor of tumorigenicity 1 (NBL1) (C), and family with sequence similarity 3 member C (FAM3C) (D). The SNP rs1668339, rs12408663, and rs2707502 were used to proxy serum IGFBP4, NBL1, and FAM3C expression. Flanking 90 kb regions to either side of corresponding genes were analyzed. r2: the correlation in allele frequencies between various SNPs and proxy SNPs. Full name of abbreviated proteins could be found in Table S3.

In the sensitivity analysis of MR analysis, 35 proteins showed no pleiotropy with DKD, only advanced glycosylation end-product specific receptor (AGER) and von willebrand factor C domain containing 2 like (VWC2L) with pleiotropy, indicating that our findings were reliable (Table S8). Figs. 2B–D presented the regional Manhattan plots of SNPs among flanking 90 kb regions of IGFBP4, NBL1, and FAM3C loci with DKD. Other proteins are shown in Fig. S1 and Table S9.

3.2. Prioritizing targets by community-based prospective validation

Next, we evaluated the prognostic value of baseline protein levels in predicting DKD progression and potential drug targets. Baseline protein levels were divided into higher-level (orange) and lower-level (blue) groups, with the cutoff based on the highest Youden index. All patients, including those who experienced clinical progression, were included. Detailed thresholds are shown in Table S10. Participants with higher baseline levels of IGFBP4 (HR = 4.23, P = 2.3 × 10−37), NBL1 (HR = 3.51, P = 1.7 × 10−34), PTGDS (HR = 3.54, P = 2.4 × 10−34), and FAM3C (HR = 3.33, P = 2.1 × 10−31) had a significantly increased risk of developing DKD (Fig. 3).

Fig. 3.

Fig. 3

Prognostic impact of top 9 risk proteins' baseline levels on diabetic kidney disease (DKD) events in diabetic adults. Adjusted Kaplan Meier (KM) survival curves illustrating the incidence of DKD over a 16-year follow-up period for the top 9 proteins with significant observational and causal associations in baseline diabetic adults. The survival probabilities of individuals with lower (blue line) versus higher (orange line) baseline plasma concentrations of insulin-like growth factor binding protein 4 (IGFBP4), neuroblastoma suppressor of tumorigenicity 1 (NBL1), prostaglandin D2 synthase (PTGDS), family with sequence similarity 3 member C (FAM3C), tumor necrosis factor (NF) receptor superfamily member 1B (TNFRSF1B), TNF receptor superfamily member 4 (TNFRSF4), CD160 molecule (CD160), shisa family member 5 (SHISA5), and chordin like 1 (CHRDL1) were compared. The hazard ratios (HRs) and 95% confidence intervals (95% CIs) were indicated for each protein. Full name of abbreviated proteins could be found in Table S3.

Notably, individuals with higher IGFBP4 levels were 4.35 times more likely to develop DKD than those with lower IGFBP4 levels. Similarly, those with higher levels of NBL1, PTGDS, and FAM3C had 3.50, 3.57, and 3.51 times higher risks for DKD events, respectively. Significant associations were also found for 37 prioritized targets (Figs. 3 and S2). From a drug target perspective, reducing these circulating risk proteins within an appropriate range could mitigate the progression of DKD.

3.3. Confirming prioritized targets’ alterations across humans and mice

As shown in Fig. S3, IGFBP4 and FAM3C protein levels were significantly elevated in DKD patients compared to healthy controls in a DKD kidney tissue dataset (https://data.mendeley.com/datasets/83k89shdx5/, with a total of 10 healthy participants and 23 DKD patients). However, NBL1 levels were significantly decreased in DKD patients, which is contradictory to our findings, suggesting that NBL1 might not be a potential target for DKD.

Immunohistochemistry staining on renal biopsy specimens verified a significantly increased expression of FAM3C, IGFBP4, PTGDS, and IGFBP1 in DKD patients compared to healthy tissues (Fig. 4A). Both renal tissue and circulating plasma samples consistently show elevated levels of FAM3C, IGFBP4, PTGDS, and IGFBP1 in DKD patients, compared to healthy and diabetic controls.

Fig. 4.

Fig. 4

Differential expression of prioritized targets in tissue immunostaining. (A) Immunohistochemistry staining validating increased expression of insulin-like growth factor binding protein 4 (IGFBP4), family with sequence similarity 3 member C (FAM3C), prostaglandin D2 synthase (PTGDS), and insulin-like growth factor binding protein 1 (IGFBP1) in renal biopsy specimens from diabetic kidney disease (DKD) and healthy volunteers. (B) Periodic acid-Schiff (PAS), periodic-acid silver methenamine (PASM) staining, blood glucose and proteinuria confirming successful modeling of diabetic (NDKD) and DKD mice. (C) Immunohistochemistry identifying boosted expression of IGFBP1, FAM3C, and PTGDS in DKD mice. (D) Representative immunofluorescence indicating elevated expression of FAM3C (red), IGFBP4 (green), and PTGDS (yellow) in the renal tissues of DKD mice across different groups. Nuclei were counterstained with 4,6-diamidino-2-phenylindole (DAPI) (blue). ∗∗∗P < 0.001, ∗∗P < 0.01, P < 0.05. IRS: immunoreactivity score.

Additionally, diabetic and DKD mouse models were established after verification of PAS, PASM, blood glucose, and proteinuria measurement (Fig. 4B). Both immunohistochemistry staining and immunofluorescence staining on different groups of mice also presented a significantly boosted expression of IGFBP4, FAM3C, PTGDS, and IGFBP1 in DKD mice (Figs. 4C and D). However, no significant difference was observed in NBL1 expression in renal biopsy specimens from DKD patients (Fig. S4).

3.4. Likelihood of drug success of prioritized targets

For these 37 prioritized targets, enrichment analysis using gene ontology terms and the Reactome pathways indicated significant associations with key biological processes of DKD (Table S11). These proteins were enriched in the regulation of response to stimulus (IGFBP4, IGFBP1, and NBL1; false discovery rate (FDR) < 0.001), signaling receptor binding (IGFBP4, IGFBP1, NBL1, FAM3C; FDR = 0.0064), and regulation of the insulin-like growth factor (IGF) transport and uptake (IGFBP4, IGFBP1, follistatin like 1 (FSTL1), shisa family member 5 (SHISA5), chordin like 1 (CHRDL1); FDR = 0.004). These pathways are implicated in the pathogenesis of renal diseases, diabetes, and cardiovascular diseases.

A summary of OpenTargets provided evidence of drug development, including phase II–IV trials, for 8 proteins containing tumor necrosis factor (TNF) receptor superfamily member 4 (TNFRSF4), adrenomedullin (ADM), folate receptor alpha (FOLR1), complement factor D (CFD), C-X3-C motif chemokine ligand 1 (CX3CL1), interferon gamma receptor 1 (IFNGR1), AGER, and angiopoietin like 3 (ANGPTL3), with 1 connected to diabetic nephropathy (AGER), and 1 related to hypercholesterolemia (ANGPTL3) (Table 1). Drug target development for clinical approval (Tclin) outcomes revealed that FOLR1, CFD, IFNGR1, and ANGPTL3 each had one “active drug”, while the ligand-Tchem data indicated “active ligands” for the other proteins, including FOLR1 and CFD. However, no clinical trials have been conducted for other 29 proteins, including IGFBP4, FAM3C, and PTGDS.

Table 1.

Genetic information and findings for key proteins associated with diabetic kidney disease (DKD).

Protein name Gene name Expressionin renal Genetic analysis
Drug development
Target development level
OR 95% CI P-value Drug name Outcomes Trial phases Drug-Tclina Ligands-Tchemb
IGFBP4 IGFBP4 Yes 1.73 [1.16, 2.59] 0.007 0 0
NBL1 NBL1 Yes 1.51 [1.04, 2.19] 0.032 0 0
PTGDS PTGDS Yes 1.30 [1.04, 1.63] 0.022 0 0
FAM3C FAM3C Yes 1.98 [1.34, 2.89] 4E-04 0 0
TNFRSF1B TNFRSF1B Yes 2.28 [1.09, 4.78] 0.028 0 0
TNFRSF4 TNFRSF4 Yes 1.31 [1.01, 1.68] 0.04 Ivuxolimab Advanced malignant neoplasm III 0 0
CD160 CD160 Yes 1.30 [1.06, 1.59] 0.012 0 0
SHISA5 SHISA5 Yes 1.62 [1.07, 2.45] 0.024 0 0
CHRDL1 CHRDL1 Yes 1.69 [1.16, 2.48] 0.008 0 0
HLA-E HLA-E Yes 1.40 [1.11, 1.77] 0.004 0 0
GPR37 GPR37 Yes 1.12 [1.00, 1.26] 0.047 0 0
ADM ADM Yes 1.24 [1.02, 1.51] 0.034 Enibarcimab Heart failure II 0 0
FOLR1 FOLR1 Yes 1.25 [1.02, 1.53] 0.025 Methotrexate Rheumatoid arthritis IV 1 50
RNASET2 RNASET2 Yes 1.60 [1.16, 2.20] 0.004 0 0
CFD CFD Yes 1.21 [1.04, 1.64] 0.021 Danicopan C3 glomerulonephritis III 1 324
CX3CL1 CX3CL1 Yes 1.42 [1.01, 1.99] 0.043 Quetmolimab Rheumatoid arthritis II 0 0
IFNGR1 IFNGR1 Yes 1.18 [1.01, 1.38] 0.04 Interferon gamma-1b Friedreich's ataxia III 1 0
PALM2 PALM2 Yes 1.34 [1.06, 1.68] 0.014 0 0
POLR2F POLR2F Yes 2.95 [1.29, 6.73] 0.01 0 0
VWC2L VWC2L Yes 0.56 [0.35, 0.88] 0.01 0 0
NCR1 NCR1 Yes 1.13 [1.01, 1.26] 0.038 0 0
NT-proBNP NPPB Yes 0.08 [0.65, 0.98] 0.029 0 0
CFC1 CFC1 No 0.34 [0.13, 0.93] 0.035 0 0
D300A CD300A Yes 1.10 [1.00, 1.21] 0.045 0 0
AGER AGER Yes 0.73 [0.54, 0.98] 0.039 Azeliragon Diabetic nephropathy II 1 14
FSTL1 FSTL1 Yes 1.20 [1.02, 1.41] 0.033 0 0
CCER2 CCER2 Yes 1.15 [1.04, 1.28] 0.007 0 0
GABARAP GABARAP Yes 0.26 [0.08, 0.84] 0.025 0 0
SCGB3A2 SCGB3A2 Yes 1.13 [1.01, 1.27] 0.033 0 0
IGFBP1 IGFBP1 Yes 0.62 [0.41, 0.96] 0.03 0 0
HIP1R HIP1R Yes 1.49 [1.04, 2.13] 0.031 0 0
MYOC MYOC Yes 1.21 [1.03, 1.44] 0.023 0 0
GCNT1 GCNT1 Yes 1.16 [1.01, 1.34] 0.036 0 0
ANGPTL3 ANGPTL3 Yes 1.21 [1.05, 1.39] 0.009 Evinacumab Hypercholesterolemia III 1 0
CGA CGA Yes 2.73 [1.36, 5.54] 0.005 0 0
SCARF1 SCARF1 Yes 1.20 [1.06, 1.35] 0.003 0 0
DCBLD2 DCBLD2 Yes 1.16 [1.02, 1.31] 0.024 0 0

OR: odd ratio; CI: confidence interval; IGFBP4: insulin-like growth factor binding protein 4; NBL1: neuroblastoma suppressor of tumorigenicity 1; PTGDS: prostaglandin D2 synthase; FAM3C: family with sequence similarity 3 member C; TNFRSF1B: tumor necrosis factor (TNF) receptor superfamily member 1B; TNFRSF4: TNF receptor superfamily member 4; CD160: CD160 molecule; SHISA5: shisa family member 5; CHRDL1: chordin like 1; HLA-E: major histocompatibility complex, class I, E; GPR37: G protein-coupled receptor 37; ADM: adrenomedullin; FOLR1: folate receptor alpha; RNASET2: ribonuclease T2; CFD: complement factor D; CX3CL1: C-X3-C motif chemokine ligand 1; IFNGR1: interferon gamma receptor 1; PALM2: paralemmin 2; POLR2F:RNA polymerase II, I And III subunit F; VWC2L: von willebrand factor C domain containing 2 like; NCR1: natural cytotoxicity triggering receptor 1; NT-proBNP: N-terminal pro B-type natriuretic peptide; CFC1: Cryptic, EGF-CFC family member 1; CD300A: CD300a molecule; AGER: advanced glycosylation end-product specific receptor; FSTL1: follistatin like 1; CCER2: coiled-coil glutamate rich protein 2; GABARAP: GABA type A receptor-associated protein; SCGB3A2: secretoglobin family 3A member 2; IGFBP1: insulin like growth factor binding protein 1; HIP1R: huntingtin interacting protein 1 related; MYOC: myocilin; GCNT1: glucosaminyl (N-acetyl) transferase 1; ANGPTL3: angiopoietin like 3; CGA: glycoprotein hormones, alpha polypeptide; SCARF1: scavenger receptor class F member 1; DCBLD2: discoidin, CUB and LCCL domain containing 2; –: no data.

a

Target has at least 1 approved drug.

b

Target has at least 1 ChEMBL compound with an activity cutoff of < 30 nmol/L or > 80% or ≤ 10 pmol/mg.

3.5. Virtual screening and TxGNN foundation model for therapeutic agents

For prioritized targets validated by tissue immunostaining, virtual screening of IGFBP4, FAM3C, and PTGDS was conducted to identify their potential inhibitors from 2619 approved drugs. Based on additional insights from the evaluation of the TxGNN foundation model [24], the top 10 ligands with best performance are summarized in Table 2. Notably, Valsartan and some ligands showed very low binding scores and strong binding potential against FAM3C. Representative interaction interfaces of binding pockets are visualized in Fig. 5A.

Table 2.

Docking results of the top 10 ligands for each potential target from virtual screening.

Receptor Ligand Score Receptor Ligand Score Receptor Ligand Score
IGFBP4 Somatostatin −9.60 FAM3C Fondaparinux −99.14 PTGDS Semaglutide −15.79
IGFBP4 Linaclotide −8.99 FAM3C Bivalirudin −93.17 PTGDS Pramlintide −14.88
IGFBP4 Acarbose −7.58 FAM3C Tannic acid −85.49 PTGDS Tenapanor −11.20
IGFBP4 Linagliptin −7.39 FAM3C Valsartan −62.10 PTGDS Cyclosporine −10.44
IGFBP4 Moexipril −7.11 FAM3C Daprodustat −51.68 PTGDS Lanreotide −9.90
IGFBP4 Fosinopril −6.91 FAM3C Benazepril −47.40 PTGDS Verapamil −7.64
IGFBP4 Telmisartan −6.89 FAM3C Fosinopril −45.95 PTGDS Fosinopril −7.07
IGFBP4 Verapamil −6.88 FAM3C Cilazapril −44.99 PTGDS Telmisartan −7.02
IGFBP4 Daprodustat −6.71 FAM3C Lisinopril −43.12 PTGDS Bexagliflozin −6.95
IGFBP4 Valsartan −6.21 FAM3C Irbesartan −42.22 PTGDS Empagliflozin −6.80

Black bold words were validated by the TxGNN platform.

IGFBP4: insulin-like growth factor binding protein 4; FAM3C: family with sequence similarity 3 member C; PTGDS: prostaglandin D2 synthase.

Fig. 5.

Fig. 5

Virtual screening and genetic intervention analysis of potential targets. (A) Representative visualization of binding pocket between ligands and core proteins. (B) Logistic regression indicated association between non-carriers and carriers of at least one damaging variant for each core protein. OR: odds ratio; PAVs: protein-altering variants. ncase: number of diabetic kidney disease (DKD) patients; ncontrol: number of healthy contrast.

In addition, we used whole-exome sequence data to assess the genetic inhibition potential of IGFBP4, FAM3C, and PTGDS at the population level, with SGLT2 and GLP1R as positive controls. Damaging mutations of SGLT2 significantly lowered the risk of DKD, and no such association was observed for GLP1R. Although no significant association was observed for IGFBP4, FAM3C, and PTGDS due to insufficient sample size, protein-damaging mutations decreased the OR of DKD condition (Fig. 5B), suggesting potential intervention effects similar to SGLT2.

3.6. Validating prioritized targets through in vitro assays

The HG-induced HK2 cell injury and apoptosis is a classic in vitro model of DKD, commonly used for screening potential targets and intervention agents. After 48 h of HG treatment, immunofluorescence results revealed a significant increase in the classic apoptotic pathway (NLRP3, caspase-1, and GSDMD), and potential targets (FAM3C and IGFBP4) (Fig. 6A). We further evaluated the effects of inhibiting FAM3C and IGFBP4 (through siRNAs) on inflammation and apoptosis in HG-induced HK2 cells. As shown in Figs. 6B and C, silencing FAM3C alleviated inflammation and cell damage in HK2 cells, downregulating NLRP3, caspase-1, and GSDMD, while IGFBP4 silencing reduced the active N-terminal domain of GSDMD (NT-GSDMD).

Fig. 6.

Fig. 6

The anti-apoptotic effect of family with sequence similarity 3 member C (FAM3C) and insulin-like growth factor binding protein 4 (IGFBP4) in high glucose (HG)-induced HK2 cell apoptosis. (A) Immunofluorescence staining results of FAM3C, IGFBP4, prostaglandin D2 synthase (PTGDS), caspase-1, gasdermin D (GSDMD), and NLR family pyrin domain containing 3 (NLRP3) in HK2 cells under normal conditions (NC) (50 mM mannitol) and HG (50 mM glucose) treatment for 48 h. Nuclei were stained with 4,6-diamidino-2-phenylindole (DAPI) (blue), and the target proteins were shown in red. (B, C) The anti-apoptotic effect of FAM3C (B) and IGFBP4 (C) silence. Control and HG group cells were transfected with FAM3C or IGFBP4 small interfering RNA (siRNA) for 48 h, respectively. FAM3C/IGFBP4 siRNA vs. NC siRNA, ∗∗∗P < 0.001, ∗∗P < 0.01, P < 0.05. (D, E) The anti-apoptotic effect of potential agents Lisinopril (D) and Telmisartan (E). Therapeutic agent vs. NC siRNA, ∗∗∗P < 0.001, ∗∗P < 0.01, P < 0.05.

Additionally, we selected potential inhibitors, Lisinopril and Telmisartan, to evaluate their effects. Fig. 6D shows that Lisinopril significantly inhibited the expression of NLRP3 and GSDMD in HK2 cells, with a slight reduction in FAM3C and caspase-1. Telmisartan significantly suppresses the IGFBP4 and NLRP3-caspase-1 axis (Fig. 6E). Taken together, both in vivo and in vitro results suggest that FAM3C and IGFBP4 may serve as potential targets for DKD intervention, with their inhibition potentially alleviating DKD progression.

4. Discussion

We identified 37 prioritized targets with observational and causal relevance for DKD by integrated large-scale plasma proteomics and genetic-driven causal inference. Colocalization analyses further confirmed their causality, including IGFBP4, FAM3C, and PTGDS. In the UKB prospective study, diabetic individuals with higher levels of IGFBP4, PTGDS, and FAM3C had 4.35, 3.57, and 3.51 times higher risks of incident DKD, respectively. Immunostaining experiments and Nephrotic Syndrome Study cohorts revealed a remarkably elevated expression of IGFBP4, FAM3C, and PTGDS in DKD kidney tissues across humans and mice. Genetic inhibition at population level and in vitro experiments suggested IGFBP4 and FAM3C might be potential targets for DKD.

This study provides the most comprehensive assessment between 2,923 proteins and DKD based on the UKB and FinnGen cohorts. Among 37 prioritized targets with cross-validation evidence, 10 proteins have been consistently linked to incident DKD: AGER, ANGPTL3, CFD, CX3CL1, IGFBP1, IGFBP4, NBL1, PTGDS, TNF receptor superfamily member 1B (TNFRSF1B), and TNFRSF4. AGER inhibitor (Azeliragon) is undergoing phase II clinical trials of DKD (ClinicalTrials: NCT00287183). CX3CL1 and CFD were also reported as potential targets for DKD [30]. Notably, 22% (8/37) of prioritized targets are currently under investigation for DKD or other diseases.

Notably, 5 proteins without drug development are involved in the regulation of IGF transport and uptake (IGFBP4, IGFBP1, FSTL1, SHISA5, and CHRDL1). Our study identified IGFBP4 as the strongest risk factor for DKD, while IGFBP1 showed a negative causal relationship with DKD. IGFBP4 and IGFBP1 are expressed in the liver and kidneys and involved in systemic IGF signaling, contributing to insulin resistance and glucose regulation. Similar to our findings, a study of younger Asian individuals with type 2 diabetes found a positive association between IGFBP4 and DKD risk [30]. DKD individuals have elevated plasma levels of IGFBP1 and IGFBP4 [31]. IGFBP4 fragments (including N- and C-terminal fragments, NT-IGFBP-4 and CT-IGFBP-4) are related to cardiovascular mortality in type 1 diabetes patients [32]. IGFBP4 is also highly expressed in the serum of CKD patients, correlating with kidney failure and reduced osteogenesis [33]. Our results demonstrated genetic inhibition of IGFBP4 alleviates HG induced apoptosis of HK2 cells and DKD events. Many studies have found that IGFBP1 is a protective factor for DKD, showing reduced expression in early type 2 DKD regulated by phosphoinositide 3-kinase (PI3K)-forkhead box O1 (FoxO1) activity in podocytes [34]. These findings highlight the potential value of IGFBP1 and IGFBP4 in DKD, suggesting further intervention studies.

FAM3C is an important regulator of glucose and lipid metabolism, with evidence [35,36] showing reduced expression in the liver under diabetic conditions. Our in vivo and in vitro experiments demonstrated that FAM3C was significantly upregulated in plasma and kidneys of DKD patients. Genetic inhibition of FAM3C effectively alleviated the classic NLRP3-caspase-1-GSDMD apoptotic axis in HK2 cells. As reported, targeting family with sequence similarity 3 member B (FAM3B) and FAM3C significantly promoted autophagy, inhibited apoptosis, and provided protection against myocardial infarction in mice [37]. These findings suggested that FAM3C might be a potential therapeutic target for DKD.

Besides IGFBP4 and FAM3C, we have demonstrated the observational and causal associations of NBL1 and PTGDS with DKD development. Since our observational and causal evaluation were based on data of plasma proteomics, which might not represent the actual state of kidney tissues. From protein expression of DKD kidney tissue dataset and related literature [38], NBL1 was significantly decreased in kidney tissues of DKD patients in contrast. Also, no significant difference could be identified in NBL1 expression on renal biopsy specimens from our DKD patients (Fig. S4). These results suggested that NBL1 might not be a potential target for DKD. Therefore, we ignored NBL1 in the following analysis. PTGDS, a key enzyme in the synthesis of prostaglandin D2 (PGD2), is positively associated with DKD risk in young Asian patients with type 2 diabetes [39,40].

The key strength of our study is applying multiple validation to ensure reliability, including cross-validation of observational and causal risk factors, colocalization analysis to confirm causality, prospective cohort studies to verify prognostic value, genetic inhibition at population level and intervention experiments to validate potential drug targets. However, our study also had several limitations. First, due to the Olink platform, only 2,923 proteins of the human plasma proteome were quantitated, representing just 1.5% of all known proteins, which likely introduced biases in the detection of secreted and low-abundance proteins. Second, although plasma samples are more accessible than kidney biopsy tissues, they lack specificity as reliable drug targets in DKD. Third, our study was carried out across different human races. We adjusted for ethnicity as a covariate. The observational analysis, along with validation in a Chinese population, helped reduce bias. Further validation in other populations is needed. Fourth, because DKD patients and diabetic mice model were both type-II diabetes, it needed more investigation for type-I diabetic patients. Future studies involving knock-out animal models and clinical trials are essential, especially in different diabetes subtypes.

5. Conclusions

In conclusion, we identified 27 new and 10 known targets associated with incident DKD, supported by both observational and causal evidence. Notably, FAM3C and IGFBP4 emerge as potential novel targets for DKD, whose inhibition via RNA interference or agents effectively alleviated the classic NLRP3-caspase-1-GSDMD apoptotic pathway. Our findings highlight the effectiveness of integrating omics data mining and causal inference for prioritizing drug targets.

CRediT authorship contribution statement

Junyu Zhang: Writing – original draft, Visualization, Validation, Software, Methodology. Jie Peng: Writing – original draft, Visualization, Validation, Project administration, Methodology, Formal analysis, Data curation. Chaolun Yu: Writing – review & editing, Resources, Funding acquisition, Conceptualization. Yu Ning: Writing – review & editing, Funding acquisition, Conceptualization. Wenhui Lin: Writing – review & editing, Methodology. Mingxing Ni: Writing – review & editing, Formal analysis, Data curation. Qiang Xie: Methodology, Supervision, Validation, Writing – review & editing. Chuan Yang: Writing – review & editing, Supervision, Conceptualization. Huiying Liang: Writing – review & editing, Supervision, Resources, Methodology, Conceptualization. Miao Lin: Writing – review & editing, Writing – original draft, Supervision, Project administration, Methodology, Funding acquisition, Conceptualization.

Data and resource availability

This study utilized the UKB resource under application number 33952. Raw UKB data, including individual medical conditions, proteomics, and others, can be accessed for researchers through registration and approval of UKB (https://www.ukbiobank.ac.uk/enable-your-research). GWAS summary statistics are available at UKBPPP (https://www.synapse.org/Synapse:syn51364943/wiki/622119) and FinnGen (https://www.finngen.fi/en/access_results) upon registration.

Declaration of competing interest

The authors declare that there are no conflicts of interest.

Acknowledgments

We thank all participants and investigators of the UKB and the FinnGen study. This study is supported by the National Natural Science Foundation of China (Grant Nos.: 82204396, 82304491, and 82400511).

Footnotes

Peer review under responsibility of Xi'an Jiaotong University.

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.jpha.2025.101265.

Contributor Information

Qiang Xie, Email: xieqiang@xmu.edu.cn.

Chuan Yang, Email: yangch@mail.sysu.edu.cn.

Huiying Liang, Email: lianghuiying@gdph.org.cn.

Miao Lin, Email: linmiao@gdph.org.cn.

Appendix A. Supplementary data

The following are the Supplementary data to this article.

Multimedia component 1
mmc1.xlsx (1.8MB, xlsx)
Multimedia component 2
mmc2.docx (7.4MB, docx)

References

  • 1.GBD 2021 Diabetes Collaborators Global, regional, and national burden of diabetes from 1990 to 2021, with projections of prevalence to 2050: A systematic analysis for the Global Burden of Disease Study 2021. Lancet. 2023;402:203–234. doi: 10.1016/S0140-6736(23)01301-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Chew N.W.S., Ng C.H., Tan D.J.H., et al. The global burden of metabolic disease: Data from 2000 to 2019. Cell Metab. 2023;35:414–428.e3. doi: 10.1016/j.cmet.2023.02.003. [DOI] [PubMed] [Google Scholar]
  • 3.de Boer I.H., Khunti K., Sadusky T., et al. Diabetes management in chronic kidney disease: A consensus report by the American diabetes association (ADA) and kidney disease: Improving global outcomes (KDIGO) Diabetes Care. 2022;45:3075–3090. doi: 10.2337/dci22-0027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Tomita I., Kume S., Sugahara S., et al. SGLT2 inhibition mediates protection from diabetic kidney disease by promoting ketone body-induced mTORC1 inhibition. Cell Metab. 2020;32:404–419.e6. doi: 10.1016/j.cmet.2020.06.020. [DOI] [PubMed] [Google Scholar]
  • 5.Wiviott S.D., Raz I., Bonaca M.P., et al. Dapagliflozin and cardiovascular outcomes in type 2 diabetes. N. Engl. J. Med. 2019;380:347–357. doi: 10.1056/NEJMoa1812389. [DOI] [PubMed] [Google Scholar]
  • 6.Liu X., Xu C., Xu L., et al. Empagliflozin improves diabetic renal tubular injury by alleviating mitochondrial fission via AMPK/SP1/PGAM5 pathway. Metabolism. 2020;111 doi: 10.1016/j.metabol.2020.154334. [DOI] [PubMed] [Google Scholar]
  • 7.Lundgren J.R., Janus C., Jensen S.B.K., et al. Healthy weight loss maintenance with exercise, liraglutide, or both combined. N. Engl. J. Med. 2021;384:1719–1730. doi: 10.1056/NEJMoa2028198. [DOI] [PubMed] [Google Scholar]
  • 8.Perkovic V., Tuttle K.R., Rossing P., et al. Effects of semaglutide on chronic kidney disease in patients with type 2 diabetes. N. Engl. J. Med. 2024;391:109–121. doi: 10.1056/NEJMoa2403347. [DOI] [PubMed] [Google Scholar]
  • 9.Sattar N., Lee M.M.Y., Kristensen S.L., et al. Cardiovascular, mortality, and kidney outcomes with GLP-1 receptor agonists in patients with type 2 diabetes: A systematic review and meta-analysis of randomised trials. Lancet Diabetes Endocrinol. 2021;9:653–662. doi: 10.1016/S2213-8587(21)00203-5. [DOI] [PubMed] [Google Scholar]
  • 10.Marso S.P., Bain S.C., Consoli A., et al. Semaglutide and cardiovascular outcomes in patients with type 2 diabetes. N. Engl. J. Med. 2016;375:1834–1844. doi: 10.1056/NEJMoa1607141. [DOI] [PubMed] [Google Scholar]
  • 11.Mori Y., Ajay A.K., Chang J.H., et al. KIM-1 mediates fatty acid uptake by renal tubular cells to promote progressive diabetic kidney disease. Cell Metab. 2021;33:1042–1061.e7. doi: 10.1016/j.cmet.2021.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jun W., Makino H. Innate immunity in diabetes and diabetic nephropathy. Nat. Rev. Nephrol. 2016;12:13–26. doi: 10.1038/nrneph.2015.175. [DOI] [PubMed] [Google Scholar]
  • 13.Suhre K., McCarthy M.I., Schwenk J.M. Genetics meets proteomics: Perspectives for large population-based studies. Nat. Rev. Genet. 2021;22:19–37. doi: 10.1038/s41576-020-0268-2. [DOI] [PubMed] [Google Scholar]
  • 14.Holmes M.V., Richardson T.G., Ference B.A., et al. Integrating genomics with biomarkers and therapeutic targets to invigorate cardiovascular drug development. Nat. Rev. Cardiol. 2021;18:435–453. doi: 10.1038/s41569-020-00493-1. [DOI] [PubMed] [Google Scholar]
  • 15.Eldjarn G.H., Ferkingstad E., Lund S.H., et al. Large-scale plasma proteomics comparisons through genetics and disease associations. Nature. 2023;622:348–358. doi: 10.1038/s41586-023-06563-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bycroft C., Freeman C., Petkova D., et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. doi: 10.1038/s41586-018-0579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kurki M.I., Karjalainen J., Palta P., et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature. 2023;613:508–518. doi: 10.1038/s41586-022-05473-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Li W. UK Biobank pharma proteomics resource. Nat. Genet. 2023;55 doi: 10.1038/s41588-023-01575-9. [DOI] [PubMed] [Google Scholar]
  • 19.Sun B.B., Chiou J., Traylor M., et al. Plasma proteomic associations with genetics and health in the UK Biobank. Nature. 2023;622:329–338. doi: 10.1038/s41586-023-06592-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jannot A.S., Ehret G., Perneger T. P < 5 × 10 (-8) has emerged as a standard of statistical significance for genome-wide association studies. J. Clin. Epidemiol. 2015;68:460–465. doi: 10.1016/j.jclinepi.2015.01.001. [DOI] [PubMed] [Google Scholar]
  • 21.Hartwig F.P., Davies N.M., Hemani G., et al. Two-sample Mendelian randomization: Avoiding the downsides of a powerful, widely applicable but potentially fallible technique. Int. J. Epidemiol. 2016;45:1717–1726. doi: 10.1093/ije/dyx028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Xu B., Sheng J., You Y., et al. Deletion of Smad3 prevents renal fibrosis and inflammation in type 2 diabetic nephropathy. Metabolism. 2020;103 doi: 10.1016/j.metabol.2019.154013. [DOI] [PubMed] [Google Scholar]
  • 23.Barutta F., Corbelli A., Mastrocola R., et al. Cannabinoid receptor 1 blockade ameliorates albuminuria in experimental diabetic nephropathy. Diabetes. 2010;59:1046–1054. doi: 10.2337/db09-1336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Huang K., Chandak P., Wang Q., et al. A foundation model for clinician-centered drug repurposing. Nat. Med. 2024;30:3601–3613. doi: 10.1038/s41591-024-03233-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gobeil É., Bourgault J., Mitchell P.L., et al. Genetic inhibition of angiopoietin-like protein-3, lipids, and cardiometabolic risk. Eur. Heart J. 2024;45:707–721. doi: 10.1093/eurheartj/ehad845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Cheng J., Novati G., Pan J., et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science. 2023;381 doi: 10.1126/science.adg7492. [DOI] [PubMed] [Google Scholar]
  • 27.Rämö J.T., Jurgens S.J., Kany S., et al. Rare genetic variants in LDLR, APOB, and PCSK9 are associated with aortic stenosis. Circulation. 2024;150:1767–1780. doi: 10.1161/CIRCULATIONAHA.124.070982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Liu J., Zhang L., Huang Y., et al. Epsin1-mediated exosomal sorting of Dll4 modulates the tubular-macrophage crosstalk in diabetic nephropathy. Mol. Ther. 2023;31:1451–1467. doi: 10.1016/j.ymthe.2023.03.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Ochoa D., Hercules A., Carmona M., et al. Open Targets Platform: Supporting systematic drug-target identification and prioritisation. Nucleic Acids Res. 2021;49:D1302–D1310. doi: 10.1093/nar/gkaa1027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hu Y., Tang W., Liu W., et al. Astragaloside IV alleviates renal tubular epithelial-mesenchymal transition via CX3CL1-RAF/MEK/ERK signaling pathway in diabetic kidney disease. Drug Des. Devel. Ther. 2022;16:1605–1620. doi: 10.2147/DDDT.S360346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Segev Y., Landau D., Marbach M., et al. Renal hypertrophy in hyperglycemic non-obese diabetic mice is associated with persistent renal accumulation of insulin-like growth factor I. J. Am. Soc. Nephrol. 1997;8:436–444. doi: 10.1681/ASN.V83436. [DOI] [PubMed] [Google Scholar]
  • 32.Hjortebjerg R., Tarnow L., Jorsal A., et al. IGFBP-4 fragments as markers of cardiovascular mortality in type 1 diabetes patients with and without nephropathy. J. Clin. Endocrinol. Metab. 2015;100:3032–3040. doi: 10.1210/jc.2015-2196. [DOI] [PubMed] [Google Scholar]
  • 33.Kiepe D., Ulinski T., Powell D.R., et al. Differential effects of insulin-like growth factor binding proteins-1, -2, -3, and-6 on cultured growth plate chondrocytes. Kidney Int. 2002;62:1591–1600. doi: 10.1046/j.1523-1755.2002.00603.x. [DOI] [PubMed] [Google Scholar]
  • 34.Lay A.C., Hale L.J., Stowell-Connolly H., et al. IGFBP-1 expression is reduced in human type 2 diabetic glomeruli and modulates β1-integrin/FAK signalling in human podocytes. Diabetologia. 2021;64:1690–1702. doi: 10.1007/s00125-021-05427-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chen Z., Ding L., Yang W., et al. Hepatic activation of the FAM3C-HSF1-CaM pathway attenuates hyperglycemia of obese diabetic mice. Diabetes. 2017;66:1185–1197. doi: 10.2337/db16-0993. [DOI] [PubMed] [Google Scholar]
  • 36.Zhang X., Yang W., Wang J., et al. FAM3 gene family: A promising therapeutical target for NAFLD and type 2 diabetes. Metabolism. 2018;81:71–82. doi: 10.1016/j.metabol.2017.12.001. [DOI] [PubMed] [Google Scholar]
  • 37.Ruozi G., Bortolotti F., Mura A., et al. Cardioprotective factors against myocardial infarction selected in vivo from an AAV secretome library. Sci. Transl. Med. 2022;14 doi: 10.1126/scitranslmed.abo0699. [DOI] [PubMed] [Google Scholar]
  • 38.Kobayashi H., Looker H.C., Satake E., et al. Neuroblastoma suppressor of tumorigenicity 1 is a circulating protein associated with progression to end-stage kidney disease in diabetes. Sci. Transl. Med. 2022;14 doi: 10.1126/scitranslmed.abj2109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Ferrer-Martínez A., Felipe A., Barceló P., et al. Effects of cyclosporine A on Na, K-ATPase expression in the renal epithelial cell line NBL-1. Kidney Int. 1996;50:1483–1489. doi: 10.1038/ki.1996.462. [DOI] [PubMed] [Google Scholar]
  • 40.Gurung R.L., Zheng H., Koh H.W.L., et al. Plasma proteomics of diabetic kidney disease among Asians with younger-onset type 2 diabetes. J. Clin. Endocrinol. Metab. 2025;110:e239–e248. doi: 10.1210/clinem/dgae266. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.xlsx (1.8MB, xlsx)
Multimedia component 2
mmc2.docx (7.4MB, docx)

Articles from Journal of Pharmaceutical Analysis are provided here courtesy of Xi'an Jiaotong University

RESOURCES