Skip to main content
BMC Cancer logoLink to BMC Cancer
. 2025 Jun 1;25:982. doi: 10.1186/s12885-025-14357-9

Targeting INF2 with DiosMetin 7-O-β-D-Glucuronide: a new stratagem for colorectal cancer therapy

Zhirui Zeng 1,2,#, Yun Ke 1,#, Fei Huang 1,#, Hangyi Li 1, Xiaomin Zhang 2, Dahuan Li 2, Yingmin Wu 2, Tengxiang Chen 2,, Yunhuan Zhen 1,
PMCID: PMC12128233  PMID: 40452021

Abstract

Background and purpose

Colorectal cancer (CRC) is the third most prevalent malignancy in the gastrointestinal tract and the second leading cause of cancer-related deaths. Despite the identification of numerous biomarkers, their non-specific distribution across different cell types complicates the development of targeted therapies. Therefore, this study aims to identify specific biomarkers for CRC and utilize them for the development of targeted therapies.

Methods

Single-cell RNA sequencing and machine learning were used to analyze INF2 localization in CRC tissues and its prognostic value. Immunohistochemistry, cell biology assays, and computational docking were employed to assess INF2's clinical significance and identify inhibitors.

Results

INF2 was identified as a key prognostic marker in CRC, with elevated expression in advanced-stage tissues. Knockdown of INF2 inhibited CRC cell proliferation and mobility. DiosMetin 7-O-β-D-Glucuronide, a natural compound, selectively inhibited INF2-high CRC cells with minimal toxicity to normal cells.

Conclusion

INF2 is a CRC-specific biomarker associated with poor prognosis. DiosMetin 7-O-β-D-Glucuronide, as an INF2 inhibitor, shows promise as a targeted therapeutic agent for CRC treatment.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12885-025-14357-9.

Keywords: Colorectal cancer, INF2, DiosMetin 7-O-β-D-Glucuronide, Therapy

Introduction

Colorectal cancer (CRC) is the third most prevalent malignancy of the gastrointestinal tract, surpassed only by lung cancer in overall incidence [1]. Advancements in endoscopic technology have significantly enhanced the diagnostic and therapeutic capabilities for both precancerous lesions and early-stage CRC, resulting in a 5-year survival rate exceeding 90%, with some patients achieving clinical remission [2]. Conversely, the prognosis for patients with mid to late-stage CRC remains poor, with a 5-year survival rate of less than 20%, despite the implementation of various treatment modalities, including surgical intervention, chemotherapy, and targeted therapies [3, 4]. Consequently, the identification of novel biomarkers and therapeutic drugs for CRC is imperative.

Numerous biomarkers have been identified in CRC. For instance, Wang et al. demonstrated a significant upregulation of POLR1D in CRC through an analysis of the Oncomine database. Subsequent immunohistochemical staining revealed that POLR1D expression levels were positively correlated with tumor size and poor survival rates in CRC patients [5]. Additionally, Gou et al. reported an upregulation of TRIP6 in CRC tissues. Using animal models, they further established that overexpression of TRIP6 accelerated the occurrence and submucosal infiltration of CRC [6]. Liu et al. employed bioinformatics analysis and experimental approaches to demonstrate a significant correlation between FSTL3 expression and poor prognosis in CRC patients. Their findings indicate that FSTL3 facilitates epithelial-mesenchymal transition by enhancing the interaction between FN1 and α5β1 [7]. Similarly, Huang et al. discovered that SPON2, derived from tumor cells, promotes the invasion of CRC cells. Furthermore, patients exhibiting high SPON2 expression were associated with reduced overall survival and disease-free survival [8]. However, the localization of these biomarkers in cells lacks specificity; they are expressed not only in tumor cells but also in immune cells and stromal cells. Consequently, therapeutic agents targeting these biomarkers may inadvertently affect non-tumor cells, including immune and stromal cells. There-fore, the identification and development of drugs targeting specific biomarkers in CRC could enhance the efficacy and precision of CRC therapy.

INF2 is an atypical diaphanous-related formin that plays a role in regulating the polymerization and depolymerization of actin [9]. Mutations in the INF2 gene have been associated with the development of focal segmental glomerulosclerosis [10]. Furthermore, INF2 expression has been found to be upregulated in various cancer cell types. For instance, INF2 is upregulated in glioblastoma cells, and its knockdown can significantly reduce the migration of glioblastoma cells [11]. The expression of INF2 is notably elevated in triple-negative breast cancer that exhibits markers of basal-like differentiation. Knockdown of INF2 significantly alters cell morphology and reduces cell proliferation [12]. Additionally, INF2 is phosphorylated by AMPK, which promotes the proliferation of endometrial cancer cells [13]. However, the clinical significance of INF2 in CRC remains inadequately understood.

Herein, we demonstrated that INF2 functions as a relatively specific biomarker for CRC, primarily localized in cancer cells of CRC tissues, and significantly influences the prognosis of CRC patients. The knockdown of INF2 led to a reduction in both cell proliferation and motility. Additionally, DiosMetin 7-O-β-D-Glucuronide was identified as a prominent inhibitor of INF2, exhibiting minimal toxicity and side effects. There-fore, DiosMetin 7-O-β-D-Glucuronide presents potential as a therapeutic agent for the treatment of CRC.

Materials and methods

Single-cell RNA (scRNA) sequencing and enrichment analysis

Utilizing the Gene Expression Omnibus (GEO) dataset cohort GSM5075683 (https://www.ncbi.nlm.nih.gov/gds) [14], we procured the scRNA sequencing analysis. The analysis of the scRNA-seq data was conducted using the Seurat package [15]. Initially, genes expressed in fewer than three cells and cells with fewer than 200 features were excluded. Cells exhibiting between 200 and 2,500 uniquely expressed genes and mitochondrial counts of 5% or less were selected for further analysis. Subsequently, Lognormalize function from the Seurat package was used to normalize the expression levels of all genes in the included cells. Using the FindVariableFeatures function in the Seurat package, we calculated the standardized variance of all genes with the Variance Stabilizing Transformation (VST) method. The 2000 genes with the highest standardized variance were identified as the top 2000 genes with the greatest expression variability, and used for further study. Because single-cell RNA-seq data typically encompass thousands to tens of thousands of genes (high-dimensional data), making direct analysis computationally intensive and susceptible to noise. Principal Component Analysis (PCA) is a statistical technique that reduces the dimensionality of a dataset while preserving most of its variability. It transforms a set of correlated variables into a smaller set of uncorrelated variables, known as principal components (PCs) [16]. To facilitate cell clustering, we first performed PCA analysis on the 2000 genes with the highest expression variance, generating 20 PCs. After obtaining the 20 PCs, we utilized the ElbowPlot function in the Seurat package to confirm that all 20 PCs are suitable for subsequent cell clustering analysis.

Based on the 20 PCs, Uniform Manifold Approximation and Projection (UMAP) analysis were used to perform cell cluster using a resolution parameter of 0.5. We utilized the FindMarkers function in the Seurat package to identify differentially expressed genes (DEGs) between cell clusters. Genes with an absolute log fold change ≥ 0.25 and a p-value < 0.05 were considered characteristic genes for the respective subpopulation. The top 10 characteristic genes with the largest fold change differences from each cell clusters were selected for cell annotation, while online tool CellMarker 2.0 (http://bio-bigdata.hrbmu.edu.cn/CellMarker/) [17] was used for performing cell annotation. Following this, the cell clusters of same cell types were merged. Specific genes exhibiting high abundance in each cell type were selected based on a threshold expression level present in 25% of the cell types and a 0.25-fold difference in their expression levels.

Furthermore, we utilized SangerBox (http://sangerbox.com/) [18] to perform Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses on these genes. In brief, for GO enrichment analysis, SangerBox leveraged the GO annotations from the R package org.Hs.eg.db (version 3.1.0) as the background dataset. The top 200 characteristic genes of cancer cells were mapped to this background, and enrichment analysis was performed using the R package clusterProfiler (version 3.14.3) to identify enriched gene sets. For KEGG enrichment analysis, SangerBox retrieved the latest KEGG Pathway gene annotations via the KEGG REST API (https://www.kegg.jp/kegg/rest/keggapi.html) as the background dataset. The top 200 tumor cell characteristic genes were mapped to this background, and enrichment analysis was conducted using clusterProfiler (version 3.14.3) to obtain enriched gene sets. The minimum gene set size was set to 5, the maximum to 5000, and a p-value < 0.05 was deemed statistically significant.

Data pretreatment and acquisition of TCGA cohort

The expression profiles of CRC tissues from TCGA were obtained through the GDC Data Portal. First, we accessed the GDC website (https://portal.gdc.cancer.gov/) and selected the projects"TCGA-COAD"and"TCGA-READ."We then chose the desired data type, added it to the cart, and downloaded the files. The colon cancer (COAD) and rectal cancer (READ) datasets were merged, and normal samples, as well as samples lacking survival status or survival time, were excluded. Finally, we obtained 452 CRC patients with complete overall survival (OS) times and survival statistics for further study. Next, probe names were converted to gene symbols. Genes (rows) and samples (columns) with more than 50% missing values were removed. Subsequently, missing values were imputed using the impute.knn function from the R package. Finally, the data were transformed using a log2(x + 1) normalization.

Machine learning

The prognostic values of the genes highly expressed in cancer cells of CRC tissues were analyzed in TCGA cohort via various machine learning methods. First, expression of top 200 genes exhibiting specific high expression in cancer cells were extracted from TCGA cohort, and used to perform univariate COX regression analysis to select genes associated with patient prognosis (dead or alive). For further confirming the relationship between the expression of these genes and CRC patient survival state, machine learning methods were performed.

For Support vector machine recursive feature elimination (SVM-RFE) method, it was performed using'e1071'package (version: 1.7–16). For detail, a linear SVM model was trained using gene expression data and CRC patient survival status, calculating the weight of each gene’s contribution to survival classification. Genes were then ranked by weight from lowest to highest, and the gene with the smallest weight was removed. The SVM was retrained on the remaining genes, weights were updated, and this recursive process continued until the desired number of features was reached or all features are eliminated.

For Boruta random forest analysis, it was performed using'Boruta'(version: 8.0.0),'randomForest'(version: 3.3.1) package. For detail, we constructed a random forest model using the expression data of the input genes to evaluate each gene’s importance in predicting CRC patient survival status. Subsequently, “shadow genes” were generated by randomly shuffling the expression values of the original genes, and their importance was compared to that of the original genes. Genes significantly outperforming their shadow counterparts were marked as important. This process was repeated multiple times, with gene importance ranked using Z-scores, ultimately yielding a list of important genes.

For east absolute shrinkage and selection operator (LASSO) regression analysis, it was performed using the R package'glmnet'(version: 4.1.8). For detail, we integrated gene expression data with CRC patient survival status. A LASSO-Cox model was constructed using the glmnet package, employing L1 regularization to automatically select important genes. Additionally, cross-validation was performed to identify the optimal regularization parameter (λ), and genes with non-zero coefficients were retained as key genes.

For Decision Tree method, it was performed using the R package'rpart'(version: 4.1.24). For detail, we integrated gene expression data with CRC patient survival status into R software. Then, the rpart package was employed to construct a decision tree model targeting survival status. The model recursively partitions gene expression features to generate a tree structure. Feature importance was evaluated using cross-validation or information gain, enabling the selection of genes with the greatest contribution to classification.

For Binomial Logistic Regression analysis, it was performed in the R package'glmnet'(version: 4.1.8). For detail, we integrated gene expression data with CRC patient survival status into R software. Then, we fitted binomial logistic regression models for each gene individually to evaluate its regression coefficient and p-value, identifying significant genes with p < 0.05.

CRC tissues collection and Immunohistochemical analysis

A total of 60 pairs of CRC tissues and corresponding adjacent tissues were collected from the Affiliated Hospital of Guizhou Medical University, following approval by the Human Ethics Committee of the same institution (Approval number: 2024–076). The CRC and adjacent tissues were fixed, dehydrated, embedded in paraffin, and subsequently sectioned into 2-µm slices. These sections were then deparaffinized and rehydrated using xylene (Sinopharm, Beijing, China) and a descending alcohol series (Sinopharm, Beijing, China), respectively. Following antigen retrieval using 100 mM sodium citrate (ZSGB-BIO, Beijing, China), the tissue sections were blocked with 3% hydrogen peroxide (Sinopharm, Beijing, China) and 5% bovine serum albumin (Boster, Wuhan, China) at room temperature. Subsequently, the sections were incubated with an anti-INF2 antibody (1:1000; cat no. 20466–1-AP, Proteintech, Wuhan, China), anti-KI67 (1:4000; cat no. 27309–1-AP, Proteintech), and anti-PCNA (1:5000; cat no. 27309–1-AP, Proteintech) overnight at 4 °C. After this incubation period, the sections were treated with a secondary antibody for 2 h. The slides were then stained using a Cell and Tissue Staining HRP-3,3′-diaminobenzidine kit (ZSGB-BIO, Beijing, China), followed by nuclear counterstaining with 0.2% hematoxylin (ZSGB-BIO, Beijing, China) for 1 min. The staining was visualized using a light microscope at a magnification of × 200 and × 400.

Cell culture

The CRC cell lines SW480 (catalog no. CL-0223) and SW620 (catalog no. CL-0225), along with human normal colon smooth muscle cells (CSMC; catalog no. CP-H037), human normal colonic mucosal epithelial cells (CMET; catalog no. CP-H040), and human peripheral blood mononuclear cells (PBMC; catalog no. CM-H158), were procured from Procell (https://www.procell.com.cn/). The SW480 and SW620 cells were cultured in DMEM medium (Hyclone, USA) supplemented with 10% FBS (Hyclone, USA). The CSMC, CMET, and PBMC cells were maintained in their respective culture media (catalog no. CM-H037, catalog no. CM-H040, and catalog no. CM-H158) obtained from Procell. All cells were cultured in an atmosphere containing 5% CO2 at a temperature of 37 °C.

Transfection of siRNAs and shRNAs

INF2-targeting siRNAs were procured from RIBOBIO, Guangzhou, China. The following siRNA sequences were utilized: si1-INF2, GCAGTACCGCTTCAGCATT; si2-INF2, GAAGCAGGTGTTTGAGCTA; si3-INF2, TGCCCTCTGTGGTCAACTA. The shRNA lentivirus was constructed based on the si2-INF2 sequence and included a puromycin resistance gene.

For siRNA transfection, SW480 and SW620 cells were seeded at a density of 1 × 105 cells per well in 6-well plates. After cell adhesion, 2 μL of Lipofectamine 2000 reagent (Invitrogen, USA) and the corresponding negative control (NC, 20 nM) or INF2-targeting siRNAs (si-INF2, 20 nM) were added. Following a 6-h incubation, the siRNA-containing medium was discarded and replaced with fresh medium supplemented with 10% FBS. After an additional 48 h of culture, silencing efficiency of siRNAs was assessed using qPCR and western blotting.

For shRNA transfection, SW480 cells were seeded at a density of 1 × 104 cells in 25 cm3 culture flasks. After cell adhesion, 2 μL of polybrene (Boster, Wuhan, China) was added, followed by the introduction of NC or sh-INF2 lentivirus (MOI = 10). After 24 h, the virus-containing medium was discarded and replaced with fresh medium containing 0.5 mg/mL puromycin (Boster, Wuhan, China). Cells were cultured for 14 days under puromycin selection to establish a stable INF2-knockdown SW480 cell line.

Real-time quantitative PCR

SW480 and SW620 cells were seeded at a density of 1 × 105 cells per well in 6-well plates. After 48 h transfection of siRNAs, total RNA was extracted from these cells using the RNA-Quick Purification Kit reagent (Esunbio, Shanghai, China). Reverse transcription was performed with the PrimeScript™ RT Master Mix (Perfect Real Time) (Takara, Japan), followed by PCR amplification using the TB Green® Premix Ex Taq™ II (Tli RNaseH Plus) (Takara, Japan). The relative expression of INF2 was normalized to ACTB. The following primers were utilized in this study: INF2-forward, 5ʹ-ACGTGGTCACCCTGCTTAG-3ʹ; INF2-reverse, 5ʹ-CAGCCCGATAAACTCGTTCC-3ʹ; ACTB-forward, 5ʹ-CCTGGCACCCAGCACAAT-3ʹ; and ACTB-reverse, 5ʹ-CGGCCGGACTCGTCATAC-3ʹ.

Western blotting analysis

SW480 and SW620 cells were seeded at a density of 1 × 105 cells per well in 6-well plates. After 48 h transfection of siRNAs, PMSF (Solarbio, Jiangsu, China) and RIPA (Solarbio, Jiangsu, China) reagent were combined in a 1:100 ratio and subsequently added to these treated cells. The protein samples were quantified using the BCA method and then loaded onto a 7.5% SDS-PAGE gel (UElandy, Tianjin, China) at a concentration of 30 μg per lane. Following electrophoresis at 120 V for 2 h, the proteins were transferred onto a PVDF membrane (Millipore, USA) at 310 mA for 2 h. Following a 2 h blocking period at room temperature with 5% non-fat milk in TBST, the membrane was incubated overnight at 4 °C with primary antibodies including INF2 (1:2000, Cat. No. 20466–1-AP, Proteintech, Wuhan, China) and β-actin (1:5000, Cat. No. AP0060, Bioworld, USA). Subsequently, the membrane was washed three times with TBST, each wash lasting 6 min. The membrane was then incubated with the secondary antibody at room temperature for 2 h, followed by three additional washes with TBST. Finally, the bands were visualized using ECL reagent (Biosharp, Beijing, China). The β-actin standard was employed to normalize the expression levels of INF2.

Cell Count Kit-8 (CCK-8) assays

Cell proliferation was assessed using the Cell Counting Kit-8 (CCK-8) assay. NC or si-INF2 cells were seeded into 96-well plates at a density of 3000 cells per well in 100 μL of medium and incubated at 37 °C with 5% CO2. At the specified time points, 10 μL of CCK-8 reagent (UElandy, Tianjin, China) was added to each well and incubated for 2 h. Subsequently, the absorbance at 450 nm for each well was measured using a microplate reader.

EdU cell proliferation assay

SW480 and SW620 cells (NC and si-INF2) in the logarithmic growth phase were evenly inoculated into 6-well plates containing coverslips with a density of 1 × 104 cells per well. After an overnight incubation to allow for cell adhesion, an EdU working solution (Beyotime, Jiangsu, China) is prepared according to the manufacturer's instructions and added to the 6-well plates for a 2-h incubation at 37 °C. Subsequently, the cells are fixed, washed, and permeabilized. The Click Additive Solution is then introduced into the 6-well plates to facilitate the click reaction. Finally, following staining and sealing procedures, fluorescence scanning detection is conducted.

Wound healing assays

The NC and si-INF2 cells were uniformly inoculated into 6-well plates with a density of 2 × 105 cells per well and cultured in an incubator maintained at 37 °C with 5% CO2. Upon reaching 90% confluence, a uniform wound was created in each well using a 10 μL pipette tip. The cells were then washed twice with PBS to remove any dislodged cells, followed by the addition of serum-free media. Multiple positioning marks were made at the center of the denuded area to ensure comparability of wounds with identical wound areas. Images of the scratch zones were captured using inverted microscopy at 0 and 24 h. Image J software was utilized to measure and assess the CRC cells'ability to migrate.

Transwell assays

CRC cells from the NC group and CRC cells efficiently transfected with INF2-siRNA were resuspended in serum-free DMEM medium and placed in the upper chamber of a cell culture insert (LABSELECT, China) pre-coated with 6% Matrigel (Corning, USA) at a density of 3 × 104 cells per well. The lower chamber contained DMEM medium supplemented with 10% FBS. After a 24-h incubation period, the cell culture inserts were fixed with 4% paraformaldehyde and stained with 0.5% crystal violet solution. Following staining, non-invasive cells were removed using a cotton swab, and the number of invasive cells was counted under a microscope.

Subcutaneous graft tumor model

A total of 10 female BalB/c nude mice (5 weeks old, weighing 18–20 g) were procured from the Laboratory Animal Center of Guizhou Medical University, following approval by the Animal Ethics Committee of Guizhou Medical University (approval number: 2400141). After a 7-day acclimatization period, the mice were randomly assigned to either the NC group or the sh-INF2 group. Subsequently, 3 × 106 NC SW480 cells and INF2-knockdown SW480 cells were subcutaneously injected into the right flank of the respective group of mice. The tumor proliferation rate in each mouse was recorded weekly. The mice were sacrificed at the week 5 by cervical dislocation, the tumor tissues were extracted to weight and performed IHC.

Computer virtual screening

The high-resolution human INF2 protein structure was downloaded from the Protein Data Bank (PDB) website (https://www.rcsb.org/) using the ID as 8RV2. The INF2 structure was subsequently imported into Maestro software (version: 2021.02) for preprocessing and minimization using the'Protein Preparation Wizard'module. Additionally, the structures of 6688 small molecule natural product compounds were obtained from the COCONUT database (https://coconut.naturalproducts.net/), and 3560 small molecule compounds of FDA-approved drugs were sourced from the MCE website (https://www.medchemexpress.cn/). Before conducting the virtual screening using computational methods, the molecular structures were imported into the Maestro software and optimized using the'Ligprep'module. Subsequently, the binding pocket of the INF2 protein was identified using the'Receptor Grid Generation'module, with the pocket size set to 20 Å for the docking process. The docking was then performed using the extra precision mode in the'Ligand Docking'module. The docking results were visualized using PYMOL software (version 3.0).

Cellular Thermal Shift Assay (CETSA)

SW480 cells (2 × 105 cells) were subjected to treat with volume DMSO, 10 μM Uridine 5′-diphosphoglucose disodium salt (cat no. HY-N7032, MCE, Wuhan, China), 10 μM Alginic acid (cat no. HY-W127758, MCE), or 10 μM Diosmetin 7-O-β-D-Glucuronide (cat no. HY-N6879, MCE) in culture dishes for 24 h, followed by centrifugation at 6000 g for 2 min at 37 °C. Subsequently, the cells were resuspended in 1 mL of PBS. The cells underwent three freeze–thaw cycles, each involving exposure to liquid nitrogen for 3 min, followed by heating at 25 °C for 3 min. After centrifugation at 6000 g at 4 °C for 30 min, the supernatants were collected. All samples were subsequently incubated at 25 °C for one hour. Subsequently, all samples were subjected to heat treatment for 3 min at temperatures of 46, 49, 52, 55, 58, and 61 °C. Ultimately, the samples underwent analysis via Western blotting.

Statistical analysis

Statistical analyses were conducted on three independent experimental replicates using GraphPad Prism version 7.0 software. Data are expressed as mean ± standard deviation. For comparisons involving more than two groups, statistical significance was assessed using one-way ANOVA followed by Tukey post hoc test. For comparisons between two groups, Student's t-tests were employed. A p-value of less than 0.05 was considered indicative of statistical significance.

Results

Identification of genes mostly located in cancer cells of CRC tissues via scRNA sequencing analysis

The scRNA sequencing expression profile GSM5075683 was employed in this study. Initially, we quantified the relative abundance of various markers in each cell, including features, cell count, hemoglobin (HB), mitochondria (MT), and ribosomes (Fig. 1A). For subsequent analysis, we selected genes expressed in a minimum of three cells, cells exhibiting a detected gene count between 200 and 2500, and cells with mitochondrial gene expression constituting less than 5% of the total expression (Fig. 1A). Subsequently, the 2,000 genes exhibiting the highest variability of expression between cells were selected (Fig. 1B, Supplemental Table 1). These genes were then subjected to principal component analysis (PCA), while a total of 20 principal components utilized for cell clustering (Fig. 1C).

Fig. 1.

Fig. 1

Identification of genes mostly located in cancer cells of CRC tissues via scRNA sequencing analysis: (A) Features, cell count, hemoglobin (HB), mitochondria (MT), and ribosomes in each cell were quantified. B The 2,000 genes exhibiting the highest variability between cells were selected. C PCA analysis for the 2,000 genes exhibiting the highest variability. D UMAP plot showed the cell cluster. E UMAP plot showed the 9 cell types. F The cell location of various biomarkers was showed in UMAP plot. G Heatmap showed the gene expression similarity of each type cells. H GO analysis for the top 200 signature genes in cancer cells. I KEGG analysis for the top 200 signature genes in cancer cells

Based on the PCA results, the cells were grouped into 12 distinct clusters (Fig. 1D), while top10 specific genes in each cluster were selected (Supplemental Table 2). Following this, the cells were annotated into 9 cell types, including cancer cell, CD4 T cell, plasma, CD8 T cell, B cell, epithelial cell, tumor associated macrophage (TAM), cancer associated fibroblast (CAF), and mast cells, based on marker gene signatures (Fig. 1E). To validate the accuracy of cell subset annotation, nine cell-specific biomarkers were employed, including B cell biomarker MS4 A1, CD4 T cell biomarker IL7R, CD8 T cell biomarker CCL5, plasma cell biomarker IGKC, cancer cell biomarker TFF1, epithelial biomarker PIGR, TAM biomarker CXCL8, CAF biomarker SPARCL1, and mast biomarker CPA3. We conducted a detailed analysis of the distribution of cell-specific biomarkers and assessed the consistency of cell expression profiles, and the findings demonstrated that all biomarkers were exclusively localized to their respective cell types and exhibited minimal expression in other cell types (Fig. 1F). Furthermore, the expression profiles within the same cell type were highly consistent (Fig. 1G). These results confirm the precise annotation of the cell subpopulations.

Then, top200 signature genes in each cell type were calculated (Supplemental Table 3). Notably, our primary focus was on the 200 signature genes of cancer cells and used them for subsequent analysis. Gene Ontology (GO) enrichment analysis revealed that the primary enrichment terms for these 200 characteristic genes included'tissue development'(Biological Process, BP),'differentiation of epithelial cells'(BP),'extracellular exosome'(Cellular Component, CC),'extracellular vesicle'(CC),'extracellular organelle'(CC),'extracellular region part'(CC),'vesicle'(CC),'extracellular space'(CC),'extracellular region'(CC), and'cell adhesion molecule binding'(Molecular Function, MF) (Fig. 1H). KEGG enrichment analysis showed that these 200 characteristic genes were mainly enriched in'metabolic pathway','biosynthesis of amino acids','carbon metabolism','oxidative phosphorylation','glycolysis','insulin resistance','HIF-1 signaling pathway', and'fatty acid metabolism' (Fig. 1I).

INF2 was determined to have the most significant impact on the prognosis of CRC patients by machine learning methods

The relationship between the expression of the top 200 signature genes in cancer cells and CRC patient prognosis was analyzed in TCGA cohort. Through univariate COX regression analysis, three genes including AFAP1-AS1, AMH, and INF2 were identified as unfavorable prognostic markers for CRC patients (Fig. 2A). Conversely, six genes including CD24, EMP2, ETS2, FDF1, LGALS4, and VIL1 were identified as favorable prognostic markers for CRC patients (Fig. 2A). The expression of these nine signature genes and survival state of CRC patients were extracted (Supplemental Table 4) and subjected to further analysis.

Fig. 2.

Fig. 2

INF2 was determined to have the most significant impact on the prognosis of CRC patients by machine learning methods: (A) Univariate COX regression analysis for the top 200 signature genes, and select nine genes associated with CRC patient prognosis. B SVM-RFE analysis for the nine candidate genes. C Logistic analysis for the nine candidate genes. D-E LASSO analysis for the nine candidate genes. F Boruta random forest analysis for the nine candidate genes. G-H Decision Tree method analysis for the nine candidate genes. I The expression of INF2 ex-pression in CRC patients with DSS event, poor therapy outcome and perineural invasion from TCGA cohort. J The cell location of INF2. * represents p < 0.05. Data are shown as mean ± SD

The nine signature genes in cancer cells associated with prognosis were subsequently analyzed using the SVM-RFE method. The results of the SVM-RFE analysis demonstrated that all nine genes were significantly associated with CRC patient prognosis (Fig. 2B). Further logistic regression analysis revealed that, among these genes, INF2 and AFAP1-AS1 were identified as significant risk factors for mortality in CRC patients. In contrast, VIL1, LGALS4, FDFT1, EMP2, and ETS2 were identified as significant protective factors against mortality in CRC patients (Fig. 2C). Similarly, the application of LASSO analysis revealed that AFAP1-AS1, AMH, EMP2, ETS2, FDFT1, INF2, and LGALS4 are significant biomarkers associated with patient prognosis (Fig. 2D-E). Additionally, Boruta random forest analysis identified FDFT1, EMP2, and INF2 as the top three genes with the most substantial impact on CRC patient prognosis (Fig. 2F). Furthermore, results from Decision Tree methods indicated that VIL1, INF2, and LGALS4 are the top three genes contributing most significantly to CRC patient prognosis (Fig. 2G-H). By integrating the results of various machine learning analyses, we identified that the gene INF2 consistently exhibited significant results across all models. Consequently, we concentrated our investigation on INF2.

Subsequent analysis in the TCGA cohort revealed that INF2 expression was upregulated in tissues from CRC patients who experienced disease-specific survival (DSS) events, poor primary therapy outcomes (PD + SD), and cases of perineural invasion (Fig. 2I). Furthermore, an analysis of the previously mentioned scRNA sequencing data revealed that INF2 mRNA is predominantly localized in cancer cells, exhibits low expression levels in epithelial cells, and is negatively expressed in immune cells (Fig. 2J). These findings suggested that INF2, a gene specifically distributed in tumor cells, could serve as a significant target for CRC therapy. Targeting INF2 in drug design may result in reduced toxicity and side effects on non-cancerous cells.

INF2 protein was up-regulated in CRC tissues, and positively associated with advanced clinical features and poor outcome

To ascertain the clinical significance of INF2, we collected a total of 60 paired CRC tissues and adjacent non-tumor tissues from our research cohort. Immunohistochemistry (IHC) assays were conducted, revealing that INF2 protein levels were elevated in CRC tissues compared to adjacent non-tumor tissues (Fig. 3A-B). Pairing analysis demonstrated that 78.3% of CRC tissues exhibited upregulated INF2 protein expression (Fig. 3C).

Fig. 3.

Fig. 3

INF2 protein was up-regulated in CRC tissues, and positively associated with advanced clinical features and poor outcome: (A-B) IHC stain for INF2 protein in CRC tissues and non-tumor adjacent tissues. C Pairing analysis for INF2 protein between CRC tissues and adjacent non-tumor tissues. D The expression of INF2 protein in the tissues from female and male CRC patients. E The expression of INF2 protein in the tissues from CRC patients aged ≥ 60 or < 60 years. F The expression of INF2 protein in the left and right CRC tissues. G The expression of INF2 protein in CRC tissues provided from patients in N0 stage and N1-N2 stage. H The expression of INF2 protein in CRC tissues provided from patients in I-II stage and III-IV stage. I The survival months of CRC patients with high and low INF2 expression. J The predictive value of INF2 protein for 1-, 3-, and 5-year survival of CRC patients. K The predictive value of INF2 protein for distinguishing adjacent tissues and CRC tissues. L The predictive value of INF2 protein for distinguish high and low tumor grade. ** represents p < 0.01. Data are shown as mean ± SD

Subsequently, we conducted further analysis on the correlation between INF2 expression and the clinical characteristics of CRC patients. No significant differences in INF2 protein expression were observed in tissues obtained from either female or male CRC patients (Fig. 3D), nor among CRC patients aged ≥ 60 or < 60 years (Fig. 3E). Similarly, no significant differences in INF2 expression were detected in CRC tissues from various anatomical sites (Fig. 3F). However, elevated INF2 protein expression was noted in CRC tissues from patients with advanced N stage (Fig. 3G) and higher tumor stage (Fig. 3H).

Moreover, our study revealed that CRC patients exhibiting high INF2 protein expression (IHC score ≥ 6) experienced significantly shorter overall survival compared to those with low INF2 protein expression (IHC score < 6) (HR = 3.67; Fig. 3I). Additionally, time-dependent receiver operating characteristic curve (ROC) analysis demonstrated that INF2 protein levels possessed a high predictive value for 1-year with area under the curve (AUC) values of 0.80, and possessed a moderate predictive value for 3- and 5-year survival rates with AUC values of 0.65 and 0.71, respectively (Fig. 3J). Furthermore, through performing AUC analysis, we found that INF2 protein levels also possessed a high predictive value for distinguishing CRC tissues with AUC as 0.89 (Fig. 3K) and tumor stage with AUC as 0.87 (Fig. 3L). In summary, our study demonstrated that the INF2 protein is upregulated in CRC tissues and is positively correlated with advanced clinical features and poor prognosis, suggesting that INF2 serves as a significant diagnostic biomarker for CRC.

INF2 knockdown inhibited the CRC cell proliferation and mobility in vitro

The biological functions of INF2 were subsequently investigated in CRC cell lines SW480 and SW620 in vitro. Three siRNAs targeting INF2 were designed, and qRT-PCR (Fig. 4A) and western blotting (Figs. 4B-C) revealed that siRNA2 was the most effective in inhibiting INF2 expression in both SW480 and SW620 cells, reducing INF2 mRNA and protein levels by more than 50% (P < 0.01). Therefore, we employed INF2 siRNA2 to generate INF2-knockdown SW480 and SW620 cell lines.

Fig. 4.

Fig. 4

INF2 knockdown inhibited the CRC cell proliferation and mobility in vitro: (A) qRT-PCR detected the effective of siRNAs in inhibiting INF2 mRNA levels. B-C Western blotting detected the effective of siRNAs in inhibiting INF2 protein levels. D CCK-8 assays were used to detect the proliferation rate of INF2-knockdown and NC SW480 and SW620 cells. EF EDU positive rate was detected in INF2-knockdown and NC SW480 and SW620 cells. G-H Wound healing assay was used to detect the migration ability in INF2-knockdown and NC SW480 and SW620 cells. I-J Wound healing assay was used to detect the invasion ability in INF2-knockdown and NC SW480 and SW620 cells. * represents p < 0.05; ** represents p < 0.01. n = 3. The NC group was used for comparison. Data are shown as mean ± SD

CCK-8 assays were conducted on these INF2-knockdown SW480 and SW620 cells, revealing a reduced proliferation rate compared to the NC cells, reducing OD value by more than 0.2 in 48 h and 0.1 in 72 h in both cells (P < 0.01; Fig. 4D). Consistently, EDU assays demonstrated a decreased EDU-positive rate (reduce levels by more than 30%) in SW480 and SW620 cells with INF2 knockdown (P < 0.01; Fig. 4E-F). Moreover, through performing wound healing assays (Fig. 4G-H), we found that lower migration rate was observed in SW480 (reduce levels by more than 30%, P < 0.05) and SW620 cells (reduce levels by more than 60%, P < 0.05) with INF2 knockdown compared with those NC cells. Furthermore, using transwell invasion assays (Fig. 4I-J), we found that lower invasion rate was observed in SW480 (reduce levels by more than 40%, P < 0.01) and SW620 cells (reduce levels by more than 50%, P < 0.01) with INF2 knockdown compared with those NC cells. Taken together, INF2 knockdown inhibited the CRC cell proliferation and mobility in vitro.

Knockdown of INF2 significantly reduced SW480 cell proliferation in vivo

We subsequently investigated the biological function of INF2 in vivo. To this end, NC and INF2-knockdown SW480 cells were subcutaneously injected into BalB/c nude mice. The results demonstrated that tumors derived from INF2-knockdown SW480 cells exhibited a significantly reduced proliferation rate (Fig. 5A-B) and decreased tumor weight (Fig. 5C). Additionally, the nude mice bearing INF2-knockdown SW480 cells experienced less weight loss (Fig. 5D). IHC analysis further revealed that tumor tissues from INF2-knockdown SW480 cells showed reduced expression of KI67 and PCNA (Fig. 5E-F). These results indicated that knockdown of INF2 significantly reduced SW480 cell proliferation in vivo.

Fig. 5.

Fig. 5

Knockdown of INF2 significantly reduced SW480 cell proliferation in vivo: (A-B) The proliferation rate of tumor tissues derived from NC SW480 and SW480 cells with INF2 knockdown. C The tumor weight of tumor tissues derived from NC SW480 and SW480 cells with INF2 knockdown. D The weight loss of mice harboring NC SW480 and SW480 cells with INF2 knockdown. EF The expression of INF2, KI67 and PCNA in the tumor tissues derived from NC SW480 and SW480 cells with INF2 knockdown. * represents p < 0.05; ** represents p < 0.01. n = 5. The NC group was used for comparison. Data are shown as mean ± SD

DiosMetin 7-O-β-D-Glucuronide had high affinity to INF2 protein

Given the specific localization in CRC cells and promotional effects of INF2 for CRC progression, we considered INF2 maybe a significant therapeutic target for CRC. Therefore. to discover potent inhibitors of INF2 protein, we conducted a computer-based virtual screening method (Fig. 6A). Initially, we performed molecular docking studies using a library of FDA-approved drugs against the INF2 protein (PDB ID: 8RV2). The results revealed a total of 64 FDA-approved drugs were predicted to had potential to bind with INF2 protein (Supplemental Table 5), while the top five FDA-approved drugs with the highest docking scores were Raltitrexed, Panobinostat, Regorafenib, Bicalutamide, and Bortezomib (Fig. 6B). However, the docking scores for all these drugs were lower than −10 (Fig. 6B), which did not meet the cut-off criteria to consider high affinity.

Fig. 6.

Fig. 6

DiosMetin 7-O-β-D-Glucuronide had high affinity to INF2 protein: (A) The flow chat of computer virtual screening for INF2 protein. B Top5 FDA-approved drugs with highest docking score for INF2 protein. C Top5 natural products with highest docking score for INF2 protein. Docking model between Uridine 5′-diphosphoglucose (D), Alginic acid (E), DiosMetin 7-O-β-D-Glucuronide (F), 2-O-β-D-Glucopyranosyl-L-ascorbic acid (G), and GDP-D-mannose (H). I-J Cellular Thermal Shift Assay was performed to detect the interactions between molecules and INF2 protein. * represents p < 0.05; ** represents p < 0.01. n = 3. The DMSO treatment group was used for comparison. Data are shown as mean ± SD

Previous studies demonstrated that natural products exhibit significant structural diversity and frequently possess anti-tumor properties through multi-target mechanisms, making them valuable resources for virtual screening and mining [19, 20]. Consequently, we conducted molecular docking of a natural product library with the INF2 protein. The findings revealed that a total of 6176 natural products had potential to bind with INF2 protein (Supplemental Fig. 6), while the top five natural products with the highest docking scores for the INF2 protein were Uridine 5′-diphosphoglucose (docking score = −14.743), Alginic acid (docking score = −14.667), Diosmetin 7-O-β-D-Glucuronide (docking score = −14.142), 2-O-β-D-Glucopyranosyl-L-ascorbic acid (docking score = −13.169), and GDP-D-mannose (docking score = −12.971) (Fig. 6C). These molecules were subsequently imported for further analysis.

For Uridine 5′-diphosphoglucose, multiple groups of it were capable of forming hydrogen bonds with the INF2 protein at Lys213, Thr303, Gly302, Asp157, Gly158, Val159, Gly15, Ser14, and Asp11 site, as well as establishing salt bridges with Lys18 site (Fig. 6D). Alginic acid can form hydrogen bonds with the INF2 protein at Gln137, Asp157, Gly158, Val159, Leu16, Gly15, Ser14, Thr303, Lys336, and Glu214 site (Fig. 6E). Diosmetin 7-O-β-D-Glucuronide forms with INF2 protein at Lys336, Asp157, Lys213, Glu214, Arg210, Asp211, Lys215, Arg62 and Asp56 site (Fig. 6F). For 2-O-β-D-Glucopyranosyl-L-ascorbic acid, it can form hydrogen bonds with the INF2 protein at Lys336, Gly302, Lue16, Gly15, Ser14, Ser14, Asp154 and Gln137 site (Fig. 6G). Moreover, we found that GDP-D-mannose was able to form hydrogen bonds with the INF2 protein at Gln137, Gly74, Val159, Ser14, Gly15, Asp157, Gly158, and Tyr306 site, in addition to forming salt bridges with Lys18 site (Fig. 6H).

Subsequent, the molecules with top3 binding score were then subjected to CETSA assays. It was demonstrated that the thermal stability of INF2 proteins was no significantly changed after treatment with Uridine 5′-diphosphoglucose and Alginic acid (Fig. 6I-J). However, after treatment with Diosmetin 7-O-β-D-Glucuronide, the thermal stability of INF2 proteins was significantly elevated (Fig. 6I-J). Taken together, Diosmetin 7-O-β-D-Glucuronide may be a potential inhibitor for INF2 protein.

DiosMetin 7-O-β-D-Glucuronide exhibited high inhibitory effects for CRC cells with high INF2 expression

Subsequently, we investigated the inhibitory effects of DiosMetin 7-O-β-D-Glucuronide on NC CRC cells and CRC cells with INF2 knockdown over a 24-h period using the CCK8 assay. The results demonstrated that the median inhibitory concentration (IC50) of DiosMetin 7-O-β-D-Glucuronide for NC SW480 (Fig. 7A) and SW620 cells (Fig. 7B) were 14.86 μM and 11.35 μM, respectively. In contrast, the IC50 values for SW480 (Fig. 7C) and SW620 cells (Fig. 7D) with INF2 knockdown were 33.55 μM and 32.83 μM, respectively. These findings suggested a high selectivity of DiosMetin 7-O-β-D-Glucuronide.

Fig. 7.

Fig. 7

DiosMetin 7-O-β-D-Glucuronide exhibited high inhibitory effects for CRC cells with high INF2 expression: CCK-8 assays were performed to detect the IC50 of DiosMetin 7-O-β-D-Glucuronide for NC SW480 cells (A), NC SW620 cells (B), SW480 with INF2 knockdown cells (C), SW620 with INF2 knockdown cells (D), CMET (E), CSMC (F), and PBMC (G). n = 6. Data are shown as mean ± SD

Furthermore, we assessed the nonspecific toxic side effects of DiosMetin 7-O-β-D-Glucuronide on CSMC, CMET, and PBMC cells over a 24-h period using the CCK-8 assay. The results demonstrated that the IC50 values of DiosMetin 7-O-β-D-Glucuronide for CMET (Fig. 7E), CSMC (Fig. 7F), and PBMC (Fig. 7G) were all greater than 80 μM, respectively. Collectively, these findings suggest that DiosMetin 7-O-β-D-Glucuronide exhibited lower nonspecific toxic side effects on normal cells, indicating its potential as an effective therapeutic agent for CRC.

Discussion

CRC is one of the most lethal and prevalent malignancies globally. Historically, surgery and chemotherapy have been the primary treatment modalities for cancer patients. However, the prognosis for CRC, particularly in cases with metastatic lesions, has remained suboptimal [21, 22]. Targeted therapy has emerged as a novel therapeutic option that has successfully extended overall survival in CRC patients. Following the identification of the epidermal growth factor receptor (EGFR) as a biomarker in CRC, the EGFR inhibitor cetuximab was developed and approved for clinical use. This agent has been demonstrated to significantly improve the prognosis of CRC patients [23, 24]. EGFR is a ubiquitously expressed protein found in the majority of mammalian cells [25]. Following cetuximab therapy, patients with CRC frequently exhibit adverse effects such as dermatological reactions, gastrointestinal disturbances, allergic responses, and cardiovascular complications [26, 27]. Consequently, it is of considerable importance to develop therapeutic targets with specific distribution for the treatment of CRC.

In this study, through scRNA analysis and bulk sequencing, we identified that INF2 is predominantly localized in cancer cells within CRC tissues and has a substantial impact on poor prognosis. Furthermore, our study identified that INF2 was upregulated in CRC tissues from our research cohort and correlated with decreased survival duration. The knockdown of INF2 significantly inhibited CRC cell proliferation and migration. These findings were first evidences to suggest that INF2 could serve as a significant biomarker for CRC.

Bioinformatics technologies, such as computer virtual screening and molecular dynamics simulations, represent cutting-edge approaches and key strategies in drug discovery. For computer virtual screening, the molecular docking software on the computer is used to simulate the interaction between the target and the candidate drug before biological activity screening, and the affinity between the two is calculated to reduce the actual number of screened compounds and improve the discovery efficiency of lead compounds [28, 29]. In this study, utilizing computer-based virtual screening and CETSA experiments, DiosMetin 7-O-β-D-Glucuronide was identified as binding to the INF2 protein, suggesting it may have potential role as an INF2 inhibitor. DiosMetin 7-O-β-D-Glucuronide was recognized as an antioxidant component in the fruits of Luffa cylindrical in a previous study [30]. However, its anti-tumor properties have not been previously characterized. Herein, through performing CCK-8, we presented the preliminary evidences demonstrating that DiosMetin 7-O-β-D-Glucuronide selectively inhibits CRC cells with INF2 expression (IC50 < 20 μM), while exhibiting minimal nonspecific toxicity towards normal cells (IC50 ≥ 80 μM). DiosMetin 7-O-β-D-Glucuronide may be a potential drug for CRC.

There are certain limitations in our current study. Firstly, while we conducted scRNA-seq analysis on CRC tissues and established that INF2 is primarily localized within cancer cells, we did not determine the cellular localization of the INF2 protein in other tissue types. Additionally, the inhibitory effects and potential nonspecific toxicity of Diosmetin 7-O-β-D-Glucuronide require validation through in vivo experiments.

Conclusions

In summary, INF2 serves as a relatively specific biomarker for CRC, predominantly localized in the cancer cells of CRC tissues, and plays a significant role in influencing the prognosis of CRC patients. The silencing of INF2 expression resulted in a decrease in both cell proliferation and motility. Furthermore, DiosMetin 7-O-β-D-Glucuronide has been identified as a potent inhibitor of INF2, demonstrating minimal toxicity and side effects. Consequently, targeting INF2 with DiosMetin 7-O-β-D-Glucuronide may represent a promising therapeutic strategy for CRC treatment.

Supplementary Information

12885_2025_14357_MOESM1_ESM.pdf (731.3KB, pdf)

Supplementary Material 1: Figure S1-S12: Original WB bands for check.

12885_2025_14357_MOESM2_ESM.xlsx (35.1KB, xlsx)

Supplementary Material 2: Table S1: The top 2000 genes with the greatest expression variability between cells were exhibited.

12885_2025_14357_MOESM3_ESM.xlsx (9.5KB, xlsx)

Supplementary Material 3: Table S2: The top10 signature genes in each cluster were showed.

12885_2025_14357_MOESM4_ESM.xlsx (159.8KB, xlsx)

Supplementary Material 4: Table S3: The top200 signature genes in each cell type were showed.

12885_2025_14357_MOESM5_ESM.xlsx (53.7KB, xlsx)

Supplementary Material 5: Table S4: The expression of the nine signature genes and information of CRC patient survival days used for machine learning.

12885_2025_14357_MOESM6_ESM.xlsx (10.4KB, xlsx)

Supplementary Material 6: Table S5: Docking score of FDA-approved drugs with INF2 protein.

12885_2025_14357_MOESM7_ESM.xlsx (187.5KB, xlsx)

Supplementary Material 7: Table S6: Docking score of natural products with INF2 protein.

Acknowledgements

Not applicable.

Abbreviations

CRC

Colorectal cancer

scRNA

Single-cell RNA

GEO

Gene Expression Omnibus

PCA

Principal component analysis

GO

Gene Ontology

KEGG

Kyoto Encyclopedia of Genes and Genomes

TCGA

The Cancer Genome Atlas

OS

Overall survival

SVM-FRE

Support vector machine recursive feature elimination

LASSO

Least absolute shrinkage and selection operator

CSMC

Colon smooth muscle cells

CMET

Colonic mucosal epithelial cells

PBMC

Peripheral blood mononuclear cells

CCK-8

Cell Count Kit-8

PDB

Protein Data Bank

CETSA

Cellular Thermal Shift Assay

HB

Hemoglobin

MT

Mitochondria

TAM

Tumor associated macrophage

CAF

Cancer associated fibroblast

BP

Biological Process

CC

Cellular Component

MF

Molecular Function

DSS

Disease-specific survival

IHC

Immunohistochemistry

ROC

Receiver operating characteristic curve

AUC

Area under the curve

IC50

The median inhibitory concentration

EGFR

Epidermal growth factor receptor

Authors’ contributions

Conceptualization, Y.Z. and T.C.; data curation, Z.Z.; formal analysis, Z.Z.; Y.K. and F.H.; investigation, Y.K. and Z.Z.; methodology, H.L. and X.Z.; resources, Z.Z.; software, D.L. and Y.M.; validation, Z.Z.; Y.Z.; and T.C.; writing—original draft, Z.Z.; writing—review and editing, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (82060440), Key project of basic research plan of Guizhou Science and Technology Department (Qiankehe Basics-zk [2023] Key 043), The Guizhou Provincial Science and Technology Projects (ZK[2022]390), The Continuous Support Fund for Excellent Scientific Research Platform of Colleges and Universities in Guizhou Province (QJJ (2022) 020, Scientific Research Project of Higher Education Department of Guizhou Province [Qianjiaoji 2022(187)], Chronic disease diagnosis and treatment transformation engineering research center research project of Guizhou Medical University (2020–002).

Data availability

The raw sequence scRNA data used for this study was retrieved from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/gds) under accession number GSM5075683; The raw sequence of data used for machine learning was retrieved from The Cancer Genome Atlas Program (TCGA) (https://portal.gdc.cancer.gov/). The codes used for analysis during the current study are available from the corresponding author on reasonable request.

Declarations

Ethics approval and consent to participate

A total of 60 pairs of CRC tissues and corresponding adjacent tissues were collected from the Affiliated Hospital of Guizhou Medical University, following approval by the Human Ethics Committee of the same institution (Approval number: 2024–076). All patients had obtained informed consent in writing. The animal experiments were approval by the Animal Ethics Committee of Guizhou Medical University (approval number: 2400141). The study was conducted in accordance with the ethical standards as laid out in the 1964 Declaration of Helsinki and its later amendments.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Zhirui Zeng, Yun Ke and Fei Huang contributed equally to this work.

Contributor Information

Tengxiang Chen, Email: txch@gmc.edu.cn.

Yunhuan Zhen, Email: yunhuanzhen72@163.com.

References

  • 1.Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74(3):229–63. [DOI] [PubMed] [Google Scholar]
  • 2.Choi Y, Sateia HF, Peairs KS, Stewart RW. Screening for colorectal cancer. Semin Oncol. 2017;44(1):34–44. [DOI] [PubMed] [Google Scholar]
  • 3.Aykut B, Lidsky ME. Colorectal Cancer Liver Metastases: Multimodal Therapy. Surg Oncol Clin N Am. 2023;32(1):119–41. [DOI] [PubMed] [Google Scholar]
  • 4.Malki A, ElRuz RA, Gupta I, Allouch A, Vranic S, Al Moustafa AE. Molecular mechanisms of colon cancer progression and metastasis: recent insights and advancements. Int J Mol Sci. 2020;22(1):130. [DOI] [PMC free article] [PubMed]
  • 5.Wang M, Niu W, Hu R, Wang Y, Liu Y, Liu L, et al. POLR1D promotes colorectal cancer progression and predicts poor prognosis of patients. Mol Carcinog. 2019;58(5):735–48. [DOI] [PubMed] [Google Scholar]
  • 6.Gou H, Wong CC, Chen H, Shang H, Su H, Zhai J, et al. TRIP6 disrupts tight junctions to promote metastasis and drug resistance and is a therapeutic target in colorectal cancer. Cancer Lett. 2023;578: 216438. [DOI] [PubMed] [Google Scholar]
  • 7.Liu Y, Li J, Zeng S, Zhang Y, Zhang Y, Jin Z, et al. Bioinformatic Analyses and Experimental Verification Reveal that High FSTL3 Expression Promotes EMT via Fibronectin-1/α5β1 Interaction in Colorectal Cancer. Front Mol Biosci. 2021;8: 762924. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Huang C, Ou R, Chen X, Zhang Y, Li J, Liang Y, et al. Tumor cell-derived SPON2 promotes M2-polarized tumor-associated macrophage infiltration and cancer progression by activating PYK2 in CRC. J Exp Clin Cancer Res. 2021;40(1):304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhao Y, Zhang H, Wang H, Ye M, Jin X. Role of formin INF2 in human diseases. Mol Biol Rep. 2022;49(1):735–46. [DOI] [PubMed] [Google Scholar]
  • 10.Chen YM, Liapis H. Focal segmental glomerulosclerosis: molecular genetics and targeted therapies. BMC Nephrol. 2015;16:101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Heuser VD, Kiviniemi A, Lehtinen L, Munthe S, Kristensen BW, Posti JP, et al. Multiple formin proteins participate in glioblastoma migration. BMC Cancer. 2020;20(1):710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Heuser VD, Mansuri N, Mogg J, Kurki S, Repo H, Kronqvist P, et al. Formin Proteins FHOD1 and INF2 in Triple-Negative Breast Cancer: Association With Basal Markers and Functional Activities. Breast Cancer (Auckl). 2018;12:1178223418792247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ding Y, Lv Z, Cao W, Shi W, He Q, Gao K. Phosphorylation of INF2 by AMPK promotes mitochondrial fission and oncogenic function in endometrial cancer. Cell Death Dis. 2024;15(1):65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Uhlitz F, Bischoff P, Peidli S, Sieber A, Trinks A, Lüthen M, et al. Mitogen-activated protein kinase activity drives cell trajectories in colorectal cancer. EMBO Mol Med. 2021;13(10): e14123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Yip SH, Sham PC, Wang J. Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data. Brief Bioinform. 2019;20(4):1583–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans A Math Phys Eng Sci. 2016;374(2065):20150202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hu C, Li T, Xu Y, Zhang X, Li F, Bai J, et al. Cell Marker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scRNA-seq data. Nucleic Acids Res. 2023;51(D1):D870-d6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Shen W, Song Z, Zhong X, Huang M, Shen D, Gao P, et al. Sangerbox: A comprehensive, interaction-friendly clinical bioinformatics analysis platform. Imeta. 2022;1(3): e36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Atanasov AG, Zotchev SB, Dirsch VM, Supuran CT. Natural products in drug discovery: advances and opportunities. Nat Rev Drug Discov. 2021;20(3):200–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.de Sousa Luis JA, Barros RPC, de Sousa NF, Muratov E, Scotti L, Scotti MT. Virtual Screening of Natural Products Database. Mini Rev Med Chem. 2021;21(18):2657–730. [DOI] [PubMed] [Google Scholar]
  • 21.Dariya B, Aliya S, Merchant N, Alam A, Nagaraju GP. Colorectal Cancer Biology, Diagnosis, and Therapeutic Approaches. Crit Rev Oncog. 2020;25(2):71–94. [DOI] [PubMed] [Google Scholar]
  • 22.McQuade RM, Stojanovska V, Bornstein JC, Nurgali K. Colorectal Cancer Chemotherapy: The Evolution of Treatment and New Approaches. Curr Med Chem. 2017;24(15):1537–57. [DOI] [PubMed] [Google Scholar]
  • 23.Troiani T, Napolitano S, Della Corte CM, Martini G, Martinelli E, Morgillo F, et al. Therapeutic value of EGFR inhibition in CRC and NSCLC: 15 years of clinical evidence. ESMO Open. 2016;1(5): e000088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Stefani C, Miricescu D, Stanescu-Spinu II, Nica RI, Greabu M, Totan AR, et al. Growth factors, PI3K/AKT/mTOR and MAPK signaling pathways in colorectal cancer pathogenesis: where are we now? Int J Mol Sci. 2021;22(19):10260. [DOI] [PMC free article] [PubMed]
  • 25.Sharifi J, Khirehgesh MR, Safari F, Akbari B. EGFR and anti-EGFR nanobodies: review and update. J Drug Target. 2021;29(4):387–402. [DOI] [PubMed] [Google Scholar]
  • 26.Zhang D, Ye J, Xu T, Xiong B. Treatment related severe and fatal adverse events with cetuximab in colorectal cancer patients: a meta-analysis. J Chemother. 2013;25(3):170–5. [DOI] [PubMed] [Google Scholar]
  • 27.Chua W, Peters M, Loneragan R, Clarke S. Cetuximab-associated pulmonary toxicity. Clin Colorectal Cancer. 2009;8(2):118–20. [DOI] [PubMed] [Google Scholar]
  • 28.Kitchen DB, Decornez H, Furr JR, Bajorath J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov. 2004;3(11):935–49. [DOI] [PubMed] [Google Scholar]
  • 29.Shaker B, Ahmad S, Lee J, Jung C, Na D. In silico methods and tools for drug discovery. Comput Biol Med. 2021;137: 104851. [DOI] [PubMed] [Google Scholar]
  • 30.Du Q, Xu Y, Li L, Zhao Y, Jerz G, Winterhalter P. Antioxidant constituents in the fruits of Luffa cylindrica (L.) Roem. J Agric Food Chem. 2006;54(12):4186–90. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12885_2025_14357_MOESM1_ESM.pdf (731.3KB, pdf)

Supplementary Material 1: Figure S1-S12: Original WB bands for check.

12885_2025_14357_MOESM2_ESM.xlsx (35.1KB, xlsx)

Supplementary Material 2: Table S1: The top 2000 genes with the greatest expression variability between cells were exhibited.

12885_2025_14357_MOESM3_ESM.xlsx (9.5KB, xlsx)

Supplementary Material 3: Table S2: The top10 signature genes in each cluster were showed.

12885_2025_14357_MOESM4_ESM.xlsx (159.8KB, xlsx)

Supplementary Material 4: Table S3: The top200 signature genes in each cell type were showed.

12885_2025_14357_MOESM5_ESM.xlsx (53.7KB, xlsx)

Supplementary Material 5: Table S4: The expression of the nine signature genes and information of CRC patient survival days used for machine learning.

12885_2025_14357_MOESM6_ESM.xlsx (10.4KB, xlsx)

Supplementary Material 6: Table S5: Docking score of FDA-approved drugs with INF2 protein.

12885_2025_14357_MOESM7_ESM.xlsx (187.5KB, xlsx)

Supplementary Material 7: Table S6: Docking score of natural products with INF2 protein.

Data Availability Statement

The raw sequence scRNA data used for this study was retrieved from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/gds) under accession number GSM5075683; The raw sequence of data used for machine learning was retrieved from The Cancer Genome Atlas Program (TCGA) (https://portal.gdc.cancer.gov/). The codes used for analysis during the current study are available from the corresponding author on reasonable request.


Articles from BMC Cancer are provided here courtesy of BMC

RESOURCES