Abstract
Background
Liver metastasis is the leading cause of mortality in colorectal cancer (CRC) patients, and there is an urgent need for biomarkers to improve the diagnosis and prognosis of CRC. However, the molecular mechanisms driving metastatic progression remain incompletely understood, necessitating the discovery of novel therapeutic targets.
Methods
GEO datasets and machine learning algorithms (Lasso regression, SVM, and RF) were employed to screen for upregulated genes in CRC liver metastasis. Protein-protein interaction (PPI) networks were constructed using STRING. Single-cell RNA sequencing (scRNA-seq) and ATAC-seq were performed to analyze immune infiltration and chromatin accessibility in metastatic tissues. Functional enrichment analysis of RNA-seq data was conducted to explore DAPK1-related pathways. Finally, CRISPR/Cas9-mediated knockout of DAPK1 in LoVo cells was used to validate its role in invasion and migration.
Results
In this study, we identified DAPK1 as a metastasis-specific upregulated gene in CRC. scRNA-seq of CRC metastases highlighted a significant correlation between DAPK1 expression and macrophage infiltration. Furthermore, ATAC-seq analysis demonstrated increased chromatin accessibility of DAPK1 in CRC liver metastasis. CRISPR/Cas9-mediated knockout of DAPK1 in LoVo cells markedly suppressed their invasion and migration.
Conclusions
By combining machine learning and multi-omics bioinformatics approaches, we identified DAPK1 as a novel biomarker for CRC liver metastasis, which has potential implications for prognosis and therapeutic targeting. These findings highlight DAPK1 as a critical driver of metastatic progression and provide a theoretical foundation for the treatment of CRC.
Graphical Abstract

Supplementary Information
The online version contains supplementary material available at 10.1186/s12935-025-04037-w.
Keywords: Liver metastasis of colorectal cancer, DAPK1, Machine learning, Bioinformatics, Biomarker
Introduction
Cancer pathogenesis is primarily driven by somatic and/or germline alterations in oncogenes and tumor suppressor genes. With the unceasing progress of technology, the treatment of cancer has developed from traditional surgery to a variety of methods, including gene therapy [1]. CRC is the third most common type of cancer in the world, with 1.9 million new cases annually and the second leading cause of cancer-related deaths [2]. Of those with young onset CRC, 60% presented with distant metastases [3]. Therefore, the exploration of biomarkers related to the distant metastasis of CRC is of great significance for promoting the precision treatment of CRC.
The Gene Expression Omnibus (GEO) database is the world’s largest public biomedical database platform for the storage of genomic data, including gene expression data, chromatin status, and genomic variation [4]. Through bioinformatics analysis of the data sets in the GEO database, genes and therapeutic targets with diagnostic value for CRC metastasis can be acquired. With the development of machine learning technology, it has become an important tool in bioinformatics, and is widely used in biological data preprocessing, feature extraction, pattern recognition, classification, clustering, prediction and other aspects. The application of machine learning in medicine not only boosts research efficiency but also offers novel research methods for the bioinformatics field [5]. By performing differential analysis on the blood mRNA expression cohort of patients with major depressive disorder to identify differentially expressed genes, and further applying machine learning techniques (LASSO, SVM-RFE), key therapeutic targets can be obtained [6]. In CRC, machine learning algorithms including Lasso, Random forest, SVM-RFE, and XGboost can be employed to screen out specific genes related to liver metastasis [7]. More and more research indicates that machine learning plays a significant role in the screening and discovery of biomarkers.
Death-associated protein kinase 1 (DAPK1) is a serine/threonine protein kinase which has been demonstrated to be closely related to the progression and metastasis of various cancers [8]. Abnormal methylation of DAPK1 has been reported to be positively associated with gastrointestinal tumorigenesis and gastric cancer metastasis [9]. High expression of DAPK1 promotes metastasis of gastric cancer [10]. In advanced colon and thyroid cancers, it participates in tumor EMT and stem cell expression [11]. Notably, the expression level of DAPK1 is closely associated with clinical prognosis, and its expression has been verified to be significantly correlated with adverse survival in patients with liver cancer [12], bladder cancer [13], and other malignancies, indicating its potential value as a prognostic marker and therapeutic target. However, the specific regulatory network of DAPK1 in CRC metastasis and its dynamic mechanism of action within the tumor microenvironment remain unclear and require further investigation.
In this study, we analyzed the crucial role of DAPK1 in CRC metastasis by integrating multiple omics and functional assays. Based on the GEO dataset and machine learning, we identified the specifically highly expressed gene DAPK1 for CRC metastasis. Bioinformatics analysis disclosed its expression in colon cancer metastasis, prognostic and diagnostic value, and protein interaction network. Further combined with scRNA-seq, ATAC-seq, and RNA-seq, the association of DAPK1 with immune invasion, epigenetic inheritance, and molecular function in CRC metastasis was explored. Functionally, CRISPR/Cas9 knockout of DAPK1 was employed to investigate its effect on the invasion and migration of LoVo. This study is the first to reveal the heterogeneous regulatory network of DAPK1-driven CRC metastasis from a multi-omics perspective, providing a theoretical basis for the precise treatment strategy targeting DAPK1.
Materials and methods
Data collection, processing, and analyses
Transcriptome data and corresponding clinical data for CRC were retrieved from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer) [14]. GSE179979 and GSE213402, were downloaded from the GEO database to explore differential genes between primary tumors and liver metastases in CRC (https://www.ncbi.nlm.nih.gov/geo/). The GSE130488 dataset was downloaded to study differential genes in colon cancer cells after DAPK1 knockout. Differential genes in the GEO dataset were analyzed using GEO2R, and the screening criteria were a P-value < 0.05 and |log2FoldChange|>1.5. The single-cell transcription of DAPK1 in CRC metastasis was investigated with the GSE136394 dataset. The GSE153016 dataset was downloaded, and IGV was used to visualize changes in chromatin accessibility of DAPK1 in CRC liver metastases.
Functional analysis
The selected differential genes were uploaded to the DAVID (https://david.ncifcrf.gov), with the species being set as “Homo sapiens”, and the entire human genome genes were chosen as the background reference set for KEGG and GO analysis. GO enrichment analysis encompassed Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). After the ID conversion of the molecules in the input data using “MSigDB Collections”, gene set enrichment analysis (GSEA) was conducted using the “clusterProfiler” package. The ggplot2 package of R4.4.2 software was employed for mapping.
Machine learning
89 genes that are highly expressed in CRC liver metastases were analyzed through machine learning. The Lasso model was used to download and organize TCGA-COAD and TCGA-READ from the TCGA database, and extract the TPM project format of the data and clinical data. The “glmnet” package was employed to analyze the cleaned data to obtain the variable lambda value, maximum likelihood number or C-index, visualize the data, and conduct 10-fold cross-validation. Feature selection was performed using support vector machines and 10-fold cross-validation was utilized to determine which genes should be regarded as typical. The RF is a randomization algorithm designed to prevent overfitting of a single decision tree and enhance the performance of a model based on a large number of related decision trees from the same training set. In this study, genes with importance greater than 1.5 were selected. The “forestplot” software package was used to generate forest plots for univariate and multivariate Cox proportional risk regression analysis, visually presenting the p-value, hazard ratio (HR), and 95% confidence interval (CI) for each variable. Then, based on the results of the multivariate Cox proportional risk model, nomograms were constructed using the “rms” package to predict the total recurrence rate.
Analysis of gene expression and prognosis diagnosis
The expression of related genes in colon cancer and metastatic tissues was analyzed through TNMplot (https://tnmplot.com/analysis/). The co-expression genes of DAPK1 were obtained by Ualcan (https://ualcan.path.uab.edu/index.html), and the correlation between primary and secondary variables in the data was analyzed. The analysis results were visualized as a co-expression heat map with the “ggplot” package. Genomicscape (http://www.genomicscape.com/) was employed for survival analysis of markers, while Kaplan-Meier (KM) method was used to plot survival curves between two subgroups. ROC analysis of the data was performed using “pROC” package and visualized through “ggplot2”.
Immunoinfiltration analysis
Based on the ssGSEA algorithm offered in R-GSVA [15], the markers of 24 types of immune cells provided in the Immunity article were employed to calculate the immune infiltration of the corresponding cloud data [16]. Based on the CIBERSORT core algorithm, by using the CIBERSORTx website (https://cibersortx.stanford.edu/), the markers of 22 types of immune cells were utilized to calculate the immune infiltration of the uploaded data [17]. TISCH (http://tisch.compbio.cn/home/) was adopted to analyze the GSE136394 single-cell transcriptome data for studying the cell typing in CRC metastasis.
Mutation analysis
The frequency of mutation and modification of biomarkers in tumors was analyzed using Phosphosite plus (https://www.phosphosite.org/). The mutations of related genes in CRC were investigated using the cBioportal (https://www.cbioportal.org/). The TCGA dataset was selected to study the mutation and mutation site of DAPK1 in CRC.
Construction of CRISPR/Cas9 plasmid
CRISPick (https://portals.broadinstitute.org/gppx/crispick/public) was used to design sgRNAs. Sangon synthesized sgRNAs. sgRNA1: CACCGCAGCCGGACCGAGCCAACGC; sgRNA2: CACCGCCTCCGACAGCGCTCCGGA. The sgRNA was annealed and attached to the px330a dCas9-KRAB carrier after BbsI treatment. The target DNA was added to the recipient cell. After transformation, it was evenly spread on the solid medium containing antibiotics and cultured overnight at 37℃. Monoclonal colonies were sequenced in the liquid medium. After successful sequencing, the plasmid extraction kit (Tiangen, China) was used to extract plasmids.
Cell culture and transfection
LoVo and THLE2 cells were obtained from the Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences (Shanghai, China). The cells were cultivated in DMEM (Gibco, USA) medium supplemented with 10% fetal bovine serum (NEWZERUM, NZL). All cell lines were placed in 5% CO2 incubators at 37 °C and maintained under a certain humidity. The DAPK1 overexpression plasmid was purchased from Miaoling (Wuhan, China). Transfection reagents (Gemma, China) and constructed plasmids were added to LoVo cells and cultured in serum-free medium for 6 h before being replaced with serum-containing medium. After continuing the culture for 24 h, cell precipitates were collected.
RNA extraction and PCR detection
Total RNA was extracted from cells using Trizol kit (Accurate Biology). Reverse transcription is performed using HiScript II reverse transcriptase (Vazyme Biotech), followed by polymerase chain reaction. DAPK1-F: CCTGATCTTGGAACTCGTTGC; DAPK1-R: CCAAAAGCATTATGTTCTCAGGC. Cycle threshold (Ct) values were used to determine the expression levels of each gene, and the relative expression levels were calculated by 2−ΔΔCt and normalized to endogenous reference GAPDH.
Scratch assay
The cells were inoculated in a 12-well plate, and when the cell density reached 80–90%, the serum-free medium was replaced with a straight 200 µL sterile gun perpendicular to the bottom. The cells were observed under an inverted microscope and the scratches were photographed at 0 and 24 h. The percentage of scratch area is measured using Image J. The experiment was repeated three times, and the average was calculated.
Transwell assay
Transwell assay was measured in a chamber with an aperture of 8 μm. Cell suspensions resuspended in serum-free medium were added above the chamber and serum-containing medium (or THLE2 cells) below the chamber. After 24 h, the chamber was removed, fixed in 4% paraformaldehyde solution for 30 min, and then stained with 0.5% crystal violet for 15 min. After cleaning with PBS, use a cotton swab to wipe the excess crystal violet above the box, and then take photos under the microscope after natural air drying. Image J was used to analyze the experimental results.
Colony-forming assay
LoVo cells were seeded in 6-well plates at a low density and cultured for 10–14 days. The resulting colonies were fixed with 4% paraformaldehyde, stained with crystal violet, and counted. The colony formation rate was calculated to assess cell proliferative capacity.
Statistical analysis
All results are expressed as mean ± SD. GraphPad Prism 7 was used for statistical analysis. Kaplan-Meier curve and logarithmic rank test were used to compare the overall survival and disease-free survival between the two groups. Statistical differences between the two groups were analyzed by Wilcoxon test. The correlation between genes and cell infiltration levels was calculated by Pearson correlation analysis. p < 0.05 was considered statistically significant.
Results
Identifying DEGs in liver metastases of CRC
To investigate the differential genes specifically expressed in the tissues of patients with CRC liver metastasis, GSE179979 and GSE213402 were selected from the GEO. 1870 genes in GSE179979 were up-regulated and 1613 genes down-regulated in liver metastasis by GEO2R. (Fig. 1A). In the GSE213402, 320 genes were up-regulated and 120 genes were down-regulated in liver metastasis (Fig. 1B). To study the genes specifically expressed in the tissues of patients with CRC liver metastasis, we intersected the highly expressed genes in the tissues of CRC liver metastasis from the two datasets and obtained 89 genes (Fig. 1C). KEGG enrichment analysis of genes up-regulated in liver metastasis revealed that the differential genes were mainly concentrated in the Metabolic pathway and Cell adhesion molecules (Fig. 1D). CRC is a metabolically heterogeneous disease, and cell adhesion is closely related to metastasis. GO analysis found that BP was mainly enriched in the positive regulation of tumor necrosis factor production and the innate immune response (Fig. 1E). These results suggest that liver metastasis of CRC is closely related to the immune microenvironment.
Fig. 1.
Identifying DEGs in liver metastases of CRC. (A) Volcano plot of GSE179979 and GSE213402. (B) Volcano plot of GSE213402. (C) Venn diagram of upregulated genes in GSE179979 and GSE213402. (D) KEGG analysis of differential genes. (E) GO analysis of differential genes
Machine learning to identify biomarkers of liver metastases in CRC
To further screen biomarkers for liver metastases in CRC, we integrated three machine learning algorithms: Lasso, Random Forest (RF), and support vector machine (SVM). 18 genes associated with the prognosis of CRC were screened by Lasso (Fig. 2A, B). Univariate and multivariate Cox regression analysis based on TCGA cohort further found that CYBB (HR = 0.49, p = 0.00009), DAPK1 (HR = 1.36, p = 0.00342), MAT1A (HR = 1.19, p = 0.04606), SERPINA1 (HR = 0.84, p = 0.01523), SERPINA3 (HR = 2.80, p = 0.01049) and VSIG4 (HR = 1.47, p = 0.02318) were independent risk factors for poor prognosis of CRC (Figure S1A, B). Combined with the above genes and clinicopathological parameters (age, pN stage, etc.), the constructed multifactor prognostic nomogram performed well in predicting 1-year, 3-year, and 5-year overall survival (C index = 0.77, AUC = 0.719–0.82) (Figure S1C). The calibration curve showed that the predicted survival rate was highly consistent with the actual observed values, verifying the reliability of the model (Figure S1D).
Fig. 2.
Machine learning to identify biomarkers of liver metastases in CRC. (A) The prognostic genes and the corresponding nonzero coefficients to identify the most critical model genes via Lasso regression analysis. (B) lambda curves showing the process through which we calculated the minimum cross-validation error point. (C) The influence of the number of decision trees on the error rate. (D) An order based on the relative significance of the genes. (E) A SVM approach for the selection of feature genes. (F) Lasso, RF, SVM identified the Venn diagram of the gene
RF analysis was conducted on 89 genes with high expression in liver metastasis, and 13 characteristic genes with importance scores greater than 1.5 were identified (Fig. 2C, D). The SVM results indicated that the accuracy of 16 gene features was 0.892 (Fig. 2E). Based on the analysis results of Lasso, SVM, and RF, three prognostic genes consistent across algorithms were ultimately determined: CYBB, DAPK1 and SLC23A2 (Fig. 2F). These genes have demonstrated significant prognostic value in multiple analytical frameworks, suggesting their potential as therapeutic targets for CRC.
DAPK1 is highly expressed in CRC metastasis and is correlated with a poor prognosis
In order to further investigate the key regulatory genes of liver metastasis in CRC, we analyzed the expression characteristics of candidate genes in metastatic CRC through the TNMplot (Fig. 3A). We discovered that CYBB, DAPK1, and SLC23A2 were significantly elevated in CRC metastases compared to primary tumors (Fig. 3B). Survival analysis further indicated that a high expression of DAPK1 (HR = 2, p = 0.012) and SLC23A2 (HR = 2.4, p = 0.0072) was significantly associated with shorter overall survival in CRC patients. A low expression of CYBB (HR = 0.54, p = 0.015) suggested a poor prognosis (Fig. 3C). It is notable that the diagnostic ROC results demonstrated that DAPK1 (AUC = 0.749, 95%CI 0.704–0.793) and CYBB (AUC = 0.709, 95%CI 0.660–0.758) had good diagnostic value. However, SLC23A2 had low diagnostic efficiency (AUC = 0.559, 95%CI 0.4888-0.630) (Fig. 3D). In conclusion, we found that DAPK1 is highly expressed in CRC metastasis and is associated with a poor prognosis, and has good diagnostic value. Therefore, we selected DAPK1 for further study.
Fig. 3.
DAPK1 is highly expressed in CRC metastasis and is correlated with a poor prognosis. (A) TNMplot analysis of mountain maps of the expression of identified genes in colon cancer metastasis. (B) Expression of the identified gene in colon cancer metastasis. (C) Survival analysis of the identified gene. (D) Diagnostic ROC curve of the identified gene
Functional analysis of DAPK1 co-expressed genes
MF analysis of differential genes was mainly focused on identical protein binding (Fig. 1E). Hence, we utilized the Ualcan database to search for genes co-expressed with DAPK1 (Fig. 4A). A string database was employed to construct the protein interaction network for the top 50 genes co-expressed with DAPK1, which was divided into 6 clusters (Fig. 4B). Further functional analysis of co-expressed genes was carried out. KEGG indicated that related genes were mainly involved in the Chemokine signaling pathway (Fig. 4C). BP demonstrated that related genes were mainly associated with positive regulation of cell migration, and positive regulation of the apoptotic process was correlated with the innate immune response (Fig. 4D). These results suggested that the co-expression of the DAPK1 gene was mainly related to the immune microenvironment and migration of tumor cells. Therefore, we further explored the correlation between DAPK1 and immune infiltration. The results of the ssGSEA revealed that DAPK1 was most correlated with Macrophages (Fig. 4E). The results of the Cibersort showed that DAPK1 was most correlated with Macrophages M2 (Fig. 4F). By analyzing the expression of DAPK1 in macrophages, we found that DAPK1 was highly expressed in macrophages (Fig. 4G). Further, we demonstrated the difference in immunoinfiltration results between the DAPK1 expression groups through superimposed bar graphs (Fig. 4H).
Fig. 4.
Functional analysis of DAPK1 co-expressed genes. (A) Heat map of DAPK1 coexpressed genes. (B) String database constructs DAPK1 protein interaction network. (C) KEGG analysis of co-expressed genes. (D) GO analysis of co-expressed genes. (E) ssGSEA algorithm was used to analyze the association between DAPK1 and immune infiltration. (F) Cibersort algorithm was used to analyze the association between DAPK1 and immune infiltration. (G) Relationship between DAPK1 expression and macrophages. (H) The association between high and low expression of DAPK1 and immune infiltration
Analysis of the immune microenvironment of CRC metastasis based on scRNA-seq
GSE136394 is a scRNA-seq in patients with metastatic CRC. By analyzing the scRNA-seq in the GSE136394 dataset, we identified 22 Cell clusters and 4 Cell types, CD4Tconv, CD8T, Mono/Macro, and Tprolif (Fig. 5A, B). The marker genes for each cell type are shown in Fig. 5C. We found that DAPK1 is significantly enriched in Mono/Macro (Fig. 5D). It was consistent with the expression of DAPK1 in immune infiltration. Cellular interaction analysis showed that DAPK1 + Mono/Macro mainly interacts with CD8T, CD4T and Tprolif (Fig. 5E). Subsequently, GSEA showed HALLMARK_APOPTOSIS, HALLMARK_P53_PATHWAY, HALLMARK_PI3K_AKT_MTOR_SIGNALING and HALLMARK_ANGIOGENESIS gene sets were specifically enriched in CRC metastatic tissues (Fig. 5F-I).
Fig. 5.
Analysis of the immune microenvironment of CRC metastasis based on scRNA-seq. (A) Cell clusters identified in CRC metastatic tissue based on the GSE136394. (B) Cell types identified in CRC metastatic tissue based on the GSE136394. (C) Representative marker genes for cell type identification in CRC metastatic tissue based on the GSE13639. (D) Expression levels of DAPK1 in identified cell types in CRC metastatic tissues based on the GSE13639. (E) Interactions between DAPK1 + Mono/Macro and other cells based on the GSE13639. (F) Enrichment of the HALLMARK_APOPTOSIS gene set in the identified cell type based on GSE136394. (G) Enrichment of the HALLMARK_P53_PATHWAY gene set in identified cell types based on GSE136394. (H) Enrichment of the HALLMARK_PI3K_AKT_MTOR_SIGNALING gene set in identified cell types based on GSE136394. (I) Enrichment of HALLMARK_ANGIOGENESIS gene sets in identified cell types based on GSE136394
DAPK1 mutation analysis and epigenetic analysis
Phosphosite results suggested that DAPK1 has the highest mutation frequency in CRC (Fig. 6A). We further explored the mutation of DAPK1 in CRC by using the cBioportal database. The findings indicated that the genetic alteration of DAPK1 was mainly missense mutation in CRC (Fig. 6B). DAPK1 is a major mutation in CRC (Fig. 6C). Furthermore, we mapped the mutation site of DAPK1 (Fig. 6D). These results imply that DAPK1 is closely associated with CRC mutations.
Fig. 6.
DAPK1 mutation analysis and epigenetic analysis. (A) Mutation frequency of DAPK1 in tumors. (B) Mutation frequency of DAPK1 in CRC. (C) Types of DAPK1 mutations in CRC. (D) Mutation sites of DAPK1. (E) Modification of DAPK1. (F) Visualization of changes in chromatin accessibility in DAPK1
We investigated the modification sites of DAPK1 using Phophotesite. The findings indicated that DAPK1 can undergo acetylation modification, phosphorylation modification, and ubiquitination modification (Fig. 6E). Acetylation modification is closely associated with chromatin. The GSE153016 dataset presents the changes in chromatin accessibility of CRC and liver metastasis genes. We visualized the results of ATAC-seq for IGV. The results demonstrated that DAPK1 significantly enhanced chromatin accessibility in CRC liver metastases (Fig. 6F).
DAPK1 facilitates the migration and invasion of LoVo cells
We utilized the lncar website to compare the expression of DAPK1 in colon cancer metastatic tissues and non-metastatic tissues. The results indicated that the expression of DAPK1 in metastatic tissues was higher than that in tumor tissues (Fig. 7A). Next, we explored the RNA-seq knockout of DAPK1 in colon cancer cells by using the GSE130488. RNA-seq identified 25 down-regulated genes and 87 up-regulated genes (Fig. 7B). KEGG indicated that the differential genes were mainly involved in pathways in cancer and focal adhesion (Fig. 7C). GO analysis revealed that DAPK1 was mainly associated with cell migration, focal adhesion, and growth factor binding (Fig. 7D). GSEA results demonstrated that DAPK1 was closely related to cell cycle (Fig. 7E). These results suggest that DAPK1 is closely associated with the migration and invasion of CRC. Therefore, we knocked out the expression of DAPK1 in colon cancer cells by CRISPR/Cas9. The sgRNA was designed using the CRISPick website. The Sanger sequencing results demonstrated that the CRISPR/Cas9 plasmid was successfully constructed (Fig. 7F). qRT-PCR results indicated that the CRISPR/Cas9 plasmid could function in knocking out the expression of DAPK1 (Fig. 7G). We selected LoVo cells of metastatic colon cancer for the functional experiment to investigate the effect of DAPK1 on colon cancer metastasis. Scratch assays revealed that the migration ability of colon cancer cells was significantly decreased after DAPK1 was knocked out in LoVo cells (Fig. 7H). Transwell showed that the invasion ability of colon cancer cells was inhibited after DAPK1 was knocked out in LoVo cells (Fig. 7I). The results of the colony-forming assay indicated that the proliferation ability of LoVo cells was attenuated following the knockout of DAPK1 (Fig. 7J). When co-culturing LoVo cells with normal liver cells, we discovered that DAPK1 has the characteristic of migrating to the liver. After knocking out DAPK1, the invasive ability of LoVo cells into the liver was attenuated (Fig. 7K). Furthermore, we overexpressed DAPK1 in LoVo cells (Fig. 7L). The results of the scratch assays and transwell demonstrated that DAPK1 enhanced the migration and invasion of LoVo cells (Fig. 7M, N). These results suggest that DAPK1 promotes the migration and invasion of LoVo cells.
Fig. 7.
DAPK1 facilitates the migration and invasion of LoVo cells. (A) The Lncar website analyzed the expression of DAPK1 in colon cancer metastatic tissues. (B) Volcano map of the GSE130488. (C) KEGG analysis of DAPK1 differential gene. (D) GO analysis of DAPK1 differential gene. (E) GSEA analysis of DAPK1 differential gene. (F) Sanger sequencing of CRISPR/Cas9 plasmid. (G) qRT-PCR was used to detect the expression of DAPK1 in LoVo cells transfected with CRISPR/Cas9 plasmid. (H) The migration ability of LoVo cells after DAPK1 knockout was detected by scratch assay. (I) Transwell assay to detect the invasion ability of LoVo cells after DAPK1 knockout. (J) The colony-forming assay was used to detect the effect of DAPK1 on the proliferation of LoVo cells. (K) The effect of DAPK1 on the transformation of LoVo cells into liver cells was investigated by co-culturing LoVo cells with normal liver cells. (L) The overexpression efficiency of DAPK1 was detected by qRT-PCR. (M) The scratch assay was used to detect the effect of overexpressed DAPK1 on the migration of LoVo cells. (N) The transwell assay was used to investigate the effect of overexpressed DAPK1 on the invasion of LoVo cells
Discussion
Bioinformatics has played an important role in the search for tumor markers. Jackson et al. identified SCN3B as a potential candidate for glioma biomarker through data analysis and further verified it through experiments [18, 19]. In this study, by integrating multiple omics data and conducting functional experiments, we systematically disclosed the cancer-promoting role of DAPK1 in CRC metastasis and its heterogeneous regulatory network for the first time. Through the combination of machine learning screening, epigenetic remodeling, and immune microenvironment analysis, we not only verified the functional necessity of DAPK1 as a driver of CRC metastasis but also provided a theoretical basis for its clinical transformation.
DAPK1 has been demonstrated as a biomarker in various tumors. In ovarian cancer, DAPK1 has been recognized as a hub gene for autophagy, which is employed to predict the prognosis of ovarian cancer [20]. In oral cancer, DAPK1 has been identified as a potential early marker [21]. Bioinformatics discloses DAPK1 as a potential mechanism and biomarker of glioma necrotic apoptosis [22]. In glioma, scRNA-seq has uncovered the mechanism of action of DAPK1 in glioma, with implications for machine diagnosis and prognosis [23]. In our study, we identified DAPK1, a tumor marker that is highly expressed in CRC metastases, through three types of machine learning and bioinformatics.
Li et al. utilized the ESTIMATE algorithm to identify and predict the relevant genes of the tumor microenvironment in ovarian cancer patients [24]. Research findings suggest that the mRNA level of DAPK1 is markedly upregulated in CRC tissues. Its high expression is enriched in subtypes featuring anoikis resistance characteristics. This subtype not only activates the pro-metastasis pathway but also exhibits significant immune cell infiltration. Moreover, through the assessment of the risks of BRAF, TP53, and KRAS mutations, it was discovered that the risk score of BRAF-mutated patients in this subtype is higher than that of non-mutated patients, whereas there is no significant difference regarding TP53 and KRAS mutations [25]. In gastric cancer, the expression level of DAPK1 is positively correlated with immune cell infiltration, as well as the IC50 values of 5-fluorouracil and cisplatin in gastric cancer (GC) tissues [26]. DAPK1 may not only be an effective prognostic factor for cancer patients but also serve as a promising predictive immunotherapeutic biomarker for those treated with immune checkpoint inhibitors [27]. We discovered that DAPK1 is closely related to macrophages through scRNA-seq of CRC metastasis. By intersecting TAM marker genes obtained from scRNA-seq data with M2 macrophage module genes from a large number of RNA-seq data to obtain TAM-M2 related genes, DAPK1 was screened out as a key prognostic gene for CRC. Using single cells and RNA-seq to reveal the interactions between tumor-associated macrophages and common molecular subtypes in CRC, DAPK1 was identified as a prognostic gene [28]. This is consistent with our findings.
In addition, DAPK1 is closely related to epigenetic inheritance. There is a strong correlation between DAPK1 promoter methylation and cervical cancer [29]. Alpha-linolenic acid regulates cervical cancer through epigenetic mechanisms that control the methylation of the 5’ CpG island of the DAPK1 promoter [30]. DAPK1 methylation is a potential biomarker for the early diagnosis of gastrointestinal cancer [9]. Spatially resolved multi-omics analysis of primary colorectal cancer reveals the occurrence of DNA mutations in chromatin modification genes and somatic chromatin remodeling in CRC [31]. In our study, we discovered that DAPK1 increases chromatin accessibility in CRC liver metastases by ATAC-seq. The simultaneous occurrence of this “increased accessibility” and “upregulated expression” indicates that the expression differences might originate from direct epigenetic regulation. This provides a more precise direction and a stronger theoretical foundation for subsequent functional verification.
Liu et al. employed tools such as limma, KEGG and GO enrichment analysis, cell landscape-based PPI networks, and Kaplan-Meier survival analysis to analyze single-cell and bulk RNA sequencing. As a result, they identified CENPA as a potential biomarker and therapeutic target for cancer [32]. In our study, we discovered that DAPK1 is closely associated with tumor pathways by analyzing the RNA-seq of DAPK1. In advanced colon cancer and thyroid cancer, it is involved in tumor EMT and stem cell expression [11]. High expression of DAPK1 promotes metastasis of gastric cancer [10]. The results of qPCR showed that DAPK1 was significantly up-regulated in colon cancer cells [28]. LoVo cells are derived from patients with colon cancer after metastasis [33]. After knocking down DAPK1 in LoVo cells, it was found that the migration and invasion ability of LoVo was weakened. When co-culturing LoVo cells with normal liver cells, we discovered that DAPK1 has the characteristic of migrating to the liver. After knocking out DAPK1, the invasive ability of LoVo cells into the liver was attenuated. Our results suggest that DAPK1 promotes migration and invasion of LoVo cells.
In summary, we employed a combination of bioinformatics and machine learning to identify DAPK1, a biomarker for liver metastasis in CRC. In recent years, machine learning has demonstrated great potential in predicting cancer prognosis. Our study offers preliminary evidence that DAPK1 plays a significant role in the progression of CRC liver metastasis and might serve as a biomarker for the disease. This research is founded on an analytical framework of public databases like TCGA and aligns with the successful strategies of prior cancer genomics investigations [34, 35]. However, we also acknowledge the intrinsic limitations of bulk transcriptome data, such as that from TCGA. These limitations include intratumoral heterogeneity and potential technical biases, which may present challenges when interpreting the results of differential expression and pathway enrichment analyses [36, 37]. Future studies should incorporate advanced computational techniques like Generative Adversarial Networks to enhance model robustness via synthetic data generation and to identify elusive DAPK1-related regulatory patterns beyond the reach of traditional methods [38]. The absence of in vivo transfer models restricts the functional validation of DAPK1 in the physiological microenvironment. As tumor biology steps into the era of precision medicine, treatment strategies are increasingly centered on targeting the tumor microenvironment and specific molecular mechanisms [39]. This study identified DAPK1 as the key driver of liver metastasis in CRC. This not only deepened our understanding of the metastasis mechanism but also provided possibilities for its clinical application. It is expected to become a valuable prognostic biomarker and even a potential novel therapeutic target for metastatic CRC.
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Supplementary Material 1: Figure S1: Related to Figure 2. (A) Univariate Cox regression analyses of the relationship between different clinical parameters and the identified genes with OS. (B) Multivariate Cox regression analyses of the relationship between different clinical parameters and the identified genes with OS. (C) Nomogram comprised the identified genes and clinical parameters for predicting the prognosis probability in CRC. (D) Calibration curves of the nomogram showed consistency in the predicted and observed 1-, 3- and 5-year survival rates
Acknowledgements
This research was supported by The Second Hospital of Dalian Medical University.
Abbreviations
- CRC
Colorectal cancer
- DAPK1
Death-associated protein kinase 1
- TCGA
The Cancer Genome Atlas
- GEO
Gene Expression Omnibus
- KEGG
Kyoto Encyclopedia of Genes and Genomes
- GO
Gene Ontology
- KM
Kaplan-Meier
- BP
Biological processes
- AUC
Area under curve
- scRNA-seq
Single-cell RNA sequencing
- PPI
Protein-protein interaction
- MF
Molecular function
- CC
Cellular component
- GSEA
Gene set enrichment analysis
- HR
Hazard ratio
- CI
Confidence interval
- CT
Cycle threshold
- RF
Random forest
- SVM
Support vector machine
Author contributions
Y. Zhang, Q. Zhang, and Y. Gao were responsible for study concept and design. Y. Gao, Z. Zhang, Z Feng, and B Pan were responsible for performing the experiments, acquisition and analysis of data, and drafting the manuscript. Y. Zhang and Qianshi Zhang were responsible for study supervision. Y. Zhang, Q. Zhang, and Y. Gao were responsible for critical revision of the manuscript. All authors reviewed the manuscript.
Funding
This work was supported by the National Natural Science Foundation of China [No.82303979], the Liaoning Provincial Department of Education Project Fund [No. LJ212410161022], Dalian Medical Scientific Research Program Project [No.2212021], and the “1 + X” Research Project of the Second Hospital of Dalian Medical University [No. CYQH2024017].
Data availability
No datasets were generated or analysed during the current study.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Qianshi Zhang, Email: zhangqs@dmu.edu.cn.
Yinan Zhang, Email: zyn930812@163.com.
References
- 1.Sonkin D, Thomas A, Teicher BA. Cancer treatments: Past, present, and future. Cancer Genet, 2024;286:18–24. [DOI] [PMC free article] [PubMed]
- 2.Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J Clin. 2024;74(3):229–63. [DOI] [PubMed] [Google Scholar]
- 3.Willauer AN, Liu Y, Pereira AAL, et al. Clinical and molecular characterization of early-onset colorectal cancer. Cancer. 2019;125(12):2002–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Barrett T, Wilhite SE, Ledoux P, et al. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 2013;41(Database issue):D991–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Xu Z, Rasteh AM, Dong A, et al. Identification of molecular targets of hypericum perforatum in blood for major depressive disorder: a machine-learning Pharmacological study. Chin Med. 2024;19(1):141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zheng S, He H, Zheng J, et al. Machine learning-based screening and validation of liver metastasis-specific genes in colorectal cancer. Sci Rep. 2024;14(1):17679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Steinmann S, Scheibe K, Erlenbach-Wuensch K, et al. Death-associated protein kinase: A molecule with functional antagonistic duality and a potential role in inflammatory bowel disease (Review). Int J Oncol. 2015;47(1):5–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Yuan W, Chen J, Shu Y, et al. Correlation of DAPK1 methylation and the risk of Gastrointestinal cancer: A systematic review and meta-analysis. PLoS ONE. 2017;12(9):e0184959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wang Q, Weng S, Sun Y et al. High DAPK1 expression promotes tumor metastasis of gastric cancer. Biology. 2022;11(10):1488. [DOI] [PMC free article] [PubMed]
- 11.You MH. Mechanism of DAPK1 for regulating cancer stem cells in thyroid cancer. Curr Issues Mol Biol. 2024;46(7):7086–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Li L, Guo L, Wang Q, et al. DAPK1 as an independent prognostic marker in liver cancer. PeerJ. 2017;5:e3568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Xie JY, Chen PC, Zhang JL, et al. The prognostic significance of DAPK1 in bladder cancer. PLoS ONE. 2017;12(4):e0175290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, et al. The cancer genome atlas Pan-Cancer analysis project. Nat Genet. 2013;45(10):1113–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013;14:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bindea G, Mlecnik B, Tosolini M, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. 2013;39(4):782–95. [DOI] [PubMed] [Google Scholar]
- 17.Chen B, Khodadoust MS, Liu CL et al. Profiling Tumor Infiltrating Immune Cells with CIBERSORT. Methods in molecular biology (Clifton, NJ). 2018;1711:243 – 59. [DOI] [PMC free article] [PubMed]
- 18.Liu H, Weng J, Huang CL, et al. Is the voltage-gated sodium channel β3 subunit (SCN3B) a biomarker for glioma? Funct Integr Genomics. 2024;24(5):162. [DOI] [PubMed] [Google Scholar]
- 19.Liu H, Hamaia SW, Dobson L, et al. The voltage-gated sodium channel β3 subunit modulates C6 glioma cell motility independently of channel activity. Biochim Biophys Acta Mol Basis Dis. 2025;1871(6):167844. [DOI] [PubMed] [Google Scholar]
- 20.Ding J, Wang C, Sun Y et al. Identification of an Autophagy-Related signature for prognosis and immunotherapy response prediction in ovarian cancer. Biomolecules. 2023;13(2):339. [DOI] [PMC free article] [PubMed]
- 21.Papadopoulos P, Zisis V, Andreadis D, et al. DAPK-1 as a potential early marker for malignant transformation risk of oral lichen planus. Cureus. 2024;16(10):e71714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gao B, Yan S, Xie W, et al. Bioinformatics reveals the potential mechanisms and biomarkers of necroptosis in neuroblastoma. Translational Cancer Res. 2024;13(7):3599–619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yu TH, Ding YY, Zhao SG, et al. Single-cell sequencing uncovers the mechanistic role of DAPK1 in glioma and its diagnostic and prognostic implications. Front Immunol. 2024;15:1463747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Li S, Yao J, Zhang S, et al. Prognostic value of Tumor-microenvironment-associated genes in ovarian cancer. BIO Integr. 2023;4(3):84–96. [Google Scholar]
- 25.Pan YB, Xu WJ, Huang MS, et al. Anoikis-related signature identifies tumor microenvironment landscape and predicts prognosis and drug sensitivity in colorectal cancer. J Cancer. 2024;15(3):841–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wang F, Hu D, Lou X, et al. BNIP3 and DAPK1 methylation in peripheral blood leucocytes are noninvasive biomarkers for gastric cancer. Gene. 2024;898:148109. [DOI] [PubMed] [Google Scholar]
- 27.Yang J, Liu Y, Geng Q, et al. Death associated protein kinase 1 predicts the prognosis and the immunotherapy response of various cancers. Mol Biol Rep. 2024;51(1):670. [DOI] [PubMed] [Google Scholar]
- 28.Shi L, Mao H, Ma J. Integrated analysis of tumor-associated macrophages and M2 macrophages in CRC: unraveling molecular heterogeneity and developing a novel risk signature. BMC Med Genom. 2024;17(1):145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Agodi A, Barchitta M, Quattrocchi A, et al. DAPK1 promoter methylation and cervical cancer risk: A systematic review and a Meta-Analysis. PLoS ONE. 2015;10(8):e0135078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Ulhe A, Raina P, Chaudhary A, et al. Alpha-linolenic acid-mediated epigenetic reprogramming of cervical cancer cell lines. Epigenetics. 2025;20(1):2451551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Heide T, Househam J, Cresswell GD, et al. The co-evolution of the genome and epigenome in colorectal cancer. Nature. 2022;611(7937):733–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Liu H, Karsidag M, Chhatwal K, et al. Single-cell and bulk RNA sequencing analysis reveals CENPA as a potential biomarker and therapeutic target in cancers. PLoS ONE. 2025;20(1):e0314745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Sochacka-Ćwikła A, Mączyński M, Czyżnikowska Ż, et al. New Oxazolo[5,4-d]pyrimidines as potential anticancer agents: their Design, Synthesis, and in vitro biological activity research. Int J Mol Sci. 2022;23(19):11694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wang Y, Wang J, Zeng T et al. Data-mining-based biomarker evaluation and experimental validation of SHTN1 for bladder cancer. Cancer Genet. 2024;288:43–53. [DOI] [PubMed]
- 35.Gao X, Liu D, Liu J, et al. TCGA-based analysis of oncogenic signaling pathways underlying oral squamous cell carcinoma. Oncol Translational Med. 2024;10(2):87–92. [Google Scholar]
- 36.Liu H, Guo Z, Wang P. Genetic expression in cancer research: challenges and complexity. Gene Rep. 2024;37(000):102042.
- 37.Liu H, Li Y, Karsidag M, et al. Technical and biological biases in bulk transcriptomic data mining for cancer research. J Cancer. 2025;16(1):34–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ai X, Smith MC, Feltus FA. Generative adversarial networks applied to gene expression analysis: an interdisciplinary perspective. Comput Syst Oncol. 2023;3:e1050.
- 39.Joshi RM, Telang B, Soni G et al. Overview of perspectives on cancer, newer therapies, and future directions. Oncol Translational Med. 2024;10(3):105-109.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Material 1: Figure S1: Related to Figure 2. (A) Univariate Cox regression analyses of the relationship between different clinical parameters and the identified genes with OS. (B) Multivariate Cox regression analyses of the relationship between different clinical parameters and the identified genes with OS. (C) Nomogram comprised the identified genes and clinical parameters for predicting the prognosis probability in CRC. (D) Calibration curves of the nomogram showed consistency in the predicted and observed 1-, 3- and 5-year survival rates
Data Availability Statement
No datasets were generated or analysed during the current study.







