Abstract
Background
Pancreatic cancer (PAC) has a complex tumor immune microenvironment, and currently, there is a lack of accurate personalized treatment. Establishing a novel consensus machine learning driven signature (CMLS) that offers a unique predictive model and possible treatment targets for this condition was the goal of this study.
Methods
This study integrated multiple omics data of PAC patients, applied ten clustering techniques and ten machine learning approaches to construct molecular subtypes for PAC, and created a new CMLS.
Results
Using multi-omics clustering, we discovered two cancer subtypes (CSs) associated with prognosis, among which CS1 exhibited poor prognostic outcomes. Subsequently, 13 central genes were identified through screening, constituting CMLS with a significant prognostic ability. The low CMLS group had a better prognosis and was more likely to possess a “hot” tumor phenotype. The prognosis for the high CMLS group was dismal. Still, the tumor mutation burden (TMB) and tumor neoantigen burden (TNB) levels in this group of patients were higher than in the low CMLS group, which were more favorable for immune therapy response.
Conclusion
This study emphasizes that CMLS provides a beneficial instrument for early prediction of patient prognosis and screening of probable patients appropriate for immunotherapy and has broad implications for clinical practice.
Supplementary Information
The online version contains supplementary material available at 10.1007/s12672-025-01841-8.
Keywords: Pancreatic cancer, Immune microenvironment, Molecular subtypes, Machine learning, Multi-omics
Introduction
Pancreatic cancer (PAC) is a deadly cancer with high malignancy, insidious onset and metastatic tendency, and its incidence rate and mortality are very high [1]. Pancreatic ductal adenocarcinoma (PDAC) accounts for approximately 90% of pancreatic cancer. The low survival rate is mostly due to the lack of early symptoms of pancreatic cancer, appropriate diagnostic tools and treatment measures, which leads to local progress and metastasis in 80–85% of PAC patients at the time of diagnosis [2, 3]. While patients with pancreatic cancer have frequently received adjuvant therapy, neoadjuvant therapy, immunotherapy, and surgery together, only a small percentage of patients with resectable tumors can benefit from this combination [4–6]. The intricate tumor microenvironment associated with pancreatic cancer may be the primary cause of these unfavorable consequences [7]. As a result, strengthening the understanding of pancreatic cancer at the molecular level is critical for better diagnosis and therapy.
Currently, immunotherapy has been proven to be an effective method for treating various malignant tumors [8]. However, the overall effect of immunotherapy is not ideal for PAC patients, and the response to immunotherapy varies greatly [5]. The “cold” tumor microenvironment is one of the important reasons for its immune therapy resistance [9]. Considering the heterogeneity of tumors, the same treatment plan has significantly different therapeutic effects on different patients. Because immunotherapy is expensive and has the potential for adverse responses, developing prognostic assessment tools with high sensitivity and specificity is critical in patient management. At present, various multi-gene prognostic models have been created to address the extensive heterogeneity of PAC and have performed rather well in some cohorts [10–12]. Because these prognostic models are based on expression files of mRNA, microRNA (miRNA), or long non-coding RNA (lncRNA) in particular pathways such as immunology, metabolism, and methylation, the data usage is inefficient. And the uniqueness and inadequacies of the selected modeling method limit its widespread application in clinical settings. As a result, integrating different omics data is critical for developing prediction models and improving the prognosis of PAC patients.
In this study, we innovatively utilized genomic mutation data, epigenetic DNA methylation data, mRNA, lncRNA, and miRNA expression profiles, and from the Cancer Genome Pancreatic adenocarcinoma (TCGA-PAAD) dataset to establish a comprehensive consensus subtype of PAC through a multi-omics integration strategy. We discovered 45 stable prognostic related genes (SPRGs) based on differential expression of distinct subtypes and created consensus machine learning driven signatures (CMLS) utilizing ten machine learning methods to predict the prognosis of PAC patients [13]. Our findings showed that CMLS demonstrated significant prognostic value in both training and validation cohorts. By analyzing the immune characteristics related to CMLS, we found that there are differences in CMLS scores in terms of immune cell infiltration and immunotherapy. Our research findings provided important reference points for improving the molecular subtypes of PAC and enhancing patient prognosis.
Materials and methods
Gathering and preparing data
We first gathered multiple omics data from TCGA (https://portal.gdc. cancer. gov), which included gene mutation data, methylation chip data, transcriptome expression data (mRNA, lncRNA, and miRNA expression profiles), and corresponding clinical data from the TCGA-PAAD cohort. To conduct a complete analysis successfully, we matched the sample with five dimensions of omics information and included a total of 145 patients. In addition, we included the dataset GSE183795 from the Gene Expression Comprehensive Database (http://www.ncbi.nlm.nih.gov/geo) [14]. Transcripts per million (TPM) were obtained from the high-throughput sequencing of transcriptomes, and all expression profiling analyses carried out using arrays were deduplicated and standardized. Patients who had a survival time of less than one month were eliminated in order to increase the robustness of the downstream analysis. Gene expression was defined as the mean expression level when more than one probe found the same gene symbol.
Multi-omics integration analysis
In this work, we used the Multi-Omics Consensus Integration Analysis (MOVICS) program to conduct a preliminary cluster analysis of 145 pancreatic cancer patients [15]. With respect to continuous variables (mRNA, miRNA, lncRNA, and methylation), we filtered the top 1000 genes with the highest degree of variation using the MOVICS package’s “getElites” function. Prognostic genes were determined through Cox regression analysis, with useful genes kept across all data dimensions (p < 0.05). We screen based on mutation frequency for gene mutation data to find the top 5% of the most commonly mutated genes. The results from these five dimensions were included in our study for further examination.
We used the ‘‘getClustNum’’ function in the MOVICS package, which incorporated the Clustering prediction index (CPI), Silhouette score, and Gaps-statistics to estimate the number of subgroups, to further establish the ideal number of clusters for our study after preliminary feature selection. We finally decided to divide pancreatic cancer into two subtypes. Following that, we utilized the ‘‘getConsensus MOIC’’ function to do cluster analysis, employing ten clustering algorithms (SNF, CIMLR, PINSPlus, NEMO, COCA, MoCluster, LRAcluster, ConsensusClustering, IntNMF, and iClusterBayes) for computation. Based on the concept of consensus clustering, we used the ‘‘getConsensus MOIC’’ function to combine the clustering results of various techniques to improve clustering accuracy. This integration method yielded the final clustering results.
Molecular properties and stability of consensus subtypes
Using the single sample gene set enrichment analysis (ssGSEA) function, we determined the enrichment scores of 29 immunological characteristics previously reported in the literature [16]. To elucidate the complex network of gene expression, we constructed a transcriptional regulatory network by reconstructing the transcriptional regulatory network and analyzing the regulatory subunit (RTN) package, which includes 23 induced/inhibited target-related transcription factors and regulatory factors related to chromatin remodeling during carcinogenesis [17, 18]. We then compared the distribution of immune checkpoint genes across various subtypes. To further evaluate the immunological milieu within tumor tissue, we used the ESTIMATE package to estimate the proportion of immune and stromal cells [19]. Following established procedures, the DNA methylation score of tumor infiltrating lymphocytes (MeTIL) was determined. GSVA was used to assess the enrichment of 24 tumor immune microenvironment cells. For verifying the consistency and stability of subtypes, we used subtype specific biomarkers to validate the clustering results. Then we compared the consensus clustering with the partition around medoids (PAM) and nearest template prediction (NTP) classifiers' consistency [20, 21]. As subtype-specific biomarkers, the top 100 genes that were significantly elevated in each subtype were employed. Subtype-specific biomarkers in the validation queue were used to validate the clustering results. Kappa values were used to examine the consistency of multiple omics subtypes with NTP and PAM classifiers.
Establishing consensus machine learning-driven prognostic signature
We used the TCGA-PAAD cohort, which has relatively full treatment data, as the training set and GSE183795 as the validation set to assess the association between CMLS and immunotherapy. To create a highly accurate and reliable CMLS, we integrated ten machine learning algorithms, including random survival forest (RSF), Ridge, Lasso, CoxBoost, stepwise Cox, elastic network (Enet), partial least squares regression for Cox (plsRcox), generalised boosted regression modelling (GBM), supervised principal components (SuperPC), and survival support vector machine (survival-SVM) [22]. We performed a univariate Cox analysis on the TCGA-PAAD and GSE183795 cohorts. Genes in the queue that have the same HR orientation and p < 0.05 were referred to as SPRGs. We used these genes to construct the most predictive CMLS through 101 combinations of 10 algorithms. The model with the highest c-index was deemed the ideal CMLS after we determined the average c-index for each model. The genes identified through CMLS filtering are considered model genes. Finally, a multivariate Cox analysis was conducted to generate CMLS-related scores for each patient. The TCGA cohort was split into PAC patients with high and low CMLS scores based on the median CMLS value. Based on the median of CMLS values, study the prediction differences between the two groups and evaluate the model's accuracy.
Prognostic value and clinical applications of CMLS
To assess the CMLS model’s prognostic significance, we compared the prognosis of high and low CMLS groups using Kaplan–Meier survival curves. Meanwhile, we collected 20 prognostic gene models related to pancreatic cancer from previously published studies and calculated each sample’s risk score according to the published coefficient. Then, we used the c-index to assess the predictive potential of all feature genes in each cohort and compared it with CMLS. The prognostic significance of signature and clinical characteristics were assessed using univariate and multivariate Cox regression models.
Immune characteristics based on CMLS
Using the IOBR package, we comprehensively examined the immunological variations between high and low CMLS patients, including the tumor microenvironment (TME) cell types, immunotherapy responses, and immune exclusion-related features [23]. Subsequently, based on these differences, we compared the differences in tumor mutation burden (TMB) and tumor neoantigen burden (TNB) between the high and low CMLS groups. We reclassified CMLS patients and compared their survival outcomes. In addition, we evaluated the delayed response survival (3 months) of patients to immunotherapy using the IMvigor-MUC cohort. We also used tumor immune phenotype (TIP) algorithm to evaluate the immune therapy response of the TCGA-PAAD cohort. Further validation was obtained in GSE78220, GSE135222, and GSE91061. The “c2.cp.Kegg.Hs.symbols.gmt” gene set was used for gene set enrichment analysis (GSEA) to analyze the activation of oncogenic pathways in high and low CMLS patients.
Drug sensitivity analysis
The half-maximal inhibitory concentrations (IC50) of the medicines were determined using the Genomics of Drug Sensitivity in Cancer (GDSC, https://www.cancerrxgene.org). The IC50 of medicines in each PAC samples was determined using the “pRRophetic” package [24].
Statistical analysis
A t-test or Wilcoxon rank-sum test was used to compare continuous variables, taking into account the sample size and data distribution. The Fisher's exact test or the chi-squared test were used to compare categorical variables. The ‘‘surv-cutpoint’’ feature of the ‘‘survminer’’ package was utilized to ascertain the CMLS score's cut-off value. The hazard ratio (HR) and 95% confidence interval (CI) were determined using the Cox proportional hazards regression model after confounding variables were taken into account. R software 4.4.0 was used for all statistical analyses. P < 0.05 were considered to have statistically significance.
Results
Multi-omics consensus prognosis-related molecular subtypes of PAC
Figure 1 depicted the workflow for this study. After effectively processing all data, we determined the optimal clusters as two subtypes from 10 multi-omics ensemble clustering algorithms based on the cluster prediction index and gap statistical analysis (Fig. 2A). The clustering results were combined with the transcriptomes (mRNA, miRNA, and lncRNA), epigenetic methylation, and somatic mutations using the consensus ensemble method (Fig. 2B–D). Our multi-omics molecular subtypes were significantly linked with overall survival (OS) (p < 0.001; Fig. 2E). It was worth noting that cancer subtype 1 (CS1) exhibited higher mutation frequency and methylation characteristics, while cancer subtype 2 (CS2) had better survival rates.
Fig. 1.
A flow diagram
Fig. 2.
The PAAC subtypes of multi-omics integrative consensus A The sample similarity of each subgroup was assessed by calculating the Silhoutte score. B Extensive heatmap of consensus ensemble subtypes, comprising mRNA, lncRNA, miRNA, methylation site, and mutant gene. C The heatmap displayed the distribution of 10 clustering methods for PAAC patients. D Consensus clustering matrix derived from 10 algorithms for two new prognostic subtypes. E Variations in the two subgroups' survival rates
Biological differences among molecular subtypes of pancreatic cancer
The emergence of different molecular subtypes may be linked to specific biological roles, so we investigated the molecular properties of each subtype. We employed the ssGSEA algorithm to assess the enrichment status of various molecular features in each sample. It was worth noting that most immune cells had higher levels of infiltration in CS2 and significant activation of immune-related pathways (Fig. 3A). These pathways were related to immune cell function, cytokine signaling, and chemokine signaling. We further explored transcriptome differences by analyzing 23 transcription factors (TFs) and potential regulatory factors associated with cancer chromatin remodeling (Fig. 3B). EGFR, ERBB2, TP63, Fibroblast growth factor receptor 3 (FGFR3), and FOXA1 regulatory factors were considerably active in CS1, whereas FGFR1, Androgen Receptor (AR), HIF1A, and STAT3 were specifically enriched in CS2. The activity profile related to chromatin remodeling underlines the potential mode of variable regulation between CSs, implying that epigenetic-driven transcriptional networks may be crucial differentiation determinants for these molecular subtypes. Given the importance of tumor immunity in tumor occurrence and development, we assessed immune cell infiltration using the ESTIMATE algorithm. We found that immune cell infiltration (such as PDCD1 and CTLA4) significantly increased in CS2, while it was relatively low in CS1 (Fig. 3C). The immune cell analysis using the CIBERSORT algorithm also showed significant enrichment of naive B cells and CD8 + T cells in CS2. In addition, methylation and immune scores indicate that CS2 has a relatively high level of immune infiltration.
Fig. 3.
Molecular landscape of multiple omics subtypes A Heatmaps of enrichment scores for two subgroups of the immune system. B Enrichment heatmaps for regulon activity profiles for 23 transcription factors (TFs) and potential regulators associated with chromatin remodeling of two subtypes. C The cohort of TCGA-PAAD immune profiles. The heatmap’s upper annotation displays the immune enrichment score, stromal enrichment score, and DNA methylation of tumor-infiltrating lymphocytes. The expression of canonical immune checkpoint genes is displayed in the top panel, while the enrichment levels of 24 immune cells connected to TMEs are displayed in the bottom panel. D Survival analysis of PAAC CSs in the GSE183795 cohort. E The consistency of CSs with NTP in the TCGA-PAAD cohort. F The consistency of CSs with PAM in the TCGA-PAAD cohort. G The consistency of NTP with PAM in the GSE183795 cohort
We selected 100 genes that were specifically upregulated for each subtype as classifiers to validate the subtypes’ stability in an external cohort. The NTP classified each sample in the GSE183795 queue as one of the determined CSs. It was worth noting that the prognosis of CS2 in the GSE183795 cohort was consistent with the results of the TCGA-PAAD cohort (p = 0.032, Fig. 3D). The consistency of multi-omics subtypes with NTP and PAM algorithms was also tested. The Kappa value of multiple omics subtypes compared to NTP was 0.683 (p < 0.001, Fig. 3E), and the Kappa value of multiple omics clustering compared to PAM was 0.740 (p < 0.001, Fig. 3F). Similarly, in the GSE183795 queue, the comparison Kappa value between NTP and PAM was 0.827 (p < 0.001, Fig. 3G).
Establishment of the CMLS prognostic model
We used univariate Cox regression analysis to identify 45 SPRGs with a significant correlated with OS expression and used them to construct a prognostic model. In the training queue, we constructed a consistency model based on 101 algorithm combinations. To assess the prediction capacity of all models in the queue, we computed the average c-index (Fig. 4A). The final model, generated from 13 genes using the RSF algorithm, had the highest average c-index among the 101 models (Fig. 4B, C). In TCGA-PAAD and GSE183795, patients with high CMLS had poorer clinical outcomes (Fig. 4D, E).
Fig. 4.
The establishment and prognostic value of CMLS A A thorough computational framework was used to construct a set of 101 machine learning algorithms. The c-index of each model was determined using the TCGA-PAAD and GSE183795 cohorts and ordered according to the validation set’s average c-index. B The hub gene selected using the RSF algorithm. C The univariate Cox regression analysis results of hub genes in training and validation cohorts. D–E Survival analysis of PAAC patients with high CMLS and low CMLS in the TCGA-PAAD and GSE183795 cohorts
Comparison of prognostic signatures in PAC
We looked for pertinent research that had been published within the last 5 years in order to thoroughly compare CMLS with other prognostic feature models. As a result, we included 20 distinct prognostic models in our analysis (Table S1). These model genes were linked to a variety of biological processes, including energy metabolism, iron death, immunological treatment response, and pyroptosis-related genes. In the TCGA-PAAD and GSE183795 datasets, CMLS outperformed other models in terms of c-index performance (Fig. 5A, B). The CMLS-based risk score was an independent predictive predictor for PAC, according to univariate and multivariate Cox regression analysis (p < 0.001, Fig. 5C, D).
Fig. 5.
Clinical practice value of CMLS (A–B) Comparison of the CMLS to the other 20 published models. C A forest plots depicting one-factor cox analyses of CMLS and other clinical factors in the TCGA-PAAD cohort. D A forest plot illustrating multifactor cox analysis of CMLS and other clinical factors in the TCGA-PAAD cohort
Immune characteristics related to CMLS
After conducting a thorough analysis of the TME of PAC, we discovered that patients with low CMLS had significantly higher levels of immune cell infiltration, including T cells, B cells, and NK cells, than patients with high CMLS (Fig. 6A). This indicated that PAC characterized by low levels of CMLS were more likely to be classified as “hot” tumors. Patients with high CMLS had significantly higher inhibitory immune cell infiltration levels, such as myeloid-derived suppressor cells (MDSCs). Additionally, patients with high CMLS primarily accumulated molecular markers associated with immune exclusion, like the epithelial-mesenchymal transition (EMT) pathway, showing an immunosuppressive state (Fig. 6B), which meant that PAC with high CMLS was more likely to be classified as “cold” tumors. Notably, the gene set related to immunotherapy was more abundant in the high CMLS group (Fig. 6C). TMB and TNB were used to assess patient responsiveness to immunotherapy in order to learn more about the immunotherapy characteristics of high CMLS. The high CMLS group had greater TMB and TNB enrichment levels, suggesting that they may be more immunogenic and vulnerable to immune treatment intervention (Fig. 6D, E). Survival analysis showed that CMLS could be an effective supplementary factor for distinguishing patient prognosis between TMB and TNB (Fig. 6F, G). Patients with PAAD who had low CMLS and high TMB or TNB infiltration had a higher prognosis and survival rate.
Fig. 6.
The TME-related molecular features in patients with high and low CMLS A Comparison of TME immune cell type signatures in high and low CMLS patients. B The distribution of immune exclusion signatures between high and low CMLS patients. C The distribution of immunotherapy biomarkers between high and low CMLS patients. D The distribution of TMB between high- and low-CMLS patients. E The distribution of TNB between high and low CMLS patients. F–G CMLS, TMB, and TNB were integrated in the survival analysis
The predictive ability of CMLS in the immunotherapy cohorts
We analyzed the IMvigor-MUC cohort to assess the function of CMLS in PAC immunotherapy, and after three months of treatment, we discovered variations in patients’ long-term survival (p < 0.001, Fig. 7A). The prognosis for the group with lower CMLS was better, suggesting that immunotherapy was more beneficial. The analysis of CMLS distribution among patients exhibiting varying degrees of response revealed that the partial response (PR) CMLS score was substantially lower than the progressive disease (PD) CMLS score (p < 0.001, Fig. 7B). In order to investigate possible immunological mechanisms linked to CMLS, we calculated the tracking tumor immune phenotype (TIP). The results indicated that the low CMLS group primarily showed significant differences in step 4 (recruitment of tumor immune infiltrating cells), which validated the findings of the previous analysis (Fig. 7C). Using prognostic data from several immunotherapy validation groups, we revalidated our findings. After immunotherapy, those with lower CMLS had better prognosis outcomes (GSE78220, p < 0.001, Fig. 7D; GSE135222, p < 0.001, Fig. 7E), moreover, low CMLS was frequently linked to improved immunotherapy outcomes (GSE91061, p = 0.008, Fig. 7F). GSEA showed significant activation of the P53 signaling pathway in PAC patients with high CMLS (Fig. 8A), while the chemokine signaling pathway and T cell receptor signaling pathway were significantly activated in PAC patients with low CMLS (Fig. 8B). From the GDSC database, we found a significant correlation and significance for the IC50 values of Acetalax and SCH772984 in the high and low CMLS groups, which may apply to patients with high CMLS (Fig. 8C, D).
Fig. 7.
CMLS’s predictive value for immunotherapy response in patients with PAAC A. The difference in long-term survival (LTS) between high and low CMLS groups following 3 months of treatment. B The distribution of CMLS in different immunotherapy response groups. C Differences in the degree of activation between high and low CMLS groups at each step of TIP. D Survival analysis of high and low CMLS groups in GSE78220. E Survival analysis of high and low CMLS group in GSE135222. F Distribution of CMLS in different immunotherapy response groups of GSE91061
Fig. 8.
Potential agents for patients with high CMLS A–B Discovery of pathways significantly activated in the high and low CMLS group through the GSEA algorithm. C–D the IC50 values of Acetalax and SCH772984 in the high and low CMLS groups
Discussion
In order to further understand the specific regulatory mechanism of pancreatic cancer, we innovatively combined the expression profiles of mRNA, lncRNA and miRNA with somatic mutation and DNA methylation data. Our study redefined pancreatic cancer’s molecular subtype and prognosis model by comprehensively analysing patients’ multi-group data. Using ten clustering algorithms, we developed two subtypes of PAC molecules with different features. We found that CS1 may have similarities with the “basal-like” subtype discovered by Moffitt et al. [25], and its prognosis was significantly worse, which was consistent with similar subtypes of other epithelial tumors. Abnormal enhancement of KRAS signaling can stimulate the differentiation, proliferation, adhesion, and migration of cancer cells, leading to high invasiveness and metastasis of cancer [26]. Given that CS2 had noticeably greater rates of immunological infiltration, including B cells, CD8 T cells, and helper T cells, it would be more in line with Bailey's theory of “immunogenic” [27]. Furthermore, the CS2 subtype exhibits activation of CTLA4 and PDCD1, indicating that this molecular subtype can be treated with immune checkpoint inhibitors. This may mean that CS2 subtype with lower KRAS mutation rates are more likely to be classified as “hot” tumor phenotypes. Our classification may further improve traditional classification methods. We built the ideal CMLS using 101 algorithm combinations in order to increase the clinical application value and overcome the restrictions of algorithm selection. Research had found that RSF had excellent predictive performance. The examination revealed that the CMLS model outperformed other prognostic tools in precisely predicting the prognosis of PAC patients when compared to 20 different models. It was important to note that lower CMLS scores were linked to higher T, B, and NK cell counts and a higher likelihood of being categorized as a “hot” tumor phenotype. Numerous immunosuppressive characteristics were markedly activated in the high CMLS group, increasing the likelihood of a "cold" tumor phenotype. Furthermore, our research findings indicated that patients with elevated CMLS scores may benefit especially from Acetalax and SCH772984 medications.
We constructed a CMLS model of 13 hub genes, all playing unique roles in cancer. For example, some studies have shown that serine protease inhibitor B5 (SERPINB5) is a tumor suppressor gene abnormally expressed at high levels in many tumors and promotes tumor cell invasion, migration, and proliferation. Elevated expression of SERPINB5 is linked to unfavorable outcomes for patients with lung adenocarcinoma, pancreatic cancer, and colorectal cancer, and it may be a target for treatment [28–30]. Similarly, Keratin 19 (KRT19) is an intermediate filament that controls the cell cycle and forms the cytoskeleton [31]. Its overexpression is a useful biomarker for PAC diagnosis and is linked to tumor development and a poor prognosis in PAC patients [32]. Placental Cadherin (CDH3) typically leads to cell dedifferentiation, increased invasiveness of tumor cells, and ultimately metastasis [33]. Overexpression of CDH3 in PAC cells increases cell motility by altering p120ctn transport, thereby enhancing GTPase activity. This may be related to the high invasiveness and potential for invasion and metastasis of PAC [34]. F10 is a novel gene associated with molar pregnancy. Overexpression of F10 may inhibit cell apoptosis by downregulating the expression of pro-apoptotic genes BAX and caspase-3 and play a role in suppressing the chemotherapy sensitivity of certain lung cancer cells to paclitaxel [35, 36]. The contribution of each gene to research emphasizes the complexity of pancreatic cancer and the potential for more targeted and effective therapeutic strategies based on molecular spectrum.
Immunotherapy has a significant effect on many malignant tumors, but it is not satisfactory in pancreatic cancer. Failure can be attributed to a variety of factors, but the main one is the dynamics and complexity of the TME, which is marked by a low burden of mutations predicted to produce very few immunogenic antigens, severe myeloid inflammation, and inadequate infiltration of effector T cells [37–39]. Checkpoint blockade has achieved significant success in other cancers, including renal-cell cancer and melanoma, but has shown little efficacy in PAC [40, 41]. The reasons for PAC immune checkpoint blockade failure are multifactorial. For example, baseline PD-1 T cell infiltration into tumors is low and lacks new epitopes [42, 43]. We separated PAC into high and low CMLS groups using multi-omics clustering and machine learning, patients with high CMLS showed marked suppression of the immune system and "cold" tumor features. In contrast, patients with low CMLS showed a higher presence of NK and T cells and were more inclined towards “hot” tumors. These subtle differences can explain why the response to traditional immunotherapy is only effective in a small subset of individuals with immunogenic “hot” tumors, but not in “cold” tumors. TMB is a moderate predictor of immune checkpoint inhibitors (CPI) response, associated with neoantigen burden, and typically low in PAC [44, 45]. It's interesting to note that the TMB and TNB were higher in the group with high CMLS than in the group with low CMLS. This finding suggests that patients with high CMLS may benefit from targeted immunotherapy that targets inhibitory immune cells.
Immunotherapy, such as adoptive cell transfer therapies, T-cell receptor (TCR) engineered T cell therapy, chimeric antigen receptor (CAR) T cell therapy, CAR NK cell therapy, and cytokine induced killer (CIK) cell therapy, is currently being used in clinical settings for a variety of solid cancers and has demonstrated encouraging results [46–48]. Potential approaches include cancer vaccinations, myeloid cell-targeting tactics, and immune checkpoint blockade (ICB) [49–51]. The intricacy and diversity of cellular components inside the pancreatic tumor microenvironment restrict the effectiveness of immunotherapy, even with significant advancements. Therefore, effective combination therapies that regulate TME are required to enhance immunotherapy outcomes. We may explore the immune cell spectrum in PAC more thoroughly by utilizing cutting-edge techniques like single-cell sequencing and multi-omics analysis. This will strengthen the groundwork for the creation of immunotherapy that targets TME.
There are various new developments in our research when compared to previously published studies. Firstly, in order to fully use the information content of each omics dimension while minimizing the influence of clustering technique selection preferences on our analysis, we integrated omics information from all five dimensions of PAC and applied ten clustering methods. Secondly, in order to reduce the possible influence of overfitting on our research findings, we employed ten popular machine learning techniques to find the model that performed the best on average c-index when building CMLS.
However, our research still has certain limitations. For instance, the sample size may not be large enough to accurately reflect a larger population, which may affect the applicability of CMLS across various demographic groups. Larger, prospective multicenter cohorts should be used to more thoroughly validate the clinical utility of CMLS. In addition, the specific mechanism of CMLS gene-induced tumorigenesis deserves further exploration.
Conclusion
In conclusion, this work improved the molecular typing of PAC by identifying two molecular subtypes using multi-omics consensus clustering. Using machine learning algorithm frameworks, we defined CMLS, which is closely related to immune therapy response and provides a valuable framework for predicting patient prognosis and selecting treatment. The high CMLS group showed a stronger immunosuppressive microenvironment and poorer prognosis but relatively higher TMB and TNB, suggesting potential responsiveness to immunotherapy. This study provides a valuable screening model for patients by integrating multiple omics data and cutting-edge computing algorithms, setting the stage for an accurate and timely diagnosis and treatment of PAC patients.
Supplementary Information
Author contributions
KP.G and XJ.Z provided ideas and designs. FF.L and YQ.W wrote the main manuscript text and XJ.Z prepared Figs. 1, 2, 3, 4, 5, 6, 7, 8. All authors reviewed the manuscript.
Data availability
The datasets presented in this study can be found in the online knowledge base. Available from: https://portal.gdc.cancer.gov, https://www.ncbi.nlm.nih.gov/geo, and https://www.cancerrxgene.org.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Kamisawa T, Wood LD, Itoi T, Takaori K. Pancreatic cancer. Lancet. 2016. 10.1016/s0140-6736(16)00141-0. [DOI] [PubMed] [Google Scholar]
- 2.Tempero MA. NCCN guidelines updates: pancreatic cancer. J Natl Compr Canc Netw. 2019. 10.6004/jnccn.2019.5007. [DOI] [PubMed] [Google Scholar]
- 3.Vincent A, et al. Pancreatic cancer. Lancet. 2011. 10.1016/s0140-6736(10)62307-0. [DOI] [PubMed] [Google Scholar]
- 4.Heinrich S, Lang H. Neoadjuvant therapy of pancreatic cancer: definitions and benefits. Int J Mol Sci. 2017. 10.3390/ijms18081622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Morrison AH, Byrne KT, Vonderheide RH. Immunotherapy and prevention of pancreatic cancer. Trends Cancer. 2018. 10.1016/j.trecan.2018.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Rocha FG. Landmark series: immunotherapy and targeted therapy for pancreatic cancer. Ann Surg Oncol. 2021. 10.1245/s10434-020-09367-9. [DOI] [PubMed] [Google Scholar]
- 7.Ren B, et al. Tumor microenvironment participates in metastasis of pancreatic cancer. Mol Cancer. 2018. 10.1186/s12943-018-0858-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Zhang Y, Zhang Z. The history and advances in cancer immunotherapy: understanding the characteristics of tumor-infiltrating immune cells and their therapeutic implications. Cell Mol Immunol. 2020. 10.1038/s41423-020-0488-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Micevic G, Bosenberg MW, Yan Q. The crossroads of cancer epigenetics and immune checkpoint therapy. Clin Cancer Res. 2023. 10.1158/1078-0432.Ccr-22-0784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tan Z, et al. 2020. The value of a metabolic reprogramming-related gene signature for pancreatic adenocarcinoma prognosis prediction. Aging. 10.18632/aging.104134 [DOI] [PMC free article] [PubMed]
- 11.Wang L, et al. Multi-omics landscape and clinical significance of a SMAD4-driven immune signature: Implications for risk stratification and frontline therapies in pancreatic cancer. Comput Struct Biotechnol J. 2022. 10.1016/j.csbj.2022.02.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Xiao M, et al. A DNA-methylation-driven genes based prognostic signature reveals immune microenvironment in pancreatic cancer. Front Immunol. 2022. 10.3389/fimmu.2022.803962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Chu G, Ji X, Wang Y, Niu H. Integrated multiomics analysis and machine learning refine molecular subtypes and prognosis for muscle-invasive urothelial cancer. Mol Ther Nucleic Acids. 2023. 10.1016/j.omtn.2023.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Yang S, et al. Dysregulation of HNF1B/Clusterin axis enhances disease progression in a highly aggressive subset of pancreatic cancer patients. Carcinogenesis. 2022. 10.1093/carcin/bgac092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lu X, et al. MOVICS: an R package for multi-omics integration and visualization in cancer subtyping. Bioinformatics. 2021. 10.1093/bioinformatics/btaa1018. [DOI] [PubMed] [Google Scholar]
- 16.Zhang J, He S, Ying H. Refining molecular subtypes and risk stratification of ovarian cancer through multi-omics consensus portfolio and machine learning. Environ Toxicol. 2024. 10.1002/tox.24222. [DOI] [PubMed] [Google Scholar]
- 17.Chagas VS, et al. RTNduals: an R/Bioconductor package for analysis of co-regulation and inference of dual regulons. Bioinformatics. 2019. 10.1093/bioinformatics/btz534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lu X, et al. Multi-omics consensus ensemble refines the classification of muscle-invasive bladder cancer with stratified prognosis, tumour microenvironment and distinct sensitivity to frontline therapies. Clin Transl Med. 2021. 10.1002/ctm2.601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Yoshihara K, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. 2013. 10.1038/ncomms3612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hoshida Y. Nearest template prediction: a single-sample-based flexible class prediction with confidence assessment. PLoS ONE. 2010. 10.1371/journal.pone.0015543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhang H, et al. The molecular feature of macrophages in tumor immune microenvironment of glioma patients. Comput Struct Biotechnol J. 2021. 10.1016/j.csbj.2021.08.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Liu Z, et al. Machine learning-based integration develops an immune-derived lncRNA signature for improving outcomes in colorectal cancer. Nat Commun. 2022. 10.1038/s41467-022-28421-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Zeng D, et al. IOBR: multi-omics immuno-oncology biological research to decode tumor microenvironment and signatures. Front Immunol. 2021. 10.3389/fimmu.2021.687975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Geeleher P, Cox N, Huang RS. pRRophetic: an R package for prediction of clinical chemotherapeutic response from tumor gene expression levels. PLoS ONE. 2014. 10.1371/journal.pone.0107468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Moffitt RA, et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat Genet. 2015. 10.1038/ng.3398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Buscail L, Bournet B, Cordelier P. Role of oncogenic KRAS in the diagnosis, prognosis and treatment of pancreatic cancer. Nat Rev Gastroenterol Hepatol. 2020. 10.1038/s41575-019-0245-4. [DOI] [PubMed] [Google Scholar]
- 27.Bailey P, et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 2016. 10.1038/nature16965. [DOI] [PubMed] [Google Scholar]
- 28.He X, et al. SERPINB5 is a prognostic biomarker and promotes proliferation, metastasis and epithelial-mesenchymal transition (EMT) in lung adenocarcinoma. Thorac Cancer. 2023. 10.1111/1759-7714.15013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liu BX, et al. SERPINB5 promotes colorectal cancer invasion and migration by promoting EMT and angiogenesis via the TNF-α/NF-κB pathway. Int Immunopharmacol. 2024. 10.1016/j.intimp.2024.111759. [DOI] [PubMed] [Google Scholar]
- 30.Tian C, et al. Cancer cell-derived matrisome proteins promote metastasis in pancreatic ductal adenocarcinoma. Cancer Res. 2020. 10.1158/0008-5472.Can-19-2578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Coulombe PA, Wong P. Cytoplasmic intermediate filaments revealed as dynamic and multipurpose scaffolds. Nat Cell Biol. 2004. 10.1038/ncb0804-699. [DOI] [PubMed] [Google Scholar]
- 32.Yao H, et al. Glypican-3 and KRT19 are markers associating with metastasis and poor prognosis of pancreatic ductal adenocarcinoma. Cancer Biomark. 2016. 10.3233/cbm-160655. [DOI] [PubMed] [Google Scholar]
- 33.Paredes J, et al. Epithelial E- and P-cadherins: role and clinical significance in cancer. Biochim Biophys Acta. 2012. 10.1016/j.bbcan.2012.05.002. [DOI] [PubMed] [Google Scholar]
- 34.Taniuchi K, et al. Overexpressed P-cadherin/CDH3 promotes motility of pancreatic cancer cells by interacting with p120ctn and activating rho-family GTPases. Cancer Res. 2005. 10.1158/0008.5472.Can-04-3646. [DOI] [PubMed] [Google Scholar]
- 35.Song Y, et al. F10, a novel hydatidiform mole-associated gene, inhibits the paclitaxel sensitivity of A549 lung cancer cells by downregulating BAX and caspase-3. Oncol Lett. 2017. 10.3892/ol.2017.5749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Song Y, et al. Overexpression of the hydatidiform mole-related gene F10 inhibits apoptosis in A549 cells through downregulation of BCL2-associated X protein and caspase-3. Oncol Lett. 2012. 10.3892/ol.2012.762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Balli D, Rech AJ, Stanger BZ, Vonderheide RH. Immune cytolytic activity stratifies molecular subsets of human pancreatic cancer. Clin Cancer Res. 2017. 10.1158/1078-0432.Ccr-16-2128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Stromnes IM, et al. T-cell localization, activation, and clonal expansion in human pancreatic ductal adenocarcinoma. Cancer Immunol Res. 2017. 10.1158/2326-6066.Cir-16-0322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Vonderheide RH, Bayne LJ. Inflammatory networks and immune surveillance of pancreatic carcinoma. Curr Opin Immunol. 2013. 10.1016/j.coi.2013.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Brahmer JR, et al. Safety and activity of anti-PD-L1 antibody in patients with advanced cancer. N Engl J Med. 2012. 10.1056/NEJMoa1200694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Royal RE, et al. Phase 2 trial of single agent Ipilimumab (anti-CTLA-4) for locally advanced or metastatic pancreatic adenocarcinoma. J Immunother. 2010. 10.1097/CJI.0b013e3181eec14c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Alexandrov LB, et al. Signatures of mutational processes in human cancer. Nature. 2013. 10.1038/nature12477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Clark CE, et al. Dynamics of the immune reaction to pancreatic cancer from inception to invasion. Cancer Res. 2007. 10.1158/0008-5472.Can-07-0175. [DOI] [PubMed] [Google Scholar]
- 44.Imamura T, et al. Characterization of pancreatic cancer with ultra-low tumor mutational burden. Sci Rep. 2023. 10.1038/s41598-023-31579-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Yarchoan M, Hopkins A, Jaffee EM. Tumor mutational burden and response rate to PD-1 inhibition. N Engl J Med. 2017. 10.1056/NEJMc1713444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Good CR, et al. An NK-like CAR T cell transition in CAR T cell dysfunction. Cell. 2021. 10.1016/j.cell.2021.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Leidner R, et al. Neoantigen T-cell receptor gene therapy in pancreatic cancer. N Engl J Med. 2022. 10.1056/NEJMoa2119662. [DOI] [PubMed] [Google Scholar]
- 48.Zhang L, et al. Clinical outcome of immunotherapy with dendritic cell vaccine and cytokine-induced killer cell therapy in hepatobiliary and pancreatic cancer. Mol Clin Oncol. 2016. 10.3892/mco.2015.660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Bauer C. Refined strategies for the treatment of pancreatic carcinoma: targeting myeloid cells in order to overcome T cell exhaustion. Gut. 2017. 10.1136/gutjnl-2016-312427. [DOI] [PubMed] [Google Scholar]
- 50.Huang X, et al. Personalized pancreatic cancer therapy: from the perspective of mRNA vaccine. Mil Med Res. 2022. 10.1186/s40779-022-00416-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Zhang Z, et al. Targeting Plk1 sensitizes pancreatic cancer to immune checkpoint therapy. Cancer Res. 2022. 10.1158/0008-5472.Can-22-0018. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets presented in this study can be found in the online knowledge base. Available from: https://portal.gdc.cancer.gov, https://www.ncbi.nlm.nih.gov/geo, and https://www.cancerrxgene.org.








