Abstract
Breast cancer (BRCA) remains the most common malignancy and the leading cause of cancer-related death among women worldwide. Recent studies have highlighted dysregulated copper homeostasis as a contributor to tumorigenesis, with cuproptosis—a copper-dependent form of regulated cell death—emerging as a potential therapeutic target. In this study, we systematically evaluated the prognostic significance of cuproptosis-related genes (CRGs) in BRCA by integrating multi-omics data from TCGA and GEO cohorts. Through weighted gene co-expression network analysis (WGCNA), we identified four key CRGs (CCDC24, TMEM65, XPOT, and NUDCD1) to construct a prognostic signature. The resulting risk score effectively stratified patients into high- and low-risk groups, with the high-risk group showing significantly worse overall survival, higher TP53 mutation frequency, and features of an immunosuppressive tumor microenvironment (TME). Functional enrichment analyses further implicated these genes in immune evasion and metabolic reprogramming pathways. Single-cell RNA sequencing (scRNAseq) confirmed heterogeneous expression of the signature genes across distinct cell populations, supporting their involvement in both tumor biology and immune modulation. Clinically, this CRG-based model provides a promising tool for individualized risk assessment and treatment planning. High-risk patients may benefit from intensified therapies or immunomodulatory strategies, while low-risk individuals could be considered for treatment de-escalation. In summary, our findings suggest that cuproptosis may play a regulatory role in BRCA progression and offer a novel, clinically actionable framework for prognostic stratification.
Supplementary Information
The online version contains supplementary material available at 10.1007/s12672-025-03525-9.
Keywords: BRCA, Cuproptosis, Prognostic signature, TME, Single-cell analysis
Introduction
Breast cancer (BRCA), a malignancy arising from ductal epithelial cells, remains the most frequently diagnosed cancer and the leading cause of cancer-related death among women worldwide, accounting for 24.5% of new cancer cases and 15.0% of cancer-related mortality in 2020 [1]. Despite advances in early detection and targeted therapies, the clinical management of BRCA is still challenged by its pronounced molecular heterogeneity and the emergence of treatment resistance, underscoring the need for novel prognostic markers and therapeutic strategies.
Recent studies have highlighted the pivotal role of transition metal homeostasis in tumor biology, particularly the oncogenic potential of copper dysregulation. Copper is an essential trace element involved in mitochondrial respiration, redox balance, and iron metabolism. Under physiological conditions, its intracellular levels are tightly regulated; deviations have been implicated in a spectrum of pathologies including neurodegenerative and hematologic disorders [2]. Notably, excessive copper accumulation has been shown to trigger a newly defined form of regulated cell death—cuproptosis—characterized by mitochondrial copper overload, aggregation of lipoylated tricarboxylic acid (TCA) cycle proteins, loss of Fe–S cluster proteins, proteotoxic stress, and eventual cell death [3, 4].
Genome-wide CRISPR-Cas9 screens have delineated a core molecular framework for cuproptosis regulation, identifying seven positive regulators (FDX1, LIAS, LIPT1, DLD, DLAT, PDHA1, and PDHB) and three suppressors (MTF1, GLS, and CDKN2A) [5]. Clinically, elevated copper levels have been associated with poor prognosis in several cancer types, while induction of cuproptosis in preclinical models has shown tumor-suppressive effects, supporting its dual potential as a prognostic biomarker and therapeutic target [6–8].
However, the functional relevance of cuproptosis-related genes (CRGs) in breast cancer progression remains largely unexplored. In particular, how CRG expression shapes tumor behavior, influences the tumor microenvironment (TME), and correlates with clinical outcomes has not been systematically investigated. Moreover, most existing studies rely on bulk transcriptomic data, which masks the heterogeneity of gene expression across different cell populations within the tumor. In contrast, single-cell RNA sequencing (scRNA-seq) enables high-resolution profiling of individual cell types, allowing for the dissection of cell-type-specific expression patterns and functional states. Integrating bulk and single-cell RNA-seq therefore offers complementary insights—capturing both overall transcriptomic trends and cellular-level specificity—thereby improving the mechanistic interpretation and clinical relevance of gene-based prognostic models [9].
In this study, we present a comprehensive analysis of CRG dysregulation and prognostic relevance in BRCA by integrating multi-omics data from the TCGA and GEO cohorts. Through weighted gene co-expression network analysis (WGCNA) and machine learning-based prognostic modeling, we identify a four-gene signature (CCDC24, TMEM65, XPOT, and NUDCD1) capable of stratifying patients into distinct risk cohorts. Through comprehensive molecular profiling, we further elucidate how high-risk tumors exhibit TP53 mutation enrichment, immunosuppressive TME remodeling, and metabolic reprogramming linked to immune evasion. These findings establish cuproptosis as a key regulatory process in BRCA and offer a novel framework for risk prediction and therapeutic targeting.
Materials and methods
BRCA data source and preprocessing
Gene expression and clinical data for breast cancer (BRCA) were obtained from The Cancer Genome Atlas (TCGA, https://cancergenome.nih.gov/) using the R package TCGAbiolinks (version 2.27.1). A total of 1,059 female breast cancer samples and 99 normal breast tissue samples with expression data ((log2(FPKM + 1))) and survival information were included for analysis.
For external validation of the predictive model, we used the GSE20685 dataset, which includes gene expression and clinical data for 327 BRCA cases. Data were downloaded from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo). Probe annotations were matched to gene symbols based on the corresponding platform file. Probes mapping to multiple genes were excluded. If multiple probes corresponded to the same gene symbol, their median expression value was used.
The CRGs were selected for analysis based on previously published literature [3]. Due to limitations in available clinical annotations within TCGA, certain subtype and phenotype data were obtained from previous reports [10], while triple-negative status was obtained via cBioPortal (http://www.cbioportal.org/datasets). Gene sets for hallmark pathway enrichment analysis were downloaded from the Molecular Signatures Database (MSigDB, version 7.4; file: msigdb.v7.4.symbols.gmt; https://www.gsea-msigdb.org/gsea/msigdb/).
Landscape of genetic expression and variation of CRGs in breast cancer
To explore the expression landscape of CRGs in breast cancer, we first compared CRG expression between BRCA tumor samples and paired normal tissues using the Wilcoxon rank-sum test. Differences in gene expression across clinical subgroups were evaluated using the Wilcoxon rank-sum test for two-group comparisons and the Kruskal–Wallis test for comparisons involving more than two groups.
Somatic mutation profiles of CRGs were visualized using the maftools package (version 2.16.0). Copy number variation (CNV) analysis was performed using Fisher’s exact test, and the results were visualized with the ggplot2 package (version 3.4.4). The chromosomal locations of CRGs were mapped using the RCircos package (version 1.2.2), generating a circos plot to illustrate their genomic distribution.
Weighted gene co-expression network construction
A weighted gene co-expression network was constructed using the WGCNA package (version 1.72-1) [11]. The analysis followed the standard workflow, comprising the following key steps: (1) define the similarity matrix; (2) choose the weighting coefficient, and convert the similarity matrix to an adjacency matrix; (3) convert the adjacency matrix to a topological overlap matrix (TOM); (4) to obtain the hierarchical clustering tree, perform hierarchical clustering for TOM-based dissimilarity (dissTOM); (5) identify the modules from the hierarchical clustering tree using the dynamic tree cut method; (6) compute the module eigengene (ME) of each module. ME denotes the module’s total expression level. The relationships among modules were assessed by computing the Pearson correlation coefficients between their respective MEs. Modules with high similarity (based on 1 - Pearson correlation) were subsequently merged.
The MEs of all modules were clustered using the average-linkage hierarchical clustering method, with a minimum module size set to 30 genes. Modules showing high similarity were subsequently merged to refine the co-expression network. Each ME represents the first principal component of the module and is used to describe its corresponding module and reflects the overall gene expression pattern within that module across samples.
To explore associations between modules and clinical traits, Pearson correlation coefficients were calculated between MEs and clinical variables. Modules with significant correlations (P < 0.05) were considered clinically relevant. To further identify key regulatory genes, we used the network screening function in the WGCNA package (version 1.72-1), which integrates two metrics: gene significance (GS), indicating the correlation between a gene and a clinical trait, and module membership (MM), representing the correlation between a gene and its module eigengene [11]. Hub genes were defined as those with high GS and MM values within trait-associated modules.
Functional enrichment analyses
To investigate the biological functions and signaling pathways associated with the identified hub genes, we performed Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses using the clusterProfiler package (version 4.10.0) [12, 13]. Default parameters were applied for both analyses. Enrichment results with a P value < 0.05 were considered statistically significant.
Identification of hub genes
To identify prognostic CRGs, we first performed univariate Cox proportional hazards regression analysis for overall survival (OS) using the survival package (version 3.5-7). Genes with P < 0.05 were considered significantly associated with prognosis. To reduce model complexity and avoid overfitting, we applied the least absolute shrinkage and selection operator (LASSO) algorithm using the glmnet package (version 4.1-8). This approach penalizes the absolute size of the regression coefficients, shrinking some of them to exactly zero and retaining only the most informative predictors. A risk score was then calculated for each patient using a linear combination of the expression levels of the selected genes weighted by their corresponding LASSO-derived coefficients, according to the following formula:
![]() |
where βi denotes the regression coefficient for gene i, and χi represents its normalized expression level.
Predictive ability of the prognostic signature of CRGs
After LASSO regression analysis, patients were stratified into high- and low-risk groups based on the median risk score derived from the selected prognostic CRGs. Kaplan–Meier (KM) survival analysis was conducted using the survival package (version 3.5-7) and survminer package (version 0.4.9) to evaluate differences in overall survival between the two groups.
To further assess the prognostic independence of the CRG-based signature, univariate and multivariate Cox proportional hazards regression analyses were performed on clinical variables in both the training and validation cohorts using the coxph function from the survival package. A P value < 0.05 was considered statistically significant. Forest plots were generated to visualize the hazard ratios and confidence intervals, allowing evaluation of whether the CRG signature remained an independent prognostic factor after adjusting for clinical covariates.
Cuproptosis signature variation analysis and functional annotation
Simple nucleotide variation (SNV) and copy number variation (CNV) analyses were conducted based on data from the training cohort. Using the maftools package (version 2.16.0), we compared mutation profiles between the high and low CRG signature groups using Fisher’s exact test. The results were visualized as waterfall plots. Genomic distribution of CNVs in relation to the CRG signature was displayed using the ggplot2 package (version 3.4.4).
To explore potential functional differences between groups, gene set variation analysis (GSVA) was performed using the GSVA package (version 1.48.3). The HALLMARK gene set (msigdb.v7.4.symbols.gmt) was downloaded from the MSigDB database (https://www.gsea-msigdb.org/gsea/index.jsp). Enrichment scores were calculated for each sample across the hallmark pathways. The distribution of these scores between high and low CRG signature groups was visualized using the pheatmap package (version 1.0.12).
CRG signature tumor microenvironment infiltration characteristics
The abundance of stromal and immune components within tumor samples was estimated using the ESTIMATE algorithm via the estimate package (version 1.0.13) in R. Stromal Score and Immune Score were estimated separately, and the ESTIMATE Score was derived by combining both.
To further assess immune cell infiltration, we applied single-sample gene set enrichment analysis (ssGSEA), CIBERSORT, and xCell algorithms. The resulting immune infiltration scores were compared between high- and low-risk CRG signature groups. Differences were evaluated using the Wilcoxon rank-sum test, and the results were visualized with the ggplot2 package (version 3.4.4).
Single-cell RNA-seq analysis to reveal the expression of key genes across different cell types
To investigate the expression landscape of key genes within the TME of BRCA, we conducted a single-cell RNA sequencing (scRNA-seq) analysis using publicly available data from the GEO database (GSE176078), which includes scRNA-seq data from 26 primary BRCA tumor tissues [14].
Data preprocessing and analysis were performed using the Seurat package (version 4.3.0) in R. Quality control was applied to filter out low-quality cells based on the following criteria: (1) number of detected genes per cell > 200; (2) number of detected genes per cell < 5,000; (3) percentage of mitochondrial gene expression < 20%.
Following quality control, gene expression values were normalized using the LogNormalize method, and highly variable genes were identified using the FindVariableFeatures function. Principal component analysis (PCA) was used for initial dimensionality reduction, and Uniform Manifold Approximation and Projection (UMAP) was subsequently applied for nonlinear dimensionality reduction and visualization. Cell clustering was performed using k-nearest neighbor (kNN) graph construction and Louvain clustering via the FindNeighbors and FindClusters (resolution = 0.8) functions in Seurat. Cell type composition was quantified by calculating the proportion of each cluster, and visualized using bar plots generated with ggplot2 (version 3.4.4). Expression patterns of selected key genes (CCDC24, TMEM65, XPOT, NUDCD1) across different cell types were visualized using DotPlot, applying a blue–white–red color gradient to reflect expression intensity. UMAP plots (DimPlot) were also used to display the spatial distribution of clusters.
Statistical analysis
All statistical analyses were conducted using R software (version 4.2.2). Differences in gene expression between two groups were evaluated using the Wilcoxon rank-sum test while comparisons among more than two groups were assessed using the Kruskal–Wallis test. Kaplan–Meier survival curves were generated using the survival (version 3.5-7) and survminer (version 0.4.9) packages, and differences between groups were tested by the log-rank test. Univariate and multivariate prognostic analyses were performed using Cox proportional hazards regression models via the coxph function. For mutation comparisons, Fisher’s exact test was applied. Boxplots, bar charts, and other visualizations were generated using ggplot2 (version 3.4.4).
A P value < 0.05 was considered statistically significant. Statistical significance in figures was annotated as follows: * P < 0.05; ** P < 0.01; *** P < 0.001; **** P < 0.0001; “ns” indicates no statistically significant difference.
Results
Differential expression and genetic alterations of CRGs in BRCA
Analysis of 10 CRGs in 1,059 BRCA samples from the TCGA dataset revealed significant transcriptional dysregulation compared to 99 normal breast tissues (P < 0.001, Wilcoxon rank-sum test). Among these, DLD, LIAS, PDHA1, FDX1, LIPT1, MTF1, and GLS were markedly upregulated in tumor tissues, whereas CDKN2A and PDHB were significantly downregulated. DLAT showed no significant difference between tumor and normal tissues (Fig. 1A). These results suggest that altered expression of CRGs may play a critical role in BRCA pathogenesis.
Fig. 1.
Transcriptional and Genomic Landscape of CRGs in BRCA. (A) Differential expression of 10 CRGs between normal breast tissues (n = 99) and BRCA samples (n = 1,059) based on the TCGA Cohort; (B) Expression patterns of CRGs in triple-negative breast cancer (TNBC, n = 180) versus non-TNBC subgroups (n = 879); (C) Somatic mutation profiles of CRGs in BRCA (TCGA BRCA cohort, n = 986). The frequency and types of mutations are shown, including missense, nonsense, frameshift, and splice site variants; (D) Genomic locations of CRGs and their copy number variation (CNV) events across human chromosomes. Differences in gene expression were evaluated using the Wilcoxon rank-sum test (wilcox.test). A P value < 0.05 was considered statistically significant. Statistical significance in figures was annotated as follows: * P < 0.05; ** P < 0.01; *** P < 0.001; **** P < 0.0001; “ns” indicates no statistically significant difference
BRCA is typically classified into four major molecular subtypes—Luminal A, Luminal B, HER2-enriched, and triple-negative breast cancer (TNBC)—based on immunohistochemical markers and gene expression profiling. TNBC, which accounts for approximately 15–20% of invasive breast cancers, is known for its pronounced molecular heterogeneity, aggressive clinical behavior, and increased risk of recurrence [15].
To further explore the relationship between CRGs and BRCA subtypes, we stratified the TCGA cohort into TNBC and non-TNBC groups. Comparative expression analysis revealed that six CRGs (DLD, GLS, PDHA1, FDX1, CDKN2A, and DLAT) were significantly upregulated in TNBC compared to non-TNBC, while three genes (LIAS, LIPT1, and LIPT2) were downregulated (Fig. 1B). In addition, evaluation across PAM50 molecular subtypes demonstrated significant differences in the expression of all ten CRGs, with nine genes—excluding MTF1—showing distinct subtype-specific patterns (Supplementary Fig. 1A). These findings underscore the heterogeneity of CRG expression across BRCA subtypes and suggest that disruptions in copper metabolism pathways may contribute to tumor development and progression.
Genomic analysis further identified frequent CNVs and somatic mutations among CRGs (Fig. 1C), with several genes exhibiting recurrent alterations at specific chromosomal loci (Fig. 1D). Notably, MTF1 was primarily affected by amplification events, while FDX1, DLAT, and PDHB exhibited higher frequencies of deletions (Supplementary Fig. 1B). These genomic alterations may partially account for the observed transcriptional dysregulation and implicate CRGs in the genomic instability of BRCA.
Network analysis identifies CRG-associated co-expression modules in BRCA
To construct a robust gene co-expression network, we first conducted scale-free topology analysis to determine the optimal soft-thresholding power (β). As shown in Fig. 2A, the left panel illustrates the scale-free topology fit index, while the right panel displays mean connectivity across a range of β values. Based on these metrics, an appropriate soft-threshold was selected to ensure network scale-freeness.
Fig. 2.
Identification of Co-expression Modules Associated with Cuproptosis in BRCA. (A) Determination of the optimal soft-thresholding power (β) for network construction. The left panel shows the scale-free topology model fit, and the right panel shows mean connectivity at various β values; (B) Heatmap showing the correlations between module eigengenes (MEs) and clinical traits in the TCGA-BRCA cohort (n = 1,059). Each cell displays the Pearson correlation coefficient and corresponding P value; (C) Scatter plot of gene significance (GS) for cuproptosis versus module membership (MM) in the pink module. Red lines indicate the threshold values (MM > 0.6, GS > 0.4) used to identify hub genes
We then integrated CRG expression data with clinical traits to identify biologically and clinically relevant gene modules. Hierarchical clustering of analysis of MEs revealed distinct co-expression patterns, and highly similar modules were subsequently merged to refine network structure (Supplementary Fig. 2A). Correlations between MEs and clinical variables were evaluated using Pearson correlation analysis, highlighting several modules with significant trait associations (Supplementary Fig. 2B). In particular, the pink, yellow, and midnight blue modules exhibited strong correlations with CRG expression profiles (Fig. 2B).
To identify hub genes within CRG-associated modules, we applied stringent selection criteria: MM > 0.6 and GS > 0.4. A total of 108 candidate hub genes met both thresholds, indicating high intramodular connectivity and strong association with CRG-related phenotypes (Fig. 2C and Supplementary Fig. 2C). These genes are likely to represent key regulatory components within the CRG co-expression network in BRCA.
Construction of the prognostic signature based on CRGs in BRCA
To develop a prognostic model, we utilized 108 CRGs identified from the WGCNA analysis as candidate predictors. Univariate Cox regression analysis revealed 23 genes significantly associated with overall survival (P < 0.05). Patients were stratified into high- and low-expression groups based on median expression values of these genes, and Kaplan–Meier survival analysis demonstrated significant survival differences for these candidates (Supplementary Fig. 3A). Among the top six prognostic genes (ranked by P value), high expression of CCDC24 was associated with favorable survival, whereas low expression of CCT6A, CDCA7, TMEM65, SLC7A5, and CCNE1 correlated with improved clinical outcomes.
To refine the model and reduce the risk of overfitting, we performed LASSO regression analysis, which identified four core prognostic genes: CCDC24, TMEM65, XPOT, and NUDCD1 (Fig. 3A and B). These genes were incorporated into a multivariate risk score model, with coefficients derived from the LASSO algorithm:
Fig. 3.
Construction and Validation of a CRG-based Prognostic Model Using Machine Learning. (A) LASSO regression analysis of 23 prognosis-associated CRGs. The left panel shows the coefficient profiles, and the right panel presents the optimal penalty parameter (λ) selection based on 10-fold cross-validation using the minimum criteria; (B) LASSO-derived coefficients of the four genes included in the final prognostic model; (C) Kaplan–Meier survival analysis comparing high- and low-risk groups in the TCGA training cohort (n = 1,045); (D) Time-dependent ROC curves assessing the predictive accuracy of the risk model in the training cohort at 1, 3, and 5 years; (E) Kaplan–Meier survival curves for the high- and low-risk groups in the GEO validation cohort (GSE20685, n = 327); (F) Time-dependent ROC curves for the validation cohort at 1, 3, and 5 years; (G-H) Heatmaps showing the expression patterns of the four signature genes (CCDC24, TMEM65, XPOT, NUDCD1) in high- and low-risk groups within the TCGA training cohort (G) and GEO validation cohort (H), respectively
![]() |
The negative coefficient for CCDC24 suggests a potential protective role, consistent with its association with better survival in the univariate analysis. This CRG-based risk model provides a quantitative tool for prognosis stratification in BRCA (Supplementary Figs. 3 A).
Predictive performance of the CRG-based prognostic signature
To assess the clinical relevance of the CRG-derived rsk score, patients in both the TCGA-BRCA (training) and GEO (GSE20685, validation) cohorts were stratified into high- and low-risk groups based on the median risk score (Supplementary Figs. 3B). Kaplan–Meier survival analysis showed that low-risk patients had significantly longer overall survival compared to high-risk patients in both cohorts (Fig. 3C and E; Supplementary Figs. 3B), supporting the prognostic robustness of the model.
We further evaluated the predictive accuracy of the signature using time-dependent receiver operating characteristic (ROC) curve analysis. In the training cohort, the model yielded area under the curve (AUC) values of 0.595, 0.619, and 0.597 for 1-year, 3-year, and 5-year overall survival predictions, respectively (Fig. 3D). The validation cohort showed improved performance, with AUCs of 0.738, 0.655, and 0.641 for the corresponding time points (Fig. 3F).
Expression profiling of the four signature genes revealed distinct patterns across risk groups. CCDC24, associated with a negative coefficient in the risk score model, was significantly upregulated in the low-risk group, consistent with its protective role. In contrast, TMEM65, XPOT, and NUDCD1—each with positive coefficients—were more highly expressed in the high-risk group, aligning with their association with poorer outcomes (Fig. 3G and H).
Validation and functional characterization of the CRG-based prognostic signature
To evaluate the independent prognostic value of CRG signature, univariate and multivariate Cox regression analyses were performed. The signature showed a trend toward statistical significance as an independent prognostic factor (P = 0.055, Fig. 4A). Further validation within the training cohort confirmed its prognostic value across PAM50 subtypes (Fig. 4B).
Fig. 4.
Evaluation of the Prognostic and Clinical Relevance of the CRG-Based Risk Score. (A) Forest plot showing the results of univariate Cox regression analysis for clinical variables and the CRG risk score in the TCGA-BRCA cohort (training cohort, n = 1,059); (B) Forest plot displaying multivariate Cox regression analysis, confirming the independent prognostic value of the risk score after adjusting for age, stage, immune subtype, and TNBC status; (C) Distribution of risk scores across different clinical and molecular subgroups, including age, tumor stage, race, PAM50 subtype, immune subtype, and TNBC status. Statistical significance was assessed using Wilcoxon rank-sum test (for two-group comparisons) or Kruskal–Wallis test (for comparisons across multiple groups), as indicated in each panel. A P value < 0.05 was considered statistically significant. Statistical significance in figures was annotated as follows: * P < 0.05; ** P < 0.01; *** P < 0.001; **** P < 0.0001; “ns” indicates no statistically significant difference
Stratified analyses uncovered notable associations between risk score and clinicopathological features. Patients aged younger than 60 years exhibited significantly higher risk scores than older individuals (Fig. 4C). Risk scores also varied markedly across molecular subtypes, with Basal-like tumors displaying the highest scores among PAM50-defined categories. Additionally, tumors classified as immunotype C4 showed elevated risk scores. In line with these findings, TNBC cases consistently exhibited higher risk scores, regardless of ER, PR, or HER2 status.
Functional enrichment analysis highlights divergent biological pathways across risk groups
To explore the biological mechanisms associated with the prognostic signature, we conducted GO and KEGG enrichment analyses of the 108 CRG-associated genes (Supplementary Fig. 4A). In the GO analysis: (1) Biological Processes (BP): cGenes were predominantly associated with chromosomal segregation and organelle fission; (2) Cellular Components (CC): Significant enrichment was observed in condensed chromosomes and chromosomal regions; (3) Molecular Functions (MF): Genes were strongly linked to serine kinase activity and single-stranded DNA binding.
KEGG pathway analysis identified cell cycle regulation and oocyte meiosis as the top enriched pathways (Supplementary Fig. 4A). In addition, GSVA revealed distinct metabolic signatures between risk groups (Supplementary Fig. 4B). The low-risk group showed enrichment in xenobiotic metabolism and adipogenesis, whereas the high-risk group was characterized by activation of spermatogenesis and mTORC1 signaling pathways.
Furthermore, stratification by molecular subtype, immune phenotype, and TNBC status revealed significant differences between the high- and low-risk groups, underscoring the biological heterogeneity underlying the prognostic stratification.
To further investigate the molecular characteristics associated with the CRG-based risk stratification, we performed functional and genomic profiling analyses. Somatic mutation analysis revealed distinct mutational landscapes between the high- and low-risk groups. TP53 and PIK3CA were the most frequently mutated genes across the cohort, with TP53 mutations predominantly classified as missense variants (Supplementary Fig. 5A). Comparative analysis of genomic alterations showed that the high-risk group harbored significantly higher frequencies of deletion events and copy number alterations relative to the low-risk group (Supplementary Fig. 5B).
Immune microenvironment characteristics and therapeutic implications of the CRG-based risk signature
To explore the relationship between CRGs and the tumor immune microenvironment in BRCA, we performed ssGSEA to estimate the infiltration levels of 28 immune cell types across high- and low-risk groups. This approach allowed us to assess how the CRG-derived risk signature may shape immune cell composition within the TME. In parallel, we applied the ESTIMATE algorithm to evaluate stromal content, immune infiltration, and overall tumor purity. Significant differences in TME features were observed between the risk groups (Fig. 5A). In particular, the high-risk subgroup exhibited markedly lower stromal scores, indicating potential stromal depletion or remodeling associated with elevated CRG risk scores.
Fig. 5.
TME Characteristics and Predicted Therapeutic Response Associated with the CRG Risk Score. (A) Comparison of ESTIMATE, Immune, and Stromal Scores between high- and low-risk groups, as calculated by the ESTIMATE algorithm; (B) ssGSEA-based analysis of 28 immune cell populations showing differential enrichment patterns across risk groups; (C) Drug sensitivity analysis identifying compounds for which high-risk patients showed significantly increased responsiveness; (D) Additional drug sensitivity analysis identifying compounds for which high-risk patients showed significantly reduced responsiveness. Statistical significance was assessed using Wilcoxon rank-sum test (for two-group comparisons). A P value < 0.05 was considered statistically significant. Statistical significance in figures was annotated as follows: * P < 0.05; ** P < 0.01; *** P < 0.001; **** P < 0.0001; “ns” indicates no statistically significant difference
Immune cell composition was further analyzed using the CIBERSORT algorithm, revealing distinct associations between risk scores and specific immune cell subsets (Fig. 5B). High-risk scores were negatively correlated with anti-tumor immune populations, including CD56 dim natural killer (NK) cells and plasmacytoid dendritic cells (pDC), while positively correlated with immune subsets often linked to immunosuppression or tumor progression, such as activated CD4+ T cells and activated dendritic cells (DC). These results suggest that elevated CRG risk may contribute to an immunosuppressive TME, potentially facilitating tumor progression.
Association between CRG risk score and drug sensitivity
To further investigate the clinical relevance of the CRG signature, we analyzed its association with drug response using estimated IC50 values across various compounds. High-risk patients demonstrated significantly reduced sensitivity to several agents, including AZD2014 and GNE-317, compared to their low-risk counterparts (Fig. 5C). Boxplot comparisons of drug response between risk groups further highlighted distinct differences in sensitivity to multiple compounds (Fig. 5D). Notably, resistance to AZD5991 and UMI-77 was pronounced in the high-risk group.
These findings support the potential utility of the CRG-based risk score not only as a prognostic marker but also as a predictive biomarker for therapeutic response, offering insights into drug resistance mechanisms and informing personalized treatment strategies in BRCA.
Single cell transcriptomic analysis reveals TME heterogeneity and cell type-specific expression of CRGs in BRCA
To further characterize the TME of BRCA, we performed scRNA-seq analysis to investigate the distribution and expression patterns of CRGs across different cell types. UMAP-based dimensionality reduction revealed a heterogeneous cellular landscape within BRCA tissues, comprising malignant epithelial cells, T cells, fibroblasts, endothelial cells, and other stromal components (Fig. 6A). Cell composition analysis showed that tumor cells were the predominant population, while immune cells, including T and B lymphocytes, were also present at appreciable levels, highlighting their potential involvement in immune evasion or anti-tumor responses (Fig. 6B).
Fig. 6.
Single-Cell Transcriptomic Profiling Reveals Cellular Heterogeneity and Differential Expression of CRGs in BRCA. (A) UMAP visualization of single-cell clusters, showing the distribution of major cell types within BRCA samples; (B) Proportional composition of identified cell types across the dataset; (C–F) Expression patterns of four CRGs (CCDC24, XPOT, TMEM65, NUDCD1) across distinct cell populations; (G) Validation of cell-type annotation using canonical marker genes
Expression profiling of key CRGs (CCDC24, TMEM65, XPOT, and NUDCD1) across cell types demonstrated striking heterogeneity (Fig. 6C and F). These genes displayed cell-type-specific expression patterns, suggesting their roles in tumor metabolic reprogramming and TME modulation. Notably, CCDC24 and XPOT were enriched in epithelial clusters, while TMEM65 and NUDCD1 showed broader distribution among stromal and immune compartments. To confirm the identity of annotated cell populations, canonical marker genes were assessed (Fig. 6G).
Overall, these findings highlight the cellular and molecular heterogeneity of the BRCA TME and underscore the potential role of CRGs in shaping tumor-immune interactions. The distinct cell-type-specific expression of CRGs provides additional rationale for targeting cuproptosis pathways in precision therapy.
Discussion
In this study, we comprehensively evaluated the prognostic relevance of CRGs in BRCA by integrating co-expression network construction, survival modeling, and immune landscape characterization. Using WGCNA, we identified key CRG-based modules and constructed a robust prognostic signature (CCDC24, TMEM65, XPOT, and NUDCD1), which demonstrated consistent predictive value in both TCGA and GEO cohorts. Multivariate Cox regression analysis confirmed the signature as an independent predictor of prognosis. Differential expression analysis revealed marked dysregulation of CRGs between tumor and normal tissues, with further variation across molecular subtypes, particularly the PAM50 classification.
Functional annotation of CRG-related modules revealed enrichment in pathways associated with chromosome segregation and cell cycle regulation, suggesting a link between cuproptosis activity and proliferative signaling in BRCA [16]. Genomic analyses further revealed that high-risk patients exhibited higher frequencies of TP53 mutations and copy number alterations, features commonly associated with genomic instability in basal-like tumors. These findings highlight a possible mechanistic link between cuproptosis activity and BRCA progression via genome-wide regulatory disruption.
Importantly, our model showed potential for predicting therapeutic response. Drug sensitivity analysis indicated that high-risk patients were less responsive to several agents, including AZD2014, GNE-317, and UMI-77, suggesting that CRG risk scores may help identify patients who require alternative or intensified treatment strategies.
TME, comprising tumor cells and their surrounding stromal components including lymphocytes, tumor-infiltrating immune cells, and vascular systems, plays a crucial role in tumor progression and therapeutic resistance [17, 18]. Our study identified significant differences in the abundance of 23 tumor-infiltrating immune cells and TME characteristics between the high-risk and low-risk groups. High CRG risk scores were associated with lower stromal content and an immunosuppressive immune cell composition—characterized by reduced infiltration of CD56 dim NK cells and plasmacytoid dendritic cells, along with enrichment of activated CD4+ T cells and dendritic cells. These patterns suggest that cuproptosis activity may influence immune evasion mechanisms in BRCA, although the directionality and causality of these interactions warrant further exploration [19].
While our integrative analysis provides novel insights, several limitations must be acknowledged. First, the study was based on retrospective data from publicly available cohorts (TCGA, n = 1,059 and GSE20685, n = 327), lacking prospective validation. Second, the prognostic signature was derived from bulk transcriptomic data, which may obscure cell type–specific expression patterns; although we incorporated single-cell RNA-seq analysis for partial validation, experimental confirmation of gene function is still lacking. Finally, the observed associations between CRG expression and immune infiltration are correlative, and further mechanistic studies are needed to determine causality [20].
In summary, our findings highlight cuproptosis as a previously underappreciated regulatory process in BRCA, with implications for tumor progression, immune modulation, and therapeutic response. The CRG-based risk signature offers a promising tool for prognostic stratification and personalized treatment planning. Future studies should focus on functional validation of CRGs and clinical translation through prospective trials.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary Material 1. Supplementary fig. 1. Expression and genomic alterations of Cuproptosis-Related genes across clinical subgroups. (A) Boxplots showing differential expression of CRGs across PAM50 molecular subtypes (TCGA-BRCA cohort, n = 1,059); (B) CNV profiles of CRGs, highlighting the distribution of amplification and deletion events. Statistical significance was assessed using Kruskal–Wallis test. A P value < 0.05 was considered statistically significant. Statistical significance in figures was annotated as follows: * P < 0.05; ** P < 0.01; *** P < 0.001; **** P < 0.0001; “ns” indicates no statistically significant difference.
Supplementary Material 2. Supplementary fig. 2. Construction of the weighted gene Co-expression network. (A) Gene clustering dendrogram based on topological overlap matrix (1 − TOM) dissimilarity. The color bands below the dendrogram indicate module assignment derived from dynamic tree cutting; (B) Scatter plot showing the correlation between gene significance and module membership (MM) in the yellow module; (C) Scatter plot showing the correlation between gene significance and module membership (MM) in the midnight blue module.
Supplementary Material 3. Supplementary fig. 3. Survival analysis and risk score distribution of prognostic CRG candidates. (A) Kaplan–Meier survival curves for the top 6 of 23 candidate CRGs significantly associated with overall survival in TCGA-BRCA cohort (n = 1,059). (B) Distribution of CRG-based risk scores and survival status across patients in the TCGA (training, n = 1,059) and GSE20685 (validation, n = 327) cohorts.
Supplementary Material 4. Supplementary fig. 4. Functional enrichment and somatic mutation landscape in Risk-Stratified groups. (A) Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of 108 tumor differentiation–associated genes; (B) Waterfall plot showing the distribution and frequency of somatic mutations in the high- and low-risk groups.
Supplementary Material 5. Supplementary fig. 5. Landscape of TP53 mutations and tumor mutational burden in risk groups. (A) Frequency and distribution of TP53 mutations in the high-risk group (TCGA-BRCA cohort, n = 986); (B) Comparison of tumor mutational burden (TMB) between high- and low-risk groups, with the high-risk group exhibiting significantly elevated TMB levels.
Acknowledgements
Not applicable.
Author contributions
R.R.P. and Y.L. developed the concept and design of this study. R.R.P. and Y.L. analyzed and interpreted the data. R.R.P. and Y.L. wrote the manuscript. Y.L. designed and illustrated the figures. B.X. conducted a critical review of the manuscript and provided constructive feedback and suggestions for revisions. B.X. provided guidance and supervision throughout the writing process.
Funding
Jiangxi Provincial Health Commission Youth Project (202510933).
Data availability
This study utilizes publicly available datasets from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). As both databases consist of anonymized, de-identified patient data, patient involvement was not required, and informed consent was not necessary in compliance with the ethical guidelines governing the use of these resources.
Declarations
Ethics approval
This study used publicly available data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Therefore, ethical approval was not required.
Patient consent
Not applicable.
Consent to participate
Not applicable. This study did not involve any human participants, animal experiments, or interventions requiring informed consent.
Consent to publish
Not applicable. No individual person’s data in any form (including images, videos, or personal details) is included in this article.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Cao W, Chen HD, Yu YW, Li N, Chen WQ. Changing profiles of cancer burden worldwide and in china: a secondary analysis of the global cancer statistics 2020. Chin Med J. 2021;134:783–91. 10.1097/cm9.0000000000001474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chen J, et al. The molecular mechanisms of copper metabolism and its roles in human diseases. Pflug Arch: Eur J Physiol. 2020;472:1415–29. 10.1007/s00424-020-02412-2. [DOI] [PubMed] [Google Scholar]
- 3.Tsvetkov P, et al. Copper induces cell death by targeting lipoylated TCA cycle proteins. Science. 2022;375:1254–61. 10.1126/science.abf0529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Duan F, et al. Establishment and validation of prognostic nomograms based on serum copper level for patients with Early-Stage Triple-Negative breast cancer. Front Cell Dev Biology. 2021;9:770115. 10.3389/fcell.2021.770115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cao S, et al. Role of Cuproptosis in Understanding diseases. Hum Cell. 2023;36:1244–52. 10.1007/s13577-023-00914-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Bian Z, Fan R, Xie LA. Novel Cuproptosis-Related prognostic gene signature and validation of differential expression in clear cell renal cell carcinoma. Genes. 2022;13. 10.3390/genes13050851. [DOI] [PMC free article] [PubMed]
- 7.Han J, Hu Y, Liu S, Jiang J, Wang H. A newly established Cuproptosis-Associated long Non-Coding RNA signature for predicting prognosis and indicating immune microenvironment features in soft tissue sarcoma. J Oncol. 2022;2022(8489387). 10.1155/2022/8489387. [DOI] [PMC free article] [PubMed]
- 8.Lv H, et al. Comprehensive analysis of Cuproptosis-Related genes in immune infiltration and prognosis in melanoma. Front Pharmacol. 2022;13. 10.3389/fphar.2022.930041. [DOI] [PMC free article] [PubMed]
- 9.Wang T, et al. Integrating bulk and single-cell RNA sequencing reveals cellular heterogeneity and immune infiltration in hepatocellular carcinoma. Mol Oncol. 2022;16:2195–213. 10.1002/1878-0261.13190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Thorsson V, et al. The immune landscape of cancer. Immunity. 2018;48:812–e830814. 10.1016/j.immuni.2018.03.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9. 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed]
- 12.Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4:44–57. 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
- 13.Yu G, Wang LG, Han Y, He QY. ClusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–7. 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wu SZ, et al. A single-cell and spatially resolved atlas of human breast cancers. Nat Genet. 2021;53:1334–47. 10.1038/s41588-021-00911-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Cocco S, et al. Biomarkers in Triple-Negative breast cancer: State-of-the-Art and future perspectives. Int J Mol Sci. 2020;21. 10.3390/ijms21134579. [DOI] [PMC free article] [PubMed]
- 16.Liao AQ, et al. Syntheses, crystal structures of copper (II)-based complexes of sulfonamide derivatives and their anticancer effects through the synergistic effect of anti-angiogenesis, anti-inflammation, pro-apoptosis and Cuproptosis. Eur J Med Chem. 2024;280:116954. 10.1016/j.ejmech.2024.116954. [DOI] [PubMed] [Google Scholar]
- 17.Tang T, et al. Advantages of targeting the tumor immune microenvironment over blocking immune checkpoint in cancer immunotherapy. Signal Transduct Target Therapy. 2021;6. 10.1038/s41392-020-00449-4. [DOI] [PMC free article] [PubMed]
- 18.Fathi M, et al. Identifying signatures of EV secretion in metastatic breast cancer through functional single-cell profiling. iScience. 2023;26:106482. 10.1016/j.isci.2023.106482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wang N, et al. Copper-Based composites nanoparticles improve Triple-Negative breast cancer treatment with induction of Apoptosis-Cuproptosis and immune activation. Adv Healthc Mater. 2024;13:e2401646. 10.1002/adhm.202401646. [DOI] [PubMed] [Google Scholar]
- 20.Liao Q, et al. p53 induces circFRMD4A to suppress cancer development through glycolytic reprogramming and Cuproptosis. Mol Cell. 2025;85:132–e149137. 10.1016/j.molcel.2024.11.013. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Material 1. Supplementary fig. 1. Expression and genomic alterations of Cuproptosis-Related genes across clinical subgroups. (A) Boxplots showing differential expression of CRGs across PAM50 molecular subtypes (TCGA-BRCA cohort, n = 1,059); (B) CNV profiles of CRGs, highlighting the distribution of amplification and deletion events. Statistical significance was assessed using Kruskal–Wallis test. A P value < 0.05 was considered statistically significant. Statistical significance in figures was annotated as follows: * P < 0.05; ** P < 0.01; *** P < 0.001; **** P < 0.0001; “ns” indicates no statistically significant difference.
Supplementary Material 2. Supplementary fig. 2. Construction of the weighted gene Co-expression network. (A) Gene clustering dendrogram based on topological overlap matrix (1 − TOM) dissimilarity. The color bands below the dendrogram indicate module assignment derived from dynamic tree cutting; (B) Scatter plot showing the correlation between gene significance and module membership (MM) in the yellow module; (C) Scatter plot showing the correlation between gene significance and module membership (MM) in the midnight blue module.
Supplementary Material 3. Supplementary fig. 3. Survival analysis and risk score distribution of prognostic CRG candidates. (A) Kaplan–Meier survival curves for the top 6 of 23 candidate CRGs significantly associated with overall survival in TCGA-BRCA cohort (n = 1,059). (B) Distribution of CRG-based risk scores and survival status across patients in the TCGA (training, n = 1,059) and GSE20685 (validation, n = 327) cohorts.
Supplementary Material 4. Supplementary fig. 4. Functional enrichment and somatic mutation landscape in Risk-Stratified groups. (A) Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of 108 tumor differentiation–associated genes; (B) Waterfall plot showing the distribution and frequency of somatic mutations in the high- and low-risk groups.
Supplementary Material 5. Supplementary fig. 5. Landscape of TP53 mutations and tumor mutational burden in risk groups. (A) Frequency and distribution of TP53 mutations in the high-risk group (TCGA-BRCA cohort, n = 986); (B) Comparison of tumor mutational burden (TMB) between high- and low-risk groups, with the high-risk group exhibiting significantly elevated TMB levels.
Data Availability Statement
This study utilizes publicly available datasets from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). As both databases consist of anonymized, de-identified patient data, patient involvement was not required, and informed consent was not necessary in compliance with the ethical guidelines governing the use of these resources.








