Abstract
Background
Colorectal cancer (CRC) exhibits high heterogeneity, affecting variable outcomes and response to therapy. Tumor stroma drives progression and immunosuppression. Although tumor–stroma ratio (TSR) is a validated prognostic marker, TSR remains subjective and poorly reproducible. Artificial intelligence (AI) enables standardized TSR quantification on hematoxylin and eosin (HE) whole-slide images (WSI), supporting clinical integration and personalized therapy.
Methods
A total of 3411 CRC patients (Cohorts 1–3) were included for survival analysis. HE-stained WSIs were processed using tumor detection and tissue segmentation models to automatically calculate TSR-AI, classified as low, intermediate, or high. Prognostic value for overall survival (OS) and disease-free survival (DFS) was assessed, along with correlations to immune infiltration. Stromal-immune interactions were further validated using spatial transcriptomics data from publicly available CRC samples profiled with Visium HD platform.
Results
TSR-AI strongly correlated with reference TSR from CK-stained WSIs (Pearson’s r = 0.93, 95% confidence intervals (CI) 0.90–0.94) and with standardized pathologist assessments (p < 0.05). Patients with TSR-AI-low had significantly prolonged OS compared with TSR-AI-high, with unadjusted hazard ratios of 2.44 (95% CI 1.61–3.70, p < 0.001) in Cohort 1, 3.29 (2.29–4.72, p < 0.001) in Cohort 2, and 2.98 (2.07–4.28, p < 0.001) in Cohort 3; similar trends were observed for DFS. TSR-AI-high was associated with reduced immune cell infiltration. Spatial transcriptomics further revealed stromal-immune interactions, with stroma-high tumors showing elevated cancer-associated fibroblast signatures and enrichment of profibrotic transforming growth factor-β signaling.
Conclusion
TSR-AI enables automated, objective, reproducible, and whole-slide quantification of TSR from routine HE-stained WSIs. TSR-AI provides robust prognostic information beyond TNM staging and may inform decisions on postoperative adjuvant therapy. Large-cohort analysis further confirms stroma as a key driver of an immunosuppressive tumor microenvironment in CRC.
Clinical trial number
Not applicable.
Supplementary information
The online version contains supplementary material available at 10.1186/s12967-026-07681-6.
Keywords: Deep learning, Whole-slide images, Tumor-stroma ratio, Colorectal cancer, Immune
Introduction
As a highly heterogeneous disease, colorectal cancer (CRC) exhibits variable clinical outcomes and therapeutic responses even among patients with similar tumor–node–metastasis stages [1, 2]. Furthermore, such heterogeneity translates into differential benefits from adjuvant chemotherapy (ACT), with some patients being exposed to potential overtreatment. This heterogeneity cannot be fully explained by cancer cell–intrinsic factors alone; alterations in the surrounding tumor microenvironment (TME) also play critical roles. Among its components, the tumor stroma is the most abundant compartment and orchestrates cross-communication within tumors, contributing to progression and therapy resistance [3]. Besides, recent studies further suggest that cancer-associated fibroblasts (CAFs), a key stromal cell type, interact with immune cells to establish an immunosuppressive microenvironment [4, 5]. Given the central role of stromal components in shaping the TME, the tumor–stroma ratio (TSR), a relative quantitative parameter, has emerged as a robust prognostic marker across epithelial cancers, with the UNITED study prospectively validating its clinical relevance and potential for integration into international guidelines [6]. However, most studies to date have focused solely on the prognostic value of TSR [7–10]. Therefore, we not only need studies to validate the prognostic value of TSR, but also investigations into its interactions with TME, which are crucial for guiding therapeutic strategies and facilitating clinical translation.
Developing artificial intelligence (AI) algorithms for automated TSR assessment provides a reproducible and objective approach that can accelerate its clinical translation. With advances in digital pathology and deep learning, tissue components on hematoxylin and eosin (HE)-stained slides can now be automatically and accurately identified and quantified [11, 12]. In our prior research, we established a deep learning model to quantify TSR based on histologic whole-slide images (WSI) of CRC and demonstrated its prognostic validity for patient stratification for overall survival (OS) [13]. After identifying tissue components on HE-stained WSIs, the automatically calculated TSR can overcome potential obstacles in clinical implementation, such as variability in cutoff thresholds [14, 15] and evaluated regions [13, 16], as it allows flexible adaptation and reproducible evaluation. In addition, we recently developed a preliminary tumor stroma immune score for quantifying tumor-infiltrating lymphocytes on HE-stained WSIs, which showed prognostic relevance for OS [17]. Therefore, leveraging objective and reproducible analysis, AI technologies present a transformative opportunity to help elucidate the barriers to TSR’s clinical implementation and facilitate its transition from an exploratory biomarker to a clinically applicable tool.
The aim of this study was two-fold. First, leveraging the efficiency of AI, we sought to identify the optimal regions and thresholds to fully realize the clinical utility of TSR and to evaluate its prognostic value in CRC patients across multiple centers. Second, we aimed to further investigate whether high stromal content is directly associated with an immunosuppressive TME, thereby assessing the potential of TSR as a reliable biomarker to guide future therapeutic decisions.
Methods
Patients
Our study recruited patients with histologically confirmed CRC who had undergone surgery with the intent of curing their cancer and had paraffin-embedded tumor samples available. Five cohorts were enrolled in this study. Cohort 1 consists of patients from Guangdong Provincial People’s Hospital (GDPH, January 2008 to July 2016) and Yunnan Cancer Hospital (December 2012 to April 2015). Cohort 2 included patients recruited from Shanxi Cancer Hospital between January 2014 and December 2016. Cohort 3 were patients from the Molecular and Cellular Oncology (MCO) cohort [18, 19] and Cohort 4 were patients from Linköping University Hospital (1983–1999). Cohort 5 consisted of two publicly available human CRC samples obtained from the 10× Genomics Visium HD spatial gene expression dataset portal (https://www0.10xgenomics.com/products/visium-hd-spatial-gene-expression/dataset-human-crc). Additionally, GDPH-RNA cohort comprised of 134 patients from GDPH, for whom transcriptome sequencing was performed (2015 to 2019). This study was conducted in accordance with the Declaration of Helsinki and was approved by the institutional review board of GDPH. The requirement for informed consent was waived due to the retrospective nature of the study. Samples were excluded if the patients received neo-adjuvant therapy (radiotherapy and chemotherapy), missing of follow-up information, death within 30 days of surgery, or unavailability of HE-stained WSIs.
Clinicopathological characteristics information was collected, including age, sex, stage, tumor location, grade, carcinoembryonic antigen (CEA) level, microsatellite instability (MSI) and post-surgery treatment (surgery alone or ACT). Stage was performed according to the Union for International Cancer Control guideline [20]. Preoperative CEA level was binarized as low and high groups, with the cut-off of 5 ng/mL. Missing values were handled using complete-case analysis for models requiring variables not available in all patients (e.g., MSI and ACT). The prespecified major endpoint was OS, referring to the time from the diagnosis as CRC to death for any reason. Disease-free survival (DFS), the secondary endpoint of interest in this study, was defined as the date from the diagnosis as CRC to the first event of tumor recurrence. Patients without events were censored at the date of the last follow-up, with a follow-up duration of 5 years.
Tissue segmentation of HE-stained WSIs
The primary CRC tissue sections fixed with formalin and embedded in paraffin were cut into 4–5 μm slides and stained with HE. These HE-stained slides were all scanned using digital slide scanner systems at 40× objective magnification, including NanoZoomer S60 (Hamamatsu Photonics, Japan), Aperio AT2 (Leica Biosystems, Germany), and Aperio GT450 (Leica Biosystems, Germany).
In routine pathology, TSR assessment is typically performed on the HE-stained section corresponding to the deepest invasive part of the tumor (i.e., the T stage). To reduce potential bias from misidentification of stromal tissue outside tumor areas, we applied a two-step, resolution-based approach for tumor region restriction and TSR calculation on the same WSI. To quantify TSR-AI automatically, two deep learning models were trained using HE-stained image blocks from WSIs. First, the WSI was downsampled to 5× magnification, and a 2D nnUNet model (Model-5×) was used to automatically detect tumor regions. Details of the model training method are in Supplementary Method S1. Second, another semantic segmentation model (Model-10×) for tissue type quantification. We defined 12 tissue types in HE-stained WSIs, namely tumor epithelium, tumor stroma, necrosis, normal gland, normal stroma, submucosa or serosa, muscle, lymphocyte aggregates, mucus, adipose, blood, and background [21]. A 2D semantic segmentation model was developed using the nnUNet framework with the ResEncUNetL configuration. The model architecture was based on a residual encoder UNet, consisting of 8 stages with increasing feature dimensions (from 32 to 512), 3 × 3 convolutional kernels, instance normalization (InstanceNorm2d), and Leaky ReLU activations. The dataset included 5615 RGB histopathology image patches derived from multiple digital pathology datasets, including TCGA-COAD, GDPH, and PAIP2020. All images were normalized using Z-score normalization per channel. In addition to the default data augmentation strategies provided by nnUNet, we further incorporated customized color perturbation and random noise augmentation methods, the details of which are described in Supplementary Method S2. For the Model-10×, these patches were randomly split into a training set (80%) and a validation set (20%) following the same data partitioning strategy used for the Model-5×. To strictly avoid patient- or slide-level leakage, we performed dataset partitioning at the WSI-level. Patch size was fixed at 896 × 768 pixels, and training was conducted with a batch size of 13 for 1000 iterations. Dice loss was computed per batch to optimize segmentation performance. Data preprocessing and resampling were handled by the default nnUNet modules using third-order interpolation for images and first-order interpolation for labels. To assess segmentation performance, we calculated the Dice similarity coefficient (DSC) for each tissue class, which reflects the accuracy of tissue classification and supports reliable TSR quantification.
Calculation of TSR-AI in tumor block area
During inference, 5× and 10× WSIs were tiled using the original patch sizes with a 50% overlap, and patch-level predictions were merged into full-slide maps. The 5× tumor segmentation was then resized to 10× resolution and used as a mask to restrict the 10× tissue-type segmentation to tumor regions, ensuring that TSR was calculated only within the tumor-containing areas. By overlaying the high-resolution segmentation results within the tumor regions identified at 5×, we calculated TSR-AI as AreaSTR/(AreaTUM+AreaSTR)×100%, where the calculation was restricted to the tumor block area. Patients were stratified into TSR-AI-low, intermediate, and high groups using 50% and 75% as cutoffs. A brief description of the TSR-AI processing pipeline provided in Supplementary Method S3. The 50% cutoff represents the most widely adopted and clinically validated TSR threshold across retrospective and prospective studies [6]. The 75% cutoff enables further stratification of tumors with very high stromal content, allowing identification of patients at particularly elevated risk. Detailed information on the validation analyses is provided in the Supplementary Method S4.
Immunohistochemistry (IHC) validation
To validate the accuracy of TSR-AI identified by our model, cytokeratin (CK) IHC was performed on available samples from a subset of Cohort 1 (N = 215). HE-stained sections were de-stained, re-stained with pan-cytokeratin (CK8/18, DAB), and rescanned to label all glandular epithelial components (benign and malignant), serving as a reference for TSR calculation (TSR-Ref). This staining procedure results in a slide where all relevant tissue is highlighted, providing us with a clear ground truth. Re-staining, instead of making consecutive slides, results in an HE and IHC WSI pair for each patient that contains the same tissue. We converted all images from color to gray scale and inversed the images against a black background. After excluding necrosis and background, the white area reflects tumor stroma, while black area reflects the tumor epithelium. We defined TSR-Ref as Areawhite/(Areawhite+Areablack) × 100% within tumor block area.
TSR-virtual tissue microarray (TSR-vTMA) calculation using a standardized protocol
To evaluate the consistency of TSR assessment, WSIs from Cohorts 3 and 4 were used, with regions manually selected by a pathologist (S.Y.). The areas for TSR assessment was selected by a pathologist according to a recently published detailed protocol describing the microscopic determination of TSR [22]. In brief, the conventional requirement for the field used to assess TSR is that tumor cells must be present at all borders of the selected image field. Under this premise, an area with the largest amount of stroma in the entire slice was selected. According to the above requirements, pathologists delineated the visual field using a circle with a diameter of 1 mm at 10× objective. TSR-vTMA was obtained by calculating the TSR in the pathologist-annotated regions on the segmentation results of HE-stained WSI. In addition, to compare the consistency of the selected regions, the same pathologist made a second TSR region selection in the same batch of data.
Assessment of immune cells in TME
To investigate the relationship between stromal content and immune cell infiltration within the TME, we performed nuclei segmentation and classification on HE-stained WSIs. A StarDist model with a U-Net backbone was used to segment individual nuclei and classify them into six major cell types: neutrophil cell, epithelial cell, lymphocyte cell, plasma cell, eosinophil cell, and cells belonging to connective tissue. The model was trained from scratch using a multi-task loss function that combines object detection, shape regression, and classification objectives. Detailed training procedures followed the protocol described in a previous study [23].
Immune cell deconvolution analysis using CIBERSORTx
Bulk RNA-seq data from the GDPH-RNA cohort were subjected to immune cell deconvolution using CIBERSORTx (RRID:SCR_016955). This approach applies a reference expression matrix of immune cell gene signatures to estimate the relative fractions of diverse immune cell subsets within the TME.
Spatial transcriptomic analysis using Visium HD
Two publicly available human CRC samples were downloaded from the 10× Genomics Visium HD spatial gene expression dataset portal, including P2CRC and P5CRC. Unsupervised clustering identified 23 transcriptionally distinct clusters, which were further annotated into 9 major cell lineages based on canonical marker genes: tumor epithelial cells, enterocytes, endothelial cells, smooth muscle cells, T cells, fibroblasts, B cells, myeloid cells, and neurons. Cluster annotations were spatially visualized across the tissue sections to assess cellular composition and distribution patterns between P2CRC and P5CRC. To further explore the relationship between stromal features and immune phenotypes, we examined the spatial expression profiles of well-established marker genes. Immune-related genes included PTPRC (CD45), CD3D, CD8A, FOXP3, PDCD1, CD68, CD163, and CD1C, while stromal-associated markers included COL1A1, COL3A1, FAP, PDGFRA, VIM, and FN1. To investigate transcriptional differences associated with stromal composition, experienced pathologists manually delineated stroma regions on the Visium HD sections for each sample. Differential gene expression analysis was then performed comparing these annotated stromal regions between the two samples. Subsequently, significantly differentially expressed genes were subjected to pathway enrichment analysis using the KEGG database (RRID:SCR_027172), with pathways considered significant at an adjusted p < 0.05.
Statistical analysis
Clinicopathological characteristics were compared by Student t-test for a continuous variable or Chi-square test for a category variable. Kaplan-Meier survival analysis was applied for the analysis of the survival curves, and log-rank statistics were used to test the differences in survival distributions. Uni- and multivariable survival analyses were performed using the Cox proportional hazard model for TSR-AI and clinicopathological variables. The discriminative performance of TSR classifications was further evaluated by calculating the concordance index (C-index) with 95% confidence intervals (CI). Two-side p value less than 0.05 was considered statistically significant. All statistical analyses were conducted using R software (version 4.0.3, R Project for Statistical Computing (RRID:SCR_001905)).
Results
Patients
A total of 3411 CRC patients were included for survival analysis. Among them, 945 patients comprised Cohort 1, 1261 patients comprised Cohort 2, and 1205 patients comprised Cohort 3. WSIs from Cohort 4 were used solely for consistency analysis of TSR assessment and were not included in survival analysis. In this study, both OS and DFS analyses were truncated at 5 years. A comparison of clinicopathologic characteristics between three cohorts is shown in Supplementary Table S1. There were significant differences between three cohorts in age, sex, stage, location, CEA level and grade (all p < 0.05, Supplementary Table S1). The overall workflow of our study is illustrated in Fig. 1. Supplementary Fig. S1 illustrates an example of the 3-category TSR-AI calculated in tumor block area and case-level TSR-AI values with overlay visualizations are provided in Supplementary Fig. S2. Representative failure-case visualizations of TSR-AI are also provided in Supplementary Fig. S3.
Fig. 1.
Study workflow. (A) Evaluation process for TSR-AI: following digitization of HE-stained slides, tissue segmentation and tumor region detection models were trained separately to automatically calculate 3-category TSR-AI as AreaSTR/(AreaTUM+AreaSTR)×100%. (B) Calculation method of TSR gold standard: following IHC re-staining of HE-stained slices, TSR-Ref was calculated within the tumor block area based on tumor region detection. TSR-Ref as Areawhite/(Areawhite+Areablack) × 100% within tumor block area. (C) Comparison of TSR-AI and TSR-vTMA: the areas for TSR was selected by a pathologist using a standardized protocol and directly calculate TSR of the same position on the segmentation results of HE-stained (TSR-vTMA). (D) Analysis: included consistency analysis, prognostic analysis, immune correlation analysis, and spatial transcriptomics analysis. TSR, tumor-stroma ratio; AI, artificial intelligence; TUM, tumor epithelium; STR, stroma; BAC, background; NEC, necrosis; NOR, normal gland; NSM, normal stroma; SUB, submucosa or serosa; MUS, muscle; LYM, lymphocyte aggregates; MUC, mucus; ADI, adipose; BLD, blood; IHC, immunohistochemistry; HE, hematoxylin and eosin; vTMA, virtual tissue microarray; CAF, cancer-associated fibroblast; mRegdc, mature regulatory dendritic cells; pDC, plasmacytoid dendritic cells; cDC1, conventional dendritic cells type I
Performance of the tissue segmentation model (Model-10×)
The segmentation model was trained with 80% of the dataset and validated on the remaining 20%. The segmentation model achieved high accuracy across the 12 defined tissue types. The average DSC for tumor epithelium and tumor stroma was 0.9629 and 0.9636, respectively, indicating reliable differentiation between key components required for TSR calculation. In addition, the model demonstrated satisfactory segmentation performance for other tissue types, as detailed in Supplementary Table S2.
Agreement between TSR-Ref and TSR-AI
To validate the accuracy of the TSR-AI identified by our model, agreement between TSR-Ref and TSR-AI was assessed. Figure 2A-C shows examples of TSR-Ref (calculated from CK-stained WSI) and TSR-AI (estimated by model from HE-stained WSI). A strong positive correlation was observed between TSR-Ref and TSR-AI (Pearson’s r = 0.93, 95% CI 0.90–0.94; Fig. 2D), indicating a robust linear association between the two methods. Bland-Altman plot showed good agreement between TSR stained by HE and CK, with the mean difference of ∆TSR was −0.02 (95% CI −0.18–0.15) (Fig. 2E). As illustrated in the confusion matrix, TSR-AI demonstrated substantial agreement with TSR-Ref, with the majority of cases falling along the diagonal line (Fig. 2F).
Fig. 2.
Consistency analysis between TSR-AI and TSR-Ref. (A) IHC re-staining of HE-stained slices. (B) Paired HE and IHC WSIs of one patient. (C) Up panel: HE-stained block, AI segmentation, and result of TSR-AI. Low panel: CK-stained block, DAB channel, and result of TSR-Ref. (D) Consistency analysis between TSR-AI and TSR-Ref (N = 215) using Pearson correlation coefficient (two-sided, α = 0.05). (E) Bland–Altman plot assessing agreement between TSR values derived from HE- and CK-stained WSIs (N = 215); solid line indicates the mean difference and dashed lines represent the 95% limits of agreement. (F) Confusion matrix of TSR-AI versus TSR-Ref (N = 215). TSR, tumor-stroma ratio. AI, artificial intelligence; HE, hematoxylin and eosin; IHC, immunohistochemistry; CK, cytokeratin; WSI, whole-slide images
Agreement between TSR-AI and TSR-vTMA
Figure 3A illustrates an HE-stained WSI alongside its segmentation results, which were used for TSR quantification on the whole slide (TSR-WSI). Figure 3B shows the vTMA-selected region according to the standardized protocol by the pathologist used for TSR calculation (TSR-vTMA). First, a correlation was observed between TSR-AI and TSR-vTMA in Cohort 3 (Pearson’s r = 0.53, 95% CI 0.49–0.57, p < 0.0001; Fig. 3C) and in Cohort 4 (Pearson’s r = 0.31, 0.07–0.52, 0.013; Fig. 3D). In addition, to assess the consistency of region selection, the same pathologist performed a second TSR region selection in a subset of patients from Cohort 3. Figure 3E-F illustrates the TSR regions in the two assessments. TSR-vTMA1 and TSR-vTMA2 showed a moderate positive correlation (Pearson’s r = 0.56, 95% CI 0.49–0.62, p < 0.0001; Fig. 3G), indicating that the variability observed between TSR-WSI and TSR-vTMA falls within the expected range attributable to manual region selection. These findings support the reliability and potential utility of whole-slide TSR assessment as a robust alternative to traditional region-based evaluation.
Fig. 3.
Consistency analysis between TSR-AI and TSR-vTMA. (A) HE-stained WSI alongside its segmentation results. (B) The vTMA-selected region according to the standardized protocol by the pathologist used for TSR calculation. (C) Consistency analysis between TSR-AI and TSR-vTMA in Cohort 3 (N = 1205) using Pearson correlation coefficient (two-sided, α = 0.05). (D) Consistency analysis between TSR-AI and TSR-vTMA in Cohort 4 (N = 62) using Pearson correlation coefficient (two-sided, α = 0.05). (E) Intra-observer consistency in region selection. (F) vTMA and corresponding tissue segmentation in two assessments. (G) Consistency analysis between TSR-vTMA1 and TSR-vTMA2 (N = 400) using Pearson correlation coefficient (two-sided, α = 0.05). TUM, tumor epithelium; STR, stroma; BAC, background; NEC, necrosis; NOR, normal gland; NSM, normal stroma; SUB, submucosa or serosa; MUS, muscle; LYM, lymphocyte aggregates; MUC, mucus; ADI, adipose; BLD, blood; WSI, whole-slide images; vTMA, virtual tissue microarray; TSR, tumor-stroma ratio; AI, artificial intelligence; HE, hematoxylin and eosin
Prognostic value of TSR-AI
To investigate whether a 3-category TSR classification provides a more refined stratification compared to the conventional binary classification (cutoff of 50%), we conducted a comparative analysis of two cutoff approaches. Obtained outcomes showed that both 3- and 2-category TSR-AI were significantly related to OS (p < 0.0001; Supplementary Fig. S4). Utilizing a 3-category TSR-AI enabled more refined risk stratification. To further quantify discriminative performance, we compared the C-index values across TSR classifications derived by AI and pathologist assessment (Supplementary Table S3). We observed that, irrespective of whether TSR was estimated automatically or subjectively, 3-category TSR consistently outperformed 2-category TSR. In a pooled cohort analysis, the C-index of the 3-category TSR-AI for OS was 0.586 (95% CI 0.561–0.610), compared to 0.579 (95% CI 0.555–0.603) for 2-category TSR-AI. Similarly, for the pathologist-derived classifications, the C-index for 3-category TSR was 0.580 (95% CI 0.554–0.606), whereas 2-category TSR showed a lower C-index of 0.566 (95% CI 0. 0.542–0.590). Comparable trends were observed for DFS (Supplementary Table S3). In addition, time-dependent discrimination analyses demonstrated that the 3-category TSR-AI consistently achieved higher C-index values than the conventional 2-category classification for both OS and DFS across cohorts (Supplementary Table S4). The prognostic effect of TSR-AI is insensitive to the exact choice of cutoffs, supporting the robustness of the selected thresholds (Supplementary Table S5). The optimism-adjusted calibration plot demonstrated good agreement between the predicted and observed survival probabilities across all TSR categories (Supplementary Fig. S5). As shown in the decision curve, the TSR model yielded a consistently higher standardized net benefit than both the treat-all and treat-none strategies across a clinically relevant range of threshold probabilities (Supplementary Fig. S6). Collectively, these findings indicate that the selected 50 and 75% cutoff values are both statistically robust and clinically meaningful.
Figure 4 presents the Kaplan–Meier curves for OS and DFS stratified by 3-category TSR-AI in different cohort. Patients with TSR-AI-low demonstrated significant prolonged OS. The unadjusted hazard ratio (HR) for high versus low was 2.44 (95%CI 1.61–3.70, p < 0.001; Fig. 4A; Table 1) in Cohort 1. In Cohort 2 and 3, these findings were confirmed. The unadjusted HR for high versus low was 3.29 (Cohort 2: 2.29–4.72, < 0.001; Fig. 4B; Table 1) and 2.98 (Cohort 3: 2.07–4.28, < 0.001; Fig. 4C; Table 1). Obtained results showed that TSR-AI was significantly correlated with DFS, with unadjusted HR for high vs. low being 3.00 (95% CI 2.09–4.31, p < 0.001, Fig. 4D; Supplementary Table S6) in Cohort 1 and 3.15 (2.25–4.39, p < 0.001, Fig. 4E; Supplementary Table S6) in Cohort 2.
Fig. 4.
Kaplan–Meier plots for colorectal cancer patients according to 3-category TSR-AI. (A) OS in Cohort 1 (N = 945). (B) OS in Cohort 2 (N = 1261). (C) OS in Cohort 3 (N = 1205). (D) DFS in Cohort 1 (N = 945). (E) DFS in Cohort 2 (N = 1261). Survival differences were assessed using the log-rank test (two-sided, α = 0.05). TSR, tumor-stroma ratio; AI, artificial intelligence; OS, overall survival; DFS, disease-free survival
Table 1.
Uni – and multivariable analyses including TNM, sex, age, location, CEA, grade and TSR-AI for OS in the three cohorts
| Variable | Levels | Cohort 1 | Cohort 2 | Cohort 3 | |||
|---|---|---|---|---|---|---|---|
| HR (95% CI) | P | HR (95% CI) | P | HR (95% CI) | P | ||
| Univariable analysis with Cox model | |||||||
| Age | 1.02 (1.01–1.03) | < 0.001 | 1.02 (1.01–1.04) | < 0.001 | 1.02 (1.01–1.03) | < 0.001 | |
| Sex | Female vs. Male | 0.92 (0.70–1.20) | 0.534 | 1.11 (0.90–1.39) | 0.329 | 0.71 (0.59–0.87) | 0.001 |
| Grade | High vs. Low | 1.50 (1.14–1.99) | 0.004 | 2.03 (1.58–2.60) | < 0.001 | 1.67 (1.30–2.13) | < 0.001 |
| Location | Right vs. Left | 0.99 (074–1.34) | 0.972 | 1.46 (1.17–1.83) | 0.001 | 0.89 (0.73–1.09) | 0.259 |
| Stage | II vs. I | 1.83 (0.83–4.02) | 0.135 | 3.59 (1.56–8.27) | 0.003 | 1.33 (0.92–1.93) | 0.127 |
| III vs. I | 5.30 (2.48–11.31) | < 0.001 | 11.74 (5.20–26.46) | < 0.001 | 2.50 (1.77–3.55) | < 0.001 | |
| IV vs. I | 12.00 (4.79–30.10) | < 0.001 | 56.19 (24.25–130.22) | < 0.001 | 10.94 (7.64–15.67) | < 0.001 | |
| CEA | Abnormal vs. Normal | 2.36 (1.80–3.08) | < 0.001 | 1.89 (1.50–2.37) | < 0.001 | NA | NA |
| TSR–AI | Intermediate vs. Low | 1.59 (1.19–2.11) | 0.002 | 2.07 (1.54–2.79) | < 0.001 | 1.71 (1.39–2.09) | < 0.001 |
| High vs. Low | 2.44 (1.61–3.70) | < 0.001 | 3.29 (2.29–4.72) | < 0.001 | 2.98 (2.07–4.28) | < 0.001 | |
| Multivariable analysis with Cox model | |||||||
| Age | 1.03 (1.01–1.04) | < 0.001 | 1.03 (1.02–1.04) | < 0.001 | 1.04 (1.04–1.05) | < 0.001 | |
| Sex | Female vs. Male | NA | NA | NA | NA | 0.71 (0.58–0.87) | 0.001 |
| Grade | Low vs. High | NA | NA | 1.67 (1.29–2.17) | < 0.001 | NA | NA |
| Location | Right vs. Left | NA | NA | NA | NA | NA | NA |
| Stage | II vs. I | 1.70 (0.73–3.98) | 0.221 | 3.43 (1.38–8.55) | 0.008 | 1.22 (0.84–1.76) | 0.300 |
| III vs. I | 4.47 (1.95–10.23) | < 0.001 | 10.77 (4.40–26.35) | < 0.001 | 2.57 (1.80–3.67) | < 0.001 | |
| IV vs. I | 9.31 (3.44–25.20) | < 0.001 | 48.71 (19.24–123.31) | < 0.001 | 13.69 (9.39–19.95) | < 0.001 | |
| CEA | Abnormal vs. Normal | 1.86 (1.42–2.44) | < 0.001 | 1.30 (1.03–1.64) | 0.029 | NA | NA |
| TSR–AI | Intermediate vs. Low | 1.59 (1.19–2.11) | 0.002 | 1.51 (1.12–2.04) | 0.008 | 1.24 (1.01–1.53) | 0.044 |
| High vs. Low | 2.44 (1.61–3.70) | < 0.001 | 1.95 (1.33–2.84) | 0.001 | 2.27 (1.57–3.28) | < 0.001 | |
Abbreviations: TNM, tumor-node-metastasis; CEA, carcinoembryonic antigen; TSR, tumor-stroma ratio; AI, artificial intelligence; OS, overall survival; HR, Hazard ratio; CI, confidence interval
The 5-year OS rates declined with TSR-AI increased. The 5-year survival rates were 81.2% in TSR-AI-low group, 72.0% in intermediate group, and 60.7% in high group in Cohort 1 (p < 0.001; Fig. 4A). In Cohort 2, the 5-year survival rate was 84.9% in the TSR-AI-low group compared with 58.3% in the high group (p < 0.001; Fig. 4B). Similarly, in Cohort 3, the rates were 72.6% and 39.2%, respectively (p < 0.001; Fig. 4C). Similar trends were observed for DFS.
TSR-AI as an independent prognostic factor for OS and DFS
In Table 1, the univariable association between clinicopathological characteristics and OS is presented. We identified age, grade, stage, CEA level, and TSR-AI as prognostic factors for OS (p < 0.05). In the multivariable analysis, TSR-AI was an independent prognostic marker for improved OS, independent of age, CEA level, and stage (Cohort 1: adjusted HR for high vs. low 2.44, 95% CI 1.61–3.70, p < 0.001; Cohort 2: 1.95, 1.33–2.84, 0.001; Cohort 3: 2.27 (1.57–3.28), < 0.001; Table 1). We identified age, grade, CEA level, TNM stage, and TSR-AI as independent predictors for DFS (Supplementary Table S6, all p < 0.05). In multivariable analysis, TSR-AI was still associated with DFS, independent of age, CEA level, and TNM stage (Cohort 1: adjusted HR for high vs. low 2.25, 95% CI 1.54–3.27, p < 0.001; Cohort 2: 1.93, 1.36–2.73, < 0.001, Supplementary Table S6).
Additionally, in a multivariable model restricted to patients with available MSI and ACT data (N = 732 for DFS; N = 1788 for OS), TSR-AI remained independently associated with both DFS and OS (Supplementary Table S7). No significant interaction between TSR-AI and ACT was observed for either endpoint (all interaction p > 0.09), indicating that the prognostic effect of TSR is consistent across treated and untreated patients. Detailed results are presented in Supplementary Table S8.
Risk stratification by TSR-AI in stage II–III patients with different post-surgery treatments
In the ACT-treated subgroup, TSR-AI effectively stratified patients into TSR-high and low groups. Among stage II patients, 5-year OS was 81.0%, 72.4%, and 59.9% for TSR-AI-low, intermediate, and high groups, respectively, while 73.5%, 64.8%, and 55.8% in stage III (both p < 0.001; Supplementary Fig. S7A, B). Among stage II patients, 3-year DFS was 84.6%, 74.9% and 61.0% across TSR-AI-low, intermediate, and high groups while 78.9%, 67.9%, and 58.6% in stage III (both p < 0.001; Supplementary Fig. S7C, D). These findings suggest that high stroma contributes to tumor progression and resistance to therapy.
In the non-ACT subgroup, TSR-AI also effectively distinguished patients with differential OS and DFS. Among stage II patients, 5-year OS was 82.1%, 74.4%, and 59.7% for TSR-AI-low, intermediate, and high groups, respectively, while 76.6%, 65.8%, and 55.8% in stage III patients (both p < 0.001; Supplementary Fig. S7E, F). Among stage II patients, 3-year DFS, 3-year DFS was 89.1%, 80.9%, and 64.4% across TSR-AI-low, intermediate, and high groups while 83.6%, 69.4%, and 56.8% in stage III (both p < 0.001; Supplementary Fig. S7G, H). Among non-ACT patients, TSR-AI distinguished significantly different OS and DFS, indicating that it could potentially improve the identification of high-risk individuals compared with existing clinical criteria.
Correlation of TSR-AI with immune cells
To evaluate the potential correlation between TSR and immune cells within TME, additional analyses were performed to explore the relationship between different TSR groups and specific immune cell subsets, including lymphocyte cell, neutrophil cell, plasma cell, and eosinophil cell. The distribution of immune cells within stroma versus TSR-AI was shown in Fig. 5. Figure 5A presents the tissue and nuclei segmentation results on the same HE-stained WSI and Fig. 5B presents the HE-stained WSI of the vTMA selection and the corresponding tissue and nuclear segmentation results. TSR-AI-low was correlated more closely with high lymphocyte cells (1807 cells/mm2) within stroma, while TSR-AI-high was related more closely to low lymphocyte cells (1226.128 cells/mm2) within stroma (Fig. 5C). Neutrophils were more abundant in the TSR-AI-low group (88 cells/mm2) compared to TSR-AI-high group (54 cells/mm2) (Fig. 5D). Similar trends were observed in plasma cells. The highest density in the TSR-AI-low group (456 cells/mm2), followed by the middle (390 cells/mm2) and high groups (286 cells/mm2) (Fig. 5E). Eosinophils showed a decreasing trend from TSR-AI-low (54 cells/mm2) to TSR-AI-high (30 cells/mm2), suggesting reduced innate immune infiltration in stroma-rich tumors (Fig. 5F). In multivariable linear regression adjusting for MSI status, tumor location, and tumor grade, TSR-AI remained independently associated with reduced stromal lymphocyte density. Compared with TSR-AI-low tumors, TSR-AI-intermediate and TSR-AI-high tumors showed significantly lower lymphocyte densities (β = −421, p = 6.1 × 10− 8; β = −574, p = 0.0013, respectively). MSI status and tumor location also contributed to immune variability, whereas tumor grade did not show a significant association. These findings confirm that the inverse relationship between TSR-AI and immune infiltration is independent of major clinical and molecular confounders.
Fig. 5.
Correlation of TSR-AI with immune cells. (A) The tissue and nuclei segmentation results on the same HE-stained WSI. (B) HE-stained WSI of vTMA selection and the corresponding tissue and nuclei segmentation results. (C) The distribution of TSR-AI versus lymphocyte cells. (D) The distribution of TSR-AI versus neutrophils. (E) The distribution of TSR-AI versus plasma cells. (F) The distribution of TSR-AI versus eosinophils. Group comparisons were performed using Student’s t-test (N = 767, two-sided, α = 0.05). WSI, whole-slide images; TUM, tumor epithelium; STR, stroma; BAC, background; NEC, necrosis; NOR, normal gland; NSM, normal stroma; SUB, submucosa or serosa; MUS, muscle; LYM, lymphocyte aggregates; MUC, mucus; ADI, adipose; BLD, blood; TSR, tumor-stroma ratio; AI, artificial intelligence; HE, hematoxylin and eosin; vTMA, virtual tissue microarray. (nsp > 0.05, *p < 0.05, **p < 0.01, ***p < 0.001, Student’s t-test)
To further validate these findings at the transcriptomic level, we next investigated the association between TSR-AI and immune cell composition using CIBERSORT in the GDPH-RNA cohort. As shown in Supplementary Fig. S8A, a heatmap visualization revealed distinct immune infiltration stratified by TSR groups. The detailed immune cell composition across individual patients is presented in Supplementary Fig. S8B, while the average proportions of each immune cell subset in different TSR groups are summarized in Supplementary Fig. S8C. In GDPH-RNA cohort, TSR-AI-high exhibited significantly higher proportions of M2 macrophages, accompanied by lower infiltration of NK cell activated, CD8+ cytotoxic T cells, T cell CD4+ memory activated and T cells regulatory (Tregs). These findings indicate that TSR-AI robustly reflects the transcriptional landscape of immune components in CRC, linking histopathological stromal abundance to an immunosuppressive phenotype.
Spatial transcriptomic profiling reveals transcriptional and functional heterogeneity
For P2CRC and P5CRC samples, representative stroma regions were first selected on the HE-stained WSIs, while non-representative stroma areas were masked, as shown in Visium HD HE-stained WSIs in Fig. 6A. Using our model, TSR-AI was calculated as 52.4% for P2CRC (classified as TSR-intermediate) and 30.5% for P5CRC (classified as TSR-low). Figure 6A presents HE-stained WSIs of P2CRC and P5CRC, the unsupervised clustering results, and the stromal regions derived from unsupervised clustering mapped back onto the HE-stained slides. The corresponding tissue segmentation results are presented in Fig. 6B. Both Visium HD and tissue segmentation results demonstrated a high stroma in P2CRC and a low stroma in P5CRC. Figure 6C shows the spatial mapping and visualization of two CRC samples by combining unsupervised clustering labels with deconvolution-based cell type annotations and Fig. 6D highlights the relationship between cancer-associated fibroblast (CAF) and immune cells. Figure 6E displays heatmaps of gene expression for differentially expressed genes distinguishing immune and stroma components, shown across annotated cell types. In P2CRC (upper panel), CAF-related genes, including COL1A2, COL3A1, and SPARC, were highly expressed, consistent with a stroma-rich microenvironment. Conversely, in P5CRC (lower panel), higher expression of immune-related markers was observed, including S100A8 and LYZ in neutrophils, IGKC and IGHG1 in plasma cells, and CXCL8 in neutrophils and macrophages. Pathway enrichment analysis comparing the manually annotated stromal regions from P2CRC and P5CRC samples revealed distinct transcriptional profiles (Fig. 6F). P2CRC exhibited a marked enrichment of transforming growth factor (TGF)-β signaling pathway, one of the key profibrotic pathways, which was absent in P5CRC. In contrast, P5CRC showed enrichment of immune-related and inflammatory pathways, such as chemokine signaling and IL-17 signaling (Fig. 6F).
Fig. 6.
Spatial transcriptomic profiling reveals transcriptional and functional heterogeneity. (A) Visium HD HE-stained WSIs, the unsupervised clustering results and stroma-related cluster mapped back onto the HE-stained slides of P2CRC and P5CRC. (B) Corresponding tissue segmentation results of P2CRC and P5CRC. (C) Spatial mapping of P2CRC and P5CRC: unsupervised clustering and deconvolution-based cell type labels. (D) Relationship between CAF and immune cells. (E) Heatmaps of gene expression for differentially expressed genes distinguishing immune and stromal components. (F) Pathway enrichment analysis. HE, hematoxylin and eosin; WSI, whole-slide images; AI, artificial intelligence; CRC, colorectal cancer; TUM, tumor epithelium; STR, stroma; BAC, background; NEC, necrosis; NOR, normal gland; NSM, normal stroma; SUB, submucosa or serosa; MUS, muscle; LYM, lymphocyte aggregates; MUC, mucus; ADI, adipose; BLD, blood; CAF, cancer-associated fibroblast; NK, natural killer cells; cDC1, conventional dendritic cells type I; mRegdc, mature regulatory dendritic cells; pDC, plasmacytoid dendritic cells; vSM, vascular smooth muscle cells; TGF, transforming growth factor
Discussion
In this study, the AI-based approach enabled quantitative analysis of tissue composition in CRC HE-stained WSIs, thereby allowing automated calculation and objective comparison of TSR across different regions and thresholds. We found that 3-category TSR-AI on whole slides was a stable prognosis factor for CRC and outperformed the current calculation of TSR using a standardized protocol. In addition, by leveraging AI in a large-scale cohort, we confirmed that stroma is a critical determinant in shaping an immunosuppressive TME.
The complexity of tumorigenesis has been recognized as a bidirectional communication between cancer cells and certain cell types within TME [24]. Tumor biology can no longer be understood solely by listing the intrinsic traits of cancer cells; instead, it must also account for the contributions of TME [25]. It has been shown that stroma mediates tumor growth, invasion, and metastasis [26]. This stroma modulates several tumor-promoting processes, including epithelial-to-mesenchymal transition [27]. A key marker for assessing the quantity of stroma is TSR, our previous analysis and studies by other scholars have shown that in CRC patients, abundant stroma was associated with poor prognosis [6, 13]. Results of our present work also suggested that patients with TSR-AI-low demonstrated significant prolonged OS and DFS. We adopted a 3-category TSR approach, providing an intermediate state perspective that avoids the oversimplification of 2-category analysis. Regardless of whether TSR was assessed manually or automatically, 3-category TSR consistently outperformed the 2-category method, enabling more refined risk stratification while maintaining compatibility with the widely accepted 50% threshold. Moreover, subgroup analysis by post-surgery treatment showed that even among patients in ACT-treated subgroup, TSR-AI-high patients demonstrated worse OS and DFS compared with TSR-AI-low patients, supporting a potential association with therapy resistance. CAF-mediated drug sequestration should be a significant factor in the biological mechanisms underlying the poor therapeutic response in TSR-high. CAFs can secrete extracellular matrix components that trap chemotherapeutic agents. This reduces the amount of drugs reaching the tumor cells, thereby leading to a poor therapeutic response in TSR-high. For example, some studies have shown that CAFs produce high levels of hyaluronic acid, which forms a dense matrix around the tumor. This matrix acts as a physical barrier, preventing the efficient diffusion of drugs into the tumor tissue [28, 29]. In TSR-high tumors, the elevated levels of TGF-β can inhibit the infiltration of cytotoxic T lymphocytes and natural killer cells into the TME. Without the effective immune surveillance and killing by these immune cells, the tumor cells can survive and proliferate even in the presence of chemotherapy [30, 31]. In addition, stroma-induced impaired drug penetration is also a key aspect. The dense stromal network in TSR-high tumors can disrupt the normal blood vessel structure and function. Abnormal blood vessels in the tumor stroma may have reduced blood flow, which limits the delivery of drugs to the tumor cells [28, 32–34]. In patients non-ACT subgroup, TSR-AI also effectively stratified risk, indicating it may outperform conventional clinical criteria in identifying high-risk individuals.
With the continuous development of AI technology, the challenges of applying TSR in clinical settings due to its subjectivity and reproducibility are expected to be gradually addressed. While a moderate agreement between TSR-AI and TSR-vTMA was observed in our study, this level of consistency is expected and mainly reflects the known limitations of manual, region-based TSR protocols. Importantly, this observation further underscores the motivation for developing automated whole-slide TSR assessment, which is designed to minimize subjectivity and enhance reproducibility. Our results showed that repeated TSR region selection by the same pathologist achieved only moderate agreement (Pearson’s r = 0.56), which is comparable to the correlation observed between TSR-AI and TSR-vTMA (Pearson’s r = 0.53), indicating that the variability observed between TSR-WSI and TSR-vTMA falls within the expected range of variability arising from manual region selection. Moreover, Firmbach et al. demonstrated that observers’ individual estimates for a specific region of interest (ROI) can differ notably, with even the most similar estimates exhibiting discrepancies of around 20% points [35]. Unlike conventional region-based TSR, TSR-AI developed in this study analyzes the entire WSI, enabling a more comprehensive representation of TME and significantly reducing the subjectivity and poor reproducibility associated with region selection and manual quantification. In previous studies on automated TSR assessment, it was observed that TSR assessment with AI showed lower agreement with TSR from experienced observers. Furthermore, they noted that high consistency among human observers’ TSR values does not necessarily mean these values objectively reflect TSR within the considered ROI [35]. To validate whether TSR-AI identified by our model could objectively represent the proportion of tumor and stroma, we used CK-stained WSI to effectively label tumor cells and enable precise TSR calculation. The results demonstrated that although the correlation between TSR-AI and TSR-vTMA was moderate, there was a strong correlation between TSR-AI and TSR-Ref, confirming the accuracy of our model in predicting TSR.
An increasing number of studies have explored more comprehensive aspects of immune regulation and immunotherapy response [36–38]. Interestingly, in addition to its prognostic value in primary solid tumors, TSR has drawn increasing attention for its potential association with the tumor immune microenvironment. Many research papers have reported similar findings, that CAFs are one of the major cell types within tumor stroma, which has been widely recognized as a crucial component in TME. On the one hand, the dense and rigid fibrotic extracellular matrix (ECM) generated by CAF activity can physically obstruct the infiltration of immune cells into the tumor, thereby restricting the interactions between T cells and cancer cells [39]. In CRC, matrix stiffening, driven by CAF-mediated ECM deposition and cross-linking, acts as a physical and immunological barrier that hampers immunotherapy. Rigidity impairs the infiltration of T-cells and macrophages, enforces immune-checkpoint expression (e.g., PD-L1), and fosters epithelial-to-mesenchymal transition, collectively promoting tumor progression and immune escape [40]. On the other hand, the stromal CAF secretome can modify tumor immunity by influencing innate immune cell recruitment and activation and polarizing the adoptive immune response towards a pro-tumor phenotype [41]. CAFs could recruit and stimulate immunosuppressive cells and inhibit various immune effector cells. IL-6 secreted by CAFs is instrumental in attracting tumor-associated macrophages and encouraging their transformation into an M2 immunosuppressive phenotype [42]. Additionally, CAFs can secrete TGF-β, which promote the recruitment and differentiation of Tregs, contributing to the suppression of anti-tumor immune responses [43, 44]. Moreover, CAFs can diminish the activation of CD8+ cytotoxic T cells and natural killer cells through the expression of inhibitory immune checkpoint molecules [45, 46]. Targeting this biomechanical checkpoint via ECM-degrading enzymes, FAK/LOX/TGF-β inhibitors, or stiffness-tuned biomaterials is showing early promise in restoring immune cell access and augmenting checkpoint-blockade, adoptive-cell, oncolytic-virus, and vaccine therapies [30]. In our study, nuclei segmentation and classification on HE-stained WSIs, together with spatial transcriptomics analysis, revealed consistent trends. TSR-low correlated with higher levels of lymphocytes, neutrophils, plasma cells, and eosinophils. Furthermore, pathway enrichment analysis validated these findings: P2CRC exhibited a marked enrichment of the TGF-β signaling pathway and P5CRC showed enrichment of immune-related and inflammatory pathways. Our study demonstrates, at a large-scale population, that stromal abundance is an important factor of immunosuppressive TME. TSR-AI not only serves as a prognostic marker but also provides a cost-effective tool for immune phenotyping, probable enabling the identification of patients suitable for anti-fibrotic plus immunotherapy trials.
This work has inherent defects in retrospective research. Therefore, it is necessary to conduct a prospective study to verify the effectiveness of TSR-AI in routine clinical practice, to improve the reliability and persuasiveness of the evidence. Additionally, the relationship between TSR and immune was still in the initial exploration stage, including analyses based on immune cell deconvolution and spatial transcriptomics from a limited number of cases. Further studies with larger sample sizes and comprehensive analyses are needed to better characterize the complex interactions within TME.
In conclusion, we proposed a TSR-AI for fully automatic quantity using HE-stained WSIs in CRC. Evidence suggests that TSR-AI was a stable prognosis factor and particularly useful for adjusting risk stratification of CRC. Furthermore, TSR-AI serves as a reliable biomarker for identifying immunosuppressive phenotypes, paving the way for microenvironment-based precision therapeutic strategies.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
This work was supported by National Natural Science Foundation of China (82372042, 82271946), National Science Foundation for Young Scientists of China (82202267), Natural Science Foundation of Guangdong Province (No. 2025A1515011834), Guangdong Basic and Applied Basic Research Foundation (2023A1515011339), Nansha District People’s Livelihood Science and Technology Project (No. 2024SM005), Heyuan City 2024 Guangdong Provincial Science and Technology Support for the ‘Hundreds of Thousands of Tens of Thousands Project’ Special Funds Second Batch Project Plan (No.25) and Xuhui District Campus Local Cooperation Project (Life and Health Field) (25XHYD-12).
Author contributions
Study design: HFY, KZ, MEZ, ZYL, JJ, TT and SY. Performed the research and collected data: HFY, KZ, YFC, HTH, ZHL, HZ, CWF, NJH, RLW, XFS and JMS. Analyzed the data: HFY, KZ, YFC, ZHL, HZ, JMS, ZYL, JJ, TT and SY. Manuscript drafting: HFY, KZ, MEZ, YFC, ZHL and HZ. Provided discussion, critical feedback and manuscript editing: All authors.
Funding
Funding sources can be found in the acknowledgements section.
Data availabilty
The spatial transcriptomics datasets used in this study were obtained from the publicly accessible 10× Genomics portal: https://www0.10xgenomics.com/products/visium-hd-spatial-gene-expression/dataset-human-crc.
Declarations
Ethics approval and consent to participate
This study was conducted in accordance with the Declaration of Helsinki and was approved by the Research Ethics Committees of Guangdong Provincial People’s Hospital (KY2023-046–01), with the need for informed consent waived for this retrospective study.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Huifen Ye, Ke Zhao, Yanfen Cui, Zhenhui Li, Huan Zhang and Min-Er Zhong contributed equally to this work.
Contributor Information
Zaiyi Liu, Email: liuzaiyi@gdph.org.cn.
Jitendra Jonnagaddala, Email: Jitendra.Jonnagaddala@unsw.edu.au.
Tong Tong, Email: t983352@126.com.
Su Yao, Email: yaosu@gdph.org.cn.
References
- 1.Chen Y, Liang Z, Lai M. Targeting the devil: strategies against cancer-associated fibroblasts in colorectal cancer. Transl Res. 2024 Aug 1;270:81–93. [DOI] [PubMed] [Google Scholar]
- 2.Wang W, Kandimalla R, Huang H, Zhu L, Li Y, Gao F, et al. Molecular subtyping of colorectal cancer: recent progress, new challenges and emerging opportunities. Semin Cancer Biol. 2019 Apr;55:37–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hu JL, Wang W, Lan XL, Zeng ZC, Liang YS, Yan YR, et al. Cafs secreted exosomes promote metastasis and chemotherapy resistance by enhancing cell stemness and epithelial-mesenchymal transition in colorectal cancer. Mol Cancer. 2019 May 7;18:91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.de Visser KE, Joyce JA. The evolving tumor microenvironment: from cancer initiation to metastatic outgrowth. Cancer Cell. 2023 Mar 13;41(3):374–403. [DOI] [PubMed] [Google Scholar]
- 5.Cheng B, Yu Q, Wang W. Intimate communications within the tumor microenvironment: stromal factors function as an orchestra. J Biomed Sci. 2023 Jan 4;30(1):1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Polack M, Smit MA, van Pelt GW, Roodvoets AGH, Meershoek-Klein Kranenbarg E, Putter H, et al. Results from the UNITED study: a multicenter study validating the prognostic effect of the tumor-stroma ratio in colon cancer. ESMO Open. 2024 Apr;9(4):102988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Huijbers A, Tollenaar RAEM, van Pelt GW, Zeestraten ECM, Dutton S, McConkey CC, et al. The proportion of tumor-stroma as a strong prognosticator for stage II and III colon cancer patients: validation in the VICTOR trial. Ann Oncol. 2013 Jan 1;24(1):179–85. [DOI] [PubMed] [Google Scholar]
- 8.Lv Z, Cai X, Weng X, Xiao H, Du C, Cheng J, et al. Tumor-stroma ratio is a prognostic factor for survival in hepatocellular carcinoma patients after liver resection or transplantation. Surgery. 2015 Jul;158(1):142–50. [DOI] [PubMed] [Google Scholar]
- 9.Roeke T, Sobral-Leite M, Dekker TJA, Wesseling J, Smit VTHBM, Tollenaar RAEM, et al. The prognostic value of the tumour-stroma ratio in primary operable invasive cancer of the breast: a validation study. Breast Cancer Res Treat. 2017 Nov;166(2):435–45. [DOI] [PubMed] [Google Scholar]
- 10.Zhang T, Xu J, Shen H, Dong W, Ni Y, Du J. Tumor-stroma ratio is an independent predictor for survival in NSCLC. Int J Clin Exp Pathol. 2015 Sep 1;8(9):11348–55. [PMC free article] [PubMed] [Google Scholar]
- 11.Bera K, Schalper KA, Rimm DL, Velcheti V, Madabhushi A. Artificial intelligence in digital pathology - new tools for diagnosis and precision oncology. Nat Rev Clin Oncol. 2019 Nov;16(11):703–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zhao Y, Ye H, Yang J, Yao S, Lv M, Chen Z, et al. AutoLDP: an accurate and efficient artificial intelligence-based tool for automatic labeling of digital pathology. EngMedicine. 2025 Mar;2(1):100060. [Google Scholar]
- 13.Zhao K, Li Z, Yao S, Wang Y, Wu X, Xu Z, et al. Artificial intelligence quantified tumour-stroma ratio is an independent predictor for overall survival in resectable colorectal cancer. EBioMedicine. 2020 Nov;61:103054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cheng N, Wang B, Xu J, Xue L, Ying J. Tumor stroma ratio, tumor stroma maturity, tumor-infiltrating immune cells in relation to prognosis, and neoadjuvant therapy response in esophagogastric junction adenocarcinoma. Virchows Arch. 2025 Feb;486(2):257–66. 10.1007/s00428-024-03755-2 [DOI] [PubMed]
- 15.Yan D, Ju X, Luo B, Guan F, He H, Yan H, et al. Tumour stroma ratio is a potential predictor for 5-year disease-free survival in breast cancer. BMC Cancer. 2022 Oct 21;22(1):1082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Smit M, Van Pelt G, Roodvoets A, Meershoek-Klein Kranenbarg E, Putter H, Tollenaar R, et al. Uniform noting for international application of the tumor-stroma ratio as an easy diagnostic tool: protocol for a multicenter prospective Cohort study. JMIR Res Protoc. 2019 Jun 14;8(6):e13464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Xu Y, Yang S, Zhu Y, Yao S, Wu L, Zhang S, et al. Annotation-free whole-slide image analysis method to assess immune infiltration in colorectal cancer. JCO Precis Oncol. 2025 May;9:e2400791. [DOI] [PubMed] [Google Scholar]
- 18.Wang X, Jiang Y, Yang S, Wang F, Zhang X, Wang W, et al. Foundation model for predicting prognosis and adjuvant therapy benefit from digital pathology in GI Cancers. J Clin Oncol. 2025 Nov 10;43(32):3468–81. 10.1200/JCO-24-01501 [DOI] [PubMed]
- 19.Jonnagaddala J, Croucher JL, Jue TR, Meagher NS, Caruso L, Ward R, et al. Integration and analysis of heterogeneous colorectal cancer data for translational Research. Stud Health Technol Inf. 2016;225:387–91. [PubMed]
- 20.Edge SB, Compton CC. The American joint committee on cancer: the 7th edition of the AJCC cancer staging manual and the future of TNM. Ann Surg Oncol. 2010 Jun;17(6):1471–74. [DOI] [PubMed] [Google Scholar]
- 21.Ye H, Ye Y, Wang Y, Tong T, Yao S, Xu Y, et al. Automated assessment of necrosis tumor ratio in colorectal cancer using an artificial intelligence-based digital pathology analysis. Med Adv. 2023 Mar;1(1):30–43. [Google Scholar]
- 22.van Pelt GW, Kjær-Frifeldt, van Krieken JHJM, Al Dieri R, Morreau H, Tollenaar RAEM, et al. Scoring the tumor-stroma ratio in colon cancer: procedure and recommendations. Virchows Arch. 2018 Oct;473(4):405–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Weigert M, Schmidt U. Nuclei instance segmentation and classification in histopathology images with Stardist. Proc IEEE Int Symp Biomed Imaging Challenges (ISBIC). 2022;1–4.
- 24.Zunder SM, Gelderblom H, Tollenaar RA, Mesker WE. The significance of stromal collagen organization in cancer tissue: an in-depth discussion of literature. Crit Rev oncol/Hematol. 2020 Jul;151:102907. [DOI] [PubMed] [Google Scholar]
- 25.Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011 Mar 4;144(5):646–74. [DOI] [PubMed] [Google Scholar]
- 26.Conti J, Thomas G. The role of tumour stroma in colorectal cancer invasion and metastasis. Cancers (Basel). 2011 Apr 26;3(2):2160–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Bakir B, Chiarella AM, Pitarresi JR, Rustgi AK. EMT, MET plasticity, and tumor metastasis. Trends Cell Biol. 2020 Oct;30(10):764–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Provenzano PP, Cuevas C, Chang AE, Goel VK, Von Hoff DD, Hingorani SR. Enzymatic targeting of the stroma ablates physical barriers to treatment of pancreatic ductal adenocarcinoma. Cancer Cell. 2012 Mar 20;21(3):418–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hingorani SR, Zheng L, Bullock AJ, Seery TE, Harris WP, Sigal DS, et al. HALO 202: randomized phase II study of PEGPH20 plus Nab-Paclitaxel/Gemcitabine versus Nab-Paclitaxel/Gemcitabine in patients with untreated, metastatic pancreatic ductal adenocarcinoma. J Clin Oncol. 2018 Feb 1;36(4):359–66. [DOI] [PubMed] [Google Scholar]
- 30.Sahai E, Astsaturov I, Cukierman E, DeNardo DG, Egeblad M, Evans RM, et al. A framework for advancing our understanding of cancer-associated fibroblasts. Nat Rev Cancer. 2020 Mar;20(3):174–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kalluri R. The biology and function of fibroblasts in cancer. Nat Rev Cancer. 2016 Aug 23;16(9):582–98. [DOI] [PubMed] [Google Scholar]
- 32.Jain RK. Normalizing tumor vasculature with anti-angiogenic therapy: a new paradigm for combination therapy. Nat Med. 2001 Sep;7(9):987–89. [DOI] [PubMed] [Google Scholar]
- 33.Olive KP, Jacobetz MA, Davidson CJ, Gopinathan A, McIntyre D, Honess D, et al. Inhibition of Hedgehog signaling enhances delivery of chemotherapy in a mouse model of pancreatic cancer. Science. 2009 Jun 12;324(5933):1457–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Stylianopoulos T, Jain RK. Combining two strategies to improve perfusion and drug delivery in solid tumors. Proc Natl Acad Sci USA. 2013 Nov 12;110(46):18632–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Firmbach D, Benz M, Kuritcyn P, Bruns V, Lang-Schwarz C, Stuebs FA, et al. Tumor-stroma ratio in colorectal cancer-comparison between human estimation and automated assessment. Cancers. 2023 May 9;15(10):2675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Sui X, Ji J, Zhang H. Ctcs detection methods in vivo and in vitro and their application in tumor immunotherapy. J Surg Oncol. 2025 Jul;132(1):80–87. [DOI] [PubMed] [Google Scholar]
- 37.Xie J, Zhang P, Liu Y, Wu D, Ou X, Wang M, et al. USP5-Mediated PD-L1 deubiquitination regulates immunotherapy efficacy in melanoma. J Transl Med. 2025 Jul 10;23(1):778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Xie J, Ma C, Zhao S, Wu D, Zhang P, Tang Q, et al. Deubiquitination by USP7 stabilizes JunD and Activates AIFM2 (FSP1) to inhibit ferroptosis in melanoma. J Invest Dermatol. 2025 Oct;145(10):2562–75.e5. [DOI] [PubMed] [Google Scholar]
- 39.Salmon H, Franciszkiewicz K, Damotte D, Dieu-Nosjean MC, Validire P, Trautmann A, et al. Matrix architecture defines the preferential localization and migration of T cells into the stroma of human lung tumors. J Clin Invest. 2012 Mar;122(3):899–910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Chen E, Zeng Z, Zhou W. The key role of matrix stiffness in colorectal cancer immunotherapy: mechanisms and therapeutic strategies. Biochim Biophys Acta Rev Cancer. 2024 Nov;1879(6):189198. [DOI] [PubMed] [Google Scholar]
- 41.Piersma B, Hayward MK, Weaver VM. Fibrosis and cancer: a strained relationship. Biochim et Biophys Acta (BBA) - Rev Cancer. 2020 Apr;1873(2):188356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Higashino N, Koma YI, Hosono M, Takase N, Okamoto M, Kodaira H, et al. Fibroblast activation protein-positive fibroblasts promote tumor progression through secretion of CCL2 and interleukin-6 in esophageal squamous cell carcinoma. Lab Invest. 2019 Jun;99(6):777–92. [DOI] [PubMed] [Google Scholar]
- 43.Chen W, Jin W, Hardegen N, Lei KJ, Li L, Marinos N, et al. Conversion of peripheral CD4+CD25- naive T cells to CD4+CD25+ regulatory T cells by TGF-beta induction of transcription factor Foxp3. J Exp Med. 2003 Dec 15;198(12):1875–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Liu Y, Sinjab A, Min J, Han G, Paradiso F, Zhang Y, et al. Conserved spatial subtypes and cellular neighborhoods of cancer-associated fibroblasts revealed by single-cell spatial multi-omics. Cancer Cell. 2025 May 12;43(5):905–24.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Nazareth MR, Broderick L, Simpson-Abelson MR, Kelleher RJ, Yokota SJ, Bankert RB. Characterization of human lung tumor-associated fibroblasts and their ability to modulate the activation of tumor-associated T cells. J Immunol. 2007 May 1;178(9):5552–62. [DOI] [PubMed] [Google Scholar]
- 46.Feig C, Jones JO, Kraman M, Wells RJB, Deonarine A, Chan DS, et al. Targeting CXCL12 from FAP-expressing carcinoma-associated fibroblasts synergizes with anti-PD-L1 immunotherapy in pancreatic cancer. Proc Natl Acad Sci USA. 2013 Dec 10;110(50):20212–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The spatial transcriptomics datasets used in this study were obtained from the publicly accessible 10× Genomics portal: https://www0.10xgenomics.com/products/visium-hd-spatial-gene-expression/dataset-human-crc.






