Summary
Single-cell RNA sequencing (scRNA-seq) studies have uncovered distinct cancer-associated fibroblast (CAF) populations. While useful as a biological framework, no studies have conclusively defined CAF subtypes with clinical significance. We define restraining (rest) and promoting (pro) CAFs in patient samples that are both prognostic and predictive of therapy response in multiple tumor types. We uncover distinct clinical and spatial interactions between pro- and restCAF subtypes and basal-like and classical tumor subtypes that support tumor-stroma crosstalk. Finally, we find striking differences in the immune contexture of pro- and restCAF tumors where restCAF-dominant tumors are more responsive to immune checkpoint inhibition and proCAF-dominant tumors are more responsive to myeloid inhibition in clinical trials. This work defines CAF subtypes that are clinically robust, prognostic, and predictive of immunotherapy response and provides a single-sample classifier, determination of pro- and restCAF subtypes (DeCAF), which is clinically actionable.
Keywords: pancreatic cancer, PDAC, cancer-associated fibroblast, CAF, CAF subtype, tumor microenvironment, spatial transcriptomics, classifier, TME, tumor-stroma interaction
Graphical abstract

Highlights
-
•
DeCAF classifies promoting and restraining CAF subtypes in patient tumors
-
•
DeCAF subtypes are prognostic and predictive of immunotherapy response
-
•
DeCAF and tumor subtypes have multidimensional relationships with prognostic impact
-
•
Tumor and CAF subtypes show specific interactions in patient tumors
Peng et al. show that tumor-promoting and tumor-restraining fibroblast subtypes, defined by the single-sample classifier DeCAF, shape cancer progression and immunotherapy response across multiple tumor types. DeCAF subtypes reveal multidimensional interactions with tumor subtypes and immune cells, providing a clinically meaningful framework for predicting outcomes and guiding treatment strategies.
Introduction
It is widely recognized that the pancreatic ductal adenocarcinoma (PDAC) tumor microenvironment (TME) plays an important role with both tumor-restraining and tumor-promoting properties.1,2,3,4 PDAC is characterized by an extremely dense desmoplasia, represented as a complex mixture of extracellular matrix (ECM), blood vessels, immune cells, as well as cancer-associated fibroblasts (CAFs).5,6,7,8,9,10 CAFs are key regulators in the TME. Clinical trials attempting to target the PDAC TME have been disappointing.11,12,13,14 These results may be partially explained by the loss of tumor restraint with genetic depletion of CAFs in genetically engineered mouse models.2,3,15
We previously reported two PDAC stroma groups, “activated” and “normal,” where patients with “activated stroma” had decreased survival relative to normal stroma.16 Maurer et al. used microdissected patient samples to derive two TME groups called “ECM-rich” and “immune-rich” stroma, with ECM-rich showing shorter survival.17 With recent advances in single-cell RNA sequencing (scRNA-seq) technology, studies on the PDAC stroma have rapidly shifted to the study of individual CAF and immune cell populations. In a landmark study, Elyada et al. identified “myofibroblastic CAF (myCAF)” and “inflammatory CAF (iCAF)” cell clusters described initially in preclinical murine studies and validated in a human scRNA-seq dataset (Elyada-sc) that enriched for CAF cells using fluorescence-activated cell sorting (FACS).18,19 More recently, additional CAF subpopulations20,21,22,23 (e.g., complement-secreting CAF [csCAF], CAF with a highly activated metabolic state [meCAF], MAPKhigh CAF [mapCAF], etc.) and CAF with certain markers24,25,26 (e.g., LRRC15+ and TGFB1+) have been described, as well as CAF with differential histology features27,28 and gene programs.29 Despite considerable interest in targeting iCAFs due to their tumor-promoting behavior in preclinical studies,30,31,32,33,34 the clinical translation of CAF subpopulations remains unclear.
Through integration of scRNA-seq, bulk RNA sequencing (RNA-seq), spatial transcriptomics (ST), pathology, and clinical data, we uncover CAF-intrinsic genes that redefine the understanding of CAF biology and their clinical importance. We introduce a clinically actionable classification of promoting (proCAF) and restraining CAF (restCAF) subtypes with multidimensional associations with established tumor-intrinsic subtypes and show functional and immune contextual differences that underpin their prognostic significance and their predictive value in immunotherapy clinical trials.
Results
Development and external validation of DeCAF
To overcome the constraints of insufficient clinical data in scRNA-seq datasets, we first set out to identify exemplar or marker genes from scRNA-seq with unique expression in CAF subpopulations that would be translatable to bulk transcriptomics data. We previously demonstrated that the SCISSORS35 algorithm captured cell-subpopulation-specific marker genes derived from scRNA-seq and that these scRNA-seq marker genes were able to recapitulate the clinically validated basal-like and classical PDAC tumor subtypes with near-perfect concordance in bulk RNA-seq. SCISSORS identifies rare cell subpopulations as low as 0.092% and accurately identifies marker genes of high specificity. Thus, we hypothesized that CAF cell-subpopulation-specific marker genes derived by SCISSORS (Table S1) may similarly be utilized to cluster bulk RNA-seq to determine clinical significance in the bulk setting where samples with clinical outcome information are abundant, and enable the training of a single-sample classifier (SSC) to predict CAF subtypes in patient samples.
First, we collated 11 publicly available bulk RNA-seq and microarray datasets containing 1,432 primary PDAC patients and clustered them each using SCISSORS-derived CAF genes by consensus clustering (CC), which separated patient samples into two clusters in each of the datasets (STAR Methods, Tables S2 and S3). Meta-analysis revealed one patient cluster with a significantly shorter median overall survival (mOS) of 20.33 months that we call tumor proCAF, while the other patient cluster had a longer mOS of 30.19 months that we call tumor restCAF (p = 0.0036, hazard ratio [HR] = 1.40 [95% confidence interval (CI) 1.317–1.725], Figure S1A; Table S2). The clinical relationships between clusters were largely consistent across all studies (Table S2).
However, clustering techniques have many real-world limitations that prohibit their application to clinical settings. For example, clustering cannot be applied to individual patients, and attempting to cluster new samples to prior data may change existing subtype assignments. As an unsupervised learning technique, clustering can also be inherently deceptive when applied to samples of limited numbers and diversity.
Thus, we set out to develop a robust SSC, determination of pro- and restCAF subtypes (DeCAF) (Figure 1A; Table S1), to predict proCAF and restCAF subtypes in individual patients through a supervised learning approach. This model was trained using expression data and corresponding CC-based training labels derived from our training group set of samples, consisting of four large bulk gene expression datasets (STAR Methods, training datasets: CPTAC,36 Dijk,37 Moffitt_GEO_array,16 and TCGA_PAAD38; Figure S1B). A key element of our method includes the utilization of our CAF-specific marker genes derived from the SCISSORS algorithm to avoid any potential confounding of their expression from tumor and other cells such as immune cells. DeCAF uses rank-derived predictors through the k Top Scoring Pair (kTSP), employed previously for the PurIST PDAC tumor subtype classifier.39 This approach avoids the direct use of raw expression values for prediction, which reduces the dependence on and impact of between sample/study normalization and also simplifies data integration over different studies for model training, as well as during prediction on new samples, since the model’s predictions are on a rank-based scale.39,40,41,42
Figure 1.
Development and external validation of DeCAF
(A) Framework of DeCAF, using 9 pairs of TSP genes to derive the proCAF probability, which is then converted to the final proCAF vs. restCAF call for a single sample.
(B) ROC curve for the evaluation of DeCAF calls against SCISSORS-CC labels using SCISSORS CAF genes.
(C) Inter-study variability for the evaluation of DeCAF calls.
(D) Summary statistics of the evaluation of DeCAF subtype calls in each validation dataset. N indicates the total number of samples with SCISSORS-CC calls.
(E) Heatmap showing SCISSORS-CC calls and DeCAF subtype calls in the pooled samples.
(F and I) Clustering of CAF cells by applying SCISSORS on the Elyada-sc and UNC-sc scRNA-seq dataset, respectively.
(G and J) Gene set enrichment VAM scores for the DeCAF, SCISSORS, and Elyada CAF genes in each cell shown on the uniform manifold approximation and projection (UMAP) of all cells in Elyada-sc and UNC-sc. Elyada CAF genes showing co-expression in non-CAF cells circled in red.
(H and K) Gene set enrichment VAM scores for the DeCAF, SCISSORS, and Elyada CAF genes in each cell shown on the UMAP of CAF-related cells in Elyada-sc and UNC-sc. Elyada CAF genes showing co-expression in non-CAF cells circled in red.
To assess the quality of our prediction model, we first evaluate the cross-validation error of the final model in our training group samples. We find that the internal leave-one-out cross-validation error for DeCAF in the training group is low (4.0%), using the CC-derived subtypes from the SCISSORS CAF gene list as the “gold standard” to assess the ability of the model to mirror the original subtype assignments in the training group. To evaluate the overall classification performance of DeCAF, we compare the DeCAF subtype calls to the CC-subtype calls in the validation group of 7 datasets17,27,39,43,44,45 (Table S3). First, we applied a nonparametric meta-analysis approach to obtain a consensus receiver operating characteristic (ROC) curve evaluating DeCAF’s ability to discriminate between CAF subtypes, given on the individual ROC curves from each validation dataset. We found that the overall consensus area under the curve (AUC) is high, with a value of 0.961 (Figure 1B). The estimated interstudy variability of these ROC curves is very low at our standard classification threshold of t = 0.5 or greater on the predicted proCAF probability scale (Figure 1C). Furthermore, sensitivities and specificities were often high at this threshold, and the overall AUC values were similarly strong (>0.8) in each validation dataset (Figure 1D).
Across validation datasets, we find that samples strongly segregate by CC subtype when sorted by their predicted proCAF probability and little segregation by study membership, despite diverse origins (Figure 1E). This suggests that our methodology avoids potential study-level effects. The relative expression of classifier genes within each classifier Top Scoring Pair (TSP) (paired rows) strongly discriminates between subtypes in each sample, forming the basis of our robust TSP-oriented approach for subtype prediction (Figure 1E). We found that the proCAF probability is positively correlated with the proportion of proCAF cells among all the CAF cells (rho = 0.92, p < 0.001, Figure S1C), supporting that proCAF probability is capturing CAF-intrinsic signals. Our results support that DeCAF is robust and replicable across multiple diverse validation datasets, setting the basis for our comprehensive investigation of the biological, pathological, and clinical relevance of proCAF vs. restCAF in patient samples.
Cell specificity of DeCAF genes in scRNA-seq
To understand the cell-type specificity of DeCAF genes, we used the scRNA-seq dataset generated by Elyada and colleagues (Elyada-sc) where they described two dominant PDAC CAF subtypes: myCAF and iCAF.18,19 We found that DeCAF- and SCISSORS-derived genes are uniquely expressed within only CAF cells and are not expressed by other cell types (e.g., epithelial and immune cells) in contrast to my/iCAF genes, demonstrating the high specificity of our CAF marker genes (Figures 1F and 1G). Furthermore, within CAF-related clusters, unlike my/iCAF genes, DeCAF proCAF and restCAF genes are not expressed in the perivascular cluster (Figure 1H). DeCAF- and SCISSORS-derived genes showed limited overlap with CAF marker genes identified in other scRNA-seq-based studies16,17,18,20,21,22,23,24,25,27,28,29 (Figure S1D).
Next, we evaluated an independent scRNA-seq dataset of six primary PDAC samples (UNC-sc, Table S4) using the SCISSORS de novo. We again identified fibroblast cells, along with other known cell types in PDAC samples, including epithelial cells, endothelial cells, as well as different types of immune cells in UNC-sc (Figure 1I). Using gene set enrichment scores derived by variance-adjusted Mahalanobis (VAM),46 we found that the DeCAF genes are more uniquely expressed within fibroblast cells, compared to Elyada iCAF genes which are expressed in the macrophage cluster and myCAF genes in the endothelial cluster (Figure 1J). We then reclustered the CAF-related clusters and show that proCAF and restCAF genes are not expressed in the perivascular cluster compared to my/iCAF genes (Figure 1K). Therefore, DeCAF genes are unique and specific markers for human PDAC CAF subpopulations.
Prognostic impact of DeCAF subtypes
To benchmark the prognostic impact of DeCAF against other comparable methods, we performed a meta-analysis of the pooled datasets with overall survival (OS) data available (Table S2). We found that the patients with proCAF subtype tumors (mOS 17.70 months) have significantly shorter OS than patients with restCAF subtype tumors (mOS 29.04 months) (p < 0.0001, stratified HR = 1.634 [95% CI 1.375–1.943], Figure 2A). Next, we used published CAF subtyping gene signatures from Elyada et al.,18 Moffitt et al.,16 and Maurer et al.17 to derive subtype calls in each of the 11 datasets for comparison similarly using CC (Table S3). In contrast to DeCAF, we found that the Elyada my/iCAF gene sets were not prognostic (mOS: 21.32 vs. 25.20 months, p = 0.339, HR = 1.203 [95% CI 1.532–0.945], Figure S1A). The Moffitt stroma schema demonstrated shorter survival for patients with tumors with activated stroma relative to normal stroma but did not reach significance (mOS: 21.45 vs. 30.0 months, p = 0.082, HR = 1.279 [95% CI 0.991–1.651], Figure S1A). We found that Maurer ECM-rich and immune-rich bulk RNA-seq gene signatures are associated with differences in survival (mOS: 20.53 vs. 30.03 months, p = 0.005, HR = 1.33 [95% CI 1.047–1.69], Figure S1A).
Figure 2.
DeCAF independently predicting overall survival
(A and F) Kaplan-Meier curves showing patient overall survival (OS) of DeCAF subtypes in pooled public datasets (p < 0.0001, stratified log-rank test) and UNC-bulk, an independent primary PDAC dataset (p = 0.00038, log-rank test).
(B) Multivariable stratified Cox-PH model comparing different stroma schemas.
(C) Sankey diagram comparing samples called as Elyada myCAF/iCAF subtypes vs. DeCAF proCAF/restCAF subtypes in pooled public datasets.
(D and G) Kaplan-Meier curves showing patient OS stratified by combined DeCAF and PurIST subtypes in the pooled public (p < 0.0001, log-rank test, stratified by dataset) and UNC-bulk (p < 0.0001, log-rank test) datasets.
(E and H) Multivariable Cox-PH model in the pooled public (stratified by dataset) and UNC-bulk datasets.
To compare the different schemas, we used the Bayesian information criterion (BIC) to evaluate model fit relative to the complexity of the model, where the model with the lowest BIC in a series of competing candidate models is preferred in statistical applications and is agnostic to the magnitude of the difference. DeCAF had the lowest BIC compared to other CC-based schemas, including SCISSORS, Maurer, Elyada, and Moffitt stroma (Figure S1A). Furthermore, using a multivariable stratified Cox proportional hazard (Cox-PH) model, DeCAF subtypes were prognostic when adjusting for other subtyping schemas (p < 0.0001, adj. HR = 1.63 [95% CI 1.345–1.978], Figure 2B). These results support the independent prognostic effect and relative clinical utility of DeCAF over alternative schemas.
To understand the differences between the CAF schemas, we compared my/iCAF subtypes18 vs. pro/restCAF subtype samples. We found that 62.6% (n = 655) of myCAF tumors were proCAF and 14.8% (n = 41) iCAF tumors were proCAF (κ = 0.326, Figure 2C), suggesting that the prognostic strength of DeCAF stems from the CAF selectivity of DeCAF genes and more precise identification of patients with proCAF tumors.
It is well validated that patients with basal-like and classical PDAC tumors have significantly different OS.16,38,39,47,48,49 We derived the basal-like and classical tumor-intrinsic subtypes as defined by PurIST39 and analyzed the OS in the context of the combinations of the PurIST and DeCAF subtypes in each patient. Patients with PurIST basal-like and DeCAF proCAF subtype tumors had the shortest OS, while patients with PurIST classical and DeCAF restCAF subtype tumors had the longest OS (11.01 months vs. 30.43 months, p < 0.001, Figure 2D). We then performed a multivariable stratified Cox-PH model for the pooled public datasets including DeCAF and PurIST subtypes as variables and found that both PurIST tumor subtype and the DeCAF subtype were independently associated with survival (p < 0.001, Figure 2E). Next, we applied DeCAF to another independent patient cohort of primary PDAC at UNC where clinical and pathology variables were available (UNC-bulk, n = 129, Table S4) and found that DeCAF subtypes are again associated with OS (HR = 2.255 [95% CI 1.423–3.574], p < 0.001, Figure 2F). We saw similar additive effects of DeCAF and PurIST subtypes and their relationship to OS (p < 0.001, Figure 2G). These findings agree with our prior findings that patients with classical and restCAF subtype tumors after neoadjuvant chemotherapy had a significantly longer OS.50
In an univariable analysis, the restCAF subtype was associated with chronic pancreatitis on pathology (p = 0.008), although chronic pancreatitis had no association with OS (p = 0.300, Table S5). In a multivariable analysis of patients who underwent complete (R0) resections, DeCAF and PurIST subtypes remained independently prognostic when including the variables stage, differentiation and lymphovascular invasion (LVI) (DeCAF p < 0.001 and PurIST p = 0.005, Figure 2H).
Interaction between tumor and CAF subtypes
Next, we investigated subtype-specific interactions between tumor and CAF subtypes. We found that 64.5% (167/259) of basal-like tumors had a proCAF subtype (Figures 3A and 3B). No difference was seen in CAF subtypes within classical tumors: 49.7% proCAF (529/1,065), suggesting an affinity between basal-like and proCAF subtypes (p = 2.166e−05, Figures 3A and 3B). Basal-like tumors also showed higher proCAF probability scores, suggesting an increase in proCAFness in the TME of basal-like tumors (Figure 3C). Next, we interrogated the cell-cell interactions of these subtype-associated cell clusters at the single-cell level. Using CellPhoneDB (v.2),51 we generated predictions on ligand-receptor interactions between each cell cluster in Elyada-sc (Table S6). We found that basal-like tumor and proCAF subtypes were clustered together, indicating close interactions (Figure 3D). We found that among all the tumor-CAF interactions, the most frequent interaction was predicted to be from proCAF to basal-like cells (n = 98, Figure 3E), suggesting potential tumor-stromal crosstalk. Differential analysis revealed specifically activated ligand-receptor pairs in the tumor-CAF interactions; IL-6 was predicted as a significant ligand in restCAF interactions with basal-like and classical 2 (Figure 3F). In addition, a11b1 (integrin alpha-11 [encoded by ITGA11] and beta-1) and COL11A1 were significantly involved in proCAF-basal-like interactions. Both ITGA11 and COL11A1 are proCAF TSPs in DeCAF.
Figure 3.
Interaction of the tumor and CAF subtypes
(A and B) Proportion of patients in PurIST basal-like and classical tumor subtypes with DeCAF proCAF or restCAF subtypes (p = 2.166e−05, Fisher’s exact test).
(C) Comparison of proCAF probability between PurIST subtypes (p = 6e−09, Wilcoxon rank-sum test, n shown in B).
(D) Number of ligand-receptor interactions between cell types predicted by CellPhoneDB using the Elyada-sc dataset.
(E) Circos plot showing the number of directed ligand-receptor interactions between tumor and CAF cell subpopulations.
(F) Differential ligand-receptor gene pairs in the tumor and CAF cell interactions.
DeCAF subtypes are spatially and histologically distinct
Next, we performed immunofluorescence (IF) staining of a DeCAF TSP pair, ITGA11 (proCAF) and FBLN5 (restCAF), in 10 primary PDAC samples (Figures S2A and S2B). We confirmed co-staining with known CAF markers αSMA and FAP (Figure S2C). We found high ITGA11 intensity in the proCAF sample and high FBLN5 in the restCAF sample (Figures 4A, 4B, S2A, and S2B). We show that CD45 and CD11b staining is distinct from ITGA11 and FBLN5, supporting that ITGA11 and FBLN5 are not confounded by immune cell expression (Figures 4C, 4D, and S2D). In addition, we found that FBLN5 and ITGA11 staining were spatially distinct and characterized by different histopathology features27,28 (Figures 4E and S3A). The proCAF ITGA11 areas were more cellular, formed by reactive fibroblasts with larger nuclei, surrounded by immature collagen, giving the ECM a “myxoid” appearance (Figure 4E). The restCAF FBLN5-stained stroma was comparatively hypocellular formed of fibrocytes with small nuclei with a histologically “mature” appearance and abundant dense ECM formed of mature collagen that we call “fibrous” (Figure 4E). Areas with a mixture of the two features were called “fibromyxoid” (Figure 4E). We used Movat pentachrome staining52 to differentiate stroma components, where immature myxoid areas stained blue (Figures 4C, 4F, S3B, and S3C), mature “fibrous” areas stained yellow (Figures 4D and 4G), and the “fibromyxoid” areas stained as a mixture of yellow and blue (Figures 4H and S3C).
Figure 4.
Pathology differences in DeCAF subtypes
(A and B) IF staining of a representative proCAF and a restCAF sample showing ITGA11 and FBLN5 in stroma.
(C and D) H&E, IF, and Movat pentachrome staining of a representative proCAF and restCAF sample. ITGA11 and FBLN5 staining are separate with immune cells. Movat pentachrome staining showed distinct pathology features for proCAF (blue) and restCAF (yellow).
(E) IF and H&E staining of a representative sample section with myxoid, fibromyxoid, and fibrous pathology feature present on the same slide.
(F, G, and H) H&E and pentachrome staining showing additional areas of myxoid (blue), fibrous (yellow), and fibromyxoid (blue-yellow mixed) areas.
(I and J) Boxplots comparing proCAF probability in samples described as having a predominant myxoid vs. fibrous stroma (p = 0.007, Wilcoxon rank-sum test, n = 41 and 65, respectively) or having a myxoid, fibromyxoid, and fibrous stroma (p = 0.002, Kruskal-Wallis test, n = 25, 42, and 39, respectively).
(K and L) Kaplan-Meier curves showing patients OS with the different stroma histologies in the UNC-bulk dataset using a predominant classification of myxoid and fibrous stroma (p = 0.004, log-rank test), or including a fibromyxoid classification (p < 0.001, log rank-test).
(M) Sankey diagram comparing Grünwald subTME calls and DeCAF subtype calls in TCGA_PAAD samples.
To evaluate histology associations with DeCAF subtypes, a pathologist blinded to the subtype calls reviewed the stroma histology of 106 samples (Table S4). We found that the restCAF subtype tumors were associated with a fibrous (51/74, 68.9%) compared to a myxoid histology (23/74, 31.1%) (p = 0.018). Samples with myxoid-dominant histology showed significantly more proCAFness (i.e., higher proCAF probability) compared to fibrous histology samples (p = 0.0066, Figure 4I), while the intermediate fibromyxoid type showed an intermediate level of proCAFness (p = 0.0021, Figure 4J). Patients with myxoid features had shorter mOS of 18.4 months compared to 20.6 months in patients with fibrous stroma (p = 0.0043, Figure 4K), while the intermediate fibromyxoid type had the intermediate level of OS (p = 0.0038, Figure 4L). Prior subTME subtypes have been described by Grünwald et al. using The Cancer Genome Atlas (TCGA) PAAD data but were not prognostic.27 In TCGA-PAAD, 43.5% (40/92) of restCAF subtype tumors had a reactive/intermediate subTME compared to a deserted subTME (56.5% [52/92]) and 89.6% (60/67) of proCAF subtype tumors had a reactive/intermediate subTME, compared to a deserted subTME (10.4% [7/67], p = 9.582e−10, Figure 4M).
Spatial proximity of proCAF to cancer cells
To further examine the spatial localization of DeCAF subtypes, we performed ST on 7 samples (UNC-st) using the 10× Genomics Visium platform (Table S4). Each spot on the Visium ST samples was annotated by a pathologist, blinded to the molecular information. The annotations (e.g., adenocarcinoma, stroma, acinar, inflammatory, blood vessel, etc.) were based on the dominant cell types observed by the pathologist and serve as the benchmark for our gene expression-based analysis (Figures 5A and S4). For each spot, we first used a deconvolution tool, DECODER, to estimate the weights of the seven major PDAC compartments that we previously described.47 We found that tissue areas that encompassed spots annotated as acinar by histology showed high DECODER acinar weights with clear segmentation between acinar and surrounding areas (Figure 5B). Similarly, DECODER immune weights accurately highlighted spots that were annotated as “inflammatory,” high DECODER activated stroma weights were associated with myxoid/fibromyxoid/fibrous stroma areas, and high DECODER tumor weights (sum of basal-like and classical indicating tumor purity) were associated with adenocarcinoma spots (Figures 5B, S5A, and S5B). The agreement of DECODER weights and pathology annotations support the robustness of our ST analyses.
Figure 5.
ST reveals distinct localization of DeCAF subtypes
(A) Pathology annotation for each Visium spot of a representative section.
(B) Over-representation of DECODER compartment weights across Visium spots.
(C) proCAF probability of stroma spots derived by DeCAF.
(D) Basal-like probability of adenocarcinoma spots derived by PurIST.
(E) Annotation of adenocarcinoma spots and adjacent stroma layers.
(F) Boxplot showing the proCAF probability of the different stroma layers in a representative sample, ST5473. Kruskal-Wallis test. n = 444, 233, 118, and 310 spots in each group of layer1, layer2, layer3, and distant were analyzed.
(G and J) Co-localization of over-expressed proCAF genes with basal-like genes, as well as high basal-like probability in bulk-based basal-like samples.
(H, I, K, L, N, O, Q, R, and S), Radius plots showing DeCAF probability around pooled basal-like (H, K, and N) compared to classical spots (I, L, O, and Q) for representative samples (H, I, K, L, N, O, and Q) and pooled samples (R and S).
(M and P) Co-localization of over-expressed proCAF (M) and restCAF (P) genes with classical genes in bulk-based classical samples.
(T) Curves showing proCAF probability of spots by distance from pooled basal-like compared to classical spots.
We applied DeCAF and PurIST to each ST spot and derived the proCAF and basal-like probabilities (Figures 5C and 5D). We found that stroma spots adjacent to the adenocarcinoma spots showed higher proCAF probability (Figure 5C). Similarly, positive ITGA11 IF staining was observed adjacent to tumor cells (Figure S3D). For a systematic comparison, we annotated stroma spots adjacent to the adenocarcinoma spots as layers 1 to 3 moving outward from the adenocarcinoma, and the rest as “distant” (Figure 5E). Stroma spots on layer 1 showed the highest proCAF and lowest restCAF gene expressions; distant stroma spots showed the lowest proCAF gene expressions and probabilities (Figure 5F). These results are consistent with previous findings that myCAF and αSMA staining is closest to the tumor,18,19 as well as a recent study showing enrichment of myCAF in basal-like tumor-bed areas and juxtalesional regions.26
Spatial co-localization of proCAF with basal-like subtypes
Next, we examined the spatial associations between the basal-like and proCAF subtypes using PurIST and DeCAF (Figures 5A, 5C, 5D and S6, STAR Methods). We found a strong co-localization of the basal-like genes in the adenocarcinoma spots, and proCAF genes in the adjacent stroma spots (Figures 5G and 5J). A summary radius plot across all adenocarcinoma spots (ST5473: n = 411, ST5425: n = 306) in these two representative samples showed that basal-like spots (ST5473: n = 23, ST5425: n = 388) were associated with higher proCAF probabilities than classical spots (ST5473: n = 29, ST5425: n = 277) (Figures 5H, 5I, 5K, and 5L), in agreement with previous findings that basal-like-expressing cells show close proximity with myofibroblasts.53
While basal-like tumor spots were found to be associated with higher proCAFness, classical tumor spots did not show a preference for proCAF vs. restCAF proximity. For example, in ST5473 (bulk subtype: classical, proCAF), we observed classical spots to be associated with both proCAF and restCAF (Figures 5M and S6), although the few basal-like spots (n = 23) showed higher proCAF probability than the classical spots (n = 388) (Figures 5N and 5O). In ST11592, where the tumor spots are all classical (n = 39), we found an overall low proCAF probability (Figures 5P and 5Q). These findings recapitulate our bulk RNA-seq results where classical subtype tumors may be either pro- or restCAF subtype (Figure 3A). To perform a systematic evaluation across all samples, we pooled the basal-like (n = 259) and classical (n = 926) spots for all 7 samples. We found that the overall proCAF probability is higher in basal-like than classical, with a progressive decrease as distance increases from the tumor spots (Figures 5R, 5S, and 5T). Taken together, our ST analysis demonstrates a strong physical interaction between the DeCAF proCAF and basal-like tumor subtypes, supporting the significant proCAF-basal subtype associations in bulk RNA-seq cohorts.
DeCAF subtypes in mesothelioma, kidney, and bladder cancer
Similarities in CAFs across cancer types have been reported.54,55 To determine if DeCAF subtypes are seen in other cancer types, we evaluated TCGA datasets56,57 and found that DeCAF subtypes are prognostic in malignant pleural mesothelioma (MESO, p = 0.021, HR = 2.056 [95% CI 1.1–3.84]), kidney renal clear cell carcinoma (KIRC, p = 0.0011, HR = 2.138 [95% CI 1.342–3.407]), urothelial bladder carcinomas (bladder cancer [BLCA], p = 0.043, HR = 1.6 [95% CI 1–2.6]), invasive breast carcinoma (BRCA, p = 0.041, HR = 1.404 [95% CI 1.012–1.948]), and lung adenocarcinoma (LUAD, p = 0.025, HR = 1.394 [95% CI 1.041–1.867], Figures 6A–6C and S7A).
Figure 6.
DeCAF subtypes in other tumor types
(A–C) Kaplan-Meier curves showing OS in patients with proCAF vs. restCAF subtypes in TCGA MESO, KIRC, and BLCA datasets. Log-rank test.
(D) Representative H&E staining showing a proCAF subtype sample with myxoid stroma and a restCAF subtype sample with fibrous stroma in TCGA MESO.
(E) Proportion of patients in each TCGA MESO histological subtype with proCAF and restCAF subtypes (p < 0.001, Fisher’s exact test).
(F) Comparison of proCAF probability between myxoid (n = 35) vs. fibrous (n = 40) stroma in TCGA MESO (p = 0.001, Wilcoxon rank-sum test).
(G) Proportion of patients with proCAF and restCAF subtypes within each bladder cancer consensus subtype in TCGA BLCA subtypes (p < 0.001, Fisher’s exact test).
(H) Boxplot comparing proCAF probability in different TCGA BLCA consensus subtypes (p < 0.001, Kruskal-Wallis test, n shown in G).
(I) Kaplan-Meier plot showing OS in patients in the context of combined tumor (basal vs. luminal) and DeCAF subtypes (p = 0.011, log-rank test).
(J) Kaplan-Meier curves showing progression-free survival of patients before developing MIBC with proCAF and restCAF subtypes in the UROMOL dataset. Log-rank test.
(K, M, and O) Proportion of patients with proCAF and restCAF subtypes in the UROMOL dataset using the bladder consensus subtype schema (K), grade (LG, low grade; HG, high grade; CIS, carcinoma in situ) (M), and class (O), Fisher’s exact test.
(L, N, and P) Boxplots comparing DeCAF probability using the bladder consensus subtype schema (L, n shown in K), using grade (N, n shown in M), and class (P, n shown in O) in the UROMOL dataset. Kruskal-Wallis test.
Given the clinical similarities of fibrosis that characterizes both MESO58 and PDAC, we hypothesized that there may also be similar pathology findings. We found that DeCAF subtypes were associated with histological type (p = 0.001) with 93% (53/57) of epithelioid type tumors having a restCAF subtype (Figures 6D and 6E). Pathologist review of the stroma showed similar findings as PDAC, where a myxoid stroma was associated with higher proCAF probability (i.e., proCAFness) compared to a fibrous histology (p = 0.001, Figure 6F).
In bladder cancer, consensus subtypes have been described including a basal bladder subtype (Ba/Sq) with similar gene expression to basal-like PDAC.16 In TCGA BLCA, we found that similar to PDAC, the Ba/Sq bladder consensus subtype was enriched in the proCAF subtype, with 52.0% (80/152, p = 1.22e−13, Figure 6G) of Ba/Sq subtype patients having a proCAF subtype and highest proCAF probability (Figure 6H), and associated with OS (p = 0.011, Figure 6I). In a dataset of non-muscle-invasive bladder cancer (NMIBC), UROMOL,59 we find that patients with proCAF subtype tumors have a shorter progression-free survival to developing muscle-invasive bladder cancer (MIBC) (PFS, p = 0.00082, Figure 6J). Using the BLCA consensus subtyping schema, 75% (3/4, p = 0.00166) of Ba/Sq tumors had a proCAF subtype and had the highest proCAF probability (Figures 6K and 6L). Using standard clinical groupings of NMIBC, we found that proCAF prevalence and higher proCAF probability (i.e., proCAFness) were associated with higher grade (p = 8.7e−06, Kruskal-Wallis test, Figures 6M and 6N) and most enriched in class 2a NMIBC (18% [25/142], p = 0.000301), the most aggressive class of NMIBC (Figures 6O and 6P).
Taken together, our results show that the presence of the proCAF subtype is associated with poor prognosis in multiple cancer types with similar histology and tumor subtype associations to our findings in PDAC.
Immune landscape of proCAF and restCAF subtype tumors
While we showed that DeCAF genes have minimal expression in immune cells in two scRNA-seq datasets and by IF (Figures 1G, 1J, 4C, and 4D), we hypothesized that CAF subtypes may support different immune landscapes. Therefore, we used CIBERSORT60 with LM22 as the reference to deconvolve the fractions of 22 types of immune cells in each of our 12 PDAC bulk datasets (Figure 7A; Table S7). We found that the immune landscape was more immunosuppressive in proCAF, with an average enrichment of 1.9-fold in regulatory T cells (Tregs) in 6 datasets, 2.1-fold neutrophils in 5 datasets, and 1.9-fold in M0 macrophages in 10 datasets (Figure 7A). In contrast, restCAF subtype tumors had an average enrichment of 1.5-fold in M1 macrophages in 4 datasets, 1.5-fold in CD8+ T cells in 7 datasets, and 2.4-fold in naive B cells in 6 datasets (Figure 7A). RestCAF subtype tumors also showed significantly higher CD8/Treg ratios in 8 datasets (Figure 7B), suggesting that patients with different DeCAF subtype tumors may show more favorable response to certain immunotherapies compared to patients with proCAF subtype tumors.61,62 Using Ecotyper,63 we estimated the fractions of the “carcinoma ecotypes” (CEs), which were defined as cohesive cellular communities with cell states. We found that proCAF were associated with higher levels of CE1 and CE2 (Figure 7C), which were linked to higher risk of death and elevated levels of basal-like epithelial cells,63 consistent with our findings that patients with proCAF tumors have shorter OS and are enriched in the basal-like tumor subtype (Figures 3 and 5). In contrast, restCAF showed higher levels of CE6 and CE10 (Figure 7C), where CE10 was proinflammatory and associated with longer OS.63 We then sought to evaluate the relationship between proCAF and myeloid TMEs using ITGA11 (proCAF), FBLN5 (restCAF), and CD11b (monocyte/macrophage) staining. We found that ITGA11+ (proCAF) areas (n = 3, 35 regions, 7,162,078 μm2) were associated with significantly higher CD11b staining (n = 3, 21 regions, 5,835,855 μm2) (p value = 1.878e−08), supporting that proCAF areas are enriched in macrophages/monocytes (Figure 7D).
Figure 7.
Immune landscape and immunotherapy response of proCAF and restCAF subtypes
(A) Comparison of immune cell fractions deconvolved using CIBERSORT and LM22 as the reference across 12 datasets. The color of the dots represents the log2 fold change (FC) of the fractions between proCAF and restCAF subtype tumors. The size of the dots represents the p value tested by Wilcoxon rank-sum test.
(B) Violin plots comparing the log2 FC of the ratio between the CD8+ T cell and Treg fractions in proCAF vs. restCAF subtype tumors. Wilcoxon rank-sum test.
(C) Boxplots comparing the state fractions of carcinoma ecotypes (CEs) between proCAF and restCAF subtypes using Ecotyper. Wilcoxon rank-sum test.
(D) Proportion of CD11b+ cells in ITGA11+ (proCAF) and FBLN5+ (restCAF) regions. n = 21 and 25 areas of interest analyzed for ITGA11+ and FBLN5+, respectively, derived from three tumor sections per group. Wilcoxon rank-sum test.
(E and F) Correlation of proCAF probability in pre-treatment samples with the tumor size change in proCAF subtype (E, rho = −0.581, p = 0.048, n = 12), and restCAF subtype (F, rho = −0.796, p = 0.002, n = 12) tumors. Spearman correlation.
(G and H) Correlation of the change in neutrophil fraction (rho = 0.487, p = 0.016), and M2 macrophages fraction (rho = −0.479, p = 0.018) between pre- and post-treatment tumor samples (n = 24) with the percent change in tumor size. Spearman correlation.
(I and J) Boxplot showing the neutrophil and M2 macrophage fractions in pre- and post-treatment samples of proCAF and restCAF subtype tumors. n = 12 and 12 paired pre-post samples for proCAF and restCAF, respectively. Paired Wilcoxon rank-sum test.
(K) Correlation of proCAF probability between pre- and post-treatment samples (n = 24, r = 0.772, p < 0.001, Pearson correlation).
(L and M) Boxplot comparing proCAF probability in the patients stratified by ORR in the IMmotion150 (L, p = 0.077, t test, n = 20 and 61 for CR/PR and SD/PD) and IMvigor 210 trials (M, p = 0.020, t test, n = 65 and 230 for CR/PR and SD/PD. CR, complete response; PR, partial response; SD, stable disease; PD, progressive disease).
(N) Kaplan-Meier curves showing patient OS by DeCAF subtypes in the IMvigor210 trial. Log-rank test.
(O) Proportion of patients with proCAF and restCAF subtype tumors within each bladder consensus subtype in IMvigor210 (p < 0.001, Fisher’s exact test).
(P) Boxplots comparing proCAF probability in patients with different bladder consensus subtype tumors in IMvigor210 (p < 0.001, Kruskal-Wallis test, n shown in O).
To investigate if DeCAF subtypes are predictive of immunotherapy response in PDAC patients, we examined the phase 1b trial of FOLFIRINOX in combination with PF-04136309, a CCR2 inhibitor (FFX+PF) which has both pre- and post-treatment samples (Linehan dataset).64 Within each subtype, we found that increasing proCAF probability (i.e., increasing proCAFness) in the pre-treatment sample was correlated with a greater percent decrease in tumor size in patients with proCAF (rho = −0.581, p = 0.048, Figure 7E) and restCAF (rho = −0.796, p = 0.002, Figure 7F) subtype tumors. As CCR2 inhibition targets the recruitment of inflammatory monocytes, our findings of proCAF responsiveness may be explained by a macrophage-rich stroma in the pre-treatment proCAF subtype samples (Figure 7A). We next looked at the change in tumor neutrophil and M2 macrophages before and after treatment. We found that decreases in the neutrophil fraction (rho = 0.487, p = 0.016) and increases in the M2 macrophage fraction (rho = −0.479, p = 0.018) were correlated with tumor response (Figures 7G and 7H). We found that the decrease in neutrophil and increase in M2 macrophage fractions were specific to proCAF subtype tumors and not found in restCAF subtype tumors (Figures 7I and 7J). DeCAF probabilities between pre- and post-treatment samples were correlated (r = 0.772, p < 0.001, Figure 7K), suggesting that there were no dramatic shifts in CAF subtype. Thus, our results suggest that DeCAF subtypes associate with specific immune microenvironments that may predict immunotherapy response.
DeCAF subtypes and immunotherapy response in bladder and kidney cancer
As DeCAF subtypes were prognostic in bladder cancer and kidney renal clear cell carcinoma (cRCC), we evaluated the IMvigor210 trial (NCT02108652) in BLCA65 and the IMmotion150 trial (NCT01984242) in cRCC66 where patients were treated with the anti-PD-L1 antibody, atezolizumab. In the IMmotion150 trial of metastatic RCC patients, in the atezolizumab only arm, lower proCAF probability (i.e., increased restCAFness) was numerically, but not significantly associated with having a complete response (CR) or partial response (PR) (p = 0.077, t test, Figure 7L). In the IMvigor210 trial for metastatic urothelial cancers, we found that patients who had a CR/PR had significantly lower proCAF probability (i.e., increased restCAFness) (p = 0.020, t test, Figure 7M). In addition, patients with proCAF subtype tumors had a mOS of 6.7 months compared to 9.9 months for patients with restCAF tumors (p = 0.043, HR 1.4 [95% CI 1.01, 1.95], Figure 7N). While LRRC15 CAF signatures have previously been shown to be increased in immune-excluded patients who failed to respond in the IMvigor210 trial,24,25 we did not find a consistent predictive effect in the full dataset in either the IMvigor210 or IMmotion150 trials (Figure S7B). Finally, similar to PDAC and the TCGA BLCA dataset, in IMvigor210, we found that Ba/Sq subtype tumors were most enriched in the proCAF subtype with 33% (36/109) of Ba/Sq subtype harboring a proCAF subtype (p = 3.81e−05, Figure 7O). As expected, higher DeCAF probability was associated with a Ba/Sq tumor subtype as well (p = 5.5e−13, Kruskal-Wallis test, Figure 7P). Therefore, cRCC and bladder cancer patients with restCAF subtype tumors may have an increased overall response rate (ORR) to immune checkpoint inhibition (ICI).
Discussion
Establishing the clinically meaningful aspects of CAF heterogeneity is a critical step in bridging the gap between preclinical studies to the clinic. Currently, although scRNA-seq analysis has enabled the better understanding of CAF heterogeneity,18,19,20,21,22,24,25,26,29 their generalizability and clinical translatability remain unclear. iCAFs have been considered inflammatory and tumor promoting, with many preclinical studies concluding that therapeutic approaches targeting iCAFs are needed.18,19,30,31,32,33,34 However, we find that 85.2% (n = 236) of patients with iCAF tumors are restCAF, exhibiting restraining properties with improved prognosis, while proCAF tumors are associated with shorter OS, suggesting that targeting iCAF populations may not be beneficial and may provide additional context for the disappointing trials to date.18,19,30,31,32,33,34 Our results are in agreement with a recent study that found that myCAFs may be pro-metastatic.67 Thus, we propose a paradigm based on the clinical function of tumor promoting and restraining CAF subtype in patients.68
As clustering tends to introduce problems of instability during between-sample normalization when a new sample is added, it can be misleading for small sample sizes and unfeasible in the clinical setting where only a single sample is being assayed. Compared with previous comparable CAF subtyping schemas,16,17 the DeCAF subtypes avoid using clustering-based methods. DeCAF subtypes showed the lowest BIC and higher HR, and only DeCAF subtypes were independently prognostic when compared with previously comparable CAF schemas. DeCAF, which uses nine pairs of TSP genes to call proCAF vs. restCAF subtypes, is robust and replicable. In addition, the TSP method is only dependent on the relative ranking of each pair of the genes. Therefore, it is one of the most robust methods across different platforms unbiased (RNA-seq, microarray, or NanoString), quantification methods (STAR or Salmon), or reference versions (Ensmbl or RefSeq). This considerably increases the flexibility and practicality of integrating and studying CAF subtypes in clinical contexts. Biased platforms that are confounded by probe or primer efficiencies or sensitivities may not be suitable for this method.
A key element of our method includes the utilization of CAF-intrinsic subpopulation marker genes identified using SCISSORS that precisely identifies CAF clusters and marker genes in scRNA-seq data without contamination of other cell types16,17,18,20,21,22,27,28,29,35 and critical to translating CAF-intrinsic subtypes to bulk RNA-seq data. We show that pro- and restCAF subtypes are not confounded by immune cells, suggesting that the CAF subtypes support distinct immune cell landscapes.69,70,71
We find that the histologic features of the stroma in PDAC and mesothelioma have direct associations with the DeCAF subtypes with proCAF subtype tumors having myxoid stroma, and restCAF subtype tumors having fibrous stroma. The DeCAF score allows us to look at mixtures where fibromyxoid (mixed) stroma have intermediate DeCAF scores. Our findings of fibrous histology are consistent with the previously described “deserted subTME” by Grünwald et al. and a “collagen-rich stroma” or “C-stroma” by Ogawa et al. In contrast to prior studies, we do find that the myxoid and fibrous features found on histology associate with outcome.27
We found that DeCAF subtypes are prognostic in mesothelioma, cRCC, and bladder carcinomas. In bladder cancer, where the PDAC basal-like gene signature can be used to accurately recall the Ba/Sq bladder consensus subtype,16 both of which have an enrichment of cytokeratins, we find that the proCAF subtype is enriched in both basal bladder Ba/Sq and PDAC basal-like subtype tumors. Furthermore, both tumor and CAF subtype affect prognosis in an additive fashion in bladder and PDAC cancer patients.
Finally, our findings of differential immune microenvironments associated with the proCAF vs. restCAF subtype have implications for ICI response in two large clinical trials in cRCC, bladder, and CCR2 inhibitor response in a PDAC clinical trial. We showed that the DeCAF proCAF probabilities are superior in predicting treatment response to anti-PD-L1 inhibitors than a previous LRRC15 CAF signature.24,25 These findings suggest that the prognostic and therapeutic implications of DeCAF are relevant across these cancer types.
Our results provide a clinical framework for understanding and targeting poor prognostic proCAF subtypes. We provide critical insights into the relationship between preclinical CAF models and CAF subtypes in patients, enabling better alignment between experimental systems and patient biology to advance future translatability. Studies of DeCAF subtypes in non-malignant72 and high-risk chronic pancreatitis cohorts73 may shed light on their role in tumor initiation. Future studies will aim to understand the biological underpinnings of DeCAF subtypes and the tumor-stromal crosstalk that drives prognosis and treatment response in patients.
Limitations of the study
While we integrated scRNA-seq, spatial transcriptomics, and pathology data to characterize CAF subtypes, our study does not capture the dynamic changes in CAF states during treatment or disease progression. Longitudinal sampling and functional assays are needed to elucidate these temporal dynamics. In addition, our analyses are correlative and do not establish causality. Experimental studies will be required to define the mechanistic basis of tumor-CAF-immune interactions and their impact on therapy response. Clinical trials will be needed to determine the impact of CAF subtypes on treatment response.
Resource availability
Lead contact
Requests for further information and resources should be directed to and will be fulfilled by the lead contact, Jen Jen Yeh (jjyeh@med.unc.edu).
Materials availability
This study did not generate new reagents. All materials involved can be found on STAR Methods.
Data and code availability
-
•
Newly generated datasets have been deposited in the Gene Expression Omnibus (GEO) under accession number GSE311789, with GSE310957 (UNC-bulk), GSE311788 (UNC-sc), and GSE311783 (UNC-st). Public bulk datasets used in this study are summarized in Table S2.
-
•
The DeCAF classifier was deposited as a GitHub repository at https://github.com/jjyeh-unc/decaf. All scripts involved in generating results and figures are available at https://github.com/jjyeh-unc/decaf_manuscript.
-
•
Any additional information required to reanalyze the data reported in this work paper is available from the lead contact upon request.
Acknowledgments
This study was funded by U.S. National Institutes of Health (NIH) R01 CA199064 (N.U.R. and J.J.Y.), NIH U01 CA274298 (N.U.R., J.J.Y., and X.L.P.), NIH P50 CA257911(N.U.R., J.J.Y., X.L.P., and Y.P.-G.), NIH U24 CA211000 (X.L.P. and J.J.Y.), NC TraCS 2KR1472106 (X.L.P. and J.J.Y.), NIH T32 CA106209 (E.V.K.), U.S. National Science Foundation (NSF) DGE-204-435 (E.V.K.), NIH T32 CA244125 (J.F.K. and M.E.L.), NIH R01 CA168863 (B.A.B. and D.C.L.), and NIH R01 CA270792 (N.P.K. and Y.P.-G.).
We thank members of the Yeh, Kim, and Rashid labs, the NIH National Cancer Institute (NCI) Pancreatic Cancer Stromal Reprogramming Consortium (PSRC), and Dr. Barbara Grünwald for insights and constructive discussions. We thank the UNC Lineberger Preclinical Studies, Tissue Procurement and Microscopy Services Cores (P30 CA016086), the Hooker Imaging Facility, and the Center for Gastrointestinal Biology of Disease Core (P30 DK034987). We thank the UNC Research Computing group. The study design, data collection and analysis, decision to publish, and manuscript preparation were conducted independently, without involvement from the funders.
Author contributions
J.J.Y., N.U.R., W.Y.K., and X.L.P. conceived the study and supervised the project. J.J.Y., W.Y.K., H.J.K., S.T., B.A.B., R.Z.P., D.C.L., and A.C.I. provided clinical oversight, patient samples, and pathological expertise. X.L.P., I.C.M., E.V.K., Y.X., R.T.Z., C.L., J.F.K., N.P.K., A.H., J.J.L., J.S., P.S.C., A.B.M., M.E.L., S.G.H.L., A.C., and Y.P.-G. performed the experiments and generated datasets. N.U.R., X.L.P., E.V.K., R.T.Z., C.L., N.P.K., A.H., J.J.L., J.S., A.C., S.G., and J.S.D. performed computational analyses and data interpretation. J.J.Y., N.U.R., W.Y.K., X.L.P., I.C.M., and E.V.K. drafted the manuscript with editing and critical revisions from all authors. All authors reviewed and approved the final manuscript.
Declaration of interests
Patent applications have been filed for work in this manuscript by the University of North Carolina Chapel Hill. Authors/inventors: J.J.Y., N.U.R., E.V.K., and X.L.P.
PurIST, used in the manuscript, is covered by an issued patent WO2020205993A1 held by the University of North Carolina Chapel Hill. Inventors who are authors: J.J.Y. and N.U.R. PurIST was licensed to GeneCentric Therapeutics, Inc., which had no participation in the current work.
STAR★Methods
Key resources table
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| Anti-ITGA11 [EPR28689-21] | Abcam | Ab316249 |
| Anti-Fibulin 5 [1G6A4] | Abcam | Ab66339; RRID: AB_2231921 |
| Anti-CD45 (D9M81) Alexa Fluor 647 | Cell Signaling | 19744S; RRID: AB_3720269 |
| Anti-CD11b/ITGAM (D6X1N) Alexa Fluor 647 | Cell Signaling | 79750S; RRID: AB_3720244 |
| Anti-Fibulin 5 | Abcam | Ab202977 |
| Anti- Pan Cytokeratin (AE1/AE3) Alexa Fluor 488 | Invitrogen | 53900382; RRID: AB_1834350 |
| Goat anti-Mouse IgG (H + L) Alexa Fluor 546 | Invitrogen | A11030; RRID: AB_144695 |
| Goat anti-Rabbit IgG (H + L) Alexa Fluor 647 | Invitrogen | A21245; RRID: AB_2535813 |
| Anti-Vimentin (D21H3) | Cell Signaling | 5741S; RRID:AB_10695459 |
| Anti-αSMA (D4K9N) | Cell Signaling | 19245S; RRID: AB_2734735 |
| Anti-FAP (F1A4G) | Cell Signaling | 52818S; RRID: AB_3674735 |
| Biological samples | ||
| Deidentified patient samples | UNC TPF | Table S4 |
| Chemicals, peptides, and recombinant proteins | ||
| Goat Serum | Gibco | 16210072 |
| Xylenes | Sigma | 214736 |
| Harris Hematoxylin Solution Modified | Sigma | HHS32-1L |
| Eosin Y Solution Alcoholic with Phloxine | Sigma | HT110332-1L |
| DAPI (4′, 6-Diamidino-2-Phenylindole) | Invitrogen | D3571 |
| Critical commercial assays | ||
| Movat Pentachrome Stain Kit (Modified Russell-Movat) | Abcam | Ab245884 |
| Visium v1 Spatial Platform | 10X Genomics | |
| Chromium Next GEM Single Cell | 10X Genomics | PN-1000121 |
| Chromium Next GEM Chip G Single Cell Kit | 10X Genomics | PN-1000127 |
| Single Index Kit T Set A | 10X Genomics | PN-1000213 |
| Miltenyi human dissociation kits | Miltenyi | 130-095-929 |
| Red blood cell lysis solution | Miltenyi | 130-094-183 |
| Chromium 3′ v3 reagents | 10X Genomics | N/A |
| Qubit dsDNA Assay Kit | Thermo | Q32851 |
| Tapestation D5000 screen tapes | Agilent | N/A |
| NextSeq 500/550 High Output Kit v2.5 (150 Cycles) | Illumina | 20024907 |
| TruSeq Stranded mRNA kit | Illumina | N/A |
| KAPA RNA HyperPrep Kit with RiboErase | KAPA Biosystems | N/A |
| Cytassist Spatial Gene Expression slide and reagent kit | 10X Genomics | N/A |
| Deposited data | ||
| UNC-bulk | Gene Expression Omnibus (GEO) | GSE310957 |
| UNC-sc | Gene Expression Omnibus (GEO) | GSE311788 |
| UNC-st | Gene Expression Omnibus (GEO) | GSE311783 |
| Software and algorithms | ||
| DeCAF | This paper | https://github.com/jjyeh-unc/decaf |
| PurIST | Rashid and Peng et al.39 | https://github.com/naimurashid/PurIST |
| SCISSORS | Leary et al.35 | https://github.com/jr-leary7/SCISSORS |
| DECODER | Peng et al.47 | https://github.com/laurapeng/decoderr |
| CIBERSORT | Newman et al.60 | https://cibersort.stanford.edu |
| Ecotyper | Luca et al.63 | https://ecotyper.stanford.edu |
| bcl2fastq2 (v2.19.0) | Illumina | https://support.illumina.com/downloads/bcl2fastq-conversion-software-v2-20.html |
| Salmon 1.9.0 | Patro et al.74 | https://combine-lab.github.io/salmon/ |
| ConsensusClusterPlus R package v1.56.0 | Wilkerson et al.75 | https://bioconductor.org/packages/devel/bioc/html/ConsensusClusterPlus.html |
| switchBox R package v1.28.0 | Afsari et al.76 | https://bioconductor.org/packages/switchBox |
| ncvreg R package v3.13.0 | Breheny et al.77 | https://cran.r-project.org/web/packages/ncvreg |
| nsROC R package v1.1 | Pérez-Fernández et al.78 | https://cran.r-project.org/web/packages/nsROC/index.html |
| Survival R package v3.2-13 | Therneau et al.79 | https://cran.r-project.org/web/packages/survival/index.html |
| Seurat R package v4 (for scRNAseq) | Hao and Hao et al.80 | https://satijalab.org/seurat/ |
| Seurat R package v5 (for ST) | Hao et al.81 | https://satijalab.org/seurat/ |
| Variance-adjusted Mahalanobis (VAM) | Frost et al.46 | https://cran.r-project.org/web/packages/VAM/index.html |
| Cell Ranger 6.1.2 | 10x Genomics | https://www.10xgenomics.com/support/software/cell-ranger/latest |
| CellPhoneDB v2 | Efremova et al.51 | https://github.com/ventolab/CellphoneDB |
| CellChat R package v1.6.1 | Jin et al.82 | https://github.com/sqjin/CellChat |
| SpaceRanger v2.1.0 | 10x Genomics | https://www.10xgenomics.com/support/software/space-ranger/latest |
| QuPath v0.6.0 | Bankhead et al.83 | https://github.com/qupath/qupath/releases |
| FIJI | National Institutes of Health | https://imagej.net/software/fiji/ |
| Other | ||
| Elyada-sc | Elyada et al.18 | dbGaP: phs001840.v1.p1 |
| Public bulk transcriptomics data | This paper | Table S2 |
| TCGA RNAseq data | Broad GDAC firehose | https://gdac.broadinstitute.org |
| RefSeq assembly (GCF_000001405.40) of the human reference genome GRCh38.p14 | RefSeq | https://www.ncbi.nlm.nih.gov/datasets/genome/ |
Experimental model and study participant details
A total of 130 de-identified samples were obtained following IRB exemption in accordance with the U.S. Common Rule from the University of North Carolina (UNC) Lineberger Comprehensive Cancer Center (LCCC) Tissue Procurement Core Facility (TPF) under the UNC LCCC institutional biorepository LCCC9001 where informed consent and HIPAA authorization allows for immediate and future usage of malignant and normal tissue under IRB approved or exempted reliant research projects. Available demographic and clinical information is provided in Table S4, including age, race, gender, stage, margin, etc. No significant association was observed between DeCAF subtypes and demographic characteristics (Table S5). The allocation of patient samples across groups is described within each analysis. Clinical trial data included in this study were obtained from publicly available resources and are summarized in Table S2.
Method details
Marker gene identification in Elyada-sc
SCISSORS includes a carefully designed function, which is a two-step method for the identification of highly cell subpopulation specific genes.35 Briefly, SCISSORS first derives a candidate gene list by comparing the cell subpopulation of interest to the most related cell subpopulation; then the highly expressed genes from other unrelated cell types are removed from this candidate gene list. In this study using the Elyada-sc human dataset, the human apCAF cell cluster was confirmed by observing overrepresentation of mouse apCAF genes in this cluster. Then the proCAF cells were compared to restCAF and apCAF cells, and the restCAF cells were compared to proCAF and apCAF cells, to derive a candidate gene list for the cell subpopulation of proCAF and restCAF separately (p < 0.05, Wilcoxon rank-sum test, log2 fold change >0). The proCAF and restCAF candidate gene lists were subjected to a filtering step, in which the highly expressed genes of the non-CAF cells were removed. The highly expressed genes were defined as the top 10% expressed by averaging all the cells within each broad cell cluster. As a result, final gene lists were identified for proCAF and restCAF cells (Table S1).
Public bulk datasets and sample inclusion
Seventeen bulk transcriptomics datasets were obtained from public sources (Table S2). Gene expression quantifications were used ‘as-is’ with respect to the original publications, when possible, i.e., data were not re-aligned or re-quantified; gene-level expression estimates were used either in the unit of TPM (transcripts per million) or FPKM (fragments per kilobase per million reads), depending on the study. When preprocessed gene expression data were not available, for training datasets, the most similar methods were used to process the data; for validation and independent datasets, Salmon 1.9.074 using RefSeq assembly (GCF_000001405.40) of the human reference genome GRCh38.p14 was used to derive gene expression levels. In PDAC datasets, only non-metastatic primary PDAC samples were included. For the Grünwald and Olive datasets, which are microdissected, only stroma samples were included. Bladder cancer subtyping calls were made on log2 transformed upper-quartile normalized expression data using the BLCA subtyping and consensusMIBC R package.84 Within the UROMOL and IMvigor210 datasets, all samples with gene expression data were used in the analysis. For the TCGA_BLCA cohort, only tumors from patients with stage M0 disease were included in the analysis.
Consensus clustering (CC)
SCISSORS proCAF and restCAF genes were derived as described above and ranked by fold enrichment to generate the top25 genes for each subtype. Gene sets of the Moffitt stroma, Elyada and Maurer schemas were collected from each study respectively. For each of the subtyping schemas in each of the 11 public transcriptomics datasets, unsupervised CC was applied using the ConsensusClusterPlus (version 1.56.0)75 R package for genes (rows) and samples (columns) separately. Data matrices were subjected to log2 transformation and column-wise quantile normalization. Note that the data were not normalized row-wise, as that may force the clusters to have similar sizes, instead of deriving the reflective number of patients in each cluster. For clustering of samples, a distance matrix was derived based on Pearson correlation. Then CC was applied to this distance matrix to derive two sample clusters (K = 2), which consisted of 1,000 iterations of k-means clustering using Euclidean distance, with 80% items hold-out at each iteration. The number of K was determined empirically by visual inspection to derive clusters of samples that were most representative of the CAF subtypes. For clustering of genes, a distance matrix was derived based on Pearson correlation. Then CC was applied to this distance matrix to derive two gene clusters (K = 2), which consisted of 200 iterations of k-means clustering using Euclidean distance, with 80% items hold-out at each iteration.
Generation of CC labels for classifier training and validation
To derive confident labels for classifier training and validation, the CC method mentioned above was adapted. Specifically, the CC starts at sample-wise K = 2, with Ks increasing step by step for inspection of clustering performance and sample-gene associations. The resultant clusters were then labeled as “proCAF”, “Mixed proCAF”, “Mixed”, “Mixed restCAF”, “restCAF” and “Absent” based on gene expressions. A dataset does not necessarily have every one of the 6 cluster categories. “Mixed proCAF” was then merged with ”proCAF”, and “Mixed restCAF” merged with “restCAF”. These merged “proCAF” and “restCAF” labels were used for training and validation of DeCAF.
DeCAF classifier training
Candidate gene ranking
SCISSORS CAF genes were ranked based on the consistency of their differential expression (DE) statistics between CC-based subtypes in each individual training dataset. A cross-study DE consistency score was obtained by summing the -log10 p-values (Wilcoxon rank-sum test) and ranking them from high to low. The top 25% of this set (consistent DE genes) were considered for model training and the genes where the direction of up-regulation or down-regulation of them were not consistent in the subtypes were removed. The remaining genes then formed our final candidate gene set for downstream steps.
Rationale of using kTSP for binary classification
Let us define a gene pair (gdis, gdit), where gdis is the expression of gene s for subject i in study d, and gdit is the expression of gene t for the same subject and study. A TSP is an indicator variable based on this gene pair, , where the value represents which gene in the pair has higher expression in subject ifrom study d, ( if gdis>gdit, and if gdis<gdit otherwise). The TSP method was originally proposed in the context of binary classification.40,85 In traditional applications, a single TSP (k = 1) is selected out of the set of all possible gene pairs, in which case, implies subtype A with high probability, otherwise subtype B is implied.42 We view such binary variables as “biological switches” indicating how pairs of genes are expressed relative to clinical outcome.
In the kTSP setting, class prediction reduces to verifying whether the sum across k selected TSPs is greater than 0:
This reduces to a majority vote across the selected k TSPs, where the contribution of each of the k TSPs are equally weighted to select subtype A if the above sum is greater than 0, and subtype B otherwise. However, some TSPs may be more informative than others, so we utilized penalized logistic regression77 to jointly estimate the effect of each of the k selected TSPs in predicting binary subtype, and to further remove TSPs with weak or redundant effects. Predicted probabilities of proCAF subtype membership (DeCAF score) may then be obtained from the fitted logistic regression model on our training samples, where values greater than 0.5 indicate predicted membership to the proCAF subtype and restCAF otherwise.
Horizontal data integration and kTSP selection via switchBox
To apply the top scoring pairs transformation, we utilize the switchBox R package (version 1.28.0)76 to enumerate all possible gene pairs based on our final candidate gene list and training samples (function SWAP.KTSP.Train, with optimal parameters featureNo = 100, krange = 50, FilterFunc = NULL). Given the large number of potential gene pairs based on this list, in addition to the strong correlation between gene pairs sharing the same genes, the switchBox package utilizes a greedy algorithm to select from this list a subset of gene pairs that are helpful for prediction, given the set of training labels. We merge data from each training dataset without normalization prior to applying switchBox, as the method only looks at the relative gene expression ranking within each sample from each study. The method then selects a subset of k TSPs, where k is determined through a greedy optimization procedure.
Model training based on selected kTSPs
To remove redundant TSPs and to jointly estimate their contribution in predicting subtype in our training samples, we utilize the ncvreg R package (version 3.13.0)77 to fit a penalized logistic regression model based upon the selected TSPs from switchBox. Our design matrix is an N x (k+1) matrix, where the first column pertains to the intercept and the remaining k columns pertains to the k selected TSPs from switchBox. Here N is the total number of training samples. Each TSP in the design matrix is represented as a binary vector, taking on the value of 1 if gene A’s expression is greater than gene B’s expression. Our outcome variable here is binary subtype (1 = proCAF, 0 otherwise). We utilize optional parameters alpha = 0.05 and nfolds = N. We allow for correlation between TSPs by setting the ncvreg alpha parameter to 0.05 to shrink the coefficients of highly correlated TSPs and also remove correlated uninformative TSPs from the model. We set nfolds = N to apply leave one out cross validation in order to choose the optimal MCP penalty tuning parameter for variable selection, where the optimal tuning parameter is the one that minimizes the cross-validation error of the fitted model. Our final model then reports the set of coefficients estimated for each of the kTSPs, where each coefficient may be interpreted as the change in log odds of a patient being part of the proCAF subtype when the lth TSP is equal to 1, given the others in the model. TSPs with coefficient of 0 are those that have been removed from the model for either weak effect or redundancy with other TSPs.
Final kTSP model
As illustrated in Figure 1A, 9 pairs of kTSP genes were evaluated for their relative ranking in a new patient. A value of 1 is assigned if the proCAF gene in a TSP has greater expression than the restCAF gene in that patient (and 0 assigned other wise) creating Xi,new, a 1 x (k +1) TSP predictor vector. These values are then multiplied by the corresponding set of estimated TSP model coefficients, , obtained from the fitted penalized logistic regression model. The intercept term is included to correct for estimated baseline effects. These values are summed to get the patient “DeCAF Score” . This score is then converted to a predicted probability of belonging to the proCAF subtype (proCAF probability) by computing its inverse logit:
Values greater than or equal to 0.5 indicated predicted proCAF subtype membership, and those less than 0.5 are predicted to be of the restCAF subtype. This is equivalent to determining whether (proCAF subtype) vs. (restCAF subtype). Thus, the DeCAF score, may also be utilized as a continuous score for classification. Therefore, prediction in new samples, such as from our validation datasets, reduces to simply checking the relative expression of each gene within the set of TSPs.
For all discussions regarding classifier performance, we obtain the predicted subtypes in the manner described above. The level of confidence in the prediction can be determined based upon the distance of from 0.5, where values closer to 0.5 indicate lower confidence in the predicted subtype and higher confidence otherwise.
Validation of DeCAF
The performance of the final DeCAF model in the training set was measured using leave-one-out cross validation. This was implemented using the cv.ncvreg function from the ncvreg (version 3.13.0).77 Since the outcome is binary, the cross-validation error is measured via the leave-one-out cross-validated deviance of the logistic regression model.
Next, the performance of the DeCAF model was evaluated in the validation sets. The validation set is composed of seven independent studies. To account for the variability between studies, a fully non-parametric ROC curve was used for this meta-analysis. The ROC curve was constructed and the inter-study variability was measured using the study random-effects model86 and implemented in the metaROC function from the nsROC R package (version 1.1).78 All validation metrics compared the DeCAF classifier to the combined “Mixed proCAF” and “proCAF” calls and combined “Mixed restCAF” and “restCAF” clustering calls. Correlation between proCAF probability (using pseudo-bulk) and the proportion of proCAF cells was derived by using Elyada-sc.
Sample processing
All samples were processed in accordance with the manufacturer’s standard protocols for UNC-bulk, UNC-sc and UNC-st respectively. For UNC-sc, six de-identified primary PDAC samples were involved. Fresh tissue was dissociated into single cells using Miltenyi human dissociation kits (Miltenyi, 130-095-929) and red blood cells were removed using red blood cell lysis solution (Miltenyi, 130-094-183). Cell counts were performed using an automated cell counter and live cell counts were determined using trypan blue staining. Up to 10,000 cells were encapsulated into droplets for droplet-based 3′ end scRNAseq using Chromium 3′ v3 reagents (10X Genomics). cDNA libraries were quantified using the Qubit dsDNA Assay Kit (Thermo, Q32851) and library quality was assessed with the 4150 Tapestation System (Agilent) and D5000 screen tapes (Agilent, 5067–5588). cDNA libraries were sequenced on a NextSeq500 (Illumina) using NextSeq 500/550 High Output Kit v2.5 (150 Cycles) (Illumina, 20024907) at 200M reads per sample.
For UNC-bulk, 129 de-identified primary PDAC patient samples of which 114 have been previously reported50 were involved. Formalin-fixed paraffin-embedded (FFPE) samples were prepared, H&E stained. RNA expression libraries were generated for flash frozen samples or FFPE samples, with TruSeq Stranded mRNA kits or with KAPA RNA HyperPrep Kit with RiboErase (HMR) according to the manufacturer’s instructions. Sequencing was performed on the NextSeq500 and NovaSeq6000 Sequencing Systems (Illumina).
For UNC-st, seven samples passed quality control (QC) and were included in this study. 10X Genomics Visium Cytassist platform using 5 μm tissue sections that were mounted onto a glass slide was used. Tumor areas of interest were aligned with a 6.5 × 6.5 mm capture area on a Cytassist Spatial Gene Expression slide. Following the manufacturer protocol, sections were deparaffinized, H&E stained, imaged, and then decrosslinked to release mRNAs which bind to the spatially barcoded oligos on the 4992 spots of each capture area, at a spatial resolution of 55 μm. Once probe pairs hybridized and ligated to the RNA, the single-stranded ligation product was released from the tissue and captured on the Visium slide. Probe products were reverse transcribed and extended to incorporate a unique molecular identifier (UMI) and the spatial barcode. Following amplification, libraries were constructed from the cDNA. qPCR of pre-amplification material was used to determine cycle number for the sample index PCR to generate the final library. After a SPRIselect bead cleanup, generated libraries were sequenced on the Illumina NextSeq2000 platform using a P3 100 cycle kit at a depth of ∼300M reads per section, resulting in >60,000 reads per spot.
UNC-sc data processing and scRNAseq analysis
Cell Ranger (v.6.1.2) was used. Raw base call (BCL) files were converted into fastq files using “cellranger mkfastq” based on bcl2fastq2 (v2.20.0). Fastq files was processed by “cellranger count” to derive unique molecular identifier (UMI) count for each gene, using the human GRCh38 genome. Samples were aggregated by cellranger aggr. Aggregated data (filtered_feature_bc_matrix) were analyzed by SCISSORS,35 which is wrapped around Seurat (v4),80 for cell clustering. QC steps include 1) the inclusion of genes expressed in more than 2 cells, 2) the inclusion of cells that have the number of genes captured within 200∼2500, and 3) the inclusion of cells with mitochondrial reads accounting for less than 5%. The filtered data underwent processing through the PrepareData() function in SCISSORS35 to obtain the initial clusters with the following parameter settings: n.HVG = 3000, regress.mt = TRUE, n.PC = 20, random.seed = 629, with other parameters using default values. The first-round clusters were annotated using SingleR. Clusters identified as “activated_stellate” were categorized as CAFs. A second-round clustering analysis was performed on the CAF related clusters (0,4,6) using the ReclusterCells function in SCISSORS, with the following parameter settings: merge.clusters = TRUE, use.sct = TRUE, n.HVG = 3000, regress.mt = TRUE, n.PC = 20, resolution.vals = 0.2, k.vals = 57, with other parameters using default values.
Variance-adjusted Mahalanobis (VAM)46 was used to estimate the enrichment score for gene sets. Cell-cell ligand-receptor interactions were predicted using CellPhoneDB v251 with default settings. Circular plots were generated using the netVisual_circle function from the CellChat package (version 1.6.1).82
UNC-bulk data processing and RNAseq analysis
BCL files were converted to fastq files using bcl2fastq2 (v2.19.0). RefSeq assembly (GCF_000001405.40) of the human reference genome GRCh38.p14 was used as the reference for gene quantification by Salmon 1.9.074 (“-- gcBias -- seqBias”). The total expected read counts per gene were normalized to transcripts per million (TPM).
CIBERSORT87 was employed to estimate the fraction of 22 human hematopoietic cell phenotypes using the LM22 signature matrix. Cell fractions were compared between DeCAF subtypes in each of the 12 PDAC bulk datasets. The Wilcoxon rank-sum test was employed to derive the p-values and the fold change of the mean in two groups was utilized to quantify the magnitude of the observed differences. Ecotyper63 was employed to derive the levels of “carcinoma ecotypes” (CEs).
Survival analysis
For pooled survival analysis, patients with subtype calls, OS time and event were included. For the Linehan dataset, where patients received treatments, only pre-treatment samples were included when both pre-treatment and post-treatment samples were available. For samples that are duplicated in the PACA_AU_seq and PACA_AU_array datasets, only the sample from PACA_AU_seq was used for survival analysis.
OS estimates were calculated using the Kaplan-Meier method. Association between OS and individual covariates such as subtype were evaluated via the Cox proportional hazard (Cox-PH) models using the coxph function from the ‘survival’ R package (version 3.2–13),79 where a given subtyping schema was considered as a multi-level categorical predictor. The log rank test was used to evaluate overall association of a subtyping schema with overall survival and derive the p-values. In the pooled analyses, a stratified Cox-PH model was utilized, where dataset of origin was used as a stratification factor to account for variation in baseline hazard across studies.
UNC-st data processing and analysis
Space Ranger (v2.1.0) was used, following10X Genomics guidelines, to process the raw data and generate gene count files. Briefly, “spaceranger mkfastq” with default settings was used to convert BCL files to FASTQ files. Then “spaceranger count” was applied with default settings and required input files, including FASTQ files, brightfield microscope image, CytAssist image, as well as the manual alignment JASON file. Human GRCh38 genome was used as the reference genome. Seurat (v5.1.0)81 was used for subsequent data processing. Space Ranger output was loaded using the Load10X_Spatial function with filter.matrix = TRUE, to.upper = FALSE and other parameters as default. Gene expressions were normalized using the NormalizeData function with default settings.
For spot deconvolution, DECODER47 (https://github.com/laurapeng/decoderr) with the TCGA PAAD dataset as reference was applied to each spots on the ST samples treating them as bulk RNAseq. For tumor and CAF subtyping, PurIST39 (https://github.com/naimurashid/PurIST) and DeCAF were applied to each spot.
Tumor patch and stroma layer annotations were manually performed in the cLoupe browser. Stroma spots directly adjacent to a tumor patch were labeled as layer 1, with layers 2 and 3 extending outward. Spots not classified as stroma (e.g., acinar or inflammatory) were excluded from layer analysis. When layers overlapped between tumor patches, assignments were made randomly if distances were equal, or to the closer tumor patch if not.
For radius plots, adenocarcinoma spot coordinates were selected from SpaceRanger output based on pathologist annotations. Spots were classified as classical or basal-like using PurIST probabilities. Adjacent stroma spots were identified and their angular position and distance (radius) from each carcinoma spot were recorded. Stroma spots were binned by 3° angle increments and 60 μm distance increments, matching Visium-level resolution. The mean DeCAF probability was calculated per bin to generate a spatial map of DeCAF probability around carcinoma spots. Bin averages were plotted with ggplot2’s polar_plot function. Additionally, DeCAF z-scores among stroma spots were computed per sample and averaged across samples (n = 7) to obtain representative distributions for basal-like and classical carcinoma spot.
Pathology annotations
Annotation was performed on hematoxylin and eosin (H&E) stained samples by a gastrointestinal (GI) pathologist blinded to the DeCAF subtype calls. The samples were annotated as “fibrous” if extracellular densely packed collagen fibers with eosinophilic aspect were seen, “myxoid” if myxoid stroma with light blue-grey cytoplasm was predominant or “fibromyxoid” if the sample is in between. Spot annotation for Visium ST samples was performed by a GI pathologist blinded to gene features for each spot. The dominant tissue type of the spot was used as the annotation.
Staining analysis
FFPE blocks were sectioned at 5 μm thickness, deparaffinized with xylenes, followed by a gradient of ethanol washes to rehydrate. H&E staining was performed by incubating tumors in 10% hematoxylin for 2 min, dehydrated, and stained with 1% eosin for 1 min before clearing and mounting. Movat-Pentachrome staining was performed by following the manufactures protocol of a Modified Russell-Movat kit (ab#235884).
IF staining was performed on tissue sections that underwent antigen retrieval in a tris pH 9 buffer in a microwave for 15 min. Samples were then blocked (10 mM tris, 100 mM magnesium chloride, 10% goat serum, 1% BSA, 0.5% Tween 20) for 60 min at room temperature. Primary antibodies were diluted in dilution buffer (2% BSA in PBS) and applied to tissue sections overnight at 4°C. Secondary alexa fluor antibodies were diluted in dilution buffer and applied to sections for 60 min at room temperature. After thorough washing, conjugated pan-cytokeratin and DAPI was added to sections and incubated for 60 min at room temperature. Coverslips were mounted over specimen with Prolong Glass Antifade mountant (Invitrogen#P36980).
Stained tumors were imaged with an Olympus VS200 slide scanner at 20x or a Zeiss LSM900 confocal microscope at 40x magnification and analyzed using FIJI and QuPath analysis platforms.83 To quantify CD11b density in stromal regions, cells were segmented and manually thresholded to classify CD11b positive cells.
To quantify FBLN5 and ITGA11 staining, whole tumor samples were segmented by nuclei in QuPath. Cells were classified by presence or absence of panCK, FBLN5, and ITGA11. Total number of cells negative for panCK with FBLN5 only, ITGA11 only, or both FBLN5 and ITGA11 were recorded for each sample.
Quantification and statistical analysis
All statistical analyses were performed using R (version 4.x) unless otherwise specified. Statistical tests, exact n values, definitions of n (e.g., number of patients, spatial transcriptomics spots, or regions analyzed), and p-values are reported in the corresponding figures, figure legends, Results, and Supplementary Tables associated with each analysis.
Published: February 17, 2026
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.xcrm.2026.102611.
Contributor Information
Xianlu Laura Peng, Email: laura.peng@med.unc.edu.
William Y. Kim, Email: wykim@med.unc.edu.
Naim U. Rashid, Email: nur2@email.unc.edu.
Jen Jen Yeh, Email: jjyeh@med.unc.edu.
Supplemental Information
(A) SCISSORS CAF genes. (B) DeCAF genes and coefficients.
(A) Summaries of bulk datasets. (B) Overall survival analysis of different CAF subtyping schemas.
(A) PDAC. (B) Notes for survival analysis in PDAC. (C) TCGA MESO. (D) TCGA BLCA. (E) TCGA KIRC. (F) UROMOL. (G) IMmotion150. (H) IMvigor210.
(A) Univariate overall survival analysis. (B) Association analysis with molecular subtypes.
A) p value. B) Means. C) Manually corrected ligand-receptor pairs.
References
- 1.Elahi-Gedwillo K.Y., Carlson M., Zettervall J., Provenzano P.P. Antifibrotic Therapy Disrupts Stromal Barriers and Modulates the Immune Landscape in Pancreatic Ductal Adenocarcinoma. Cancer Res. 2019;79:372–386. doi: 10.1158/0008-5472.CAN-18-1334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Lee J.J., Perera R.M., Wang H., Wu D.C., Liu X.S., Han S., Fitamant J., Jones P.D., Ghanta K.S., Kawano S., et al. Stromal response to Hedgehog signaling restrains pancreatic cancer progression. Proc. Natl. Acad. Sci. USA. 2014;111:E3091–E3100. doi: 10.1073/pnas.1411679111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rhim A.D., Oberstein P.E., Thomas D.H., Mirek E.T., Palermo C.F., Sastra S.A., Dekleva E.N., Saunders T., Becerra C.P., Tattersall I.W., et al. Stromal Elements Act to Restrain, Rather Than Support, Pancreatic Ductal Adenocarcinoma. Cancer Cell. 2014;25:735–747. doi: 10.1016/j.ccr.2014.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Jiang H., Torphy R.J., Steiger K., Hongo H., Ritchie A.J., Kriegsmann M., Horst D., Umetsu S.E., Joseph N.M., McGregor K., et al. Pancreatic ductal adenocarcinoma progression is restrained by stromal matrix. J. Clin. Investig. 2020;130:4704–4709. doi: 10.1172/JCI136760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kalluri R. The biology and function of fibroblasts in cancer. Nat. Rev. Cancer. 2016;16:582–598. doi: 10.1038/nrc.2016.73. [DOI] [PubMed] [Google Scholar]
- 6.Ho W.J., Jaffee E.M., Zheng L. The tumour microenvironment in pancreatic cancer — clinical challenges and opportunities. Nat. Rev. Clin. Oncol. 2020;17:527–540. doi: 10.1038/s41571-020-0363-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Karamitopoulou E. Tumour microenvironment of pancreatic cancer: immune landscape is dictated by molecular and histopathological features. Br. J. Cancer. 2019;121:5–14. doi: 10.1038/s41416-019-0479-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hessmann E., Buchholz S.M., Demir I.E., Singh S.K., Gress T.M., Ellenrieder V., Neesse A. Microenvironmental Determinants of Pancreatic Cancer. Physiol. Rev. 2020;100:1707–1751. doi: 10.1152/physrev.00042.2019. [DOI] [PubMed] [Google Scholar]
- 9.Sherman M.H., Di Magliano M.P. Cancer-Associated Fibroblasts: Lessons from Pancreatic Cancer. Annu. Rev. Cancer Biol. 2023;7:43–55. [Google Scholar]
- 10.Perez V.M., Kearney J.F., Yeh J.J. The PDAC Extracellular Matrix: A Review of the ECM Protein Composition, Tumor Cell Interaction, and Therapeutic Strategies. Front. Oncol. 2021;11 doi: 10.3389/fonc.2021.751311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kim E.J., Sahai V., Abel E.V., Griffith K.A., Greenson J.K., Takebe N., Khan G.N., Blau J.L., Craig R., Balis U.G., et al. Pilot Clinical Trial of Hedgehog Pathway Inhibitor GDC-0449 (Vismodegib) in Combination with Gemcitabine in Patients with Metastatic Pancreatic Adenocarcinoma. Clin. Cancer Res. 2014;20:5937–5945. doi: 10.1158/1078-0432.CCR-14-1269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ramanathan R.K., McDonough S.L., Philip P.A., Hingorani S.R., Lacy J., Kortmansky J.S., Thumar J., Chiorean E.G., Shields A.F., Behl D., et al. Phase IB/II Randomized Study of FOLFIRINOX Plus Pegylated Recombinant Human Hyaluronidase Versus FOLFIRINOX Alone in Patients With Metastatic Pancreatic Adenocarcinoma: SWOG S1313. J. Clin. Orthod. 2019;37:1062–1069. doi: 10.1200/JCO.18.01295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Van Cutsem E., Tempero M.A., Sigal D., Oh D.Y., Fazio N., Macarulla T., Hitre E., Hammel P., Hendifar A.E., Bates S.E., et al. Randomized Phase III Trial of Pegvorhyaluronidase Alfa With Nab-Paclitaxel Plus Gemcitabine for Patients With Hyaluronan-High Metastatic Pancreatic Adenocarcinoma. J. Clin. Orthod. 2020;38:3185–3194. doi: 10.1200/JCO.20.00590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ko A.H., LoConte N., Tempero M.A., Walker E.J., Kate Kelley R., Lewis S., Chang W.C., Kantoff E., Vannier M.W., Catenacci D.V., et al. A Phase I Study of FOLFIRINOX Plus IPI-926, a Hedgehog Pathway Inhibitor, for Advanced Pancreatic Adenocarcinoma. Pancreas. 2016;45:370–375. doi: 10.1097/MPA.0000000000000458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Özdemir B.C., Pentcheva-Hoang T., Carstens J.L., Zheng X., Wu C.C., Simpson T.R., Laklai H., Sugimoto H., Kahlert C., Novitskiy S.V., et al. Depletion of Carcinoma-Associated Fibroblasts and Fibrosis Induces Immunosuppression and Accelerates Pancreas Cancer with Reduced Survival. Cancer Cell. 2014;25:719–734. doi: 10.1016/j.ccr.2014.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Moffitt R.A., Marayati R., Flate E.L., Volmar K.E., Loeza S.G.H., Hoadley K.A., Rashid N.U., Williams L.A., Eaton S.C., Chung A.H., et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat. Genet. 2015;47:1168–1178. doi: 10.1038/ng.3398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Maurer C., Holmstrom S.R., He J., Laise P., Su T., Ahmed A., Hibshoosh H., Chabot J.A., Oberstein P.E., Sepulveda A.R., et al. Experimental microdissection enables functional harmonization of pancreatic cancer subtypes. Gut. 2019;68:1034–1043. doi: 10.1136/gutjnl-2018-317706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Elyada E., Bolisetty M., Laise P., Flynn W.F., Courtois E.T., Burkhart R.A., Teinor J.A., Belleau P., Biffi G., Lucito M.S., et al. Cross-species single-cell analysis of pancreatic ductal adenocarcinoma reveals antigen-presenting cancer-associated fibroblasts. Cancer Discov. 2019;9:1102–1123. doi: 10.1158/2159-8290.CD-19-0094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Öhlund D., Handly-Santana A., Biffi G., Elyada E., Almeida A.S., Ponz-Sarvise M., Corbo V., Oni T.E., Hearn S.A., Lee E.J., et al. Distinct populations of inflammatory fibroblasts and myofibroblasts in pancreatic cancer. J. Exp. Med. 2017;214:579–596. doi: 10.1084/jem.20162024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chen K., Wang Q., Li M., Guo H., Liu W., Wang F., Tian X., Yang Y. Single-cell RNA-seq reveals dynamic change in tumor microenvironment during pancreatic ductal adenocarcinoma malignant progression. EBioMedicine. 2021;66 doi: 10.1016/j.ebiom.2021.103315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wang Y., Liang Y., Xu H., Zhang X., Mao T., Cui J., Yao J., Wang Y., Jiao F., Xiao X., et al. Single-cell analysis of pancreatic ductal adenocarcinoma identifies a novel fibroblast subtype associated with poor prognosis but better immunotherapy response. Cell Discov. 2021;7:36. doi: 10.1038/s41421-021-00271-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Oh K., Yoo Y.J., Torre-Healy L.A., Rao M., Fassler D., Wang P., Caponegro M., Gao M., Kim J., Sasson A., et al. Coordinated single-cell tumor microenvironment dynamics reinforce pancreatic cancer subtype. Nat. Commun. 2023;14:5226. doi: 10.1038/s41467-023-40895-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Veghini L., Pasini D., Fang R., Delfino P., Filippini D., Neander C., Vicentini C., Fiorini E., Lupo F., D'Agosto S.L., et al. Differential activity of MAPK signalling defines fibroblast subtypes in pancreatic cancer. Nat. Commun. 2024;15 doi: 10.1038/s41467-024-54975-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Krishnamurty A.T., Shyer J.A., Thai M., Gandham V., Buechler M.B., Yang Y.A., Pradhan R.N., Wang A.W., Sanchez P.L., Qu Y., et al. LRRC15+ myofibroblasts dictate the stromal setpoint to suppress tumour immunity. Nature. 2022;611:148–154. doi: 10.1038/s41586-022-05272-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dominguez C.X., Müller S., Keerthivasan S., Koeppen H., Hung J., Gierke S., Breart B., Foreman O., Bainbridge T.W., Castiglioni A., et al. Single-Cell RNA Sequencing Reveals Stromal Evolution into LRRC15+ Myofibroblasts as a Determinant of Patient Response to Cancer Immunotherapy. Cancer Discov. 2020;10:232–253. doi: 10.1158/2159-8290.CD-19-0644. [DOI] [PubMed] [Google Scholar]
- 26.Pei G., Min J., Rajapakshe K.I., Branchi V., Liu Y., Selvanesan B.C., Thege F., Sadeghian D., Zhang D., Cho K.S., et al. Spatial mapping of transcriptomic plasticity in metastatic pancreatic cancer. Nature. 2025;642:212–221. doi: 10.1038/s41586-025-08927-x. [DOI] [PubMed] [Google Scholar]
- 27.Grünwald B.T., Devisme A., Andrieux G., Vyas F., Aliar K., McCloskey C.W., Macklin A., Jang G.H., Denroche R., Romero J.M., et al. Spatially confined sub-tumor microenvironments in pancreatic cancer. Cell. 2021;184:5577–5592.e18. doi: 10.1016/j.cell.2021.09.022. [DOI] [PubMed] [Google Scholar]
- 28.Ogawa Y., Masugi Y., Abe T., Yamazaki K., Ueno A., Fujii-Nishimura Y., Hori S., Yagi H., Abe Y., Kitago M., Sakamoto M. Three Distinct Stroma Types in Human Pancreatic Cancer Identified by Image Analysis of Fibroblast Subpopulations and Collagen. Clin. Cancer Res. 2021;27:107–119. doi: 10.1158/1078-0432.CCR-20-2298. [DOI] [PubMed] [Google Scholar]
- 29.Hwang W.L., Jagadeesh K.A., Guo J.A., Hoffman H.I., Yadollahpour P., Reeves J.W., Mohan R., Drokhlyansky E., Van Wittenberghe N., Ashenberg O., et al. Single-nucleus and spatial transcriptome profiling of pancreatic cancer identifies multicellular dynamics associated with neoadjuvant treatment. Nat. Genet. 2022;54:1178–1191. doi: 10.1038/s41588-022-01134-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Biffi G., Oni T.E., Spielman B., Hao Y., Elyada E., Park Y., Preall J., Tuveson D.A. IL1-Induced JAK/STAT Signaling Is Antagonized by TGFβ to Shape CAF Heterogeneity in Pancreatic Ductal Adenocarcinoma. Cancer Discov. 2019;9:282–301. doi: 10.1158/2159-8290.CD-18-0710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Mace T.A., Shakya R., Pitarresi J.R., Swanson B., McQuinn C.W., Loftus S., Nordquist E., Cruz-Monserrate Z., Yu L., Young G., et al. IL-6 and PD-L1 antibody blockade combination therapy reduces tumour progression in murine models of pancreatic cancer. Gut. 2018;67:320–332. doi: 10.1136/gutjnl-2016-311585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Donahue K.L., Watkoske H.R., Kadiyala P., Du W., Brown K., Scales M.K., Elhossiny A.M., Espinoza C.E., Lasse Opsahl E.L., Griffith B.D., et al. Oncogenic KRAS-Dependent Stromal Interleukin-33 Directs the Pancreatic Microenvironment to Promote Tumor Growth. Cancer Discov. 2024;14:1964–1989. doi: 10.1158/2159-8290.CD-24-0100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Steele N.G., Biffi G., Kemp S.B., Zhang Y., Drouillard D., Syu L., Hao Y., Oni T.E., Brosnan E., Elyada E., et al. Inhibition of Hedgehog Signaling Alters Fibroblast Composition in Pancreatic Cancer. Clin. Cancer Res. 2021;27:2023–2037. doi: 10.1158/1078-0432.CCR-20-3715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Zhang Y., Yan W., Collins M.A., Bednar F., Rakshit S., Zetter B.R., Stanger B.Z., Chung I., Rhim A.D., di Magliano M.P. Interleukin-6 Is Required for Pancreatic Cancer Progression by Promoting MAPK Signaling Activation and Oxidative Stress Resistance. Cancer Res. 2013;73:6359–6374. doi: 10.1158/0008-5472.CAN-13-1558-T. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Leary J.R., Xu Y., Morrison A.B., Jin C., Shen E.C., Kuhlers P.C., Su Y., Rashid N.U., Yeh J.J., Peng X.L. Sub-Cluster Identification through Semi-Supervised Optimization of Rare-Cell Silhouettes (SCISSORS) in single-cell RNA-sequencing. Bioinformatics. 2023;39 doi: 10.1093/bioinformatics/btad449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cao L., Huang C., Cui Zhou D., Hu Y., Lih T.M., Savage S.R., Krug K., Clark D.J., Schnaubelt M., Chen L., et al. Proteogenomic characterization of pancreatic ductal adenocarcinoma. Cell. 2021;184:5031–5052.e26. doi: 10.1016/j.cell.2021.08.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Dijk F., Veenstra V.L., Soer E.C., Dings M.P.G., Zhao L., Halfwerk J.B., Hooijer G.K., Damhofer H., Marzano M., Steins A., et al. Unsupervised class discovery in pancreatic ductal adenocarcinoma reveals cell-intrinsic mesenchymal features and high concordance between existing classification systems. Sci. Rep. 2020;10:337. doi: 10.1038/s41598-019-56826-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma. Cancer Cell. 2017;32:185–203.e13. doi: 10.1016/j.ccell.2017.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rashid N.U., Peng X.L., Jin C., Moffitt R.A., Volmar K.E., Belt B.A., Panni R.Z., Nywening T.M., Herrera S.G., Moore K.J., et al. Purity Independent Subtyping of Tumors (PurIST), a clinically robust, single sample classifier for tumor subtyping in pancreatic cancer. Clin. Cancer Res. 2020;26:82–92. doi: 10.1158/1078-0432.CCR-19-1467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Afsari B., Braga-Neto U.M., Geman D. Rank discriminants for predicting phenotypes from RNA expression. Ann. Appl. Stat. 2014;8:1469–1491. [Google Scholar]
- 41.Patil P., Bachant-Winner P.-O., Haibe-Kains B., Leek J.T. Test set bias affects reproducibility of gene signatures. Bioinformatics. 2015;31:2318–2323. doi: 10.1093/bioinformatics/btv157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Leek J.T. The tspair package for finding top scoring pair classifiers in R. Bioinformatics. 2009;25:1203–1204. doi: 10.1093/bioinformatics/btp126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Bailey P., Chang D.K., Nones K., Johns A.L., Patch A.M., Gingras M.C., Miller D.K., Christ A.N., Bruxner T.J.C., Quinn M.C., et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 2016;531:47–52. doi: 10.1038/nature16965. [DOI] [PubMed] [Google Scholar]
- 44.Puleo F., Nicolle R., Blum Y., Cros J., Marisa L., Demetter P., Quertinmont E., Svrcek M., Elarouci N., Iovanna J., et al. Stratification of Pancreatic Ductal Adenocarcinomas Based on Tumor and Microenvironment Features. Gastroenterology. 2018;155:1999–2013.e3. doi: 10.1053/j.gastro.2018.08.033. [DOI] [PubMed] [Google Scholar]
- 45.Hayashi A., Fan J., Chen R., Ho Y.J., Makohon-Moore A.P., Lecomte N., Zhong Y., Hong J., Huang J., Sakamoto H., et al. A unifying paradigm for transcriptional heterogeneity and squamous features in pancreatic ductal adenocarcinoma. Nat. Cancer. 2020;1:59–74. doi: 10.1038/s43018-019-0010-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Frost H.R. Variance-adjusted Mahalanobis (VAM): a fast and accurate method for cell-specific gene set scoring. Nucleic Acids Res. 2020;48:e94. doi: 10.1093/nar/gkaa582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Peng X.L., Moffitt R.A., Torphy R.J., Volmar K.E., Yeh J.J. De novo compartment deconvolution and weight estimation of tumor samples using DECODER. Nat. Commun. 2019;10:4729. doi: 10.1038/s41467-019-12517-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Chan-Seng-Yue M., Kim J.C., Wilson G.W., Ng K., Figueroa E.F., O'Kane G.M., Connor A.A., Denroche R.E., Grant R.C., McLeod J., et al. Transcription phenotypes of pancreatic cancer are driven by genomic events during tumor evolution. Nat. Genet. 2020;52:231–240. doi: 10.1038/s41588-019-0566-9. [DOI] [PubMed] [Google Scholar]
- 49.Suurmeijer J.A., Soer E.C., Dings M.P.G., Kim Y., Strijker M., Bonsing B.A., Brosens L.A.A., Busch O.R., Groen J.V., Halfwerk J.B.G., et al. Impact of classical and basal-like molecular subtypes on overall survival in resected pancreatic cancer in the SPACIOUS-2 multicentre study. Br. J. Surg. 2022;109:1150–1155. doi: 10.1093/bjs/znac272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Lee J.J., Kearney J.F., Trembath H.E., Hariharan A., LaBella M.E., Kharitonova E.V., Chan P.S., Morrison A.B., Cliff A., Meyers M.O., et al. Tumor-intrinsic and Cancer-associated Fibroblast Subtypes Independently Predict Outcomes in Pancreatic Cancer. Ann. Surg. 2024;280:659–666. doi: 10.1097/SLA.0000000000006416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Efremova M., Vento-Tormo M., Teichmann S.A., Vento-Tormo R. CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat. Protoc. 2020;15:1484–1506. doi: 10.1038/s41596-020-0292-x. [DOI] [PubMed] [Google Scholar]
- 52.Herrera J.A., Dingle L., Montero M.A., Venkateswaran R.V., Blaikley J.F., Lawless C., Schwartz M.A. The UIP/IPF fibroblastic focus is a collagen biosynthesis factory embedded in a distinct extracellular matrix. JCI Insight. 2022;7 doi: 10.1172/jci.insight.156115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Loveless I.M., Kemp S.B., Hartway K.M., Mitchell J.T., Wu Y., Zwernik S.D., Salas-Escabillas D.J., Brender S., George M., Makinwa Y., et al. Human Pancreatic Cancer Single-Cell Atlas Reveals Association of CXCL10+ Fibroblasts and Basal Subtype Tumor Cells. Clin. Cancer Res. 2025;31:756–772. doi: 10.1158/1078-0432.CCR-24-2183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Luo H., Xia X., Huang L.B., An H., Cao M., Kim G.D., Chen H.N., Zhang W.H., Shu Y., Kong X., et al. Pan-cancer single-cell analysis reveals the heterogeneity and plasticity of cancer-associated fibroblasts in the tumor microenvironment. Nat. Commun. 2022;13:6619. doi: 10.1038/s41467-022-34395-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Galbo P.M., Zang X., Zheng D. Molecular Features of Cancer-associated Fibroblast Subtypes and their Implication on Cancer Pathogenesis, Prognosis, and Immunotherapy Resistance. Clin. Cancer Res. 2021;27:2636–2647. doi: 10.1158/1078-0432.CCR-20-4226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Liu J. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell. 2018;173:400–416.e11. doi: 10.1016/j.cell.2018.02.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Robertson A.G., Kim J., Al-Ahmadie H., Bellmunt J., Guo G., Cherniack A.D., Hinoue T., Laird P.W., Hoadley K.A., Akbani R., et al. Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer. Cell. 2017;171:540–556.e25. doi: 10.1016/j.cell.2017.09.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Perryman L., Gray S.G. Fibrosis in Mesothelioma: Potential Role of Lysyl Oxidases. Cancers. 2022;14:981. doi: 10.3390/cancers14040981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Lindskrog S.V., Prip F., Lamy P., Taber A., Groeneveld C.S., Birkenkamp-Demtröder K., Jensen J.B., Strandgaard T., Nordentoft I., Christensen E., et al. An integrated multi-omics analysis identifies prognostic molecular subtypes of non-muscle-invasive bladder cancer. Nat. Commun. 2021;12:2301. doi: 10.1038/s41467-021-22465-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Newman A.M., Liu C.L., Green M.R., Gentles A.J., Feng W., Xu Y., Hoang C.D., Diehn M., Alizadeh A.A. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods. 2015;12:453–457. doi: 10.1038/nmeth.3337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Koh C.-H., Lee S., Kwak M., Kim B.-S., Chung Y. CD8 T-cell subsets: heterogeneity, functions, and therapeutic potential. Exp. Mol. Med. 2023;55:2287–2299. doi: 10.1038/s12276-023-01105-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Tanaka A., Sakaguchi S. Regulatory T cells in cancer immunotherapy. Cell Res. 2017;27:109–118. doi: 10.1038/cr.2016.151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Luca B.A., Steen C.B., Matusiak M., Azizi A., Varma S., Zhu C., Przybyl J., Espín-Pérez A., Diehn M., Alizadeh A.A., et al. Atlas of clinically distinct cell states and ecosystems across human solid tumors. Cell. 2021;184:5482–5496.e28. doi: 10.1016/j.cell.2021.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Nywening T.M., Wang-Gillam A., Sanford D.E., Belt B.A., Panni R.Z., Cusworth B.M., Toriola A.T., Nieman R.K., Worley L.A., Yano M., et al. Targeting tumour-associated macrophages with CCR2 inhibition in combination with FOLFIRINOX in patients with borderline resectable and locally advanced pancreatic cancer: a single-centre, open-label, dose-finding, non-randomised, phase 1b trial. Lancet Oncol. 2016;17:651–662. doi: 10.1016/S1470-2045(16)00078-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Mariathasan S., Turley S.J., Nickles D., Castiglioni A., Yuen K., Wang Y., Kadel E.E., III, Koeppen H., Astarita J.L., Cubas R., et al. TGFβ attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature. 2018;554:544–548. doi: 10.1038/nature25501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.McDermott D.F., Huseni M.A., Atkins M.B., Motzer R.J., Rini B.I., Escudier B., Fong L., Joseph R.W., Pal S.K., Reeves J.A., et al. Clinical activity and molecular correlates of response to atezolizumab alone or in combination with bevacizumab versus sunitinib in renal cell carcinoma. Nat. Med. 2018;24:749–757. doi: 10.1038/s41591-018-0053-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Mucciolo G., Araos Henríquez J., Jihad M., Pinto Teles S., Manansala J.S., Li W., Ashworth S., Lloyd E.G., Cheng P.S.W., Luo W., et al. EGFR-activated myofibroblasts promote metastasis of pancreatic cancer. Cancer Cell. 2023;42:101. doi: 10.1016/j.ccell.2023.12.002. [DOI] [PubMed] [Google Scholar]
- 68.Chen Y., McAndrews K.M., Kalluri R. Clinical and therapeutic relevance of cancer-associated fibroblasts. Nat. Rev. Clin. Oncol. 2021;18:792–804. doi: 10.1038/s41571-021-00546-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Qiang L., Hoffman M.T., Ali L.R., Castillo J.I., Kageler L., Temesgen A., Lenehan P., Wang S.J., Bello E., Cardot-Ruffino V., et al. Transforming Growth Factor-β Blockade in Pancreatic Cancer Enhances Sensitivity to Combination Chemotherapy. Gastroenterology. 2023;165:874–890.e10. doi: 10.1053/j.gastro.2023.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Bianchi A., De Castro Silva I., Deshpande N.U., Singh S., Mehra S., Garrido V.T., Guo X., Nivelo L.A., Kolonias D.S., Saigh S.J., et al. Cell-Autonomous Cxcl1 Sustains Tolerogenic Circuitries and Stromal Inflammation via Neutrophil-Derived TNF in Pancreatic Cancer. Cancer Discov. 2023;13:1428–1453. doi: 10.1158/2159-8290.CD-22-1046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Abrego J., Sanford-Crane H., Oon C., Xiao X., Betts C.B., Sun D., Nagarajan S., Diaz L., Sandborg H., Bhattacharyya S., et al. A Cancer Cell–Intrinsic GOT2–PPARδ Axis Suppresses Antitumor Immunity. Cancer Discov. 2022;12:2414–2433. doi: 10.1158/2159-8290.CD-22-0661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Carpenter E.S., Elhossiny A.M., Kadiyala P., Li J., McGue J., Griffith B.D., Zhang Y., Edwards J., Nelson S., Lima F., et al. Analysis of Donor Pancreata Defines the Transcriptomic Signature and Microenvironment of Early Neoplastic Lesions. Cancer Discov. 2023;13:1324–1345. doi: 10.1158/2159-8290.CD-23-0013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Dimitrieva S., Harrison J.M., Chang J., Piquet M., Mino-Kenudson M., Gabriel M., Sagar V., Horn H., Lage K., Kim J., et al. Dynamic Evolution of Fibroblasts Revealed by Single-Cell RNA Sequencing of Human Pancreatic Cancer. Cancer Res. Commun. 2024;4:3049–3066. doi: 10.1158/2767-9764.CRC-23-0489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Patro R., Duggal G., Love M.I., Irizarry R.A., Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods. 2017;14:417–419. doi: 10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Wilkerson M.D., Hayes D.N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. 2010;26:1572–1573. doi: 10.1093/bioinformatics/btq170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Afsari B., Fertig E.J., Geman D., Marchionni L. switchBox: an R package for k-Top Scoring Pairs classifier development. Bioinformatics. 2015;31:273–274. doi: 10.1093/bioinformatics/btu622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Breheny P., Huang J. COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. Ann. Appl. Stat. 2011;5:232–253. doi: 10.1214/10-AOAS388. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Pérez-Fernández S., Martínez-Camblor P., Filzmoser P., Corral N. nsROC: An R package for Non-Standard ROC Curve Analysis. The R Journal. 2018;10(2):55–77. https://digitalcommons.unl.edu/r-journal/356 [Google Scholar]
- 79.Therneau T., Grambsch P. Springer; 2000. Modeling Survival Data: Extending the Cox Model. [Google Scholar]
- 80.Hao Y., Hao S., Andersen-Nissen E., Mauck W.M., 3rd, Zheng S., Butler A., Lee M.J., Wilk A.J., Darby C., Zager M., et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e29. doi: 10.1016/j.cell.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Hao Y., Stuart T., Kowalski M.H., Choudhary S., Hoffman P., Hartman A., Srivastava A., Molla G., Madad S., Fernandez-Granda C., Satija R. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 2024;42:293–304. doi: 10.1038/s41587-023-01767-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Jin S., Guerrero-Juarez C.F., Zhang L., Chang I., Ramos R., Kuan C.H., Myung P., Plikus M.V., Nie Q. Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 2021;12:1088. doi: 10.1038/s41467-021-21246-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Bankhead P., Loughrey M.B., Fernández J.A., Dombrowski Y., McArt D.G., Dunne P.D., McQuaid S., Gray R.T., Murray L.J., Coleman H.G., et al. QuPath: Open source software for digital pathology image analysis. Sci. Rep. 2017;7 doi: 10.1038/s41598-017-17204-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Kamoun A., de Reyniès A., Allory Y., Sjödahl G., Robertson A.G., Seiler R., Hoadley K.A., Groeneveld C.S., Al-Ahmadie H., Choi W., et al. A Consensus Molecular Classification of Muscle-invasive Bladder Cancer. Eur. Urol. 2020;77:420–433. doi: 10.1016/j.eururo.2019.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Geman D., d’Avignon C., Naiman D.Q., Winslow R.L. Classifying gene expression profiles from pairwise mRNA comparisons. Stat. Appl. Genet. Mol. Biol. 2004;3 doi: 10.2202/1544-6115.1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Martínez-Camblor P. Fully non-parametric receiver operating characteristic curve estimation for random-effects meta-analysis. Stat. Methods Med. Res. 2017;26:5–20. doi: 10.1177/0962280214537047. [DOI] [PubMed] [Google Scholar]
- 87.Chen B., Khodadoust M.S., Liu C.L., Newman A.M., Alizadeh A.A. Profiling tumor infiltrating immune cells with CIBERSORT. Methods Mol. Biol. 2018;1711:243–259. doi: 10.1007/978-1-4939-7493-1_12. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
(A) SCISSORS CAF genes. (B) DeCAF genes and coefficients.
(A) Summaries of bulk datasets. (B) Overall survival analysis of different CAF subtyping schemas.
(A) PDAC. (B) Notes for survival analysis in PDAC. (C) TCGA MESO. (D) TCGA BLCA. (E) TCGA KIRC. (F) UROMOL. (G) IMmotion150. (H) IMvigor210.
(A) Univariate overall survival analysis. (B) Association analysis with molecular subtypes.
A) p value. B) Means. C) Manually corrected ligand-receptor pairs.
Data Availability Statement
-
•
Newly generated datasets have been deposited in the Gene Expression Omnibus (GEO) under accession number GSE311789, with GSE310957 (UNC-bulk), GSE311788 (UNC-sc), and GSE311783 (UNC-st). Public bulk datasets used in this study are summarized in Table S2.
-
•
The DeCAF classifier was deposited as a GitHub repository at https://github.com/jjyeh-unc/decaf. All scripts involved in generating results and figures are available at https://github.com/jjyeh-unc/decaf_manuscript.
-
•
Any additional information required to reanalyze the data reported in this work paper is available from the lead contact upon request.







