Skip to main content
Nature Communications logoLink to Nature Communications
. 2022 Mar 28;13:1642. doi: 10.1038/s41467-022-29164-0

Single-cell transcriptomic analysis suggests two molecularly distinct subtypes of intrahepatic cholangiocarcinoma

Guohe Song 1,#, Yang Shi 2,#, Lu Meng 3,#, Jiaqiang Ma 1,3,#, Siyuan Huang 4, Juan Zhang 1, Yingcheng Wu 1, Jiaxin Li 4, Youpei Lin 1, Shuaixi Yang 1, Dongning Rao 1, Yifei Cheng 1, Jian Lin 5, Shuyi Ji 5, Yuming Liu 1, Shan Jiang 3, Xiaoliang Wang 6, Shu Zhang 1, Aiwu Ke 1, Xiaoying Wang 1, Ya Cao 7, Yuan Ji 8, Jian Zhou 1,9, Jia Fan 1,9,, Xiaoming Zhang 3,, Ruibin Xi 2,, Qiang Gao 1,
PMCID: PMC8960779  PMID: 35347134

Abstract

Intrahepatic cholangiocarcinoma (iCCA) is a highly heterogeneous cancer with limited understanding of its classification and tumor microenvironment. Here, by performing single-cell RNA sequencing on 144,878 cells from 14 pairs of iCCA tumors and non-tumor liver tissues, we find that S100P and SPP1 are two markers for iCCA perihilar large duct type (iCCAphl) and peripheral small duct type (iCCApps). S100P + SPP1− iCCAphl has significantly reduced levels of infiltrating CD4+ T cells, CD56+ NK cells, and increased CCL18+ macrophages and PD1+CD8+ T cells compared to S100P-SPP1 + iCCApps. The transcription factor CREB3L1 is identified to regulate the S100P expression and promote tumor cell invasion. S100P-SPP1 + iCCApps has significantly more SPP1+ macrophage infiltration, less aggressiveness and better survival than S100P + SPP1− iCCAphl. Moreover, S100P-SPP1 + iCCApps harbors tumor cells at different status of differentiation, such as ALB + hepatocyte differentiation and ID3+ stemness. Our study extends the understanding of the diversity of tumor cells in iCCA.

Subject terms: Cancer genomics, Tumour biomarkers, Tumour heterogeneity, Bile duct cancer


The molecular classification and tumour microenvironment in intrahepatic cholangiocarcinoma (iCCA) need further characterisation. Here, the authors perform single cell RNA-sequencing from 14 pairs of iCCA tumours and non-tumour liver tissues and propose S100P and SPP1 as markers for patient classification.

Introduction

Intrahepatic cholangiocarcinoma (iCCA) is the second most common primary liver malignancy after hepatocellular carcinoma, with poor outcome and rising incidence globally1. As a highly heterogeneous disease, iCCA can originate from cholangiocytes located at any point of a biliary tree above the second-order bile ducts. Recently, the World Health Organization and European Network for the Study of Cholangiocarcinoma have recognized that iCCA can be classified into two main histologically distinct subtypes, including perihilar large duct type (iCCAphl) and peripheral small duct type (iCCApps), according to the level or size of the affected bile duct2,3. Indeed, emerging evidence has indicated that the two histological subtypes of iCCA harbored distinct cellular origins and pathogenesis4.

Generally, iCCAphl is considered to be derived from large intrahepatic bile ducts and mainly composed of mucin-producing cholangiocytes. This subtype of iCCA is characterized by mucus hypersecretion and has higher lymph node metastasis rates and worse survival5 compared with iCCApps. It has been reported that MUC5AC, one of the main components of mucus, is frequently overexpressed in iCCAphl and associated with aggressive tumor behavior6. Also, S100P, a member of the S100 family of EF-hand calcium-binding proteins, that are highly expressed in various types of cancer and play crucial roles in tumor progression7, is also upregulated in mucin-producing iCCAs and suggested to be an important marker8,9 for iCCAphl. On the contrary, iCCApps is commonly believed to originate from small intrahepatic bile ducts with no or minimal mucin production. It has been found that iCCApps express CDH2 more frequently than iCCAphl and present distinctive clinical and molecular features10. Moreover, NCAM, a marker of hepatic progenitor cells, was also expressed in iCCApps, as well as cholangiolocellular carcinoma (CLC) which is thought to originate from canals of Hering/bile ductules3,5. Although the two subtypes of iCCA displayed significant differences in mucin production, the shape of tumor cells, and patient prognosis3,4, there is no consensus and definite panel of markers to distinguish them, and our knowledge of their biological, molecular, and therapeutic difference is still limited. Single-cell RNA sequencing (scRNA-seq) is a powerful technology for cancer research. Previous scRNA-seq studies have reported the complexity of the tumor microenvironment in iCCAs without taking into consideration of the histological classification, which may not accurately reflect the diversity of this tumor11,12.

Here, we identify and independently validate that SPP1, together with S100P, are optimal discriminatory biomarkers for iCCAphl and iCCApps. As compared with S100P-SPP1 + iCCApps, S100P + SPP1− iCCAphl has increased CCL18+ macrophages infiltration, decreased SPP1+ macrophages, aggressive phenotypes, and worse prognosis. Our data further our understanding of the diversity of tumor cells in iCCA.

Results

Single-cell profiling of the tumor ecosystem in iCCA

We applied scRNA-seq and whole-exome sequencing (WES) on tumor and paired adjacent non-tumor liver tissues from fourteen treatment-naïve iCCA patients (Fig. 1a and Supplementary Fig. 1a). All tumors were negative for Hep-Par1 and Arg-1 (specific markers for hepatocellular carcinoma) expression (Supplementary Fig. 1b). The patient clinicopathological characteristics are presented in Supplementary Data 1. We obtained single-cell transcriptomes for 144,878 cells after quality control. Thirteen main cell clusters with the expression of known marker genes were identified including epithelial cells, monocytes, macrophages, dendritic cells (DC), natural killer (NK) cells, CD4+ T cells, regulatory T cells (Treg), CD8+ T cells, mucosal-associated invariant T (MAIT) cells, B cells, plasma B cells, fibroblasts, and endothelial cells (Fig. 1b). Totally, we identified 23,667 malignant cells by inferring large-scale copy number variations (CNVs) from epithelial cells with the high expression of KRT19 (Fig. 1c and Supplementary Fig. 1c). Consistent with previous findings in other tumors, malignant cells showed strong intertumoral heterogeneity and formed patient-specific clusters13,14 (Fig. 1d). Also, infiltrating immune cells were found to be significantly heterogeneous among different patients and between tumor and peri-tumor tissues (Supplementary Fig. 1d, e). For example, macrophages, CD4+ T cells, and Tregs were highly infiltrated in the tumor, while MAIT cells were mainly distributed in the adjacent liver tissues (Fig. 1e).

Fig. 1. ScRNA-seq profiling of 14 iCCAs.

Fig. 1

a Schematic representation of the experimental strategy. WES whole-exome sequencing, TMA tissue microarray. Part of the picture was adapted from motifolio.com. b Heatmap showing the expression of marker genes in the indicated cell types. c Chromosomal landscape of inferred large-scale copy number variations (CNVs) in nonmalignant epithelial cells (top) and malignant cells from 14 iCCA samples. Rows represent individual cells and columns represent chromosomal positions. Amplifications (red) or deletions (blue) were inferred by averaging expression over 100-gene stretches on the respective chromosomes. d t-SNE plot of malignant and nonmalignant cells from 14 iCCAs. e Boxplot showing the fraction of nonmalignant cells in peri-tumor and tumor. (Peri-tumor n = 14, Tumor n = 14; **P < 0.01; two-sided Wilcoxon matched-pairs signed-rank test; Macrophage: P = 0.00012; CD4: P = 0.0012; Treg: P = 0.00012; MAIT: P = 0.00012; Fibroblast: P = 0.0017; Endothelial: P = 0.0012). The central mark indicates the median, and the bottom and top edges of the box indicate the first and third quartiles, respectively. The top and bottom whiskers extend the boxes to a maximum of 1.5 times the interquartile range. Source data are provided as a Source Data file.

SPP1 is a representative marker for iCCApps

To explore the subtypes of iCCA with different cell origins at the single-cell level, we examined the expression of previously proposed markers of iCCAphl (S100P, MUC5AC, and MUC6) and iCCApps (NCAM1 and CDH2) in malignant cells2,3 (Fig. 2a and Supplementary Fig. 2a). We found that 7 out of 14 iCCAs (P02, P03, P04, P06, P16, P17, and P18) exhibited high expression of iCCAphl markers such as S100P and MUC5AC, indicating their origin from large intrahepatic bile ducts. Notably, S100P + cells accounted for 91.14% of total tumor cells from these seven iCCAs and displayed more representative and extensive-expression compared with the other markers (MUC5AC: 42.37%, and MUC6: 22.97%). The 14 iCCAs can be divided into two groups based on S100P expression, which was confirmed by immunohistochemistry (Supplementary Fig 2b, c). For the remaining seven S100P- iCCAs (P09, P10, P12, P13, P14, P15, and P19), they expressed iCCApps markers NCAM1 and CDH2, which were mutually exclusive with the expression of S100P, confirming the different origins of these tumor cells. NESTIN, which has been proposed as a possible diagnostic biomarker for diagnosing combined hepatocellular carcinoma-intrahepatic cholangiocarcinoma (cHCC-ICC)15, was mostly expressed in S100P- iCCA cases, suggesting the possible similarities between cHCC-ICC and S100P- iCCA. However, the positive cells of NCAM1 (2.36%) and CDH2 (31.86%) accounted for a very low proportion of the total tumor cells in these seven S100P- iCCAs. To find more representative markers for iCCApps, we searched for genes mutually exclusive with S100P but expressed extensively in iCCApps. Gene such as SPP1 had low expression in S100P + and high expression in S100P- cells, making it potential biomarkers (Fig. 2b and Supplementary Data 2). SPP1, also known as osteopontin (OPN), is highly expressed in a variety of tumors and plays important roles in tumor progression and tumor cell evolution in response to therapy16,17. We confirmed that the seven S100P- iCCAs showed high expression of SPP1 both at the cellular (87.17% of S100P- iCCAs’ tumor cells) and tissue level (Fig. 2c and Supplementary Fig. 2d). Thus, we divided 14 iCCAs into S100P + SPP1− iCCAphl and S100P-SPP1 + iCCApps subgroups based on the expression of S100P and SPP1.

Fig. 2. iCCA can be classified into two subtypes according to the expression of S100P and SPP1.

Fig. 2

a t-SNE plot showing the expression level of S100P in malignant cells. b Proportion of positive cells with gene expression in S100P + (x-axis) and S100P- cells (y-axis). c t-SNE plot showing the expression level of SPP1 in malignant cells. d Representative images of immunohistochemical expression of S100P and SPP1 in iCCAs from TMA cohort (n = 201). Patient 1: S100P +  SPP1−, Patient 2: S100P-SPP1+. Scale bar, 100 μm. The experiment was repeated once with similar results. e Kaplan–Meier plot of the S100P + SPP1− and S100P-SPP1+ based on TMA data. Two-sided log-rank test. f The scatter diagrams showing the differences in carbohydrate antigen 19-9 (CA19-9, S100P + SPP1− n = 63, S100P-SPP1 + n = 114), carcinoembryonic antigen, (CEA, S100P + SPP1 − n = 63, S100P − SPP1 + n = 115), Ki67 (S100P + SPP1− n = 68, S100P−SPP1 + n = 118), and tumor size (S100P + SPP1− n = 68, S100P-SPP1 + n = 118) between the two groups (*P < 0.05; ***P < 0.001; two-sided Mann–Whitney U-test; CA19-9: P < 0.0001; CEA: P < 0.0001; Ki67: P = 0.025; tumor size: P = 0.019). Data were presented as median with interquartile range. g Scatterplot of S100P and SPP1 expression in Jusakul et al. dataset20. A Gaussian mixture model with two mixture components was used to identify S100P +/− and SPP1 +/− patients (right and top distribution curves). Solid circles represent iCCA and open circles represent extrahepatic cholangiocarcinoma (ECC). Red represents S100P + SPP1- while blue represents S100P-SPP1+. h Graphical representation of the proportion of S100P + SPP1- and S100P-SPP1+ in iCCA and ECC. i Kaplan–Meier plot of the S100P + SPP1− and S100P-SPP1+ based on Jusakul et al. dataset20. Two-sided log-rank test. Source data are provided as a Source Data file.

According to our scRNA-seq data, most of the tumor cells either expressed S100P (23.95%) or SPP1 (60.05%), while only 10.01% and 5.98% tumor cells showed double negativity or double positivity, respectively (Supplementary Fig. 2e, f). Consistently, we performed the same analyses in Ma et al.’s iCCA scRNA-seq dataset and found that the expression of S100P and SPP1 were mutually exclusive in iCCA cells17 (Supplementary Fig. 3a, b). We also found a small number of S100P + SPP1 + cells (5.62%) exist in their iCCA cases (Supplementary Fig. 3c, d). Through dimension reduction, we found that the global expression profile of S100P + SPP1 + showed a higher degree of similarity to S100P + SPP1− than the S100P-SPP1 + cells (Supplementary Fig. 3e). Immunohistochemical results revealed that these S100P + SPP1 + cells were mostly present in the invasive regions of cancer nodules in certain iCCAphl cases (Supplementary Fig. 3f). Accumulating evidence has revealed SPP1 acts as a significant mediator of modulating tumor invasion and metastasis18, implying these double-positive cells may be involved in the progression of iCCA.

To further explore whether the expression of S100P and SPP1 in iCCA were mutually exclusive in a larger cohort, immunohistochemistry was performed on a tissue microarray (TMA) containing 201 iCCAs. We found that 92.54% iCCAs can be clearly divided into S100P + SPP1− (33.83%, 68 patients) and S100P-SPP1 + (58.71%, 118 patients) iCCAs, while only 5.97% (12 patients) and 1.49% (three patients) were classified as S100P-SPP1− and S100P + SPP1 + iCCAs, respectively (Fig. 2d and Supplementary Fig. 4). Our results demonstrated that these S100P + SPP1− iCCAs were more like iCCAphl (all were positive for MUC5AC and mucin production), while S100P-SPP1 + iCCAs were more like iCCApps (mostly were negative for MUC5AC and mucin production) by performing the staining of Alcian blue staining (to detect the mucus secreted by mucous tumor cells), immunohistochemical staining of MUC5AC (essential for mucus production), and HE staining (morphology) in these samples (Supplementary Fig. 4). Consistent with the previous study, it’s difficult to accurately distinguish iCCAphl and iCCApps only from the morphology19. Survival analysis revealed that S100P + SPP1− iCCAs had a significantly worse prognosis than S100P-SPP1 + iCCAs (P = 0.008, Fig. 2e), which was further confirmed by the multivariate Cox regression analysis (HR, 1.922; 95% CI, 1.257–2.939; P = 0.003, Supplementary Data 3). Also, S100P + SPP1− iCCAs significantly correlated with higher CA19-9 (P < 0.01), CEA (P < 0.01), Ki67 expression (P = 0.025), lymph node metastasis (P = 0.013), and advanced TNM stage (P = 0.021), but negatively correlated with tumor size (P = 0.019), HBsAg status (P < 0.01), chronic hepatitis (P = 0.002) and liver cirrhosis (P = 0.049) (Fig. 2f and Supplementary Data 3). The higher percentage of HBsAg positive status, chronic hepatitis, and liver cirrhosis in S100P-SPP1 + iCCAs further support the notion that iCCApps usually develop on a background of chronic liver disease5.

We further evaluated the effect of S100P and SPP1 in distinguishing iCCAphl and iCCApps in two RNA-seq databases of cholangiocarcinoma. We found that 81.48% iCCAs can be divided into two independent groups according to the expression of S100P and SPP1 in Jusakul et al.’s dataset20 (Fig. 2g and Supplementary Data 4). The S100P-SPP1 + samples almost exclusively exist in iCCA instead of ECC, further supporting their distinct origination (Fig. 2h). Survival analysis showed that the prognosis of S100P + SPP1− iCCAs were significantly worse than S100P-SPP1 + iCCAs (P < 0.01, Fig. 2i). Similar results were obtained from Job et al.’s dataset21 (Supplementary Fig. 5a, b).

Analysis of the WES data found that S100P + SPP1− iCCAs tended to have more TP53 (4/14), SYNE1 (3/14), and EPHA2 (3/14) mutations, while S100P-SPP1 + iCCAs harbored more BAP1 (3/14) mutations, which was consistent with previous studies10,20 (Supplementary Fig. 5c, d). We also found that the DNA methylation level of S100P in S100P + SPP1− was significantly lower than that in S100P-SPP1+, while no apparent difference was observed in CNVs20, indicating potential epigenetic regulation of S100P in these two iCCA subtypes (Supplementary Fig. 5e, f). Taken together, these results indicate that S100P and SPP1 are two optimal biomarkers for distinguishing iCCAphl and iCCApps, which can effectively divide the iCCA patients into two subtypes with different cell origins and clinicopathological characteristics.

Molecular profiles and transcription networks of S100P + SPP1− and S100P-SPP1 + iCCAs

The presence of two main subgroups of malignant cells in iCCA prompted us to investigate their unique gene expression profiles. We first evaluated their intratumor heterogeneity (ITH) at the genomic and single-cell transcriptome levels. The results showed no significant difference in genomic ITH, but a significantly higher transcriptomic ITH in S100P + SPP1− iCCAs (Fig. 3a). This was consistent with a previous study that higher transcriptomic ITH predicted poor survival11. Subsequently, we identified 755 differentially expressed genes between these two groups of malignant cells (|logFC | > 1.5 and P < .01, Supplementary Fig. 6a and Supplementary Data 5). Genes upregulated in S100P-SPP1 + cells were mainly enriched in the regulation of coagulation and complement activation, which were involved in hepatocyte function (Fig. 3b). These cells presented high expression of hepatocyte-specific genes such as SERPINE2, APOB, and CPB2, further supporting their hepatocyte-like differentiation (Supplementary Fig. 6b). In contrast, genes upregulated in S100P + SPP1− cells were related to mucus secretion, protein localization to the endoplasmic reticulum (ER), and epithelial structure maintenance. Remarkably, we found that PSCA, which encodes a tumor antigen and is upregulated in prostate22 and bladder23 cancers, was highly expressed in S100P + SPP1− iCCAs, making it a promising candidate for immunotherapy of iCCAphl (Supplementary Fig. 6c).

Fig. 3. Different gene expression profiles between S100P + SPP1− and S100P-SPP1+ cells.

Fig. 3

a Boxplot of the genomic heterogeneity (left) and transcriptomic heterogeneity (right) of S100P + SPP1−(n = 7) and S100P-SPP1 + (n = 7) iCCAs. (**P < 0.01; two-sided Wilcoxon rank-sum test; Transcriptomic heterogeneity: P = 0.0041; NS not significant). The central mark indicates the median, and the bottom and top edges of the box indicate the first and third quartiles, respectively. The top and bottom whiskers extend the boxes to a maximum of 1.5 times the interquartile range. b Top enriched pathways for genes with specific expression in S100P + SPP1− and S100P-SPP1 + cells. c Network representation of selected differentially expressed transcription factors between S100P + SPP1− and S100P-SPP1 + cells, as analyzed by SCENIC. Transcription factors in S100P + SPP1− are shown in red; transcription factors in S100P-SPP1 + are shown in blue. Bar graph showing the difference score for the selected set of differentially expressed transcription factors in S100P + SPP1− (red) and S100P-SPP1 + (blue). d Scatterplot showing the correlation of CREB3L1 expression (x-axis) with S100P expression (y-axis). Correlation is evaluated by the Spearman correlation coefficient. e The relative luciferase activity in HEK-293T cells following co-transfection with plasmid containing S100P promoter and increasing doses of the CREB3L1 expression vector (***P < 0.001; two-sided student’s t-test; CREB3L1 50 ng: P < 0.0001; 100 ng: P < 0.0001; 200 ng: P < 0.0001; n = 12 biologically independent samples). f, g Representative images of the Transwell invasion assay (f) and a statistical histogram (g) (***P < 0.001; two-sided student’s t-test; HuCCT1: P < 0.0001; RBE: P < 0.0001; n = 6 biologically independent samples). Scale bar, 100 μm. h Heatmap displaying expression levels of differentially expressed genes in Si-CREB3L1 versus Si-Ctl in HuCCT1 cells. i Top enriched pathways for downregulated genes in Si-CREB3L1 HuCCT1 cells. Si-Ctl Small interfering control (f, g, and h); NES normalized enrichment score (i). Error bars of (e and g) represent the means ± SD. Source data are provided as a Source Data file.

We further applied SCENIC analysis to characterize transcription networks between S100P + SPP1− and S100P-SPP1 + cells24. The results showed that transcription factors such as ATF3, CREB5, MEIS2, and EGR1 were upregulated in S100P-SPP1 + cells, while S100P + SPP1− cells showed upregulation of transcription factors like CREB3L1, PPARG, CDX2, and HOXB7 (Fig. 3c and Supplementary Fig. 6d). Survival analysis from Jusakul et al.’s dataset20 showed that transcription factors that highly expressed in iCCAphl (PPARG, MECOM, HOXB7, IRF7, FOXA3) and iCCApps (ONECUT1, HNF1B, MEIS2), were associated with worse and better prognosis, respectively (Supplementary Fig. 6e). Notably, the SCENIC analysis revealed that CREB3L1, which is induced by ER stress and contributes to maximal induction of the unfolded protein response25, was a potential transcription factor regulating S100P. Also, CREB3L1 expression strongly and positively correlated with S100P expression (r = 0.58, p < 2.2e-16, Fig. 3d). To determine whether S100P is a direct target of CREB3L1, we performed a dual-luciferase report assay and found that the S100P promoter activity was markedly increased in a dose-dependent manner after overexpression of CREB3L1 (Fig. 3e). Transwell assays showed that CREB3L1 knockdown significantly weakened the invasion capacity of HuCCT1 and RBE cells (Fig. 3f, g). RNA‐seq analysis showed that CREB3L1 not only modulated the expression of S100P, but also affected the expression of various upregulated genes in S100P + SPP1− cells, such as OASL, RCN3, and OAS1 (Fig. 3h). Pathway analysis indicated that CREB3L1 was involved in co-translational protein targeting to membrane, the establishment of protein localization to ER, and actin filament reorganization (Fig. 3i). Together, these results reveal the distinct transcriptional profiles of S100P + SPP1− and S100P-SPP1 + cells, identifying CREB3L1 as a potential transcriptor of S100P that promotes invasion of iCCAphl.

Different polarization of infiltrated macrophages in iCCAphl and iCCApps

Despite studies have profiled the tumor immune microenvironment of iCCA by scRNA-seq11,12, the difference of immune landscape between iCCAphl and iCCApps remains unclear. First, we evaluated the infiltration of T cells, B cells, NK cells, and macrophages in 186 iCCAs from the TMA cohort by immunostaining. Results showed that more CD3+ T cells (P < 0.01) and CD56+ NK cells (P < 0.01) were infiltrated in S100P-SPP1 + iCCApps (118 patients) compared to S100P + SPP1− iCCAphl (68 patients). Further analysis of T cell subsets revealed that iCCAphl harbored increased CD8+ T cells while decreased CD4+ T cells than iCCApps. In addition, iCCAphl displayed significantly higher PD1+CD8+ T cells infiltration than iCCApps (P < 0.01), while no significant difference in FOXP3+CD4+ Treg cells infiltration (Supplementary Fig. 7a). Although there was no significant difference in CD68+ macrophages and CD20+ B cells, more CD68+CD206+ macrophages were found to be infiltrated in S100P + SPP1− iCCAphl (P < 0.01) (Supplementary Fig. 7b, c). Then, we focused on macrophages to evaluate distinct macrophage subsets infiltrated in the two subtypes of iCCAs.

A total of six clusters present in the myeloid lineage with the expression of specific marker genes, including one monocyte (Mono_FCN1), two macrophages (Macro_c1_SPP1 and Macro_c2_CCL18), and three DCs (DC_c1_CD1C, DC_c2_XCR1, and DC_c3_CD1A) (Fig. 4a, b and Supplementary Data 6). Macrophages and CD1a + DCs (DC_c3_CD1A) were significantly enriched in tumors compared with paired non-tumor tissues, while monocytes, CD1c+ DCs (DC_c1_CD1C), and cDC1 DCs (DC_c2_XCR1) showed the opposite trend (Supplementary Fig. 7d). Indeed, we observed that SPP1+ macrophages, which have been reported in colon cancer and closely interact with cancer-associated fibroblasts (CAFs)26, were more infiltrated in S100P-SPP1 + iCCApps, while CCL18+ macrophages, which were abundant in advanced hepatocellular carcinoma27, were mostly infiltrated in S100P + SPP1− iCCAphl (Fig. 4c, d). Though both macrophages subsets have been defined as tumor-associated macrophages, they varied in signaling pathways and metabolic features28 (Supplementary Fig. 8a, b). Consistently, we found that SPP1+ macrophages showed an increased level of oxidative phosphorylation and glycine, serine, threonine, and tyrosine metabolism, while CCL18+ macrophages had elevated cytokine–cytokine receptor interaction, nitrogen, and riboflavin metabolism (Supplementary Fig. 8c). By calculating pro-/anti-inflammatory and M1/M2 polarization scores29, we found that SPP1+ macrophages were more potent in both pro- and anti-inflammatory responses and skewed toward M1 polarization (Fig. 4e, f). In contrast, CCL18+ macrophages showed a dominant M2-like phenotype with the high expression of CD163, MARCO, and CSF1R, suggesting their stronger tumor-promoting role than SPP1+ macrophages (Supplementary Fig. 8d). Immunostaining on the TMA cohort further confirmed that SPP1+CCL18 macrophages were more abundant in S100P-SPP1 + iCCApps, while SPP1CCL18+ macrophages were mostly enriched in S100P + SPP1− iCCAphl (Fig. 4g, h), which were again validated by the results from Jusakul et al.’s dataset20 (Supplementary Fig. 8e). Together, these results indicate that iCCAphl has a unique immune ecosystem, with increased CCL18+ macrophages, reduced CD3+ T and CD56+ NK cells as compared with iCCApps.

Fig. 4. Two different subsets of macrophages infiltrated in iCCAphl and iCCApps.

Fig. 4

a The t-SNE plot showing the subtypes of myeloid cells derived from iCCA peri-tumor and tumor. b Heatmap showing the expression of marker genes in each subtype of myeloid cells. c t-SNE plot of myeloid cells from S100P + SPP1− (red dots) and S100P-SPP1 + (blue dots). d Bar plot showing the proportion of macrophage subsets from S100P + SPP1− and S100P-SPP1+. e, f Scatterplots showing pro-/anti-inflammatory scores (e) and M1/M2 scores (f) for two macrophage subsets. Macro_c1_SPP1, n = 4016 cells; Macro_c2_CCL18, n = 3447 cells. (***P < 0.001; two-sided Wilcoxon rank-sum test; Anti-inflammatory score: P < 2.22e-16; Pro-inflammatory score: P < 2.22e-16; M2 polarization score: P < 2.22e-16; M1 polarization score: P < 2.22e-16). The central mark indicates the median, and the bottom and top edges of the box indicate the first and third quartiles, respectively. The top and bottom whiskers extend the boxes to a maximum of 1.5 times the interquartile range. g, h Representative mIHC images (left) and statistical graphs (right) to show the distribution of CD68+SPP1+CCL18 and CD68+SPP1-CCL18+ macrophages in S100P+SPP1− (g) and S100P−SPP1 + (h), respectively: CK19 (green), S100P (red), SPP1 (purple), CD68 (white), CCL18 (yellow), and DAPI (blue) (S100P + SPP1− n = 68, S100P-SPP1 + n = 112). White arrows (CD68 + SPP1 + CCL18−), yellow arrows (CD68 + SPP1−CCL18+). (***P < 0.001; two-sided Mann–Whitney U-test; CD68 + SPP1 + CCL18− (%): P < 0.0001; CD68 + SPP1-CCL18 + (%): P < 0.0001). Data were presented as median with interquartile range (g and h). Scale bar, 50 μm. Source data are provided as a Source Data file.

iCCApps contains tumor cells at different status of differentiation

The expression of ALB is generally considered a marker of hepatocytes. Several studies have demonstrated the expression of ALB in iCCA, but the features of these ALB + tumor cells are still unclear8,30,31. Here, we detected a group of ALB-expressing tumor cells at the single-cell level, most of which (79.4%) were present in the S100P-SPP1 + iCCApps (Supplementary Fig. 9a–c). Due to the different origins of iCCAphl and iCCApps, we here only focused on these seven S100P-SPP1 + iCCApps to explore their heterogeneity. By comparing the gene expression profiles of ALB + and ALB- cells, we found that ALB- cells highly expressed ID3, which negatively regulates the basic helix-loop-helix and is involved in cell differentiation, and neoplastic transformation32 (Fig. 5a, Supplementary Fig. 9d, and Supplementary Data 7). ALB + cells highly expressed hepatocyte-specific genes such as CPB2, ASGR1, FGA, as well as cholangiocyte markers KRT19, KRT18, and EPCAM, but did not express AFP, a marker of hepatic progenitor cells (Fig. 5b and Supplementary Fig. 9e). Genes that are highly expressed in ALB + cells were mainly involved in hepatocyte-specific processes, such as complement activation, detoxification, fatty acid catabolic process, and bile acid secretion, suggesting their hepatocyte differentiation (Supplementary Fig. 9f). SCENIC analysis showed that genes specifically upregulated in ALB + cells were regulated by NR5A2, BATF, and NFIA (Supplementary Fig. 9g). In contrast, ID3 + cells highly expressed genes such as MDK, ZEB1, and LGR5 that play important roles in tumor stemness3336. The SCENIC analysis predicted that transcription factors SOX11, PAX2, IRX2, IRX3, FOXC1, and EN2 were responsible for genes upregulated in these cells.

Fig. 5. Tumor cells at different status of differentiation exist in S100P-SPP1 + iCCAs.

Fig. 5

a t-SNE plot showing expression levels of ALB and ID3 in 7 S100P-SPP1 + iCCAs. b Heatmap showing expression levels of differentially expressed genes (rows) between ALB + and ALB- S100P-SPP1 + tumor cells (columns). c Trajectory of tumor cells from P09 and P10 separately in a two-dimensional state-space defined by Monocle. d Differentially expressed genes along the pseudo-time were clustered hierarchically into two profiles. The representative gene functions and pathways were shown. e Heatmap showing expression of representative genes. Color key from blue to red indicates relative expression levels from low to high. f Heatmap of ALB + and ALB- specific genes (rows) and hierarchical clustering result in 34 S100P-SPP1 + iCCA (columns) from Jusakul et al. dataset20. g Correlation between expression of ID3 and expression of ALB and MKI67. Blue line represents the linear regression curve. The gray band represents the 95% confidence interval of the regression line. Correlation is evaluated by the two-sided Spearman correlation coefficient. Source data are provided as a Source Data file.

Previous studies have designated ID3 + cells as hepatoblasts which could give rise to both hepatocytes and cholangiocytes37. To reveal the differentiation process in iCCA, we explored the gene expression patterns along this transition by trajectory analysis. Tumor cells from P09 and P10 were selected for this analysis as they contained a comparable number of ALB + and ID3 + cells (Fig. 5c and Supplementary Fig. 9h). We found that ALB + cells were mainly located at the terminal of this trajectory and genes involved in the regulation of coagulation, ER lumen, and response to ER stress were increased gradually along the trajectory (Fig. 5d). Also, the expression of MKI67 showed the same trend as ALB, implying an increased proliferation capacity of ALB + cells. ID3 + cells located opposite to ALB + cells in the trajectory and were enriched for pathways in the collagen-containing extracellular matrix and negative regulation of cell adhesion. For example, the expression of COL12A1, which encodes the alpha chain of type XII collagen and is overexpressed in several cancer types38,39, decreased gradually along the transition from ALB- cells to ALB + cells (Fig. 5e).

By evaluating the expression of 16 identified marker genes of ID3 + and ALB + cells in Jusakul et al.’s dataset20, we validated that S100P-SPP1 + patients can also be clearly divided into two subclasses with a mutually exclusive expression of 16 genes (Fig. 5f). In addition to the exclusivity between ALB and ID3, a significantly negative correlation between ID3 and MKI67 expression was also observed, suggesting the slow proliferation of these tumors (Fig. 5g). Taken together, these results demonstrate that iCCApps is a heterogeneous tumor with tumor cells at the various status of differentiation such as hepatocyte differentiation or stemness.

ID3 + tumor cells indicate abundant stroma components and worse prognosis in iCCApps

We next explored the clinical and histological characteristics of ID3 + iCCApps. By immunostaining, we find that ID3 was predominantly expressed in the nucleus of tumor cells located in the tumor center and were surrounded by rich stromal components (Fig. 6a). To further explore the relationship between ID3 expression and tumor stroma, we analyzed the correlation between ID3 expression and CAFs in two public databases20,21. Results showed that ID3 expression positively correlated with CAFs’ gene signature, such as PDGFRB, COL1A1, and PDPN (Fig. 6b and Supplementary Fig. 9i).

Fig. 6. Prognostic significance of CK19 + ID3 + tumor cells in S100P-SPP1 + iCCAs.

Fig. 6

a Representative immunostaining of ID3 in the indicated S100P-SPP1 + iCCAs. ID3 + tumor cells were predominantly located in the intratumor region. Scale bar, 400 μm (up) and 100 μm (down). Images were collected from 17 additional iCCA slides that contained both tumor and corresponding paracancerous tissues. The experiment was repeated once with similar results. b Correlation between ID3 expression and CAFs. iCCA from Jusakul et al.’s dataset20 were ordered by their ID3 expression level as shown by bar plot (top). Heatmap (middle) showing expression levels of selected CAF markers (rows) for each tumor (columns). Colored bar (bottom) showing the CAFs score estimated by MCP-Counter of each tumor. c Representative mIHC images showing the distribution of CK19 + ID3 + , CK19 + ID3- tumor cells and CK19-PDGFRβ + cells in S100P-SPP1 + iCCA (n = 118) from TMA cohort: CK19 (green), ID3 (yellow), PDGFRβ (red), and DAPI (blue). White arrows (CK19 + ID3 + ), yellow arrows (CK19 + ID3−), red arrows (CK19-PDGFRβ+). The experiment was repeated once with similar results. Scale bar, 200 μm. d Correlation analysis between the proportion of CK19 + ID3 + (up) and CK19 + ID3- (down) within CK19 + tumor cells and the proportion of CK19-PDGFRβ + cells within CK19− cells per core, respectively. (Two-sided spearman correlation coefficient). e Kaplan–Meier analysis of overall survival (OS) in S100P-SPP1 + iCCA tumors according to the proportion of CK19 + ID3 + within CK19+ tumor cells (up) and CK19-PDGFRβ + within CK19− cells (down) in the TMA cohort. Two-sided log-rank test. Source data are provided as a Source Data file.

Since CAFs play important roles in tumor progression and chemoresistance40, we speculated that ID3 expression was related to iCCA prognosis. We selected 118 S100P-SPP1 + iCCApps from our TMA cohort to explore the prognostic values of ID3 + tumor cells and PDGFRβ + stromal cells (most of which were CAFs) (Fig. 6c). As expected, the proportion of CK19 + ID3 + tumor cells positively correlated with the proportion of CK19-PDGFRβ + cells (r = 0.46, P < 0.001), while the proportion of CK19 + ID3- tumor cells negatively correlated with the proportion of CK19-PDGFRβ + cells (r = −0.46, P < 0.001) (Fig. 6d). Survival curves indicated that the proportion of CK19 + ID3 + tumor cells (P = 0.016) and CK19-PDGFRβ + cells (P = 0.005) both significantly correlated with poor prognosis in iCCApps (Fig. 6e). Thus, these results demonstrate that ID3 + cells commonly correlated with the presence of CAFs and patient survival in iCCApps.

Discussion

iCCAs can be divided into two main histological subtypes, iCCAphl and iCCApps, according to the tumor anatomical location and the origin of tumor cells. In this study, we generated scRNA-seq profiles of 14 primary iCCAs and identified SPP1 as a representative marker for iCCApps. We found that 92.5% iCCAs can be classified as iCCAphl and iCCApps according to the expression of S100P and SPP1, and there are significant differences in clinicopathological characteristics, gene regulatory networks, and immune infiltration between these two iCCA subtypes. Moreover, we confirmed the presence of tumor cells at various differentiation in iCCApps at the single-cell level (Fig. 7).

Fig. 7. Schematics for the classification of iCCA. Two major subtypes of iCCA were identified in this study.

Fig. 7

Morphological features, cellular component, immune infiltration, and prognosis varied significantly between these two iCCA subtypes. Part of the picture was adapted from motifolio.com.

Cholangiocarcinoma can be divided into iCCA (which arises above the second-order bile ducts) or ECC (including perihilar CCA and distal CCA) according to the tumor location in the biliary tree. Compared to iCCApps, iCCAphl comprises mucin-producing columnar tumor cells and has high invasiveness and high expression of S100P, which is more similar to ECC8. We here identified S100P + SPP1− cells, which were mostly present in iCCAphl, highly expressed mucus-related genes such as MUC5AC, and MUC6 at the single-cell level. Mucins synthesis begins in the ER and they are extremely susceptible to misfolding due to their large sizes and structure complexity, which can eventually lead to ER stress41. We indeed observed many genes associated with mucins synthesis or ER stress upregulated in iCCAphl, such as XBP142, AGR243, and CREB3L125, which may be involved in the progression of this subtype of iCCA. We also found that despite S100P + SPP1− iCCAphl often had smaller tumor size, it had more lymph node metastases, and higher levels of CA19-9, Ki67, and CEA compared with S100P-SPP1 + iCCApps. This further suggested that there are marked differences in clinical characteristics between these two iCCA subtypes. Notably, there were also significant differences in the infiltration of several important immune cells between them. S100P + SPP1− iCCAphl had less CD3+ T and CD56+ NK cells, but more CCL18+ macrophage infiltration than S100P-SPP1 + iCCApps, indicating its dampened anti-tumor immune response that may contribute to the higher invasive potential.

SPP1 is considered to play a cancer-promoting role and is often associated with a worse prognosis in various tumors, but its prognostic significance in iCCA is still controversial44,45. One important reason for this inconsistency is that the classification of iCCA is not properly considered. iCCApps is believed to originate from mucin-negative cuboidal cholangiocytes or ductules containing hepatic progenitor cells. It has been reported that CDH2 and NCAM1, are representative markers of these iCCAs3,5. Based on our results, the expression of SPP1 is mutually exclusive with S100P, showing a better specificity and sensitivity than CDH2 or NCAM1 as a marker of iCCApps. S100P-SPP1 + iCCApps had less lymph node metastasis, larger tumor volume, and better prognosis than S100P + SPP1− iCCAphl. It has been noted that the occurrence of these two subtypes of iCCA was related to different pathogenic factors4648. The iCCApps usually develop on a background of chronic viral hepatitis or liver cirrhosis compared with iCCAphl, which often develop under primary sclerosing cholangitis (PSC) or liver fluke infection status3. Our results revealed that there were significantly higher percentages of HBsAg positive status, chronic hepatitis, and liver cirrhosis in S100P-SPP1 + iCCApps than in S100P + SPP1− iCCAphl, further highlighting their distinct pathogenic background. One research has reported that iCCA with cholangiolocellular differentiation highly expressed CRP and CDH2, while iCCA without cholangiolocellular differentiation highly expressed TFF1 and S100P. The two groups of iCCAs showed significant differences in clinicopathological characteristics and patient outcomes9. The results from this study are very similar to the findings of our study. S100P-SPP1 + iCCApps showed high expression of CRP and CDH2, which correspond to the iCCAs with cholangiolocellular differentiation. Studies have revealed that iCCApps often occur in the background of chronic hepatitis or liver cirrhosis5. We observed SPP1+ macrophages, which has been reported involving in liver inflammation and fibrosis49, were highly infiltrated in iCCApps, indicating that these macrophages may be involved in the occurrence and development of iCCApps. It should be noted that two isoforms of SPP1 (iOPN and sOPN) with distinct functions could be generated by an alternative translation that we could not determine whether the form of SPP1 expressed by SPP1 + macrophages was the same as that of SPP1 + tumor cells, which needs to be further explored16.

Heterogeneity in tumor cell differentiation was observed in iCCA because of the complicated cell origin and formation. In the present study, two major subsets of tumor cells, ALB + and ID3 + tumor cells were identified in iCCApps. The expression of ALB mRNA has been detected by in situ hybridization in about 40% of all iCCAs50, but the specific biology of these ALB+ cells is still not clear. The results of our study showed that these ALB+ cells have the characteristics of hepatocyte differentiation. However, these cells also expressed EPCAM and KRT19, indicating that these may be hepatocyte-like cells in the early stage of differentiation rather than mature hepatocytes. The stem-like ID3+ cells coexisting in iCCApps may be the precursor cells of ALB + cells. There are several reasons for this conjecture. First, these ID3 + cells highly expressed many stemness-related genes, such as ID4, MDK, ZEB1, and LGR5. Of note, it has been reported that the expression of ID3 and LGR5 could promote stem cell features in iCCA33,36. Second, a previous study has identified ID3 + cells at the early stages of development in human and mouse fetal livers, which are able to differentiate into both hepatocytes or cholangiocytes37. Therefore, the presence of ID3 + cells may be one of the reasons for the diversity of iCCApps. Additionally, we found that ID3 + cells were generally located in the interior area of the iCCApps and positively correlated with the CAF content. The location of these ID3 + cells and the presence of a large amount of CAFs surrounding them may be an important reason for the poor prognosis of this type of iCCA.

A few limitations of the current study should not be ignored. There were 1.49% S100P + SPP1 + and 5.97% S100P-SPP1− iCCAs in our validation cohort. We did not analyze the clinicopathological features of these iCCAs because of their small number. Also, due to the small number of S100P + SPP1 + and S100P-SPP1− cells in scRNA-seq data, we could not evaluate the molecular characteristics of these two types of tumor cells at the single-cell level accurately. Therefore, future studies with a larger sample size containing these two iCCA subtypes may help to resolve this issue. Furthermore, the lack of functional data in our study restricts our understanding regarding the molecular mechanisms underlying the tumorigenesis of these two iCCA subtypes. Further animal experiments may shed light on this issue and validate our results in the future.

In summary, our findings suggest that iCCAphl and iCCApps have distinct cell origins. Nevertheless, it is often difficult or impossible to accurately distinguish them by conventional methods, such as evaluating their cellular morphology, architectural features or mucin productivity, because a certain proportion of iCCAs contain mixtures of the large duct and small duct types and also displayed atypical histology and the combined detection of the expression of multiple tissue markers may facilitate their distinguishment. We here suggest two markers, S100P, and SPP1 differentiate between iCCAphl and iCCApps, which may provide insights into iCCAs with a different cell of origin.

Methods

Patient samples

Fourteen patients had liver resection and were pathologically diagnosed as iCCA from January 2019 to January 2020 were enrolled for scRNA-seq. None of the patients received chemotherapy, radiotherapy, or any other anti-tumor therapy before surgery. Fresh paired tumor and non-tumor liver tissues were obtained during surgical resection. The adjacent normal tissues were at least 3 cm away from the matched tumor tissue. This study was conducted in accordance with the ethical standards of the Research Ethics Committee of Zhongshan Hospital with patients’ informed consent. Written informed consent was obtained from all patients involved in this study for the use of their tissue samples and clinical information.

Tissue microarray, immunohistochemistry, and Alcian blue staining

Paraffin-embedded tissue samples from 201 iCCA patients who underwent primary and curative resection for their tumor in Liver Cancer Institute, Zhongshan Hospital of Fudan University (Shanghai, China) between 2012 and 2015 were selected. All these cases were pathologically diagnosed as iCCAs and were verified experimentally before51. The tissue microarrays were baked at 60 °C for 1 h, dewaxed in xylene, rehydrated through a gradient concentration, and blocked the endogenous peroxidase activity by 3% hydrogen peroxide. The sections were incubated with 10% goat serum for 30 min to block nonspecific binding sites and then incubated with the primary antibodies including S100P (1:1500 dilution, ab133554, Abcam), SPP1 (1:2000 dilution, ab214050, Abcam), Hep-Par1 (1:2000 dilution, ab190706, Abcam), ARG1 (1:1000 dilution, ab133543, Abcam) and MUC5AC (1:1000 dilution, ab3649, Abcam) at 4 °C overnight. Detailed information on antibodies was provided in Supplementary Data 8. After repeated washing, the sections were incubated at room temperature with goat anti-mouse or goat anti-rabbit secondary antibody (Vector Lab, CA) and visualized by DAB solution and counterstained with hematoxylin. IHC staining score was assessed by two independent pathologists who were blinded to the patients’ clinicopathological data. The score for IHC intensity was scaled as 0 for no IHC signal, 1 for weak, 2 for moderate, and 3 for strong. A positive IHC stain was defined by a visible staining pattern (score 1 to 3) compared to the negative control (score 0). Alcian blue staining was performed to evaluate mucin content using an Alcian blue staining kit (C0155M, Beyotime) following the manufacturer’s instructions. The score for mucin content was scaled as 0 for accumulation of mucin within <10% of glandular lumens; 1 for accumulation of mucin within 10 to 50% of glandular lumens; and 2 for accumulation of mucin within >50% of glandular lumens or frequent intracytoplasmic mucin as previous study did19.

Preparation of single-cell suspensions

Fresh iCCA tumor tissues and adjacent non-tumor liver tissues were obtained immediately following tumor resection and transferred to the 50 mL centrifugal tube filled with RPMI-1640 medium (Gibco) with 10% fetal bovine serum (Gibco) and transported rapidly to the laboratory on ice. Specimens were then washed twice with cold 1× PBS (Gibco) and digested with Miltenyi Tumor Dissociation Kit and the GentleMACS (Miltenyi, Bergisch Gladbach, Germany) following the manufacturer’s instructions. The dissociated cells were subsequently passed through a 70 µm cell-strainer (BD) to remove clumps and undigested tissue. After centrifugation, the cell pellet was washed twice with MACS buffer (PBS containing 1% FBS, 0.5% EDTA, and 0.05% gentamycin) and then re-suspended in sorting buffer (PBS supplemented with 1% FBS). Single-cell suspensions were stained with DRAQ5 (1:200, 10 min, 4084, CST,) and DAPI (1:200, 5 min, 422801, Biolegend). Finally, DRAQ5 + DAPI- cells were sorted into RPMI-1640 media supplemented with 10% FBS by FACSAria (BD Biosciences).

Single-cell RNA sequencing

Libraries for scRNA-seq were generated using the Chromium Single Cell 3′ library and Gel Bead & Multiplex Kit from 10x Genomics. 10×Genomics Chromium barcoding system was used to construct a 10× barcoded cDNA library following the manufacturer’s instructions. All libraries were sequenced on Illumina HiSeq 4000 until sufficient saturation was reached.

scRNA-seq data processing

CellRanger (v3.1.0) was applied for read mapping and gene expression quantification. Cells with less than 1000 UMIs or >20% mitochondria genes were excluded. We also used three algorithms (DoubletFinder, DoubletDetection, and Scrublet)5254 to find doublets and remove cells which were identified as a doublet by at least one algorithm. The total number of transcripts in each cell was normalized to 10,000, followed by log transformation. Then we used Seurat (v3)55 to detect highly variable genes, perform PCA, graph-based clustering, and t-SNE.

Classification of malignant cells

As malignant cells harbor significantly more copy number variation (CNV) than normal cells, we estimated CNV from scRNA-seq following the steps described in the previous study56 and made some minor improvements. In brief, we first restricted our target cells to epithelial cells defined by both SingleR. Then, genes were sorted according to their genomic location at each chromosome, and a sliding window of 100 genes was applied to calculate the average relative expression values to derive CNVi (CNV of the ith window). Epithelial cells from P02 and P04 (peripheral normal liver tissue) were used as a reference in the above step. Next, we defined the CNV score of each cell as the mean of squared CNVi across all windows. In addition, we calculated the CNV correlation score by computing the Spearman correlation of the CNVi of a cell and the average CNVi of the single-cells with the top 3% CNV scores from the same tumor. Malignant cells were then defined as those with CNV signal above 0.04 and CNV correlation above 0.5.

Classification of nonmalignant cells

For all nonmalignant cells, we first used SingleR57 to classify cells into seven major cell types: myeloid cell, NK cell, CD8+ T cell, CD4+ T cell, B cell, endothelial cell, and fibroblast. Other cell types (e.g., hepatocyte, neutrophil, mast cell, and normal epithelial cells) with fewer than 500 cells are excluded. Then we applied the graph-based clustering method implemented in Seurat to group cells into subtypes and each subtype was further annotated according to its marker genes.

Bulk whole-exome sequencing and data processing

DNA was extracted from iCCA tumor and non-tumor liver tissues from these fourteen patients using a DNeasy Blood and Tissue kit (Qiagen), and DNA concentration and purity were determined using a NanoQuant Plate Infinite M200 PRO reader (Tecan Austria GmbH). After enrichment of exonic DNA fragments with a SureSelect Human All Exon Kit (Agilent, 50 Mb V5), sequencing was performed on Illumina NovaSeq 6000.

Raw sequencing reads were mapped to human genome version 38 (hg38) using BWA-MEM58. After removing duplicated reads, SNV and indel were detected using Mutect2 (10.1101/861054) and annotated with Oncotator59. Copy number alteration (CNA) was identified using FACETS60.

Tumor heterogeneity analysis

For WES data, the cancer cell fraction (CCF) and clonality of each mutation was determined following the process described in Nicholas et al.61 Genomic heterogeneity was calculated as the proportion of subclonal mutations in a tumor. For scRNA-seq data, we estimated transcriptomic heterogeneity according to the method in Ma et al11.

Differential expression and pathway analysis

Differentially expressed genes (fold change >4 and P value < 0.001) were identified using the QLF model implemented in edgeR (v3.26.3)62. Pathway enrichment analysis was performed using clusterprofiler63 based on GOBP gene sets from MSigDB.

Gene regulatory network inference

Gene regulatory networks were identified using SCENIC (v1.1.0)24 with default settings. To reduce the computing time, a python implementation in SCENIC (GRNBoost) was used.

Developmental trajectory analysis

Monocle64 was applied to infer the developmental trajectory with each tumor. Only the top 1000 variable genes identified by differentialGeneTest were selected for constructing the developmental tree.

Dual-luciferase assay

The dual reporter plasmid expressing firefly luciferase under the human S100P promoter and Renilla luciferase under the SV40 promoter was constructed. Different concentrations of expression plasmids were transiently transfected into the HEK-293T cells (purchased from ATCC) with Renilla luciferase plasmid. Firefly luciferase activity was measured with a Dual-Luciferase Assay Kit (Promega) 24 h after transfection and normalized with a Renilla luciferase reference plasmid. Results are assessed as the ratio of Firefly luciferase activity to Renilla luciferase activity.

RNAi and transfection

Human CREB3L1 siRNA (si-CREB3L1) lentivirus vectors and nonspecific siRNA (si-Ctrl) lentivirus vectors were synthesized by GeneChem Technology (Shanghai, China). The si-CREB3L1 sequences are at nucleotide positions131–149 (CGGAGAACATGGAGGACTT) as reported previously65. Non-targeting siRNA was used as the negative control. Lentivirus transfection was performed following the manufacturer’s instructions and the efficiency of silencing was confirmed by immunoblotting.

Transwell invasion assay

Cell invasion was determined by Transwell invasion assay. Briefly, transwell inserts were firstly coated with Matrigel (BD, USA). Then, 1 × 105 HuCCT1 (purchased from Chinese Academy of Sciences Shanghai Branch Cell Bank, Shanghai, China) or RBE (purchased from Cell Resource Center of Tohoku University, Tohoku, Japan) cells suspended in 0.2 mL serum-free medium were added into inserts and 0.5 mL medium containing 20% FBS was added to the lower compartment as a chemoattractant. After culturing for 48 h, the cells on the upper membrane were carefully removed using a cotton bud, and cells on the lower surface were fixed with methanol for 15 min and successively stained with 0.1% crystal violet solution for 10 min. Photographs were then taken and the number of cells that passed through the Matrigel were counted. Assays were performed in duplicate in three independent experiments.

Multiplex immunohistochemistry and quantitative analysis

In brief, 4-μm FFPE TMAs sections were deparaffinized in xylene and then rehydrated in 100, 90, and 70% alcohol successively. Antigen unmasking was performed with a preheated epitope retrieval solution, endogenous peroxidase was inactivated by incubation in 3% H2O2 for 20 min. Next, the sections were pre-incubated with 10% normal goat serum and then incubated overnight with primary antibodies panel 1: CK19 (1:3500 dilution, ab52625, Abcam), S100P (1:3000 dilution, ab133554, Abcam), SPP1 (1:2000 dilution, ab214050, Abcam), CD68 (1:2000 dilution, 76437, CST), CCL18 (1:1000 dilution, ab104867, Abcam); panel 2: CK19 (1:3500 dilution, ab52625, Abcam), ID3 (1:2000 dilution, A5375, ABclonal), PDGFRβ (1:3000 dilution, ab32570, Abcam); panel 3: EPCAM (1:2000 dilution, ab223582, Abcam), S100P (1:3000 dilution, ab133554, Abcam), PSCA (1:2000 dilution, sc-80654, Santa Cruz Biotechnology); panel 4: CD45 (1:2500 dilution, ab40763, Abcam), CD3 (1:2000 dilution, ab16669, Abcam), CD68 (1:3000 dilution, ab213363, Abcam), CD206 (1:2000 dilution, 91992, CST), CD20 (1:2500 dilution, ab78237, Abcam), CD56 (1:2000 dilution, ab220360, Abcam); panel 5: CD4 (1:2000 dilution, ab133616, Abcam), CD8 (1:2500 dilution, ab237709, Abcam), PD1 (1:3000 dilution, ab52587, Abcam), FOXP3 (1:2000 dilution, ab215206, Abcam). Detailed information of antibodies was provided in the Supplementary Data 8). Next, sections were incubated with the corresponding HRP-conjugated goat anti-mouse or goat anti-rabbit second antibodies (Vector Lab, CA) for 30 min at room temperature. The antigenic binding sites were visualized using the OPAL dye. Opal −520 (PerkinElmer Inc.), Opal- 570 (PerkinElmer Inc.), Opal −620 (PerkinElmer Inc.), Opal -650 (PerkinElmer Inc.), Opal -690 (PerkinElmer Inc.) were applied to each antibody, respectively.

Data were analyzed as previously described66. Images were analyzed and quantified by inForm software (v2.3, PerkinElmer Inc.) based active machine learning algorithm with a pre-visual cutoff followed by single-cell based mean pixel fluorescence intensity to achieve accuracy. A threshold value of each marker was identified and displayed by both FCS Express 6 Plus v6.04.0034 (De Novo Software) with FACS alike density plot and Inform Score that could adjust the cutoff based on the score map and original staining images to improve the accuracy.

Statistical analysis

Statistical analysis was performed with the R (v3.6.1), SPSS (v22, IBM, Armonk, NY), and Prism 6.0 (SanDiego, CA) softwares. Comparisons were performed using χ2 test and unpaired two-sided Wilcoxon rank-sum test unless specified. The cumulative survival time was estimated by Kaplan–Meier estimator with a log-rank test.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Supplementary information

Peer Review File (2.9MB, pdf)
41467_2022_29164_MOESM3_ESM.pdf (109.6KB, pdf)

Description of Additional Supplementary Files

Supplementary Data 1 (15.7KB, xlsx)
Supplementary Data 2 (770KB, xlsx)
Supplementary Data 3 (15.2KB, xlsx)
Supplementary Data 4 (33KB, xlsx)
Supplementary Data 5 (1.4MB, xlsx)
Supplementary Data 6 (232.8KB, xlsx)
Supplementary Data 7 (1.4MB, xlsx)
Supplementary Data 8 (10.3KB, xlsx)
Reporting Summary (307.9KB, pdf)

Acknowledgements

We thank the High-Performance Computing Platform of the Center for Life Sciences (Peking University) for providing the data analysis platform. The study was supported by project grants from the National Natural Science Foundation of China (Nos. 81961128025, 91859105, 11971039, and 31800743), Programs of the Science and Technology Commission of Shanghai (No. 22YF1407200, 20JC1418900 and 19XD1420700), the Strategic Priority Research Program (No. XDPB0303), Frontier Science Key Research Project (No. QYZDB-SSW-SMC036), Chinese Academy of Sciences, Sanming Project of Medicine in Shenzhen (No. SZSM202003009), and National Key Basic Research Project of China (No. 2020YFE0204000). This study was also supported by the Sino-Russian Mathematics Center.

Source data

Source Data (7.7MB, xlsx)

Author contributions

Q.G., R.X., X.Z., and J.F. contributed to study design and supervised the study. G.S. and Q.G. contributed to writing the manuscript. Y.S., S.H., Y.W., and J.-X.L. assisted in the data analysis. G.S., L.M., and J.M. performed immunohistochemical staining and image analysis. Ju.Z., Yp.L., J.L., and Sy.J. assisted in the preparation of the experiments. S.Y., D.R., Yf.C., and Ym.L. aided in the collection of tissue samples. S.J. and Xl.W. assisted with data collection. S.Z., A.K., Xy.W., and Y.J. assisted in histopathological analysis. Y.C., J.Z., and J.F. made intellectual contributions.

Peer review

Peer review information

Nature Communications thanks Tim Greten, Hidewaki Nakagawa and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Data availability

The raw sequencing data reported in this paper (including scRNA-seq and WES data) has been deposited in the Genome Sequence Archive in National Genomics Data Center under the accession number HRA000863, which is accessible at. The raw sequencing data are available for non-commercial purposes under controlled access because of data privacy laws, and access can be obtained by request to the corresponding authors. The request will be passed within 1 week and then the users will be given a download link valid for 1 year to download the raw data. For public datasets analysis, Jusaka et al.’s dataset20 (including 81 iCCAs and 34 ECCs) were retrieved from GSE89749 and GSE89803 and Job et al.’s dataset21 (including 78 iCCAs) was retrieved from ArrayExpress with accession number E‐MTAB‐6389. Source data are provided with this paper. The remaining data were available within the Article, Supplementary Information, or Source Data file. Source data are provided with this paper.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Guohe Song, Yang Shi, Lu Meng, Jiaqiang Ma.

Change history

5/17/2022

A Correction to this paper has been published: 10.1038/s41467-022-30599-8

Contributor Information

Jia Fan, Email: fan.jia@zs-hospital.sh.cn.

Xiaoming Zhang, Email: xmzhang@ips.ac.cn.

Ruibin Xi, Email: ruibinxi@math.pku.edu.cn.

Qiang Gao, Email: gaoqiang@fudan.edu.cn.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-022-29164-0.

References

  • 1.Valle JW, Lamarca A, Goyal L, Barriuso J, Zhu AX. New horizons for precision medicine in biliary tract cancers. Cancer Disco. 2017;7:943–962. doi: 10.1158/2159-8290.CD-17-0245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Nakanuma, Y., Klimstra, D., Komuta, M. & Zen, Y. in WHO Classification of Tumours: Digestive System Tumours 5th edn (ed. WHO Classification of Tumors Editorial Board.) 8, 254–259 (World Health Organization, 2019).
  • 3.Banales JM, et al. Cholangiocarcinoma 2020: the next horizon in mechanisms and management. Nat. Rev. Gastroenterol. Hepatol. 2020;17:557–588. doi: 10.1038/s41575-020-0310-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kendall T, et al. Anatomical, histomorphological and molecular classification of cholangiocarcinoma. Liver Int. 2019;39:7–18. doi: 10.1111/liv.14093. [DOI] [PubMed] [Google Scholar]
  • 5.Aishima S, Oda Y. Pathogenesis and classification of intrahepatic cholangiocarcinoma: different characters of perihilar large duct type versus peripheral small duct type. J. Hepatobiliary Pancreat. Sci. 2015;22:94–100. doi: 10.1002/jhbp.154. [DOI] [PubMed] [Google Scholar]
  • 6.Aishima S, et al. Gastric mucin phenotype defines tumour progression and prognosis of intrahepatic cholangiocarcinoma: gastric foveolar type is associated with aggressive tumour behaviour. Histopathology. 2006;49:35–44. doi: 10.1111/j.1365-2559.2006.02414.x. [DOI] [PubMed] [Google Scholar]
  • 7.Prica F, Radon T, Cheng Y, Crnogorac-Jurcevic T. The life and works of S100P - from conception to cancer. Am. J. Cancer Res. 2016;6:562–576. [PMC free article] [PubMed] [Google Scholar]
  • 8.Komuta M, et al. Histological diversity in cholangiocellular carcinoma reflects the different cholangiocyte phenotypes. Hepatology. 2012;55:1876–1888. doi: 10.1002/hep.25595. [DOI] [PubMed] [Google Scholar]
  • 9.Rhee H, et al. Transcriptomic and histopathological analysis of cholangiolocellular differentiation trait in intrahepatic cholangiocarcinoma. Liver Int. 2018;38:113–124. doi: 10.1111/liv.13492. [DOI] [PubMed] [Google Scholar]
  • 10.Liau JY, et al. Morphological subclassification of intrahepatic cholangiocarcinoma: etiological, clinicopathological, and molecular features. Mod. Pathol. 2014;27:1163–1173. doi: 10.1038/modpathol.2013.241. [DOI] [PubMed] [Google Scholar]
  • 11.Ma L, et al. Tumor cell biodiversity drives microenvironmental reprogramming in liver cancer. Cancer Cell. 2019;36:418–430 e416. doi: 10.1016/j.ccell.2019.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhang M, et al. Single-cell transcriptomic architecture and intercellular crosstalk of human intrahepatic cholangiocarcinoma. J. Hepatol. 2020;73:1118–1130. doi: 10.1016/j.jhep.2020.05.039. [DOI] [PubMed] [Google Scholar]
  • 13.Karthaus WR, et al. Regenerative potential of prostate luminal cells revealed by single-cell analysis. Science. 2020;368:497–505. doi: 10.1126/science.aay0267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chen Z, et al. Single-cell RNA sequencing highlights the role of inflammatory cancer-associated fibroblasts in bladder urothelial carcinoma. Nat. Commun. 2020;11:5077. doi: 10.1038/s41467-020-18916-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Xue R, et al. Genomic and transcriptomic profiling of combined hepatocellular and intrahepatic cholangiocarcinoma reveals distinct molecular subtypes. Cancer Cell. 2019;35:932–947 e938. doi: 10.1016/j.ccell.2019.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Moorman, H. R. et al. Osteopontin: a key regulator of tumor progression and immunomodulation. Cancers12, 3379 (2020). [DOI] [PMC free article] [PubMed]
  • 17.Ma, L. et al. Single-cell atlas of tumor cell evolution in response to therapy in hepatocellular carcinoma and intrahepatic cholangiocarcinoma. J. Hepatol. 75, 1397–1408 (2021). [DOI] [PMC free article] [PubMed]
  • 18.Shevde LA, Samant RS. Role of osteopontin in the pathophysiology of cancer. Matrix Biol. 2014;37:131–141. doi: 10.1016/j.matbio.2014.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hayashi A, et al. Distinct clinicopathologic and genetic features of 2 histologic subtypes of intrahepatic cholangiocarcinoma. Am. J. Surg. Pathol. 2016;40:1021–1030. doi: 10.1097/PAS.0000000000000670. [DOI] [PubMed] [Google Scholar]
  • 20.Jusakul A, et al. Whole-genome and epigenomic landscapes of etiologically distinct subtypes of cholangiocarcinoma. Cancer Disco. 2017;7:1116–1135. doi: 10.1158/2159-8290.CD-17-0368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Job S, et al. Identification of four immune subtypes characterized by distinct composition and functions of tumor microenvironment in intrahepatic cholangiocarcinoma. Hepatology. 2020;72:965–981. doi: 10.1002/hep.31092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Saeki N, Gu J, Yoshida T, Wu X. Prostate stem cell antigen: a Jekyll and Hyde molecule? Clin. Cancer Res. 2010;16:3533–3538. doi: 10.1158/1078-0432.CCR-09-3169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Fu YP, et al. Common genetic variants in the PSCA gene influence gene expression and bladder cancer risk. Proc. Natl Acad. Sci. USA. 2012;109:4974–4979. doi: 10.1073/pnas.1202189109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Aibar S, et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods. 2017;14:1083–1086. doi: 10.1038/nmeth.4463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Sampieri L, Di Giusto P, Alvarez C. CREB3 transcription factors: ER-golgi stress transducers as hubs for cellular homeostasis. Front. Cell Dev. Biol. 2019;7:123. doi: 10.3389/fcell.2019.00123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Zhang L, et al. Single-cell analyses inform mechanisms of myeloid-targeted therapies in colon cancer. Cell. 2020;181:442–459 e429. doi: 10.1016/j.cell.2020.03.048. [DOI] [PubMed] [Google Scholar]
  • 27.Song G, et al. Global immune characterization of HBV/HCV-related hepatocellular carcinoma identifies macrophage and T-cell subsets associated with disease progression. Cell Disco. 2020;6:90. doi: 10.1038/s41421-020-00214-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Xiao Z, Dai Z, Locasale JW. Metabolic landscape of the tumor microenvironment at single cell resolution. Nat. Commun. 2019;10:3763. doi: 10.1038/s41467-019-11738-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Azizi E, et al. Single-cell map of diverse immune phenotypes in the breast tumor microenvironment. Cell. 2018;174:1293–1308 e1236. doi: 10.1016/j.cell.2018.05.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Collins K, Newcomb PH, Cartun RW, Ligato S. Utility and limitations of albumin mRNA in situ hybridization detection in the diagnosis of hepatobiliary lesions and metastatic carcinoma to the liver. Appl. Immunohistochem. Mol. Morphol. 2021;29:180–187. doi: 10.1097/PAI.0000000000000885. [DOI] [PubMed] [Google Scholar]
  • 31.Lin F, et al. Detection of albumin expression by RNA in situ hybridization is a sensitive and specific method for identification of hepatocellular carcinomas and intrahepatic cholangiocarcinomas. Am. J. Clin. Pathol. 2018;150:58–64. doi: 10.1093/ajcp/aqy030. [DOI] [PubMed] [Google Scholar]
  • 32.Wang LH, Baker NE. E proteins and ID proteins: helix-loop-helix partners in development and disease. Dev. Cell. 2015;35:269–280. doi: 10.1016/j.devcel.2015.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Huang L, et al. ID3 promotes stem cell features and predicts chemotherapeutic response of intrahepatic cholangiocarcinoma. Hepatology. 2019;69:1995–2012. doi: 10.1002/hep.30404. [DOI] [PubMed] [Google Scholar]
  • 34.Filippou PS, Karagiannis GS, Constantinidou A. Midkine (MDK) growth factor: a key player in cancer progression and a promising therapeutic target. Oncogene. 2020;39:2040–2054. doi: 10.1038/s41388-019-1124-8. [DOI] [PubMed] [Google Scholar]
  • 35.Caramel J, Ligier M, Puisieux A. Pleiotropic roles for ZEB1 in cancer. Cancer Res. 2018;78:30–35. doi: 10.1158/0008-5472.CAN-17-2476. [DOI] [PubMed] [Google Scholar]
  • 36.Kawasaki K, et al. LGR5 induces beta-catenin activation and augments tumour progression by activating STAT3 in human intrahepatic cholangiocarcinoma. Liver Int. 2021;41:865–881. doi: 10.1111/liv.14747. [DOI] [PubMed] [Google Scholar]
  • 37.Wang X, et al. Comparative analysis of cell lineage differentiation during hepatogenesis in humans and mice at the single-cell transcriptome level. Cell Res. 2020;30:1109–1126. doi: 10.1038/s41422-020-0378-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Poschel A, et al. Identification of disease-promoting stromal components by comparative proteomic and transcriptomic profiling of canine mammary tumors using laser-capture microdissected FFPE tissue. Neoplasia. 2021;23:400–412. doi: 10.1016/j.neo.2021.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.van Huizen, et al. Up-regulation of collagen proteins in colorectal liver metastasis compared with normal liver tissue. J. Biol. Chem. 2019;294:281–289. doi: 10.1074/jbc.RA118.005087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Chen X, Song E. Turning foes to friends: targeting cancer-associated fibroblasts. Nat. Rev. Drug Disco. 2019;18:99–115. doi: 10.1038/s41573-018-0004-1. [DOI] [PubMed] [Google Scholar]
  • 41.Bansil R, Turner BS. The biology of mucus: composition, synthesis and organization. Adv. Drug Deliv. Rev. 2018;124:3–15. doi: 10.1016/j.addr.2017.09.023. [DOI] [PubMed] [Google Scholar]
  • 42.Glimcher LH, Lee AH, Iwakoshi NN. XBP-1 and the unfolded protein response (UPR) Nat. Immunol. 2020;21:963–965. doi: 10.1038/s41590-020-0708-3. [DOI] [PubMed] [Google Scholar]
  • 43.Schroeder BW, et al. AGR2 is induced in asthma and promotes allergen-induced mucin overproduction. Am. J. Respir. Cell Mol. Biol. 2012;47:178–185. doi: 10.1165/rcmb.2011-0421OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Zhou KQ, et al. Circulating osteopontin per tumor volume as a prognostic biomarker for resectable intrahepatic cholangiocarcinoma. Hepatobiliary Surg. Nutr. 2019;8:582–596. doi: 10.21037/hbsn.2019.03.14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Zheng Y, et al. Osteopontin promotes metastasis of intrahepatic cholangiocarcinoma through recruiting MAPK1 and mediating Ser675 phosphorylation of beta-Catenin. Cell Death Dis. 2018;9:179. doi: 10.1038/s41419-017-0226-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Asayama Y, et al. Coexpression of neural cell adhesion molecules and bcl-2 in intrahepatic cholangiocarcinoma originated from viral hepatitis: relationship to atypical reactive bile ductule. Pathol. Int. 2002;52:300–306. doi: 10.1046/j.1440-1827.2002.01349.x. [DOI] [PubMed] [Google Scholar]
  • 47.Tsai JH, et al. S100P immunostaining identifies a subset of peripheral-type intrahepatic cholangiocarcinomas with morphological and molecular features similar to those of perihilar and extrahepatic cholangiocarcinomas. Histopathology. 2012;61:1106–1116. doi: 10.1111/j.1365-2559.2012.04316.x. [DOI] [PubMed] [Google Scholar]
  • 48.Liau JY, et al. Morphological subclassification of intrahepatic cholangiocarcinoma: etiological, clinicopathological, and molecular features. Mod. Pathol. 2014;27:1163–1173. doi: 10.1038/modpathol.2013.241. [DOI] [PubMed] [Google Scholar]
  • 49.Song Z, et al. Osteopontin takes center stage in chronic liver disease. Hepatology. 2021;73:1594–1608. doi: 10.1002/hep.31582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Avadhani V, Cohen C, Siddiqui MT, Krasinskas A. A subset of intrahepatic cholangiocarcinomas express albumin RNA as detected by in situ hybridization. Appl. Immunohistochem. Mol. Morphol. 2021;29:175–179. doi: 10.1097/PAI.0000000000000882. [DOI] [PubMed] [Google Scholar]
  • 51.Gao Q, et al. Activating mutations in PTPN3 promote cholangiocarcinoma cell proliferation and migration and are associated with tumor recurrence in patients. Gastroenterology. 2014;146:1397–1407. doi: 10.1053/j.gastro.2014.01.062. [DOI] [PubMed] [Google Scholar]
  • 52.DePasquale EAK, et al. DoubletDecon: deconvoluting doublets from single-cell RNA-sequencing data. Cell Rep. 2019;29:1718–1727 e1718. doi: 10.1016/j.celrep.2019.09.082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.McGinnis CS, Murrow LM, Gartner ZJ. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 2019;8:329–337 e324. doi: 10.1016/j.cels.2019.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Wolock SL, Lopez R, Klein AM. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 2019;8:281–291 e289. doi: 10.1016/j.cels.2018.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Stuart T, et al. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902 e1821. doi: 10.1016/j.cell.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Tirosh I, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352:189–196. doi: 10.1126/science.aad0501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Aran D, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 2019;20:163–172. doi: 10.1038/s41590-018-0276-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Ramos AH, et al. Oncotator: cancer variant annotation tool. Hum. Mutat. 2015;36:E2423–E2429. doi: 10.1002/humu.22771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Shen R, Seshan VE. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 2016;44:e131. doi: 10.1093/nar/gkw520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.McGranahan N, et al. Clonal status of actionable driver events and the timing of mutational processes in cancer evolution. Sci. Transl. Med. 2015;7:283ra254. doi: 10.1126/scitranslmed.aaa1408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Qiu X, et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods. 2017;14:979–982. doi: 10.1038/nmeth.4402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Denard B, et al. The membrane-bound transcription factor CREB3L1 is activated in response to virus infection to inhibit proliferation of virus-infected cells. Cell Host Microbe. 2011;10:65–74. doi: 10.1016/j.chom.2011.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Ma J, et al. PD1(Hi) CD8(+) T cells correlate with exhausted signature and poor clinical outcome in hepatocellular carcinoma. J. Immunother. Cancer. 2019;7:331. doi: 10.1186/s40425-019-0814-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Peer Review File (2.9MB, pdf)
41467_2022_29164_MOESM3_ESM.pdf (109.6KB, pdf)

Description of Additional Supplementary Files

Supplementary Data 1 (15.7KB, xlsx)
Supplementary Data 2 (770KB, xlsx)
Supplementary Data 3 (15.2KB, xlsx)
Supplementary Data 4 (33KB, xlsx)
Supplementary Data 5 (1.4MB, xlsx)
Supplementary Data 6 (232.8KB, xlsx)
Supplementary Data 7 (1.4MB, xlsx)
Supplementary Data 8 (10.3KB, xlsx)
Reporting Summary (307.9KB, pdf)

Data Availability Statement

The raw sequencing data reported in this paper (including scRNA-seq and WES data) has been deposited in the Genome Sequence Archive in National Genomics Data Center under the accession number HRA000863, which is accessible at. The raw sequencing data are available for non-commercial purposes under controlled access because of data privacy laws, and access can be obtained by request to the corresponding authors. The request will be passed within 1 week and then the users will be given a download link valid for 1 year to download the raw data. For public datasets analysis, Jusaka et al.’s dataset20 (including 81 iCCAs and 34 ECCs) were retrieved from GSE89749 and GSE89803 and Job et al.’s dataset21 (including 78 iCCAs) was retrieved from ArrayExpress with accession number E‐MTAB‐6389. Source data are provided with this paper. The remaining data were available within the Article, Supplementary Information, or Source Data file. Source data are provided with this paper.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES