Abstract
Primary liver cancer arises either from hepatocytic or biliary lineage cells, giving rise to hepatocellular carcinoma (HCC) or intrahepatic cholangiocarcinoma (ICCA). Combined hepatocellular- cholangiocarcinomas (cHCC-CCA) exhibit equivocal or mixed features of both, causing diagnostic uncertainty and difficulty in determining proper management. Here, we perform a comprehensive deep learning-based phenotyping of multiple cohorts of patients. We show that deep learning can reproduce the diagnosis of HCC vs. CCA with a high performance. We analyze a series of 405 cHCC-CCA patients and demonstrate that the model can reclassify the tumors as HCC or ICCA, and that the predictions are consistent with clinical outcomes, genetic alterations and in situ spatial gene expression profiling. This type of approach could improve treatment decisions and ultimately clinical outcome for patients with rare and biphenotypic cancers such as cHCC-CCA.
Subject terms: Liver cancer, Tumour heterogeneity, Biomedical engineering, Machine learning, Pathology
Combined hepatocellular-cholangiocarcinomas (cHCC-CCA) are challenging to diagnose, as they exhibit features of hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICCA). Here, the authors use deep learning to re-classify cHCC-CCA tumours into HCC or ICCA based on histopathology images.
Introduction
Primary liver cancer is the fourth leading cause of cancer-related death worldwide and an increasing public health problem1. The two most common types of primary liver cancer are hepatocellular carcinoma (HCC), which derives from hepatocytes, and intrahepatic cholangiocarcinoma (ICCA), which is thought to originate from biliary epithelial cells1. These two entities represent the two ends of the primary liver tumor spectrum and have completely different risk factors, clinical outcomes, treatment strategies and genetic/molecular features1,2.
Combined hepatocellular-cholangiocarcinoma (cHCC-CCA) is a rare variant of liver cancer which can present as a mixture or a coexistence of tumor tissue with hepatocellular and biliary morphological differentiation3. Most cases, however, display equivocal features that cannot be easily classified as either HCC or ICCA. This explains why the diagnosis is often very difficult for pathologists. Clinical management of patients with cHCC-CCA is also highly challenging, and, due to the rarity of this cancer, there are no consensus guidelines. Treatment strategies are usually extrapolated from HCC and ICCA, but the regulatory approval of modern therapies is usually restricted to “pure” HCCs or ICCAs. As a result, patients with cHCC-CCA often do not respond well to therapies and have detrimental clinical outcomes3. Interestingly, several studies showed that cHCC-CCA displays overlapping genetic alterations and gene expression profiles with those of HCC or ICCA, and it is debated whether cHCC-CCA represents a true molecular entity3–5. A recent study has suggested that cHCC-CCA arise from liver progenitor cells, and that its development is dependent on IL-6 trans-signaling6. Another hypothesis is that these tumors may indeed arise from the dedifferentiation or transdifferentiation of a preexisting conventional HCC or ICCA, but maintain a phylogenetic proximity to their ancestral differentiation3.
Artificial intelligence (AI) is widely used in pathology image analysis. We and others have applied AI to digitized whole slide images (WSI) of different cancers, including primary liver tumors, and showed that AI can extract clinically actionable information directly from routinely available tissue slides stained with hematoxylin and eosin (H&E)7–10. In this work, we aim to determine if AI allows the reclassification of cHCC-CCA as pure HCC or ICCA (Fig. 1A), and if this classification has both a clinical (in terms of prognostication) and a molecular (in terms of concordance with genetic defects and spatial molecular profiles) relevance.
Results
AI model performance in differentiating HCC and ICCA
To investigate whether an AI model can re-classify cHCC-CCA tumors into “pure” HCC or ICCA categories, we trained an AI pipeline based on a self-supervised feature extractor11 with an attention-MIL aggregation model12–14 (Fig. 1B) to distinguish pure HCCs (785 WSIs from n = 424 patients) from pure ICCAs (239 WSIs from n = 167 patients) (Methods, Supplemental Tables 1 and 2). In this cohort (“Discovery cohort”, Fig. 1C), the model achieved a cross-validated area under the receiver operator characteristic curve (AUROC) of 0.99 [0.01], corresponding to an almost perfect separability of the classes (Fig. 2A), reaching a sensitivity of 97.9% and specificity of 97.6%. As another piece of evidence for the plausibility of the model’s predictions, we subsequently evaluated the model on another patient cohort, the publicly available TCGA cohort, which was composed of n = 333 HCCs (TCGA-LIHC) and n = 27 ICCAs (TCGA-CHOL). The labels of the TCGA cohort were not seen by the model during training, however the training was exposed to some TCGA image data during self-supervised pretraining, which might affect an intermediary result but not the subsequent results. We found that the model reached an AUROC of 0.94 [0.05], representing a very good generalizability to this additional dataset (Fig. 2B). Next, we asked which tissue structures were used by the model to make its prediction and found that the model placed a high attention to areas with an ICCA-like phenotype (glandular structures and fibrous stroma) (Supplemental Fig. 1). Together, these data show that the AI model can robustly distinguish pure HCC from pure ICCA tumors (Fig. 2C). We used this model as the starting point for our subsequent experiments.
AI model application on cHCC-CCA samples
Subsequently, we applied the trained model to a large multicentric cohort of cases which were initially diagnosed as cHCC-CCA (Supplemental Table 3). We investigated the spatial prediction maps and found that, generally, regions with HCC-like morphology were assigned a high “HCCness” by the model, while regions with ICCA-like morphology were assigned a high “ICCAness” by the model (Fig. 3A). For tumors with a significant proportion of equivocal or intermediate features, it was however much more difficult for pathologists to determine the morphology and proportion of areas with high “HCCness” or “ICCAness”. As region-specific predictions are not clinically actionable, we further investigated the patient-level prediction scores. We found that these scores followed a bimodal distribution, with a subset of cases peaking at a high HCC prediction and the remainder of the cases peaking at a high ICCA prediction (Fig. 3B). Importantly, there was no association between the predictions and the clinical centers managing the patients (p = 0.62). Together, these data show that the AI model can process tissue samples of cHCC-CCA cases and re-classify them as HCC or CCA. We then sought to determine if a simple pathological reclassification of cHCC-CCA tumors by microscopic examination was associated with AI predictions. To this aim, all cases were reviewed in a blinded way by an expert liver pathologist (JC) and cHCC-CCA were reclassified as HCC or ICCA according to the more abundant morphological component (Methods). Interestingly, only a slight concordance was observed between the pathological analysis and the model’s predictions (Cohen’s Kappa 0.19, Supplemental Fig. 2), indicating that the AI model does not simply assess the more abundant tissue component in the way a human pathologist would.
Clinical outcomes based on AI-based reclassification
Next, we investigated the potential clinical and biological relevance of the AI-based reclassification of cHCC-ICCAs. One of the major differences between HCC and ICCA is their clinical outcome, with worse overall 5-year survival rates for patients with ICCA3. We thus aimed to determine if our reclassification had an impact on the prognosis. Indeed, patients with cHCC-CCA reclassified as ICCA had a shorter median survival (29 months) than patients with a tumor reclassified as HCC (median survival not reached, p = 0.052, hazard ratio = 1.76 95% CI 0.98-3.14, Fig. 3C and Supplemental Table 4). Survival prediction is particularly relevant in patients who receive a liver transplant, as donor organs should be prioritized for patients with a good prognosis. A diagnosis of cHCC-CCA is currently considered a contra-indication to this therapeutic modality which remains a curative option for patients with HCC. As observed for resection, patients with cHCC-CCA reclassified as HCC showed a prolonged 5-year overall survival (76.4% survival rate, median survival not reached), which is similar to that usually observed in patients with conventional HCC (Fig. 3D and Supplemental Table 5). Transplanted patients with an incidental diagnosis of cHCC-CCA at transplant, which were re-classified as ICCA by the AI model, had a poor median survival of 48 months and a 5-year overall survival of only 45.4% (hazard ratio = 2.69, 95% CI 1.15-6.31). As opposed to resected patients, the prognostic impact remained significant on multivariate analysis. This observation may be explained by the fact that the competing mortality from cirrhosis and the risk of tumor recurrence are minimized by liver transplantation, as the whole diseased liver is replaced by the intervention. We also observed differences in patient characteristics according to the 2 different therapeutic modalities: transplanted patients were more frequently male (p = 0.039), cirrhotic (p < 0.001), with a higher frequency of alcohol consumption (p < 0.001) and a lower rate of HBV infection (p = 0.004) (Supplemental Table 6).
We further aimed to determine if the conventional reclassification of cHCC-CCA (according to the more abundant contingent assessed by a blinded pathologist) yielded any prognostic value, but observed that it had not a significant impact on survival in either resected (p = 0.16) or transplanted (p = 0.32) patients (Supplemental Figs. 3 and 4). In order to assess the degree of inter-observer variability of this histological reclassification, slides were also reviewed by another pathologist. The overall agreement was only fair (Cohen’s Kappa of 0.37), supporting the use of a more standardized and reproducible system such as our model. Altogether, these data suggest that our AI-based reclassification of cHCC-CCA allows us to make more clinically relevant predictions about disease outcomes than a classical pathological assessment.
AI-based reclassification and genomic alterations
We next investigated if the model’s predictions were concordant with known genetic differences of HCC and ICCA. We performed targeted next-generation sequencing with a panel that includes all major genes involved in HCC or ICCA development for n = 104 randomly selected cases. We identified several cases with alterations in TERT promoter, CTNNB1 and NFE2L2, which typically occur in HCCs, and several cases with FGFR2 fusions and IDH1/2, KRAS, NRAS, BRAF and HER2 mutations, which typically occur in ICCAs. We found that all genetic alterations in HCC-specific genes occurred in the tumor subset which the AI model had re-classified as HCC (Fig. 3E). Eleven out of 16 genetic alterations which are typical for ICCA occurred in tumors that were re-classified as CCA. In other words, the AI predictions match the genomic alterations of cHCC-CCA (p = 0.0009 in Fisher’s exact test), suggesting that the model detects patterns directly linked to the genetic defects identified by genomic profiling of the tumor tissue.
Spatial transcriptomics analysis and AI predictions
To gain further insights into the in situ relationships between the models prediction and the underlying biology, we performed spatial transcriptomics on tissue sections obtained from formalin-fixed, paraffin embedded blocks of 6 randomly selected cHCC-CCA cases. We then applied our model on the corresponding WSI and matched the prediction heatmaps with the gene expression profiling data. We investigated, within each case, the differences between the 100 image tiles most highly associated with a prediction of ICCA and the 100 tiles most highly associated with a prediction of HCC. We observed that the model’s predictions matched the underlying in situ gene expression profiles of the tumors. For Sample #A4 (Fig. 4A and B), areas predicted as ICCA-like were indeed associated with increased expression of genes related to cholangiocyte differentiation (e.g EPCAM, HNF1B and KRT7) and decreased expression of well-known hepatocytic markers (ALB, FABP1 and APOB) (Fig. 4B and Supplementary Data 1). Similar findings were obtained for 2 additional cases (#A1 and #A2), while the results for remaining samples were less clear with few significantly dysregulated genes (#A3, #A5 and #A6) (Supplementary Data 1), possibly due to the constraints of performing spatial transcriptomics analyses on formalin-fixed paraffin-embedded material.
Discussion
In summary, our study shows that AI-based reclassification of cHCC-CCA into one of the “pure” HCC or CCA categories could improve prognostication, which is critical given the therapeutic implications, and also help to determine if a given cHCC-CCA tumor is genetically more similar to HCC or ICCA. A diagnosis of cHCC-CCA remains a formidable challenge for physicians and little or no evidence is available to guide the treatment options for the patients. Hence, oncologists often recommend treatment according to HCC or CCA therapeutic strategies, but the responses are often poor and their outcome dire3,15. Recent large scale molecular studies of cHCC-CCA have failed to demonstrate any specific genetic alterations, and most cases have a similar gene expression profile to that of HCC or ICCA3,4. This reclassification could be performed by genetic profiling, however these approaches are not universally available and are lengthy and costly. By definition, routine histopathological slides are available for every single one of these patients as histopathological evaluation is needed to make a diagnosis of cHCC-CCA in the first place.
Here, we have shown that an AI system can make a clear call for either HCC or ICCA. Reclassifying tumors as ICCA could be clinically useful as some of the associated alterations, including FGFR2 fusions and BRAF and IDH mutations, can be targeted by specific drugs. Measurable antitumor activity has indeed been reported with pemigatinib, futibatinib (FGFR inihbitors), ivosidenib (IDH1 inhibitor) or neratinib (pan-HER tyrosine kinase inhibitor)16–18. The standard of care for patients with advanced disease is also different between HCC (atezolizumab plus bevacizumab or durvalumab plus tremelimumab) and ICCA (durvalumab plus chemotherapy), and prospective clinical trials, although very challenging to carry out, are needed to determine if patients with cHCC-CCA may benefit from our reclassification approach to be allocated the systemic treatment that fits with the predicted class.
A limitation of our study is that by the nature of this problem, there is no ground truth for our proposed reclassification. We rely on a combination of clinical and genomic markers to demonstrate the plausibility and utility of our proposed reclassification scheme. We further observe that as for any biomarker, model predictions close to the decision cutoff are associated with a higher ambiguity. For such cases falling in the mid-range of the AI score, pathologists could integrate other factors such as clinical probability or imaging results to enhance the prediction confidence. Our software is currently suitable for research use only, and regulatory approvals will be needed to ensure its reliability and efficacy. The sharing of our source code however encourages further development and application by the wider research community.
The next step for the implementation of such models will be their validation on biopsies, as they are the only type of samples that can be obtained before surgery (resection or transplantation) or in patients with advanced disease not amenable for curative therapies. It may be challenging as biopsy is rarely performed due to the existence of non-invasive HCC diagnostic criteria. There is however a renewed interest in biopsy, in particular in the context of clinical trials, and this validation process could be undertaken in the near future. This could also be an opportunity to determine whether the addition of biological features (AFP/CA 19-9) or radiological findings may improve the classification.
In conclusion, our study demonstrates that AI may be useful for tumors that do not fit into common nosological frameworks. Developing evidence-based guidelines for such rare and challenging entities is indeed difficult (if not impossible). Our method could be applied to other cancer subtypes with mixed or biphenotypic differentiation that present a therapeutic challenge, such as combined adenocarcinomas / neuroendocrine tumors or adenosquamous carcinomas. We also believe that the combination of deep-learning heatmaps with spatial transcriptomics is a useful approach to provide insights into the molecular profile of highly predictive areas, and thus demonstrates that AI can be used as a tool for understanding tumor tissue in a research context.
Methods
Ethics statement
This study reports a retrospective analysis of tissue samples of archival tissue of primary liver tumors which was collected in a multicentric way. The protocol was approved by the review board of Université Paris Est Creteil, France (ID n° APHP22012), conducted in accordance with the Declaration of Helsinki and the legislations of each participating center. In this international multicentric cohort informed consents were obtained from patients when required by local regulations. Centers with informed written consent obtained: Hamburg, Barcelona, Mondor, Chinese University of Hong Kong, Beaujon, Paul Brousse. Centers with waiver of consent after IRB approval: University of Texas Southwestern, Stanford, Aachen, Pitié-Salpêtrière, Michigan University, Chennai, Rouen, Saint Antoine, Lille, Angers, Milano, Amiens, Hong Kong, Poitiers, St Louis University, Seoul National University College of Medicine, Prince of Songkhla, Montpellier, Brest, Reims, Yale School of Medicine, Bachmai Hospital, Mayo Clinic Rochester, Regensburg.
Patients and samples
Slides used for training (pure HCC and iCCA) were obtained from the archives of five pathology departments. Inclusion criteria were as follows: (1) patients with HCC or ICCA treated by surgical resection, (2) lack of preoperative antitumor treatment and (3) available WSI and baseline clinical, biological and pathological features. They were scanned using a Hamamatsu Nanozoomer S360 (ndpi encoding format) or a Leica Aperio (svs encoding format) scanning device. For the validation cohort, we used TCGA-LIHC and TCGA-CHOL cohorts (HCC n = 333 and ICCA n = 27). Slides of cHCC-CCA were obtained from European (n = 18), American (n = 6) and Asian (n = 6) liver centers. Inclusion criteria were: patients treated by surgical resection or liver transplantation, diagnosis of cHCC-CCA as defined by the World Health Organization and available histological slides and baseline clinical data.
Pathological reviewing
For all cases, an expert liver pathologist (JC) reviewed the cHCC-CCA histological slides and quantified each contingent (HCC, ICCA, and intermediate/equivocal). To compare the AI model’s prediction with a conventional morphological reclassification, cHCC-CCA cases were reclassified as HCC if the HCC contingent was more abundant that the ICCA contingent or as ICCA if the ICCA was more abundant that the HCC contingent.
Development and validation of the deep learning model
Processing of WSIs was performed according to a pre-defined protocol19. Digitized WSI were preprocessed by tessellation into non-overlapping small patches of size pixels at an edge length of 256 µm. Background and blurry tiles were removed in order to provide the deep learning model with clean and informative input. As described before, the Canny Edge detector module with a threshold of 2 from OpenCV was used20. The primary analysis was carried out using the raw image tiles, and was repeated after color-normalization with the Macenko method to investigate potential batch effects21. Then, we used our previously published pipeline “Marugoto” for supervised Deep Learning12,13. The pipeline consists of a feature extraction module which transforms each tile into a feature vector of size . Feature vectors of all tiles for each slide are subsequenctly processed by an aggregation module which outputs a single score for a given WSI. As the feature extraction module, we used a resnet50 which was pre-trained in a self-supervised way with the RetCCL method in a previous study11. The pretraining included TCGA image data but no labels. As the aggregation module, we used a custom-built attention-based multiple instance learning (attMIL)22. AttMIL incorporates an attention mechanism that involves two fully connected layers that compute an attention score for each tile, resulting in a bag-level feature vector obtained by scaling the embeddings of each tile using the softmax of its attention score, and adding them up22. This bag-level feature vector is then transformed into a final classification through another fully connected layer. To train the model on multiple patients in each batch, a subset of tiles from each patient is considered sufficient, and in each epoch, the tiles are re-sampled20,22. We trained the model for 100 epochs with a patient of 16 for an early-stopping callback. We used a batch size of 64 for the training subset and defined a fixed bag size of 512. For every training epoch, the instances for each bag have been randomly re-sampled. We used all the tiles of the WSIs for prediction in the test and validation cohorts, with a batch size of 1. In order to evaluate the model’s internal performance, we performed a 5-fold cross-validation at the level of patients.
Identification of molecular alterations
Tumor areas were first macro-dissected from formalin-fixed, paraffin embedded tissue blocks, and mRNA and DNA extractions were performed using the Maxwell RSC Plus DNA FFPE Kit IVD and the Maxwell RSC RNA FFPE Kit IVD (Promega, France). They were further quantified using a Qubit fluorimeter in combination with the Qubit™ dsDNA HS Assay Kit and Qubit™ RNA HS Assay Kit (ThermoFisher Scientific). RNA was reverse transcribed to cDNA using SuperScript IV VILO Master Mix (ThermoFisher Scientific). The Oncomine Comprehensive Library Assay v3C was used to amplify 50 nanograms of DNA and RNA (as measured by fluorimetry). Amplicons were digested, barcoded and amplified with the Ion Ampliseq Library and Ion Xpress barcode adapter kits (ThermoFisher Scientific). After quantification, 50 pM of each library were multiplexed and clonally amplified on ion-sphere particles using a Ion Chef instrument (ThermoFisher Scientific). The ISP templates were loaded onto an Ion-540 chip and sequenced using an Ion S5 device and the Ion 540™ Kit–Chef. The Ion Reporter Software was used to assess performance and analyze sequencing data using specific stringent filters (allele frequency between 5 and 90%; only exonic location, read depth >300X).
Spatial transcriptomics
Samples were first screened for RNA quality (DV200 scores > 50%, Tapestation), as recommended on the Visium Tissue Preparation Guide (10X Genomics). Visium spatial gene expression slides and reagents kits were used according to manufacturer instructions.
Five-micrometer thick tissue sections were cut from the FFPE tumor block and placed within the fiducial frames (n = 4) of the FFPE Visium Spatial Gene Expression Slide. Each capture areas has ~5000 gene expression spots that include a partial read 1 sequencing primer (Illumina TruSeq Read 1), 16 nt spatial barcode, a 12 nt unique molecular identifier (UMI) and a 30 nt poly(dT) sequence (captures ligation product). Spots provide a resolution of ~5–10 cells. Slides were deparaffinized and stained. They were further coverslipped and scanned at 40X resolution using a Hamamatsu S360 scanning device. Coverslips were removed, and a decrosslinking step was performed.
Probes were hybridized using the Visium Hybridization Mix. After a post-hybridization wash (FFPE Post-Hyb Wash and SSC Buffers), a ligase is added to seal the junction between the probe pairs that have hybridized to RNA, forming a ligation product. The ligation products were released from the tissue section upon RNase treatment and permeabilization, and further captured on the slide. Probes were extended by the addition of UMI, partial read 1 and spatial barcodes. We then obtained spatially barcoded products for library preparation.
A qPCR was performed to determine the cycle numbers, and the ligated and spatially barcoded products underwent indexing via Sample Index PCR.Sequencing of libraries was performed on a NextSeq 2000 instrument with a P3 flow cell (100 cycles, Illumina, CA, USA).
Survival analyses
Statistical analysis and visualization were performed using R software version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org) and Bioconductor packages (version 3.4). Overall survival was defined by the interval between surgical resection/liver transplantation and death or last follow-up. Survival curves were represented using the Kaplan-Meier method compared with log-rank statistics. Univariate analysis was performed using the Cox proportional-hazards regression model with variables with a P-value < 0.05 selected for multivariate analysis. All tests were two-tailed and a P-value < 0.05 was considered significant. For patients treated by surgical resection, inclusion criteria were lack of pre-operative treatment, lack of metastatic or macroscopic residual disease at the time of surgery, and uninodular tumors. For liver transplantation, all patients with available clinical follow-up were included.
Statistics, reproducibility and other tools
Measurements were taken from distinct samples, i.e., the same sample was never measured repeatedly. For supervised classification experiments, the primary endpoint was the area under the receiver operating characteristic curve (AUROC) with 95% confidence intervals obtained by 1000x bootstrapping. The MI-CLAIM checklist is provided in Supplemental Table 7. In accordance with the COPE (Committee on Publication Ethics) position statement of 13 February 2023 (https://publicationethics.org/cope-position-statements/ai-author), the authors hereby disclose the use of the following artificial intelligence models during the writing of this article. GPT-4 (OpenAI) for checking spelling and grammar.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Source data
Acknowledgements
J.C. is supported by Institut National du Cancer (PRT-K22-146 and TRANSCAN TANGERINE project), Bayer, Fondation Bristol Myers Squibb pour la Recherche en Immuno-Oncologie, Cancéropole Ile-de-France Emergence, Fondation ARC, Ipsen, Ligue Contre le Cancer, and Fondation de l’Avenir. JNK is supported by the Max-Eder-Programme of the German Cancer Aid (grant #70113864), the German Federal Ministry of Education and Research (PEARL, 01KD2104C; CAMINO, 01EO2101; SWAG, 01KD2215A), the German Academic Exchange Service (SECAI, 57616814), the German Federal Joint Committee (Transplant.KI, 01VSF21048) the European Union’s Horizon Europe and innovation programme (ODELIA, 101057091; GENIAL, 101096312) and the National Institute for Health and Care Research (NIHR, NIHR213331) Leeds Biomedical Research Centre. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. JNK and TL received funding from the Federal Ministry of Health under grant number 2520DAT111 (Deep Liver) and from the German Federal Ministry of Education and Research under grant number 031L0312B (TransformLiver). S.C. is supported by Fondation ARC and Les Entreprises contre le Cancer (Gefluc). H.K. is supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2022R1A2C2010348). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. H.K. is supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2022R1A2C2010348). Q.Z. is funded by the China Scholarship Council (Grant n°201908070052). A.D. is supported by grant from Instituto de Salud Carlos III (PI22/ 01427). A.F. Grant from Instituto de Salud Carlos III (PI18/00542). M.R. is supported by grant from Instituto de Salud Carlos III (PI15/00145,PI18/0358, PI22/ 01427 and PMP22/00054).CIBERehd is funded by the Instituto de Salud Carlos III. Some of the authors of this article are members of the European Reference Network (ERN) RARE-LIVER. This work was funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.
Author contributions
J.C., S.C. and J.N.K. conceptualized the study. N.G.L and Q.Z. developed the software. J.C., N.G.L., S.C., Q.Z. and J.N.K. had full access to the raw data, performed the analysis, evaluated the results and wrote the first draft of the manuscript. P.M., L.F., A.P., C.K., C.B., L.R.H., A.U., T.L., L.D.T., A.B., A.C., D.G., C.T.N., H.N.-C., K.N.T., V.G., R.P.G., F.C., D.W., M.V., D.S.A., F.A., A.D., B.R., A.H., K.E., D.F.C., J.A., W.Q.L., H.H.W.L., E.B., M.R.e.l., A.F., A.W.-C.H.C., A.F., M.R.e.i., M.A., O.S., D.C., C.B.-R., N.S., B.M., E.F., D.T., C.T., E.K., H.K., M.N., S.M.P., P.G., R.B., E.V., K.S., D.F.R., S.A.W., R.R., J.M.P., X.Z., A.Lu., S.M., A.La., G.A., H.R., E.D.M., C.S., P.N., M.W., R.C.L.L., J.B., A.G., C.G., M.L., K.H., P.S., P.W.W., N.L., J.T., A.K., J.S., V.P. and S.C. contributed samples, contributed to the interpretation of the results and contributed to writing the final version of the manuscript.
Peer review
Peer review information
Nature Communications thanks the anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Data availability
Some of the data that support the findings of this study are publicly available, and some are proprietary datasets provided for this analysis under collaboration agreements. All data (including histological images) from the TCGA database are available at https://portal.gdc.cancer.gov. Raw sequencing data for the proprietary cohorts have been uploaded to the European Nucleotide Archive (ENA) (accession number PRJEB62487). All other histopathology image data with accompanying metadata are under controlled access according to the local ethical guidelines and can only be requested directly from the respective study groups that independently manage data access for their study cohorts. The central data collection was managed by JC to whom sharing requests can be directed and will be responded to within 4 weeks. Source data for figures are provided with this paper.
Code availability
The data was analyzed using custom-developed open-source software. Our deep learning methods use Python with h5py v3.6, numpy v1.22, openpyxl v3.0, pandas v1.4, torch v1.8, fastai v2.5, fire v0.4. The transcriptomics analysis and statistical analysis methods use R v4.1.2, Seurat v4.1.1 for differential gene expression analysis, Seurat v4.3.0 for visualization, glmGamPoi v1.6.0, dplyr v1.0.10, crayon v1.5.2, ggplot2 v3.4.0, gridExtra v2.3, MAST v1.20.0, ggrepel v0.9.2, readxl v1.4.3, survminer v0.4.9, rms v6.4-1, survival v3.5-5, ComplexHeatmap v2.13.1, irr v0.84.1, vcd v1.4-11. All source codes are publicly available: https://github.com/KatherLab/preprocessing-ng for WSI tessellation, https://github.com/KatherLab/preProcessing for color-normalization and https://github.com/KatherLab/marugoto for model training and deployment and https://github.com/qinghezeng/ST_cHCC-CCA for spatial transcriptomics analysis.
Competing interests
J.N.K. declares consulting services for Owkin, France; DoMore Diagnostics, Norway, Panakeia, UK and Histofy, UK; furthermore he holds shares in StratifAI GmbH and has received honoraria for lectures by AstraZeneca, Bayer, Eisai, MSD, BMS, Roche, Pfizer and Fresenius. JC reports consulting for Crosscope. HR reports consulting service for Boston. The remaining authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Julien Calderaro, Email: julien.calderaro@aphp.fr.
Jakob Nikolas Kather, Email: jakob_nikolas.kather@tu-dresden.de.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-023-43749-3.
References
- 1.Llovet JM, et al. Hepatocellular carcinoma. Nat. Rev. Dis. Prim. 2021;7:6. doi: 10.1038/s41572-020-00240-3. [DOI] [PubMed] [Google Scholar]
- 2.Calderaro J, Ziol M, Paradis V, Zucman-Rossi J. Molecular and histological correlations in liver cancer. J. Hepatol. 2019;71:616–630. doi: 10.1016/j.jhep.2019.06.001. [DOI] [PubMed] [Google Scholar]
- 3.Beaufrère A, Calderaro J, Paradis V. Combined hepatocellular-cholangiocarcinoma: an update. J. Hepatol. 2021;74:1212–1224. doi: 10.1016/j.jhep.2021.01.035. [DOI] [PubMed] [Google Scholar]
- 4.Xue R, et al. Genomic and transcriptomic profiling of combined hepatocellular and intrahepatic cholangiocarcinoma reveals distinct molecular subtypes. Cancer Cell. 2019;35:932–947.e8. doi: 10.1016/j.ccell.2019.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Nguyen CT, et al. Immune profiling of combined hepatocellular—cholangiocarcinoma reveals distinct subtypes and activation of gene signatures predictive of response to immunotherapy. Clin. Cancer Res. 2022;28:540–551. doi: 10.1158/1078-0432.CCR-21-1219. [DOI] [PubMed] [Google Scholar]
- 6.Rosenberg N, et al. Combined hepatocellular-cholangiocarcinoma derives from liver progenitor cells and depends on senescence and IL-6 trans-signaling. J. Hepatol. 2022;77:1631–1641. doi: 10.1016/j.jhep.2022.07.029. [DOI] [PubMed] [Google Scholar]
- 7.Bera K, Schalper KA, Rimm DL, Velcheti V, Madabhushi A. Artificial intelligence in digital pathology—new tools for diagnosis and precision oncology. Nat. Rev. Clin. Oncol. 2019;16:703–715. doi: 10.1038/s41571-019-0252-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Shmatko A, Ghaffari Laleh N, Gerstung M, Kather JN. Artificial intelligence in histopathology: enhancing cancer research and clinical oncology. Nat. Cancer. 2022;3:1026–1038. doi: 10.1038/s43018-022-00436-4. [DOI] [PubMed] [Google Scholar]
- 9.Saillard, C. et al. Predicting survival after hepatocellular carcinoma resection using deep-learning on histological slides. J. Hepatology10.1002/hep.31207 (2020). [DOI] [PubMed]
- 10.Zeng, Q. et al. Artificial intelligence predicts immune and inflammatory gene signatures directly from hepatocellular carcinoma histology. J. Hepatol. 10.1016/j.jhep.2022.01.018 (2022). [DOI] [PubMed]
- 11.Wang X, et al. RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval. Med. Image Anal. 2022;83:102645. doi: 10.1016/j.media.2022.102645. [DOI] [PubMed] [Google Scholar]
- 12.Seraphin, T. P. et al. Prediction of heart transplant rejection from routine pathology slides with self-supervised deep learning. Eur. Heart J. Digit. Health.4, 265–274 (2023). [DOI] [PMC free article] [PubMed]
- 13.Saldanha OL, et al. Self-supervised attention-based deep learning for pan-cancer mutation prediction from histopathology. NPJ Precis Oncol. 2023;7:35. doi: 10.1038/s41698-023-00365-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Niehues JM, et al. Generalizable biomarker prediction from cancer pathology slides with self-supervised deep learning: A retrospective multi-centric study. Cell Rep. Med. 2023;4:100980. doi: 10.1016/j.xcrm.2023.100980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Calderaro J, et al. Nestin as a diagnostic and prognostic marker for combined hepatocellular-cholangiocarcinoma. J. Hepatol. 2022;77:1586–1597. doi: 10.1016/j.jhep.2022.07.019. [DOI] [PubMed] [Google Scholar]
- 16.Harding JJ, et al. Antitumour activity of neratinib in patients with HER2-mutant advanced biliary tract cancers. Nat. Commun. 2023;14:630. doi: 10.1038/s41467-023-36399-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Goyal L, et al. Futibatinib for FGFR2-rearranged intrahepatic cholangiocarcinoma. N. Engl. J. Med. 2023;388:228–239. doi: 10.1056/NEJMoa2206834. [DOI] [PubMed] [Google Scholar]
- 18.Abou-Alfa GK, et al. Pemigatinib for previously treated, locally advanced or metastatic cholangiocarcinoma: a multicentre, open-label, phase 2 study. Lancet Oncol. 2020;21:671–684. doi: 10.1016/S1470-2045(20)30109-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Muti, H. S. et al. The aachen protocol for deep learning histopathology: a hands-on guide for data preprocessing. Zenodo10.5281/zenodo.3694994 (2020).
- 20.Ghaffari Laleh N, et al. Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology. Med. Image Anal. 2022;79:102474. doi: 10.1016/j.media.2022.102474. [DOI] [PubMed] [Google Scholar]
- 21.Macenko, M. et al. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro 1107–1110 (University of Carolina, 2009).
- 22.Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Proceedings of the 35th International Conference on Machine Learning 1st edn, Vol. 80 (eds. Dy, J. & Krause, A.) Ch. 2127–2136 (PMLR, 2018).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Some of the data that support the findings of this study are publicly available, and some are proprietary datasets provided for this analysis under collaboration agreements. All data (including histological images) from the TCGA database are available at https://portal.gdc.cancer.gov. Raw sequencing data for the proprietary cohorts have been uploaded to the European Nucleotide Archive (ENA) (accession number PRJEB62487). All other histopathology image data with accompanying metadata are under controlled access according to the local ethical guidelines and can only be requested directly from the respective study groups that independently manage data access for their study cohorts. The central data collection was managed by JC to whom sharing requests can be directed and will be responded to within 4 weeks. Source data for figures are provided with this paper.
The data was analyzed using custom-developed open-source software. Our deep learning methods use Python with h5py v3.6, numpy v1.22, openpyxl v3.0, pandas v1.4, torch v1.8, fastai v2.5, fire v0.4. The transcriptomics analysis and statistical analysis methods use R v4.1.2, Seurat v4.1.1 for differential gene expression analysis, Seurat v4.3.0 for visualization, glmGamPoi v1.6.0, dplyr v1.0.10, crayon v1.5.2, ggplot2 v3.4.0, gridExtra v2.3, MAST v1.20.0, ggrepel v0.9.2, readxl v1.4.3, survminer v0.4.9, rms v6.4-1, survival v3.5-5, ComplexHeatmap v2.13.1, irr v0.84.1, vcd v1.4-11. All source codes are publicly available: https://github.com/KatherLab/preprocessing-ng for WSI tessellation, https://github.com/KatherLab/preProcessing for color-normalization and https://github.com/KatherLab/marugoto for model training and deployment and https://github.com/qinghezeng/ST_cHCC-CCA for spatial transcriptomics analysis.