Summary
Background
Differentiating intrahepatic cholangiocarcinomas (iCCA) from hepatic metastases of pancreatic ductal adenocarcinoma (PAAD) is challenging. Both tumours have similar morphological and immunohistochemical pattern and share multiple driver mutations. We hypothesised that DNA methylation-based machine-learning algorithms may help perform this task.
Methods
We assembled genome-wide DNA methylation data for iCCA (n = 259), PAAD (n = 431), and normal bile duct (n = 70) from publicly available sources. We split this cohort into a reference (n = 399) and a validation set (n = 361). Using the reference cohort, we trained three machine learning models to differentiate between these entities. Furthermore, we validated the classifiers on the technical validation set and used an internal cohort (n = 72) to test our classifier.
Findings
On the validation cohort, the neural network, support vector machine, and the random forest classifiers reached accuracies of 97.68%, 95.62%, and 96.5%, respectively. Filtering by anomaly detection and thresholds improved the accuracy to 99.07% (37 samples excluded by filtering), 96.22% (17 samples excluded), and 100% (44 samples excluded) for the neural network, support vector machine and random forest, respectively. Because of best balance between accuracy and number of predictable cases we tested the neural network with applied filters on the in-house cohort, obtaining an accuracy of 95.45%.
Interpretation
We developed a classifier that can differentiate between iCCAs, intrahepatic metastases of a PAAD, and normal bile duct tissue with high accuracy. This tool can be used for improving the diagnosis of pancreato-biliary cancers of the liver.
Funding
This work was supported by Berlin Institute of Health (JCS Program), DKTK Berlin (Young Investigator Grant 2022), German Research Foundation (493697503 and 314905040 – SFB/TRR 209 Liver Cancer B01), and German Cancer Aid (70113922).
Keywords: Pathology, Machine learning, Oncology, Molecular diagnosis, Epigenetic
Research in context.
Evidence before this study
The therapy of choice for an intrahepatic cholangiocarcinoma (iCCA) is surgical resection followed by adjuvant chemotherapy, whereas hepatic metastasized pancreatic ductal adenocarcinomas (PAAD) are not resected and are treated with palliative chemotherapy. These two lesions have a very similar morphology, immunohistochemistry, and mutational profile making this differential diagnosis very difficult, relying mostly on the patient history and imaging methods. Starting from the premise that genome wide methylation profiles are tissue specific, we hypothesised that a methylation-based classifier can be ideal for solving this diagnostic problem.
Added value of this study
To the best of our knowledge, our study is by far the largest research to characterize the methylation landscape of pancreato-biliary tumours. We observed that there are two groups of iCCA, a distinct one (iCCA group 1) and a second one located in the proximity of PAAD samples (iCCA group 2), each group being characterized by specific mutations and distinct pathogenesis. Starting from this landscape analysis, we generated a methylation-based machine learning classifier that distinguishes between normal bile duct tissue, metastases of PAAD, and iCCA with very high accuracy. Moreover, we showed the feasibility of our diagnostic tool using small core needle biopsies and small diagnostic resections, mimicking a real-life diagnostic situation. Additionally, we demonstrated that our classifier, with added filters for the interpretation of the results, is able to exclude other malignancies, such as stomach adenocarcinoma and colon adenocarcinoma, that are other frequent causes of hepatic metastases, and is able to detect high-grade precursor lesions of iCCA.
Implications of all the available evidence
Herein, we developed a neural network classifier that could aid pathologists in the diagnosis of iCCA versus liver metastases of PAAD. This classifier can be immediately implemented in the clinical practice and have important diagnostic and therapeutic impact. Additionally, this classifier represents the basis for the development of more rapid intraoperative classifiers that can have real-time impact on the surgical decision making.
Introduction
Differentiating an intrahepatic cholangiocarcinoma (iCCA) from a hepatic metastasized pancreatic ductal adenocarcinoma (PAAD) remains a challenging surgical pathology diagnosis. Well to poorly formed, monomorphic glands, lined by a columnar or cuboidal epithelium in a fibrous stroma, morphologically characterize these two entities, making their differentiation challenging. Morphological characteristic features with low specificity for iCCA are the “never-ending” glandular pattern in which the glands merge with one another, and the formation of small solid nests with clear cell changes.1
At the immunohistochemistry level, the two tumour entities share multiple features, both being CK7, CK19 positive, and CK20 variably positive. Similarly, at the genomic level the two neoplasia often share a similar mutational profile, both displaying KRAS, TP53, and SMAD4 mutations and CDKN2A deletions, in different proportions.2, 3, 4
Some molecular traits have been demonstrated to differentiate between iCCA and PAAD: IDH1/2 mutations are specific to iCCA but are only present in one fifth of iCCA.5 Further, around 10% of iCCA may demonstrate alterations of BAP1,5 but frequently these patients are also IDH1/2 mutated.6 However, despite these differences, most iCCA display the same driver mutations as PAAD,6 or have no known driver mutation.7 Additionally, characteristic for iCCA are gene fusions of FGFR2, which were described only in a minority of patients.8 Therefore, adding DNA and RNA sequencing to the diagnostic algorithm is of limited value.
The differentiation of iCCA and PAAD often has to be performed on biopsies as the only available material, further complicating the distinction due to limited tissue. Recently, genome-wide DNA methylation patterns, that are highly tissue specific, have demonstrated to be valuable for differentiating between classes of tumours and predicting their origin.9, 10, 11, 12, 13 Furthermore, it was shown that DNA methylation data is suitable to develop machine learning based models to be used as diagnostic tools.9, 10, 11, 12, 13 Array based DNA methylation assays require only small amounts of input DNA and are thus suitable for analysis of biopsies.
Machine learning is a useful tool for analysing complex biological data, such as DNA methylation patterns in cancer.9, 10, 11, 12, 13 It can identify molecular signatures that differentiate between tumours and improve diagnostic accuracy. However, there are different approaches to machine learning. Neural networks can handle high-dimensional data with complex relationships, but may overfit. Support vector machines are good for classification tasks but are prone to overfitting with noisy or large numbers of features. Random Forest models are robust to noisy and correlated data but their performance decreases with a large number of features. The best model should be chosen based on its observed performance on several cohorts.
Therefore, we hypothesise that genome-wide DNA methylation can be used to build machine-learning classifiers that can support the pathologists in differentiating iCCA from hepatic PAAD metastases. In order to replicate the real-life scenario, we also used biopsy specimens to test the newly developed diagnostic tool.
Methods
Study design
We retrieved publicly available DNA methylation data (450k and EPIC) from two TCGA and eight GEO datasets and further performed EPIC DNA methylation array analysis on 72 in-house samples. Using the reference cohort (n = 399), we developed three different machine learning classifiers: random forest, support vector machine and neural network. We next evaluated the classifiers and performed rejection class analysis, determining a set of thresholds for the prediction probability, below which we cannot confidently predict the sample class. For these analyses we used a technical validation set (n = 361).
We further tested the classifier on our clinical test set composed of 72 samples. No data from the technical validation set or the clinical test set was used for the development of the classifiers.
External methylation data sets—reference, validation, and landscape sets
We collected 760 methylation data samples from GEO and TCGA and divided them into 2 cohorts, the reference and technical validation cohort.
The reference cohort (Table S1), containing a total of 399 samples (50.6% of total samples), consisted of 205 PAAD samples (105 from TCGA,3 50 from GSE155353,14 and 50 from GSE49149,15), 144 iCCA samples (23 from TCGA,4 85 from GSE89803,6 and 36 from GSE20124116), and 50 normal bile duct samples (from GSE15629917). Raw DNA methylation data in the form of IDAT files was available for all the reference cohort samples, and have been pre-processed using the same methods. Samples from GSE3207918 and GSE4965619 were not considered for the reference cohort due to the fact that no IDAT files were available and thus we could not ensure the same pre-processing methods.
The reference cohort was built in order to satisfy typical machine learning requirements while also taking key biological factors into account. We first preselected iCCA samples (n = 144) for which the IDH1/2 mutation status was known, pulling randomly from all studies, in order to be able to analyse the impact of IDH1/2 mutations on the distribution of the samples and classifier performance. We continued by randomly dividing the 70 normal bile duct samples between the reference and technical validation cohorts (split 70%/30%), leading to the inclusion of 50 normal bile duct samples into the reference cohort. The normal bile duct tissue class is underrepresented in both cohorts due to the fact that only 70 samples were available. Finally, we added 205 random PAAD samples to the reference cohort, the rest being included in the technical validation cohort (split 45%/55%), in order to balance out the other two classes. Due to the fact that methylation is tissue specific and iCCA and normal bile duct samples are believed to share the same tissue origin, as well as the same background stroma, we decided that the number of PAAD samples should be similar to the combined number of iCCA and normal bile duct samples.
The technical validation cohort (Table S2) consisted of the remainder of the samples, not included in the reference cohort. It contained 361 samples: 226 PAAD samples (6 from GSE149250,20 32 from GSE155353,14 117 from GSE49149,15 and 71 from TCGA3), 115 iCCA samples (9 from GSE156299,17 16 from GSE201241,16 50 from GSE32079,18 26 from GSE49656,19 10 from GSE89803,6 and 4 from TCGA4), and 20 normal bile duct samples (12 from GSE201241,16 4 from GSE49656,19 and 4 from GSE89803,6).
For all samples of the reference and technical validation cohort, we further retrieved from the original studies: mutational status (IDH1/2, KRAS, TP53, SMAD4, BAP1, MLL3, BAP1), FGFR2 fusion status, liver Fluke status (Opisthorchis viverrini), array type, material type, and if necessary, the pre-processing method of the IDAT.
Additionally, from the same studies, we retrieved 63 samples of methylation data of normal pancreatic tissue, 16 distal (d)CCAs, 22 distal high-grade intraductal papillary neoplastic lesions of the biliary tract (dIPNB), 8 intraheaptic IPNBs (iIPNBs), 8 intrahepatic high-grade intraductal tubule-papillary neoplastic lesions of the biliary tract (iITPN), 9 ITPNs of the pancreas (ITPN-P), 50 perihilar (p)CCAs, 10 perihilar IPNBs (pIPNBs), 2 perihilar ITPNs (pITPNs), and 8 extrahepatic (e)CCAs (samples that are pCCA/dCCA but no further details were available) (Table S3), in order to construct the DNA methylation landscape of pancreato-biliary tumours.
Lastly, all samples from TCGA stomach adenocarcinoma (STAD) and colon adenocarcinoma (COAD) were used in order to create an anomaly detection model capable of identifying other cancer types not included in the reference cohort. We randomly selected 50 STAD and 50 COAD samples which were used together with the reference cohort to create the training dataset for the anomaly detection, while the remaining 347 STAD samples and 303 COAD samples were used to validate the anomaly detection. STAD and COAD samples were chosen as they are amongst the most common other metastasis to the liver.
Testing set
The clinical test set consisted of 72 archival formalin fixed paraffin embedded (FFPE) samples from the Institute of Pathology, Charité—Universitätsmedizin Berlin and Heidelberg University Hospital. Of these, the PAAD samples were comprised of 16 primary PAAD from resections, and 20 core biopsies and small intraoperative diagnostic resection samples from PAAD liver metastases. The iCCA samples included 23 primary iCCA samples from resection specimens, 12 iCCA core biopsies, and one iCCA small intraoperative diagnostic resection (Table S4).
For all the included samples, the diagnosis was reliably determined by the standard histopathological investigation and was reconfirmed for this study. The diagnosis of iCCA biopsies (n = 13) was confirmed on the later surgical resection specimens or on the biopsy plus imaging. The diagnosis of PAAD liver metastases (n = 20) was supported either by a history of known PAAD and/or by imaging data supporting the diagnosis of a primary PAAD.
For all clinical test set samples, additionally to the diagnosis, we retrieved sex, tumour grade, T stage, N stage, tumour location and sample type (biopsy, resection, diagnostic intraoperative resections). When available, additional molecular data for clinical test set samples was also retrieved from the patient records (n = 17).
DNA extraction
For samples of the clinical test sets, tumour areas with the highest tumour cell content were identified using a light microscope (Olympus, BX46). From the tumour region of interest, the FFPE block was punched for DNA extraction. Semi-automated DNA extraction was performed according to the manufacturer's instructions (Maxwell RSC FFPE Plus DNA Purification Kit, Custom, Promega). DNA quantities were measured using Qubit HS DNA assay (Thermo Fisher Scientific).
Methylation analysis
DNA restoration was performed using the Infinium HD FFPE DNA Restore Kit according to the manufacturer's protocol and methylation analysis was performed using the Illumina Infinium MethylationEPIC BeadChip according to protocols supplied by the manufacturer.
Methylation array processing
All methylation data pre-processing was conducted in R using various packages as implemented in ChAMP.21 Raw signals were loaded from the IDAT files using the minfi package,22,23 combining Illumina EPIC and 450k samples. A number of CpG sites were filtered out: EPIC array sites not included in 450k arrays; all CpG sites with a detection p-value above 0.01; low quality sites, with less than 3 beads in at least 5% of the samples; all SNP-related sites24; multi-hit sites25; and CpGs located on chromosomes X and Y; leaving a total of 301,439 CpG sites. Lastly, the beta values of the remaining CpG sites were normalized using FunNorm26 followed by BMIQ,27 the combination being shown to outperform individual methods.28 Each cohort has been pre-processed independently.
Tumour purity estimation
The tumour purity of the samples was estimated using the InfiniumPurify R package29 using the provided informative differentially methylated CpG sites. For estimating the purity of iCCA and PAAD metastases samples, we selected “CHOL” as tumour type, and “PAAD” as tumour type for the primary PAAD resections. The tumour type “CHOL” was chosen for iCCA and PAAD metastases tumour purity estimation as these two entities share the same background normal cells (liver parenchyma).30
Copy number analysis
As previously described,9 calculation of copy number profiles from DNA methylation array data were done using the conumee package, version: 1.3.0.31 Briefly, a set of 50 control samples derived from histologically confirmed normal bile duct were used as reference for iCCA samples, while for PAAD tumours we used a set of 63 histologically confirmed normal pancreas samples. The evaluation was carried out manually with consideration of the tumour cell content for the evaluation of chromosomal gains or losses. In general, changes were considered potentially relevant if the intensity ratio of a segment deviated from the baseline by at least more than 0.1.32 In addition, we created summary copy number profiles for various groups. This analysis was done using an adaptation of the conumee script (provided by Dr. Damian Stichel, Neuropathology Heidelberg).
Unsupervised clustering: t-SNE and heatmap
In order to create the t-distributed stochastic neighbour embedding (t-SNE) plots, beta values of CpG sites were decomposed into eigenvectors, and then processed using the R package Rtnse33 with 5000 iterations. The number of eigenvectors (k) and the perplexity (p) was selected individually for each plot in order to account for the different number of samples.
For the pancreato-biliary landscape t-SNE, we used the top 2048 CpG sites with the highest standard deviation in their methylation level across all the 1028 samples collected (k = 50, p = 15).
The reference cohort t-SNE plot was created using the 2048 CpG sites with the highest standard deviation in their methylation level across the samples in the reference cohort (k = 30, p = 15). The same CpG beta values as for the reference cohort plot were used for the reference and test cohorts combined t-SNE plot. These CpG sites also constitute the features of the classifiers.
The heatmap was created using the R package ComplexHeatmap34 on the same 2048 CpG sites used for t-SNE of the reference cohort.
Classifier development
We trained three machine learning classifiers, random forest, radial basis function kernel support vector machine, and a neural network on the reference cohort and used threefold cross-validation in order to tune the hyperparameters of the models. The random forest and support vector machine models were built using the PyCaret Python library,35 while the neural network was built using the Python library keras.36
The CpG sites with the highest standard deviation in their methylation level across the reference cohort samples were chosen as features for the classifiers. In order to determine the optimal number of features, we trained the models using between 64 (26) and 32,768 (215) CpGs. The selected features were used to fit a regularized empirical Bayes model as implemented in Python library reComBat that eliminates material induced variation in samples outside the reference cohort, before being passed on to the classifiers.
The optimal parameters were established by optimizing the cross-entropy loss using threefold cross-validation. We found 100 trees to be the optimal for the random forest model, higher number of trees leading to overfitting. A “C” of 7 and an “auto” setting for the kernel gamma coefficient were determined to be the optimal parameters for the support vector machine model. For the neural network, preliminary tests showed that a progressively decreasing number of neurons per hidden layer outperformed a constant number of neurons, therefore we settled for a network shape where the width of the hidden layers decreases by a factor of two per layer, down to a minimum of 16 neurons. A depth of 8 layers and a width of 256 for the first hidden layer performed optimally. Furthermore, the optimal learning rate of 0.00895, dropout rate of 0, and L1 regularization of 0.00441 were selected. The SGD optimizer with default parameters was used. Python library optuna37 was used to find the hyperparameters for the neural network, while the random search from the scikit-learn library as implemented by PyCaret35 was used for the random forest and support vector machine models.
After the hyperparameter tuning, the models were finalized by being trained on the complete reference cohort dataset. Additionally, the random forest and support vector machine models were calibrated in order to output probability scores for each class. This step was not needed for the neural network, due to the usage of a softmax output layer.
Rejection class analysis
In order to improve the accuracy of the classifiers, to increase the confidence in the output of the model, and to exclude other entities for which the classifiers were not designed, we added 2 filtering layers. A sample must pass both layers in order to be considered a valid prediction. The filtering consists of an anomaly detection model that establishes if the sample is indeed iCCA, PAAD or normal bile duct tissue, followed by specific thresholds the classifier output has to pass. Samples which do not meet the threshold are considered “no match”, therefore unclassifiable.
The anomaly detection layer assesses whether a specific sample corresponds to one of the three classes predicted by our classifiers, or if it is a different adenocarcinoma (stomach adenocarcinoma (STAD) or colon adenocarcinoma (COAD)). It was built by training a neural network ensemble on the same features as the classifiers, using the reference cohort together with 100 randomly selected STAD and COAD samples. Python library optuna was used to conduct hyperparameter optimisation, finding that a network with 8 layers and 128 neurons per layer, a learning rate of 0.0034, a dropout rate of 0 and L1 regularization of 0.0038 to perform optimally.
In order to choose the appropriate threshold, we generated classification scores for all samples in the technical validation cohort and inferred the classification accuracy and number of predictable samples at all thresholds between 0.5 and 0.95 in 0.05 increments. Additionally, we looked at the distribution of the probabilities for each true class independently. Based on this analysis, we selected thresholds for each of the three classes.
Selection of final classifier models
Using the 361 samples of the technical validation cohort, we evaluated the overall accuracy of each model, the number of samples it can correctly classify once additional filtering is applied, as well as the accuracy for each individual class separately.
The performance on the technical validation set, was similar to the performance on the reference set, indicating that the models do not suffer from overfitting.
We selected the model with the best combination of overall and class specific accuracy, while still being capable of classifying most samples when the additional filtering is applied.
Deconvolution of methylation data using single cell RNAseq
To perform cell deconvolution, we utilized the EpiSCORE R library,38 which estimates the proportion of cell types present in a mixed cell population using single cell RNAseq data. We provided EpiSCORE with the pre-processed beta values of all CpGs, and the liver and pancreas reference DNA methylation signature matrices, which are publicly offered by the authors of EpiSCORE. The EpiSCORE algorithm utilizes a non-negative least squares regression to estimate cell proportions, which we subsequently used for downstream analyses.
Differently methylated CpGs
In order to identify differentially methylated CpGs probes (DMP) between sample groups, we utilized the limma R package.39 Limma uses a linear model and empirical Bayes method to estimate the mean difference in methylation between the groups, and then calculates adjusted p-values to account for multiple testing. CpGs with an adjusted p-value<0.01 and an absolute logFC value >0.3 were considered differentially methylated between groups.
Pathways and cell type analysis
Integrated pathway analyses were performed with WikiPathway 2021 Human using Enrichr bioinformatics resources (http://amp.pharm.mssm.edu/Enrichr/). An adjusted p-value<0.01 was considered significant. For visualizing and clustering the pathways of genes corresponding to the 2048 most standard deviated CpGs, the Appyters scatter plot tool was used.40 The clusters were built using the Leiden algorithm. Points were plotted on the first two Uniform Manifold Approximation and Projection dimensions. Cell type deconvolution was performed with Descartes Cell Types and Tissue 2021 using Enrichr bioinformatics resources (http://amp.pharm.mssm.edu/Enrichr/). An adjusted p-value<0.01 was considered significant. Results were visualized using bar charts.
Gene expression
In order to check the concordance between the differently methylated CpGs and the transcriptome RNA expression data in the log2 (tpm + 0.001) format for the matching genes was retrieved from the UCSC Xena TCGA Pan-Cancer (PANCAN) (https://xenabrowser.net/datapages/). We downloaded the RNA expression data from the iCCAs of the CHOL TCGA cohort (n = 27) and for ductal adenocarcinomas from the PAAD TCGA cohort (n = 171). Differently expressed genes were determined using an unpaired t-test. All analyses were performed using R version 4.2.1.
Tissue microarray construction and immunohistochemistry
For all FFPE samples for which material was still available (n = 60), we constructed two tissue microarrays (TMA) containing two 1.5 mm cores for each tumour. For two samples because of insufficient material, only one core was included into the TMA. Additionally, for seven biopsy samples (n = 7) that were too thin to be included into the TMA we performed whole slide staining. Hence, in total we performed immunohistochemistry analysis in 67 samples from the 72 of the testing cohort (93.1%). Next, the FFPE TMA and tumour blocks were cut into 3 μm sections. One section was used for H&E staining and three others for Ki67, Annexin 1 and Annexin 10 immunohistochemistry staining. For the immunohistochemical staining, a BenchMark XT immunostainer (Ventana Medical Systems, Tucson, AZ) was used. For antigen retrieval, sections were incubated in CC1 mild buffer (Ventana Medical Systems, Tucson, AZ) for 30 min at 100 °C or in protease 1 for 8 min. The sections were stained with anti-MIB1 (Ki67) antibody (M7240, Dako, 1:50, RRID:AB_2142367), anti-Annexin A10 (PA5-52151, Invitrogen, 1:2,000, RRID:AB_2638018), and anti-Annexin I (610066, BD Biosciences, 1:5,000, RRID:AB_397477) for 60 min at room temperature, and visualized using the avidin–biotin complex method and DAB. We stained the cell nuclei by additionally incubating for 12 min with haematoxylin and bluing reagent (Ventana Medical Systems, Tucson, AZ). Histological images were acquired with the digital slide scanner PANNORAMIC 1000 (3DHISTECH).
Histological analysis and immunohistochemistry scoring
H&E whole slides were used to revaluate the grading of all tumours and to determine their growth pattern. For PAAD and metastatic PAAD we used the recently introduced morphological classification by Kalimuthu et al.41 Briefly, the tumours were classified as gland forming and non-gland forming. The gland forming tumours were further divided into conventional (small and well-differentiated glands lined by pancreaticobiliary-type epithelium) and tubule-papillary (large, dilated glands lined by foveolar/pancreaticobiliary-type epithelium). The non-gland forming PAADs were divided into squamous (>30% squamous differentiation) and composite (multiple non-glandular morphologies: cribriform, single cells, single files, cords, nests, ribbons, nests, buds, angulated glands). For iCCA we used the macroscopic description from the pathology report and all H&E slides from routine diagnosis to define the tumours as mass forming, intraductal or periductal infiltrating.42 The degree of fibrosis for each tumour was estimated in percent on the H&E whole slide slides.
For the immunohistochemistry classification into iCCA and PAAD we used the scoring and classification system proposed by Padden et al.43 Briefly, we evaluated the immunohistochemistry by using the immunoreactive score (IRS). Each tumour was graded as 0 (none), 1 (weak), 2 (intermediate), or 3 (strong) for the intensity of the expression and 0, 1 (≤5%), 2 (6–10%), 3 (11–50%), or 4 (>50%) for the percentage of positive tumour cells. The two scores were multiplied resulting in values between 0 and 12 for each antibody. We used the IRS thresholds proposed by the same study to consider a marker as suggestive for a given diagnosis. Therefore, an IRS of 5 or higher for Annexin 1 and an IRS of 0.5 or higher for Annexin 10 was diagnostic for PAAD. If one of the two markers were above or equal to the IRS cut-off, we considered that tumour to be a PAAD.43 For Ki67 we estimated the percentage of positive tumour cells.
Statistical analysis
Statistical analyses and graphics were carried out and designed with GraphPad Prism 9.4.1. For the comparison of categorical variables, we used the chi-square test. For correlations between continues variables, we used the Pearson test. All tests were two-sided, and a p-value<0.05 was considered statistically significant. For further evaluation of the performance of the models in the study, the receiver operating characteristic (ROC) curve and area under the curve (AUC) were calculated using the scikit-learn library in Python.44
Ethics
The study was approved by the ethic commissions of Charité, Universitätsmedizin Berlin (EA1/079/22) and Heidelberg University Hospital (S-519/2019). FFPE tissue from patients was acquired with informed consent following the local institutional review and the Declaration of Helsinki.
Role of the funding source
The funding sources had no role in the study design, data collection, data analyses, interpretation, or writing of report.
Results
The methylation landscape of pancreato-biliary tumours
We started our analysis by looking at the methylation data of normal bile duct, normal pancreas, intrahepatic, perihilar, and distal high-grade intraductal papillary neoplastic lesions of the biliary tract (iIPNB, pIPNB and dIPNB), intrahepatic and perihilar high-grade intraductal tubulopapillary neoplastic lesions of the biliary tract (iITPN, pITPN) and pancreas (ITPN-P), iCCA, dCCA, pCCA, eCCA, and PAAD (Fig. 1a) from The Cancer Genome Atlas (TCGA), the Gene Expression Omnibus (GEO) databases and our own samples (Charité and Heidelberg). We visualized these data (n = 1028 individual biological samples) by t-SNE creating a DNA methylation landscape of pancreato-biliary tumours. This initial plot showed that normal bile duct, normal pancreatic tissue, PAAD, and the larger fraction of iCCA samples group as separate clusters, while pCCA, dCCA, a subset of iCCA, and most of the matching CCA precursors (ITPN and IPNB) cluster together in the proximity of the PAAD cluster (Fig. 1b). We further investigated if the t-SNE clustering is affected by the dataset origin, but the samples grouped by biological groups and not by data origin (Fig. S1). These results once more indicate that DNA methylation is associated with tissue of origin.
We next split our external cohort in a reference set and technical validation set and further tested the classifier on an in-house clinical cohort consisting of primary and metastatic PAAD and iCCA (Fig. 1c). We hypothesised that based on DNA methylation data a precise differentiation of PAAD, and iCCA which have a very similar morphology (Fig. 1d) can be achieved.
An in-depth biological analysis of the reference cohort
We next focused on the particularities of the tissues to be classified, iCCA, PAAD, and normal bile duct samples in the reference cohort (Fig. 2a), leaving aside all other samples. We started the detailed biological analysis of the reference cohort, by checking the key cell types45 of the bulk methylation data of the reference cohort using tissue-specific single-cell RNA-sequencing data.46 We performed decomposition of bulk DNA methylation data from iCCA, PAAD, and normal bile duct on both normal liver and normal pancreas, in this way we were also able to understand how similar these entities are. On normal liver background, as expected the main cell type for iCCA and normal bile duct were the cholangiocytes (Chol) (iCCA: Chol fraction—0.34, and normal bile: Chol fraction—0.37) (Fig. S2a). Against normal pancreas we confirmed that PAAD samples were mainly composed of ductal cells (PAAD: Ductal fraction—0.32) (Fig. S2b). Interestingly, PAAD methylation data on normal liver background were highly enriched in cholangiocytes (PAAD: Chol fraction—0.26), while iCCA and normal bile on normal pancreas background showed higher fractions of ductal cells than PAAD (iCCA: Ductal fraction—0.39, and normal bile: Ductal fraction—0.41) (Fig. S2a and b). This data underlines how similar the transcriptome of PAAD and iCCA is and explains the difficulty of this differential diagnosis.
We next analysed the available biological factors specific to these entities. Using the top 2048 most variable CpG sites we calculated a t-SNE for the reference set. As previously observed, normal bile duct and PAAD samples clustered separately, while iCCA formed two main clusters, a distinct one (iCCA group 1) and a second cluster located in the proximity of the PAAD cluster (iCCA group 2) (Fig. 2b). We confirmed this observation using a second analysis, hierarchical clustering, iCCA forming the same two main groups: iCCA group 1 composed of well separated samples, and iCCA group 2 with most samples intermingled with PAAD samples (Fig. S3a).
We observed that tumour purity has a minimal impact on the formation of these main clusters. iCCA samples showed a higher purity on average compared to PAAD samples, implying a higher tumour cell density (Fig. 2c). Because the reference set is composed of samples from multiple studies (Fig. 2a), we tested for batch effect. However, the samples from different cohorts clustered according to biological groups, implying no batch effect (Fig. 2d).
Next, we overlaid the t-SNE plot with additional molecular information available for our reference set. Regarding mutations, we noticed that iCCA group 1 contained all IDH1/2 mutated samples while the iCCA group 2 has no IDH1/2 mutations (p = 0.0012, chi-square test). Moreover, nearly all IDH1/2 mutant samples clustered close together, likely representing a subgroup of iCCA group 1 (Fig. 2e). We observed a reminiscent distribution for BAP1 mutations, the iCCA group 1 contained significantly more BAP1 mutant samples compared to group 2 (p = 0.0061, chi-square test, Fig. S3b). Next, we checked the status of KRAS, TP53, and SMAD4 in the two iCCA groups and distinguished significantly more mutations in the iCCA group 2 compared to the iCCA group 1 (p = 0.0036, p < 0.0001, and p < 0.0001, respectively, chi-square test, Fig. 2f–h). Furthermore, the Fluke associated MLL3 mutation47 was more common in the iCCA group 2 (p = 0.0379, chi-square test, Fig. S3c). In contrast, for ARID1A mutations, we did not observe significant differences between the two iCCA groups (p = 0.9637, chi-square test, Fig. S3d). We observed a strong enrichment of liver Fluke (Opisthorchis viverrini) associated tumours in iCCA group 2 compared to group 1 (p < 0.0001, chi-square test), actually building a subgroup of group 2 (Fig. 2i). Additionally, we detected FGFR2 fusions only in iCCA group 1, while group 2 contained only samples with no FGFR2 fusion (p = 0.0438, chi-square test, Fig. S3e), and observed that iCCA group 1 has significantly more grade 2 and 3 samples compared to iCCA group 2 (p = 0.0004, chi-square test, Fig. S3f).
This data aligns well with the concept of small- and large-duct iCCA, group 1 representing small-duct iCCA enriched for IDH1/2 mutations and iCCA group 2 representing large-duct iCCA enriched for TP53, KRAS, and SMAD4 mutations and frequent Fluke associated.6,48, 49, 50 This indicates that our reference cohort covers these previously described biological iCCA classes.
Copy number alteration profiles using DNA methylation data showed a relatively distinct profile for the iCCA group 1 characterized by frequent 1p, 3p, 6q, 9q, 14q deletions and 1q amplification, often involving the entire chromosome arm (Fig. S4a). Many of these copy number alterations were previously described for iCCA.17 On the other hand, the iCCA group 2 was characterized mainly by 1p, 4p, 8p, 9p, 17p, 18q and 19p deletions, and 8q amplifications, again involving large chromosome regions (Fig. S4b) resembling the copy number profiles of pCCA and dCCA.17 For PAAD, we noticed fewer and more focal copy number alterations, the most notable being 3p, 5p, 7q, 18q, and 21p deletions (Fig. S4c). Interestingly, PAAD and the iCCA group 2 shared few copy number alterations, the most notable being a large loss of chromosome arm 18q.
Developing and validating a DNA methylation classifier that can distinguish between iCCA, PAAD and normal bile duct
Using DNA methylation data as input, we generated three machine learning models: random forest, support vector machine, and neural network, in order to classify samples into iCCA, PAAD, or normal bile duct tissue. The models were calibrated to generate probability scores for the three different outcomes. We optimized the number of features used focusing on the best performance and settled for 2048 CpGs (Fig. S5a–h), which are identical to the 2048 CpGs used for generating the t-SNE of the reference cohort.
We therefore analysed the biological function of the 2048 CpGs used for developing the classifier and explored their biological relevance. Of the 2048 CpGs, 1608 probes are associated to annotated genomic regions while the rest are located in intergenic regions (IGR) (Fig. S6a). In total the 1608 CpGs are associated to 1074 unique genes, several CpG probes being associated with the same gene (Table S5). We observed that the top most significantly enriched pathways of this 1074 unique genes were mainly involved in tissue and organ development, WNT/beta-catenin signalling, GDNF/RET signalling axis, and calcium regulation in the cardiac cell (Fig. S6b). Using Descartes cell types and Tissue 2021 database, we observed that the cell types matching to these 1074 genes were mainly associated to the enteric nervous system neurons in the intestine, pancreas, and stomach, to visceral neurons in the lung, SATB2 positive cells in heart, inhibitory neurons in cerebellum and cerebrum, Schwann cells in muscle, astrocytes in eye, and stromal cells in kidney (Fig. S6c). This association between the most relevant CpGs for classification with morphogenesis and neuronal pathways was previously reported.12 In order to obtain a multiomics understanding51 of the 2048 CpGs used to develop the classifiers we also analysed their RNA expression. For these CpGs we checked their methylation level and observed that 359 are differently expressed in iCCA versus PAAD. Of these, 239 CpGs were mapping to known genes with several CpGs being mapped to the same gene multiple times (for example 23 CpGs were mapped to the gene MEIS1), whereas the other CpGs were localized in IGR. Hence, we ended up with 155 potentially differently expressed genes (Table S6). For the genes with multiple CpGs we did not observe any discordant methylation levels between the CpGs, all being differently expressed in the same direction. Very interesting, almost all CpGs were hypermethylated in iCCA versus PAAD with only two exceptions, cg15073906 that maps the 3′UTR of RALGPS2 and cg03019505 that maps the body of TFIP11. We were able to identify the RNA transcripts for 152 of the 155 genes and observed that 107 are significantly differently expressed between iCCA and PAAD (p < 0.05, Student's t-test), with 79 (73.83%) transcripts being expressed in the same direction as the methylation level (i.e. overexpressed when hypomethylated) (Fig. S6d).
The technical validation cohort was assembled from 10 methodologic different studies from all over the world (Fig. 3a) inducing a high degree of heterogeneity. An additional requirement for the classifier was to identify samples not belonging to any of the prediction classes. Similar to well-established methylation-classifiers,9,13 we used an additional anomaly detection filter52 to exclude other potential metastatic adenocarcinomas to the liver (colorectal adenocarcinoma–COAD and gastric adenocarcinoma–STAD). Moreover, we added thresholds on the models’ output probability in order to increase the accuracy and confidence in the results. This led to the generation of a “no-match” output for the classifiers.
First, we ran the samples through the anomaly detection model. Almost all iCCA, PAAD and normal bile duct samples passed the filter (iCCA—97.39%, PAAD—97.35%, normal bile—100%, while the other site adenocarcinomas were mostly rejected by the filter (reject: STAD—92.46%, COAD—100%) (Fig. S7a). Second, we compared the accuracy against the number of predictable cases for each model at different thresholds. We identified a threshold of 0.8 to offer the best balance between accuracy and number of classifiable samples (Fig. 3b–d). Looking closer at the percentage of predictable cases and accuracy at different thresholds, we observed that the scores for normal bile duct class are on average lower (Fig. S7b–d). Normal bile duct samples in the random forest (Fig. S7b) and neural network (Fig. S7d) models had a lower number of predictable cases compared to the carcinomas, even at low thresholds, while still having a comparably good accuracy. On the other hand, for normal bile, the support vector machine starting from a threshold of 0.5 had a high number of predictable cases and lower accuracy (Fig. S7c). Therefore, we decided to lower the threshold for the normal bile duct tissue class to 0.5.
Next, we analysed the accuracy of the classifiers without and with additional filtering (anomaly detection and thresholds for normal bile duct samples (th. = 0.5) and carcinomas (th. = 0.8)) (Table S7). Without filtering, the random forest showed an accuracy of 95.01% (Fig. 3e), the support vector machine correctly classified 93.35% of the samples (Fig. 3f), and neural network 96.4% (Fig. 3g). The most frequently misclassified samples were normal bile samples: 45% misclassified for random forest, 60% for support vector machine, and 15% for neural network respectively.
After applying the anomaly detection and the two thresholds, the accuracy increased to 100% for random forest, and 44 samples were defined as “no match” (Fig. 3e), 96.22% for the support vector machine, with 17 samples labelled as “no match” (Fig. 3f), and 99.07% for neural networks with 37 samples labelled as “no match” (Fig. 3g). Even with the filters, the support vector machine misclassified normal bile duct samples as PAAD, a problematic outcome, with the other two classifiers performing much better in this respect. For the random forest and neural network, applying filters led to the reclassification of most or all misclassified samples into the new “no match” class, a desirable outcome. Unfortunately, some of the correctly classified samples were also reclassified as “no match”, especially for the normal bile duct tissue, where the neural network outperformed the random forest model by 30% (30% normal bile samples classified as “no match” by the neural network versus 60% by random forest). These results showed that the neural network classifier outperformed the other two classifiers in all key aspects and therefore, we focused on this tool for further analysis.
Finally, we also looked for elements that could impact the probability score for the true class in each of the three models. As already noticed all three classifiers showed lower probability scores for the normal bile duct true class, and the classifiers performed slightly better for frozen tissue than FFPE (Fig. 3h–j). All three models outputted the lowest true class probabilities for the samples from GSE201241 (Fig. 3h–j), the study with most normal samples and with FFPE material. When looking at the mutational status, the classifiers performed slightly better for tumours with a known driver mutation compared to WT situation. For iCCA, we also compared samples with large bile duct type specific mutations (KRAS and/or TP53) versus small bile duct type mutations (IDH1/2 and BAP1) and no important differences were observed. Furthermore, the classifiers showed slightly higher probability scores for true class for 450K array samples versus EPIC (Fig. S7e–g). Next, we analysed the effect of tumour purity on the probability score of true class, and observed for random forest (Pearson r = 0.1615, p = 0.0028) and support vector machine (Pearson r = 0.1208, p = 0.0257) a significant direct correlation, while for neural network (Pearson r = 0.0845, p = 0.1192) the correlation was not significant (Fig. S7h–j). Additionally, we used single cell data to perform a decomposition of bulk DNA methylation in order to estimate the immune cell fraction of the samples and correlated this to the probability scores for the true class. For all three classifiers we obtained significant negative correlations: random forest (Pearson r = −0.1570, p = 0.0028), support vector machine (Pearson r = −0.2090, p < 0.0001), and neural network (Pearson r = −0.3037, p < 0.0001) (Fig. S7k–m).
Testing the classifier on an independent clinical cohort
We proceeded to test our classifiers on the in-house samples (n = 72): resected primary tumours (n = 39), and diagnostic biopsies of PAAD liver metastases and primary iCCA (PAAD metastases = 20, iCCA = 13).
First, we plotted the clinical test samples together with the reference cohort into a t-SNE. The iCCA samples grouped into the existing iCCA groups. Despite introducing FFPE PAAD samples for the first time to the analysis, these samples clustered with or next to the existing PAAD group (Fig. 4a). Herein, we used our in-house samples to check how much the DNA methylation pattern changes between primary and metastatic PAAD tumours and iCCAs. For the 2048 most standard deviated CpGs, only 11 were significantly differently methylated between primary PAAD and metastatic PAAD, while 515 CpGs were differently methylated when comparing primary PAAD to iCCA, and 418 CpGs for metastatic PAAD versus iCCA. Of the 515 and 418 CpGs, 331 were shared by primary and metastatic PAAD (Fig. S8a).
Second, we looked at results of the neural network classifier (Table S8). Without filtering, we obtained an accuracy of 94.44%, with two PAAD sample misclassified as normal bile duct and two iCCA samples classified as normal bile duct and PAAD, respectively (Fig. 4b – upper panel). After applying the thresholds, we reached an accuracy of 95.45% and we had two PAAD sample misclassified as normal bile duct, one iCCA as PAAD, and 6 (8.33%) additional samples designated as “no match” (Fig. 4b – lower panel). This demonstrates that the neural network classifier performed well and very rarely misclassified samples. We also checked the accuracy of the other two models with filtering. The support vector machine achieved an accuracy of 90.14% with 1.39% of the samples labelled “no match”. The random forest achieved an accuracy of 95.24%, but the number of unclassified samples was much higher: 41.67% (Fig. 4c). This data clearly indicated that the neural network classifier was superior.
Third, we compared the results of the neural network classifier with that of an established two-marker immunohistochemistry-based (IHC) classifier composed of ANXA1 and ANXA10.43 The IHC classifier (Fig. S8b) reached an accuracy of 86.57%. The area under the curve (AUC) of the IHC classifier in all samples for differentiating PAAD from iCCA was 0.8705, while the AUC of the neural network classifier with the anomaly detection filter and thresholds was 0.9946 and an accuracy of 95.45% (Fig. S8c). More specifically, the neural network performed considerably better than IHC on distinguishing PAAD metastases from iCCA (neural network: accuracy 94.28%, AUC = 0.9903 versus IHC: accuracy 75%, AUC = 0.8427, Fig. S8c).
We further looked at clinicopathological parameters that could have impacted the accuracy of our classifiers (Fig. 4c). We observed a direct correlation between tumour purity and probability score of the true class (Pearson r = 0.3618, p = 0.0018) and an inverse correlation between immune infiltrate and the true class score (Pearson r = −0.4044, p = 0.0004). While for degree of desmoplasia (fibrosis), proliferation index (Ki67), differentiation grade, IHC classifier results, growth pattern, iCCA subtype, and FGFR2 status no associations were observed (Fig. S8d).
As a last step, we checked the performance of the neural network classifier with sample filtering on other tumour types (Table S9). We started by checking if the classifier was able to exclude samples that are not iCCA, PAAD or normal bile. We tested the classifier on the 266 primary colon adenocarcinomas (COAD) and 345 primary stomach adenocarcinomas (STAD) from TCGA that were not used in the fitting of the anomaly detection model. The neural network classifier labelled all COAD and 95.36% of the STAD samples as “no match” (Fig. S9a).
We also checked the classifier's capacity to identify precursor lesions of iCCA, high-grade iITPN and iIPNB. Out of 16 precursor lesions, the classifier labelled 11 as iCCA and 5 as “no match” (Fig. S9b), a desirable outcome, if we consider that high-grade iITPN and iIPNB are resected similar to iCCA. Hence, the DNA methylation profile of iCCA used to build this classifier seemed to be present even from the premalignant phases. Lastly, we checked the classifier performance on pCCA. Out of 50 samples it labelled 32 as iCCA and 14 as “no-match”, with only 4 samples classified as either PAAD or normal bile tissue. This is a desirable outcome that confirms a strong ability to classify also large bile duct type iCCA, as they are more similar to pCCA (Fig. S9b).
Discussion
We show here that genome wide DNA methylation can be used to solve one of the most challenging diagnostic questions in oncologic liver pathology, the differentiation between iCCA and intrahepatic PAAD metastases. We assembled a large DNA methylation cohort of iCCA, PAAD and normal bile, containing samples from all over the world. Using this data, we constructed a reference cohort and trained three machine learning models, which we further tested and compared on a validation set. We finally tested the classifier on our in-house samples. The neural network classifier with additional filtering achieved the best results across all cohorts: 99.07% accuracy on the technical validation set with 10.25% as “no match” samples and 95.45% accuracy on the clinical test set with 8.33% as “no match” samples.
To our knowledge no other study used methylation data to solve this differential diagnosis problem. In a manuscript using immunohistochemistry, Lok et al. discovered a set of four protein markers that could aid the diagnosis. They observed that the certain immunohistochemistry pattern is specific for 59% of iCCA and other combinations are more common for PAAD.53 More recently, Padden et al. used proteomics analysis to discover new diagnostic immunohistochemistry markers. They discovered a two-marker signature composed of ANXA1 and ANXA10 that can classify with an accuracy of 61% iCCA and 78% PAAD.43 Furthermore, Ferrone et al., proposed DNA-enhanced albumin RNA in situ hybridization for the diagnosis of intrahepatic versus extrahepatic lesions. The staining was positive in 99% of iCCA, but also in 100% of hepatocellular carcinomas (HCC)s, and was negative in primary and metastatic PAADs.54 In a subsequent study from a different group, albumin expression by RNA in situ hybridization was positive in 97% of HCCs, only 64% iCCA (71% in resections and 50% in biopsies), none of the primary PAAD, and in all normal liver samples.55 Hence, this tool seems to have low specificity for iCCA, especially biopsies. These results show once more that this differential diagnostic problem cannot be solved using immunohistochemistry or in situ hybridization, and much more specific markers for the cell of origin are necessary, such as DNA methylation.
The other important observation in our study is the confirmation, based solely on the DNA methylation, that there are two important subtypes of iCCA: group 1 that forms a distinct cluster and is characterized by IDH1/2, and BAP1 mutations, and that is like the molecular counterpart to small-duct type iCCA,48, 49, 50 and group 2 overlapping pCCA and dCCA located close to the PAAD cluster, enriched for KRAS, TP53, and SMAD4, likely being the molecular counterpart to large-duct type iCCA. More specifically, we observed that iCCA group 1 further contains an IDH1/2 subgroup and iCCA group 2 is also strongly enriched for a subgroup of Fluke positive samples. A multiomics analysis on CCA (mostly iCCA) showed that there are four clusters of CCA: cluster 1 containing most Fluke positive samples, cluster 2 characterized by TP53 mutations, cluster 3 with high copy number alterations, and cluster 4 characterized by IDH1/2 mutations. Not surprisingly, cluster 1 and 2 contain both iCCA and eCCA, and cluster 3 and 4 were predominately iCCA.6 Our DNA methylation results match well these data: cluster 1 matches our Fluke positive subgroup, cluster 2 matches our iCCA group 2 without the Fluke positive samples, for cluster 3 we did not find a clear match, and cluster 4 matches our IDH1/2 mutated subgroup. Furthermore, Goeppert et al. described based on methylation and mutational data four iCCA clusters—low (L), medium (M), IDH and high (H).16 From a mutational point of view our iCCA group 1 with its specific IDH1/2 mutated subgroup resembled the H and IDH cluster, respectively, and from the point of view of copy number alterations the iCCA group 2 resembles the M group. We could not find a clear match for the L group in our analysis. The iCCA analysis of Goeppert et al. is based on data from a single geographical region and no samples from Fluke endemic areas were included, hence presenting a limited perspective on this topic.
Regarding the classification of iCCA, we believe that there are two clear groups of iCCA, that match the well described histological subtypes: small duct iCCA (group 1) and large duct iCCA (group 2). Each of these two groups contains an additional subgroup: the IDH1/2 and the Fluke subgroup. Additionally, we consider that there is a continuum between the two main groups, tumours arising probably from medium sized bile ducts, that share characteristics of both.
Both iCCA and PAAD metastases are exclusion diagnosis relying heavily on clinical and imaging data. In cases where this data is inconclusive, we believe that our DNA methylation classifier could be an extremely valuable clinical tool. PAAD frequently presents with distant metastases at diagnosis, with the liver being the most commonly affected site.56 While imaging has a sensitivity of 73–94% for the initial diagnosis of primary PAAD, specificity ranges from 60 to 89%.57, 58, 59 As a result, a significant number of patients with PAAD liver metastases have an undetectable primary tumour, and up to 17% of cancers of unknown primary (CUP) in the liver are metastatic PAAD.60 We believe that these patients would benefit greatly from our classifier.
On the other hand, when it comes to imaging, a solitary liver metastasis is the most important mimicker of mass forming iCCA.61 In an MRI study comparing the diagnosis of iCCA to hepatic adenocarcinoma of unknown primary, there was 67% agreement between the medical record, radiologists, and pathology report.62 Therefore, whenever an iCCA lesion with atypical features is detected on imaging and no specific markers for other adenocarcinomas are found by pathologists, our classifier may be of great use.
We envision multiple improvements and additional applications to our classifier that could bring this tool into clinical practice. PAAD is a frequent cause of CUP,60 so we need to test if we can use the classifier for diagnosing metastatic PAAD from other locations. This tool would be extremely valuable in the perioperative setting for the diagnosis of liver tumours discovered during planed curative pancreatectomies. We are planning to use the much quicker nanopore sequencing and run the diagnostic during the operation as was already done for brain tumours.63 Recently, DNA methylation-based classifiers made their way into standard diagnostic algorithms, and are now considered desirable diagnostic criteria by the WHO.64 This implies that genome wide DNA methylation tools will be widely available, and classifiers for other tumour entities will be easily translatable into clinical practice. Additionally, DNA methylation nanopore sequencing is cheaper65 and would make such classifiers available also for centres that lack the high throughput DNA methylation detection platforms.
Moreover, our tool clearly shows that the identification of subtypes of iCCA based solely on methylation-based data is possible. Future studies using well annotated clinical samples from the point of view of therapy response could make this tool suitable for predicting subgroups of patients that respond to new targeted therapies developed for this carcinoma.66
Our manuscript has limitations that need to be mentioned. Firstly, this is a retrospective study and the classifier was not tested prospectively. Another shortcoming is the inclusion of samples from only one geographic region in the test set. Recent data shows that molecular markers of iCCA differ between regions, mainly because of different aetiologies.19 Despite this, taking into account that our reference and validation cohorts are from all over the world, we consider that our classifier would perform well with samples from other geographical regions. Furthermore, the classifier was built and validated using primary PAAD, being however designed to detect PAAD liver metastases. The capacity of the classifier to diagnose PAAD liver metastases was proven using the testing cohort, and exactly this limitation can be interpreted as a strength: the DNA methylation signal used for the classifier development is specific for a carcinoma type; therefore, like in previous studies,12 the background tissue only marginally interferes with the results.
Secondly, in order to check if the methylation signals used for classifier development are specific for iCCA, PAAD and normal bile duct or pan-cancer specific,67 the classifier should be tested on samples other than those to be classified, ideally on liver metastases. Moreover, other signal selection methods51,68,69 could be considered for classifier development that could improve pan-cancer performance. Our data regarding this scenario are preliminary, and we do not have a full understanding on how the neural network model will classify other entities. Our results show that the neural network classifier with the added filters correctly labels all COAD and most STAD as “no match”.
Thirdly, the size of normal bile duct samples in the reference cohort leads to lower predictable scores for this class. We partially solved this issue by applying different thresholds for normal tissue and carcinomas and this led to a high accuracy for neural networks.
In summary, we developed a neural network classifier that could aid pathologists in the diagnostic of iCCA versus liver metastases of PAAD. This classifier can be immediately implemented in the clinical practice and have important diagnostic impact.
Contributors
Conception and design: M.P. Dragomir, T.G. Calina, F. Roßner, and D. Capper. Development of methodology: M.P. Dragomir, T.G. Calina, and D. Capper. Acquisition of data: M.P. Dragomir, T.G. Calina, E. Perez, S. Schallenberg, M. Chen, T. Albrecht, I. Koch, and P. Wolkenstein. Analysis and interpretation of data: M.P. Dragomir, T.G. Calina, E. Perez, M. Chen, G.A. Calin, and D. Capper. Writing, review and/or revision of the manuscript: M.P. Dragomir, T.G. Calina, S. Roessler, G.A. Calin, F. Roßner, and D. Capper. Administrative, technical, or material support: M.P. Dragomir, T.G. Calina, B. Goeppert, S. Roessler, G.A. Calin, C. Sers, D. Horst, F. Roßner, and D. Capper. Study supervision: M.P. Dragomir, and D. Capper. All authors read and approved the final version of the manuscript. M.P. Dragomir, and T.G. Calina have accessed and verified the data. M.P. Dragomir was responsible for the decision to submit the manuscript.
Data sharing statement
The in-house clinical dataset analysed in this study is available from the Gene Expression Omnibus (GEO) repository under the following accession number: GSE217384. All other publicly available data are from previous studies deposited in GEO and TCGA, an overview of these studies and their origin is provided in our Methods section of the manuscript.
Declaration of interests
The authors declare no conflicts of interest.
Acknowledgements
We acknowledge the technical assistance of Daniel Teichmann and Sandra Meier from the Institute for Neuropathology, Charité-Universitätsmedizin Berlin, and the technical help of Barbara Meyer-Bartell, Kerstin Witkowski, and the workers from the archive of the Institute of Pathology Charité-Universitätsmedizin Berlin in retrieving the samples. M.P.D. and T.G.C want to thank their great teacher, Karl Schümann from Rostock, who thought them the pleasure of science and the value of team work in a place and time where these ideals were almost forgotten. Dr. Calin is the Felix L. Haas Endowed Professor in Basic Science.
This work was supported by Berlin Institute of Health (JCS Program) and DKTK Berlin (Young Investigator Grant 2022) to M.P. Dragomir and by German Research Foundation (493697503 and 314905040 – SFB/TRR 209 Liver Cancer B01), and German Cancer Aid (70113922) to Stephanie Roessler.
Footnotes
Supplementary data related to this article can be found at https://doi.org/10.1016/j.ebiom.2023.104657.
Appendix A. Supplementary data
References
- 1.Bledsoe J.R., Shinagare S.A., Deshpande V. Difficult diagnostic problems in pancreatobiliary neoplasia. Arch Pathol Lab Med. 2015;139(7):848–857. doi: 10.5858/arpa.2014-0205-RA. [DOI] [PubMed] [Google Scholar]
- 2.Lowery M.A., Ptashkin R., Jordan E., et al. Comprehensive molecular profiling of intrahepatic and extrahepatic cholangiocarcinomas: potential targets for intervention. Clin Cancer Res. 2018;24(17):4154–4161. doi: 10.1158/1078-0432.CCR-18-0078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Cancer Genome Atlas Research Network. Electronic address aadhe, cancer genome atlas research N Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell. 2017;32(2):185–203.e13. doi: 10.1016/j.ccell.2017.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Farshidfar F., Zheng S., Gingras M.C., et al. Integrative genomic analysis of cholangiocarcinoma identifies distinct IDH-mutant molecular profiles. Cell Rep. 2017;18(11):2780–2794. doi: 10.1016/j.celrep.2017.02.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jiao Y., Pawlik T.M., Anders R.A., et al. Exome sequencing identifies frequent inactivating mutations in BAP1, ARID1A and PBRM1 in intrahepatic cholangiocarcinomas. Nat Genet. 2013;45(12):1470–1473. doi: 10.1038/ng.2813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Jusakul A., Cutcutache I., Yong C.H., et al. Whole-genome and epigenomic landscapes of etiologically distinct subtypes of cholangiocarcinoma. Cancer Discov. 2017;7(10):1116–1135. doi: 10.1158/2159-8290.CD-17-0368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wang X.Y., Zhu W.W., Wang Z., et al. Driver mutations of intrahepatic cholangiocarcinoma shape clinically relevant genomic clusters with distinct molecular features and therapeutic vulnerabilities. Theranostics. 2022;12(1):260–276. doi: 10.7150/thno.63417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Arai Y., Totoki Y., Hosoda F., et al. Fibroblast growth factor receptor 2 tyrosine kinase fusions define a unique molecular subtype of cholangiocarcinoma. Hepatology. 2014;59(4):1427–1434. doi: 10.1002/hep.26890. [DOI] [PubMed] [Google Scholar]
- 9.Capper D., Jones D.T.W., Sill M., et al. DNA methylation-based classification of central nervous system tumours. Nature. 2018;555(7697):469–474. doi: 10.1038/nature26000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Moran S., Martinez-Cardus A., Sayols S., et al. Epigenetic profiling to classify cancer of unknown primary: a multicentre, retrospective analysis. Lancet Oncol. 2016;17(10):1386–1395. doi: 10.1016/S1470-2045(16)30297-2. [DOI] [PubMed] [Google Scholar]
- 11.Jurmeister P., Scholer A., Arnold A., et al. DNA methylation profiling reliably distinguishes pulmonary enteric adenocarcinoma from metastatic colorectal cancer. Mod Pathol. 2019;32(6):855–865. doi: 10.1038/s41379-019-0207-y. [DOI] [PubMed] [Google Scholar]
- 12.Jurmeister P., Bockmayr M., Seegerer P., et al. Machine learning analysis of DNA methylation profiles distinguishes primary lung squamous cell carcinomas from head and neck metastases. Sci Transl Med. 2019;11(509) doi: 10.1126/scitranslmed.aaw8513. [DOI] [PubMed] [Google Scholar]
- 13.Koelsche C., Schrimpf D., Stichel D., et al. Sarcoma classification by DNA methylation profiling. Nat Commun. 2021;12(1):498. doi: 10.1038/s41467-020-20603-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Endo Y., Fujimoto M., Ito N., et al. Clinicopathological impacts of DNA methylation alterations on pancreatic ductal adenocarcinoma: prediction of early recurrence based on genome-wide DNA methylation profiling. J Cancer Res Clin Oncol. 2021;147(5):1341–1354. doi: 10.1007/s00432-021-03541-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Bailey P., Chang D.K., Nones K., et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 2016;531(7592):47–52. doi: 10.1038/nature16965. [DOI] [PubMed] [Google Scholar]
- 16.Goeppert B., Toth R., Singer S., et al. Integrative analysis defines distinct prognostic subgroups of intrahepatic cholangiocarcinoma. Hepatology. 2019;69(5):2091–2106. doi: 10.1002/hep.30493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Goeppert B., Stichel D., Toth R., et al. Integrative analysis reveals early and distinct genetic and epigenetic changes in intraductal papillary and tubulopapillary cholangiocarcinogenesis. Gut. 2022;71(2):391–401. doi: 10.1136/gutjnl-2020-322983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wang P., Dong Q., Zhang C., et al. Mutations in isocitrate dehydrogenase 1 and 2 occur frequently in intrahepatic cholangiocarcinomas and share hypermethylation targets with glioblastomas. Oncogene. 2013;32(25):3091–3100. doi: 10.1038/onc.2012.315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chan-On W., Nairismagi M.L., Ong C.K., et al. Exome sequencing identifies distinct mutational patterns in liver fluke-related and non-infection-related bile duct cancers. Nat Genet. 2013;45(12):1474–1478. doi: 10.1038/ng.2806. [DOI] [PubMed] [Google Scholar]
- 20.Gregorio C., Soares-Lima S.C., Alemar B., et al. Calcium signaling alterations caused by epigenetic mechanisms in pancreatic cancer: from early markers to prognostic impact. Cancers (Basel) 2020;12(7) doi: 10.3390/cancers12071735. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tian Y., Morris T.J., Webster A.P., et al. ChAMP: updated methylation analysis pipeline for Illumina BeadChips. Bioinformatics. 2017;33(24):3982–3984. doi: 10.1093/bioinformatics/btx513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Aryee M.J., Jaffe A.E., Corrada-Bravo H., et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30(10):1363–1369. doi: 10.1093/bioinformatics/btu049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Fortin J.P., Triche T.J., Jr., Hansen K.D. Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics. 2017;33(4):558–560. doi: 10.1093/bioinformatics/btw691. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Zhou W., Laird P.W., Shen H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 2017;45(4):e22. doi: 10.1093/nar/gkw967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Nordlund J., Backlin C.L., Wahlberg P., et al. Genome-wide signatures of differential DNA methylation in pediatric acute lymphoblastic leukemia. Genome Biol. 2013;14(9) doi: 10.1186/gb-2013-14-9-r105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Fortin J.P., Labbe A., Lemire M., et al. Functional normalization of 450k methylation array data improves replication in large cancer studies. Genome Biol. 2014;15(12):503. doi: 10.1186/s13059-014-0503-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Teschendorff A.E., Marabita F., Lechner M., et al. A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics. 2013;29(2):189–196. doi: 10.1093/bioinformatics/bts680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Liu J., Siegmund K.D. An evaluation of processing methods for HumanMethylation450 BeadChip data. BMC Genomics. 2016;17:469. doi: 10.1186/s12864-016-2819-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Qin Y., Feng H., Chen M., Wu H., Zheng X. InfiniumPurify: an R package for estimating and accounting for tumor purity in cancer methylation research. Genes Dis. 2018;5(1):43–45. doi: 10.1016/j.gendis.2018.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhang N., Wu H.J., Zhang W., Wang J., Wu H., Zheng X. Predicting tumor purity from methylation microarray data. Bioinformatics. 2015;31(21):3401–3405. doi: 10.1093/bioinformatics/btv370. [DOI] [PubMed] [Google Scholar]
- 31.Hovestadt V., Zapatka M. 2017. conumee: enhanced copy-number variation analysis using Illumina DNA methylation arrays. R package version 1.9.0.http://bioconductor.org/packages/conumee/ Available from. [Google Scholar]
- 32.Capper D., Stichel D., Sahm F., et al. Practical implementation of DNA methylation and copy-number-based CNS tumor diagnostics: the Heidelberg experience. Acta Neuropathol. 2018;136(2):181–210. doi: 10.1007/s00401-018-1879-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Krijthe J.H. Rtsne: T-distributed stochastic neighbor embedding using a Barnes-Hut implementation. 2015. https://github.com/jkrijthe/Rtsne Available from:
- 34.Gu Z., Eils R., Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32(18):2847–2849. doi: 10.1093/bioinformatics/btw313. [DOI] [PubMed] [Google Scholar]
- 35.Moez A. 2020. PyCaret: an open source, low-code machine learning library in Python.https://www.pycaret.org Available from: [Google Scholar]
- 36.Abadi M., Agarwal A., Barham P., et al. 2015. TensorFlow: large-scale machine learning on heterogeneous systems.https://www.tensorflow.org/about/bib Available from: [Google Scholar]
- 37.Akiba T., Sano S., Yanase T., Ohta T., Koyama M. 2019. Optuna: a next-generation hyperparameter optimization framework. KDD 2019 applied data science track. [Google Scholar]
- 38.Teschendorff A.E., Zhu T., Breeze C.E., Beck S. EPISCORE: cell type deconvolution of bulk tissue DNA methylomes from single-cell RNA-Seq data. Genome Biol. 2020;21(1):221. doi: 10.1186/s13059-020-02126-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Smyth G.K. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3(Article3):1–25. doi: 10.2202/1544-6115.1027. [DOI] [PubMed] [Google Scholar]
- 40.Clarke D.J.B., Jeon M., Stein D.J., et al. Appyters: turning jupyter notebooks into data-driven web apps. Patterns (N Y) 2021;2(3) doi: 10.1016/j.patter.2021.100213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.N Kalimuthu S., Wilson G.W., Grant R.C., et al. Morphological classification of pancreatic ductal adenocarcinoma that predicts molecular subtypes and correlates with clinical outcome. Gut. 2020;69(2):317–328. doi: 10.1136/gutjnl-2019-318217. [DOI] [PubMed] [Google Scholar]
- 42.Nakanuma Y., Sato Y., Harada K., Sasaki M., Xu J., Ikeda H. Pathological classification of intrahepatic cholangiocarcinoma based on a new concept. World J Hepatol. 2010;2(12):419–427. doi: 10.4254/wjh.v2.i12.419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Padden J., Ahrens M., Kalsch J., et al. Immunohistochemical markers distinguishing cholangiocellular carcinoma (CCC) from pancreatic ductal adenocarcinoma (PDAC) discovered by proteomic analysis of microdissected cells. Mol Cell Proteomics. 2016;15(3):1072–1082. doi: 10.1074/mcp.M115.054585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Pedregosa F., Varoquaux G., Gramfort A., et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12(85):2825–2830. [Google Scholar]
- 45.Chen L., Pan X., Zhang Y.H., Huang T., Cai Y.D. Analysis of gene expression differences between different pancreatic cells. ACS Omega. 2019;4(4):6421–6435. [Google Scholar]
- 46.Zhu T., Liu J., Beck S., et al. A pan-tissue DNA methylation atlas enables in silico decomposition of human tissue methylomes at cell-type resolution. Nat Methods. 2022;19(3):296–306. doi: 10.1038/s41592-022-01412-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Ong C.K., Subimerb C., Pairojkul C., et al. Exome sequencing of liver fluke-associated cholangiocarcinoma. Nat Genet. 2012;44(6):690–693. doi: 10.1038/ng.2273. [DOI] [PubMed] [Google Scholar]
- 48.Akita M., Fujikura K., Ajiki T., et al. Dichotomy in intrahepatic cholangiocarcinomas based on histologic similarities to hilar cholangiocarcinomas. Mod Pathol. 2017;30(7):986–997. doi: 10.1038/modpathol.2017.22. [DOI] [PubMed] [Google Scholar]
- 49.Shibata T., Arai Y., Totoki Y. Molecular genomic landscapes of hepatobiliary cancer. Cancer Sci. 2018;109(5):1282–1291. doi: 10.1111/cas.13582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Aishima S., Oda Y. Pathogenesis and classification of intrahepatic cholangiocarcinoma: different characters of perihilar large duct type versus peripheral small duct type. J Hepatobiliary Pancreat Sci. 2015;22(2):94–100. doi: 10.1002/jhbp.154. [DOI] [PubMed] [Google Scholar]
- 51.Zhu J.H., Yan Q.L., Wang J.W., et al. The key genes for perineural invasion in pancreatic ductal adenocarcinoma identified with monte-carlo feature selection method. Front Genet. 2020;11 doi: 10.3389/fgene.2020.554502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Jurmeister P., Gloss S., Roller R., et al. DNA methylation-based classification of sinonasal tumors. Nat Commun. 2022;13(1):7148. doi: 10.1038/s41467-022-34815-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Lok T., Chen L., Lin F., Wang H.L. Immunohistochemical distinction between intrahepatic cholangiocarcinoma and pancreatic ductal adenocarcinoma. Hum Pathol. 2014;45(2):394–400. doi: 10.1016/j.humpath.2013.10.004. [DOI] [PubMed] [Google Scholar]
- 54.Ferrone C.R., Ting D.T., Shahid M., et al. The ability to diagnose intrahepatic cholangiocarcinoma definitively using novel branched DNA-enhanced albumin RNA in situ hybridization technology. Ann Surg Oncol. 2016;23(1):290–296. doi: 10.1245/s10434-014-4247-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Lin F., Shi J., Wang H.L., et al. Detection of albumin expression by RNA in situ hybridization is a sensitive and specific method for identification of hepatocellular carcinomas and intrahepatic cholangiocarcinomas. Am J Clin Pathol. 2018;150(1):58–64. doi: 10.1093/ajcp/aqy030. [DOI] [PubMed] [Google Scholar]
- 56.Louvet C., Philip P.A. Accomplishments in 2007 in the treatment of metastatic pancreatic cancer. Gastrointest Cancer Res. 2008;2(3 Suppl):S37–S41. [PMC free article] [PubMed] [Google Scholar]
- 57.Izuishi K., Yamamoto Y., Sano T., Takebayashi R., Masaki T., Suzuki Y. Impact of 18-fluorodeoxyglucose positron emission tomography on the management of pancreatic cancer. J Gastrointest Surg. 2010;14(7):1151–1158. doi: 10.1007/s11605-010-1207-x. [DOI] [PubMed] [Google Scholar]
- 58.Lytras D., Connor S., Bosonnet L., et al. Positron emission tomography does not add to computed tomography for the diagnosis and staging of pancreatic cancer. Dig Surg. 2005;22(1–2):55–61. doi: 10.1159/000085347. discussion 2. [DOI] [PubMed] [Google Scholar]
- 59.Rijkers A.P., Valkema R., Duivenvoorden H.J., van Eijck C.H. Usefulness of F-18-fluorodeoxyglucose positron emission tomography to confirm suspected pancreatic cancer: a meta-analysis. Eur J Surg Oncol. 2014;40(7):794–804. doi: 10.1016/j.ejso.2014.03.016. [DOI] [PubMed] [Google Scholar]
- 60.Lazaridis G., Pentheroudakis G., Fountzilas G., Pavlidis N. Liver metastases from cancer of unknown primary (CUPL): a retrospective analysis of presentation, management and prognosis in 49 patients and systematic review of the literature. Cancer Treat Rev. 2008;34(8):693–700. doi: 10.1016/j.ctrv.2008.05.005. [DOI] [PubMed] [Google Scholar]
- 61.Kovac J.D., Jankovic A., Dikic-Rom A., Grubor N., Antic A., Dugalic V. Imaging spectrum of intrahepatic mass-forming cholangiocarcinoma and its mimickers: how to differentiate them using MRI. Curr Oncol. 2022;29(2):698–723. doi: 10.3390/curroncol29020061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Al Ansari N., Kim B.S., Srirattanapong S., et al. Mass-forming cholangiocarcinoma and adenocarcinoma of unknown primary: can they be distinguished on liver MRI? Abdom Imaging. 2014;39(6):1228–1240. doi: 10.1007/s00261-014-0172-3. [DOI] [PubMed] [Google Scholar]
- 63.Djirackor L., Halldorsson S., Niehusmann P., et al. Intraoperative DNA methylation classification of brain tumors impacts neurosurgical strategy. Neurooncol Adv. 2021;3(1) doi: 10.1093/noajnl/vdab149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Louis D.N., Perry A., Wesseling P., et al. The 2021 WHO classification of tumors of the central nervous system: a summary. Neuro Oncol. 2021;23(8):1231–1251. doi: 10.1093/neuonc/noab106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Kuschel L.P., Hench J., Frank S., et al. Robust methylation-based classification of brain tumours using nanopore sequencing. Neuropathol Appl Neurobiol. 2023;49(1) doi: 10.1111/nan.12856. [DOI] [PubMed] [Google Scholar]
- 66.Cho S.M., Esmail A., Raza A., Dacha S., Abdelrahim M. Timeline of FDA-approved targeted therapy for cholangiocarcinoma. Cancers (Basel) 2022;14(11):2641. doi: 10.3390/cancers14112641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Ding S., Li H., Zhang Y.H., et al. Identification of pan-cancer biomarkers based on the gene expression profiles of cancer cell lines. Front Cell Dev Biol. 2021;9 doi: 10.3389/fcell.2021.781285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Li J., Xu Q., Wu M., Huang T., Wang Y. Pan-cancer classification based on self-normalizing neural networks and feature selection. Front Bioeng Biotechnol. 2020;8:766. doi: 10.3389/fbioe.2020.00766. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Lu J., Li J., Ren J., et al. Functional and embedding feature analysis for pan-cancer classification. Front Oncol. 2022;12 doi: 10.3389/fonc.2022.979336. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.