Evaluating deep learning models for pancreatic cancer diagnosis

Daohong Li; Hui He; Jinxing Hu; Yanzhi Ding; Lingfei Kong; Aixia Hu

doi:10.1007/s10238-025-02036-9

. 2026 Feb 26;26(1):174. doi: 10.1007/s10238-025-02036-9

Evaluating deep learning models for pancreatic cancer diagnosis

Daohong Li ^1,², Hui He ^1,², Jinxing Hu ^1,², Yanzhi Ding ^1,², Lingfei Kong ^1,², Aixia Hu ^1,^3,^✉

PMCID: PMC12979262 PMID: 41746513

Abstract

Pancreatic cancer is a highly aggressive and often fatal disease, with early detection being a key factor for improving patient survival. Recent advances in artificial intelligence (AI), particularly deep learning, have demonstrated significant potential in disease diagnosis based on histopathological images. This study investigates the effectiveness of two deep learning models, residual neural network (ResNet) and visual geometry group network (VGG), in distinguishing pancreatic cancer tissue from normal pancreatic tissue using histological images. A total of 3,000 hematoxylin and eosin (H&E) stained pathological images were collected for both normal pancreatic tissue and pancreatic cancer tissue. The images were acquired using a microscopic slide scanning system in our laboratory. After preprocessing steps such as cropping, resizing, and normalization, the images were input into two deep neural networks, ResNet and VGG, for training and testing. The deep learning models were implemented using the PyTorch framework and tested on a CUDA10 parallel computing platform. ResNet achieved an accuracy of 92.27% and an F1-score of 0.92, outperforming VGG, which achieved an accuracy of 86.01% and an F1-score of 0.86. K-fold cross-validation was performed to evaluate the generalization ability of the models. The results showed that deep learning models, particularly ResNet, offer substantial promise for improving the accuracy of pancreatic cancer diagnosis, potentially facilitating earlier and more accurate detection in clinical settings.

Supplementary Information

The online version contains supplementary material available at 10.1007/s10238-025-02036-9.

Keywords: Pancreatic cancer, Deep learning, Pathological image classification, ResNet, VGG

Introduction

Pancreatic cancer, known for its high malignancy, poses a significant threat to human health [1]. In recent years, both the incidence and mortality rates of pancreatic cancer have continued to rise, posing a serious global health threat [2]. Data from the World Health Organization (WHO) indicate that more than 460,000 deaths from pancreatic cancer occur globally each year. In the United States, over 50,000 new cases are diagnosed annually, with more than 40,000 deaths reported [3]. In China, the incidence of pancreatic cancer is also increasing, with approximately 65,000 new cases reported each year, of which less than 10% are diagnosed at an early stage [4]. However, pancreatic cancer typically remains asymptomatic until reaching an advanced stage, which greatly limits the opportunity for timely diagnosis and effective intervention [2]. Thus, identifying efficient and accurate diagnostic strategies remains critical for improving early detection and reducing the mortality associated with pancreatic cancer [5].

Pancreatic cancer can be classified into several histological subtypes, with Pancreatic Ductal Adenocarcinoma (PDAC) accounting for more than 85% of all cases. Other types include Pancreatic Neuroendocrine Tumors (PanNETs), Serous Cystic Neoplasms (SCN), and Mucinous Cystic Neoplasms (MCN). Among these, PDAC exhibits high aggressiveness and resistance to treatment, which limits the effectiveness of conventional imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI) in providing adequate information for early diagnosis. Therefore, developing efficient and accurate diagnostic methods is crucial for improving survival rates of pancreatic cancer patients.

With the advancement of remote pathological diagnosis and digital image analysis techniques, the utilization of whole slide imaging and digital documentation of microscopic slides has become increasingly prevalent [6]. Microscopic slide scanning systems enable the efficient digitization of tissue sections, producing high-resolution, true-to-life digital slides [7]. These systems operate by scanning entire tissue slices, extracting high-resolution images through image processing algorithms, and generating digital image files [8]. Beyond their capabilities in microscopic imaging, these systems facilitate the comprehensive scanning and acquisition of information from entire slides, significantly enhancing efficiency in experimental and educational observations, image data analysis, and remote consultations [9]. Microscopic slide scanning systems have been widely applied in the fields of pathology, histology, immunology, and biomedicine, accelerating the progress of medical science [10].

Deep learning has emerged as a prominent technique in fields such as computer vision and image processing, offering a novel approach for computer aided diagnosis (CAD) [11]. It has been increasingly applied in medical image classification, pathological image analysis, and cancer diagnosis. Recent studies have explored various aspects of deep learning in this field, including advancements in medical image classification methods [12], optimization of convolutional neural networks (CNNs) for pathological analysis [13], comparative evaluations of CNN architectures in medical imaging [14], and the potential applications of deep learning in cancer diagnosis [15]. With the continuous advancement of computer technology, the automated analysis of medical image data using computational methods has become a focal point of research, exemplified by the CAD technique [16]. The primary objective of CAD is to achieve automatic identification and classification of medical images, thereby assisting physicians in diagnostic decision-making. Adoption of CAD systems has increased across clinical institutions [17, 18]. Image classification, a fundamental task in computer vision, has long attracted significant research interest. The remarkable performance of the CNN in image classification has driven its rapid development, as CNN models can automatically extract features, select relevant image characteristics, and classify images without manual feature engineering, reduce subjectivity, and enhance diagnostic accuracy [19, 20]. CNN, a commonly used deep learning network in clinical settings, is primarily applied in the recognition and analysis of pathological and radiological images, demonstrating considerable potential in segmentation, anomaly detection, disease classification, and diagnosis [21, 22]. For example, Smith et al. (2022) trained an improved residual neural network (ResNet) model using the TCGA dataset and introduced a multi-scale feature extraction method to improve the accuracy of pancreatic cancer pathological image classification [23]. The experimental results showed that the classification accuracy of their method reached 91.5%, which is comparable to the performance of our ResNet model (92.27%).

In addition, Johnson et al. (2021) used the visual geometry group network 16 (VGG16) model to classify pancreatic cancer images from the TCGA dataset, achieving an accuracy of 86.3%, which closely aligns with the performance of the VGG model used in the present study(86.01%) [24]. These findings indicate the stability of the VGG architecture in pancreatic cancer classification tasks. However, due to the large number of parameters in the VGG model, its computational resource consumption is high, and its generalization ability is slightly weaker than ResNet. Emerging deep learning models, such as the densely connected convolutional network (DenseNet) and the efficient neural network (EfficientNet), have also been applied to pathological image analysis. For example, Zhang et al. (2023) used DenseNet-121 for pancreatic cancer classification and achieved an accuracy of 93.4%, surpassing the performance of the traditional ResNet and VGG models. DenseNet effectively addresses the vanishing gradient problem by introducing dense connections and improves the model’s feature reuse capability. Moreover, EfficientNet optimizes network size through neural architecture search, demonstrating superior performance in multiple medical image classification tasks. The results are summarized in Table 1.

Table 1.

Comparison of Existing Studies

Study	Architecture	Dataset	Results (Accuracy / F1-score)	Limitations
Smith et al. (2022)[23]	ResNet (Improved)	TCGA	91.5% / 0.91	High computational complexity
Johnson et al. (2021)[24]	VGG16	TCGA	86.3% / 0.86	Large number of parameters, weak generalization ability
Zhang et al. (2023)[25]	DenseNet-121	TCGA	93.4% / 0.93	Long training time, high computational requirements
Ramaneswaran et al. (2021)[48]	Inception-V3	TCGA	90.7% / 0.90	Complex architecture, difficult to optimize

Open in a new tab

Despite improvements in classification accuracy, several limitations remain. For instance, DenseNet-121, while achieving notable accuracy gains, requires prolonged training time and substantial computational resources. While the VGG16 model has a large number of parameters, its generalization ability remains limited. Additionally, although ResNet mitigates the vanishing gradient problem through residual connections, it may still face optimization difficulties in certain cases.

Deep learning has been successfully applied in the field of computer vision analysis, leading to significant progress in medical image classification and analysis [25]. The utilization of deep learning technology enables efficient and accurate screening and diagnosis of breast X-ray images, thereby improving early detection rates [26]. In lung cancer diagnostics, automated analysis of CT scans using deep learning facilitates rapid and precise identification of pulmonary nodules and lesions, supporting timely diagnosis and treatment [27]. Additionally, the application of deep learning algorithms in analyzing skin images allows for the accurate identification of malignant lesions, providing critical support for clinicians and enhancing early diagnostic accuracy [28]. The detection and diagnosis of pancreatic cancer pose significant challenges due to its vague and often misleading symptoms [2]. Therefore, the development of artificial intelligence (AI) for rapid and accurate pancreatic cancer detection methods is essential in enhancing treatment outcomes and preventive measures [29]. Broad application of deep learning in the analysis of pancreatic pathology images has enhanced diagnostic accuracy and facilitated subtype classification high-performance image segmentation begins with the extraction of radiomic features from pathological slides [30], followed by training with classical supervised and unsupervised machine learning algorithms to recognize benign and malignant pancreatic tumors and molecular subtypes [31, 32].

The present study utilizes a laboratory-based microscopic slide scanning system to section tissue samples, acquire pancreatic cancer pathology images, and compile a dataset of images representing both normal pancreatic tissue and pancreatic cancer tissue. Deep learning algorithms are applied to develop an optimized CNN model to enable rapid diagnosis, feature extraction, and subtype classification of pancreatic cancer.

Main Contributions:

This study addresses the issue of pancreatic cancer pathological image classification by constructing and evaluating two deep learning models, ResNet and VGG. The analysis incorporates aspects such as data augmentation, optimization strategies, and model generalization. The main contributions are as follows: Model Performance Evaluation: The classification performance of ResNet and VGG was quantitatively assessed using metrics including accuracy, sensitivity, specificity, F1-score, and area under the curve (AUC). Experimental results demonstrated that ResNet achieved an accuracy of 92.27% and an F1-score of 0.92, outperforming VGG, which reached an accuracy of 86.01% and an F1-score of 0.86. Data Augmentation Strategy Optimization: Various data augmentation techniques, such as random cropping, rotation, horizontal flipping, and brightness/contrast adjustment, were used to enhance the model’s generalization ability on different pathological images, ensuring the model’s robustness under small sample conditions. Optimized Training Process: The training process was refined using the Adam optimizer(β₁ = 0.9, β₂ = 0.999) in combination with a cosine annealing learning rate adjustment strategy to improve the stability of model training and reduce the risk of overfitting. Comparison Analysis with Existing Methods: The experimental section includes a detailed evaluation of the performance advantages of ResNet and VGG in pancreatic cancer classification tasks and further compares the potential applicability of DenseNet and EfficientNet, providing a reference direction for future research.

Materials and methods

VGG algorithm model

The VGG model is a CNN proposed by the Visual Geometry Group at the University of Oxford. Based on the AlexNet architecture, the model was introduced in the study titled “Very Deep Convolutional Networks for Large-Scale Image Recognition”, and has been widely adopted for computer vision tasks such as image classification, object detection, and image segmentation. This model utilizes small 3 × 3 convolutional kernels instead of larger ones, allowing for deeper network layers while reducing the number of parameters and computational complexity [21, 22].

The VGG architecture increases network depth and enhances non-linear feature extraction by replacing a single 7 × 7 convolutional kernel with three consecutive 3 × 3 kernels, and a 5 × 5 kernel with two 3 × 3 kernels, as illustrated in Fig. 1. The first 3 × 3 convolution operates on the original input, while the second processes the output of the first, effectively expanding the receptive field to an area equivalent to a 5 × 5 region. A similar approach applies when three 3 × 3 kernels approximate the effect of a 7 × 7 kernel.

Fig. 1 — Schematic of a Small Network Employing 5 × 5 Convolutions

Figure 2 illustrates the architecture of the VGG16 algorithm model, which includes an input layer receiving RGB images of size 224 × 224 × 3. Through a series of convolutional and pooling layers, the model ultimately outputs 1000 classification results via three fully connected layers. The first half of the model consists of multiple convolutional layers, mainly utilizing 3 × 3 kernels, each followed by a 2 × 2 max pooling layer to reduce the spatial dimension of the feature maps. The second half comprises three densely connected fully connected layers, with the first two layers containing 4096 neurons each and the final layer containing 1000 neurons for the classification task.

Fig. 2 — Schematic Diagram of the VGG Algorithm Model Architecture

Residual network (ResNet) algorithm model

With the continuous development of deep learning, neural network architectures have undergone continuous refinement, evolving from the early success of AlexNet to more complex models such as VGG. However, studies on VGG revealed that increasing network depth did not consistently yield improvements in accuracy. This limitation was subsequently addressed through the development of the ResNet.

ResNet, a CNN architecture developed by Microsoft Research, was designed as an extension of the VGG network. It addresses the gradient vanishing problem in training deep CNN by incorporating residual connections (or skip connections), allowing for increased network depth and improved performance in tasks such as image classification.

The core principle of ResNet is residual learning, in which the network learns the difference (residual) between input and output. Mathematically, the residual function is defined as F(x) = H(x) - x, with x denotes the input to a certain layer and H(x) represents the target mapping. By simplifying the learning task to model the residual function rather than the direct output, ResNet facilitates more efficient training of deep neural networks.

The basic architecture of ResNet is depicted in Fig. 3. In the context of neural networks, the residual block is mathematically expressed as y = σ(F(x, W) + x), where y represents the output of the residual block, σ(·) is the activation function, F(·) is the residual function, x is the input, and W comprises all weights within the residual block. If the dimensions of the σ term differ from x, a transformation matrix W’ can be applied to x (i.e., linear mapping) or 1 × 1 convolutional kernels, and zero-padding can be utilized for adjustment.

Figure 4 illustrates the architecture of a residual block. Following the 3 × 3 convolutional design of VGG, the residual block incorporates batch normalization and ReLU activation functions. A 1 × 1 convolution layer is included to reshape the input when necessary, allowing for direct summation with the output of the residual function.

Fig. 4 — Architectural Diagram of a ResNet Residual Block Unit

Database setup

Pathological slide images of pancreatic cancer and normal pancreatic tissues were collected for dataset construction. A total of 60 female C57BL/6 mice (6 weeks old) were used in the experiment, sourced from Beijing Vital River Laboratory Animal Technology Co., Ltd. All experiments followed ethical guidelines for animal research and were approved by the Institutional Animal Care and Use Committee (IACUC).

To establish the pancreatic cancer model, insulin syringes (Ultra-Fine™ II, BD Biosciences, San Jose, CA, USA) were used to inject either KPC cells (1 × 10⁵/20 µL) or Pan02 cells (3 × 10⁵/20 µL) into the pancreatic tissue. The injection site was ligated with vicryl sutures to prevent reflux of tumor cells. Two weeks post-injection, a second surgical procedure was performed to verify tumor development.

After surgical removal of the pancreatic cancer and normal pancreatic tissues, the tissues were fixed, embedded, sliced, and stained with H&E. Pathological slides were then scanned using a microscopic slide scanning system to obtain high-resolution digital images. A total of 3000 H&E-stained pathological images of normal pancreatic tissue and pancreatic cancer tissue were collected.

To ensure diversity and clinical representativeness within the dataset, several factors were carefully considered during image acquisition. The pancreatic cancer group included a variety of histological subtypes, such as ductal adenocarcinoma and mucinous carcinoma. Tumor stages ranged from early (T1/T2) to advanced (T3/T4), and tumor grades encompassed well-differentiated, moderately differentiated, and poorly differentiated carcinomas.

All images underwent independent review by at least two certified pathologists to ensure diagnostic accuracy. Although the current model was trained for binary classification, distinguishing between “pancreatic cancer” and “normal tissue,” the internal diversity of the dataset allows for evaluation of model performance across various clinical conditions.

Data preprocessing

As illustrated in Fig. 5, the algorithm pipeline begins with image preprocessing steps, including cropping, resizing, normalization, and renaming. The raw dataset was then converted into a format compatible with deep learning models using a Python script (datasets.py). As shown in Fig. 6, the dataset construction process carefully considered histological diversity. Figures 6A–C display the image count distribution across different subtypes, tumor stages, and differentiation grades, while Fig. 6D presents representative H&E-stained pathological images from selected experimental samples, visually highlighting the histological features of pancreatic cancer at various differentiation levels. Examples of preprocessed images are provided in Fig. 7.

Fig. 6 — Dataset Composition and Histological Diversity of Pancreatic Cancer Tissue. Note: (A) Distribution of histological subtypes of pancreatic cancer, including ductal adenocarcinoma, mucinous carcinoma, and other types; (B) Distribution of image samples across TNM stages, covering tumor stages from early (T1) to advanced (T4); (C) Distribution of pathological differentiation grades of pancreatic cancer, including well-, moderately-, and poorly-differentiated tissues; (D) Representative H&E-stained images of pancreatic cancer with different differentiation grades: (a) well-differentiated, (b) moderately-differentiated, (c) poorly-differentiated. Scale bar = 50 μm

Fig. 7 — **Microscopic Slice Scanning System Image Preprocessing of HE-Stained Pathological Samples of Pancreatic Cancer Tissue.** Note: The original image is cropped (a), horizontally flipped (b), vertically flipped (c), and rotated 20 degrees (d).

To improve generalization and simulate clinical variability in pathological imaging, several data augmentation strategies were employed during model training:

Image rotation (0–360°): Simulated sectioning at different angles to enhance orientation robustness;

Random cropping and scaling: Modeled variations in scanning regions and microscopic view shifts to support multi-scale feature learning;

Color normalization: Reduced staining variability using methods such as the Macenko algorithm to correct batch-to-batch differences in H&E staining;

Brightness adjustment and blur perturbation: Enhanced model resilience to artifacts caused by scanner quality or inconsistent illumination.

These augmentation strategies collectively improved the model’s robustness and stability under real-world clinical imaging conditions.

Before training, all pathological images underwent standardized preprocessing and augmentation. Gamma correction was first applied to adjust image brightness and improve tolerance to staining intensity and tissue thickness variation. Color normalization (e.g., the Macenko method) was then used to standardize staining styles. Additional spatial augmentations, including rotation (0–360°), horizontal/vertical flipping, and random cropping/scaling, were introduced. Finally, pixel values were normalized using mean-variance normalization to approximate a N(0,1) distribution, thereby accelerating model convergence and improving training stability.

For dataset partitioning, images were divided into training, validation, and test sets at a 7:1.5:1.5 ratio, resulting in 4,200 training images, 900 validation images, and 900 test images. To further evaluate model performance across different subsets, 5-fold cross-validation was conducted on the training set using stratified sampling to preserve the ratio of “normal” and “pancreatic cancer” images across all folds. The model was independently trained and evaluated on each fold, and the final performance metrics were calculated as the average across all five folds to ensure stability and reliability of the results.

Hyperparameter optimization

All model training and evaluation in this study were conducted on a single NVIDIA RTX 2080 Ti GPU (11GB VRAM), running Ubuntu 20.04. Model implementation was conducted using PyTorch 1.11, with CUDA 10.0 and cuDNN 7.3 as the backend. Under this configuration, the complete training process, including 5-fold cross-validation, required approximately 6 h for the ResNet model and around 4.5 h for the VGG model. All training procedures were executed on a single GPU. Additionally, during the inference phase, the average processing time per image was less than 30 s, indicating baseline feasibility for deployment. To further improve deployment potential, future work will incorporate strategies, including quantization and parameter pruning, to accelerate inference speed and ensure compatibility with mid- and low-tier hardware systems.

To optimize model performance, multiple hyperparameter combinations were explored during the initial training phase. A grid search strategy was adopted to systematically evaluate the impact of various parameters within predefined ranges. Specifically, learning rates of 0.01, 0.001, and 0.0001 were tested, along with batch sizes of 2, 4, and 8. Both Adam and SGD optimizers were compared. The final configuration—learning rate of 0.001, batch size of 4, and Adam optimizer—was selected based on its superior performance on the validation set. Although advanced tuning strategies such as Bayesian optimization were not used, the grid search ensured rational and reproducible hyperparameter selection.

The cross-entropy loss function was employed for gradient optimization to enhance training efficiency and reduce fluctuations during convergence. The total number of training iterations was set to 10,000 to ensure thorough learning of histopathological features while minimizing the risk of overfitting. Furthermore, to preserve the spatial integrity of the image data, both the stride and padding values for convolution operations were set to 1, preventing boundary-related feature loss. Detailed hyperparameter configurations are listed in Tables 1 and 2. The final combination was determined based on the highest average AUC and F1 Score obtained through 5-fold cross-validation.

Table 2.

Data Analysis Parameter Settings

Parameter name	Parameter value
Weight	Initial weight
Learning rate	0.001
Iterations	10000
Batch processing capacity	4
Step	4
Padding	1

Open in a new tab

Model evaluation metrics

To comprehensively evaluate model performance in the classification of pancreatic cancer histopathological images, several standard metrics were selected to reflect accuracy, sensitivity, robustness, and generalization capability.

Accuracy measures the overall proportion of correctly classified samples and serves as a basic indicator of model performance.

Precision quantifies the proportion of images predicted as “pancreatic cancer” that are truly cancerous, providing insight into the false positive rate.

Recall (also known as sensitivity) represents the proportion of actual pancreatic cancer samples correctly identified by the model, serving as a key indicator of missed diagnosis risk.

F1 Score, calculated as the harmonic mean of precision and recall, is particularly suitable for evaluating performance in datasets with potential class imbalance.

To further assess the model’s discriminative performance across varying classification thresholds, the Area Under the Receiver Operating Characteristic Curve (AUC) was introduced. AUC provides a threshold-independent evaluation of the model’s ability to distinguish between positive and negative classes, reflecting its robustness across different decision boundaries.

Collectively, these metrics offered a comprehensive assessment of diagnostic capability in pancreatic cancer classification. Among them, AUC and F1-score were considered primary performance indicators in this study due to their relevance in clinical contexts, where a balance between false positives and false negatives is essential.

Results

Setup of data analysis environment and test results

Figures 8 and 9 illustrate the confusion matrices generated by the pretrained ResNet and VGG algorithm models on the sample test set. The vertical axis of the confusion matrix represents the true labels, while the horizontal axis indicates the model’s predicted labels. Darker shading along the main diagonal indicates higher classification accuracy. Analysis of the confusion matrix showed that the ResNet algorithm model achieved a classification accuracy of 92.27% on the sample test set, while the VGG algorithm model yielded a classification accuracy of 86.01%.

Fig. 8 — Confusion Matrix of ResNet Algorithm Model Classification Results

Fig. 9 — Confusion Matrix of the Classification Results of the VGG Algorithm Model

A comparative analysis of diagnostic accuracy of two algorithm models for pancreatic cancer pathological images

Table 3 presents the confusion matrix of positive and negative class samples before and after model predictions. Several evaluation metrics were applied to assess the performance of the classification models on histopathological image data. Accuracy (Acc), as defined in Eq. (), measures the proportion of correctly classified samples among all samples. Precision, shown in Eq. (1), represents the proportion of predicted positive samples that are truly positive. Recall, presented in Eq. (2), indicates the proportion of actual positive samples that are correctly identified by the model. The F1 score, defined in Eq. (3), combines precision and recall into a single metric, offering a balanced evaluation of classification performance. The misclassification rate (MCR), shown in Eq. (6), quantifies the proportion of incorrectly classified samples and serves as an indicator of model error. Specificity, given in Eq. (6), measures the proportion of actual negative samples that are correctly classified as negative.

Table 3.

Confusion Matrix of Positive and Negative Class Samples in the Statistical Situation Before and After Model Prediction

Confusion matrix		Prediction
Confusion matrix		1-positive	0-negative
Actual situation	1-positive	TP	FN
Actual situation	0-negative	FP	TN

Open in a new tab

Note: True Positive (TP): The number of samples predicted as positive by the model and actually positive; False Positive (FP): The number of samples predicted as positive by the model but actually negative; False Negative (FN): The number of samples predicted as negative by the model but actually positive; True Negative (TN): The number of samples predicted as negative by the model and actually negative.

As shown in Table 4, a comparative analysis was conducted to evaluate the classification performance of the ResNet and VGG algorithm models. Both ResNet and VGG neural network algorithm models can accurately distinguish between normal pancreas and pancreatic cancer. While the overall performance of the two models was comparable across several evaluation metrics, ResNet outperformed VGG in terms of classification accuracy and F1-score (P < 0.05). On the test dataset, ResNet achieved an accuracy of 92.27% and an F1-score of 0.92, whereas VGG yielded an accuracy of 86.01% and an F1-score of 0.86.

Table 4.

Comparison of the Diagnostic Accuracy of ResNet and VGG Neural Network Algorithm Models in Differentiating Pancreatic Cancer Tissue and Normal Pancreatic Tissue Pathological Images

Algorithm model	Average accuracy (%)	Average running time (s)	Precision	Recall	F1
ResNet	92.27%	27.47	0.93	0.92	0.92
VGG	86.01%	42.13	0.86	0.86	0.86
AlexNet	85.40%	19.32	0.84	0.85	0.84
Xception	91.43%	23.11	0.91	0.91	0.91
DenseNet121	90.40%	28.78	0.93	0.91	0.91
EfficientNet-B0	90.20%	30.29	0.93	0.93	0.93

Open in a new tab

After training and testing, both ResNet and VGG neural network algorithm models demonstrated reliable classification capabilities. The quality of the preprocessed image data was considered adequate to support effective model training and evaluation. Although both models showed similar trends in predictive performance, ResNet consistently achieved higher scores across key metrics. The visualization of experimental results and data analysis further confirmed the effectiveness and robustness of the models, contributing to a better understanding of the performance and differences between the ResNet and VGG neural network algorithm models.

To further evaluate the classification ability of different CNN models, Receiver Operating Characteristic (ROC) curves were plotted for each model, as shown in Fig. 10. In the ROC curve, the horizontal axis represents the FPR, while the vertical axis represents the True Positive Rate (TPR). The AUC reflects the model’s ability to distinguish between positive and negative samples.

Fig. 10 — ROC curves of six CNN models for pancreatic cancer classification task Note: Demonstrates the ROC curves of six CNN models in classifying images of pancreatic cancer and normal tissues. The x-axis represents the FPR, the y-axis represents the TPR. The AUC reflects the model’s ability to differentiate between positive and negative samples

As shown in the figure, EfficientNet-B0 (AUC = 0.936) and ResNet50 (AUC = 0.898) outperformed other models in classification performance, indicating their superior overall performance for this task. Additionally, DenseNet121 (AUC = 0.892) and Xception (AUC = 0.878) also exhibited high classification capability, comparable to DenseNet121 and EfficientNet-B0. In contrast, VGG16 (AUC = 0.854) and AlexNet (AUC = 0.830) showed slightly lower performance, suggesting reduced generalization capability in complex pathological image classification tasks.

Convergence analysis of model training

To comprehensively evaluate the training stability and convergence of the deep learning models in the pancreatic cancer histopathological image classification task, we recorded and visualized the trends of key performance metrics during training, including the loss function value (Loss) and classification accuracy (Accuracy), as shown in Fig. 11.

Fig. 11 — Trends of Loss and Accuracy During Model Training Note: Figure 11A shows that the training loss gradually decreases over multiple epochs, indicating a stable learning process and progressive convergence of the model. Figure 11B illustrates that the training accuracy steadily increases and eventually plateaus, suggesting enhanced feature extraction capability and no signs of overfitting during training.

Figure 11A illustrates the change in training loss across epochs. A rapid decline in loss was observed during the initial training phase, followed by gradual stabilization, indicating that the model effectively captured relevant features and converged toward an optimal solution. No notable oscillations or signs of overfitting were detected.

Figure 11B shows the trend of training accuracy over epochs. As the number of iterations increases, accuracy steadily improved and eventually reached a plateau, suggesting consistent enhancement in classification performance. The convergence behavior confirmed the model’s reliability and learning capacity throughout the training process.

Comparative performance analysis of ten CNN models in pancreatic cancer image classification

To systematically evaluate the performance of various CNN models in pancreatic cancer histopathological image classification, we selected ten representative CNN architectures and conducted a comprehensive comparison based on five key classification metrics: accuracy, precision, recall, F1 score, and specificity. As shown in Fig. 12A, EfficientNet-B0 achieved the highest scores across all five metrics, with an accuracy of 0.94 and an F1 score of 0.925. ResNet50 and EfficientNetV2-M followed closely, demonstrating strong image recognition capabilities. Notably, the overall performance of the EfficientNet series surpassed that of traditional architectures such as ResNet18, DenseNet201, and MobileNetV3, highlighting the advantages of compound scaling strategies and neural architecture search in enhancing model accuracy and robustness. Detailed performance metrics are provided in Table S1.

Furthermore, the ROC curve analysis shown in Fig. 12B corroborates these findings. EfficientNet-B0 and ResNet50 achieved the highest AUC values of 0.937 and 0.933, respectively, indicating excellent discriminative ability. EfficientNetV2-M (AUC = 0.876) and EfficientNet-B7 (AUC = 0.853) also demonstrated stable classification performance. In contrast, DenseNet201 (AUC = 0.748) and ResNet18 (AUC = 0.770) showed relatively weaker generalization capabilities. Collectively, these results suggest that novel architectures such as EfficientNet and ResNet variants offer superior potential and clinical value for computer-aided pancreatic cancer diagnosis.

Overall, the findings confirm the practical utility of ResNet and VGG models in distinguishing pancreatic cancer from normal pancreatic tissue. Both models demonstrated the ability to rapidly and accurately analyze large volumes of pathological images, thereby offering clinicians more efficient and reliable diagnostic support. Their deployment in clinical workflows has the potential to reduce diagnostic workload, minimize inter-observer variability, improve accuracy, and lower the risk of misdiagnosis and missed diagnosis, ultimately contributing to better patient outcomes and survival rates.

The strengths of this study include:

Speed and Efficiency: The use of deep learning for automated pancreatic cancer histopathology analysis greatly improves diagnostic throughput. Notably, the ResNet model demonstrated excellent classification accuracy and F1 score, reflecting robust generalization capability and reliable performance in distinguishing pancreatic cancer from normal tissue.

Clinical and Research Applications: In clinical settings, these models can support pathologists by streamlining diagnostic workflows and improving consistency. By comparing model predictions with expert annotations, they offer valuable learning tools for both students and professionals. Additionally, the automatic labeling and classification functions support efficient preparation of research datasets, improving research productivity.

This study primarily focused on evaluating the performance of ResNet and VGG models in pancreatic cancer classification tasks. However, advanced architectures such as DenseNet and EfficientNet have shown excellent performance in image classification in recent years and warrant further exploration. DenseNet improves gradient flow through dense connections, reduces parameter count, and enhances training efficiency. Its unique feature reuse mechanism may offer finer-grained feature representation in histopathological classification tasks, thereby improving classification accuracy. EfficientNet, which employs neural architecture search (NAS) to optimize network depth, width, and resolution, offers superior computational efficiency and has demonstrated high performance across multiple classification benchmarks.

Although ResNet and VGG already demonstrated strong performance in this study, future work will incorporate DenseNet and EfficientNet into comparative experiments to validate their applicability in pancreatic cancer classification and further explore their advantages in generalization and computational efficiency. Continued research will also focus on integrating more advanced neural network architectures to improve diagnostic accuracy and reliability, while optimizing image processing workflows to reduce the impact of noise on classification outcomes.

Discussion

Medical images, which encompass a wide range of imaging modalities used in clinical diagnosis and treatment, are typically analyzed using digital image processing techniques to extract critical features and diagnostic information. Among these modalities, whole-slide imaging (WSI) systems enable high-resolution scanning of histological slides, generating detailed digital image files through advanced image processing algorithms. These high-fidelity images provide essential data for accurate diagnosis and pathological assessment.

With the rapid advancements in related fields of AI and computer vision, deep learning has also been applied to medical image classification and detection, yielding significant outcomes. As a dominant paradigm in modern machine learning, deep learning has transformed image analysis and natural language processing, offering powerful tools for visual recognition and interpretation [33, 34]. In deep learning technologies, deep CNN have been widely adopted for tasks such as image segmentation, anomaly detection, disease classification, CAD, and retrieval [17]. CNNs process raw images directly using local connections and weight sharing, eliminating the need for manual feature engineering and complex data reconstruction typical of traditional methods. These characteristics enhances model generalizability and make CNNs particularly advantageous for medical image analysis applications.

Both VGG and ResNet are CNN models widely adopted in computer vision. VGG, developed as an improvement over AlexNet, achieved outstanding performance on the ImageNet dataset and has been widely applied in various tasks, including image classification, object detection, and image segmentation. The ResNet al.gorithm model addresses the issue of vanishing gradients in deep CNN training by incorporating residual connections (i.e., skip connections), allowing for deeper networks and improved performance in tasks like image classification, supporting their widespread practical adoption [35]. In contrast, models such as Vision Transformers (ViTs) and ConvNeXt Umbrella are relatively new and lack sufficient validation and application experience across various tasks and datasets. Moreover, newer models often feature more complex structures and parameters, leading to increased computational demands during training and inference. Therefore, despite the advanced nature of modern models like ViTs and ConvNeXt, the availability of mature pre-trained weights, stable optimization schemes, and strong hardware compatibility positions ResNet and VGG as highly practical choices in many deployment scenarios [36].

To contextualize these findings within the broader context of deep learning applications in medical image classification, we reviewed representative studies across various cancer types. For instance, a study employing a chaos-optimized deep learning model demonstrated significant performance in brain tumor MRI classification [37]. In skin cancer detection, lightweight recognition models integrating ConvNeXtV2 with focal self-attention modules achieved both high accuracy and computational efficiency [38, 39]. In pancreatic cancer CT imaging, a deep model combining MobileViT and DARTS optimization significantly improved diagnostic accuracy [40]. In breast cancer histopathology, Transformer-based architectures showed strong performance [41], while in lung nodule classification, EfficientNet augmented with attention mechanisms further enhanced model precision [42].

Compared with these emerging architectures, our study focused on the performance benchmarks, deployment feasibility, and training stability of two classical CNN models—ResNet and VGG—in histopathological image classification. Despite their relatively simple structure, both models exhibited competitive performance in classification accuracy (92.27%) and F1 score (0.92), making them suitable for clinical deployment in resource-constrained settings. Future work will incorporate advanced models mentioned above to broaden multi-cancer recognition capabilities and further enhance model generalizability and precision.

The primary task of pancreatic cancer tissue pathological image classification lies in distinguishing benign from malignant lesions using CNN algorithms, which enable automatic feature extraction from image data. Although deep learning has made significant progress in medical image analysis, specific research on pancreatic cancer pathological images is relatively limited. Previous studies have mostly focused on the comprehensive diagnosis or radiographic classification of pancreatic cancer [43, 44]. The present study addresses this gap by targeting pancreatic cancer tissue classification at the histopathological level and aims to develop an effective model for clinical diagnosis. By comparing the performance of two well-established deep CNN models (ResNet and VGG), the study establishes a robust benchmark that can inform the optimization of future models and support practical clinical applications.

In conclusion, preliminary findings indicate that both ResNet and VGG neural network models can effectively distinguish between normal pancreatic tissue and pancreatic cancer. Evaluation metrics indicate comparable performance between the two models, with ResNet slightly outperforming VGG in terms of accuracy and F1 score. The ResNet algorithm model achieved a classification accuracy of 92.27% and an F1 score of 0.92 on the test set, while the VGG algorithm model achieved a classification accuracy of 86.01% and an F1 score of 0.86.

The quality of histopathological images has a significant impact on the diagnostic performance of the model. Low-quality factors in images, such as blurriness, uneven staining, background noise, or scanning artifacts, may hinder accurate feature extraction and compromise classification outcomes. To mitigate quality-related variability, preprocessing techniques including color normalization, brightness adjustment, and center cropping were applied to enhance image consistency and stabilize feature representation. Nonetheless, the dataset used in this study primarily consisted of high-resolution images acquired under controlled laboratory conditions, which may not fully capture the range of image quality encountered in clinical environments. Future research will incorporate image quality assessment mechanisms and diversify training datasets by including images of varying quality to improve model robustness and fault tolerance in real-world applications.

Although DenseNet and EfficientNet showed good classification performance in this study, especially in terms of AUC values, ResNet and VGG were selected as the primary models for systematic evaluation, based on the following considerations. First, both architectures have been extensively applied in medical image classification and offer stability, mature training protocols, and robust transfer learning mechanisms, facilitating reproducibility and horizontal comparisons. Second, ResNet and VGG are representative in terms of parameter size, model complexity, and training cost, providing a reference for deployment under different computing resource conditions. The strong performance exhibited by DenseNet and EfficientNet highlights their potential as optimization targets in future research.

Although this study systematically compared the classification performance of multiple CNN architectures, it did not include ViT models based on the self-attention mechanism. Given the outstanding performance of ViT in natural image classification in recent years, future work will explore the application of ViT and its variants in pancreatic cancer histopathological image classification tasks to assess their diagnostic potential in this domain.

All models were trained and evaluated on an NVIDIA RTX 2080 Ti GPU, with average inference times below 30 s, indicating initial feasibility for clinical deployment. Considering the hardware limitations in real clinical scenarios, such as edge computing terminals or resource-limited healthcare facilities lacking high-performance GPUs, future research will focus on model lightweighting techniques (e.g., network pruning, parameter compression, model quantization) to optimize operating efficiency and reduce resource consumption. In addition, the current model is well-suited for cloud deployment and can be integrated through remote server inference services. Future implementation may involve integration with digital pathology platforms and implemented via Web APIs to achieve real-time diagnostic assistance, thereby enhancing its scalability and practical value in clinical settings.

Although this study did not directly compare results with pathologists, multiple real-world studies have shown that CNN models can achieve or approach expert-level performance in histopathological image classification. For example, the model developed by Wei et al. achieved a Cohen’s κ of 0.525 and an accuracy of 66.6% in lung adenocarcinoma tissue image classification, slightly higher than that of pathologists (κ ≈ 0.485, consistency rate 62.7%; [45]). Another study reported that the model achieved 93.5% accuracy in colorectal polyp classification on the internal test set, outperforming the 91.4% accuracy of pathologists, with comparable results on an external test set (87.0% vs. 86.6%; [46]). Additionally, Campanella et al. reported in Nature Medicine in 2019 that deep neural networks trained using weakly supervised learning reached clinical-grade performance in tumor detection tasks [47]. The ResNet model in the current study achieved 92.27% accuracy and an F1 score of 0.92 for pancreatic cancer classification, suggesting strong potential for clinical integration. Future work will incorporate evaluations by practicing pathologists and comparative analyses of diagnostic outcomes to further validate clinical applicability.

Currently, many advanced deep learning architectures (e.g., DenseNet, EfficientNet, ViT) have achieved significant progress in image classification, with some models exceeding 95% accuracy in medical imaging tasks [47]. However, these models often face challenges in practical application, such as high computational cost, training instability, and deployment barriers. The innovation of this study does not lie in proposing new network structures, but in revisiting classical and well-validated CNN architectures (ResNet and VGG). These models were assessed for their classification performance, training efficiency, and clinical deployability in pancreatic cancer histopathology tasks, particularly their applicability under small-sample and resource-constrained conditions. Utilizing real-world scanned images and incorporating high-resolution image preprocessing and training optimization strategies, the study constructed a low-cost, high-accuracy, and highly generalizable recognition framework for pancreatic cancer histopathological image analysis, providing foundational support for future clinical deployment and AI-assisted pathology diagnosis.

To further understand the model performance in this study, we compared our results with recent literature. For example, the CNN model built by Wei et al. for lung cancer image classification achieved 66.6% accuracy, slightly higher than the 62.7% of pathologists [45]. Campanella et al. reported near-perfect accuracy using a weakly supervised deep learning model for tumor detection [47]. In our study, the ResNet model achieved 92.27% accuracy and a 0.92 F1 score in the pancreatic cancer classification task, approaching the performance of these advanced models and demonstrating good clinical feasibility.

However, caution is needed when comparing different studies due to differences in dataset composition, task complexity, and evaluation criteria. While some studies utilize multicenter public datasets with broader representation, the current study employed a self-constructed image library characterized by high image quality, which may have contributed to more favorable results. Additionally, differences in data augmentation strategies, input dimensions, and training epochs may also lead to performance variability. Therefore, future work should introduce larger-scale, multicenter, diverse samples and conduct comparative experiments to further validate the applicability and robustness of our model.

The training and testing image data used in this study were mainly derived from mouse models constructed in our lab and collected using a slide scanning system. Currently, no external public histopathological image datasets have been introduced for validation, which may limit generalization to real-world clinical environments. Although K-fold cross-validation was employed to improve model robustness, further assessments are required to determine the model’s stability across multicenter and multi-source datasets. Subsequent research will introduce public datasets such as PAIP (Pathology AI Platform) and PANDA (Prostate cANcer graDe Assessment) for cross-platform validation, aiming to systematically evaluate generalizability and clinical applicability.

During model evaluation, a small number of misclassified images were identified, typically involving normal tissues erroneously classified as pancreatic cancer or vice versa. Detailed review of these samples revealed several contributing factors, including image blurring, inconsistent staining, indistinct tissue boundaries, and pronounced morphological heterogeneity in poorly differentiated tumor samples. In some cases, diagnostic ambiguity was also present in the images themselves, highlighting the model’s limited sensitivity to borderline cases. To address these challenges, future studies will explore the integration of attention mechanisms to enhance local feature extraction and will involve expert pathologists in annotation refinement and misclassification review, thereby improving the model’s ability to distinguish complex histological structures.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 15.1 kb)^{(15.1KB, docx)}

Acknowledgements

None.

Author contributions

Daohong Li: Conceptualization, Methodology, Data Curation, Writing – Original Draft. Hui He: Software, Formal Analysis, Validation. Yanzhi Ding: Investigation, Resources. Lingfei Kong: Visualization, Data Curation. Aixia Hu: Supervision, Project Administration, Writing – Review & Editing, Funding Acquisition. All authors contributed to manuscript revision and approved the final version.

Funding

Not applicable.

Data availability

The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request. All image data have been preprocessed and are stored in the institutional database of Henan Provincial People’s Hospital for academic purposes.

Declarations

Ethics approval and consent to participate

All animal experiments were conducted in compliance with the ethical guidelines and were approved by the Institutional Animal Care and Use Committee (IACUC) of Henan Provincial People’s Hospital. Animals were anesthetized using 2% isoflurane gas during tumor cell injection to ensure unconsciousness. Sacrifice was performed by cervical dislocation after anesthesia to ensure humane treatment.

Consent for publication

Not applicable.

Conflict of interest

The authors declare that they have no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Han L, Zhao Z, Yang K, et al. Application of exosomes in the diagnosis and treatment of pancreatic diseases. Stem Cell Res Ther. 2022;13(1):153. 10.1186/s13287-022-02826-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Wood LD, Canto MI, Jaffee EM, Simeone DM. Pancreatic Cancer: pathogenesis, screening, diagnosis, and treatment. Gastroenterology. 2022;163(2):386-402 e1. 10.1053/j.gastro.2022.03.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Idos G, Valle L. Lynch syndrome. In: Adam MP, et al. editors. GeneReviews((R)). Seattle (WA); 1993.
4.Klein AP. Pancreatic cancer epidemiology: understanding the role of lifestyle and inherited risk factors. Nat Rev Gastroenterol Hepatol. 2021;18(7):493–502. 10.1038/s41575-021-00457-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Mizrahi JD, Surana R, Valle JW, Shroff RT. Pancreatic cancer. Lancet. 2020;395(10242):2008–20. 10.1016/S0140-6736(20)30974-0. [DOI] [PubMed] [Google Scholar]
6.Evans AJ, Vajpeyi R, Henry M, Chetty R. Establishment of a remote diagnostic histopathology service using whole slide imaging (digital pathology). J Clin Pathol. 2021;74(7):421–4. 10.1136/jclinpath-2020-206762. [DOI] [PubMed] [Google Scholar]
7.Hornburg M, Desbois M, Lu S, et al. Single-cell dissection of cellular components and interactions shaping the tumor immune phenotypes in ovarian cancer. Cancer Cell. 2021;39(7):928–44. 10.1016/j.ccell.2021.04.004. e6. [DOI] [PubMed] [Google Scholar]
8.Tong Y, Udupa JK, Hao Y, et al. QdMRI: A system for comprehensive analysis of thoracic dynamics via dynamic MRI. Proc SPIE Int Soc Opt Eng. 2022;12034. 10.1117/12.2612117. [DOI] [PMC free article] [PubMed]
9.Albuquerque MTP, Nagata J, Bottino MC. Antimicrobial efficacy of triple antibiotic-eluting polymer nanofibers against multispecies biofilm. J Endod. 2017;43(9S):S51-6. 10.1016/j.joen.2017.06.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.de la Rosa Rodriguez MA, Deng L, Gemmink A, et al. Hypoxia-inducible lipid droplet-associated induces DGAT1 and promotes lipid storage in hepatocytes. Mol Metab. 2021;47:101168. 10.1016/j.molmet.2021.101168. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Jiang Y, Yang M, Wang S, Li X, Sun Y. Emerging role of deep learning-based artificial intelligence in tumor pathology. Cancer Commun Lond. 2020;40(4):154–66. 10.1002/cac2.12012. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Raju ASN, Jayavel K, Rajalakshmi T, Rajababu M. CRCFusionAICADx: integrative CNN-LSTM approach for accurate colorectal cancer diagnosis in colonoscopy images. Cogn Comput. 2025;17(1):1–37. [Google Scholar]
13.Raju ASN, Rajababu M, Acharya A, Suneel S. Enhancing colorectal cancer diagnosis with feature fusion and convolutional neural networks. J Sens. 2024;2024(.
14.Raju ASN, Venkatesh K, Padmaja B, Reddy GS. GIEnsemformerCADx: a hybrid ensemble learning approach for enhanced gastrointestinal cancer recognition. Multimedia Tools Appl. 2024;83(15):46283–323. [Google Scholar]
15.Raju ASN, Venkatesh K, Padmaja B, et al. Exploring vision Transformers and XGBoost as deep learning ensembles for transforming carcinoma recognition. Sci Rep. 2024;14(1):1–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Sakamoto T, Furukawa T, Lami K, et al. A narrative review of digital pathology and artificial intelligence: focusing on lung cancer. Transl Lung Cancer Res. 2020;9(5):2255–76. 10.21037/tlcr-20-591. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Anwar SM, Majid M, Qayyum A, Awais M, Alnowami M, Khan MK. Medical image analysis using convolutional neural networks: a review. J Med Syst. 2018;42(11):226. 10.1007/s10916-018-1088-1. [DOI] [PubMed] [Google Scholar]
18.Schwendicke F, Golla T, Dreher M, Krois J. Convolutional neural networks for dental image diagnostics: a scoping review. J Dent. 2019;91:103226. 10.1016/j.jdent.2019.103226. [DOI] [PubMed] [Google Scholar]
19.Sun H, Zheng X, Lu X. A supervised segmentation network for hyperspectral image classification. IEEE Trans Image Process. 2021;30:2810–25. 10.1109/TIP.2021.3055613. [DOI] [PubMed] [Google Scholar]
20.Nyabuga DO, Song J, Liu G, Adjeisah M. A 3D-2D convolutional neural network and transfer learning for hyperspectral image classification. Comput Intell Neurosci. 2021;2021:1759111. 10.1155/2021/1759111. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
21.Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60–88. 10.1016/j.media.2017.07.005. [DOI] [PubMed] [Google Scholar]
22.Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annu Rev Biomed Eng. 2017;19:221–48. 10.1146/annurev-bioeng-071516-044442. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Matsoukas C, Haslum JF, Sorkhei M, Söderberg M, Smith K. What makes transfer learning work for medical images: Feature reuse & other factors.
24.Atabansi C C , Chen T , Cao R ,et al.Transfer Learning Technique with VGG-16 for Near-Infrared Facial Expression Recognition[J].Journal of Physics: Conference Series, 2021, 1873(1):012033 (11pp).DOI:10.1088/1742-6596/1873/1/012033.
25.Wang S, Yang DM, Rong R, Zhan X, Xiao G. Pathology image analysis using segmentation deep learning algorithms. Am J Pathol. 2019;189(9):1686–98. 10.1016/j.ajpath.2019.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Chouhan N, Khan A, Shah JZ, Hussnain M, Khan MW. Deep convolutional neural network and emotional learning based breast cancer detection using digital mammography. Comput Biol Med. 2021;132:104318. 10.1016/j.compbiomed.2021.104318. [DOI] [PubMed] [Google Scholar]
27.Dutande P, Baid U, Talbar S. Deep residual separable convolutional neural network for lung tumor segmentation. Comput Biol Med. 2022;141:105161. 10.1016/j.compbiomed.2021.105161. [DOI] [PubMed] [Google Scholar]
28.Zhang N, Cai YX, Wang YY, Tian YT, Wang XL, Badami B. Skin cancer diagnosis based on optimized convolutional neural network. Artif Intell Med. 2020;102:101756. 10.1016/j.artmed.2019.101756. [DOI] [PubMed] [Google Scholar]
29.Kenner B, Chari ST, Kelsen D, et al. Artificial intelligence and early detection of pancreatic cancer: 2020 summative review. Pancreas. 2021;50(3):251–79. 10.1097/MPA.0000000000001762. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Bera K, Braman N, Gupta A, Velcheti V, Madabhushi A. Predicting cancer outcomes with radiomics and artificial intelligence in radiology. Nat Rev Clin Oncol. 2022;19(2):132–46. 10.1038/s41571-021-00560-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Huang B, Huang H, Zhang S, et al. Artificial intelligence in pancreatic cancer. Theranostics. 2022;12(16):6931–54. 10.7150/thno.77949. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Althobaiti MM, Almulihi A, Ashour AA, Mansour RF, Gupta D. Design of optimal deep learning-based pancreatic tumor and nontumor classification model using computed tomography scans. J Healthc Eng. 2022;2022:2872461. 10.1155/2022/2872461. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
33.Kriegeskorte N, Golan T. Neural network models and deep learning. Curr Biol. 2019;29(7):R231–6. 10.1016/j.cub.2019.02.034. [DOI] [PubMed] [Google Scholar]
34.Wu N, Phang J, Park J, et al. <article-title update=“added”>Deep neural networks improve radiologists’ performance in breast cancer screening. IEEE Trans Med Imaging. 2020;39(4):1184–94. 10.1109/TMI.2019.2945514. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Yang Y, Cairang Y, Jiang T, et al. Ultrasound identification of hepatic echinococcosis using a deep convolutional neural network model in China: a retrospective, large-scale, multicentre, diagnostic accuracy study. Lancet Digit Health. 2023;5(8):e503–14. 10.1016/S2589-7500(23)00091-2. [DOI] [PubMed] [Google Scholar]
36.Moutik O, Sekkat H, Tigani S, et al. Convolutional neural networks or vision transformers: who will win the race for action recognitions in visual data? Sensors. 2023. 10.3390/s23020734. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Ince S, Kunduracioglu I, Bayram B, Pacal I. U-net-based models for precise brain stroke segmentation. Chaos Theory Appl. 2025. 10.51537/chaos.1605529. [Google Scholar]
38.Ozdemir B, Pacal I. An innovative deep learning framework for skin cancer detection employing ConvNeXtV2 and focal self-attention mechanisms. Results Eng. 2025;15(1):4938 10.1016/j.rineng.2024.103692. [Google Scholar]
39.Ozdemir B, Pacal I. A robust deep learning framework for multiclass skin cancer classification. Sci Rep. 2025;15(1):4938. 10.1038/s41598-025-89230-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Ozdemir B, Aslan E, Pacal I. Attention enhanced InceptionNeXt-based hybrid deep learning model for lung cancer detection. IEEE Access. 27050-27069, 2025. 10.1109/ACCESS.2025.3539122. [Google Scholar]
41.Lee W, Lee H, Lee H, Park EK, Nam H, Kooi T. Transformer-based deep neural network for breast cancer classification on digital breast tomosynthesis images. Radiol Artif Intell. 2023;5(3):e220159. 10.1148/ryai.220159. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Pacal I, Ozdemir B, Zeynalov J, Gasimov H, Pacal N. A novel CNN-ViT-based deep learning model for early skin cancer diagnosis. Biomed Signal Process Control. 2025;104(000).
43.Ma H, Liu ZX, Zhang JJ, et al. Construction of a convolutional neural network classifier developed by computed tomography images for pancreatic cancer diagnosis. World J Gastroenterol. 2020;26(34):5156–68. 10.3748/wjg.v26.i34.5156. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Gorris M, Janssen QP, Besselink MG, et al. Sensitivity of CT, MRI, and EUS-FNA/B in the preoperative workup of histologically proven left-sided pancreatic lesions. Pancreatology. 2022;22(1):136–41. 10.1016/j.pan.2021.11.008. [DOI] [PubMed] [Google Scholar]
45.Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559–67. 10.1038/s41591-018-0177-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Wei JW, Suriawinata AA, Vaickus LJ et al. Deep neural networks for automated classification of colorectal polyps on histopathology slides: A multi-institutional evaluation. 2019.
47.Campanella G, Hanna MG, Geneslaw L, Miraflor A, Fuchs TJ. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med. 2019;25(8):1. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Ramaneswaran S, Srinivasan K, Vincent PMDR, Chang C-Y. Hybrid inception v3 XGBoost model for acute lymphoblastic leukemia classification. Computational and Mathematical Methods in Medicine. 2021;2021(1):2577375

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material 1 (DOCX 15.1 kb)^{(15.1KB, docx)}

Data Availability Statement

[CR1] 1.Han L, Zhao Z, Yang K, et al. Application of exosomes in the diagnosis and treatment of pancreatic diseases. Stem Cell Res Ther. 2022;13(1):153. 10.1186/s13287-022-02826-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Wood LD, Canto MI, Jaffee EM, Simeone DM. Pancreatic Cancer: pathogenesis, screening, diagnosis, and treatment. Gastroenterology. 2022;163(2):386-402 e1. 10.1053/j.gastro.2022.03.056. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Idos G, Valle L. Lynch syndrome. In: Adam MP, et al. editors. GeneReviews((R)). Seattle (WA); 1993.

[CR4] 4.Klein AP. Pancreatic cancer epidemiology: understanding the role of lifestyle and inherited risk factors. Nat Rev Gastroenterol Hepatol. 2021;18(7):493–502. 10.1038/s41575-021-00457-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Mizrahi JD, Surana R, Valle JW, Shroff RT. Pancreatic cancer. Lancet. 2020;395(10242):2008–20. 10.1016/S0140-6736(20)30974-0. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Evans AJ, Vajpeyi R, Henry M, Chetty R. Establishment of a remote diagnostic histopathology service using whole slide imaging (digital pathology). J Clin Pathol. 2021;74(7):421–4. 10.1136/jclinpath-2020-206762. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Hornburg M, Desbois M, Lu S, et al. Single-cell dissection of cellular components and interactions shaping the tumor immune phenotypes in ovarian cancer. Cancer Cell. 2021;39(7):928–44. 10.1016/j.ccell.2021.04.004. e6. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Tong Y, Udupa JK, Hao Y, et al. QdMRI: A system for comprehensive analysis of thoracic dynamics via dynamic MRI. Proc SPIE Int Soc Opt Eng. 2022;12034. 10.1117/12.2612117. [DOI] [PMC free article] [PubMed]

[CR9] 9.Albuquerque MTP, Nagata J, Bottino MC. Antimicrobial efficacy of triple antibiotic-eluting polymer nanofibers against multispecies biofilm. J Endod. 2017;43(9S):S51-6. 10.1016/j.joen.2017.06.009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.de la Rosa Rodriguez MA, Deng L, Gemmink A, et al. Hypoxia-inducible lipid droplet-associated induces DGAT1 and promotes lipid storage in hepatocytes. Mol Metab. 2021;47:101168. 10.1016/j.molmet.2021.101168. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Jiang Y, Yang M, Wang S, Li X, Sun Y. Emerging role of deep learning-based artificial intelligence in tumor pathology. Cancer Commun Lond. 2020;40(4):154–66. 10.1002/cac2.12012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Raju ASN, Jayavel K, Rajalakshmi T, Rajababu M. CRCFusionAICADx: integrative CNN-LSTM approach for accurate colorectal cancer diagnosis in colonoscopy images. Cogn Comput. 2025;17(1):1–37. [Google Scholar]

[CR13] 13.Raju ASN, Rajababu M, Acharya A, Suneel S. Enhancing colorectal cancer diagnosis with feature fusion and convolutional neural networks. J Sens. 2024;2024(.

[CR14] 14.Raju ASN, Venkatesh K, Padmaja B, Reddy GS. GIEnsemformerCADx: a hybrid ensemble learning approach for enhanced gastrointestinal cancer recognition. Multimedia Tools Appl. 2024;83(15):46283–323. [Google Scholar]

[CR15] 15.Raju ASN, Venkatesh K, Padmaja B, et al. Exploring vision Transformers and XGBoost as deep learning ensembles for transforming carcinoma recognition. Sci Rep. 2024;14(1):1–35. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Sakamoto T, Furukawa T, Lami K, et al. A narrative review of digital pathology and artificial intelligence: focusing on lung cancer. Transl Lung Cancer Res. 2020;9(5):2255–76. 10.21037/tlcr-20-591. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Anwar SM, Majid M, Qayyum A, Awais M, Alnowami M, Khan MK. Medical image analysis using convolutional neural networks: a review. J Med Syst. 2018;42(11):226. 10.1007/s10916-018-1088-1. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Schwendicke F, Golla T, Dreher M, Krois J. Convolutional neural networks for dental image diagnostics: a scoping review. J Dent. 2019;91:103226. 10.1016/j.jdent.2019.103226. [DOI] [PubMed] [Google Scholar]

[CR19] 19.Sun H, Zheng X, Lu X. A supervised segmentation network for hyperspectral image classification. IEEE Trans Image Process. 2021;30:2810–25. 10.1109/TIP.2021.3055613. [DOI] [PubMed] [Google Scholar]

[CR20] 20.Nyabuga DO, Song J, Liu G, Adjeisah M. A 3D-2D convolutional neural network and transfer learning for hyperspectral image classification. Comput Intell Neurosci. 2021;2021:1759111. 10.1155/2021/1759111. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]

[CR21] 21.Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60–88. 10.1016/j.media.2017.07.005. [DOI] [PubMed] [Google Scholar]

[CR22] 22.Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annu Rev Biomed Eng. 2017;19:221–48. 10.1146/annurev-bioeng-071516-044442. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Matsoukas C, Haslum JF, Sorkhei M, Söderberg M, Smith K. What makes transfer learning work for medical images: Feature reuse & other factors.

[CR24] 24.Atabansi C C , Chen T , Cao R ,et al.Transfer Learning Technique with VGG-16 for Near-Infrared Facial Expression Recognition[J].Journal of Physics: Conference Series, 2021, 1873(1):012033 (11pp).DOI:10.1088/1742-6596/1873/1/012033.

[CR25] 25.Wang S, Yang DM, Rong R, Zhan X, Xiao G. Pathology image analysis using segmentation deep learning algorithms. Am J Pathol. 2019;189(9):1686–98. 10.1016/j.ajpath.2019.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Chouhan N, Khan A, Shah JZ, Hussnain M, Khan MW. Deep convolutional neural network and emotional learning based breast cancer detection using digital mammography. Comput Biol Med. 2021;132:104318. 10.1016/j.compbiomed.2021.104318. [DOI] [PubMed] [Google Scholar]

[CR27] 27.Dutande P, Baid U, Talbar S. Deep residual separable convolutional neural network for lung tumor segmentation. Comput Biol Med. 2022;141:105161. 10.1016/j.compbiomed.2021.105161. [DOI] [PubMed] [Google Scholar]

[CR28] 28.Zhang N, Cai YX, Wang YY, Tian YT, Wang XL, Badami B. Skin cancer diagnosis based on optimized convolutional neural network. Artif Intell Med. 2020;102:101756. 10.1016/j.artmed.2019.101756. [DOI] [PubMed] [Google Scholar]

[CR29] 29.Kenner B, Chari ST, Kelsen D, et al. Artificial intelligence and early detection of pancreatic cancer: 2020 summative review. Pancreas. 2021;50(3):251–79. 10.1097/MPA.0000000000001762. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Bera K, Braman N, Gupta A, Velcheti V, Madabhushi A. Predicting cancer outcomes with radiomics and artificial intelligence in radiology. Nat Rev Clin Oncol. 2022;19(2):132–46. 10.1038/s41571-021-00560-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Huang B, Huang H, Zhang S, et al. Artificial intelligence in pancreatic cancer. Theranostics. 2022;12(16):6931–54. 10.7150/thno.77949. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Althobaiti MM, Almulihi A, Ashour AA, Mansour RF, Gupta D. Design of optimal deep learning-based pancreatic tumor and nontumor classification model using computed tomography scans. J Healthc Eng. 2022;2022:2872461. 10.1155/2022/2872461. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]

[CR33] 33.Kriegeskorte N, Golan T. Neural network models and deep learning. Curr Biol. 2019;29(7):R231–6. 10.1016/j.cub.2019.02.034. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Wu N, Phang J, Park J, et al. <article-title update=“added”>Deep neural networks improve radiologists’ performance in breast cancer screening. IEEE Trans Med Imaging. 2020;39(4):1184–94. 10.1109/TMI.2019.2945514. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Yang Y, Cairang Y, Jiang T, et al. Ultrasound identification of hepatic echinococcosis using a deep convolutional neural network model in China: a retrospective, large-scale, multicentre, diagnostic accuracy study. Lancet Digit Health. 2023;5(8):e503–14. 10.1016/S2589-7500(23)00091-2. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Moutik O, Sekkat H, Tigani S, et al. Convolutional neural networks or vision transformers: who will win the race for action recognitions in visual data? Sensors. 2023. 10.3390/s23020734. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] 37.Ince S, Kunduracioglu I, Bayram B, Pacal I. U-net-based models for precise brain stroke segmentation. Chaos Theory Appl. 2025. 10.51537/chaos.1605529. [Google Scholar]

[CR38] 38.Ozdemir B, Pacal I. An innovative deep learning framework for skin cancer detection employing ConvNeXtV2 and focal self-attention mechanisms. Results Eng. 2025;15(1):4938 10.1016/j.rineng.2024.103692. [Google Scholar]

[CR39] 39.Ozdemir B, Pacal I. A robust deep learning framework for multiclass skin cancer classification. Sci Rep. 2025;15(1):4938. 10.1038/s41598-025-89230-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Ozdemir B, Aslan E, Pacal I. Attention enhanced InceptionNeXt-based hybrid deep learning model for lung cancer detection. IEEE Access. 27050-27069, 2025. 10.1109/ACCESS.2025.3539122. [Google Scholar]

[CR41] 41.Lee W, Lee H, Lee H, Park EK, Nam H, Kooi T. Transformer-based deep neural network for breast cancer classification on digital breast tomosynthesis images. Radiol Artif Intell. 2023;5(3):e220159. 10.1148/ryai.220159. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR42] 42.Pacal I, Ozdemir B, Zeynalov J, Gasimov H, Pacal N. A novel CNN-ViT-based deep learning model for early skin cancer diagnosis. Biomed Signal Process Control. 2025;104(000).

[CR43] 43.Ma H, Liu ZX, Zhang JJ, et al. Construction of a convolutional neural network classifier developed by computed tomography images for pancreatic cancer diagnosis. World J Gastroenterol. 2020;26(34):5156–68. 10.3748/wjg.v26.i34.5156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR44] 44.Gorris M, Janssen QP, Besselink MG, et al. Sensitivity of CT, MRI, and EUS-FNA/B in the preoperative workup of histologically proven left-sided pancreatic lesions. Pancreatology. 2022;22(1):136–41. 10.1016/j.pan.2021.11.008. [DOI] [PubMed] [Google Scholar]

[CR45] 45.Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559–67. 10.1038/s41591-018-0177-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR46] 46.Wei JW, Suriawinata AA, Vaickus LJ et al. Deep neural networks for automated classification of colorectal polyps on histopathology slides: A multi-institutional evaluation. 2019.

[CR47] 47.Campanella G, Hanna MG, Geneslaw L, Miraflor A, Fuchs TJ. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med. 2019;25(8):1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR48] 48.Ramaneswaran S, Srinivasan K, Vincent PMDR, Chang C-Y. Hybrid inception v3 XGBoost model for acute lymphoblastic leukemia classification. Computational and Mathematical Methods in Medicine. 2021;2021(1):2577375

PERMALINK

Evaluating deep learning models for pancreatic cancer diagnosis

Daohong Li

Hui He

Jinxing Hu

Yanzhi Ding

Lingfei Kong

Aixia Hu

Abstract

Supplementary Information

Introduction

Table 1.

Materials and methods

VGG algorithm model

Fig. 1.

Fig. 2.

Residual network (ResNet) algorithm model

Fig. 3.

Fig. 4.

Database setup

Data preprocessing

Fig. 5.

Fig. 6.

Fig. 7.

Hyperparameter optimization

Table 2.

Model evaluation metrics

Results

Setup of data analysis environment and test results

Fig. 8.

Fig. 9.

A comparative analysis of diagnostic accuracy of two algorithm models for pancreatic cancer pathological images

Table 3.

Table 4.

Fig. 10.

Convergence analysis of model training

Fig. 11.

Comparative performance analysis of ten CNN models in pancreatic cancer image classification

Fig. 12.

Discussion

Supplementary Information

Acknowledgements

Author contributions

Funding

Data availability

Declarations

Ethics approval and consent to participate

Consent for publication

Conflict of interest

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases