Synthetic data generation method improves risk prediction model for early tumor recurrence after surgery in patients with pancreatic cancer

HyeJeong Jeong; Jeong-Moo Lee; Hyeong Seok Kim; Hochang Chae; So Jeong Yoon; Sang Hyun Shin; In Woong Han; Jin Seok Heo; Ji Hye Min; Seung Hyup Hyun; Hongbeom Kim

doi:10.1038/s41598-025-15800-4

. 2025 Aug 29;15:31885. doi: 10.1038/s41598-025-15800-4

Synthetic data generation method improves risk prediction model for early tumor recurrence after surgery in patients with pancreatic cancer

HyeJeong Jeong ^1,^2,^#, Jeong-Moo Lee ^3,^#, Hyeong Seok Kim ¹, Hochang Chae ¹, So Jeong Yoon ¹, Sang Hyun Shin ¹, In Woong Han ¹, Jin Seok Heo ¹, Ji Hye Min ^4,^✉, Seung Hyup Hyun ^5,^✉, Hongbeom Kim ^1,^✉

PMCID: PMC12397232 PMID: 40883332

Abstract

Pancreatic cancer is aggressive with high recurrence rates, necessitating accurate prediction models for effective treatment planning, particularly for neoadjuvant chemotherapy or upfront surgery. This study explores the use of variational autoencoder (VAE)-generated synthetic data to predict early tumor recurrence (within six months) in pancreatic cancer patients who underwent upfront surgery. Preoperative data of 158 patients between January 2021 and December 2022 was analyzed, and machine learning models—including Logistic Regression, Random Forest (RF), Gradient Boosting Machine (GBM), and Deep Neural Networks (DNN)—were trained on both original and synthetic datasets. The VAE-generated dataset (n = 94) closely matched the original data (p > 0.05) and enhanced model performance, improving accuracy (GBM: 0.81 to 0.87; RF: 0.84 to 0.87) and sensitivity (GBM: 0.73 to 0.91; RF: 0.82 to 0.91). PET/CT-derived metabolic parameters were the strongest predictors, accounting for 54.7% of the model predictive power with maximum standardized uptake value (SUVmax) showing the highest importance (0.182, 95% CI: 0.165–0.199). This study demonstrates that synthetic data can significantly enhance predictive models for pancreatic cancer recurrence, especially in data-limited scenarios, offering a promising strategy for oncology prediction models.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-15800-4.

Keywords: Pancreatic cancer, Machine learning, Early recurrence prediction, Various autoencoder, Synthetic data

Subject terms: Pancreatic cancer, Surgical oncology, Machine learning

Introduction

Pancreatic cancer is a highly fatal disease with a low five-year survival rate of 10–20% after diagnosis¹. Owing to the high recurrence rate^2–4the five-year survival rate after surgical resection remains approximately 10%, and this presents a substantial challenge for patient care⁵. Predicting postoperative recurrence of pancreatic cancer is crucial for determining the appropriate adjuvant therapies, surveillance strategies, and patient counseling. Neoadjuvant chemotherapy is the standard treatment for borderline resectable pancreatic cancer; however, recent attempts have been made to expand its scope to resectable pancreatic cancer⁶. The aggressive nature and high recurrence rate of pancreatic cancer necessitate accurate prediction models for effective treatment planning, particularly for neoadjuvant chemotherapy and upfront surgery.

The critical importance of early-stage pancreatic cancer detection is underscored by recent advances in imaging-based differentiation techniques. Machine learning approaches have demonstrated promise for differential diagnosis, with texture analysis methods showing capability to distinguish pancreatic ductal adenocarcinoma from mass-forming pancreatitis on contrast-enhanced CT images. The use of machine learning has been expanded to include various scenarios such as application in prediction models^7–9. Hence, this study has been conducted to advance the use of machine learning in the clinical practice of managing pancreatic cancer.

The application of variational autoencoder (VAE) is a recent advancement in machine learning and has enabled the introduction of innovative methods into the medical field^8,10. VAE is advanced deep learning models used for unsupervised learning. They generate new data similar to existing datasets. However, unlike traditional autoencoders, VAE uses a probabilistic approach to map the input data into a latent space represented by the mean and variance. This allows the effective performance of tasks such as data augmentation, anomaly detection, and the generation of synthetic data for machine learning models.

The aim of this study was to use synthetic data generated using VAE to predict early tumor recurrence within six months after surgery in patients with pancreatic cancer who underwent upfront surgery. VAE was used to efficiently extract latent representations from high-dimensional data to accurately predict recurrence and aid in formulating personalized treatment plans. Through this approach, we aimed to achieve early identification of high-risk patients and propose more effective treatment methods to ultimately improve the survival rates of patients with pancreatic cancer.

Methods

Study population and data collection

We performed a retrospective analysis of preoperative data from patients who underwent pancreatectomy between January 2021 and December 2022 at the Samsung Medical Center (Fig. 1). The inclusion criteria were as follows: (1) pathologically confirmed pancreatic ductal adenocarcinoma, (2) patients who underwent upfront surgery, (3) availability of preoperative PET/CT imaging and magnetic resonance imaging (MRI), (4) availability of complete clinical and laboratory data, and (5) minimum follow-up period of six months. The exclusion criteria for the patients were as follows: (1) having received neoadjuvant chemotherapy and (2) disease at advanced stages and inability to undergo curative surgery. This study was approved by the institutional review board (SMC 2023-05-108), and the requirement for informed consent was waived due to the retrospective nature of the study. This study was conducted in accordance with the ethical principles of the Declaration of Helsinki (1989).

Fig. 1 — Patient flow diagram and imaging modalities for pancreatic ductal adenocarcinoma. Abbreviations: SMC, Samsung Medical Center; PDAC, Pancreatic ductal adenocarcinoma; NAC, neoadjuvant chemotherapy; O&C, open and closure; BRPC, borderline resectable pancreatic cancer; PD, pancreaticoduodenectomy; PET, positron emission tomography; MRI, magnetic resonance imaging.

Data preprocessing and synthetic data generation

The dataset included demographic variables, sex, age, preoperative prognostic factors for pancreatic cancer, preoperative tumor size, preoperative CT lymph node enlargement (LNE) status, tumor markers (CA19-9 and CEA), superior mesenteric vein (SMV) abutment, adjuvant treatment details, and various imaging metrics and parameters. The PET/CT-derived features such as maximum standardized uptake value (SUVmax), metabolic tumor volume (MTV2.5), total lesion glycolysis (TLG2.5), and related edge-based variants were included. The primary outcome was early tumor recurrence within six months after surgery.

Whole-body PET and unenhanced CT images were acquired using a PET/CT scanner (Discovery STE, GE Healthcare, Waukesha, WI). Whole-body CT was performed using a 16-slice helical CT with 30 to 170 mAs adjusted to the patient’s body weight at a 140-kVp and 3.75-mm section width. After the CT scan, an emission scan was performed from the thigh to the basal skull for 2.5 min per frame in three-dimensional mode 60 min after intravenous injection of 18 F-FDG (5.0 MBq/kg). PET images were reconstructed using CT for attenuation correction with the ordered subsets expectation maximization algorithm (20 subsets, 2 iterations) with matrix of 128 × 128 and voxel size of 3.9 × 3.9 × 3.3 mm¹¹. Tumor FDG avidity was measured as maximum SUV (SUVmax) normalized to patient body weight by manually placing a spherical volume-of-interest over the primary tumor¹². MTV was obtained by using the volume viewer software with threshold SUV of 2.5 on a GE Advantage Workstation version 4.7. Total lesion glycolysis (TLG) was measured as MTV multiplied by the mean SUV.

All MR images were obtained using a 3.0 Tesla scanner (Intera Achieva or Ingenia; Philips Healthcare, Best, The Netherlands). Diffusion-weighted (DW) images were captured prior to contrast administration using respiratory-triggered single-shot echo-planar imaging with b-values set at 0, 100, and 800 mm²/s. The apparent diffusion coefficient (ADC) was computed using a monoexponential function based on b-values of 0 and 800 mm²/s¹³.

The ADC was measured by an abdominal radiologist using the region of interest (ROI). The largest possible circular ROI was manually placed on the axial slice in the largest cross-sectional area to avoid areas with necrosis, hemorrhage, or MRI artifacts. Additionally, the ratios of muscle and spleen ADC values were defined for comparison. Each measurement was performed thrice, and the values were averaged to reduce measurement error. The muscle and spleen ADC ratios were calculated by measuring the ADC in regions corresponding to the normal muscle tissue and spleen, respectively, on the same slices.

Missing values (< 5% of the data points) were imputed using multiple imputations with chained equations (MICE). Continuous variables were standardized using z-score normalization. The dataset was split into training (n = 94), validation (n = 33), and test (n = 31) sets with stratification for recurrence outcome. Validation (n = 33) and test (n = 31) sets consisted exclusively of real patients; no synthetic sample was included at any point.

Traditional machine learning models compress data and are affected by small case volumes, which results in lower performance, whereas VAE is effective in generating synthetic data that closely mirror the patterns in the original dataset. Hence, VAE was applied instead of traditional machine learning models, as shown in Fig. 2. The VAE model was trained on preoperative data to create a synthetic dataset and enhance the diversity and depth of the training data^14,15. This approach helps overcome the limitations of requiring more cases and computing power, thereby reducing costs.

Fig. 2 — Framework of variational autoencoder (VAE)-based synthetic data generation for improving pancreatic cancer recurrence prediction.

Additional oversampling benchmark (SMOTE)

For a comparison with conventional oversampling, we created an alternative training set using the Synthetic Minority Over-sampling Technique (SMOTE). The original derivation cohort (n = 94, recurrence 27.7%) was resampled with k = 5 nearest-neighbor SMOTE (imblearn 0.12, random_state = 42) until the minority-to‐majority ratio reached 1 : 1 (total n = 130). All downstream preprocessing (median imputation, z-score scaling) and model hyper-parameters were kept identical to the main VAE pipeline to ensure a fair comparison.

VAE architecture

The VAE model was implemented using PyTorch, which consists of encoder and decoder networks with the following architecture:

Encoder: Input layer (23 nodes) → Dense layer (64 nodes, ReLU, Rectified Linear Unit) → Dense layer (32 nodes, ReLU) → Latent space (16 dimensions)
Decoder: Latent space → Dense layer (32 nodes, ReLU) → Dense layer (64 nodes, ReLU) → Output layer (23 nodes)

The VAE was trained using the Adam optimizer with a learning rate of 0.001 and a batch size of 32 for 1000 epochs. The loss function combined reconstruction loss (mean squared error) and KL divergence with β = 0.5 weighting the KL term to balance between reconstruction fidelity and latent space regularization. Early stopping was implemented with a patience of 100 epochs to monitor the validation loss (20% of the data used for validation). Training convergence was achieved when the relative change in the validation loss was < 0.0001 over the patience period. The encoder received the feature vector x concatenated with a one-hot outcome label (y = 0 for non-recurrence, 1 for recurrence). To counter class imbalance (27.7% positives), recurrence-positive cases were oversampled four-fold during training so that the latent prior became balanced at 1 : 1. During generation, 50% of latent seeds were drawn from the positive half and 50% from the negative half, yielding an exactly balanced synthetic cohort without altering feature correlations.

Euclidean nearest-neighbor (NN) distances were calculated between each synthetic sample and its closest real counterpart after median-imputation + z-score scaling (Supplementary Figure S1). The same metric was computed within the real cohort (self-NN, diagonal set to ∞) to serve as a baseline.

Recurrence prediction model development

We developed the following machine learning models to predict the six-month recurrence for pancreatic cancer: Logistic Regression (LR), Random Forest (RF), Gradient Boosting Machine (GBM), and Deep Neural Networks (DNN). Each model was trained on both the original and augmented datasets containing the original data and VAE-generated synthetic samples (1:1 ratio), respectively. The model performance was evaluated based on accuracy, sensitivity, specificity, and AUC-ROC. The feature importance was calculated using the RF mean decrease in impurity (MDI) with sum-to-one normalization. Bootstrap resampling was performed to estimate the 95% confidence interval (95% CI) for each parameter.

To address potential instability arising from limited validation (n = 33) and test (n = 31) sample sizes, we implemented comprehensive statistical validation approaches. Bootstrap confidence intervals were calculated using the bias-corrected and accelerated (BCa) method with 2,000 resampling iterations for each performance metric. Statistical significance of AUC improvements between VAE-augmented and original models was assessed using the DeLong method, which accounts for correlated ROC curves and provides robust comparisons in small sample settings. The SHAP analysis also offered patient-level interpretability by highlighting how variable contributions differed, even within small subgroups (Supplementary Figure S2, S3).

Additionally, leave-one-out cross-validation (LOOCV) was performed on the entire training cohort to evaluate model stability and reduce potential overfitting concerns. LOOCV provides an unbiased estimate of model performance by iteratively training on n-1 samples and testing on the remaining sample, particularly valuable for datasets with limited sample sizes (Supplementary Figure S4).

Model calibration assessment

Model calibration was comprehensively evaluated using multiple metrics to assess the reliability of predicted probabilities for clinical decision-making. Calibration analysis was performed on the independent test set (n = 31) for all four machine learning models trained on both original and VAE-augmented datasets.

The Brier score was calculated as the mean squared difference between predicted probabilities and actual outcomes, where lower scores indicate better calibrated models.

where pi represents the predicted probability for patient i and yi represents the actual outcome (0 or 1).

Calibration metrics were derived using logistic regression of observed outcomes on predicted probabilities. The calibration intercept measures systematic over- or under-estimation (ideal value = 0), while the calibration slope indicates the degree of over-confidence or under-confidence (ideal value = 1).

Statistical analysis

Continuous variables are presented as mean ± standard deviation or median interquartile range (IQR) based on normality testing (Shapiro-Wilk test). Categorical variables are presented as numbers and percentages. Differences between groups were assessed using Student’s t-test or Mann-Whitney U test for continuous variables and chi-square or Fisher’s exact test for categorical variables. Model performance metrics are presented with 95% CIs. Statistical significance was set at p < 0.05. All statistical analyses were performed using Python 3.12 with scikit-learn 1.5.1.

Results

Clinical and imaging characteristics of data sets

Table 1 shows the comparable distribution of clinical and imaging characteristics across all datasets. Of the 158 patients included in this study, 94 were assigned to the training set and 33 and 31 were allocated to the validation and test sets, respectively. The median age was 66.2 years (IQR: 60–74), and 61.7% of the patients were male. The six-month recurrence rate was 27.7% (26/94) in the training set with comparable distributions observed across the validation (16/33, 48.5%) and test (11/31, 35.5%) sets.

Table 1.

Baseline clinical, radiologic, and laboratory characteristics of study population.

Characteristic	Training Set (n = 94)	VAE-Augmented Set (n = 94)	Validation Set (n = 33)	Test Set (n = 31)	p-value
Age (years)	66.2 ± 10.3 [95% CI; [62.98–66.91]	66.2 ± 10.3[95% CI 62.98–66.91]	65.1 ± 8.8 [95% CI 64.23–70.96]	64.6 ± 9.2[95% CI; 61.59–69.28]	0.78
Female number (%)	36 (38.3)	36 (38.3)	16 (48.5)	12 (38.7)	0.68
Preoperative size (cm)^a	2.5 (2.0-2.9)	2.5 (2.1–2.9)	2.5 (1.7–3.2)	2.4 (2.0–3.0)	0.91
Lymph node enlargement number (%)
Normal	42 (44.7)	42 (44.7)	14 (42.4)	14 (45.2)	0.89
Borderline	35 (37.2)	35 (37.2)	13 (39.4)	12 (38.7)
Enlarged	17 (18.1)	17 (18.1)	6 (18.2)	5 (16.1)
PET LN positive number (%)	25 (26.6)	25 (26.6)	9 (27.3)	8 (25.8)	0.92
Vascular involvement number (%)					0.88
None	54 (57.4)	54 (57.4)	19 (57.6)	18 (58.1)
Vein	28 (29.8)	28 (29.8)	10 (30.3)	9 (29.0)
Artery	12 (12.8)	12 (12.8)	4 (12.1)	4 (12.9)
Tumor location quadrant number (%)					0.90
Head	45 (47.9)	45 (47.9)	16 (48.5)	15 (48.4)
Body	31 (33.0)	31 (33.0)	11 (33.3)	10 (32.3)
Tail	18 (19.1)	18 (19.1)	6 (18.2)	6 (19.3)
Vessel contact number (%)	21 (22.3)	21 (22.3)	7 (21.2)	7 (22.6)	0.91
Tumor markers
CEA (ng/mL)^a	2.1 (1.4-3.0)	2.1 (1.4-3.0)	2.1 (2.0-3.2)	2.1 (1.3–3.1)	0.87
CA19-9 (U/mL)^a	90.0 (34.3-316.8)	88.7 (34.7-319.9)	96.7 (42.1-440.6)	144.4 (70.4-417.8)	0.72
PET/CT Parameters
SUVmax^a	4.9 (3.2–7.8)	4.9 (3.2–7.7)	5.0 (3.7–6.9)	4.9 (3.5–7.2)	0.98
SUVavg2.5^a	2.8 (2.3–3.3)	2.8 (2.3–3.3)	2.8 (2.3–3.4)	2.7 (2.2–3.3)	0.95
MTV2.5 cm³ ^a	6.6 (3.1–9.1)	6.6 (3.1–9.2)	8.2 (3.4–15.5)	6.1 (3.8–9.4)	0.51
TLG2.5^a	19.3 (7.7–34.8)	19.3 (7.7–34.5)	22.3 (9.1–63.2)	18.8 (10.5–37.6)	0.85
SUVavg_edge	3.1 ± 1.3 [95% CI; 2.89–3.48]	3.1 ± 1.4[95% CI; 2.87–3.48]	3.35 ± 1.42 [95% CI 2.86–3.84]	2.7 (2.0-3.9) [95% CI 2.49–3.37]	0.68
MTV_edge cm³ ^a	8.5 (4.3–14.5)	8.7 (4.3–14.6)	8.1 (4.2–18.1)	8.3 (4.0-13.4)	0.94
TLG_edge^a	23.5 (15.4–38.3)	23.7 (15.4–37.9)	23.7 (10.7–60.8)	20.7 (10.5–35.6)	0.85
MRI Parameters
ADC1 × 10⁻³ mm²/s	1.17 ± 0.25 [95% CI:1.10–1.21]	1.17 ± 0.25 [95% CI:1.10–1.21]	1.13 ± 0.20 [95% CI:1.07–1.20]	1.14 ± 0.24[95% CI; [1.11–1.26]	0.84
ADC2 × 10⁻³ mm²/s	1.26 ± 0.24[95% CI; [1.19–1.30]	1.26 ± 0.24[95% CI; [1.19–1.30]	1.22 ± 0.19 [95% CI; 1.15–1.29]	1.20 ± 0.22[95%CI; 1.19–1.34]	0.56
ADC1/muscle ratio	0.76 ± 0.15[95%CI; 0.71–0.78]	0.76 ± 0.15[95%CI; 0.71–0.78]	0.75 ± 0.17[95% CI;0.70–0.80]	0.75 ± 0.18[95% CI; 0.73–0.83]	0.92
ADC2/muscle ratio	0.81 ± 0.15[95% CI; 0.77–0.83]	0.81 ± 0.15[95% CI; 0.77–0.83]	0.81 ± 0.17[95% CI; 0.76–0.85]	0.79 ± 0.17[95%CI; 0.78–0.88]	0.85
ADC1/spleen ratio	1.47 ± 0.36[95% CI; 1.38–1.55]	1.47 ± 0.36[95% CI; 1.38–1.55]	1.46 ± 0.46[95% CI; 1.35–1.56]	1.49 ± 0.35[95% CI; 1.37–1.62]	0.89
ADC2/spleen ratio	1.58 ± 0.36[95% CI; 1.49–1.66]	1.58 ± 0.36 [95% CI; 1.49–1.66]	1.59 ± 0.47[95% CI; 1.47–1.68]	1.58 ± 0.37[95% CI; 1.47–1.72]	0.91

Open in a new tab

VAE, variational autoencoder; CEA, carcinoembryonic antigen; CA19-9, cancer antigen 19 − 9; CT, computed tomography; PET, positron emission tomography; LN, lymph node; SUV, standardized uptake value; avg, average; MTV, metabolic tumor volume; TLG, total lesion glycolysis; ADC, apparent diffusion coefficient; MRI, magnetic resonance imaging;.

^aValues are presented as median (interquartile range) for non-normally distributed variables.

p-values were calculated using the chi-square test for categorical variables and ANOVA or Kruskal-Wallis test for continuous variables according to normality.

The VAE-generated synthetic dataset (n = 94) showed statistical similarity to the original training dataset across all key variables. Preoperative tumor size showed a consistent distribution across all groups (median, 2.4–2.5 cm; p = 0.91). The tumor marker levels were not significantly different between the groups for either CEA (median range, 2.1 ng/mL across all groups; p = 0.87) or CA19-9 (median range, 88.7-144.4 U/mL; p = 0.72), although CA19-9 showed notable variability, particularly in the test set.

The metabolic parameters derived from the PET/CT imaging showed notable consistency between the original and VAE-augmented datasets. SUVmax values were highly comparable (training: 4.9 [3.2–7.8] vs. VAE-augmented: 4.9 [3.2–7.7]; p = 0.98), as were MTV2.5 (6.6 [3.1–9.1] cm³ vs. 6.6 [3.1–9.2] cm³) and TLG2.5 (19.3 [7.7–34.8] vs. 19.3 [7.7–34.5]). This consistency was extended to edge-based measurements, with SUVavg_edge showing nearly identical distributions (3.1 ± 1.3 vs. 3.1 ± 1.4).

ADC values from MRI showed strong consistency between the original and synthetic data (ADC1: 1.17 ± 0.25 × 10⁻³ mm²/s in both groups; ADC2: 1.26 ± 0.24 × 10⁻³ mm²/s in both groups); with no significant differences across all study groups (p = 0.84 and p = 0.56, respectively).

Kolmogorov-Smirnov tests showed no significant differences between the original and synthetic data distributions for any of the features (all p > 0.05). Especially strong similarities were observed for the metabolic parameters (SUVmax: KS statistic = 0.085, p = 0.923; TLG2.5: KS statistic = 0.092, p = 0.891) and tumor markers (CA19-9: KS statistic = 0.089, p = 0.908) (Fig. 3).

Fig. 3 — Density plots comparing distributions of clinical variables between original (blue) and synthetic (red) data; Probability density distributions of the following clinical variables are shown: age (years), carcinoembryonic antigen (CEA; ng/mL), cancer antigen 19 − 9 (CA19-9^†; U/mL), maximum standardized uptake value (SUVmax), metabolic tumor volume (MTV; cm³), and total lesion glycolysis (TLG) Synthetic data maintain the statistical properties of the original dataset, as evidenced by Kolmogorov-Smirnov (KS) test results; p > 0.05 for all variables. ^†log-transformed for visualization.

Performance metrics of different models with and without VAE augmentation

As shown in Fig. 4, all machine learning models showed improved performance when trained using the VAE-augmented data than with the original dataset alone. Furthermore, performance enhancement through VAE augmentation was consistent across all models with average improvements of 6.0-6.5% points in accuracy and 7.0–9.0% points in sensitivity. Pronounced improvements were observed in the GBM and RF models, which showed identical optimal performance metrics (Table 2A). Both models showed an accuracy of 0.88, sensitivity of 0.87, and specificity of 0.89 with an area under the receiver operating characteristic curve (AUC-ROC) of 0.88 when using the VAE-augmented training data. Notably, this enhancement was consistently observed in the validation set, which indicates the strong generalizability of the models augmented using synthetic data.

Table 2A.

Model performance in validation set (n = 33).

Model	Training Data	Accuracy	Sensitivity	Specificity	AUC-ROC	F1-score
GBM	Original	0.82 (0.70–0.92)	0.80 (0.65–0.92)	0.83 (0.66–0.95)	0.82 (0.68–0.93)	0.80 (0.65–0.91)
	VAE-augmented	0.88 (0.76–0.96)	0.87 (0.72–0.97)	0.89 (0.74–0.97)	0.88 (0.76–0.96)	0.87 (0.73–0.96)
Random Forest	Original	0.82 (0.69–0.93)	0.82 (0.66–0.94)	0.83 (0.69–0.94)	0.82 (0.68–0.92)	0.82 (0.67–0.93)
	VAE-augmented	0.88 (0.75–0.96)	0.87 (0.71–0.97)	0.89 (0.74–0.97)	0.88 (0.75–0.96)	0.87 (0.73–0.96)
Logistic Regression	Original	0.79 (0.65–0.90)	0.82 (0.67–0.93)	0.78 (0.63–0.89)	0.79 (0.65–0.90)	0.77 (0.63–0.88)
	VAE-augmented	0.85 (0.71–0.94)	0.87 (0.72–0.97)	0.83 (0.68–0.94)	0.85 (0.72–0.94)	0.84 (0.70–0.94)
DNN	Original	0.79 (0.64–0.90)	0.73 (0.55–0.87)	0.83 (0.68–0.93)	0.78 (0.63–0.89)	0.76 (0.60–0.88)
	VAE-augmented	0.85 (0.71–0.94)	0.80 (0.63–0.92)	0.89 (0.74–0.97)	0.84 (0.70–0.93)	0.83 (0.68–0.93)

Open in a new tab

GBM, Gradient Boosting Machine; VAE, variational autoencoder; DNN, Deep Neural Network; AUC-ROC, area under the receiver operating characteristic curve.

Performance in external validation

As shown in Fig. 5, the GBM model trained with the VAE-augmented data showed improved performance in the test set than in the original dataset. Accuracy increased from 0.81 to 0.87, sensitivity from 0.73 to 0.91, and AUC-ROC from 0.81 to 0.88, whereas specificity remained consistent at 0.85. The RF model exhibited enhancements comparable to those of the VAE-augmented training data. In the test set, accuracy improved from 0.84 to 0.87, sensitivity from 0.82 to 0.91, and AUC-ROC from 0.82 to 0.88 with specificity stable at 0.85. The LR models exhibited moderate improvements with VAE augmentation. In the test set, accuracy increased from 0.77 to 0.84 and sensitivity from 0.64 to 0.82, whereas specificity remained at 0.85. Additionally, the DNN model exhibited consistent improvement in the test set validation, and accuracy improved from 0.77 to 0.84, sensitivity from 0.73 to 0.82, and AUC-ROC from 0.76 to 0.83 with specificity stable at 0.85. These findings have confirmed the effectiveness of VAE-based synthetic data augmentation in improving the predictive performance of machine learning models across diverse metrics, particularly sensitivity, while maintaining high specificity (Table 2B).

Table 2B.

Model performance in test set (n = 31).

Model	Training Data	Accuracy	Sensitivity	Specificity	AUC-ROC	F1-score
GBM	Original	0.81 (0.64–0.91)	0.73 (0.40–0.89)	0.85 (0.65–0.95)	0.81 (0.55–0.97)	0.70 (0.37–0.87)
	VAE-augmented	0.87 (0.70–0.97)	0.91 (0.73-1.00)	0.85 (0.66–0.96)	0.88 (0.76–0.97)	0.78 (0.57–0.92)
Random Forest	Original	0.84 (0.66–0.95)	0.82 (0.63–0.95)	0.85 (0.66–0.96)	0.82 (0.69–0.93)	0.75 (0.55–0.89)
	VAE-augmented	0.87 (0.70–0.97)	0.91 (0.73-1.00)	0.85 (0.66–0.96)	0.88 (0.76–0.97)	0.78 (0.57–0.92)
Logistic Regression	Original	0.77 (0.59–0.90)	0.64 (0.43–0.82)	0.85 (0.66–0.96)	0.76 (0.62–0.88)	0.67 (0.47–0.83)
	VAE-augmented	0.84 (0.66–0.95)	0.82 (0.63–0.95)	0.85 (0.66–0.96)	0.83 (0.70–0.93)	0.75 (0.55–0.89)
DNN	Original	0.77 (0.59–0.90)	0.73 (0.54–0.89)	0.80 (0.61–0.92)	0.76 (0.62–0.88)	0.69 (0.49–0.84)
	VAE-augmented	0.84 (0.66–0.95)	0.82 (0.63–0.95)	0.85 (0.66–0.96)	0.83 (0.70–0.93)	0.75 (0.55–0.89)

Open in a new tab

GBM, Gradient Boosting Machine; VAE, variational autoencoder; DNN, Deep Neural Network; AUC-ROC, area under the receiver operating characteristic curve.

Comprehensive calibration analysis revealed significant improvements in all calibration metrics (Supplementary Table S1 and Figure S5). Brier scores improved substantially across all models: GBM (12.1% improvement, 0.165→0.145), Random Forest (10.1% improvement, 0.158→0.142), Logistic Regression (12.4% improvement, 0.185→0.162), and DNN (10.1% improvement, 0.178→0.160). The average Brier score improvement was 11.4%, indicating significantly better calibrated probability estimates. Systematic bias, as measured by calibration intercept, was reduced across all models, with values moving closer to the ideal of zero. Original models showed intercepts ranging from 0.12 to 0.22, which improved to 0.08–0.14 with VAE augmentation. Calibration slopes also improved toward the ideal value of 1.0, with GBM achieving near-perfect calibration (slope = 1.05) and all other models showing substantial improvements.

Performance of SMOTE-Augmented Models

To evaluate the effectiveness of VAE-based data augmentation, we compared its performance against SMOTE across all model architectures (Supplementary Table S2).

GBM with VAE achieved substantial improvements in AUC-ROC (0.88 vs. 0.81, + 8.6%) and sensitivity (0.91 vs. 0.73) compared to both original data and SMOTE augmentation, which showed minimal improvement. Similarly, Random Forest with VAE demonstrated better performance compared to SMOTE (AUC: 0.88 vs. 0.82).

In contrast, Logistic Regression showed contrasting results, with SMOTE achieving higher AUC compared to VAE (0.98 vs. 0.85). This pattern aligns with theoretical expectations, as SMOTE’s linear interpolation approach optimally matches the linear decision boundaries of logistic regression models.

Feature importance analysis

RF feature importance analysis showed that metabolic parameters obtained from PET/CT imaging were the strongest predictors of early recurrence. SUVmax was found to be the most significant predictor (importance score: 0.182, 95% CI: 0.165–0.199), followed by metabolic parameters such as TLG2.5 (0.145, 95% CI: 0.130–0.160), MTV2.5 (0.128, 95% CI: 0.114–0.142), SUVavg_edge (0.092, 95% CI: 0.080–0.104), and TLG_edge (0.045, 95% CI: 0.037–0.053). The collective contribution of the PET/CT-derived metabolic parameters accounted for 59.2% of the predictive power of the model (total importance score: 0.592), which substantially outweighs the contributions of conventional imaging and clinical parameters. Tumor markers constituted the second most important category by contributing to 24.3% of the overall predictive capability; particularly, CA19-9 and CEA showed importance scores of 0.158 and 0.085, respectively.

MRI-derived ADC values showed limited usefulness for predicting pancreatic cancer recurrence. In this study, the ADC, including those derived from ROIs placed on the pancreas, did not significantly correlate with the recurrence of pancreatic cancer.

Conventional clinicopathological parameters such as preoperative tumor size showed relatively low importance (0.075, 95% CI: 0.065–0.085), whereas patient age exhibited minimal predictive value (0.017, 95% CI: 0.012–0.022). The complete feature importance rankings are presented in Table 3; Fig. 6.

Table 3.

Feature importance of final random forest model.

Feature	Importance Score	95% CI
SUVmax	0.182	0.165–0.199
CA19-9	0.158	0.142–0.174
TLG2.5	0.145	0.130–0.160
MTV2.5	0.128	0.114–0.142
SUVavg_edge	0.092	0.080–0.104
CEA	0.085	0.074–0.096
Preop size	0.075	0.065–0.085
TLG_edge	0.045	0.037–0.053
ADC1	0.038	0.031–0.045
ADC2	0.035	0.028–0.042
Age	0.017	0.012–0.022

Open in a new tab

SUV, standardized uptake value; CA19-9, cancer antigen 19 − 9; TLG, total lesion glycolysis; MTV, metabolic tumor volume; avg, average; CEA, carcinoembryonic antigen; ADC, apparent diffusion coefficient.

Fig. 6 — Feature importance of final prediction model using VAE-augmented dataset.

Discussion

In this study, we have demonstrated that VAE-based synthetic data generation significantly improves the performance of machine-learning models in predicting early recurrence after pancreatic cancer surgery. As recent attempts are being made to expand the scope of neoadjuvant chemotherapy to treat resectable pancreatic cancer^16,17these findings present several important implications both for current clinical practice and the course of future research.

Including an adequate number of cases is critical for predictive modeling using artificial intelligence (AI). Hence, predicting various aspects of low-incidence diseases or rare events requires extensive datasets that often require considerable time and financial investment to develop. Therefore, AI-related research is inherently resource-intensive.

In contrast, VAE-based data augmentation effectively enhances the sample size in clinical research and offers significant savings in terms of time and cost¹⁸. In medical imaging, VAE-generated synthetic data help create diverse and realistic images, which are particularly valuable when real-world datasets are scarce.

The VAE-generated synthetic data enhance the performance of the model through several key mechanisms. A major challenge associated with the original dataset was the significant class imbalance as only 27.7% of patients experienced recurrence within six months. VAE-based augmentation effectively addressed this issue by generating synthetic data that balanced the distribution. The augmented models showed improved sensitivity, which reduces the chance of non-identification of recurrence cases and enables early intervention. Importantly, this improvement did not compromise specificity/. Therefore, the reliability of the model in identifying true negatives has been retained.

The superior performance of VAE in tree-based models reflects the fundamental alignment between VAE’s capacity to capture complex, non-linear feature interactions and the ability of ensemble methods to leverage these sophisticated patterns. Pancreatic cancer recurrence involves intricate biological relationships characterized by multi-factorial interactions between clinical, pathological, and molecular markers that are inherently non-linear. VAE’s encoder-decoder architecture learns the underlying probability distribution of these complex relationships, generating synthetic samples that preserve biological relevance and enhance model performance in algorithms capable of modeling non-linear decision boundaries.

In contrast, the superior performance of SMOTE with Logistic Regression aligns with theoretical expectations, as SMOTE’s linear interpolation approach optimally matches the linear decision boundaries of logistic regression models. These findings establish GBM with VAE augmentation as the optimal combination for pancreatic cancer recurrence prediction, providing enhanced sensitivity (0.91). The architecture-dependent effectiveness demonstrates that optimal augmentation strategies must consider both the characteristics of the underlying biological data and the capabilities of the chosen modeling approach, with VAE particularly suited for complex medical prediction tasks where non-linear relationships predominate.

For this reason, The VAE-augmented models showed improved predictive performance with complex variables. The VAE showed a capability to preserve complex nonlinear relationships among PET/CT-derived metabolic parameters such as SUVmax, TLG2.5, and MTV2.5. These parameters were consistently ranked as top predictors in feature importance analyses, thereby emphasizing their relevance in assessing tumor metabolic activity. By preserving these complex relationships, VAE-based augmentation improved the ability of the model to better understand the metabolic signals, which improved the predictive performance.

The augmented models efficiently managed the challenge of handling extreme values, especially for tumor markers such as CA19-9, which exhibited a wide range (from 1.51–117,557.63). The VAE effectively captured these extreme value distributions and generated synthetic data that appropriately represented them. This improved the stability, robustness, and generalizability of the model and reduced the influence of outliers.

The enhanced performance of the GBM and RF models, as evidenced by an increase in AUC-ROC from 0.82 to 0.88, could be primarily attributed to their inherent ability to capture nonlinear relationships in the VAE-augmented feature space. These improvements were consistent across multiple metrics and external datasets, which indicates the strong generalizability of the findings.

The most significant finding of this study was the predominant role of the metabolic imaging parameters in predicting early recurrence. The combination of PET/CT-derived features (SUVmax, TLG2.5, and MTV2.5) and CA19-9 accounted for 61.3% of the total predictive power and significantly outperformed conventional anatomical and clinical parameters. This observation, consistent with previous studies^19–22suggests that tumor metabolism rather than tumor size alone may be a more reliable indicator of aggressive disease behavior and the risk of recurrence.

In contrast to the contribution of PET/CT parameters, MRI-driven factors exerted limited impact on predicting recurrence. Furthermore, the role reported for ADC in pancreatic cancer has been inconsistent, likely because of differences in imaging protocols, patient populations, and study methodologies^23–29. Several studies have shown that changes in the ADC values are significant predictors of treatment outcomes, such as progression-free and overall survival, in patients with unresectable pancreatic cancer undergoing chemotherapy^30,31. These findings suggest that longitudinal monitoring of ADC values during treatment is a more nuanced and reliable approach for assessing therapeutic responses than static single time-point measurements. However, in this study, ADC was found to have a minimal impact on predicting early recurrence of pancreatic cancer (importance score: 0.038; 95% CI: 0.031–0.045). Specifically, preoperative ADC measurements in resectable pancreatic cancer lacked significant correlation with histological characteristics or overall survival, which highlights their limited clinical relevance in this context.

Despite the significant enhancements achieved in the predictive performance of machine learning models through VAE-based data augmentation, several limitations have been identified, warranting future studies. First, the original dataset comprised a relatively small sample size from a single institution, which may limit the generalizability of the findings and increase the risk of institutional bias. Although validation demonstrated promising results, the predictive power of the model remains constrained by the limited number of cases. Larger, multi-institutional studies are required to validate the model across diverse clinical settings and to ensure robust, real-world applicability.

Another limitation of this study was the exclusion of emerging biomarkers such as circulating tumor DNA (ctDNA) and sequencing data. These biomarkers have recently gained significant attention owing to their potential to provide accurate and individualized predictions of cancer recurrence³². Several ongoing studies on biomarkers to predict pancreatic cancer recurrence have shown promising results with regard to their predictive accuracy^33–35. Thus, the absence of these advanced markers in this study could be considered a limitation. In fact, incorporating these additional biomarkers could significantly improve the predictive accuracy of the model. However, these tests typically cost thousands of dollars and require several weeks for the delivery of the results; hence, these biomarkers are not particularly suitable for use before surgery in clinical settings. Therefore, this study has focused on developing a model that leverages a limited set of easily accessible clinical indicators in combination with synthetic data to overcome these challenges. Thus, this approach demonstrates the possibility of creating practical and effective models for real-world clinical applications.

Methodologically, various other data generation techniques are available apart from VAE. One such class of frameworks is the Generative Adversarial Networks (GANs), which are continuing to evolve³⁶. Modern AI techniques in image analysis commonly use GANs as they are particularly effective for creating diverse synthetic datasets. However, GANs have exhibited several major limitations when used to generate medical data. These issues include mode collapse, which reduces the variety of samples; instability during training; and the potential to amplify biases inherent in the original data. Given the critical need for accuracy and reliability in the medical field, healthcare professionals increasingly rely on ‘digital twins,’ datasets that closely replicate original medical records. To address the challenges associated with synthetic data generation, advanced architectures such as VAE-GAN hybrids and conditional GANs have been introduced. These approaches aim to preserve the original data characteristics more effectively while producing high-quality synthetic datasets. For this study, the VAE approach was identified as the most appropriate, as it enabled stable data synthesis while preserving the original data distribution—an essential feature for analyzing risk factors. Although alternative methods, such as ctGAN (designed specifically for tabular data), were evaluated, increasing the dataset size did not result in significant performance improvements. Therefore, the VAE approach was ultimately chosen for this study.

Another limitation of our study is the model’s heavy reliance on PET/CT parameters for performance. When PET/CT variables (SUV, MTV, TLG) were excluded from the model, the predictive capability dropped significantly (AUC from 0.81 to 0.59, F1-score from 0.70 to 0.33), indicating that centers without routine PET/CT imaging may not achieve clinically acceptable prediction accuracy (Supplementary Table S3). Although synthetic data augmentation provided some improvement in the non-PET model (AUC: 0.59 to 0.67, F1-score: 0.33 to 0.48), the performance remained suboptimal for reliable clinical application. This limitation restricts the generalizability of our findings to real world practice where PET/CT is not routinely available. Nevertheless, in Korea, PET/CT scans are commonly performed as part of standard metastasis workup for pancreas cancer patients, making our approach clinically applicable in most Korean medical centers.

Nevertheless, the finding of this study is valuable because data generated from a single institution have been used successfully shown that integrating VAE-based synthetic data effectively enhances the predictive performance of machine-learning models. Particularly, combining metabolic imaging parameters with synthetic data for early recurrence prediction achieved higher accuracy than the use of traditional anatomical and clinical variables. This approach has the potential to stably synthesize data and develop practically applicable models. These findings may be further validated and strengthened through multi-institutional studies and integrating advanced biomarkers.

Additionally, privacy and ethical considerations are critical when applying synthetic data generated using VAE. Although the privacy metrics maintained using the approach explained in this study were within acceptable ranges, future studies should focus on enhancing privacy-preserving techniques and ensuring compliance with strict privacy standards. Developing ethical guidelines and regulatory frameworks for using synthetic data in clinical settings is crucial to a gain broader acceptance for this approach.

Conclusions

In conclusion, this study has shown that VAE-based synthetic data could potentially improve machine learning models for early recurrence prediction in pancreatic cancer. Addressing the issues of data limitations, class imbalance, and preserving critical patterns would enable VAE-based methods to significantly enhance accuracy and sensitivity. Thus, integrating metabolic imaging parameters with synthetic data is valuable for developing practical high-performance models. Furthermore, validation through multi-institutional studies and the incorporation of advanced biomarkers is essential to improve clinical applicability.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1.^{(782.1KB, docx)}

Abbreviations

ADC: Apparent diffusion coefficient
AI: Artificial intelligence
AUC-ROC: Area under the receiver operating characteristic curve
ctDNA: Circulating tumor DNA
DNN: Deep Neural Networks
DW: Diffusion-weighted
GANs: Generative Adversarial Networks
GBM: Gradient Boosting Machine
IQR: Interquartile range
LNE: Lymph node enlargement
LR: Logistic Regression
MTV2.5: Metabolic tumor volume
MDI: Mean decrease in impurity
MICE: Multiple imputations with chained equations
RF: Random Forest
ROI: Region of interest
SMV: Superior mesenteric vein
SUVmax: Maximum standardized uptake value
TLG2.5: Total lesion glycolysis
VAE: Variational autoencoder

Author contributions

H.J, J-M.L., H.K conceived and designed the study and wrote the main manuscript. H.J, HS. K, H.C., SJ.Y., JH.M. and SH.H collected the data and analyzed them. J-M. L visualized the results. All authors interpreted the data and reviewed the manuscript. All have approved the submitted version.

Funding

This research was supported by a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant No.: HI23C1591).

Data availability

The data that support the findings of this study are available on request from the corresponding author, H Kim. The data are not publicly available due to privacy and ethical restrictions.

Declarations

Competing interests

The authors declare no competing interests.

Ethics approval

This study was approved by the institutional review board (SMC 2023-05-108). The Institutional Review Boards of Samsung Medical Center waived the need for written informed consent from the participants since the research involved no more than minimal risk to subjects, and there was no reason to assume rejection of agreement.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

HyeJeong Jeong and Jeong-Moo Lee contributed equally to this work.

Contributor Information

Ji Hye Min, Email: minjh1123@gmail.com.

Seung Hyup Hyun, Email: shnm.hyun@samsung.com.

Hongbeom Kim, Email: surgeonkhb@gmail.com.

References

1.Gooden, H., Tiller, K., Mumford, J. & White, K. Integrated psychosocial and supportive care needed for patients with pancreatic cancer. Cancer Forum. 40, 66–69. 10.3316/informit.998824554245881 (2016). [Google Scholar]
2.Kolbeinsson, H. et al. Recurrence patterns and postrecurrence survival after curative intent resection for pancreatic ductal adenocarcinoma. Surgery169, 649–654. 10.1016/j.surg.2020.06.042 (2021). [DOI] [PubMed] [Google Scholar]
3.Macedo, F. I. et al. Survival outcomes associated with clinical and pathological response following neoadjuvant FOLFIRINOX or Gemcitabine/Nab-Paclitaxel chemotherapy in resected pancreatic cancer. Ann. Surg.270, 400–413. 10.1097/SLA.0000000000003468 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Kim, R. Y. et al. Total neoadjuvant therapy for operable pancreatic cancer. Ann. Surg. Oncol.28, 2246–2256. 10.1245/s10434-020-09149-3 (2021). [DOI] [PubMed] [Google Scholar]
5.McGuigan, A. et al. Pancreatic cancer: A review of clinical diagnosis, epidemiology, treatment and outcomes. World J. Gastroenterol.24, 4846–4861. 10.3748/wjg.v24.i43.4846 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Mizrahi, J. D., Surana, R., Valle, J. W. & Shroff, R. T. Pancreatic cancer. Lancet395, 2008–2020. 10.1016/S0140-6736(20)30974-0 (2020). [DOI] [PubMed] [Google Scholar]
7.Wang, Z. et al. Improving semiconductor device modeling for electronic design automation by machine learning techniques. IEEE Trans. Electron. Devices. 71, 263–271. 10.1109/TED.2023.3307051 (2023). [Google Scholar]
8.Chen, M., Hao, Y., Hwang, K., Wang, L. & Wang, L. Disease prediction by machine learning over big data from healthcare communities. Ieee Access.5, 8869–8879 (2017). [Google Scholar]
9.Hung, A. J., Chen, J. & Gill, I. S. Automated performance metrics and machine learning algorithms to measure surgeon performance and anticipate clinical outcomes in robotic surgery. JAMA Surg.153, 770–771. 10.1001/jamasurg.2018.1512 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Kingma, D. P. & Welling, M. An introduction to variational autoencoders. Found. Trends® Mach. Learn.12, 307–392. 10.1561/2200000056 (2019).
11.Rhu, J. et al. Maximum standardized uptake value on 18F-fluorodeoxyglucose positron emission tomography/computed tomography improves outcome prediction in retroperitoneal liposarcoma. Sci. Rep.9, 6605 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Lee, J. H., Song, K. D., Cha, D. I. & Hyun, S. H. New intra-abdominal mass after operation for colorectal cancer: desmoid tumor versus peritoneal seeding. Abdom. Radiol.43, 2923–2927 (2018). [DOI] [PubMed] [Google Scholar]
13.Min, J. H. et al. Hepatic neuroendocrine tumour: apparent diffusion coefficient as a potential marker of prognosis associated with tumour grade and overall survival. Eur. Radiol.28, 2561–2571 (2018). [DOI] [PubMed] [Google Scholar]
14.Shamout, F., Zhu, T. & Clifton, D. A. Machine learning for clinical outcome prediction. IEEE Rev. Biomed. Eng.14, 116–126. 10.1109/RBME.2020.3007816 (2021). [DOI] [PubMed] [Google Scholar]
15.Way, G. P., Zietz, M., Rubinetti, V., Himmelstein, D. S. & Greene, C. S. Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations. Genome Biol.21, 109. 10.1186/s13059-020-02021-3 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Oba, A. et al. New criteria of resectability for pancreatic cancer: A position paper by the Japanese society of Hepato-Biliary-Pancreatic surgery (JSHBPS). J. Hepatobiliary Pancreat. Sci.29, 725–731. 10.1002/jhbp.1049 (2022). [DOI] [PubMed] [Google Scholar]
17.Ghaneh, P. et al. Immediate surgery compared with short-course neoadjuvant gemcitabine plus capecitabine, FOLFIRINOX, or chemoradiotherapy in patients with borderline resectable pancreatic cancer (ESPAC5): a four-arm, multicentre, randomised, phase 2 trial. Lancet Gastroenterol. Hepatol.8, 157–168. 10.1016/S2468-1253(22)00348-X (2023). [DOI] [PubMed] [Google Scholar]
18.Papadopoulos, D. & Karalis, V. D. Variational autoencoders for data augmentation in clinical studies. Appl. Sci. 13, 8793. http://doi.org/0.3390/app13158793 (2023).
19.Moon, D. et al. Preoperative carbohydrate antigen 19 – 9 and standard uptake value of positron emission tomography-computed tomography as prognostic markers in patients with pancreatic ductal adenocarcinoma. J. Hepatobiliary Pancreat. Sci.29, 1133–1141. 10.1002/jhbp.845 (2022). [DOI] [PubMed] [Google Scholar]
20.Panda, A. et al. Borderline resectable and locally advanced pancreatic cancer: FDG PET/MRI and CT tumor metrics for assessment of pathologic response to neoadjuvant therapy and prediction of survival. AJR Am. J. Roentgenol.217, 730–740. 10.2214/AJR.20.24567 (2021). [DOI] [PubMed] [Google Scholar]
21.Lee, W. et al. Metabolic activity by FDG-PET/CT after neoadjuvant chemotherapy in borderline resectable and locally advanced pancreatic cancer and association with survival. Br. J. Surg.109, 61–70. 10.1093/bjs/znab229 (2021). [DOI] [PubMed] [Google Scholar]
22.Ushida, Y. et al. High CA19-9 level in resectable pancreatic cancer is a potential indication of neoadjuvant treatment. Pancreatology21, 130–137. 10.1016/j.pan.2020.11.026 (2021). [DOI] [PubMed] [Google Scholar]
23.Xie, P., Liu, K., Peng, W. & Zhou, Z. The correlation between Diffusion-Weighted imaging at 3.0-T magnetic resonance imaging and histopathology for pancreatic ductal adenocarcinoma. J. Comput. Assist. Tomogr. 39, 697–701. 10.1097/RCT.0000000000000274 (2015). [DOI] [PubMed] [Google Scholar]
24.Ma, W. et al. Apparent diffusion coefficient and dynamic Contrast-Enhanced magnetic resonance imaging in pancreatic cancer: characteristics and correlation with histopathologic parameters. J. Comput. Assist. Tomogr. 40, 709–716. 10.1097/RCT.0000000000000434 (2016). [DOI] [PubMed] [Google Scholar]
25.Hayano, K. et al. Correlation of apparent diffusion coefficient measured by diffusion-weighted MRI and clinicopathologic features in pancreatic cancer patients. J. Hepatobiliary Pancreat. Sci.20, 243–248. 10.1007/s00534-011-0491-5 (2013). [DOI] [PubMed] [Google Scholar]
26.Zaboriene, I., Zviniene, K., Lukosevicius, S., Ignatavicius, P. & Barauskas, G. Dynamic perfusion computed tomography and apparent diffusion coefficient as potential markers for poorly differentiated pancreatic adenocarcinoma. Dig. Surg.38, 128–135. 10.1159/000511973 (2021). [DOI] [PubMed] [Google Scholar]
27.Kitajima, K. et al. Correlation of the SUVmax of FDG-PET and ADC values of diffusion-weighted MR imaging with pathologic prognostic factors in breast carcinoma. Eur. J. Radiol.85, 943–949. 10.1016/j.ejrad.2016.02.015 (2016). [DOI] [PubMed] [Google Scholar]
28.Aydin, H. et al. Is there any relationship between adc values of diffusion-weighted imaging and the histopathological prognostic factors of invasive ductal carcinoma? Br. J. Radiol.91, 20170705. 10.1259/bjr.20170705 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Jackson, A. et al. MRI apparent diffusion coefficient (ADC) as a biomarker of tumour response: Imaging-Pathology correlation in patients with hepatic metastases from colorectal cancer (EORTC 1423). Cancers (Basel). 15, 3580. 10.3390/cancers15143580 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Nishiofuku, H. et al. Increased tumour ADC value during chemotherapy predicts improved survival in unresectable pancreatic cancer. Eur. Radiol.26, 1835–1842. 10.1007/s00330-015-3999-2 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Niwa, T. et al. Advanced pancreatic cancer: the use of the apparent diffusion coefficient to predict response to chemotherapy. Br. J. Radiol.82, 28–34. 10.1259/bjr/43911400 (2009). [DOI] [PubMed] [Google Scholar]
32.Qiu, B. et al. Dynamic recurrence risk and adjuvant chemotherapy benefit prediction by ctdna in resected NSCLC. Nat. Commun.12, 6770. 10.1038/s41467-021-27022-z (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Guan, S. et al. Circulating tumor DNA in prediction of prognosis and response to Nab-paclitaxel based First-line chemotherapy in metastatic pancreatic cancer. Pancreas48, 1435 (2019). [Google Scholar]
34.Pek, M. et al. Development and evaluation of a multi-cancer screening (MCS) test for cancers common in Asia. J. Clin. Oncol.4110.1200/JCO.2023.41.16_suppl.3052 (2023).
35.Osayi, S. N., Bloomston, M., Schmidt, C. M., Ellison, E. C. & Muscarella, P. Biomarkers as predictors of recurrence following curative resection for pancreatic ductal adenocarcinoma: a review. Biomed. Res. Int.2014 (468959). 10.1155/2014/468959 (2014). [DOI] [PMC free article] [PubMed]
36.Goodfellow, I. J. et al. Generative adversarial networks. Commun. ACM. 63, 139–144 (2020). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1.^{(782.1KB, docx)}

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author, H Kim. The data are not publicly available due to privacy and ethical restrictions.

[CR1] 1.Gooden, H., Tiller, K., Mumford, J. & White, K. Integrated psychosocial and supportive care needed for patients with pancreatic cancer. Cancer Forum. 40, 66–69. 10.3316/informit.998824554245881 (2016). [Google Scholar]

[CR2] 2.Kolbeinsson, H. et al. Recurrence patterns and postrecurrence survival after curative intent resection for pancreatic ductal adenocarcinoma. Surgery169, 649–654. 10.1016/j.surg.2020.06.042 (2021). [DOI] [PubMed] [Google Scholar]

[CR3] 3.Macedo, F. I. et al. Survival outcomes associated with clinical and pathological response following neoadjuvant FOLFIRINOX or Gemcitabine/Nab-Paclitaxel chemotherapy in resected pancreatic cancer. Ann. Surg.270, 400–413. 10.1097/SLA.0000000000003468 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Kim, R. Y. et al. Total neoadjuvant therapy for operable pancreatic cancer. Ann. Surg. Oncol.28, 2246–2256. 10.1245/s10434-020-09149-3 (2021). [DOI] [PubMed] [Google Scholar]

[CR5] 5.McGuigan, A. et al. Pancreatic cancer: A review of clinical diagnosis, epidemiology, treatment and outcomes. World J. Gastroenterol.24, 4846–4861. 10.3748/wjg.v24.i43.4846 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Mizrahi, J. D., Surana, R., Valle, J. W. & Shroff, R. T. Pancreatic cancer. Lancet395, 2008–2020. 10.1016/S0140-6736(20)30974-0 (2020). [DOI] [PubMed] [Google Scholar]

[CR7] 7.Wang, Z. et al. Improving semiconductor device modeling for electronic design automation by machine learning techniques. IEEE Trans. Electron. Devices. 71, 263–271. 10.1109/TED.2023.3307051 (2023). [Google Scholar]

[CR8] 8.Chen, M., Hao, Y., Hwang, K., Wang, L. & Wang, L. Disease prediction by machine learning over big data from healthcare communities. Ieee Access.5, 8869–8879 (2017). [Google Scholar]

[CR9] 9.Hung, A. J., Chen, J. & Gill, I. S. Automated performance metrics and machine learning algorithms to measure surgeon performance and anticipate clinical outcomes in robotic surgery. JAMA Surg.153, 770–771. 10.1001/jamasurg.2018.1512 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Kingma, D. P. & Welling, M. An introduction to variational autoencoders. Found. Trends® Mach. Learn.12, 307–392. 10.1561/2200000056 (2019).

[CR11] 11.Rhu, J. et al. Maximum standardized uptake value on 18F-fluorodeoxyglucose positron emission tomography/computed tomography improves outcome prediction in retroperitoneal liposarcoma. Sci. Rep.9, 6605 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Lee, J. H., Song, K. D., Cha, D. I. & Hyun, S. H. New intra-abdominal mass after operation for colorectal cancer: desmoid tumor versus peritoneal seeding. Abdom. Radiol.43, 2923–2927 (2018). [DOI] [PubMed] [Google Scholar]

[CR13] 13.Min, J. H. et al. Hepatic neuroendocrine tumour: apparent diffusion coefficient as a potential marker of prognosis associated with tumour grade and overall survival. Eur. Radiol.28, 2561–2571 (2018). [DOI] [PubMed] [Google Scholar]

[CR14] 14.Shamout, F., Zhu, T. & Clifton, D. A. Machine learning for clinical outcome prediction. IEEE Rev. Biomed. Eng.14, 116–126. 10.1109/RBME.2020.3007816 (2021). [DOI] [PubMed] [Google Scholar]

[CR15] 15.Way, G. P., Zietz, M., Rubinetti, V., Himmelstein, D. S. & Greene, C. S. Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations. Genome Biol.21, 109. 10.1186/s13059-020-02021-3 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Oba, A. et al. New criteria of resectability for pancreatic cancer: A position paper by the Japanese society of Hepato-Biliary-Pancreatic surgery (JSHBPS). J. Hepatobiliary Pancreat. Sci.29, 725–731. 10.1002/jhbp.1049 (2022). [DOI] [PubMed] [Google Scholar]

[CR17] 17.Ghaneh, P. et al. Immediate surgery compared with short-course neoadjuvant gemcitabine plus capecitabine, FOLFIRINOX, or chemoradiotherapy in patients with borderline resectable pancreatic cancer (ESPAC5): a four-arm, multicentre, randomised, phase 2 trial. Lancet Gastroenterol. Hepatol.8, 157–168. 10.1016/S2468-1253(22)00348-X (2023). [DOI] [PubMed] [Google Scholar]

[CR18] 18.Papadopoulos, D. & Karalis, V. D. Variational autoencoders for data augmentation in clinical studies. Appl. Sci. 13, 8793. http://doi.org/0.3390/app13158793 (2023).

[CR19] 19.Moon, D. et al. Preoperative carbohydrate antigen 19 – 9 and standard uptake value of positron emission tomography-computed tomography as prognostic markers in patients with pancreatic ductal adenocarcinoma. J. Hepatobiliary Pancreat. Sci.29, 1133–1141. 10.1002/jhbp.845 (2022). [DOI] [PubMed] [Google Scholar]

[CR20] 20.Panda, A. et al. Borderline resectable and locally advanced pancreatic cancer: FDG PET/MRI and CT tumor metrics for assessment of pathologic response to neoadjuvant therapy and prediction of survival. AJR Am. J. Roentgenol.217, 730–740. 10.2214/AJR.20.24567 (2021). [DOI] [PubMed] [Google Scholar]

[CR21] 21.Lee, W. et al. Metabolic activity by FDG-PET/CT after neoadjuvant chemotherapy in borderline resectable and locally advanced pancreatic cancer and association with survival. Br. J. Surg.109, 61–70. 10.1093/bjs/znab229 (2021). [DOI] [PubMed] [Google Scholar]

[CR22] 22.Ushida, Y. et al. High CA19-9 level in resectable pancreatic cancer is a potential indication of neoadjuvant treatment. Pancreatology21, 130–137. 10.1016/j.pan.2020.11.026 (2021). [DOI] [PubMed] [Google Scholar]

[CR23] 23.Xie, P., Liu, K., Peng, W. & Zhou, Z. The correlation between Diffusion-Weighted imaging at 3.0-T magnetic resonance imaging and histopathology for pancreatic ductal adenocarcinoma. J. Comput. Assist. Tomogr. 39, 697–701. 10.1097/RCT.0000000000000274 (2015). [DOI] [PubMed] [Google Scholar]

[CR24] 24.Ma, W. et al. Apparent diffusion coefficient and dynamic Contrast-Enhanced magnetic resonance imaging in pancreatic cancer: characteristics and correlation with histopathologic parameters. J. Comput. Assist. Tomogr. 40, 709–716. 10.1097/RCT.0000000000000434 (2016). [DOI] [PubMed] [Google Scholar]

[CR25] 25.Hayano, K. et al. Correlation of apparent diffusion coefficient measured by diffusion-weighted MRI and clinicopathologic features in pancreatic cancer patients. J. Hepatobiliary Pancreat. Sci.20, 243–248. 10.1007/s00534-011-0491-5 (2013). [DOI] [PubMed] [Google Scholar]

[CR26] 26.Zaboriene, I., Zviniene, K., Lukosevicius, S., Ignatavicius, P. & Barauskas, G. Dynamic perfusion computed tomography and apparent diffusion coefficient as potential markers for poorly differentiated pancreatic adenocarcinoma. Dig. Surg.38, 128–135. 10.1159/000511973 (2021). [DOI] [PubMed] [Google Scholar]

[CR27] 27.Kitajima, K. et al. Correlation of the SUVmax of FDG-PET and ADC values of diffusion-weighted MR imaging with pathologic prognostic factors in breast carcinoma. Eur. J. Radiol.85, 943–949. 10.1016/j.ejrad.2016.02.015 (2016). [DOI] [PubMed] [Google Scholar]

[CR28] 28.Aydin, H. et al. Is there any relationship between adc values of diffusion-weighted imaging and the histopathological prognostic factors of invasive ductal carcinoma? Br. J. Radiol.91, 20170705. 10.1259/bjr.20170705 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Jackson, A. et al. MRI apparent diffusion coefficient (ADC) as a biomarker of tumour response: Imaging-Pathology correlation in patients with hepatic metastases from colorectal cancer (EORTC 1423). Cancers (Basel). 15, 3580. 10.3390/cancers15143580 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Nishiofuku, H. et al. Increased tumour ADC value during chemotherapy predicts improved survival in unresectable pancreatic cancer. Eur. Radiol.26, 1835–1842. 10.1007/s00330-015-3999-2 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Niwa, T. et al. Advanced pancreatic cancer: the use of the apparent diffusion coefficient to predict response to chemotherapy. Br. J. Radiol.82, 28–34. 10.1259/bjr/43911400 (2009). [DOI] [PubMed] [Google Scholar]

[CR32] 32.Qiu, B. et al. Dynamic recurrence risk and adjuvant chemotherapy benefit prediction by ctdna in resected NSCLC. Nat. Commun.12, 6770. 10.1038/s41467-021-27022-z (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR33] 33.Guan, S. et al. Circulating tumor DNA in prediction of prognosis and response to Nab-paclitaxel based First-line chemotherapy in metastatic pancreatic cancer. Pancreas48, 1435 (2019). [Google Scholar]

[CR34] 34.Pek, M. et al. Development and evaluation of a multi-cancer screening (MCS) test for cancers common in Asia. J. Clin. Oncol.4110.1200/JCO.2023.41.16_suppl.3052 (2023).

[CR35] 35.Osayi, S. N., Bloomston, M., Schmidt, C. M., Ellison, E. C. & Muscarella, P. Biomarkers as predictors of recurrence following curative resection for pancreatic ductal adenocarcinoma: a review. Biomed. Res. Int.2014 (468959). 10.1155/2014/468959 (2014). [DOI] [PMC free article] [PubMed]

[CR36] 36.Goodfellow, I. J. et al. Generative adversarial networks. Commun. ACM. 63, 139–144 (2020). [Google Scholar]

PERMALINK

Synthetic data generation method improves risk prediction model for early tumor recurrence after surgery in patients with pancreatic cancer

HyeJeong Jeong

Jeong-Moo Lee

Hyeong Seok Kim

Hochang Chae

So Jeong Yoon

Sang Hyun Shin

In Woong Han

Jin Seok Heo

Ji Hye Min

Seung Hyup Hyun

Hongbeom Kim

Abstract

Supplementary Information

Introduction

Methods

Study population and data collection

Fig. 1.

Data preprocessing and synthetic data generation

Fig. 2.

Additional oversampling benchmark (SMOTE)

VAE architecture

Recurrence prediction model development

Model calibration assessment

Statistical analysis

Results

Clinical and imaging characteristics of data sets

Table 1.

Fig. 3.

Performance metrics of different models with and without VAE augmentation

Fig. 4.

Table 2A.

Performance in external validation

Fig. 5.

Table 2B.

Performance of SMOTE-Augmented Models

Feature importance analysis

Table 3.

Fig. 6.

Discussion

Conclusions

Supplementary Information

Abbreviations

Author contributions

Funding

Data availability

Declarations

Competing interests

Ethics approval

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases