Skip to main content
The British Journal of Surgery logoLink to The British Journal of Surgery
. 2023 Jul 4;110(12):1623–1627. doi: 10.1093/bjs/znad201

Artificial intelligence-based models to assess the risk of malignancy on radiological imaging in patients with intraductal papillary mucinous neoplasm of the pancreas: scoping review

Alberto Balduzzi 1,3, Boris V Janssen 2,3,4,3, Matteo De Pastena 5, Tommaso Pollini 6, Giovanni Marchegiani 7, Henk Marquering 8,9,10, Jaap Stoker 11,12, Inez Verpalen 13,14, Claudio Bassi 15, Marc G Besselink 16,17,4,, Roberto Salvia 18,4,; for the Pancreatobiliary and Hepatic Artificial Intelligence Research (PHAIR) consortium
PMCID: PMC10638536  PMID: 37402951

Introduction

Increased detection of pancreatic cystic neoplasms has drawn the attention of the medical community1,2. Among these, intraductal papillary mucinous neoplasms (IPMNs) represent a serious challenge for clinicians because of their (low) premalignant potential. Despite extensive efforts, the treatment of IPMN remains controversial, which is reflected by differences in the current three major guidelines1,3,4.

Most patients diagnosed with IPMN will be kept under surveillance, aimed at monitoring progression of the cyst, which may require surgical resection in highly selected patients. Still, the risk of clinicians missing IPMN progression to malignancy is concerning5, with burdensome consequences for the patient. This concern must be balanced against the risk of complications after major pancreatic surgery. Therefore, patient selection is crucial both to avoid unnecessary surgery for benign lesions, and to continue surveillance safely. Typically, diagnostic imaging plays a central role in guiding patient selection for, and the timing of, surgery. However, current imaging approaches fall short for optimal decision-making.

Machine learning assessment of radiological imaging may improve the assessment of IPMNs and add to the decision-making for surgery. This scoping review provides an overview of the available evidence on this topic.

Methods

Literature search

The Joanna Briggs Institute and PRISMA Extension for Scoping Reviews criteria were used for this scoping review6,7. The review was performed in PubMed and Embase up to 16 December 2021. The search strategy is provided in Table S1. Two assessors worked independently on literature screening, evaluation of eligibility, and inclusion, with conflicts handled through discussion. The remaining literature was subjected to full-text analysis. Original studies on imaging-based machine learning models (Table S2) in IPMN, which reported on model performance in terms of malignancy assessment, were included.

Data extraction and analysis

Two reviewers extracted data independently. If consensus could not be established, disagreements were resolved by discussion. When these two reviewers could not reach agreement, a third independent assessor was involved. Data from the included studies were analysed descriptively. The primary outcome was the discriminatory ability of the models measured by the area under the curve (AUC), accuracy, C-index, and P value. Model performance of 0.75 or more was considered sufficient for research reporting on AUC, C-index, and accuracy.

Specific attention was paid to determining whether the discriminative models were compared with the reference standard of clinical care based on guidelines.

Methodological assessment

The methodological quality of the included studies was assessed using the modified Radiomics Quality Score (mRQS)8. Within the mRQS (Table S3), radiomics approaches can reach a total of 36 points on 16 aspects, and deep learning approaches can reach a total of 32 points on 14 aspects. Two independent assessors performed the methodological assessment. Discrepancies were resolved by discussion.

Results

Literature search

The literature search yielded 49 studies, of which 33 were excluded based on title and abstract screening, and four more studies after full-text screening. Eventually, 12 studies9–20 fulfilled the eligibility criteria and were included in this scoping review (Fig. S1). Table 1 details the included studies and the data extracted.

Table 1.

Studies on imaging-based machine learning models assessing the malignant potential of intraductal papillary mucinous neoplasm

Reference Country Imaging IPMN type Target variable Total n Machine learning component Data type Outcome(s)
Hanania et al.9 USA CT BD and MD Low-grade IPMN versus high-grade IPMN or PDAC Total 52
(10-fold cross-validation)
PCA and LR Radiomics AUC 0.96
Permuth et al.10 USA CT BD, MD, and Mixed Low-grade IPMN versus high-grade IPMN or PDAC Total 38
(10-fold cross-validation)
LR and PCA Radiomics AUC 0.93
Attiyeh et al.11 USA CT BD Low-grade IPMN versus high-grade IPMN or PDAC Total 103
(10-fold cross-validation)
RF Radiomics AUC 0.76
Chakraborty et al.12 USA CT BD Low-grade IPMN versus high-grade IPMN or PDAC Total 103
(10-fold cross-validation)
RF and SVM Radiomics AUC 0.77
Kuwahara et al.19 Japan EUS Undefined Low-grade IPMN versus high-grade IPMN or PDAC Total 50
(10-fold cross-validation)
NN Image AUC 0.98
Corral et al.20 USA MRI BD and MD Low-grade IPMN versus high-grade IPMN or PDAC Total 139
(10-fold cross-validation)
NN Image AUC 0.78
Jeon et al.18 South Korea MRI BD, MD, and mixed Low-grade IPMN versus high-grade IPMN or PDAC 248
(no validation)
LR Radiomics Entropy: OR 1.49–1.52
Compactness 2: OR 0.977–0.981
Harrington et al.13 USA CT and EUS BD and MD Low-grade IPMN versus high-grade IPMN or PDAC Training 103
Testing 33
RF Radiomics AUC 0.83
Polk et al.14 USA CT Undefined Low-grade IPMN versus high-grade IPMN or PDAC Total 51
(5-fold cross-validation)
LR Radiomics AUC 0.90
Tobaly et al.17 France CT BD, MD, and Mixed Low grade IPMN versus high grade IPMN or PDAC Training 296
Testing 112
LASSO and LR Radiomics AUC 0.84
Cui et al.16 China MRI BD Low-grade IPMN versus high-grade IPMN or PDAC Training 107
Testing 99
LASSO and LR Radiomics AUC 0.81–0.82
Cheng et al.15 China CT and MRI Undefined Low-grade IPMN versus high-grade IPMN or PDAC Total 60
(10-fold cross-validation)
LR and SVM Radiomics MRI: AUC 0.879–0.940
CT: AUC 0.811–0.864

IPMN, intraductal papillary mucinous neoplasm; BD, branch duct; MD, main duct; PDAC, pancreatic ductal adenocarcinoma; PCA, principle component analysis; LR, logistic regression; AUC, area under the curve; EUS, endoscopic ultrasound imaging; RF, random forest; SVM, support vector machine; NN, neural network; LASSO, least absolute shrinkage and selection operator.

Radiomics models

Of the 10 IPMN radiomics models, 9 were based on CT and 1 on MRI. All models aimed to distinguish low- or intermediate-grade dysplasia from high-grade dysplasia or IPMN with concomitant pancreatic ductal adenocarcinoma. The studies had AUCs ranging between 0.76 and 0.969–17; one18 showed that two features were independent variables for malignant IPMN with ORs ranging from 1.49 to 1.52 and 0.977 to 0.981.

Deep learning models

Of the two deep learning models, one was based on MRI and the other on endoscopic ultrasound imaging. The models aimed to distinguish low- or intermediate-grade dysplasia from high-grade dysplasia or IPMN with concomitant pancreatic ductal adenocarcinoma. These models reached AUCs of 0.78 and 0.9819,20.

Comparison with reference standard and added value

Two of the included studies compared the developed model with available guidelines. Corral et al.20 reported a similar diagnostic performance for MRI-based deep learning (AUC 0.78), and American Gastroenterology Association4 and Fukuoka (AUC 0.77)1 guidelines. Conversely, the CT and MRI-based radiomics model of Cheng et al.15 had superior discriminative performance (MRI: AUC 0.94; CT: AUC 0.86) to that of the clinical and imaging model based on the Fukuoka guidelines (AUC 0.77)1.

Methodological assessment

The median mRQS score for the radiomics studies was 11.5 of 36, ranging from 4 to 17. The deep learning studies had a median mRQS score of 9 of 32, ranging from 8 to 10. The included studies consistently scored poorly (25 per cent or fewer of the studies received all points) on 10 of 16 items in the mRQS: 2, 3, 4, 8, 10, 11, 12, 13, 15, and 16. Four items scored consistently positive (at least 75 per cent of the studies received all points): 5, 7, 9, and 14. Finally, two items scored moderately consistently (received all points in more than 25 per cent and less than 75 per cent of the studies): 1 and 6. Figure 1 summarizes the mRQS scoring.

Fig. 1.

Fig. 1

Methodological assessment of included studies using modified Radiomics Quality Score

Green, all points given; orange, not all points given; red, no points or negative points given; –, not applicable to this study owing to data type.

Discussion

This scoping review identified 12 artificial intelligence-based models to assess the risk of malignancy in patients with IPMN on radiological imaging. Although model performance was generally promising, the methodological quality of the studies was relatively poor. Furthermore, none of the models were applied in a prospective clinical setting or determined the added value compared with current guidelines or a clinical expert panel. If methodologically robust models are developed, and evaluated in a prospective setting, they may have the potential to enhance decision-making in finding the best time for surgery in patients diagnosed with IPMN.

This review has several limitations. First, owing to publication bias, models that were not discriminative may not have been submitted or accepted for publication. The model performance presented in this review may therefore be optimistic. Second, all studies are based on retrospective surgical series. However, most IPMNs are addressed to surgery after surveillance if cyst progression is observed. The selection bias originating from including only patients with surgically resected tumours makes the value of these models unclear in the unresected population. Third, the methods of the included studies varied extensively. Therefore, extracting generalizable results from this overview is difficult.

Future research should concentrate on developing methodologically sound, generalizable, and clinically validated models. Multiple methodological elements are frequently missed or ignored, as is evident from the mRQS scores of the research included. Once robust and generalizable models have been constructed, their performance and value should be validated in clinical settings. Currently available studies have focused on assessing the discriminative performance of machine learning models for malignant IPMNs. However, ideally, models would exclude the presence of malignancy with a high negative predictive value and ‘safely’ advise surveillance in patients who would have been selected for surgical treatment according to current criteria. This would represent a true added value to current clinical practice.

This scoping review has provided evidence that 12 artificial intelligence-based machine learning models have sufficient capacity to evaluate the risk of malignancy in IPMN. However, the methodological quality of the included studies is inadequate, and the clinical value of the proposed models has not been proven. As a result, caution should be advised when interpreting these results, and the findings must be corroborated by additional high-quality studies. Future research should focus on developing rigorous models and investigating their usefulness in clinical practice to ensure that they are dependable tools for assessing the risk of malignancy in IPMN.

Supplementary Material

znad201_Supplementary_Data

Acknowledgements

A.B. and B.V.J. are joint first authors, and M.G.B. and R.S. are joint last authors, of this article.

Contributor Information

Alberto Balduzzi, Department of Surgery and Oncology, Unit of General and Pancreatic Surgery, University of Verona Hospital Trust, Verona, Italy.

Boris V Janssen, Department of Surgery, Amsterdam UMC, location University of Amsterdam, Amsterdam, the Netherlands; Department of Pathology, Amsterdam UMC, location University of Amsterdam, Amsterdam, the Netherlands; Cancer Centre Amsterdam, Amsterdam, the Netherlands.

Matteo De Pastena, Department of Surgery and Oncology, Unit of General and Pancreatic Surgery, University of Verona Hospital Trust, Verona, Italy.

Tommaso Pollini, Department of Surgery and Oncology, Unit of General and Pancreatic Surgery, University of Verona Hospital Trust, Verona, Italy.

Giovanni Marchegiani, Department of Surgery and Oncology, Unit of General and Pancreatic Surgery, University of Verona Hospital Trust, Verona, Italy.

Henk Marquering, Cancer Centre Amsterdam, Amsterdam, the Netherlands; Department of Biomedical Engineering and Physics, Amsterdam UMC, location University of Amsterdam, Amsterdam, the Netherlands; Department of Radiology and Nuclear Medicine, Amsterdam UMC, Location University of Amsterdam, Amsterdam, the Netherlands.

Jaap Stoker, Cancer Centre Amsterdam, Amsterdam, the Netherlands; Department of Radiology and Nuclear Medicine, Amsterdam UMC, Location University of Amsterdam, Amsterdam, the Netherlands.

Inez Verpalen, Cancer Centre Amsterdam, Amsterdam, the Netherlands; Department of Radiology and Nuclear Medicine, Amsterdam UMC, Location University of Amsterdam, Amsterdam, the Netherlands.

Claudio Bassi, Department of Surgery and Oncology, Unit of General and Pancreatic Surgery, University of Verona Hospital Trust, Verona, Italy.

Marc G Besselink, Department of Surgery, Amsterdam UMC, location University of Amsterdam, Amsterdam, the Netherlands; Cancer Centre Amsterdam, Amsterdam, the Netherlands.

Roberto Salvia, Department of Surgery and Oncology, Unit of General and Pancreatic Surgery, University of Verona Hospital Trust, Verona, Italy.

Funding

The authors have no funding to declare.

Author contributions

Alberto Balduzzi (Conceptualization, Methodology, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Writing - Review & Editing, Visualization, Project administration), Boris Janssen (Conceptualization, Methodology, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Writing - Review & Editing, Visualization, Project administration), Matteo De Pastena (Data Curation, Writing - Review & Editing), Tommaso Pollini (Data Curation, Writing - Review & Editing), Giovanni Marchegiani (Conceptualization, Writing - Review & Editing, Methodology, Supervision), Henk Marquering (Conceptualization, Writing - Review & Editing, Methodology, Supervision), Jaap Stoker (Conceptualization, Writing - Review & Editing, Methodology, Supervision), Inez Verpalen (Conceptualization, Writing - Review & Editing, Methodology, Supervision), Claudio Bassi (Conceptualization, Writing - Review & Editing, Methodology, Supervision), Marc Besselink (Conceptualization, Writing - Review & Editing, Methodology, Supervision), and Roberto Salvia (Conceptualization, Writing - Review & Editing, Methodology, Supervision).

Disclosure

The authors declare no conflict of interest.

Supplementary material

Supplementary material is available at BJS online.

Data availability

The raw data required to reproduce the scoping review findings were extracted from published articles.

References

  • 1. Tanaka M, Fernández-del Castillo C, Kamisawa T, Jang JY, Levy P, Ohtsuka Tet al. Revisions of international consensus Fukuoka guidelines for the management of IPMN of the pancreas. Pancreatology 2017;17:738–753 [DOI] [PubMed] [Google Scholar]
  • 2. Balduzzi A, Salvia R, Löhr M. Risk stratification tools for branch-duct intraductal papillary mucinous neoplasms of the pancreas. United European Gastroenterol J 2022;10:145–146 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Del Chiaro M, Besselink MG, Scholten L, Bruno MJ, Cahen DL, Gress TMet al. European evidence-based guidelines on pancreatic cystic neoplasms. Gut 2018;67:789–804 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Vege SS, Ziring B, Jain R, Moayyedi P, Adams MA, Dorn SDet al. American gastroenterological association institute guideline on the diagnosis and management of asymptomatic neoplastic pancreatic cysts. Gastroenterology 2015;148:819–822 [DOI] [PubMed] [Google Scholar]
  • 5. Balduzzi A, Marchegiani G, Pollini T, Biancotto M, Caravati A, Stigliani Eet al. Systematic review and meta-analysis of observational studies on BD-IPMNS progression to malignancy. Pancreatology 2021;21:1135–1145 [DOI] [PubMed] [Google Scholar]
  • 6. Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac Det al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med 2018;169:467–473 [DOI] [PubMed] [Google Scholar]
  • 7. Lockwood C, Tricco AC. Preparing scoping reviews for publication using methodological guides and reporting standards. Nurs Health Sci 2020;22:1–4 [DOI] [PubMed] [Google Scholar]
  • 8. Janssen BV, Verhoef S, Wesdorp NJ, Huiskens J, de Boer OJ, Marquering Het al. Imaging-based machine-learning models to predict clinical outcomes and identify biomarkers in pancreatic cancer: a scoping review. Ann Surg 2022;275:560–567 [DOI] [PubMed] [Google Scholar]
  • 9. Hanania AN, Bantis LE, Feng Z, Wang H, Tamm EP, Katz MHet al. Quantitative imaging to evaluate malignant potential of IPMNs. Oncotarget 2016;7:85 776–85 784 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Permuth JB, Choi J, Balarunathan Y, Kim J, Chen DT, Chen Let al. Combining radiomic features with a miRNA classifier may improve prediction of malignant pathology for pancreatic intraductal papillary mucinous neoplasms. Oncotarget 2016;7:85 785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Attiyeh MA, Chakraborty J, Gazit L, Langdon-Embry L, Gonen M, Balachandran VPet al. Preoperative risk prediction for intraductal papillary mucinous neoplasms by quantitative CT image analysis. HPB (Oxford) 2019;21:212–218 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Chakraborty J, Midya A, Gazit L, Attiyeh M, Langdon-Embry L, Allen PJet al. CT radiomics to predict high-risk intraductal papillary mucinous neoplasms of the pancreas. Med Phys 2018;45:5019–5029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Harrington KA, Williams TL, Lawrence SA, Chakraborty J, Al Efishat MA, Attiyeh MAet al. Multimodal radiomics and cyst fluid inflammatory markers model to predict preoperative risk in intraductal papillary mucinous neoplasms. J Med Imaging (Bellingham) 2020;7:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Polk SL, Choi JW, McGettigan MJ, Rose T, Ahmed A, Kim Jet al. Multiphase computed tomography radiomics of pancreatic intraductal papillary mucinous neoplasms to predict malignancy. World J Gastroenterol 2020;26:3458–3471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Cheng S, Shi H, Lu M, Wang C, Duan S, Xu Qet al. Radiomics analysis for predicting malignant potential of intraductal papillary mucinous neoplasms of the pancreas: comparison of CT and MRI. Acad Radiol 2022;29:367–375 [DOI] [PubMed] [Google Scholar]
  • 16. Cui S, Tang T, Su Q, Wang Y, Shu Z, Yang Wet al. Radiomic nomogram based on MRI to predict grade of branching type intraductal papillary mucinous neoplasms of the pancreas: a multicenter study. Cancer Imaging 2021;21:26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Tobaly D, Santinha J, Sartoris R, Burgio MD, Matos C, Cros Jet al. CT-based radiomics analysis to predict malignancy in patients with intraductal papillary mucinous neoplasm (IPMN) of the pancreas. Cancers (Basel) 2020;12:3089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Jeon SK, Kim JH, Yoo J, Kim JE, Park SJ, Han JK. Assessment of malignant potential in intraductal papillary mucinous neoplasms of the pancreas using MR findings and texture analysis. Eur Radiol 2021;31:3394–3404 [DOI] [PubMed] [Google Scholar]
  • 19. Kuwahara T, Hara K, Mizuno N, Okuno N, Matsumoto S, Obata Met al. Usefulness of deep learning analysis for the diagnosis of malignancy in intraductal papillary mucinous neoplasms of the pancreas. Clin Transl Gastroenterol 2019;10:1–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Corral JE, Hussein S, Kandel P, Bolan CW, Bagci U, Wallace MB. Deep learning to classify intraductal papillary mucinous neoplasms using magnetic resonance imaging. Pancreas 2019;48:805–810 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

znad201_Supplementary_Data

Data Availability Statement

The raw data required to reproduce the scoping review findings were extracted from published articles.


Articles from The British Journal of Surgery are provided here courtesy of Oxford University Press

RESOURCES