Abstract
Objective:
Our aim was to propose a preoperative computer-aided diagnosis scheme to differentiate pancreatic serous cystic neoplasms from other pancreatic cystic neoplasms, providing supportive opinions for clinicians and avoiding overtreatment.
Materials and Methods:
In this retrospective study, 260 patients with pancreatic cystic neoplasm were included. Each patient underwent a multidetector row computed tomography scan and pancreatic resection. In all, 200 patients constituted a cross-validation cohort, and 60 patients formed an independent validation cohort. Demographic information, clinical information, and multidetector row computed tomography images were obtained from Picture Archiving and Communication Systems. The peripheral margin of each neoplasm was manually outlined by experienced radiologists. A radiomics system containing 24 guideline-based features and 385 radiomics high-throughput features was designed. After the feature extraction, least absolute shrinkage selection operator regression was used to select the most important features. A support vector machine classifier with 5-fold cross-validation was applied to build the diagnostic model. The independent validation cohort was used to validate the performance.
Results:
Only 31 of 102 serous cystic neoplasm cases in this study were recognized correctly by clinicians before the surgery. Twenty-two features were selected from the radiomics system after 100 bootstrapping repetitions of the least absolute shrinkage selection operator regression. The diagnostic scheme performed accurately and robustly, showing the area under the receiver operating characteristic curve = 0.767, sensitivity = 0.686, and specificity = 0.709. In the independent validation cohort, we acquired similar results with receiver operating characteristic curve = 0.837, sensitivity = 0.667, and specificity = 0.818.
Conclusion:
The proposed radiomics-based computer-aided diagnosis scheme could increase preoperative diagnostic accuracy and assist clinicians in making accurate management decisions.
Keywords: pancreatic cancer, pancreatic serous cystic neoplasms, computer-aided diagnosis, radiomics, MDCT image
Introduction
Pancreatic cancer is one of the most mortal malignant tumors in the world with an overall 5-year survival rate of only 8%.1 Thanks to the wide use of high-definition imaging scans, more and more pancreatic cystic neoplasms (PCNs) have been discovered and have aroused increasing awareness.2 There are 4 typical types of PCNs: serous cystic neoplasms (SCNs), intraductal papillary mucinous neoplasms (IPMNs), mucinous cystic neoplasms (MCNs), and solid pseudopapillary neoplasms (SPNs). The SCNs are almost benign and indolent tumors, with slow growth and very low risk of malignant progression.3,4 Most patients with SCN have no obvious symptoms and only need to undergo conservative management and periodical imaging scans rather than surgical resection.4,5 Whereas IPMNs, MCNs, and SPNs have a relatively significant rate of malignancy, and the best treatment choice is surgical resection before these neoplasms progress to a high-grade cancer.6,7 Previous studies showed that the clinical diagnostic accuracy of SCNs was far from satisfactory, and more than half of patients who should have been managed conservatively underwent unnecessary surgeries.8–12
For this reason, an accurate preoperative distinction between SCNs and non-SCNs is quite important and significant. Imaging plays an indispensable role in this issue. Imaging examinations such as endoscopic ultrasound, computed tomography (CT), and magnetic resonance imaging are widely used clinically to provide radiologic information and increase diagnostic accuracy. Especially, abdominal multidetector row computed tomography (MDCT) is an effective method that allows enhanced thin-section scanning of the pancreas and has become the preferred imaging modality in the early diagnosis of PCNs.13,14 Unfortunately, correctly classifying different types of PCNs by manual examination of radiological images is still a huge challenge, even for an experienced radiologist.15
In recent years, computer-aided diagnosis (CAD) systems have been increasingly designed to provide second opinions for radiologists. After selecting the region of interest (ROI) and extracting features from medical images (manually or automatically), classification models are established through machine learning algorithms to identify different types of tumors with higher reliability and objectivity. Many algorithms have been proposed for the diagnosis of tumors in various organs, such as the thyroid, lung, breast and brain, while there are relatively few CAD researches on PCNs.16 Furthermore, in the majority of existing researches on PCNs, CT image features such as tumor size, cyst number, and the presence or absence of calcifications all were recorded manually.15,17,18 This mainly depends on radiologists’ experience and ignores amounts of image information such as morphological details, texture characteristics, and intensity changes. To solve this problem, the radiomics method has been proposed and has become an emerging field of research.19 It refers to the automatic high-throughput feature extraction and analysis of medical images to build robust models to identify different tumors or predict patient survival.20 In a recent study, a radiomics approach has been used to effectively enhance the preoperative prediction of IPMN malignancy and to our knowledge, there are currently no such studies for the classification of PCNs.21
The aim of our study was to automatically extract quantitative features from MDCT images and develop a radiomics-based CAD classification scheme that improves the preoperative diagnostic accuracy of SCNs and provides support opinions for clinicians to avoid overtreatment.
Materials and Methods
Patients
In this retrospective study, we included 260 patients who underwent a pancreatic resection for a PCN from March 2007 to November 2016 at Department of Pancreatic Surgery, Huashan Hospital of Fudan University, Shanghai, China. All patients had provided written informed consent for imaging and clinical data to be donated for the research. The research was approved by the institutional review board of Huashan Hospital (KY2018-019).
All patients had demographic information (age, sex) recorded in the hospital information system. The ages of patients (94 males and 166 females) ranged from 15 to 86 years (mean age 53.4 [15.1] years). Definite histopathological diagnosis for each patient was performed by experienced pathologists after the surgical resection. The database consisted of the 4 most common types of PCNs: 102 cases of SCNs, 74 cases of IPMNs, 35 cases of MCNs, and 49 cases of SPNs. We created a data set of 200 patients including 75 SCN cases, 58 IPMN cases, 28 MCN cases, and 39 SPN cases who underwent MDCT scan before May 2015 as the cross-validation cohort. The other 60 patients who underwent MDCT scan between May 2015 and November 2016 constructed the independent validation cohort containing 27 SCN cases, 16 IPMN cases, 7 MCN, cases and 10 SPN cases. The IPMNs, MCNs, and SPNs were classified into the non-SCN category. A summary of the patient characteristics in different cohorts is shown in Table 1.
Table 1.
Category | Cross-Validation Cohort | Independent Validation Cohort | ||
---|---|---|---|---|
SCNs | Non-SCNs | SCNs | Non-SCNs | |
Age (mean [SD]) | 54.1 (14.0) | 52.5 (15.7) | 57.6 (11.0) | 51.7 (17.0) |
Sex (case [%]) | ||||
Male | 20 (26.7) | 56 (44.8) | 7 (25.9) | 11 (33.3) |
Female | 55 (73.3) | 69 (55.2) | 20 (74.1) | 22 (66.7) |
Total | 75 | 125 | 27 | 33 |
200 | 60 |
Abbreviations: SCN, serous cystic neoplasm; SD, standard deviation.
MDCT Protocol
All included patients had undergone abdominal 64-MDCT before the surgery. All scans were obtained using a dedicated dual-phase pancreatic protocol. More than 90% of patients had 1.5 mm slice images of each phase (noncontrast, arterial phase, and venous phase), and the rest had slice images of 1 or 3 mm thickness. Our operating procedure included acquiring CT slice images of the abdomen from the superior liver capsule to the iliac crests without contrast. Nonionic-iodinated contrast material (370 mg I/mL) was then injected intravenously through a power injector at a rate of 4 mL/s. The volume of contrast material injected was based on the weight of the patient. Arterial phase images were acquired 25 to 30 seconds after contrast injection. For the venous phase, images were acquired 60 to 65 seconds after contrast injection.
Image Analysis
Venous phase images were used for all cases because of better tumor-background contrast.21 Two well-trained radiologists with more than 5 years of experience reviewed all MDCT images. For each patient, one of the readers selected the central slice of the imaging scan and then outlined the peripheral margin of each neoplasm within the pancreas, capturing both solid and cystic components. The other reader rechecked the slices to finalize the boundaries and marked the ROI on each central slice. This process provided the reliable delineation of the tumor region and guaranteed the accuracy and reliability of extracted features.
We proposed a radiomics system containing 409 quantitative features. The feature set consisted of 2 parts: 24 guideline-based features and 385 radiomics high-throughput features. In the first part, the demographic information (age, sex) of patients was obtained from Picture Archiving and Communication Systems and radiologists recorded the location information of tumors in the pancreas (head, neck, body, or tail). In addition, we designed and extracted 21 morphological features based on the clinical guideline of PCNs, specially focusing on the following factors: cyst information, tumor shape, tumor wall, calcification, and central scar.22 In the second part, we designed and extracted 16 intensity features, 61 texture features, and 308 wavelet features to uncover and quantify the image information that could not be observed by the naked eye, such as the intensity distribution and subtle texture changes. Details of the radiomics system are shown in Supplemental Appendix 1.
Statistical Analysis
The data set was divided into 2 parts: The cross-validation cohort was used to select the most valuable features and build a classification model, and the independent validation cohort was merely used to assess the performance of our model. The division was based on the date of MDCT scans to avoid other interference factors.
Feature selection refers to the search for a best subset of features and is indispensable for an accurate classification. In this step, we performed a 5-fold cross-validation with 100 bootstrapping repetitions on the cross-validation cohort to obtain a reliable and effective feature subset. In each 5-fold cross-validation, the cross-validation cohort was split into 5 folds. By holding 1 fold as the testing set, the other 4 folds were put together as a training set to build a model. The process continued until each fold was used as the testing set. The least absolute shrinkage selection operator (LASSO) model was used on the training set to select the most important features. We recorded the selected features in each time of bootstrapping and sorted all features by their occurrence frequency in all repetitions. The top 20% of sorted features were retained as the final optimal feature subset.
After feature selection, we performed the bootstrapping repetitions of cross-validation again to reduce overfitting and objectively evaluate the model. Each time a support vector machine (SVM) model was built and tested to obtain prediction results of the classification. The SVM is one of the most popular supervised learning algorithms in the machine learning field and performs effectively in classification problems.23,24 We normalized the features into the range [−1,1] and then utilized the SVM of linear kernel after comparing the classification performance. The receiver operating characteristic (ROC) curves were constructed and the area under the ROC curve (AUC), sensitivity (SEN), and specificity (SPEC) were calculated to evaluate the classification performance of the model. All values were averaged, and the 95% confidence intervals (CIs) of the AUC were also calculated. The classification model was conducted on the independent validation cohort to test its robustness and generalization. The performance was also accessed by AUC, SEN, and SPEC.
All mentioned image and data processing were performed in Matlab R2015b (Mathworks, Inc, Natick, Massachusetts).
Results
The 260 patients enrolled in the retrospective study were divided into a cross-validation cohort of 200 patients and an independent validation cohort of 60 patients. For each patient, the central slice of the imaging scan was manually selected and the peripheral margin of each neoplasm was outlined. Examples of different PCNs after manually outlining the tumor region are shown in Figure 1.
After 100 bootstrapping repetitions of the feature selection by the LASSO regression, we selected 22 features that were the most statistically significant and appeared most frequently in the repeated selection, from 409 quantitative features. The final optimal feature subset is shown in Supplemental Appendix 2.
It could be divided into 2 main categories: 5 guideline-based features reflected demographic and morphological information such as sex, location, shape, and cyst size; 17 radiomics high-throughput features revealed intensity and texture characteristics, indicating calcification, central scar, and other density difference. Representative features in each category are shown in Table 2.
Table 2.
Category | Feature | SCNs, (Mean [SD]) | Non-SCNs, (Mean [SD]) | P Value |
---|---|---|---|---|
Guideline-based features | Sex | 1.735 (0.443) | 1.576 (0.496) | .009 |
Tumor location | 2.377 (1.180) | 2.190 (1.264) | .273 | |
Moment difference | 0.029 (0.012) | 0.022 (0.012) | <.001 | |
Cyst size (mm2) | 217.2 (245.1) | 702.8 (1571.0) | .039 | |
Radiomics high-throughput features | Intensity T-range | 171.6 (48.01) | 158.2 (38.78) | .007 |
Wavelet intensity T-median | 0.333 (0.840) | 0.077 (0.850) | .005 | |
Wavelet NGTDM busyness | 0.159 (0.116) | 0.255 (0.273) | .009 |
Abbreviations: NGTDM, neighborhood gray-tone difference matrix; SCN, serous cystic neoplasm; SD, standard deviation.
Guideline-Based Features
Sex
To quantify gender differences among patients, male cases were marked as 1 and female cases were marked as 2. The mean sex value of SCN cases was 1.735 (0.443) and that of non-SCN cases was 1.576 (0.496; P value = .009).
Location
The location information of the tumor in the pancreas was recorded by radiologists and the feature values of pancreatic head, neck, body, and tail were from 1 to 4. In our study, the mean location value of SCN cases was 2.377 (1.180), slightly greater than that of non-SCN cases (2.190 [1.264]). Although the p value of the location information is higher than .05 (p value = .273), this feature was selected in our radiomics system and this issue will be discussed later.
Shape
Moment difference and rectangle-fitting factor were used to describe the tumor shape. Moment difference was designed to quantify the roughness of the tumor edge and the rectangle-fitting factor was defined as the ratio of the tumor area to its minimum enclosing rectangle. The mean moment difference of SCN cases was 0.029 (0.012) and that of non-SCN cases was 0.022 (0.012; P value <.001). The mean rectangle-fitting factor of SCN cases was 0.715 (0.055) and that of non-SCN cases was 0.733 (0.0.055; P value = .004). The SCN cases had a greater moment difference value and a lower rectangle-fitting factor value, which meant that SCNs had a lobulated contour and non-SCNs had a relatively smooth contour.
Cyst size
This feature was specially designed to extract the information of cysts inside the tumor and automatically calculate their average area. The mean cyst size of SCN cases was 217.2 (245.1) mm2 and that of non-SCN cases was 702.8 (1571.0) mm2 (P = .039). We found that the cyst size was an effective feature in distinguishing SCNs from non-SCNs.
Radiomics High-Throughput Features
A total of 17 intensity and texture features were selected, showing difference between SCNs and non-SCNs. Typically, the intensity T-range, wavelet intensity T-median, and wavelet neighborhood gray-tone difference matrix (NGTDM) busyness were the most distinguishable. The mean intensity T-ranges of SCN and non-SCN cases were 171.6 (48.01) and 158.2 (38.78), respectively (P value = .007). The mean wavelet intensity T-medians of SCN and non-SCN cases were 0.333 (0.840) and 0.077 (0.850), respectively (P value = .005). The mean wavelet NGTDM busyness of SCN and non-SCN cases was 0.159 (0.116) and 0.255 (0.273), respectively (P value = .009). Thus, SCNs had relatively wider intensity range, higher overall density, and more homogeneously distributed local density than non-SCNs. These features will be further discussed later.
Then, we used an SVM classifier to intelligently combine these 22 selected features and build a robust model. Three calculated indicators showed a superior performance that achieved an AUC of 0.767 (95% CI, 0.763-0.770), SEN of 0.686, and SPEC of 0.709 in the cross-validation cohort and a higher AUC of 0.837, SEN of 0.667, and SPEC of 0.818 in the independent validation cohort. These metrics indicated that our classification model could accurately and efficiently identify most SCN cases. We also compared the performance of our SVM classifier with 4 classifiers that used feature subsets selected by other classic feature selection methods.25 In the Wilcoxon rank-sum test, we selected statistically significant features with a P value lower than .01. In the χ2 test and relief method, features were sorted according to the calculated value and then the feature subset with the best classification performance was chosen. The logistic regression was used in the same procedure as the LASSO regression to select the top 20% of sorted features. The SVM classification performance metrics for each feature selection method are listed in Table 3. The ROC curves are shown in Figure 2.
Table 3.
Method of Feature Selection | Number of Selected Features | Cross-Validation Cohort (5-Fold Cross-Validation With 100 Bootstrapping Repetitions) | Independent Validation Cohort | ||||
---|---|---|---|---|---|---|---|
AUC (95% CI) | SEN | SPEC | AUC | SEN | SPEC | ||
WRST | 17 | 0.658 (0.653-0.663) | 0.605 | 0.644 | 0.736 | 0.593 | 0.727 |
Relief | 21 | 0.644 (0.639-0.648) | 0.612 | 0.625 | 0.679 | 0.630 | 0.636 |
Logistic regression | 20 | 0.628 (0.624-0.633) | 0.621 | 0.564 | 0.667 | 0.630 | 0.576 |
χ2 Test | 16 | 0.667 (0.663-0.670) | 0.573 | 0.671 | 0.733 | 0.630 | 0.697 |
LASSO | 22 | 0.767 (0.763-0.770) | 0.686 | 0.709 | 0.837 | 0.667 | 0.818 |
Abbreviations: AUC, area under the ROC curve; LASSO, least absolute shrinkage selection operator; SEN, sensitivity; SPEC, specificity; WRST, Wilcoxon rank-sum test.
To evaluate the improvement brought by radiomics high-throughput features, we compared the classification performance of the SVM classifier using selected guideline-based features with the SVM classifier using the full selected feature set. The comparison result is shown in Table 4. Most metrics indicated that high-throughput radiomics features could utilize more image information than traditional guideline-based features and greatly increase the diagnostic accuracy. Detailed information regarding diagnostic discrepancies between radiomics CAD result and definitive histological diagnosis in the independent validation cohort is shown in Table 5.
Table 4.
Feature Set | Number of Features | Cross-Validation Cohort (5-Fold Cross-Validation With 100 Bootstrapping Repetitions) | Independent Validation Cohort | ||||
---|---|---|---|---|---|---|---|
AUC (95% CI) | SEN | SPEC | AUC | SEN | SPEC | ||
Selected guideline-based features | 5 | 0.707 (0.704-0.710) | 0.747 | 0.602 | 0.774 | 0.778 | 0.636 |
Full selected feature set | 22 | 0.767 (0.763-0.770) | 0.686 | 0.709 | 0.837 | 0.667 | 0.818 |
Abbreviations: AUC, area under the ROC curve; CI, confidence interval; SEN, sensitivity; SPEC, specificity; SVM, support vector machine.
a The full selected feature set contained both selected guideline-based features and radiomics high-throughput features.
Table 5.
Radiomics CAD Result | Definitive Histological Diagnosis | ||||
---|---|---|---|---|---|
SCN | Non-SCN | ||||
IPMN | MCN | SPN | Total | ||
SCN | 18 | 3 | 2 | 1 | 6 |
Non-SCN | 9 | 13 | 5 | 9 | 27 |
Abbreviations: CAD, computer-aided diagnosis; IPMN, intraductal papillary mucinous neoplasm; MCN, mucinous cystic neoplasm; SCN, serous cystic neoplasm; SPN, solid pseudopapillary neoplasm.
Discussion
Current Research Status of SCN Diagnosis
Due to different tumor characteristics and management strategies, the classification of different types of PCNs has generated much interest. Patients need a noninvasive and affordable approach to accurately distinguish SCNs from MCNs, IPMNs, and SPNs, so that those with SCNs can avoid the morbidity and high-economic costs of surgery. However, the preoperative diagnostic accuracy of PCNs by clinicians is currently far from satisfactory. According to a recent study of 141 patients with histology-proven PCN, the overall preoperative diagnostic accuracy of PCNs was 61.0% (86 of 141) while the diagnostic accuracy of SCNs was only 24.2% (8 of 33).11 In our retrospective study of 260 patients with PCN, we were surprised to find that the overall preoperative diagnostic accuracy by clinicians was 37.3% (97 of 260), and only 30.4% (31 of 102) of SCN cases were correctly diagnosed. This meant that more than two-thirds of patients with SCN suffered unnecessary pancreatic resection.
Radiologic imaging technologies, especially MDCT scans, play an important role in the preoperative diagnosis of PCNs. In many recent researches, radiologists recorded descriptive morphologic features from CT images based on their experience, including the tumor size, location (pancreatic head, neck, body, or tail), contour shape (smooth, lobulated), calcification (absent, central, or peripheral), septa (absent, present), and central scar (absent, present).13,15,17,18,26,27 Then, statistical methods and even machine learning algorithms were used to analyze recorded radiologic features to improve the diagnostic accuracy.
According to the results of Kim et al, significant differences in tumor shape were found between serous oligocystic adenomas (SOAs) and the other macrocystic neoplasms (MCNs and IPMNs) (P < .05).26 Their research focused on the macrocystic types of PCN and convinced us about the importance of features of the tumor shape. Goh et al found that SCNs differ from MCNs by their relatively higher male-to-female ratio (P = .004), higher frequency of tumors occurring in the head of the pancreas, and smaller cyst size (P < .001).27 In one research, Cohen-Scali et al compared the CT appearance of 12 macrocystic SCNs, 11 MCNs, and 10 pseudocysts.13 They found that location in the pancreatic head (P < .05), lobulated contour (P < .005), and lack of wall enhancement (P < .005) were specific for macrocystic SCNs compared with other PCNs. Li et al adopted a CAD scheme to distinguish SOAs from MCNs.17 They found the tumor size, contour, and location features effective in SVM classifier and achieved an accuracy of 88.37% (38 of 43). However, their database was too small and only contained SOA and MCN cases, which tremendously simplified the classification problem.
These researches contributed to the accurate preoperative diagnosis of PCNs but had their limitations. Furthermore, the manual feature extraction in these researches was inefficient in dealing with large amounts of data and the accuracy of these descriptive features relied heavily on radiologists’ subjective judgment. In the feature extraction section of our study, we referred to the valuable results of these researches and realized automatic extraction and quantification of these morphological features.
Our Findings and Advantages of Radiomics Analysis
Here, we conducted the first retrospective study to evaluate the clinical utility of radiomics features in automatic CAD of SCNs. In our research procedure, we manually outlined the accurate tumor boundary for each of the 260 patients and then designed programs to extract 409 quantitative features, containing both guideline-based features and radiomics high-throughput features. Quantitative guideline-based features provided similar information as descriptive morphological features manually recorded in former researches and the automatic feature extraction was more efficient and effective. Furthermore, radiomics high-throughput features containing intensity features, texture features, and their wavelet decomposition forms fully utilized image information and obtained more image details that were hard to discover with the naked human eyes.28 Then, instead of simply adding up the factors, we used LASSO regression to select the most statistically important features and build up an optimal feature subset. The redundancy between highly relevant variables can be eliminated and the most influential variables were selected to get a refined statistical model with high accuracy and interpretability.29,30 The results showed that the LASSO regression could select the most effective feature subset and achieve a better performance than other methods in each indicator value. Owing to the superior feature selection of the LASSO regression, we acquired an optimal feature subset of 22 features from 409 radiomics features. Then, we adopted the SVM, a supervised-learning model, to address various classification situations and achieved a tremendous diagnosis performance improvement compared with clinical diagnosis. As we all know, machine learning requires large data sets to acquire valid results. Compared with existing CAD studies, the sample amount of our database had an absolute advantage and this ensured the reliability and applicability of our research.17,31
According to the results of our study, guideline-based features were effective in the CAD scheme, with AUC of 0.707. Sex was an important factor in the diagnosis of SCNs, with a P value of .009, because IPMNs are more common in males while SCNs, MCNs, and SPNs are more common in women.22 Age was not a statistically significant feature in our study, with a P value of .3238, which was consistent with early researches of Kim et al and Goh et al.26,27 However, age is considered an important discriminator in many recent researches.22,32,33 To explain the discrepancy, we calculated patients’ average age in each category. The average age (mean standard deviation [SD]) in SCN and non-SCN categories was 55.0 (13.4) and 52.3 (16.1), respectively. In the non-SCN category, the average ages of patients with IPMNs, MCNs, and SPNs were 62.4 (8.70), 49.4 (15.2), and 39.2 (15.0), respectively. When IPMNs, MCNs, and SPNs were all considered as non-SCNs, age became a weak feature in distinguishing SCNs from other PCNs. This was one of the limitations of LASSO algorithm and was also due to the insufficient database. The tumor location showed no statistical significance, with a P value of .273, but surprisingly this feature was selected in 100 bootstrapping repetitions of LASSO. According to the results of Goh et al and Scoazec et al, SCNs occur more frequently in the pancreatic head rather than the body or tail, MCNs are more likely to appear in the pancreatic tail of women, and IPMNs have a higher probability to appear in the pancreatic head and neck of men.27,34 This is the advantage of the LASSO regression over traditional statistical methods. The tumor location was an effective feature when combined with other features such as sex and tumor shape. Features about the tumor shape indicated obvious statistical significance. Especially, the P value of moment difference was less than .001. Relatively speaking, non-SCNs tended to have a regular oval shape with smooth contour, while SCNs tended to have a multicystic or lobulated contour. This finding was also consistent with the results of Sahani et al and Kim et al.22,26 The cyst size was also an effective feature in our study. According to existing reports, SCNs usually have plenty of small cysts or a few macrocysts; compared with SCNs, cysts of MCNs are smaller in numbers and greater in size; SPNs usually present as single heterogeneous masses with solid and cystic components and have a large size; however, cysts of IPMNs are typically small.27,32 In our study, the average cyst size (mean [SD]) in SCN and non-SCN categories was 217.2 (245.1) mm2 and 702.8 (1571.0) mm2, respectively. In the non-SCN category, the average cyst sizes of patients with IPMNs, MCNs, and SPNs were 201.3 (481.0), 1223.4 (2332.6), and 1088.3 (1769.1) mm2, respectively. This feature alone may be not effective in distinguishing SCNs from IPMNs, but the combination of several features achieved a better performance in the SVM classifier.
High-throughput radiomics features mined the deep information of MDCT images. When these features were added into the CAD scheme, the classification performance was greatly improved. The AUC increased from 0.707 to 0.767 in the cross-validation cohort and from 0.774 to 0.837 in the independent validation cohort. According to the values of intensity, texture, and wavelet features of tumor density, SCNs had relatively higher overall density and more homogeneously distributed local density than non-SCNs. Typically, the P value of wavelet intensity T-median feature was .005 and that of wavelet NGTDM busyness feature was.009. T-median referred to the median of gray values in tumor region. The NGTDM reflected the differences between a pixel and its surrounding neighbors, showing local texture details. The cyst fluid of SCNs is usually described as a clear, thin, watery fluid, while MCNs and IPMNs contain thick, viscous, and turbid fluid; SPNs contain heterogeneous solid and cystic components.35 This partly accounted for the difference between SCNs and non-SCNs in the intensity, texture, and wavelet features. We also found that tumors with calcification or central scar usually had a wider range of intensity and a stronger ROI contrast because calcification or scar region in CT images was shown as bright spots with higher intensity than general tumor tissue. Typically, the P value of intensity T-range feature was .007.
Actually, there are 4 morphological patterns of SCNs: microcystic, macrocystic (also known as oligocystic), mixed, and solid.4,27 Microcystic SCNs are multilobular tumors formed by numerous tiny cysts and usually have a honeycomb or sponge appearance with central calcified scar.36 Features of the central scar and calcification were especially effective in distinguishing microcystic SCNs from other PCNs. While macrocystic SCNs have a relatively small (countable) number of cysts and the central scars typically seen in microcystic SCNs are absent.37 Features of sex, tumor location, shape, and density were more valuable in distinguishing macrocystic SCNs from other PCNs, especially MCNs.13 Mixed SCNs are defined by the combination of microcystic and macrocystic pattern, and solid SCNs are tumors without distinguishable cystic lesions on images. Mixed and solid SCNs account for a quite small proportion of SCNs and are relatively hard to distinguish from other PCNs.4 We will design specific features for these 2 types in our further research.
There are some other limitations of our study. First, we only outlined the peripheral margin of tumors in central slices to extract 2-dimensional image features. We will design automatic segmentation algorithms in the future to replace manual segmentation and realize the reconstruction of tumors to extract 3-dimensional features. Second, most of the misclassified tumors were found to be smaller tumors. We will pay more attention to these cases and continue to improve classification accuracy. Lastly, the database of our current study was still insufficient and we will include more patients with PCN from other hospitals to create a multicenter database. With larger amounts of data, we can more convincingly pick out effective features and accurately classify each type of PCNs.
In conclusion, our study proposed a radiomics-based CAD scheme and stressed the role of radiomics analysis as a novel noninvasive method for improving the preoperative diagnostic accuracy of SCNs. In all, 409 quantitative features were automatically extracted, and a feature subset containing the 22 most statistically significant features was selected after 100 bootstrapping repetitions. Our proposed method improved the diagnostic accuracy and performed well in all metrics, with AUC of 0.767 in the cross-validation cohort and 0.837 in the independent validation cohort. This demonstrated that our CAD scheme could provide a powerful reference for the diagnosis of clinicians to reduce misjudgment and avoid overtreatment.
Supplemental Material
Supplemental_Material for Computer-Aided Diagnosis of Pancreas Serous Cystic Neoplasms: A Radiomics Method on Preoperative MDCT Images by Ran Wei, Kanru Lin, Wenjun Yan, Yi Guo, Yuanyuan Wang, Ji Li, and Jianqing Zhu in Technology in Cancer Research & Treatment
Abbreviations
- AUC
area under the receiver operating characteristic curve
- CAD
computer-aided diagnosis
- CI
confidence intervals
- CT
computed tomography
- IPMN
intraductal papillary mucinous neoplasms
- LASSO
least absolute shrinkage selection operator
- NGTDM
neighborhood gray-tone difference matrix
- MCN
mucinous cystic neoplasms
- MDCT
multidetector row computed tomography
- PCN
pancreatic cystic neoplasms
- PDAC
pancreatic ductal adenocarcinomas
- ROC
receiver operating characteristic
- ROI
region of interest
- SCN
serous cystic neoplasm
- SD
standard deviation
- SEN
sensitivity
- SOA
serous oligocystic adenomas
- SPEC
specificity
- SPN
solid pseudopapillary neoplasms
- SVM
support vector machine
Footnotes
Authors’ Note: Ran Wei and Kanru Lin contributed equally to this work. All authors certify that this article has not been published in whole or in part nor is it being considered for publication elsewhere.
Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (grants 61771143 and 81772566).
ORCID iD: Yi Guo, PhD https://orcid.org/0000-0002-7142-2871
Supplemental Material: Supplemental material for this article is available online.
References
- 1. Rahib L, Smith BD, Aizenberg R, Rosenzweig AB, Fleshman JM, Matrisian LM. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Res. 2014;74(11):2913–2921. [DOI] [PubMed] [Google Scholar]
- 2. Farrell JJ, Fernandez-del Castillo C. Pancreatic cystic neoplasms: management and unanswered questions. Gastroenterology. 2013;144(6):1303–1315. [DOI] [PubMed] [Google Scholar]
- 3. Galanis C, Zamani A, Cameron JL, et al. Resected serous cystic neoplasms of the pancreas: a review of 158 patients with recommendations for treatment. J Gastrointest Surg. 2007;11(7):820–826. [DOI] [PubMed] [Google Scholar]
- 4. Jais B, Rebours V, Malleo G, et al. Serous cystic neoplasm of the pancreas: a multinational study of 2622 patients under the auspices of the International Association of Pancreatology and European Pancreatic Club (European Study Group on Cystic Tumors of the Pancreas). Gut. 2016;65(2):305–312. [DOI] [PubMed] [Google Scholar]
- 5. Malleo G, Bassi C, Rossini R, et al. Growth pattern of serous cystic neoplasms of the pancreas: observational study with long-term magnetic resonance surveillance and recommendations for treatment. Gut. 2012;61(5):746–751. [DOI] [PubMed] [Google Scholar]
- 6. Farrell JJ. Prevalence, diagnosis and management of pancreatic cystic neoplasms: current status and future directions. Gut Liver. 2015;9(5):571–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Brugge WR, Lauwers GY, Sahani D, Fernandez-del Castillo C, Warshaw AL. Cystic neoplasms of the pancreas. N Engl J Med. 2004;351(12):1218–1226. [DOI] [PubMed] [Google Scholar]
- 8. Cho CS, Russ AJ, Loeffler AG, et al. Preoperative classification of pancreatic cystic neoplasms: the clinical significance of diagnostic inaccuracy. Ann Surg Oncol. 2013;20(9):3112–3119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Salvia R, Malleo G, Marchegiani G, et al. Pancreatic resections for cystic neoplasms: from the surgeon’s presumption to the pathologist’s reality. Surgery. 2012;152(3 suppl 1):S135–S142. [DOI] [PubMed] [Google Scholar]
- 10. Sawhney MS, Al-Bashir S, Cury MS, et al. International consensus guidelines for surgical resection of mucinous neoplasms cannot be applied to all cystic lesions of the pancreas. Clin Gastroenterol Hepatol. 2009;7(12):1373–1376. [DOI] [PubMed] [Google Scholar]
- 11. Del CM, Segersvard R, Pozzi MR, et al. Comparison of preoperative conference-based diagnosis with histology of cystic tumors of the pancreas. Ann Surg Oncol. 2014;21(5):1539–1544. [DOI] [PubMed] [Google Scholar]
- 12. Hines OJ, Reber HA. Pancreatic surgery. Curr Opin Gastroenterol. 2008;24(5):603–611. [DOI] [PubMed] [Google Scholar]
- 13. Cohen-Scali F, Vilgrain V, Brancatelli G, et al. Discrimination of unilocular macrocystic serous cystadenoma from pancreatic pseudocyst and mucinous cystadenoma with CT: initial observations. Radiology. 2003;228(3):727–733. [DOI] [PubMed] [Google Scholar]
- 14. Sainani NI, Saokar A, Deshpande V, Fernandez-del Castillo C, Hahn P, Sahani DV. Comparative performance of MDCT and MRI with MR cholangiopancreatography in characterizing small pancreatic cysts. Am J Roentgenol. 2009;193(3):722–731. [DOI] [PubMed] [Google Scholar]
- 15. Sahani DV, Sainani NI, Blake MA, Crippa S, Mino-Kenudson M, Fernandez-del Castillo C. Prospective evaluation of reader performance on MDCT in characterization of cystic pancreatic lesions and prediction of cyst biologic aggressiveness. Am J Roentgenol. 2011;197(1):W53–W61. [DOI] [PubMed] [Google Scholar]
- 16. Guo Y, Hu Y, Qiao M, et al. Radiomics analysis on ultrasound for prediction of biologic behavior in breast invasive ductal carcinoma. Clin Breast Cancer. 2017;S1526-S8209(17):30146–30155. [DOI] [PubMed] [Google Scholar]
- 17. Li C, Lin X, Hui C, Lam KM, Zhang S. Computer-aided diagnosis for distinguishing pancreatic mucinous cystic neoplasms from serous oligocystic adenomas in spectral CT Images. Technol Cancer Res Treat. 2016;15(1):44–54. [DOI] [PubMed] [Google Scholar]
- 18. Lv P, Mahyoub R, Lin X, Chen K, Chai W, Xie J. Differentiating pancreatic ductal adenocarcinoma from pancreatic serous cystadenoma, mucinous cystadenoma, and a pseudocyst with detailed analysis of cystic features on CT scans: a preliminary study. Korean J Radiol. 2011;12(2):187–195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kickingereder P, Götz M, Muschelli J, et al. Large-scale radiomic profiling of recurrent glioblastoma identifies an imaging predictor for stratifying anti-angiogenic treatment response. Clin Cancer Res. 2016;22(23):5765–5771. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Kumar V, Gu Y, Basu S, et al. Radiomics: the process and the challenges. Magn Reson Imaging. 2012;30(9):1234–1248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Permuth JB, Choi J, Balarunathan Y, et al. Combining radiomic features with a miRNA classifier may improve prediction of malignant pathology for pancreatic intraductal papillary mucinous neoplasms. Oncotarget. 2016;7(52):85785–85797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Sahani DV, Kambadakone A, Macari M, Takahashi N, Chari S, Fernandez-del Castillo C. Diagnosis and management of cystic pancreatic lesions. Am J Roentgenol. 2013;200(2):343–354. [DOI] [PubMed] [Google Scholar]
- 23. Amari S, Wu S. Improving support vector machine classifiers by modifying kernel functions. Neural Netw. 1999;12(6):783–789. [DOI] [PubMed] [Google Scholar]
- 24. Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics. 2000;16(10):906–914. [DOI] [PubMed] [Google Scholar]
- 25. Parmar C, Grossmann P, Bussink J, Lambin P, Aerts HJ. Machine learning methods for quantitative radiomic biomarkers. Sci Rep. 2015;5:13087 doi:10.1038/srep13087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Kim SY, Lee JM, Kim SH, et al. Macrocystic neoplasms of the pancreas: CT differentiation of serous oligocystic adenoma from mucinous cystadenoma and intraductal papillary mucinous tumor. Am J Roentgenol. 2006;187(5):1192–1198. [DOI] [PubMed] [Google Scholar]
- 27. Goh BK, Tan YM, Yap WM, et al. Pancreatic serous oligocystic adenomas: clinicopathologic features and a comparison with serous microcystic adenomas and mucinous cystic neoplasms. World J Surg. 2006;30(8):1553–1559. [DOI] [PubMed] [Google Scholar]
- 28. Kickingereder P, Burth S, Wick A, et al. Radiomic profiling of glioblastoma: identifying an imaging predictor of patient survival with improved performance over established clinical and radiologic risk models. Radiology. 2016;280(3):880–889. [DOI] [PubMed] [Google Scholar]
- 29. Zou H. The adaptive LASSO and its Oracle properties. J Am Stat Assoc. 2006;101(476):1418–1429. [Google Scholar]
- 30. Zhao P, Yu B. On model selection consistency of LASSO. J Mach Learn Res. 2006;7(12):2541–2563. [Google Scholar]
- 31. Dmitriev K, Kaufman AE, Javed AA, et al. Classification of pancreatic cysts in computed tomography images using a random forest and convolutional neural network ensemble. MICCAI. 2017;10435:150–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Buerke B, Domagk D, Heindel W, et al. Diagnostic and radiological management of cystic pancreatic lesions: important features for radiologists. Clin Radiol. 2012;67(8):727–737. [DOI] [PubMed] [Google Scholar]
- 33. Lennon AM, Wolfgang C. Cystic neoplasms of the pancreas. J Gastrointest Surg. 2013;17(4):645–653. [DOI] [PubMed] [Google Scholar]
- 34. Scoazec JY, Vullierme MP, Barthet M, Gonzalez JM, Sauvanet A. Cystic and ductal tumors of the pancreas: diagnosis and management. J Visc Surg. 2013;150(2):69–84. [DOI] [PubMed] [Google Scholar]
- 35. Compton CC. Serous cystic tumors of the pancreas. Semin Diagn Pathol. 2000;17(1):43–55. [PubMed] [Google Scholar]
- 36. Atiq M, Suzuki R, Khan AS, et al. Clinical decision making in the management of pancreatic cystic neoplasms. Expert Rev Gastroenterol Hepatol. 2013;7(4):353–360. [DOI] [PubMed] [Google Scholar]
- 37. Tseng JF, Warshaw AL, Sahani DV, Lauwers GY, Rattner DW, Fernandez-del Castillo C. Serous cystadenoma of the pancreas: tumor growth rates and recommendations for treatment. Ann Surg. 2005;242(3):413–419. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental_Material for Computer-Aided Diagnosis of Pancreas Serous Cystic Neoplasms: A Radiomics Method on Preoperative MDCT Images by Ran Wei, Kanru Lin, Wenjun Yan, Yi Guo, Yuanyuan Wang, Ji Li, and Jianqing Zhu in Technology in Cancer Research & Treatment