Abstract
Objectives:
Abnormalities of the main pancreatic duct may be an early indicator of pancreatic ductal adenocarcinoma (PDAC). We develop and validate algorithms that predict the risk of PDAC using features identified on cross-sectional imaging and other clinical characteristics collected through electronic medical records.
Methods:
Adult patients with abdominal computed tomography or magnetic resonance imaging in 01/2006–06/2016 demonstrating dilatation of main pancreatic duct were identified. Pancreas-related morphologic features were extracted from radiology reports using natural language processing. The cumulative incidence of PDAC with death as a competing risk was estimated using multi-state models. Model discrimination was assessed using c-index. The models were internally validated using bootstrapping.
Results:
The cohort consisted of 7819 patients (mean age, 71 years; 65% female). A total of 781 (10%) patients developed PDAC within three years after the first eligible imaging study. The final models achieved reasonable discrimination (c-index 0.825–0.833). The three-year average risk of PDAC in the top 5% of the total eligible patients was 56.0%, more than 20 times of the average risk among the bottom 50% of patients.
Conclusions:
Prediction models combining imaging features and clinical measures can be used to further stratify the risk of pancreatic cancer among patients with pancreas ductal dilatation.
Keywords: risk prediction, pancreatic duct, duct dilatation, atrophy
1. INTRODUCTION
Pancreatic cancer newly affects 53,000 and kills 43,000 individuals in the US annually, leading to 5-year survival of less than 10%.1,2 Of several types of pancreatic cancer, pancreatic ductal adenocarcinoma (PDAC) is the most common and lethal. Due to the lack of early detection strategies and curative interventions, the mortality of pancreatic cancer is projected to grow continuously and become the 2nd leading cause of cancer death in the US by the year of 2030.3
An ideal target for early detection of pancreatic cancer would consist of high-grade precursor lesions such as pancreatic intraepithelial neoplasia (PanIN).4 Although current imaging techniques such as computed tomography (CT), magnetic resonance imaging (MRI) with magnetic resonance cholangiopancreatography, and endoscopic ultrasound are limited in identifying PanINs,5–10 features identifiable through these imaging modalities have been associated with PanIN.11
Of several abnormalities in the pancreas, the presence of main pancreatic duct dilatation has been suggested to be a significant risk factor for both PDAC12 as well as malignant transformation of pancreatic cystic lesions.13 A previous retrospective review of pre-diagnostic CT images from 28 patients with pancreatic cancer found the presence of abnormalities (predominantly duct dilatation and duct cutoff/stricture) in 50% of CT scans obtained 2–6 months prior to cancer diagnosis.14 However, it remains unclear the extent to which abnormal findings detected on routine imaging, in particular abnormalities of the main pancreatic duct may be related to risk of PDAC.
Statistical models or machine learning approaches incorporating clinical features have been attempted to identify patients with high risk of PDAC.15–18 However, these models typically are developed based on patient demographic and clinical characteristics. The ability to characterize high-risk precursor lesions may contribute significantly to the development of effective prediction models. In this study, we aimed to characterize the risk of pancreatic cancer associated with specific imaging abnormalities of the pancreas detected in the context of routine clinical care. We further sought to develop and validate risk prediction models based on combining imaging features and clinical data including laboratory measures to identify patients at highest risk of developing PDAC in the setting of pancreatic duct dilatation.
2. MATERIALS AND METHODS
2.1. Study Design
This is a retrospective cohort study utilizing information on patient demographics, diagnosis and procedure codes, laboratory measures as well as radiology notes extracted from the Research Data Warehouse of an integrated Health Maintenance Organization which serves more than 4.6 million health plan enrollees. Race/ethnicity distribution, demographics and socioeconomic status are representative of the Southern California region.19 The study protocol was approved by the organization’s Institutional Review Board.
2.2. Study Cohort and Follow-up
Patients whose abdominal CT or MRI indicated duct dilatation between 2006 and June of 2016 were identified based on natural language processing (NLP) algorithms (Supplemental Digital Content 1). The date of the first imaging study that met the inclusion criteria above was defined as the index date (t0). Patients who (1) were <18 years of age, (2) had a clearly defined mass (>2 cm) in the pancreas or a history of PDAC (cancer site code C25.0–25.9 in the Cancer Registry (CR) of the organization) on or prior to t0, or (3) were not continuously enrolled in the health plan in the 12 months prior to t0 (baseline window) were excluded. The requirement of continuous enrollment allowed adequate data to define study variables.
For each patient in the cohort, follow up started on t0 and ended with the earliest of the following events: dis-enrollment from the health plan, end of the study (07/31/2017), reached the maximum length of follow-up (1-year, 2-years and 3-years, respectively, for the three outcomes), non-PDAC related death, or PDAC diagnosis or death (outcome).
2.3. Outcome Identification
Pancreatic ductal adenocarcinoma was identified by querying the CR and by linking the health plan enrollees’ information with the California State Death Master files. Individuals were considered having developed incident PDAC if they had cancer at the site of pancreas by Site Codes C25.x in the CR, or Tenth Revision of International Classification of Diseases, Clinical Modification (ICD-10-CM) codes C25.x as the cause of death in the State Death Master files. If a patient was found in both data sources, the information from the CR served as the source of record.
2.4. Pancreatic Morphologic Features
The following pancreas-specific morphologic features were extracted from the reports of the abdominal CT or MRI prior to PDAC diagnosis using NLP algorithms (Supplemental Digital Content 1): atrophy, calcification, pancreatic cyst, pancreatic ductal irregularity, focal pancreatic duct stricture with distal (upstream) dilatation, focal pancreatic side branch dilation, granular pancreatic duct filling defects, as well as intra-ductal calculi (duct stone). For each feature, the output of the NLP algorithms were positive, possible, negative or unknown (no mentioning of such a feature). In this study, the ‘positive’ and ‘possible’ categories were both classified as “presence”, and ‘negative’ and ‘unknown’ were classified as “absence”.
2.5. Patient Demographic and Clinical Features at Baseline
Age, sex, race/ethnicity, tobacco and alcohol use, medical insurance type, years since health plan enrollment, neighborhood educational level (% of population with high school completion), family history of pancreatic cancer, diabetes, acute and chronic pancreatitis, dyspepsia, gallstone disorders, depression, insulin resistance (ICD-9 code 790.29 or ICD-10 code R73.03) and weight change in the 12 months prior to the index date were captured. Laboratory measures including fasting glucose, hemoglobin A1c, creatinine, cholesterol, alanine transaminase, aspartate aminotransferase, alkaline phosphatase, bilirubin, total protein, conjugated bilirubin, and albumin that were within one year prior to and most proximal to t0 were also extracted.
2.6. Imputation of Missing Data
In the current study, the missingness of data is not negligible (Table 1) for lab measurements or weight loss. We imputed 10 datasets for each of the lab measurements with <60% missingness and for weight loss using a flexible R package referred to as missForest.20,21 For lab measures with ≥60% missingness, they were not included in the model development process.
TABLE 1.
Demographics | |
Age, mean (SD) | 71.3 (13.3) |
Sex, female, n (%) | 5091 (65.1) |
Race/ethnicity, n (%) | |
White | 4078 (52.2) |
African American | 1512 (19.3) |
Hispanic | 1543 (19.7) |
Asian and Pacific Islanders | 612 (7.9) |
Multiple/other/unknown | 74 (0.9) |
Tobacco user, n (%) | |
Never | 3424 (43.8) |
Passive | 36 (0.5) |
Quit | 2767 (35.4) |
Yes | 834 (10.7) |
Missing | 758 (9.7) |
Alcohol user, n (%) | |
No | 1485 (19.0) |
Yes | 4286 (54.8) |
Missing | 2048 (26.2) |
Insurance, n (%) | |
Commercial | 2618 (33.5) |
Medicaid | 210 (2.7) |
Medicare | 4996 (63.9) |
Private pay | 3095 (39.6) |
Years since first enrollment, mean (SD), y | 19.7 (14.3) |
Percent with high school or higher education (geocoded), mean (SD) | 21.8 (7.9) |
Family history of pancreatic cancer, n (%) | 226 (2.9) |
Diabetes, n (%) | 3004 (38.4) |
Body mess index, kg/m2, n (%) | |
Underweight (<18.5) | 578 (7.4) |
Normal weight (18.5–25) | 3276 (41.9) |
Overweight (25–30) | 2071 (26.5) |
Obese (30+) | 1123 (14.4) |
Missing | 771 (9.9) |
Clinical characteristics | |
Gallstone disorders, n (%) | 1004 (12.8) |
Chronic pancreatitis, n (%) | 355 (4.5) |
Insulin resistance, n (%) | 814 (10.4) |
Depression, n (%) | 2010 (25.7) |
Weight change, n = 6545, n (%), kg | |
≤−6 | 1386 (21.2) |
>−6 & ≤−4 | 769 (11.8) |
>−4 & ≤−2 | 989 (15.1) |
>−2 & <2 | 2384 (36.4) |
≥2 & <4 | 500 (7.64) |
≥4 & <6 | 249 (3.8) |
≥6 | 269 (4.1) |
Fasting glucose, n = 3461, n (%), mg/dL | |
Low (<100) | 1625 (47.0) |
Medium (<126 & ≥100) | 1216 (35.1) |
High (≥126) | 620 (17.9) |
HgbA1c, n = 3773, n (%) | |
Low (<5.7) | 628 (16.6) |
Medium (<6.5 & ≥ 5.7) | 1671 (44.3) |
High (≥ 6.5) | 1474 (39.1) |
Creatinine, n = 7012, n (%), mg/dL | |
Low (M: <0.7; F: <0.6) | 457 (6.5) |
Medium (M: <1.3 & ≥0.7; F: <1.1 & ≥0.6) | 5155 (73.5) |
High (M: ≥1.3; F: ≥1.1) | 1400 (20.0) |
Alanine transaminase (ALT), n = 7154, n (%), U/L | |
Low (M: <64; F: <55) | 5966 (83.4) |
High (M: ≥ 64; F: ≥55) | 1188 (16.6) |
Aspartate aminotransferase (AST), n = 4708, n (%), U/L | |
Low (M: <35; F: <31) | 2874 (61.0) |
High (M: ≥35; F: ≥31) | 1834 (39.0) |
Alkaline phosphatase (ALP), n = 5744, n (%), U/L | |
Low/Medium (≤125) | 4370 (76.1) |
High (>125) | 1374 (23.9) |
Total protein, n = 1604, g/dL | |
Median (IQR) | 6.8 (6.2–7.3) |
Bilirubin (total), n = 5706, n (%), mg/dL | |
Low (≤0.1) | 25 (0.4) |
Medium (≤1.0 & >0.1) | 4075 (71.4) |
High (>1.0) | 1606 (28.1) |
Albumin, n = 2603, n (%), g/dL | |
Median (IQR) | 3.5 (2.9–3.9) |
Low (≤3.3) | 1592 (44.0) |
Medium (≤4.8 & >3.3) | 2004 (55.4) |
High (>4.8) | 22 (0.6) |
Pancreatic morphologic features | |
Atrophy, n (%) | |
Absence | 6148 (78.6) |
Presence | 1671 (21.4) |
Calcification, n (%) | |
Absence | 6836 (87.4) |
Presence | 983 (12.6) |
Pancreatic cyst, n (%) | |
Absence | 7366 (94.4) |
Presence | 453 (5.8) |
Pancreatic ductal irregularity, n (%) | |
Absence | 7468 (95.5) |
Presence | 351 (4.5) |
Focal pancreatic duct stricture with distal (upstream) dilatation, n (%) | |
Absence | 7463 (95.5) |
Presence | 356 (4.5) |
Focal pancreatic side branch dilation, n (%) | |
Absence | 7381 (94.4) |
Presence | 438 (5.6) |
Granular pancreatic duct filling defects, n (%) | |
Absence | 7671 (98.1) |
Presence | 148 (1.9) |
Intra-ductal calculi (duct stone), n (%) | |
Absence | 7402 (94.7) |
Presence | 417 (5.3) |
2.7. Model Development
We estimated the cumulative incidence of PDAC with death as a competing risk since t0 using a multi-state model available in R (package ‘mstate’).22–24 For each of the follow-up period (1-year, 2-years, and 3-years), three models were constructed. Model 1 only involved pancreatic morphologic features. Model 2 supplemented patient demographics and other clinical characteristics, and Model 3 added laboratory measures. The three final models were not nested (ie, a predictor selected into Model 1 may not appear in Model 2). Rather, features were selected separately for each model. To generate parsimonious models with optimal predictive power, a forward selection method was applied to select the most influential predictors maximizing c-index, a concordance measure. Starting with a null model, the predictors were added one at a time to maximize the improvement of the average c-index of the 10 imputed datasets, until the addition of a new predictor could not improve the c-index of any imputed dataset by more than 0.005.
2.8. Internal Validation
Internal validation was performed using bootstrapping.25,26 We generated 50 bootstrapped samples (5 for each imputed dataset) with the same size of the original sample, and developed prediction models for each sample following the approach mentioned in the “Model development” section.
2.9. Model Performance
Calibration was assessed by comparing the observed risk of the event and the model-based predicted risk in 1-, 2- and 3-years based on each of the 50 validation samples. More specifically, each patient was placed into one of the risk groups (<50th, 50–74th, 75–90th, 90–94th, 95–100th percentiles) based on the predicted risk derived from each of the 50 validation samples. For each group, the average predicted risk of the patients in the risk group was evaluated against the average observed risk, where the average predicted (or observed) risk was the mean of the 50 predicted (or observed) risk estimates. The 95% confidence intervals (CI) were estimated for both predicted and observed risks.
The discriminative power of each model for each follow-up window (1-, 2- and 3-years) was evaluated by using c-index.27 The mstate package estimated the standard errors of the estimates by a bootstrap procedure,23 page 27). Sensitivity, specificity and positive predictive value (PPV) were estimated by averaging the estimates derived from the 50 validation samples.
2.10. Statistical Analysis
To estimate the pooled effect of each predictor being selected into the final models, we combined the 10 imputed datasets using Rubin’s rule implemented in the mi.meld function within the R package Amelia.28–30 Estimates were reported with 95% CI. The cumulative incidence rates were derived by mstate. Sensitivity analyses were conducted (Supplemental Digital Content 2). All the analyses were performed using SAS (version 9.4 for Unix, SAS Institute, Cary, NC) except for the R packages mentioned in above. All computations and analyses carried out in R was based on R version 3.4.3 (R Foundation, Vienna, Austria).
2.11. Implementation
The predicted three-year risks of PDAC and non-PDAC deaths based on the coefficients and the baseline cumulative hazard derived from our datasets using the mstate package can be estimated using the R-codes in Supplemental Digital Content 3.
3. RESULTS
3.1. Characteristics of the Study Cohort
In a total of 11,233 patients identified with pancreatic duct abnormality, 3414 were excluded (Supplemental Digital Content 4, Supplemental Fig. 1). A manual chart review of 220 radiological reports by a trained gastroenterologist revealed a positive predictive value of 98%. Patient characteristics are displayed in Table 1. Of the 7819 included patients, 52.2% were white, 65.1% were female and 63.9% were on Medicare. On average, the patients were 71.3 years of age and had been with KPSC for 19.7 years. Tobacco and alcohol use were common (35.4% current and 43.8% passive smokers, 54.8% alcohol drinkers). Nearly 3% had a family history of pancreatic cancer. Nearly half (48.1%) of the patients lost at least 2 kilograms and more than one-fifth (21.2%) lost at least 6 kilograms in the year prior to t0. High fasting glucose (≥126 mg/dL) and high hemoglobin A1c (≥6.5%) were seen in 17.9% and 39.1% patients, respectively.
The two most common additional pancreatic morphologic abnormalities reported were atrophy (21.4%) and calcification (12.6%) (Table 1). Pancreatic ductal irregularity (4.5%), focal pancreatic duct stricture with distal (upstream) dilatation (4.5%), focal pancreatic side branch dilation (5.6%), granular pancreatic duct filling defects (2.9%) and intra-ductal calculi (duct stone) (5.4%) were infrequent.
3.2. Incidence of PDAC
Among the 7819 eligible patients, 781 developed PDAC within 3 years, of which 712 (91%) and 756 (97%) were diagnosed within 1 and 2 years, respectively (Table 2). Out of the 781 PDAC cases, 664 (85%) were captured from the CR and the rest (117 or 15%) died of PDAC based on the information with the State Death files. The total follow-up time in years, number (and incidence rate) of PDAC, number (and incidence rate) of non-PDAC deaths, and time to PDAC or deaths are reported in Table 2 for each of the three follow-up periods. The incidence rates of PDAC were 11.1 (95% CI, 10.3–12.0), 6.4 (95% CI, 5.9–6.8), and 4.7 (95% CI, 4.4–5.0) per 100 person-years of 1-, 2-, and 3-year follow-up, respectively.
TABLE 2.
Total Follow-up Time, y | No. of PDAC | Incidence Rate of PDAC/100-Person Years (95% CI) | Time to PDAC, median (IQR), d | No. of Deaths | Death Rate/100-Person Years (95% CI) | Time to Death, median (IQR), d | |
---|---|---|---|---|---|---|---|
One year | 6396 | 712 | 11.1 (10.3–12.0) | 22 (8–62.5) | 1056 | 16.5 (15.5–17.5) | 105 (40–213) |
Two years | 11,825 | 756 | 6.4 (5.9–6.8) | 25 (9–79.5) | 1458 | 12.3 (11.7–13.0) | 188 (60–390) |
Three years | 16,678 | 781 | 4.7 (4.4–5.0) | 27 (9–91.0) | 1778 | 10.7 (10.2–11.2) | 266.5 (78–604) |
The cumulative incidence of PDAC in 1 year (9.2%), 2 years (9.8%) and 3 years (10.1%) were displayed in Figure 1 (data not shown). The cumulative incidence of non-PDAC deaths were 13.7%, 19.2%, and 23.7% in 1, 2 and 3 years, respectively (data not shown).
Among the patients whose cancer stage was known (n = 571), 253 (44.3%), 58 (10.2%), and 162 (28.4%) had stage II, stage III and stage IV cancer, respectively (data not shown).
3.3. Performance of Prediction Models
The predictors included in the final nine models are listed in the footnote of Table 3. The pooled effects of these selected predictors are displayed in Supplemental Digital Content 5–Supplemental Table 3. Atrophy positively and calcification negatively predicted the risk of PDAC in all the nine models (Table 3). The c-indexes for Model 1 were in the range of 0.661–0.663. They increased to 0.745–0.751 for Model 2 (with atrophy, calcification, focal pancreatic duct stricture with distal (upstream) dilatation, sex, race/ethnicity and change in weight as predictors) and to 0.825–0.833 for Model 3 (with atrophy, calcification, race/ethnicity, change in weight, fasting glucose, hemoglobin A1c and alkaline phosphatase as predictors) (Table 3).
TABLE 3.
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
One year | 0.661 ± 0.013 | 0.751 ± 0.010 | 0.833 ± 0.008 |
Two years | 0.663 ± 0.012 | 0.748 ± 0.010 | 0.830 ± 0.007 |
Three years | 0.663 ± 0.009 | 0.745 ± 0.011 | 0.825 ± 0.008 |
Data presented as C-Indexed ± standard error.
Predictors included the final models:
Model 1 (pancreatic morphologic features only): atrophy, calcification, focal pancreatic duct stricture with distal (upstream) dilatation, and intra-ductal calculi (duct stone).
Model 2 (pancreatic morphologic features + patient demographics/clinical features): atrophy, calcification, focal pancreatic duct stricture with distal (upstream) dilatation, sex, race/ethnicity and change in weight.
Model 3 (pancreatic morphologic features + patient demographics/clinical features + lab measures): atrophy, calcification, race/ethnicity, change in weight, fasting glucose, hemoglobin A1c and alkaline phosphatase.
The average predicted and observed risks of PDAC based on Model 3 for the patients in each of the five risk groups are presented in Figure 2. The average predicted risks of PDAC in three years among the top 5% of the patients with the highest predicted risks was 56.0%, more than 20 times of the average risks of the patients whose risks were among the bottom 50% (Fig. 2, rightmost panel).
The accuracy of model performance was reported in Table 4 (Model 3 only). With a 3-year risk threshold of 5% which included 52.1% of all eligible patients, approximately 90% of PDAC cases were identified (Table 4). The corresponding PPV and specificity were 17.3% and 52.1%, respectively. When the 3-year risk threshold increased to 10%, the sensitivity reduced to 75.4% and the PPV increased to 25.6%.
TABLE 4.
Sensitivity, % | Specificity, % | PPV, % | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Threshold | Threshold | Threshold | ||||||||||
5 | 10 | 20 | 30 | 5 | 10 | 20 | 30 | 5 | 10 | 20 | 30 | |
One year | 88.1 | 73.5 | 51.4 | 36.2 | 57.6 | 78.8 | 90.9 | 95.4 | 17.3 | 25.9 | 36.3 | 44.2 |
Two years | 90.5 | 74.3 | 53.5 | 38.6 | 53.4 | 76.5 | 90.2 | 95.1 | 17.2 | 25.6 | 36.7 | 44.7 |
Three years | 89.7 | 75.4 | 53.3 | 37.8 | 52.1 | 75.6 | 89.9 | 94.9 | 17.3 | 25.6 | 37.1 | 45.3 |
Percent of eligible patients whose risk was above each risk threshold:
One year: 5% threshold: 46.6%; 10% threshold: 26.0%; 20% threshold: 13.0%; 30% threshold: 7.5%.
Two year: 5% threshold: 50.8%; 10% threshold: 28.5%; 20% threshold: 13.9%; 30% threshold: 8.0%.
Three year: 5% threshold: 52.1%; 10% threshold: 29.5%; 20% threshold: 14.4%; 30% threshold: 8.4%.
The results of the sensitivity analyses can be found in Supplemental Digital Content 2.
4. DISCUSSION
In the present study, we developed and validated prediction models involving pancreatic morphologic features and other clinical characteristics in patients with pancreatic duct dilatation detected in the context of routine clinical care. Pancreatic parenchymal atrophy further predicted the risk of PDAC in both main and sensitivity analyses. Weight loss, elevated fasting glucose, hemoglobin A1c and alkaline phosphatase were important predictors of PDAC in this population. Targeting patients with more than 5% of risk of developing PDAC could reach approximately 90% of the PDAC cases in the study population. An increased threshold of PDAC risk (e.g. from 5% to 10%) will increase specificity and PPV, and decrease sensitivity. The findings provide the opportunity to better characterize the risk of pancreatic cancer associated with selected abnormal imaging features thereby improving the ability for clinicians to engage with patients regarding the appropriateness of further cancer surveillance in the setting of a negative initial evaluation.
A number of the features included in the present study have been previously suggested to relate to early forms of pancreatic cancer. A review of 62 CT scans in 28 pancreatic cancer patients revealed that definite or suspicious findings (predominantly duct dilatation) were present in 50% of the CTs obtained in the 6–18 months prior to the diagnosis of pancreatic cancer.14 The relationship between glycemic changes and increased risk of cancer in the setting of pancreatic imaging abnormalities was suggested in a prior study among cancer patients with pre-diagnostic imaging].31 The associations of precursor lesions and lobular parenchymal atrophy were reported in previous literature as well.5 Conversely, pancreatic calcifications indicate a benign lesion31 and thus are less likely to progress to cancer. The present study extends the previous findings by characterizing the overall risk of cancer among patients with duct abnormalities of the pancreas.
Prediction models have been developed to assess the risks of PDAC in various populations.15–18,32,33 With a sample of 1561 subjects and 16 PDAC cases, risk-prediction models potentially suitable for application to patients with new onset of diabetes (NOD) were developed.32 The risk increased from 0.8% in all NOD patients to 3.6% in a high-risk group, which comprised of 18% of NOD patients (sensitivity and specificity 80%). Boursi et al described a model based on the UK primary care database including clinical and routine laboratory parameters.17 The study investigators were able to identify a subgroup of patients comprising 6.2% of the NOD patients that were at 2.6% risk of developing pancreatic cancer within 3-years of diagnosis, with a sensitivity of 44.7%. Yu et al. developed prediction models with a c-statistics of 0.84 for men and 0.80 for women based on a cohort of 1.3 million men and 0.6 million women in Korea.15 Sensitivity, specificity and PPV were not reported. In all these studies, death due to non-PDAC was not considered as a competing risk of PDAC events. Our models were distinct from previous attempts to characterize risk of pancreatic cancer in that they involved pancreatic morphologic features and accounted for non-PDAC death as a competing risk while achieving a reasonable discrimination (c-index 0.825–0.833).
The present findings need to be interpreted with caution. First, the study population was heterogeneous including patients with abdominal imaging obtained for a variety of indications. This would influence the pre-test probability of cancer based on an image finding. Second, it is possible that some of the desired features may not have been reported by radiologists as part of a clinical reading for a non-pancreas related indication. Thus, the presence of the abnormalities may actually be higher than what was reported. Third, the studies used for analysis were acquired in the context of routine clinical care and as a result there was variation in types of studies and imaging protocols used. This may have caused inconsistency in the interpretation of the imaging reports. Finally, compared to the pancreatic morphologic features that are commonly reported in imaging reports, those that are not frequently reported may suffer from lower accuracy due to the lack of training in the process of NLP algorithm development. This may explain why they were not selected into the prediction models.
Despite the aforementioned limitations, the present study findings have several clinical and research implications. First, we have confirmed that duct dilatation may be an indicator of pancreatic cancer. However, it is important to note that the majority of patients (90%) with duct dilatation did not develop pancreatic cancer during the study period. Second, the prediction models developed in the context of the present study highlight the importance of specific key variables including changes in weight and glycemic parameters that were recently included in a clinical prediction model for pancreatic cancer in patients with NOD.33 Taken together this information can support clinical decision making by helping to identify a subset of high-risk patients in whom further surveillance or intervention may be indicated despite an initial negative evaluation. The study findings also have implications for development of a potential future screening platform based on NOD by providing a potential target lesion given incidence rates being more than 50% in patients who had the highest risks (top 5%).
The current analysis did not include evaluation of newer oncologic imaging techniques in MRI, such as diffusion weighted imaging (DWI). Studies have shown DWI is helpful in distinguishing pancreatic cancer from acute or chronic pancreatitis.34 Because of this, there is an interest in the ability of DWI to aid in early detection of PDAC. Other imaging techniques such as FDG positron emission tomography (PET)/CT and PET/MRI have also been studied, although to a smaller degree. Many novel radiotracers for PET, including some aimed at early detection of pancreatic cancer are in the development or testing stages, and could potentially be proven to be useful in the future.
In conclusion, we have further characterized the risk of pancreatic cancer among patients with findings of duct dilatation on cross-sectional imaging and demonstrated the ability of prediction algorithms to provide improved estimates of the risk of pancreatic cancer among patients with this finding. The models developed in the current study still require external validation. Nevertheless, they are revealing in terms of highlighting key factors including changes in weight, elevations fasting glucose, hemoglobin A1c and alkaline phosphatase that can be used to further stratify patients according to risk of pancreatic cancer.
Supplementary Material
ACKNOWLEDGMENTS
We thank Dianne Taylor and Melanie Balasanian for the assistance with formatting the manuscript.
Grant Support
This study was in part supported through funding from the Consortium for the Study of Chronic Pancreatitis, Diabetes and Pancreatic Cancer (CPDPC-U01). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Abbreviations
- CI
confidence intervals
- CR
cancer registry
- CT
computed tomography
- DWI
diffusion weighted imaging
- ICD-10-CM
Tenth Revision of International Classification of Diseases, Clinical Modification
- NLP
natural language processing
- NOD
new onset of diabetes
- PanIN
Pancreatic Intraepithelial Neoplasia
- PDAC
pancreatic ductal adenocarcinoma
- PET
positron emission tomography
- PPV
positive predictive value
Footnotes
Disclosures
The authors declare they have not conflict of interest for this study.
Address where the work was conducted: 100 S Los Robles, 2nd Floor, Pasadena, CA 91101
REFERENCES
- 1.American Cancer Society. Cancer Facts & Figures 2016. Atlanta, GA: American Cancer Society, 2016. Available at: https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts-figures/cancer-facts-figures-2016.html. Accessed November 21, 2018. [Google Scholar]
- 2.SEER Program (National Cancer Institute (U.S.)). SEER Stat Fact Sheets: Pancreas Cancer. Volume 2016: NCI’s Division of Cancer Control and Population Sciences, 2016. Accessed November 21, 2019. [Google Scholar]
- 3.Rahib L, Smith BD, Aizenberg R, et al. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Res. 2014;74:2913–2921. [DOI] [PubMed] [Google Scholar]
- 4.Hruban RH, Maitra A, Goggins MJ, et al. Update on pancreatic intraepithelial neoplasia. Int J Clin Exp Pathol. 2008;1:306–316. [PMC free article] [PubMed] [Google Scholar]
- 5.Al-Sukhni W, Borgida A, Rothenmund H, et al. Screening for pancreatic cancer in a high-risk cohort: an eight-year experience. J Gastrointest Surg. 2012;16:771–783. [DOI] [PubMed] [Google Scholar]
- 6.Poley J, Kluijt I, Gouma D, et al. The yield of first-time endoscopic ultrasonography in screening individuals at a high risk of developing pancreatic cancer. Am J Gastroenterol. 2009;104:2175. [DOI] [PubMed] [Google Scholar]
- 7.Vasen HF, Wasser M, Van Mil A, et al. Magnetic resonance imaging surveillance detects early-stage pancreatic cancer in carriers of a p16-Leiden mutation. Gastroenterology. 2011;140:850–856. [DOI] [PubMed] [Google Scholar]
- 8.Verna EC, Hwang C, Stevens PD, et al. Pancreatic cancer screening in a prospective cohort of high-risk patients: a comprehensive strategy of imaging and genetics. Clinical Cancer Res. 2010;16:5028–5037. [DOI] [PubMed] [Google Scholar]
- 9.Barral M, Taouli B, Guiu B, et al. Diffusion-weighted MR imaging of the pancreas: current status and recommendations. Radiology. 2014;274:45–63. [DOI] [PubMed] [Google Scholar]
- 10.Jang KM, Kim SH, Kim YK, et al. Missed pancreatic ductal adenocarcinoma: Assessment of early imaging findings on prediagnostic magnetic resonance imaging. Eur J Radiol. 2015;84:1473–1479. [DOI] [PubMed] [Google Scholar]
- 11.Yokode M, Akita M, Fujikura K, et al. High-grade PanIN presenting with localised stricture of the main pancreatic duct: A clinicopathological and molecular study of 10 cases suggests a clue for the early detection of pancreatic cancer. Histopathology. 2018;73:247–258. [DOI] [PubMed] [Google Scholar]
- 12.Tanaka S, Nakao M, Ioka T, et al. Slight dilatation of the main pancreatic duct and presence of pancreatic cysts as predictive signs of pancreatic cancer: a prospective study. Radiology. 2010;254:965–972. [DOI] [PubMed] [Google Scholar]
- 13.Wu BU, Sampath K, Berberian CE, et al. Prediction of malignancy in cystic neoplasms of the pancreas: a population-based cohort study. Am J Gastroenterol. 2014;109:121–129; quiz 130. [DOI] [PubMed] [Google Scholar]
- 14.Gangi S, Fletcher JG, Nathan MA, et al. Time Interval Between Abnormalities Seen on CT and the Clinical Diagnosis of Pancreatic Cancer: Retrospective Review of CT Scans Obtained Before Diagnosis. AJR Am J Roentgenol. 2004;182:897–903. [DOI] [PubMed] [Google Scholar]
- 15.Yu A, Woo SM, Joo J, et al. Development and validation of a prediction model to estimate individual risk of pancreatic cancer. PloS One. 2016;11:e0146473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Klein AP, Lindström S, Mendelsohn JB, et al. An absolute risk model to identify individuals at elevated risk for pancreatic cancer in the general population. PloS One. 2013;8:e72311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Boursi B, Finkelman B, Giantonio BJ, et al. A clinical prediction model to assess risk for pancreatic cancer among patients with new-onset diabetes. Gastroenterology. 2017;152:840–850.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Munigala S, Singh A, Gelrud A, et al. Predictors for pancreatic cancer diagnosis following new-onset diabetes mellitus. Clinical Transl Gastroenterol. 2015;6:e118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Koebnick C, Langer-Gould AM, Gould MK, et al. Sociodemographic characteristics of members of a large, integrated health care system: comparison with US Census Bureau data. Perm J. 2012;16:37–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Stekhoven DJ, Bühlmann P. MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics. 2011;28:112–118. [DOI] [PubMed] [Google Scholar]
- 21.Stekhoven DJ. missForest: Nonparametric missing value imputation using random forest. R package version 1.4. December 31, 2013. Available at: https://CRAN.R-project.org/package=missForest. Accessed November 21, 2018. [Google Scholar]
- 22.De Wreede LC, Fiocco M, Putter HJCm, et al. The mstate package for estimation and prediction in non-and semi-parametric multi-state and competing risks models. Comput Methods Programs Biomed. 2010;99:261–274. [DOI] [PubMed] [Google Scholar]
- 23.de Wreede LC, Fiocco M, Putter HJJoss. mstate: an R package for the analysis of competing risks and multi-state models. J Stat Softw. 2011;38:1–30. [Google Scholar]
- 24.Putter H, de Wreede L, Fiocco M, et al. mstate: Data Preparation, Estimation and Prediction in Multi-State Models. R package version 0.2. Available at: https://cran-r-project.org/web/packages/mstate/index.html. Accessed November 21, 2018. [Google Scholar]
- 25.Efron B, Tibshirani RJ. An introduction to the bootstrap. New York, NY: Chapman and Hall/CRC; 1993:436. [Google Scholar]
- 26.Davison AC, Hinkley DV. Bootstrap methods and their application: (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge, UK: Cambridge University Press;1997. [Google Scholar]
- 27.Gerds TA, Kattan MW, Schumacher M, et al. Estimating a time‐dependent concordance index for survival prediction models with covariate dependent censoring. S Stat Med 2013;32:2173–2184. [DOI] [PubMed] [Google Scholar]
- 28.Rubin DB. Multiple imputation for survey nonresponse. New York, NY: Wiley; 1987. [Google Scholar]
- 29.Marshall A, Altman DG, Holder RL, et al. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009;9:57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Honaker J, King G, Blackwell M, et al. Package ‘Amelia’. Version View Article. May 8, 2018. Available at: https://www.rdocumentation.org/packages/Amelia/version/1.7.5. Accessed November 21, 2018. [Google Scholar]
- 31.Paulino-Netto A, Dreiling DA, Baronofsky IDJAos. The relationship between pancreatic calcification and cancer of the pancreas. Ann Surg. 1960;151:530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sharma A, Kandlakunta H, Nagpal SJS, et al. Model to determine risk of pancreatic cancer in patients with new-onset diabetes. Gastroenterology. 2018;155:730–739. e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Risch HA, Yu H, Lu L, et al. Detectable symptomatology preceding the diagnosis of pancreatic cancer and absolute risk of pancreatic cancer diagnosis. Am J Epidemiol. 2015;182:26–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ichikawa T, Erturk SM, Motosugi U, et al. High-b value diffusion-weighted MRI for detecting pancreatic adenocarcinoma: preliminary results. AJR Am J Roentgenol. 2007;188:409–414. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.