Abstract
Purpose:
Acute pancreatitis (AP) is a frequently encountered adverse drug reaction. However, the validity of diagnostic codes for AP is unknown. We aimed to determine the positive predictive value (PPV) of a diagnostic code-based algorithm for identifying patients with AP within the US Veterans Health Administration and evaluate the value of adding readily available structured laboratory information.
Methods:
We identified patients with possible AP events first based on the presence of a single hospital discharge ICD-9 or ICD-10 diagnosis of AP (Algorithm 1). We then expanded Algorithm 1 by including relevant laboratory test results (Algorithm 2). Specifically, we considered amylase or lipase serum values obtained between 2 days before admission and the end of the hospitalization. Medical records of a random sample of patients identified by the respective algorithms were reviewed by two separate gastroenterologists to adjudicate AP events. The PPV (95% confidence interval [CI]) for the algorithms were calculated.
Results:
Algorithm 2, consisting of one ICD-9 or ICD-10 hospital discharge diagnosis of AP and the addition of lipase serum value ≥200 U/L, had a PPV 89.1% (95% CI 83.0%–95.2%), improving from the PPV of algorithm 1 (57.9% [95% CI 46.8–69.0]).
Conclusions:
An algorithm consisting of an ICD-9 or ICD-10 diagnosis of AP with a lipase value ≥200 U/L achieved high PPV. This simple algorithm can be readily implemented in any electronic health records (EHR) systems and could be useful for future pharmacoepidemiologic studies on AP.
Keywords: acute pancreatitis, case-finding, electronic health record, validation
1 ∣. INTRODUCTION
Acute pancreatitis (AP) is one of the most common gastrointestinal diagnoses in North America and Europe and a common cause of emergency department visits, hospitalizations and mortality,1 costing over US$2 billion annually.1 Individuals may develop AP for many reasons, including medications. AP is recognized as an adverse drug reaction (ADR) of over 500 drugs by the World Health Organization.2,3 Electronic databases serve as a valuable data source to study risk factors, incidence, epidemiology, and natural history of AP in real-world settings. However, the validity of such investigative work is predicated upon the availability of automated algorithms that can accurately and efficiently identify cases of AP from electronic data sources.
Few studies have evaluated the performance of such electronic algorithms for AP. Relying on AP diagnostic codes alone, two algorithms reportedly attained positive predictive values of 83% and 97% in the Swedish and Danish populations, respectively.4,5 However, the performance of such algorithms has been much poorer when applied in the US setting. For example, the PPV of an AP algorithm based on the presence of at least one International Classification of Diseases, 9th Revision (ICD-9) code for AP was only 48% among US veterans with alcohol use disorder.6 Previously, a systematic review of valid methods for developing case-finding algorithms using administrative data suggested the use of multicenter cohort and incorporation of laboratory values to improve algorithm performance.7 We included both of these features in our study design. An algorithm developed in a single US hospital did achieve better performance by incorporating additional laboratory biochemical data.8 However, it was only briefly reported in a letter to the editor. Furthermore, this algorithm requires knowledge about the specific reference ranges of laboratory values which may not be as readily available in other datasets.
To meet the need to identify AP cases accurately and efficiently in US electronic health records (EHR) databases, we developed and compared the performance characteristics of two electronic case-finding algorithms for AP using the combination of ICD diagnosis codes with and without laboratory values.
2 ∣. METHODS
We conducted a retrospective study using the electronic health record data available from the Veterans Health Administration (VHA) between January 1, 1998 and December 31, 2017. The VHA is the largest integrated healthcare system in the US with a robust EHR system. Its 171 VA medical Centers and 1112 outpatient sites provide comprehensive care for over nine million individuals. Data collected by VHA include demographic information, outpatient and hospital diagnoses, procedures, inpatient and outpatient laboratory results, and dispensed medications.9 We used national data from inpatient, outpatient, and emergency department encounters as well as discharge summaries, admission notes, progress notes, laboratory values, demographics, and radiology reports within the Veteran Information National Computing Infrastructure (VINCI) environment. The study was approved by the Institutional Review Board at Corporal Michael J. Crescenz VA Medical Center.
2.1 ∣. Case: Finding algorithms
We evaluated two separate approaches to identifying hospital AP events within the VHA data (Figure 1). Our Algorithm 1 identified AP events based on ICD diagnostic codes alone (Table 1). Using this algorithm, we identified Veterans who had a hospitalization between January 1, 1998 and December 31, 2017 associated with a hospital discharge diagnosis of AP (in principal or secondary position). We chose to evaluate all hospital discharge AP diagnoses, regardless of diagnosis position order because our goal was to capture hospital AP events that originated both in the outpatient setting and during the hospitalization. We only included the first AP-associated hospitalization among those with more than one episode. For Algorithm 2, we built on Algorithm 1 with the additional consideration of a maximal amylase or lipase level measured from 2 days prior to the inpatient admission date to the date of hospital discharge. Therefore, the eligible study patients for evaluation of Algorithm 2 consisted of those included in the evaluation of Algorithm 1 who had at least 1 measurement for amylase or lipase between 2 days before and the end of the hospitalization. We chose to evaluate these laboratory values because they are routinely obtained in suspected AP cases and included in the clinical guidelines definition of AP.10,11 Between the two laboratory values, we ultimately included the one associated with better discriminatory performance in Algorithm 2. In addition, we aimed to identify a laboratory threshold that could yield a high PPV (i.e., ≥80%) and sensitivity (≥80%).
FIGURE 1.
Diagram describing criteria for selection of veterans for acute pancreatitis (AP) validation. ICD, International Classification of Diseases.
TABLE 1.
Acute pancreatitis related diagnosis codes and descriptions
| Code type | Code number | Code description |
|---|---|---|
| ICD-9 | 577.0 | acute pancreatitis |
| ICD-10 | K85.0 | idiopathic acute pancreatitis |
| K85.1 | Biliary acute pancreatitis | |
| K85.2 | Alcohol induced acute pancreatitis | |
| K85.3 | Drug induced acute pancreatitis | |
| K85.8 | Other acute pancreatitis | |
| K85.9 | Acute pancreatitis, unspecified |
2.2 ∣. Confirmation of outcomes
VHA electronic data were reviewed by a trained abstractor and abstracted into a VHA Research Electronic Data Capture (REDcap) database.12 The abstraction form included the following elements of VHA electronic health record: (1) pertinent clinical information from clinician hospital admission notes, progress notes, consultant notes, and discharge summaries (2) laboratory results (lipase and amylase values) and (3) radiology reports (to evaluate for presence of AP). A definite diagnosis of AP was based on References 10,11,13 and confirmed if at least two of the following criteria were documented during the hospitalization: (1) upper abdominal pain (2) blood levels of amylase or lipase >3 times higher than the normal reference value or (3) signs of AP on abdominal CT or MRI. Two gastroenterologists independently reviewed the abstraction form to determine the presence, absence, or uncertainty of AP during a hospitalization. A third gastroenterologist reviewed any disagreement to arbitrate the event.
2.3 ∣. Statistical analyses
We focused on high PPV to enhance the confidence that identified cases are true events. We estimated that a sample of ≥75 individuals would allow PPV determination with a sufficiently narrow 95% CI (69%–88%), with an a priori PPV goal of 80%. We first calculated the positive predictive value (PPV) based on a random sample of individuals who met the criteria for Algorithm 1. Because the PPV of Algorithm 1 was lower than our a priori-defined minimum adequacy level of 80%, we developed Algorithm 2 with the addition of laboratory values. We derived the area under the receiver operating characteristic (ROC) curves for Algorithm 2 that included maximal amylase and lipase levels, respectively. The laboratory value yielding better area under the curve (AUC) was chosen for Algorithm 2. We then tabulated the PPVs, sensitivities, and specificities of Algorithm 2 at the various cutoffs for the chosen laboratory value. The finalized Algorithm 2 was defined using the optimal laboratory value cutoff associated with the best tradeoff between sensitivity and specificity. We then performed a separate adjudication with a separate sample of patients who met the criteria for the finalized Algorithm 2 (i.e., at least 1 ICD discharge code for AP and at least one measurement of lipase ≥200 U/L between 2 days before and the end of hospitalization) and measured the corresponding PPV. We also calculated the percentage observed agreement between the two adjudicators, using the Kappa statistic. We analyzed all data using Stata 16.1 (Stata Corp).
3 ∣. RESULTS
For Algorithm 1, 76 574 individuals met the criteria, and 76 were randomly sampled for manual chart review. We include additional information on cohort characteristics in Table S1. We found low PPV: 44 individuals sampled had definite AP with a PPV of 57.9% (95% CI 46.8–69.0). The percent agreement among adjudicators was 100.0% (kappa statistic: 1.00). Using the individuals originally sampled for algorithm 1 (n = 76), we refined our algorithm, re-evaluating the performance characteristic of the random samples from algorithm 1 with the addition of a maximal laboratory values (i.e., lipase or amylase) (Figure 2a,b). The original individuals used to sample algorithm 1 in were used in the development of algorithm 2; 68 (89.5%) had a lipase laboratory drawn and 57 (75.0%) had an amylase value drawn. The combination of one discharge diagnosis code and lipase had an AUC of 0.844. The combination of one discharge diagnosis code and amylase had a lower AUC of 0.740. Given the higher AUC associated with lipase values, we focused on developing algorithm 2 with the addition of lipase values. We evaluated the PPV, negative predictive value (NPV), sensitivity, and specificity of various lipase thresholds (Table 2). A lipase level cutoff of ≥200 U/L (units/liter) appeared to yield the optimal tradeoff between sensitivity (88.4%) and specificity (56.5%). Therefore, the finalized definition for AP based on Algorithm 2 was the following: 1 ICD-9 or ICD-10 diagnostic code for AP and any lipase value ≥200 U/L up to 2 days prior to and during the hospital encounter.
FIGURE 2.
A, receiver operating characteristic curve (ROC) for amylase values. B, receiver operating characteristic curve (ROC) for lipase values
TABLE 2.
Performance of Algorithm 2 at various lipase cutoffs
| Lipase cutpoint (U/L) |
Proportion of study patients ≥ lipase cutoff among study patients with confirmed AP (meeting criteria for Algorithm 1) (%) |
Individuals without AP below lipase cutoff (meeting criteria for Algorithm 1) (%) |
Positive predictive value (PPV, %) |
Negative predictive value (NPV, %) |
|---|---|---|---|---|
| 100.0 | 93.0 | 30.4 | 71.4 | 70.0 |
| 150.0 | 90.7 | 47.8 | 76.4 | 73.3 |
| 200.0 | 88.4 | 56.5 | 79.1 | 72.2 |
| 250.0 | 86.1 | 60.9 | 80.4 | 70.0 |
| 300.0 | 83.7 | 60.9 | 80.0 | 66.7 |
Based on the finalized Algorithm 2, 50 609 patients in our study cohort met the definition for AP. We randomly sampled 101 individuals from this cohort and estimated the PPV of Algorithm 2, again using manual chart review as the gold standard. The PPV for the finalized Algorithm 2 was 89.1% (95% CI 83.0–95.2) for definite AP. The percent agreement among adjudicators for Algorithm 2 was 99.0% (kappa statistic: 0.95). We also stratified the algorithm performance based on demographics of medical conditions such as age, diabetes, and obesity, and the results were similar (Table S2).
Table 3 reports the most common reasons for false positive AP events identified by both cohorts for Algorithms 1 and 2. For Algorithm 1, the most common reasons for not meeting conditions for definite AP were miscoding (n = 10, e.g., no mention of AP in notes, no laboratory values or imaging performed) and unlikely pancreatitis (n = 13, elevated laboratory values without mention of abdominal pain or imaging). For Algorithm 2, the most common reason was not having definite criteria for AP (i.e., meeting clinical criteria but laboratory values were not three times upper limit of normal or without appropriate abdominal imaging). In one instance, an individual had chronic pancreatitis rather than AP.
TABLE 3.
Positive predictive values (with 95% confidence intervals) of case-identifying algorithms to identify acute pancreatitis among patients in the VHA
| Algorithm | N confirmed AP |
Total examined in algorithm |
PPV (95% CI) | Reason not confirmed | |||
|---|---|---|---|---|---|---|---|
| Probable APa |
Unlikely APb |
Chronic pancreatitis |
Completely miscodedc |
||||
| ICD-9/10 Code | 44 | 76 | 57.9 (46.8–69.0) | 9 | 13 | – | 10i |
| ICD-9/10 Code + lipase ≥200 | 90 | 101 | 89.1 (83.0–95.2) | 7 | 3 | 1 | – |
Abbreviation: AP, acute pancreatitis.
Abdominal pain present, lipase values high but not ≥3x ULN, and imaging (ultrasound, CT, or MRI) either not performed or normal.
No abdominal pain, normal imaging, lipase value elevated.
No mention of acute pancreatitis in any notes, no lipase or amylase values ordered, no imaging, no abdominal pain.
4 ∣. DISCUSSION
In this study, we evaluated the PPV of two ICD-9 and ICD-10 based coding algorithms to identify AP within VHA data. Algorithm 1, which solely used hospital discharge codes within VHA data performed poorly (PPV 57.9%). Algorithm 2 additionally included a lipase value ≥200 U/L between 2 days prior to and the end of a hospitalization, improving the predictive performance of an AP case-finding algorithm (PPV 89.1%).
Algorithm 1 was likely insufficient because individuals who were coded to have clinically probable but not definite AP were also included. To enhance the accuracy of a case-finding algorithm for AP, we included an additional value of lipase cutoff of ≥200 U/L, which yielded optimal balance between specificity and sensitivity rates. We found that lipase offered better discrimination for AP than amylase. This is consistent with the clinical observation that lipase is a more specific marker for pancreatic abnormality than amylase. By including our algorithm criteria to have an additional laboratory value that is routinely measured in the clinical determination of AP, we improved its accuracy without compromising its feasibility (89.5% of individuals with a hospital discharge diagnosis code had a lipase value drawn associated with the hospital encounter). Our algorithm differed from previous algorithms. We found that a sole diagnostic code-based algorithm had low diagnostic accuracy compared with the Swedish and Danish literature, which had shown that a single diagnostic code was sufficient for adequate PPV ≥80%.4,5 Our results are more similar to US based literature which have also shown that ICD based algorithms solely based on ICD codes are inadequate.6 We hypothesize that differences in diagnostic accuracy may result from the various health care systems surveyed. Both Sweden and Denmark have universal healthcare and a central patient registry with comprehensive information on inpatient and outpatient visits. We hypothesize that the VHA may not fully capture information from individuals transferred to non-VA hospitals. Our algorithm currently relies solely on structured data (eg. administrative codes and laboratory values). Incorporation of imaging performance would likely help refine the algorithm even further, as imaging is included in the definition of AP. However, in its current state, assessment of imaging performance from the unstructured data (e.g., free text) relies on significant manual chart review. When clinical informatics and tools such as natural language processing are further developed on a wider scale within the VA, our algorithm could be further refined and this algorithm should be updated.
Our study has several strengths. First, we used a rigorous, widely accepted clinical case definition of AP that is based on societal guidelines10 to identify definite AP. Second, we required the clinical criteria for case definitions to be present during the hospital admission or in the case of the laboratory value, up to 2 days prior of the hospital admission date (to include an ED hospitalization) of the date of the ICD diagnosis. Third, our gold standard for AP diagnosis was established by manual chart reviewed by two gastroenterologists.
This study has several potential limitations. First, this algorithm could miss some individuals who may have been diagnosed with AP but discharged from the emergency room and never admitted to the hospital. However, we suspect that the virtually all AP cases are admitted to the hospital given the acuity of cases. Second, we did not separately evaluate ICD-9 and ICD-10 case-finding algorithms. However, given that the diagnostic coding for AP is straightforward, we have no suspicion that the patient population captured would be different for ICD-9 vs. ICD-10 criteria for AP. Third, our cohort was predominantly male and may not be generalizable to populations with a different gender breakdown. However, there is no suggestion that there are sex differences in AP. Finally, our second algorithm was defined by an absolute cutoff (≥200 U/L), rather than its relationship to the reference value (e.g., three times the upper limit of the normal reference value.) We chose this algorithm for ease of implementation, given the difficulty in extracting reference ranges from the EHR. We acknowledge that our laboratory reference values had multiple threshold ranges. However, our goal was to maximize the practical implementation of our case-finding algorithm. Using a single absolute cutoff is more feasible than using a cutoff of various threshold ranges, which also renders drastically different absolute cutoffs. Ultimately even with this limitation, our algorithm had good performance. Lastly, the majority of our codes used for our algorithms were ICD-9 codes; the sample size was too small for interpretation of the performance of ICD-9 and ICD-10 codes separately. We did not investigate the performance of ICD-9 versus ICD-10 codes separately, given there was direct correspondence of the codes from ICD-9 to ICD-10.
In conclusion, we evaluated two ICD-9 and ICD-10 based case-finding algorithms to identify AP within the national VHA data set. We found that case-finding algorithms consisting of an ICD hospital discharge code and a lipase value ≥200 U/L 2 days prior to and during the hospitalization identified an AP diagnosis with PPV 89.1%. This algorithm could be used in future EHR-based epidemiologic studies to evaluate AP.
Supplementary Material
Key Points.
The electronic health record is a rich resource for evaluating the risk and epidemiology of acute pancreatitis, a common adverse drug reaction. But algorithms to identify AP have not been fully validated in US electronic healthcare records.
We developed several case-finding algorithms based on International Classification of Diseases, Ninth Revision (ICD-9) and 10th Revision (ICD-10) hospital discharge diagnoses and laboratory serum values (amylase or lipase) associated with the hospital encounter.
We identified patients with the following: (i) hospital discharge ICD-9 or ICD-10 diagnosis of AP or (ii) hospital discharge ICD-9 or ICD-10 diagnosis of AP plus laboratory value (amylase or lipase serum value) 2 days prior to or until the end of the AP hospitalization.
An algorithm consisting of an ICD-9 or ICD-10 diagnosis with a lipase value ≥200 U/L had a PPV of 89.1% (95% CI 83.0%–95.2%) of chart-confirmed AP events.
This validated case-finding algorithm could be used in future studies to identify AP cases within the US Veterans Health care administration data.
Plain Language Summary.
The validity of diagnostic codes for acute pancreatitis (AP) is unknown in large population-based electronic health records (EHR). We aimed to develop an accurate diagnostic code-based algorithm for identifying patients with AP within the US Veterans Health Administration and evaluate the value of adding laboratory information. Initially, we identified patients with possible AP events based on the presence of a single hospital discharge International Classification of Diseases, Ninth Revision (ICD-9) or 10th Revision (ICD-10) code for AP (Algorithm 1). Subsequently, we expanded the algorithm by additionally including relevant laboratory test results such as amylase or lipase serum values obtained between 2 days before admission and the end of the hospitalization (Algorithm 2). Medical records of a random sample of patients identified by the respective algorithms were reviewed by two separate gastroenterologists to adjudicate AP events. Algorithm 2, consisting of one ICD-9 or ICD-10 hospital discharge diagnosis of AP and the addition of lipase serum value ≥200 U/L, had a positive predictive value (PPV) of 89.1% (95% CI 83.0%–95.2%), improving from the PPV of algorithm 1 (57.9% [95% CI 46.8–69.0]). This simple algorithm can be readily implemented in the EHR for future pharmacoepidemiologic studies on AP.
FUNDING INFORMATION
This research was supported by an NIH T32 training grant in gastrointestinal epidemiology (5T32DK007740).
Footnotes
CONFLICT OF INTEREST
The authors have no conflicts of interest to disclose.
ETHICS STATEMENT
The study was approved by the Institutional Review Boards of the Corporal Michael J. Crescenz Philadelphia VA Medical Center with a waiver of informed consent.
SUPPORTING INFORMATION
Additional supporting information can be found online in the Supporting Information section at the end of this article.
REFERENCES
- 1.Peery AF, Crockett SD, Murphy CC, et al. Burden and cost of gastrointestinal, liver, and pancreatic diseases in the United States: update 2018. Gastroenterology. 2019;156:254–272.e11. doi: 10.1053/j.gastro.2018.08.063 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nitsche CJ, Jamieson N, Lerch MM, Mayerle JV. Drug induced pancreatitis. Best Pract Res Clin Gastroenterol. 2010;24:143–155. doi: 10.1016/j.bpg.2010.02.002 [DOI] [PubMed] [Google Scholar]
- 3.Lancashire RJ, Cheng K, Langman MJS. Discrepancies between population-based data and adverse reaction reports in assessing drugs as causes of acute pancreatitis: drugs causing pancreatitis. Aliment Pharmacol Ther. 2003;17:887–893. doi: 10.1046/j.1365-2036.2003.01485.x [DOI] [PubMed] [Google Scholar]
- 4.Kirkegård J, Mortensen MR, Johannsen IR, Mortensen FV, Cronin-Fenton D. Positive predictive value of acute and chronic pancreatitis diagnoses in the Danish National Patient Registry: a validation study. Scand J Public Health. 2020;48:14–19. doi: 10.1177/1403494818773535 [DOI] [PubMed] [Google Scholar]
- 5.Razavi D, Ljung R, Lu Y, Andrén-Sandberg Å, Lindblad M. Reliability of acute pancreatitis diagnosis coding in a national patient register: a validation study in Sweden. Pancreatology. 2011;11:525–532. doi: 10.1159/000331773 [DOI] [PubMed] [Google Scholar]
- 6.Yadav D, Eigenbrodt ML, Briggs MJ, Williams DK, Wiseman EJ. Pancreatitis: prevalence and risk factors among male veterans in a detoxification program. Pancreas. 2007;34:390–398. doi: 10.1097/mpa.0b013e318040b332 [DOI] [PubMed] [Google Scholar]
- 7.Carnahan RM. Mini-Sentinel's systematic reviews of validated methods for identifying health outcomes using administrative data: summary of findings and suggestions for future research: health outcome algorithm summary. Pharmacoepidemiol Drug Saf. 2012;21:90–99. doi: 10.1002/pds.2318 [DOI] [PubMed] [Google Scholar]
- 8.Podugu A, Lee PJW, Bhatt A, Holmes J, Lopez R, Stevens T. Positive predictive value of ICD-9 discharge diagnosis of acute pancreatitis. Pancreas. 2014;43:969. doi: 10.1097/MPA.0000000000000145 [DOI] [PubMed] [Google Scholar]
- 9.Fihn SD, Francis J, Clancy C, et al. Insights from advanced analytics At the veterans health administration. Health Aff. 2014;33:1203–1211. doi: 10.1377/hlthaff.2014.0054 [DOI] [PubMed] [Google Scholar]
- 10.Tenner S, Baillie J, DeWitt J, Vege SS. American College of Gastroenterology guideline: management of acute pancreatitis. Am J Gastroenterol. 2013;108:1400–1415. doi: 10.1038/ajg.2013.218 [DOI] [PubMed] [Google Scholar]
- 11.Gardner TB. Acute Pancreatitis. Ann Intern Med. 2021;174:ITC17–ITC32. doi: 10.7326/AITC202102160 [DOI] [PubMed] [Google Scholar]
- 12.Harris PA, Taylor R, Minor BL, et al. The REDCap consortium: building an international community of software platform partners. J Biomed Inform. 2019;95:103208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mederos MA, Reber HA, Girgis MD. Acute pancreatitis: a review. JAMA. 2021;325:382. doi: 10.1001/jama.2020.20317 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


