Abstract
Background & Aims:
Acute on chronic liver failure (ACLF) causes high short-term mortality in patients with previously stable chronic liver disease. To date there are no models to predict which patients are likely to develop ACLF, and existing models to predict ACLF mortality are based on limited cohorts. We sought to create novel risk prediction scores using a large cohort of patients with cirrhosis.
Methods:
We performed a retrospective cohort study of 74,790 patients with incident cirrhosis in the Veterans Health Administration database using randomized 70% derivation/30% validation sets. ACLF events were identified per the European ACLF criteria. Multivariable logistic regression was used to derive prediction models for developing ACLF at three, six, and twelve months, and ACLF mortality at 28 and 90 days. Mortality models were compared to MELD, MELD-sodium, and the CLIF-C ACLF score.
Results:
Models for the developing ACLF had very good discrimination (concordance [C] statistics 0.83–0.87) at all timepoints. Models for ACLF mortality also had good discrimination at 28 and 90 days (C-statistics 0.79–0.82), and were superior to MELD, MELD-sodium, and the CLIF-C ACLF score. The calibration of the novel models was excellent at all timepoints.
Conclusion:
We have obtained highly-predictive models for developing ACLF, as well as for ACLF short-term mortality in a diverse United States cohort. These may be used to identify outpatients at significant risk of ACLF, which may prompt closer follow-up or early transplant referral, and facilitate decision-making for patients with diagnosed ACLF, including escalation of care, expedited transplant evaluation, or palliation.
Keywords: European Association for the Study of the Liver (EASL), cirrhosis, chronic liver disease, prediction modeling
Lay Summary
Acute on chronic liver failure (ACLF) results in extremely high short-term mortality in patients with previously stable chronic liver disease. We lack prediction models to identify which patients are at highest risk for ACLF, and only limited models exist to predict ACLF mortality. Using a large, diverse group of patients with liver disease in the United States, we have created highly-accurate models to perform these function.
Introduction
The burden of cirrhosis in the developed world is rising due to the epidemic of non-alcoholic fatty liver disease (NAFLD) and continued impact of hepatitis C virus (HCV). Despite the paradigm shift in our ability to cure HCV, many patients remain undiagnosed or untreated, and a sizable proportion of cured patients still harbor advanced fibrosis or cirrhosis. As such, the morbidity and mortality of cirrhosis is increasing,1 with patients at risk for hepatocellular carcinoma (HCC), decompensated cirrhosis, and acute on chronic liver failure (ACLF). The latter entity is particularly formidable, being characterized by an acute insult in a stable patient with chronic liver disease that results in extremely high short-term mortality, in excess of 50% at 90 days.2–4 As such, ACLF has become an increasing area of focus in hepatology.
While a number of models to predict ACLF mortality have emerged, there are no models to predict which patients are likely to develop ACLF, significantly limiting our ability to identify patients who require closer follow-up, referral for transplant evaluation outside of standard indications, and focused areas of risk mitigation. This is crucial given the large number of patients with compensated cirrhosis that are followed infrequently as outpatients. Likewise, there is a need to predict the trajectory of patients who experience ACLF, as there are clear implications for medical triage, expedited transplant evaluation, utilization of emerging liver-directed therapies, and for research purposes. Although some studies have developed prediction scores for ACLF mortality, including the Chronic Liver Failure Consortium (CLIF-C) ACLF score,5 none were developed using large cohorts reflective of diverse etiologies of liver disease. These questions reflect high-priority areas for systems improvement and resource utilization for patients with liver disease,6 and impact clinicians across the spectrum of hospital care (i.e. hospitalists, intensivists, and consultants). To address these gaps, we used a large, well-defined cohort of patients with compensated cirrhosis to derive and validate prediction scores to improve the management of patients at risk for, or suffering from, ACLF.
Methods
Definition of Predictive Models
We sought to create risk prediction models for two distinct scenarios. The first series of models predicted the development of ACLF events at three, six, or twelve months. The time intervals were defined from the end of a six-month baseline period after the diagnosis of cirrhosis, to allow sufficient time to obtain clinical/laboratory data and ensure initial stability of liver disease. The second series of models predicted mortality at 28 or 90 days after ACLF development. Although we acknowledge the presence of more than a dozen definitions of ACLF,7 we created models focused on the European Association for the Study of the Liver (EASL) criteria, arguably the most widely-accepted definition.2
Study Design, Cohort Creation, and Ethical Considerations
We performed a retrospective cohort study using a large dataset collected in conjunction with the Veterans Outcomes and Costs Associated with Liver Disease (VOCAL) study group,8 which has been used in numerous chronic liver disease studies.9–12 We previously described the creation of a sub-cohort of patients using VOCAL data from the Veterans Health Administration (VHA), the largest single provider of liver care in the United States.13 Briefly, we identified patients with incident cirrhosis between 2008 and 2016 using a validated algorithm (one inpatient or two outpatient International Classification of Disease [ICD]-9/10 cirrhosis codes).14,15 We included patients age ≥18 years who were actively engaged in VHA care (defined as ≥2 completed outpatient visits in the baseline period),11,16 and excluded patients who received liver transplantation prior to the end of the baseline period, died during the baseline period, or did not have follow-up data beyond the baseline period. We also excluded patients with baseline HCC, congestive heart failure, or chronic kidney disease, as patients such as these (including patients with significant extra-hepatic chronic disease) were excluded from the seminal EASL ACLF study.2 For models predicting development of ACLF, we used the complete cohort of patients meeting selection criteria. For models predicting ACLF mortality, we used a sub-cohort restricted to patients diagnosed with ACLF. Institutional review board approval was obtained for this study at both the Philadelphia Veterans Affairs Hospital and the Hospital of the University of Pennsylvania.
Variable Collection
For each patient, we gathered data on demographics (age, sex, race), comorbidities (hypertension, diabetes, coronary artery disease, atrial fibrillation, pulmonary embolism, and cerebrovascular accident), baseline Child-Turcotte-Pugh (CTP) scores, and liver transplantation. Comorbidity data was obtained from combinations of ICD-9/10 codes, vital signs, and laboratory values through methods previously described.8,9,13 We ascertained etiology of liver disease using a validated algorithm,17 subsequently classified as HCV, hepatitis B (HBV), alcoholic liver disease (EtOH), HCV/EtOH, NAFLD, or other (including autoimmune and inborn metabolic disorders). We also obtained HCV viral load data during the baseline period to establish additional categories of HCV or HCV/EtOH with and without viremia. As a proxy of hazardous drinking behavior, alcohol use disorders identification test (AUDIT-C) scores were collected. These were coded as a binary variable, with a score ≥4 positive for men and ≥3 positive for women.18 Finally, laboratory data were obtained at the end of the baseline period and at the time of onset of ACLF. These included white blood cell (WBC) count, hemoglobin, platelet count, sodium, creatinine, INR, albumin, and total bilirubin.
Regarding classification of ACLF events by EASL criteria, we previously identified all first ACLF events in the analytic cohort, the methods of which are exhaustively detailed in our prior published work.13 This entailed gathering data on liver, kidney, brain, coagulation, circulation, and respiratory organ failures,2 as well as data on four inciting acute decompensations (hepatic encephalopathy [HE], ascites, gastrointestinal bleeding [GIB], and bacterial infection). Based on the number and type of organ failures present, ACLF events were also graded from 1 to 3 in accordance with published definitions,19 with 3 being the most severe.
Modeling Approach
For each series of models, the cohort was randomly divided into 70% derivation/30% validation sets. We chose this method because the large size of the dataset limits the risk of optimism bias,20 and it is similar to approaches taken by other groups to create prediction models for ACLF mortality.21,22 Variables of interest were compared between sets using Wilcoxon rank-sum and Chi-squared tests for continuous and categorical variables, respectively, using α=0.05 as a threshold for statistical significance. Logistic regression, which performs similarly to machine learning approaches in this context,23 was used for predictive modeling, where the outcome was development of ACLF. We chose logistic regression as opposed to time-to-event approaches because of the a priori timepoints of evaluation, which were based on clinical importance, interpretability, and consistency with the literature. However, we modeled outcomes at later timepoints based on restricted cohorts of patients who did not develop the outcome at the previous timepoint. For example, ACLF mortality at 90 days was only modeled among patient who survived to 28 days. We then used conditional probabilities to properly compute the predicted probabilities of the outcome, which approximates a time-to-event approach. Furthermore, this modeling approach ensures that all predicted probabilities are monotonically increasing at subsequent timepoints.
Using the derivation set, we first assessed missing data and, for continuous variables, explored the appropriateness of linearity assumptions for regression models. Because data missingness was less than ~5%, we used complete case analysis rather than methods such as multiple imputation. Locally-weighted scatterplot smoothing (LOWESS) curves were plotted for each continuous variable against the outcome of interest. If plots displayed notable non-linearity, continuous variables were modeled using restricted cubic splines. We selected the number and location of knots in order to achieve good agreement between a LOWESS curve fit to the raw data and the fitted spline model on the basis of visual inspection of the two curves. Our initial spline model was constructed using 3 knots located at equally spaced percentiles of the predictor variable, as recommended by Harrell.24 If notable deviation between the LOWESS curve and fitted model was observed, we refined the spline model by adding an additional knot, or incrementally changing a knot value in the region of notable deviation. We prioritized achieving reasonable agreement between the LOWESS curve and spline model while using as few knots as possible. The final knots selected to form the spline basis functions are given in Supplemental Table 1. In many cases the relationship between exposure and outcome was complex, and close modeling of the observed data was achieved using restricted cubic splines by visual assessment. As an example, WBC count conferred increased risk of ACLF mortality at both very low and very high values (Supplemental Figure 1).
After all spline functions were created, univariate logistic regression models were estimated for each combination of predictor and outcome, with an α=0.10 threshold used to select predictor variables for multivariable analysis. Variables evaluated as possible predictors are summarized in Supplemental Table 2. In evaluating etiology of liver disease as a univariate predictor, we did not find HCV viremia or HBV viremia to be significant predictors of ACLF development or mortality, and thus we simplified categorization to HCV, HBV, EtOH, NAFLD, or other. For multivariable logistic regression, we used backward stepwise model building with an entry/exit p-value threshold of 0.05. Once an initial model was obtained, we then evaluated multiple clinician-driven models where variables felt to be clinically meaningful were reintroduced. Joint hypothesis tests were used for spline basis functions, where relevant. We used the Bayesian Information Criterion to select final models, favoring those with minimum values. Additionally, we prioritized models that were more parsimonious (i.e. ≤10 variables), recognizing that a practical clinical risk calculator should not require an excessive number of inputs. STATA 14.2/IC (College Station, TX) was used for data management and analyses.
Prediction Model Evaluation
Using the final models obtained from the derivation sets, predictions were generated in the derivation and validation sets. Each predictive model was evaluated along three axes: overall performance, discrimination, and calibration, consistent with recommended best practice.25 Overall performance was evaluated using the Brier score, which ranges from 0 (perfect prediction) to 1 (no predictive ability). Model discrimination was assessed using receiver operator characteristic (ROC) curves with computation of concordance statistics (C-statistics) and 95% confidence intervals. The C-statistic is equivalent to area under the ROC curve (AUC), where 0.5 corresponds to random chance and 1.0 reflects perfect sensitivity and specificity. Finally, we assessed model calibration through visual inspection of plots of observed versus predicted number of events (perfect calibration would be indicated by points lying on a 45-degree line), with guidance from recent literature.26 We did not use the Hosmer-Lemeshow test for model calibration because of known issues with power as group size changes, especially with large datasets where the likelihood of inappropriately rejecting the null hypothesis is high.27
Comparison of ACLF Mortality Predictions to Clinical Standards
We additionally aimed to compare our ACLF mortality models to existing clinical standards, including the model for end-stage liver disease (MELD),MELD-sodium (MELD-Na), and the CLIF-C ACLF score. MELD and MELD-Na are the most commonly used risk prediction scores in the field of liver disease,28 and the CLIF-C ACLF score is arguably the gold standard in mortality risk prediction for patients with ACLF. The CLIF-C ACLF score, as developed by the EASL Consortium, incorporates CLIF-C organ failures, age, and WBC count into risk prediction.5 We computed MELD, MELD-Na, and the CLIF-C ACLF score at the time of ACLF diagnosis for each patient. The latter was calculated using the published formula.29 After computation of scores, we compared overall performance, discrimination, and calibration of our risk scores (denoted as “VOCAL-Penn” in figures) to those of MELD, MELD-Na, and CLIF-C ACLF at both 28 and 90 days.
Results
Cohort Characteristics and Predictions for Developing ACLF
The selected VHA analytic cohort (N = 74,790) was predominantly male but reflected diverse racial backgrounds and etiologies of liver disease (Supplemental Table 3). Over a median follow-up of 3.5 years (IQR 1.8 – 5.8), we observed 6,072 ACLF events (Supplemental Table 4). Short-term mortality rose with increasing ACLF grade, and was highest in ACLF grade 3 patients (83.0% at 90 days). The randomized derivation (N=52,483) and validation (N=22,307) cohorts were similar across the full range of demographic, comorbidity, and laboratory data (all p>0.05; Table 1). In all multivariable models, baseline laboratory values for albumin, INR, total bilirubin, creatinine, and hemoglobin were significant predictors of developing ACLF (Supplemental Tables 5–7). Established diabetes mellitus was associated with increased odds of ACLF at each timepoint, and AUDIT-C and etiology of liver disease were additional significant predictors at selected timepoints. At three-, six-, and 12-month timepoints, overall performance and discrimination for all models were excellent in both the derivation and validation cohorts (Brier scores ranging from <0.01 – 0.03, C-statistics 0.83 – 0.87; Table 2, Figure 1a-c). The calibration of the models was also excellent, with no systematic overprediction or underprediction across the entire range of ACLF risk (Supplemental Figure 2).
Table 1 –
Patient Characteristics after Randomization of Full Analytic Cohort (all p > 0.05)
| Variable | Derivation (N = 52,483) | Validation (N = 22,307) |
|---|---|---|
| Age, median (IQR) | 61 (56, 65) | 61 (56, 65) |
| Male Sex | 50,749 (96.7%) | 21,549 (96.6%) |
| Race | ||
| White | 30,078 (57.3%) | 12,819 (57.5%) |
| Black | 8,096 (15.4%) | 3,404 (15.3%) |
| Hispanic | 4,778 (9.1%) | 2,074 (9.3%) |
| Asian/Pacific Islander | 693 (1.3%) | 311 (1.4%) |
| Other/Unknown | 8,838 (16.8%) | 3,699 (16.6%) |
| BMI, median (IQR) | 27.83 (24.36, 31.85) | 27.83 (24.39, 31.85) |
| Etiology of Liver Disease | ||
| HCV | 8,654 (16.5%) | 3,699 (16.6%) |
| HBV | 1,274 (2.4%) | 555 (2.5%) |
| EtOH | 16,730 (31.9%) | 7,169 (32.1%) |
| NAFLD | 23,963 (45.7%) | 10,105 (45.3%) |
| Other | 1,862 (3.5%) | 779 (3.5%) |
| MELD (baseline), median (IQR) | 6 (6, 9) | 6 (6, 9) |
| MELD-Na (baseline), median (IQR) | 8 (6, 11) | 8 (6, 11) |
| INR (baseline), median (IQR) | 1.11 (1.03, 1.29) | 1.11 (1.03, 1.29) |
| Total bilirubin (baseline), median (IQR) | .9 (.6, 1.4) | .9 (.6, 1.4) |
| Creatinine (baseline), median (IQR) | .9 (.78, 1.05) | .9 (.78, 1.06) |
| Sodium (baseline), median (IQR) | 138 (136, 140) | 138 (136, 140) |
| Albumin (baseline), median (IQR) | 3.7 (3.2, 4.1) | 3.7 (3.2, 4.07) |
| Hemoglobin (baseline), median (IQR) | 13.7 (12.2, 14.9) | 13.6 (12.2, 14.9) |
| Platelets (baseline), median (IQR) | 133 (91, 188) | 133 (91, 187) |
| Hypertension | 47,662 (90.8%) | 20,281 (90.9%) |
| Coronary Artery Disease | 7,522 (14.3%) | 3,174 (14.2%) |
| Diabetes Mellitus | 23,782 (45.3%) | 10,184 (45.7%) |
| Atrial fibrillation | 2,281 (4.3%) | 1,006 (4.5%) |
| Pulmonary Embolism | 411 (0.8%) | 166 (0.7%) |
| Cerebrovascular Accident | 1,380 (2.6%) | 570 (2.6%) |
Table 2 –
Model Performance Characteristics for Predicting Development of ACLF
| Variable | Derivation Cohort (N = 52,483) | Validation Cohort (N = 22,307) | |
|---|---|---|---|
| Three Months | Overall performance | ||
| Brier score | <0.01 | <0.01 | |
| Discrimination | |||
| C-statistic | 0.87 (0.85 – 0.89) | 0.85 (0.82 – 0.88) | |
| Six Months | Overall performance | ||
| Brier score | 0.02 | 0.02 | |
| Discrimination | |||
| C-statistic | 0.85 (0.84 – 0.87) | 0.84 (0.82 – 0.86) | |
| One Year | Overall performance | ||
| Brier score | 0.03 | 0.03 | |
| Discrimination | |||
| C-statistic | 0.83 (0.82 – 0.84) | 0.83 (0.82 – 0.85) | |
Figure 1 –

Receiver Operating Characteristic (ROC) Curves for Predicting Development of ACLF at Three Months (a), Six Months (b), and One Year (c)
Predictions for ACLF Mortality
Among the cohort of patients with diagnosed ACLF (N = 6,072), the randomized derivation and validation cohorts were again similar across all potential predictors (all p>0.05, data not shown). In the multivariable models, we found the following variables to be significantly associated with ACLF mortality at both 28 and 90 days: age, albumin, INR, total bilirubin, creatinine, sodium, WBC count, and etiology of liver disease (Supplemental Tables 8 and 9). increasing WBC count was significantly associated with increased odds of mortality at 28 and 90 days. The overall performance and discrimination of our models were superior in predicting mortality at both timepoints compared to MELD, MELD-Na, and CLIF-C ACLF scores (Table 3, Figure 2a/b, Figure 3a/b). The calibration of our model was also excellent at both timepoints (Figure 2c, Figure 3c), again neither overpredicting nor underpredicting across the entire spectrum of risk. The calibration of the MELD, MELD-Na, and CLIF-C ACLF models systematically overpredicted risk among lower-risk patients, and underpredicted risk among higher-risk patients.
Table 3 -.
Model Performance Characteristics for Predicting ACLF Mortality
| Variable | Derivation Cohort (N = 4,275) | Validation Cohort (N = 1,797) | ||||||
|---|---|---|---|---|---|---|---|---|
| 28-Day | Overall performance | |||||||
| Brier score (VOCAL-Penn) | 0.16 | 0.17 | ||||||
| Brier score (MELD) | 0.18 | 0.20 | ||||||
| Brier score (MELD-Na) | 0.18 | 0.20 | ||||||
| Brier score (CLIF-C) | 0.17 | 0.18 | ||||||
| Discrimination | ||||||||
| C-statistic (VOCAL-Penn) | 0.82 (0.80 – 0.83) | 0.80 (0.78 – 0.82) | ||||||
| C-statistic (MELD) | 0.77 (0.76 – 0.79) | 0.74 (0.72 – 0.77) | ||||||
| C-statistic (MELD-Na) | 0.76 (0.75 – 0.78) | 0.74 (0.71 – 0.76) | ||||||
| C-statistic (CLIF-C) | 0.77 (0.76 – 0.79) | 0.78 (0.76 – 0.80) | ||||||
| 90-Day | Overall performance | |||||||
| Brier score (VOCAL-Penn) | 0.18 | 0.19 | ||||||
| Brier score (MELD) | 0.20 | 0.21 | ||||||
| Brier score (MELD-Na) | 0.21 | 0.21 | ||||||
| Brier score (CLIC-CF) | 0.20 | 0.20 | ||||||
| Discrimination | ||||||||
| C-statistic (VOCAL-Penn) | 0.79 (0.78 – 0.81) | 0.79 (0.77 – 0.81) | ||||||
| C-statistic (MELD) | 0.75 (0.73 – 0.76) | 0.74 (0.72 – 0.76) | ||||||
| C-statistic (MELD-Na) | 0.75 (0.73 – 0.76) | 0.74 (0.62 – 0.77) | ||||||
| C-statistic (CLIF-C) | 0.74 (0.73 – 0.76) | 0.75 (0.73 – 0.77) | ||||||
Figure 2 – Receiver Operating Characteristic (ROC) Curves for Predicting ACLF Mortality at 28 Days in Derivation (a) and Validation (b) Cohorts, and Associated Calibration Curves (c).

Caption: In panel C, “D” denotes derivation cohort and “V” denotes validation cohort. Plotted points reflect groups across the risk spectrum.
Figure 3 – Receiver Operating Characteristic (ROC) Curves for Predicting ACLF Mortality at 90 Days in Derivation (a) and Validation (b) Cohorts, and Associated Calibration Curves (c).

Caption: In panel C, “D” denotes derivation cohort and “V” denotes validation cohort. Plotted points reflect groups across the risk spectrum.
Model Summaries
A summary of the final model predictors for both ACLF development and ACLF mortality is given in Table 4. Detailed derivation of model formulas is provided in the Supplement Appendix.
Table 4 -.
Summary of Model Predictors for ACLF Models
| ACLF Development | ACLF Mortality | ||||
|---|---|---|---|---|---|
| Variable | 3 month | 6 month | 1 year | 28 day | 90 day |
| Age | X | X | |||
| AUDIT-C | X | X | |||
| Albumin | X | X | X | X | X |
| INR | X | X | X | X | X |
| T. bilirubin | X | X | X | X | X |
| Creatinine | X | X | X | X | X |
| Hemoglobin | X | X | X | ||
| Platelets | X | ||||
| Sodium | X | X | X | X | |
| WBC | X | X | |||
| Etiology of liver disease | X | X | X | ||
| DM | X | X | X | ||
AUDIT-C = alcohol use disorders identification test; INR = international normalized ratio; T. bilirubin = total bilirubin; WBC = white blood cell count; DM = diabetes
Discussion
In this study of 74,790 patients with incident cirrhosis of diverse etiologies, we report the derivation and validation of highly-predictive models for developing ACLF and ACLF mortality (to be available at www.aclfcalc.com). These models will be of immediate use to practitioners in multiple specialties, as MELD-Na does not accurately risk-stratify many of these patients from an ACLF perspective.30 In the outpatient setting, primary care physicians, gastroenterologists, and hepatologists may shorten follow-up intervals for patients at high risk of developing ACLF. Importantly, because a low MELD-Na does not preclude a high risk of developing ACLF, our prediction scores may also prompt clinicians to refer patients for transplant evaluation who might not otherwise meet standard indications (i.e., MELD ≥15). Finally, practitioners may focus on mitigating modifiable risk factors for ACLF development, such as alcohol cessation and improved management of diabetes and/or obesity.31,32 While this may seem intuitive, in a context where clinicians and patients must realistically select a handful of issues to address, the models highlight those which are most important.
There are limited data evaluating predictors for developing ACLF, and prior to this study, there were no available prediction scores. The only study to evaluate possible predictors used an outpatient cohort of 406 patients, of whom 61 developed ACLF within 12 months.4 Significant predictors included age, mean arterial blood pressure, MELD, and anemia. Despite an excellent C-statistic (C=0.87), this model has limited generalizability as two-thirds of the cohort had decompensated cirrhosis and the ACLF incidence rate was an order of magnitude higher than that of a general population-based cohort. In contrast, our study incorporates more NAFLD and represents a ~200-fold larger sample size, and would thus generalize more broadly. Additionally, the outcome of ACLF in their study includes many patients with non-hepatic organ failures, as shown previously.13 This implies that the model may lack face validity for a liver-specific outcome of interest. Our publicly-available risk prediction scores are the first of their kind to address the development of ACLF, and incorporate diverse groups of liver disease. Furthermore, they can be computed quickly based on standard laboratory testing and easily-obtained historical information.
We also developed accurate predictions models for ACLF short-term mortality. The novelty of these models lies in the diversity and scale of the underlying cohort, excellent calibration, and demonstrated improvement over other clinical standards. These risk scores will be useful in the inpatient setting in several ways. Hospitalists may elect to transfer those with high projected ACLF mortality to the intensive care unit (ICU) or to a tertiary care facility. Critical care physicians and transplant hepatologists may rely on these scores to facilitate patient selection for expedited transplant evaluation, novel liver-directed therapies, or palliation. They may also be used for risk stratification in clinical trials or for research purposes. Perhaps most importantly, the ability to prognosticate around ACLF will help patients and families manage expectations, plan for the future, and make challenging decisions in the setting of critical illness.
In comparison to MELD and MELD-Na, which are frequently used as general predictors of short-term mortality in liver disease, our ACLF mortality models substantially improved discrimination and overall performance. The majority of studies predicting ACLF mortality focus on EASL ACLF criteria, as we have done here. Although EASL ACLF grades correlate with 28-day mortality, the early clinical course of ACLF has more prognostic value than the initial grade.33,34 This is important, as numerous studies attempting to model short-term EASL ACLF mortality incorporate the definition itself into models. The largest effort comes from the EASL Consortium, where a cohort of 275 EASL ACLF patients was used to generate a new prognostic score, the CLIF-C ACLF score.5 This incorporated CLIF-C organ failures in addition to age and WBC count. C-statistics for 28 and 90-day mortality were 0.744 and 0.736, respectively. We found similar C-statistics for the CLIF-C ACLF score In our cohort, ranging from 0.74 – 0.78. In a separate study of academic Canadian hospitals, the CLIF-C ACLF score was also used to predict short-term mortality in 867 ACLF patients admitted to the ICU.35 This yielded C-statistics of 0.69 and 0.68 for 28 and 90-day mortality, respectively. The models developed in our study were superior to CLIF-C ACLF in direct comparison at both 28 and 90 days, and offer several additional advantages. over CLIF-C ACLF. First, the above studies were comprised of predominantly patients with alcohol-induced cirrhosis, most of whom developed ACLF on the basis of alcoholic hepatitis. In contrast, the models in our study represent diverse etiologies of liver disease, in particular a substantial proportion of patients with NAFLD, who have been sorely underrepresented in the ACLF literature. Second, our cohort is substantially larger in size, resulting in narrower confidence intervals for all parameter estimates. Third, the CLIF-C ACLF score incorporates the ACLF diagnostic criteria themselves into the risk prediction model, and we agree with existing sentiment that the two must be de-linked,36 as we have done here. Fourth, our models have superior calibration as compared to CLIF-C ACLF, which results directly from using restricted cubic splines to accurately model complex relationships between exposure and outcome. Finally, the VHA dataset has the advantage of spanning the full spectrum of care, from outpatient clinics to academic-affiliated ICUs. These features contribute to high confidence in the models obtained, and improved external validity.
There are several important limitations that we acknowledge in this study. First, as this study utilizes an administrative database, there is possible misclassification of exposures and outcomes. Although we relied primarily on ICD-9/10 codes, laboratory values, and medication administration data to classify ACLF events, we used validated algorithms wherever possible, and were able to employ far more granular data than other published ACLF studies that utilize United Network for Organ Sharing or Nationwide/National Inpatient Sample data. Furthermore, our approach is similar to other recent published ACLF literature using the VHA dataset.37 Regarding outcomes misclassification, ACLF events in patients admitted to non-VHA hospitals would not be captured in the VHA dataset. To address this, we only included patients who were actively engaged in outpatient VHA care. Incomplete ascertainment of ACLF outcomes, however, would result in some underestimation of risk in models. Second, it is possible that liver transplantation biased survival estimates in our study, as we did not perform competing risks analyses. However, in the VHA population liver transplantation is far less common than in other settings (see Supplemental Table 4), making this dataset ideal for studying the natural history of liver diseases. Thus, we believe our prediction models are less susceptible to bias induced by liver transplantation than other published efforts. Third, our models require numerous inputs for accurate predictions—10 in the case of ACLF development and 8 in the case of ACLF mortality. However, these are fewer than the number of inputs required for the CLIF-C ACLF score (12),38 and all variables in our models are readily available through routine laboratory studies and patient history. Finally, we do not present external validation in an independent cohort. Indeed, while we believe this study reflects a diversity of liver disease etiologies, the VHA dataset may not generalize well In other ways, such as sex distribution, psychiatric comorbidities, and substance abuse.39,40 Using our online calculators, future studies may aim to externally validate these scores in other contexts.
In conclusion, we have developed highly-accurate and accessible models to predict ACLF events in a diverse United States cohort of patients with chronic liver disease. We have also created novel prediction scores for ACLF mortality in this group that exceed the performance of existing gold standards. These models address reflect diverse patients with chronic liver disease, and apply to providers in the outpatient, hospital ward, and intensive care settings. The ACLF development risk scores may impact risk factor management, early transplant evaluation referral, and follow-up intervals, while the ACLF mortality risk scores may ultimately direct clinicians to escalate level of care, expedite transplant evaluation, or engage palliative care.
Supplementary Material
Supplemental Figure 1 – Example of Data Modeled with Restricted Cubic Splines
Caption: In panel C, “D” denoted derivation cohort and “V” denotes validation cohort. Plotted points reflect groups across the risk spectrum.
Acknowledgments:
This work was supported by resources and facilities available through the Philadelphia Veterans Affairs Healthcare System and central data repositories maintained by the Veterans Affairs Information Resource Center. The views expressed herein do not reflect position or policy of the Department of Veterans Affairs or the United States government.
Funding/Acknowledgments: Nadim Mahmud is supported by a National Institutes of Health T32 grant (2-T32-DK007740–21A1).
Footnotes
Disclosures: The authors have no conflicts of interest to disclose.
References
- 1.Tapper EB, Parikh ND. Mortality due to cirrhosis and liver cancer in the United States, 1999–2016: observational study. BMJ. 2018;362:k2817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Moreau R, Jalan R, Gines P, et al. Acute-on-chronic liver failure is a distinct syndrome that develops in patients with acute decompensation of cirrhosis. Gastroenterology. 2013;144(7):1426–1437. e1429. [DOI] [PubMed] [Google Scholar]
- 3.Sarin SK, Kedarisetty CK, Abbas Z, et al. Acute-on-chronic liver failure: consensus recommendations of the Asian Pacific Association for the Study of the Liver (APASL) 2014. Hepatol Int. 2014;8(4):453–471. [DOI] [PubMed] [Google Scholar]
- 4.Piano S, Tonon M, Vettore E, et al. Incidence, predictors and outcomes of acute-on-chronic liver failure in outpatients with cirrhosis. J Hepatol. 2017;67(6):1177–1184. [DOI] [PubMed] [Google Scholar]
- 5.Jalan R, Saliba F, Pavesi M, et al. Development and validation of a prognostic score to predict mortality in patients with acute-on-chronic liver failure. J Hepatol. 2014;61(5):1038–1047. [DOI] [PubMed] [Google Scholar]
- 6.Williams R, Alexander G, Aspinall R, et al. Gathering momentum for the way ahead: fifth report of the Lancet Standing Commission on Liver Disease in the UK. The Lancet. 2018;[epub ahead of print]. [DOI] [PubMed] [Google Scholar]
- 7.Wlodzimirow KA, Eslami S, Abu-Hanna A, Nieuwoudt M, Chamuleau RA. A systematic review on prognostic indicators of acute on chronic liver failure and their predictive value for mortality. Liver Int. 2013;33(1):40–52. [DOI] [PubMed] [Google Scholar]
- 8.Kaplan DE, Dai F, Aytaman A, et al. Development and performance of an algorithm to estimate the Child-Turcotte-Pugh score from a national electronic healthcare database. Clin Gastroenterol Hepatol. 2015;13(13):2333–2341. e2336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Serper M, Taddei TH, Mehta R, et al. Association of provider specialty and multidisciplinary care with hepatocellular carcinoma treatment and mortality. Gastroenterology. 2017;152(8):1954–1964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Goldberg DS, Taddei TH, Serper M, et al. Identifying barriers to hepatocellular carcinoma surveillance in a national sample of patients with cirrhosis. Hepatology. 2017;65(3):864–874. [DOI] [PubMed] [Google Scholar]
- 11.Goldberg DS, French B, Forde KA, et al. Association of distance from a transplant center with access to waitlist placement, receipt of liver transplantation, and survival among US veterans. JAMA. 2014;311(12):1234–1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mahmud N, Sundaram V, Kaplan DE, Taddei TH, Goldberg DS. Grade 1 Acute on Chronic Liver Failure is a Predictor for Subsequent Grade 3 Failure. Hepatology. 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mahmud N, Kaplan DE, Taddei TH, Goldberg DS. Incidence and Mortality of Acute on Chronic Liver Failure using Two Definitions in Patients with Compensated Cirrhosis. Hepatology. 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Goldberg D, Lewis J, Halpern S, Weiner M, Lo Re III V. Validation of three coding algorithms to identify patients with end-stage liver disease in an administrative database. Pharmacoepidemiol Drug Saf. 2012;21(7):765–769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lapointe-Shaw L, Georgie F, Carlone D, et al. Identifying cirrhosis, decompensated cirrhosis and hepatocellular carcinoma in health administrative data: A validation study. PLoS One. 2018;13(8):e0201120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Katz IR, McCarthy JF, Ignacio RV, Kemp J. Suicide among veterans in 16 states, 2005 to 2008: comparisons between utilizers and nonutilizers of Veterans Health Administration (VHA) services based on data from the National Death Index, the National Violent Death Reporting System, and VHA administrative records. Am J Public Health. 2012;102(S1):S105–S110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Beste LA, Leipertz SL, Green PK, Dominitz JA, Ross D, Ioannou GN. Trends in burden of cirrhosis and hepatocellular carcinoma by underlying liver disease in US veterans, 2001–2013. Gastroenterology. 2015;149(6):1471–1482. e1475. [DOI] [PubMed] [Google Scholar]
- 18.Bradley KA, DeBenedetti AF, Volk RJ, Williams EC, Frank D, Kivlahan DR. AUDIT-C as a brief screen for alcohol misuse in primary care. Alcoholism: Clinical and Experimental Research. 2007;31(7):1208–1217. [DOI] [PubMed] [Google Scholar]
- 19.Hernaez R, Solà E, Moreau R, Ginès P. Acute-on-chronic liver failure: an update. Gut. 2017;66(3):541–553. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Smith GC, Seaman SR, Wood AM, Royston P, White IR. Correcting for optimistic prediction in small data sets. Am J Epidemiol. 2014;180(3):318–324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Choudhury A, Jindal A, Maiwall R, et al. Liver failure determines the outcome in patients of acute-on-chronic liver failure (ACLF): comparison of APASL ACLF research consortium (AARC) and CLIF-SOFA models. Hepatol Int. 2017;11(5):461–471. [DOI] [PubMed] [Google Scholar]
- 22.O’leary JG, Reddy KR, Garcia-Tsao G, et al. NACSELD acute-on-chronic liver failure (NACSELD-ACLF) score predicts 30-day survival in hospitalized patients with cirrhosis. Hepatology. 2018;67(6):2367–2374. [DOI] [PubMed] [Google Scholar]
- 23.Jie M, Collins GS, Steyerberg EW, Verbakel JY, van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019. [DOI] [PubMed] [Google Scholar]
- 24.Harrell FE Jr. Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer; 2015. [Google Scholar]
- 25.Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology (Cambridge, Mass). 2010;21(1):128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Paul P, Pennell ML, Lemeshow S. Standardizing the power of the Hosmer–Lemeshow goodness of fit test in large data sets. Stat Med. 2013;32(1):67–80. [DOI] [PubMed] [Google Scholar]
- 27.Kramer AA, Zimmerman JE. Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited. Crit Care Med. 2007;35(9):2052–2056. [DOI] [PubMed] [Google Scholar]
- 28.Kim HY, Jang JW. Sarcopenia in the prognosis of cirrhosis: going beyond the MELD score. World Journal of Gastroenterology: WJG. 2015;21(25):7637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liver EAftSot. EASL Clinical Practice Guidelines for the management of patients with decompensated cirrhosis. J Hepatol. 2018;69(2):406–460. [DOI] [PubMed] [Google Scholar]
- 30.Sundaram V, Jalan R, Wu T, et al. Factors Associated with Survival of Patients With Severe Acute on Chronic Liver Failure Before and After Liver Transplantation. Gastroenterology. 2018. [DOI] [PubMed] [Google Scholar]
- 31.Sundaram V, Jalan R, Ahn JC, et al. Class III obesity is a risk factor for the development of acute-on-chronic liver failure in patients with decompensated cirrhosis. J Hepatol. 2018;69(3):617–625. [DOI] [PubMed] [Google Scholar]
- 32.Mahmud N, Kaplan DE, Taddei TH, Goldberg DS. Reply to Letters to the Editor: Diverging Definitions and Dividing Lines in Acute-on-Chronic Liver Failure. Hepatology. [Google Scholar]
- 33.Arroyo V, Moreau R, Jalan R, Ginès P, Study E-CCC. Acute-on-chronic liver failure: a new syndrome that will re-classify cirrhosis. J Hepatol. 2015;62(1):S131–S143. [DOI] [PubMed] [Google Scholar]
- 34.Gustot T, Fernandez J, Garcia E, et al. Clinical course of acute-on-chronic liver failure syndrome and effects on prognosis. Hepatology. 2015;62(1):243–252. [DOI] [PubMed] [Google Scholar]
- 35.Karvellas CJ, Garcia-Lopez E, Fernandez J, et al. Dynamic prognostication in critically ill cirrhotic patients with multiorgan failure in ICUs in Europe and North America: a multicenter analysis. Crit Care Med. 2018;46(11):1783–1791. [DOI] [PubMed] [Google Scholar]
- 36.Bajaj JS, Wong F, Kamath PS. Defining Acute on Chronic Liver Failure: More Elusive Than Ever. Hepatology. 2019;[in press]. [DOI] [PubMed] [Google Scholar]
- 37.Hernaez R, Kramer JR, Liu Y, et al. Prevalence and Short-term Mortality of Acute-on-Chronic Liver Failure: a national cohort study from the USA. J Hepatol. 2018. [DOI] [PubMed] [Google Scholar]
- 38.CLIF-C ACLF Calculator. http://www.efclif.com/scientific-activity/score-calculators/clif-c-aclf. Accessed February 20, 2019.
- 39.Kazis LE, Miller DR, Clark J, et al. Health-related quality of life in patients served by the Department of Veterans Affairs: results from the Veterans Health Study. Arch Intern Med. 1998;158(6):626–632. [DOI] [PubMed] [Google Scholar]
- 40.Fortney JC, Curran GM, Hunt JB, et al. Prevalence of probable mental disorders and help-seeking behaviors among veteran and non-veteran community college students. Gen Hosp Psychiatry. 2016;38:99–104. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental Figure 1 – Example of Data Modeled with Restricted Cubic Splines
Caption: In panel C, “D” denoted derivation cohort and “V” denotes validation cohort. Plotted points reflect groups across the risk spectrum.
