Abstract
BACKGROUND AND AIMS:
Disease progression in children with primary sclerosing cholangitis (PSC) is variable. Prognostic and risk-stratification tools exist for adult-onset PSC, but not for children. We aimed to create a tool that accounts for the biochemical and phenotypic features and early disease stage of pediatric PSC.
APPROACH AND RESULTS:
We used retrospective data from the Pediatric PSC Consortium. The training cohort contained 1,012 patients from 40 centers. We generated a multivariate risk index (Sclerosing Cholangitis Outcomes in Pediatrics [SCOPE] index) that contained total bilirubin, albumin, platelet count, gamma glutamyltransferase, and cholangiography to predict a primary outcome of liver transplantation or death (TD) and a broader secondary outcome that included portal hypertensive, biliary, and cancer complications termed hepatobiliary complications (HBCs). The model stratified patients as low, medium, or high risk based on progression to TD at rates of <1%, 3%, and 9% annually and to HBCs at rates of 2%, 6%, and 13% annually, respectively (P < 0.001). C-statistics to discriminate outcomes at 1 and 5 years were 0.95 and 0.82 for TD and 0.80 and 0.76 for HBCs, respectively. Baseline hepatic fibrosis stage was worse with increasing risk score, with extensive fibrosis in 8% of the lowest versus 100% with the highest risk index (P < 0.001). The model was validated in 240 children from 11 additional centers and performed well.
CONCLUSIONS:
The SCOPE index is a pediatric-specific prognostic tool for PSC. It uses routinely obtained, objective data to predict a complicated clinical course. It correlates strongly with biopsy-proven liver fibrosis. SCOPE can be used with families for shared decision making on clinical care based on a patient’s individual risk, and to account for variable disease progression when designing future clinical trials.
Children with primary sclerosing cholangitis (PSC) are at risk for developing cirrhosis and end-stage liver disease (ESLD). Within 10 years of diagnosis as children, 30% of patients require liver transplantation (LT).(1) Individual patient progression to these outcomes is variable and currently unpredictable. Differentiating patient populations at low or high risk of complications is critical for patient education, clinical management, and trial design.(2) Several predictive models have been derived using data from adult-onset PSC populations, but no consensus exists regarding the optimal model.(3) There is no prognostic or risk-stratification tool for children with PSC.
Important clinical differences exist between pediatric and adult-onset PSC patients. At diagnosis, dominant strictures are present in only 4% of children(1,4) versus as many as 45% of adults.(5) Cholangiocarcinoma (CCA) is rare in children, occurring in 1% by 10 years(1,4) versus 7%–13% of adults(6–8) and often in the year following diagnosis. A small-duct phenotype is present in up to 20% of children,(1,4) but only 10% of adults.(9,10) Features of autoimmune hepatitis (AIH) overlap with PSC are present in >33% of children,(1,4) but only 7% of adults,(11,12) affecting the predictive utility of models that rely on aspartate aminotransferase (AST). Alkaline phosphatase (ALP) is a key marker of cholestatic liver diseases in adults, but, because of bone turnover in growing children, is not a reliable marker of liver disease in children. Other risk factors for liver disease progression, such as chronic alcohol abuse, smoking, obesity, and polypharmacy, are lower in children. Given these differences, it is unclear whether risk models derived from adult patients can or should be generalized to children.(13)
We used data from the Pediatric PSC Consortium to create and validate a pediatric-specific risk stratification and prognostic tool: the Sclerosing Cholangitis Outcomes in Pediatrics (SCOPE) index. We aimed to create a tool with clinical utility for providers, patients, and families by relying on objective, readily available clinical data.
Materials and Methods
DATA SOURCE
The Pediatric PSC Consortium is a retrospective research registry involving 51 sites throughout Europe, North and South America, the Middle East, and Asia.(1) We collected all known cases of PSC, diagnosed between 1990 and 2019 with onset before age 18 years. Data were entered by collaborating physician investigators at each site. We collected demographics, laboratory data, and reports of histopathology, cholangiography, and endoscopy on each patient. PSC diagnosis was based on a cholestatic laboratory profile and either cholangiography showing multifocal stricturing and segmental dilations of the biliary tree and/or liver biopsy showing periductal, concentric fibrosis, fibro-obliterative cholangitis, or primary ductular involvement.(3) Patients with abnormal cholangiograms were classified as large-duct PSC. Patients with normal cholangiograms but abnormal liver biopsy were classified as small-duct PSC. Features of overlap with AIH were defined as present in patients who met a “probable” or “definite” score on the pediatric-modified simplified AIH criteria.(14) Metavir fibrosis stage was extracted from the trichrome stain on liver histopathology reports of biopsies done within 3 months of PSC diagnosis. Complete laboratory records were available in 86% of patients. For the remainder, to account for missing data, we performed multivariate imputation using iteratively chained equations, combining the results of 10 imputed data sets.
PRIMARY END POINT
We followed all patients from the time of PSC diagnosis to a primary end point of transplant or death (TD), which was the date of listing for LT or date of death from liver disease, whichever was earliest. Date of listing for LT was chosen over the date of LT itself because time from listing to transplantation varies by surgeon, center, and region in ways that are unrelated to disease progression, especially the rate of utilization of living donor transplantation. Listing date and transplant date were both analyzed and were similar, but given that wait times after listing can be long in PSC and follow-up time at pediatric centers before transition to adult programs can be relatively short, using date of listing allowed us to capture outcomes in patients who had not yet undergone transplant before the end of follow-up.
SECONDARY END POINT
Given that cirrhosis and ESLD may precede LT listing or death by months or years, we assessed performance of a model to predict a broader outcome termed hepatobiliary complications (HBCs). This composite end point consisted of the earliest date of any of: (1) diagnosis of portal hypertensive complications (esophageal varices [EV], ascites, or hepatic encephalopathy [HE]); (2) diagnosis of biliary complications (biliary stricture requiring balloon dilation, stenting or external drainage, or hospitalization for acute bacterial cholangitis); (3) diagnosis of CCA; (4) the date of listing for LT; or (5) death from liver disease. Date of diagnosis of portal hypertensive or biliary complications were included in the composite end point because they represent objective, irreversible points in progression of liver disease. EV were diagnosed when first observed on endoscopy. Ascites was diagnosed when observed on abdominal imaging or at the start of diuretic medications. Encephalopathy was diagnosed clinically when first noted in documentation or at the start of rifaximin or lactulose. Endoscopic or interventional radiology management of strictures was at the local center’s discretion when patients were jaundiced and cholangiography showed prominent strictures with proximal ductal dilation. Acute bacterial cholangitis was diagnosed when patients had an episode of increase in liver biochemistry from baseline with jaundice, fever, right upper quadrant pain, or positive blood culture that responded clinically to antibiotics and/or endoscopic or radiological interventional procedures. We previously showed that 2.8 years is the median survival with native liver after a portal hypertensive or biliary complication was diagnosed in children.(1) Given that most pediatric-onset PSC is identified before any such complications are present, predicting which patients will experience HBCs, not just the primary outcome of TD, is of paramount importance to patients and clinicians.
DERIVATION AND VALIDATION COHORT SELECTION
A total of 1,333 patients from 51 centers were available for model creation. Given that the number of patients and outcomes was sufficiently high to avoid overfitting with a multivariate model, we elected to split our sample into model derivation and model validation cohorts at an ~4:1 ratio. We did this at the level of randomly selected centers, rather than randomly selected patients, to reflect unmeasured differences in center- or country-specific clinical practice. We randomly selected centers, one at a time, and placed all of their patients into a validation cohort until the size exceeded 250 (~20% of all available patients). These observations were set aside for validation and not used in model derivation. All remaining observations served as the model derivation cohort. We eliminated patients from all analyses who had events already present at the time of diagnosis and entry into the database. Person-time was censored at the date of the last known clinical encounter. We used the Kaplan-Meier method to calculate rates of survival each year after diagnosis.
VARIABLE SELECTION
Potential predictor variables were chosen a priori, based on earlier studies and clinical experience. We previously showed that total bilirubin, gamma glutamyltransferase (GGT), AST, and platelet count, large-duct phenotype, and presence of inflammatory bowel disease (IBD) were important predictors of prognosis at PSC diagnosis.(1,15,16) We additionally assessed hemoglobin, international normalized ratio (INR), ALP, alanine aminotransferase (ALT), sodium, creatinine, age at PSC diagnosis, sex, presence of features of overlap with AIH, and IBD phenotype as predictors. Univariate Cox proportional hazards regression was used to assess the association between each potential baseline predictor and the primary outcome: TD. The proportional hazards assumption was tested graphically in each case. We explored possible interaction between variables and nonlinear (e.g., quadratic) relationships between variables. Predictors and interaction terms with a P value <0.1 in univariate analyses were included in a multivariate model. Backward elimination of variables was then performed in a step-wise fashion until the final model contained only variables with a P value <0.05.
RISK INDEX GENERATION
We used the method of Sullivan et al.(17) to generate a risk index from the final Cox proportional hazards model. Continuous variables were divided into three or four discrete categories, as appropriate, with cutoffs chosen to optimize the individual predictor’s discrimination ability (c-statistic). The number of points assigned to each covariate was its regression coefficient divided by the parameter estimate in the model with the smallest absolute value, rounded to the nearest whole number. Each patient’s final score was calculated as the sum of the total points from each predictor.(18) We assessed the probability of the primary outcome: TD, and the secondary outcome: HBCs, in patients at each risk index score from 0 to the maximum total points. Cutoffs for low-, medium-, and high-risk groups were selected at thresholds where the largest increases in occurrence of TD during follow-up between successive risk-score groups were observed. To ensure that minimal loss of prognostic information occurred with conversion from a full model with continuous variables to a risk index with discrete cutoffs, we compared receiver operating curves of each strategy. The final SCOPE index was applied to each patient in the derivation and validation cohorts.
ASSESSING GOODNESS OF FIT
Discriminatory ability of the model was assessed with Harrell’s concordance statistic (c-statistic). The c-statistic expresses the proportion of times that if every possible pair of 2 patients in a cohort is considered, the patient with the higher-risk index has the shorter survival.(19) C-statistic values range from 0.5 (no discrimination) to 1.0 (perfect discrimination), with values of ≥0.8 generally regarded as “good” and ≥0.9 as “excellent” discrimination.(20) We compared discrimination in the derivation and validation cohorts, and in patients grouped by important demographic and phenotypic variables. C-statistics were calculated for SCOPE for the primary and secondary outcomes.
We evaluated the ability of the risk index to yield accurate survival probabilities for a given patient graphically, by comparing side-by-side plots of survival in the derivation and validation cohorts. We assessed the ability of the risk index to stratify patients into low-, medium-, and high-risk groups with distinctly different survival using the log-rank test. The log-rank test is used to test the null hypothesis that there is no difference between the risk groups in the probability of an event at any time point.(21)
Calibration of SCOPE to assess whether predicted and observed survivals were aligned was assessed graphically. With observed probability of events on the x-axis and predicted probability on the y-axis, perfect predictions should plot on a 45-degree line, with relatively worse predictions falling further from this line. Discrimination and calibration were assessed across risk strata in the derivation and validation cohorts, and in patients grouped by important demographic and phenotypic variables. We also used the Grønnesby and Borgan test,(22) a version of the Hosmer-Lemeshow goodness-of-fit test(23) adapted to survival data, to ensure that observed and predicted probabilities were similar. With patients grouped into decile of risk score, P values for observed and expected probabilities should be >0.1 in a well-calibrated model.
REPEATABILITY
To assess whether SCOPE performed well when used with data from time points after PSC diagnosis, we assessed goodness of fit in patients who survived without the primary end point for at least 2 years. We entered data from 2 years after diagnosis (±6 weeks) and assessed survival without the primary end point at years 3 through 13 after PSC diagnosis.
COMPARISON TO OTHER MODELS
Several prognosticating tools exist for adult patients with PSC: The Mayo Clinic revised natural history model,(24) the Amsterdam-Oxford model,(25) the UK short- and long-term models,(26) and the most recent, machine-learning–based Primary Sclerosing Cholangitis Risk Estimate Tool (PREsTo).(27) Direct comparison between models and with SCOPE is impossible. Each defines its primary end point differently, and each is designed to make predictions over different time spans (from 2 to 15 years). None were generated with pediatric data. In a general sense, however, each model assesses for ESLD leading to LT. Thus, to compare their utility in children to that of the SCOPE index, we assessed the utility of each model’s risk score to predict the primary outcome of this study, TD, in children in the derivation cohort, annually for 10 years after diagnosis. For the UK long-term model, we used data from 2 years after diagnosis and assessed the primary outcome at years 3–13 after diagnosis.
STATISTICAL ANALYSIS AND ETHICS
Calculations were done using Stata software (version 16; StataCorp LP, College Station, TX). Continuous variables were compared between groups using the Wilcoxon rank-sum test, and discrete variables were compared using the chi-squared test. All research work was approved by the institutional review board of each participating center. All research was conducted in accordance with the Declaration of Helsinki guidelines of good practice. Since we collected only retrospective, de-identified data, informed consent was waived at each center.
Results
COHORT CHARACTERISTICS
A total of 1,333 patients at 51 centers were available for analysis. Patients were 39% female, median age 12.7 years at diagnosis, and had median 4.1 years of follow-up. A large-duct phenotype was present in 88%, IBD in 78%, and features of overlap with AIH in 32%. After randomly assigning 11 centers with 256 patients to the validation cohort and the remaining 40 centers with 1,077 patients to the derivation cohort, we excluded 81 patients with end points already present at diagnosis and entry into the database (6%; 16 of 256) of the validation cohort and 6% (65 of 1,077) of the derivation cohort). The final validation cohort contained 240 patients, and the final derivation cohort contained 1,012 patients, as shown in Supporting Information Appendix S1. In 6,914 person-years of follow-up, 98 patients were listed for LT at a median Model for End-Stage Liver Disease (MELD) score of 14 (interquartile range [IQR], 10–21], 83 were successfully transplanted at median MELD 22 [IQR, 16–29], 2 died, and 13 were awaiting transplantation at the end of follow-up. Five patients died of liver disease before being listed for transplantation. Of patients with a small-duct PSC phenotype initially with at least 5 years of follow-up, 15% (21 of 151) eventually developed large-duct lesions and complications. Detailed characteristics of the two cohorts are noted in Table 1. The validation cohort contained fewer patients of Asian race, more patients with a small-duct phenotype, and progressed to HBCs less frequently. Centers in each cohort are listed in Supporting Information Appendix S2.
TABLE 1.
Baseline Patient Characteristics at PSC Diagnosis in Derivation vs. Validation Cohorts
Derivation | Validation | ||
---|---|---|---|
n = 1,012 | n = 240 | ||
40 Centers | 11 Centers | P Value | |
Phenotype | |||
Age at diagnosis (years) | 12.6 [8.8–15.0] | 13.2 [9.9–16.0] | 0.030 |
Sex (% female) | 40 | 38 | 0.568 |
Race/ethnicity | |||
White, non-Hispanic/Latino (%) | 73% | 84% | <0.001 |
Hispanic/Latino (%) | 9% | 8% | 0.600 |
Black (%) | 7% | 6% | 0.411 |
Asian (%) | 11% | 2% | <0.001 |
IBD phenotype | |||
Ulcerative colitis/IBD-U | 61% | 56% | 0.135 |
Crohn’s disease | 17% | 18% | 0.404 |
No IBD | 22% | 26% | 0.013 |
Features of overlap with AIHs (% with) | 32 | 33 | 0.105 |
Large-duct phenotype (% with) | 76 | 53 | <0.001 |
Biochemistry | |||
Hemoglobin (g/dL) | 12.4 [11.3–13.5] | 12.3 [10.2–13.4] | 0.094 |
Platelets (k/μL) | 322 [237–396] | 309 [254–386] | 0.306 |
INR | 1.1 [1.0–1.2] | 1.1 [1.0–1.1] | 0.099 |
Albumin (g/dL) | 4 [3.6–4.3] | 4.1 [3.7–4.4] | 0.041 |
GGT (U/L) | 234 [115–375] | 205 [104–324] | 0.049 |
ALP (U/L) | 377 [228–668] | 312 [186–472] | <0.001 |
AST (U/L) | 106 [47–178] | 88 [40–156] | 0.026 |
ALT (U/L) | 133 [66–230] | 117 [48–215] | 0.026 |
Total bilirubin (mg/dL) | 0.6 [0.4–1.2] | 0.5 [0.4–0.8] | 0.002 |
Sodium (meq/L) | 139 [137–141] | 139 [137–141] | 0.994 |
Creatinine | 0.6 [0.5–0.7] | 0.6 [0.5–0.7] | 0.392 |
Liver biopsy fibrosis stage | |||
Metavir stage 0–2 (%) | 65 | 63 | 0.204 |
Metavir stage 3–4 (%) | 35 | 37 | 0.204 |
Medications | |||
Ursodeoxycholic acid | 74% | 85% | <0.001 |
Oral vancomycin therapy | 12% | 8% | 0.187 |
Clinical events (5-year rates) | |||
Endoscopic biliary procedure (stent/balloon/drain) | 9% (n = 68) | 6% (n = 9) | 0.179 |
Hospitalization for bacterial cholangitis | 5% (n = 36) | 3% (n = 4) | 0.234 |
Detection of EV | 17% (n = 121) | 13% (n = 22) | 0.155 |
VH | 1% (n = 17) | 1% (n = 2) | 0.496 |
CCA | <1% (n = 1) | 0% (n = 0) | 0.967 |
Listing for LT | 12% (n = 81) | 9% (n = 17) | 0.731 |
LT | 11% (n = 70) | 8% (n = 13) | 0.602 |
Death from liver disease | 1% (n = 4) | 0% (n = 0) | 0.226 |
Any event | 25% (n = 189) | 19% (n = 35) | 0.110 |
Data presented as median [IQR] or proportion.
MODEL CREATION IN THE TRAINING COHORT
Association between predictor variables and the primary outcome of TD were tested in univariate analysis, as noted in Table 2. Features of overlap with AIH (vs. absence), a presence of a large-duct phenotype (vs. small duct phenotype), as well as hemoglobin, platelet count, total protein, albumin, GGT, ALP, total bilirubin, and sodium were associated with P values <0.1 and were included in the multivariate model. After backward step-wise elimination of predictors with a multivariate P value >0.05, the final model contained large- versus small-duct phenotype (renamed abnormal vs. normal cholangiography), platelet count, albumin, GGT, and total bilirubin. Weighting of each predictor range yielded the SCOPE index. Optimal discrete cutoffs and associated hazard ratios (HRs) of each predictor are noted in Table 3. The final SCOPE index ranged from 0 to 11, as shown in Table 4. Patients with SCOPE of 0–3 were classified as low risk and developed the primary outcome at a rate of <1% per year. Patients with SCOPE of 4–5 were classified medium risk and developed the primary outcome at a rate of 3% per year. Patients with SCOPE of 6–11 were classified high risk and developed the primary outcome at a rate of 9% per year. Figure 1 illustrates annual probability of the primary and secondary outcomes, by SCOPE index annually for 5 years following diagnosis. After calculating an individual patient’s SCOPE index, users can reference this to understand the corresponding risk of events. Each 1-point increase in the SCOPE index increased the odds of HBCs by 81% (HR, 1.81; 95% confidence interval, 1.64–1.99; P < 0.001). Information loss in the model was minimal when converting from continuous variables to discrete cutoffs in a risk index, as shown in Supporting Information Appendix S3. Bilirubin was the most discriminatory variable in SCOPE, whereas cholangiography was the least, as noted in Supporting Information Appendix S4.
TABLE 2.
Univariate Predictors of LT Listing or Liver-Related Death After PSC Diagnosis
Predictor | HR | P Value |
---|---|---|
Age at diagnosis (per 1.0 year) | 1.03 | 0.161 |
Male sex (vs. female) | 0.83 | 0.373 |
Presence of IBD (vs. no IBD) | 1.01 | 0.311 |
Ulcerative colitis phenotype (vs. no IBD) | 0.93 | 0.720 |
Crohn’s disease phenotype (vs. no IBD) | 0.92 | 0.752 |
Features of overlap with AIH (vs. absence) | 0.68 | 0.096 |
Large-duct phenotype (vs. small duct) | 1.76 | 0.038 |
Hemoglobin (per 1.0-mg/dL increase) | 0.89 | 0.036 |
Platelet count (per 100,000/μL increase) | 0.62 | <0.001 |
INR (per 1.0 increase) | 1.07 | 0.572 |
Total protein (per 1.0-g/dL increase) | 0.71 | 0.008 |
Albumin (per 1.0-g/dL increase) | 0.37 | <0.001 |
GGT (per 100 U/L) | 1.10 | <0.001 |
ALP (per 1.0 × ULN for age) | 1.16 | 0.001 |
AST (per 100 U/L) | 1.02 | 0.460 |
ALT (per 100 U/L) | 0.99 | 0.549 |
Total bilirubin (per 1.0-mg/dL increase) | 1.16 | <0.001 |
Sodium (per 1-meq/L increase) | 1.07 | 0.062 |
Creatinine (per 1.0-mg/dL increase) | 0.41 | 0.123 |
Abbreviation: ULN, upper limit of normal.
TABLE 3.
Final Cox Proportional Hazards Regression Model for Risk of HBCs After Diagnosis of PSC
Predictor | Reference | HR | P Value |
---|---|---|---|
Total bilirubin 0.7–2.7 mg/dL | ≤0.6 g/dL | 1.89 | 0.041 |
Total bilirubin 2.8–4.8 mg/dL | 2.97 | 0.007 | |
Total bilirubin ≥4.9 mg/dL | 6.01 | <0.001 | |
Albumin 3.2–3.9 g/dL | ≥4 g/dL | 2.05 | 0.010 |
Albumin ≤3.1 g/dL | 3.60 | <0.001 | |
Platelets 136,000–224,000/μL | ≥225,000/μL | 2.04 | 0.026 |
Platelets ≤135,000/μL | 3.51 | <0.001 | |
GGT 101–249 U/L | ≤100 U/L | 3.61 | 0.010 |
GGT ≥250 U/L | 5.01 | 0.001 | |
Abnormal cholangiography | Normal | 2.04 | 0.040 |
TABLE 4.
The SCOPE Index
Value | |||
---|---|---|---|
Attribute | Traditional Units | SI Units | Points |
mg/dL | μmol/L | ||
Total bilirubin | ≤0.6 | ≤11 | 0 |
0.7–2.7 | 11.1–46.7 | +1 | |
2.8–4.8 | 46.8–82.4 | +2 | |
≥4.9 | ≥82.5 | +3 | |
g/dL | g/L | ||
Albumin | ≥4 | ≥40 | 0 |
3.2–3.9 | 32–39 | +1 | |
≤3.1 | ≤31 | +2 | |
×103/μL | ×109/L | ||
Platelet count | ≥225 | ≥225 | 0 |
136–224 | 136–224 | +1 | |
≤135 | ≤135 | +2 | |
U/L | U/L | ||
GGT | ≤100 | ≤100 | 0 |
101–249 | 101–249 | +2 | |
≥250 | ≥250 | +3 | |
Cholangiography* | Normal | 0 | |
Large-duct involvement | + 1 | ||
Total | |||
Low risk | 0–3 | ||
Medium risk | 4–5 | ||
High risk | 6–11 |
The index is the sum of the points attributed to each of the patient’s total bilirubin, albumin, and platelet count.
Positive findings of multifocal large-duct beading and strictures on magnetic resonance or endoscopic retrograde cholangiopancreatography.
FIG. 1.
What does the SCOPE index mean for future complications?
GOODNESS OF FIT AND VALIDATION
SCOPE had good to excellent discrimination of the primary outcome at all time points in the derivation and validation cohorts, with Harrell c-statistics of 0.95 versus 0.88 at 1 year, 0.82 versus 0.83 at 5 years, and 0.80 versus 0.77 at 10 years, respectively. Discrimination ability was unchanged in demographic and phenotypic subtypes of PSC. The c-statistic at 5 years after diagnosis was >0.80 for males, females, patients younger or older than 12.5 years, small-duct disease, large-duct disease, patients with or without IBD, and patients with or without features of overlap with AIH, as shown in Supporting Information Appendix S5. SCOPE performed equally well when laboratory studies from 2 years after diagnosis were input to assess occurrence of LT listing or liver-related death at years 3–13 after diagnosis, with c-statistics at all time points >0.8.
SCOPE also discriminated the secondary outcome well, in the derivation and validation cohorts, with c-statistics of 0.8 versus 0.84 at 1 year, 0.76 versus 0.83 at 5 years, and 0.75 versus 0.80 at 10 years, respectively. SCOPE stratified patients into low-, medium-, and high-risk groups in similar proportions and with similar survival in the derivation and validation cohorts, with log-rank P between all group combinations <0.001, as detailed in Fig. 2A,B. SCOPE was well calibrated in all risk groups in the derivation and validation cohorts and between demographic and phenotypic subtypes, as detailed in Fig. 2C,D. P values in the Grønnesby and Borgan test of fit were >0.19 in all deciles of risk, indicating that predicted and observed probabilities were similar.
FIG. 2.
SCOPE model performance and validation. (A) Survival without LT listing after diagnosis of PSC in the derivation cohort and (B) validation cohort. Log-rank P value between all groups in both cohorts <0.001. (C) Calibration of SCOPE model for LT listing or liver-related death within 5 years of PSC diagnosis by risk strata in derivation and validation cohorts and (D) by demographic and phenotypic subtypes.
RISK INDEX AND FIBROSIS
Liver biopsies obtained within 3 months of diagnosis were available in 610 patients (from derivation and validation cohorts combined). Metavir fibrosis stage was F0 in 9%, F1 in 26%, F2 in 29%, F3 in 21%, and F4 (cirrhosis) in 15%. Increasing SCOPE index correlated strongly with worse fibrosis stage, as shown in Fig. 3. Extensive fibrosis (F3 or F4) was present in 8% of patients with SCOPE index of 0% and 100% of patients with SCOPE index ≥8 (P < 0.001). Transient elastography was obtained within 3 months of diagnosis in 113 patients. Median liver stiffness increased progressively in each risk group: 3.8 kilopascals (kPa; IQR, 2.7–5.4) in low-risk patients, 6.4 kPa (IQR, 3.5–10.6) in medium-risk patients, and 11.0 kPa (IQR, 6.4–16.6) in high-risk patients, as shown in Supporting Information Appendix S6.
FIG. 3.
Metavir fibrosis stage on liver biopsy within 3 months of diagnosis of PSC by SCOPE index.
COMPARISON WITH ADULT-DERIVED PREDICTIVE MODELS
SCOPE using biochemistry at diagnosis or using biochemistry at 2 years showed superior discrimination for TD compared to the Mayo Clinic revised natural history model, the Amsterdam-Oxford model, the UK short- and long-term models, and PREsTo, as shown in Fig. 4.
FIG. 4.
A comparison of model discrimination of LT listing or liver-related death over time in the pediatric PSC derivation cohort. *Data from 2 years after diagnosis were entered into the model for patients who survived this long, and outcomes were date shifted so that years 3–13 in these patients can be plotted as years 1–10.
Discussion
We created the SCOPE index, a risk-stratification tool to predict liver-related outcomes in children and adolescents with PSC. The SCOPE index was derived from the largest-ever cohort of pediatric PSC patients. This work is important for multiple reasons. First, we showed that routinely obtained, objective biochemical markers can accurately predict a child’s clinical course. Second, we demonstrated the validity of the SCOPE index in a diverse global cohort, showing that this tool will be useful in varied clinical settings. Third, we demonstrated that the SCOPE index correlates well with biopsy-proven fibrosis stage. Fourth, we showed demonstrable heterogeneity in PSC outcomes that must be accounted for in study designs.
The SCOPE index has excellent discrimination ability, calibration characteristics, repeatability, and validity in a variety of centers. There are currently no data sets of children with PSC that compare in size to the Pediatric PSC Consortium, and there are no other pediatric risk tools to compare with SCOPE. Previously, clinicians and families have relied on adult-derived prognostic tools to estimate risk in children with PSC, but they were never designed for this purpose. We previously showed calibration problems with some of the most commonly used adult PSC prognostic models when pediatric data were entered.(13) SCOPE outperformed these models in our analysis and was superior at discriminating the future need for transplant or risk of death. Importantly for children, the SCOPE index does not rely on ALP or AST like all of the adult-derived models. ALP is highly variable in children because of bone turnover and growth and is thus an unreliable marker of hepatobiliary inflammation.(28) GGT is a better marker of cholestatic liver disease in children and is routinely obtained by pediatric clinicians. AST is also variable in children because of a high prevalence of features of overlap with AIH,(13) a phenotype that does not impact overall prognosis.(1)
LT and liver-related death are generally the outcome of interest in adult-derived prognostic tools because so many adults with PSC already have EV, variceal hemorrhage (VH), dominant stricture, and more-advanced liver disease at diagnosis. In contrast, children rarely present with these complications. The ability of SCOPE to also predict the broader outcome of HBCs is particularly relevant for children. Inclusion of portal hypertensive and biliary stricture events in the combined end point of HBCs captures a larger group of patients with progressive liver outcomes than TD alone. For most patients, portal hypertensive and biliary complications are the earliest signs of a clinical course that will progress to TD. Rather than functioning as another transplant triage tool for patients with already advanced liver disease, analogous to the MELD,(29) risk stratification with SCOPE is identifying patients who have disease that is at an early stage, yet is deemed higher risk because their disease is highly progressive.
Total bilirubin, serum albumin, and platelet count were the strongest predictors of long-term outcome in pediatric-onset PSC. These predictors have been identified repeatedly in numerous studies in adults with PSC(24,25,30–37) and are surrogates of hepatobiliary fibrosis in virtually every other liver disease. GGT added minimal discrimination to the model, but is also an established prognostic indicator. Although the model contains no new or unique predictive variables, the weighting is tuned specifically to a pediatric population. Prediction tools using biomarkers more directly related to the pathogenesis of PSC may be possible in the future. Such tools would need to prove that they outperform SCOPE, however, which works remarkably well using routinely obtained tests.
Cholangiography showing large-duct involvement was also a predictor in the model, though it added little to the model’s performance compared to biochemical markers. Not all patients with large-duct strictures have progression of disease, and those who do show progressive aberrations in bilirubin and the other biochemical predictors, ultimately making biochemistry better at discriminating outcomes. It is possible that inclusion of more-detailed cholangiographic predictors that classify the degree of intra- and extrahepatic biliary lesions would further refine our model, as was described in adult patients.(35,37) Availability of equipment, software protocols, and pediatric radiologists with expertise in PSC was highly variable among this global cohort, so the present SCOPE index, with a binary positive or negative cholangiography predictor, was felt to be the most broadly applicable to the most patients.
It is unclear whether small-duct PSC is a true phenotype in children or rather an early stage of disease that will eventually progress to large-duct PSC. In our series, 15% of patients labeled small-duct PSC, followed for at least 5 years, progressed to large-duct PSC complications. An adult series with long-term follow-up similarly showed that 23% of patients with small-duct PSC developed large-duct lesions over a median of 7 years.(38) It is also possible that some patients labeled as small-duct PSC are misclassified. Magnetic resonance cholangiography on modern 3-Tesla scanners is preferred over 1.5-Tesla-strength scanners for PSC(39) because of superior special resolution that may be more sensitive to duct abnormalities.(39) Such scanners are not routinely used at most pediatric centers, however. Also, progressive familial intrahepatic cholestasis type 3 shares histological similarities with small-duct PSC, but genetic testing is not routinely performed at all centers. Prospective analysis of the small-duct phenotype with serial cholangiography is needed to further elucidate this phenotype.
We expect that the main utility for the SCOPE index will be helping pediatric patients, families, and providers understand a child’s current risk strata. Families are often frightened when a child is diagnosed with a progressive liver disease that has no effective therapy. Most information available on Internet searches of PSC refers to adult patients, who generally have a much worse prognosis and far more complications. Children in the low-risk SCOPE group have a very small probability of developing life-threatening liver events or needing an LT in the subsequent decade. SCOPE provides a quantitative tool for explaining this to families and guiding providers. Such information should be considered in shared decision making between patient and clinician when considering clinical and laboratory monitoring frequency, cost, and utility of repeat liver biopsy or cross-sectional imaging to serially assess fibrosis, the risk/benefit ratio of unproven and off-label therapeutic regimens, and the applicability of new studies and clinical trials performed in adults. Future practice guidelines for children could be based around this information with recommendations specific to each SCOPE index or risk tier.
Medium- and high-risk patients with SCOPE index ≥4 should be referred to LT centers for more-frequent subspecialty evaluation. They may benefit from early nutritional optimization, focused screening for micro- and macronutrient deficiencies, verification of immunity and updating vaccinations in anticipation of possible future transplant listing, and screening for minimal HE. More research is needed to expand our understanding of the occurrence of PSC-related complications, like CCA or colorectal cancer, within each risk strata.
The SCOPE index will also be useful for clinical trial design and enrollment. We showed broad heterogeneity in PSC outcomes on an individual patient level. Randomizing to treatment within SCOPE index or risk strata would improve a future pediatric trial. Targeted enrollment of medium-risk patients who have a reasonable probability of clinical events within the time frame of a clinical trial, but who likely do not yet have ESLD, may be optimal to allow for smaller sample sizes. Treatment effects must also be taken into the context of a participant’s baseline SCOPE. Without risk stratification, a large proportion of children who do well in a trial are likely low risk and likely would have had favorable outcomes on no therapy. Because of the relative rarity of PSC in children, most future research will be observational or small uncontrolled, prospective trials. It will be especially important for such publications to report the SCOPE index of patients and stratify results by risk groups to help interpret the magnitude of benefit.
Although most pediatric providers and families currently use GGT to follow disease activity, the SCOPE index may provide a more useful surrogate. We previously established that GGT normalization by 1 year is an important prognostic factor in pediatric PSC.(16) However, GGT levels fluctuate widely and often, especially early in the disease course in children with PSC. It is likely that GGT normalization is confounded by a patient’s stage of hepatobiliary fibrosis. Low-risk SCOPE patients generally have minimal fibrosis and slow disease progression and, frequently, normalize GGT on or off of therapy. Total bilirubin, serum albumin, and platelet count appear to be better surrogate markers of hepatobiliary fibrosis and are more important to long-term prognosis than the biliary inflammation reflected in GGT.
The strength of this study includes the large size of our patient cohort. Despite the relative rarity of PSC in children compared to adults, our derivation cohort was larger than any previous adult-based, model-creation effort to date. Patients came from a mix of population-based centers, secondary- and tertiary-care centers, and with and without LT programs, broadly representing patients with PSC from many settings. SCOPE used objective, reproducible, and routinely obtained predictors and important clinical end points. We confirmed the utility of the model by validating it among a cohort of patients from centers not used in the model derivation. Weaknesses include relatively limited long-term follow-up data given that most children see a pediatrician for ~5 years before they move and/or transition to an adult gastroenterologist around age 18. Misclassification of patients was possible given that diagnostic workup was not uniform at each site. Whereas prospective data are preferred, it was only feasible to capture such a large number of children with a rare disease retrospectively. Prospective validation of SCOPE will occur as part of a planned multicenter, prospective PSC study in the Childhood Liver Diseases Research Network (NCT04181138).
In conclusion, we created the SCOPE index: a simple, validated pediatric PSC risk tool that can be used to predict probability of HBCs in children. It relies on routinely obtained, objective tests. It has excellent discrimination and calibration characteristics and identifies three distinct risk groups of patients with different outcomes. It is validated and correlates strongly with a patient’s underlying biopsy-proven fibrosis. Patients and providers can use this tool together to share decision making about frequency of follow-up, risk/benefit ratio of the use of off-label agents, enrollment in clinical trials, the applicability of results of studies in adults with PSC, and for the need for transplantation referrals. The SCOPE index is a powerful tool to predict individual patient risk in pediatric PSC.
Supplementary Material
Acknowledgments
Research reported in this publication was supported by PSC Partners Seeking A Cure, the Primary Children’s Hospital Foundation, and the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Numbers KL2TR001065 and 8UL1TR000105 (formerly UL1RR025764). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Abbreviations:
- ALP
alkaline phosphatase
- AIH
autoimmune hepatitis
- ALT
alanine aminotransferase
- AST
aspartate aminotransferase
- CCA
cholangiocarcinoma
- ESLD
end-stage liver disease
- EV
esophageal varices
- GGT
gamma glutamyltransferase
- HBCs
hepatobiliary complications
- HR
hazard ratio
- IBD
inflammatory bowel disease
- INR
international normalized ratio
- IQR
interquartile range
- kPa
kilopascals
- LT
liver transplantation
- MELD
Model for End-Stage Liver Disease
- PREsTo
Primary Sclerosing Cholangitis Risk Estimate Tool
- PSC
primary sclerosing cholangitis
- SCOPE
Sclerosing Cholangitis Outcomes in Pediatrics
- TD
transplantation or death
Footnotes
Supporting Information
Additional Supporting Information may be found at onlinelibrary.wiley.com/doi/10.1002/hep.31393/suppinfo.
Potential conflict of interest: Dr. Schwarz consults for and received grants from Roche/Genentech and Gilead. She consults for Up To Date. Dr. Mack consults for Albireo. Dr. Kamath consults for and received grants from Mirum and Albireo. Dr. Kerkar consults for High Tide. Dr. Loomes consults for and received grants from Mirum and Albireo. Dr. Miloh consults for and is on the speakers’ bureau for Alexion. Dr. Soufi received grants from Albireo, Mirum, and Gilead.
REFERENCES
- 1).Deneau MR, El-Matary W, Valentino PL, Abdou R, Alqoaer K, Amin M, et al. The natural history of primary sclerosing cholangitis in 781 children: a multicenter, international collaboration. Hepatology 2017;66:518–527. [DOI] [PubMed] [Google Scholar]
- 2).Trivedi PJ, Corpechot C, Pares A, Hirschfield GM. Risk stratification in autoimmune cholestatic liver diseases: opportunities for clinicians and trialists. Hepatology 2016;63:644–659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3).Chapman R, Fevery J, Kalloo A, Nagorney DM, Boberg KM, Shneider B, et al. Diagnosis and management of primary sclerosing cholangitis. Hepatology 2010;51:660–678. [DOI] [PubMed] [Google Scholar]
- 4).Valentino PL, Wiggins S, Harney S, Raza R, Lee CK, Jonas MM. The natural history of primary sclerosing cholangitis in children: a large single-center longitudinal cohort study. J Pediatr Gastroenterol Nutr 2016;63:603–609. [DOI] [PubMed] [Google Scholar]
- 5).Bjornsson E, Lindqvist-Ottosson J, Asztely M, Olsson R. Dominant strictures in patients with primary sclerosing cholangitis. Am J Gastroenterol 2004;99:502–508. [DOI] [PubMed] [Google Scholar]
- 6).Burak K, Angulo P, Pasha TM, Egan K, Petz J, Lindor KD. Incidence and risk factors for cholangiocarcinoma in primary sclerosing cholangitis. Am J Gastroenterol 2004;99:523–526. [DOI] [PubMed] [Google Scholar]
- 7).Kornfeld D, Ekbom A, Ihre T. Survival and risk of cholangiocarcinoma in patients with primary sclerosing cholangitis. A population-based study. Scand J Gastroenterol 1997;32:1042–1045. [DOI] [PubMed] [Google Scholar]
- 8).Bergquist A, Ekbom A, Olsson R, Kornfeldt D, Lööf L, Danielsson Å, et al. Hepatic and extrahepatic malignancies in primary sclerosing cholangitis. J Hepatol 2002;36:321–327. [DOI] [PubMed] [Google Scholar]
- 9).Angulo P, Maor-Kendler Y, Lindor KD. Small-duct primary sclerosing cholangitis: a long-term follow-up study. Hepatology 2002;35:1494–1500. [DOI] [PubMed] [Google Scholar]
- 10).Bjornsson E, Boberg KM, Cullen S, Fleming K, Clausen OP, Fausa O, et al. Patients with small duct primary sclerosing cholangitis have a favourable long term prognosis. Gut 2002;51:731–735. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11).van Buuren HR, van Hoogstraten HJE, Terkivatan T,Schalm SW, Vleggaar FP. High prevalence of autoimmune hepatitis among patients with primary sclerosing cholangitis. J Hepatol 2000;33:543–548. [DOI] [PubMed] [Google Scholar]
- 12).Kaya M, Angulo P, Lindor KD. Overlap of autoimmune hepatitis and primary sclerosing cholangitis: an evaluation of a modified scoring system. J Hepatol 2000;33:537–542. [DOI] [PubMed] [Google Scholar]
- 13).Deneau M, Valentino P, Mack C, Alqoaer K, Amin M, Amir AZ, et al. Assessing the validity of adult-derived prognostic models for primary sclerosing cholangitis outcomes in children. J Pediatr Gastroenterol Nutr 2020;70:e12–e17. [DOI] [PubMed] [Google Scholar]
- 14).Mileti E, Rosenthal P, Peters MG. Validation and modification of simplified diagnostic criteria for autoimmune hepatitis in children. Clin Gastroenterol Hepatol 2012;10:417–421.e1–e2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15).Deneau M, Perito E, Ricciuto A, Gupta N, Kamath BM, Palle S, et al. Ursodeoxycholic acid therapy in pediatric primary sclerosing cholangitis: predictors of gamma glutamyltransferase normalization and favorable clinical course. J Pediatr 2019;209:92–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16).Deneau MR, Mack C, Abdou R, Amin M, Amir A, Auth M, et al. Gamma glutamyltransferase reduction is associated with favorable outcomes in pediatric primary sclerosing cholangitis. Hepatol Commun 2018;2:1369–1378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17).Sullivan LM, Massaro JM, D’Agostino RB Sr., Presentation of multivariate data for clinical use: the Framingham Study risk score functions. Stat Med 2004;23:1631–1660. [DOI] [PubMed] [Google Scholar]
- 18).van Walraven C, Dhalla IA, Bell C, Etchells E, Stiell IG, Zarnke K, et al. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. CMAJ 2010;182:551–557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19).Harrell FE Jr., Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA 1982;247:2543–2546. [PubMed] [Google Scholar]
- 20).Caetano SJ, Sonpavde G, Pond GR. C-statistic: a brief explanation of its construction, interpretation and limitations. Eur J Cancer 2018;90:130–132. [DOI] [PubMed] [Google Scholar]
- 21).Bland JM, Altman DG. The logrank test. BMJ 2004;328:1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22).Gronnesby JK, Borgan O. A method for checking regression models in survival analysis based on the risk score. Lifetime Data Anal 1996;2:315–328. [DOI] [PubMed] [Google Scholar]
- 23).Hosmer DW, Lemesbow S. Goodness of fit tests for the multiple logistic regression model. Commun Stat Theory Methods 1980;9:1043–1069. [Google Scholar]
- 24).Kim WR, Therneau TM, Wiesner RH, Poterucha JJ, Benson JT, Malinchoc M, et al. A revised natural history model for primary sclerosing cholangitis. Mayo Clin Proc 2000;75:688–694. [DOI] [PubMed] [Google Scholar]
- 25).de Vries EM, Wang J, Williamson KD, Leeflang MM, Boonstra K, Weersma RK, et al. A novel prognostic model for transplant-free survival in primary sclerosing cholangitis. Gut 2018;67:1864–1869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26).Goode EC, Clark AB, Mells GF, Srivastava B, Spiess K, Gelson WTH, et al. Factors associated with outcomes of patients with primary sclerosing cholangitis and development and validation of a risk scoring system. Hepatology 2019;69:2120–2135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27).Eaton JE, Vesterhus M, McCauley BM, Atkinson EJ, Schlicht EM, Juran BD, et al. Primary Sclerosing Cholangitis Risk Estimate Tool (PREsTo) predicts outcomes of the disease: a derivation and validation study using machine learning. Hepatology 2020;71:214–224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28).Turan S, Topcu B, Gökce İ, Güran T, Atay Z, Omar A, et al. Serum alkaline phosphatase levels in healthy children and evaluation of alkaline phosphatase z-scores in different types of rickets. J Clin Res Pediatr Endocrinol 2011;3:7–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29).Kim WR, Lake JR, Smith JM, Schladt DP, Skeans MA, Harper AM, et al. OPTN/SRTR 2014 annual data report: liver. Am J Transplant 2016;16(Suppl. 2):69–98.26755264 [Google Scholar]
- 30).Wiesner RH, Grambsch PM, Dickson ER, Ludwig J, Maccarty RL, Hunter EB, et al. Primary sclerosing cholangitis: natural history, prognostic factors and survival analysis. Hepatology 1989;10:430–436. [DOI] [PubMed] [Google Scholar]
- 31).Farrant JM, Hayllar KM, Wilkinson ML, Karani J, Portmann BC, Westaby D, et al. Natural history and prognostic variables in primary sclerosing cholangitis. Gastroenterology 1991;100:1710–1717. [DOI] [PubMed] [Google Scholar]
- 32).Dickson ER, Murtaugh PA, Wiesner RH, Grambsch PM, Fleming TR, Ludwig J, et al. Primary sclerosing cholangitis: refinement and validation of survival models. Gastroenterology 1992;103:1893–1901. [DOI] [PubMed] [Google Scholar]
- 33).Broome U, Olsson R, Loof L, Bodemar G, Hultcrantz R, Danielsson A, et al. Natural history and prognostic factors in 305 Swedish patients with primary sclerosing cholangitis. Gut 1996;38:610–615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34).Boberg KM, Rocca G, Egeland T, Bergquist A, Broomé U, Caballeria L, et al. Time-dependent Cox regression model is superior in prediction of prognosis in primary sclerosing cholangitis. Hepatology 2002;35:652–657. [DOI] [PubMed] [Google Scholar]
- 35).Ponsioen CY, Vrouenraets SM, Prawirodirdjo W, Rajaram R, Rauws EAJ, Mulder CJJ et al. Natural history of primary sclerosing cholangitis and prognostic value of cholangiography in a Dutch population. Gut 2002;51:562–566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36).Tischendorf JJ, Hecker H, Kruger M, Manns MP, Meier PN. Characterization, outcome, and prognosis in 273 patients with primary sclerosing cholangitis: a single center study. Am J Gastroenterol 2007;102:107–114. [DOI] [PubMed] [Google Scholar]
- 37).Ponsioen CY, Reitsma JB, Boberg KM, Aabakken L, Rauws E, Schrumpf E. Validation of a cholangiographic prognostic model in primary sclerosing cholangitis. Endoscopy 2010;42:742–747. [DOI] [PubMed] [Google Scholar]
- 38).Schramm C, Eaton J, Ringe KI, Venkatesh S, Yamamura J. Recommendations on the use of magnetic resonance imaging in PSC-A position statement from the International PSC Study Group. Hepatology 2017;66:1675–1688. [DOI] [PubMed] [Google Scholar]
- 39).Isoda H, Kataoka M, Maetani Y, Kido A, Umeoka S, Tamai K, et al. MRCP imaging at 3.0 T vs. 1.5 T: preliminary experience in healthy volunteers. J Magn Reson Imaging 2007;25:1000–1006. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.