Abstract
Background
We applied the GRADE (Grading of Recommendations Assessment, Development and Evaluation) framework to evaluate the performance of fecal calprotectin (FC) as an alternative to endoscopy in patients with moderate-severe ulcerative colitis (UC) treated with a biologic agent or tofacitinib.
Methods
Individual participant data from trials of infliximab, golimumab, vedolizumab, and tofacitinib for UC were pooled to generate prevalence of endoscopic activity (Mayo endoscopy score) across different combinations of rectal bleeding (RBS) and stool frequency (SFS) scores. These estimates were then combined with data from an updated systematic review of the operating properties of FC to generate clinical scenario-specific assessments of the performance of FC as a predictor of endoscopic disease activity. A pre-specified threshold of acceptability for false negative and false positive test results was set at 5%.
Results
For patients with UC achieving RBS 0 + SFS 0/1, FC≤50μg/g may avoid endoscopy in 50% patients with a false negative rate <5%. Likewise, for patients with RBS 2/3 + SFS 2/3, FC≥250μg/g potentially avoids endoscopy in approximately 50% patients with false positive rate <5%. The greatest uncertainty in diagnostic performance for FC was observed in UC patients achieving RBS 0 but having SFS 2/3, where false negative and false positive rates were consistently >10%, and endoscopic evaluation may be warranted.
Conclusion
Two clinical scenarios were identified where FC can be used with confidence for monitoring treatment response to biologics or tofacitinib in UC patients without the requirement for endoscopy.
Keywords: Biomarker, monitoring, calprotectin, treat-to-target
INTRODUCTION
The management of ulcerative colitis (UC) has improved considerably over the past two decades due to both the introduction of more effective medical treatment, and a systematic treat-to-target approach. Introduction of treat-to-target concept has resulted in the identification of specific goals of therapy that should be evaluated at defined times during induction and maintenance. Patients with UC are recommended to undergo an endoscopy prior to, and 4–6 months after, treatment initiation and/or adjustment.1 The intent of this recommendation is to increase the likelihood that endoscopic remission, an end-point closely associated with reductions in disease-related complications such as flares, hospitalization, colorectal cancer, and surgery is achieved.2 Unfortunately, in the real world, endoscopy-based monitoring is only performed in approximately 50% patients initiating biologic therapy. Importantly, a reduced rate of endoscopic monitoring in these patients is associated with an increased risk of disease-related complications.3 Although the reasons for this gap between current treatment guidelines and clinical practice have not been formally studied, endoscopy is both costly and poorly accepted by patients.4 Non-invasive biomarkers of endoscopic inflammation seem to be an attractive alternative for disease monitoring. Nevertheless, despite the widespread availability of fecal calprotectin (FC) in the US, only 2–5% patients with UC undergo testing in routine practice.3 Besides reimbursement and issues with stool collection, one of the primary issues for low uptake of the test may be a lack of understanding regarding how the results of FC testing should be interpreted in specific clinical situations based upon the operating properties of the test.
We addressed this gap by assessing the performance of a combination of patient-reported outcomes and predefined FC cut-offs, in estimating the presence of moderate to severe endoscopic activity (Mayo endoscopy score 2 or 3) or endoscopic improvement (Mayo endoscopy score 0 or 1, currently proposed treatment target) in patients with UC. We applied the GRADE (Grading of Recommendations Assessment, Development and Evaluation) framework for diagnostic accuracy studies to evaluate the performance of FC as a test replacement strategy to endoscopy in patients with moderate-severe UC being treated with a biologic agent or tofacitinib.
METHODS
Through an iterative process, we developed focused clinical questions deemed relevant for clinical practice related to the diagnostic performance of FC in adult patients with moderate to severely active UC, being treated with a biologic agent or tofacitinib. From these focused clinical questions, well-defined statements in terms of patients, intervention, comparator and outcome (PICO) were outlined (Supplementary Table 1), and these formed the framework for formulating the study inclusion and exclusion criteria and guided the literature search. FC was considered as a test replacement strategy for detection of moderate to severe endoscopic activity (Mayo endoscopic sub-score 2 or 3), i.e., in patients with valid results, FC could replace routine use of lower endoscopy and limit its use to cases with inconclusive FC results or diagnostic equipoise. For patients with clinically quiescent disease, FC was considered a test replacement strategy for confirming endoscopic improvement (Mayo endoscopic sub-score 0 or 1, ruling out moderate to severe endoscopic activity).
Systematic Literature Review to Inform Diagnostic Performance of Fecal Calprotectin
To inform the diagnostic performance of FC, a previous systematic literature search for FC in UC was updated by an experienced medical librarian using a combination of controlled vocabulary supplemented with keywords, with input from the authors (Supplementary Material).5 The search was updated through August 27, 2018, and the databases included: Ovid Medline In-Process & Other Non-Indexed Citations, Ovid MEDLINE, Ovid EMBASE, Ovid Cochrane Central Register of Controlled Trials, Ovid Cochrane Database of Systematic Reviews, Scopus, and Web of Science. The search was supplemented with a recursive search of the bibliographies of recently published systematic reviews on this topic, to identify any additional studies. No language restrictions were applied.
Pairs of reviewers screened and identified studies, based on the following inclusion criteria: studies in adult patients with UC, evaluating the diagnostic performance of FC as the index test, using endoscopy (sigmoidoscopy or colonoscopy) as the gold standard, reporting endoscopic disease using comparable scoring indices (Mayo Endoscopy Score, UC disease activity index, Rachmilewitz Endoscopic Index). From these studies, we included only studies that reported the performance of FC at pre-defined cut-offs of ≤50(±10)μg/g (for use in patients with quiescent disease to rule out endoscopic inflammation) or ≥250(±20)μg/g (for use in patients with active disease to rule in endoscopic inflammation). Data on other cut-offs (for example, ≤100μg/g) were infrequently reported. Conflicts in study selection were resolved by consensus, referring back to the original article in consultation with clinical content experts. These cut-offs have been most consistently reported as being associated with endoscopic improvement and moderate-severe endoscopic activity, respectively.
The following data from each study was abstracted using a standardized data abstraction form: (a) study characteristics: primary author, time period of study/year of publication, country of study and cohort size, inclusion and exclusion criteria (to determine population in which the diagnostic performance of FC has been studied); (b) endoscopic activity assessment using FC: assay/platform, cut-off corresponding to maximum sensitivity, maximum specificity, and cut-off corresponding to best trade-off of sensitivity and specificity corresponding to area-under-the-receiver-operator-curve (AUROC) or Youden index; (c) outcomes reported: presence or absence of endoscopic activity, endoscopic index and definitions used for endoscopic improvement; (d) test performance of pre-specified FC cut-offs [≤50(±10)μg/g or ≥250(±20)μg/g]: sensitivity, specificity, prevalence of outcome of interest in study (to impute numbers of true-positive [TP], true-negative [TN], false-positive [FP], and false-negative [FN] results).
The quality assessment of included studies was performed by a single author (SS) using the quality assessment of diagnostic accuracy studies (QUADAS) questionnaire, which was designed to assess the internal and external validity of diagnostic accuracy studies included in systematic reviews.6 This tool is a 14-item instrument that allows for the identification of important design elements in diagnostic accuracy studies such as patient spectrum, the presence or absence of observer blinding and verification bias, handling of indeterminate results, and reporting of patient loss to follow-up evaluation.
The paired values of sensitivity and specificity were pooled using a bivariate regression random-effects model proposed by proposed by Reitsma et al using STATA 14.0 software (College Station, TX).7 Statistical assessment of heterogeneity was performed using the inconsistency index (I2), which estimates what proportion of total variation across studies was due to heterogeneity rather than chance. We tested for the presence of publication bias by using a regression of the diagnostic log odds ratio against 1/(Effective Sample Size)1/2 and weighting according to the effective sample size, with p-value <0.10 indicating asymmetry consistent with publication bias.
Disease State Prevalence Estimates based on Clinical Scenarios
Specific clinical scenarios based upon combination of rectal bleeding score (RBS) and stool frequency score (SFS), derived from the Mayo Clinic Score of UC were defined a priori. Consequently, scenarios around ruling out moderate to severe endoscopic activity were anchored around resolution of rectal bleeding (RBS 0; patients with quiescent disease), and scenarios around ruling in moderate to severe endoscopic activity were anchored around increase stool frequency (SFS 2 or 3; patients with active disease). The corresponding prevalence of endoscopic improvement (Mayo endoscopic sub-score 0 or 1) and moderate to severe endoscopic activity (Mayo endoscopic sub-score 2 or 3) for each clinical scenario was obtained through published data from analysis of individual participant data from phase 2 and 3 clinical trial programs for infliximab (ACT-1 and −2),8 golimumab (PURSUIT-SC),9, 10 vedolizumab (MLN002 and GEMINI-1),11, 12 and tofacitinib (OCTAVE-1, −2 and SUSTAIN),13 in patients with moderate to severely active UC.14 For the OCTAVE trials, both local and central endoscopy scoring were available, however, local endoscopy scores were used for consistency with other trial designs and to more accurately represent clinical practice. Data were included irrespective of treatment arm or dosing regimen.
Data Synthesis
The diagnostic accuracy of any test in terms of rates of TP, TN, FP and FN depends on pre-test probabilities and prevalence of condition. Using the calculated sensitivity and specificity from included studies for pre-specified FC cut-offs [≤50(±10)μg/g or ≥250(±20)μg/g], we calculated rates of TP, FP, TN and FN rates using clinical scenario-specific prevalence estimates for endoscopic disease activity.
Consequences of Diagnostic Test Results and Outcomes of Interest
Corresponding to each possible outcome (TP, FP, TN, FN), presumed downstream consequences on patient-important outcomes were considered. For example, for PICOs# 1–4 on detection of endoscopic activity,
True positives (patients correctly diagnosed as having moderate to severe endoscopic activity) would be eligible to undergo treatment adjustment, all of which may eventually decrease disease-related complications and morbidity, without being subject to risks and invasive testing with endoscopy.
False positives (patients incorrectly labeled as having moderate to severe endoscopic activity, when actually they have endoscopic improvement) may receive unnecessary testing (endoscopy) and/or treatment adjustment, and have avoidable anxiety, potential testing- or treatment-related complications and excessive resource utilization.
True negatives (patients correctly diagnosed as have endoscopic improvement) would be reassured and obviate the need for invasive testing with endoscopy, although they may need to undergo serial assessment of FC at periodic intervals.
False negatives (patients incorrectly labeled as having endoscopic improvement, when actually they have moderate to severe endoscopic activity) would be falsely reassured, and may not receive appropriate treatment, potentially leading to increased disease related complications, morbidity and mortality.
For PICO statements #1 and #2, pertaining to the ability of FC to rule out moderate to severe endoscopic activity among patients with complete resolution of rectal bleeding, the primary outcome of interest was the FN rate (with a pre-defined maximal tolerable FN rate of 5%). For PICO statements #3 and #4, pertaining to the ability of FC to rule in moderate to severe endoscopic activity among patients with increased stool frequency ≥3 stools above their baseline (stool frequency score of 2 or 3), the primary outcome of interest was the FP rate (with a pre-defined maximal tolerable FP rate of 5%).
Certainty of Evidence
We rated the certainty of evidence using the GRADE approach for diagnostic tests and strategies.15 In this approach, all evidence from randomized controlled trials (comparing different diagnostic tests or cut-offs of same test) and observational diagnostic accuracy studies start at high-quality, but can be rated down for any of the following factors:
Risk of bias in included studies (inferred based on QUADAS instrument),16
Indirectness (e.g., deemed present if there are important differences between the populations studied and those for whom the recommendation is intended). In this updated GRADE approach for diagnostic accuracy studies, TP, FP, TN, and FN derived from sensitivity/specificity are not considered surrogate outcomes.
Inconsistency (e.g., deemed present if there were considerable differences between studies in the accuracy estimates that were not explained, or if cut-offs for FC for endoscopic improvement for moderately to severe endoscopic activity were not pre-specified but primarily obtained post-hoc corresponding to AUROC),
Imprecision (deemed present if there were wide confidence intervals for true and false positive and true and false negative rates), and
Publication bias, if it was strongly suspected.
RESULTS
Systematic Literature Review and Diagnostic Performance of Fecal Calprotectin
Supplementary Figure 1 shows the study selection flowchart. After reviewing 2410 studies, a total of 23 unique studies provided information on 47 reports on diagnostic performance of FC at predefined cut-offs (Supplementary Table 2).5, 17–33 A single study was a post-hoc analysis of a phase 3 clinical trial program,31 with the remaining studies being prospective observational cohort studies. None of the studies used central blinded scoring of endoscopy, and the majority used the Mayo endoscopic scoring system with a minority utilizing the Rachmilewitz (n=3) or Ulcerative Colitis Endoscopic Index of Severity (UCEIS; n=1), to classify endoscopic activity status. All studies reporting on assay specifications used ELISA based assays or the Bulhmann Quantum Blue lateral flow assay (Supplementary Material).
A total of 14 studies, with 1,658 patients, reported on the diagnostic accuracy of FC at a cut-off of ≤50(±10)μg/g. The pooled sensitivity and specificity was 0.82 (95% CI, 0.74–0.89) and 0.74 (95% CI, 0.65–0.82) respectively, with significant heterogeneity across studies (I2=97%) (Supplementary Figure 2). The corresponding area under the receiver operator curve (AUROC) was 0.85 (95% CI, 0.82–0.88), positive likelihood ratio (PLR) was 3.2 (95% CI, 2.3–4.5) and negative likelihood ratio (NLR) was 0.24 (95% CI, 0.16–0.36). A total of 12 studies, with 1,228 patients, reported on the diagnostic accuracy of FC at a cut-off of ≥250(±20)μg/g. The pooled sensitivity and specificity was 0.76 (95% CI 0.65, 0.84) and 0.79 (95% CI 0.73, 0.84) respectively, with significant heterogeneity across studies (I2=97%) (Supplementary Figure 3). The corresponding AUROC, PLR and NLR was 0.84 (95% CI, 0.81–0.87), 3.7 (95% CI, 2.8–4.8) and 0.30 (95% CI, 0.21–0.45), respectively.
Disease Prevalence Estimates
Across 6 possible patient reported outcome (PRO) combinations for RBS and SFS sub-components of the partial Mayo score, the highest prevalence of endoscopic improvement (Mayo endoscopic sub-score of 0 or 1) was observed among patients with a RBS 0 + SFS 0 post-induction (81.2%; 95% CI 77.7, 84.3) and during maintenance (90.7%; 95% CI 87.3, 93.2) (Table 1, Supplementary Figure 4). Participants with RBS 0 + SFS 0 or 1 had a prevalence of endoscopic improvement comparable to that of participants with a RBS 0 + SFS 0 post-induction (75.3% vs. 81.2%) and during maintenance (87.6% vs. 90.7%). Participants with RBS 0 + SFS 2 or 3 had a lower prevalence of endoscopic improvement of 34.5% (95% CI 30.5, 38.7) post-induction and 44.6% (95% CI 35.0, 54.6) during maintenance. Participants with RBS 1 had a prevalence of endoscopic improvement of 24.0% (95% CI 21.4, 26.9) post-induction and 28.5% (95% CI 21.3, 37.0) during maintenance. The lowest prevalence of endoscopic improvement was seen among study participants with a RBS 2 or 3 + SFS 2 or 3 post-induction (9.8%, 95% CI 7.5, 12.6) and during maintenance (14.5%, 95% CI 7.8, 25.4).
Table 1.
PRO measure | INDUCTION (ACT-1/2, PURSUIT, OCTAVE-1/2, GEMINI1, MLN002) | MAINTENANCE (ACT-1/2, GEMINI1, OCTAVE-Sustain) |
---|---|---|
RBS 0 + SFS 0/1 | 75.3% (73.0–77.5), 1120/1471 | 87.6% (84.9–89.9), 592/673 |
RBS 0 + SFS 0 | 81.2% (77.7–84.3), 515/618 | 90.7% (87.3–93.2), 372/407 |
RBS 0 + SFS 2/3 | 34.5% (30.5–38.7), 181/535 | 44.6% (35.0–54.6), 45/101 |
RBS 0 | 64.3% (62.2–66.4), 1299/2006 | 82.0% (79.2–84.6), 637/774 |
RBS 0/1 | 48.9% (46.9.0–50.9), 1229/2531 | 73.5% (70.6–76.3), 675/916 |
RBS 2/3 + SFS 2/3 | 9.8% (7.5–12.6), 53/580 | 14.5% (7.8–25.4), 9/88 |
RBS: rectal bleeding score; SFS: stool frequency score; PRO: patient reported outcome
Based on these estimates we anchored baseline prevalence of moderate to severe endoscopic activity into three categories for ease of interpretation of data in day-to-day practice – low-likelihood of moderate to severe endoscopic activity (RBS 0 + SFS 0 or 1), intermediate-likelihood of moderate to severe endoscopic activity (RBS 0 + SFS 2 or 3), and high-likelihood of moderate to severe endoscopic activity (RBS 2 or 3 + SFS 2 or 3).
Application of Fecal Calprotectin Across Clinical Scenarios
#1 and #2 In adults with moderate-severe UC achieving resolution of rectal bleeding after induction therapy (6–8 weeks) or during maintenance therapy (6–12 months) with a biologic agent or tofacitinib, how accurate is a fecal calprotectin cut-off of 50(±10)μg/g for ruling out moderate to severe endoscopic activity (Mayo endoscopy score 2 or 3), obviating the need for routine endoscopic assessment?
In adults with UC who have completed induction therapy with a biologic or tofacitinib and have achieved resolution of rectal bleeding and normalization of stool frequency to <3 stools above their baseline (RBS 0 + SFS 0/1, low-likelihood of moderate to severe active endoscopic inflammation), FC cut-off of 50(±10)μg/g correctly classifies 76% patients as either having endoscopic improvement (TN, 55.5%) or with moderate to severe endoscopic activity (TP, 20.5%). That is, 55.5% patients may be able to avoid endoscopies post-induction if they achieve resolution of rectal bleeding and normalization of stool frequency to < 3 stools, with an FC≤50μg/g, and only a minority of patients are falsely classified as being in endoscopic improvement (FN, 4.5%), which corresponds to our pre-specified maximal acceptable FN rate (Table 2). For a similar low-risk patient during maintenance therapy, 65.1% patients may be able to avoid endoscopies during maintenance therapy if they achieve resolution of rectal bleeding and normalization of stool frequency to < 3 stools, with an FC≤50(±10)μg/g, and with only a minority being falsely classified as being in endoscopic improvement (FN, 2.2%) (Supplementary Table 3).
Table 2: Q1. In adults with moderate-severe UC achieving resolution of rectal bleeding after induction therapy with a biologic or tofacitinib, how accurate is a fecal calprotectin cut-off of 50 for ruling out moderate to severe endoscopically active disease (Mayo endoscopy score 2/3), obviating the need for routine endoscopic assessment?
Test result | Number of results per 1000 patients tested (95% CI) | Number of studies, participants | Quality of the Evidence (GRADE) | Comments | |
---|---|---|---|---|---|
Low-likelihood (Prevalence 25%) | Intermediate-likelihood (Prevalence 65%) | ||||
True positives (patients with moderate to severe endoscopically active disease) | 205 (185 to 223) | 533 (481 to 579) | 14 studies, 1658 patients | ⊕⊕⊕○ MODERATE1 (Inconsistency) | TP may lead to diagnostic endoscopy for confirmation and modification/optimization of therapy (if FC used as triage strategy), potentially reducing risk of disease-related complications. TP will have further testing (lower endoscopy) and/or intervention which may lead to side effects. |
False negatives (patients incorrectly classified as being in endoscopic remission or having mildly active disease) | 45 (27 to 65) | 117 (71 to 169) | FN may lead to inadequate treatment and potentially increased risk of disease-related complications due to delay in detection of moderate to severe endoscopically active disease | ||
True negatives (patients in endoscopic remission or having mildly active disease) | 555 (488 to 615) | 259 (227 to 287) | TN will likely be reassured, avoid an invasive test but may still be retested with fecal calprotectin periodically | ||
False positives (patients incorrectly classified as having moderate to severe endoscopically active disease) | 195 (135 to 262) | 91 (63 to 123) | FP will likely have further testing (if fecal calprotectin is used as triage strategy) or may be over-treated (if fecal calprotecin is used a test replacement strategy) and will increase anxiety, complications and resource use. |
High unexplained heterogeneity, selective inclusion of studies corresponding to cut-off of ≤50 (±10) μg/g.
In adults with UC who have completed induction therapy with a biologic or tofacitinib and have achieved resolution of rectal bleeding yet they continue to have a stool frequency of ≥3 stools above their baseline (RBS 0 + SFS 2/3, intermediate-likelihood of moderately to severely active endoscopic inflammation), a FC cut-off of 50(±10)μg/g correctly classifies 79.2% patients, as either having moderate to severe endoscopic activity (TP, 53.3%) or being in endoscopic improvement (TN, 25.9%). Misclassification among these patients with a FC cut-off of 50(±10)μg/g is secondary to both falsely classifying them as having moderate to severe endoscopic activity (FP, 9.1%), and falsely classifying them as being in endoscopic improvement (FN, 11.7%) (Table 2). For a similar intermediate-likelihood patient during maintenance therapy, the proportion of patients falsely classified as being in endoscopic improvement (FN) with FC≤50(±10)μg/g was 9.9% (Supplementary Table 3). The incremental value of FC≤50μg/g in patients with resolution of rectal bleeding with varying stool frequency and in patients with RBS 1, after induction therapy and during maintenance therapy, is presented in Figures 1A and B. Summary positive and negative predictive value of FC≤50μg/g in different clinical scenarios is presented in Supplementary Table 4.
#3 and #4 In adults with moderate-severe UC with persistent increased stool frequency after induction therapy (6–8 weeks) or during maintenance therapy (6–12 months) with a biologic agent or tofacitinib, how accurate is a fecal calprotectin cut-off of 250(±20)μg/g for ruling in moderate to severe endoscopic activity (Mayo endoscopy score 2/3), obviating the need for routine endoscopic assessment?
In adults with UC who have completed induction therapy with a biologic or tofacitinib and continue to have a stool frequency of ≥3 stools above their baseline despite resolution of rectal bleeding (RBS 0 + SFS 2/3, intermediate-probability of moderately to severely active endoscopic inflammation), a FC cut-off of 250(±20)μg/g correctly classifies 77% patients as, either having moderate to severe endoscopic activity (TP, 49.4%), or being in endoscopic improvement (TN, 27.6%). The majority of misclassification among these patients with a FC cut-off of 250μg/g is secondary to falsely classifying them as being in endoscopic improvement (FN, 15.6%), with a minority of patients being falsely classified as having moderate to severe endoscopic activity (FP, 7.4%) (Table 3). For a similar intermediate-likelihood patient during maintenance therapy, a FC cut-off of 250μg/g, the proportion of patients falsely classified as having moderate to severe endoscopic activity (FP) is 9.4%; both post-induction and during maintenance, these rates of FP are higher than our pre-specific maximal acceptable FP threshold of 5% (Supplementary Table 5).
Table 3: Q3. In adults with moderate-severe UC with persistent increased stool frequency after induction therapy with a biologic or tofacitinib, how accurate is a fecal calprotectin cut-off of 250 for ruling in moderate to severe endoscopically active disease (Mayo endoscopy score 2/3), obviating the need for routine endoscopic assessment?
Test result | Number of results per 1000 patients tested (95% CI) | Number of studies | Quality of the Evidence (GRADE) | Comments | |
---|---|---|---|---|---|
Intermediate-likelihood (Prevalence 65%) | High-likelihood (Prevalence 90%) | ||||
True positives (patients with moderate to severe endoscopically active disease) | 494 (423 to 546) | 684 (585 to 756) | 12 studies, 1228 patients | ⊕⊕⊕○ MODERATE1 (Inconsistency) | TP may lead to diagnostic endoscopy for confirmation and modification/optimization of therapy (if FC used as triage strategy), potentially reducing risk of disease-related complications. TP will have further testing (lower endoscopy) and/or intervention which may lead to side effects. In contrast, if FC is used a test replacement strategy, this may lead to treatment modification. |
False negatives (patients incorrectly classified as being in endoscopic remission or having mildly active disease) | 156 (104 to 227) | 216 (144 to 315) | FN may lead to inadequate treatment and potentially increased risk of disease-related complications due to delay in detection of moderate to severe endoscopically active disease | ||
True negatives (patients in endoscopic remission or having mildly active disease) | 276 (256 to 294) | 79 (73 to 84) | TN will likely be reassured, avoid an invasive test but may still be retested with fecal calprotectin periodically | ||
False positives (patients incorrectly classified as having moderate to severe endoscopically active disease) | 74 (56 to 94) | 21 (16 to 27) | FP will likely have further testing (if fecal calprotectin is used as triage strategy) or may be over-treated (if fecal calprotecin is used a test replacement strategy) and will increase anxiety, complications and resource use. |
High unexplained heterogeneity, selective inclusion of studies corresponding to cut-off of ≥250μg/g.
In adults with UC who have completed induction therapy with a biologic agent or tofacitinib and have rectal bleeding in >50% of bowel movements or pass blood alone, along with continued increased stool frequency of ≥3 stools above their baseline (RBS 2/3 + SFS 2/3, high-likelihood of moderate to severe endoscopic activity), a FC cut-off of 250μg/g correctly classifies 76.3% of patients as, either having moderate to severe endoscopic activity (TP, 68.4%) or being in endoscopic improvement (TN, 7.9%), i.e., 68.4% patients may be able to avoid endoscopies post-induction if they have RBS 2/3 + SFS 2/3 with an FC ≥250μg/g. Only a minority of patients are falsely classified as having moderate to severe endoscopic activity (FP, 2.1%), which is lower than our pre-specific maximal acceptable FP threshold of 5% (Table 3). For a similar high-likelihood patient during maintenance therapy, 64.6% patients may be able to avoid endoscopies during maintenance therapy if they have rectal bleeding in >50% of bowel movements or pass blood alone, along with continued increased stool frequency of ≥3 stools above their baseline, with an FC≥250μg/g. The proportion of patients falsely classified as having moderately to severe endoscopic activity (FP) by a FC cut-off of 250 μg/g increases to 3.1% during maintenance compared to post-induction (2.1%), but it remains below our pre-specific maximal acceptable FP threshold of 5% (Supplementary Table 5). The incremental value of FC≥250μg/g in patients with varying combinations of RBS and SFS, after induction therapy and during maintenance therapy, is presented in Figures 2A and B. Summary positive and negative predictive value of FC≥250μg/g in different clinical scenarios is presented in Supplementary Table 4.
Across both scenarios of quiescent and active disease, and during induction and maintenance therapy, overall body of evidence was rated as moderate quality, due to high heterogeneity and selective inclusion of studies corresponding to pre-specified FC cut-offs.
DISCUSSION
Notwithstanding that endoscopic confirmation of treatment response is recommended,1 only 50% UC patients initiating biologic therapy in the US have a follow-up procedure within 2 years of treatment initiation. Although the reasons for this deficit are likely multifactorial, it is relevant patients rank endoscopy as the least acceptable monitoring strategy.3, 4 FC, a non-invasive stool based biomarker of mucosal inflammation, is widely available yet it is infrequently used for monitoring treatment response in UC. We attempted to address this gap by combining PRO-based disease state specific prevalence rates for endoscopic activity with diagnostic performance of pre-defined FC cut-offs, to provide an understanding of diagnostic accuracy for specific clinical scenarios.
Among UC patients achieving complete resolution of rectal bleeding and normalization of stool frequency to <3 stools above their baseline (RBS 0 + SFS 0/1) post-induction or during maintenance therapy with a biologic agent or tofacitinib, FC testing has the potential to avoid 55–65% endoscopies by correctly ruling out the presence of moderate to severe endoscopic activity. Given the associated false negative rate of 4.5% post-induction and 2.2% during maintenance, there is a low chance for providers to be misled by a FC value of ≤50μg/g in this clinical scenario. Among patients achieving complete resolution of rectal bleeding but with persistent increased stool frequency of ≥3 stools above their baseline (RBS 0 + SFS 2/3) post-induction or during maintenance, however, FC has the potential to avoid only 25–33% endoscopies by correctly ruling out the presence of moderate to severe endoscopic activity, and with a false negative rate of 9–12%, the confidence in use of FC testing in this clinical scenario is limited and endoscopy likely remains a necessary strategy. Among UC patients with a persistent increased stool frequency of ≥3 stools above their baseline post-induction or during maintenance, FC had the greatest potential value when used in patients with rectal bleeding in >50% of bowel movements or who were passing blood alone (RBS 2/3 + SFS 2/3). For these patients, a FC cut-off of ≥250μg/g would correctly classify approximately 50% of patients as having moderate to severe endoscopic activity, thereby obviating the need for endoscopy to confirm inflammation. With a false positive rate of 2.1–3.1%, we can have high degree of confidence when FC≥250μg/g in this clinical scenario. Collectively, these observation indicate that physicians who have already incorporated FC testing into their practice as an alternative to endoscopic monitoring, may want to consider integrating our findings because the non-selective use of FC as a replacement for endoscopic monitoring in an all patients without augmentation with clinical data may lead to important rates of misclassification.
Our findings have important implications for clinical practice. Endoscopy is not without burden, costs, risks, or anxiety for patients, and more appropriate utilization would help to maximize overall diagnostic value and patient acceptance, particularly in clinical scenarios where confidence in non-invasive assessments with PROs or biomarkers such as FC is limited. Furthermore, the appropriate utilization of FC in clinical scenarios where a high degree of confidence can be achieved with test results could help to expedite treatment decision making and improve disease monitoring access, while avoiding unnecessary invasive testing. Based on our study, the most appropriate utilization of FC would be to rule out moderate to severe endoscopic activity with a cut-off of 50μg/g in patients achieving complete resolution of rectal bleeding and normalization of stool frequency to <3 bowel movements above their baseline, and to rule in moderate to severe endoscopic activity with a cut-off of 250μg/g in patients having rectal bleeding in >50% of bowel movements or passing blood alone, along with continued increased stool frequency of ≥3 stools above their baseline. Figure 3 summarized the implications for practice when treatment decisions are based on ruling out or ruling in moderate to severe endoscopic activity.
Our study has several strengths that include use of the pooled analysis of individual participant data from clinical trials encompassing a broad spectrum of treatment options for moderate-severe UC to generate the diagnotic test performance estimates for FC, and generation of scenario-specific integration that incorporated state-specific prevalence estimates.14 Our study is, however, not without limitations. We were unable to comment on the more stringent definition of endoscopic remission using a Mayo endoscopic sub-score of 0 (excluding patients with a Mayo endoscopic subscore of 1 who have mildly active disease) or evolving regulatory definitions for mucosal healing which incorporate histologic disease activity. As our understanding of the importance of more complete resolution of both endoscopic and histologic inflammation grows and treatment end-points continue to evolve towards more stringent definitions and outcomes potentially more closely linked with reductions in disease-related complications, further work will be needed to understand the diagnostic accuracy of FC for these evolving definitions. In addition, locally read endoscopy tends to overestimate the presence of endoscopic inflammation as compared with central reading, which could affect the results of our study. We only focused on diagnostic performance of two pre-specified cut-offs of FC, ≤50mcg/g and ≥250mcg/g, since these have been most commonly reported and applied in clinical practice. The number of studies reporting diagnostic accuracy at pre-defined FC cut-offs of 100μg/g and 150μg/g were limited. Values between 50μg/g to 250μg/g will have sensitivity and specificity in between these cut-offs, and expectedly, when these are applied in specific clinical scenarios, it is likely that prespecified FN and FP rates may exceed >5%. Hence, FC values between 50–250mcg/g may be inconclusive, and endoscopic examination may be warranted for these patients. There may be intra-individual variability in FC, as well as inter-assay variability in different FC tests which we were unable to account for in our analysis. Finally, our study is limited to the integration of FC and further analyses will be needed to understand whether alternative emerging biomarkers may be more ideal in certain situations.
In summary, FC has a potential role in monitoring response for UC patients being treated with a biologic agent or tofacitinib. The greatest confidence in test performance can be achieved when ruling out moderate to severe endoscopic activity with a cut-off of 50μg/g in patients who have resolution of rectal bleeding and normalization of stool frequency, and when ruling in moderate to severe endoscopic activity with a cut-off of 250μg/g in patients who have persistent rectal bleeding and increased stool frequency. These data will be of value for clinicians when integrating FC into routine monitoring algorithms and will allow for a more focused utilization of endoscopy for assessment of treatment response, thereby improving the overall value of disease monitoring in UC.
Supplementary Material
Acknowledgments
Disclosures: Parambir S. Dulai is supported by an American Gastroenterology Association Research Scholar Award. Siddharth Singh is supported by an American College of Gastroenterology Junior Faculty Development Award #144271, Crohn’s and Colitis Foundation Career Development Award #404614, and the National Institute of Diabetes and Digestive and Kidney Diseases K23DK117058. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Footnotes
Conflicts of Interest: PSD has received research support from Takeda, Pfizer, Abbvie, Janssen, Polymedco, ALPCO, Buhlmann, and consulting fees from Takeda, Pfizer, Abbvie and Janssen. CM has served as a consultant for Robarts Clinical Trials, Inc., advisory board for Janssen, AbbVie, speakers fees from Janssen, Pfizer. NN holds a McMaster University Department of Medicine Internal Career Award and has received honoraria from Janssen, Abbvie, Takeda, Pfizer, Merck, and Ferring. MM has received speaker and consultant fees for Takeda, Abbvie, Janssen, and Pfizer. NVC has received research support from R-Biopharm and Takeda and consulting fees from Boehringer Ingelheim, Janssen, Pfizer, Progenity, Prometheus and Takeda, outside of the submitted work. BSB has received research support from Takeda and Janssen, and consulting fees from Abbvie and Prometheus laboratories. GDH has served as advisor for Abbvie, Ablynx, Amakem, AM Pharma, Avaxia, Biogen, Bristol Meiers Squibb, Boerhinger Ingelheim, Celgene, Celltrion, Cosmo, Covidien, Ferring, DrFALK Pharma, Engene, Galapagos, Gilead, Glaxo Smith Kline, Hospira, Immunic, Johnson and Johnson, Lycera, Medimetrics, Millenium/Takeda, Mitsubishi Pharma, Merck Sharp Dome, Mundipharma, Novonordisk, Pfizer, Prometheus laboratories/Nestle, Protagonist, Receptos, Robarts Clinical Trials, Salix, Sandoz, Setpoint, Shire, Teva, Tigenix, Tillotts, Topivert, Versant and Vifor and received speaker fees from Abbvie, Ferring, Johnson and Johnson, Merck Sharp Dome, Mundipharma, Norgine, Pfizer, Shire, Millenium/Takeda, Tillotts and Vifor. BGF has received grant/research support from Millennium Pharmaceuticals, Merck, Tillotts Pharma, AbbVie, Novartis Pharmaceuticals, Centocor, Elan/Biogen, UCB Pharma, Bristol-Myers Squibb, Genentech, ActoGenix and Wyeth Pharmaceuticals; consulting fees from Millennium Pharmaceuticals, Merck, Centocor, Elan/Biogen, Janssen-Ortho, Teva Pharmaceuticals, Bristol-Myers Squibb, Celgene, UCB Pharma, AbbVie, AstraZeneca, Serono, Genentech, Tillotts Pharma, Unity Pharmaceuticals, Albireo Pharma, Given Imaging, Salix Pharmaceuticals, Novonordisk, GSK, ActoGenix, Prometheus Therapeutics and Diagnostics, Athersys, Axcan, Gilead, Pfizer, Shire, Wyeth, Zealand Pharma, Zyngenia, GiCare Pharma and Sigmoid Pharma; and speaker’s bureau fees from UCB, AbbVie and J&J/Janssen.WJS has received research grants from Atlantic Healthcare Limited, Amgen, Genentech, Gilead Sciences, Abbvie, Janssen, Takeda, Lilly, Celgene/Receptos; consulting fees from Abbvie, Allergan, Amgen, Arena Pharmaceuticals, Avexegen Therapeutics, BeiGene, Boehringer Ingelheim, Celgene, Celltrion, Conatus, Cosmo, Escalier Biosciences, Ferring, Forbion, Genentech, Gilead Sciences, Gossamer Bio, Incyte, Janssen, Kyowa Kirin Pharmaceutical Research, Landos Biopharma, Lilly, Oppilan Pharma, Otsuka, Prizer, Precision IBD, Progenity, Prometheus Laboratories, Reistone, Ritter Pharmaceuticals, Robarts Clinical Trials (owned by Health Academic Research Trust, HART), Series Therapeutics, Shire, Sienna Biopharmaceuticals, Sigmoid Biotechnologies, Sterna Biologicals, Sublimity Therapeutics, Takeda, Theravance Biopharma, Tigenix, Tillotts Pharma, UCB Pharma, Ventyx Biosciences, Vimalan Biosciences, Vivelix Pharmaceuticals; and stock or stock options from BeiGene, Escalier Biosciences, Gossamer Bio, Oppilan Pharma, Precision IBD, Progenity, Ritter Pharmaceuticals, Ventyx Biosciences, Vimalan Biosciences. Spouse: Opthotech - consultant, stock options; Progenity - consultant, stock; Oppilan Pharma - employee, stock options; Escalier Biosciences - employee, stock options; Precision IBD - employee, stock options; Ventyx Biosciences – employee, stock options; Vimalan Biosciences – employee, stock options. VJ has received has received consulting fees from AbbVie, Eli Lilly, GlaxoSmithKline, Arena pharmaceuticals, Genetech, Pendopharm, Sandoz, Merck, Takeda, Janssen, Robarts Clinical Trials, Topivert, Celltrion; speaker’s fees from Takeda, Janssen, Shire, Ferring, Abbvie, Pfizer. SS has received research grants from AbbVie, and consulting fees from AbbVie, Takeda, Pfizer, AMAG Pharmaceuticals. All other authors declare no potential conflicts of interest.
REFERENCES
- 1.Peyrin-Biroulet L, Sandborn W, Sands BE, et al. Selecting Therapeutic Targets in Inflammatory Bowel Disease (STRIDE): Determining Therapeutic Goals for Treat-to-Target. Am J Gastroenterol 2015;110:1324–38. [DOI] [PubMed] [Google Scholar]
- 2.Dulai PS, Levesque BG, Feagan BG, et al. Assessment of mucosal healing in inflammatory bowel disease: review. Gastrointest Endosc 2015;82:246–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Limketkai BN, Singh S, Jairath V, et al. US Practice Patterns and Impact of Monitoring for Mucosal Inflammation after Biologic Initiation in Inflammatory Bowel Disease IBD 2019. [DOI] [PMC free article] [PubMed]
- 4.Buisson A, Gonzalez F, Poullenot F, et al. Comparative Acceptability and Perceived Clinical Utility of Monitoring Tools: A Nationwide Survey of Patients with Inflammatory Bowel Disease. Inflamm Bowel Dis 2017;23:1425–1433. [DOI] [PubMed] [Google Scholar]
- 5.Mosli MH, Zou G, Garg SK, et al. C-Reactive Protein, Fecal Calprotectin, and Stool Lactoferrin for Detection of Endoscopic Activity in Symptomatic Inflammatory Bowel Disease Patients: A Systematic Review and Meta-Analysis. Am J Gastroenterol 2015;110:802–19; quiz 820. [DOI] [PubMed] [Google Scholar]
- 6.Whiting P, Rutjes AW, Reitsma JB, et al. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol 2003;3:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Reitsma JB, Glas AS, Rutjes AW, et al. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol 2005;58:982–90. [DOI] [PubMed] [Google Scholar]
- 8.Rutgeerts P, Sandborn WJ, Feagan BG, et al. Infliximab for induction and maintenance therapy for ulcerative colitis. N Engl J Med 2005;353:2462–76. [DOI] [PubMed] [Google Scholar]
- 9.Sandborn WJ, Feagan BG, Marano C, et al. Subcutaneous golimumab induces clinical response and remission in patients with moderate-to-severe ulcerative colitis. Gastroenterology 2014;146:85–95; quiz e14–5. [DOI] [PubMed] [Google Scholar]
- 10.Sandborn WJ, Feagan BG, Marano C, et al. Subcutaneous golimumab maintains clinical response in patients with moderate-to-severe ulcerative colitis. Gastroenterology 2014;146:96–109 e1. [DOI] [PubMed] [Google Scholar]
- 11.Parikh A, Leach T, Wyant T, et al. Vedolizumab for the treatment of active ulcerative colitis: a randomized controlled phase 2 dose-ranging study. Inflamm Bowel Dis 2012;18:1470–9. [DOI] [PubMed] [Google Scholar]
- 12.Feagan BG, Rutgeerts P, Sands BE, et al. Vedolizumab as induction and maintenance therapy for ulcerative colitis. N Engl J Med 2013;369:699–710. [DOI] [PubMed] [Google Scholar]
- 13.Sandborn WJ, Su C, Sands BE, et al. Tofacitinib as Induction and Maintenance Therapy for Ulcerative Colitis. N Engl J Med 2017;376:1723–1736. [DOI] [PubMed] [Google Scholar]
- 14.Dulai PS, Singh S, Jairath V, et al. Prevalence of endoscopic improvement and remission according to patient-reported outcomes in ulcerative colitis. Aliment Pharmacol Ther 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Schunemann HJ, Mustafa RA, Brozek J, et al. GRADE guidelines: 22. The GRADE approach for tests and strategies-from test accuracy to patient-important outcomes and recommendations. J Clin Epidemiol 2019;111:69–82. [DOI] [PubMed] [Google Scholar]
- 16.Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155:529–36. [DOI] [PubMed] [Google Scholar]
- 17.Buisson A, Vazeille E, Minet-Quinard R, et al. Faecal chitinase 3-like 1 is a reliable marker as accurate as faecal calprotectin in detecting endoscopic activity in adult patients with inflammatory bowel diseases. Alimentary Pharmacology & Therapeutics 2016;43:1069–79. [DOI] [PubMed] [Google Scholar]
- 18.Chen J-M, Liu T, Gao S, et al. Efficacy of noninvasive evaluations in monitoring inflammatory bowel disease activity: A prospective study in China. World Journal of Gastroenterology 2017;23:8235–8247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ferreiro R, Barreiro-De Acosta M, Vallejo N, et al. Accuracy of a rapid fecal calprotectin test as a predictor of mucosal healing in patients with Ulcerative Colitis (UC). Journal of Crohn’s and Colitis 2015;1):S194. [Google Scholar]
- 20.Hart L, Kherad O, Lemieux C, et al. Fecal calprotectin correlates to endoscopic and histologic remission in ulcerative colitis: A prospective study. Journal of Crohn’s and Colitis 2017;11 (Supplement 1):S156–S157. [Google Scholar]
- 21.Hassan EA, Ramadan HK, Ismael AA, et al. Fecal Calprotectin Levels Are Closely Correlated with the Absence of Relevant Mucosal Lesions in Postoperative Crohn’s Disease. Saudi Journal of Gastroenterology 2017;23:238–245.28721978 [Google Scholar]
- 22.Jusue V, Chaparro M, Gisbert JP. Accuracy of fecal calprotectin for the prediction of endoscopic activity in patients with inflammatory bowel disease. Digestive & Liver Disease 2018;50:353–359. [DOI] [PubMed] [Google Scholar]
- 23.Kanmura S, Hamamoto H, Morinaga Y, et al. Fecal Human Neutrophil Peptide Levels Correlate with Intestinal Inflammation in Ulcerative Colitis. Digestion 2016;93:300–8. [DOI] [PubMed] [Google Scholar]
- 24.Kristensen V, Klepp P, Cvancarova M, et al. Prediction of endoscopic disease activity in ulcerative colitis by two different assays for fecal calprotectin.[Erratum appears in J Crohns Colitis. 2015 Jul;9(7):595; PMID: 26040316]. Journal of Crohn’s & colitis 2015;9:164–9. [DOI] [PubMed] [Google Scholar]
- 25.Labaere D, Smismans A, Van Olmen A, et al. Comparison of six different calprotectin assays for the assessment of inflammatory bowel disease. United European Gastroenterology Journal 2014;2:30–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Langhorst J, Boone J, Lauche R, et al. Faecal Lactoferrin, Calprotectin, PMN-elastase, CRP, and White Blood Cell Count as Indicators for Mucosal Healing and Clinical Course of Disease in Patients with Mild to Moderate Ulcerative Colitis: Post Hoc Analysis of a Prospective Clinical Trial. Journal of Crohn’s & colitis 2016;10:786–94. [DOI] [PubMed] [Google Scholar]
- 27.Ma C, Lumb R, Walker EV, et al. Noninvasive Fecal Immunochemical Testing and Fecal Calprotectin Predict Mucosal Healing in Inflammatory Bowel Disease: A Prospective Cohort Study. Inflammatory Bowel Diseases 2017;23:1643–1649. [DOI] [PubMed] [Google Scholar]
- 28.Magro F, Lopes S, Coelho R, et al. Accuracy of Faecal Calprotectin and Neutrophil Gelatinase B-associated Lipocalin in Evaluating Subclinical Inflammation in UlceRaTIVE Colitis-the ACERTIVE study. Journal of Crohn’s & colitis 2017;11:435–444. [DOI] [PubMed] [Google Scholar]
- 29.Mak WY, Buisson A, Andersen MJ Jr., et al. Fecal Calprotectin in Assessing Endoscopic and Histological Remission in Patients with Ulcerative Colitis. Digestive Diseases & Sciences 2018;63:1294–1301. [DOI] [PubMed] [Google Scholar]
- 30.Munoz Villafranca C, Perez De Arenaza A, Gomez L, et al. Faecal calprotectin as a biomarker of early mucosal healing in patients with ulcerative colitis naive to adalimumab treatment. Journal of Crohn’s and Colitis 2016;10 (Supplement 1):S339–S340. [Google Scholar]
- 31.Sandborn WJ, Panes J, Zhang H, et al. Correlation Between Concentrations of Fecal Calprotectin and Outcomes of Patients With Ulcerative Colitis in a Phase 2 Trial. Gastroenterology 2016;150:96–102. [DOI] [PubMed] [Google Scholar]
- 32.Scaioli E, Scagliarini M, Cardamone C, et al. Clinical application of faecal calprotectin in ulcerative colitis patients. European Journal of Gastroenterology & Hepatology 2015;27:1418–24. [DOI] [PubMed] [Google Scholar]
- 33.Voiosu T, Bengus A, Balanescu P, et al. Rapid Fecal Calprotectin Testing Predicts Mucosal Healing Better than C-reactive Protein and Serum Tumor Necrosis Factor alpha in Patients with Ulcerative Colitis. Romanian Journal of Internal Medicine 2015;53:253–60. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.