Abstract
Importance:
Surveillance testing is performed after primary treatment for colorectal cancer (CRC), but it is unclear if this testing decreases time to detection of recurrence or affects patient survival.
Objective:
To determine if intensity of post-treatment surveillance is associated with time to detection of CRC recurrence, rate of recurrence, resection for recurrence, or overall survival (OS).
Design:
Observational retrospective cohort study.
Setting:
Patient data abstracted from the medical record as part of a Commission on Cancer (CoC) Special Study merged with records from the National Cancer Database (NCDB).
Participants:
Random sample of Stage I-III CRC patients treated at CoC accredited facilities during 2006–2007 with follow up through December 31, 2014.
Exposure:
Intensity of imaging and carcinoembryonic antigen (CEA) surveillance testing [higher (HI) and lower intensity (LI)] derived empirically at the facility level using the observed–to-expected ratio for surveillance testing during a 3-year observation period.
Main Outcomes and Measures:
The primary outcome was time to detection of CRC recurrence and recurrence rates; secondary outcomes included rates of resection for recurrent disease and overall survival (OS).
Results:
8529 patients at 1,175 facilities underwent surveillance imaging and CEA testing in the 3 years after CRC treatment. Patients with HI imaging (49.1%) or CEA (48.5%) had a mean of 2.9 (95% CI 2.8–2.9) imaging and 4.3 (95% CI 4.2–4.4) CEA tests. Patients with LI underwent a mean of 1.6 (95% CI 1.6–1.7) imaging and 1.6 (95% CI 1.6–1.7) CEA tests. There was no difference in median time to detection of recurrence by imaging [HI 15.1 (95% CI 8.2–26.3) vs. LI 16.0 (95% CI 7.9–27.2) months] or CEA intensity [HI 15.9 (95% CI 8.5–27.5) and LI 15.3 (95% CI 7.9–25.7) months] (p=0.60 and p=0.39). Imaging and CEA surveillance intensity were not associated with a difference in recurrence rates [HR 0.98 (95% CI 0.89–1.09) and 1.00 (95% CI 0.90–1.10), recurrence resection [HR 0.99 (95% CI 0.89–1.09) and HR 1.00 (0.90–1.10)] or overall survival [HR 1.00 (95% CI 0.94–1.08) and HR 0.96 (95% CI 0.89–1.03)], respectively.
Conclusions and Relevance:
Among patients treated for Stage I-III colorectal cancer, there was no significant association between surveillance intensity and recurrence, resection, or overall survival.
INTRODUCTION
Colorectal cancer (CRC) is the third most common cancer and second leading cause of cancer death for men and women in the United States, affecting an estimated 140,250 patients in 2018.1 Approximately 80% of CRC patients present with localized disease (AJCC Stage I-III) for which surgical therapy is curative, and as of 2014, patients with CRC represent the third largest group of cancer survivors.2
After completion of definitive treatment, surveillance is recommended with the goal of improving disease-specific (DSS) and overall survival (OS) by detecting disease recurrence or a second primary cancer early, such that a patient has an opportunity for potentially curative surgery. Additionally, surveillance can be effective in monitoring long-term treatment toxicity, managing patient anxiety, and assuring continuation of cancer survivorship care.
However, the optimal surveillance strategy is unknown, and recent data from randomized trials have not demonstrated significant survival benefit from intensive follow-up.3,4 Data to inform which testing is most beneficial and how often testing should be performed is limited, and the survival benefit of surveillance in contemporary practice is unknown. National and international consensus guidelines vary, although many still recommend frequent testing, and adherence to these guidelines is inconsistent, resulting in both over-testing and under-testing in clinical practice.5–13 While the frequency with which more intensive testing can identify recurrence earlier and lead to an intervention that improves outcomes is unknown, recent evidence suggests it would have little clinical impact.
The purpose of this study was to determine the association between surveillance intensity (SI) and the detection of CRC recurrence and survival within a large U.S. population outside the parameters of a clinical trial. The hypothesis was that more intensive surveillance would not be associated with earlier recurrence detection or better overall survival.
METHODS
Study Design & Data Sources
Patients age >18 years old with American Joint Committee on Cancer (AJCC) Stage I-III CRC treated with definitive surgical resection in 2006–2007 were identified from the National Cancer Data Base (NCDB). Demographic data were obtained from pre-defined NCDB variables,14 and stage was defined by AJCC 7th edition.15 A random sample of up to 10 CRC patients from each Commission on Cancer (CoC) facility were selected for detailed primary data collection regarding surveillance testing, recurrence, and treatment as part of a CoC Special Study. Pre-defined primary data was collected by cancer registrars from primary chart review of records at the treating facility and physician offices, and then merged with corresponding NCDB records. If patients sought care at separate facilities, registrars obtained records from other facilities and outpatient offices.
The enrollment period (2006–2007) was chosen to ensure a minimum of 5 years follow-up for recurrence and DSS; OS was assessed during a total of 7 years. Surveillance and recurrence data was collected through December, 2012 with vital status through December, 2014.
These data were collected via a secure web form housed at the NCDB and provided to the principal investigator as de-identified data in compliance with Health Insurance Portability and Accountability Act (HIPAA). Study analysis was considered exempt by The MD Anderson Cancer Center Institutional Review Board.
Surveillance start date was defined as 90 days after curative resection. Exclusion criteria included lack of surveillance start date (due to recurrence, death, or loss to follow-up within 90 days of surgery), lack of tumor site or stage, and non-surgical management. Registrars screened patients based on the above criteria and if excluded, a new randomly selected patient was assigned prior to data collection.
Ascertainment of Surveillance Testing
To assess surveillance intensity, the observed number of imaging studies [computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET)], and carcinoembryonic antigen (CEA) levels obtained for each patient during the surveillance period was recorded. Complementary imaging studies performed within 30 days, such as CT Chest + CT Abdomen/Pelvis or CT Chest + MRI Abdomen/Pelvis, were counted as a single study.
Exposure/Definition of Surveillance Intensity
An observed-to-expected ratio (O/E ratio) of testing (separately for imaging/CEA) at both the individual and facility level was used to account for differences in risk-related surveillance. The observed number (O) of tests within 3 years was calculated for individual patients. The expected number (E) was then estimated based on fixed effects and posterior means of random effects derived from a two-level random intercept negative binomial model in predicting the O, accounting for not only the random effects from hospitals but fixed effects including the variables indicated in Table 1.16
Table 1.
Patient demographics and tumor characteristics among patients who underwent surveillance by facility-level testing intensity (N=8529).
| Characteristics | Imaging | CEAa | ||||
|---|---|---|---|---|---|---|
| Lower Intensity (N=4,341) |
Higher Intensity (N=4,188) |
P-value | Lower Intensity (N=4,393) |
Higher Intensity (N=4,136) |
P-value | |
| Number of tests | ||||||
| Mean (95% CI) | 1.63(1.58-1.68) | 2.87(2.79-2.94) | <.001 | 1.63(1.55-1.71) | 4.31(4.18-4.44) | <.001 |
| Median (IQR) | 1 (0, 2) | 2 (1, 4) | <.001 | 1 (0, 2) | 3 (1, 7) | <.001 |
| Min-Max | 0-14 | 0-27 | NA | 0-33 | 0-41 | NA |
| Age | 0.82 | 0.77 | ||||
| 18-49 | 502 (50.8) | 487 (49.2) | 519 (52.5) | 470 (47.5) | ||
| 50-64 | 1,357 (51.4) | 1,284 (48.6) | 1,339 (50.7) | 1,302 (49.3) | ||
| 65-74 | 1,138 (49.8) | 1,145 (50.2) | 1,172 (51.3) | 1,111 (48.7) | ||
| 75-90 | 1,285 (51.3) | 1,218 (48.7) | 1,301 (52) | 1,202 (48) | ||
| >90 | 59 (52.2) | 54 (47.8) | 62 (54.9) | 51 (45.1) | ||
| Sex | 0.99 | 0.35 | ||||
| Male | 2,118 (50.8) | 2,051 (49.2) | 2,158 (51.8) | 2,011 (48.2) | ||
| Female | 2,222 (51) | 2,136 (49) | 2,233 (51.2) | 2,125 (48.8) | ||
| Unknown | 1 (50) | 1 (50) | 2 (100) | 0 (0) | ||
| Race | 0.02 | 0.03 | ||||
| White | 3,714 (50.5) | 3,642 (49.5) | 3,783 (51.4) | 3,573 (48.6) | ||
| Black | 449 (55.6) | 359 (44.4) | 441 (54.6) | 367 (45.4) | ||
| Others | 178 (48.8) | 187 (51.2) | 169 (46.3) | 196 (53.7) | ||
| Comorbidity score | 0.23 | 0.85 | ||||
| 0 | 3,065 (50.8) | 2,969 (49.2) | 3,115 (51.6) | 2,919 (48.4) | ||
| 1 | 957 (52.2) | 878 (47.8) | 935 (51) | 900 (49) | ||
| 2 | 319 (48.3) | 341 (51.7) | 343 (52) | 317 (48) | ||
| Insurance status | 0.05 | 0.49 | ||||
| Private | 1,698 (51.8) | 1,580 (48.2) | 1,704 (52) | 1,574 (48) | ||
| Uninsured | 132 (50.4) | 130 (49.6) | 134 (51.1) | 128 (48.9) | ||
| Medicaid | 174 (54.9) | 143 (45.1) | 177 (55.8) | 140 (44.2) | ||
| Medicare | 2,250 (50.4) | 2,218 (49.6) | 2,279 (51) | 2,189 (49) | ||
| Managed Care | 25 (36.2) | 44 (63.8) | 31 (44.9) | 38 (55.1) | ||
| Unknown | 62 (45.9) | 73 (54.1) | 68 (50.4) | 67 (49.6) | ||
| Median income quartile | 0.56 | 0.02 | ||||
| $46,000 + | 1,216 (51.9) | 1,126 (48.1) | 1,261 (53.8) | 1,081 (46.2) | ||
| $35,000 - $45,999 | 1,127 (49.9) | 1,132 (50.1) | 1,166(51.6) | 1,093 (48.4) | ||
| $30,000 - $35,000 | 1,114 (50.6) | 1,089 (49.4) | 1,122(50.9) | 1,081 (49.1) | ||
| < $30,000 | 884 (51.2) | 841 (48.8) | 844 (48.9) | 881 (51.1) | ||
| Proportion without high school degree by ZIP code | 0.42 | 0.49 | ||||
| < 14% | 908 (52.1) | 834 (47.9) | 915 (52.5) | 827 (47.5) | ||
| 14% - 19.9% | 1,228 (51.6) | 1,153 (48.4) | 1,225 (51.4) | 1,156 (48.6) | ||
| 20% - 28.9% | 1,360 (50.2) | 1,348 (49.8) | 1,404 (51.8) | 1,304 (48.2) | ||
| 29% + | 845 (49.8) | 853 (50.2) | 849 (50) | 849 (50) | ||
| Population density of residence | 0.42 | <.01 | ||||
| Metro area | 3,445 (51.3) | 3,265 (48.7) | 3,543 (52.8) | 3,167 (47.2) | ||
| Urban area | 708 (49.3) | 729 (50.7) | 659 (45.9) | 778 (54.1) | ||
| Rural area | 79 (47.6) | 87 (52.4) | 70 (42.2) | 96 (57.8) | ||
| Unknown | 109 (50.5) | 107 (49.5) | 121 (56) | 95 (44) | ||
| Cancer site | 0.68 | 0.13 | ||||
| Colon | 3,396 (51) | 3,261 (49) | 3,458 (51.9) | 3,199 (48.1) | ||
| Rectum | 945 (50.5) | 927 (49.5) | 935 (49.9) | 937 (50.1) | ||
| Tumor stage* | 0.63 | 0.71 | ||||
| I | 1,104 (51.7) | 1,032 (48.3) | 1,091 (51.1) | 1,045 (48.9) | ||
| II | 1,519 (50.7) | 1,480 (49.3) | 1,555 (51.9) | 1,444 (48.1) | ||
| III | 1,718 (50.6) | 1,676 (49.4) | 1,747 (51.5) | 1,647 (48.5) | ||
| Tumor histology | 0.39 | 0.65 | ||||
| Non-mucinous adenocarcinoma | 3,907 (51.1) | 3,735 (48.9) | 3,949 (51.7) | 3,693 (48.3) | ||
| Signet-ring cell | 39 (45.9) | 46 (54.1) | 42 (49.4) | 43 (50.6) | ||
| Mucinous | 395 (49.3) | 407 (50.7) | 402 (50.1) | 400 (49.9) | ||
| Tumor grade | 0.08 | 0.43 | ||||
| Well/Moderately differentiated | 3,632 (51.2) | 3,464 (48.8) | 3,661 (51.6) | 3,435 (48.4) | ||
| Poorly | 650 (48.7) | 686 (51.3) | 680 (50.9) | 656 (49.1) | ||
| Undifferentiated | 58 (61.1) | 37 (38.9) | 52 (54.7) | 43 (45.3) | ||
| Unknown | 1 (50) | 1 (50) | 0 (0) | 2 (100) | ||
| Total lymph nodes accessed | 0.19 | 0.13 | ||||
| 0-11 | 1,302 (52.4) | 1,184 (47.6) | 1,238 (49.8) | 1,248 (50.2) | ||
| 12+ | 3,009 (50.3) | 2,978 (49.7) | 3,125 (52.2) | 2,862 (47.8) | ||
| Unknown | 30 (53.6) | 26 (46.4) | 30 (53.6) | 26 (46.4) | ||
| Lymphovascular invasion | 0.19 | 0.25 | ||||
| Yes | 879 (50.1) | 875 (49.9) | 880 (50.2) | 874 (49.8) | ||
| No | 2,582 (50.9) | 2,492 (49.1) | 2,603 (51.3) | 2,471 (48.7) | ||
| Unknown | 861 (52.2) | 790 (47.8) | 884 (53.5) | 767 (46.5) | ||
| Not applicable | 19 (38) | 31 (62) | 26 (52) | 24 (48) | ||
| Perineural invasion | 0.70 | 0.03 | ||||
| Yes | 228 (51.9) | 211 (48.1) | 223 (50.8) | 216 (49.2) | ||
| No | 1,992 (51.4) | 1,881 (48.6) | 1,981 (51.1) | 1,892 (48.9) | ||
| Unknown | 2,065 (50.4) | 2,036 (49.6) | 2,144 (52.3) | 1,957 (47.7) | ||
| Not applicable | 56 (48.3) | 60 (51.7) | 45 (38.8) | 71 (61.2) | ||
| Surgical margin status | 0.45 | 0.14 | ||||
| Negative | 4,080 (50.8) | 3,948 (49.2) | 4,120 (51.3) | 3,908 (48.7) | ||
| Positive | 217 (53.3) | 190 (46.7) | 228 (56) | 179 (44) | ||
| Unknown | 44 (46.8) | 50 (53.2) | 45 (47.9) | 49 (52.1) | ||
| Treatment sequence | 0.57 | 0.80 | ||||
| No chemotherapy | 2,213 (51.6) | 2,078 (48.4) |
2,228 (51.9) |
2,063 (48.1) |
||
| Neo-adjuvant | 124 (50.4) | 122 (49.6) | 130 (52.8) | 116 (47.2) | ||
| Adjuvant | 1,824 (50) | 1,821 (50) | 1,861 (51.1) | 1,784 (48.9) | ||
| Other | 180 (51.9) | 167 (48.1) | 174 (50.1) | 173 (49.9) | ||
| Facility type | <0.01 | <0.01 | ||||
| Community | 1,150 (48.8) | 1,206 (51.2) | 1,103 (46.8) | 1,253 (53.2) | ||
| Comprehensive | 2,318 (50.5) | 2,271 (49.5) | 2,422 (52.8) | 2,167 (47.2) | ||
| Academic/Research | 850 (54.8) | 700 (45.2) | 851 (54.9) | 699 (45.1) | ||
| Others/Unknown | 23 (67.6) | 11 (32.4) | 17 (50) | 17 (50) | ||
| Distance traveled for care (miles) | 0.64 | 0.01 | ||||
| 0-23 | 1,107 (51.7) | 1,035 (48.3) | 1,108 (51.7) | 1,034 (48.3) | ||
| 24-60 | 1,073 (51.3) | 1,018 (48.7) | 1,137 (54.4) | 954 (45.6) | ||
| 61-141 | 1,092 (49.8) | 1,100 (50.2) | 1,125 (51.3) | 1,067 (48.7) | ||
| 142+ | 1,069 (50.8) | 1,035 (49.2) | 1,023 (48.6) | 1,081 (51.4) | ||
Carcinoembryonic antigen (CEA)
Comparison of the multilevel model to a non-multilevel model (likelihood-ratio test) indicated a highly significant clustering effect of testing intensity by facility (p<.0001). Therefore, the O/E for each facility was calculated based on the sum of the individuals from that facility, and the facility was categorized into higher intensity (HI; O/E ≥ 1) or lower intensity (LI; O/E < 1) categories for comparison. Because patients with documented recurrence undergo more testing during evaluation of recurrence, only those disease-free with at least 3-year survival were included for the O/E ratio estimation (n=6279).
To minimize the effect of provider-level bias and the potential to misclassify confirmatory tests in the setting of suspected recurrence as surveillance tests, patients were assigned a surveillance intensity group (HI vs. LI) based on the facility at which they received care. After HI and LI facility-based groups were empirically defined, patients who died or developed documented recurrence were reintroduced into the analysis and assigned to the intensity group based on their treatment facility. Analyses were then performed at the individual patient level using a facility-based HI versus LI assignment.
Primary and Secondary Outcomes
Primary outcome of interest was time to detection of recurrence (locoregional or distant) and cumulative recurrence rates, confirmed pathologically or clinically. Pre-specified secondary outcomes included: resection for recurrent distant disease and OS.
Post-hoc sensitivity analyses for internal validity were performed: 1) only Stage II-III patients for O/E ratio estimation; 2) facilities with the highest and lowest quartiles of imaging and CEA testing; 3) hospital-level agreement for intensity by imaging vs. CEA; 4) analysis using missing/unavailable data as a separate category.
Statistical Analysis
Baseline characteristics were compared across intensity groups using the Chi-square test for categorical variables, t-test for means, and Kruskal–Wallis test for medians of continuous variables. Cumulative recurrence rates, resection rates, and overall survival were determined using the Kaplan-Meier (KM) method and compared by log-rank. Adjusted analyses were performed using multivariable Cox regression, controlling for fixed effect variables used for intensity estimation. The proportional-hazards assumptions for HI versus LI was verified graphically using the “log-log” plot, where the -ln{-ln(survival)} curves of the covariate versus ln(analysis time) demonstrated reasonable parallel curves. Patients were censored at time of death, loss to follow-up, or end of surveillance study period (5 years).
Agreement of imaging/CEA intensity was measured in a continuous form using a Bland-Altman plot. The limits of agreement were defined as the mean difference ± 1.96 standard deviation of the differences.
Two variables, perineural invasion (PNI) and lymphovascular invasion (LVI), had > 5% missing values. Multiple imputation was used for these variables in multivariable Cox regression. All tests were two-sided, with alpha 0.05, using SAS software (version 9.1.3; SAS Institute, Cary, NC) for data processing and Stata MP (version 13.1; StataCorp, College Station, TX) for statistical analyses.
RESULTS
Study Population
Of the 11,100 patients randomly selected for detailed primary data collection (Figure 1), 8,542 had surveillance testing within 3 years after primary resection and median follow-up of 5 years (IQR 3.75–5) as surveillance data collection was truncated at 5 years. Of the 8542 patients who underwent surveillance testing, 6279 patients remained disease free at 3-years and eligible for O/E estimation. The final study cohort (N=8529) included patients who underwent surveillance testing and were treated at a facility allowing O/E assignment (Figure 1). Demographic data by surveillance intensity are demonstrated in Table 1. Five and 7-year OS for the final cohort (N=8529) was 73.7% and 65.6%, respectively.
Figure 1.
Patient selection
Surveillance Intensity
Based on the estimated O/E ratio, 613 facilities were designated LI and 562 HI for imaging and 636 facilities LI and 539 HI based on CEA testing frequency. The mean number of imaging tests per patient within 3 years for the cohorts was 1.63 (95%CI 1.58–1.68) for LI and 2.87 (95%CI 2.79–2.94) for HI (p<0.001). The mean number of CEA tests per patient within 3 years was 1.63 (95%CI 1.55–1.71) for LI and 4.31 (95%CI 4.18–4.44) for HI (p<0.001)(Table 1). Mean number of tests performed was associated with stage, but stage was not associated with intensity assignment (Supplemental Figure 2).
Time to Detection of Recurrence
There was no difference in median time to recurrence (any) detection by imaging [HI 15.1 (IQR 8.2–26.3) vs. LI 16.0 (IQR 7.9–27.2) months (p=0.60)] or CEA intensity [HI 15.9 (IQR 8.5–27.5) and LI 15.3 (IQR 7.9–25.7) months (p=0.39]. Additionally, there was no association between surveillance intensity and detection of locoregional recurrence by imaging [HI 12.2 (IQR 7.6–23.1) vs. LI 13.7 (IQR 7.2–25.6) months, p=0.41] or CEA [HI 11.3 (IQR 6.6–23.1) vs. LI 14.1 (IQR 8.2–26.2) months, p=0.10]. Similar findings were noted for detection of distant recurrence [HI imaging 16.0 (IQR 8.7–27.6) vs. LI 16.6 months (IQR 7.9–27.0), p=0.61 and HI CEA 17.0 (IQR 8.8–29.2) vs. LI 15.5 (IQR 7.8–26.1) months, p=0.07].
Recurrence Rate
Overall 5-year recurrence rates did not differ based on intensity of surveillance imaging or CEA testing (20.8% vs. 21.1%, HI vs. LI imaging, p=0.83; 21.0% vs. 20.9%, HI vs. LI CEA, p=0.99). This finding persisted after stratification by tumor site (Figures 2A-B). Additionally, no difference in 5-year recurrence rates was observed based on intensity of surveillance stratified by stage of the primary tumor [Stage I: 5.9% vs. 7.1%, HI vs. LI imaging, p=0.35; 6.1% vs. 7.0%, HI vs. LI CEA, p=0.49; Stage II: 17.0% vs. 18.3%, HI vs. LI imaging, p=0.35; 18.0% vs. 17.3%, HI vs. LI CEA, p=0.76); Stage III: 33.3% vs. 32.7%, HI vs. LI imaging, p=0.64; 32.9% vs. 33.1%, HI vs. LI CEA, p=0.97].
Figure 2.
a. Cumulative incidence of recurrence by disease site and facility-level imaging testing intensity and log-rank test for comparisons (N=8,529)*
O/E: Observed/Expected ratio
*Median years of follow-up (interquartile range) were 5(4.9–5), 5(5–5), 5(4.95–5) and 5(5–5) for O/E<1 colon, O/E<1 rectum, O/E>=1 colon, and O/E>=1 rectum, respectively, as derived using reversed Kaplan-Meier method.
b. Cumulative incidence of recurrence by disease site and facility-level CEA testing intensity using log-rank test for comparisons (N=8,529)*
O/E: Observed/Expected ratio
*Median years of follow-up (interquartile range) were 5(4.8–5), 5(5–5), 5(5–5) and 5(5–5) for O/E<1 colon, O/E<1 rectum, O/E>=1 colon, and O/E>=1 rectum, respectively, as derived using reversed Kaplan-Meier method.
Multivariable Cox regression adjusting for the patient and tumor related factors demonstrated no association between imaging and CEA surveillance intensity and recurrence rates [HR 0.98 (95%CI 0.89–1.09) and 1.00 (95%CI 0.91–1.10)].
Resection Rates and Overall Survival
There was no difference in the proportion of patients who underwent resection for recurrence at 3 or 5 years by imaging surveillance intensity (LI 3.53% vs. HI 4.19% and LI 4.59% vs. HI 5.43%, respectively, p=0.08), nor by CEA intensity (LI 3.67% vs. HI 4.05% and LI 4.76 vs. HI 5.27%, respectively, p=0.34). When stratified by stage, patients with Stage III CRC who underwent HI imaging surveillance had a slightly higher 5-year resection rate for recurrence than patients with LI imaging (8.9% vs. 7.5%; p=0.27) (Figure 3A), although this was not significant. A similar trend was identified among Stage II CRC patients (4.7% vs. 3.8%, p=0.41). Higher intensity of CEA testing was not associated with higher resection rates among Stage II or III patients (4.5% vs. 4.0%; p=0.84 and 8.8% vs. 7.6%; p=0.17, respectively) (Figure 3B). Multivariable Cox regression showed no difference by imaging (HR 1.22; 95%CI 0.99–1.50) or CEA intensity (HR 1.12; 95%CI 0.91–1.38).
Figure 3.
a. Cumulative incidence of recurrence treated with curative resection by stage at primary diagnosis and facility-level imaging intensity using log-rank test for comparisons (N=8,529)*
O/E: Observed/Expected ratio
*Median years of follow-up (interquartile range) were 5(4.6–5), 5(2.9–5), 5(1.8–5), 5(4.5–5), 5(3.3–5), and 5(1.9–5) for O/E<1 stage I, O/E<1 stage II, O/E<1 stage III, O/E>=1 stage I, O/E>=1 stage II, O/E>=1 stage III, respectively, as derived using reversed Kaplan-Meier method.
b. Cumulative incidence of recurrence treated with curative resection by stage at primary diagnosis and facility-level carcinoembryonic antigen (CEA) intensity using log-rank test for comparisons (N=8,529)*
O/E: Observed/Expected ratio
*Median years of follow-up (interquartile range) were 5(4.7–5), 5(3.1–5), 5(1.7–5), 5(4.5–5), 5(3.2–5), and 5(2.0–5) for O/E<1 stage I, O/E<1 stage II, O/E<1 stage III, O/E>=1 stage I, O/E>=1 stage II, O/E>=1 stage III, respectively, as derived using reversed Kaplan-Meier method.
Finally, 5 and 7-year OS rates did not differ based on imaging or CEA intensity. Among patients undergoing HI imaging surveillance, 5 and 7-year OS rates were 73.7% (95%CI 72.4%−75.0%) and 65.6% (95%CI 64.1%−67.0%), compared to 73.7% (95%CI 72.4%−75.0%) and 65.5% (95%CI 64.1%−67.0%) in the LI group (p=0.74). Similarly, among patients undergoing HI CEA surveillance testing, 5 and 7-year OS survival rates were 74.3% (95%CI 72.9%−75.6%) and 66.4% (95% CI 64.9%−67.9%) vs. 73.1% (95%CI 71.8%−74.4%) and 64.7% (95%CI 63.2%−66.2%) in the LI group, p=0.19). Further, survival rates did not differ based on imaging or CEA intensity when stratifying by tumor site (Figure 4A-B).
Figure 4.
a. Overall survival by disease site and facility-level imaging testing intensity using log-rank test for comparisons (N=8,529)*
O/E: Observed/Expected ratio
*Median years of follow-up (interquartile range) were 7.8(7.2–8.4), 7.7(6.9–8.4), 7.8(7.1–8.4) and 7.7(7.0–8.4) for O/E<1 colon, O/E<1 rectum, O/E>=1 colon, and O/E>=1 rectum, respectively, as derived using reversed Kaplan-Meier method. Survival curves were truncated at 7 years.
b. Overall survival by disease site and facility-level CEA testing intensity using log-rank test for comparisons (N=8,529)*
O/E: Observed/Expected ratio
*Median years of follow-up (interquartile range) were 7.8(7.1–8.4), 7.7(6.9–8.4), 7.8(7.2–8.4) and 7.7(7.0–8.3) for O/E<1 colon, O/E<1 rectum, O/E>=1 colon, and O/E>=1 rectum, respectively, as derived using reversed Kaplan-Meier method. Survival curves were truncated at 7 years.
Multivariable Cox regression failed to demonstrate an association between imaging and CEA surveillance intensity and OS [HR 1.00 (95%CI 0.91–1.10) and HR 0.96 (95%CI 0.89–1.03), respectively].
Sensitivity Analyses
Several sensitivity analyses were performed: 1) Analysis using only stage II and III patients for O/E ratio estimation [Recurrence: HR 1.06 (95%CI 0.96–1.17) for imaging and 0.97 (95%CI 0.88–1.07) for CEA] [OS: HR 0.99 (95%CI 0.93–1.07) for imaging and 0.94 (95%CI 0.89–1.01) for CEA]; 2) Comparison of recurrence detection and OS between facilities at the highest versus lowest quartiles of testing using multivariable Cox regression with imputation [Recurrence: HR 1.17 (95%CI 1.02–1.34) for imaging and 1.06 (95%CI 0.93–1.22) for CEA; OS: HR 1.09 (95%CI 0.98–1.20) for imaging and 0.99 (95%CI 0.90–1.10) for CEA; 3) Bland-Altman plot indicated agreement of intensity measured either by imaging or CEA with only 5.9% of the hospitals outside the limits of agreement (Supplemental Figure 1).17 4) PNI and LVI modeled in multivariable Cox regression as a separate category (without imputation) [Recurrence: HR 0.98 (95%CI 0.89–1.09) for imaging and 1.00 (95%CI 0.9–1.10) for CEA; OS: HR 0.98 (95%CI 0.90–1.06) for imaging and 0.95 (95%CI 0.87–1.04) for CEA].
DISCUSSION
In this study of a large, national population of patients with Stage I-III colorectal cancer, intensity of follow-up testing by imaging or CEA was not associated with time to detection of disease recurrence. Additionally, no significant association was identified between surveillance testing intensity and overall survival. Stage II-III patients undergoing HI imaging surveillance had a slightly higher rate of resection for recurrence compared to patients with LI imaging surveillance, but this difference was not significant.
These findings differ from traditional assumptions regarding the association between follow-up testing and survival after curative treatment of localized colorectal cancer. In this study, intensity was empirically defined. Test utilization at HI facilities was consistent with current U.S. guidelines for annual CT (2.9 imaging tests/3 years) but less frequent than guideline for every 3–6 month CEA (4.3 CEA tests/3 years).5,8–10,18 LI facilities performed follow-up testing at rates consistent with less intensive follow-up schedules, including at least 1 imaging test and 1 CEA during the first 3 years of follow-up.6,7,19 These observed rates reflect everyday practice outside the rigorously controlled context of clinical trials. Current surveillance recommendations for more intensive follow-up are derived from historical data associating intensive surveillance with earlier time to detection of recurrence, and therefore improved overall, but not cancer-specific survival.20–23 However, two recently completed randomized trials of high intensity surveillance vs minimal follow-up failed to demonstrate improved survival, and thus have called these traditional assumptions into question,3,4 leaving recent guidelines unable to recommend an optimal strategy.24
The Follow-up after Colorectal Surgery (FACS) randomized trial showed a small increase in rates of curative resection with intensive imaging and CEA testing compared to minimal follow-up. However, the absolute difference was only 5%, and combining CEA and imaging did not increase resection rates.3 Moreover, there was no significant difference in the number of deaths with intensive follow-up. Similarly, the CEAwatch trial, found a higher proportion of recurrences amenable to curative resection among patients followed in an intensive surveillance protocol compared to standard follow-up, although overall recurrence rates in this study were low (7.5%) and OS was not assessed.
The Gruppo Italiano Lavoro per la Diagnosi Anticipata (GILDA) trial compared intensive testing with semiannual liver ultrasound, annual colonoscopy, and chest imaging to less intensive follow-up with 2 ultrasounds within 16 months and colonoscopy at years 1 and 4 for patients with Duke’s stage B2 and C CRC. Although recurrence was detected at a median 5.9 months earlier with intensive surveillance, there was no difference in OS.4 Earlier evidence in support of high intensity testing are likely attributable to early recurrence from occult metastases present at diagnosis, effects mitigated by the marked improvements in preoperative staging over time.3 It is also plausible that the historical benefits from follow-up may stem, at least in part, from regular contact with a healthcare provider, as opposed to earlier detection of recurrent disease. Additional insight from the COLOFOL trial, comparing the impact of high vs. low intensity surveillance on 5-year DSS and OS is highly anticipated.25
In the present study, as in recent trials, absolute rates of salvage resection were up to 1.4% higher in facilities with HI imaging surveillance among Stage III patients; however, this small difference is not likely to be clinically meaningful. Moreover, rates of recurrence detection did not differ between HI and LI, suggesting that recurrences are identified through a combination of imaging, CEA, and symptom-driven evaluation. It is conceivable that even higher surveillance intensity or higher rates of curative resection for recurrent disease in the HI group could have resulted in a greater effect. This may explain the lack of association of HI CEA testing intensity on rates of salvage surgery, which were lower than those observed with HI CEA testing in the FACS or CEAwatch trials.3,26 Sensitivity analyses comparing recurrence and survival among patients treated at the highest vs. lowest quartile facilities showed no overall differences in survival by surveillance intensity. While there was a small observed effect on recurrence detection when comparing the highest vs. lowest quartile for imaging intensity, this was not associated with differences in survival.
One explanation for the lack of OS effect by surveillance intensity is the low event rate of salvage surgery observed in the present study and in recently reported trials. This study may be underpowered to detect a difference in OS based on resection rates, although the absolute difference in OS would be expected to be small. Furthermore, LI was not equivalent to no follow-up, but rather reduced follow-up during the first 3 years, when the majority of recurrences occur. It is also possible, if not likely, that patients at highest risk of recurrence (e.g. stage IIIC) might receive more intensive surveillance, but may also have recurrences not amenable to curative resection. While it is generally accepted that salvage surgery can improve survival, the risk of disease relapse remains high, and the benefit may be limited for some patients. Thus, the value of earlier detection without survival impact is likely to be dependent upon the personal values and preferences of the patient, highlighting the importance of individualizing surveillance plans. It is also important to recognize that surveillance testing is not without potential harm, including false positives leading to unnecessary tests or procedures; radiation exposure; contrast toxicity; and other testing-related complications.
Based on this data and the recent FACS trial, current NCCN guideline recommendations (CT testing every six months for three years) could be considered over-testing given the absence of improvement in recurrence detection or survival. Moreover, these data suggest that the recommendation of 2 CT scans in the first 3 years and CEA testing every 6 months in the first 3 years by the National Institute for Health and Care Excellence in the UK are appropriate.27
Limitations
This study has several limitations. First, this is a retrospective, observational cohort study, and patients were not randomized, nor could individual decision-making driving testing frequency be accounted for. By assigning surveillance intensity based on treatment facility, the clustered facility practice effect was leveraged in addition to adjusting for patient factors, to control for this bias. Second, because we collected data on up to 10 patients for each facility regardless of volume, it is possible that low-volume facilities could be overrepresented. Third, although the cohort would appear less contemporary, patients were selected from 2006–2007 to allow at least 5 years of follow-up data regarding surveillance and recurrence, and followed through 2014 for vital status. Although there have been advances in systemic therapy for metastatic disease over the study period, without significant differences in the detection and treatment of recurrence, these advances similarly affect both the LI and HI cohorts, and therefore do not affect the primary findings of this study. Fourth, retrospectively collected data is dependent upon the quality of data abstraction. It is possible that if patients completed follow-up or sought treatment at another institution, data loss could occur. However, registrars were asked to follow-up with outside institutions to ensure data completeness and reported a high level of confidence (median score 95, IQR 75, 100) in the completeness of their review by post-review survey. Finally, the empiric definition of HI surveillance in this study included less intensive follow-up than is practiced in some centers, for example, and a benefit of even higher intensity could not be excluded. However, the observed surveillance testing is reflective of contemporary community practice, and therefore provides relevant data for comparison.
Conclusions
Among patients treated for Stage I-III colorectal cancer, there was no significant association between higher intensity vs. lower intensity imaging or CEA surveillance and time to detection of cancer recurrence. Additionally, surveillance intensity was not associated with overall survival.
Supplementary Material
Key Points:
Question:
Is there an association between intensity of post-treatment surveillance for Stage I-III colorectal cancer and time to recurrence detection?
Findings:
This retrospective cohort study included 8529 patients with Stage I-III colorectal cancer. Median time to recurrence was 15.1 (95% CI 8.2–26.3) and 15.9 (95% CI 8.5–27.5) months in the higher intensity imaging and carcinoembryonic antigen (CEA) surveillance groups, and 16.0 (95% CI 7.9–27.2) and 15.3 (95% CI 7.9–25.7) months in the lower intensity surveillance groups, a difference that was not statistically significant.
Meaning:
There was no significant association between intensity of surveillance and time to detection of colorectal cancer recurrence.
Acknowledgements
Dr. George Chang had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The funding organizations, including the Patient-Centered Outcomes Research Institute (PCORI), National Cancer Institute (NCI), and Alliance for Clinical Trials in Oncology had no role in collection and management of the data, approval of the manuscript, nor decision to submit the manuscript for publication.
Support: Research reported in this publication was supported by the Patient-Centered Outcomes Research Institute (PCORI) Award [CE13–04-6855, GJC]. This work was also supported, in part, by the National Cancer Institute of the National Institutes of Health under the Award Numbers P30CA016672 (The University of Texas, MD Anderson Cancer Center Support Grant) and U10CA180821 (to the Alliance for Clinical Trials in Oncology). All statements in this publication, including, its findings, are solely those of the authors and do not necessarily represent the views of the Patient-Centered Outcomes Research Institute (PCORI), its Board of Governors or Methodology Committee, or the National Institutes of Health.
Footnotes
Conflicts of interest: The authors have no conflicts of interest to disclose.
ClinicalTrials.gov Identifier: NCT02217865
REFERENCES
- 1.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA Cancer J Clin 2018;68(1):7–30. [DOI] [PubMed] [Google Scholar]
- 2.Howlader N NA, Krapcho M, Garshell J, Miller D, Altekruse SF, Kosary CL, Yu M, Ruhl J, Tatalovich Z,Mariotto A, Lewis DR, Chen HS, Feuer EJ, Cronin KA. SEER Cancer Statistics Review, 1975–2013, National Cancer Institute. Bethesda, MD. [Google Scholar]
- 3.Primrose JN, Perera R, Gray A, et al. Effect of 3 to 5 years of scheduled CEA and CT follow-up to detect recurrence of colorectal cancer: the FACS randomized clinical trial. JAMA. 2014;311(3):263–270. [DOI] [PubMed] [Google Scholar]
- 4.Rosati G, Ambrosini G, Barni S, et al. A randomized trial of intensive versus minimal surveillance of patients with resected Dukes B2-C colorectal carcinoma. Ann Oncol. 2016;27(2):274–280. [DOI] [PubMed] [Google Scholar]
- 5.Benson AB 3rd, Venook AP, Bekaii-Saab T, et al. Rectal Cancer, Version 2.2015. J Natl Compr Canc Netw 2015;13(6):719–728; quiz 728. [DOI] [PubMed] [Google Scholar]
- 6.Glimelius B, Tiret E, Cervantes A, Arnold D, Group EGW. Rectal cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 2013;24 Suppl 6:vi81–88. [DOI] [PubMed] [Google Scholar]
- 7.Labianca R, Nordlinger B, Beretta GD, et al. Early colon cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 2013;24 Suppl 6:vi64–72. [DOI] [PubMed] [Google Scholar]
- 8.Meyerhardt JA, Mangu PB, Flynn PJ, et al. Follow-up care, surveillance protocol, and secondary prevention measures for survivors of colorectal cancer: American Society of Clinical Oncology clinical practice guideline endorsement. J Clin Oncol 2013;31(35):4465–4470. [DOI] [PubMed] [Google Scholar]
- 9.Steele SR, Chang GJ, Hendren S, et al. Practice Guideline for the Surveillance of Patients After Curative Treatment of Colon and Rectal Cancer. Dis Colon Rectum 2015;58(8):713–725. [DOI] [PubMed] [Google Scholar]
- 10.Group CCS. Follow-up Care, Surveillance Protocol, and Secondary Prevention Measures for Survivors of Colorectal Cancer 2016. [DOI] [PubMed] [Google Scholar]
- 11.Excellence NIfHaC. Colorectal cancer: the diagnosis and management of colorectal cancer. National Institute for Health and Clinical Excellence; 2016. [Google Scholar]
- 12.Group NZG. Management of Early Colorectal Cancer. In: Mo Health, ed. Wellington, New Zealand: New Zealand Guidelines Group; 2011. [Google Scholar]
- 13.Benson AB 3rd, Venook AP, Bekaii-Saab T, et al. Colon cancer, version 3.2014. Journal of the National Comprehensive Cancer Network : JNCCN. 2014;12(7):1028–1059. [DOI] [PubMed] [Google Scholar]
- 14.Boffa DJ, Rosen JE, Mallin K, et al. Using the National Cancer Database for Outcomes Research. Jama Oncol 2017;3(12):1722–1728. [DOI] [PubMed] [Google Scholar]
- 15.Edge SB, American Joint Committee on Cancer. AJCC cancer staging manual. 7th ed. New York: Springer; 2010. [DOI] [PubMed] [Google Scholar]
- 16.Rabe-Hesketh S, Skrondal A. Multilevel and longitudinal modeling using Stata. 3rd ed. College Station, Tex.: Stata Press Publication; 2012. [Google Scholar]
- 17.Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–310. [PubMed] [Google Scholar]
- 18.Benson AB 3rd, Venook AP, Cederquist L, et al. Colon Cancer, Version 1.2017, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw 2017;15(3):370–398. [DOI] [PubMed] [Google Scholar]
- 19.Excellence” NIfHaC. Colorectal cancer: diagnosis and management. Clinical guideline (CG131). 2011; https://www.nice.org.uk/guidance/CG131. Accessed August 27, 2017, 2017.
- 20.Jeffery M, Hickey BE, Hider PN. Follow-up strategies for patients treated for non-metastatic colorectal cancer. Cochrane Database Syst Rev. 2007(1):Cd002200. [DOI] [PubMed] [Google Scholar]
- 21.Tjandra JJ, Chan MK. Follow-up after curative resection of colorectal cancer: a meta-analysis. Dis Colon Rectum. 2007;50(11):1783–1799. [DOI] [PubMed] [Google Scholar]
- 22.Pita-Fernandez S, Alhayek-Ai M, Gonzalez-Martin C, Lopez-Calvino B, Seoane-Pillado T, Pertega-Diaz S. Intensive follow-up strategies improve outcomes in nonmetastatic colorectal cancer patients after curative surgery: a systematic review and meta-analysis. Ann Oncol 2015;26(4):644–656. [DOI] [PubMed] [Google Scholar]
- 23.Renehan AG, Egger M, Saunders MP, O’Dwyer ST. Impact on survival of intensive follow up after curative resection for colorectal cancer: systematic review and meta-analysis of randomised trials. BMJ (Clinical research ed). 2002;324(7341):813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Leong K, Hartley J, Karandikar S. Association of Coloproctology of Great Britain & Ireland (ACPGBI): Guidelines for the Management of Cancer of the Colon, Rectum and Anus (2017) - Follow Up, Lifestyle and Survivorship. Colorectal disease : the official journal of the Association of Coloproctology of Great Britain and Ireland. 2017;19 Suppl 1:67–70. [DOI] [PubMed] [Google Scholar]
- 25.Wille-Jorgensen P, Laurberg S, Pahlman L, et al. An interim analysis of recruitment to the COLOFOL trial. Colorectal disease : the official journal of the Association of Coloproctology of Great Britain and Ireland. 2009;11(7):756–758. [DOI] [PubMed] [Google Scholar]
- 26.Verberne CJ, Zhan Z, van den Heuvel E, et al. Intensified follow-up in colorectal cancer patients using frequent Carcino-Embryonic Antigen (CEA) measurements and CEA-triggered imaging: Results of the randomized “CEAwatch” trial. European journal of surgical oncology : the journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology. 2015;41(9):1188–1196. [DOI] [PubMed] [Google Scholar]
- 27.Excellence NIfHaC. Colorectal cancer: diagnosis and management. Clinical guideline (CG131. 2011. (updated 2014); https://www.nice.org.uk/guidance/CG131. Accessed January 15, 2018.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







