Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Jun 1.
Published in final edited form as: Cancer Epidemiol Biomarkers Prev. 2015 Mar 23;24(6):913–920. doi: 10.1158/1055-9965.EPI-14-1321

Development and validation of a clinical score for predicting risk of adenoma at screening colonoscopy

Aasma Shaukat 1,2, Timothy R Church 3, Ryan Shanley 4, Noah D Kauff 5, Michael J O’Brien 6, Glenn M Mills 7, Paul A Jordan 8, John A Allen 2,9, Adam Kim 9, Andrew D Feld 10, Ann Graham Zauber 11, Sidney J Winawer 12
PMCID: PMC4452431  NIHMSID: NIHMS674174  PMID: 25800242

Abstract

Background

Currently no clinical tools use demographic and risk factor information to predict the risk of finding an adenoma in individuals undergoing colon cancer screening. Such a tool would be valuable for identifying those who would most benefit from screening colonoscopy.

Methods

We used baseline data from men and women who underwent screening colonoscopy the randomized, multicenter National Colonoscopy Study (NCS) to develop and validate an adenoma risk model. The study, conducted at three sites in the U.S. (Minneapolis, MN; Seattle, WA; and Shreveport, LA) asked all participants to complete baseline questionnaires on clinical risk factors and family history. Model parameters estimated from logistic regression yielded an area under the receiver operating characteristic curve (AUROCC) used to assess prediction.

Results

541 subjects were included in the development model, and 1334 in the validation of the risk score. Variables in the prediction of adenoma risk for colonoscopy screening were age (likelihood ratio test for overall contribution to model, p<0.001), male sex (p<0.001), body mass index (BMI) (p<0.001), family history of at least one first-degree relative with colorectal cancer (p = 0.036), and smoking history (p<0.001). The adjusted AUROCC of 0.67 (95% CI 0.61, 0.74) for the derivation cohort was not statistically significant different from that in the validation cohort. The adjusted AUROCC for the entire cohort was 0.64 (95% CI: 0.60, 0.67).

Conclusion

We developed and validated a simple well calibrated risk score.

Impact

This tool may be useful for estimating risk of adenomas in screening eligible men and women

Keywords: Colorectal Cancer, prediction: risk score, adenoma

INTRODUCTION

Colorectal cancer (CRC) is the second leading cause of cancer-related death in the U.S.[1] Current guidelines recommend initiating screening for asymptomatic men and women at age 50, using a menu of screening options.[2] Most CRCs are thought to arise from precursor lesions called adenomas.[35] The 2008 U.S. Multi-Society Task Force screening guidelines emphasized that the primary goal of screening should be prevention of CRC by detection and removal of asymptomatic adenomas.[2] Recent guidelines on CRC screening by the American College of Physicians recommend that individualized risk assessment for risk of CRC should be performed in all adults, and a screening modality should be selected based on their risk.[6]

Several demographic and clinical risk factors for harboring adenomas in asymptomatic men and women age 50 and over have been identified in large cohort and case-control studies and include increasing age, male sex, race, and a family history of colorectal cancer in a first-degree relative.[710] Other identified risk factors include higher body mass index (BMI), current smoking, and heavy alcohol use.[1118] However, there is a lack of clinical tools to reliably risk-stratify men and women based on these factors. Several authors [19] have reported developing and validating risk scores for advanced neoplasia. However, there are no such tools for risk of adenomas, or that have been developed or validated in a US cohort.. Such clinical risk-stratification tools, or risk scores, are used not only for breast cancer[20] but also in several other areas of medicine--such as for stratifying individuals by risk of heart disease,[21] for organ allocation (MELD score),[22] for severity of liver disease (Child-Pugh score)[23] and for hospital mortality (APACHE II)[24]--where they have diagnostic or prognostic value.

An adenoma risk score would identify the absolute risk for an individual for harboring advanced neoplasia. Based on their absolute risk, individuals could be stratified into low- and or high-risk groups, and those in the high-risk group could be prioritized for screening colonoscopy, while those in low-risk groups can be offered a choice of modalities of screening including colonoscopy. Given the limited capacity for colonoscopy in the U.S., along with its cost and complications, the ability to risk-stratify men and women adequately would be a first step in improving resource utilization, allocating capacity, and reducing costs and complications. The objective of our study was to develop and validate a risk prediction model by using data from a randomized multicenter clinical trial to combine the risk factors associated with adenomas into an adenoma risk score among men and women undergoing colonoscopic screening.

MATERIALS AND METHODS

We used data from phase I and II of the National Colonoscopy Study (NCS), a randomized trial of colonoscopy screening for model development and validation. The study, comparing the clinical results of colonoscopic screening compared to usual care, was conducted in two phases, between 2000 and 2002 (phase 1) and between 2004 and 2007(phase II) on a general population of men and women at three clinical centers: Group Health Cooperative, a managed care organization in the Puget Sound area of Washington State; a collaboration of the University of Minnesota and Minnesota Gastroenterology, a large group practice in Minneapolis, MN; and a wellness clinic for underserved and minorities at Louisiana State University in Shreveport, LA. The study sites were chosen to represent different areas of the country and different healthcare delivery systems. The study coordinating center was Memorial Sloan Kettering Cancer Center (MSKCC) and the central pathology center was at the Department of Pathology, Boston Medical Center (Mallory Institute of Pathology), Boston University.

Men and women age 50 years and older (40 years and older for Louisiana State University) were invited to participate. Eligibility criteria were: no personal history of colorectal cancer, familial adenomatous polyposis or inflammatory bowel disease; no prior colonoscopy (phase I and II) and no prior flexible sigmoidoscopy in the previous five years(phase II); no serious active co-morbidities such as myocardial infarction, congestive heart failure, active treatment for cancer or on anticoagulation; no currently implanted cardioverter/defibrillator; and ability and willingness to provide informed consent. After giving consent, those eligible were randomized to colonoscopy or usual care. For the purposes of this study, we restricted our analysis to the cohort at least age 50 undergoing screening colonoscopy in phase I or II. All colonoscopies were performed by gastroenterologists with expertise in endoscopy and clinical research. Colonoscopies were performed used standard preparations and conscious sedation, with removal of all polyps. The polyps underwent histopathological review locally and by the study pathologist (Michael J. O’Brien) at the central pathology center.

Participants were asked to complete detailed questionnaires on various demographic, dietary and lifestyle factors, and on personal and family history of cancer prior to or shortly after their colonoscopy. The family history data were entered into a database and linked to a pedigree drawing program. Each pedigree was reviewed by a Genetics Review Committee at MSKCC to identify participants with possible hereditary non polyposis colon cancer (HNPCC) or familial adenomatous polyposis (FAP), who were subsequently counseled to see their physicians and discuss CRC screening.

Statistical Analysis

For the purpose of risk-score derivation and validation, only those subjects over the age of 50 at time of colonoscopy with a complete colonoscopy to the cecum and adequate bowel preparation and who completed all the relevant items on the baseline questionnaire were included.

Risk score derivation

The outcome variable was one or more adenomas or CRC found by the screening colonoscopy. Because the analysis was predictive rather than causal, we restricted our analysis to individuals that had complete information for variables of interest. The independent variables were age at colonoscopy, sex, race, BMI, smoking history, current use of aspirin or non-steroidal anti-inflammatory drugs (NSAIDS) at least once per week for a year or more, history of colorectal cancer in at least one first-degree relative, and history of colorectal cancer in at least one second-degree relative (if no first-degree relative is affected). The effects of these variables, along with interactions selected before model fitting, were estimated by logistic regression. Likelihood ratio tests, combining the main effects with the interaction effects, measured the overall effect of variables involved with interactions. In addition to the listed variables, effects of the three NCS clinical centers on the risk of adenomatous polyps were estimated in a separate regression. The receiver operating characteristic curve (ROCC) was estimated with the convex hull approach,[25] and the area under this curve (AUROCC) was computed. To mitigate bias induced by such reuse of the data, the bootstrap method for estimation of prediction error was applied.[26] Since the adjustments are based on bootstrap sample means of the parameter estimates, 200 iterations of bootstrap resampling were used. Confidence intervals (CIs) for the adjusted estimates were produced by a second, outer bootstrap applied to the entire estimation and adjustment process, yielding a double bootstrap.38 To obtain adequate precision for the CIs, the outer bootstrap consisted of 2,000 iterations.

Risk score validation

The risk score developed on phase I participants was applied to phase II participants. To verify the bias adjustment procedure non-parametrically, another 2,000 iteration bootstrap was applied to estimate the difference between the adjusted AUROCC from phase I and the AUROCC of the same risk score applied to phase II. This difference was not statistically significantly different from zero, establishing the adequacy of the bias correction. A new logistic regression model was then fit using combined phase I and II data, and the same double bootstrap was applied to estimate the final adjusted AUROCC and CI.

Deciles of predicted probabilities were plotted against observed relative frequencies of participants with adenomas to examine the calibration of our model. The cumulative distribution of estimated risk was plotted to determine proportions of the population falling below any particular risk.

RESULTS

In phase I (derivation cohort), a total of 700 men and women were enrolled at the three sites, of whom 691 filled out some part of the baseline questionnaire, and 622 underwent a colonoscopy. One person who underwent colonoscopy did not fill out the baseline questionnaire, so 621 individuals had both a colonoscopy and a baseline questionnaire. Of these, 541 individuals remained in the final derivation cohort after excluding those 40–49 years old (n=56) and those with missing information on their baseline questionnaire on one or more variables included in the model (n=34). A few individuals age 49 at randomization but 50 at colonoscopy were included.

In phase II (validation cohort), 1763 were enrolled, of whom 1693 were 50 years or older and 1450 filled some part of the baseline questionnaire and underwent a colonoscopy. Of these, 1334 had complete information on baseline variables and colonoscopy, and were included in the validation analyses. The demographic and baseline characteristics of the derivation and validation cohorts are presented in Tables 1 and 2.

Table 1.

Phase I(derivation cohort) Distribution of baseline variables

Baseline variables Overall Individuals who underwent colonoscopy and filled some part of baseline questionnaire, by adenoma detection
n = 621
Individuals included in final analysis, by adenoma detection
n = 541

N (%) or mean ± SD
n = 691
Adenoma detected
n = 116
No adenoma detected
n = 505
Adenoma detected
n = 106
No adenoma detected
n = 435

Age at baseline
 40–49 70 (10%) 5 (4%) 51 (10%) 0 1 (0.2%)
 50–59 346 (50%) 58 (50%) 258 (51%) 54 (51%) 190 (44%)
 60–69 275 (40%) 53 (46%) 196 (39%) 52 (49%) 244 (56%)

Sex
 Male 317 (46%) 66 (57%) 227 (45%) 60 (57%) 204 (47%)
 Female 374 (54%) 50 (43%) 278 (55%) 46 (43%) 231 (53%)

Race
 White non-hispanic 572 (83%) 101 (88%) 421 (83%) 96 (91%) 383 (88%)
 Other 116 (17%) 14 (12%) 82 (16%) 10 (9%) 52 (12%)
 Missing 3 (0.5%) 1 (0.9%) 2 (0.4%)

Center location
 Minnesota 290 (42%) 53 (46%) 223 (44%) 49 (46%) 214 (49%)
 Louisiana 200 (29%) 22 (19%) 146 (29%) 17 (16%) 93 (21%)
 Washington 201 (29%) 41 (35%) 136 (27%) 40 (38%) 128 (29%)

Body mass index
 < 20 15 (2%) 2 (2%) 9 (2%) 2 (2%) 7 (2%)
 20–24 198 (29%) 23 (20%) 153 (30%) 21 (20%) 143 (33%)
 25–29 273 (40%) 50 (43%) 198 (39%) 47 (44%) 178 (41%)
 ≥ 30 187 (27%) 38 (33%) 130 (26%) 36 (34%) 107 (25%)
 Missing 18 (3%) 3 (3%) 15 (3%)

CRC in at least one first-degree relative
 Yes 90 (13%) 21 (18%) 61 (12%) 20 (19%) 54 (12%)
 No 599 (87%) 94 (81%) 443 (88%) 86 (81%) 381 (88%)
 Missing 2 (0.3%) 1 (0.9%) 1 (0.2%)

CRC in at least one second-degree relative
 Yes 82 (12%) 21 (18%) 55 (11%) 20 (19%) 50 (11%)
 No 607 (88%) 94 (81%) 449 (89%) 86 (81%) 385 (89%)
 Missing 2 (0.3%) 1 (0.9%) 1 (0.2%)

Smoking
 Ever 379 (55%) 69 (59%) 272 (54%) 65 (61%) 242 (56%)
 Never 312 (45%) 47 (41%) 233 (46%) 41 (39%) 193 (44%)

Pack years for ever smokers 17.2 ± 18.2 17.8 ± 20.0 17.0 ± 17.7 18.4 ± 20.4 17.5 ± 17.9

Regular intake of aspirin
 Yes 254 (37%) 41 (35%) 196 (39%) 40 (38%) 174 (40%)
 No 436 (63%) 75 (65%) 308 (61%) 66 (62%) 261 (60%)
 Missing 1 (0.1%) 0 1 (0.2%)

Regular intake of NSAIDS
 Yes 175 (25%) 22 (19%) 132 (26%) 22 (21%) 113 (26%)
 No 514 (74%) 94 (81%) 371 (73%) 84 (79%) 322 (74%)
 Missing 2 (0.3%) 0 2 (0.4%)

Table 2.

Phase II(validation cohort) Distribution of baseline variables

Baseline variables Overall Individuals who underwent colonoscopy and filled some part of baseline questionnaire, by adenoma detection
n = 1450
Individuals included in final analysis, by adenoma detection
n = 1334

N (%) or mean ± SD
n = 1609
Adenoma detected
n = 330
No adenoma detected
n = 1120
Adenoma detected
n = 307
No adenoma detected
n = 1027

Age at baseline
 40–49 137 (9%) 19 (6%) 94 (8%) 3 (1%) 8 (1%)
 50–59 1110 (69%) 218 (66%) 788 (70%) 214 (70%) 781 (76%)
 60–69 362 (23%) 93 (28%) 283 (21%) 90 (29%) 238 (23%)

Sex
 Male 784 (49%) 199 (60%) 521 (47%) 187 (61%) 490 (48%)
 Female 825 (51%) 131 (40%) 599 (53%) 120 (39%) 537 (52%)

Race
 White non-hispanic 1301 (81%) 267 (81%) 925 (83%) 253 (82%) 884 (86%)
 Other 308 (19%) 63 (19%) 195 (17%) 54 (18%) 143 (14%)

Center location
 Minnesota 954 (59%) 209 (63%) 677 (60%) 205 (67%) 675 (66%)
 Louisiana 425 (26%) 87 (26%) 280 (25%) 68 (22%) 190 (19%)
 Washington 230 (14%) 34 (10%) 163 (15%) 34 (11%) 162 (16%)

Body mass index
 < 20 34 (2%) 1 (0.3%) 28 (3%) 1 (0.3%) 25 (2%)
 20–24 416 (26%) 67 (20%) 314 (28%) 66 (22%) 293 (29%)
 25–29 642 (40%) 141 (43%) 447 (40%) 133 (43%) 420 (41%)
 ≥ 30 513 (32%) 118 (36%) 330 (29%) 107 (35%) 289 (28%)
 Missing 4 (0.3%) 3 (0.9%) 1 (0.1%)

CRC in at least one first-degree relative
 Yes 161 (10%) 41 (12%) 105 (9%) 37 (12%) 91 (9%)
 No 1437 (89%) 287 (87%) 1009 (90%) 270 (88%) 936 (91%)
 Missing 11 (0.7%) 2 (0.6%) 6 (0.5%)

CRC in at least one second-degree relative
 Yes 196 (12%) 44 (13%) 138 (12%) 37 (12%) 126 (12%)
 No 1402 (87%) 284 (86%) 976 (87%) 270 (88%) 901 (88%)
 Missing 11 (0.7%) 2 (0.6%) 6 (0.5%)

Smoking
 Ever 807 (50%) 187 (57%) 545 (49%) 174 (57%) 507 (49%)
 Never 802 (50%) 143 (43%) 575 (51%) 133 (43%) 520 (51%)

Pack years for ever smokers 18.0 ± 18.6 23.1 ± 19.7 16.3 ± 16.8 22.7 ± 19.6 16.6 ± 17.0

Regular intake of aspirin
 Yes 490 (30%) 114 (34%) 322 (29%) 106 (35%) 311 (30%)
 No 1119 (70%) 216 (66%) 798 (71%) 201 (65%) 716 (70%)

Regular intake of NSAIDS
 Yes 345 (21%) 67 (20%) 250 (22%) 60 (20%) 234 (23%)
 No 1264 (79%) 263 (80%) 870 (78%) 247 (80%) 793 (77%)

Of the 541 individuals in the derivation cohort, one or more adenomas were found in 106 (19%) individuals. One or more advanced adenomatous polyps (villous histology, size larger than 1cm, high-grade dysplasia or cancer) were found in 33 participants (6%). Of the 1334 individuals in the validation cohort, one or more adenomas were found in 307 (23%) individuals, and one or more advanced adenomas were found in 76 (5%) individuals. The histopathology findings at colonoscopy are presented in Table 3 for the individuals in the derivation and validation cohort included in the analyses.

Table 3.

Distribution of colonoscopy findings for derivation and validation cohort

Phase 1 Phase 2

Pathology Patients who had colonoscopy
n (%)
n = 621
Patients included in analysis
n (%)
n = 541
Patients who had colonoscopy
n (%)
n = 1450
Patients included in analysis
n (%)
n = 1334

Center location
 Minnesota 276 (44%) 263 (49%) 886 (61%) 880 (66%)
 Louisiana 168 (27%) 110 (20%) 367 (25%) 258 (19%)
 Washington 177 (29%) 168 (31%) 197 (14%) 196 (15%)

Total number of polyps per colonoscopy
 0 387 (62%) 328 (61%) 846 (58%) 781 (59%)
 1 110 (18%) 91 (17%) 321 (22%) 290 (22%)
 >1 124 (20%) 122 (23%) 283 (20%) 263 (20%)

Total number of adenomas per colonoscopy
 0 505 (81%) 435 (80%) 1120 (77%) 1027 (77%)
 1 81 (13%) 72 (13%) 230 (16%) 213 (16%)
 >1 35 (6%) 34 (6%) 100 (7%) 94 (7%)

Total number of right sided adenomas per colonoscopy
 0 555 (89%) 481 (89%) 1248 (86%) 1146 (86%)
 1 56 (9%) 50 (9%) 159 (11%) 151 (11%)
 >1 10 (2%) 10 (2%) 43 (3%) 37 (3%)

Total number of left sided adenomas per colonoscopy
 0 553 (89%) 477 (88%) 1273 (88%) 1169 (88%)
 1 54 (9%) 51 (9%) 135 (9%) 125 (9%)
 >1 14 (2%) 13 (2%) 42 (3%) 40 (3%)

Total number of advanced adenomas per colonoscopy
 0 587 (95%) 508 (94%) 1370 (94%) 1258 (94%)
 1 30 (5%) 29 (5%) 63 (4%) 60 (4%)
 >1 4 (0.6%) 4 (0.7%) 17 (1%) 16 (1%)

Total number of hyperplastic polyps per colonoscopy
 0 500 (81%) 424 (78%) 1200 (83%) 1104 (83%)
 1 68 (11%) 65 (12%) 167 (12%) 152 (11%)
 >1 53 (9%) 51 (9%) 83 (6%) 78 (6%)

The adjusted AUROCC for the phase I derivation model was 0.67 (95% CI 0.61, 0.74). That risk score applied to phase II resulted in an AUROCC of 0.61 (95% CI 0.59, 0.65). The bootstrap 95% confidence interval for the difference of these two AUROCCs was (−0.13, +0.01), which was not significantly different from zero. Thus, we combined phase I and II participants and re-estimated and bias-adjusted the risk score using the entire sample.

In multiple logistic regression, we found age (likelihood ratio test for overall contribution to the model, p<0.001), male sex (p < 0.001), body mass index (BMI) (p<0.001), family history of at least one first-degree relative with colorectal cancer (p = 0.036), and smoking history (p<0.001) to be individually associated with risk of harboring one or more adenomatous polyps in the entire cohort. The effects of clinical centers were examined in a separate regression and found to be insignificant. After expanding the model and adding interaction terms, the fitted values for all coefficients are illustrated in Figure 1.

Figure 1.

Figure 1

Odds ratios (unadjusted) estimated from the logistic model with 95% CIs.

Validation

The unadjusted, all-data ROCC based on our model is plotted in Figure 2. The adjusted area under the curve (AUROCC) was 0.64 (95% CI: 0.60, 0.67). The predicted probability ranges from 0.03 to 0.7. For example, the predicted risk score for a 50 year old female with a BMI of 20 who is a non smoker, has no family history of CRC and uses aspirin daily is low (0.1) while that for a 61 year old male with a BMI of 46 and 62 pack year history, with no family history of CRC who does not use aspirin is high (0.69).

Figure 2.

Figure 2

The unadjusted, all-data ROCC based on our model Bootstrap-adjusted sensitivities are shown as black dots along with 95% CIs for given values of (1 − specificity). Our final logistic model was: logit[Prob(adenoma)] = −1.536 + 0.057(bmi) − 0.026(smoker) + 0.011(pack years) + 0.13(aspirin) + 0.082(age) + 0.367(CRC FDR) + 0.334(CRC SDR) + 0.526(male) − 0.432(NSAID) + 0.232(nonwhite) − 0.053(age*aspirin) + 0.011(age*NSAID) − 0.044(age*male) − 0.025(age*bmi) − 0.373(male*aspirin) + 0.314(male*NSAID)

A plot of predicted risk of adenoma detection (horizontal axis) by fraction of the population at or below that risk (vertical axis) is shown in Figure 3. This figure illustrates the impact of potential use of such a model in clinical practice: For example if we define ‘high risk’ as individuals where predicted probability of an adenoma is ≥ 0.2, 58% would be classified as high risk and prioritized for colonoscopy, while 42% of individuals would be classified as low risk, and could be offered other modalities of screening. This classification would accurately capture 75% of all adenomas and 73% of advanced adenomas. Of the high risk individuals undergoing colonoscopy, an adenoma or advanced adenoma will be found in 35%, improving the number of therapeutic screening colonoscopies, compared to all comers. Of the entire cohort, 7% of individuals harboring an adenoma or advanced adenoma would be classified as ‘low risk’ and not be screened with colonoscopy initially. If the cut off for ‘high risk’ is changed to a ≥ 0.15 predicted probability of an adenoma is, 78% would be classified as high risk and prioritized for colonoscopy, while 22% of individuals would be classified as low risk. This classification would accurately capture 88% of all adenomas and 90% of advanced adenomas. Of the high risk individuals undergoing colonoscopy, an adenoma or advanced adenoma will be found in 32%. Also, of the entire cohort 3% of individuals harboring an adenoma or advanced adenoma would be classified as ‘low risk’ and not be screened with colonoscopy initially. As the risk criterion increases, so does the proportion offered alternatives to colonoscopy. Careful cost-benefit analyses would determine the criterion actually used.

Figure 3.

Figure 3

Plot of estimated risk of adenoma detection (horizontal axis) by proportion of the population at or below that risk (vertical axis). For example, approximately 40% of our cohort had a predicted risk < 0.2

DISCUSSION

Our study aimed to create and validate a risk model for quantifying an individual’s risk of harboring adenomas. Our final AUROCC was 0.64 indicating that the model has good predictive utility.

Such a model could be used to determine which men and women are most likely to harbor adenomas and would likely benefit from a therapeutic colonoscopy. While we did not directly compare colonoscopic screening to other modalities of screening, our results are a first step towards allowing payers, patients and physicians to use risk thresholds to decide who should be offered screening colonoscopy as opposed to other modalities of screening, or who should be prioritized for screening colonoscopy, based on their estimated risk of harboring adenomas. Once externally validated in sufficient populations, and possibly improved, we envision a model such as this to be a web-based or mobile application tool readily accessible and easy to use. As illustrated in Figure 3, a selected threshold for the cumulative probability of harboring adenomas can be used to stratify individuals for priority colonoscopy versus other modalities of screening. Combined with appropriate cost-effectiveness analysis, this information can determine the appropriate thresholds for which colonoscopic screening should be prioritized versus some other modality. For example, if we set the risk threshold for harboring an adenoma at 0.2 or above (Figure 3), the cumulative fraction of the population below this cut-off would be about 40%, and would be offered another less invasive modality of screening, such as fecal occult blood test (FOBT) or even no screening. Only about 60% of the population would be prioritized to colonoscopic screening, which may greatly improve efficiency. Of course, the actual cut-off for predicted risk would have to be based on a careful analysis and comparison of the cost-effectiveness for each potential cut-off, or multiple cut-offs.

Others have used similar approaches to predict an individual’s risk for harboring colorectal cancer.[27] Imperiale, et al.[28] developed a score based on three clinical factors to predict risk of harboring proximal lesions based on findings at flexible sigmoidoscopy. Others have developed risk scores for advanced neoplasia, but not adenoma, in Asian and European cohorts. Tao et al[19] have used nine risk factors to derive a risk stratification tool for a German cohort. They also took into account prior colonoscopy and detection of polyps. Kaminski et al.[29] used factors similar to ours: age, sex, family history of CRC, cigarette smoking and BMI. Others have reported similar approaches in Asian patients.[3032] However, there are no clinical tools with systematic evaluation of risk factors for predicting an individual’s risk of harboring an adenoma, that have been performed in a US cohort.

We included demographic and baseline variables that would be practical and easy to obtain for the score to be clinically useful. We also favored a parsimonious model, and included variables that others have shown to be risk factors for adenomas,[810, 33, 34] and variables that are easy to obtain from the chart or from the patient. Also, since the risk score changes with time, it can be used to recalculate an individual’s risk for harboring adenoma periodically, and perhaps changing the individual’s priority for screening colonoscopy.

Because our analysis is predictive, not causal,[35] our estimated coefficients for individual factors are not necessarily unbiased estimates of their causal effects.[36] Although we made no attempt to adjust for potential confounding in our model, the results were similar to findings by others, i.e. increasing age and male sex were associated with risk of adenomas.[9, 27] We also found risk associations with family history of CRC in one or more first-degree relatives. In their large cohort of 3,121 mostly male veteran patients undergoing screening colonoscopy, Lieberman, et al.[34] reported family history of CRC in one or more first-degree relatives and current smoking as risk factors for advanced adenomas (10 mm or more, 25% villous histology or more, high grade dysplasia or invasive cancer) while use of NSAIDs including aspirin was inversely associated with risk of advanced adenomas and no association was found with BMI. Others have not found family history in a first-degree relative to be associated with adenomas.[10] We did not find use of aspirin to be associated with risk of adenomatous polyps, but did find higher BMI to be associated with risk, as has been reported by others.[17] We also did not find an association with non-white race and did not examine physical activity, the evidence for both of which in the literature is mixed.[13, 37, 38]

The strengths of our study are the multiple sites from diverse and different geographic locations, inclusion of community-dwelling men and women in the U.S., non-white races, complete colonoscopy and histology information with one central pathologist review, comprehensive collection and review of family cancer history for both first- and second-degree relatives, and community-dwelling individuals; rigorous derivation and validation of the risk score; good model calibration. The adenoma/advanced adenoma prevalence rate of 25% and 28% observed in our cohorts is similar to the rates reported by others [39, 40] and to those recommended by guidelines as indicating high quality in colonoscopy examinations.[41] The high-quality colonoscopy exam, along with risk factors comparable to the general population, also supports the validity of our modeling and risk score assessment. Our robust statistical approach included both internal and second-sample validation and a double bootstrap to address both overfitting bias and variance.

Limitations of our study include a small number of advanced adenomatous polyps precluding a separate robust model to predict only advanced adenomatous polyps and cancers. Our study is cross sectional and predictive and thus cannot identify causal factors for progression of adenomatous polyps to cancer. Finally, there is possibility of a recall error in self-reporting of risk factors, although this error is unlikely to be biased by outcomes because nearly all of the baseline data were collected prior to colonoscopy.

While our risk score is validated and well calibrated, it would benefit from further validation in other cohorts of average risk men and women in the U.S. While the AUROCC for our model of 0.64 is good, and similar to that reported by others,[29] improvements will enhance its efficiency. Development of a clinically useful risk stratification score is important to enhancing our capacity to deliver effective colonoscopic screening targeted towards those who may benefit the most. Cost-effectiveness analyses should be performed to determine appropriate thresholds. Similar work needs to be applied to surveillance colonoscopy intervals.

Acknowledgments

FUNDING

R01 CA079572, National Cancer Institute, Screening Colonoscopy Feasibility Trial (National Colonoscopy Study) (A.G. Zauber and S.J. Winawer)

Center for Chronic Disease Outcomes Research, a VA HSR&D Center of Innovation (CIN 13-406) (A.Shaukat)

Footnotes

CONFLICTS OF INTEREST

The authors disclose no conflicts.

References

  • 1.Siegel R, Naishadham D, Jemal A. Cancer statistics, 2012. CA Cancer J Clin. 2012;62(1):10–29. doi: 10.3322/caac.20138. [DOI] [PubMed] [Google Scholar]
  • 2.Levin B, Lieberman DA, McFarland B, Andrews KS, Brooks D, Bond J, et al. Screening and surveillance for the early detection of colorectal cancer and adenomatous polyps, 2008: a joint guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. Gastroenterology. 2008;134(5):1570–1595. doi: 10.1053/j.gastro.2008.02.002. [DOI] [PubMed] [Google Scholar]
  • 3.Winawer SJ, Zauber AG, Stewart E, O’Brien MJ. The natural history of colorectal cancer. Opportunities for intervention. Cancer. 1991;67(4 Suppl):1143–1149. doi: 10.1002/1097-0142(19910215)67:4+<1143::aid-cncr2820671507>3.0.co;2-d. [DOI] [PubMed] [Google Scholar]
  • 4.Winawer SJ, Zauber AG, Ho MN, O’Brien MJ, Gottlieb LS, Sternberg SS, et al. Prevention of colorectal cancer by colonoscopic polypectomy. The National Polyp Study Workgroup. The New England journal of medicine. 1993;329(27):1977–1981. doi: 10.1056/NEJM199312303292701. [DOI] [PubMed] [Google Scholar]
  • 5.O’Brien MJ, O’Keane JC, Zauber A, Gottlieb LS, Winawer SJ. Precursors of colorectal carcinoma. Biopsy and biologic markers. Cancer. 1992;70(5 Suppl):1317–1327. doi: 10.1002/1097-0142(19920901)70:3+<1317::aid-cncr2820701519>3.0.co;2-x. [DOI] [PubMed] [Google Scholar]
  • 6.Qaseem A, Denberg TD, Hopkins RH, Jr, Humphrey LL, Levine J, Sweet DE, et al. Screening for Colorectal Cancer: A Guidance Statement From the American College of Physicians. Ann Intern Med. 2012;156(5):378–386. doi: 10.7326/0003-4819-156-5-201203060-00010. [DOI] [PubMed] [Google Scholar]
  • 7.Hassan C, Pickhardt PJ, Marmo R, Choi JR. Impact of lifestyle factors on colorectal polyp detection in the screening setting. Dis Colon Rectum. 2010;53(9):1328–1333. doi: 10.1007/DCR.0b013e3181e10daa. [DOI] [PubMed] [Google Scholar]
  • 8.Lieberman DA, Prindiville S, Weiss DG, Willett W, Lieberman DA, Prindiville S, et al. Risk factors for advanced colonic neoplasia and hyperplastic polyps in asymptomatic individuals. JAMA. 2003;290(22):2959–2967. doi: 10.1001/jama.290.22.2959. [DOI] [PubMed] [Google Scholar]
  • 9.Lipkin M, Winawer SJ, Sherlock P. Early identification of individuals at increased risk for cancer of the large intestine. Part I: definition of high risk populations. Clinical bulletin. 1981;11(1):13–21. [PubMed] [Google Scholar]
  • 10.Lynch KL, Ahnen DJ, Byers T, Weiss DG, Lieberman DA, Lynch KL, et al. First-degree relatives of patients with advanced colorectal adenomas have an increased prevalence of colorectal cancer [see comment] Clinical Gastroenterology & Hepatology. 2003;1(2):96–102. doi: 10.1053/cgh.2003.50018. [DOI] [PubMed] [Google Scholar]
  • 11.Anderson JC, Stein B, Kahi CJ, Rajapakse R, Walker G, Alpern Z. Association of smoking and flat adenomas: results from an asymptomatic population screened with a high-definition colonoscope. Gastrointestinal endoscopy. 71(7):1234–1240. doi: 10.1016/j.gie.2009.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Sandler RS, Lyles CM, McAuliffe C, Woosley JT, Kupper LL. Cigarette smoking, alcohol, and the risk of colorectal adenomas. Gastroenterology. 1993;104(5):1445–1451. doi: 10.1016/0016-5085(93)90354-f. [DOI] [PubMed] [Google Scholar]
  • 13.Hermann S, Rohrmann S, Linseisen J. Lifestyle factors, obesity and the risk of colorectal adenomas in EPIC-Heidelberg. Cancer Causes Control. 2009;20(8):1397–1408. doi: 10.1007/s10552-009-9366-3. [DOI] [PubMed] [Google Scholar]
  • 14.Anderson JC, Latreille M, Messina C, Alpern Z, Grimson R, Martin C, et al. Smokers as a high-risk group: data from a screening population. Journal of clinical gastroenterology. 2009;43(8):747–752. doi: 10.1097/MCG.0b013e3181956f33. [DOI] [PubMed] [Google Scholar]
  • 15.Kang HW, Kim D, Kim HJ, Kim CH, Kim YS, Park MJ, et al. Visceral obesity and insulin resistance as risk factors for colorectal adenoma: a cross-sectional, case-control study. The American journal of gastroenterology. 105(1):178–187. doi: 10.1038/ajg.2009.541. [DOI] [PubMed] [Google Scholar]
  • 16.Tsilidis KK, Brancati FL, Pollak MN, Rifai N, Clipp SL, Hoffman-Bolton J, et al. Metabolic syndrome components and colorectal adenoma in the CLUE II cohort. Cancer Causes Control. 21(1):1–10. doi: 10.1007/s10552-009-9428-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Stein B, Anderson JC, Rajapakse R, Alpern ZA, Messina CR, Walker G. Body mass index as a predictor of colorectal neoplasia in ethnically diverse screening population. Digestive diseases and sciences. 55(10):2945–2952. doi: 10.1007/s10620-009-1113-9. [DOI] [PubMed] [Google Scholar]
  • 18.Burke CA. Colonic complications of obesity. Gastroenterology clinics of North America. 39(1):47–55. doi: 10.1016/j.gtc.2009.12.005. [DOI] [PubMed] [Google Scholar]
  • 19.Tao S, Hoffmeister M, Brenner H. Development and validation of a scoring system to identify individuals at high risk for advanced colorectal neoplasms who should undergo colonoscopy screening. Clin Gastroenterol Hepatol. 2014;12(3):478–485. doi: 10.1016/j.cgh.2013.08.042. [DOI] [PubMed] [Google Scholar]
  • 20.Gail MH, Benichou J. Validation studies on a model for breast cancer risk. Journal of the National Cancer Institute. 1994;86(8):573–575. doi: 10.1093/jnci/86.8.573. [DOI] [PubMed] [Google Scholar]
  • 21.Frikke-Schmidt R, Tybjaerg-Hansen A, Schnohr P, Jensen GB, Nordestgaard BG. Common clinical practice versus new PRIM score in predicting coronary heart disease risk. Atherosclerosis. 2010;213(2):532–538. doi: 10.1016/j.atherosclerosis.2010.07.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kamath PS, Kim WR. The model for end-stage liver disease (MELD) Hepatology (Baltimore, Md. 2007;45(3):797–805. doi: 10.1002/hep.21563. [DOI] [PubMed] [Google Scholar]
  • 23.Merkel C, Zoli M, Siringo S, van Buuren H, Magalotti D, Angeli P, et al. Prognostic indicators of risk for first variceal bleeding in cirrhosis: a multicenter study in 711 patients to validate and improve the North Italian Endoscopic Club (NIEC) index. The American journal of gastroenterology. 2000;95(10):2915–2920. doi: 10.1111/j.1572-0241.2000.03204.x. [DOI] [PubMed] [Google Scholar]
  • 24.Beck DH, Smith GB, Pappachan JV, Millar B. External validation of the SAPS II, APACHE II and APACHE III prognostic models in South England: a multicentre study. Intensive care medicine. 2003;29(2):249–256. doi: 10.1007/s00134-002-1607-9. [DOI] [PubMed] [Google Scholar]
  • 25.Provost F, Fawcett T. Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97); 1997. [Google Scholar]
  • 26.Efron B, Tibshirani RJ, editors. An Introduction to Bootstrap. New York: Chapman and Hall; 1993. [Google Scholar]
  • 27.Freedman AN, Slattery ML, Ballard-Barbash R, Willis G, Cann BJ, Pee D, et al. Colorectal cancer risk prediction tool for white men and women without known susceptibility. J Clin Oncol. 2009;27(5):686–693. doi: 10.1200/JCO.2008.17.4797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Imperiale TF, Wagner DR, Lin CY, Larkin GN, Rogge JD, Ransohoff DF. Using risk for advanced proximal colonic neoplasia to tailor endoscopic screening for colorectal cancer. Ann Intern Med. 2003;139(12):959–965. doi: 10.7326/0003-4819-139-12-200312160-00005. [DOI] [PubMed] [Google Scholar]
  • 29.Kaminski MF, Polkowski M, Kraszewska E, Rupinski M, Butruk E, Regula J. A score to estimate the likelihood of detecting advanced colorectal neoplasia at colonoscopy. Gut. 2014;63(7):1112–1119. doi: 10.1136/gutjnl-2013-304965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Yeoh KG, Ho KY, Chiu HM, Zhu F, Ching JY, Wu DC, Matsuda T, et al. The Asia-Pacific Colorectal Screening score: a validated tool that stratifies risk for colorectal advanced neoplasia in asymptomatic Asian subjects. Gut. 2011;60(9):1236–1241. doi: 10.1136/gut.2010.221168. [DOI] [PubMed] [Google Scholar]
  • 31.Cai QC, Yu ED, Xiao Y, Bai WY, Chen X, He LP, et al. Derivation and validation of a prediction rule for estimating advanced colorectal neoplasm risk in average-risk Chinese. Am J Epidemiol. 2012;175(6):584–593. doi: 10.1093/aje/kwr337. [DOI] [PubMed] [Google Scholar]
  • 32.Park HW, Hans S, Lee JS, Chang HS, Lee D, Choe JW. RIsk stratification for advanced proximal colon neoplasm and indivdulaized endoscopic screening for colorectal cancer by risk-scoring model. Gut. 2014;63(7):1112–1119. doi: 10.1016/j.gie.2012.06.013. [DOI] [PubMed] [Google Scholar]
  • 33.Lipkin M, Blattner WA, Gardner EJ, Burt RW, Lynch H, Deschner E, et al. Classification and risk assessment of individuals with familial polyposis, Gardner’s syndrome, and familial non-polyposis colon cancer from [3H]thymidine labeling patterns in colonic epithelial cells. Cancer research. 1984;44(9):4201–4207. [PubMed] [Google Scholar]
  • 34.Lieberman DA, Holub JL, Moravec MD, Eisen GM, Peters D, Morris CD, et al. Prevalence of colon polyps detected by colonoscopy screening in asymptomatic black and white patients [see comment] JAMA. 2008;300(12):1417–1422. doi: 10.1001/jama.300.12.1417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Rubin DB. Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. J Educ Psychol. 1974;66(5):688–701. [Google Scholar]
  • 36.Hernan MA, Hernandez-Diaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155(2):176–184. doi: 10.1093/aje/155.2.176. [DOI] [PubMed] [Google Scholar]
  • 37.Giovannucci E, Ascherio A, Rimm EB, Colditz GA, Stampfer MJ, Willett WC. Physical activity, obesity, and risk for colon cancer and adenoma in men. Ann Intern Med. 1995;122(5):327–334. doi: 10.7326/0003-4819-122-5-199503010-00002. [DOI] [PubMed] [Google Scholar]
  • 38.Giovannucci E, Colditz GA, Stampfer MJ, Willett WC. Physical activity, obesity, and risk of colorectal adenoma in women (United States) Cancer Causes Control. 1996;7(2):253–263. doi: 10.1007/BF00051301. [DOI] [PubMed] [Google Scholar]
  • 39.Schoenfeld P, Cash B, Flood A, Dobhan R, Eastone J, Coyle W, et al. Colonoscopic screening of average-risk women for colorectal neoplasia. The New England journal of medicine. 2005;352(20):2061–2068. doi: 10.1056/NEJMoa042990. [DOI] [PubMed] [Google Scholar]
  • 40.Kim DH, Lee SY, Choi KS, Lee HJ, Park SC, Kim J, et al. The usefulness of colonoscopy as a screening test for detecting colorectal polyps. Hepato-gastroenterology. 2007;54(80):2240–2242. [PubMed] [Google Scholar]
  • 41.Rex DK, Lieberman D. ACG colorectal cancer prevention action plan: update on CT-colonography. The American journal of gastroenterology. 2006;101(7):1410–1413. doi: 10.1111/j.1572-0241.2006.00585.x. [DOI] [PubMed] [Google Scholar]

RESOURCES