Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Mar 1.
Published in final edited form as: J Rural Health. 2023 Aug 29;40(2):268–271. doi: 10.1111/jrh.12792

Urban-rural differences in cancer mortality: operationalizing rurality

Elizabeth S Davis 1, Jeffrey A Franks 2, Smita Bhatia 3,4, Kelly M Kenzik 1,5
PMCID: PMC10902199  NIHMSID: NIHMS1927504  PMID: 37644650

Abstract

Objective:

To assess urban-rural differences in cancer mortality across definitions of rurality as 1) established binary cut-points, 2) data-driven binary cut-points 3) continuous.

Methods:

We used Surveillance, Epidemiology, and End Results (SEER) data between 2000 and 2016 to identify incident adult screening-related cancers. Analyses were based on one testing and four validation cohorts (all n=26,587). Urban-rural status was defined by Rural-Urban Continuum Codes, National Center for Health Statistics codes, and the Index of Relative Rurality. Each was modeled using established binary cut-points, data-driven cut-points, and as continuous. The primary outcome was 5-year cancer-specific mortality.

Results:

Compared to established cut-points, data-driven cut-points classified more patients as rural, resulted in larger White populations in rural areas, and yielded 7-14% lower estimates of urban-rural differences in cancer mortality. Further, hazard of cancer mortality increased 4-67% with continuous rurality measures, revealing important between-unit differences.

Conclusions:

Different cut-points introduce variation in urban-rural differences in mortality across definitions, whereas using urban-rural measures as continuous allows rurality to be conceptualized as a continuum, rather than a simple aggregation.

Policy Implications:

Findings provide alternative cut-points for multiple measures of rurality and support the consideration of utilizing continuous measures of rurality in order to guide future research and policymakers.

Keywords: urban-rural status, cancer, mortality

INTRODUCTION

Rural cancer populations experience inferior survival compared to those living in urban areas, regardless of how rurality is defined.1 County-level definitions include Rural-Urban Continuum Codes (RUCC),2 National Center for Health Statistics (NCHS) codes,3 and Index of Relative Rurality (IRR).4 The RUCC classifies counties based on population size, urbanization, and proximity to metropolitan areas.2 The NCHS focuses on metropolitan areas, containing one category each for micropolitan and rural areas.3 The RUCC has been found to better differentiate cancer risk across rural counties, while the NCHS is better suited to discriminate between urban counties.5 In addition, the IRR is a continuous measure of rurality ranging from 0 to 1, based on population size, density, percentage of urban residents, and distance to metropolitan area.4 Despite multiple categories for the RUCC and NCHS (9 and 6 categories, respectively) and the continuous nature of the IRR, these definitions are frequently operationalized as binary variables (urban vs. rural).4,6 This dichotomization may conceal meaningful differences in health outcomes between levels of rurality and limit analyses regarding what level or degree of rurality is associated with worse outcomes.5,7 Prior works have identified differences in effect size, model fit, and significance when modeling rurality as binary, ternary and continuous, as well as the appropriateness of the use of urban-rural definitions as continuous.1,7,8 Notably, estimates of urban-rural differences in mortality varied up to 24% between definitions and from 3-61% within definitions.1 The greatest variation was seen among cancers with screening recommendations (prostate, breast, lung, cervical, colorectal),1 which may have important implications for health research, policy decision-making, and resource allocation. Further, there is disagreement regarding which categories should be “urban” and “rural,” creating even more questions about how to operationalize measures of rurality in research. The present study examined how various methods of operationalizing urban-rural definitions affects estimates of cancer mortality and provides data-driven cut-points as alternatives to established ones.

METHODS

Using Surveillance, Epidemiology, and End Results (SEER) data from 2000 to 2016, we demonstrated the greatest variation across urban-rural definitions for cancers with screening recommendations (prostate, breast, lung, cervical, colorectal).1 From this cohort (n=2,658,753), we extracted a 5% random sample (n=132,935), randomly dividing into one testing and four validation samples (n=26,587 each).

Parametric competing risk survival models9 estimated urban-rural differences in hazard of 5-year cancer-specific mortality. For each definition (RUCC; NCHS; IRR), three models were constructed using one of the following: 1) established binary cut-point defined by the Economic Research Service, CDC, or prior study [Supplementary Table 1], 2) data-driven binary cut-point, and 3) continuous variable as the primary exposure. The established cut-points for the RUCC (values 1-3) and NCHS (values 1-4) include metropolitan areas as “urban” (RUCC values 1-3; NCHS 1-4) and the established cut-point for the IRR is 0.50. Data-driven cut-points were estimated using optimal survival time-related cut-point methods developed by Liu & Jin, 2015.10 This method utilizes receiver operating characteristic curves to maximize the concordance probability function (c), which is equal to the product of the sensitivity and specificity.10 Agreement between the established and newly estimated data-driven cut-points was assessed using Cohen’s kappa (κ)11 and evaluated using Bayesian information criterion (BIC), Somer’s D, and Harrell’s c statistic.12 Models were adjusted for sex, race/ethnicity, stage, age at diagnosis, and year of diagnosis. Analyses were conducted using Stata v17. This was deemed non-human subjects research by the Boston University Institutional Review Board.

RESULTS

Supplementary Table 2 describes demographics. Established and data-driven cut-points had slight to fair agreement as data-driven cut-points classified larger proportions (26-40%) of the population as “rural” [Table 1]. Additionally, rural populations defined by data-driven cut-points had 5-8% more non-Hispanic White patients than rural populations defined by established cut-points (all p-values<0.01) [Table 2].

Table 1.

Cut-points and their associated measures of model fit, calibration, and discrimination

Measure Cut-point % rural Agreement between cut-points Model fit and calibration Model discrimination
Value Cohen’s κ (SE) p-value HR (95% CI) BIC Somer’s D Harrell’s c
RUCC
Established binary 3.000 11.67 0.35 (0.005) <0.001 1.25 (1.18-1.33) 49597.92 0.6075 0.8038
Data-driven binary 1.000 38.07 1.11 (1.07-1.15) 49609.56 0.6068 0.8034
Continuous - - - - 1.04 (1.03-1.05) 49583.93 0.6085 0.8042
NCHS
Established binary 3.000 11.67 0.35 (0.005) <0.001 1.20 (1.14-1.26) 49597.92 0.6075 0.8038
Data-driven binary 2.000 38.07 1.11 (1.07-1.15) 49609.56 0.6068 0.8034
Continuous - - - - 1.05 (1.04-1.06) 49321.66 0.6083 0.8041
IRR
Established binary 0.500 10.01 0.20 (0.004) <0.001 1.19 (1.13-1.26) 49604.40 0.6074 0.8037
Data-driven binary 0.306 50.25 1.12 (1.07-1.18) 49606.70 0.6073 0.8036
Continuous - - - - 1.67 (1.44-1.93) 49592.67 0.6078 0.8039

Table 2.

Demographic comparison between individuals classified as urban and rural at both the established cut-point and at the data-driven cut-point (concordant) and those classified as rural by the data-driven cut-point only (discordant).

RUCC NCHS IRR
Concordant Discordant p-value Concordant Discordant p-value Concordant Discordant p-value
Total, n (%) 18517 (69.65) 8070 (30.35) 19569 (73.60) 7018 (26.40) 15888 (59.76) 10699 (40.24)
Age at diagnosis, n (%) 0.01 0.08 0.06
  20-44 1154 (6.23) 471 (5.84) 1213 (6.20) 412 (5.87) 994 (6.26) 631 (5.90)
  45-54 2887 (15.59) 1204 (14.92) 3032 (15.49) 1059 (15.09) 2483 (15.63) 1608 (15.03)
  55-64 4914 (26.54) 2181 (27.03) 5194 (26.54) 1901 (27.09) 4171 (26.25) 2924 (27.33)
  65-74 5237 (28.28) 2404 (29.79) 5586 (28.55) 2055 (29.28) 4486 (28.2) 3155 (29.49)
  75-84 3371 (18.20) 1379 (17.09) 3542 (18.10) 1208 (17.21) 2904 (18.28) 1846 (17.25)
  85+ 954 (5.15) 431 (5.34) 1002 (5.12) 383 (5.46) 850 (5.35) 535 (5.00)
Sex, n (%) 0.54 0.56 0.62
  Male 9299 (50.22) 4086 (50.63) 9831 (50.24) 3554 (50.64) 7979 (50.22) 5406 (50.53)
  Female 9218 (49.78) 3984 (49.37) 9738 (49.76) 3464 (49.36) 7909 (49.78) 5293 (49.47)
Race, n (%) <0.001 <0.001 <0.001
  White 13017 (70.30) 6251 (77.46) 13938 (71.22) 5330 (75.95) 10985 (69.14) 8283 (77.42)
  Black 2326 (12.56) 897 (11.12) 2398 (12.25) 825 (11.76) 2232 (14.05) 991 (9.26)
  Hispanic 1703 (9.20) 620 (7.68) 1737 (8.88) 586 (8.35) 1373 (8.64) 950 (8.88)
  Other 1471 (7.94) 302 (3.74) 1496 (7.64) 277 (3.95) 1298 (8.17) 475 (4.44)
Stage at diagnosis, n (%) 0.51 0.83 0.23
  Early 14837 (80.13) 6438 (79.78) 15653 (79.99) 5622 (80.11) 12752 (80.26) 8523 (79.66)
  Late 3680 (19.87) 1632 (20.22) 3916 (20.01) 1396 (19.89) 3136 (19.74) 2176 (20.34)

Rurality defined using established and data-driven RUCC cut-points was associated with 25% (hazard ratio [HR], 1.25; 95% confidence interval [CI], 1.18-1.33) and 11% (HR, 1.11; 95%CI, 1.07-1.15) increased hazard of 5-year cancer mortality, respectively [Table 1]. For NCHS, established and data-driven rural residence definitions were associated with 20% (HR, 1.20; 95% CI, 1.14-1.26) and 11% increased hazard of cancer mortality (HR, 1.11; 95%CI, 1.07-1.15) compared to urban residence. For IRR, established and data-driven rural residence cut-points, there was 19% (HR, 1.19; 95% CI, 1.13-1.26) and 12% increased hazard of cancer mortality (HR, 1.12; 95%CI, 1.07-1.18) compared to urban residence. Overall, established cut-points yielded estimates 7-14% higher than data-driven cut-points. When modeled as continuous, hazard of mortality increased by 4%, 5%, and 67% for each one-unit increase in RUCC, NCHS, and 100% increase in IRR, respectively. For each measure, model fit was best when modeled continuously, though differences were minimal [Table 1]. The four validation samples verified results.

DISCUSSION

The ongoing conversation on defining rurality and how (or if) variables should be categorized is complex, requiring further investigation.4,6,8 Using county rural definitions, we assessed differences in model fit and estimated the hazard of primary cancer death using established cut-points, data-driven cut-points, and modeling rurality as continuous. While model fit and discrimination were comparable, the magnitude of difference varied across models. Compared to data-driven cut-points, established cut-points categorized fewer counties as rural (10-12% vs 38-50%) and contained fewer white patients (69-71% vs 76-77%). Established cut-points categorized smaller metropolitan areas as urban, whereas data-driven cut-points included metropolitan areas with populations >1 million. Hazard ratios estimated using established cut-points were 7-14% higher than those using data-driven cut-points, while modeling rurality as a continuous variable revealed significant between-unit differences in mortality among all three measures. Thus, the measure of rurality in future studies, as well as the choice of cut-points, should be guided by the specific research question.

Larger mortality differences from models using established cut-points may be due to sociodemographic makeup of “urban” vs “rural” counties when comparing established cut-points to data-driven cut-points. Using data-driven cut-points, “rural” counties contained 5-8% more White individuals compared to those classified as rural using established cut-points. Historically, rural areas are predominantly white, though diversity in these areas has increased in the past decade (80% white in 2010 to 76% in 2020).13 However, race and rurality intersect such that rural white populations experience better health outcomes than rural Black populations14,15 – differences that become increasingly disparate as rurality increases.15 Notably, Black-white differences in income, education and unemployment are greater among residents of fringe and medium metropolitan areas when compared to large metropolitan areas, where the magnitude of racial disparities is smaller.15 Prior findings, along with the current results, emphasize the importance of considering sociodemographic factors when deciding how to operationalize rural measures. These findings are particularly striking given that most commonly used urban-rural measures are based on arbitrary geographical units such as county, zip code, or census tract, and consider rurality a measure of location rather than a multifaceted contextual measure.6 Thus, urban-rural measures are not interchangeable or comparable with respect to identifying disparities in health outcomes, and they are also sensitive to choice of cut-point, which impacts the sociodemographic makeup of rural areas.

The hazard of primary cancer mortality by rural/urban status increased as level of rurality increased for all three measures when modeled as continuous. The observed between-unit differences suggest that categorizing measures of rurality may conceal important differences between categories. Efficiency and cost-effectiveness may be maximized by targeting policies and interventions at high-risk individuals within rural areas, as well as considering degree of rurality rather than absolute categories. The use of different cut-points results in varying estimates, ignores within-group differences, and reduces statistical power. Further, while the RUCC and NCHS are better suited for differentiating cancer mortality risk across rural and urban counties, respectively5, the present results suggest that the IRR may be better suited to capture differences in risk across the entire rural-urban spectrum. While there is no single correct method of operationalizing urban-rural measures, we encourage the use of continuous variables (when appropriate) and hypothesis-driven cut-points in order to identify disparities, guide policy decisions, and implement interventions.

Supplementary Material

Supinfo

Funding:

Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R37CA266193 (PI: Kenzik). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Footnotes

Disclosures: None

Conflicts of interest: None

REFERENCES

  • 1.Franks JA, Davis ES, Bhatia S, Kenzik KM. Defining rurality: an evaluation of rural definitions and the impact on survival estimates. J Natl Cancer Inst. Published online February 10, 2023. doi: 10.1093/jnci/djad031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.USDA Economic Research Service. Rural-Urban Continuum Codes. US Department of Agriculture. Published 2020. Accessed May 20, 2022. https://www.ers.usda.gov/data-products/rural-urban-continuum-codes.aspx [Google Scholar]
  • 3.National Center for Health Statistics. NCHS Urban-Rural Classification Scheme for Counties. CDC. Published December 2, 2019. Accessed November 9, 2020. https://www.cdc.gov/nchs/data_access/urban_rural.htm [Google Scholar]
  • 4.Waldorf B, Kim A. Defining and Measuring Rurality in the US: From Typologies to Continuous Indices. In: ; 2018:26. [Google Scholar]
  • 5.Hirko KA, Xu H, Rogers LQ, et al. Cancer disparities in the context of rurality: risk factors and screening across various U.S. rural classification codes. Cancer Causes Control. 2022;33(8):1095–1105. doi: 10.1007/s10552-022-01599-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zahnd WE, Mueller-Luckey GS, Fogleman AJ, Jenkins WD. Rurality and Health in the United States: Do Our Measures and Methods Capture Our Intent? J Health Care Poor Underserved. 2019;30(1):70–79. doi: 10.1353/hpu.2019.0008 [DOI] [PubMed] [Google Scholar]
  • 7.Yaghjyan L, Cogle C, Deng G, et al. Continuous Rural-Urban Coding for Cancer Disparity Studies: Is It Appropriate for Statistical Analysis? Int J Environ Res Public Health. 2019;16(6):1076. doi: 10.3390/ijerph16061076 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Schiefelbein AM, Taylor AK, Krebsbach JK, et al. Same People, Different Results: Categorizing Cancer Registry Cases across the Rural-Urban Continuum. Published online December 29, 2021. doi: 10.21203/rs.3.rs-1200114/v1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lambert PC, Royston P. Further Development of Flexible Parametric Models for Survival Analysis. Stata J Promot Commun Stat Stata. 2009;9(2):265–290. doi: 10.1177/1536867X0900900206 [DOI] [Google Scholar]
  • 10.Liu X, Jin Z. Optimal survival time-related cut-point with censored data. Stat Med. 2015;34(3):515–524. doi: 10.1002/sim.6360 [DOI] [PubMed] [Google Scholar]
  • 11.Ranganathan P, Pramesh CS, Aggarwal R. Common pitfalls in statistical analysis: Measures of agreement. Perspect Clin Res. 2017;8(4):187–191. doi: 10.4103/picr.PICR_123_17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Harrell FE, Lee KL, Mark DB. MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS. Stat Med. 1996;15(4):361–387. doi: [DOI] [PubMed] [Google Scholar]
  • 13.Johnson KM, Lichter D. Growing Racial Diversity in Rural America: Results from the 2020 Census. University of New Hampshire Carsey School of Public Policy; 2022. doi: 10.34051/p/2022.09 [DOI] [Google Scholar]
  • 14.Probst JC, Zahnd WE, Hung P, Eberth JM, Crouch EL, Merrell MA. Rural-Urban Mortality Disparities: Variations Across Causes of Death and Race/Ethnicity, 2013–2017. Am J Public Health. 2020;110(9):1325–1327. doi: 10.2105/AJPH.2020.305703 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Owens-Young J, Bell CN. Structural Racial Inequities in Socioeconomic Status, Urban-Rural Classification, and Infant Mortality in US Counties. Ethn Dis. 2020;30(3):389–398. doi: 10.18865/ed.30.3.389 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supinfo

RESOURCES