Abstract
The National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) program is continuously exploring opportunities to augment its already extensive collection of data, enhance the quality of reported cancer information, and contribute to more comprehensive analyses of cancer burden. This manuscript describes a recent linkage of the LexisNexis longitudinal residential history data with 11 SEER registries and provides estimates of the inter-state mobility of SEER cancer patients. To identify mobility from one state to another, we used state postal abbreviations to generate state-level residential histories. From this, we determined how often cancer patients moved from state-to-state. The results in this paper provide information on the linkage with LexisNexis data and useful information on state-to-state residential mobility patterns of a large portion of US cancer patients for the most recent 1-, 2-, 3-, 4-, and 5-year periods. We show that mobility patterns vary by geographic area, race/ethnicity and age, and cancer patients tend to move less than the general population.
Keywords: data linkage; exposure estimates; residential history; social determinants; Surveillance, Epidemiology, and End Results (SEER) program
Introduction
The National Cancer Institute's (NCI's) Surveillance, Epidemiology, and End Results (SEER) program is a rich source of cancer related information including diagnostic data, patient demographics, tumor characteristics, initial treatment at the time of diagnosis, and outcomes.1 The SEER Program is continuously exploring opportunities to augment its already extensive collection of data, enhance the quality of reported cancer information, and contribute to more comprehensive analysis of cancer burden. The emerging sources of cancer-related data coupled with novel technologies for data extraction and linkage present an opportunity for cancer registries to integrate larger-scale longitudinal data pre-and post-diagnosis into the existing cancer surveillance data infrastructure.
While cancer registries collect the patient's residential address at the time of diagnosis, historical and updated address histories are not generally available. Having residential history pre- and post-cancer diagnosis would facilitate data linkages with multiple sources of longitudinal data, enhance the quality of data linkage in the absence of patient identifier information, and provide research opportunities to investigate the association of exposures to neighborhood social and environmental conditions with risks of developing cancer over the life course2-7 as well as the impact of a cancer diagnosis on cancer survivorship issues.2,8-10 For example, incorporating residential history records into cancer research can enhance our understanding of the impacts of neighborhood sociodemographic and physical conditions, poverty and social deprivation, accessibility to healthcare resources, quality and availability of cancer care, tobacco and alcohol consumption, food environments, and contaminants in water, soil, and air at various places of residence on cancer risk and outcomes Once diagnosed with cancer, patients may move for a variety of reasons: to be closer to their families, for better access to treatment, or for other survivorship considerations,9 or as a result of losing a job due to poor health or disability. Until recently, individual residential history data have been difficult and expensive to obtain. Studies requiring residential history records for cancer patients often relied on patient's self-reported addresses, introducing recall bias with no means of assessing this error, or incomplete addresses derived from electronic medical records, introducing collection bias.11 Increasingly, commercial resources of residential history data such as LexisNexis12 offer easier access to, and more complete, individual address information, which presents an opportunity for cancer control research community to reconstruct residential histories of cancer patients.
In 2016, NCI sponsored a pilot study to assess the accuracy and completeness of residential history data provided by three vendors including LexisNexis, compared to self-reported address from 66 volunteer participants at NCI and NIEHS who represented varying age and migratory history. Of the three vendors, LexisNexis was identified as a source of the most complete, accurate, and available residential history data dating back to the 1980s.13 Other studies, limited to a single registry, conducted assessment of LexisNexis residential history data6,11 and concluded that LexisNexis address records can be used for reconstructing residential histories in cancer surveillance and epidemio-logical research.
This manuscript describes a recent enhanced linkage of the LexisNexis longitudinal residential history data with 11 SEER registries and provides estimates of the inter-state mobility of SEER cancer patients based on this linkage. Because most data received by cancer registries are within the state, knowing how often cancer patients move out of the state of diagnosis can inform the percent of patients that may not be linking to state data. To our knowledge, no study has investigated the inter-state mobility patterns of a large population based database of cancer patients.
Methods
Linkage
LexisNexis maintains a commercially available database containing information from a variety of data sources on more than 276 million US individuals.12 Based on the prior linkage with LexisNexis,13 11 SEER registries (10 state registries and one metropolitan-area registry (Seattle)) who had already established confidentiality agreements with LexisNexis were included in this study. We included cancer patients who were at least 21 years old and had been diagnosed between 2009 and 2015 because the residential history data for younger ages and earlier diagnoses years were not as complete. Death certificate only cases were excluded since only limited address information is available for these cases. The cohort included approximately 3,247,000 cancer patients. For each cancer patient in the cohort, the following data items were sent to LexisNexis to conduct the linkage: first name, middle name, last name, suffix, Social Security Number (SSN), address at diagnosis (street, city, state and zip code), date of birth and phone number. The linkage was conducted in 2019. The percentage of cases in the SEER data with a complete SSN was approximately 96%.
Developing Residential Histories and Conducting State-to-State Mobility Analysis
Data returned by LexisNexis included any address associated with an individual and a range of dates when that address was used. The data often contained multiple records for the same residence with minor differences, multiple unique residence records for overlapping time periods, or a gap in residence records during the time period. To construct each patient's residential history, i.e. a single address at any particular time point, the data needs to be reconciled and adjusted for overlaps and gaps in addresses. To identify mobility from one state to another, we used state postal abbreviations which are rarely misspelled and can be easily reconciled to generate state-level residential histories. From this, we determined how often cancer patients moved from state-to-state. For their final state of residence, we determined the number of years in this state and noted patients that moved to a different state within 1 year, 2 years, 3 years, 4 years, and 5 years. From this we calculated the state-level move rates as the percent of patients who have moved to a different state within the most recent number of years. Note that this time period varies for each patient depending on the end date of the most recent address returned by LexisNexis. These time periods are looking backwards in time from the most recent residence reported by LexisNexis and, thus, include residence periods both before and after the date of diagnosis. For this study, we looked only at the LexisNexis address data, so we were not able to differentiate between pre- and post-diagnosis locations.
For the United States, data on the residential mobility of the general population is available from the Census Bureau14 and these data have been analyzed for older adults.15 We used 5-year data from the Census Bureau's American Community Survey for 2015-2019 to calculate state-level move rates for the general population stratified by geographic area, sex, race/ethnicity and age group. Since the cancer patients are generally older than the general population and previous studies have shown that older adults move frequently, we used age group profiles of the cancer population to create weighted state-level move rates. These rates provide estimates of the state-level move rates for a subset of the general population with matching age profiles.
Results
As shown in Table 1, LexisNexis was able to link and return address information on 3,117,258 (98.5%) of the patients sent for linkage. We received up to the maximum of 20 address records for each patient, with an average of 7.7 records per patient. The percentage linked by registry was highest for Connecticut, Georgia, Kentucky, Louisiana, Seattle, and Utah (over 99%) and lowest for New York (97.7%). Linkage rates were very similar by sex but were lower for the non-Hispanic Asian and Pacific Islander API (94.3%) and Hispanic (95.7%) patients. By age at first diagnosis, linkage rates were highest for those diagnosed between 50 and 64 years (98.9%) and were lowest for patients diagnosed at the youngest (97.5%) and oldest (97.1%) age groups. By diagnosis year, linkage rates were very similar.
Table 1.
SEER Residential History Data Linkage Results by Registry, Demographic Characteristics, and Diagnosis Year
| No. patients submitted | No. linked and returned with address information (%) | |
|---|---|---|
| Total | 3,226,404 | 3,177,258 (98.5) |
| Registry | ||
| California | 1,093,698 | 1,072,072 (98.0) |
| Connecticut | 149,405 | 148,344 (99.3) |
| Georgia | 332,737 | 330,892 (99.4) |
| Iowa | 123,331 | 122,072 (99.0) |
| Idaho | 54,252 | 53,694 (99.0) |
| Kentucky | 186,414 | 185,233 (99.4) |
| Louisiana | 172,361 | 171,319 (99.4) |
| New Mexico | 64,062 | 63,245 (98.7) |
| New York | 792,594 | 774,250 (97.7) |
| Seattle | 184,198 | 183,138 (99.4) |
| Utah | 73,352 | 72,999 (99.5) |
| Sex | ||
| Male | 1,559,51 1 | 1,537,859 (98.6) |
| Female | 1,666,450 | 1,638,969 (98.4) |
| Other/unknown | 443 | 430 (97.1) |
| Race/ethnicity | ||
| NH White | 2,309,662 | 2,292,590 (99.3) |
| NH Black | 335,079 | 329,553 (98.4) |
| NH AI/AN | 14,152 | 14,024 (99.1) |
| NH API | 184,377 | 173,879 (94.3) |
| Hispanic | 346,317 | 331,587 (95.7) |
| Unknown | 36,817 | 35,625 (96.8) |
| Age at first diagnosis (y) | ||
| 20-24 | 18,386 | 17,919 (97.5) |
| 25-29 | 34,396 | 33,719 (98.0) |
| 30-34 | 50,703 | 49,818 (98.3) |
| 35-39 | 70,902 | 69,765 (98.4) |
| 40-44 | 118,189 | 116,448 (98.5) |
| 45-49 | 187,984 | 185,493 (98.7) |
| 50-54 | 289,516 | 286,192 (98.9) |
| 55-59 | 370,108 | 365,919 (98.9) |
| 60-64 | 437,590 | 432,652 (98.9) |
| 65-69 | 457,842 | 452,263 (98.8) |
| 70-74 | 384,928 | 379,158 (98.5) |
| 75-79 | 318,363 | 312,503 (98.2) |
| 80-84 | 249,814 | 244,727 (98.0) |
| >85 | 237,674 | 230,677 (97.1) |
| Unknown | 9 | 5 (55.6) |
| Diagnosis year | ||
| 2009 | 466,879 | 459,564 (98.4) |
| 2010 | 458,222 | 451,186 (98.5) |
| 2011 | 461,623 | 454,724 (98.5) |
| 2012 | 456,053 | 449,496 (98.6) |
| 2013 | 457,223 | 450,333 (98.5) |
| 2014 | 460,335 | 453,408 (98.5) |
| 2015 | 466,069 | 458,547 (98.4) |
AI/AN, American Indian and Alaska Native; API, Asian/Pacific Islander; NH, non-Hispanic.
The percentage of cancer patients who moved to a different state within the most recent 1 year, 2 years, 3 years, 4 years, and 5 years is shown in Table 2 for 11 SEER registry areas. About 1 percent or less of cancer patients have moved to a different state within the most recent 1 year; whereas between 2.5 and 4.7 percent have moved within the last 5 years. Cancer patients in New York have the most state-to-state moves and patients in Louisiana have the least. Among cancer patients in these registries, females move from state-to-state a bit more often than males. By race/ethnic groups, non-Hispanic API patients move from state-to-state the most frequently with non-Hispanic White patients moving the least often. As expected, younger patients move from state-to-state more often than older patients.
Table 2.
State-Level Move Rates for Cancer Patients by Registry and by Demographic Characteristics for the Most Recent 1-Year to 5-Year Periods with Comparative 1-Year Move Rates for the General Population
| Percent of cancer patients who moved to a different state within the most recent N years a | One-year state move rates for general population b | ||||||
|---|---|---|---|---|---|---|---|
| 1 y | 2 y | 3 y | 4 y | 5 y | Unweighted | Weighted3 | |
| Registry | |||||||
| California | 0.95 | 1.6 | 2.4 | 3.2 | 3.9 | 1.30 | 0.78 |
| Connecticut | 0.78 | 1.5 | 2.3 | 3.3 | 4.2 | 2.31 | 1.21 |
| Georgia | 0.77 | 1.4 | 2.0 | 2.8 | 3.5 | 2.75 | 1.78 |
| Idaho | 0.77 | 1.5 | 2.3 | 3.2 | 4.2 | 4.33 | 3.14 |
| Iowa | 0.51 | 1.0 | 1.5 | 2.1 | 2.7 | 2.50 | 1.16 |
| Kentucky | 0.57 | 1.0 | 1.6 | 2.1 | 2.8 | 2.46 | 1.28 |
| Louisiana | 0.50 | 0.9 | 1.4 | 2.0 | 2.5 | 1.70 | 1.00 |
| New Mexico | 0.75 | 1.5 | 2.4 | 3.3 | 4.3 | 2.92 | 2.08 |
| New York | 1.06 | 1.9 | 2.7 | 3.7 | 4.7 | 1.34 | 0.62 |
| Seattle | 0.80 | 1.6 | 2.4 | 3.2 | 4.1 | 3.49 | 1.88 |
| Utah | 0.72 | 1.5 | 2.3 | 3.2 | 4.1 | 3.26 | 2.29 |
| Sex | |||||||
| Male | 0.80 | 1.4 | 2.1 | 2.8 | 3.6 | 1.94 | 1.80 |
| Female | 0.93 | 1.7 | 2.5 | 3.4 | 4.2 | 1.81 | 1.95 |
| Race/ethnicity | |||||||
| NH White | 0.71 | 1.3 | 2.0 | 2.7 | 3.4 | 2.18 | 2.22 |
| NH Black d | 0.85 | 1.5 | 2.3 | 3.1 | 4.0 | 1.97 | 1.87 |
| NH API d | 2.00 | 3.1 | 4.2 | 5.5 | 7.0 | 1.87 | 1.52 |
| NH AI/AN d | 0.78 | 1.4 | 2.1 | 2.8 | 3.5 | 1.75 | 1.71 |
| Hispanic | 1.30 | 2.1 | 3.0 | 4.2 | 5.3 | 1.07 | 0.79 |
| Age at diagnosis (y) | |||||||
| 20-24 | 2.45 | 5.3 | 8.3 | 11.7 | 15.2 | 4.24 | 4.26 |
| 25-29 | 2.16 | 4.6 | 7.1 | 9.8 | 12.6 | 3.93 | 3.96 |
| 30-34 | 1.62 | 3.5 | 5.4 | 7.6 | 9.7 | 2.78 | 2.80 |
| 35-39 | 1.28 | 2.6 | 4.0 | 5.6 | 7.2 | 1.98 | 1.99 |
| 40-44 | 1.15 | 2.1 | 3.3 | 4.5 | 5.8 | 1.46 | 1.45 |
| 45-49 | 0.98 | 1.9 | 2.9 | 3.9 | 4.8 | 1.18 | 1.17 |
| 50-54 | 0.91 | 1.7 | 2.6 | 3.5 | 4.4 | 1.07 | 1.07 |
| 55-59 | 0.85 | 1.6 | 2.4 | 3.3 | 4.1 | 0.98 | 0.99 |
| 60-64 | 0.81 | 1.5 | 2.2 | 3.1 | 3.8 | 0.98 | 0.99 |
| 65-69 | 0.74 | 1.3 | 2.0 | 2.7 | 3.4 | 0.92 | 0.94 |
| 70-74 | 0.73 | 1.2 | 1.8 | 2.4 | 3.0 | 0.82 | 0.83 |
| >75 | 0.78 | 1.2 | 1.6 | 2.1 | 2.7 | 0.81 | 0.79 |
Source: state-level residential history of cancer patients included in the SEER-LN linkage ages 21 and older diagnosed between 2009 and 2015.
Source: Census American Community Survey moves from a different state within the last year, 5 year results 2015-201 9.
Census results are weighted by the age-group profiles of the cancer patients in each of the registry areas.
Bridged race/ethnicity categories for non-Hispanic (NH) Black, NH API, and NH AIAN are not available in Census tables. Because of this., move rates for NH Black cancer patients are compared with the single-race Black population of any Hispanic origin; NH API with single-race API of any Hispanic origin, and NH AIAN with single-race AIAN of any Hispanic origin.
For comparison, Table 2 includes state-level move rates for the general population. The unweighted state-level move rates of the general population are generally higher that the state-level move rates for the cancer population. The weighted move rates which estimate the state-level move rates for a subset of general population with matching age profiles are also generally higher than those for cancer patients. By registry area, the exceptions are the states of California and New York where cancer patients have higher state-to-state move rates than their counterparts in the general public. By race/ethnicity, the exceptions are non-Hispanic API and Hispanic cancer patients.
Discussion
This paper demonstrates the feasibility of obtaining residential histories for almost all adult cancer patients diagnosed in recent years in SEER. In addition, this is the first large-scale assessment of the state-to-state mobility patterns of US cancer patients covering 30% of the US population and can provide some initial insights into how often cancer patients move between states for different geographic areas. Knowledge of state-to-state move patterns for cancer patients plays an important role for understanding the need to include out of state data in data linkages. For example, requests for supplemental prescription drug data for a given state registry can include data from neighboring states with significant move rates.
There is some geographic variation in the state-to-state move rates with New York rates being the highest and Louisiana rates being the lowest. There is also some variation by race/ethnicity with non-Hispanic API rates the highest and non-Hispanic White rates the lowest. Older cancer patients move less frequently than younger patients. This is consistent with previous studies that indicate older adults move less frequently.15 Comparison of the state-to-state move rates of cancer patients with that of the general public show generally lower rates for cancer patients. However, state-to-state move rates were in general very low and under 5%, indicating that less than 5% of cancer patients will be missed in state specific data linkages.
This study has limitations. We focused on state level moves as a first step to identify the need to acquire and link with out-of-state data. We only included 11 registries representing 30% of the US population. A recent study showed that LexisNexis address information near the time of death may not be accurate.16
The results in this paper provide information on the linkage with LexisNexis data and useful information on state-to-state residential mobility patterns of a large portion of US cancer patients for the most recent 1-year, 2-years, 3-years, 4-years, and 5-years. Mobility patterns vary by geographic area, race/ ethnicity and age. Finally, cancer patients tend to move less than the general population.
Work is currently being done to develop an algorithm to construct detailed residential histories that identify unique addresses for a patient with a single address at any particular time point. Once the complete residential history data is created and validated, it will be a unique and valuable resource for extending our understanding of the residential mobility of cancer patients throughout the cancer control continuum as well as providing research opportunities to investigate the association of exposures on outcomes.
Acknowledgement
We would like to acknowledge members of the SEER/ LexisNexis residential address linkage working group: Mary Charlton (Iowa Cancer Registry), Iona Cheng (Greater Bay Area Cancer Registry), Rosemary Cress (Cancer Registry of Greater California), Dennis Deapen (Los Angeles County Cancer Surveillance Program), Will Howe (Information Management Services, Inc.), Tina Lefante (Louisiana Tumor Registry), and Bozena Morawski (Cancer Data Registry of Idaho).
References
- 1.National Cancer Institute's Surveillance, Epidemiology, and End Results Program website. Accessed December 7, 2022. https://seer.cancer.gov/ [Google Scholar]
- 2.Namin S, Zhou Y, Neuner J, Beyer K. The role of residential history in cancer research: a scoping review. Soc Sci Med. 2021;270:113657. doi: 10.1016/j.socscimed.2020.113657 [DOI] [PubMed] [Google Scholar]
- 3.Tatalovich Z, Wilson JP, Mack T, Yan Y, Cockburn M. The objective assessment of lifetime cumulative ultraviolet exposure for determining melanoma risk. J Photochem Photobiol B. 2006;85(3):198–204. doi: 10.1016/j.jphotobiol.2006.08.002 [DOI] [PubMed] [Google Scholar]
- 4.Paulu C, Aschengrau A, Ozonoff D. Exploring associations between residential location and breast cancer incidence in a case-control study. Environ Health Perspect. 2002;110(5):471–478. doi: 10.1289/ehp.02110471 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jacquez GM, Kaufmann A, Meliker J, Goovaerts P, AvRuskin G, Nriagu J. Global, local and focused geographic clustering for case-control data with residential histories. Environ Health. 2005;4(1):4. doi: 10.1186/1476-069x-4-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wheeler DC, Wang A. Assessment of residential history generation using a public-record database. Int J Environ Res Public Health. 2015;12(9):11670–11682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wheeler DC, Waller LA, Cozen W, Ward MH. Spatial-temporal analysis of non-Hodgkin lymphoma risk using multiple residential locations. Spat Spatiotemporal Epidemiol. 2012;3(2): 163–171. doi: 10.1016/j.sste.2012.04.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wiese D, Stroup AM, Maiti A, et al. Residential mobility and geospatial disparities in colon cancer survival. Cancer Epidemiol Biomarkers Prev. 2020;29(11):2119–2125. [DOI] [PubMed] [Google Scholar]
- 9.Liu B, Lee FF, Boscoe F. Residential mobility among adult cancer survivors in the United States. BMC Pub Health. 2020;20(1):1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gomez SL, Shariff-Marco S, DeRouen M, et al. The impact of neighborhood social and built environment factors across the cancer continuum: current research, methodological considerations, and future directions. Cancer. 2015;121 (14):2314–2330. doi: 10.1002/cncr.29345 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Jacquez GM, Slotnick MJ, Meliker JR, AvRuskin G, Copeland G, Nriagu J. Accuracy of commercially available residential histories for epidemiologic studies. Am J Epidemiol. 2011;173(2):236–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.LexID. LexisNexis Risk Solutions website. Accessed April 15, 2022. https://risk.lexisnexis.com/our-technology/lexid [Google Scholar]
- 13.Stinchcomb DG, Roeser A. NCI/SEER Residential History Project: Technical Report. Westat, Inc; 2016. https://www.westat.com/sites/default/files/NCISAS/NCI_Res_Hist_Proj_Tech_Rpt_v2sec.pdf [Google Scholar]
- 14.Frost R. Are Americans Stuck in Place? Declining Residential Mobility in the US. Joint Center for Housing Studies of Harvard University; 2020. https://www.jchs.harvard.edu/sites/default/files/harvard_jchs_are_americans_stuck_in_place_frost_2020.pdf [Google Scholar]
- 15.Choi JH, Goodman L, Zhu J, Walsh J. Senior Housing and Mobility: Recent Trends and Implications for The Housing Market. Urban Institute; 2019. https://www.urban.org/sites/default/files/publication/100953/senior_housing_and_mobility.pdf [Google Scholar]
- 16.Woolpert KM, Ward KC, England CV, Lash TL. Validation of LexisNexis Accurint in the Georgia Cancer Registry's Cancer Recurrence and Information Surveillance Program. Epidemiology. 2021;32(3):434–438. [DOI] [PMC free article] [PubMed] [Google Scholar]
