Skip to main content
Journal of Registry Management logoLink to Journal of Registry Management
. 2022 Dec 1;49(4):109–113.

Assessment of Interstate Residential Mobility of SEER Patients: SEER and LexisNexis Residential Address Linkage

Zaria Tatalovich a,, David G Stinchcomb b, Angela Mariotto a, Diane Ng b, Jennifer L Stevens c, Linda M Coyle c, Lynne Penberthy a
PMCID: PMC10229186  PMID: 37260810

Abstract

The National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) program is continuously exploring opportunities to augment its already extensive collection of data, enhance the quality of reported cancer information, and contribute to more comprehensive analyses of cancer burden. This manuscript describes a recent linkage of the LexisNexis longitudinal residential history data with 11 SEER registries and provides estimates of the inter-state mobility of SEER cancer patients. To identify mobility from one state to another, we used state postal abbreviations to generate state-level residential histories. From this, we determined how often cancer patients moved from state-to-state. The results in this paper provide information on the linkage with LexisNexis data and useful information on state-to-state residential mobility patterns of a large portion of US cancer patients for the most recent 1-, 2-, 3-, 4-, and 5-year periods. We show that mobility patterns vary by geographic area, race/ethnicity and age, and cancer patients tend to move less than the general population.

Keywords: data linkage; exposure estimates; residential history; social determinants; Surveillance, Epidemiology, and End Results (SEER) program

Introduction

The National Cancer Institute's (NCI's) Surveillance, Epidemiology, and End Results (SEER) program is a rich source of cancer related information including diagnostic data, patient demographics, tumor characteristics, initial treatment at the time of diagnosis, and outcomes.1 The SEER Program is continuously exploring opportunities to augment its already extensive collection of data, enhance the quality of reported cancer information, and contribute to more comprehensive analysis of cancer burden. The emerging sources of cancer-related data coupled with novel technologies for data extraction and linkage present an opportunity for cancer registries to integrate larger-scale longitudinal data pre-and post-diagnosis into the existing cancer surveillance data infrastructure.

While cancer registries collect the patient's residential address at the time of diagnosis, historical and updated address histories are not generally available. Having residential history pre- and post-cancer diagnosis would facilitate data linkages with multiple sources of longitudinal data, enhance the quality of data linkage in the absence of patient identifier information, and provide research opportunities to investigate the association of exposures to neighborhood social and environmental conditions with risks of developing cancer over the life course2-7 as well as the impact of a cancer diagnosis on cancer survivorship issues.2,8-10 For example, incorporating residential history records into cancer research can enhance our understanding of the impacts of neighborhood sociodemographic and physical conditions, poverty and social deprivation, accessibility to healthcare resources, quality and availability of cancer care, tobacco and alcohol consumption, food environments, and contaminants in water, soil, and air at various places of residence on cancer risk and outcomes Once diagnosed with cancer, patients may move for a variety of reasons: to be closer to their families, for better access to treatment, or for other survivorship considerations,9 or as a result of losing a job due to poor health or disability. Until recently, individual residential history data have been difficult and expensive to obtain. Studies requiring residential history records for cancer patients often relied on patient's self-reported addresses, introducing recall bias with no means of assessing this error, or incomplete addresses derived from electronic medical records, introducing collection bias.11 Increasingly, commercial resources of residential history data such as LexisNexis12 offer easier access to, and more complete, individual address information, which presents an opportunity for cancer control research community to reconstruct residential histories of cancer patients.

In 2016, NCI sponsored a pilot study to assess the accuracy and completeness of residential history data provided by three vendors including LexisNexis, compared to self-reported address from 66 volunteer participants at NCI and NIEHS who represented varying age and migratory history. Of the three vendors, LexisNexis was identified as a source of the most complete, accurate, and available residential history data dating back to the 1980s.13 Other studies, limited to a single registry, conducted assessment of LexisNexis residential history data6,11 and concluded that LexisNexis address records can be used for reconstructing residential histories in cancer surveillance and epidemio-logical research.

This manuscript describes a recent enhanced linkage of the LexisNexis longitudinal residential history data with 11 SEER registries and provides estimates of the inter-state mobility of SEER cancer patients based on this linkage. Because most data received by cancer registries are within the state, knowing how often cancer patients move out of the state of diagnosis can inform the percent of patients that may not be linking to state data. To our knowledge, no study has investigated the inter-state mobility patterns of a large population based database of cancer patients.

Methods

Linkage

LexisNexis maintains a commercially available database containing information from a variety of data sources on more than 276 million US individuals.12 Based on the prior linkage with LexisNexis,13 11 SEER registries (10 state registries and one metropolitan-area registry (Seattle)) who had already established confidentiality agreements with LexisNexis were included in this study. We included cancer patients who were at least 21 years old and had been diagnosed between 2009 and 2015 because the residential history data for younger ages and earlier diagnoses years were not as complete. Death certificate only cases were excluded since only limited address information is available for these cases. The cohort included approximately 3,247,000 cancer patients. For each cancer patient in the cohort, the following data items were sent to LexisNexis to conduct the linkage: first name, middle name, last name, suffix, Social Security Number (SSN), address at diagnosis (street, city, state and zip code), date of birth and phone number. The linkage was conducted in 2019. The percentage of cases in the SEER data with a complete SSN was approximately 96%.

Developing Residential Histories and Conducting State-to-State Mobility Analysis

Data returned by LexisNexis included any address associated with an individual and a range of dates when that address was used. The data often contained multiple records for the same residence with minor differences, multiple unique residence records for overlapping time periods, or a gap in residence records during the time period. To construct each patient's residential history, i.e. a single address at any particular time point, the data needs to be reconciled and adjusted for overlaps and gaps in addresses. To identify mobility from one state to another, we used state postal abbreviations which are rarely misspelled and can be easily reconciled to generate state-level residential histories. From this, we determined how often cancer patients moved from state-to-state. For their final state of residence, we determined the number of years in this state and noted patients that moved to a different state within 1 year, 2 years, 3 years, 4 years, and 5 years. From this we calculated the state-level move rates as the percent of patients who have moved to a different state within the most recent number of years. Note that this time period varies for each patient depending on the end date of the most recent address returned by LexisNexis. These time periods are looking backwards in time from the most recent residence reported by LexisNexis and, thus, include residence periods both before and after the date of diagnosis. For this study, we looked only at the LexisNexis address data, so we were not able to differentiate between pre- and post-diagnosis locations.

For the United States, data on the residential mobility of the general population is available from the Census Bureau14 and these data have been analyzed for older adults.15 We used 5-year data from the Census Bureau's American Community Survey for 2015-2019 to calculate state-level move rates for the general population stratified by geographic area, sex, race/ethnicity and age group. Since the cancer patients are generally older than the general population and previous studies have shown that older adults move frequently, we used age group profiles of the cancer population to create weighted state-level move rates. These rates provide estimates of the state-level move rates for a subset of the general population with matching age profiles.

Results

As shown in Table 1, LexisNexis was able to link and return address information on 3,117,258 (98.5%) of the patients sent for linkage. We received up to the maximum of 20 address records for each patient, with an average of 7.7 records per patient. The percentage linked by registry was highest for Connecticut, Georgia, Kentucky, Louisiana, Seattle, and Utah (over 99%) and lowest for New York (97.7%). Linkage rates were very similar by sex but were lower for the non-Hispanic Asian and Pacific Islander API (94.3%) and Hispanic (95.7%) patients. By age at first diagnosis, linkage rates were highest for those diagnosed between 50 and 64 years (98.9%) and were lowest for patients diagnosed at the youngest (97.5%) and oldest (97.1%) age groups. By diagnosis year, linkage rates were very similar.

Table 1.

SEER Residential History Data Linkage Results by Registry, Demographic Characteristics, and Diagnosis Year

No. patients submitted No. linked and returned with address information (%)
Total 3,226,404 3,177,258 (98.5)
Registry
 California 1,093,698 1,072,072 (98.0)
 Connecticut 149,405 148,344 (99.3)
 Georgia 332,737 330,892 (99.4)
 Iowa 123,331 122,072 (99.0)
 Idaho 54,252 53,694 (99.0)
 Kentucky 186,414 185,233 (99.4)
 Louisiana 172,361 171,319 (99.4)
 New Mexico 64,062 63,245 (98.7)
 New York 792,594 774,250 (97.7)
 Seattle 184,198 183,138 (99.4)
 Utah 73,352 72,999 (99.5)
Sex
 Male 1,559,51 1 1,537,859 (98.6)
 Female 1,666,450 1,638,969 (98.4)
 Other/unknown 443 430 (97.1)
Race/ethnicity
 NH White 2,309,662 2,292,590 (99.3)
 NH Black 335,079 329,553 (98.4)
 NH AI/AN 14,152 14,024 (99.1)
 NH API 184,377 173,879 (94.3)
 Hispanic 346,317 331,587 (95.7)
 Unknown 36,817 35,625 (96.8)
Age at first diagnosis (y)
 20-24 18,386 17,919 (97.5)
 25-29 34,396 33,719 (98.0)
 30-34 50,703 49,818 (98.3)
 35-39 70,902 69,765 (98.4)
 40-44 118,189 116,448 (98.5)
 45-49 187,984 185,493 (98.7)
 50-54 289,516 286,192 (98.9)
 55-59 370,108 365,919 (98.9)
 60-64 437,590 432,652 (98.9)
 65-69 457,842 452,263 (98.8)
 70-74 384,928 379,158 (98.5)
 75-79 318,363 312,503 (98.2)
 80-84 249,814 244,727 (98.0)
 >85 237,674 230,677 (97.1)
Unknown 9 5 (55.6)
Diagnosis year
 2009 466,879 459,564 (98.4)
 2010 458,222 451,186 (98.5)
 2011 461,623 454,724 (98.5)
 2012 456,053 449,496 (98.6)
 2013 457,223 450,333 (98.5)
 2014 460,335 453,408 (98.5)
 2015 466,069 458,547 (98.4)

AI/AN, American Indian and Alaska Native; API, Asian/Pacific Islander; NH, non-Hispanic.

The percentage of cancer patients who moved to a different state within the most recent 1 year, 2 years, 3 years, 4 years, and 5 years is shown in Table 2 for 11 SEER registry areas. About 1 percent or less of cancer patients have moved to a different state within the most recent 1 year; whereas between 2.5 and 4.7 percent have moved within the last 5 years. Cancer patients in New York have the most state-to-state moves and patients in Louisiana have the least. Among cancer patients in these registries, females move from state-to-state a bit more often than males. By race/ethnic groups, non-Hispanic API patients move from state-to-state the most frequently with non-Hispanic White patients moving the least often. As expected, younger patients move from state-to-state more often than older patients.

Table 2.

State-Level Move Rates for Cancer Patients by Registry and by Demographic Characteristics for the Most Recent 1-Year to 5-Year Periods with Comparative 1-Year Move Rates for the General Population

Percent of cancer patients who moved to a different state within the most recent N years a One-year state move rates for general population b
1 y 2 y 3 y 4 y 5 y Unweighted Weighted3
Registry
California 0.95 1.6 2.4 3.2 3.9 1.30 0.78
Connecticut 0.78 1.5 2.3 3.3 4.2 2.31 1.21
Georgia 0.77 1.4 2.0 2.8 3.5 2.75 1.78
Idaho 0.77 1.5 2.3 3.2 4.2 4.33 3.14
Iowa 0.51 1.0 1.5 2.1 2.7 2.50 1.16
Kentucky 0.57 1.0 1.6 2.1 2.8 2.46 1.28
Louisiana 0.50 0.9 1.4 2.0 2.5 1.70 1.00
New Mexico 0.75 1.5 2.4 3.3 4.3 2.92 2.08
New York 1.06 1.9 2.7 3.7 4.7 1.34 0.62
Seattle 0.80 1.6 2.4 3.2 4.1 3.49 1.88
Utah 0.72 1.5 2.3 3.2 4.1 3.26 2.29
Sex
Male 0.80 1.4 2.1 2.8 3.6 1.94 1.80
Female 0.93 1.7 2.5 3.4 4.2 1.81 1.95
Race/ethnicity
NH White 0.71 1.3 2.0 2.7 3.4 2.18 2.22
NH Black d 0.85 1.5 2.3 3.1 4.0 1.97 1.87
NH API d 2.00 3.1 4.2 5.5 7.0 1.87 1.52
NH AI/AN d 0.78 1.4 2.1 2.8 3.5 1.75 1.71
Hispanic 1.30 2.1 3.0 4.2 5.3 1.07 0.79
Age at diagnosis (y)
20-24 2.45 5.3 8.3 11.7 15.2 4.24 4.26
25-29 2.16 4.6 7.1 9.8 12.6 3.93 3.96
30-34 1.62 3.5 5.4 7.6 9.7 2.78 2.80
35-39 1.28 2.6 4.0 5.6 7.2 1.98 1.99
40-44 1.15 2.1 3.3 4.5 5.8 1.46 1.45
45-49 0.98 1.9 2.9 3.9 4.8 1.18 1.17
50-54 0.91 1.7 2.6 3.5 4.4 1.07 1.07
55-59 0.85 1.6 2.4 3.3 4.1 0.98 0.99
60-64 0.81 1.5 2.2 3.1 3.8 0.98 0.99
65-69 0.74 1.3 2.0 2.7 3.4 0.92 0.94
70-74 0.73 1.2 1.8 2.4 3.0 0.82 0.83
>75 0.78 1.2 1.6 2.1 2.7 0.81 0.79
1.

Source: state-level residential history of cancer patients included in the SEER-LN linkage ages 21 and older diagnosed between 2009 and 2015.

2.

Source: Census American Community Survey moves from a different state within the last year, 5 year results 2015-201 9.

3.

Census results are weighted by the age-group profiles of the cancer patients in each of the registry areas.

4.

Bridged race/ethnicity categories for non-Hispanic (NH) Black, NH API, and NH AIAN are not available in Census tables. Because of this., move rates for NH Black cancer patients are compared with the single-race Black population of any Hispanic origin; NH API with single-race API of any Hispanic origin, and NH AIAN with single-race AIAN of any Hispanic origin.

For comparison, Table 2 includes state-level move rates for the general population. The unweighted state-level move rates of the general population are generally higher that the state-level move rates for the cancer population. The weighted move rates which estimate the state-level move rates for a subset of general population with matching age profiles are also generally higher than those for cancer patients. By registry area, the exceptions are the states of California and New York where cancer patients have higher state-to-state move rates than their counterparts in the general public. By race/ethnicity, the exceptions are non-Hispanic API and Hispanic cancer patients.

Discussion

This paper demonstrates the feasibility of obtaining residential histories for almost all adult cancer patients diagnosed in recent years in SEER. In addition, this is the first large-scale assessment of the state-to-state mobility patterns of US cancer patients covering 30% of the US population and can provide some initial insights into how often cancer patients move between states for different geographic areas. Knowledge of state-to-state move patterns for cancer patients plays an important role for understanding the need to include out of state data in data linkages. For example, requests for supplemental prescription drug data for a given state registry can include data from neighboring states with significant move rates.

There is some geographic variation in the state-to-state move rates with New York rates being the highest and Louisiana rates being the lowest. There is also some variation by race/ethnicity with non-Hispanic API rates the highest and non-Hispanic White rates the lowest. Older cancer patients move less frequently than younger patients. This is consistent with previous studies that indicate older adults move less frequently.15 Comparison of the state-to-state move rates of cancer patients with that of the general public show generally lower rates for cancer patients. However, state-to-state move rates were in general very low and under 5%, indicating that less than 5% of cancer patients will be missed in state specific data linkages.

This study has limitations. We focused on state level moves as a first step to identify the need to acquire and link with out-of-state data. We only included 11 registries representing 30% of the US population. A recent study showed that LexisNexis address information near the time of death may not be accurate.16

The results in this paper provide information on the linkage with LexisNexis data and useful information on state-to-state residential mobility patterns of a large portion of US cancer patients for the most recent 1-year, 2-years, 3-years, 4-years, and 5-years. Mobility patterns vary by geographic area, race/ ethnicity and age. Finally, cancer patients tend to move less than the general population.

Work is currently being done to develop an algorithm to construct detailed residential histories that identify unique addresses for a patient with a single address at any particular time point. Once the complete residential history data is created and validated, it will be a unique and valuable resource for extending our understanding of the residential mobility of cancer patients throughout the cancer control continuum as well as providing research opportunities to investigate the association of exposures on outcomes.

Acknowledgement

We would like to acknowledge members of the SEER/ LexisNexis residential address linkage working group: Mary Charlton (Iowa Cancer Registry), Iona Cheng (Greater Bay Area Cancer Registry), Rosemary Cress (Cancer Registry of Greater California), Dennis Deapen (Los Angeles County Cancer Surveillance Program), Will Howe (Information Management Services, Inc.), Tina Lefante (Louisiana Tumor Registry), and Bozena Morawski (Cancer Data Registry of Idaho).

References


Articles from Journal of Registry Management are provided here courtesy of National Cancer Registrars Association

RESOURCES