Determining Fitness for Use of SEER Cause-Specific Cause of Death in Analyses of Cause-Specific Survival

Bożena M Morawski; Mei-Chin Hsieh; Manxia Wu; Recinda Sherman; Angela B Mariotto; Christopher J Johnson

. 2022 Dec 1;49(4):177–189.

Determining Fitness for Use of SEER Cause-Specific Cause of Death in Analyses of Cause-Specific Survival

Bożena M Morawski ^a,^✉, Mei-Chin Hsieh ^b, Manxia Wu ^c, Recinda Sherman ^d, Angela B Mariotto ^e, Christopher J Johnson ^a

PMCID: PMC10229190 PMID: 37260823

Abstract

Background:

Net and crude cancer survival statistics can be calculated using cause of death or expected survival from life tables. In some instances, using cause of death information may be advantageous. The Surveillance, Epidemiology, and End Results (SEER) Program cause-specific cause of death variable (North American Association of Central Cancer Registries [NAACCR] item #1914) designates that a patient died of their cancer. We evaluated how miss-ingness in NAACCR item #1914 impacted survival estimates to determine fitness for use in NAACCR Cancer in North America (CiNA) products.

Methods:

We used CiNA survival and prevalence data (November 2020 submission) to calculate 60-month cause-specific survival among persons aged 15-99 years at time of diagnosis using NAACCR item #1914. We treated missing/unknown causes of death in 3 ways: excluded from analysis, included as dead from this cancer, or included as censored at time of last follow-up. Autopsy/death-certificate-only cases were excluded from survival analyses. We calculated the proportion of deaths with unknown/missing cause of death by registry and demographic variables.

Results:

Generally, 60-month cause-specific survival estimates differed by <1% between the 3 approaches when NAACCR item #1914 was missing/unknown for <3% of deaths. When applying a <3% fit-for-use standard to SEER cause-specific cause of death, data from 34 registries were included in cause-specific survival analyses. The proportion of deaths with missing/unknown cause of death varied by primary site, age at diagnosis, race/ethnicity, year of diagnosis, and registry.

Conclusion:

We have identified missingness cut points for NAACCR item #1914, which strike a balance between scientific integrity and registry inclusiveness, to designate data in NAACCR CiNA data products as fit for use in cause-specific survival analyses.

Keywords: survival estimates, cause-specific cause of death variable, cause of death

Background

Each year, members of the North American Association of Central Cancer Registries (NAACCR) voluntarily submit data to develop an aggregated resource for cancer surveillance and research. This aggregated resource is used to create multiple data products, including the Cancer in North America (CiNA) monographs,^1-5 and the CiNA research data set. For inclusion in incidence data products, registry data must meet certification criteria for submission timeliness, completeness, and accuracy. The highest level of certification is given to registries that meet the following criteria: case ascertainment of >95% completeness; <3% of cases are only identified via a death certificate; <0.1% of tumors are duplicates per the NAACCR duplicate protocol; all fields used to calculate incidence statistics (cancer type, sex, race, age, and county) are error-free (ie, pass edits); <2% of tumors are missing meaningful information on age, sex, and county; <3% of tumors are missing meaningful information on race (United States only); and the file is submitted to NAACCR within 23 months of the end of the submission diagnosis year.⁶

For inclusion in the CiNA survival and prevalence volumes and data sets, registries must additionally meet Surveillance, Epidemiology, and End Results (SEER) Program standards for follow-up or ascertain all deaths through the study cutoff date (including state mortality file linkage and National Death Index [NDI] linkage for US registries and provincial/territorial mortality file linkage for Canadian registries). In 2021, these criteria—which are applied to the years of data included in survival and prevalence estimates—received their own NAACCR recognition, "Fitness for Use for Survival & Prevalence Recognition." The criteria for this recognition only pertain to overall vital status and not cause of death. Accordingly, current recognition criteria are well aligned with the most commonly used population-based cancer survival statistics that do not require cause of death information: relative survival ratios and the Pohar-Perme estimator.⁷

Net cancer survival can be calculated using a relative or cause-specific survival approach. Relative survival estimates represent the ratio of the observed-to-expected survival among cancer patients in the absence of competing causes of death, where expected survival is determined from matched life tables. Relative survival methods have the advantage of not requiring cause of death information from death certificates, which can be inaccurate.^8,9 Additionally, relative survival has the advantage of representing any excess mortality experienced by cancer patients (eg, late cardiotoxic effects of therapy as opposed to mortality attributable to only the cancer). The accuracy of relative survival measures can, however, be greatly influenced by the appropriateness of life tables selected to represent the expected survival of the study population.^10-12 Life tables should be matched on factors influencing cancer-specific and overall survival, such as age, sex, race/ethnicity, socioeconomic status, geography,¹⁰ and factors relevant to the cancer under study.¹³ In instances where life tables are not well matched to the study population, cause-specific survival may be more informative to researchers, public health professionals, and policymakers than relative survival. Appropriate life tables may be difficult to find for study populations that are not well characterized (eg, calculating survival for screening-related cancers that may be diagnosed among populations that are healthier than the general population, or for cancers where incidence and mortality are related to underlying risk factors such as smoking). Finally, large shifts in populationlevel mortality due to specific events (eg, decreases in life expectancy at birth due to COVID-19 or the US opioid crisis) may render previously appropriate life tables inappropriate.

As there is increased interest in estimating cause-specific survival at the population level,^8,14,15 it becomes increasingly important to ensure that researchers have access to high-quality, population-level cause of death information. More precisely, it is important that researchers be aware of jurisdiction-specific limitations of population-level data—in particular, when they intend to present regional, registry-specific, or other subpopulation estimates. For example, some registries may not be able to release cause of death information to national bodies (ie, NAACCR, SEER, and the National Program of Cancer Registries [NPCR]), resulting in large-proportions of jurisdiction-specific miss-ingness. Registry operations —in particular, timing and sources of linkages — also impact the completeness of cause of death information, whereby a particular follow-up or diagnosis year is missing a higher-than-average proportion of cause of death among deceased cancer patients. There are numerous approaches to handling missing data, including cause of death, and some of these ad hoc methods have been shown to bias estimates.^14,16-18 Acknowledging, however, that many researchers conduct analyses of cancer surveillance data (eg, CiNA, SEER) in SEER*Stat alone necessitates the identification of data that are fit for use in cause-specific survival upstream of data release to researchers. The impact of cause of death missingness on survival estimates as calculated in SEER*Stat has not been evaluated in the NAACCR CiNA data set.

This study used the CiNA survival and prevalence data to assess how the proportion of missingness in the SEER cause-specific cause of death variable (NAACCR item #1914) impacted survival estimates and establishes fit-for-use cut points indicating when data should be excluded from cause-specific survival analyses. The ultimate goal of these efforts is to increase visibility around completeness of cause of death information and improve the completeness of cause of death data, such that no otherwise qualifying tumors will be censored or excluded from analyses due to missing cause of death information.

Methods

We evaluated the impact of excluding registries with >0.5%, >2%, and >3% missing/unknown SEER cause-specific cause of death on (1) the number of registries that would be included in cause-specific survival analyses and (2) the survival estimates themselves. We used NAACCR CiNA survival and prevalence data for the United States and Canada (December 2020 submission)¹⁹ to calculate 60-month cause-specific survival among persons aged 15-99 years at time of diagnosis with a malignant tumor during 2011-2017. Following the methods in the NAACCR CiNA monograph, estimates were age-standardized using the International Cancer Survival Standards, which include patients diagnosed at ages 15-99 years.⁴ Follow-up time was calculated using a blended approach.⁴ For registries meeting SEER follow-up standards, follow-up time was calculated through the first of date of death, date of last contact, or end of the study period (December 31, 2017). For all other registries, the presumed alive method was used, meaning follow-up time for patients not known to be deceased was calculated through the end of the study period. For registries conducting active follow-up (ie, ascertaining vital status and date of last contact via linkages with administrative or hospital databases), alive cases with no follow-up time were excluded (about 0.17%). Tumors reported only via death certificate or autopsy were excluded, but proportions of these cases were evaluated by primary site, patient demographics, and registry. Cause of death was determined by the SEER cause-specific cause of death field (NAACCR item #1914),²⁰ which is described as follows: To capture deaths related to the specific cancer but not coded as such, the SEER cause-specific death classification variables are defined by taking into account causes of deaths in conjunction with tumor sequence (ie, only 1 tumor or the first of subsequent tumors), site of the original cancer diagnosis, and comorbidities (eg, AIDS and/or site-related diseases).^13,15 Other survival analysis parameters matched those used in the CiNA survival volume (https://www.naaccr.org/wp-content/uploads/2022/06/CiNA.2015-2019.v4.survival.pdf).⁴

Survival calculations were performed on blended survival time in SEER*Stat version 8.4.0.1 using the actuarial method.²¹ Three sets of survival estimates were calculated by classifying tumors with missing/unknown cause of death in the following ways: (1) excluding these tumors from analyses; (2) including these tumors in analyses with a cause of death of the cancer under study; (3) including these tumors in analyses and censoring them at time of death, with the assumption that the cause of death is not cancer. All eligible tumors were included in analyses vs restricting to first primary.¹⁵ We calculated proportion of deaths with unknown or missing cause of death codes by registry, primary site, patient demographics (age, race/ ethnicity, rural vs urban residence at time of diagnosis), and other covariates. We then compared absolute differences in 60-month cause-specific survival estimates as calculated by each method described in (1)-(3) above, with particular attention to instances where missingness in cause of death yielded >1% absolute difference in survival estimates.

Among registries with <3% missing/unknown SEER cause-specific cause of death, we quantified differences in the proportion missing/unknown SEER cause-specific cause of death by registry and age at diagnosis (15-64 years vs >65 years), race/ethnicity (race and origin recode: non-Hispanic white, non-Hispanic Black, non-Hispanic American Indian/ Alaska Native, non-Hispanic Asian or Pacific Islander, Hispanic), county-level urban/rural residence at diagnosis (2013 Beale codes), type of reporting source, follow-up source central, and primary site (SEER site recode ICD-O-3/WHO 2008) to identify potential factors driving missing/unknown cause of death. Follow-up source central (NAACCR item #1791) indicates the source of consolidated vital status, date of last contact, and cause of death information, as applicable. Finally, we examined patterns of missing/unknown cause of death by year of diagnosis within registry to describe how the proportion of deaths with missing/unknown cause may vary by changes in registry practice or policy over time.

Results

We evaluated data from 58 central cancer registries (50 US states, the District of Columbia, 7 provincial Canadian registries) receiving the CiNA "Fitness for Use for Survival & Prevalence Recognition." The percent missing/unknown SEER cause-specific cause of death among 11,757,022 tumors diagnosed during 2012-2017 with follow-up through the end of 2017 ranged by registry, from 100% in 6 US or Canadian registries to 0.02% for 1 US registry; the median proportion of missing/unknown SEER cause-specific cause of death across registries was 1.52% (Table 1). We saw evidence that missingness in cause of death information was impacted by year-to-year variation in registry operations. Within registry, missingness varied across diagnosis years (data not shown). Among registries where cause of death was missing for <100% and >3% of tumors for all study years combined, the number of years with <3% missing/unknown SEER cause-specific cause of death information ranged from 0 to 6 of the 7 diagnosis years. There was a high degree of correlation between which registries had >3% missing/ unknown SEER cause-specific cause of death overall and those registries with >10% missing/unknown SEER cause-specific cause of death for a specific year, indicating that >10% missing/unknown SEER cause-specific cause of death for a specific year was an additional informative marker of biased survival estimates (data not shown).

Table 1.

Ranking of Percent Missing Cause of Death and Impact on Cause-Specific 5-Year Survival Estimates, 2011-2017 Diagnosis Years (Site Recode ICD-O-3/WHO 2008, All Sites Combined)

NAACCR registry number	% DCO/ autopsy	% Missing COD	n	5-year cause-specific survival			Absolute difference in survival estimates
NAACCR registry number	% DCO/ autopsy	% Missing COD	n	Censored	Excluded	Cancer death	Censored-excluded	Censored-dead	Excluded-dead
Excluded per >3% missing/unknown criterion (registry n = 16)
60	0.74	-	164,106	100.0	100.0	100.0	0.0	0.0	0.0
62	1.22	-	5,988	100.0	100.0	100.0	0.0	0.0	0.0
11	1.82	100.0	436,323	100.0	100.0	58.2	0.0	41.8	41.8
55	0.34	100.0	31,972	100.0	100.0	100.0	0.0	0.0	0.0
59	0.71	100.0	23,203	100.0	100.0	100.0	0.0	0.0	0.0
58	0.81	100.0	36,351	100.0	100.0	100.0	0.0	0.0	0.0
54	0.28	86.03	43,200	93.1	90.8	54.7	2.3	38.4	36.0
39	0.72	81.48	17,594	91.5	89.6	59.7	1.9	31.8	29.9
40	2.29	48.75	39,187	79.6	78.5	65.2	1.2	14.5	13.3
50	3.36	9.57	195,327	71.0	70.8	68.6	0.3	2.5	2.2
44	0.18	8.00	98,293	70.3	70.1	68.8	0.2	1.6	1.4
19	1.41	6.85	351,005	66.8	66.8	66.6	0.0	0.2	0.2
35	2.92	5.25	136,701	68.8	68.7	67.6	0.1	1.2	1.1
37	1.98	4.73	24,074	71.9	71.9	71.0	0.1	0.9	0.9
23	1.87	4.49	170,860	64.4	64.2	63.1	0.2	1.3	1.1
38	1.60	4.31	328,945	71.0	70.7	69.6	0.2	1.4	1.2
Excluded per >2% missing/unknown criterion (registry n = 22)
16	0.83	2.56	29,752	66.8	66.8	66.7	0.0	0.0	0.0
271	1.02	2.50	211,430	70.8	70.7	70.1	0.1	0.6	0.5
4	3.05	2.42	197,943	65.7	65.5	64.9	0.1	0.7	0.6
272	1.01	2.30	257,052	67.1	67.0	66.4	0.1	0.7	0.6
3	0.92	2.18	134,915	70.2	70.1	69.5	0.1	0.6	0.5
7	1.03	2.16	24,374	69.9	69.9	69.4	0.1	0.6	0.5
27	1.19	2.11	1,077,506	68.2	68.1	67.6	0.1	0.6	0.5
41	2.30	1.94	147,373	70.8	70.7	70.4	0.1	0.4	0.4
273	1.32	1.90	608,521	67.6	67.6	67.1	0.1	0.6	0.5
21	2.58	1.77	702,655	67.2	67.1	66.7	0.1	0.5	0.4
12	0.95	1.72	173,361	62.7	62.7	62.4	0.1	0.3	0.2
13	1.53	1.71	18,573	67.3	67.2	66.8	0.1	0.5	0.4
1	2.75	1.57	105,279	64.1	64.0	63.6	0.1	0.5	0.4
42	2.67	1.53	59,198	64.5	64.4	64.0	0.1	0.5	0.4
56	0.19	1.51	117,475	67.8	67.7	67.3	0.1	0.5	0.4
20	1.75	1.49	208,898	68.3	68.3	68.0	0.0	0.3	0.3
15	1.43	1.49	63,098	67.7	67.6	67.3	0.1	0.4	0.3
5	1.32	1.44	37,221	67.3	67.2	67.0	0.1	0.3	0.3
26	1.07	1.43	720,505	69.7	69.7	69.3	0.1	0.4	0.3
33	1.40	1.37	46,069	67.4	67.3	67.0	0.1	0.4	0.3
311	0.57	1.09	165,574	70.4	70.3	70.1	0.0	0.3	0.3
18	1.19	1.05	161,946	64.8	64.7	64.4	0.0	0.3	0.3
Excluded per >0.5% missing/unknown criterion (registry n = 9)
9	2.43	0.95	506,900	67.3	67.3	67.1	0.0	0.2	0.2
17	3.37	0.93	125,652	62.9	62.9	62.7	0.0	0.2	0.2
34	2.18	0.86	55,787	66.2	66.2	66.0	0.0	0.2	0.2
6	1.41	0.86	315,450	66.9	66.8	66.6	0.0	0.2	0.2
461	1.51	0.85	146,412	67.1	67.1	66.9	0.0	0.3	0.2
49	3.64	0.83	414,100	65.9	65.9	65.7	0.0	0.2	0.2
36	1.96	0.74	186,434	70.4	70.4	70.2	0.0	0.2	0.2
29	2.51	0.62	232,863	64.5	64.5	64.3	0.0	0.2	0.1
25	0.57	0.62	66,474	73.1	73.0	72.9	0.0	0.2	0.1
Never excluded (registry n = 11)
24	1.72	0.50	52,815	70.5	70.5	70.4	0.0	0.1	0.1
31	1.61	0.50	231,496	69.7	69.7	69.6	0.0	0.1	0.1
47	2.44	0.45	212,591	64.0	64.0	63.9	0.0	0.1	0.1
8	2.09	0.42	51,483	67.3	67.3	67.2	0.0	0.1	0.1
28	1.97	0.41	38,100	66.5	66.4	66.3	0.0	0.1	0.1
10	0.99	0.33	104,837	62.4	62.3	62.3	0.0	0.1	0.1
2	1.67	0.28	114,955	66.2	66.2	66.2	0.0	0.1	0.1
32	0.89	0.23	77,156	62.2	62.2	62.1	0.0	0.1	0.0
53	1.53	0.22	496,534	67.5	67.5	67.5	0.0	0.1	0.0
45	2.39	0.21	172,339	65.1	65.1	65.0	0.0	0.0	0.0
30	1.61	0.02	783,858	67.5	67.5	67.5	0.0	0.0	0.0

Open in a new tab

COD, cause of death; DCO, death certificate only; ICD-O-3, International Classification of Diseases for Oncology, Third Edition; NAACCR, North American Association of Central Cancer Registries; WHO, World Health Organization. NAACCR member registries listed above include state, metropolitan, provincial, and territorial registries.

As the percentage of tumors with missing/unknown SEER cause-specific cause of death information increased, absolute differences in all sites combined survival estimates using the 3 methods also increased (Table 1, Figure 1). For 10 registries with <100% and >3% missing/unknown SEER cause-specific cause of death, the median absolute difference in cause-specific survival estimates was 0.2% (interquartile range [IQR], 0.1-1.0) between methods that censored patients with unknown cause of death at date of last follow0-up vs excluded them, 1.5% (IQR, 1.2-11.5) between methods that censored patients with unknown cause of death at date of last follow-up vs included these patients as dead from their cancer, and 1.3% (IQR, 1.1-10.5) between methods that censored patients with unknown cause of death at date of last follow-up vs excluded them, 1.4% (IQR, 0.9-14.5) between methods that censored patients with unknown cause of death at date of last follow-up vs included these patients as dead from their cancer, and 1.2% (IQR, 0.9-13.3) between methods that excluded patients from analyses vs classifying these patients as dead from their cancer. An inclusion cut point of <3% missing/unknown SEER cause-specific cause of death strikes a balance between scientific integrity in survival estimates and registry inclusiveness; ie, a minimal number of registries are excluded from 5-year cause-specific survival calculations and differences in estimates from included registries demonstrated <1% differences in 5-year cause-specific survival by method as calculated in SEER*Stat. The selection of other cutoff points (eg, missingness of <2% or <0.5%) yielded smaller median differences in survival estimates (Table 1). However, these cut points exclude 22 and 9 additional registries, respectively, from cause-specific survival analyses, with minimal corresponding benefit in reducing bias in cause-specific survival estimates. Thus, registries meeting the <3% missing/unknown cut point were deemed fit for use.

Scatterplot of Percent Missingness in Cause of Death vs Absolute Difference in Cause-Specific Survival Estimates (Censored vs Dead from This Cancer)

Subsequent analyses evaluated patterns of missing-ness using data from 34 US registries deemed fit for use (<3% missing/unknown SEER cause-specific cause of death). Substate registries covered by their entire state were excluded from these analyses (ie, Greater California, Greater Bay, Los Angeles, and Seattle) and substate registries not covered by their entire state were included (ie, Detroit). An examination of the percent of tumors missing cause of death information by primary site (Table 2) demonstrated that cancers of the blood (ie, leukemia, Hodgkin lymphoma, and myeloma) had the highest mean proportions of missing cause of death (3.7%, 2.7%, and 2.1%, respectively). Tumors of the larynx (9.1%), liver and bile duct (5.4%), and stomach (5.1%) also had particularly high maximum values of miss-ingness, indicating that identifying cause of death for these cancers might be more difficult in some registry jurisdictions. Cause-specific survival estimates for specific primary sites were impacted less when not stratifying by registry, with the largest differences in survival estimates being 0.2 for liver and intrahepatic bile duct and stomach, 0.6 for stomach, and 0.4 for stomach and cervix uteri between methods.

Table 2.

Percent of Survival Records Missing Cause of Death (Maximum and Mean) and Survival Statistics for 34 US Registries with < 3% Missing/Unknown Cause-Specific Cause of Death by Primary Site for Diagnosis Years 2011-2017

Primary site at diagnosis	Max	Mean	Survival, n	Cause spec censored	Cause spec exclude	Cause spec dead	Diff censored exclude	Diff censored dead	Diff exclude dead
All sites	2.4	1.1	7,048,111	67.0	66.9	66.7	0.0	0.3	0.3
Oral cavity and pharynx	2.8	0.8	199,700	67.9	67.9	67.6	0.0	0.3	0.3
Esophagus	4.5	1.7	77,217	23.3	23.2	23.1	0.1	0.2	0.2
Stomach	5.1	1.5	106,817	35.6	35.4	35.1	0.2	0.6	0.4
Colon and rectum	2.8	1.2	632,661	64.7	64.7	64.4	0.1	0.3	0.3
Liver and intrahepatic bile duct	5.4	1.5	145,021	23.4	23.2	22.9	0.2	0.5	0.3
Pancreas	1.8	0.6	209,139	12.1	12.0	11.9	0.1	0.2	0.1
Larynx	9.1	1.7	57,895	65.7	65.7	65.4	0.1	0.3	0.3
Lung and bronchus	3.3	1.2	959,490	26.2	26.1	25.9	0.1	0.3	0.2
Melanoma of the skin	4.6	1.2	351,506	89.6	89.6	89.5	0.0	0.1	0.1
Breast	3.5	1.2	1,078,023	88.6	88.5	88.4	0.0	0.2	0.2
Cervix uteri	2.9	1.2	60,254	68.6	68.5	68.1	0.1	0.4	0.4
Corpus and uterus, NOS	1.9	0.7	240,459	81.0	81.0	80.8	0.0	0.2	0.2
Ovary	4.6	1.7	96,486	49.4	49.4	49.1	0.1	0.3	0.2
Prostate	2.9	0.5	895,181	92.7	92.7	92.5	0.0	0.2	0.2
Testis	3.8	1.2	40,157	95.3	95.3	95.3	0.0	0.1	0.1
Urinary bladder	2.8	1.2	329,458	77.2	77.2	76.9	0.0	0.3	0.2
Kidney and renal pelvis	2.8	1.2	275,887	77.5	77.5	77.2	0.0	0.3	0.2
Brain and other nervous system	4.3	0.9	92,316	27.7	27.6	27.3	0.1	0.3	0.2
Thyroid	1.8	0.7	213,979	97.1	97.1	97.0	0.0	0.1	0.1
Hodgkin lymphoma	5.6	2.7	38,400	87.2	87.2	87.0	0.0	0.2	0.2
Non-Hodgkin lymphoma	3.0	1.0	315,332	72.5	72.5	72.2	0.0	0.3	0.3
Myeloma	12.5	2.1	114,643	59.7	59.6	59.3	0.1	0.3	0.3
Leukemia	12.8	3.7	219,241	61.1	61.0	60.7	0.1	0.3	0.3
Mesothelioma	3.4	1.4	14,752	12.1	12.0	11.9	0.1	0.2	0.1

Open in a new tab

Max, maximum; NOS, not otherwise specified; spec, specific.

Given the differences in cause of death completeness by primary site and registry, we further analyzed tumor missingness by patient demographic characteristics and registry among the 34 registries meeting the <3% overall cause of death missingness criterion (Table 3). The mean and median values of cause of death missingness by age category (age 15-64 years at diagnosis vs age >65 years at diagnosis) were similar (median, 1.04% in 15-64 years vs 0.85% in >65 years; mean, 1.22% in 15-64 years vs 1.00% in >65 years). Proportion of missingness was also similar across diagnosis year and urbanicity of county of residence at time of diagnosis. Differences in missingness by race/ ethnicity were larger, with the median proportion of miss-ingness of 3.46% among Hispanic patients (all races) and 3.79% in non-Hispanic Asian or Pacific Islander patients vs 0.90% in non-Hispanic American Indian/Alaska Native patients, 1.25% in non-Hispanic Black patients, and 0.79% in non-Hispanic White patients. We noted that the proportion missing/unknown also differentially impacted survival estimates by method within registry by race/ethnicity, with the largest median differences in estimates using dead from their cancer vs censored as the cause of death—a difference of 1.1%—among non-Hispanic Asian or Pacific Islander patients and Hispanic patients (all races) (Table 4).

Table 3.

Maximum, Mean, and Median Percent Missing COD by Demographic Characteristics, All Sites Combined, for 34 US Registries with < 3% Missing SEER Cause-Specific Cause of Death (Site Recode ICD-O-3/WHO 2008, All Sites) for Diagnosis Years 2011-2017

Demographic category	Diagnosis year	Maximum (%)	Mean (%)	Median (%)
Age group (y)
15-64	201 1-2017	2.97	1.22	1.04
>65	201 1-2017	2.28	1.00	0.85
Race and ethnicity
Hispanic (all races)	201 1-2017	5.22	3.13	3.46
Non-Hispanic American Indian/Alaska Native	201 1-2017	6.67	1.28	0.90
Non-Hispanic Asian or Pacific Islander	201 1-2017	23.08	4.55	3.79
Non-Hispanic Black	201 1-2017	15.38	1.90	1.25
Non-Hispanic White	201 1-2017	2.06	0.85	0.79
Urban/rural (2013)
Metropolitan counties	201 1-2017	2.93	1.10	0.87
Counties in metropolitan areas > 1 million population	201 1-2017	4.92	1.23	0.95
Counties in metropolitan areas of 250,000 to 1 million population	201 1-2017	2.26	1.02	0.90
Counties in metropolitan areas of < 250,000 population	201 1-2017	2.93	1.09	0.94
Nonmetropolitan counties	201 1-2017	2.62	1.02	0.88
Urban population of > 20,000 adjacent to a metropolitan area	201 1-2017	2.71	0.87	0.62
Urban population of > 20,000 not adjacent to a metropolitan area	201 1-2017	6.39	1.24	0.74
Urban population of 2,500 to 19,999, adjacent to a metropolitan area	201 1-2017	2.98	1.00	0.91
Urban population of 2,500 to 19,999, not adjacent to a metropolitan area	201 1-2017	3.15	1.06	0.89
Comp rural < 2,500 urban population, adjacent to a metropolitan area	201 1-2017	6.25	1.17	0.81
Comp rural < 2,500 urban population, not adjacent to metropolitan area	201 1-2017	3.31	1.01	0.90
Missing or unknown state/county includes XX, YY, ZZ or 999	201 1-2017	33.33	7.50	3.35
By year
	2011	3.23	1.16	1.06
	2012	3.07	1.07	0.99
	2013	3.37	1.03	0.94
	2014	2.39	1.00	0.86
	2015	4.36	1.11	0.85
	2016	2.85	1.00	0.79
	2017	3.01	1.18	0.98

Open in a new tab

COD, cause of death; ICD-O-3, International Classification of Diseases for Oncology, Third Edition; SEER, Surveillance, Epidemiology, and End Results Program; WHO, World Health Organization.

Table 4.

Differences in Survival Estimates by Method and Race/Ethnicity Among Registries with < 3% Missing/Unknown Cause-Specific Cause of Death, Diagnosis Years 2011–2017

NAACCR registry number	Absolute percentage difference in estimates
	Excluded—dead from cancer						Dead from cancer—censored						Excluded—censored
		NHW	NHB	NAIAN	NAPI	H	NU	NHW	NHB	NAIAN	NAPI	H	NU	NHW	NHB	NAIAN	NAPI	H	NU
Total	0.29	0.34	0.33	0.89	0.86	0.59	0.34	0.41	0.39	1.05	1.00	0.61	0.05	0.07	0.06	0.16	0.14	0.02
13	0.29	1.09	0.19	2.84	1.17	0.00	0.36	1.24	0.25	3.50	1.36	0.00	0.07	0.15	0.06	0.65	0.18	0.00
4	0.48	0.67	0.81	0.98	1.35	0.29	0.58	0.80	0.99	1.20	1.60	0.33	0.10	0.14	0.18	0.21	0.24	0.03
1	0.33	0.68	0.60	0.96	0.89	0.72	0.40	0.80	0.74	1.14	1.01	0.73	0.07	0.13	0.14	0.17	0.11	0.01
27	0.26	0.33	0.31	0.95	1.08	0.59	0.31	0.41	0.35	1.13	1.26	0.59	0.05	0.07	0.05	0.18	0.18	0.01
41	0.33	0.65	0.32	1.02	0.57	0.26	0.39	0.77	0.37	1.17	0.67	0.27	0.06	0.12	0.05	0.15	0.09	0.00
3	0.44	0.68	0.00	1.23	1.31	1.00	0.52	0.79	0.00	1.38	1.52	1.04	0.08	0.12	0.00	0.15	0.20	0.03
5	0.22	0.34	0.00	0.34	0.77	2.25	0.27	0.40	0.00	0.39	0.86	2.45	0.05	0.07	0.00	0.05	0.09	0.19
30	0.01	0.01	0.00	0.00	0.00	0.00	0.01	0.01	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
6	0.12	0.29	0.00	1.37	0.89	0.41	0.14	0.35	0.00	1.58	0.99	0.42	0.02	0.05	0.00	0.21	0.10	0.01
33	0.23	0.34	0.00	0.41	0.19	0.70	0.28	0.38	0.00	0.51	0.23	0.72	0.05	0.05	0.00	0.10	0.04	0.02
8	0.05	0.64	0.00	0.51	0.85	0.00	0.06	0.74	0.00	0.61	1.00	0.00	0.01	0.10	0.00	0.10	0.15	0.00
2	0.05	0.23	0.00	1.28	0.73	0.00	0.06	0.27	0.00	1.50	0.82	0.00	0.01	0.04	0.00	0.22	0.10	0.00
12	0.23	0.35	1.06	0.95	0.93	0.30	0.29	0.44	1.26	1.11	1.08	0.30	0.06	0.08	0.20	0.16	0.14	0.00
18	0.21	0.32	0.43	1.25	1.24	1.60	0.24	0.38	0.58	1.47	1.45	1.69	0.04	0.07	0.15	0.22	0.22	0.09
34	0.19	0.64	0.26	0.35	0.34	0.00	0.21	0.70	0.28	0.35	0.35	0.00	0.03	0.07	0.02	0.00	0.01	0.00
461	0.15	0.34	0.15	0.63	0.74	1.06	0.17	0.41	0.16	0.77	0.86	1.07	0.03	0.07	0.02	0.14	0.12	0.01
36	0.12	0.74	0.18	1.37	1.32	1.08	0.14	0.87	0.21	1.60	1.51	1.11	0.02	0.13	0.03	0.23	0.20	0.02
10	0.04	0.11	0.00	1.08	0.93	0.00	0.05	0.13	0.00	1.29	1.06	0.00	0.01	0.03	0.00	0.21	0.13	0.00
47	0.07	0.16	0.34	1.02	0.33	0.70	0.08	0.19	0.42	1.17	0.39	0.79	0.01	0.03	0.08	0.15	0.06	0.09
28	0.10	0.00	0.06	1.21	0.52	0.00	0.13	0.00	0.08	1.50	0.64	0.00	0.03	0.00	0.02	0.29	0.12	0.00
15	0.29	0.42	0.33	2.00	1.18	0.74	0.35	0.51	0.36	2.35	1.39	0.75	0.06	0.09	0.03	0.35	0.21	0.01
24	0.09	0.35	0.00	1.74	0.30	0.31	0.10	0.40	0.00	2.00	0.33	0.34	0.01	0.05	0.00	0.26	0.03	0.03
42	0.29	0.31	0.58	0.71	0.55	0.60	0.35	0.36	0.68	0.82	0.66	0.68	0.06	0.05	0.10	0.11	0.11	0.08
26	0.20	0.41	0.28	0.81	0.95	1.28	0.24	0.49	0.32	0.94	1.11	1.33	0.03	0.08	0.04	0.13	0.16	0.06
7	0.46	0.75	0.70	4.90	0.00	0.84	0.54	0.75	0.80	5.14	0.00	0.94	0.07	0.00	0.11	0.24	0.00	0.10
49	0.14	0.18	0.50	0.95	0.93	0.44	0.16	0.22	0.53	1.09	1.09	0.46	0.02	0.03	0.04	0.14	0.17	0.01
17	0.13	0.18	0.11	0.79	1.09	0.53	0.15	0.21	0.13	0.95	1.24	0.54	0.02	0.03	0.02	0.16	0.16	0.01
9	0.18	0.28	0.22	0.65	0.70	0.35	0.21	0.33	0.25	0.75	0.80	0.36	0.03	0.05	0.04	0.10	0.09	0.01
45	0.03	0.03	0.00	0.14	0.32	0.09	0.04	0.04	0.00	0.15	0.34	0.09	0.01	0.01	0.00	0.01	0.02	0.00
16	0.03	0.00	0.04	0.00	0.73	0.00	0.04	0.00	0.05	0.00	0.83	0.00	0.01	0.00	0.01	0.00	0.10	0.00
29	0.11	0.19	0.99	0.63	1.07	0.69	0.13	0.22	1.12	0.75	1.20	0.76	0.02	0.03	0.14	0.12	0.13	0.07
21	0.23	0.37	0.49	1.17	0.83	1.98	0.27	0.43	0.57	1.34	0.96	2.03	0.04	0.06	0.08	0.17	0.13	0.05
25	0.08	0.59	0.00	0.74	0.91	0.00	0.09	0.66	0.00	0.87	1.05	0.00	0.01	0.07	0.00	0.13	0.15	0.00
31	0.09	0.08	0.11	0.13	0.22	0.37	0.10	0.08	0.11	0.13	0.23	0.37	0.01	0.00	0.00	0.01	0.02	0.00
32	0.04	0.09	0.00	0.90	0.31	0.35	0.05	0.10	0.00	1.08	0.34	0.38	0.01	0.02	0.00	0.18	0.04	0.03
20	0.29	0.19	0.45	0.33	0.31	1.02	0.32	0.22	0.49	0.37	0.34	1.10	0.03	0.03	0.04	0.04	0.03	0.09
Median diff	0.16	0.33	0.19	0.95	0.80	0.43	0.19	0.39	0.23	1.10	0.91	0.44	0.03	0.05	0.02	0.15	0.12	0.01

Open in a new tab

H, Hispanic (all races); diff, difference; NAIAN, non-Hispanic American Indian/Alaska Native; NAPI, non-Hispanic Asian or Pacific Islander; NHB, non-Hispanic Black; NHW, non-Hispanic white; NU, non-Hispanic unknown race.

Discussion

This is the first time that the impact of registry-specific cause of death missingness on survival estimates has been evaluated using data for the United States and Canada. This study used CiNA survival data to establish fit-for-use criteria that indicate when data from specific registries should be excluded from CiNA data products that intend to present cause-specific survival. The results of these analyses, which were based on the November 2020 data submission, support a recommendation that registries should be deemed fit for use for cause-specific survival analyses when <3% of tumors have missing/unknown SEER cause-specific cause of death.

Additionally, we noted how patterns in cause of death ascertainment mirror issues in ascertainment of vital status overall,^22,23 illustrating how important it is for researchers to investigate patterns of missingness in SEER cause-specific cause of death for their specific study questions. Differential cause of death missingness was noted in specific racial/ ethnic populations and for specific primary sites. For example, a higher proportion of SEER cause-specific cause of death was missing for persons diagnosed with gastric cancers —rates of which are higher among Hispanic and non-Hispanic non-white populations^24,25 —among registries with higher proportions of non-Hispanic Asian or Pacific Islander and Hispanic patients.²⁶ In general, higher proportions of SEER cause-specific cause of death were missing for non-Hispanic Asian or Pacific Islander patients and Hispanic patients. One factor contributing to higher proportions of missing cause of death among Hispanic and non-Hispanic Asian or Pacific Islander patients may be related to NDI scoring (ie, that these populations have, on average, lower linkage match scores for which NDI will return vital status but not cause of death).

Missing or inaccurate Social Security number may be the underling driver of many of the cause of death missingness patterns that we see in these analyses. Among Hispanic²² and non-Hispanic Asian or Pacific Islander patients (personal communication with Dr. Scarlett Lin Gomez, March 2022), the distribution of follow-up source central differs from that of non-Hispanic White patients, which may be due to difficulty in linking to death certificate or patient emigration. In other words, updates are made to vital status from sources that do not include cause of death, such as hospital registrars. The relatively high proportion of missing cause of death for specific primary sites (in particular, leukemia, lymphoma, and Hodgkin lymphoma) may be the result of those cancers being more frequently reported from pathology sources alone, which typically do not include Social Security number, resulting in lower matches to death data.

We also noted large differences in cause of death missingness within registry by diagnosis year. These observations speak to the dynamic nature of missingness for this field and the importance of examining missingness within registry by year and at each analysis. Inadvertently including data with a high proportion of missingness for a given year based on a low percentage missingness for all years, especially when presenting results by diagnosis year or period, could yield biased or otherwise misleading results.

One important limitation of these analyses is that by changing the method of including tumors that did not have cause of death information (excluded, dead from the cancer under study, or censored), survival estimates were not calculated using an identical tumor set, potentially biasing our comparisons. However, these calculations represent what would happen under real-life circumstances of using these data in SEER*Stat. Thus, we felt that these comparisons were appropriate. Additionally, we attempted to evaluate cause of death missingness by follow-up source central. We found that follow-up source central did not reliably capture the data source of follow-up information. For example, although linkage with the Social Security Administration - Service for Epidemiological Researchers (SSA-SER) data does not provide cause of death information, 14.7% of tumors with follow-up source central listed as SSA-SER were reported as having died of their cancer (Table 5).

Table 5.

Cause of Death Missingness by Follow-up Source Central for Deceased Patients Diagnosed While Residents of Registries with <3% Missing Cause of Death, 2011-2017 Diagnosis Years, All Sites

	Alive or dead of other cause		Dead (attributable to this cancer)		Dead (missing/ unknown COD)		Total
Follow-up source central	Count	Row %	Count	Row %	Count	Row %	Count
Follow-up not performed for this patient	905	8.70	6,989	67.20	2,506	24.10	10,400
Medicare/Medicaid File	0	0.00	3	5.60	51	94.40	54
Center for Medicare and Medicaid Services (formerly Health Care Finance Administration [HCFA])	4	0.90	17	4.00	405	95.10	426
Department of Motor Vehicle Registration	3	16.70	6	33.30	9	50.00	18
National Death Index (NDI)	23,197	26.30	64,683	73.40	205	0.20	88,088
State death tape/death certificate file	525,157	23.30	1,730,082	76.70	1,627	0.10	2,256,873
County/municipality death tape/ death certificate file	9	1.70	508	97.90	2	0.40	519
Social Security Administration Death Master File	34,071	20.10	130,498	76.80	5,321	3.10	169,890
Hospital discharge data	135	8.70	1,302	84.40	106	6.90	1,543
Health maintenance organization (HMO) file	5	3.90	9	7.00	114	89.10	128
Social Security epidemiological vital status data	163	5.90	406	14.70	2,197	79.40	2,766
Voter registration file	6	25.00	8	33.30	10	41.70	24
Linkages, NOS	10,894	22.20	37,520	76.30	746	1.50	49,160
Hospitals and treatment facilities	7,179	18.60	26,464	68.50	4,998	12.90	38,642
Physicians	388	7.10	4,690	85.30	423	7.70	5,501
Patient	1	0.70	4	2.70	141	96.60	146
Central or regional cancer registry	288	18.20	1,129	71.40	165	10.40	1,582
Other	161	5.30	1,559	51.20	1,326	43.50	3,046
Blank(s)	26	11.90	184	84.00	9	4.10	219
Total	602,592	22.90	2,006,061	76.30	20,361	0.80	2,629,025

Open in a new tab

COD, cause of death; NOS, not otherwise specified.

Based on this evaluation of cause-specific cause of death, we recommend that the following guidelines be implemented by anyone conducting cause-specific cause of death analyses, in particular researchers and others using CiNA data:

Registry-specific data should be excluded from cause-specific survival calculations if >3% of cases are missing cause-specific cause of death for that registry.
Registry-specific data should be excluded from cause-specific survival if >10% of cases are missing cause-specific cause of death for a single year of data for that registry.
Because cause-specific cause of death missingness varies by primary site and race/ethnicity, researchers and others using these data should apply the above rules to the data used for their specific research questions, including analyses by the data strata of interest to their research.

Researchers may need to exclude data from additional registries depending on their study population of interest. Accordingly, researchers conducting analyses in SEER*Stat should also conduct sensitivity analyses to evaluate the impact of any missingness by registry or other subcat-egory on the survival estimates using these 3 options for classifying cause of death: exclusion, censoring, or dead from the cancer. Researchers conducting analyses outside of SEER*Stat should consider multiple imputation as a potential solution to missing cause-specific cause of death, with special consideration for analyses involving registries impacted by legislation- or registry operations-related reason for high missingness.

These analyses underscore that registries differ in their interpretation or have different policies governing cause of death release. Cause of death ascertained from state death records may not be releasable by central cancer registries per agreements with state vital statistics departments. Cause of death information is, however, used to calculate SEER cause-specific cause of death, which may be able to be released by central cancer registries in the absence of specific cause of death information. Additionally, as described in the data release guidelines published by NAACCR,²⁷ data on fact, date, and cause of death identified through NDI linkages may be released to approved researchers after review and approval by the cancer registry provided that the registry annually provides NDI with information describing the release of these data (ie, researcher name, organization, study title, date). Release of NDI fact, date, and cause of death may be included in annual data submissions to NPCR, the SEER Program, and NAACCR.

The results of these analyses also underscore the importance of data processing sequence in the annual data submissions. Specifically, if death linkage is conducted prior to a particular case being reported to the registry (eg, late reporting of interstate data or delayed reporting by hospitals), vital status may not be appropriately recorded for that patient. Central registries should be particularly cognizant of the potential need to reconduct death linkages for these new cases.

Conclusion

This paper aims to establish a standard for when registry data is or is not fit for use for cause-specific survival with the ultimate goal of encouraging registries to improve their data quality. To this end, we have established a recommended cut point of <3% missing/unknown SEER cause-specific cause of death by registry and/or any strata for which cause-specific survival is reported. We have also established that any registry with <3 % overall missing/ unknown SEER cause-specific cause of death but >10% missing/unknown SEER cause-specific cause of death for 1 or more individual diagnosis years should also be excluded from cause-specific survival analyses in SEER*Stat. This 3% standard is a reasonable request of most registries, as most US registries already met the standard prior to its quantification. This cause-specific cause of death fit-for-use criterion is a direct analogue to how other fit-for-use metrics are applied to CiNA data products, and maintaining a similar approach to cause-specific survival is sensible for CiNA data products and surveillance publications.

This paper also serves to call researchers’ attention to how missingness in cancer surveillance data is differential and likely to impact survival estimates. In the absence of multiple imputation or other more advanced statistical techniques, we recommend that researchers working with any subnational data set or who intend to present survival estimates by subpopulations exclude data per the above criteria. At minimum, we recommend conducting sensitivity analyses as described.

References

1.Sherman R, Firth R, Kahl A, et al., eds. Cancer in North America: 20152019: Volume One: Combined Cancer Incidence for the United States, Canada and North America. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]
2.Sherman R, Firth R, Kahl A, et al., eds. Cancer in North America: 20152019: Volume Two: Registry-Specific Cancer Incidence in the United States and Canada. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]
3.Sherman R, Firth R, Kahl A, et al., eds. Cancer In North America, 20152019: Volume Three: Registry-Specific Cancer Mortality in the United States and Canada. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]
4.Johnson CJ, Wilson R, Mariotto A, et al., eds. Cancer in North America: 2015-2019: Volume Four: Cancer Survival in the United States and Canada 2012-2018. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]
5.Johnson CJ, Wilson R, Mariotto A, et al., eds. Cancer in North America: 2015-2019: Volume Five: Cancer Prevalence in the United States and Canada 2009-2018. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]
6.Weir HK, Tucker TC. NAACCR registry certification. In: Menck HR, Deapen D, Phillips JL, Tucker TC, eds. Central Cancer Registries: Design, Management and Use. 2nd ed. Kendall/Hunt Publishing Company; 2007:223–236. [Google Scholar]
7.Perme MP, Stare J, Esteve J. On estimation in relative survival. Biometrics. 2012;68(1):113–120. [DOI] [PubMed] [Google Scholar]
8.Forjaz de Lacerda G, Howlader N, Mariotto AB. Differences in cancer survival with relative versus cause-specific approaches: an update using more accurate life tables. Cancer Epidemiol Biomarkers Prev. 2019;28(9):1544–1551. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Wissing MD, Greenwald ZR, Franco EL. Improving the reporting of cancer-specific mortality and survival in research using cancer registry data. Cancer Epidemiol. 2019;59:232–235. [DOI] [PubMed] [Google Scholar]
10.Mariotto AB, Zou Z, Johnson CJ, Scoppa S, Weir HK, Huang B. Geographical, racial and socio-economic variation in life expectancy in the US and their impact on cancer relative survival. PLoS One. 2018;13(7):e0201034. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Rachet B, Maringe C, Woods LM, Ellis L, Spika D, Allemani C. Multivariable flexible modelling for estimating complete, smoothed life tables for sub-national populations. BMC Public Health. 2015;15:1240. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Spika D, Bannon F, Bonaventure A, et al. Life tables for global surveillance of cancer survival (the CONCORD programme): data sources and methods. BMC Cancer. 2017;17(1):159. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Howlader N, Ries LA, Mariotto AB, Reichman ME, Ruhl J, Cronin KA. Improved estimates of cancer-specific survival rates from population-based data. J Natl Cancer Inst. 2010;102(20):1584–1598. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Lauseker M, Zu Eulenburg C. Analysis of cause of death: competing risks or progressive illness-death model? Biom J. 2019;61(2):264–274. [DOI] [PubMed] [Google Scholar]
15.Forjaz G, Howlader N, Scoppa S, Johnson CJ, Mariotto AB. Impact of including second and later cancers in cause-specific survival estimates using population-based registry data. Cancer. 2022;128(3):547–557. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Nur U, Shack LG, Rachet B, Carpenter JR, Coleman MP. Modelling relative survival in the presence of incomplete data: a tutorial. Int J Epidemiol. 2010;39(1):118–128. [DOI] [PubMed] [Google Scholar]
17.Carpenter JR, Kenward MG. Multiple Imputation and Its Application. John Wiley & Sons; 2013. [Google Scholar]
18.Binder N, Schumacher M. Missing information caused by death leads to bias in relative risk estimates. J Clin Epidemiol. 2014;67(10):1111–1120. [DOI] [PubMed] [Google Scholar]
19.SEER*Stat Database: NAACCR Incidence Data - CiNA Production File, 1995-201 8, for U.S. and CDN - Survival & Prevalence (which includes data from CDC's National Program of Cancer Registries (NPCR), CCCR's Provincial and Territorial Registries, and the NCI's Surveillance, Epidemiology and End Results (SEER) Registries), certified by the North American Association of Central Cancer Registries (NAACCR) as meeting high-quality incidence data standards for the specified time periods, submitted December 2020. In: 2021. [Google Scholar]
20.Thornton ML, ed. Standards for Cancer Registries Volume II: Data Standards and Data Dictionary. 24th ed. North American Association of Central Cancer Registries; 2022. [Google Scholar]
21.SEER*Stat software [computer program]. Version 8.4.0.1. www.seer.cancer.gov/seerstat. [Google Scholar]
22.Pinheiro PS, Morris CR, Liu L, Bungum TJ, Altekruse SF. The impact of follow-up type and missed deaths on population-based cancer survival studies for Hispanics and Asians. J Natl Cancer Inst Monogr. 2014;2014(49):210–217. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Morawski BM, Qiao B, Coyle L, Rycroft RK, Schymura MJ, Johnson CJ. Impact of linkage to the Social Security Administration on follow-up completeness and cancer relative survival estimates in 2 new SEER registries: 2000-2016 diagnosis years. J Registry Manag. 2020;47(2):37–47. [PubMed] [Google Scholar]
24.Haile RW, John EM, Levine AJ, et al. A review of cancer in U.S. Hispanic populations. Cancer Prev Res (Phila). 2012;5(2):150–163. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Everhart JE, Kruszon-Moran D, Perez-Perez GI, Tralka TS, McQuillan G. Seroprevalence and ethnic differences inHelicobacter pyloriInfection among adults in the United States. J Infect Dis. 2000;181 (4):1359–1363. [DOI] [PubMed] [Google Scholar]
26.Zavala VA, Bracci PM, Carethers JM, et al. Cancer health disparities in racial/ethnic minorities in the United States. Br J Cancer. 2021;124(2):315–332. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Linking Central Cancer Registry Data with National Death Index (NDI) Data. NAACCR; 2021. https://www.naaccr.org/wp-content/uploads/2021/04/NDI-Factsheet-for-NAACCR_Final_Updated-4.29.21.pdf [Google Scholar]

[R1] 1.Sherman R, Firth R, Kahl A, et al., eds. Cancer in North America: 20152019: Volume One: Combined Cancer Incidence for the United States, Canada and North America. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]

[R2] 2.Sherman R, Firth R, Kahl A, et al., eds. Cancer in North America: 20152019: Volume Two: Registry-Specific Cancer Incidence in the United States and Canada. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]

[R3] 3.Sherman R, Firth R, Kahl A, et al., eds. Cancer In North America, 20152019: Volume Three: Registry-Specific Cancer Mortality in the United States and Canada. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]

[R4] 4.Johnson CJ, Wilson R, Mariotto A, et al., eds. Cancer in North America: 2015-2019: Volume Four: Cancer Survival in the United States and Canada 2012-2018. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]

[R5] 5.Johnson CJ, Wilson R, Mariotto A, et al., eds. Cancer in North America: 2015-2019: Volume Five: Cancer Prevalence in the United States and Canada 2009-2018. North American Association of Central Cancer Registries, Inc; 2022. [Google Scholar]

[R6] 6.Weir HK, Tucker TC. NAACCR registry certification. In: Menck HR, Deapen D, Phillips JL, Tucker TC, eds. Central Cancer Registries: Design, Management and Use. 2nd ed. Kendall/Hunt Publishing Company; 2007:223–236. [Google Scholar]

[R7] 7.Perme MP, Stare J, Esteve J. On estimation in relative survival. Biometrics. 2012;68(1):113–120. [DOI] [PubMed] [Google Scholar]

[R8] 8.Forjaz de Lacerda G, Howlader N, Mariotto AB. Differences in cancer survival with relative versus cause-specific approaches: an update using more accurate life tables. Cancer Epidemiol Biomarkers Prev. 2019;28(9):1544–1551. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Wissing MD, Greenwald ZR, Franco EL. Improving the reporting of cancer-specific mortality and survival in research using cancer registry data. Cancer Epidemiol. 2019;59:232–235. [DOI] [PubMed] [Google Scholar]

[R10] 10.Mariotto AB, Zou Z, Johnson CJ, Scoppa S, Weir HK, Huang B. Geographical, racial and socio-economic variation in life expectancy in the US and their impact on cancer relative survival. PLoS One. 2018;13(7):e0201034. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Rachet B, Maringe C, Woods LM, Ellis L, Spika D, Allemani C. Multivariable flexible modelling for estimating complete, smoothed life tables for sub-national populations. BMC Public Health. 2015;15:1240. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Spika D, Bannon F, Bonaventure A, et al. Life tables for global surveillance of cancer survival (the CONCORD programme): data sources and methods. BMC Cancer. 2017;17(1):159. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Howlader N, Ries LA, Mariotto AB, Reichman ME, Ruhl J, Cronin KA. Improved estimates of cancer-specific survival rates from population-based data. J Natl Cancer Inst. 2010;102(20):1584–1598. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Lauseker M, Zu Eulenburg C. Analysis of cause of death: competing risks or progressive illness-death model? Biom J. 2019;61(2):264–274. [DOI] [PubMed] [Google Scholar]

[R15] 15.Forjaz G, Howlader N, Scoppa S, Johnson CJ, Mariotto AB. Impact of including second and later cancers in cause-specific survival estimates using population-based registry data. Cancer. 2022;128(3):547–557. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Nur U, Shack LG, Rachet B, Carpenter JR, Coleman MP. Modelling relative survival in the presence of incomplete data: a tutorial. Int J Epidemiol. 2010;39(1):118–128. [DOI] [PubMed] [Google Scholar]

[R17] 17.Carpenter JR, Kenward MG. Multiple Imputation and Its Application. John Wiley & Sons; 2013. [Google Scholar]

[R18] 18.Binder N, Schumacher M. Missing information caused by death leads to bias in relative risk estimates. J Clin Epidemiol. 2014;67(10):1111–1120. [DOI] [PubMed] [Google Scholar]

[R19] 19.SEER*Stat Database: NAACCR Incidence Data - CiNA Production File, 1995-201 8, for U.S. and CDN - Survival & Prevalence (which includes data from CDC's National Program of Cancer Registries (NPCR), CCCR's Provincial and Territorial Registries, and the NCI's Surveillance, Epidemiology and End Results (SEER) Registries), certified by the North American Association of Central Cancer Registries (NAACCR) as meeting high-quality incidence data standards for the specified time periods, submitted December 2020. In: 2021. [Google Scholar]

[R20] 20.Thornton ML, ed. Standards for Cancer Registries Volume II: Data Standards and Data Dictionary. 24th ed. North American Association of Central Cancer Registries; 2022. [Google Scholar]

[R21] 21.SEER*Stat software [computer program]. Version 8.4.0.1. www.seer.cancer.gov/seerstat. [Google Scholar]

[R22] 22.Pinheiro PS, Morris CR, Liu L, Bungum TJ, Altekruse SF. The impact of follow-up type and missed deaths on population-based cancer survival studies for Hispanics and Asians. J Natl Cancer Inst Monogr. 2014;2014(49):210–217. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Morawski BM, Qiao B, Coyle L, Rycroft RK, Schymura MJ, Johnson CJ. Impact of linkage to the Social Security Administration on follow-up completeness and cancer relative survival estimates in 2 new SEER registries: 2000-2016 diagnosis years. J Registry Manag. 2020;47(2):37–47. [PubMed] [Google Scholar]

[R24] 24.Haile RW, John EM, Levine AJ, et al. A review of cancer in U.S. Hispanic populations. Cancer Prev Res (Phila). 2012;5(2):150–163. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Everhart JE, Kruszon-Moran D, Perez-Perez GI, Tralka TS, McQuillan G. Seroprevalence and ethnic differences inHelicobacter pyloriInfection among adults in the United States. J Infect Dis. 2000;181 (4):1359–1363. [DOI] [PubMed] [Google Scholar]

[R26] 26.Zavala VA, Bracci PM, Carethers JM, et al. Cancer health disparities in racial/ethnic minorities in the United States. Br J Cancer. 2021;124(2):315–332. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Linking Central Cancer Registry Data with National Death Index (NDI) Data. NAACCR; 2021. https://www.naaccr.org/wp-content/uploads/2021/04/NDI-Factsheet-for-NAACCR_Final_Updated-4.29.21.pdf [Google Scholar]

PERMALINK

Determining Fitness for Use of SEER Cause-Specific Cause of Death in Analyses of Cause-Specific Survival

Bożena M Morawski, MPH, PhD

Mei-Chin Hsieh, PhD, CTR

Manxia Wu, PhD

Recinda Sherman, PhD, CTR

Angela B Mariotto, PhD

Christopher J Johnson, MPH

Abstract

Background:

Methods:

Results:

Conclusion:

Background

Methods

Results

Table 1.

Figure 1.

Table 2.

Table 3.

Table 4.

Discussion

Table 5.

Conclusion

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Determining Fitness for Use of SEER Cause-Specific Cause of Death in Analyses of Cause-Specific Survival

Bożena M Morawski, MPH, PhD

Mei-Chin Hsieh, PhD, CTR

Manxia Wu, PhD

Recinda Sherman, PhD, CTR

Angela B Mariotto, PhD

Christopher J Johnson, MPH

Abstract

Background:

Methods:

Results:

Conclusion:

Background

Methods

Results

Table 1.

Figure 1.

Table 2.

Table 3.

Table 4.

Discussion

Table 5.

Conclusion

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases