Population-Based Cancer Survival in the United States: Data, Quality Control, and Statistical Methods

Claudia Allemani; Rhea Harewood; Christopher J Johnson; Helena Carreira; Devon Spika; Audrey Bonaventure; Kevin Ward; Hannah K Weir; Michel P Coleman

doi:10.1002/cncr.31025

. Author manuscript; available in PMC: 2018 Mar 14.

Published in final edited form as: Cancer. 2017 Dec 15;123(Suppl 24):4982–4993. doi: 10.1002/cncr.31025

Population-Based Cancer Survival in the United States: Data, Quality Control, and Statistical Methods

Claudia Allemani ¹, Rhea Harewood ¹, Christopher J Johnson ², Helena Carreira ¹, Devon Spika ¹, Audrey Bonaventure ¹, Kevin Ward ³, Hannah K Weir ⁴, Michel P Coleman ¹

PMCID: PMC5851448 NIHMSID: NIHMS945472 PMID: 29205302

Abstract

BACKGROUND

Robust comparisons of population-based cancer survival estimates require tight adherence to the study protocol, standardized quality control, appropriate life tables of background mortality, and centralized analysis. The CONCORD program established worldwide surveillance of population-based cancer survival in 2015, analyzing individual data on 26 million patients (including 10 million US patients) diagnosed between 1995 and 2009 with 1 of 10 common malignancies.

METHODS

In this Cancer supplement, we analyzed data from 37 state cancer registries that participated in the second cycle of the CONCORD program (CONCORD-2), covering approximately 80% of the US population. Data quality checks were performed in 3 consecutive phases: protocol adherence, exclusions, and editorial checks. One-, 3-, and 5-year age-standardized net survival was estimated using the Pohar Perme estimator and state- and race-specific life tables of all-cause mortality for each year. The cohort approach was adopted for patients diagnosed between 2001 and 2003, and the complete approach for patients diagnosed between 2004 and 2009.

RESULTS

Articles in this supplement report population coverage, data quality indicators, and age-standardized 5-year net survival by state, race, and stage at diagnosis. Examples of tables, bar charts, and funnel plots are provided in this article.

CONCLUSIONS

Population-based cancer survival is a key measure of the overall effectiveness of services in providing equitable health care. The high quality of US cancer registry data, 80% population coverage, and use of an unbiased net survival estimator ensure that the survival trends reported in this supplement are robustly comparable by race and state. The results can be used by policymakers to identify and address inequities in cancer survival in each state and for the United States nationally.

Keywords: cancer, National Program for Cancer Registries (NPCR), population-based survival, statistical methods, Surveillance, Epidemiology, End Results (SEER)

INTRODUCTION

Population-based cancer survival is a measure of the overall effectiveness of the health system in dealing with cancer.¹ Comparisons of population-based cancer survival require adherence to a well-designed protocol, standardized quality control procedures, appropriate life tables of background mortality, and centralized analysis with the latest statistical methods.^2,3

The second cycle of the CONCORD program (CONCORD-2) established worldwide surveillance of cancer survival in 2015, with estimates of 5-year net survival based on individual data for more than 25 million cancer patients (approximately 10 million patients in the United States) diagnosed between 1995 and 2009 with 1 of 10 common cancers: stomach, colon, rectum, liver, lung, breast (women), cervix, ovary, prostate, and leukemia in adults (15–99 years old) and acute lymphoblastic leukemia in children (0–14 years old).³ Patients were followed up to December 31, 2009.

For the articles in this Cancer supplement, we analyzed data from 37 statewide cancer registries (27 funded by the National Program for Cancer Registries [NPCR] program, 5 funded by the Surveillance, Epidemiology, and End Results [SEER] program, and 5 funded by both NPCR and SEER) that participated in CONCORD-2. These registries, which cover approximately 80% of the US population, agreed to the inclusion of their data in more detailed analyses by stage at diagnosis and by race (Fig. 1). The CONCORD protocol required data on stage only for patients diagnosed from January 1, 2001 onward; these analyses, focusing mainly on survival by race and stage at diagnosis, were restricted to patients diagnosed between 2001 and 2009.

Map of the participating states. NPCR indicates National Program for Cancer Registries; SEER, Surveillance, Epidemiology, and End Results.

Public health surveillance using data from population-based cancer registries is a key component of cancer control.⁴ The North American Association of Central Cancer Registries (NAACCR) develops and promotes uniform data standards for all cancer registries in North America.⁵ Participating US registries had to meet the NAACCR certification criteria and to have conducted record linkage with both state vital records and the National Death Index to update the vital status of registered patients. NAACCR members developed a detailed SAS program to map the NAACCR database record structure to the CONCORD protocol and thus to enable all North American registries to exclude cases that would not have been considered reportable primaries according to the International Association of Cancer Registries’ (IACR) multiple primary rules,⁶ before their data sets for 1995–2009 were extracted for CONCORD-2. This was necessary because North American registries define multiple primary cancers under the rules of the SEER program,⁷ whereas registries in the European Network of Cancer Registries and in other continents generally use the rules of the IACR, which are more conservative.

Topography and morphology were coded according to the International Classification of Diseases for Oncology, Third Edition (ICD-O-3).⁸ Solid tumors were defined by anatomical site (Table 1). For ovarian cancer, we included the fallopian tube, uterine ligaments, and adnexa as well as the peritoneum and retroperitoneum, where high-grade serous ovarian carcinomas are often detected; this was done to improve the international comparability of the data sets. Kaposi’s sarcoma and solid tumors with a lymphoma morphology were excluded from analysis.

TABLE 1.

Definition of Malignancies

Malignancy	Topography or Morphology Codes	Description
Stomach	C16.0-C16.6, C16.8-C16.9	Stomach
Colon	C18.0-C18.9, C19.9	Colon and rectosigmoid junction
Rectum	C20.9, C21.0-C21.2, C21.8	Rectum, anus, and anal canal
Liver	C22.0-C22.1	Liver and intrahepatic bile ducts
Lung	C34.0-C34.3, C34.8-C34.9	Lung and bronchus
Breast (women)	C50.0-C50.6, C50.8-C50.9	Breast
Cervix	C53.0-C53.1, C53.8-C53.9	Cervix uteri
Ovary	C48.0-C48.2, C56.9, C57.0-C57.4, C57.7-C57.9	Ovary, fallopian tube and uterine ligaments, other and unspecified female genital organs, peritoneum, and retroperitoneum
Prostate	C61.9	Prostate gland
Leukemia (children)	9727, 9728, 9729, 9835, 9836, 9837	Precursor-cell acute lymphoblastic leukemia

Open in a new tab

Leukemias were defined by morphology. In this supplement, we cover only precursor-cell acute lymphoblastic leukemia in children (ICD-O-3 morphology codes 9727, 9728, 9729, 9835, 9836, and 9837). Estimates of survival by race, state, and subtype of adult leukemia will be presented in other publications.

Only primary invasive cancers (ICD-O-3 behavior code 3) were included in survival analyses. We included cancers at a given site regardless of whether the patient had had a previous cancer. If a patient had been diagnosed with 2 or more cancers of a given organ, including paired organs, between 2001 and 2009, only the first was considered in survival analyses.

FOLLOW-UP

US registries were asked to submit follow-up data (the vital status and the date of the last known vital status) as of December 31, 2009, after conducting linkages of all cancer registrations with both state vital records systems and the National Death Index. Patients whose cancer registration could not be linked to a death record were considered to be alive on December 31, 2009 (passive follow-up, which is also known as the “presumed alive” method).

SEER registries are required to meet a specific standard for the completeness and recency of follow-up. At least 90% of registered patients not known to be deceased were required to have a date of last known vital status on or after January 1, 2010. These follow-up dates could have been obtained from either passive or active follow-up.⁹

Patients whose survival time was unknown were excluded from analyses. This group comprised patients registered solely from a death certificate or diagnosed at autopsy.

DATA QUALITY CONTROL

We performed data quality checks in 3 consecutive phases: protocol adherence, exclusions, and editorial checks. After each phase, a detailed report was sent to each cancer registry.

Phase 1: Protocol Adherence

We first checked the compliance with the CONCORD-2 protocol of each of 37 variables (demographic characteristics, basis of diagnosis, date of diagnosis, topography, morphology, behavior, stage, vital status, and date of last known vital status) in each tumor record in each data set. Any value not specified in the protocol was considered noncompliant. Each registry was sent a table of the number of records and the percentage compliance for each variable and for each cancer. Minor issues were corrected by the CONCORD Central Analytic Team after discussion with the registry. For major structural issues, 5 registries corrected and resubmitted their data.

Phase 2: Exclusions

Next, we checked for logical inconsistencies between the variables in each tumor record, for each cancer site. Exclusion criteria were defined a priori on the basis of the experience within the Cancer Survival Group, the checks performed in the first CONCORD study, the data quality checks of EURO-CARE (a European cancer registry–based study of the survival and care of cancer patients), the checks proposed by the International Agency for Research on Cancer, the descriptions of morphology in the World Health Organization/International Agency for Research on Cancer classification of tumors for each cancer, and, finally, clinical expertise.³

We produced exclusion tables summarizing the quality of each data set. Data quality indicators were tabulated separately for patients diagnosed in 1995–1999, 2000–2004, and 2005–2009 to enable evaluations of trends in data quality over time. We defined 3 broad categories for exclusion: ineligibility (eg, an in situ neoplasm), definite error (eg, a sex-site mismatch), and possible error (eg, an apparent inconsistency between site and morphology). We had requested records of in situ neoplasms to assess the intensity of diagnostic activity, particularly for cancers of the breast and cervix, but in situ neoplasms were not included in survival analyses. The number and percentage of patients excluded from analyses are shown in Table 2.

TABLE 2.

Data Quality Indicators for Cancer Patients Diagnosed Between 1995 and 2009, by US State (All Solid Cancers Combined)

	Calendar Period	Patients Submitted, No.	Ineligible Patients, %^a		Eligible Patients, No.	Exclusions, % ^b		Available for Analyses, No.	Data Quality Indicators, %^c				Type of Follow-Up^d
	Calendar Period	Patients Submitted, No.	In Situ	Other	Eligible Patients, No.	DCO	Other	Available for Analyses, No.	MV	Nonspecific Morphology	Lost to Follow-Up	Censored	Type of Follow-Up^d
US Registries	1995–2009	10,115,271	6.4	1.4	9,325,815	1.9	0.2	9,142,718	99.7	1.3	1.0	<0.1
Alabama	1996–2009	184,581	0.0	1.3	182,156	1.9	0.2	178,484	95.9	1.1	0.0	0.0	P
Alaska	1996–2009	19,959	6.9	2.9	18,002	0.7	0.2	17,852	95.8	1.7	0.0	0.0	P
California	1995–2009	1,326,462	6.1	2.1	1,218,053	1.2	0.2	1,202,096	95.8	0.8	3.2	0.0	P&A
Colorado	1995–2009	162,405	6.4	1.4	149,860	2.4	0.2	146,306	95.8	0.8	0.0	0.0	P
Connecticut	1995–2009	180,154	7.8	1.1	164,128	1.3	0.2	161,865	97.0	0.7	4.5	0.0	P&A
Delaware	1995–2009	41,768	6.1	1.2	38,717	2.0	0.2	37,956	96.1	0.7	0.0	0.0	P
Florida	1995–2009	928,713	5.0	0.8	874,825	3.3	0.2	846,156	97.2	1.5	0.0	<0.1	P
Georgia	2000–2009	241,967	6.0	0.6	225,831	1.8	0.2	221,675	96.2	0.7	2.1	0.0	P&A
Hawaii	1995–2009	55,510	7.6	1.0	50,774	1.2	0.2	50,116	96.6	0.3	4.1	0.0	P&A
Idaho	1995–2009	51,319	4.6	1.3	48,277	2.5	0.2	47,086	95.8	0.4	0.0	0.0	P
Iowa	1995–2009	146,231	5.0	1.3	136,938	1.5	0.2	134,776	95.2	0.5	1.4	0.0	P
Kentucky	1995–2009	208,365	4.7	0.8	197,074	1.4	0.2	194,119	93.7	1.4	1.1	0.0	P&A
Louisiana	1995–2009	205,149	4.4	1.1	193,856	1.5	0.2	190,693	95.2	0.4	2.2	0.0	P&A
Maryland	1996–2009	225,540	10.6	1.1	199,088	2.9	0.3	193,230	94.8	1.8	0.0	0.0	P
Massachusetts	1995––2009	336,858	9.1	0.8	303,282	1.7	0.2	297,992	95.8	1.4	0.0	<0.1	P
Michigan	1995–2009	522,531	12.9	1.0	449,914	1.1	0.2	444,382	94.7	3.0	0.0	0.1	P&A
Mississippi	2003–2009	64,396	4.7	0.9	60,833	3.1	0.2	58,974	96.7	0.4	0.0	0.0	P
Montana	1995–2009	48,221	10.1	1.7	42,512	3.5	0.2	41,087	95.8	0.3	9.8	0.0	P&A
Nebraska	1995–2009	88,971	4.9	1.5	83,231	1.5	0.2	81,980	96.1	1.2	0.0	<0.1	P
New Hampshire	1995–2009	60,507	7.6	0.9	55,347	1.7	0.2	54,345	95.1	1.1	0.0	0.0	P
New Jersey	1995–2009	440,395	6.7	1.6	403,724	1.3	0.2	398,191	96.2	1.0	2.8	0.0	P&A
New Mexico	1995–2009	70,628	5.4	0.8	66,269	3.1	0.2	64,241	94.8	1.1	5.1	0.0	P&A
New York	1995–2009	940,361	7.7	2.6	842,888	1.7	0.2	827,621	94.8	0.9	0.0	0.0	P
North Carolina	1995–2009	375,205	5.9	0.7	350,656	1.7	0.1	344,750	96.3	0.7	0.0	0.0	P
Ohio	2001–2009	334,006	6.1	0.7	311,520	2.8	0.2	303,146	96.3	2.0	0.0	0.0	P
Oklahoma	1997–2009	147,158	4.3	1.0	139,322	2.9	0.2	135,165	93.1	1.6	0.0	0.0	P
Oregon	1996–2009	155,767	5.3	0.8	146,145	1.8	0.2	143,473	94.2	1.4	0.0	<0.1	P
Pennsylvania	1995–2009	682,922	6.5	1.5	628,121	1.3	0.2	619,287	95.7	0.7	0.0	0.0	P
Rhode Island	1995–2009	55,914	6.3	0.8	51,937	1.6	0.2	51,052	96.0	2.7	0.0	0.0	P
South Carolina	1996–2009	184,660	5.3	0.6	173,791	2.0	0.2	170,159	94.8	2.0	0.0	0.0	P
Tennessee	2003–2009	133,826	5.1	0.9	125,694	2.9	0.2	122,080	96.7	0.4	0.0	<0.1	P
Texas	1995–2009	814,295	5.0	1.4	762,429	3.2	0.2	737,811	94.0	2.5	0.0	0.0	P
Utah	1995–2009	63,227	5.9	1.2	58,729	0.4	0.2	58,373	97.0	0.4	3.0	0.0	P&A
Washington	1995–2008	246,015	5.9	1.3	228,416	1.2	0.2	225,458	95.1	0.8	1.9	<0.1	P&A
West Virginia	1995–2009	101,396	4.6	1.1	95,644	1.8	0.2	93,880	93.6	0.9	0.0	0.0	P
Wisconsin	1995–2009	248,955	7.1	1.3	228,035	<0.1	0.3	227,213	96.4	2.3	0.0	0.0	P
Wyoming	1995–2009	20,934	4.7	0.8	19,797	0.6	0.2	19,648	95.6	0.6	0.0	0.0	P&A

Open in a new tab

Abbreviations: DCO, death certificate only; ICD-O-3, International Classification of Diseases for Oncology, Third Edition; MV, microscopically verified; P, passive (presumed alive); P&A, passive and active (reported alive).

In situ malignancy (ICD-O-3 behavior code 2): some registries do not register in situ cancers; other registries did not submit them. Other: records with incomplete data, or tumors that are benign (behavior code 0), of uncertain behavior (1), metastatic from another organ (6), or unknown if primary or metastatic (9); or for patients with age outside the range 15–99 years (adults).

DCO: tumors registered from a death certificate only (DCO), or detected solely at autopsy. Other: vital status or sex unknown; invalid sequence of dates; inconsistency of sex-site, site-morphology, age-site, age-morphology, or age-site-morphology.

MV: microscopically verified. Non-specific morphology (solid tumors only): ICD-O-3 morphology code in the range 8000–8005. Censored: patients diagnosed during 1995-2004, with last known vital status “alive”, but less than 5 years of follow-up.

P, Passive (“presumed alive”) method; P&A, Passive and Active (“reported alive”) methods; see text.

The majority of the patients (99.6%) had only a single tumor record for any 1 cancer between 1995 and 2009. However, since a small proportion of patients had more than 1 tumor record for a given cancer (“multiple tumor, same site”), it was necessary to apply the quality control checks to every tumor record independently before we selected the single tumor record to be included in survival analyses. For example, if a woman had an in situ neoplasm of the breast diagnosed in 2001 followed by an invasive primary breast cancer in 2007, the invasive cancer record was selected for inclusion in the analyses as long as it was free of error.

Phase 3: Editorial Tables

We evaluated the distribution of key data quality indicators for each cancer and for each registry. These indicators included the proportion of cancers in the final data set that had been microscopically verified and the proportion of patients who had been lost to follow-up. We also checked the distributions of the day and the month of the dates of birth, diagnosis, and last known vital status. These distributions should be flat because one would expect approximately 8% of births, diagnoses, and deaths to occur in each month, and approximately 3% to occur on each day of a given month, except for days 29 to 31: spikes in these distributions, often on the 1st, 15th, or 16th day of the month, or in June or July, helped to identify where registries had imputed missing elements of each date.

Table 2 provides a summary of the exclusions and data quality indicators for adults (15–99 years) diagnosed between 1995 and 2009 with 1 of 9 common cancers (all solid cancers), by US state. The calendar periods within which survival analyses could be performed by stage at diagnosis were constrained by the availability of data on stage only from 2001 onward, and by the change in coding from 2004 (discussed later). Therefore, the periods for which data quality indicators are presented do not exactly match the periods used for survival analysis. However, data quality has generally been very high in all US registries, and it tended to improve over the 15 years from 1995 to 2009. Only about 2% of tumors were registered from a death certificate only (DCO) or detected solely at autopsy. These records must be excluded from survival analyses because the follow-up time for these patients is unknown. However, the proportion of DCO registrations in the United States was low overall (1.9%) and in all states (range <0.1% to 3.5%). The proportion of other errors was very low (0.2%). Therefore, approximately 98% of eligible patients were included in survival analyses. Practically all tumors (99.7%) were microscopically verified: this proportion was more than 95% in almost all US states.

The proportion of the US population covered by this study is 80.6%. Table 3 shows the population coverage by US state, as well as the number of patients diagnosed between 1995 and 2009 and included in the analyses.

TABLE 3.

Population Coverage and Number of Men and Women Diagnosed Between 1995 and 2009, by US State

	Population Covered, No.^a	% of National	Stomach, No.	Colon, No.	Rectum, No.	Liver, No.	Lung, No.	Breast (women), No.	Cervix, No.	Ovary, No.	Prostate, No.	Leukemia, No.	ALL, No.^b	Total, No.
Northeast	53,343,618	17.2	71,864	364,080	98,955	45,343	548,066	558,858	36,809	73,576	612,802	96,851	5,974	2,513,178
Connecticut	3,518,288	1.1	4,890	23,612	6,708	2,817	36,756	39,027	1,987	4,717	41,351	6,435	412	168,712
Massachusetts	6,593,587	2.1	7,881	43,253	12,295	5,223	68,739	73,117	3,524	8,761	75,199	10,951	766	309,709
New Hampshire	1,324,575	0.4	1,119	7,570	2,272	691	12,905	13,540	706	1,608	13,934	2,310	145	56,800
New Jersey	8,707,739	2.8	12,299	59,971	15,731	7,074	86,105	90,804	6,766	12,189	107,252	15,908	1,081	415,180
New York	19,541,453	6.3	27,815	124,529	33,385	18,226	182,745	192,993	14,457	26,279	207,192	34,253	2,113	863,987
Pennsylvania	12,604,767	4.1	16,277	97,350	26,642	10,501	148,024	137,529	8,655	18,753	155,556	25,362	1,388	646,037
Rhode Island	1,053,209	0.3	1,583	7,795	1,922	811	12,792	11,848	714	1,269	12,318	1,632	69	52,753
South	101,946,182	32.9	83,760	497,182	136,823	63,616	906,514	801,768	62,815	94,898	877,756	139,280	9,699	3,674,111
Alabama	4,708,708	1.5	4,164	25,870	6,761	2,672	47,987	40,510	3,029	5,019	42,472	6,354	384	185,222
Delaware	885,122	0.3	882	5,235	1,471	555	9,670	8,320	605	1,014	10,204	1,320	93	39,369
Florida	18,537,969	6.0	20,634	124,316	31,640	14,460	215,900	184,051	13,971	24,081	217,103	34,954	1,779	882,889
Georgia	9,829,211	3.2	5,038	28,739	8,876	3,764	54,615	52,462	3,905	6,327	57,949	8,344	660	230,679
Kentucky	4,314,113	1.4	3,803	28,207	8,311	2,622	61,642	40,803	3,451	5,062	40,218	7,656	448	202,223
Louisiana	4,492,076	1.5	5,228	27,450	8,001	3,519	49,414	40,902	3,494	4,313	48,372	7,109	443	198,245
Maryland	5,699,478	1.8	4,443	26,837	7,112	3,051	45,181	47,220	2,865	5,364	51,157	6,171	271	199,672
Mississippi	2,951,996	1.0	1,403	8,436	2,353	1,027	15,995	12,124	965	1,327	15,344	2,007	137	61,118
North Carolina	9,380,884	3.0	7,405	46,540	13,170	5,200	88,642	82,085	5,357	9,650	86,701	12,557	909	358,216
Oklahoma	3,687,050	1.2	2,619	19,103	5,209	2,243	37,451	31,244	2,282	3,763	31,251	5,760	373	141,298
South Carolina	4,561,242	1.5	4,129	23,391	6,362	2,429	42,490	38,694	2,932	4,369	45,363	6,045	372	176,576
Tennessee	6,296,254	2.0	2,534	16,679	4,817	1,982	35,098	27,996	1,921	3,113	27,940	4,621	331	127,032
Texas	24,782,302	8.0	19,523	102,010	28,511	18,846	173,838	176,373	16,384	18,903	183,423	32,603	3,330	773,744
West Virginia	1,819,777	0.6	1,955	14,369	4,229	1,246	28,591	18,984	1,654	2,593	20,259	3,779	169	97,828
Midwest	31,971,621	10.3	25,906	172,649	48,545	18,747	289,855	274,527	16,538	35,847	308,883	51,542	3,095	1,246,134
Iowa	3,007,856	1.0	2,771	22,550	6,012	1,747	32,616	30,562	1,726	4,321	32,471	6,737	348	141,861
Michigan	9,969,727	3.2	10,491	59,825	16,607	7,499	108,252	99,647	6,373	13,398	122,290	18,459	1,136	463,977
Nebraska	1,796,619	0.6	1,670	13,064	3,837	1,226	18,417	19,170	1,229	2,399	20,968	3,929	223	86,132
Ohio	11,542,645	3.7	6,685	43,767	12,779	4,777	80,262	69,842	4,279	8,504	72,251	11,605	726	315,477
Wisconsin	5,654,774	1.8	4,289	33,443	9,310	3,498	50,308	55,306	2,931	7,225	60,903	10,812	662	238,687
West	62,329,218	20.1	57,922	268,833	84,204	48,429	419,871	505,928	34,950	64,828	530,771	83,321	8,293	2,107,350
Alaska	698,473	0.2	480	2,305	819	450	4,073	4,555	349	470	4,351	649	86	18,587
California	36,961,664	11.9	38,136	162,637	50,069	32,671	249,281	301,418	22,916	38,622	306,346	47,886	5,370	1,255,352
Colorado	5,024,748	1.6	3,214	19,086	5,905	2,696	27,649	39,349	2,375	4,942	41,090	6,703	593	153,602
Hawaii	1,295,178	0.4	2,576	7,559	2,428	1,766	10,079	12,011	825	1,314	11,558	1,685	150	51,951
Idaho	1,545,801	0.5	969	6,058	2,034	580	9,772	11,548	655	1,548	13,922	2,279	181	49,546
Montana	974,989	0.3	860	5,466	1,743	451	9,345	9,336	509	1,243	12,134	1,694	101	42,882
New Mexico	2,009,671	0.6	1,937	8,468	2,765	1,763	12,024	15,681	1,176	1,967	18,460	2,859	274	67,374
Oregon	3,825,657	1.2	2,883	18,637	5,868	2,431	34,325	36,514	1,886	4,537	36,392	5,419	427	149,319
Utah	2,784,572	0.9	1,266	7,430	2,564	864	7,440	14,803	905	2,044	21,057	2,983	353	61,709
Washington	6,664,195	2.2	5,200	28,488	9,253	4,484	51,862	56,176	3,045	7,536	59,414	10,349	708	236,515
Wyoming	544,270	0.2	401	2,699	756	273	4,021	4,537	309	605	6,047	815	50	20,513
US registries	249,590,639	80.6	239,452	1,302,744	368,527	176,135	2,164,306	2,141,081	151,112	269,149	2,330,212	370,994	27,061	9,540,773

Open in a new tab

Abbreviation: ALL, acute lymphoblastic leukemia.

Data are from the UN Population Division for 2009.

Acute lymphoblastic leukemia, children (0–14 years) only.

STUDY DESIGN

The focus of this monograph is the striking differences in survival by race and stage at diagnosis. Because differences in survival between men and women were generally very small compared to the differences in survival between blacks and whites (Table 4), we do not show survival estimates by sex in the articles on each cancer.

TABLE 4.

Age-Standardized 5-Year NS for Men and Women (15–99 Years) Diagnosed With 1 of 10 Common Malignancies and for Children (0–14 Years) Diagnosed With ALL Between 2004 and 2009, by Race and Sex: United States

	All Races						Whites						Blacks						Difference, %
	Both Sexes		Men		Women		Both Sexes		Men		Women		Both Sexes		Men		Women		Men and women^a			Blacks and Whites^b
Cancer	NS (%)	95% CI	NS (%)	95% CI	NS (%)	95% CI	NS (%)	95% CI	NS (%)	95% CI	NS (%)	95% CI	NS (%)	95% CI	NS (%)	95% CI	NS (%)	95% CI	All Races	Whites	Blacks	Men	Women
Stomach	29.0	28.6–29.5	26.5	25.9–27.0	33.4	32.7–34.1	28.0	27.5–28.5	25.3	24.7–26.0	32.7	31.8–33.5	28.3	27.1–29.4	24.5	23.0–25.9	33.7	32.0–35.4	−6.9	−7.3	−9.2	−0.9	1.0
Colon	64.6	64.4–64.9	63.7	63.3–64.0	65.7	65.4–66.0	65.4	65.2–65.7	64.5	64.1–64.8	66.4	66.1–66.8	56.6	55.9–57.3	54.5	53.4–55.5	58.6	57.7–59.5	−2.0	−1.9	−4.1	−10.0	−7.8
Rectum	64.0	63.6–64.4	62.4	61.8–63.0	66.1	65.5–66.7	64.2	63.7–64.7	62.8	62.1–63.4	66.1	65.5–66.8	57.5	56.0–59.0	53.6	51.3–55.9	61.7	59.7–63.8	−3.7	−3.4	−8.1	−9.2	−4.4
Liver	14.8	14.4–15.2	14.3	13.8–14.8	16.8	16.1–17.6	14.3	13.8–14.8	13.8	13.3–14.4	16.4	15.5–17.3	11.4	10.3–12.5	10.8	9.4–12.3	14.2	12.2–16.2	−2.5	−2.6	−3.4	−3.0	−2.1
Lung	19.0	18.8–19.1	16.1	16.0–16.3	22.4	22.2–22.6	19.4	19.2–19.5	16.5	16.3–16.7	22.7	22.4–22.9	14.9	14.5–15.2	12.3	11.9–12.8	18.3	17.7–19.0	−6.2	−6.1	−6.0	−4.2	−4.3
Breast (women)					88.6	88.4–88.8					89.6	89.4–89.8					78.4	77.7–79.1	–	–	–	–	−11.2
Cervix					62.8	62.2–63.5					63.5	62.7–64.2					55.5	53.9–57.1	–	–	–	–	−7.9
Ovary					41.0	40.5–41.5					41.7	41.2–42.2					31.1	29.5–32.7	–	–	–	–	−10.6
Prostate			96.9	96.7–97.1					96.9	96.7–97.1					92.7	92.1–93.3			–	–	–	−4.2	–
Leukemia	52.1	51.7–52.5	52.0	51.5–52.5	52.3	51.7–52.9	52.7	52.3–53.1	52.5	51.9–53.0	52.9	52.3–53.6	41.9	40.4–43.5	41.1	38.9–43.3	42.7	40.6–44.9	−0.3	−0.4	−1.6	−11.3	−10.2
ALL^c	88.1	87.2–88.9	87.4	86.2–88.6	88.9	87.6–90.2	88.6	87.6–89.5	88.1	86.8–89.4	89.1	87.7–90.6	83.6	80.6–86.6	82.3	78.4–86.2	85.1	80.1–90.1	−1.5	−1.0	−2.8	−5.8	−4.0

Open in a new tab

Abbreviations: ALL, acute lymphoblastic leukemia; CI, confidence interval; NS, net survival.

The population coverage represents 80.6% of the US population in 2009 (according to the UN Population Division).

A negative value means that men have lower survival than women.

A negative value means that blacks have lower survival than whites.

Children (0–14 years old) only.

The CONCORD protocol required information on the stage at diagnosis only for patients diagnosed from 2001 onward, because the completeness of data on stage in the United States and many other countries was known to be much lower before 2001. For the analyses of survival by stage at diagnosis, patients were grouped by year of diagnosis into 2 calendar periods (2001–2003 and 2004–2009) to reflect changes in the methods used by US registries to collect data on the stage at diagnosis. From 2001 onward, most registries coded stage directly from the source data to SEER Summary Stage 2000.¹⁰ From 2004 onward, all registries began to derive SEER Summary Stage 2000 from 15 pathological and clinical data items, using the Collaborative Staging System.¹¹ Data on stage at diagnosis were not available for Maryland or Wisconsin, or for patients diagnosed between 2004 and 2009 in Rhode Island.

We estimated net survival with the cohort approach for patients diagnosed in 2001–2003 because all patients had been followed up for at least 5 years by December 31, 2009. We used the complete approach to estimate net survival for patients diagnosed from 2004 to 2009 because 5 years of follow-up data were not available for all patients.

Cohort Approach

The cohort approach is the classic approach to survival analysis: all patients who are included in the analysis have had the opportunity to be followed for the full duration of the survival analysis (in this case, 5 years). The cohort of patients is defined by the year or calendar period during which they were diagnosed, and each patient is followed for the same length of time. In our analyses, at least 5 years of follow-up for vital status were available by the end of 2009 for all patients diagnosed between 2001 and 2003. Each patient, regardless of his or her actual year of diagnosis, contributes survival information at each point in follow-up time that, taken cumulatively, make up the survival estimate at 5 years.

The cohort approach is considered the gold standard^12,13 because it provides a survival estimate for a group of patients who have been diagnosed during the same year or period, who are likely to have been treated in a similar fashion, and who have all been followed for at least the duration of survival required. It is the natural approach to estimation of the outcome and is easy to interpret, but other approaches may be required if sufficient data are not available.

Complete Approach

The complete approach can be applied to estimate survival for patients who have been diagnosed more recently, and for whom 5 full years of follow-up data may not be available at the closing date of the study. For example, some patients diagnosed in 2004–2009 were followed for less than 5 years. The cohort approach can be used to estimate 5-year survival for patients diagnosed in 2004, but 5-year survival can be estimated for the whole calendar period with the complete approach, in which all the available follow-up data for patients diagnosed between 2004 and 2009 are used. The potential follow-up time for these patients varies between 1 and 5 years.

Age Standardization

We compared survival estimates between US states, between blacks and whites, and between calendar periods of diagnosis. For age-specific survival estimates, a comparison between populations or over time is straightforward, but if we want to compare overall (all-ages) survival estimates, age standardization is required. This is essentially for the same reasons as in comparison of overall incidence or mortality rates, namely that net survival may also vary widely with age at diagnosis, and the age profile of cancer patients may differ between the populations or change between the calendar periods among which we wish to compare overall survival.

For age standardization of incidence or mortality rates, what matters is the age structure of the general population at risk of cancer. With cancer survival, however, what matters is the age profile of cancer patients, which is very different from the age profile of the general population. The weights used for age standardization of cancer survival estimates are thus completely different from those required for standardizing incidence or mortality rates. The weight for each age group is provided by the proportion of cancer patients in that age group in a standard population of cancer patients.

The International Cancer Survival Standard weights (ICSS)¹⁴ are strongly recommended for international comparisons of cancer survival. They comprise 3 sets of standard age weights, derived from discriminant analysis to find the smallest number of sets of weights that enable adequate standardization of survival. Each standard is applicable to a range of different cancers, and provides age-standardized survival estimates that are not too different from the unstandardized estimates. The same age weights can be used for men and women, and for direct comparisons of age-standardized net survival between patient groups defined by sex and race.

STATISTICAL METHODS

We estimated net survival up to 5 years after diagnosis with 95% confidence intervals (CI), using the Pohar Perme estimator, implemented in the Stata algorithm.^15–17 We analyzed survival by state, race, stage at diagnosis, and calendar period of diagnosis. Net survival is the probability of surviving up to a given time since diagnosis after controlling for other causes of death (background mortality). To control for the wide differences in background mortality among participating states and racial/ethnic groups, we constructed life tables of all-cause mortality in the general population of each state from the number of deaths and the population by single year of age, sex, calendar year, and, where possible, race (black and white).

Net survival in adults was estimated for 5 age groups (15–44, 45–54, 55–64, 65–74, and 75–99 years; except for prostate cancer 15–54, 55–64, 65–74, 75–84 and 85–99 years). We obtained age-standardized survival estimates using the International Cancer Survival Standard weights. For children, survival was estimated for the age groups 0 to 4, 5 to 9, and 10 to 14 years. We obtained age-standardized estimates by assigning equal weights to the 3 age-specific estimates.¹⁸

We derived standard errors for both unstandardized and age-standardized survival estimates with the Greenwood method,¹⁹ assuming a normal distribution, and truncated to the range 0–100. We did not estimate survival if fewer than 10 patients were available for analysis. Age standardization was performed only if there were at least 10 patients in each of the age categories specified above. If an age-specific estimate could not be obtained, we merged data for adjacent age groups and assigned the combined estimate to both age groups. If 2 or more age-specific estimates could not be obtained, we present only the pooled, unstandardized estimates for all ages combined: these estimates are italicized in Supporting Tables 2 and 3 in other articles of this supplement.^20–29

For each of the 37 states, we present estimates of age-standardized net survival for each cancer up to 5 years after diagnosis. For convenience, we report cumulative survival probabilities (range, 0–1) as percentages in the range of 0% to 100%.

LIFE TABLES

For the analyses presented in this supplement, we used the life tables for background mortality that were constructed for the CONCORD-2 study.³⁰

To control for variation between US states in background mortality by age, sex, race, and calendar year while estimating net survival, we used life tables of all-cause mortality rates by single year of age (0–99 years) for each state, race, calendar year (2001–2010), and sex. For a few states in which the black population is small, it was not possible to construct adequately robust life tables of all-cause mortality by single year of age and sex for blacks, so net survival estimates for blacks in those states are not presented separately. These life tables can be downloaded from the CONCORD library of over 12,000 life tables.³¹ The library includes detailed statistical and graphical reports on the robustness of the life tables for each US state.

We received raw data on death counts and populations for each US state. To produce life tables for each US state by race, sex, and calendar year (state- and race-specific life tables), we used a flexible Poisson model³² that enables creation of single-year-of-age life tables even when the raw data are sparse. We checked the life tables by examination of semilog plots of the age-sex-mortality rates, the life expectancy at birth, the probability of death in the age bands 15 to 59, 60 to 84, and 85 to 99 years, and, where necessary, the model residuals, to examine the goodness of fit of the models by age and sex.

GRAPHICAL REPRESENTATION

In each cancer-specific article in this supplement, trends, geographic variations, and differences in age-standardized survival by race are presented graphically in bar charts and funnel plots.³³

Bar Charts

Results are summarized in bar charts of 5-year age-standardized net survival by calendar period (2001–2003 and 2004–2009), for each state, grouped within the 4 US Census geographic regions (Northeast, South, Midwest, and West). The results for each region are presented with a different color. Within each region, darker shades indicate NPCR registries, whereas lighter shades indicate SEER registries. Five registries funded by both SEER and NPCR were grouped with SEER because they use both passive and active follow-up; they are indicated with an asterisk.

The survival estimates for each state in 2004–2009 are ranked from high to low within each US Census region. The same ranking is then applied to the results for 2001–2003, to facilitate examination of changes in survival from 2001–2003 to 2004–2009 within each state. The absolute difference in 5-year net survival between the 2 periods is also shown for each state as a percentage.

Each graphic includes the pooled survival estimates for all 37 participating states combined.

Funnel Plots

Funnel plots are graphical representations designed to detect excessive variation in performance indicators by simple visual inspection of the data.³⁴ They provide a simple and informative display of geographical variation or time trends in population-based cancer survival measures (eg, age-standardized net survival).

A funnel plot comprises 4 elements³³: the target (or reference) value for the outcome, a set of control limits (the funnel), data points for the outcome variable (indicator), and the associated precision parameter for each data point. Data points outside the control limits (the funnel) indicate variation in the indicator beyond what would be expected by chance, while taking account of precision.³⁴

The funnel plot in Figure 2 shows, as an example, 5-year age-standardized net survival for breast cancer in the United States between 2004 and 2009, by race and state. It is constructed by plotting the 37 state-specific survival estimates for breast cancer between 2004 and 2009 (on the y-axis), against their associated precision (on the x-axis), forming a scatter plot. Fewer data points are available for blacks (28 states) than whites (37 states) because of the difficulty in constructing robust life tables for blacks in every state. The precision parameter in this example is, in fact, the precision of each age-standardized net survival estimate (the inverse of its variance). This is a natural choice to represent the statistical precision of each estimate, but it could be any function that is proportional to the inverse of the variance.

Five-year age-standardized net survival for women (15–99 years old) who were diagnosed with breast cancer in 2004–2009, by state and race. Each data point represents the survival estimate for a US state for either blacks (28 states) or whites (37 states; see text).

The target (the solid, horizontal line in Fig. 2) is then superimposed. This is a constant value, considered independent of the observations, and it specifies the expected value for the outcome. The target shown in Figure 2 is the 5-year age-standardized net survival estimate for the pooled US data for women diagnosed with breast cancer between 2004 and 2009. The pooled US estimate was selected as the target, to show the extent to which survival for blacks and whites in each state varies around the overall survival estimate for the United States.

The control limits (the dashed lines in Fig. 2) are also independent of the individual survival estimates. They depend only on the target value, and their correct formulation depends on the underlying theoretical distribution of the target value. The control limits for a given level of significance (α) are drawn around the target value across the entire observed range of precision of the individual estimates. The most common levels of significance are α = 5% and α = 0.2%, so the resulting 95% and 99.8% control limits represent approximately 2 and 3 standard deviations, respectively, on either side of the target value at each level of precision. An estimate that appears outside the control limits is identified as diverging from the target value and is an “out-of-control” estimate; in other words, it is a probable outlier that may need to be investigated further.

In Figure 2, as with all the funnel plots reported in this supplement, 5-year age-standardized net survival is represented by open circles for white patients and by solid circles for black patients. Funnel plots are extremely powerful tools for visual examination of variation in an indicator: we can perceive at first glance that 5-year survival in blacks is persistently lower than would be expected (the pooled US survival estimate, ie, the target) and that survival for blacks is generally lower than survival for whites.

DISCUSSION

This article summarizes the data quality control procedures, analytic methods, and graphical presentations that have been deployed for all the data sets reported in this supplement. The quality of the population-based data from the 37 participating US cancer registries was impressively high (Table 2). More details about the quality indicators for each cancer can be found in the online supplementary appendix (http://www.thelancet.com/journals/lancet/article/PIIS0140-6736(14)62038-9/supplemental) for the CONCORD-2 article.³

For NPCR registries that use only passive follow-up to determine the vital status of registered cancer patients (the “presumed alive” method), survival estimates may be inflated if the cancer registrations for some patients who have in fact died could not be successfully linked to the data from their death certificate. The vital statistics offices in each state have reported all death certificate information to the National Death Index since 1979. Passive methods of follow-up are known to be efficient because of the completeness and accuracy of the National Death Index, which tends to capture 1% to 3% more deaths than if the registry can link its data only to the state death index.³⁵ Most of the extra deaths captured in this way will be those of patients who migrated to a different state after their cancer diagnosis. However, the registries included in these analyses had all matched their data against the National Death Index before data submission, so the completeness of vital status ascertainment is expected to be extremely high, although it may not capture out-of-country deaths.

A major strength of this study is the use of life tables that are specific for each state, each race (white, black), and each calendar year, to control for differences and changes in background mortality by single year of age, sex, race, state, and single calendar year. This approach provided the tightest possible control of background mortality with the available data. More specific life tables may be considered in future studies, subject to the availability of high-quality data on death and population counts for Hispanics or other major racial or ethnic groups.

The CONCORD-2 protocol required registries to provide information on the stage at diagnosis for patients diagnosed in 2001 or later. Calendar years of diagnosis were then grouped for analyses of survival by stage into 2001–2003 and 2004–2009, to reflect a change in the US stage coding system in 2004.

This choice of calendar periods imposed the following selection of analytic approach. We were able to estimate 5-year net survival with the “cohort” approach for patients diagnosed in 2001–2003, since all patients had at least 5 years of potential follow-up. However, the “period” approach³⁶ that was adopted to estimate 5-year survival for patients diagnosed between 2005 and 2009 in the CONCORD-2 study³ could have been used to estimate 5-year survival by stage for patients diagnosed between 2004 and 2009 only if stage data had also been available for patients diagnosed in 2000. We therefore adopted the “complete” approach. In this approach, more information is available for estimating survival in the early years of follow-up than for later years. Only patients diagnosed in 2004 had the potential to be followed up for 5 years by December 31, 2009; only patients diagnosed between 2004 and 2005 had the potential to contribute to the conditional survival probabilities between 4 and 5 years after diagnosis (and so on). This leads to some increased variation around the 5-year survival estimates for 2004–2009 compared with those for 2001–2003. This is reflected in the confidence intervals, and slightly lower precision, seen in the funnel plots for 2004–2009 in some of the site-specific articles.

This is the most extensive analysis of 5-year population-based cancer survival in the United States to date, with survival trends for 10 common cancers in 37 states that include 80% of the US population. Here, we have focused on variations in survival by race and stage at diagnosis for patients diagnosed between 2001 and 2009.

Population-based cancer survival is a key measure of the overall effectiveness of the health system in dealing with cancer. The high quality of the data from the US cancer registries, implementation of the most up-to-date and unbiased estimator of net survival, combined with the use of state- and race-specific life tables, all help to ensure that these cancer survival estimates are robust and comparable. We believe that they can be confidently used by policy-makers to identify inequities in cancer survival by race in each state and for the United States as a whole, and to plan cancer control strategies that promote equal opportunity for the best possible outcomes after a cancer diagnosis.

Acknowledgments

FUNDING SUPPORT

This study was funded by the US Centers for Disease Control and Prevention (12FED03123 and ACO12036).

Footnotes

CONFLICT OF INTEREST DISCLOSURES

The authors made no disclosures.

The findings and conclusions in this report are those of the authors and do not necessarily reflect the official position of the CDC.

This Supplement edition of Cancer has been sponsored by the U.S. Centers for Disease Control and Prevention (CDC), an Agency of the Department of Health and Human Services.

The CONCORD-2 study was approved by the Ethics and Confidentiality Committee of the UK’s statutory National Information Governance Board (now the Health Research Authority) (ref ECC 3-04(i)/2011) and by the National Health Service Research Ethics Service (Southeast; 11/LO/0331).

AUTHOR CONTRIBUTIONS

Claudia Allemani: Conceptualization, methodology, writing–original draft, supervision and funding acquisition. Rhea Harewood: Data validation, formal analysis and visualization. Christopher J. Johnson: Writing–review and editing. Helena Carreira: Data validation, formal analysis and visualization. Devon Spika: Life tables, data validation, formal analysis, and visualization. Audrey Bonaventure: Writing–review and editing. Kevin Ward: Writing–review and editing. Hannah K. Weir: Writing–review and editing. Michel P. Coleman: Conceptualization, methodology, writing–review and editing, and funding acquisition.

References

1.Coleman MP. Cancer survival: global surveillance will stimulate health policy and improve equity. Lancet. 2014;383:564–573. doi: 10.1016/S0140-6736(13)62225-4. [DOI] [PubMed] [Google Scholar]
2.Coleman MP, Quaresma M, Berrino F, et al. Cancer survival in five continents: a worldwide population-based study (CONCORD) Lancet Oncol. 2008;9:730–756. doi: 10.1016/S1470-2045(08)70179-7. [DOI] [PubMed] [Google Scholar]
3.Allemani C, Weir HK, Carreira H, et al. Global surveillance of cancer survival 1995–2009: analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2) Lancet. 2015;385:977–1010. doi: 10.1016/S0140-6736(14)62038-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.White MC, Babcock F, Hayes NS, et al. The history and use of cancer registry data by public health cancer control programs in the United States. Cancer. 2017;123:4969–4976. doi: 10.1002/cncr.30905. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.North American Association of Central Cancer Registries. https://www.naaccr.org/data-standards-data-dictionary/
6.IACR Working Group. International rules for multiple primary cancers (ICD-O third edition) Eur J Cancer Prev. 2005;14:307–308. doi: 10.1097/00008469-200508000-00002. [DOI] [PubMed] [Google Scholar]
7.Surveillance, Epidemiology, and End Results Program. [Accessed September 24, 2016];Multiple primary and histology coding rules manual. http://seer.cancer.gov/tools/mphrules/download.html.
8.Fritz AG, Percy C, Jack A, et al. International Classification of Diseases for Oncology (ICD-O) Geneva, Switzerland: World Health Organization; 2013. [Google Scholar]
9.Johnson CJ, Weir HK, Mariotto AB, Nishri D, Wilson R, editors. Cancer in North America: 2008–2012—Volume Four: Cancer Survival in the United States and Canada 2005–2011. Springfield, IL: North American Association of Central Cancer Registries; 2016. [Google Scholar]
10.Young JL, Roffers SD, Ries LAG, Fritz AG, Hurlbut AA. SEER Summary Staging Manual–2000: Codes and Coding Instructions. Bethesda, MD: National Cancer Institute; 2001. NIH publication 01–4969. [Google Scholar]
11.Cronin KA, Ries LAG, Edwards BK. Preface. Cancer. 2014;120:3755–3757. doi: 10.1002/cncr.29049. [DOI] [PubMed] [Google Scholar]
12.Estève J, Benhamou E, Raymond L. Statistical Methods in Cancer Research—Volume IV: Descriptive Epidemiology. Lyon, France: International Agency for Research on Cancer; 1994. IARC Scientific Publication 128. [PubMed] [Google Scholar]
13.Cutler SJ, Ederer F. Maximum utilisation of the life table method in analyzing survival. J Chronic Dis. 1958;8:699–712. doi: 10.1016/0021-9681(58)90126-7. [DOI] [PubMed] [Google Scholar]
14.Corazziari I, Quinn MJ, Capocaccia R. Standard cancer patient population for age standardising survival ratios. Eur J Cancer. 2004;40:2307–2316. doi: 10.1016/j.ejca.2004.07.002. [DOI] [PubMed] [Google Scholar]
15.Pohar Perme M, Stare J, Estève J. On estimation in relative survival. Biometrics. 2012;68:113–120. doi: 10.1111/j.1541-0420.2011.01640.x. [DOI] [PubMed] [Google Scholar]
16.StataCorp. Stata Statistical Software. College Station, TX: Stata Corporation; 2015. [Google Scholar]
17.Clerc-Urmés I, Grzebyk M, Hédelin G. Net survival estimation with stns. Stata J. 2014;14:87–102. [Google Scholar]
18.Stiller CA, Bunch KJ. Trends in survival for childhood cancer in Britain diagnosed 1971–85. Br J Cancer. 1990;62:806–815. doi: 10.1038/bjc.1990.383. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Greenwood M. The Natural Duration of Cancer. London, United Kingdom: Her Majesty’s Stationery Office; 1926. [Google Scholar]
20.Jim MA, Pinheiro PS, Carreira H, Espey DK, Wiggins CL, Weir HK. Stomach cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:4994–5013. doi: 10.1002/cncr.30881. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.White A, Joseph DA, Rim SH, Johnson CJ, Coleman MP, Allemani C. Colon cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5014–5036. doi: 10.1002/cncr.31076. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Joseph DA, Johnson CJ, White A, Wu M, Coleman MP. Rectal Cancer Survival in the United States by Race and Stage (2001–2009): Findings from the CONCORD-2 study. Cancer. 2017;123:5037–5058. doi: 10.1002/cncr.30882. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Momin BR, Pinheiro PS, Carreira H, Li C, Weir HK. Liver cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5059–5078. doi: 10.1002/cncr.30820. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Richards TB, Henley SJ, Puckett MC, et al. Lung cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5079–5099. doi: 10.1002/cncr.31029. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Miller JW, Lee Smith J, Ryerson AB, Tucker TC, Allemani C. Disparities in breast cancer survival in the United States (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5100–5118. doi: 10.1002/cncr.30988. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Benard V, Watson M, Saraiya M, et al. Cervical cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5119–5137. doi: 10.1002/cncr.30906. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Stewart SL, Harewood R, Matz M, et al. Disparities in ovarian cancer survival in the United States (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5138–5159. doi: 10.1002/cncr.31027. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Steele CB, Li J, Huang B, Weir HK. Prostate cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5160–5177. doi: 10.1002/cncr.31026. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Tai E, Ward KC, Bonaventure A, Siegel D, Coleman MP. Survival among children diagnosed with acute lymphoblastic leukemia in the United States by race and age, 2001–2009: findings from the CONCORD-2 study. Cancer. 2017;123:5178–5189. doi: 10.1002/cncr.30899. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Spika D, Bannon F, Bonaventure A, et al. Life tables for global surveillance of cancer survival (the CONCORD programme): data sources and methods. BMC Cancer. 2017;17:159. doi: 10.1186/s12885-017-3117-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Spika D, Rachet B, Bannon F, et al. [Accessed September 24, 2016];Life tables for the CONCORD-2 study. http://csg.lshtm.ac.uk/tools-analysis/
32.Rachet B, Maringe C, Woods LM, Ellis L, Spika D, Allemani C. Multivariable flexible modelling for estimating complete, smoothed life tables for sub-national populations. BMC Public Health. 2015;15:1240. doi: 10.1186/s12889-015-2534-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Quaresma M, Coleman MP, Rachet B. Funnel plots for population-based cancer survival: principles, methods and applications. Stat Med. 2014;33:1070–1080. doi: 10.1002/sim.5953. [DOI] [PubMed] [Google Scholar]
34.Spiegelhalter DJ. Funnel plots for comparing institutional performance. Stat Med. 2005;24:1185–1202. doi: 10.1002/sim.1970. [DOI] [PubMed] [Google Scholar]
35.Johnson CJ, Weir HK, Fink AK, et al. The impact of National Death Index linkages on population-based cancer survival rates in the United States. Cancer Epidemiol. 2013;37:20–28. doi: 10.1016/j.canep.2012.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Brenner H, Gefeller O. An alternative approach to monitoring cancer patient survival. Cancer. 1996;78:2004–2010. [PubMed] [Google Scholar]

[R1] 1.Coleman MP. Cancer survival: global surveillance will stimulate health policy and improve equity. Lancet. 2014;383:564–573. doi: 10.1016/S0140-6736(13)62225-4. [DOI] [PubMed] [Google Scholar]

[R2] 2.Coleman MP, Quaresma M, Berrino F, et al. Cancer survival in five continents: a worldwide population-based study (CONCORD) Lancet Oncol. 2008;9:730–756. doi: 10.1016/S1470-2045(08)70179-7. [DOI] [PubMed] [Google Scholar]

[R3] 3.Allemani C, Weir HK, Carreira H, et al. Global surveillance of cancer survival 1995–2009: analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2) Lancet. 2015;385:977–1010. doi: 10.1016/S0140-6736(14)62038-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.White MC, Babcock F, Hayes NS, et al. The history and use of cancer registry data by public health cancer control programs in the United States. Cancer. 2017;123:4969–4976. doi: 10.1002/cncr.30905. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.North American Association of Central Cancer Registries. https://www.naaccr.org/data-standards-data-dictionary/

[R6] 6.IACR Working Group. International rules for multiple primary cancers (ICD-O third edition) Eur J Cancer Prev. 2005;14:307–308. doi: 10.1097/00008469-200508000-00002. [DOI] [PubMed] [Google Scholar]

[R7] 7.Surveillance, Epidemiology, and End Results Program. [Accessed September 24, 2016];Multiple primary and histology coding rules manual. http://seer.cancer.gov/tools/mphrules/download.html.

[R8] 8.Fritz AG, Percy C, Jack A, et al. International Classification of Diseases for Oncology (ICD-O) Geneva, Switzerland: World Health Organization; 2013. [Google Scholar]

[R9] 9.Johnson CJ, Weir HK, Mariotto AB, Nishri D, Wilson R, editors. Cancer in North America: 2008–2012—Volume Four: Cancer Survival in the United States and Canada 2005–2011. Springfield, IL: North American Association of Central Cancer Registries; 2016. [Google Scholar]

[R10] 10.Young JL, Roffers SD, Ries LAG, Fritz AG, Hurlbut AA. SEER Summary Staging Manual–2000: Codes and Coding Instructions. Bethesda, MD: National Cancer Institute; 2001. NIH publication 01–4969. [Google Scholar]

[R11] 11.Cronin KA, Ries LAG, Edwards BK. Preface. Cancer. 2014;120:3755–3757. doi: 10.1002/cncr.29049. [DOI] [PubMed] [Google Scholar]

[R12] 12.Estève J, Benhamou E, Raymond L. Statistical Methods in Cancer Research—Volume IV: Descriptive Epidemiology. Lyon, France: International Agency for Research on Cancer; 1994. IARC Scientific Publication 128. [PubMed] [Google Scholar]

[R13] 13.Cutler SJ, Ederer F. Maximum utilisation of the life table method in analyzing survival. J Chronic Dis. 1958;8:699–712. doi: 10.1016/0021-9681(58)90126-7. [DOI] [PubMed] [Google Scholar]

[R14] 14.Corazziari I, Quinn MJ, Capocaccia R. Standard cancer patient population for age standardising survival ratios. Eur J Cancer. 2004;40:2307–2316. doi: 10.1016/j.ejca.2004.07.002. [DOI] [PubMed] [Google Scholar]

[R15] 15.Pohar Perme M, Stare J, Estève J. On estimation in relative survival. Biometrics. 2012;68:113–120. doi: 10.1111/j.1541-0420.2011.01640.x. [DOI] [PubMed] [Google Scholar]

[R16] 16.StataCorp. Stata Statistical Software. College Station, TX: Stata Corporation; 2015. [Google Scholar]

[R17] 17.Clerc-Urmés I, Grzebyk M, Hédelin G. Net survival estimation with stns. Stata J. 2014;14:87–102. [Google Scholar]

[R18] 18.Stiller CA, Bunch KJ. Trends in survival for childhood cancer in Britain diagnosed 1971–85. Br J Cancer. 1990;62:806–815. doi: 10.1038/bjc.1990.383. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Greenwood M. The Natural Duration of Cancer. London, United Kingdom: Her Majesty’s Stationery Office; 1926. [Google Scholar]

[R20] 20.Jim MA, Pinheiro PS, Carreira H, Espey DK, Wiggins CL, Weir HK. Stomach cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:4994–5013. doi: 10.1002/cncr.30881. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.White A, Joseph DA, Rim SH, Johnson CJ, Coleman MP, Allemani C. Colon cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5014–5036. doi: 10.1002/cncr.31076. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Joseph DA, Johnson CJ, White A, Wu M, Coleman MP. Rectal Cancer Survival in the United States by Race and Stage (2001–2009): Findings from the CONCORD-2 study. Cancer. 2017;123:5037–5058. doi: 10.1002/cncr.30882. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Momin BR, Pinheiro PS, Carreira H, Li C, Weir HK. Liver cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5059–5078. doi: 10.1002/cncr.30820. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] 24.Richards TB, Henley SJ, Puckett MC, et al. Lung cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5079–5099. doi: 10.1002/cncr.31029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Miller JW, Lee Smith J, Ryerson AB, Tucker TC, Allemani C. Disparities in breast cancer survival in the United States (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5100–5118. doi: 10.1002/cncr.30988. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Benard V, Watson M, Saraiya M, et al. Cervical cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5119–5137. doi: 10.1002/cncr.30906. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Stewart SL, Harewood R, Matz M, et al. Disparities in ovarian cancer survival in the United States (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5138–5159. doi: 10.1002/cncr.31027. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Steele CB, Li J, Huang B, Weir HK. Prostate cancer survival in the United States by race and stage (2001–2009): findings from the CONCORD-2 study. Cancer. 2017;123:5160–5177. doi: 10.1002/cncr.31026. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Tai E, Ward KC, Bonaventure A, Siegel D, Coleman MP. Survival among children diagnosed with acute lymphoblastic leukemia in the United States by race and age, 2001–2009: findings from the CONCORD-2 study. Cancer. 2017;123:5178–5189. doi: 10.1002/cncr.30899. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Spika D, Bannon F, Bonaventure A, et al. Life tables for global surveillance of cancer survival (the CONCORD programme): data sources and methods. BMC Cancer. 2017;17:159. doi: 10.1186/s12885-017-3117-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Spika D, Rachet B, Bannon F, et al. [Accessed September 24, 2016];Life tables for the CONCORD-2 study. http://csg.lshtm.ac.uk/tools-analysis/

[R32] 32.Rachet B, Maringe C, Woods LM, Ellis L, Spika D, Allemani C. Multivariable flexible modelling for estimating complete, smoothed life tables for sub-national populations. BMC Public Health. 2015;15:1240. doi: 10.1186/s12889-015-2534-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Quaresma M, Coleman MP, Rachet B. Funnel plots for population-based cancer survival: principles, methods and applications. Stat Med. 2014;33:1070–1080. doi: 10.1002/sim.5953. [DOI] [PubMed] [Google Scholar]

[R34] 34.Spiegelhalter DJ. Funnel plots for comparing institutional performance. Stat Med. 2005;24:1185–1202. doi: 10.1002/sim.1970. [DOI] [PubMed] [Google Scholar]

[R35] 35.Johnson CJ, Weir HK, Fink AK, et al. The impact of National Death Index linkages on population-based cancer survival rates in the United States. Cancer Epidemiol. 2013;37:20–28. doi: 10.1016/j.canep.2012.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Brenner H, Gefeller O. An alternative approach to monitoring cancer patient survival. Cancer. 1996;78:2004–2010. [PubMed] [Google Scholar]

PERMALINK

Population-Based Cancer Survival in the United States: Data, Quality Control, and Statistical Methods

Claudia Allemani, MSc, PhD, FHEA, HonMFPH

Rhea Harewood, MSc

Christopher J Johnson, MPH

Helena Carreira, BSc, MSc

Devon Spika, MSc

Audrey Bonaventure, MD, PhD

Kevin Ward, PhD, MPH

Hannah K Weir, PhD

Michel P Coleman, BA, BM, BCh, MSc, FFPH

Abstract

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

INTRODUCTION

Figure 1.

TABLE 1.

FOLLOW-UP

DATA QUALITY CONTROL

Phase 1: Protocol Adherence

Phase 2: Exclusions

TABLE 2.

Phase 3: Editorial Tables

TABLE 3.

STUDY DESIGN

TABLE 4.

Cohort Approach

Complete Approach

Age Standardization

STATISTICAL METHODS

LIFE TABLES

GRAPHICAL REPRESENTATION

Bar Charts

Funnel Plots

Figure 2.

DISCUSSION

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases