Abstract
Monitoring and comparing trends in cancer rates across geographic regions or over different time periods has been one main task of the National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) Program as it profiles health care quality as well as decides health care resource allocations within a spatial-temporal framework. A fundamental difficulty, however, arises when such comparisons have to be made for regions or time intervals that overlap, e.g. comparing the change in trends of mortality rates in a local area (e.g. the mortality rate of Breast Cancer in California) with a more global level (i.e. the national mortality rate of Breast Cancer). In view of sparsity of available methodologies, this paper develops a simple corrected Z-test that accounts for such overlapping. The performance of the proposed test over the two-sample “pooled” t-test that assumes independence across comparison groups is assessed via the Pitman asymptotic relative efficiency as well as Monte Carlo simulations and applications to the SEER cancer data. The proposed test will be important for the SEER*STAT software, maintained by the NCI, for the analysis of the SEER data.
Keywords: Age-adjusted cancer rates, Annual percent change (APC), Surveillance, Trends, Hypothesis testing, Pitman asymptotic relative efficiency (ARE)
1 Introduction
Cancer continues to be a major epidemic concern in the United States, contributing the second most deaths each year in the United States. For instance, cancer resulted in approximately 570,280 deaths in year 2005 (American Cancer Society, 2005), while the overall cost of cancer, including the costs of diagnosis, treatment, lost person-hours, and education and research, tallied as much as $189.8 billion for 2004 (Ghosh and Tiwari, 2007).
Many public and private agencies dealing with cancer and related problems depend on the rates of cancer deaths or new cases as an estimate of cancer burden for planning and resource allocation. Among these agencies, the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (NCI) is the most authoritative and comprehensive source of information on cancer incidence and deaths in the United States, which currently collects and publishes cancer incidence and survival data from population-based cancer registries covering approximately over a quarter of the entire US population. One main task of the SEER program is to routinely monitor and compare trends in cancer mortality and incidence rates across geographic regions or over different time periods. The data are analyzed by SEER*STAT software, which is maintained by the NCI, with the results periodically published at http://seer.cancer.gov/csr/. Indeed, this surveillance task has important social and economic ramifications, ranging from deciding which cancer programs get funded to deciding how the funds are allocated among various regions. Having reliable and accurate comparisons of trends of cancer rates is thus of tremendous importance.
However, a fundamental statistical difficulty arises when such comparisons, largely for policy making purposes, have to be made for regions or time intervals that overlap, e.g. comparing the most recent changes in trends of cancer rates in a local area (e.g. the mortality rate of breast cancer in California) with a more global level (i.e. the national mortality rate) over two overlapping time periods, because of availability of the data. For example, as detailed in the data analysis section, it is of substantial interest to compare the changes in California cancer mortality rates with the national cancer mortality rates in the last 15 years. However, for a 15-year block, the California cancer rates were available for 1990–2004, while the national data were available for 1988–2002.
As the current SEER*STAT software utilizes the two-sample pooled t-test (Kleinbaum et al., 1988) that assumes independence across comparison groups, it is not appropriate for the aforementioned settings. In this paper, we develop a simple corrected Z-test that accounts for the overlap and that will be available for the NCI SEER program.
The rest of this article is structured as follows. In Section 2, we introduce the cancer rate regression model that has been used in the SEER analysis, followed by the classical t-test, employed by the current SEER*STAT software for comparing the trends between two independent regressions in Section 3. In Section 4, we propose a corrected Z-test that properly accounts for correlation when the comparison has to be made across two overlapping regions or time intervals. The performance of the proposed test is assessed via applications to the SEER cancer data, with its validity confirmed by simulations in Section 5. We conclude with a short summary in Section 6. The technical detail is relegated to the WebAppendix (http://www.biometrics.tibs.org/).
2 Age-adjusted Cancer Rate Regression Model and Annual Percent Change
Let nji and dji be the mid-year population at risk and counts of deaths or incidents for age group j (j = 1,…,J) at time ti, i = 1,…,I. The age-adjusted rate, at time ti, is typically computed as
(1) |
where wj > 0, j = 1,…,J, are the known standards for the age group j so that In the SEER program, there are J = 19 standard age-groups consisting of 0–1, 1–4, 5–9,…,85+, and the specific weights wj are given in Fay et al. (2006).
To describe the trend in mortality or incidence, we often use a logarithm transformation of and fit a linear regression on the calendar time. However, for rare cancers, defined in (1) can be zero, making its logarithm transformation overflow in computation. To avoid this situation, we introduce a correction factor, which amounts to distributing a count of 1 uniformly to all J categories, and hence adding 1/J to dji, yielding a zero-corrected rate (Tiwari et al., 2006)
(2) |
Numerically, the difference between (1) and (2) is negligible; however, the logarithm of the latter is always defined. A simple linear regression has been established by a number of authors (Kim et al. 2000; Tiwari et al., 2005; Fay et al., 2006) to link the logarithm transformation of mortality or incidence rate ri, say, yi = log(ri), to the calendar time ti, via
(3) |
where the ei are i.i.d. normal with mean 0 and variance σ2, which measures the fluctuation of rates over years.
Model (3) is commonly referred to as the (transformed) Cancer Rate Regression Model in the SEER analysis (see e.g. Kim et al. 2000; Tiwari et al., 2005; Fay et al., 2006), which can be conveniently fitted for observed data (ti, yi), i = 1,…,I, using the least squares or the maximum likelihood estimation methodologies. The resulting estimates of β = (β0, β1) are denoted by
Regression coefficient β1 in (3) has been of main interest, as it transcribes the trends of mortality or incidence. Indeed, the annual percent change (APC), defined as APC = 100(eβ1 − 1), has been used by the NCI (see e.g. Fay et al., 2006) for describing the trends of cancer incidence and mortality. Its estimate, , along with its variance, obtained via the delta method (Ries et al., 2003; Fay et al., 2006), constitutes the basis of drawing inference on the trend, e.g. constructing confidence intervals or testing hypothesis. Here, and the unbiased estimator where is a prediction of yi based on (3), namely,
For the purpose of health-care evaluations, it is of substantial interest to compare the APC of one region (e.g., county or state level) to that of another region, or to a more global level (e.g. state or national level). One may also be interested in comparing the APCs over two overlapping intervals. In the following, we derive the tests for comparing APCs of two overlapping regions within two overlapping time intervals, which includes the aforementioned local-vs-global comparison as a special case.
3 Test for Equality of APCs for Two Independent Regressions
To start, we briefly review the test for comparing APCs for two independent comparison groups, e.g. for two non-overlapping regions or time intervals. That is, we consider two independent linear regressions
(4) |
for k = 1, 2, flagging groups 1 and 2, respectively.
Let APC1 and APC2 be the corresponding APC values for these two regressions. Often, we wish to test the null hypothesis H0 : APC1 = APC2 versus the alternative hypothesis H1 : APC1 ≠ APC2, which is equivalent to testing H′0 : β11 = β21 versus H′1 : β11 ≠ β21. Under the assumption that error variances for the two groups are equal, a test for the latter is given by Kleinbaum et al. (1988):
(5) |
where
for k = 1, 2, and is the “pooled” unbiased estimate of σ2 given by
where are the predictions for k = 1, 2. Test (5) is currently employed by the NCI SEER*STAT software (http://seer.cancer.gov/seerstat). We remark that in these tests an implicit assumption is that the total population in each time period is stable so that the variance of yki is time-independent. This is indeed the case for the SEER incidence/mortality data. Hence, we will make the same assumption throughout.
4 A Corrected Z-test for Two Dependent Regressions
Much difficulty arises as (5) is no longer valid if the independence assumption is violated. Suppose we are interested in comparing the APCs of two overlapping regions, say, Region 1 and Region 2, with data collected over two time intervals [t1, tm] and [ts+1, ts+I], which possibly overlap. That is, t1 ≤ ts+1 < tm ≤ ts+I. We modify (4) to accommodate this situation
(6) |
for Region 1, and
(7) |
for Region 2. Let be the estimates of the slope parameters of the regression lines for these two regions respectively. In particular,
where
When Regions 1 and 2 are overlapping, the two regressions may not be independent and, hence, (5) will not be valid as it fails to account for the correlation between . Indeed, under the assumption that, errors e1i and e2i are i.i.d. normal with mean 0 and equal variance σ2 for the two regions,
where It turns out that the derivation of when the two time intervals [t1, tm] and [ts+1, ts+I] under consideration are overlapping, is nontrivial as it requires a careful consideration of the overlapping of two regions. The detailed derivation is given in the WebAppendix, which shows that
(8) |
where Here, we have used superscript ‘O’ to denote the intersection of Regions 1 and 2, and denoted by nkji and the numbers of underlying population at risk for age group j at time ti in Region k(k = 1, 2), and in the overlapping subregion, respectively.
The cross term in (8)
merits attention as it determines the sign of (8) and is completely decided by how [t1, tm] overlaps with [ts+1,ts+I]. For example, when [t1, tm] coincides with [ts+1, ts+I] (i.e. s = 0, m = I), then On the other hand, when [t1, tm] only partially overlaps with [ts+1,ts+I], σ12 can be negative, causing a negative covariance in (8). For example, when s is close to m such that and for any i ∈ [s + 1, m], leading to σ12 < 0.
Note that when the overlapping region is an empty set, n(O) = 0, and . When s + 1 > m (i.e. the time intervals are non-overlapping), σ12 = 0 and, hence, as well. On the other hand, if, for example, Region 1 is completely contained in Region 2, then n1 = n(O), and
So, in summary, if the two regions are non-overlapping (or time intervals are non-overlapping)
(9) |
and if Region 1 is completely contained in Region 2,
where n1/n2 is typically termed as the overlapping ratio. In general, for two regions that overlap partially,
(10) |
Eq. (10) reveals that its asymptotic efficacy (AE), defined by its noncentrality, is
(11) |
compared to the AE of the naive test that ignores overlapping [cf. (9) or (5)]
(12) |
Hence, the Pitman Asymptotic Relative Efficiency (ARE), which is the ratio of (11) and (12), and measures the gain of efficiency by accounting for overlapping, is
Several points are worth mentioning. First, when n(O) = 0 (corresponding to disjoint regions) or ts+1 > tm (corresponding to disjoint time intervals), the Pitman ARE is 1, justifying the use of the classical test (as used in the current SEER*STAT software). Secondly (and interestingly), depending on the sign of σ12, i.e the mixing of the time intervals, the ARE can be greater or less than 1. Specifically, when σ12 > 0, then ARE > 1, indicating the naive test will be too conservative; otherwise, ARE < 1, hinting that the naive test will be too aggressive and will not maintain the nominal type I error, all of which calls for a new test that accounts for overlapping. Finally, as a simple example, when s = 0, m = I (i.e. two time intervals are identical), then indicating that the naive test will always be too conservative and the efficiency loss will become more severe as the overlapping population n(O) becomes larger.
In practice, as σ2 is unknown, we have to replace it with a consistent estimate, leading to the following Z-test,
(13) |
Under the null hypothesis, Z in (13) approximately follows a normal distribution, where an unbiased estimate for σ2 is given by
5 Analysis of SEER Mortality Data and Simulation Studies
It is of substantial interest to compare the changes in cancer mortality rates in California with the national levels as a California law (Health and Safety Code, Section 103885) was passed in late 1980's that mandated the reporting of malignancies diagnosed throughout the state. For this purpose, we applied the proposed methodology to compare the annual percent change (APC) in the age-adjusted mortality rates for the United States (US) for the period from 1988–2002 to that of California (CA)for the period from 1990 to 2004. We fitted the simple linear models (4) to the logarithms of the age-adjusted mortality rates for both male and female for a number of cancer sites from the Cancer Facts & Figures (American Cancer Society, 2007). The mortality data for the United States are compiled by the National Center for Health Statistics (NCHS) of the Centers for Disease Control and Prevention (www.cdc.gov/nchs) and are available from the National Cancer Institute’s Surveillance, Epidemiology, and End Results SEER) Program (http://www.seer.cancer.gov). The ratio of the total population for all age-groups combined for CA to that for the US for the overlapping years (i.e. n1/n2) was around 11% for females, and 11.5% for males. Because of the space contraint, the results are summarized in Tables A.1 and A.2 in the WebAppendix. The tables give the estimates of the slope parameters for CA and US and their standard errors, along with the p-values for the comparisons based on the naive t- and the corrected Z- tests. The estimate of common residual variance σ2 is also provided. We also calculated the residual variances for all the cancer sites for CA and the US separately (not reported in the tables), and found that they were close, confirming our common variance assumption.
The table shows that the corrected Z-test seems to more aggressively detect the difference between the two APCs than the t-test, yielding smaller p-values for all the cancer sites. For example, the corrected Z-test detected a significant difference in the APC between CA and the US on the site of Stomach in men (meaning CA has a more rapid decrease of Stomach cancer mortality rate compared to the US), while the naive t-test failed to detect such a difference at 5% type I error rate level.
We also compared annual percent change (APC) in the recent 15 years’ age-adjusted mortality rates for California (1990 to 2004) to the national mortality rates during eighties and early nineties (1980–1994). Indeed, it was a common practice for policy-makers to evaluate the progress made at a state level by comparing with the historical national trends (see e.g. http://statecancerprofiles.cancer.gov/historicaltrend). Statistically, this comparison is also of interest. In particular, as σ12 < 0 in this case, the theoretical results in Section 4 hinted that the naive t-test would be too aggressive, and, hence, might ‘exaggerate’ the progress made in California. The ratio of the total population for all age-groups combined for CA to that for the US for the overlapping years (1990–1994) (i.e. n1/n2) was around 11.1% for females, and 11.4% for males. The results are summarized in Tables B.1 and B.2 in the WebAppendix. These tables show that the naive t-test was a bit more aggressive than the corrected Z-test, yielding slightly smaller p-values for all the cancer sites. This confirmed our theoretical results.
To further confirm our analysis results, simulation studies were performed to compare the characteristics of the naive t-test, based on (5), with the corrected Z-test (13) that properly accounts for overlapping. We conducted the following simulations to compare the APCs for two regions. We mimicked the comparision between, say, the Southern Region (Region 1) consisting of Georgia (GA), South Carolina (SC) and North Carolina (NC), and the Eastern Region (Region 2) consisting of NC, Virginia (VA) and Maryland (MD), with NC the overlapping state. The different time periods, with varying degree of overlap in the time intervals, are taken to be : (Scenario 1) years [1986,…,2001] for Region 1, and years [1989,…,2004] for Region 2 so that there a considerable overlap of 12 years between the two intervals and σ12 = 152.75; (Scenario 2) years [1978,…,1993] for Region 1, and years [1989,…,2004] for Region 2 so that there is a little overlap of three years between the two intervals and σ12 = −141.25 < 0. For generating the counts, dkji, we assume that dkji ∼ind Poisson(nkjiλkji), where log(λkji) = βkj,0 + βk1ti, with ti taking values in the intervals corresponding to the two regions stated above. Define the age-adjusted rate at time ti in Region k as Then the specification for λkji leads to
Hence, the delta method would yield that E(yki) ≡ E(log rki) = log(Bk,0) + βk1ti, where APCk = 100(eβk1 − 1).
Now to specify the regression for λkji, we take βk1 = log(100−1APCk + 1), based on the specified values of APCk ranging from −0.3% to 1.0%, and compute where dkj,0 and nkj,0 are, respectively, the observed number of deaths and the number of at risk population at the “baseline” year, the beginning of the time interval considered for Region k. The age-specific counts for the overlapping state, NC at the overlapping time ti are generated from Poisson distributions with means denotes the number of at-risk population in the overlapping region (e.g. NC) in year ti. In our simulation, the number of at-risk population and the observed number of deaths were obtained from the SEER database for all malignant male cancers and prostate cancer within the time intervals specified in Scenarios 1 and 2.
Table 1 and Table 2 display the powers of the corrected Z-test and the naive Kleinbaum’s t-test (5) as a function of various APCs for the two regions, based on the two time-overlapping scenarios listed above. For each parameter configuration, a total of 10000 monte carlo samples were generated and the empirical powers were calculated. The results in Table 1 and Table 2 clearly showed that corrected test maintained the nominal type I error under both time-overlapping scenarios and had good power, which approached 1 quickly as the difference between the two APCs increased. On the contrary, the naive test did not maintain the nominal type I error. It was too conservative in Scenario 1 (where σ12 > 0 as in Table 1), with the type I error being around 0.031 under the null hypothesis, almost half less than the nominal level, and its power was obviously less than the corrected test, while in Scenario 2 (where σ12 < 0 as in Table 2), its type I error rate was around 0.075, almost 50% more than the nominal level. Hence, our simulation results verified the theoretical results.
Table 1 (for time-overlapping scenario 1).
site | APC1 | APC2 | naive T-test | Corrected Z-test | estimated residual variance |
---|---|---|---|---|---|
All Malignant Cancers | −0.5 | −0.5 | 0.030 | 0.050 | 5.71E-05 |
−0.5 | −0.3 | 0.940 | 0.960 | 5.68E-05 | |
−0.5 | −0.1 | 0.99 | 1.00 | 5.63E-05 | |
−0.3 | −0.3 | 0.032 | 0.047 | 5.63E-05 | |
−0.3 | −0.1 | 0.930 | 0.960 | 5.59E-05 | |
−0.3 | 0.1 | 0.99 | 1.00 | 5.56E-05 | |
−0.1 | −0.1 | 0.032 | 0.048 | 5.55 E-05 | |
−0.1 | 0.1 | 0.94 | 0.96 | 5.52E-05 | |
−0.1 | 0.3 | 0.99 | 1.00 | 5.49E-05 | |
0.1 | 0.1 | 0.030 | 0.050 | 5.47E-05 | |
0.1 | 0.3 | 0.94 | 0.96 | 5.44E-05 | |
0.1 | 0.5 | 0.99 | 1.00 | 5.44E-05 | |
0.3 | 0.3 | 0.032 | 0.050 | 5.40E-05 | |
0.3 | 0.5 | 0.94 | 0.96 | 5.36E-05 | |
0.3 | 0.7 | 0.99 | 1.00 | 5.33E-05 | |
0.5 | 0.5 | 0.031 | 0.050 | 5.32E-05 | |
0.5 | 0.7 | 0.94 | 0.97 | 5.26E-05 | |
0.5 | 1.0 | 0.99 | 1.00 | 5.26E-05 | |
Prostate Cancer | −0.5 | −0.5 | 0.031 | 0.050 | 0.000480 |
−0.5 | −0.3 | 0.188 | 0.242 | 0.000477 | |
−0.5 | −0.1 | 0.655 | 0.716 | 0.000475 | |
−0.5 | 0.1 | 0.94 | 0.97 | 0.000472 | |
−0.3 | −0.3 | 0.031 | 0.050 | 0.000474 | |
−0.3 | −0.1 | 0.191 | 0.245 | 0.000471 | |
−0.3 | 0.1 | 0.660 | 0.721 | 0.000468 | |
−0.3 | 0.3 | 0.95 | 0.97 | 0.000466 | |
−0.1 | −0.1 | 0.031 | 0.049 | 0.000467 | |
−0.1 | 0.1 | 0.193 | 0.247 | 0.000465 | |
−0.1 | 0.3 | 0.665 | 0.724 | 0.000462 | |
−0.1 | 0.5 | 0.95 | 0.97 | 0.000459 | |
0.1 | 0.1 | 0.031 | 0.049 | 0.000461 | |
0.1 | 0.3 | 0.196 | 0.250 | 0.000458 | |
0.1 | 0.5 | 0.670 | 0.727 | 0.000456 | |
0.1 | 0.7 | 0.953 | 0.970 | 0.000453 | |
0.3 | 0.3 | 0.031 | 0.049 | 0.000455 | |
0.3 | 0.5 | 0.198 | 0.250 | 0.000452 | |
0.3 | 0.7 | 0.673 | 0.733 | 0.000450 | |
0.3 | 1.0 | 0.990 | 0.994 | 0.000446 | |
0.5 | 0.5 | 0.032 | 0.050 | 0.000449 | |
0.5 | 0.7 | 0.200 | 0.253 | 0.000446 | |
0.5 | 1.0 | 0.860 | 0.894 | 0.000443 |
Table 2 (for time-overlapping scenario 2).
site | APC1 | APC2 | naive T-test | Corrected Z-test | estimated residual variance |
---|---|---|---|---|---|
All Malignant Cancers | −0.5 | −0.5 | 0.075 | 0.052 | 6.81E-05 |
−0.5 | −0.3 | 0.840 | 0.830 | 6.84E-05 | |
−0.5 | −0.1 | 1.00 | 0.99 | 6.82E-05 | |
−0.3 | −0.3 | 0.075 | 0.058 | 6.77E-05 | |
−0.3 | −0.1 | 0.843 | 0.831 | 6.78E-05 | |
−0.3 | 0.1 | 1.00 | 1.00 | 6.74E-05 | |
−0.1 | −0.1 | 0.074 | 0.052 | 6.67E-05 | |
−0.1 | 0.1 | 0.851 | 0.831 | 6.65E-05 | |
−0.1 | 0.3 | 1.00 | 1.00 | 6.62E-05 | |
0.1 | 0.1 | 0.073 | 0.054 | 6.58E-05 | |
0.1 | 0.3 | 0.854 | 0.831 | 6.54E-05 | |
0.1 | 0.5 | 1.00 | 0.99 | 6.53E-05 | |
0.3 | 0.3 | 0.071 | 0.054 | 6.62E-05 | |
0.3 | 0.5 | 0.859 | 0.846 | 6.51E-05 | |
0.3 | 0.7 | 1.00 | 1.00 | 6.44E-05 | |
0.5 | 0.5 | 0.075 | 0.054 | 6.42E-05 | |
0.5 | 0.7 | 0.862 | 0.849 | 6.40E-05 | |
0.5 | 1.0 | 1.00 | 1.00 | 6.35E-05 | |
Prostate Cancer | −0.5 | −0.5 | 0.076 | 0.052 | 0.00061 |
−0.5 | −0.3 | 0.198 | 0.189 | 0.00061 | |
−0.5 | −0.1 | 0.540 | 0.517 | 0.00061 | |
−0.5 | 0.1 | 0.850 | 0.837 | 0.00061 | |
−0.3 | −0.3 | 0.075 | 0.050 | 0.00060 | |
−0.3 | −0.1 | 0.199 | 0.183 | 0.00059 | |
−0.3 | 0.1 | 0.544 | 0.523 | 0.00058 | |
−0.3 | 0.3 | 0.854 | 0.840 | 0.00057 | |
−0.1 | −0.1 | 0.075 | 0.058 | 0.00057 | |
−0.1 | 0.1 | 0.201 | 0.185 | 0.00057 | |
−0.1 | 0.3 | 0.545 | 0.526 | 0.00057 | |
−0.1 | 0.5 | 0.95 | 0.97 | 0.00057 | |
0.1 | 0.1 | 0.074 | 0.053 | 0.00057 | |
0.1 | 0.3 | 0.203 | 0.186 | 0.00057 | |
0.1 | 0.5 | 0.550 | 0.530 | 0.00057 | |
0.1 | 0.7 | 0.859 | 0.845 | 0.00057 | |
0.3 | 0.3 | 0.075 | 0.051 | 0.00057 | |
0.3 | 0.5 | 0.205 | 0.188 | 0.00057 | |
0.3 | 0.7 | 0.555 | 0.533 | 0.00057 | |
0.3 | 1.0 | 0.941 | 0.933 | 0.00057 | |
0.5 | 0.5 | 0.075 | 0.052 | 0.00056 | |
0.5 | 0.7 | 0.205 | 0.190 | 0.00056 | |
0.5 | 1.0 | 0.736 | 0.717 | 0.00056 |
6 Discussion
In this paper, we have considered an important problem where comparisons have to be made for regions or time intervals that overlap. We have shown that the existing methodology, which does not properly account for such overlapping, will be be inappropriate as it will not maintain the type I error. We have proposed a simple test that solves this fundamental difficulty and correctly accounts for overlapping. Simulations have indicated good performance of the proposed methodology. We have applied the developed methodology to the analysis of the major cancer sites from the SEER Program and have found that the corrected Z-test renders more power than the naive t-test. Hence, the proposed Z-test will be an important addition to the SEER*STAT software, which only handles independent comparisons at this time.
We have focused on the local linearity for the cancer rates by considering time periods of short or moderate length. Indeed, linearity assumption for the cancer rates is a debatable issue in cancer surveillance, which is likely to be violated over a longer period (e.g. ≥ 30 years). A detailed discussion on this issue has been made in Fay et al. (2006), which proposed a joinpoint linear regression for long-term cancer rate analysis. In a similar context, we plan to pursue APC comparisons for longer periods by considering joinpoint linear regressions, and will report the results in a subsequent communication.
7 Supplementary Materials
Web Appendix for the derivation of Equation (8) and Tables A.1, A.2, B.1 and B.2 referenced in Section 5 are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.
Supplementary Material
Acknowledgements
The authors would like to thank the Editor, the AE and an anonymous referee for insightful suggestions that improved the orginal manuscript. The authors thank Steve Scoppa and Joe Zou of Information Management Services (IMS), Inc., the company that provides biomedical computing support to the NCI, for their valuable contributions. The authors also thank Dr Rocky Feuer for carefully reading an early draft of this work and for many insightful suggestions.
Contributor Information
Yi Li, Harvard School of Public Health and Dana-Farber Cancer Institute.
Ram C. Tiwari, National Cancer Institute
Reference
- American Cancer Society. Cancer Facts & Figures. Atlanta: Georgia; 2007. [Google Scholar]
- Fay M, Tiwari R, Feuer E, Zou Z. Estimating average annual percent change for disease rates without assuming constant change. Biometrics. 2006;62:847–854. doi: 10.1111/j.1541-0420.2006.00528.x. [DOI] [PubMed] [Google Scholar]
- Ghosh K, Tiwari R. Prediction of US cancer mortality counts using semi-parametric Bayesian techniques. Journal of the American Statistical Association. 2007;102:7–15. [Google Scholar]
- Kim H, Fay M, Feuer E, Midthune D. Permutation tests for joinpoint regression with applications to cancer rates. Statistics in Medicine. 2000;19:335–351. doi: 10.1002/(sici)1097-0258(20000215)19:3<335::aid-sim336>3.0.co;2-z. [DOI] [PubMed] [Google Scholar]
- Kleinbaum D, Kupper, Muller P. Applied Regression Analysis and Other Multivariable Methods. 2nd edition. Boston: PWS-Kent; 1988. [Google Scholar]
- Pickle LW, White AA. Effects of the choice of age-adjustment method on maps of death rates. Statistics in Medicine. 1995;14:615–627. doi: 10.1002/sim.4780140519. [DOI] [PubMed] [Google Scholar]
- Ries L, Eisner M, Kosary C, Hankey B, Miller B, Clegg L, Mariotto A, Feuer E, Edwards BK, editors. SEER Cancer Statistics Review, 1975–2002. Bethesda, MD: National Cancer Institute; 2003. http://seer.cancer.gov/csr/1975-2002/ [Google Scholar]
- Tiwari R, Cronin K, Davis W, Feuer E, Yu B, Chib S. Bayesian model selection for join point regression with application to age-adjusted cancer rates. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2005;54:919–939. [Google Scholar]
- Tiwari R, Clegg L, Zou Z. Effcient interval estimation for age-adjusted cancer rates. Statistical Methods in Medical Research. 2006;15:547–569. doi: 10.1177/0962280206070621. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.