Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Oct 1.
Published in final edited form as: Theor Popul Biol. 2023 Jun 7;153:50–68. doi: 10.1016/j.tpb.2023.05.001

Decomposition of Disparities in Life Expectancy with Applications to Administrative Health Claims and Registry Data

I Akushevich 1, A Yashkin 1, M Kovtun 1, E Stallard 1, AI Yashin 1, J Kravchenko 2
PMCID: PMC10526891  NIHMSID: NIHMS1906939  PMID: 37295513

Abstract

Research shows that geographic disparities in life expectancy between leading and lagging states are increasing over time while racial disparities between Black and White Americans have been going down. In the 65+ age strata morbidity is the most common cause of death, making differences in morbidity and associated adverse health-related outcomes between advantaged and disadvantaged groups an important aspect of disparities in life expectancy at age 65 (LE65). In this study, we used Pollard’s decomposition to evaluate the disease-related contributions to disparities in LE65 for two types of data with distinctly differing structures: population/registry and administrative claims. To do so, we analyzed Pollard’s integral, which is exact by construction, and developed exact analytic solutions for both types of data without the need for numerical integration. The solutions are broadly applicable and easily implemented. Applying these solutions, we found that the largest relative contributions to geographic disparities in LE65 were chronic lower respiratory diseases, circulatory diseases, and lung cancer; and, to racial disparities: arterial hypertension, diabetes mellitus, and cerebrovascular diseases. Overall, the increase in LE65 observed over 1998–2005 and 2010–2017 was primarily due to a reduction in the contributions of acute and chronic ischemic diseases; this was partially offset by increased contributions of diseases of the nervous system including dementia and Alzheimer’s disease.

Keywords: Life expectancy at 65, Disparities, Decomposition methods, Medicare, Time trends

1. Introduction

Persistent disparities in health outcomes1 between sex, race/ethnicity, and geographic area-specific subgroups in the United States are an important barrier to improving total life expectancy (LE)2. Attempts at reducing such disparities have met with limited success3-5, and the mechanisms by which health disparities develop and propagate are not fully understood2,6,7. In the 65+ age strata morbidity is the most common cause of death, making differences in associated adverse health-related outcomes between advantaged and disadvantaged groups an important aspect of disparities in life expectancy at age 65 (LE65). However, isolating the effects of individual diseases/disease groups is not straightforward due to the need to balance estimate accuracy, type and number of methodological assumptions, range of included diseases, and the statistical power of the available data.

Methods of decomposition in demography can provide important insight into the causes of the differences in aggregate measures (such as LE) between well-defined population groups. As we will demonstrate shortly, Pollard’s decomposition8,9 is the method of choice for representing cause-of-death-specific contributions to disparities in LE10-12. The difference in LEs for two subpopulations is represented through an integral of age-functions of subpopulation-specific survival functions, LEs, and cause-specific hazard functions. For LE65, the integration is performed over age intervals from 65 to the maximal age in a specific dataset. This means that all functions in the integrand have to be well-defined and computable from available data. This is not trivial for cause-specific hazard functions. Many commonly used data sources report/update information at random age points (e.g., the age and time that a medical insurance claim was submitted to an administrative claim database) or aggregated across a fixed time interval (e.g., calendar-year-specific data on cause-specific mortality aggregated over fixed time periods, usually single-year age groups, in registry datasets). In the first case (administrative data), cause-specific survival functions can be evaluated using the Kaplan-Meier product-limit estimator13. This estimator is discontinuous, however, and results in infinite hazards at exact times or ages of death and zero hazards at all other age/time points. Mathematically, this requires us to deal with discontinuous and generalized functions. Although Pollard's decomposition extends (in theory) to discontinuous and generalized functions, numerical integration of such functions is not possible, restricting its use in such applications. In the second case (registry data), cause-specific survival functions and LEs can be evaluated using the life table approach. The life table approach also yields estimates of cause-specific hazards aggregated (or integrated) over discrete age intervals. No comparable life table approach has yet been proposed for cause-of-death decomposition using Pollard’s integral, the result being that practitioners (e.g., ref.14) continue to use Pollard’s midpoint approximation8 (eqn. 14) without considering the associated uncertainties and/or biases.

Pollard’s decomposition is generally applied to aggregated mortality data, allowing the inputs to be obtained from the associated life tables14,15. Evaluations of Pollard’s decomposition that require assumptions affecting how the input measures are obtained from data, or that use numerical approximations for integration over age-intervals, can introduce bias into the resulting cause-specific decompositions. Systematic biases are generally not considered in such analyses. Pollard’s decomposition has never been applied to administrative data in a way that exploits the additional information provided by having the exact dates of death. In order to address these limitations, we determined how Pollard’s decomposition8,9 could be adapted for use with administrative claims records and population/registry data. We used a product-limit estimator for left-truncated (or delayed entry) censored administrative data16 and a life table approach for registry/population data; we developed an exact representation of Pollard’s decomposition for discrete age intervals that does not require numerical integration and does not require an assumption of independence among the causes of death selected for use in the decomposition.

Pollard’s decomposition is exact in continuous time—i.e., the sum of all relative contributions over all causes of death is exactly 1.0, or 100% of the difference in life expectancy. The property of additivity is a major advantage of Pollard’s decomposition; without it, distinguishing small cause-specific contributions to disparities from accumulated uncertainties due to methodological assumptions could be a challenge. The additivity property of Pollard’s decomposition continues to hold exactly in our solutions shown in eqns. (8) and (9), even after applying the multiple operations necessary to solve Pollard’s integral for the two types of data used in this study.

The assumption of independence of cause-specific forces of mortality is not required for our approach; in contrast, independence is required by alternative approaches such as proposed by Beltrán-Sánchez et al.12. Our method is more broadly applicable because we do not have to limit its use to sets of causes for which independence is a reasonable approximation. This allows us to deal with any specified set of causes without having to consider joint dependence on common risk factors such as diet, exercise, smoking, environmental exposures, etc. Indeed, Mokdad et al.17 proposed that such variables were not just risk factors but were in fact actual causes for close to half the deaths in the U.S in 2000. This suggests that dependencies between cause-specific contributions to the differences in life expectancy may be explained by common causes that are amenable to behavioral modifications or other forms of intervention. For example, diabetes mellitus, hypertension, and atherosclerosis all increase the risk of death from cerebrovascular disease and cardiovascular disease. The existence of such major dependencies is one reason to further emphasize that Pollard’s decomposition as well as the adaptations performed in this study do not require an assumption of independence of causes of death.

This paper is structured as follows: Section 2.1 provides preliminary results demonstrating that Pollard’s decomposition8,9 is the method of choice for representing cause-specific contributions to disparities in LE10-12. Section 2.2 provides a description of our methodological development. Section 2.3 provides the derivation of the explicit formulas for Pollard’s decomposition for administrative and population registry data. Section 3 applies the method to Multiple Cause of Death (MCD) data and Medicare administrative claims, to identify the relative contributions of high-impact morbidities to disparities in LE65 between i) Black and White Americans, ii) the top and bottom eight U.S. states by LE65, and iii) the 1998–2005 and 2010–2017 time-periods. Extensive discussions of methodologic and substantive developments including detailed comparisons of our approach and other decomposition approaches available in the literature, are presented in Section 4. Our conclusions are in Section 5.

2. Methods

2.1. Pollard’s decomposition is the method of choice

Our refinements to Pollard’s decomposition8,9 are premised on the proposition that Pollard’s decomposition is the method of choice for calculating cause-of-death-specific contributions to differences in LE between two life tables, two populations, or the same population at two time points. Support for this proposition was published in this journal by Beltrán-Sánchez and Soneji18 who conducted an extensive review of the relevant literature, including prior work by Beltrán-Sánchez et al.12 that established a close connection between Pollard’s decomposition and cause-elimination life table techniques—the former dealing with observed changes, the latter dealing with hypothetical changes. This close connection is the key to understanding why the Pollard decomposition is the method of choice for the pairwise comparisons noted above.

Our literature review indicated that the close connection between Pollard’s decomposition and the standard cause-elimination methods could have been noted prior to Beltrán-Sánchez’s papers in 2008 and 2011. For example, it could have been noted by Pollard in his 1982 and 1988 papers8,9. At that time, however, the question of the day was how Pollard’s (1982) method differed from that of Arriaga19 (1984) whose method was gaining popularity among demographers because it was easy to use. Pollard (1988) resolved the question9: the methods were equivalent. Arriaga’s (1984) and later (1989) method19,20 was a discrete time formulation of Pollard’s continuous time method. This equivalence, combined with the close connection between Pollard’s decomposition and cause-elimination methods, simplifies our presentation substantially and makes our results accessible without dependence on the extensive subsequent literature.

Our development relies on two papers by Makeham21,22 that extended life table theory to include cause-specific forces of mortality (same as hazard rates) that are additive over any set of independent causes of death. More than three-quarters century later, Greville23 (1948; p. 288) considered the gain in life expectancy at age x that would occur if deaths from cause i were eliminated under Makeham’s theory. He published the following formula for computing such change:

ex(i)ex=0tpxμx+tiex+t(i)dt (1)

where the left side is the difference between the initial LE at age x, denoted ex, and the new, higher LE after eliminating deaths from cause i, denoted ex(i), where the superscript (i) is an indexing symbol, not an exponent. The integrand on the right side has three factors. The first is the probability of surviving from age x to age x+t in the initial life table; the second is the force of mortality at age x+t due to cause i in the initial life table; the third is the LE at age x+t in the new life table. By setting x=0, one can obtain the formula for the gain in LE at birth due to elimination of deaths from cause i. Thus, Greville’s equation (1) can be rewritten as:

e0(i)e0=0μtitp0et(i)dt (2)

where the first two factors in the integrand were interchanged.

Pollard9 (1988; eqn. (6) presented the following formula for the change in LE at birth between any two life tables, indexed by the superscripts 1 and 2:

e02e01=0(μt1μt2)tp01et2dt. (3)

Comparing eqns. (2) and (3) term-by-term, one can see that life table 1 corresponds to Greville’s initial life table, and life table 2 corresponds to the life table with deaths from cause i eliminated. The probability of surviving from age 0 to age t is indexed to life table 1; the new, higher LE is indexed to life table 2. The change in the force of mortality at age t between the two life tables corresponds to the force of mortality due to cause i in Greville’s initial life table, i.e., (μt1μt2)μti. Thus, Greville’s cause-elimination formula (2) and Pollard’s decomposition formula (3) can be matched term-by-term to reveal the formal equivalence between them.

Table 1.

Study diseases/disease groups and associated cause-specific age-adjusted mortality

Disease ICD-9 ICD-10 Mortality*
Septicemia 038 A40-A41 67
Other Infectious and Parasitic Diseases 001-139 A00-B99 32
Malignant neoplasms of colon, rectum and anus 153-154 C18-C21 94
Malignant neoplasm of pancreas 157 C25 64
Malignant neoplasms of trachea, bronchus and lung 162 C33-C34 276
Malignant neoplasm of breast 174-175 C50 60
Malignant neoplasm of prostate 185 C61 65
Non-solid (leukemias and lymphomas) 200–208 C81-C96 104
Other solid fast progressive see** see** 216
Other solid slow progressive see*** see*** 35
Secondary malignant neoplasm 196–198 C77-C79 12
Other non-specified cancers 199 C80, C45, C97 61
In situ, benign uncertain, or unknown neoplasms 210-239 D00-D48 29
Diseases of Blood and Blood Forming Organs 280-289 D50-D89 16
Diabetes mellitus 250 E10-E14 132
Other Endocrine, Nutritional, Metabolic, Immunity 240-279 E00-E89 54
Dementia 290, 294.2 F01, F03 208
Other Mental Disorders 290-319 F01 – F99 12
Alzheimer's disease 331.0 G30 188
Parkinson's disease 332 G20-G21 53
Other Diseases of Nervous System and Sense Organs 320-389 G00-G99, H00-H95 62
Hypertension 401-405 I10-I15 117
Acute myocardial infarction and other acute ischemic 410, 411 I21-I22, I24 275
Atherosclerotic cardiovascular disease, so described 429.2 I25.0 109
All other forms of chronic ischemic heart disease 412-414 I20, I25.1-I25.9 447
Heart failure 428 I50 126
Cerebrovascular diseases 430-434, 436-438 I60-I69 303
Other diseases of Circulatory System 390-459 I00-I99 320
Influenza and Pneumonia 480-488 J09-J18 121
Chronic lower respiratory diseases 490-494, 496 J40-J47 291
Other Diseases of Respiratory System 460-519 J00-J99 104
Diseases of Digestive System 520-579 K00-K95 139
Skin/Subcutaneous/Musculoskeletal/Connective 680-739 L00-L99 M00-M99 33
Renal failure 584-586 N17-N19 89
Other Diseases of Genitourinary System 580-629 N00-N99 42
Signs, Symptoms and Ill-Defined Conditions 780-799 R00-R99 52
Injury and Poisoning 800-999 E800-E999 S00-T88, V00-Y99 125
*

Age adjusted cause-specific mortality rates (per 100,000).

**

ICD-9: 150-152,155, 156, 158, 159 163-165, 170-172, 176, 179, 182, 183, 188, 189, 191, 192, 194, 195, 209; ICD-10: C15-C17, C22-C24, C26, C37-C41, C43, C46-C49, C54-C56, C64-C68, C70-C72, C74-C76

***

ICD-9: 140-149, 160, 161, 173, 180, 181, 184, 186, 187, 190, 193; ICD-10: C00-C14, C30-C32, C44, C51-C53, C57, C58, C60, C62, C63, C69, C73

Greville’s formula (2) represents the increase in LE that would hypothetically occur upon elimination of deaths from cause i ; Pollard’s formula (3) represents the change in LE that actually occurs upon replacement of the force of mortality in life table 1 with the force of mortality in life table 2. This replacement is typically thought of as reductions in the force of mortality at all ages, but this is not required by Pollard’s formula (3); there may be ages where the force of mortality increases.

Moreover, using the additivity of the cause-specific forces of mortality in Makeham’s theory21,22, Pollard’s decomposition can be rewritten in the form:

e02e01=0i=1I(μti1μti2)tp01et2dt. (4)

By setting μti2=0 for some i and μtj2=μtj1 for all ji, one can reproduce the right side of Greville’s formula (2) for elimination of deaths from cause i with e02=e0(i) and e01=e0.

The above analysis could have been done in 1982 by Pollard when publishing his paper8, or by any reader familiar with Greville23. Greville’s formula (2) was well-known; it formed the basis of the cause-elimination life tables published by the National Center for Health Statistics beginning with the U.S. Decennial Life Tables for 1959–196124. Arriaga’s formulas19,20 were also well-known; they formed the basis of the cause decomposition tables published by the National Center for Health Statistics beginning with calendar years 1984–198925. The equivalence of Arriaga’s19,20 and Pollard’s8,9 formulas were known in 1988 but their relationship to Greville’s formula23 were not recognized until much later. Arriaga19,20 made no mention of the concurrent work by Pollard8,9 or prior work by Greville23.

Pollard8,9 further observed that the order of superscripts 1 and 2 used to index the two life tables is arbitrary; they can be interchanged in eqn. (4) to yield, after a change in signs:

e02e01=0i=1I(μti1μti2)tp02et1dt, (5)

which, although equal to the formula above, may differ when separated into components that are additive over cause and/or age (i.e., assuming that the integral is separated into age-specific components). Because eqns. (4) and (5) are identities, Pollard recommended they be combined as a convex combination with equal weights. Thus, Pollard’s decomposition has two forms based on the ordering of the life table indexes, 1–2, 2–1, and a third form based on a convex combination of the first two. The convex combination form with equal weights will be used in the present paper for comparisons of life tables for pairs of subpopulations, as recommended by Pollard.

The calculations in the present paper represent discrete-time alternatives to Arriaga’s19,20 discrete-time formulas; our calculations derive directly from Pollard’s decomposition and, hence, connect directly to Greville’s23 formulas (1) and (2) and Makeham’s life table theory21,22 for independent causes of death.

2.1.1. The assumption of independence of cause-specific forces of mortality is not required for Pollard’s decomposition by cause of death in equations (4) and (5).

Greville commented23 that the independence assumption was required for the cause elimination calculations in eqns. (1) and (2) but independence was not required for the representation of the total force of mortality as a sum of cause-specific forces indexed by i; the only assumption required for additivity was that the set of I, i=1,2,,I, causes be mutually exclusive and exhaustive. The same result follows from Gail26 (eqn. 4): the additivity of the cause-specific forces arises solely as a consequence of the chain rule for partial derivatives of multivariable functions. Thus, the additivity over causes in eqns. (4) and (5) applies even if the causes of death are dependent. This is important for our applications because many causes of death above age 65 are chronic degenerative diseases with common risk factors that generate complex forms of dependence that cannot be resolved using available population/registry and administrative claims data.

Makeham22 and Greville23 used the term “independent” to describe cause-specific forces of mortality that would remain unchanged upon elimination of deaths due to one or more other causes, the implication being that “dependent” cause-specific forces of mortality would allow such changes to occur. Gail26 considered Makeham’s22 use of the term “independent” in the context of competing risk theory and determined that the condition of no change in a given force of mortality upon elimination of deaths due to other causes is slightly weaker than the usual assumption of statistical independence of the multiple theoretical times to death associated with any given set of independent competing risks. The slight difference is of little practical significance given that only the first of the multiple theoretical times to death can actually be observed. Gail26 commented that the use of either form of independence assumption is “always suspect.” Pollard’s decomposition can be formulated without either form of independence assumption making it highly attractive for use in decomposing disparities in LE.

Beltrán-Sánchez et al.12 (2008) and Beltrán-Sánchez and Soneji18 (2011) recognized the close connection between Pollard’s decomposition and cause-elimination life table techniques, but they did not mention that Pollard’s decomposition can be performed without the independence assumption. They wrote: “An important limitation in this area of demographic research is the assumption of independence among causes of death.” (Beltrán-Sánchez and Soneji18, 2011; p. 44). The independence assumption is required only for cause elimination calculations, not for Pollard’s decomposition. Whereas two life tables are observed in Pollard’s formulas (3)-(5), only one is observed in Greville’s formulas (1) and (2). The penalty for the lack of observed information regarding Greville’s second life table is the need for additional assumptions, the simplest of which is independence26.

The impact of assuming independence between causes of death may not be that large in practice. For example, Beltrán-Sánchez et al.12 developed an alternative decomposition of LE differences based on cause elimination calculations with an explicit independence assumption. Comparisons of their alternative with Pollard’s decomposition led to their conclusion: “Differences are found to be minute between our approach and Pollard’s.” (Beltrán-Sánchez et al.,12 p. 1327). They found substantially larger differences in comparisons with Arriaga’s approach20. Given the aforementioned equivalence of Arriaga’s20 and Pollard’s8 formulas, however, it is highly likely that the differences reported by Beltrán-Sánchez et al.12 were due to the use of a “1–2” form of Arriaga’s formulas20 rather than the convex-combination form recommended by Pollard8. Unlike Pollard8 who identified three forms of decomposition consistent with formulas (4) and (5), Arriaga19,20 published just one form for his decomposition. Preston et al.27 simplified the indexing in Arriaga’s formulas and introduced the “1–2” notation to refer to two different populations in the same way as Pollard did. Once the existence of the different 1–2 and 2–1 forms is taken into account and the discrepancy with Arriaga’s20 approach is explained, it then follows that Beltrán-Sánchez et al.’s12 results support our conclusion that Pollard’s8,9 decomposition is the method of choice.

2.2. Pollard’s Decomposition and Cause-Specific Contributions

The approach developed by Pollard8,9 provides the fundamental decomposition of LE65 which is represented herein as the sum over all cause-of-death-specific contributions:

eW(65)eB(65)=i65(μiB(a)μiW(a))(B(a)eW(a)+W(a)eB(a))2da (6)

where i indexes causes of death, and ej(a), μij(a) and j(a) are LE, cause-specific force of mortality, and survival function at age a for the groups being compared (j indexes the groups); we use j(a)a65p65j to simplify Pollard’s notation in eqns. (3)-(5). Term-by-term comparison shows that eqn. (6) is a convex combination of eqns. (4) and (5) with equal weights, as recommended by Pollard.

We denote the population groups by “B” and “W” instead of Pollard’s “1–2” to mirror the analysis of race-related disparities between Black and White Americans; but the formula is applicable to any two well-defined groups. We will shift the location of the “B” and “W” from subscript to superscript freely in the following development, if such shifts simplify the associated equations.

Each individual term in eqn. (6) in the sum over i can be divided by eW(65)eB(65)0 to represent the relative contributions of cause i to the disparities of LE65, with the ratio denoted by RCi,

RCi=100%65(μiB(a)μiW(a))(B(a)eW(a)+W(a)eB(a))2(eW(65)eB(65))da, (7)

where iRCi=100%. The ratios are defined only when the LE65s differ between the two groups (i.e., division by zero is undefined). Moreover, the cause-specific RCis may differ in sign, implying that the sum of their absolute values may be substantially greater than 100% and that the differences in LE65s reflect offsetting effects for one or more causes.

Numerical evaluation of integrals always requires approximations. We leverage the specific properties of two types of widely available epidemiological data to evaluate the integral in eqn. (6) without additional assumptions beyond those that are standard for the two types of data. The two types of data are: i) population/registry data (MCD data in our case) where the number of cause-specific deaths and the associated population counts in each age group are known; and ii) administrative claims data (5%-Medicare data in our case) where the age and time at the beginning and end of each individual’s follow-up period and, if deceased, the date and cause of death are known. The standard assumptions for these two types of data are those necessary for life table construction and Kaplan-Meier estimation of group-specific survival functions, respectively.

For population/registry data we evaluate the integral assuming the availability of the numbers of cause-specific deaths between integer ages (i.e., between x and x+1, where x is an integer) and observed or estimated exposed-subpopulation counts at exact age x. Chiang28 eqn. (23) showed that exposed-population counts at exact age x can be estimated using mid-year population counts for ages between x and x+1. The life table calculations provide the age-specific survival functions lxj (using lxjj(x)) and life expectancies exj (using exjej(x)). Pollard’s integral can then be evaluated, yielding the following decomposition (details are presented in Section 2.3):

e65We65B=i=1Ix=65x1(μ^ixBμ^xB(eμ^xB1)μ^ixWμ^xW(eμ^xW1))×Wx, (8)

where Wx=12(lx+1W(exB(a^xx))+lx+1B(exW(a^xx))) for 65x<x1, and Wx1=12(lx1B+lx1W)(1qx1B)(1qx1W)qx1Bqx1W. The difference a^xx can be set to ½ in the majority of applications.

The total (all-cause) 1-year cumulative hazards are estimated as: μ^xB=log(lxBlx+1B) and μ^xW=log(lxWlx+1W). Age x1 denotes the left bound of the last age group which is open-to-the-right (e.g., the group 100+ if x1=100). The total hazard in this group is estimated as μ^x1j=log(1qx1j), where qx1j denotes the annual probability of death at age x1 which is assumed to be constant at age x1 and above. It follows from Greville23 eqn. (5) that if the cause specific hazards are proportional between ages x and x+1, then the constant of proportionality for each cause i is equal to the fraction rixj=dixjdxj, where dxj=idixj are the total and cause-specific number of deaths between ages x and x+1, for 65x<x1, or for the last age interval the total and cause-specific number of deaths for age x1 and above. The cause-specific one-year cumulative hazards, μ^ixj, are estimated using μ^ixB=rixBμ^xB and μ^ixW=rixWμ^xW; see Appendix A for proof.

For administrative claims, the product-limit estimates of subpopulation-specific survival functions for left-truncated (or delayed entry) data16 are constructed using individual follow-up data. Because the product-limit survival functions are constant between exact ages of death, LE65 is easily calculated as the area under the survival function. Pollard’s integral is then represented as a sum of the respective estimates taken at ages of death recorded in the data for one or the other subpopulation:

eW(65)eB(65)=12i=1Iki=1Ki((μ^iB(aiki)μ^B(aiki)(eμ^B(aiki)1)μ^iW(aiki)μ^W(aiki)(eμ^W(aiki)1))(W(aiki+)eB(aiki)+B(aiki+)eW(aiki))) (9)

where i indexes cause of death, aiki represent age at death from cause i, and the number of such deaths is Ki; {ak} is the set of ages of death in either or both subpopulations; ak and ak+ indicate that the left and right limits of the respective functions have to be taken. The product-limit estimator has jump discontinuities of survival functions at these points, so survival functions for ak and ak+ (and therefore life-expectancies) are not equal but are well-defined at these points. The quantities μ^j(ak) and μ^ij(ak) denote total and cause-specific cumulative hazards that are estimated as logarithms of the ratios of the respective survival functions at ak and ak+.

Standard errors and confidence intervals for LE65 and RCi are calculated using a simulation strategy29-31. Although most such approaches provide analytical formulae for the variance in life expectancies, we prefer the version involving Monte Carlo simulations because such an approach allows us to obtain the standard errors and confidence intervals for the cause-specific relative contributions RCi. Specifically, the number of deaths for an age interval is assumed to be binomially distributed and simulated when a life table is constructed based on the age-specific probability of death and the number of individuals at risk in the input dataset. The standard deviation of the distribution of LEs, simulated multiple times, gives an estimate of the standard error of the life-table-based LE. For our study the standard approach (based on the assumption of a binomial distribution of the number of deaths) was improved in several aspects. First, because there were many different causes of death, the multinomial distribution was used to simulate cause-of-death-specific probabilities of death at each exact age. Technically we used the conditional method to simulate a multinomial draw using a series of binomial draws32. Second, although the standard approach provides for the calculation of standard errors for LE, we also needed standard errors for cause-specific relative contributions RCi. Therefore, we calculated the RCi for each simulated life table and evaluated the standard errors of the RCi using their simulated distributions.

2.3. Derivation of Equations (8) and (9)

Formulas (6) and (7) are exact. The sum of the relative contributions iRCi equals 1 (or 100%). Only the cause-specific hazards μij(a) depend on i, so the sum over i in (6) or (7) results in iμij(a)=μj(a), where μj(a) is the total subpopulation-specific hazard function for subpopulation j. The integral in the resulting formulas is analytically calculated for two arbitrary ages:

a1a2(μB(a)μW(a))(B(a)eW(a)+W(a)eB(a))da=(B(a1)+W(a1))(eW(a1)eB(a1))(B(a2)+W(a2))(eW(a2)eB(a2)). (10)

The four-step proof of this statement is (where each line is a step):

a1a2(μB(a)μW(a))(B(a)eW(a)+W(a)eB(a))da=a1a2W(a)B(a)B(a)W(a)B(a)W(a)(B(a)eW(a)+W(a)eB(a))da=a1a2W(a)eW(a)d[B(a)W(a)]+a1a2B(a)eB(a)d[W(a)B(a)]=B(a)eW(a)a1a2a1a2B(a)da+W(a)eB(a)1a2+a1a2W(a)da=(B(a1)+W(a1))(eW(a1)eB(a1))(B(a2)+W(a2))(eW(a2)eB(a2)),

where we used i) the equation μj(x)j(x)=j(x) in step one (j indexes the groups being compared, e.g., j=B or W); ii) properties of derivatives of a ratio like (B(a)W(a))=(B(a)W(a)W(a)B(a))W2(a) for step two; iii) integration by parts and the equation j(x)=(j(x)ej(x)) that follows from the definition of LE: ej(x)=1j(x)xj(a)da, for step three; and iv) the equation: a1a2j(a)da=j(a2)ej(a2)j(a1)ej(a1) which also follows from the definition of LE for step four. The result of the integration can be represented as F(a1)F(a2) where F(a)(B(a)+W(a))(eW(a)eB(a)). Pollard’s integral is obtained when a1=65 and a2 because F(65)=2(eW(65)eB(65)) and F()=0.

An important property of Pollard’s integral is that it is exactly zero if taken over a time interval with no deaths in either subpopulation under comparison. This follows from eqn. (10) because the subpopulation-specific survival functions are constant, and therefore, changes in life expectancies for both subpopulations are equal: eW(a1)eW(a2)=eB(a1)eB(a2)=a2a1, and finally, F(a1)=F(a2). This implies that the entire integration region can be separated into subregions containing and not containing death events and only the former contributes to Pollard’s integral. The area around an event can be chosen to be arbitrarily small. Thus, if all events occurred at different times (from a theoretical point of view this is always true), we can express Pollard’s integral as a sum over all death events as

eWo(65)eBo(65)=k=1KF(ak)F(ak+) (11)

where ak and ak+ are left and right limits of age ak (i.e., ak=akϵ and ak+=akϵ, where ϵ is an arbitrary small value or even ϵ0), and k,k=1,,K, indexes the sequence numbers of the death events. Each death is associated with a certain cause i; hence, we can index each ordered death time by cause as aiki,aiki+, and aiki. All terms in eqn. (11) can be regrouped by cause, resulting in a decomposition of the disparity in LE in terms of cause-specific contributions:

eWo(65)eBo(65)=i=1Iki=1KiF(aiki)F(aiki+) (12)

where i indexes causes of death, aiki represent ages of death from cause i, and the number of deaths from cause i is Ki, where i=1IKi=K, the total number of deaths from all causes.

We need to specify the details of this approach for the two types of data used in the present study: i) administrative data, where we have individual follow-up information including initial age, final age of follow-up, an indicator of death/censoring at the final age, and cause of death; and ii) population/registry data where we know the number of deaths by cause and age (or age groups) and the associated exposed-population counts.

2.3.1. Administrative data

The first step of analysis using administrative data is to obtain the product-limit estimates for left-truncated (or delayed entry) data16 both for total and cause-specific survival functions. The cause-specific survival functions can then be calculated similarly, but ignoring all other causes of death, i.e., treating the other causes as additional “censoring events”. Software to obtain the estimates of such total and cause-specific survival functions is available in all standard statistical packages, e.g., Proc PHREG with BASELINE statement provides such estimates in SAS software33. The resulting subpopulation-specific total (j(a)) and cause-specific (ij(a)) survival functions then satisfy the equality j(a)=i=1Iij(a) for each age a. The survival functions obtained using the above product-limit method are constant between events (i.e., times of death) and change only at these time points. This means that subpopulation-specific hazards functions μB(a) and μW(a) are zero everywhere except at age-points where deaths occur, i.e., the hazard functions have to be described in terms of the Dirac delta-function. However, the cumulative hazards, i.e., hazard functions integrated over an age interval that includes the age-at-death of a death event (ak) are well-defined and can be expressed in terms of logarithms of the ratios of the respective survival functions taken at the age-points just before (ak) and just after (ak+) the death event (e.g., see ref.34). Specifically, the estimates of the total and cause-specific cumulative hazards are: μ^j(ak)=log(j(ak)j(ak+)) for total and μ^ij(ak)=log(ij(ak)ij(ak+)) for cause-specific cumulative hazards. The equality iμ^ij(ak)=μ^j(ak) is then satisfied exactly, because:

iμ^ij(ak)=ilog(ij(ak)ij(ak+))=log(iij(ak)iij(ak+))=log(j(ak)j(ak+)=μ^j(ak).

As noted above, Pollard’s integral is exactly zero if taken over a time interval with no deaths in either subpopulation under comparison. Therefore, only intervals involving death events contribute to the integral, and Pollard’s integral is represented as a sum of the respective estimates taken at ages of death detected in the data for one of the subpopulations.

If exactly one death occurred in the age interval between ak and ak+, so that μj(a)=μij(a), then the integral over the area around the death event from cause i at age ak can be calculated using (10), yielding:

akak+(μiB(a)μiW(a))(B(a)eW(a)+W(a)eB(a))da=F(ak)F(ak+)=(B(ak)+W(ak))(eW(ak)eB(ak))(B(ak+)+W(ak+))(eW(ak+)eB(ak+))=B(ak)eW(ak)W(ak)eB(ak)B(ak+)eW(ak+)+W(ak+)eB(ak+) (13)

where we used the equalities

j(ak)ej(ak)=j(ak+)ej(ak+) (14)

which are valid for each subpopulation j (e.g., j=B or W). The equality holds because both products in the left and right sides of eqn. (14) are equal to akj(u)du. Recall that ak and ak+ mean that the left and right limits of the respective functions have to be taken. Also recall that we have the freedom to choose any lower and upper integration limit in eqn. (13) and that there is only one age at death (ak) inside the integration region. This allows us to choose ak and ak+ infinitely close to ak. Because of the discontinuity of the survival function at the point ak, it follows that life expectancy is not continuous at this point either. Recall, however, that the cumulative hazard function is estimated as the logarithm of the ratio of the survival functions at ak and ak+. It follows that the survival and life expectancy functions can each be expressed in terms of the cumulative hazard function:

j(ak)=j(ak+)exp(μ^j(ak))ej(ak)=ej(ak+)exp(μ^j(ak)) (15)

where the first line of eqn. (10) follows from the definition of the cumulative hazard and the second uses the definition of LE, i.e., ej(x)=1j(x)xj(a)da, combined with the property that the integral akj(u)du is a continuous function of age, in which case akj(u)du=ak+j(u)du=akj(u)du. Multiplication of the first and second lines of eqn. (15) yields eqn. (14), thereby confirming its validity.

Eqn. (15) allows us to express eqn. (13) in terms of j(ak) and ej(ak+):

F(ak)F(ak+)=(eμ^W(ak)eμ^B(ak))(W(ak)eB(ak+)+B(ak)eW(ak+)), (16)

or, in terms of j(ak+) and ej(ak):

F(ak)F(ak+)=(eμ^B(ak)eμ^W(ak))(W(ak+)eB(ak)+B(ak+)eW(ak)). (17)

One more expression involves the difference between j(ak) and j(ak+) or jump discontinuity, defined as Jj(ak)=j(ak)j(ak+):

Jj(ak)=j(ak)j(ak+):F(ak)F(ak+)=(JB(ak)B(ak+)JW(ak)W(ak+))(W(ak+)eB(ak)+B(ak+)eW(ak)). (18)

Eqns. (16), (17), and (18), and the property that the integral over an interval without any deaths is exactly zero, allow us to represent the disparities in LE65 (eW(65)eB(65)) as a sum overtimes when deaths occurred:

eW(65)eB(65)=12k(F(ak)F(ak+))=12k((JB(ak)B(ak+)JW(ak)W(ak+))(W(ak+)eB(ak)+B(ak+)eW(ak))). (19)

Similar formulas can be obtained using eqns. (16) and (17).

If each term in the sum over k is associated with a certain cause of death, then all terms in (19) can be regrouped to combine all cause-specific terms resulting in decomposition of the disparity in LE in terms of cause-of-death-related contributions:

eW(65)eB(65)=12i=1Iki=1Ki((JB(aiki)B(aiki+)JW(aiki)W(aiki+))(W(aiki+)eB(aiki)+B(aki+)eW(aiki))) (20)

where i indexes cause of death, aiki represent age at death from cause i, and the number of such deaths is Ki.

Eqn. (20) is valid only when no deaths from a different cause occur on the same day, assuming that time of day of each death is not available to break such ties. Because such situations are likely to occur occasionally, a minor modification of eqn. (20) is required. The modification is based on the property that the jump discontinuities Jj(ak) are proportional to the cause-specific cumulative hazards Jj(ak)iμ^ij(ak)μ^j(ak)Jj(ak). This works for one or several deaths from one cause as well as many deaths from different causes, all occurring at exact age ak (i.e., with attained age measured in days since birth). Substituting for Jj(ak) in eqn. (20), we obtain an expression for disparities in LE valid in the general case, i.e., when multiple deaths with multiple causes can occur at an exact age ak:

eW(65)eB(65)=12i=1Iki=1Ki((μ^iB(aiki)μ^B(aiki)JB(aiki)B(aiki+)μ^iW(aiki)μ^W(aiki)JW(aiki)W(aiki+))(W(aiki+)eB(aiki)+B(aiki+)eW(aiki))), (21)

or, alternatively:

eW(65)eB(65)=12i=1Iki=1Ki((μ^iB(aiki)μ^B(aiki)(eμ^B(aiki)1)μ^iW(aiki)μ^W(aiki)(eμ^W(aiki)1))(W(aiki+)eB(aiki)+B(aiki+)eW(aiki))). (22)

Equation (22) is the final expression for the decomposition of life-expectancy disparities using administrative data; it is the same as eqn. (9). The explicit expressions for the cause-specific relative contributions are:

RCi=12ki=1Ki((μ^iB(aiki)μ^B(aiki)(eμ^B(aiki)1)μ^iW(aiki)μ^W(aiki)(eμ^W(aiki)1))(W(aiki+)eB(aiki)+B(aiki+)eW(aiki))(eW(65)eB(65))).

Because all derivations used to obtain eqn. (22) were exact, and the only actions taken to isolate the contributions of specific causes of death was a regrouping of the terms by combining deaths with the same cause to create the sum over i, the property i=1IRCi=1 is exactly satisfied. The RCi s may include offsetting effects for one or more causes, in which case some RCi s will differ in sign from that of the overall sum.

2.3.2. Population/registry data

In population/registry data, data for integration are usually available in the form of life table functions. We assume that the total and cause-specific survival functions (lxj and lixj) are available for integer ages. We use notation lxj and lixj in order to underscore that these are life table functions in contrast to the continuous survival functions j(a) and ij(a) that we used in Section 2.1.1. Decomposition of disparities in LE65 can be obtained from eqn. (10) if we consider a1 and a2 as consecutive integer ages. The integration between the two integer ages is performed as shown in Appendix B.

eW(65)eB(65)=12i=1Ix=65((μ^ixBμ^xB(eμ^xB1)μ^ixWμ^xW(eμ^xW1))(lx+1W(exB(a^xx))+lx+1B(eW(a^xx)))), (23)

where the total and cause-specific cumulative hazards are defined as: μ^xjlog(lxjlx+1j) and μ^xijlog(lixjli,x+1j). No specific assumption on the form of the distribution of deaths within an age interval is required. The age a^x is an effective mean age of death in this interval, which is calculated as weighted sum of the individual ages of death (Appendix B). Equivalently, we can consider the age a^x as the point at which all deaths in the interval occurred; no specific assumptions are required to do so. In this case, eqn. (23) can be obtained directly from eqn. (22). The assumption that a^xx=12 can be made for one-year intervals, but it is not necessary for our development.

We need to explicitly demonstrate that the sum over all age intervals, indexed by x, results in the difference in life expectancies shown in eqn. (23). The ratios of cause-specific and total cumulative hazards sum to one, after summing over i, so we need to deal only with terms derived from eqn. (23) having the following form:

(lxBlx+1BlxWlx+1W)(lx+1W(exB12)+lx+1B(exW12))=(lxBlx+1BlxWlx+1W)(lx+1Wy=xly+1BlxB+lx+1By=xly+1BlxW), (24)

where we used the following expressions for LE in terms of life table functions

exj=y=xLyjlxj=a^xx+y=xly+1jlxj. (25)

That is true when a^xx is independent of x. The standard expression for LE is obtained using a^xx=12.

To complete the demonstration, we change the order of summations in the right side of (24) and reintroduce the summation over x, starting at x0=65:

12x=x0(lxBlx+1BlxWlx+1W)(y=x(lx+1Wly+1BlxB+lx+1Bly+1WlxW))=12x=x0y=x(lx+1Wly+1BlxB+lx+1Bly+1WlxW)(lxBlx+1BlxWlx+1W)=12x=x0y=x(lxBly+1WlxWlxWly+1BlxB+lx+1Wly+1Blx+1Blx+1Bly+1Wlx+1W)=12y=x0x=x0y(lxBly+1WlxWlxWly+1BlxB+lx+1Wly+1Blx+1Blx+1Bly+1Wlx+1W)=12y=x0(lx0Bly+1Wlx0Wlx0Wly+1Blx0B+ly+1Wly+1Bly+1Bly+1Bly+1Wly+1W)=y=x0(ly+1Wly+1B)=ex0Wex0B,

where, with x0=65, the final term is seen to be equal to the first term in eqn. (23). In these calculations x0 denotes the left bound of the first age interval (e.g., x0=65), so the equation lx0W=lx0B=1 was used for consistency with the initialization of the continuous survival functions j(x0)=1. For any other starting age a, where laWlaB1, the result of the above calculation is:

x=a(lxBlx+1BlxWlx+1W)(y=x(lx+1Wly+1BlxB+lx+1Bly+1WlxW))=(laW+laB)(eaWeaB). (26)

This calculation can be repeated for any initial and final interval resulting in:

x=a1a2(lxBlx+1BlxWlx+1W)(y=x(lx+1Wly+1BlxB+lx+1Bly+1WlxW))=(la1W+la1B)(ea1Wea1B)(la2+1W+la2+1B)(ea2+1Wea2+1B).

The sums in this calculation were treated as follows:

x=a1a2y=xy=a1x=a1min(y,a2)y=a1(y=a1y=a2+1).

The last age group—with left bound denoted as x1—should be chosen such that both lx1+1B and lx1+1W are negligible. In practice, data for life tables contain a last group that is open to the right, e.g., the group 100+ in our study. The total and cause-specific probabilities of death are assumed constant within the last group. The contribution of the last group to the difference in LE65 can be calculated based on eqn. (23) in which all terms starting from x1 in the sum over x have to be analytically summed for fixed probabilities of death qx1j, where qx1j=1exp(μ^x1j). Life expectancies for xx1 are time independent (i.e., constant) if the probabilities of death are time independent; they are estimated as:

ex1j(a^xx)=y=x1ly+1jlx1j=1lx1jy=x1(lyjlyjqx1j)=(1qx1j)y=x1(1qx1)yx1=1qx1j1. (27)

The survival functions lx+1B and lx+1W in eqn. (23) are the only functions that change with changing x for xx1, and therefore, sums of these functions need to be calculated to evaluate the last-age-group contribution:

y=x1lx+1j=lx1j1qx1jqx1j.

Then the last age-group contribution in eqn. (23) is:

12i=1I(μ^ix1Bμ^x1B(exp(μ^x1B)1)μ^ix1Wμ^x1W(exp(μ^x1W)1))((1qx1B)(1qx1W)qx1Bqx1W)(lx1B+lx1W). (28)

Thus, the life-table-based decomposition with x0 and x1 denoting left bound of the first and last (open-to-the-right) age intervals is:

ex0Wex0B=i=1Ix=x0x1(μ^ixBμ^xB(eμ^xB1)μ^ixWμ^xW(eμ^xW1))Wx, (29)

where

Wx=12(lx+1W(exB(a^xx))+lx+1B(exW(a^xx)))forx0x<x1,

and

Wx1=12(lx1B+lx1W)(1qx1B)(1qx1W)qx1Bqx1W.

These quantities correspond to Pollard’s weights in his original notation. The final expression using eqn. (29) with x0=65 is shown in eqn. (8).

3. Application

We now apply the new approach to identify cause-specific contributions to disparities in LE65 between i) Black and White subpopulations, ii) U.S. regions leading and lagging the national average in respect to LE65, and iii) two time periods: 1998–2005 and 2010–2017. Leading and lagging U.S. regions were identified by ranking all U.S. states by their LE65 using the Centers for Diseases Control and Prevention (CDC) Wide-ranging OnLine Data for Epidemiologic Research (WONDER)35 and selecting eight with the lowest and eight with the highest LE65 as lagging/leading states, respectively. The lagging states were: Arkansas, Tennessee, Louisiana, Oklahoma, Kentucky, Alabama, Mississippi, and West Virginia. The leading states were: California, New York, Hawaii, Florida, Arizona, Connecticut, Minnesota, and Colorado. We then selected 37 cause groups (Table 1) that cover all possible causes of death that are recorded on death certificates. When an individual cause (e.g., hypertension, heart failure) in a chapter of the International Classification of Disease 10th Edition (ICD-10) (e.g., disease of the circulatory system) demonstrated rates in excess of 50 per 100,000, it was treated as its own cause group; the remaining diseases in the chapter were combined into an other-cause group (e.g., other diseases of the circulatory system). The total sum of death rates for these 37 cause groups was 4,533/100,000 which was equal to the total age-adjusted death rate obtained from the CDC-Wonder for the study period.

3.1. Data

Two datasets were used in this study. The Multiple Cause of Death database (MCD) provides cause-specific mortality and population counts for the U.S.; the mortality data are derived from death certificates; the mortality and population counts can be aggregated to form race- or geographic-region-specific population strata. The mortality data for each decedent are summarized as a single underlying cause of death, supplemented with up to twenty additional causes, with each cause coded using 4-digit ICD-10 codes; basic demographic data including place of residence, age, race/ethnicity, and sex are also provided. We used the underlying cause of death provided by the CDC-WONDER MCD tool to evaluate race- and geography-specific age-adjusted mortality rates in older adults age 65+ over the 2000–2017 period.

The MCD database is a commonly used source of information for evaluation and analyses of strata-specific LE. However, because of known limitations at ages 65+ (e.g., unavailability of year-specific population counts at ages 85+, erroneous assignment of underlying cause-of-death), another source of data is needed for comparison. Because most legal U.S. residents become eligible for health insurance provided by the Medicare system at age 65, Medicare administrative health insurance claims can serve as an alternative source of information for analyses of LE65. In the present study, a nationally representative sample of 5% of the total Medicare population (5%-Medicare) over the 1991–2017 period was used. The enrollment file provided information on place of residence, age, race, sex and date of all-cause mortality while ICD-9/10 codes in the diagnoses fields of Medicare Part A (facility-based services) or Medicare Part B (professional services) records were used to ascertain morbidity.

Each of the utilized datasets underwent additional processing to make them suitable for the planned analysis. For the MCD data, cause-specific death rates were calculated as the ratio of the number of deaths from a given underlying cause within a given age group to the total mid-year population count within that age group. Our analysis required cause-specific death rates calculated in 36 age groups (single-year age groups from 65 to 99 and the age group 100+ that aggregates all ages above 99) across four population strata (White and Black population, leading and lagging state of residence) for each calendar year of the analysis (2000–2017). The CDC-WONDER provides all necessary data with one exception: population counts by single-year age groups are available only for ages younger than 85 and are aggregated into a single group for ages 85+. Therefore, we used additional information to disaggregate the year-specific population counts in the 85+ age group into single-year age groups for ages 85-99 and a single aggregated group for age 100+. For geographic disparities, we used state-specific life tables from the Human Mortality Database for all years of the analysis (see Akushevich et.al.36, Supplementary Methodologic Remark 1 for methodological details). For racial disparities, we used the Annual Projections of the Resident Population by Age, Sex, Race, and Hispanic Origin, 1999–2100, from the 2000 National Population Projections Datasets37. Reconstructed distributions were smoothed for ages 83–86.

For the 5%-Medicare data, an assigned underlying cause of death was not available. Therefore, we developed a simulation procedure to assign a cause of death from the set of diagnoses associated with each individual decedent. Specifically, we reconstructed individual health trajectories utilizing all 37 cause-specific groups (Table 1) and used them to simulate the cause of death. In the simulation procedure, individual assignment of a cause of death occurred through a random selection of a cause from the lists of health conditions individuals accumulated by the end of their life. Selection of the final cause of death used probabilities proportional to the cause-specific weights, which were iteratively recalculated in such a way as to have the final cause-specific mortality rates equal to those found in the MCD data.

3.2. Results

Time-patterns of LE65 for leading and lagging states, the White and Black population, as well as the related differences are shown in Figure 1. Both the size of LE65 and the shapes of the time patterns observed in the MCD and Medicare data were similar. Geographic-related disparities were stronger in their impact on LE65 than race-related disparities over the entire study period. In the 5%-Medicare data, a slow increase in both geographic and race-related disparities can be seen between 2000 and 2005 after which racial disparities decrease sharply until about 2015 while geographic-related disparities continued increasing but at a much higher rate. The MCD data suggest that the decline in racial disparities occurred much sooner than observed in the Medicare data. In general, the MCD data showed much sharper increases in geographic disparities and decreases in racial disparities than the 5%-Medicare data. Even so, both datasets show 2005–2015 as a period of active change.

Figure 1.

Figure 1.

Life expectancy at age 65 (left plot) and subpopulation-specific disparities (right plot). State disparities reflect leading states (Hawaii, Florida, Arizona, California, New York, Connecticut, Minnesota, and Colorado) vs, lagging states (Arkansas, Tennessee, Louisiana, Oklahoma, Kentucky, Alabama, Mississippi, and West Virginia). Race disparities reflect White vs. Black. B-splines with two internal nodes in 2005 and 2011 were used to smooth the curves.

The full set of absolute and relative cause-specific contributions to the differences in life expectancies are shown in Supplementary Tables 1-3. Time trends of the cause-specific contributions are shown in Supplementary Figures 1 and 2. The causes with the largest contributions are shown in Figure 2. A cause was included in Figure 2 if one of its contributions exceeded one month of LE65. The life-table-based cause-specific decomposition was obtained by applying eqn. (8) to the MCD (open circles; all colors), and Medicare-based estimates were obtained on a day-by-day basis using the product-limit form of Pollard’s decomposition using eqn. (9) (closed circles; all colors). Cause-specific contributions for racial (red) and geographic (blue) disparities, as well as the trend of the effect of this condition over time (green) are also shown. In sensitivity analysis, we also obtained Medicare-based estimates on a grouped data basis using the life-table form of Pollard’s decomposition (Supplementary Tables 1-3). The differences between the two Medicare-based estimates were minor; this gives us confidence in the stability of the initial results and allowed us to conclude that the results differed somewhat due to differences in how the respective causes of death were assigned, and possible contributions of individuals without Medicare records available. Standard errors and confidence intervals were evaluated using the life-table approach with simulations based on the multinomial distribution. The last line in Figure 2 presents the contributions to the disparities from the group of individuals without identified Medicare diagnoses for the 37 disease groups included in this study; therefore, a cause of death cannot be assigned for these individuals. Jointly, this group accounts for 10.0% of total available deaths and is not homogenous; it includes four subgroups: i) individuals on Medicare Part C plans that do not contribute claims data to the database (38.9%), ii) individuals eligible for Medicare services but residing outside the U.S. (2.2%), iii) individuals without Medicare service records (38.5%), and iv) individuals with Medicare service records but without diagnoses identified in our algorithms (20.5%). The first two subgroups represent individuals with incomplete data due to a known (entry/exit/duration in this stage is known as well as, in some cases, partial health records, are available) reason; the third – individuals not utilizing the Medicare system for reasons unknown; the last – relatively healthy people. All subgroups contribute to negative geographic disparities (i.e., fractions of membership in leading states for the group shown in Figure 2 are much higher than the fractions in the total sample) and last two subgroups contribute to the positive racial disparities increased time trend in Figure 2. Specifically, the fractions of individuals without a diagnosis are higher for leading states (85.8%) compared to the total group (72.9%) and for the Black subpopulation (10.3% vs. 8.2%). These estimates reflect the observed disparities for the group of individuals without identified Medicare diagnoses in Figure 2.

Figure 2.

Figure 2.

The cause-specific contributions to the differences in life expectancy at 65 given by eqns. (8) and (9). Estimates are shown for racial (red), geographic (blue), and time-related (green) disparities using Medicare (closed dots) and Multiple Cause of Death data (open dots). The last line shows disparities for the group of individuals without established diagnoses in Medicare records (the blue point that shows the geographic disparity has to be multiplied by 3 to obtain actual value).

The greatest contributions to racial disparities were observed for arterial hypertension, diabetes mellitus, cerebrovascular diseases, and renal disease. While the majority of causes contributing to LE65 disparities were those with higher rates among the Black population, there were also several causes with substantial contributions to the gap in LE65 whose rates were higher among Whites: chronic respiratory diseases had the greatest contribution in the MCD data and slow progressive cancers in the Medicare data followed by the contribution of Parkinson’s disease in both datasets. The largest relative contributions to geographic disparities in LE65 were from chronic lower respiratory diseases, lung cancer, and circulatory diseases including myocardial infarction, heart failure, and stroke. Chronic ischemic disease had large contributions in both MCD and Medicare data but these contributions were in opposite directions. Circulatory diseases and lung cancer were among the leading causes contributing to increases in LE65 with time. Dementia and Alzheimer’s disease as well as renal disease (in Medicare) had the greatest negative contributions, suppressing LE65 with time.

4. Discussion

We extended Pollard’s decomposition for application to administrative and registry data. We demonstrated that if the continuous-time cause-specific survival functions are estimated using the Kaplan-Meier product-limit estimator, then Pollard’s integral for the difference of life expectancies between two subpopulations can be solved exactly. The analytic formulae for the difference can be specified for administrative data, and then extended to discrete-time registry data. The formulae do not require numerical integration; no parametric assumptions have to be made for the cause-specific hazards and other functions in Pollard’s integrand. An important property of the solutions for Pollard’s integral for registry and administrative data is that the sums of cause-specific contributions in expressions (8) and (9) are exactly 100% of the respective differences in life expectancy. This means that biases of technical or methodological origin do not impact the estimates, or that their contributions mutually cancel out. The new methodology was applied to an exhaustive set of causes of death observed in Multiple Causes of Death and Medicare data, and a series of new results on the racial and geographic disparities as well as their cause-of-death decompositions was presented and discussed.

4.1. Methodologic development: Solving Pollard’s decomposition

Pollard’s decomposition for specific causes of death is given by eqn. (6). To calculate the integral, we need to calculate each function in the integrand at each exact age at death, a. Practical applications generally require estimating the required functions from an administrative dataset and, possibly, numerical evaluation of the integral. Medicare data (as well as any other administrative dataset) provide individual measurements including age/time of initial follow-up, age/time of the end of follow-up, and the censoring/death (with a cause) indicator at the end of follow-up. The survival functions can be calculated from such data using the Kaplan-Meier estimator (or its generalization for left-truncated data). To calculate the life expectancy at exact age a, the survival function can be integrated from a to infinity. This can be done using a numerical method with specific assumptions concerning the behavior of the survival function after the last observed age at death. The race-specific and cause-specific hazard functions can be evaluated using parametric functions or non-parametric procedures that yield numerical values at each exact age of death and handle multiple deaths with different causes occurring on the same day (usually the smallest time-interval in administrative data). Thus, Pollard’s integral can be numerically calculated under various simplifying assumptions using administrative data with exact ages at death.

Our approach to evaluation for such data of all functions contributing to the integral in eqn. (6) used Kaplan-Meier product-limit estimators for the total and cause-specific survival functions. The survival functions estimated using the Kaplan-Meier approach are discontinuous and the respective instantaneous hazards are equal to zero everywhere except for exact ages at death where the hazards go to infinity. Numerical evaluation of Pollard's integral using such hazard functions is not possible. Instead, we performed the integration analytically, obtaining expressions for the cumulative hazard functions as logarithms of the ratios of the associated survival functions just before and just after each exact age of death. The exact solution for the integral in eqn. (6) is shown in eqn. (9). A remarkable property of this solution is that Pollard’s decomposition is represented as a sum of terms obtained from the Kaplan-Meier estimators for the total and cause-specific survival functions, without numerical integration. Moreover, because the Kaplan-Meier estimator is a maximum likelihood estimator, it follows from Zehna38 that the solution for Pollard’s decomposition in eqn. (9) is likewise a maximum likelihood estimator.

Having solved Pollard’s decomposition for exact ages of death, we then considered the corresponding problem for complete life tables using population/registry data with death recorded using single-year age at last birthday. The expressions for complete life tables were obtained by integration of the expressions for administrative data over each age interval of the life table, using the procedure described in subsection 2.3.2. The expressions in eqns. (8) and (29) reflect the close-out of the life tables at age 100; this can be increased to age 110 or 120 if the life table data are available. The close-out age should be high enough that the impact of the close-out term is negligible. Thus, Pollard’s decomposition can be represented as a sum of terms obtained from the respective life tables for the total and cause-specific survival functions, without numerical integration. Also, given that the life table survival function is a maximum likelihood estimator29, it follows from Zehna38 that the resulting solution for Pollard’s decomposition in eqn. (8) is likewise a maximum likelihood estimator.

Thus, we developed a consistent methodology for solving Pollard’s decomposition for exact ages of death and for grouped ages of death, which are exact in the sense that they do not require numerical integration or parametric models for the cause-specific hazards and other functions in Pollard’s integrand. In addition, methods for generating reliable estimates of standard errors and associated confidence intervals for the cause-specific contributions were also provided. The solutions for Pollard’s decomposition were applied to two types of data that are often used in biodemography for LE evaluation at the national level: i) Medicare administrative claims data, and ii) MCD data with population counts. Each dataset must be analytically extended to be used in these analyses. Population counts for MCD data end at age 84, with the 85+ population represented by an aggregate group. Since this is too imprecise for our study, we applied procedures to distribute population counts aggregated at age 85+ into single-year age groups for ages 85-99 and a single aggregated group for ages 100+. These distributions had to be calculated for each year, had to be race-specific for analyses of racial disparities, and had to be state-specific for analyses of geographic disparities. Medicare data do not contain entries for cause of death. Therefore, we developed a Monte Carlo procedure to assign underlying causes of death for all decedents. The procedure provides equal cause-specific death rates in the 5%-Medicare and MCD data.

The final solutions (8) and (9) for Pollard’s decomposition of LE65 for MCD and Medicare data, respectively, possess the main properties of a decomposition: exact normalization and additivity. The cause-specific contributions to the decomposition related to different causes of death are appropriately normalized, i.e., the sum of the cause-specific relative contributions for each decomposition computed using eqn. (8) and (9) is exactly 1, or 100% of the difference in life expectancy. This property was proved analytically for both types of datasets and was used for additional cross-checks of our analysis software: in our SAS code these sums were exactly 100% for each year of our analysis. Importantly, as long as this property holds, any contribution obtained through Pollard’s integral is directly interpretable regardless of size, i.e., researchers can be equally confident in both large and small cause-specific contributions. The additive property of Pollard’s decomposition holds exactly in our final expressions; even after the multiple operations necessary to apply Pollard’s integral to the two types of data used in this study. For example, this allows us to sum all cancer-related or circulatory-disease-related contributions to obtain the decomposition involving the all-cancer or all circulatory disease groups. The exact normalization will hold in this case. We note that our main decomposition presented by formulae (8) and (9) is valid even in the case when the difference in life expectancies at age 65 is exactly zero. In this purely theoretical scenario, the absolute decomposition shows how effects of causes with positive contributions are exactly compensated by causes with negative contributions. Although relative differences are undefined, the sum of all cause-specific contributions gives zero years exactly.

This study opens potential new avenues for research on the origin of substantial geographic and racial disparities in LE in older U.S. adults. For example, each type of disparity could be expressed using health measures recorded in Medicare data to further identify contributing factors behind the observed time trends and observed disparities. After ranking all causes according to their effects on LE in selected geographic regions, it then becomes feasible to apply partitioning analyses39-44 to the causes that contribute most to the disparities to assess the extent to which their contributions reflect higher incidence vs. poorer survival. Such analyses are feasible with Medicare data because the longitudinal sub-files for each enrollee beyond age 65 record all covered services before and after onset of each chronic disease allowing measures of incidence, prevalence, and case fatality to be generated and analyzed. Moreover, regression-based approaches, e.g., Oaxaca-Blinder decomposition extended for censored data45-47, can be used to help understand the role of clinical factors (e.g., screening, effective diagnostic procedures, treatment choice, and treatment adherence), demographic and socioeconomic characteristics, behavioral risks, environmental exposures, as well as access to and quality of available medical care. Ultimately, identification and evaluation of such contributing factors may reveal a role as fundamental drivers of the observed disparities, which could inform the design of strategies to improve public health and the healthcare system48.

4.2. Pollard’s decomposition and other methods of cause-specific decomposition of life expectancies.

Two distinct demographic approaches are commonly used to evaluate the cause-specific contributions to the difference in life expectancies for two well-defined subpopulations12. The first considers each subpopulation separately and estimates the gains in life-expectancy that would occur under the assumption that the causes of death are independent and that the deaths from each cause can be eliminated one at a time, in turn, with the deaths from all other causes retained, i.e., the force of mortality for each selected cause is set to zero for all ages with the forces of mortality for all other causes held unchanged from their initial values49. The cause-elimination life expectancy gain for cause i can be computed using Greville’s method23 as shown in eqn. (2). The difference between the cause-elimination life expectancy gains for cause i in the two subpopulations can then be treated as a measure of the contribution of cause i to the overall life expectancy difference between the two subpopulations. Unfortunately, the contributions are not additive; additional procedures are needed to ensure that the sum of the relative contributions over all causes equals 1. Beltrán-Sánchez et al.12 developed an alternative decomposition of LE differences based on cause-elimination principles (including the independence assumption) that does not make direct use of the cause-elimination life expectancy gain for cause i in either subpopulation; instead, the underlying cause-specific survival functions in the two subpopulations are used to estimate cause-specific contributions for cause i that are approximately additive for the overall life expectancy difference between the two subpopulations.

The second approach is based on demographic decomposition theory. Pollard’s decomposition belongs to the second approach. The properties of Pollard’s decomposition were discussed in Section 2.1. Pollard’s decomposition is exact by construction and represents the difference in life expectancies for two subpopulations as the sum of cause-specific components. An important property of Pollard’s decomposition is that the assumption of independence of cause-specific forces of mortality is not required. The two approaches, i.e., Beltrán-Sánchez’s decomposition and Pollard’s decomposition, are not equivalent though the respective results are close12.

Historically, methods for decomposing differences in LE were presented in several mathematically equivalent approaches8,19,50-52, as demonstrated by Pollard9 and Shkolnikov et al.53, although their implementation via various discrete approximations can give rise to different results12. Several other approaches were presented to further analyze the difference in LE between subpopulations in terms of age and cause-of-death contributions.

Horiuchi et al.54 presented a general approach for decomposition of a demographic measure that can be represented as a function of multiple covariates. For example, race-specific LE65 can be considered such a demographic measure, and differences in race-specific LE65s can be represented as functions of age and cause specific mortality rates. Eqn. (5) of ref.54 gives the general decomposition formula for an arbitrary demographic measure, and eqns. (9-10) of ref.54 give the decomposition formula for differences in life expectancy. Two types of numerical integration are required to represent the difference in LE65 in a form comparable to our eqn. (6): i) integration over age groups, and ii) integration over a linear variable that connects pairs of age-specific mortality rates (e.g., two race-specific mortality rates from lung cancer in age group 65–69). The second integration is specific to Horiuchi’s method and can be performed only under the assumption of a linear relationship between specified covariates (e.g., as in eqn. (11) of ref.54). The necessity for this linearity assumption and the two-way numerical integration are limitations of this method that are not present in our approach. Horiuchi reported that an approximation to Pollard’s decomposition (using stepwise replacement55) by broad age groups for Japanese females was equivalent to their method when applied on an annual basis over 54 separate calendar years (Table 3 of ref.54) and was close when applied across the entire 54-year period (Table 4 of ref.54). This equivalence/closeness occurs because our solution of Pollard’s formula in eqn. (8) is an exact solution to Horiuchi’s eqns. (5) and (9)54. Horiuchi’s method has been implemented in DemoDecomp R-package56 (Decompose Demographic Functions) and used in recent analyses57,58.

Age decomposition in the general replacement method55 was developed based on the idea that the difference between life expectancies can be constructed based on a general algorithm, which includes the stepwise replacement of elements from one vector of age-specific mortality rates by the respective elements of another vector, with the replacements ordered by ascending age. The resulting formula eqn. (3) of ref.55) for age decomposition with predetermined age groups is a particular case of our formula (10) that is valid for continuous ages. The cause-of-death decomposition in our approach is then developed based on eqn. (10) using exact methods of calculus for continuous age. Depending on the dataset type, the resulting formulae for continuous age can be reduced to age-interval specific formulae for certain datasets: one day in administrative data and one year in registry data types. In contrast, incorporation of cause-of-death decomposition in the replacement approach requires another set of replacements of age-specific rates with vectors of age- and cause-specific rates. The final effect for a specific cause of death is computed by averaging over all possible combinations of replacements using all other causes.59

Another common approach59-63 is Arriaga’s decomposition19. Pollard showed9 that the age decomposition presented by Arriaga18 was mathematically equivalent to the "1–2" form of his decomposition. An important difference between our approach and Arriaga's, however, is that our cause-specific decomposition was theoretically derived; Arriaga's formula was an assumed allocation formula that happens to be consistent with Pollard's result. One limitation of Arriaga cause-of-death decomposition is that the calculations may become unstable if the denominator (the difference between the age-specific mortality rates in the two subpopulations being compared) is small relative to the corresponding cause-specific differences. This could happen, for example, if sex differences at some ages were small except for breast and prostate cancers, which could be large but in opposite directions. Our discrete time formula (8) is equivalent to Arriaga's discrete time formula, but: i) ours is easier to interpret given that it can be matched to formula (9) that covers the Kaplan-Meier estimator; and ii) ours does not become unstable for small, or zero, differences between the age-specific mortality rates in the two subpopulations being compared.

A complementary indicator to cause-elimination gains in life expectancy is the average years of life lost prior to some chosen threshold age, which can be evaluated for the age interval between birth (or other fixed age) and any selected upper age limit (e.g., ages 85 or 95). Using this strategy, Andersen et al.64 presented a new approach for defining and evaluating life years lost by cause of death, and generating cause-specific decompositions of the differences in restricted (or “temporary”) life expectancy between two subpopulations, where the restricted life expectancy is: 0x(a)da, assuming (0)=1; the higher the value of x, the closer the restricted will be to the unrestricted life expectancy. The difference between the restricted life expectancies in any two subpopulations is exactly equal to the difference in their respective life years lost (see page 1135 of ref.64), and this difference can be readily decomposed by cause and age. The additivity of the cause-specific forces of mortality implies that the decomposition can be done without assuming that the causes of death are independent (see Gail25 eqn. (4)). The need to choose an upper age limit and use numerical integration are the main limitations of this approach compared to the approach developed in this paper.

A new approach for decomposition of the young adult mortality hump by cause of death was presented in Remund et al.65. Under this approach, a non-parametric spline-based model was used to parameterize the age-dependence of the hazard functions. The approach allowed the separation of the trends in the hump from background mortality, yielding multiple insights into the cause-specific structure of the hump in the U.S. The spline-based approach for non-parametric evaluation of the cause-specific hazards could be considered for evaluating Pollard’s integral; however, the error-accumulation from model fitting and the necessity for numerical integration limit the applicability of such an approach when precise estimates are needed. In any case, the property that the sum of all relative cause-specific contributions to the decomposition is exactly 100% of the difference in life expectancy does not hold.

4.3. Differences in time trends of LE65 and its racial and geographic disparities

We found that LE65 in the U.S. increased during the study period for all subpopulations (Figure 1). The increase was more rapid for leading states and for the Black population. The shapes of the time patterns were similar for all study populations as well as for results obtained using MCD and Medicare data. The difference in estimates for LE65 in 2005 and 2015 did not exceed 1% with the exceptions of Blacks (2.12%) and leading states (2.52%) in 2015, for which LE65 was lower in the Medicare data, as well as lagging states (1.63%) in 2005, for which LE65 was lower in the MCD data. A tendency for stagnation in LE65 in later years was seen in the Medicare data. However, even though LE65 increased for all subgroups, notable racial and geographic disparities remained. The geographic disparity in LE65 between leading and lagging states was 2.75 years (2.32 years based on Medicare data) in 2015; this was a 29% increase from its 2005 value of 2.13 years (1.83 based on Medicare data). Over the same time-period the Black/White racial disparity in LE65 decreased by 41% from 1.55 years (1.68 based on Medicare data) in 2005 to 0.91 years (1.33 based on Medicare data) in 2015.

We found that the dynamics of racial and geographic disparities in LE65 demonstrated opposing tendencies. The loss of LE65 associated with racial disparities narrowed between the years 2000 and 2017, predominantly due to increasing LE65 in the Black population. This is consistent with estimates of change in race-specific disparities in life expectancy at birth recently reported by the Global Burden of Diseases Collaboration66. Over the same time-period the size of the geographic disparity in LE65 widened, predominantly due to more rapid improvements of LE65 in leading states. Although race-specific differences between age-adjusted mortality between 2005 and 2015 looked similar in leading and lagging states: 913/1077/983 per 100,000 in leading/lagging/all states for the Black and 559/486/497 for the White populations, there were three types of contributions to the time trend in the geographic disparities. A lower difference in mortality for the White population in lagging states provided the main contribution to the overall increase in geographic disparities. The stronger declines in mortality in the Black population of lagging states partly offset the overall widening disparities in LE65 between leading and lagging states. Although the decline in mortality was much larger for the Black population, the contribution of the Black population to the trend in geographic disparities was lower due to the relative size of the Black population in these states (8.5% in leading states and 14.5% in lagging states in 2010). One more contribution to the geographic disparities came from changes in the fraction of the Black population, especially in leading states, e.g., an increase of the fraction from 8.0% in 2005 to 9.0% in 2015 in leading states. Our estimates showed that the effect of such changes in the geographic disparity trend were as large as 7%.

It was reported that the Black-White gap in all-cause mortality (i.e., death rate disparity relative to Whites) narrowed from 33% in 1999 to 16% in 201567. Over a similar time period our findings show a decrease in Black-White disparities in LE65 from 1.70 in 2000 to 0.83 years in 2015. Furthermore, it was reported that the non-Hispanic Black population in 2000–2014 had the greatest increase in LE at birth. Our findings also show that LE65 for Blacks increases sharply over the 2000-2014 period. The magnitude of this growth was much higher than that observed in White individuals which is consistent with a previous finding that non-Hispanic Whites had the lowest increase in LE among all race/ethnicity-related groups68,69. However, Blacks continue to have higher mortality than Whites at ages < 6567 and 60% of the decline in the racial gap in LE at birth in 2000–2010 has been attributed to declining differences in the age at which Black and White individuals die of chronic diseases, with the remaining 40% of the reduction in the LE gap associated with changes in the distribution of causes of death among Blacks and Whites at younger ages (e.g., due to greater declines in HIV/AIDS and perinatal deaths in Blacks, and increasing death rates caused by accidental poisoning—mostly drug related—among Whites)69-71. A recent study of Medicare beneficiaries aged 65+ supports this rationale by showing that the average decline in LE65 with an additional comorbid condition was higher in the Black population (1.8 years per additional condition) than in the White population (1.7 years), with the Black population being more likely to have 7+ chronic conditions72.

Widening disparities in state-specific LE65 across the U.S. have been observed for midlife mortality since the early 1980s73. More recently, persistent geographic inequalities in the risk of death at ages 65–85 have received increasing attention74,75 with suggestions that the increases in geographic inequality in LE65 in the U.S. over the past three decades have been driven largely by unfavorable dynamics of mortality at older ages74. Furthermore, geographic variation in the disparities in LE between Black and White Americans has been documented, suggesting that racial and geographic disparities are intertwined76; indeed, there are indications that the narrower racial disparities in LE in certain U.S. states were not due to the Black population living longer but rather to the White population having a shorter LE compared to the national average77. Our findings confirmed these tendencies, finding that over time racial disparities decreased but geographic disparities increased. These results were replicable across two distinct data types: Medicare and MCD.

4.4. Causes contributing to racial and geographic disparities in LE65 and the time trend in LE65

Our study showed that trends in the cause-related determinants of racial disparities were characterized by a relatively small number of high-impact contributors with some acting through increased mortality rates in the White population. Lower LE65 in the Black population was associated with death from arterial hypertension (24.4%/9.1% of total 2015 disparity in LE65 using MDS and Medicare data, respectively)), cerebrovascular disease (17.0%/11.0%), diabetes mellitus (21.3%/14.5%), renal disease (16.0%/10.5%), prostate cancer (9.9%/5.9%), and the group of other circulatory diseases (8.0%/7.5%). Disease specific contributions were not identified for LE65 before, however, some of these causes have been previously reported as important determinants of racial disparities in LE at birth. For example, heart disease was found to be a leading contributor that accounted roughly for 25% of the racial gap70 with cardiovascular disease in general accounting for 34%48 of the racial difference in all-cause mortality. COPD was the only major contributor to Black/White disparities in LE65 that acted through higher mortality rates in the White population48,78. We note that an important contribution of the Firebaugh study70—which utilized the approach of Beltrán-Sánchez et al.12 to the decomposition of MCD data in order to evaluate the role of age-at-death and incidence of 19 causes in forming the gap in overall LE between Blacks and Whites—was that the decline in Black/White disparities in overall LE was attributable to a reduction in the contribution of age-at-death, largely due to the declining differences in the ages at which Blacks and Whites died of chronic diseases.

In our study trends in cause-related determinants of geographic disparities were characterized by a relatively large number of causes with relatively small individual impacts all acting through increased mortality rates in lagging states. The largest contributions to geographic disparities in LE65 were associated with CVD including acute ischemic heart diseases (7.8%, 7.9%), the other circulatory diseases group (8.4%, 7.4%), and heart failure (8.1%, 3.5%), as well as COPD (10.7%/9.8%), lung cancer (8.7%/9.9%), and Alzheimer’s disease (7.1%/5.3%). The contributions of COPD, Alzheimer’s disease, and arterial hypertension increased over time. Although, not directly comparable due to differences in study design; our results are concordant with several existing studies focused on geographic disparities in cause-specific mortality and LE. For example, MCD-based studies have shown that CVD and smoking-associated diseases were the principal drivers of geographic variations in LE65 which accounted for over 64% of the geographic disparities in changes in LE65 from 2000 to 2016 in the 65+ population78,79. This effect persisted despite overall declines in CVD mortality in all states78. These studies also stressed improvement in smoking-related mortality (lung cancer and respiratory diseases) as being second (following CVD) in contributions to LE65 gain in the U.S.78,79. Finally, rising mortality from Alzheimer’s disease and other diseases of the mental or nervous systems, including non-Alzheimer’s dementia, has been identified as negative contributors to changes in LE65 among older U.S. adults79.

In our study the shapes of the time patterns in LE65 were similar for all subpopulations: LE65 increased over the entire study time-period with the speed of the gains decelerating after 2010. The increase in LE65 between 2005 and 2015 was more than one year for all subpopulations, and Pollard’s approach was then used to decompose this gain into cause-specific contributions for each of the four study groups. Our results showed that the increase in LE65 between 2005 and 2015 was primarily due to (Table 2) reduced mortality from chronic ischemic disease, acute ischemic disease, lung cancer, and cerebrovascular disease. This was offset, in part, by increased mortality from Alzheimer’s disease and dementia. For most causes, the relative contributions to temporal reductions in mortality in Blacks were less than in Whites. We note that approximately 65% of the total gains in LE65 occurred in 2005–2010 and only 35% in 2010–2015 which was consistent with other research80,81.

Our results are consistent with a comprehensive report on the burden of disease and its patterns in the U.S. states from 1990 to 2016 which also identified wide disparities in the burden of disease at the state level82. While the U.S. overall is experiencing improvements in health outcomes, the patterns of health burden at the state level substantially vary across geography, and monitoring location-specific disease trends is essential given the geographic differences in various aspects of health and social policy (e.g., enrollment in the Medicaid program, the use of private insurance companies)83 and the local socioeconomic environment82. In this study, we identified a number of exact disease-specific causes, which if targeted, can act to mitigate these tendencies. Although, geographic differences were aggregated at the state-group level, stronger sources of data can now be combined with the methods developed in this study to obtain results specific to smaller localities. Consistency between Medicare and MCD data demonstrated in this study, suggest that high-power Medicare samples can be used for such analysis; though additional linkages to Medicaid claims may be necessary to better reflect socioeconomically disadvantaged individuals, many of whom are also members of other disadvantaged populations.

4.5. Study strengths and limitations

The study has several strengths. Within the same study design, we analyzed the trends of both racial and geographic disparities in LE65 among older U.S. adults over a period spanning 17 years and obtained complementary results from two different large nationally representative datasets. Much of the existing research on disparities in LE is focused on variations in mortality at working ages82,83. However, unlike working-age individuals, in the 65+ age strata morbidity is the most common cause of death79,82, making differences in morbidity and associated adverse health-related outcomes between advantaged and disadvantaged groups an important aspect of disparities in LE65. The analysis of individual contributions of discrete diseases to this problem is sub-optimal due to both increasing levels of multi-morbidity in older adults and the inherent relationships between disease trends such as the recent declines in mortality from circulatory disease and certain cancers (lung, colorectal, breast) being partially offset by increases in mortality from other chronic diseases84. We used Pollard’s decomposition to evaluate the contributions of 37 cause groups to the size and the time trend in both geographic and racial disparities in LE65 in the U.S.

We also acknowledge the following limitations. An underlying cause of death is not provided in Medicare data, and it may be hard to determine accurately in the MCD data, particularly among the elderly, where multiple mortality-related diseases are likely to be present at the time of death. Involving multiple causes of death (MCD data) in future analyses (e.g., through random assignment of a cause of death from the list of multiple causes) could estimate the sensitivity of our estimates. Furthermore, we developed an approach to assign a cause of death in Medicare records, and these analyses can be repeated when linked cause-of-death data become more widely available for Medicare beneficiaries. Population estimates in the MCD could be biased, especially for minority populations, because of census undercounts, age misreporting, and race misclassification. However, because the MCD and 5%-Medicare results were broadly consistent, this limitation may be of lesser concern. Finally, changes in diagnostic and coding practices have likely affected some of the trends that we have identified in cause-specific mortality. For example, coding changes implemented by the National Center for Health Statistics in 2006 produced an increase in deaths from “unspecified dementia” and “vascular dementia”, and this increase occurred at the expense of anemias, nutritional deficiencies, heart disease, and cerebrovascular disease85.

5. Conclusions

Although notable progress has been made in improving the health and well-being of Americans, health inequities between population groups and geographic areas persist representing a major area of policy concern78. Using an integral representation of Pollard’s decomposition8,9 applied to Multiple Cause of Death and Medicare administrative health insurance claims data, this study identified the morbidity-related causes of race and geography-related disparities in LE65; quantified their respective magnitudes to isolate cause-groups with the highest overall impact; and measured the trends in the relative impact of such cause-groups over time. These analyses required methodologic development involving an analytic solution to Pollard’s decomposition to evaluate the cause-specific contributions for administrative and registry data. We demonstrated that if the cause-specific survival functions are estimated using the Kaplan-Meier estimator, then Pollard’s integral for the difference of life expectancies of two subpopulations can be exactly solved. The analytic formulae for the difference were specified for both types of data. They do not require numerical integration; no parametric assumptions for the cause-specific hazards and other functions in Pollard’s integrand have to be made for their derivation. An important property of the expressions (8) and (9) for Pollard’s integral for administrative and registry data is that the sum of all cause-specific contributions is exactly 100% of the difference in life expectancy. This means that biases of technical or methodological origin do not impact our estimates, or that their contributions mutually cancel out. The additivity of the cause-specific contributions arises without the assumption of independence among the selected causes of death. Dependencies between the cause-specific contributions may be due to common risk factors that impact multiple causes.

The developed methodology was applied for an exhaustive set of causes of death observed in Multiple Causes of Death and Medicare data, and a series of new results on the racial and geographic disparities as well as their cause-of-death decompositions were obtained and discussed. We found that the temporal improvement in LE65 was more substantial for the Black than the White population and it was less pronounced in the lagging compared to the leading states. These temporal trends resulted in a narrower LE65 gap between the White and Black populations and a wider gap between the leading and lagging states. Estimates using Medicare data reproduced those obtained using MCD data, though Medicare-based trends were less pronounced. The largest relative contributions to geographic and racial disparities in LE65 were: for geographic—chronic lower respiratory diseases, Alzheimer’s disease, circulatory diseases, lung cancer, and renal failure; for racial—arterial hypertension, diabetes, cerebrovascular diseases, and renal disease. The ranks of the diseases were in agreement between analyses using MCD and Medicare data. Based on the MCD, chronic ischemic diseases for geographic disparities and chronic respiratory diseases for racial disparities were higher in the advantaged groups thereby reducing LE65 disparities, however, these effects were not replicated in the Medicare data. Overall, the increase in LE65 observed in 1998–2005 and 2010–2017 was primarily due to a reduction in the contributions of acute and chronic ischemic diseases, although this was partially offset by increased contributions of diseases of the nervous system, including dementia and Alzheimer’s disease.

Supplementary Material

1

Acknowledgements:

This study was supported by the National Institute on Aging (R01-AG066133, RF1-AG046860, R01-AG057801, R01-AG063971) and the Department of Defense (W81XWH-20-1-0253). The sponsors had no role in the design and conduct of this study.

Appendix A

Here we demonstrate that if the cause-specific hazard functions are proportional over the interval [t0, t1], with constant of proportionality ri(t0) for cause i, then the life-table number of deaths from cause i, denoted di(t0), is proportional to the life-table number of deaths from all causes, denoted d(t0), where d(t0)=i=1Idi(t0), with the same constant of proportionality ri(t0). The sum over all such intervals of the life-table number of deaths from all causes is scaled to equal (0), which for simplicity is initialized to (0)=1.

Because μ(t)=μ1(t)++μI(t) and (t)=1(t)I(t) (see Gail25 eqns. (4) and (5)), the total density of deaths from all causes is:

f(t)=ddt(11(t)I(t))=f1(t)2(t)I(t)++1(t)I1(t)fI(t).

The first term in this sum, f1(t)2(t)I(t)=f1(t)(t)1(t)=μ1(t)(t), is interpreted as the density of deaths from cause 1; more generally, μi(t)(t) is the density of deaths from cause i. Therefore, the integrals of these cause-specific densities over the interval [t0, t1] give the probabilities of death from the specific causes. Hence,

di(t0)=t0t1μi(t)(t)dt=ri(t0)t0t1μ(t)(t)dt=ri(t0)t0t1f(t)dt=ri(t0)d(t0),

as required. It follows that the total and cause-specific life-table numbers of deaths are discrete probability functions.

Appendix B

Integration between two integer ages x and x+1 starting from (10) and using properties of the integral and approach resulting in (11) allows us to rewrite (10) as

eW(65)eB(65)=12x=x0(F(x)F(x+1))

Consider one time interval [x, x+1]. We need to sum over all deaths that occurred in this time interval. The contribution for the case of one death is given by (17), so:

F(x)F(x+1)=k,xak<x+1(F(ak)F(ak+))=k(eμ^B(ak)eμ^W(ak))(W(ak+)eB(ak)+B(ak)eW(ak)).

We need to prove that

F(x)F(x+1)=(lxBlx+1BlxWlx+1W)(lx+1W(exB(a^xx))+lx+1B(exW(a^xx))). (30)

To do this we need to use eqn. (15) in the form: j(ak)=skjj(ak+) and ej(ak+)=skjej(ak), where we define sjk=exp(μ^j(ak)) to simplify formulae. Furthermore, we use relations of survival functions and life expectancies in age points without deaths between them such as: j(ak+)=j(ak+1,) and ej(ak1,+)=ej(ak)+(akak1). These equations are sufficient to express all functions through life expectancy at x and all survival functions at x+1.

F(x)F(x+1)=k(B(ak)eW(ak)W(ak)eB(ak)B(ak+)eW(ak+)+W(ak+)eB(ak+))=k(skBB(ak+)eW(ak)skWW(ak+)eB(ak)skWB(ak+)eW(ak)+skBW(ak+)eB(ak))=k((skBskW)(B(ak+)eW(ak)W(ak+)eB(ak))).

Continuing recursively, we can obtain the expression with all functions at bounds. Specifically, for the case of three deaths inside the interval [x, x+1] we obtain:

F(x)F(x+1)=(s1Bs2Bs3Bs1Ws2Ws3W)(B(x+1)(eW(x)+a1x)+W(x+1)(eB(x)+a1x))(s3Bs3W)(a3a2)(B(x+1)+W(x+1))(s2Bs3Bs2Ws3W)(a2a1)(B(x+1)+W(x+1)).

Then we introduce a^ to represent the average age at death:

F(a1)F(a2)=(s1Bs2Bs3Bs1WS2Ws3W)(B(x+1)(eW(x)(a^x))+W(x+1)(eB(x)(a^x)))(B(x+1)+W(x+1))((s3Bs3W)(a3a2)+(s2Bs3Bs2Ws3W)(a2a1)+(s1Bs2Bs3Bs1Ws2Ws3W)(a1a^)).

Thus,

F(a1)F(a2)=(s1Bs2Bs3Bs1Ws2Ws3W)(B(x+1)(eW(x)(a^x))+W(x+1)(eB(x)(a^x))),

where a^ is defined as a^=w1a1+w2a2+w3a3, with weights:

w1=1s2Bs3Bs2Ws3Ws1Bs2Bs3Bs1Ws2Ws3Ww2=s2Bs3Bs2Ws3Ws1Bs2Bs3Bs1Ws2Ws3Ws3Bs3Ws1Bs2Bs3Bs1Ws2Ws3Ww3=s3Bs3Ws1Bs2Bs3Bs1Ws2Ws3W.

Because w1+w2+w3=1, a^ can be interpreted as a weighted average of ages at death. In addition, because s1js2js3j=j(x)j(x+1). eqn. (30) is proven for the case of three deaths in the interval [x, x+1], as well as for the general case, because it does not depend on the number of deaths except through a^.

The formula for a^ in the general case is: a^=kwkak, with weights:

wk=k=kKxSBkk=kKxSWkk=1KxSBkk=1KxSWkk=k+1KxSBkk=k+1KxSWkk=1KxSBkk=1KxSWk.

This formula is valid for the first and last age at death (using the convention: k=Kx+1Kx()=1.

The above formulae are valid independently of assumptions on the form of the distribution of ages at death in the age interval. The expressions got a^=a^x can be derived for specific cases and under specific assumptions. Assume that the race-specific hazards are time-independent (i.e., constant) for the interval [x, x+1] and that the number of deaths for each race (KxW and KxB, Kx=KxW+KxB) is known. In this case a^ has to be calculated through Kx-dimensional integration. For example, if KxW=1 and KxB=1, we need to calculate two two-dimensional integrals a^x=I1+I2, where:

I1=xx+1dx1xx+1dx2ρB(x1)ρW(x2)(x1w1BW+x2w2BW)I(x1<x2)I2=xx+1dx1xx+1dx2ρW(x2)ρB(x2)(x1w1WB+x2w2WB)I(x1<x2).

Here ρ denotes the conditional density of age at death for the interval [x, x+1], i.e.,

ρj(y)=μjexp(μjy)exp(μjx)exp(μj(x+1)).

The weights in this case are:

w1BW=11sWsBsW,w2BW=1sWsBsW,w1WB=11sBsBsW,w1WB=1sBsBsW

The integrals I1 and I2 are calculated analytically and can be presented in the form:

a^x=x+12μB+μW12+o(μB,μW).

The integration for more than two deaths results in:

a^x=x+12KxBμB+KxWμW12+o(μB,μW).

The formula for an arbitrary age interval [a1, a2] is:

a^x=a1+a22(a2a1)2KxBμB+KxWμW12+o(μB,μW).

In these illustrations, a^x is less than halfway between the start and end of the indicated age interval. This property follows from the assumption that the hazard rates are constant over the interval. An assumption of increasing hazard rates is reasonable for human mortality at age 65 and above. The most convenient assumption for such increasing hazard rates is that deaths are uniformly distributed over each interval, with the mean time to death among decedents exactly halfway between the start and end of the interval. The expression a^x=x+12 is consistent with this assumption.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Data Availability:

The 5%-Medicare data used in this study cannot be shared publicly due to confidentiality reasons. Please contact the Centers for Medicare and Medicaid Services and/or ResDAC for information on how to obtain the data. The datasets containing the Multiple-Cause-of-Death data analyzed during the current study are available in the CDC Wide-ranging OnLine Data for Epidemiologic Research (WONDER) portal, https://https://wonder.cdc.gov/.

References

  • 1.Ezzati M, Friedman AB, Kulkarni SC, Murray CJ. The reversal of fortunes: trends in county mortality and cross-county mortality disparities in the United States. PLoS medicine. 2008;5(4):e66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Council NR, Population Co. International differences in mortality at older ages: Dimensions and sources. National Academies Press; 2011. [PubMed] [Google Scholar]
  • 3.DHHS U. US Department of Health and Human Services: Healthy People 2010; Midcourse review. 2007. http://www.healthypeople.gov/data/midcourse.
  • 4.DHHS. Department of health human services. Centers for Disease Control Prevention. National Center for Health Statistics. Healthy people 2010: Final review. US Government Printing Office; 2012. [Google Scholar]
  • 5.Kaul P, Armstrong PW, Chang W-C, et al. Long-term mortality of patients with acute myocardial infarction in the United States and Canada: comparison of patients enrolled in Global Utilization of Streptokinase and t-PA for Occluded Coronary Arteries (GUSTO)-I. Circulation. 2004;110(13):1754–1760. [DOI] [PubMed] [Google Scholar]
  • 6.Woolf SH, Aron LY. The US health disadvantage relative to other high-income countries: findings from a National Research Council/Institute of Medicine report. Jama. Feb 27 2013;309(8):771–2. doi: 10.1001/jama.2013.91 [DOI] [PubMed] [Google Scholar]
  • 7.Council NR, Population Co. Explaining divergent levels of longevity in high-income countries. National Academies Press; 2011. [PubMed] [Google Scholar]
  • 8.Pollard JH. The expectation of life and its relationship to mortality. Journal of the Institute of Actuaries. 1982;109(2):225–240. [Google Scholar]
  • 9.Pollard JH. On the decomposition of changes in expectation of life and differentials in life expectancy. Demography. May 1988;25(2):265–76. [PubMed] [Google Scholar]
  • 10.Canudas-Romo V. Decomposition methods in demography. Rozenberg Publishers Amsterdam; 2003. [Google Scholar]
  • 11.Vaupel JW, Romo VC. Decomposing change in life expectancy: A bouquet of formulas in honor of Nathan Keyfitz’s 90th birthday. Demography. 2003;40(2):201–216. [DOI] [PubMed] [Google Scholar]
  • 12.Beltrán-Sánchez H, Preston SH, Canudas-Romo V. An integrated approach to cause-of-death analysis: cause-deleted life tables and decompositions of life expectancy. Demographic research. 2008;19:1323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. Journal of the American statistical association. 1958;53(282):457–481. [Google Scholar]
  • 14.White A, McKee M, de Sousa B, et al. An examination of the association between premature mortality and life expectancy among men in Europe. The European Journal of Public Health. 2014;24(4):673–679. [DOI] [PubMed] [Google Scholar]
  • 15.Yamunadevi A, Sulaja S. Old Age Mortality in India–An Exploration from Life Expectancy at Age 60. International Journal of Asian Social Science. 2016;6(12):698–704. [Google Scholar]
  • 16.Tsai W-Y, Jewell NP, Wang M-C. A note on the product-limit estimator under right censoring and left truncation. Biometrika. 1987;74(4):883–886. [Google Scholar]
  • 17.Beltrán-Sánchez H, Soneji S. A unifying framework for assessing changes in life expectancy associated with changes in mortality: The case of violent deaths. Theoretical Population Biology. 2011;80(1):38–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Arriaga EE. Measuring and explaining the change in life expectancies. Demography. 1984;21(1):83–96. [PubMed] [Google Scholar]
  • 19.Arriaga EE. Changing trends in mortality decline during the last decades. Differential mortality: Methodological issues and biosocial factors. Oxford: …; 1989. [Google Scholar]
  • 20.Makeham WM. On the law of mortality. Journal of the Institute of Actuaries. 1867;13(6):325–358. [Google Scholar]
  • 21.Makeham W. On an application of the theory of the composition of decremental forces. Journal of the Institute of Actuaries. 1874;18(5):317–322. [Google Scholar]
  • 22.Greville TN. Mortality tables analyzed by cause of death. Record of the American Institute of Actuaries. 1948;37(76):283–294. [Google Scholar]
  • 23.Bayo F. United States Life Tables by Causes of Death, 1959-61. National Center for Health Statistics, Washington, DC. 1968; [Google Scholar]
  • 24.Kochanek KD, Maurer JD, Rosenberg HM. Causes of death contributing to changes in life expectancy: United States, 1984-1989. vol 23. Department of Health and Human Serivces Public Health Ters for Di; 1994. [PubMed] [Google Scholar]
  • 25.Gail M. A review and critique of some models used in competing risk analysis. Biometrics. 1975:209–222. [PubMed] [Google Scholar]
  • 26.Preston S, Heuveline P, Guillot M. Demography: measuring and modeling population processes. 2001. Malden, MA: Blackwell Publishers. 2000; [Google Scholar]
  • 27.Chiang CL. A stochastic study of the life table and its applications: II. Sample variance of the observed expectation of life and other biometric functions. Human biology. 1960;32(3):221–238. [PubMed] [Google Scholar]
  • 28.Chiang CL. Life table and its applications. Robert E. Krieger Publishing Company.; 1984. [Google Scholar]
  • 29.Scherbov S, Ediev D. Significance of life table estimates for small populations: Simulation-based study of standard errors. Demographic Research. 2011;24:527–550. [Google Scholar]
  • 30.Silcocks P, Jenner D, Reza R. Life expectancy as a summary of mortality in a population: statistical considerations and suitability for use by health authorities. Journal of Epidemiology & Community Health. 2001;55(1):38–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gentle JE. Random number generation and Monte Carlo methods. vol 381. Springer; 2003. [Google Scholar]
  • 32.SAS P, Procedures R. SAS/STAT 9.4 User’s Guide. Cary, NC: SAS Institute Inc. 2017. [Google Scholar]
  • 33.Peterson AV Jr. Expressing the Kaplan-Meier estimator as a function of empirical subsurvival functions. Journal of the American Statistical Association. 1977;72(360a):854–858. [Google Scholar]
  • 34.CDC. Underlying cause of death 1999-2017 on CDC WONDER online database, released 2019. Centers for Disease Control Prevention; National Center for Health Statistics; Data are from the multiple cause of death files. 1999;2019 [Google Scholar]
  • 35.Akushevich I, Yashkin AP, Yashin AI, Kravchenko J. Geographic disparities in mortality from Alzheimer's disease and related dementias. Journal of the American Geriatrics Society. 2021; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hollmann FW, Kallan JE, Mulder TJ. Methodology and assumptions for the population projections of the United States: 1999-2100. US Department of Commerce, Bureau of the Census, Population Division; …; 1999. [Google Scholar]
  • 37.Zehna PW. Invariance of maximum likelihood estimators. Annals of Mathematical Statistics. 1966;37(3):744. [Google Scholar]
  • 38.Akushevich I, Yashkin A, Kravchenko J, et al. Theory of Partitioning of Disease Prevalence and Mortality in Observational Data. Theoretical Population Biology. 2017;114:117–127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Akushevich I, Yashkin AP, Kravchenko J, et al. Identifying the causes of the changes in the prevalence patterns of diabetes in older US adults: A new trend partitioning approach. Journal of diabetes and its complications. 2018;32(4):362–367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Akushevich I, Kravchenko J, Yashkin A,P., Fang F, Yashin A,I. . Partitioning of Time Trends in Prevalence and Mortality of Lung Cancer. Statistics in Medicine. 2019;(in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Akushevich I, Yashkin AP, Inman BA, Sloan F. Partitioning of time trends in prevalence and mortality of bladder cancer in the United States. Annals of Epidemiology. 2020;47:25–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Akushevich I, Yashkin AP, Kravchenko J, Yashin AI. Analysis of Time Trends in Alzheimer’s Disease and Related Dementias Using Partitioning Approach. Journal of Alzheimer's Disease. 2021;(Preprint):1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Akushevich I, Yashkin A, Kovtun M, Yashin A, Kravchenko J. Underlying mechanisms of change in cancer prevalence in older US adults: contributions of incidence, survival, and ascertainment at early stages. Cancer Causes & Control. 2022;33(9):1161–1172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Yun M-S. Decomposing differences in the first moment. Economics letters. 2004;82(2):275–280. [Google Scholar]
  • 45.Powers DA, Yun M-S. 7. multivariate decomposition for hazard rate models. Sociological Methodology. 2009;39(1):233–263. [Google Scholar]
  • 46.Akushevich I, Kolpakov S, Yashkin AP, Kravchenko J. Vulnerability to hypertension is a major determinant of racial disparities in Alzheimer’s disease risk. American Journal of Hypertension. 2022;35(8):745–751. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Akushevich I, Kravchenko J, Yashkin A, Doraiswamy M, Hill C, for the “Alzheimer’s Disease and Related Dementia Health Disparities Collaborative Group” Expanding the Scope of Health Disparities Research in Alzheimer’s Disease and Related Dementias: Recommendations from the “Leveraging Existing Data and Analytic Methods for Health Disparities Research Related to Aging and Alzheimer’s Disease and Related Dementias” Workshop Series. Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring. 2023:in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Wong MD, Shapiro MF, Boscardin WJ, Ettner SL. Contribution of major diseases to disparities in mortality. New England Journal of Medicine. 2002;347(20):1585–1592. [DOI] [PubMed] [Google Scholar]
  • 49.Arriaga E A note on the use of temporary life expectancies for analyzing changes and differentials of mortality. Geneva: World Health Organization. 1982:559–562. [Google Scholar]
  • 50.Andreev EM. The method of components in the analysis of length of life. Vestnik statistiki. 1982;9:42–47. [Google Scholar]
  • 51.Pressat R Contribution des écarts de mortalité par âge è la différence des vies moyennes. Population (french edition). 1985:766–770. [Google Scholar]
  • 52.Shkolnikov VM, Valkonen T, Begun A, Andreev EM. Measuring inter-group inequalities in length of life. Genus. 2001:33–62. [Google Scholar]
  • 53.Horiuchi S, Wilmoth JR, Pletcher SD. A decomposition method based on a model of continuous change. Demography. 2008;45(4):785–801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Andreev EM, Shkolnikov VM, Begun AZ. Algorithm for decomposition of differences between aggregate demographic measures and its application to life expectancies, healthy life expectancies, parity-progression ratios and total fertility rates. Demographic Research. 2002;7:499–522. [Google Scholar]
  • 55.Riffe T DemoDecomp: decompose demographic functions. R Package version. 2018;101 [Google Scholar]
  • 56.Aburto JM, Schöley J, Kashnitsky I, et al. Quantifying impacts of the COVID-19 pandemic through life-expectancy losses: a population-level study of 29 countries. International journal of epidemiology. 2022;51(1):63–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Aburto JM, Tilstra AM, Floridi G, Dowd JB. Significant impacts of the COVID-19 pandemic on race/ethnic differences in US mortality. Proceedings of the National Academy of Sciences. 2022;119(35):e2205813119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Andreev EM, Shkolnikov VM. An Excel spreadsheet for the decomposition of a difference between two values of an aggregate demographic measure by stepwise replacement running from young to old ages. Rostock: Max Planck Institute for Demographic Research (MPIDR Technical Report TR–2012–002). 2012; [Google Scholar]
  • 59.Auger N, Feuillet P, Martel S, Lo E, Barry AD, Harper S. Mortality inequality in populations with equal life expectancy: Arriaga's decomposition method in SAS, Stata, and Excel. Annals of Epidemiology. 2014;24(8):575–580.e1. [DOI] [PubMed] [Google Scholar]
  • 60.Zheng Y, Chang Q, Yip PSF. Understanding the increase in life expectancy in Hong Kong: contributions of changes in age-and cause-specific mortality. International Journal of Environmental Research and Public Health. 2019;16(11):1959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Currie J, Boyce T, Evans L, et al. Life expectancy inequalities in Wales before COVID-19: an exploration of current contributions by age and cause of death and changes between 2002 and 2018. Public Health. 2021;193:48–56. [DOI] [PubMed] [Google Scholar]
  • 62.Currie J, Schilling HT, Evans L, et al. Contribution of avoidable mortality to life expectancy inequalities in Wales: a decomposition by age and by cause between 2002 and 2020. Journal of Public Health. 2022; [DOI] [PubMed] [Google Scholar]
  • 63.Mehregan M, Khosravi A, Farhadian M, Mohammadi Y. The age and cause decomposition of inequality in life expectancy between Iranian provinces: application of Arriaga method. BMC Public Health. 2022;22(1):1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Andersen PK, Canudas-Romo V, Keiding N. Cause-specific measures of life years lost. Demographic Research. 2013;29:1127–1152. [Google Scholar]
  • 65.Remund A, Camarda CG, Riffe T. A cause-of-death decomposition of young adult excess mortality. Demography. 2018;55(3):957–978. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Dwyer-Lindgren L, Kendrick P, Kelly YO, et al. Life expectancy by county, race, and ethnicity in the USA, 2000–19: a systematic analysis of health disparities. The Lancet. 2022;400(10345):25–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Cunningham TJ, Croft JB, Liu Y, Lu H, Eke PI, Giles WH. Vital signs: racial disparities in age-specific mortality among blacks or African Americans—United States, 1999–2015. MMWR Morbidity and mortality weekly report. 2017;66(17):444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Kochanek K, Murphy S, Xu J, Arias E. Mortality in the United States, 2016. NCHS Data Brief, no 293. Hyattsville, MD: National Center for Health Statistics. 2017; [PubMed] [Google Scholar]
  • 69.Harper S, Rushani D, Kaufman JS. Trends in the black-white life expectancy gap, 2003-2008. Jama. 2012;307(21):2257–2259. [DOI] [PubMed] [Google Scholar]
  • 70.Firebaugh G, Acciai F, Noah AJ, Prather C, Nau C. Why the racial gap in life expectancy is declining in the United States. Demographic research. 2014;31:975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Kochanek KD, Arias E, Anderson RN. How did cause of death contribute to racial differences in life expectancy in the United States in 2010?(NCHS data Brief No. 125). National Center for Health Statistics US Centers for Disease Control and Prevention [Online] http://wwwcdcgov/nchs/data/databriefs/db125pdf Accessed April. 2013;4:2016. [PubMed] [Google Scholar]
  • 72.DuGoff EH, Canudas-Romo V, Buttorff C, Leff B, Anderson GF. Multiple chronic conditions and life expectancy: a life table analysis. Medical care. 2014:688–694. [DOI] [PubMed] [Google Scholar]
  • 73.Woolf SH, Schoomaker H. Life expectancy and mortality rates in the United States, 1959-2017. Jama. 2019;322(20):1996–2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Dwyer-Lindgren L, Bertozzi-Villa A, Stubbs RW, et al. Inequalities in life expectancy among US counties, 1980 to 2014: temporal trends and key drivers. JAMA internal medicine. 2017;177(7):1003–1011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Currie J, Schwandt H. Inequality in mortality decreased among the young while increasing for older adults, 1990–2010. Science. 2016;352(6286):708–712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Riddell CA, Morrison KT, Kaufman JS, Harper S. Trends in the contribution of major causes of death to the black-white life expectancy gap by US state. Health & place. 2018;52:85–100. [DOI] [PubMed] [Google Scholar]
  • 77.Bharmal N, Tseng C-H, Kaplan R, Wong MD. State-level variations in racial disparities in life expectancy. Health services research. 2012;47(1 Pt 2):544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Singh GK, Daus GP, Allender M, et al. Social determinants of health in the United States: addressing major health inequality trends for the nation, 1935-2016. International Journal of MCH and AIDS. 2017;6(2):139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Vierboom YC, Preston SH. Life beyond 65: Changing spatial patterns of survival at older ages in the United States, 2000–2016. The Journals of Gerontology: Series B. 2020;75(5):1093–1103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Ma J, Ward EM, Siegel RL, Jemal A. Temporal Trends in Mortality in the United States, 1969-2013. Jama. Oct 27 2015;314(16):1731–9. doi: 10.1001/jama.2015.12319 [DOI] [PubMed] [Google Scholar]
  • 81.Statistics NCfH. Health, United States, 2016, With chartbook on long-term trends in health. Government Printing Office; 2017. [PubMed] [Google Scholar]
  • 82.Elo IT, Hendi AS, Ho JY, Vierboom YC, Preston SH. Trends in non-hispanic white mortality in the United States by metropolitan-nonmetropolitan status and region, 1990–2016. Population and development review. 2019;45(3):549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Woolf SH, Chapman DA, Buchanich JM, Bobby KJ, Zimmerman EB, Blackburn SM. Changes in midlife death rates across racial and ethnic groups in the United States: systematic analysis of vital statistics. Bmj. 2018;362 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Bauer UE, Briss PA, Goodman RA, Bowman BA. Prevention of chronic disease in the 21st century: elimination of the leading preventable causes of premature death and disability in the USA. The Lancet. 2014;384(9937):45–52. [DOI] [PubMed] [Google Scholar]
  • 85.Kramarow EA, Tejada-Vera B. Dementia mortality in the United States, 2000-2017. National Vital Statistics Reports: From the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System. 2019;68(2):1–29. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

The 5%-Medicare data used in this study cannot be shared publicly due to confidentiality reasons. Please contact the Centers for Medicare and Medicaid Services and/or ResDAC for information on how to obtain the data. The datasets containing the Multiple-Cause-of-Death data analyzed during the current study are available in the CDC Wide-ranging OnLine Data for Epidemiologic Research (WONDER) portal, https://https://wonder.cdc.gov/.

RESOURCES