Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Aug 1.
Published in final edited form as: Sociol Methodol. 2022 Jul 23;52(2):254–286. doi: 10.1177/00811750221112398

Bayesian Multistate Life Table Methods for Large and Complex State Spaces: Development and Illustration of a New Method

Scott M Lynch 1, Emma Zang 2
PMCID: PMC10241463  NIHMSID: NIHMS1852062  PMID: 37284595

Abstract

Multistate life table methods are an important tool for producing easily understood measures of population health. Most contemporary uses of these methods involve sample data, thus requiring techniques for capturing uncertainty in estimates. In recent decades, several methods have been developed to do so. Among these methods, the Bayesian approach proposed by Lynch and Brown has several unique advantages. However, the approach is limited to estimating years to be spent in only two living states, such as “healthy” and “unhealthy.” In this article, the authors extend this method to allow for large state spaces with “quasi-absorbing” states. The authors illustrate the new method and show its advantages using data from the Health and Retirement Study to investigate U.S. regional differences in years of remaining life to be spent with diabetes, chronic conditions, and disabilities. The method works well and yields rich output for reporting and subsequent analyses. The expanded method also should facilitate the use of multi-state life tables to address a wider array of social science research questions.

Keywords: multistate life table, Bayesian estimation, discrete-time Markov processes


Multistate life tables (MSLT), also known as increment-decrement life tables, are useful tools for studying the life-course implications of transitions into and out of states defined by health or other statuses. Because the key output of MSLTs—remaining years to be lived in each measured state (“state expectancies”)—can be easily understood by policymakers, laypeople, and others, MSLT methods have been used widely in many different fields. Originally, MSLT methods relied on census and aggregate vital statistics data and were used exclusively by demographers to study a variety of topics, including marital status transitions, working-life transitions, and health transitions (e.g., Billingsley 1968; Fix and Neyman 1951; Land and Rogers 1982; Meier 1955; Sverdrup 1965).

In recent decades, use of survey data has become increasingly common for MSLT estimation in studies of health transitions. Survey data provide richer information than census data, allowing researchers to provide more detail regarding subpopulation disparities in state expectancies, and, in theory, to investigate more complex state spaces. Several MSLT methods have been developed over the past two decades to facilitate the production of MSLTs using sample data, including using regression-based smoothing methods to compensate for small cell sizes and using bootstrapping and Markov chain Monte Carlo (MCMC) methods to produce interval estimates, as is appropriate for sample data (Calhoun 1997; Hayward, Rendall, and Crimmins 1999; Land, Guralnik, and Blazer 1994; Lee and Rendall 2001; Lynch and Brown 2005). The methods and software developed to construct MSLTs have been used in numerous studies to elucidate social disparities in state expectancies (e.g., Crimmins et al. 2009; Reynolds, Saito, and Crimmins 2005; Zaninotto and Steptoe 2019).

Although these newer MSLT methods have enabled us to learn more about disparities in state expectancies, the state spaces investigated using these methods have remained relatively limited. The most common applications of MSLT methods in demographic studies of health using survey data in recent decades have involved simple two-living-states-plus-death state spaces—such as healthy, unhealthy, and deceased—with health typically defined by a single measure, such as the presence or absence of a disability. The reason is that extant methods for constructing MSLTs from survey data are computationally costly because (1) constructing interval estimates for state expectancies is not straightforward but is necessary when using sample (versus population) data, and (2) larger state spaces complicate estimation of the larger transition probability matrices that underlie MSLT construction.

Still, adult life-course health processes and outcomes are complex and multifaceted, as evidenced by a large literature on multistate health processes outside of demography that does not estimate state expectancies (see, e.g., Willekens and Putter 2014) but simply elucidates multistate processes. For example, diabetes is one of the most rapidly evolving threats to population health, because it entails downstream health complications such as heart disease, disability, and premature mortality. Thus, in this article, we extend a Bayesian approach developed by Lynch and Brown (2005) to handle large state spaces with “quasi-absorbing” states: states that, once visited, do not allow a return to certain other states (e.g., once diagnosed as “diabetic,” one cannot reverse this diagnosis; on prior use of this terminology, see Abner et al. 2012).

Lynch and Brown’s (2005) approach involves (1) using MCMC methods to sample parameters from the posterior distribution for a discrete-time multinomial probit model predicting outcome states, conditional on starting states, over time intervals (panel study waves); (2) generating sets of predicted age-specific transition probability matrices from the parameter samples; (3) applying standard MSLT calculations to the transition probability matrices; and (4) summarizing quantities of interest from the resulting collection of life tables. The Bayesian approach offers some key advantages over other approaches. First, prior information can be incorporated formally into Bayesian models. Second, the Bayesian paradigm enables direct, probabilistic interpretations of estimates, whereas interpretations of estimates obtained from other approaches are not straightforward or necessarily theoretically justified. Third, from a computational perspective, the Bayesian approach can more easily handle sparse cells that often arise from complex state spaces with sample data than data bootstrapping and other methods can (Gelman et al. 2013; Ghosh, Mukhopadhyay, and Lu 2006).

Until recently, sampling parameters from a posterior distribution for multinomial models with many outcomes has been a substantial challenge, thus limiting the practical size of the state space that could be investigated with a Bayesian approach. The MSLT method developed by Lynch and Brown (2005), for example, handles only simple state spaces with two living states: “healthy” and “unhealthy.” Fortunately, recent advances have facilitated the rapid sampling of parameters from multinomial logit models. We show how, with these new developments and a key change in the modeling strategy, the Lynch-Brown method can be extended to handle large state spaces with quasi-absorbing states. We illustrate the utility of such an approach by addressing the following substantive questions: How do diabetic life expectancy (LE) and the percentage of remaining life to be spent diabetic at age 50 vary by birth region and current region? How do total years of life remaining at age 50 with activities of daily living (ADLs) or chronic conditions vary by birth and current region for persons with diabetes? How does the percentage of remaining life to be spent with ADLs or chronic conditions at age 50 vary by birth and current region for persons with diabetes?

BACKGROUND

The single decrement life table has been a staple of demography since its initial development more than three centuries ago. Its key output is LE, which is an estimate of the mean of the age-at-death distribution for a synthetic cohort whose mortality rates are assumed to be “stationary,” or fixed at the current prevailing rates in a given year (Centers for Disease Control and Prevention 2022; Ofstedal et al. 2019). LE remains a key indicator of population health, but the development of MSLT methodology began early in the past century with an interest in differentiating qualitatively different living states, such as “healthy” and “unhealthy,” and capturing transitions between them prior to death. However, this development was relatively slow until the late 1970s, when computing power facilitated conducting the tedious calculations required for MSLT construction (Schoen 1988). The 1980s saw considerable development in the mathematics underlying multistate processes, including linking multistate processes to continuous-time Markov process theory (Land and Rogers 1982; Schoen 1988). Numerous applications of the methodology emerged (e.g., Billingsley 1968; Fix and Neyman 1951; Land and Rogers 1982; Meier 1955; Sverdrup 1965). However, the development and application of MSLT methods implicitly required that population-level transition rates were available for estimation.

Interest in using survey data to produce MSLTs began in the early 1990s, as general demographic interest expanded to investigating more finely grained subpopulation differences in not only the traditional population dynamics (fertility, mortality, and migration) but also phenomena related to them, such as marital transitions, health transitions, and labor market transitions. Whereas population-level data typically can be disaggregated only at coarse levels, such as by age, sex, and race, survey data contain much richer sets of covariates, allowing much more detailed subpopulation comparisons. However, survey data are typically much smaller in size than population data, making disaggregation along multiple dimensions difficult because of small cell sizes. In 1990, scholars began to use regression modeling techniques to produce smoothed estimates of transition probabilities, or rates, from which MSLTs could be constructed (Hayward and Grady 1990; Land et al. 1994).

In the MSLT literature, a general strategy for producing smoothed estimates of transition probabilities or rates is to model changes in states between two time points (“transitions”) by treating the state at time t + k as the outcome of a multinomial (or other type of) regression model in which the state at time t is a predictor and age and other variables are covariates. For example, in a two-living-state-plus-death state space, there are six possible transitions between two time points: (1) retention in state 1 (with corresponding transition/retention probability represented by p11), (2) transition from state 1 to state 2 (p12), (3) transition from state 1 to death (p13), (4) transition from state 2 to state 1 (p21), (5) retention in state 2 (p22), and (6) transition from state 2 to death (p23). A multinomial regression model with three outcome categories (state at t + k) and a dichotomous predictor (living state at t), along with age and other variables, is required to estimate the transition probabilities. Land et al. (1994) used Markov panel regression models following this general strategy, and Laditka and Wolf (1998) used multinomial logistic regression models. Rather than conditioning on the state at the start of a time interval by using a covariate for it, however, the latter estimated separate multinomial logistic regression models for each starting state.

Given a set of coefficient estimates, covariates can be fixed at values for which a life table is to be generated (a “covariate profile”; e.g., White men with 12 years of schooling), and age can be incremented across the age range to obtain a complete set of predicted, smoothed age-specific 3 × 3 transition probability or rate matrices. Standard MSLT calculations can then be used to produce life tables from the transition matrices (Palloni 2001). This strategy represented a significant advance in MSLT methodology, but sample data contain sampling error, and this error was largely ignored in these efforts. Later in the 1990s, scholars began to develop methods for obtaining interval estimates of MSLT quantities to compensate for sampling (and other) uncertainty. These methods involve three broad approaches: bootstrapping, estimating embedded Markov chains (EMCs; with and without microsimulation), and Bayesian estimation.

The bootstrapping approach involves either resampling data and estimating a multivariate hazard model for each bootstrap sample (Hayward et al. 1999) or estimating a single hazard model and drawing parameter samples from the sampling distribution of the parameters implied by the estimates (Calhoun 1997; Lee and Rendall 2001). In the first approach, multiple data sets are obtained via bootstrapping. For each bootstrapped sample, a discrete-time multinomial logit model is estimated and a life table is constructed using the approach described previously: generating sets of age-specific transition matrices for a desired covariate profile and applying standard MSLT calculations to them. In the second approach, a single multinomial logit model is estimated using the data. Then, drawing on maximum likelihood asymptotic theory, parameters are simulated from a multivariate normal distribution with the mean vector and the covariance matrix implied by the estimates of parameters and standard errors obtained from the logit model (a “bootstrap normal approximation,” see Efron and Tibshirani 1995). A life table is constructed for each parameter sample.

Both approaches work well for small state spaces, but significant limitations arise as the state space increases in size. Under the first approach, it becomes increasingly likely to obtain bootstrap samples with no cases experiencing rare transitions, and the implication for inference of simply dropping such samples, or increasing the size of the bootstrap samples (which violates the traditional mathematical assumptions of the bootstrap), is not clear (Efron and Tibshirani 1995). Under the second approach, the sampling distribution for multinomial logit parameters becomes increasingly nonnormal as the dimensionality of the parameter space increases, so that the asymptotic assumptions justifying simulating the parameters from a multivariate normal distribution are potentially unreasonable even in moderately large samples.

The original EMC and microsimulation approach in MSLT methods involves estimating transition probabilities using a set of multinomial logit models (one for each starting state as mentioned previously), finding the EMC implied by the logit results, and simulating individual life histories on the basis of the EMC (Laditka and Wolf 1998). Lievre, Brouard, and Heathcote (2003) developed the statistical program IMaCh, which extends this approach to generate interval estimates for the life table quantities using the delta method. Cai et al. (2010) developed a further extension of this approach, with corresponding software SPACE in SAS. Unlike the IMaCh approach, however, bootstrapping is applied to estimate the variances for life table quantities in SPACE, and this approach handles the challenges posed by data obtained from complex sampling methods.

There are clear strengths to using an EMC approach (Wolf and Gill 2009). Specifically, panel surveys with longer intervals between waves of data collection (e.g., greater than one year) tend to miss some transitions between states, because most panel studies only ask about respondents’ current status at the time of interview. The goal of an EMC approach is to identify transition probabilities over shorter time intervals that are “embedded” in the observed current status data. This approach, when coupled with microsimulation of life histories, facilitates the inclusion of duration dependence in the life table construction process, making the approach “semi-Markovian,” a feature SPACE incorporates.

The EMC approach is theoretically pleasing and more consistent than other methods with the continuous-time Markov process theory underlying MSLTs (Schoen 1988), but EMC approaches have some significant limitations. First, it is not always possible to find an embedded continuous-time process underlying observed probabilities—the “embeddability problem” (Singer and Spilerman 1976). This is not necessarily problematic when embedding a discrete-time process with shorter time intervals within another discrete-time process with longer intervals and using ML estimation rather than direct mathematical computation, as in Laditka and Wolf (1998), but it implies that the process is still a discrete-time one, just with shorter time intervals within which transitions are not observed. Thus, the approach simply makes a different assumption about transitions than do other approaches, such as the bootstrapping approaches discussed previously and the Bayesian approach discussed later (“event-history approaches”).

Second, the ML estimation process for the EMC approach is extremely costly and limits most applications to relatively simple state spaces and very few covariates. Thus, there is a trade-off between one type of “realism” provided by an EMC approach and another type of realism: the ability to investigate disparities across multiple covariates using more detailed state spaces potentially afforded by event-history approaches.

Third, Wolf and Gill (2009) used real data in which monthly status was recorded. They estimated hazard models for transitions between nondisabled and disabled states and produced life tables using the full, monthly data. They compared model and life table results with results obtained using event-history methods applied to data at 12- and 24-month intervals. Although they found that the EMC approach performed better than event-history approaches in capturing model parameters and transitions, differences in state expectancy estimates were not substantial even when observed intervals were two years long. Both EMC and event-history approaches tended to underestimate disabled years by half a year to a year. We suspect a reason for the similarity between the methods’ performances is that, although individuals may experience multiple transitions over a two-year interval, state expectancy calculations depend on assumptions regarding person-years lived in each state over an interval. Whether an individual transitions back and forth between disabled and nondisabled states continuously over an interval or makes only a single transition, the same number of years lived in both states over the interval may be obtained by both approaches.

The Bayesian approach to MSLT interval estimation developed by Lynch and Brown (2005) resolves some of the problems of the other approaches but has its own limitations. In their approach, parameters are sampled from a discrete-time multinomial probit model using MCMC sampling methods (Gamerman and Lopes 2006). Thus, like the bootstrapping approaches, the approach assumes only one transition occurs between observed survey waves. As with the bootstrapping approaches, a life table is produced for each parameter sample, and (credible) interval estimates of any life table quantity of interest can be computed from the life tables using a typical Bayesian approach of sorting estimates and selecting the values at the desired quantiles (see Brooks et al. 2011). Lynch and Brown (2005) developed a pair of C and R programs to facilitate implementation (sometimes called GSMLT for “Gibbs sampling for multistate life tables”).

Aside from the ability of this approach to incorporate prior information if desired, the method does not resample data and therefore does not risk obtaining samples that are missing transitions. Furthermore, the method does not assume normality of the posterior distribution of the model parameters, so it does not produce only asymptotically approximate interval estimates, such as the delta method does. Finally, the method is fast and has virtually no limit on the number of covariates that can be included. A closely related method was subsequently developed by Lynch and Brown (2010) that adapts Sullivan’s method in a Bayesian framework to produce interval estimates for MSLT quantities for subpopulations using cross-sectional (i.e., prevalence), rather than panel, data.

The key limitation of Lynch and Brown’s (2005) method, however, is that, as developed, the method can handle only two living states. Additionally, the method treats the state at the start of an interval as a covariate, with the state at the end of the interval as the multinomial outcome. This approach does not allow state spaces in which some states do not communicate, because some outcome states cannot be predicted by some starting states. For example, in our illustration below, one state is “healthy” and another is “diagnosed diabetic.” Once an individual has been diagnosed with diabetes, the person cannot return to the “healthy” state. We refer to diabetes and similar states as “quasi-absorbing states.” A consequence is that, in some state spaces, a given starting state may imply a reduced set of outcome states, potentially implying a partial or complete separation of the sample were one to estimate a logit model. A solution to this problem may be to establish separate models for each state in the state space, with restricted sets of starting state covariates. However, piecing together results from such an approach would ignore dependencies across equations and lead to incorrect interval estimates. Ideally, a single model should be constructed that (1) can handle large state spaces with quasi-absorbing states, (2) can handle numerous covariates to improve the precision of estimates or estimation of MSLTs for very specific subpopulations, and (3) can be estimated within a reasonable time frame.

We conclude this section by noting that our discussion of the literature has been limited to the development of MSLT methods in demography and not multistate methods in statistics more generally. The general multistate literature has advanced substantially in recent years. A recent textbook by Cook and Lawless (2018) introduces major advances in multistate analysis in statistics, and Willekens and Putter (2014) discussed available software for multistate analysis, including the only software available for MSLT estimation, which was discussed previously. The state spaces investigated in the multistate statistics literature are often much more complex than those investigated in demography, but there are significant differences between the focus of the general literature and that of demography. First, the processes investigated in the general multistate literature tend to be time homogenous, meaning that transition probabilities are constant over time. In contrast, MSLT methods in demography involve time-inhomogenous processes: transition probabilities change with age. This complicates both the theory underlying the limiting behavior of the assumed Markov process, including mathematical expectations for time lived in states, and computation. Second, most applications tend to be relatively short term, not over decades of the life-course, and exact timing of transitions is often known, facilitating a closer linkage between continuous-time theory and the observed data. Third, the focus of most multi-state applications is not on state expectancies. Thus, there has been little overlap between the demographic and statistics literatures.

A NEW METHOD FOR MSLT ESTIMATION

Our MSLT method involves three main steps: (1) sampling parameters from the posterior distribution for a Bayesian discrete-time multinomial logit model of transitions observed in panel data, with covariates, including age, as predictors; (2) computing predicted age-specific transition probability matrices from the posterior samples; and (3) generating MSLTs using typical demographic calculations applied to the age-specific transition probability matrices. This approach is similar to that of Lynch and Brown (2005) but with some key changes to the model structure and estimation strategy.

Step 1: Data and Model Setup and Sampling Parameters

We assume a discrete-time first-order Markov process with a finite state space that has at least two living states, as well as death as an absorbing state. Each state may or may not communicate with all other states, but all states should allow for a direct transition to death. All states other than death in such a state space are transient, by definition. This is a typical setup for MSLTs. Figure 1 shows a state space with states with all possible paths between states except death, which can be reached from any other state but is excluded from the figure to avoid clutter. Also excluded from the figure, but implicit, are arrows indicating that individuals may stay in a given state (“retention”) rather than transition out of it in a given time interval. The states in this state space are the following: being healthy (H); having diabetes (D); having at least one chronic condition (heart disease, stroke, cancer, or lung disease) (C); having at least one ADL disability (difficulty with dressing, bedding, bathing, toileting, walking, or eating) (A); having diabetes and at least one ADL disability (DA); having diabetes and at least one chronic condition (DC); having at least one chronic condition and one ADL disability (CA); having diabetes, at least one chronic condition, and at least one ADL disability (DCA); and death (X). Some states are quasi-absorbing, because a return to certain other states from them is not possible. In particular, once an individual has been diagnosed with diabetes, the person cannot transition to any nondiabetic state (i.e., H, C, A, and AC), and once a person has been diagnosed with a chronic condition, the person cannot transition to any state that does not include a condition (i.e., H, D, A, and AD).

Figure 1.

Figure 1.

State space of interest.

Note: Retention in each state is not shown but is allowed. The parenthetical numbers next to each state indicate how many transitions are possible from the given state. States include being healthy (H), being diabetic (D), having at least one chronic condition (C), having at least one activities of daily living (ADL) disability (A), being diabetic with at least one condition (DC), being diabetic with at least one ADL disability (DA), having at least one condition and one ADL disability (CA), being diabetic with a at least one condition and at least one ADL disability (DCA), and death. Death is not shown but transition to death is allowed from all states.

Probabilities of transitioning between states over time can be represented with a square matrix in which rows represent origin (“starting”) states and columns represent destination (“ending”) states:

P(t)=[ph,hph,aph,cph,caph,dph,daph,dcph,dcaph,xpa,hpa,apa,cpa,capa,dpa,dapa,dcpa,dcapa,x00pc,cpc,ca00pc,dcpc,dcapc,x00pca,cpca,ca00pca,dcpca,dcapca,x0000pd,dpd,dapd,dcpd,dcapd,x0000pda,dpda,dapda,dcpda,dcapda,x000000pdc,dcpdc,dcapdc,x000000pdca,dcpdca,dcapdca,x000000001], (1)

where P(t) is a transition probability matrix representing the probabilities of transitioning between t and t + k, with pi,j (or just pij) being the probability of transitioning from state i to j over the interval of length k. Note that P(t) ≠ P; that is, the magnitude of the transition probabilities changes over time (age) so that the process is time inhomogenous. Thus, there is a transition probability matrix for each time/age interval, and constructing an MSLT requires a collection of such matrices over the age range covered by the MSLT estimates, usually from some base age t = 0 to the oldest age (Ω). The transition matrices contain a number of structural zeros representing impossible transitions from quasi-absorbing to certain other states. For example, p32 = pc,h = 0, because it is impossible to return to the “healthy” state once one has been diagnosed with a chronic condition. Given a collection of P(t) across time, t = 0 … Ω, an MSLT can be produced to yield years to be lived in each state using typical MSLT calculations involving manipulations of matrices, as described later. The initial goal is thus to obtain estimates of the transition probabilities.

The data required for the method are panel data structured in “long” format in which each row of the data set represents a time interval between t and t + k months, years, or some other time unit, for individual i. We assume the intervals are evenly spaced. Columns in the data set must include (1) the state an individual is in at time t (“starting state”), (2) the state an individual is in at time t + k (“ending state”), and (3) the age of the individual at t. As many covariates as desired can also be included. Individuals contribute as many interval records to the data as there are survey waves, less 1, if they survive over the survey period (in which case they are right-censored at the end of the study period). Individuals who die prior to the end of the survey period contribute only as many interval records as they have available prior to their death. Those who drop because of attrition contribute as many interval records as are available; they are also censored cases. This is a typical setup for a discrete-time hazard modeling approach (Allison 1984). Note that, given this setup, for all persons, the ending state for the record that began at time t is the starting state for the record that begins at t + k.

Most approaches to MSLT estimation that include covariates and produce interval estimates of state expectancies use a regression modeling strategy that involves conditioning on the starting state, either by estimating separate models for each starting state or by including the starting state as a covariate. A problem with the former approach is that estimating separate models produces a piecewise approach to constructing transition probability matrices that ignores dependencies between model parameters across equations, as mentioned earlier. Another problem with that approach is that some starting states in a large state space may have very few sample members, making estimation of a model difficult or at least inefficient. A problem with the latter approach is that there may be many structural zeros, so that some starting state covariates imply a limited number of outcome states (or, some outcome states may be perfectly predicted by some starting states). In such cases, there may be a “separation” that prohibits estimation of parameters or, worse, produces estimates that are wrong (Cook, Niehaus, and Zuhlke 2018). This difficulty may be overcome by imposing parameter constraints that imply probabilities of zero for some transitions (something our approach can easily do via inclusion of strong priors), but a more direct approach is to model the transition as the outcome of the model, rather than the ending state as a function of the starting state. Modeling transitions as the outcome has the added advantage of allowing the effect of age to vary across transitions without necessitating the addition of interactions between age and the starting state covariate in each equation. Transition probabilities can then be recovered from the model results as described later.

An appropriate model for obtaining transition probabilities with multiple possible transitions is the multinomial logit model. Let p(yi = j) represent the probability that record i (a person-interval record) experiences outcome (transition) j. Then,

P(yi=j)=exp(Xiβj)s=1Jexp(Xiβs) (2)

represents the relationship between a l × m vector of covariates for record i, Xi, and the probabilities of experiencing one of J possible transitions, through an m × J matrix of coefficients β, where βj is the jth column of the matrix. These probabilities can be inserted into a multinomial mass function to obtain a likelihood function:

L(βY)i=1n(j=1Jp(yi=j)I(yi=j)), (3)

where I(yi = J) is the indicator function indicating whether record i experienced outcome j. It is well known that the coefficients are not identified without constraints, so an arbitrary outcome is omitted as the reference outcome. In a Bayesian framework, priors for the matrix β can be incorporated to produce a posterior distribution: p(β | Y) ∝p(β)L(β | Y), where p(β) is the prior for the parameter matrix, β.

Defining transitions as the outcome—rather than using the ending state as the outcome with a covariate for the starting state—resolves one limitation of the Lynch-Brown method: transitions that are not possible (i.e., structural zeros) are simply not included as an outcome in the multinomial regression model. However, defining transitions as the outcome increases the dimensionality of the multinomial outcome, posing another problem. On paper, the number of outcomes in a model is theoretically unlimited, but parameters of the multinomial logit model become increasingly difficult to estimate when the dimensionality becomes large, in both maximum likelihood and Bayesian settings.

In the Bayesian setting, the goal is to summarize the posterior distribution for parameters. Common summaries (e.g., the mean, the median, and quantiles) involve integral calculus, which is difficult in high dimensions. Thus, contemporary Bayesian analyses typically involve simulating parameters from the posterior distribution and constructing sample values for the desired summary measures. A benefit of the sampling approach is that functions (or functionals) of the samples can be used directly to construct distributions of quantities (e.g., state expectancies) that were not directly estimated in the model.

MCMC methods are a class of methods used in Bayesian statistics that involve (1) constructing rules for producing a Markov chain with a stationary distribution that is proportional to the posterior distribution of interest and (2) randomly generating sequences of values from the Markov chain. Because of its efficiency and speed, the most commonly used sampler is the Gibbs sampler, which involves iteratively sampling from conditional distributions for model parameters. In general, consider a posterior distribution f(θ | Y) in which θ is a collection of parameters θ1 … θq. A Gibbs sampler would involve finding fi | θi, Y) for each θ (θi is the vector excluding the ith component) and then iterating the following steps:

  1. Select starting values for all θ and set g = 1;

  2. simulate θ1(g)~f(θ1θ2(g1),,θq(g1),Y);

  3. simulate θ2(g)~f(θ2θ1(g),θ3(g1),θq(g1),Y);

  4. simulate θq(g)~f(θqθ1(g),θq1(g),Y);

  5. increment g and return to step 1;

where in each subsequent conditional distribution, θ is set to its most recently sampled value, as indicated by the parenthetical superscripts.

The multinomial probit model is often used in the Bayesian context for models with mutually exclusive outcomes, because Gibbs samplers are fairly easy to implement using data augmentation strategies; that is, additional steps within the Gibbs sampler in which the observed multinomial outcomes are replaced by continuous, latent variables that follow truncated normal distributions (Albert and Chib 1993; Imai and Van Dyk 2005; McCulloch and Rossi 1994). Lynch and Brown (2005) discussed this strategy in depth in developing their method. In brief, if β represents the parameters of a multinomial probit regression model, Y is the vector of observed multinomial outcomes, and Z is the vector of augmented (latent) data, the conditional distribution f(β | Z) is multivariate normal, and f(Z | β, Y) is truncated multivariate normal, so the Gibbs sampler simply involves two steps, both of which are straightforward to implement, as long as the dimensionality of Y is not too large.

In contrast, simulation of multinomial logit model parameters requires more general Metropolis-Hastings or other MCMC samplers, because no analogous data augmentation strategy yields a conditional posterior distribution for β that is a known form. One such general sampling strategy is the “independence sampler.” In the independence sampler, a “proposal distribution,”g(θ), is used to generate “candidate” values, θc, that are evaluated for acceptance as coming from the posterior distribution of interest, f(θ | Y). The proposal distribution may be any distribution from which it is easy to sample, but the more closely it follows the posterior distribution, the better it performs. A general independence sampler can be implemented as follows:

  1. Select starting values for all θ and set g = 1.

  2. Simulate θc ~ g(θ).

  3. Compute the ratio R=f(θcY)g(θ(g1))f(θ(g1)Y)g(θc).

  4. Generate u ~ U(0,1). If R > u, set θ(g) = θc; otherwise set θ(g) = θ(g–1).

  5. Increment g and return to step 1.

In this algorithm, evaluating candidates using the ratio R ensures that the sampled values follow the posterior distribution, despite having been generated from the proposal distribution: proposed values are accepted in proportion to their relative probability under the posterior distribution (the first part of the ratio), adjusting for differences in probability of being proposed under the proposal distribution (the latter part of the ratio).

In the context of the multinomial logit model, we found that the independence sampler with a multivariate normal proposal distribution with mean equal to the ML estimate, and a covariance matrix equal to the covariance matrix of estimates obtained from common software such as Stata, works fairly well when (1) the sample size is large so there are few rare transitions in the data, (2) prior distributions for model parameters are relatively noninformative, and (3) the dimensionality of the multinomial outcome is small to medium (e.g., fewer than 15–20 transitions are possible in the state space).

In general, for models in which the number of possible transitions is low (e.g., <10), the Gibbs sampler for the probit model and the independence sampler for the logit model both work well. For models in which the number of possible transitions exceeds 10 but is less than 25 or so, the independence sampler for the logit model works better than the Gibbs sampler for the probit model. However, for models in which the number of possible transitions exceeds 25, both the Gibbs sampler for the probit model and Metropolis-Hastings routines for the logit model, including the independence sampler, become computationally difficult or even infeasible.

Recent development of a Gibbs sampling routine for the logit model has resolved much of the previous difficulty in simulating parameters in high-dimensional logit models. Polson, Scott, and Windle (2013) developed a Gibbs sampler for the multinomial logit model using a data augmentation strategy involving Polya-Gamma latent variables. Details of the approach can be found in the authors’ original article and in their technical supplement. Compared with other Bayesian strategies of estimating multinomial logit models, this data augmentation strategy is more efficient and easier to use when the dimensionality of the outcome is large. In our own trials, this strategy is orders of magnitude faster computationally to converge and mix compared with an independence sampler and other MCMC algorithms. Thus, we rely on this new Gibbs sampling approach in our example here but note that an independence sampler may work well for state spaces that are larger than that of Lynch and Brown (2005) but smaller than our illustration here. See Zang, Lynch, and West (2021) and Zang et al. (forthcoming) for recent examples involving the independence sampler approach applied to smaller state spaces.

Step 2: Computing Age-Specific Transition Probability Matrices

For our purposes, all J possible transitions across a time interval but one are treated as an outcome in the multinomial logit model, with all covariates x1xm (including an intercept and an age variable) included as predictors of each transition. After implementing either a Gibbs sampler or an independence (or other MCMC) sampler, we will have g = 1 … G samples of β[m × (J–1)]. For life table construction, we first determine a covariate profile for which we wish to generate a life table, and we construct an N × m matrix Z, where N is the number of age groups included in the life table calculations. We use a to represent age groups, rather than specific ages, to simplify subsequent notation. If study intervals are k years long, then a = 1 corresponds to the youngest age at which life table estimation begins (age t = 0), a = 2 corresponds to age k, a = 3 corresponds to age 2k, and so on. a = N corresponds to the final, open-ended age group traditionally referred to as Ω in demography.

Each row in Z consists of the fixed covariate values (and intercept) plus a value of age that is incremented across rows by the value k. The product Zβ(g) yields an N × (J – 1) matrix of predicted values (for parameter sample g) from which probabilities for each transition can be computed row-wise:

prj(a)=exp((Zβ(g))aj)1+j=2Jexp((Zβ(g))aj).

The probability for the omitted transition can be computed by subtracting the sum of the computed probabilities from 1. The result is a matrix, f[Zβ(g)], from which age-specific transition probability matrices, P(a), a = 1 … N, can be constructed. Specifically, each row of f[Zβ(g)] is placed in a square matrix of dimension d × d on the basis of the starting and ending states represented by the probabilities. Structural zeros are included, and a final row of zeros with a trailing 1 is also included to reflect the fact that individuals cannot transition from the deceased state. Note that d is the number of states in the state space, and J is the total number of estimated possible transitions between them, so d < J < d2.

The rows of these square matrices are then normalized so that each row sums to 1 by adding all elements in a given row and dividing each element in the row by the row sum. This process of normalizing the estimated probabilities by row follows from the conditional probability rule that p(j | i) = p(i, j)/p(i).1 Here, the probability of transitioning from state i to state j is the probability of being in state j at time t + k conditional on being in state i at time t. The multinomial logit model estimates the joint probability of being in state i at t and state j at t + k. The unconditional probability of being in state i at t is simply the sum of the probability of being in state i at t, marginalizing over j. Because the logit outcomes are mutually exclusive, this is just the sum over p(i, j) for all j. Dividing by the unconditional probability of being in state i at t produces the conditional probability by definition. This produces a set of right-stochastic matrices, one for each age group, and the process is repeated across all parameter samples.

In developing the method, we compared estimated transition probabilities computed under this approach to those obtained by treating the starting state as a covariate. Estimated probabilities are very similar, usually differing only in the third decimal place. These slight differences are attributable to the fact that, when the transition is the outcome in the model, the effect of age (and all other covariates) varies across all transitions. In contrast, when the ending state is the model outcome and the starting state is a covariate, no such interaction between covariates and the starting state is implied.

Step 3: Generating MSLTs

With a complete set of age-specific transition probability matrices for a specific covariate profile, we can construct MSLTs in a straightforward fashion using mostly traditional calculations. Let l be an N × d matrix of counts of individuals in each of the d states at the start of each age interval, and let l(a) reference row a of the matrix. Then, l(1) is the radix population (the number or proportion of persons in the synthetic cohort who begin the life table in each state), l(2) is the number of persons in each state at age k, l(3) is the number of persons in each state at age 2k, and so on, and l(a = N) is the number of persons alive at the start of the open-ended interval.

The radix population can either be derived from the row sums of the unnormalized probabilities in f[Zβ(g)] at age 0 (a = 1) to produce “population-based” life tables, or it can be fixed at specific values to produce “status-based” life tables. For example, we could set the radix such that all persons in the population are in a single, specific state at age 0 (e.g., diabetic without other chronic conditions defined previously or ADL disabilities) to estimate state expectancies for persons who enter the life table at age 0 in that state.

Given a radix, we can compute l(a) for all age groups except the last as

l(a+1)=l(a)P(a).

Person-years lived in each state in an age group can be computed using the linear assumption

L(a)=.5k[l(a)+l(a+1)],

where, again, k is the width of the age interval.

As with any life table procedure, the final age interval requires a different computation than other intervals. In particular, we compute L(T) as

L(N)=kl(N)[IP(N)]1,

where the last row and column of l(N) and P(N) is omitted, and I is an identity matrix.

The more conventional calculation for years lived in the open-ended interval is L(Ω) = kl(a–1, where μ is the intensity or rate matrix of transitions in the oldest age group. In a continuous-time framework, this calculation provides the waiting time for absorption (death), assuming constant transition rates from age Ω forward. Our calculation simply assumes constant transition probabilities, rather than rates, from age Ω forward, so that waiting times are geometrically distributed. [I −P(N)]−1 is the sum of the infinite geometric series implied by the transition probability matrix at age Ω. Given that we generate predicted transition probability matrices to age 110 + (the oldest age for which we often have some data in panels), there is almost no difference in estimates of state expectancies at age 0, nor at age Ω, between these approaches. We prefer this calculation to converting P(N) to a rate matrix and performing the usual computation as a matter of coherence with the discrete-time modeling strategy.

The vector T(a), the person-years to be lived in each state from age (group) a forward, can be computed as

T(a)=i=aNL(i).

Finally, state expectancies can be computed for each age group by dividing T(a) by Σl(a), where Σl(a) is the total number of persons alive at age a, regardless of state. This computation apportions the total years to be lived in each state from a given age forward across all persons surviving to the beginning of age group a.

This method produces estimates that are a compromise between a period and a cohort life table. Panel data generally follow an accelerated longitudinal design: multiple birth cohorts are followed over an extended time period, but no birth cohort is observed over the complete age range observed in the study. Here, if birth cohort is included as a covariate in the multinomial regression model (and we argue it should be), then it must be fixed at a value for life table estimation. Fixing cohort at a specific value means the resulting life table is a cohort life table, but estimates will obviously be informed by patterns of other cohorts, implying that the results assume some degree of stationarity.

REGIONAL DIFFERENCES IN DIABETES AND ITS SEQUELAE

The prevalence of diabetes is increasing in the United States, and it is reducing LE in the population (Arias et al. 2003; Olshansky et al. 2005; Preston et al. 2018). The effect of diabetes on LE is indirect and due to its influence on downstream, potentially fatal health conditions, including heart disease and stroke, and physical disability, such as difficulty with performing ADLs (Giovannucci et al. 2010; Laditka and Laditka 2015; Pitocco et al. 2012; Wu et al. 2003). Preventing diabetes is thus an important concern for public health, as is understanding its implications once it is acquired. Few existing studies, however, have examined the implications of diabetes for years to be lived with other health conditions and disabilities (but see Laditka and Laditka 2015). Instead, most studies focus only on the implications of diabetes for the risk for a single downstream health outcome. Nonetheless, examination of the influence of diabetes on remaining life to be lived with subsequent conditions and disabilities is important, because it reflects the broader impact of diabetes on quality of life prior to death and has a much clearer interpretation than do regression coefficients or odds ratios.

Regional differences in diabetes prevalence in the United States are well documented, but the root causes of these differences are poorly understood (Barker et al. 2011; Danaei et al. 2009). Importantly, most existing studies on geographic disparities in health in the United States focus only on the effect of current residence (e.g., Dwyer-Lindgren et al. 2017). These studies consistently show that moving to an economically better area may increase LE and decrease rates of disease such as diabetes and obesity (e.g., Ludwig et al. 2011, 2012).

A small but growing body of literature has begun to consider the role birth region, rather than region of residence at time of interview, plays in affecting a variety of health conditions (e.g., Gilsanz et al. 2017; Glymour et al. 2013; Zheng and Tumin 2015), but most of these studies involve unrepresentative samples and focus on singular health outcomes. We contend that many disparities in health are, in fact, attributable to where respondents were born and raised, rather than where respondents live as adults at the time of the interview. The distinction between birth and current region has important implications for understanding the root causes of contemporary regional disparities in diabetes prevalence and its consequences. From our perspective, birth region reflects socialization into cultural norms, including diet and exercise patterns, as well as historical regional disparities in infrastructure and environment, whereas current region reflects current health care infrastructure disparities across locales (Zheng and Tumin 2015). Observed regional differences based on current region of residence are probably due to lack of geographic mobility for many (He and Schachter 2003; U.S. Census Bureau 2015), so current region simply acts as a reasonable proxy for birth region.

Including birth region along with current region clarifies the contribution of each, reducing measurement error bias in estimating the true influence of current regional conditions on health. Moreover, if our hypothesis is correct, remedying regional disparities in health requires policy interventions that influence cultural norms in childhood, and not necessarily structural changes in the physical environment. At a minimum, recognizing that disparities are driven by conditions in childhood can help redirect future studies to more precisely estimate the effect of structural deficiencies by modeling them at the right “lags,” rather than using current structural conditions as an imperfect proxy for historical ones.

In this application, we address the following questions: (1) How do diabetic LE and the percentage of remaining life to be spent diabetic at age 50 vary by birth region and current region? (2) How many years, on average, can persons with diabetes who also have ADLs or chronic conditions live after age 50, and does this vary by birth and current region? and (3) How does the percentage of remaining life to be spent with ADLs or chronic conditions at age 50 vary by birth and current region for persons with diabetes?

To answer these questions, we use the state space shown in Figure 1 and discussed earlier. We use data from the 1998 to 2014 waves of the Health and Retirement Study (HRS), and we use the RAND version of the data file to enhance replicability. The HRS is a popular data source for MSLT analyses (e.g., Ofstedal et al. 2019; Reuser, Bonneux, and Willekens 2009; Reuser, Willekens, and Bonneux 2011; Zaninotto et al. 2020; Zimmer and Rubin 2016). However, most MSLT applications are limited to two-living-states-plus-death state spaces, with relatively few exceptions (e.g., Dudel and Myrskylä 2017). We restrict our sample to individuals 50 years old or older who were interviewed in 1998 or were brought into the survey as members of new cohorts added in 2004 and 2010. We further restrict the sample to one individual per household, and we eliminate the foreign-born and individuals who ever lived out of the United States in any wave. After these restrictions, our sample size is 17,686, with a potential for 83,962 person-intervals. However, a number of respondents are missing health information at the beginning or ending of some intervals. After eliminating missing transitions, our analytic sample consists of 16,983 persons measured on 80,146 intervals (4 percent of respondents missing; 4.5 percent of transitions missing).

Our main covariates of interest in the multinomial logit model include the respondent’s region of birth and region of residence at the time of interview. Each measure has four categories, Northeast, Midwest, South, and West, as defined by the U.S. Census Bureau and measured here with dummy variables (reference: southern birth and southern current residence). Birth region is time invariant, but current region is observed on each survey occasion. The HRS collects three region measures: region at birth, region during adolescence, and region at each interview wave. We use birth region, rather than adolescence, because the measurement of adolescent region depends on respondents’ educational experience: the question asks where respondents lived when they were in school. Very few respondents differ in their reported birth and adolescent regions, so we opt to use birth region, given its cleaner measurement. Other predictors include age (in decimal form on the basis of month of birth and month of interview; range = 50.2–109.7 years), sex (male = 1), race (Black or other race = 1; reference: White), Hispanic ethnicity (Hispanic = 1), birth cohort (birth year 1900), marital status (married = 1, unmarried = 0), and years of schooling. Some covariates are included so that regional differences in them can be controlled; others are included because there are known sociodemographic differences in LE and health across their values. For example, we may be interested in producing life tables for both men and women because of the substantial sex differences in life expectancies. Furthermore, there are large compositional differences in education across U.S. regions, in part reflecting selective migration. Thus, we control for education to address this issue. In both cases, large health and mortality disparities exist across the variables, so their inclusion in the multinomial model facilitates the construction of precise life table estimates for specific subpopulations for “apples-to-apples” comparisons. Table 1 shows descriptive statistics for these variables.

Table 1.

Descriptive Statistics for Covariates Included in the Multinomial Logit Model for Health Transitions

Variable Mean (S.D.) [Range] or Percentage
Birth cohort 35.3 (11.7) [–8 to 59]
Age (years) 69.4 (10.7) [50.2 to 109.7]
Male 44.40
Race
 White (reference) 79
 Black 18.00
 Other race 3.00
 Hispanic 4.60
 Education 12.5 (3.0) [0 to 17]
 Married 52.10
Birth region
 South (reference) 40.30
 Northeast 20.70
 Midwest 30.00
 West 9.00
Current region
 South (reference) 41.40
 Northeast 15.30
 Midwest 26.10
 West 17.20

Source: Health and Retirement Study data, 1998 to 2014.

Note: The cohort is computed as birth year 1900. Descriptive statistics include all transitions (n = 80,146).

Our modeled outcomes are the transitions experienced by respondents across time intervals, as indicated by the state space shown in Figure 1. Table 2 shows the counts of observed transitions in the person-interval data set across all intervals. The cell representing a transition from having ADL disability (A) to being diabetic with chronic conditions (DC) had only 10 observations. Because of convergence issues, we coded these respondents as ending their interval with diabetes, chronic conditions, and ADL disability (DCA), simply assuming that ADL disability status was misreported. After omitting this transition type and one transition as the reference, we are left with a 42-dimensional multinomial outcome, which is the count of the nonzero entries shown in equation (1).

Table 2.

Counts of Observed Transitions between States across All Transition Intervals

Time 1 (Transition Start) Time 2 (End of Transition)
H A C CA D DA DC DCA Dead
Healthy (H) 27,954 1,867 2,227 434 755 60 96 30 680
ADL (A) 1,308 2,062 152 371 42 79 10a 25 504
Condition (C) 0 0 14,501 2,346 0 0 522 108 1,431
Condition and ADL (CA) 0 0 1,346 3,710 0 0 59 193 1,968
Diabetic (D) 0 0 0 0 4,182 471 471 123 169
Diabetic and ADL (DA) 0 0 0 0 315 595 60 152 181
Diabetic and condition (DC) 0 0 0 0 0 0 3,540 976 581
Diabetic, condition, and ADL (DCA) 0 0 0 0 0 0 615 1,968 907
Dead 0 0 0 0 0 0 0 0 All

Source: Health and Retirement Study data, 1998 to 2014.

Note: ADL means having at least one ADL disability; condition means having at least one chronic condition.

n = 80,146 total transitions observed. ADL = activities of daily living.

a

Observations in this cell were assigned to DCA because of convergence issues.

We used the Gibbs sampler, discussed earlier, to sample parameters from the posterior distribution for the multinomial logit model, rather than a multinomial probit or independence sampler for the logit, because of the high dimensionality of the multinomial outcome (Polson et al. 2013; Rossi, Allenby, and McCulloch 2005). We ran the Gibbs sampler twice from two randomly selected sets of starting values, generating 2,500 posterior draws for each. After monitoring convergence and mixing across the two chains, we dropped the first 500 values from each and thinned each chain to every fourth iteration, yielding 1,000 posterior draws.

Trace plots, and the 1,000 posterior draws themselves, are available in the online supplement. Polson et al. (2013) produced an R package to implement their method, but at the time of writing, this package was (and is currently) unsupported by R. Thus, we downloaded the relevant files from GitHub (https://github.com), “cannibalized” the Polya-Gamma simulation functions from the original package, and adapted them for our purposes.

For each posterior sample, we computed a set of 31 age-specific probabilities for each outcome transition from age 50 to Ω = 110 +, by 2-year intervals, and then renormalized them by row, as discussed previously. We generated MSLTs for each of the possible 16 combinations of birth and current region, but set values for sex, race, marital status, and birth cohort to the overall means for the sample, effectively controlling for birth and current regional differences in these covariates. For illustrative purposes, education was set to region-specific means, so its value is not controlled across regions. However, subsequent life tables could be generated in which education is set to the sample mean, and those results could be compared to the ones we present here to evaluate the extent to which regional differences in educational attainment account for regional differences in state expectancies.

Table 3 shows posterior mean estimates for life expectancies with 84 percent credible intervals, including total LE, healthy LE, diabetic LE, LE with chronic conditions, LE with ADL disability, and LE with all three health issues. Our choice of 84 percent credible intervals is motivated by the literature on frequentist inference, which suggests that using 83 percent or 84 percent confidence intervals when comparing two intervals gives comparable results to a formal t test on the difference between means at the α = .05 level, under the assumption the standard errors are roughly equal (Payton, Greenstone, and Schenker 2003). That is, observing that two 84 percent credible intervals do not overlap is roughly equivalent to rejecting the null hypothesis that the two values are equal at the p < .05 level.

Table 3.

Posterior Means and 84 Percent Credible Intervals for Various State Expectancies at Age 50, by Birth Region and Region of Current Residence

Birth-Current Region Population-Based Results
TLE HLE XLE CLE DLE XCDLE
S-S 28.4 [27.9, 28.9] 12.0 [11.6, 12.5] 6.8 [6.3, 7.3] 12.7 [12.2, 13.2] 5.6 [5.4, 5.8] 1.8 [1.6, 1.9]
S-NE 29.7 [28.7, 30.8] 12.5 [11.7, 13.2] 7.3 [6.4, 8.2] 13.4 [12.3, 14.4] 6.0 [5.6, 6.4] 1.9 [1.6, 2.2]
S-MW 29.7 [28.8, 30.6] 12.0 [11.4, 12.7] 8.0 [7.1, 9.0] 13.7 [12.9, 14.6] 6.0 [5.6, 6.4] 2.0 [1.7, 2.3]
S-W 30.2 [29.3, 31.0] 13.3 [12.6, 14.0] 6.4 [5.6, 7.2] 13.3 [12.5, 14.2] 6.0 [5.6, 6.4] 1.7 [1.4, 2.0]
NE-S 29.7 [28.9, 30.6] 14.0 [13.3, 14.7] 5.9 [5.2, 6.7] 12.4 [11.6, 13.2] 4.7 [4.3, 5.0] 1.3 [1.1, 1.6]
NE-NE 30.9 [30.2, 31.6] 14.2 [13.6, 14.8] 6.2 [5.6, 6.9] 13.0 [12.3, 13.6] 5.1 [4.8, 5.4] 1.4 [1.2, 1.6]
NE-MW 31.3 [30.2, 32.4] 14.1 [13.3, 14.9] 7.0 [6.0, 8.0] 13.5 [12.6, 14.6] 5.0 [4.6, 5.4] 1.5 [1.2, 1.8]
NE-W 31.2 [30.2, 32.2] 14.9 [14.1, 15.8] 5.6 [4.8, 6.4] 12.9 [12.0, 14.0] 5.1 [4.7, 5.6] 1.3 [1.0, 1.6]
MW-S 30.1 [29.3, 30.9] 13.9 [13.2, 14.5] 6.2 [5.6, 7.0] 12.8 [12.0, 13.5] 4.8 [4.5, 5.1] 1.3 [1.1, 1.6]
MW-NE 32.4 [31.3, 33.6] 14.9 [14.0, 15.8] 6.6 [5.6, 7.6] 13.8 [12.7, 14.9] 5.0 [4.6, 5.5] 1.3 [1.0, 1.6]
MW-MW 31.2 [30.5, 31.9] 13.7 [13.2, 14.2] 7.2 [6.5, 7.9] 13.7 [13.1, 14.3] 5.2 [4.9, 5.4] 1.5 [1.4, 1.7]
MW-W 31.4 [30.5, 32.3] 14.7 [14.0, 15.5] 5.7 [4.9, 6.5] 13.2 [12.4, 14.1] 5.3 [4.9, 5.6] 1.3 [1.0, 1.6]
W-S 30.1 [28.9, 31.3] 13.7 [12.7, 14.6] 7.1 [6.0, 8.3] 12.8 [11.7, 13.9] 4.9 [4.4, 5.4] 1.5 [1.2, 1.9]
W-NE 31.6 [30.2, 33.2] 14.2 [13.1, 15.4] 7.6 [6.2, 9.1] 13.6 [12.3, 15.1] 5.2 [4.7, 5.8] 1.7 [1.3, 2.2]
W-MW 31.2 [29.8, 32.5] 13.4 [12.4, 14.4] 8.4 [7.0, 10.0] 13.9 [12.5, 15.3] 5.3 [4.8, 5.9] 1.9 [1.4, 2.4]
W-W 30.8 [30.0, 31.7] 14.2 [13.5, 15.0] 6.5 [5.7, 7.4] 13.1 [12.3, 13.9] 5.4 [5.1, 5.8] 1.5 [1.2, 1.8]
%HLEa %XLEa %CLEa %DLEa %XCDLEa
S-S 42.4 [41.1, 43.7] 24.1 [22.5, 25.7] 44.8 [43.4, 46.3] 19.7 [19.1, 20.4] 6.2 [5.7, 6.8]
S-NE 42.0 [39.5, 44.3] 24.6 [21.7, 27.3] 44.9 [42.4, 47.7] 20.1 [18.9, 21.4] 6.4 [5.3, 7.5]
S-MW 40.5 [38.4, 42.6] 26.9 [24.3, 29.7] 46.2 [44.0, 48.5] 20.1 [19.0, 21.2] 6.8 [5.9, 7.8]
S-W 44.0 [41.9, 46.0] 21.2 [18.9, 23.6] 44.2 [42.0, 46.5] 19.8 [18.6, 21.0] 5.6 [4.7, 6.5]
NE-S 47.0 [44.7, 49.1] 19.9 [17.7, 22.4] 41.8 [39.5, 44.1] 15.7 [14.7, 16.7] 4.4 [3.7, 5.3]
NE-NE 46.1 [44.4, 48.0] 20.1 [18.2, 22.0] 42.0 [40.2, 43.9] 16.5 [15.8, 17.3] 4.5 [3.9, 5.1]
NE-MW 45.1 [42.6, 47.6] 22.2 [19.4, 25.1] 43.2 [40.7, 45.9] 15.9 [14.8, 17.1] 4.8 [3.9, 5.8]
NE-W 47.9 [45.3, 50.4] 17.8 [15.4, 20.3] 41.5 [38.9, 44.1] 16.4 [15.2, 17.7] 4.1 [3.3, 5.1]
MW-S 46.1 [44.0, 48.0] 20.7 [18.6, 22.9] 42.4 [40.4, 44.5] 15.8 [14.9, 16.7] 4.4 [3.7, 5.2]
MW-NE 46.0 [43.4, 48.6] 20.2 [17.5, 23.0] 42.4 [39.6, 45.1] 15.4 [14.3, 16.6] 4.0 [3.2, 4.9]
MW-MW 44.0 [42.4, 45.5] 23.0 [21.2, 25.1] 43.9 [42.3, 45.5] 16.6 [16.0, 17.2] 4.9 [4.4, 5.4]
MW-W 47.0 [44.8, 49.1] 18.3 [15.9, 20.6] 42.1 [39.9, 44.4] 16.7 [15.8, 17.7] 4.1 [3.3, 4.9]
W-S 45.5 [42.7, 48.2] 23.6 [20.3, 27.2] 42.6 [39.6, 45.7] 16.2 [14.9, 17.5] 5.1 [4.0, 6.3]
W-NE 45.0 [41.5, 48.4] 24.0 [20.0, 28.2] 42.9 [39.2, 46.6] 16.6 [15.0, 18.2] 5.3 [4.0, 6.9]
W-MW 43.1 [40.0, 45.9] 26.8 [22.9, 31.2] 44.6 [41.1, 48.0] 17.1 [15.6, 18.7] 6.0 [4.4, 7.6]
W-W 46.2 [44.1, 48.2] 21.0 [18.6, 23.6] 42.4 [40.1, 44.7] 17.6 [16.6, 18.7] 4.8 [4.0, 5.6]
Birth-Current Region Status-Based Results
XLE CLE DLE XCDLE
S-S 22.0 [20.4, 23.6] 10.0 [9.0, 10.9] 5.7 [5.2, 6.3] 4.0 [3.6, 4.4]
S-NE 22.9 [20.8, 25.2] 10.0 [8.4, 11.6] 6.2 [5.3, 7.4] 4.0 [3.3, 4.8]
S-MW 23.8 [21.6, 26.1] 11.0 [9.4, 12.5] 6.3 [5.4, 7.2] 4.2 [3.6, 4.9]
S-W 23.2 [21.0, 25.3] 11.3 [9.8, 12.9] 6.1 [5.2, 7.0] 4.6 [3.8, 5.3]
NE-S 21.9 [19.1, 24.4] 9.0 [7.6, 10.5] 4.7 [3.9, 5.5] 3.3 [2.7, 4.0]
NE-NE 23.1 [21.1, 25.2] 9.0 [7.9, 10.1] 5.3 [4.7, 6.1] 3.3 [2.9, 3.8]
NE-MW 24.0 [21.1, 27.0] 10.1 [8.1, 12.0] 5.2 [4.2, 6.1] 3.5 [2.7, 4.3]
NE-W 23.1 [20.3, 25.9] 10.3 [8.6, 12.1] 5.3 [4.4, 6.3] 3.9 [3.1, 4.7]
MW-S 23.0 [20.8, 25.2] 9.6 [8.3, 10.9] 4.8 [4.1, 5.4] 3.4 [2.8, 4.0]
MW-NE 24.6 [22.0, 27.5] 9.7 [7.9, 11.5] 5.0 [4.1, 5.9] 3.3 [2.5, 4.0]
MW-MW 24.8 [22.9, 26.7] 10.4 [9.4, 11.4] 5.4 [4.8, 6.0] 3.6 [3.3, 4.0]
MW-W 23.5 [21.1, 25.9] 10.5 [9.0, 12.0] 5.3 [4.5, 6.1] 3.9 [3.2, 4.7]
W-S 23.3 [20.0, 26.5] 9.8 [7.8, 11.8] 5.0 [4.0, 6.1] 3.3 [2.5, 4.3]
W-NE 24.8 [21.1, 28.8] 9.8 [7.6, 12.3] 5.6 [4.3, 6.9] 3.3 [2.4, 4.3]
W-MW 25.2 [21.7, 28.7] 10.9 [8.5, 13.5] 5.8 [4.5, 7.0] 3.7 [2.7, 4.8]
W-W 24.1 [22.0, 26.2] 10.7 [9.3, 12.4] 5.8 [5.0, 6.7] 3.9 [3.2, 4.6]
%CLEa %DLEa %XCDLEa
S-S 45.3 [41.8, 48.8] 26.1 [24.0, 28.5] 18.1 [16.4, 19.8]
S-NE 43.6 [38.3, 50.0] 27.3 [23.5, 31.5] 17.6 [14.6, 20.8]
S-MW 46.2 [40.7, 51.6] 26.3 [23.3, 29.6] 17.8 [15.3, 20.5]
S-W 48.7 [42.7, 54.5] 26.4 [22.8, 30.2] 19.7 [16.8, 23.0]
NE-S 41.4 [36.4, 46.8] 21.5 [18.8, 24.3] 15.1 [12.6, 17.5]
NE-NE 39.0 [35.0, 43.3] 23.2 [20.7, 25.8] 14.5 [12.7, 16.5]
NE-MW 42.0 [35.8, 48.6] 21.5 [18.5, 24.8] 14.7 [12.0, 17.6]
NE-W 44.8 [37.9, 51.7] 23.0 [19.5, 26.5] 17.1 [13.9, 20.3]
MW-S 41.7 [36.6, 46.8] 20.8 [18.3, 23.5] 14.7 [12.5, 17.2]
MW-NE 39.5 [33.2, 46.1] 20.4 [17.2, 23.8] 13.3 [10.6, 16.4]
MW-MW 41.9 [38.3, 45.4] 21.8 [19.9, 23.7] 14.7 [13.1, 16.3]
MW-W 44.8 [38.4, 50.7] 22.4 [19.2, 26.0] 16.6 [13.7, 19.8]
W-S 42.1 [34.8, 49.3] 21.7 [18.0, 25.5] 14.4 [11.0, 18.4]
W-NE 39.7 [31.4, 48.5] 22.5 [18.0, 27.2] 13.3 [9.8, 17.6]
W-MW 43.3 [34.7, 51.4] 23.0 [18.7, 27.5] 14.7 [11.1, 18.7]
W-W 44.4 [38.8, 50.6] 24.1 [20.9, 27.3] 16.1 [13.4, 19.1]

Note: Regions include South (S), Northeast (NE), Midwest (MW), and West (W). State expectancies include total life expectancy (TLE), healthy life expectancy (HLE), diabetic life expectancy (XLE), life expectancy with chronic conditions (CLE), life expectancy with activities of daily living disability (DLE), and life expectancy with all three health issues (XCDLE).

a

Percentage of remaining TLE.

The table contains a considerable amount of information. The top half of the table reports results of population-based life tables for persons born and currently living in all 16 possible birth and current region combinations. This portion of the table reports years to be lived in each selected state for the “average” person at age 50. The bottom half of the table reports results from status-based life tables. This portion of the table shows years to be lived in each selected state for persons who are diabetic at age 50 but who do not yet have another chronic condition (defined previously) or disability. There are numerous differences across regions in both the population-based and status-based results, but these are perhaps best displayed visually. Thus, we provide this table for detailed perusal, but we highlight some of the key results and advantages of our approach over alternative MSLT methods.

First, as Table 3 shows, state expectancies can be aggregated or divided in various ways to obtain meaningful life table quantities, with uncertainty measured by credible intervals. For example, if we are interested in diabetic LE, we can simply sum up all expectancies in states that include having diabetes, that is, eD, eDC, eDA, and eDCA. Similarly, we could calculate LE with chronic conditions by summing the expectancies in states that include having chronic conditions: eC, eDC, eCA, and eDCA. LE with ADL disability can be calculated by summing the expectancies of states that include having ADLs: eA, eDA, eCA, and eDCA. Another useful statistic we can calculate from our life tables is the percentage of remaining life an individual can expect to live in a given state. For example, if we are interested in the percentage of remaining life a respondent can expect to live healthy, we can simply divide healthy life years by total life years for all posterior samples to obtain a credible interval for this quantity.

Figure 2 shows how diabetic LE (left panel) and the percentage of remaining life spent diabetic (right panel) at age 50 vary by birth region–current region combination, with 84 percent credible intervals. Both panels are ordered by the value of the posterior mean. A longer diabetic LE or a higher percentage of remaining life to be spent diabetic implies a person will spend more years or a greater percentage of remaining life with diabetes. Diabetic LE is calculated by summing across eD, eDC, eDA, and eDCA. Percentage of diabetic LE is calculated by dividing years spent diabetic by total years in life for each of the posterior samples. The posterior means of diabetic LE at age 50 in the left panel lie within a range of 5.6 to 8.4 years. Individuals born in the West or the South and currently living in the Midwest or the Northeast will spend the most years diabetic. However, this result could be driven by the possibility that persons born or living in those areas also have the longest total LE, either because diabetes is more deadly for persons born or living in other regions or for other reasons. Therefore, it is useful to examine the percentage of years of life remaining to be spent diabetic. Percentages in the right panel show that the percentage of remaining life to be spent diabetic is generally greater for individuals born in the South or the West compared with those born in the Northeast or the Midwest. Persons born in the South or the West and currently living in the Midwest spend, on average, 27 percent of their life after age 50 being diabetic, whereas the corresponding percentages for individuals born in the Northeast or the Midwest and currently living in the West are just about 18 percent. The 84 percent credible intervals for the two groups do not overlap, indicating that a classical null hypothesis of equality can be rejected. From a Bayesian perspective, we could conclude there is a very high posterior probability that individuals born in the South have more adverse outcomes.

Figure 2.

Figure 2.

Life expectancy with diabetes in years (XLE) and percentages of total life expectancy (%XLE), with 84 percent credible intervals for 16 birth–current region combinations. (a) Diabetic life expectancy. (b) Diabetic life expectancy percentage.

Note: Regions include West (W), South (S), Northeast (NE), and Midwest (MW). XLE is calculated by adding expectancies in states D (having diabetes), DA (having diabetes and at least one activities of daily living disability), DC (having diabetes and at least one chronic condition), and DCA (having diabetes, at least one chronic condition, and at least one activities of daily living disability).

A second useful feature of our method is that both population-based and status-based life tables can easily be constructed. Compared with population-based life tables, status-based tables allow us to evaluate implications of having a certain disease, such as diabetes, for future life with chronic conditions and ADL disability. When making population-based life tables, the radix population is determined by the percentage of the population in each state at the youngest age. When making status-based life tables, the radix is set so that all persons enter the life table in a selected state. As discussed previously, some degree of stationarity is assumed.

Figure 3 shows how many years “average” individuals and persons with diabetes at age 50 can expect to live with ADL disability, by birth and current region. The left panel of the figure shows population-based life table estimates, and the right panel shows status-based estimates. Estimates in both panels are sorted by the value of the posterior mean estimates, and 84 percent credible intervals are displayed. In both panels, individuals born in the South tend to spend more years being disabled. Although the posterior means for disabled LE are slightly larger for persons with diabetes than for “average” persons, they are all roughly within a range of four to seven years. On the basis of the 84 percent credible intervals, “significant” regional disparities are observed in the population-based results but not in the status-based results. One interpretation of this result is that diabetes is not more consequential for persons born in or living in different regions. Nonetheless, from a Bayesian perspective, posterior probabilities clearly favor persons born somewhere other than the South. As discussed earlier, this result could be driven by reduced total LE among persons with diabetes from the South.

Figure 3.

Figure 3.

Population-based and diabetic status–based expectancies with ADL disability, with 84 percent credible intervals for 16 birth–current region combinations. (a) Population based. (b) Status based.

Note: Regions include West (W), South (S), Northeast (NE), and Midwest (MW). Life expectancy with activities of daily living (ADL) disability (DLE) is calculated by adding expectancies in states A (having at least one ADL disability), DA (having diabetes and at least one ADL disability), CA (having at least one chronic condition and one ADL disability), and DCA (having diabetes, at least one chronic condition, and at least one ADL disability).

Figure 4 shows the percentages of life to be spent with ADLs for the population-based (left panel) and diabetic status–based (right panel) tables. In the population-based figure in the left panel, individuals born in the South experience the greatest percentage of life with ADLs (around 20 percent) compared with respondents born in other regions. In the diabetic status–based figure in the right panel, persons with diabetes who were born in the South also have the highest percentages of life spent with ADLs, exceeding 26 percent. On the basis of the 84 percent credible intervals, the disparities between individuals born in the South and their counterparts born in other regions are real in the population-based results, but there is considerable overlap in intervals in the status-based results.

Figure 4.

Figure 4.

Population-based and diabetic status-based percentages of remaining life to be spent with ADL disability, with 84 percent credible intervals for 16 birth–current region combinations. (a) Population based. (b) Status based.

Note: Regions include West (W), South (S), Northeast (NE), and Midwest (MW). Life expectancy with activities of daily living (ADL) disability (DLE) is calculated by adding expectancies in states A (having at least one ADL disability), DA (having diabetes and at least one ADL disability), CA (having at least one chronic condition and one ADL disability), and DCA (having diabetes, at least one chronic condition, and at least one ADL disability).

The third advantage of our method is that the probability one group has worse outcomes relative to another group can easily be computed. Our approach is fundamentally Bayesian, so the rejection of null hypotheses/“statistical significance” discussed previously is of less concern than consideration of posterior probabilities reflecting the likelihood of (dis)advantages to birth and residence in one region versus another. The foregoing results suggest that being born in the South is linked to worse health outcomes, even when these differences do not reach “statistical significance” by classical standards. The extent to which the evidence supports this view can be evaluated in the Bayesian framework by computing the percentage of sampled life table quantities for which southerners are worse off relative to persons from other regions for each of the health outcomes examined. Among the life expectancies examined, shorter total LE or healthy LE, and longer LE with diabetes, LE with chronic conditions, LE with ADL disability, and LE with all three health issues indicate worse health outcomes. Results of these probabilistic calculations are shown in Table 4. The results are for population-based life tables only. The left panel shows the results of comparing birth in the Northeast, the Midwest, and the West with the South; the right panel shows the results of comparing current residence in each non-South region to current residence in the South. The posterior probabilities in the left-hand columns of the table are almost uniformly close to 1, indicating that the health outcomes of people born in the South are, on average, worse than those of people born elsewhere.

Table 4.

Posterior Probabilities That Southern Health Outcomes Are Worse Than Those of Other Regions (from Population-Based Life Tables)

Birth Region Current Region
Measure NE MW W NE MW W
TLE (<) .970 .996 .937 .985 .985 .961
HLE (<) 1.000 1.000 .980 .828 .444 .953
XLE .966 .932 .385 .269 .021 .821
CLE .704 .462 .491 .118 .031 .200
DLE 1.000 .999 .956 .088 .042 .031
XCDLE .996 .998 .749 .361 .089 .609
%HLE (<) .998 .994 .910 .383 .060 .768
%XLE .996 .992 .550 .471 .060 .948
%CLE .974 .949 .822 .474 .148 .599
%DLE 1.000 1.000 .996 .380 .219 .181
%XCDLE .999 1.000 .839 .519 .160 .761

Note: Regions include West (W), South (S), Northeast (NE), and Midwest (MW). Life expectancies include total life expectancy (TLE), healthy life expectancy (HLE), diabetic life expectancy (XLE), life expectancy with chronic conditions (CLE), life expectancy with activities of daily living disabilities (DLE), and life expectancy with all three health issues (XCDLE). “Worse” means shorter total life expectancy, shorter healthy life expectancy, longer diabetic life expectancy, and so on.

One exception is that persons born in the South probably do not spend more years with chronic conditions than do persons born in other regions. The posterior probabilities in the right-hand columns are not uniformly high, which suggests that, although living in the South may be more detrimental to health than living elsewhere, this claim is less convincing, when birth region is controlled.

These results suggest birth region is more important for health than current region. As we discussed earlier, much of the recent work on regional disparities in health uses current region as its measure for region and has sought explanations in terms of disparities in contemporary living conditions. However, current region may simply be a proxy measure for birth region, given the lack of geographic mobility for many, so true explanations for some regional disparities in health may have roots in historical, rather than contemporary, living conditions.

DISCUSSION AND CONCLUSIONS

In this article, we extend the approach to generating MSLTs developed by Lynch and Brown (2005) to handle large and complex state spaces. By treating transitions between states over time intervals as the outcome of a multinomial logit model and normalizing predicted probabilities by starting state, rather than treating the ending state as the multinomial outcome and conditioning on starting by including it as a covariate, we are able to overcome one of the limitations of their and other MSLT methods for sample data. The consequence of this change in specification, however, is an increase in the dimensionality of the multinomial outcome used in the model for estimating transition probabilities used as input for MSLT generation. Nonetheless, as we discussed, we can use an independence sampler to sample parameters from the posterior distribution when the multinomial dimensionality is small to medium, and we are able to take advantage of a recent advance in the Bayesian estimation of multinomial logit models when the dimensionality is large enough that independence sampling does not work. Our approach can easily and flexibly generate detailed life table quantities that can be used by policymakers and understood by anyone, a general advantage of MSLTs. Our application using the HRS provides several examples of useful statistics one can calculate from the generated life tables. These statistics tell us that southern birth often has worse outcomes than birth in other regions, but current southern residence is not necessarily much worse than residence in other regions. Our substantive findings also suggest that status-based tables may be underused: status-based tables provide more clinically relevant information, which becomes more apparent with complex state spaces.

Our approach in its current form is not without limitations, which should be addressed in future studies. First, most sample data are not from simple random samples, but our discussion assumes that they are. Indeed, despite the explosion in the use of Bayesian methods over the past few decades, comparatively less attention has been devoted to conducting Bayesian analyses using data from complex samples. This is due to a variety of reasons, including that the appropriate approach to weighting, especially in panel studies, is unclear even in the classical statistical paradigm (Gelman 2007). Nonetheless, there is general agreement that sampling strategies cannot simply be ignored in estimating population quantities. A key first question is whether adjusting for sample design produces any changes in interval estimates that have substantive implications. In MSLTs, we have generally found that explicitly compensating for sample design makes little meaningful difference to state expectancy interval estimates. In part, controlling for relevant weighting and clustering variables, and then specifying covariate profiles for specific subpopulations upon which weights are based, obviates the need for weighting (Kennedy and Gelman 2019; Winship and Radbill 1994). In part, state expectancy estimates tend to be larger quantities than regression coefficients and are derived through multiple steps so that, even if compensating for weighting affects the regression coefficients, the ultimate consequence to state expectancies tends to be only a few hundredths or tenths of a year in the state expectancy metric.

Recent work in incorporating sample design into Bayesian analyses suggests two straightforward approaches to incorporating design (weights): (1) constructing a pseudo-likelihood function that replaces usual likelihood in obtaining the posterior distribution and (2) introducing a weighted bootstrapping step within the MCMC procedure used to sample from the posterior. Both approaches have been tried in previous research on Bayesian MSLT methods (Lynch and Brown 2010; Lynch, Brown, and Harmsen 2003) and were found to change results, but not substantially. More recent work has conducted simulation with both approaches in a more general modeling context and has found the latter approach is preferable (Gunawan et al. 2017). Such an approach is simple to incorporate: one simply draws a weighted bootstrap sample at each iteration of either a Gibbs sampler or independence sampler in sampling from the posterior for the multinomial logit model parameters. We followed such an approach in unreported results and found it produces minor changes to estimates obtained ignoring weighting altogether.

A second limitation of the new method (and other, similar methods) is that we assume time intervals between survey waves are equal for all observations (person-interval records). In many surveys, including the HRS, nominal intervals between survey waves are not necessarily consistent with de facto intervals between interviews. For example, HRS interviews are nominally two years apart, but they are not exactly two years apart: some interview intervals are much shorter than two years, and some are longer, on the basis of interview dates. In simulations (not reported), we found that differing interval widths do not change estimates, under the assumption that interval widths are unrelated to covariates. A comparison between two other MSLT packages, SPACE and IMaCh, found similar results (Cai et al. 2010).

A third, and related, limitation is that our approach assumes that “current status” data, which is what panel studies in fact measure, capture all actual transitions. We know that this is not the case, and Wolf and Gill (2009) showed that panel data tend to produce underestimates of some state expectancies, regardless of approach, including the EMC approach. More work needs to be done in this area for all MSLT methods.

Fourth, we note that our method is a discrete-time, rather than continuous-time, method, and this may be viewed as a limitation. MSLT methods, from their initial development until the present, generally assume a continuous-time Markov process produces our observed data (Schoen 1988). That we use a discrete-time approach may be viewed as a limitation, but we are uncertain about this. The availability of true, population-level, rate data facilitate MSLT estimation under a continuous-time assumption. But, relying on continuous-time assumptions while using panel data for estimation of MSLTs requires assuming more than we think is justifiable. We must make assumptions regarding when, and how many, transitions occur between periods of observations. No single method can adjudicate between assumptions about this in the absence of actual continuous-time data. Thus, we concur with Cai et al. (2010) in arguing that the goal of MSLTs in demographic research is to provide a crude approximation when the information is incomplete. We would argue that this also is consistent with Bayesian thinking. State expectancies are not directly observed quantities for which we can produce estimates of their population counterparts. Instead, we must model them indirectly. Our uncertainty about them, therefore, includes not only sampling uncertainty but also parametric uncertainty, which includes uncertainty about how well the parameters we estimate reflect the true process underlying our observed data. The Bayesian interpretation of interval estimates derived from the multistep process of MSLT construction more closely represents our true uncertainty about state expectancies than do classical sampling theory-based interval estimates. Thus, we believe Bayesian approaches to MSLT construction should be used more often.

Our MSLT method allows for easier and more widespread use of the life table approach, which has more utility than just for questions of interest to demographers. For example, family scholars have used the life table approach to study transitions in marital statuses (e.g., Schoen and Canudas-Romo 2006). Economists have used life tables to study business closing patterns (e.g., Nucci 1999). Clinicians have used life tables to study disability and other health issues (e.g., Gill et al. 2021). Biologists have used life tables to study the survival of animals and insects (e.g., Liu and Tsai 2000). More generally, life table methods have the potential to help us better understand any transition patterns. However, they have been underused in fields other than demography and health. To facilitate greater use of our method, we are developing a general R package that performs both the MCMC sampling for obtaining samples of multinomial logit model parameters and the life table computations using the MCMC results. In the interim, our somewhat less general R functions are available, along with detailed instructions.

Supplementary Material

Supplementary Material

Acknowledgment

We thank Matthew van Adelsberg, Justin T. Max, Seth G. Sanders, and Maria-Giovanna Merli for their helpful comments.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Dr. Lynch received support from the National Institute on Aging (grant R01AG040199). Dr. Zang received support from the National Institute on Aging (grant R21AG074238-01), the Research Education Core of the Claude D. Pepper Older Americans Independence Center at Yale School of Medicine (grant P30AG021342), and the Institution for Social and Policy Studies at Yale University.

Biographies

Author Biographies

Scott M. Lynch, PhD, is a professor of sociology, director of the Center for Population Health and Aging, and associate director of the Duke University Population Research Institute at Duke University. His methodological interests are in Bayesian statistics and demographic methods. His substantive interests are in health disparities across age and birth cohorts. He has published two books and several chapters in edited volumes on applied Bayesian statistics in social science research and has published articles in the top demography, methodology, gerontology, and health journals in sociology.

Emma Zang, PhD, is an assistant professor of sociology and biostatistics at Yale University. She researches family demography and population health and develops methods to model trajectories and life transitions. Her articles have appeared in many leading social science and medical journals, such as the American Journal of Sociology, Demography, Social Science & Medicine, and JAMA Internal Medicine.

Footnotes

Supplemental Material

Supplemental material for this article is available online.

1.

Because we use a single discrete-time multinomial logit regression model to estimate raw probabilities of transitions, rather than probabilities of transitioning to each state conditional on one’s current state, our cell probabilities, when placed in a cross-tabulation of starting and ending states across time intervals, can be summed meaningfully into row and column marginal probabilities. This means that it is possible that our model-implied column marginal probabilities for the time interval that ends at time t may differ from the row marginal probabilities for the next time interval, which begins at time t. This possibility at first seems to imply some incoherence in our method and may suggest that some constraint should be imposed to ensure column and row marginal probabilities “align” over time. However, we think this is an unnecessary condition for our approach. First, the data come from multiple birth cohorts entering the study at different points in time and represent survivors (and respondents/nonattriters) at each age. Given attrition and mortality, the data themselves do not follow this condition if one were to construct a set of cross-tabulations from the raw data. The model is simply smoothing out the transitions observed among a heterogeneous set of cohorts to produce an underlying synthetic cohort. It is hard to imagine that if the data do not imply continuity, any model could or should. Second, the construction of transition probability matrices using the parameter samples involves renormalizing the obtained cell probabilities so they sum to 1 across rows. At this point, the marginals will not align, because the rows sum to 1, and the column sums are meaningless. If we followed the previous strategy in the literature and used the starting health state to predict the outcome state, the probabilities obtained would immediately be true transition probabilities. Both approaches yield the same transition probabilities, assuming age interacts with the starting state under the former approach. Thus, it seems unnecessary for the marginals under the new approach to align. Third, to explore this idea further, we conducted a simulation study (see the online supplement for details). For the simulation, we kept the state space simple—healthy (h), unhealthy (u), dead (d)—and generated transition probabilities from separate logit specifications. The conclusion from the simulation is that our new method involving estimating transitions as outcomes rather than outcome states conditional on starting states works well, and it does so despite not producing the alignment of marginals.

References

  1. Abner Erin L., Kryscio Richard J., Cooper Gregory E., Fardo David W., Jicha Gregory A., Mendiondo Marta S., Nelson Peter T., et al. 2012. “Mild Cognitive Impairment: Statistical Models of Transition Using Longitudinal Clinical Data.” International Journal of Alzheimer’s Disease 2012:291920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Albert James H., and Chib Siddhartha. 1993. “Bayesian Analysis of Binary and Polychotomous Response Data.” Journal of the American Statistical Association 88(422):669–79. [Google Scholar]
  3. Allison Paul D. 1984. Event History Analysis: Regression for Longitudinal Event Data. Beverly Hills, CA: Sage. [Google Scholar]
  4. Arias Elizabeth, Anderson Robert N., Kung Hsiang-Ching, Murphy Sherry L., and Kochanek Kenneth D.. 2003. “Deaths: Final Data for 2001.” National Vital Statistics Reports 52(3):1–115. [PubMed] [Google Scholar]
  5. Barker Lawrence E., Kirtland Karen A., Gregg Edward W., Geiss Linda S., and Thompson Theodore J.. 2011. “Geographic Distribution of Diagnosed Diabetes in the U.S.: A Diabetes Belt.” American Journal of Preventive Medicine 40(4):434–39. [DOI] [PubMed] [Google Scholar]
  6. Billingsley Patrick. 1968. Statistical Inference for Markov Processes, Vol. 2. Chicago: University of Chicago Press. [Google Scholar]
  7. Brooks Steve, Gelman Andrew, Jones Galin L., and Meng Xiao-Li, eds. 2011. Handbook of Markov Chain Monte Carlo. Boca Raton, FL: Chapman & Hall. [Google Scholar]
  8. Cai Liming, Hayward Mark D., Saito Yasuhiko, Lubitz James, Hagedorn Aaron, and Crimmins Eileen. 2010. “Estimation of Multi-state Life Table Functions and Their Variability from Complex Survey Data Using the SPACE Program.” Demographic Research 22(6):129–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Calhoun Charles. 1997. “Bootstrapping the Multi-state Life Table: Preliminary Results.” Paper presented at the 1997 annual meeting of the Population Association of America, Washington, DC. [Google Scholar]
  10. Centers for Disease Control and Prevention. 2022. “Life Expectancy.” Retrieved July 5, 2022. https://www.cdc.gov/nchs/nvss/life-expectancy.htm.
  11. Cook Richard J., and Lawless Jerald F.. 2018. Multistate Models for the Analysis of Life History Data. Boca Raton, FL: CRC Press. [Google Scholar]
  12. Cook Scott J., Niehaus John, and Zuhlke Samantha. 2018. “A Warning on Separation in Multinomial Logistic Models.” Research & Politics 5(1):1–5. [Google Scholar]
  13. Crimmins Eileen M., Hayward Mark D., Hagedorn Aaron, Saito Yasuhiko, and Brouard Nicolas. 2009. “Change in Disability-Free Life Expectancy for Americans 70 Years Old and Older.” Demography 46(3):627–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Danaei Goodarz, Friedman Ari B., Oza Shefali, Murray Christopher J. L., and Ezzati Majid. 2009. “Diabetes Prevalence and Diagnosis in US States: Analysis of Health Surveys.” Population Health Metrics 7(1):16–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dudel Christian, and Myrskyla Mikko¨. 2017. “Working Life Expectancy at Age 50 in the United States and the Impact of the Great Recession.” Demography 54(6):2101–23. [DOI] [PubMed] [Google Scholar]
  16. Dwyer-Lindgren Laura, Bertozzi-Villa Amelia, Stubbs Rebecca W., Morozoff Chloe, Mackenbach Johan P., van Lenthe Frank J., Mokdad Ali H., et al. 2017. “Inequalities in Life Expectancy among US Counties, 1980 to 2014: Temporal Trends and Key Drivers.” JAMA Internal Medicine 177(7):1003–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Efron Bradley, and Tibshirani Robert J.. 1995. An Introduction to the Bootstrap. Boca Raton, FL: CRC Press. [Google Scholar]
  18. Fix Evelyn, and Neyman Jerzy. 1951. “A Simple Stochastic Model of Recovery, Relapse, Death and Loss of Patients.” Human Biology 23:205–41. [PubMed] [Google Scholar]
  19. Gamerman Dani, and Lopes Hedibert. 2006. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference. 2nd ed. Boca Raton, FL: Chapman & Hall/CRC. [Google Scholar]
  20. Gelman Andrew. 2007. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22(2):153–64. [Google Scholar]
  21. Gelman Andrew, Carlin John B., Stern Hal S., Dunson David B., Vehtari Aki, and Rubin Donald B.. 2013. Bayesian Data Analysis. Boca Raton, FL: CRC Press. [Google Scholar]
  22. Ghosh Sujit K., Mukhopadhyay Pabak, and Lu Jye-Chyi J. C.. 2006. “Bayesian Analysis of Zero-Inflated Regression Models.” Journal of Statistical Planning and Inference 136:1360–75. [Google Scholar]
  23. Gill Thomas M., Zang Emma X., Murphy Terrence E., Leo-Summers Linda, Gahbauer Evelyne A., Festa Natalia, Falvey Jason R., et al. 2021. “Association between Neighborhood Disadvantage and Functional Well-Being in Community-Living Older Persons.” JAMA Internal Medicine 181(10): 1297–1304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gilsanz Paola, Mayeda Elizabeth Rose, Glymour M. Maria, Queensberry Charles P., and Whitmer Rachel A.. 2017. “Association between Birth in a High Stroke Mortality State, Race, and Risk of Dementia.” JAMA Neurology 74(9):1056–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Giovannucci Edward, Harlan David M., Archer Michael C., Bergenstal Richard M., Gapstur Susan M., Habel Laurel A., Pollak Michael, et al. 2010. “Diabetes and Cancer: A Consensus Report.” CA: A Cancer Journal for Clinicians 60(4):207–21. [DOI] [PubMed] [Google Scholar]
  26. Glymour M. Maria, Benjamin Emelia J., Kosheleva Anna, Gilsanz Paola, Curtis Lesley H., and Patton Kristen K.. 2013. “Early Life Predictors of Atrial Fibrillation-Related Mortality: Evidence from the Health and Retirement Study.” Health & Place 21:133–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Gunawan David, Panagiotelis Anastasios, Griffiths William, and Chotikapanich Duangkamon. 2017. “Bayesian Weighted Inference from Surveys.” Australian & New Zealand Journal of Statistics 62(1): 1–39. [Google Scholar]
  28. Hayward Mark D., and Grady William R.. 1990. “Work and Retirement among a Cohort of Older Men in the United States, 1966–1983.” Demography 27(3):337–56. [PubMed] [Google Scholar]
  29. Hayward Mark D., Rendall Michael, and Crimmins Eileen. 1999. “Evaluating Group Differences in Healthy Life Expectancy: The Estimation of Confidence Intervals for Multistate Life Table Expectancies.” Paper presented at the annual meeting of the Gerontological Society of America, San Francisco, CA. [Google Scholar]
  30. He Wan, and Schachter Jason. 2003. “Internal Migration of the Older Population, 1995 to 2000.” Census 2000 Special Reports. Washington, DC: U.S. Census Bureau. [Google Scholar]
  31. Imai Kosuke, and Van Dyk David A.. 2005. “A Bayesian Analysis of the Multinomial Probit Model Using Marginal Data Augmentation.” Journal of Econometrics 124(2):311–34. [Google Scholar]
  32. Kennedy Lauren, and Gelman Andrew. 2019. “Know Your Population and Know Your Model: Using Model-Based Regression and Poststratification to Generalize Findings Beyond the Observed Sample.” arXiv. Retrieved July 5, 2022. https://arxiv.org/abs/1906.11323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Laditka Sarah B., and Laditka James N.. 2015. “Active Life Expectancy of Americans with Diabetes: Risks of Heart Disease, Obesity, and Inactivity.” Diabetes Research and Clinical Practice 107(1): 37–45. [DOI] [PubMed] [Google Scholar]
  34. Laditka Sarah B., and Wolf Douglas A. 1998. “New Methods for Analyzing Active Life Expectancy.” Journal of Aging and Health 10(2):214–41. [Google Scholar]
  35. Land Kenneth C., Guralnik Jack M., and Blazer Dan G.. 1994. “Estimating Increment-Decrement Life Tables with Multiple Covariates from Panel Data: The Case of Active Life Expectancy.” Demography 31(2):297–319. [PubMed] [Google Scholar]
  36. Land Kenneth C., and Rogers Andrei. 1982. “Multidimensional Mathematical Demography: An Overview.” Pp. 1–41 in Multidimensional Mathematical Demography, edited by Land KC and Rogers A. New York: Academic Press. [Google Scholar]
  37. Lee MA, and Rendall Michael S.. 2001. “Self-Employment Disadvantage in the Working Lives of Blacks and Females.” Population Research and Policy Review 20(4):291–320. [Google Scholar]
  38. Lievre Agnes, Brouard Nicolas, and Heathcote Christopher. 2003. “The Estimation of Health Expectancies from Cross-Longitudinal Surveys.” Mathematical Population Studies 10:211–48. [Google Scholar]
  39. Liu Ying Hong, and Tsai James H.. 2000. “Effects of Temperature on Biology and Life Table Parameters of the Asian Citrus Psyllid, Diaphorina citri Kuwayama (Homoptera: Psyllidae).” Annals of Applied Biology 137(3):201–06. [Google Scholar]
  40. Ludwig Jens, Duncan Greg J., Gennetian Lisa A., Katz Lawrence F., Kessler Ronald C., Kling Jeffrey R., and Sanbonmatsu Lisa. 2012. “Neighborhood Effects on the Long-Term Well-Being of Low-Income Adults.” Science 337(6101):1505–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Ludwig Jens, Sanbonmatsu Lisa, Gennetian Lisa, Adam Emma, Duncan Greg J., Katz Lawrence F., Kessler Ronald C., et al. 2011. “Neighborhoods, Obesity, and Diabetes—A Randomized Social Experiment.” New England Journal of Medicine 365(16):1509–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Lynch Scott M., and Brown J. Scott. 2005. “A New Approach to Estimating Life Tables with Covariates and Constructing Interval Estimates of Life Table Quantities.” Sociological Methodology 35:177–225. [Google Scholar]
  43. Lynch Scott M., and Brown J. Scott. 2010. “Obtaining Multistate Life Table Distributions for Highly Refined Subpopulations from Cross-Sectional Data: A Bayesian Extension of Sullivan’s Method.” Demography 47(4):1053–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Lynch Scott M., Brown J. Scott, and Harmsen KG. 2003. “The Effect of Altering ADL Thresholds on Active Life Expectancy Estimates among Older Individuals.” Journals of Gerontology: Social Sciences 58B:171–78. [DOI] [PubMed] [Google Scholar]
  45. McCulloch Robert, and Rossi Peter E.. 1994. “An Exact Likelihood Analysis of the Multinomial Probit Model.” Journal of Econometrics 64(1–2):207–40. [Google Scholar]
  46. Meier Paul. 1955. “Note on Estimation in a Markov Process with Constant Transition Rates.” Human Biology 27:121–25. [PubMed] [Google Scholar]
  47. Nucci Alfred R. 1999. “The Demography of Business Closings.” Small Business Economics 12(1):25–39. [Google Scholar]
  48. Ofstedal Mary Beth, Chiu Chi-Tsun, Jagger Carol, Saito Yasuhiko, and Zimmer Zachary. 2019. “Religion, Life Expectancy, and Disability-Free Life Expectancy among Older Women and Men in the United States.” Journals of Gerontology, Series B: Psychological Sciences and Social Sciences 74(8):e107–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Olshansky S. Jay, Passaro Douglas J., Hershow Ronald C., Layden Jennifer, Carnes Bruce A., Brody Jacob, Hayflick Leonard, et al. 2005. “A Potential Decline in Life Expectancy in the United States in the 21st Century.” New England Journal of Medicine 352(11):1138–45. [DOI] [PubMed] [Google Scholar]
  50. Palloni Alberto. 2001. “Increment-Decrement Life Tables.” Pp. 256–72 in Demography: Measuring and Modeling Population Processes, edited by Preston SH, Heuveline P, and Guillot M. Oxford, UK: Blackwell. [Google Scholar]
  51. Payton Mark E., Greenstone Matthew H., and Schenker Nathaniel. 2003. “Overlapping Confidence Intervals or Standard Error Intervals: What Do They Mean in Terms of Statistical Significance?” Journal of Insect Science 3(1):34–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Pitocco Dario, Fuso Leonello, Conte Emanuele G., Zaccardi Francesco, Condoluci Carola, Scavone Giuseppe, Incalzi Raffaele Antonelli, et al. 2012. “The Diabetic Lung: A New Target Organ?” Review of Diabetic Studies 9(1):23–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Polson Nicholas G., Scott James G., and Windle Jesse. 2013. “Bayesian Inference for Logistic Models Using Polya-Gamma Latent Variables.” Journal of the American Statistical Association 108(504): 1339–49. [Google Scholar]
  54. Preston Samuel H., Choi Daesung, Elo Irma T., and Stokes Andrew. 2018. “Effect of Diabetes on Life Expectancy in the United States by Race and Ethnicity.” Biodemography and Social Biology 64(2): 139–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Reuser Mieke, Bonneux Luc G., and Willekens Frans J.. 2009. “Smoking Kills, Obesity Disables: A Multistate Approach of the US Health and Retirement Survey.” Obesity 17(4):783–89. [DOI] [PubMed] [Google Scholar]
  56. Reuser Mieke, Willekens Frans J., and Bonneux Luc. 2011. “Higher Education Delays and Shortens Cognitive Impairment: A Multistate Life Table Analysis of the US Health and Retirement Study.” European Journal of Epidemiology 26(5):395–403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Reynolds Sandra L., Saito Yasuhiko, and Crimmins Eileen M.. 2005. “The Impact of Obesity on Active Life Expectancy in Older American Men and Women.” Gerontologist 45(4):438–44. [DOI] [PubMed] [Google Scholar]
  58. Rossi Peter E., Allenby Greg M., and McCulloch Rob. 2005. Bayesian Statistics and Marketing. Hoboken, NJ: John Wiley. [Google Scholar]
  59. Schoen Robert. 1988. Modeling Multigroup Populations. New York: Plenum. [Google Scholar]
  60. Schoen Robert, and Canudas-Romo Vladimir. 2006. “Timing Effects on Divorce: 20th Century Experience in the United States.” Journal of Marriage and Family 68(3):749–58. [Google Scholar]
  61. Singer Burton, and Spilerman Seymour. 1976. “The Representation of Social Processes by Markov Models.” American Journal of Sociology 82(1):1–54. [Google Scholar]
  62. Sverdrup Erling. 1965. “Estimates and Test Procedures in Connection with Stochastic Models for Deaths, Recoveries and Transfers between Different States of Health.” Scandinavian Actuarial Journal 1965: 184–211. [Google Scholar]
  63. U.S. Census Bureau. 2015. “Geographical Mobility: 2013 to 2014.” Retrieved July 5, 2022. https://www.census.gov/data/tables/2014/demo/geographic-mobility/cps-2014.html.
  64. Willekens Frans, and Putter Hein. 2014. “Software for Multistate Analysis.” Demographic Research 31: 381–420. [Google Scholar]
  65. Winship Christopher, and Radbill Larry. 1994. “Sampling Weights and Regression Analysis.” Sociological Methods & Research 23(2):230–57. [Google Scholar]
  66. Wolf Douglas A., and Gill Thomas M.. 2009. “Modeling Transition Rates Using Panel Current-Status Data: How Serious Is the Bias?” Demography 46(2):371–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Wu Jasmanda H., Haan Mary N., Liang Jersey, Ghosh Debashis, Gonzalez Hector M., and Herman William H.. 2003. “Diabetes as a Predictor of Change in Functional Status among Older Mexican Americans: A Population-Based Cohort Study.” Diabetes Care 26(2):314–19. [DOI] [PubMed] [Google Scholar]
  68. Zang Emma, Lynch Scott M., Liu Chen, Lu Nancy, and Banas Julia. Forthcoming. “Racial/Ethnic and Educational Disparities in the Impact of Diabetes on Population Health among the US-Born Population.” Journals of Gerontology, Series B: Psychological Sciences and Social Sciences. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Zang Emma, Lynch Scott M., and West Jessica. 2021. “Regional Differences in the Impact of Diabetes on Population Health in the USA.” Journal of Epidemiology and Community Health 75(1):56–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Zaninotto Paola, Batty George David, Stenholm Sari, Kawachi Ichiro, Hyde Martin, Goldberg Marcel, Westerlund Hugo, et al. 2020. “Socioeconomic Inequalities in Disability-Free Life Expectancy in Older People from England and the United States: A Cross-National Population-Based Study.” Journals of Gerontology, Series A: Biological Sciences and Medical Sciences 75(5):906–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Zaninotto Paola, and Steptoe Andrew. 2019. “Association between Subjective Well-Being and Living Longer without Disability or Illness.” JAMA Network Open 2(7):e196870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Zheng Hui, and Tumin Dmitry. 2015. “Variation in the Effects of Family Background and Birth Region on Adult Obesity: Results of a Prospective Cohort Study of a Great Depression-Era American Cohort.” BMC Public Health 15:535–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Zimmer Zachary, and Rubin Sara. 2016. “Life Expectancy with and without Pain in the US Elderly Population.” Journals of Gerontology, Series A: Biological Sciences and Medical Sciences 71(9): 1171–76. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

RESOURCES