Summary
Many longitudinal studies are designed to monitor participants for major events related to the progression of diseases. Data arising from such longitudinal studies are usually subject to interval censoring since the events are only known to occur between two monitoring visits. In this work, we propose a new method to handle interval-censored multistate data within a proportional hazards model framework where the hazard rate of events is modeled by a nonparametric function of time and the covariates affect the hazard rate proportionally. The main idea of this method is to simplify the likelihood functions of a discrete-time multistate model through an approximation and the application of data augmentation techniques, where the assumed presence of censored information facilitates a simpler parameterization. Then the expectation-maximization algorithm is used to estimate the parameters in the model. The performance of the proposed method is evaluated by numerical studies. Finally, the method is employed to analyze a dataset on tracking the advancement of coronary allograft vasculopathy following heart transplantation.
Keywords: data augmentation, interval censoring, multistate model, proportional hazards model, time-to-event data
1 |. INTRODUCTION
The advancement of diseases can usually be described as several stages or states based on the etiology, severity and presentation of diseases. For example, the stages of cancer are defined by the spread and size of cancerous tumors in tissues. The stages of Alzheimer’s disease are characterized by cognitive impairment and the progression of Alzheimer’s disease is further complicated by various disease subtypes. A lot of clinical studies are designed to understand the natural history of disease progression. These studies will follow individuals with elevated risks of disease and monitor the disease progression through scheduled clinical visits. The typical research question that these studies aim to answer is whether the individuals are at high risk of progressing to the next state and what factors will accelerate and delay the state transitions. To analyze data from these studies, multistate models are an indispensable tool especially when there are multiple disease states of interest.
In this research, we focus on the problem of interval censoring in multistate models. In clinical studies, The disease status of individuals is evaluated at their clinical visits, so usually, the investigators only know that the changes of status occurred sometime between the two consecutive visits, and the exact times of state transitions are not known. In practice, some study individuals may not strictly adhere to the monitoring schedules and missed visits are anticipated. As a consequence, the sequences of monitoring times are irregularly spaced and the problem of interval censoring is not negligible. In the literature, there has been a variety of methods to analyze interval-censored data in a single-event survival model. Lindsey1 considered a parametric likelihood-based method; Turnbull2, Finkelstein et al.3 and Farrington4 used nonparametric maximum likelihood estimation methods in semiparametric models; Tanner et al.5 considered a data augmentation algorithm using multiple imputation method; Wang et al.6 considered a two-stage data augmentation method using a latent Poisson process. Interval censoring in multistate models is usually a more complex problem. Unlike single-event models, which have at most one unobserved event per interval, there can be multiple unobserved events occurring in the interval where the number of events and the actual event sequence is also unknown. Because of the difficulties mentioned above, many multistate models in the literature have restrictive assumptions on the structure of multistate models or the distribution of state transition times. In the frequentist framework, both Marshall7 and Satten8 considered using stationary Markov chains to model the multistate data; Alioum and Commenges9 used piecewise-constant intensities in Markov models to relax the stationarity assumptions; Frydman and Szarek10 considered nonparametric estimation of transition intensities in a three-state “illness-death” model; Pak et al.11 considered the semi-parametric estimation of a progressive three-state model; Zhang and Sun12 considered a Monte Carlo expectation-maximization algorithm in the semi-parametric estimation of a four-state model with informative missingness. In the Bayesian framework, multistate models have been considered by Sharples13, Pan14, Van Den Hout and Matthews15, Kneib and Hennerfeind16 and De Iorio et al.17 with different prior specifications and model structures, and all these methods deal with the censored information by sampling from the posterior distribution of the latent variables using Monte Carlo Markov chains. In sum, the existing methods are usually limited by restrictive assumptions (e.g., stationary and parametric), flexibility to accommodate complicated multistate structures, and intensive computational algorithms (e.g., constrained optimization and Monte Carlo methods). To address the problems mentioned above, we propose a new method for fitting interval-censored multistate models with arbitrary model structures. The proposed method is semiparametric where the transition hazards are a nonparametric function of time and model covariates affect the transition hazards proportionally as in a proportional hazards model. Parameters in the model can be estimated using an expectation-maximization algorithm.
In this paper, we will propose a novel method for fitting interval-censored multistate data. The proposed method applies an approximation to reduce the number of parameters in the likelihood and increase the computational efficiency of the algorithm. The remainder of the paper is organized as follows. In Section 2, we will introduce the proposed method along with the model estimation technique. Some simulation studies for evaluating the proposed method are presented in Section 3. A real data application to the heart transplant data is presented in Section 4. Section 5 concludes the paper with some discussions and directions for future research. Appendix A presents the proposed method in the single-event survival models as a special case of the multistate model, and some technical details about the model estimation are provided in Appendix B.
2 |. METHOD
2.1 |. Data and Notations
In this section, we will give a full description of the method for modeling multistate interval-censored data. Let us first consider a typical multistate interval-censored dataset. The relationship between different states can be described by a directed graph , where the set of vertices represents the collection of states and the set of directed edges represents all possible transitions from one state to the next. In this paper, we will number the states by (i.e., ), and if and only if the individuals can make a transition from state to state . We will let denote all states that can be transitioned to starting from state . Suppose that there are individuals in the dataset. The ith individual is sequentially monitored at the time sequence for the occurrences of transitions. Here we let be the state occupied by the ith individual at time , and . The exact times of state transitions are not directly observed and can only be inferred from the observed data. We let be the design interval, where can be chosen arbitrarily large to cover all observation times and thus the design interval does not have to be specified a priori. Next, we introduce some additional notations that are needed to describe the discrete-time multistate model in this paper. Different from other discrete-time survival models that assume that event time distributions have discrete masses over the design interval, in this paper, we will formulate our model in terms of intervals instead of discrete times. Such consideration helps simplify the analysis of interval-censored data and the interpretation of results. The design interval is discretized into a union of disjoint small intervals , where the time sequence encompasses all observation times . Here we let be the event indicator process such that if the ith individual transitions from state to at time and 0 otherwise. Let be the at-risk process such that if the ith individual is at risk of transitioning out of state at time . For ease of exposition, we introduce the following notations to allow and to be functions of intervals. For an interval , we let which means if and only if the transition from state to is observed before censoring and and in the interval . Similarly, we let , which means if and only if the individual is at risk of transitioning out of state before entering the interval and has not been censored in the interval . We let if the ith individual transitions from state to in the interval and 0 otherwise. Also, we let if the ith individual is at risk of transitioning out of in the interval and 0 otherwise. Let be a p-dimensional vector of covariates, we assume the following discrete-time multistate model with a complementary log-log link
where is a p-dimensional vector of regression coefficients and are parameters that characterize the rate of transitions in the interval . Many existing works have pointed out the similarity between discrete-time survival models with complementary log-log link to the continuous-time survival models.3,18 In discrete-time survival models, the probability mass of event time distributions is assumed to be placed on the observed event times, while the proposed method allocates the mass of transition time distributions to intervals. can similarly be interpreted as the cause-specific baseline hazard in competing risks models, and is usually treated as a nuisance parameter when the focus of statistical inference is on the regression coefficient . We will further explore the relationship between the proposed model and the Cox proportional hazards model in the next section. In this manuscript, we will assume that the observation time process is independent of the event time process which is usually referred to as the “independent inspection time” model. As noted by Lawless the “independent inspection time” model satisfies the constant sum condition by Oller et al.19,20
2.2 |. The Observed and Complete Likelihood
Following the frequentist inference approach, we estimate parameters by maximizing the observed likelihood, which is the likelihood function given all the observed data. However, due to interval censoring, the observed likelihood takes a complicated form that is hard to deal with. To overcome the difficulty, we will utilize the technique of data augmentation, as described by Van Dyk and Meng21. The data augmentation technique assumes the existence of certain unknown parameters and data to create an augmented dataset, which allows us to derive a complete likelihood function of a simpler form. This approach is particularly useful when the statistical problem is complicated by unobserved data and missing information.22,23 According to the theory of the EM algorithm, maximum likelihood estimation can be achieved by maximizing the expected complete log-likelihood, enabling us to work directly with the complete likelihood instead of the observed likelihood.22 Throughout this paper, we will follow this idea to develop the method for handling interval-censored data. In this section, we will first derive the observed and complete likelihood functions in the single-event case. In addition, we propose to apply an approximation to the complete likelihood function to further simplify the parameter estimation and numerical computation of the method. The benefits of utilizing the complete likelihood and the approximation will be demonstrated in this section.
We let and denote the underlying event and at-risk processes, denote the observed data, and and denote the collection of parameters. To simplify the notations, we will let be the -th element of matrix The log probability density function of the observed data is
where is the stochastic matrix given by
Therefore, the observed log-likelihood function for and is . Calculating the derivatives of with respect to and will require much effort because it involves multiplication of matrices that depend on and . However, we can construct augmented data that give a simple complete likelihood function by assuming the complete knowledge of and . Given the values of and we can easily write down the joint log probability density function of , and the observed data
We note that is the probability density function that specifies the probability of observing the interval-censored outcomes given the true event process. By our assumption that the observation process is independent of the event process, should not depend on the parameters and . The complete log-likelihood function for and is then given by
where the term that does not depend on and can be dropped. We can see from the expression for that taking derivatives of the complete likelihood is as easy as taking derivatives of the likelihood function of a complementary log-log model.
2.3 |. An Approximation Strategy for Attaining the Partial Likelihood
Next, we will apply an approximation to the complete likelihood function that will eventually make the nuisance parameters implicit in the likelihood function. The result of the approximation resembles the partial likelihood of the Cox cause-specific hazard multistate model and we will see the connections between the proposed discrete-time survival model and the continuous-time Cox proportional hazards models. Let be the rate of transition, and consider the following Taylor’s expansion
which holds for small, positive values of or say, when the intervals are small enough. If we apply Taylor’s expansion to the zeroth order, , the complete log-likelihood function can be approximated by
We can re-write as a partial log-likelihood for , following the arguments by Murphy and Van der Vaart.24 Given the importance of this technique in the context of this paper, we provide a detailed derivation in this section. Notice that when the complete likelihood is maximized, the following equation holds
Solving the equation, we can derive the following expression for
After we plug in the above expression back to , then can be written as a partial likelihood function that is free of the nuisance parameters
The zeroth order approximation reveals that the complete log-likelihood function can be expressed in a form that closely resembles the partial likelihood function of the Cox cause-specific hazard multistate model.25 Similarly, we can apply the first order approximation, , to , which reduces the log-likelihood function to
We can similarly derive the partial log-likelihood form using the same technique. To re-write the complete log-likelihood function as a partial log-likelihood function, we can solve for
which will give us the following expression for ,
After plugging in the expression for back into the complete log-likelihood , we obtain the corresponding partial log-likelihood function
We note that it is not possible to obtain the partial log-likelihood form for approximations of second order or higher, but due to limitations on space, we omit the details. In this manuscript, we focus on the first-order approximation as it tends to improve the efficiency of parameter estimation compared to the zeroth-order approximation as discussed in Appendix A.5 and the simple partial log-likelihood form associated with the first-order approximation significantly simplifies the problem and numerical computations.
2.4 |. Model Estimation by EM Algorithm
In the literature of interval-censored single-event data, there are two major types of methods for parameter estimation in interval-censoring models, the EM algorithm introduced by Turnbull originally known as the self-consistency algorithm, and the iterative convex minorant algorithm introduced by Groeneboom and Wellner.2,26 The self-consistency algorithm has a concise form that makes it easy to implement, but it may converge slowly, especially when the nonparametric component has a large number of parameters.27 The iterative convex minorant algorithm is a gradient-based method that optimizes the observed likelihood function directly and is typically much faster. However, the implementation is complicated due to the high-dimensional parameter space and the need to satisfy bound constraints. A lot of the existing works on interval-censored multistate data have relied on optimization algorithms to directly maximize the observed likelihood.8,28,9 Recently, Wang et al.6 discovered a novel use of the EM algorithm by considering a two-stage data augmentation using Poisson processes to improve the self-consistency algorithm. Motivated by this method, we present an EM algorithm for estimating the parameters of the proposed model in this section. We have already set up our problem to apply the EM algorithm in the previous sections. The complete likelihood based on the augmented data has a simple form that is convenient to work with. The approximation we apply leads to a partial likelihood function that is free of the nuisance parameters . Therefore, it has the potential to overcome the slow convergence of the self-consistency algorithm, especially when the dimension of is high. The theory of the EM algorithm by Dempster et al.22 suggests that maximizing the observed log-likelihood can be achieved by maximizing the expected complete log-likelihood given the observed data. In the calculation of the expected complete log-likelihood, we will adopt the method of fractional re-weighted at-risk process proposed by Datta and Satten29 when the survival status of the individual is unknown. Combining all the techniques mentioned above, we will be able to derive an algorithm that is easy to implement and also computationally efficient.
The EM algorithm primarily consists of the expectation step and the maximization step. In the expectation step, we are concerned with evaluating the expectation of the complete log-likelihood given the observed data and current parameter estimates. To simplify the notations a bit, we let and be the conditional probability and conditional expectation of a random variable given all observed data , and and be the conditional probability and conditional expectation of a random variable given all observed data from the individual , . The expected complete log-likelihood given the observed data is given by
The quantities that we need to evaluate are and . Since and can only take values of 0 and 1, we have
and
The above conditional probabilities can be calculated by some routine manipulations of stochastic matrices. Given and , we can construct the a stochastic matrix representing the transition probability in the interval for the ith individual, where is the probability of transitioning from state to when , and is the probability of leaving state when . Namely,
Let be the unique interval among that contains . To calculate the above conditional probabilities,
Similarly,
We can see that the expected log complete likelihood is a re-weighted version of the log complete likelihood , where the weights can be interpreted as the probability of observing certain information. The first term is weighted by representing the expected values of the event process, and the second term is weighted by representing the expected values of the at-risk process. We can easily draw a parallel between the proposed method and the fractional re-weighted at-risk process to the risk set determined by the probability that the individual is still at-risk in the interval considered by others.30,29 When the survival status of an individual is unknown, the individual contributes a fractional weight . The proposed method extends the idea of the fractional re-weighted at-risk process by Datta et al.30 from right-censored cases to interval-censored cases.
In the maximization step, we update parameters to maximize the expected complete log-likelihood function . Directly taking derivatives of to update and via Newton-Raphson step is possible, but impractical due to the large number of parameters in and the constraint that . In practice, we found that directly working on is not convenient and leads to slow convergence. On the contrary, the partial log-likelihood only depends on parameters and is free of bound constraints. As and are equivalent with different parameterizations, we suggest updating using the partial log-likelihood function and then updating by plugging in the updated values of . The algorithm for the maximization step is described below
where and are the first and second derivatives of the following expected partial likelihood with respect to
Compared to directly taking derivatives of the expected log-complete likelihood, the above procedure typically speeds up parameter estimate convergence and reduces the likelihood of numerical singularities due to bound constraints on . The EM algorithm proceeds by iterating between the expectation step and maximization step until convergence holds. The convergence properties of the proposed EM algorithm is established as a proposition in Appendix B. The step-by-step implementation of the algorithm has been described in Section S1 of the Supplementary Materials.
The proposed EM algorithm is relatively easy to implement and has a clear interpretation. As a comparison, many other methods that directly optimize the likelihood functions usually involve calculating the derivatives of complicated functions and checking the bounds and singularities of parameter estimates.8,9,31. We can now further discuss how the proposed method is different from the EM method proposed by Wang et al. and Gu et al.6,32 Both methods use data augmentation and the EM algorithm in the estimation of parameters, but we highlight the following main differences between their methods and the proposed method. First, the proposed method is derived based on a discrete-time survival model where the event rate in each interval is described by a binomial model with complementary log-log link, while the method in Want et al. is derived based on a continuous-time survival model where the survival events are described by Poisson point processes. Second, the proposed method treats both the event process and the at-risk process as incompletely observed data, and applies the method of the fractional re-weighted process by Datta et al.30 to account for missing information. Finally, Want et al. and Gu et al. employ a two-stage data augmentation approach that partitions the Poisson processes into two levels, while our proposed method unifies the two stages into a single-stage procedure.
3 |. SIMULATION STUDIES
We performed simulation studies to evaluate the proposed method for interval-censored multistate data. We let the design interval be (0, 1] and discretize it into 200 sub-intervals and follow multivariate normal distribution with mean and variance . In Case (A), we consider a 4-state model as shown in the left panel of Figure 1. In Case (B), we consider a more complicated 6-state model as shown in the right panel of Figure 1. The true values of are given in Table 1. In Case (A), and where is the right end of the interval . In Case (B), and where is the right end of the interval .
FIGURE 1.

The 4-state model considered in Case (A) (left) and the 6-state model considered in Case (B) (right) in the simulation studies.
TABLE 1.
Biases and RMSE of the estimated regression coefficients, biases, and RMSE in the simulation study of interval-censored multistate data. Values in the parentheses are the corresponding standard errors.
|
|
|
|
|
||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| s 1 | s 2 | Method | Bias | Truth | Bias | Truth | Bias | Truth | Bias | Truth | RMSE |
|
| |||||||||||
| Case (A) | |||||||||||
| 1 | 2 | 1st Order Approx | 0.008 (0.003) | 0.6 | 0.003 (0.003) | 0.2 | 0.000 (0.003) | 0.2 | 0.003 (0.003) | 0.2 | 0.129 (0.002) |
| 1 | 2 | Package msm | 0.048 (0.003) | 0.6 | 0.041 (0.003) | 0.2 | 0.018 (0.003) | 0.2 | 0.023 (0.003) | 0.2 | 0.151 (0.003) |
| 1 | 3 | 1st Order Approx | −0.000 (0.003) | 0.2 | 0.006 (0.003) | 0.6 | 0.001 (0.003) | 0.2 | 0.002 (0.003) | 0.2 | 0.126 (0.002) |
| 1 | 3 | Package msm | 0.038 (0.003) | 0.2 | 0.045 (0.003) | 0.6 | 0.020 (0.003) | 0.2 | 0.021 (0.003) | 0.2 | 0.147 (0.002) |
| 2 | 4 | 1st Order Approx | 0.004 (0.005) | 0.2 | 0.004 (0.004) | 0.2 | 0.016 (0.004) | 0.6 | 0.008 (0.004) | 0.2 | 0.199 (0.004) |
| 2 | 4 | Package msm | −0.058 (0.004) | 0.2 | −0.056 (0.004) | 0.2 | −0.090 (0.004) | 0.6 | −0.038 (0.004) | 0.2 | 0.211 (0.003) |
| 3 | 4 | 1st Order Approx | 0.012 (0.004) | 0.2 | 0.008 (0.005) | 0.2 | 0.006 (0.004) | 0.2 | 0.010 (0.004) | 0.6 | 0.201 (0.003) |
| 3 | 4 | Package msm | −0.048 (0.004) | 0.2 | −0.055 (0.004) | 0.2 | −0.037 (0.004) | 0.2 | −0.093 (0.004) | 0.6 | 0.211 (0.003) |
| Case (B) | |||||||||||
| 1 | 2 | 1st Order Approx | 0.004 (0.002) | 0.5 | −0.000 (0.002) | 0.1 | 0.000 (0.002) | 0.1 | 0.002 (0.002) | 0.1 | 0.105 (0.002) |
| 1 | 2 | Package msm | 0.075 (0.003) | 0.5 | 0.074 (0.004) | 0.1 | 0.022 (0.003) | 0.1 | 0.028 (0.003) | 0.1 | 0.163 (0.003) |
| 1 | 3 | 1st Order Approx | −0.001 (0.002) | 0.1 | 0.008 (0.002) | 0.5 | −0.003 (0.002) | 0.1 | 0.002 (0.002) | 0.1 | 0.101 (0.002) |
| 1 | 3 | Package msm | 0.068 (0.003) | 0.1 | 0.077 (0.004) | 0.5 | 0.021 (0.003) | 0.1 | 0.027 (0.003) | 0.1 | 0.159 (0.003) |
| 2 | 4 | 1st Order Approx | 0.001 (0.004) | 0.1 | 0.006 (0.004) | 0.1 | 0.005 (0.004) | 0.4 | 0.004 (0.004) | 0.2 | 0.182 (0.003) |
| 2 | 4 | Package msm | −0.123 (0.004) | 0.1 | −0.060 (0.005) | 0.1 | −0.085 (0.004) | 0.4 | −0.054 (0.005) | 0.2 | 0.231 (0.004) |
| 2 | 5 | 1st Order Approx | 0.011 (0.004) | 0.5 | −0.004 (0.004) | 0.1 | 0.002 (0.004) | 0.1 | −0.005 (0.004) | 0.1 | 0.169 (0.003) |
| 2 | 5 | Package msm | −0.139 (0.004) | 0.5 | −0.069 (0.004) | 0.1 | −0.080 (0.004) | 0.1 | −0.061 (0.004) | 0.1 | 0.234 (0.004) |
| 3 | 5 | 1st Order Approx | 0.003 (0.004) | 0.1 | 0.009 (0.004) | 0.5 | 0.001 (0.004) | 0.1 | 0.008 (0.004) | 0.1 | 0.172 (0.003) |
| 3 | 5 | Package msm | −0.062 (0.004) | 0.1 | −0.139 (0.004) | 0.5 | −0.051 (0.004) | 0.1 | −0.073 (0.004) | 0.1 | 0.227 (0.003) |
| 3 | 6 | 1st Order Approx | 0.001 (0.004) | 0.1 | −0.014 (0.004) | 0.1 | 0.001 (0.004) | 0.2 | 0.009 (0.004) | 0.4 | 0.181 (0.003) |
| 3 | 6 | Package msm | −0.063 (0.004) | 0.1 | −0.136 (0.005) | 0.1 | −0.064 (0.004) | 0.2 | −0.092 (0.005) | 0.4 | 0.245 (0.004) |
We compared the proposed method with the method in the R package “msm”, which is one of the few R packages that can implement regression analysis on multistate models with arbitrary structures. The method similarly assumes a proportional hazards model for transition times and covariates, so the estimates by “msm” are comparable to the estimates from the proposed model. However, the method in the “msm” assumes that transition times follow a parametric exponential multistate model. In contrast, the proposed method is semi-parametric where the baseline hazard functions can be nonparametric, so the proposed method is more flexible in comparison to the method in “msm”. In Table 1, we compare the bias and RMSE of the estimates produced by the package “msm” and the proposed method that applies first order approximation denoted by “1st Order Approx”. From the table, we can see that the proposed method generally gives estimates of smaller biases and smaller RMSE compared to the method in the package “msm”. Furthermore, in Case (B), the algorithm in the package “msm” failed to converge on 195 simulations with error messages that the Hessian matrices were not positive-definite. The “msm” algorithm is based on the scoring procedure by Kalbfleisch and Lawless33. In formulating the scoring procedure, the chain rule is used to obtain derivatives of the likelihood function with respect to the transition probabilities initially, followed by the coefficients. As a result, the transition probabilities are in the denominator of the Hessian matrix as shown in Equation (3.6) of Kalbfleisch and Lawless33. Consequently, their method is more likely to suffer from a singular Hessian matrix. By contrast, the proposed EM algorithm is less likely to suffer from numerical singularities because of the techniques we use to simplify the likelihood function to be optimized.
4 |. DATA APPLICATION TO HEART TRANSPLANT DATA
In this section, we apply the proposed method to analyze a real dataset that tracks the progression of coronary allograft vasculopathy (CAV) after heart transplantation. The dataset can be accessed through the R package “msm” on CRAN: https://cran.r-project.org/web/packages/msm/index.html. The dataset comprises 2846 visits from 622 patients, with each visit recording the severity of CAV. We categorize CAV into States 1, 2, and 3, representing no, mild, and severe CAV, respectively, and State 4 signifies death. The time origin is chosen to be the time of transplantation. We note that in this model, patients can transition from a more severe state to a less severe state (e.g., from State 2 to State 1 and from State 3 to State 2). In this dataset, we observed such transitions on 58 patients (10.3%) where 53 patients (9.4%) had one such transition and 5 patients had two such transitions. State 4 is the end state that patients can not recover from. The multistate models in many other works make the assumption that there are no bidirection transitions12,32,34, and the backward transitions are usually assumed to be misclassified and removed from the data analysis (e.g., Section 1.3.1 of Van Den Hout34). The proposed method is not constrained by such assumptions which could potentially lead to an underestimation of the transition risks. The model incorporates two covariates: the age group of organ recipients (coded as 0 for “under 50 years” and 1 for “over 50 years”) and the age group of organ donors (coded as 0 for “under 30 years” and 1 for over 30 years”). Figure 2 illustrates the transitions between states, Table 2 summarizes the frequency of each observed transition, and Table 3 summarizes the demographics and distribution of covariates in the analysis population.
FIGURE 2.

A four-state model for describing the progression of severity of CAV following heart transplantation.
TABLE 2.
Observed number of transitions in the heart transplant dataset.
| To | |||||
|---|---|---|---|---|---|
| State 1 | State 2 | State 3 | State 4 | ||
| From | State 1 | 1367 | 204 | 44 | 148 |
| State 2 | 46 | 134 | 54 | 48 | |
| State 3 | 4 | 13 | 107 | 55 | |
TABLE 3.
Summary of the demographics and distribution of covariates in the analysis population.
| Variable | Frequency (Percent) |
|---|---|
|
| |
| Sex (Female) | 87 (14.0%) |
| Age of Organ Recipient ≥ 50 | 291 (46.8%) |
| Age of Organ Donor ≥ 30 | 297 (47.7%) |
Table 4 presents the estimated regression coefficients. We find that the risk of transitioning from State 1 to State 2 is higher for older donor age groups (p < 0.001), and there is also a reduced likelihood of recovering from State 2 to State 1 (p < 0.001) and from State 3 to State 2 (p = 0.038) when donors are in the older group. Recipients in the older age group face a higher risk of transitioning from State 1 to State 4 (p < 0.001) and a lower chance of recovering from State 3 to State 2 (p = 0.005). At the 0.05 significance level, we do not find other estimates of regression coefficients to be statistically significant.
TABLE 4.
Estimated regression coefficients and the corresponding 95% CI and p-values in the application to heart transplant data.
| Variable | Estimate | 95% CI | P values |
|---|---|---|---|
|
| |||
| State 1 → State 2 | |||
| Recipient Age Group | 0.033 | (−0.188,0.253) | 0.772 |
| Donor Age Group | 0.409 | (0.195,0.623) | < .001 |
| State 1 → State 4 | |||
| Recipient Age Group | 0.983 | (0.592,1.374) | < .001 |
| Donor Age Group | 0.157 | (−0.222,0.535) | 0.417 |
| State 2 → State 1 | |||
| Recipient Age Group | 0.245 | (−0.085,0.575) | 0.146 |
| Donor Age Group | −0.583 | (−0.915,−0.250) | < .001 |
| State 2 → State 3 | |||
| Recipient Age Group | −0.252 | (−0.699,0.195) | 0.269 |
| Donor Age Group | 0.023 | (−0.357,0.403) | 0.904 |
| State 2 → State 4 | |||
| Recipient Age Group | −0.074 | (−0.856,0.709) | 0.853 |
| Donor Age Group | 0.143 | (−0.569,0.856) | 0.694 |
| State 3 → State 2 | |||
| Recipient Age Group | −1.121 | (−1.908,−0.334) | 0.005 |
| Donor Age Group | −0.645 | (−1.254,−0.036) | 0.038 |
| State 3 → State 4 | |||
| Recipient Age Group | 0.307 | (−0.234,0.848) | 0.266 |
| Donor Age Group | −0.138 | (−0.626,0.350) | 0.579 |
Figure 3 presents our estimates of the 1-year probabilities of state occupation for recipients and donors in the younger age group within 10 years of transplantation. The results indicate that given a patient is currently in State 1, the probability of leaving State 1 is generally less than 0.2. Patients in State 2 have a higher probability of transitioning to State 3, and the probability of returning to State 1 decreases over time. Notably, the 1-year probability of death (State 4) increases for patients in more severe states (State 1 < State 2 < State 3). The Julia code for implementing the data application can be found in the GitHub repository at https://github.com/luyouepiusf/approximation_method.
FIGURE 3.

The figure provides estimated probabilities of the recipient’s future state occupation after one year, based on their current state and the time elapsed since transplantation. These probabilities are calculated assuming both the recipient and the donor belong to the younger age groups.
5 |. CONCLUSIONS AND DISCUSSIONS
In this paper, a novel method based on the idea of data augmentation is proposed to handle interval censoring in both single-event survival models and multistate models. An efficient EM algorithm is proposed to estimate the parameters in the model. Theoretical and numerical results have shown that the proposed method gives sound parameter estimates and is computationally efficient. The proposed method is applied to the heart transplant dataset to model the advancement of CAV following heart transplantation.
There are still some questions and future research topics that can be explored following this research effort. First, the proposed method relies on an approximation method that compromises the precision of parameter estimates. It is worth discussing whether a higher-order approximation or a correction method can improve the estimation. Second, in many observational studies, longitudinal measures of risk factors are also collected from the participants. The longitudinal risk factors can be important predictors of disease progression and can be included as covariates in the model. The proposed method can be extended to incorporate longitudinal data. Third, the proportional hazards assumption can be relaxed to allow more flexibility in our model. For example, we may consider introducing time-dependent coefficients or using a semiparametric single-index model.35,36 Finally, the current research only discusses cases when the censoring is independent but in some applications, censoring can be dependent on the covariates or the past state occupation. Methods to handle dependent censoring need to be considered in this case.37,38
Supplementary Material
ACKNOWLEDGMENTS
Research reported in this publication was supported by the National Institute Of Diabetes And Digestive And Kidney Diseases of the National Institutes of Health under Award Number R03DK135437. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
0. Abbreviations:
- EM
expectation-maximization
- CAV
coronary allograft vasculopathy
- RMSE
root-mean-squared error
APPENDIX
A. METHOD FOR INTERVAL-CENSORED SINGLE-EVENT DATA
We will give a description of the method in the single-event interval model as a special case of the multistate model.
A.1. Data and Notations
While we will try to keep the notations consistent with those in Section 2, some adaptations to the notations are needed to represent single-event data. We consider a typical interval-censored dataset where each individual can only experience the survival event once. Suppose that there are individuals in the dataset. The ith individual is sequentially monitored at the time sequence for the survival status. We let if the ith individual is known to have experienced the survival event at one of the monitoring times, and otherwise. For individuals with , we know that the true event time falls in one of the intervals: and denote the interval by with the censoring time defined by . For those who did not have an event, we know that the individual remains event-free until and we will let the censoring time be the last follow-up time . Let be the design interval that encompasses all monitoring times . The design interval is discretized into a union of disjoint small intervals where , , and the time sequence encompasses all , and . Here we let be the event indicator process, and be the at-risk process. For ease of exposition, we introduce the following notations to allow and to be functions of intervals. For an interval , we let , which means if and only if the event is observed before censoring and in the interval . Similarly, we let , which means if and only if the individual is at risk of events before entering the interval and has not been censored in the interval . Let be a p-dimensional vector of covariates for the ith individual. The survival times are modeled by a discrete-time survival model with complementary log-log link
where is a p-dimensional vector of regression coefficients, and are parameters that characterize the rate of events in the interval .
A.2. The Observed, Complete, and Partial Likelihood
Given the observed data, we only know the values of and partially. When , we know that is 0 before , and at least one of is non-zero in . So the log probability density function of the observed data is
| (A1) |
When , we know that is 0 before , so the log probability density function of the observed data is
Therefore, the observed log-likelihood function for and is given by , Similarly, we can construct augmented data that gives a simple complete likelihood function by assuming the complete knowledge of and . Given the values of and , the joint log probability density function of , and the observed data is
We note that is the probability density function that specifies the probability of observing the interval-censored outcomes given the true survival information. By our assumption that the observation process is independent of the event process, should not depend on the parameters and . The complete log-likelihood function for and is then given by
where the term that does not depend on and can be dropped. We can similarly apply the Taylor approximation in Section 2.2 and the complete log-likelihood function can be approximated by
We can similarly re-write as a partial log-likelihood. By solving , we can derive the following expression for
After we plug in the above expression back to , then can be written as a partial likelihood function that is free of the nuisance parameters
After applying the zeroth order approximation, the complete log-likelihood function closely resembles the Cox partial likelihood function.25 Similarly, we can apply the first order approximation, , to , which reduces the log-likelihood function to
We can similarly derive the partial log-likelihood form using the technique
A.3. Model Estimation by EM Algorithm
Similarly, the EM algorithm primarily consists of the expectation step and the maximization step. In the expectation step, we are concerned with evaluating the expectation of the complete log-likelihood given the observed data and current parameter estimates. We let and be the conditional probability and conditional expectation of a random variable given all observed data , and and be the conditional probability and conditional expectation of a random variable given all observed data from the individual . The expected complete log-likelihood given the observed data is given by
| (A2) |
The quantities that we need to evaluate in the expectation step are essentially and . Since and can only take values of 0 or 1, we have and .
Therefore, for individuals with , we have
For individuals with , we have and
So the problem boils down to the calculation of and when and . Given the observed information, we know the event has occurred in the interval , so for at least one of the in the collection , let , so we have
Similarly,
We can see that the expected log complete likelihood is a re-weighted version of the log complete likelihood, where the weights can be interpreted as the probability of observing certain information. The first term is weighted by representing the expected values of the event process, and the second term is weighted by represented the expected values of the at-risk process. We can easily draw a parallel between the proposed method and the fractional re-weighted at-risk process considered by others. 30,29 When the survival status of an individual is unknown, the individual contributes a fractional weight to the risk set determined by the probability that the individual is still at-risk in the interval . The proposed method extends the idea of the fractional re-weighted at-risk process by Datta et al.30 from right-censored cases to interval-censored cases.
In the maximization step, we update parameters to maximize the expected complete log-likelihood function Similarly, we suggest updating using the partial log-likelihood function and then updating by plugging in the updated values of . The algorithm for the maximization step is described below
where and are the first and second order derivatives of . The EM algorithm proceeds by iterating between the expectation step and maximization step until convergence holds. The convergence properties of the proposed EM algorithm is established as a proposition in Appendix B.
A.4. Comparing with Turnbull’s Method
We give a comparison between the proposed method and the seminal idea proposed by Turnbull2. Based on Turnbull’s idea, the log-likelihood function can be written as follows
where
and . By contrast, our current parameterization gives the following form of
where . From the above equations, we can observe that the difference between and is whether the event time probabilities are modeled conditional on the at-risk process. It is noteworthy that the parameterization with enables us to operate with the logarithm of products rather than the logarithm of sums. Also, the parameterization with is subject to the constraint that and . These nuances explain the reason why the parameterization with simplifies the problem.
A.5. Simulation Studies
We will present some simulation studies to evaluate the proposed method for interval-censored single-event data. In the simulations, we let the design interval be (0,1]. The design interval is discretized into 100 sub-intervals and , where is the right end of the interval and follow multivariate normal distribution with mean and variance . The true regression coefficients . The true survival outcomes are generated according to the assumed hazard rate and regression coefficients . The monitoring time sequences are generated such that follow exponential distribution with mean where , , , and are rounded to multiples of 0.01. We considered four different cases. In Cases (I), (II), and (III), we let the sample size respectively, and . In Cases (VI) and (V), we let respectively, and . All simulations are repeated 500 times. We will compare the following five methods. The proposed methods that approximate the log-likelihood functions by 0th and 1st order Taylor expansion are denoted by “0th Order Approx” and “1st Order Approx”. The method that directly maximizes the log-likelihood functions without approximations is denoted by “No Approx”. The standard Cox proportional hazards model that does not account for interval censoring is denoted by “Cox PH”. The method by the R package “icenreg” is denoted by “Package icenreg”, which implements the state-of-the-art gradient ascent algorithm to fit interval-censored data. In Table A1, we summarize the results by evaluating the biases of estimated regression coefficients , and the root-mean-squared error (RMSE) of regression coefficients . In addition, we reported the median and interquartile of CPU times and the number of iterations taken to converge.
TABLE A1.
Estimation results and computational performance of the five methods used in modeling interval-censored single-event data. The accuracy of the estimation is evaluated by measuring the biases and RMSE of the estimated regression coefficients, with the corresponding standard errors in parentheses. The computational performance is evaluated by the median CPU times in seconds and median number of iterations required to converge, along with the corresponding inter-quartile ranges in brackets.
| Method | CPU Time (Seconds) | Iterations | |||||
|---|---|---|---|---|---|---|---|
|
| |||||||
| Case (I) | |||||||
| 0th Order Approx | −0.038 (0.002) | −0.031 (0.002) | −0.021 (0.002) | −0.006 (0.002) | 0.110 (0.002) | 0.21 [0.20,0.21] | 5 [5,5] |
| 1st Order Approx | −0.034 (0.002) | −0.028 (0.002) | −0.019 (0.002) | −0.005 (0.002) | 0.109 (0.002) | 0.22 [0.21,0.22] | 5 [5,5] |
| No Approx | 0.009 (0.003) | 0.004 (0.002) | 0.002 (0.002) | 0.005 (0.002) | 0.109 (0.002) | 46.41 [39.64,54.79] | 940 [832,1046] |
| Cox PH | −0.043 (0.003) | −0.034 (0.002) | −0.024 (0.002) | −0.007 (0.002) | 0.120 (0.002) | 0.01 [0.01,0.01] | 3 [3,3] |
| Package icenReg | 0.009 (0.003) | 0.005 (0.002) | 0.002 (0.002) | 0.006 (0.002) | 0.110 (0.002) | 0.09 [0.08,0.09] | 10 [9,11] |
| Case (II) | |||||||
| 0th Order Approx | −0.045 (0.002) | −0.033 (0.002) | −0.020 (0.001) | −0.012 (0.002) | 0.091 (0.001) | 0.49 [0.48,0.49] | 5 [5,5] |
| 1st Order Approx | −0.041 (0.002) | −0.030 (0.002) | −0.018 (0.002) | −0.011 (0.002) | 0.088 (0.001) | 0.51 [0.50,0.52] | 5 [5,5] |
| No Approx | 0.002 (0.002) | 0.002 (0.002) | 0.003 (0.002) | 0.000 (0.002) | 0.076 (0.001) | 100.52 [69.91,115.81] | 957 [885,1034] |
| Cox PH | −0.051 (0.002) | −0.037 (0.002) | −0.024 (0.002) | −0.014 (0.002) | 0.100 (0.001) | 0.02 [0.02,0.02] | 3 [3,3] |
| Package icenReg | 0.002 (0.002) | 0.002 (0.002) | 0.003 (0.002) | 0.000 (0.002) | 0.077 (0.001) | 0.75 [0.73,0.78] | 8 [8,8] |
| Case (III) | |||||||
| 0th Order Approx | −0.043 (0.001) | −0.035 (0.001) | −0.024 (0.001) | −0.009 (0.001) | 0.077 (0.001) | 0.76 [0.75,0.77] | 5 [5,5] |
| 1st Order Approx | −0.039 (0.001) | −0.032 (0.001) | −0.022 (0.001) | −0.008 (0.001) | 0.074 (0.001) | 0.79 [0.78,0.80] | 5 [5,5] |
| No Approx | 0.002 (0.001) | −0.001 (0.001) | −0.002 (0.001) | 0.002 (0.001) | 0.054 (0.001) | 137.35 [124.93,146.15] | 970 [921,1024] |
| Cox PH | −0.049 (0.001) | −0.039 (0.001) | −0.027 (0.001) | −0.011 (0.001) | 0.086 (0.001) | 0.02 [0.02,0.02] | 3 [3,3] |
| Package icenReg | 0.002 (0.001) | −0.001 (0.001) | −0.002 (0.001) | 0.002 (0.001) | 0.054 (0.001) | 0.43 [0.38,0.48] | 12 [11,14] |
| Case (IV) | |||||||
| 0th Order Approx | −0.033 (0.002) | −0.023 (0.002) | −0.014 (0.001) | −0.008 (0.002) | 0.081 (0.001) | 0.69 [0.62,0.71] | 5 [5,5] |
| 1st Order Approx | −0.028 (0.002) | −0.020 (0.002) | −0.011 (0.001) | −0.007 (0.002) | 0.079 (0.001) | 0.76 [0.66,0.78] | 5 [5,5] |
| No Approx | 0.002 (0.002) | 0.002 (0.002) | 0.004 (0.002) | 0.000 (0.002) | 0.075 (0.001) | 118.86 [99.55,129.15] | 918 [849,987] |
| Cox PH | −0.036 (0.002) | −0.025 (0.002) | −0.015 (0.002) | −0.010 (0.002) | 0.086 (0.001) | 0.01 [0.01,0.01] | 3 [3,3] |
| Package icenReg | 0.002 (0.002) | 0.003 (0.002) | 0.004 (0.002) | 0.000 (0.002) | 0.075 (0.001) | 0.76 [0.74,0.79] | 8 [8,8] |
| Case (V) | |||||||
| 0th Order Approx | −0.054 (0.001) | −0.040 (0.001) | −0.025 (0.001) | −0.014 (0.002) | 0.099 (0.001) | 0.41 [0.41,0.42] | 5 [5,5] |
| 1st Order Approx | −0.051 (0.001) | −0.038 (0.002) | −0.023 (0.001) | −0.013 (0.002) | 0.096 (0.001) | 0.43 [0.42,0.43] | 5 [5,5] |
| No Approx | 0.003 (0.002) | 0.003 (0.002) | 0.004 (0.002) | 0.001 (0.002) | 0.078 (0.001) | 144.31 [118.43,157.38] | 1015 [937,1080] |
| Cox PH | −0.062 (0.002) | −0.046 (0.002) | −0.029 (0.002) | −0.017 (0.002) | 0.111 (0.001) | 0.01 [0.01,0.01] | 3 [3,3] |
| Package icenReg | 0.003 (0.002) | 0.002 (0.002) | 0.004 (0.002) | 0.001 (0.002) | 0.078 (0.001) | 0.16 [0.15,0.17] | 9 [9,10] |
Here we briefly summarize the results. In terms of RMSE, “Cox PH” has the largest RMSE, implying that the parameter estimates obtained from models not accounting for interval censoring are not as efficient than the other methods that account for interval censoring. Among the three methods proposed in this manuscript (“0th Order Approx”, “1st Order Approx”, and “No Approx”), “No Approx” gives the estimates of the smallest RMSE while “0th Order Approx” gives the largest RMSE in most of the cases considered here. Since “1st Order Approx” improves the order of approximation in “0th Order Approx”, it gives a smaller RMSE compared to “0th Order Approx”. The RMSE of parameter estimates by “Package icenReg” is close to “No Approx”. As the state-of-the-art algorithm for fitting interval-censored data, “Package icenReg” demonstrates good computational efficiency overall with short CPU time and the algorithm converges around 10 iterations. The CPU times of “0th Order Approx” and “1st Order Approx” are comparable to “No Approx”, and the algorithms only take around 5 iterations to converge. On the contrary, “No Approx” is relatively slow and usually requires a large number of iterations to achieve convergence. The results show that the techniques to approximate the log-likelihood and get rid of nuisance parameters can greatly facilitate the convergence of the algorithms. Among the three proposed methods, we found “1st Order Approx” promising as it is computationally efficient to be extended to interval-censored multistate data, and the loss of estimation efficiency due to approximation is less compared to “0th Order Approx”.
B. CONVERGENCE OF THE EM ALGORITHMS
B.1. Convergence of the EM Algorithm for Single-Event Models
We will begin by demonstrating the convergence of the EM algorithm for single-event data, as it represents a more straight forward scenario. The convergence of the EM algorithm for multistate data can be acquired similarly with some slight modifications. In this subsection, we will follow the notations introduced in Appnedix A.
We present the following proposition to show the convergence property of the EM algorithm.
Proposition 1
(Convergence of the EM algorithm for single-event model). Let be the augmented data and and be the parameters in the model. Then under assumptions (a)–(d) listed below, there exists a neighborhood of , such that for any initial value in , the sequence of parameter estimates generated by the EM algorithm converges to the maximizer , where and .
Across all values of the event time processes and the processes of monitoring times are jointly independent. The processes of monitoring times are independent of the parameters and .
- The survival times can be modeled by a discrete-time survival model with a complementary log-log link
The parameter space is compact with non-empty interior and the maximizer of the loglikelihood function lies in the interior of .
has finite stationary points in .
Proof of Proposition 1. First, we will verify the following smoothness properties.
has at least second-order continuous derivatives with respect to .
has at least second-order continuous derivatives with respect to both and .
To show (i), we note that
Since both and can only take values of 0 and 1, is defined on a discrete measure space , and the Lebesgue integral over can be reduced to a finite sum. As a consequence, the smoothness of follows from the smoothness of . Property (i) follows from the fact that
has at least second-order continuous derivatives with respect to . Similarly, by Bayes rule, we can write
For the same reason, we can show the smoothness of in (ii) follows from the smoothness of and .
The smoothness properties imply that the derivatives of and should be 0, at their stationary points. The finiteness of stationary points and the uniqueness of the maximizer implies that there exists some such that for any in in is negative definite, and if and only if . Here we remark that the finiteness of stationary points is needed to rule out the possibility that the observed data is non-informative about the underlying survival process indicated by . In the extreme case that is completely non-informative of the underlying survival process, is a constant and all in are stationary points.
Next, we will apply Theorem 6 of Wu39 to complete the proof. If the initial value is in . By Theorem 1 of Dempster et al.22, is non-decreasing, so all subsequent are also in . The finiteness of and the continuity of implies that is a closed set. As a result, is bounded in and there exists a constant such that all eigenvalues of are smaller than . Apply Taylor’s expansion to at , we have
where the first-order term is zero as by the the fact that maximizes , and is some point on the line segment joining and . Therefore, we have
At this point, the remainder of the proof follows from Theorem 6 of Wu39.
B.2. Convergence of the EM Algorithm for Multistate Models
We also present the following proposition to show the convergence property of the EM algorithm for multistate models. In this subsection, we will follow the notations introduced in Section 2.
Proposition 2
(Convergence of the EM algorithm for multistate models). Let be the augmented data and . Let . be the parameters in the model that lies in a compact space with a non-empty interior. Then under assumptions (a’), (b’), (c) and (d), there exists a neighborhood of , such that for any initial value in , the sequence of parameter estimates generated by the EM algorithm converges to the maximizer , where , and .
-
(a’)
Across all values of the event time processes and the processes of monitoring times are jointly independent. The processes of monitoring times are independent of the parameters and .
-
(b’)The multistate data can be discrete-time multistate model with a complementary log-log link
Proof of Proposition 2. The proof is omitted for brevity since it closely mirrors the proof of Proposition 1.
DATA AVAILABILITY STATEMENT
The data in the case study and the Julia code for implementing the proposed methods can be found online at https://github.com/luyouepiusf/approximation_method.
References
- 1.Lindsey J. A study of interval censoring in parametric regression models. Lifetime Data Analysis 1998; 4(4): 329–354. [DOI] [PubMed] [Google Scholar]
- 2.Turnbull BW. The empirical distribution function with arbitrarily grouped, censored and truncated data. Journal of the Royal Statistical Society: Series B (Methodological) 1976; 38(3): 290–295. [Google Scholar]
- 3.Finkelstein DM, Wolfe RA. A semiparametric model for regression analysis of interval-censored failure time data. Biometrics 1985: 933–945. [PubMed] [Google Scholar]
- 4.Farrington C. Interval censored survival data: a generalized linear modelling approach. Statistics in Medicine 1996; 15(3): 283–292. [DOI] [PubMed] [Google Scholar]
- 5.Tanner MA, Wong WH. The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 1987; 82(398): 528–540. [Google Scholar]
- 6.Wang L, McMahan CS, Hudgens MG, Qureshi ZP. A flexible, computationally efficient method for fitting the proportional hazards model to interval-censored data. Biometrics 2016; 72(1): 222–231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Marshall G, Jones RH. Multi-state models and diabetic retinopathy. Statistics in Medicine 1995; 14(18): 1975–1983. [DOI] [PubMed] [Google Scholar]
- 8.Satten GA, Longini IM. Markov chains with measurement error: Estimating the “true” course of a marker of the progression of human immunodeficiency virus disease. Journal of the Royal Statistical Society: Series C (Applied Statistics) 1996; 45(3): 275–295. [Google Scholar]
- 9.Alioum A, Commenges D. MKVPCI: a computer program for Markov models with piecewise constant intensities and covariates. Computer Methods and Programs in Biomedicine 2001; 64(2): 109–119. [DOI] [PubMed] [Google Scholar]
- 10.Frydman H, Szarek M. Nonparametric estimation in a Markov “illness–death” process from interval censored observations with missing intermediate transition status. Biometrics 2009; 65(1): 143–151. [DOI] [PubMed] [Google Scholar]
- 11.Pak D, Li C, Todem D, Sohn W. A multistate model for correlated interval-censored life history data in caries research. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2017; 66(2): 413–423. [Google Scholar]
- 12.Zhang H, Kelvin EA, Carpio A, Allen Hauser W. A multistate joint model for interval-censored event-history data subject to within-unit clustering and informative missingness, with application to neurocysticercosis research. Statistics in Medicine 2020; 39(23): 3195–3206. [DOI] [PubMed] [Google Scholar]
- 13.Sharples LD. Use of the Gibbs sampler to estimate transition rates between grades of coronary disease following cardiac transplantation. Statistics in Medicine 1993; 12(12): 1155–1169. [DOI] [PubMed] [Google Scholar]
- 14.Pan SL, Wu HM, Yen AMF, Chen THH. A Markov regression random-effects model for remission of functional disability in patients following a first stroke: a Bayesian approach. Statistics in Medicine 2007; 26(29): 5335–5353. [DOI] [PubMed] [Google Scholar]
- 15.Van Den Hout A, Matthews FE. Estimating dementia-free life expectancy for Parkinson’s patients using Bayesian inference and microsimulation. Biostatistics 2009; 10(4): 729–743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kneib T, Hennerfeind A. Bayesian semi parametric multi-state models. Statistical Modelling 2008; 8(2): 169–198. [Google Scholar]
- 17.De Iorio M, Gallot N, Valcarcel B, Wedderburn L. A Bayesian semiparametric Markov regression model for juvenile dermatomyositis. Statistics in Medicine 2018; 37(10): 1711–1731. [DOI] [PubMed] [Google Scholar]
- 18.Huang J. Efficient estimation for the proportional hazards model with interval censoring. The Annals of Statistics 1996; 24(2): 540–568. [Google Scholar]
- 19.Lawless JF. A note on interval-censored lifetime data and the constant-sum condition of Oiler, Gómez & Calle (2004). Canadian Journal of Statistics 2004; 32(3): 327–331. [Google Scholar]
- 20.Oller R, Gómez G, Calle ML. Interval censoring: identifiability and the constant-sum property. Biometrika 2007; 94(1): 61–70. [Google Scholar]
- 21.Van Dyk DA, Meng XL. The art of data augmentation. Journal of Computational and Graphical Statistics 2001; 10(1): 1–50. [Google Scholar]
- 22.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological) 1977; 39(1): 1–22. [Google Scholar]
- 23.Meng XL, Van Dyk D. Fast EM-type implementations for mixed effects models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 1998; 60(3): 559–578. [Google Scholar]
- 24.Murphy SA, Van Der Vaart AW. On profile likelihood. Journal of the American Statistical Association 2000; 95(450): 449–465. [Google Scholar]
- 25.Cox DR. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological) 1972; 34(2): 187–202. [Google Scholar]
- 26.Groeneboom P, Wellner JA. Information bounds and nonparametric maximum likelihood estimation. 19. Springer Science & Business Media. 1992. [Google Scholar]
- 27.Zhang Z, Sun J. Interval censoring. Statistical Methods in Medical Research 2010; 19(1): 53–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Marshall G, Guo W, Jones RH. MARKOV: A computer program for multi-state Markov models with covariables. Computer Methods and Programs in Biomedicine 1995; 47(2): 147–156. [DOI] [PubMed] [Google Scholar]
- 29.Datta S, Satten GA. Estimating future stage entry and occupation probabilities in a multistage model based on randomly right-censored data. Statistics & Probability Letters 2000; 50(1): 89–95. [Google Scholar]
- 30.Datta S, Satten GA, Datta S. Nonparametric estimation for the three-stage irreversible illness–death model. Biometrics 2000; 56(3): 841–847. [DOI] [PubMed] [Google Scholar]
- 31.Jackson CH, Sharples LD, Thompson SG, Duffy SW, Couto E. Multistate Markov models for disease progression with classification error. Journal of the Royal Statistical Society: Series D (The Statistician) 2003; 52(2): 193–209. [Google Scholar]
- 32.Gu Y, Zeng D, Heiss G, Lin DY. Maximum Likelihood Estimation for Semiparametric Regression Models with Interval-Censored Multistate Data. Biometrika 2023: asad073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kalbfleisch J, Lawless JF. The analysis of panel data under a Markov assumption. Journal of the American Statistical Association 1985: 863–871. [Google Scholar]
- 34.Van Den Hout A. Multi-State Survival Models for Interval-Censored Data. CRC Press. 2016. [Google Scholar]
- 35.Tian L, Zucker D, Wei L. On the Cox model with time-varying regression coefficients. Journal of the American Statistical Association 2005; 100(469): 172–183. [Google Scholar]
- 36.Sun J, Kopciuk KA, Lu X. Polynomial spline estimation of partially linear single-index proportional hazards regression models. Computational Statistics & Data Analysis 2008; 53(1): 176–188. [Google Scholar]
- 37.Ma L, Hu T, Sun J. Cox regression analysis of dependent interval-censored failure time data. Computational Statistics & Data Analysis 2016; 103: 79–90. [Google Scholar]
- 38.Finkelstein DM, Goggins WB, Schoenfeld DA. Analysis of failure time data with dependent interval censoring. Biometrics 2002; 58(2): 298–304. [DOI] [PubMed] [Google Scholar]
- 39.Wu CJ. On the convergence properties of the EM algorithm. The Annals of Statistics 1983; 11(1): 95–103. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data in the case study and the Julia code for implementing the proposed methods can be found online at https://github.com/luyouepiusf/approximation_method.
