Summary:
Prediction modeling for clinical decision making is of great importance and needed to be updated frequently with the changes of patient population and clinical practice. Existing methods are either done in an ad hoc fashion, such as model recalibration, or focus on studying the relationship between predictors and outcome and less so for the purpose of prediction. In this article, we propose a dynamic logistic state space model to continuously update the parameters whenever new information becomes available. The proposed model allows for both time-varying and time-invariant coefficients. The varying coefficients are modeled using smoothing splines to account for their smooth trends over time. The smoothing parameters are objectively chosen by maximum likelihood. The model is updated using batch data accumulated at prespecified time intervals, which allows for better approximation of the underlying binomial density function. In the simulation, we show that the new model has significantly higher prediction accuracy compared to existing methods. We apply the method to predict one year survival after lung transplantation using the United Network for Organ Sharing (UNOS) data.
Keywords: Dynamic prediction, Laplace approximation, Smoothing spline
1. Introduction
Accurately predicting the probability of clinical events is important for assessing disease risks, evaluating different treatment strategies and making clinical decisions. A well-performed prediction model should maintain its accuracy over time (Steyerberg et al. 2013; Justice et al. 1999; Reilly and Evans, 2006; Debray et al. 2015). However, many clinical examples have shown that the performance of a risk prediction model worsens over time (Janssen et al. 2008). This is often due to changes in patient populations and standard of care, and, as a result, the model originally developed no longer predicts outcome accurately. Different methods have been proposed to address this issue, including model refitting and recalibration. However these approaches are often done in an ad hoc fashion and examined for specific clinical examples (Kappen et al. 2012; Hickey et al. 2013; Toll et al. 2008; Vergouwe et al. 2017; Steyerberg et al. 2004; Janssen et al. 2008). For a detailed overview, see Su et al. (2018). In this paper, we propose a dynamic state space model in which the parameter estimates and the prediction of future outcomes can be sequentially updated over time whenever new information becomes available.
Our work is motivated by the Lung Allocation Score (LAS) that has been used to prioritize wait-listed patients for receiving lung transplant (Egan et al. 2006; Chan et al. 2015). The LAS score is derived from two risk prediction models that take into account waitlist urgency and post-transplant survival respectively. Over the years, there were important changes that may have altered the performance of both models. These include the changing composition of waitlisted patients toward older and sicker patients (Valapour et al. 2015), changes in the management of waitlisted patients (e.g., increasing use of extracorporeal membrane oxygenation (ECMO) as a bridge to transplantation) (Chiumello et al. 2015; Cypel and Keshavjee, 2011; Pal et al. 2015), and changing odds that waitlisted patients will be transplanted due to greater use of non-standard organs (e.g., organs from older donors, Valapour et al. 2015). Similarly, the second LAS model, which predicts post-transplant survival, does not reflect changes in recipient characteristics that are associated with outcomes (Singer et al. 2015; Lederer et al. 2011; Baldwin et al. 2012), changes in operative techniques (e.g., single vs. bilateral transplant, Schaffer et al. 2015), new pre-transplant donor management algorithms (Angel et al. 2006) or the advent of ex vivo lung perfusion prior to transplantation (Cypel et al. 2011). The prediction models for deriving the LAS score were first developed in 2005 and revised in 2010 and 2015 (Chan et al. 2015). Due to the infrequent model updating, there is a discrepancy between the population for which the model was developed and the population to which the model is applied. Consequently, the performance of LAS in clinical practice is unsatisfactory. Since the changes happen continuously over time, a better strategy is to fit a model that dynamically updates with those changes.
State space models are powerful tools in modeling dynamic systems in various disciplines (Durbin and Koopman, 2012). In this method, the dynamic system is modeled over time by a set of latent parameters that determine the changing distribution of observed data. The hierarchical model structure allows for both making statistical inference and predicting future outcomes, taking into account the observed past trajectory. State space models have been developed for both Gaussian (Chapter 3, Durbin and Koopman, 2012) and non-Gaussian outcomes (Chapter 9, Durbin and Koopman, 2012), with applications in Biology (Wade, 2000; Bhattacharjee et al. 2018), Epidemiology (Guihenneuc-Jouyaux, 2000; O’Neill, 2002), Ecology (Reckhow, 1990) and others. Under the Gaussian models, there is a closed-form solution for estimating the parameters. For the non-Gaussian models, such as the binary outcome in the motivating LAS example, the solution often relies on complicated numerical integration Normal approximations like the Laplace approximation is commonly used (Lewis and Raftery, 1997) for computational convenience. To our knowledge, the most related dynamic modeling method for binary outcomes was proposed in McCormick et al. (2012), although they focused on examining the associations between the predictors and outcome. Based on the state space framework, they proposed a logistic state space model in which all model coefficients were varying over time. The state equation was modeled as a random walk in which coefficients at the next time point were white noise perturbations of their current values. The white noise magnitude at each time point is a tuning parameter, which is chosen subjectively. In addition, the model did not take into account the trend of model coefficients over time, and was updated one observation at a time. Since the estimation involved a normal approximation to the Bernoulli distribution, this may create large approximation error which accumulates over time.
In clinical applications, it is often believed that the system characteristics change slowly over time. A commonly used method is to model the parameter of interest as a smooth function over time, e.g., smoothing spline. Wahba (1978, 1983) showed that the estimation of a smoothing spline for a Gaussian outcome can be obtained through the posterior estimate of a Gaussian stochastic process and the posterior variance can be used to construct Bayesian confidence intervals for the spline. Gu (1992) showed this is also true for Non-Guassian outcomes with Laplace approximation. Wecker and Ansley (1983) showed how to fit smoothing splines using the state space model formulation. Motivated by the LAS model for post lung transplantation survival, we propose a dynamic logistic state space model that includes both time-varying and time-invariant coefficients modeled by the state equations. The time-varying coefficients are modeled using smoothing splines, similar to Wecker and Ansley (1983). The smoothing parameters are estimated via maximum likelihood on the training data. We compute K-steps ahead prediction using an algorithm similar to Kalman filter. The smoothing parameters, model coefficients and prediction are updated when new data becomes available. Because of the hidden Markov structure of state space models, the likelihood calculation, coefficient estimation and prediction can all be computed recursively in O(n) operations. When new data becomes available, the updates only need to be computed on the additional observations. A standard state space model is updated as each new observation becomes available. Since the filtering step often involves a Laplace approximation for binary outcomes, updating one observation at a time can lead to large cumulative approximation errors. In addition, it is often not feasible or desirable to update the model one patient at a time in clinical applications. Consequently, we propose to update the model at prespecified fixed time intervals when a batch of patient data becomes available. Although the method is motivated by the LAS example, it can be applied to many other clinical settings, e.g., prediction of aortic valve replacement mortality (Clark, et al. 2012), cardiac surgery risk prediction (Hickey,et al. 2013) and prediction of mortality risk in ICU Patients (Knaus et al. 1985). While in our application, computational speed is not critical, the computational efficiency of our proposed method enables it to be implemented for online monitoring, which has a wide range of applications.
The rest of this article is organized as follows. In Section 2, we describe the model formulation. Section 3 describes the estimation and prediction procedures. In Section 4, we conduct simulation studies to evaluate the performance of the proposed method and compare it to a few established methods. Analyses of the lung transplant data are presented in Section 5 followed by a brief discussion in Section 6.
2. Modeling
2.1. The General Setup
Suppose the observed data for each subject i is (ti, xi, yi), i = 1, 2, ⋯ , n, where 0 ⩽ t1 < t2 < ⋯ < tn ⩽ 1 are observation time, xi = (xi1, ⋯ , xiq)T is a q-dimensional vector of covariates, and yi is a binary outcome. We propose the following model for yi in which xi are predictors, i.e.,
where
| (1) |
where xi1 = (1, xi1, ⋯ , xiq1)T are the covariates with varying coefficients, xi2 = (xi,q1+1, ⋯ , xiq)T are the covariates with time-invariant coefficients, β(t) = (β0(t), β1(t), ⋯ , βq1(t))T, α = (α1, ⋯ , αq2)T are the corresponding coefficients, and q1 + q2 = q. The model can also include interaction terms defined as the product of covariates.
The parameters can be estimated through maximizing the following penalized log-likelihood function
| (2) |
It is well known that the minimizer , j = 0, 1, ⋯ , q1 are cubic smoothing splines. For data from exponential families, Gu (1992) established a connection between the smoothing spline models and a Bayesian model from which the estimate of a smoothing spline can be obtained through the posterior estimate of a Gaussian stochastic process. Following Gu’s Bayesian approach, βj(t) is modeled as
| (3) |
where B1j and B2j have diffuse prior ([B1j, B2j]T ~ N(0, τI), with τ → ∞), Wj(s) are Wiener processes and bj are smoothing parameters. Gu (1992) shown that under diffuse prior, if one approximates the posterior via Laplace’s method, the mean of the approximate posterior is the smoothing spline estimator when bj = 1/λj. Wecker and Ansley (1983) further showed that the stochastic model (3) can be written with the function βj(t) and its first derivative in the state vector as
| (4) |
where
The parameter λj is the smoothing parameter that controls the trade-off between smoothness and bias. When , βj(t) is restricted to be a straight line with intercept βj(0) and slope . Thus our model (1) can be represented in a state space representation as
where
The distinctions between model specifications for varying and constant coefficients are reflected in both the transition matrix Ti and variance matrix Qi. The identity transition in Ti that corresponds to α and zero variance in Qi implies that α are constant coefficients. In state space models, the estimates of the coefficients are updated as each new observation becomes available. For non-Gaussian state space models, numerical integrations or normal approximations are typically used in the sequential updates. Numerical approximation of the binary distribution has a large approximation error that can accumulate over time. Instead, we propose to update the estimates over fixed time intervals with batch data, which can lead to more accurate approximation.
2.2. Model Specifications for batch data
We first divide the time domain [0, 1] into S equally spaced intervals [(s − 1)/S, s/S], s = 1, 2, ⋯ , S. Let ys = {ysm, m = 1, 2, ⋯ , ns} be the set of observations ysm with tsm ∈ [(s − 1)/S, s/S], and corresponding xs = {xsm, m = 1, 2, ⋯ , ns}. Let be the mid-point of the s-th interval. In the following, we use batch data (xs, ys) to update the model. Specifically, we evaluate the data in the interval [(s − 1)/S, s/S] at the middle point . We propose the following state-space model for batch data,
| (5) |
| (6) |
where
3. Estimation and Prediction
Due to the recusive nature of state space models, for the given smoothing parameters, the model coefficients can be efficiently estimated using a Kalman filtering like algorithm that involves a prediction step and a filtering step at each time point. Using the same algorithm, the likelihood can also be calculated and maximized to estimate the smoothing parameters. Given the estimates of the smoothing parameters, the K-step-ahead prediction can be done by running the Kalman filtering prediction steps without the filtering steps. When the new data becomes available, the coefficients can be efficiently updated by running the algorithm from the original estimates over the new observations.
We first outline the general structure of our algorithm:
- Fit model on the training dataset.
- Run the Kalman filtering algorithm described in Section 3.1 to compute the likelihood function defined in Section 3.2.
- Use an optimizer to numerically maximize the likelihood function to obtain the estimates of the smoothing parameters.
- Based on the estimated smoothing parameters, run the Kalman filtering and smoothing algorithm to estimate both the time-varying and time-invariant coefficients in the model.
K-steps ahead prediction: run the Kalman filtering K steps ahead by treating the corresponding observations as missing data described in Section 3.3.
- Update smoothing parameters and coefficients when new data becomes available.
- Update smoothing parameters based on the likelihood of all available data.
- Based on the new estimates of the smoothing parameters repeat steps (1c) and (2) for estimation and prediction.
We next describe the Kalman filtering and smoothing, the maximum likelihood estimates for the smoothing parameters and the dynamic prediction steps.
3.1. Kalman filtering and smoothing
In this section we describe how to fit our state-space model using a Kalman filter like algorithm. Let Ys = (y1, ⋯ , ys) and Xs = (x1, ⋯ , xs) be the set of past observations of the outcome and covariates up to time t = s/S. For given smoothing parameters λj, j = 0, 1, ⋯ , q1, the algorithm sequentially repeats the prediction step and filtering step to estimate the coefficients. For simplicity of notation, denote . The general Kalman filtering algorithm of model fitting proceeds in the following steps:
Start with s = 0, take diffuse prior p(γ0∣Y0) ~ N(0, cI) as the initial prior distribution of γ1, where c is a large constant.
Prediction step: calculate
Filtering step: calculate the posterior mean E(γs∣Ys−1, Xs−1, ys, xs) and variance Var(γs∣Ys−1, Xs−1, ys, xs) through normal approximation.
Repeat step ii) and iii) successively for s = 1, 2, ⋯ , S.
In the filtering step, the posterior distribution of coefficients γs can be written as
| (7) |
Here p(γs∣Ys−1, Xs−1) is a normal distribution with mean E(γs∣Ys−1, Xs−1) and variance Var(γs∣Ys−1, Xs−1). However, since p(ys∣γs, xs) is a product of binomial distribution, the posterior distribution (7) does not have a closed-form expression. We approximate the posterior distribution with a normal distribution where the mean of the approximating normal distribution is the mode of posterior distribution.
Denote
| (8) |
The Newton-Raphson method can be used to maximize ϕ (γs) to get its mode. Take as the starting value. The procedure is to iterate the following equation until convergence with k-th iteration being
where Dϕ and D2ϕ are first and second derivative operator. Let be the converged value, the posterior mean and variance are
| (9) |
Based on the Kalman filtering results, a fixed interval smoothing algorithm can be used to compute the posterior means and variances for the coefficients. It is calculated by the backwards recursion for s = S − 1, S − 2, ⋯ , 2, 1,
where .
3.2. Smoothing parameter selection
In the model, the smoothing parameters λj control the smoothness of βj(t), j = 0, 1, ⋯ , q1 respectively. The selection of λj plays a key role in model fitting and prediction. We propose to maximize the following log-likelihood to estimate the smoothing parameters,
where
and is not available in closed form. A commonly used method is Laplace approximation. Lewis and Raftery (1997) suggested that this approximation should be quite accurate. The Laplace approximation yields
where ℓi(γ) = log p(yi∣γ, Ys−1, Xs−1) + log p(γ∣Ys−1, Xs−1). We can then directly optimize the function using existing packages in R, such as optim.
3.3. K-step-ahead Prediction
After we fit model (5) and (6) with the training dataset, the k-step-ahead predictions can be calculated by running the prediction from the last time point of the training dataset. The K-step-ahead predicted model coefficients from timepoint follow
where K = 1, 2, 3, ⋯ , . The predicted probability of y = 1 with the given covariates can then be calculated using equation (5) by replacing the coefficients with the predicted expectations.
4. Simulation Study
In this section, we evaluate the predictive performance of the proposed model using simulated data. We also compare our proposed method with the method proposed by McCormick et al. (2012) (referred to as the dynamic model averaging (DMA) method) and standard logistic regression in which all coefficients are constant (referred to as the GLM method).
We simulated a binary outcome using the following model,
where ti was generated from a uniform distribution U[0, 1] and Xi1 and Xi2 were independently generated from a uniform distribution U[−3, 3]. β0(t) was the varying intercept generated by , and β1(t) was the varying coefficient for Xi1 generated by β1(t) = cos(2πt). α = 1 was a constant coefficient. The average outcome prevalence, i.e., Pr(Y = 1), over time was around 25%. We evaluated the model performance under three different sample sizes n = 2000, 4000, and 8000. The first 75% of the data was used to fit the model, and the remaining 25% was used for testing model performance. The simulation was repeated 200 times for each scenario.
For our proposed DLSSM method, we first divided the data into batches. In the dataset, we divided the time into 100 equally spaced time intervals, [(s − 1)/100, s/100], s = 1, 2, ⋯ , 100. On average, there were 20, 40, and 80 subjects within each time interval, corresponding to the total sample size of 2000, 4000, and 8000 respectively. We used a normal distribution N(0, 100 · I5) as the initial prior distribution of the coefficients in the model. We used the proposed criterion as in section 2.4 to select the smoothing parameters for β0(t) and β1(t) using the first 75% data, denoted as λ0 and λ1 respectively. Both λ0 and λ1 were fixed when making predictions in the remaining data.
We applied the DMA method and fit the following model using the R package dma (McCormick et al. 2018)
All coefficients were considered time varying and updated continuously when each new observation became available. We chose the automatic tuning option in the package to select the forgetting factor. In addition, we used the default option that used the first 10% of data to generate initial values. We also fit a standard logistic model in which all coefficients were fixed, logit{Pr(Yi = 1)} = γ0 + γ1Xi1 + γ2Xi2.
We compared the estimated functional curves and corresponding 95% confidence bands for each coefficient using the three methods. We also evaluated the accuracy of the predicted probability of outcome using the scaled root of mean square error (SRMSE) criterion
| (10) |
where ni is sample size of ith batch, and and pij are the estimated and true probabilities of Y = 1. Note that in real data analysis, this criterion cannot be used because the true probability pij is unknown. Alternatively, we evaluated the prediction performance using the Brier score (Brier, 1950), i.e.,
| (11) |
To evaluate how far ahead the model can predict the outcome well, we compared the performance of the three methods for doing K-step ahead predictions, K = 1, 2, 3, 4, 5. For example, when K = 1, we predicted the model coefficients and the probability of outcome one time step ahead in the future.
Figure 1 shows the estimated coefficients for the intercept (left panel), X1 (middle panel), and X2 (right panel) with corresponding 95% confidence bands from each method when the sample size is n = 8000. The coefficient plots were divided at t = 0.75 indicated by vertical dotted line, corresponding to the training and validation data, respectively. The figure shows the estimated coefficients in the training set (t ⩽ 0.75) and one-step ahead predicted coefficients in the validation set (t > 0.75). The estimated coefficient curves using our proposed method were close to the true curves (red lines). In addition, the 95% confidence bands covered the true values in both the training and validation data. The GLM was a static model which captured the average of the coefficients over time and did not perform well, as one would expect. The estimated coefficient curves from the DMA method had larger variation over time and wider confidence bands compared to our proposed method. This larger variation occurs because the coefficients in the DMA model at the next time point were random perturbations of their current values. In addition, the coefficients were updated one observation at a time which has larger approximation error compared to our method and may accumulate over time.
Figure 1.
The plot of coefficients for the intercept (left), X1 (middle) and X2 (right) with 95% confidence bands using the GLM (blue line), DMA (gray line) and DLSSM (black line) method. The sample size is n = 8000. The red line represents the true value for each parameter. The coefficient plots were divided at t = 0.75, corresponding to the estimated coefficient in the training data and one-step ahead prediction in the validation data respectively.
Figure 2 shows the box plot of the SRMSE across 100 simulations for K-step ahead predictions, K = 1, 2, ⋯ , 5, under different sample size n = 2000, 4000 and 8000. Our proposed method had much smaller SRMSE compared to the other two methods in all scenarios. When the sample size increased, the SRMSE for all three methods decreased, especially for the DMA method. Figure 3 shows the box plot of Brier score of the same 100 simulations as were shown in Figure 2. Similarly, the Brier scores using our proposed method were smaller than the other two methods in all scenarios. The results showed that the performance gets worse when the batch size is smaller. This is because the error due to normal approximation increases when the batch size decreases. We also found a small loss of efficiency if we allowed all coefficients to change over time while some are actually constant.
Figure 2.
Boxplots of the SRMSE from the 100 simulations using the three methods (GLM, DMA, DLSSM) for K step-ahead prediction, K = 1, 2, 3, 4, 5 with different sample size n = 2000, 4000, 8000.
Figure 3.
Boxplots of the Brier score from the 100 simulations using the three methods (GLM, DMA, DLSSM) for K step-ahead prediction, K = 1, 2, 3, 4, 5 with different sample size n = 2000, 4000, 8000.
5. Data Analysis
Lung transplantation is the only life saving therapy for many forms of advanced lung disease (ALD). Since May 1, 2005, all patients with ALD age greater or equal to 12 who were placed on a waitlist for transplant have been prioritized to receive organs using the LAS (Egan et al. 2006). The LAS represents the difference between transplant benefit and waiting list urgency, with transplant benefit being defined as post-transplant survival minus waiting list survival, and waiting list urgency being defined as waiting list survival. The estimates of waiting list and post-transplant survival are obtained using two prediction models: one predicts survival up to one year on the waitlist without transplant; and the other one predicts survival one year after transplant. The introduction of LAS has been associated with fewer deaths and shorter time on the waitlist for patients ultimately transplanted (Valapour et al., 2015; Gries et al. 2007). However, many studies have come to a general conclusion that the LAS is broken because it prioritizes older, sicker patients with reduced post-transplant survival (Tsuang et al., 2013; Kotloff, 2013); excludes key prognostic variables (Chen et al., 2009) that leads to higher post-transplant costs (Maxwell et al., 2015) and has failed to improve upon per-existing problems such as gender-based (Wille et al., 2013) and geographic disparities in access (Russo et al., 2013; Thabut et al., 2012). Some of the limitations may be attributed to how infrequently the model is updated, so that the model that is currently in use does not accurately reflect the evolving patient population and changes in standard care.
Our lung transplant data came from the United Network for Organ Sharing (UNOS) database which includes all patients listed for single or bilateral lung transplantation in the U.S. Transplant procedures are distributed across an average of 60 lung transplant centers. As required by law, all U.S. transplant centers must upload data for all waitlisted patients to the UNOS in order for such patients to be considered for transplantation. Patients younger than 12 are excluded because the LAS was not developed for children, and is not used to prioritize them. In this analysis, we included 14,613 patients who received a single lung transplant from 2007 to 2016 after removing missing data. The outcome is a binary variable indicating whether a patient died within one year after lung transplant. In this analysis, we included the six most significant predictors from the LAS equation: age, creatinine, six-minute walk distance (SMWD), mechanical Ventilation (MV), functional status (FS) and diagnosis that includes Cystic Fibrosis (CF), Obstructive Disease (OD), Pulmonary Fibrosis (PF) and Pulmonary Hypertension (PH). We created four binary indicators to represent each diagnosis category respectively. The distribution of these covariates were plotted by calendar year in Figure 4. Figure 5 shows the mortality risk by year, which decreased over time. The average mortality risk was 13.5%.
Figure 4.
Distribution of the covariates by calendar year in the UNOS data. MV: mechanical ventilation. For diagnosis, 1: cystic fibrosis; 2: obstructive disease; 3: pulmonary fibrosis; 4: pulmonary hypertension.
Figure 5.

Mortality risk from year 2007 to 2016 in the UNOS data.
We first fit separate models for each predictor. Within each model, we performed a formal test on whether the coefficient changed over time. Alternatively, we fit a multivariable model that allows the coefficients for all predictors to change over time. We found that the only coefficient that changed over time was functional status. In the final model, we modeled the coefficient for functional status as time varying in addition to the time varying intercept. The other coefficients were constant. Here is the model,
We combined the data for each calendar month and used the data from 2007-2013 to fit the model, which was then used to dynamically predict one-year mortality risk in the remaining 2014-2016 data. N(0, 100 · I11) was used as the diffuse prior for the coefficients in the model. Using the training data, the smoothing parameters were selected as 3.5e − 20 and 1.5e − 44 for β0(t) and β1(t) respectively, which suggested close to linear functions. Finally, we did one month ahead prediction using the 2014-2016 data, in which the smoothing parameters were fixed.
Figure 6 shows the estimated β0(t) and β1(t) using the training data up to year 2013 and one-step ahead prediction of the two coefficients after year 2013. The intercept β0(t) decreased over time, corresponding to the decreased mortality risk over time. β1(t) had an increasing trend toward zero, which suggested that the effect of Functional Status became weaker over time. Table 1 summarizes the estimated constant coefficients for the other covariates, which were all significant predictors of the outcome.
Figure 6.
The plots of time-varying intercept β0(t) and coefficient for functional status β1(t), with 95% confidence bands in the UNOS data. The coefficient plots were divided at year 2013, corresponding to the filtering estimate (black line) and smoothing estimate (red lines) in the training set and one-step ahead prediction in the validation set respectively.
Table 1.
Parameter estimate and standard error (SE) of the constant coefficients in the UNOS data.
| Estimator | Age (α1) | MV (α2) | SMWD (α3) | Creatinine (α4) | OD (α5) | PF (α6) | PH (α7) |
|---|---|---|---|---|---|---|---|
| Estimate | 0.0229 | −0.3367 | 0.3456 | 0.5775 | 0.2743 | 0.2864 | 0.1840 |
| SE | 0.0035 | 0.0720 | 0.1351 | 0.1088 | 0.1798 | 0.1554 | 0.0714 |
We evaluated the model performance using a scaled Brier score (SBS), defined as
| (12) |
where
| (13) |
which is the average outcome among the ni subjects at time ti.
We also evaluated the prediction performance using the true classification rate. We defined a binary prediction of death based on the predicted probability using the cutoff defined as the average mortality risk in the most recent month. The true classification rate is the percent of subjects that the model correctly predicts the outcome.
Table 2 shows the true classification rate and scaled Brier score of the three methods for K-step ahead prediction. The proposed DLSSM method had a higher true classification rate and smaller scaled Brier score compared to the DMA and GLM method.
Table 2.
Summary of the true classification rate (TCR) and scaled Brier score (SBS) for K-step ahead prediction, K = 1, 2, …, 5 in the validation set of the UNOS data.
| Criterion | Method | K = 1 | K = 2 | K = 3 | K = 4 | K = 5 |
|---|---|---|---|---|---|---|
| TCR | DLSSM | 0.5832 | 0.5863 | 0.5933 | 0.6047 | 0.6061 |
| DMA | 0.5006 | 0.5118 | 0.5060 | 0.4936 | 0.5019 | |
| GLM | 0.3348 | 0.3567 | 0.3581 | 0.3637 | 0.3585 | |
| SBS | DLSSM | 2.7794 | 2.7755 | 2.7781 | 2.8064 | 2.8288 |
| DMA | 2.8115 | 2.8104 | 2.8069 | 2.8380 | 2.8815 | |
| GLM | 2.8191 | 2.8147 | 2.8173 | 2.8483 | 2.8730 |
6. Discussion
A new dynamic logistic state-space prediction model (DLSSM) was proposed to dynamically predict binary outcomes, in which the time-varying coefficients were modeled as cubic smoothing spline. Smoothing splines of higher order can also be used in the model. Wecker and Ansley (1983) provided a general framework for using smoothing splines of any order. For the Laplace approximation, we used a first-order approximation which showed a reasonably good performance. Second-order Laplace approximation can also be used, although we do not expect substantial improvement in prediction performance. While in our application, one only needs to update the model every six months, our algorithm can also be applied to online monitoring which has a wide range of applications.
A reviewer suggests that we can potentially adapt the Polya-Gamma data augmentation method (Polson et al. 2013) for the fitering step so that Laplace approximation would not be needed. Similar idea was proposed by Carlin, Polson and Stoffer (1992) where they performed Gibbs sampling for each step of filtering and prediction. This was later found that the approximation errors accumulate over time and lead to a very slow convergence. The state of the art in Bayesian analysis in state space models is to draw the entire history of the state vector over time using forward filtering/backward sampling (Durbin and Koopman 2012, CH13). For a fixed dataset, this leads to reasonable convergence. However, it would be very computationally intensive to implement this online because for each new bath of observations we would have to restart the MCMC process and cannot exploit the recursive nature of state space models. Therefore we adopted the Laplace approximation in the filtering step.
The performance of the proposed DLSSM method depends on the normal approximation of the product of binary distributions, which is related to the batch size and binary outcome event rates. In our simulation and application, the algorithm performs well for batch size greater than 10. The algorithm is computationally efficient. The computational burden is mainly related to the smoothing parameter selection, which took about two minutes with sample size n = 8000 using a laptop with standard configuration. As a contrast, the prediction and filtering steps can be computed almost instantaneously.
In our application to the lung transplantation data, we found that the intercept and the coefficient associated with functional status varied over time and other coefficients were constant. The intercept decreased over time, which corresponded to a decreasing mortality risk over time. The coefficient of functional status was attenuated over time, which may be due to the improvement of technology and clinical care. Regarding the prediction performance, the DLSSM model had higher true classification rate and smaller scaled Brier score compared to the DMA and GLM methods. Our proposed dynamic prediction model that gradually updates the model coefficients performs well and provides a powerful tool for clinical decision making.
Acknowledgments
Drs. Kimmel and Guo are co-senior authors. This research was supported by NIH grants R01HL141294, UL1TR001878, R01DK117208 and R01DK130067. We thank the joint editor, the associate editor, and two reviewers for the helpful comments that substantially improved the presentation of the paper.
Footnotes
Supporting Information
R-codes are available with this paper at the Biometrics website on Wiley Online Library.
Data availability statement
The data that support the findings of this paper are available from the United Network for Organ Sharing (UNOS). The authors do not have the authority to share UNOS data. Researchers interested in accessing this data must submit a request to UNOS directly (https://optn.transplant.hrsa.gov/data/request-data).
References
- Angel LF, Levine DJ, Restrepo MI, Johnson S, Sako E, Carpenter A, Calhoon J, Cornell JE, Adams SG, Chisholm GB, Nespral J, Roberson A, Levine SM (2006). Impact of a lung transplantation donor-management protocol on lung donation and recipient outcomes. American journal of respiratory and critical care medicine, 174, 710–716. [DOI] [PubMed] [Google Scholar]
- Baldwin MR, Arcasoy SM, Shah A, Schulze PC, Sze J, Sonett JR, Lederer DJ (2012). Hypoalbuminemia and early mortality after lung transplantation: a cohort study. American journal of transplantation, 12, 1256–1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhattacharjee A, Vishwakarma GK, & Thomas A (2018). Bayesian state-space modeling in gene expression data analysis: An application with biomarker prediction. Mathematical biosciences, 305, 96–101. [DOI] [PubMed] [Google Scholar]
- Brier GW (1950). Verification of forecasts expressed in terms of probability. Monthly weather review, 78, 1–3. [Google Scholar]
- Carlin BP, Polson NG and Stoffer DS (1992). A Monte Carlo Approach to Nonnormal and Nonlinear State-Space Modeling. Journal of the American Statistical Association 87, 493–500. [Google Scholar]
- Chan K, Robbins-Callahan L, Valapour M, Skeans M, Wozniak T, Edwards L (2015). Early Effects After the First Major Revision of the Lung Allocation Score (LAS) in the United States. Chest, 1079A. [Google Scholar]
- Chen H, Shiboski SC, Golden JA, Gould MK, Hays SR, Hoopes CW, De MT (2009). Impact of the lung allocation score on lung transplantation for pulmonary arterial hypertension. American journal of respiratory and critical care medicine, 180, 468–474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chiumello D, Coppola S, Froio S, Colombo A, Del SL (2015). Extracorporeal life support as bridge to lung transplantation: a systematic review. Critical care, 19, 19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark MA, Duhay FG, Thompson AK, Keyes MJ, Svensson LG, Bonow RO, Stockwell BT, Cohen DJ (2012). Clinical and economic outcomes after surgical aortic valve replacement in Medicare patients. Risk Management and Healthcare Policy, 5, 117–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cypel M, Keshavjee S (2011). Extracorporeal life support as a bridge to lung transplantation. Clinics in chest medicine . 32, 245–251. [DOI] [PubMed] [Google Scholar]
- Cypel M, Yeung JC, Liu M, Anraku M, Chen F, Karolak W, Sato M, Laratta J, Azad S, Madonik M, Chow CW, Chaparro C, Hutcheon M, Singer LG, Slutsky AS, Yasufuku K, de PM, Pierre AF, Waddell TK, Keshavjee S (2011). Normothermic ex vivo lung perfusion in clinical lung transplantation. The New England journal of medicine. 364, 1431–1440. [DOI] [PubMed] [Google Scholar]
- Debray TP, Vergouwe Y, Koffijberg H, Nieboer D, Steyerberg EW, Moons KG (2015). A new framework to enhance the interpretation of external validation studies of clinical prediction models. Journal of clinical epidemiology, 68, 279–289. [DOI] [PubMed] [Google Scholar]
- Durbin J, & Koopman SJ (2012). Time series analysis by state space methods. Oxford university press. [Google Scholar]
- Egan TM, Murray S, Bustami RT, Shearon TH, McCullough KP, Edwards LB, Coke MA, Garrity ER, Sweet SC, Heiney DA, Grover FL (2006). Development of the new lung allocation system in the United States. American journal of transplantation, 6, 212–627. [DOI] [PubMed] [Google Scholar]
- Gu C (1992). Penalized likelihood regression: a Bayesian analysis, Statistica Sinica, 2, 255–264. [Google Scholar]
- Gries CJ, Mulligan MS, Edelman JD, Raghu G, Curtis JR, Goss CH (2007). Lung allocation score for lung transplantation: impact on disease severity and survival. Chest, 132, 1954–1961. [DOI] [PubMed] [Google Scholar]
- Guihenneuc-Jouyaux C, Richardson S, & Longini IM Jr (2000). Modeling markers of disease progression by a hidden Markov process: application to characterizing CD4 cell decline. Biometrics, 56, 733–741. [DOI] [PubMed] [Google Scholar]
- Hickey GL, Grant SW, Caiado C, Kendall S, Dunning J, Poullis M, … & Bridgewater B (2013,a). Dynamic prediction modeling approaches for cardiac surgery. Circulation: Cardiovascular Quality and Outcomes, 6, 649–658. [DOI] [PubMed] [Google Scholar]
- Janssen KJM, Moons KGM, Kalkman CJ, Grobbee DE, & Vergouwe Y (2008). Updating methods improved the performance of a clinical prediction model in new patients. Journal of clinical epidemiology, 61, 76–86. [DOI] [PubMed] [Google Scholar]
- Justice AC, Covinsky KE, Berlin JA (1999). Assessing the generalizability of prognostic information. Annals of internal medicine, 130, 515–524. [DOI] [PubMed] [Google Scholar]
- Kappen TH, Vergouwe Y, van Klei WA, van Wolfswinkel L, Kalkman CJ, & Moons KG (2012). Adaptation of clinical prediction models for application in local settings. Medical Decision Making, 32, 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kotloff RM (2013). Risk stratification of lung transplant candidates: implications for organ allocation. Annals of internal medicine, 158, 699–700. [DOI] [PubMed] [Google Scholar]
- Knaus WA, Draper EA, Wagner DP, & Zimmerman JE (1985). APACHE II: a severity of disease classification system. Critical care medicine, 13, 818–829. [PubMed] [Google Scholar]
- Maxwell BG, Mooney JJ, Lee PH, Levitt JE, Chhatwani L, Nicolls MR, Zamora MR, Valentine V, Weill D, Dhillon GS (2015). Increased resource use in lung transplant admissions in the lung allocation score era. American journal of respiratory and critical care medicine, 191, 302–308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCormick TH, Raftery AE, Madigan D, & Burd RS (2012). Dynamic logistic regression and dynamic model averaging for binary classification. Biometrics, 68, 23–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCormick TH, Raftery AE, Madigan D, Kandanaarachchi S, & Sevcikova Hana (2018). dma: Dynamic Model Averaging. R package version 1.4-0 [Google Scholar]
- O’Neill PD (2002). A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte Carlo methods. Mathematical biosciences, 180, 103–114. [DOI] [PubMed] [Google Scholar]
- Lederer DJ, Kawut SM, Wickersham N, Winterbottom C, Bhorade S, Palmer SM, Lee J, Diamond JM, Wille KM, Weinacker A, Lama VN, Crespo M, Orens JB, Sonett JR, Arcasoy SM, Ware LB, Christie JD (2011). Obesity and primary graft dysfunction after lung transplantation: the Lung Transplant Outcomes Group Obesity Study. American journal of respiratory and critical care medicine, 184,1055–1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis SM and Raftery AE (1997). Estimating Bayes factors via posterior simulation with the Laplace-Metropolis estimator. Journal of the American Statistical Association, 92, 648–655. [Google Scholar]
- Pal AF, Mohite P, Rosenberg A, Hernandez C, Saez DG, De Asua I, Popov A, Simon A (2015). Escalation of extracorporeal life support as a bridge to lung transplantation in end-stage lung disease. European Journal of Heart Failure. 17–147. [Google Scholar]
- Polson Nicholas G.; Scott James G.; Windle Jesse (2013). Bayesian Inference for Logistic Models Using Pólya-Gamma Latent Variables. Journal of the American Statistical Association, 108, 1339–1349. [Google Scholar]
- Reilly BM, Evans AT (2006). Translating clinical research into clinical practice: impact of using prediction rules to make decisions. Annals of internal medicine 144, 201–209. [DOI] [PubMed] [Google Scholar]
- Reckhow KH (1990). Bayesian inference in non-replicated ecological studies. Ecology, 71, 2053–2059. [Google Scholar]
- Russo MJ, Meltzer D, Merlo A, Johnson E, Shariati NM, Sonett JR, Gibbons R (2013). Local allocation of lung donors results in transplanting lungs in lower priority transplant recipients. The Annals of thoracic surgery, 95, 1231–1234. [DOI] [PubMed] [Google Scholar]
- Schaffer JM, Singh SK, Reitz BA, Zamanian RT, Mallidi HR (2015). Single- vs double-lung transplantation in patients with chronic obstructive pulmonary disease and idiopathic pulmonary fibrosis since the implementation of lung allocation based on medical need. Journal of the American Medical Association. 313,936–948. [DOI] [PubMed] [Google Scholar]
- Singer JP, Diamond JM, Gries CJ, McDonnough J, Blanc PD, Shah R, Dean MY, Hersh B, Wolters PJ, Tokman S, Arcasoy SM, Ramphal K, Greenland JR, Smith N, Heffernan P, Shah L, Shrestha P, Golden JA, Blumenthal NP, Huang D, Sonett J, Hays S, Oyster M, Katz PP, Robbins H, Brown M, Leard LE, Kukreja J, Bacchetta M, Bush E, D’Ovidio F, Rushefski M, Raza K, Christie JD, Lederer DJ (2015). Frailty Phenotypes, Disability, and Outcomes in Adult Candidates for Lung Transplantation. American journal of respiratory and critical care medicine. 192,1325–1334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Su TL, Jaki T, Hickey GL, Buchan I, & Sperrin M (2018). A review of statistical updating methods for clinical prediction models. Statistical methods in medical research, 27, 185–197. [DOI] [PubMed] [Google Scholar]
- Steyerberg EW, Borsboom GJ, van Houwelingen HC, Eijkemans MJ, & Habbema JD. (2004). Validation and updating of predictive logistic regression models: a study on sample size and shrinkage. Statistics in Medicine, 23, 2567–2586. [DOI] [PubMed] [Google Scholar]
- Steyerberg EW, Moons KG, van der Windt DA, Hayden JA, Perel P, Schroter S, Riley RD, Hemingway H, Altman DG (2013). Prognosis Research Strategy (PROGRESS) 3: prognostic model research. PLoS medicine, 10, e1001381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thabut G, Munson J, Haynes K, Harhay MO, Christie JD, Halpern SD (2012). Geographic disparities in access to lung transplantation before and after implementation of the lung allocation score. American journal of transplantation, 12, 3085–3093. [DOI] [PubMed] [Google Scholar]
- Toll DB, Janssen KJ, Vergouwe Y, Moons KG (2008). Validation, updating and impact of clinical prediction rules: a review. Journal of clinical epidemiology , 61, 1085–1094. [DOI] [PubMed] [Google Scholar]
- Tsuang WM, Vock DM, Finlen Copeland CA, Lederer DJ, Palmer SM (2013). An acute change in lung allocation score and survival after lung transplantation: a cohort study. Annals of internal medicine, 158, 650–657. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valapour M, Skeans MA, Heubner BM, Smith JM, Hertz MI, Edwards LB, Cherikh WS, Callahan ER, Snyder JJ, Israni AK, Kasiske BL(2015). OPTN/SRTR Annual Data Report: lung. American journal of transplantation, 15 , 1–28. [DOI] [PubMed] [Google Scholar]
- Vergouwe Y, Nieboer D, Oostenbrink R, Debray TP, Murray GD, Kattan MW, … & Steyerberg EW (2017). A closed testing procedure to select an appropriate method for updating prediction models. Statistics in medicine, 36, 4529–4539. [DOI] [PubMed] [Google Scholar]
- Wade PR (2000). Bayesian methods in conservation biology. Conservation biology, 14, 1308–1316. [Google Scholar]
- Wahba G (1978). Improper priors, spline smoothing and the problem of guarding against model errors in regression. Journal of the Royal Statistical Society: Series B (Methodological), 40, 364–372. [Google Scholar]
- Wahba G (1983). Bayesian confidence intervals for the cross-validated smoothing spline. Journal of the Royal Statistical Society: Series B (Methodological), 45, 133–150. [Google Scholar]
- Wecker WE, & Ansley CF (1983). The signal extraction approach to nonlinear regression and spline smoothing. Journal of the American Statistical Association, 78, 81–89. [Google Scholar]
- Wille KM, Harrington KF, Andrade JA, Vishin S, Oster RA, Kaslow RA (2013). Disparities in lung transplantation before and after introduction of the lung allocation score. The Journal of heart and lung transplantation, 32, 684–692. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this paper are available from the United Network for Organ Sharing (UNOS). The authors do not have the authority to share UNOS data. Researchers interested in accessing this data must submit a request to UNOS directly (https://optn.transplant.hrsa.gov/data/request-data).





