Summary
A recurring objective in longitudinal studies on aging and longevity has been the investigation of the relationship between age-at-death and current values of a longitudinal covariate trajectory that quantifies reproductive or other behavioral activity. We propose a novel technique for predicting age-at-death distributions for situations where an entire covariate history is included in the predictor. The predictor trajectories up to current time are represented by time-varying functional principal component scores, which are continuously updated as time progresses and are considered to be time-varying predictor variables that are entered into a class of time-varying functional regression models that we propose. We demonstrate for biodemographic data how these methods can be applied to obtain predictions for age-at-death and estimates of remaining lifetime distributions, including estimates of quantiles and of prediction intervals for remaining lifetime. Estimates and predictions are obtained for individual subjects, based on their observed behavioral trajectories, and include a dimension-reduction step that is implemented by projecting on a single index. The proposed techniques are illustrated with data on longitudinal daily egg-laying for female medflies, predicting remaining lifetime and age-at-death distributions from individual event histories observed up to current time.
Keywords: Aging, Biodemography, Conditional distribution, Dimension reduction, Event history analysis, Functional data analysis, Functional principal component, Longitudinal data, Prediction interval, Quantile estimation, Varying coefficient model
1. Introduction
In longitudinal studies, frequently one obtains measurements on time-dependent covariates and a time-to-event for each individual. The association between these covariates and remaining survival is of interest in biodemography, where relationships between longevity and behavioral or reproductive longitudinal trajectories are studied. Typical biodemographic data obtained in studies on aging are uncensored and are obtained under controlled conditions with longitudinal covariates recorded on densely spaced regular time grids. Our methods are specifically addressing a need in biology and biodemography to analyze this type of data. A classical framework to characterize the relationship between longitudinal covariates and time-to-event would be to apply the proportional hazards regression model (Cox, 1972) with time-varying covariates. This model uses covariate information only at event times and determines the hazard rate at a given time by current covariate levels.
In contrast to the Cox model and similar approaches, we explore here models where remaining lifetime may depend on the entire event history as captured by the covariate trajectory, and not just on current covariate levels. That such an assumption is biologically reasonable was demonstrated, for example, in Müller et al. (2001), where a parametric model summarizing egg-laying trajectories of female medflies (Mediterranean fruit fly, Ceratitis capitata) was shown to define remaining egg-laying potential. Thus a connection between the entire egg-laying trajectory up to current time and remaining lifetime was established. This model illustrated that age-at-death is related to early reproductive activity and shed new light on the cost of reproduction hypothesis (Partridge and Harvey, 1985). Issues in biodemographic hazard rate analysis and their implications were discussed in Vaupel et al. (1998), and are of ongoing interest in biodemography and the evolution of longevity (Oeppen and Vaupel, 2002).
In this article, we develop flexible functional models by using methods and tools from functional data analysis. We aim at nonparametric predictors of current remaining lifetime distributions, and especially mean remaining lifetime, by extracting information from the entire available covariate trajectory. For a situation where a sample of random trajectories is given with the same domain for each trajectory, commonly used functional data analysis techniques can be applied such as functional principal component analysis (Rice and Silverman, 1991) or functional regression (Cardot et al., 2003). Functional principal component analysis (FPCA) is an extension of multivariate PCA where one describes a sample of random trajectories by a mean function and the eigenfunctions of the covariance operator, also referred to as principal component functions. These provide a parsimonious orthonormal basis in which to represent the observed trajectories and correspond to the “modes of variation” of random trajectories (Castro, Lawton, and Sylvestre, 1987). The coefficients of the eigenfunctions in an eigenbasis representation of a given random trajectory are random variables, and a finite number of these are commonly used as predictors for functional regression models.
We propose two main extensions of this methodology: first, to the case of a response that is a distribution function rather than a number. This will allow the estimation of quantiles and prediction intervals for remaining lifetimes, which are highly desirable for survival outcomes. We will introduce appropriate assumptions in Section 2.3 and discuss estimation of remaining lifetime distributions in Section 3.2. Second, in the situation we consider, the predictor trajectories for each individual are only observed until age-at-death, and therefore, do not share a common domain. At each given time only a random number of subjects is still alive for whom the covariate trajectories can be observed. Accordingly, any postulated functional regression model must evolve over time. This motivates the time-varying functional regression models that we introduce in Section 2.2, with appropriate estimates being discussed in Section 3.1.
Besides the practical purpose of prediction, it is usually of scientific interest to determine which features of a covariate history are related to survival and longevity. A related area is joint modeling (Self and Pawitan, 1992; Tsiatis, DeGruttola, and Wulfsohn, 1995; Bycott and Taylor, 1998), where statistical models reflect the impact of current covariate values on hazard ratio and current mortality, while our goal is to predict the distribution of remaining lifetimes for each subject still alive at age t, and also to address the simpler problem of predicting mean remaining lifetime, given the entire event history information for a subject up to current time t. As an illustration, the upper panel of Figure 1 provides the egg-laying curves for two randomly selected medflies, with respective age-at-death of 36 and 52 days. Assuming that only the solid segments of the covariate processes are observed from birth until age t = 25 days, these observed trajectories then form the basis for these predictions.
Figure 1.
Smoothed daily egg-laying curves for two randomly selected medflies, from birth to death (upper panel). Data from birth to day 25 (solid curves in upper panel) are used for predicting density functions of remaining lifetime. Actually observed age-at-death is marked as a circle, predicted age-at-death as a cross for each fly. Predicted densities of remaining lifetime for each of the two flies are shown in the lower panel, with line patterns matched to the egg-laying curves in the upper panel.
The article is organized as follows: In Section 2, we introduce assumptions and the time-varying functional regression models that result from coupling time-varying functional principal components with ideas from functional regression. In Section 3, we describe estimation of the components of the model. Application of the proposed model to a biodemographic study, using egg-laying trajectories of female medflies as predictors, and comparisons of the performance with other possible estimators are the theme of Section 4, followed by discussion and concluding remarks in Section 5.
2. Proposed Model
2.1 Modeling Predictor Trajectories
We denote lifetime or age-at-death of a subject by T, and assume that for each subject a covariate X is recorded continuously or on a dense regular grid until death. Then {X(t), T}, t ∈ [0, T] are the data available for each subject, where X(·) is the covariate trajectory, with support [0, T] determined by the lifetime T. We denote the observed covariate trajectory for an individual that is still alive at time t by X̃ (s, t), 0 ≤ s ≤ t, that is,
Our aim is to predict the distribution of remaining lifetime T − t, given X̃ (s, t), 0 ≤ s ≤ t, and in particular mean remaining lifetime.
For t ∈ [0, τ], with some large τ, define
| (1) |
and the eigenfunctions or principal component functions of the conditional covariances as solutions of the eigenequations
| (2) |
where λ1t ≥ λ2t ≥ ··· > 0 are eigenvalues and ρ1t(·),ρ2t(·), …, are orthonormal eigenfunctions associated with these eigenvalues. Then one has the representation, for 0 ≤ s1, s2 ≤ t,
| (3) |
We find that
E{X̃ (s, t)} = μ(s, t) = E{X(s) |T > t}, and
cov{X̃ (s1, t), X̃ (s2, t)} = cov{X(s1), X(s2) | T > t}.
As one can easily show, the trajectories observed conditional on survival beyond t are independent, and therefore, the observed trajectories X̃ (s, t) for subjects with T > t can be represented by the Karhunen–Loève expansion (Ash and Gardner, 1975; Rice and Silverman, 1991),
| (4) |
where the random variables εjt are conditional functional principal component scores with E(εjt) = 0, var(εjt) = λjt, and cov(εjt, εkt) = 0 (for j ≠ k). Here X̃ (s, t) are trajectories that are observed on the entire interval [0, t], and the functional principal component scores can be represented as
| (5) |
The number Nt of trajectories X̃ (s, t) observable up to time t is random. Assuming F̄(t) = P(T > t), we have Nt ~ Binomial(n, F̄(t)), where n is the total number of subjects. Denoting by R(t) the risk set at time t, R(t) = {i: Ti > t}, then for all i ∈ R(t).
The remaining lifetime function at t is
| (6) |
and the distribution function of remaining lifetime at y, where y ≥ 0, is
| (7) |
so that . It is well known (Cox, 1972) that the corresponding survival F̄(·) and hazard λ(·) functions are and , so that the function r(·) defines the survival schedule.
2.2 Modeling Mean Remaining Lifetime
Our aim is to relate the remaining lifetime T − t for a given arbitrary t ∈ [0, τ] to the observed trajectory X̃ on [0, t], that is, to estimate
| (8) |
We assume that there exists a family of smooth link functions ht indexed by t, with ht (s) = H(t, s):[0, τ] × R → R, for a function H that is continuous in s and t, and an associated evaluation function β(s, t), satisfying β ∈ L2(Ct), Ct = {(s, t), 0 ≤ s ≤ t, 0 ≤ t ≤ τ}, such that
| (9) |
This assumption puts our model into the framework of an extension of functional regression (Ramsay and Silverman, 1997; Cardot et al., 2003; Müller, 2005), where the extension is the inclusion of a time-varying feature, while the classical linear functional regression model would be with a fixed S > 0, with generalized version , where g is a link function (James, 2002; Müller and Stadtmüller, 2005). Another type of extension of functional regression where current observed values are to be predicted from past observations was proposed in Malfait and Ramsay (2003).
For given t and any orthonormal basis, ψjt (·), j = 1, 2, …, on L2([0, t]), the evaluation function β(·, t) can be represented by , 0 ≤ s ≤ t, 0 ≤ t ≤ τ, with varying coefficients βjt. A special choice for the basis are the eigenfunctions ρjt (·) of cov(X(u), X(v)). Then the model for mean remaining lifetime becomes
| (10) |
and because r0(t) is a nonrandom function, we may introduce another (nonrandom) smooth link function gt (·) such that gt {z(t)} = ht {r0(t) + z(t)} for an arbitrary function z(t). This leads to
| (11) |
where plays the same role as a linear predictor in a generalized linear model, with the additional feature that it is a function of t, and therefore, is referred to as the linear predictor function. If X̃ (·, t) and β(·, t) are expressed in terms of the orthonormal basis ρ1t(·), ρ2t(·), …, then by the orthogonality of the basis functions, the linear predictor function becomes , where εjt are as in (5) and
| (12) |
We assume there exists a finite number M of components (where M = M(n) may increase with sample size) such that processes X̃ (·, ·) are sufficiently well approximated by projecting on the function space spanned by the first M eigenfunctions. It then suffices to consider truncated linear predictor functions given by
| (13) |
These developments suggest a varying coefficient generalized linear model for remaining lifetime, with the time-varying principal component scores εjt as predictors. Once a number M of component scores to be included in the model as predictors has been chosen, the trajectory X̃ (·, t) is summarized by the random M-vector (ε1t, …, εMt). We then fit the varying coefficient generalized linear model (Hastie and Tibshirani, 1993)
| (14) |
where gt (·) is the link function at time t which may be assumed to be known or unknown but smooth. For quasi-likelihood-based estimating equations, we also need to specify a variance function, V (·, ·),
| (15) |
which depends on the conditional mean rX̃ (t) only via the smooth function σ2(·). This function also can be assumed to be known (such as in quasi-Poisson or quasi-Gamma regression) or unknown but smooth.
When the link function gt (·) is chosen as identity function for all t, model (14) is replaced by a varying coefficient linear regression model,
| (16) |
where r0(t) is the mean remaining lifetime function at t, corresponding to a varying intercept function, εjt are the individual random components (uncorrelated and with zero mean), and βjt, j = 1, 2,…, M are the varying coefficients.
2.3 Modeling Conditional Distributions of Remaining Lifetime
The conditional distribution of remaining lifetime T − t, at a current time (age) t of a subject, and given an observed trajectory X̃ up to time t, is defined as FX̃,t(y) = P{T − t ≤ y| X̃ (s, t), 0 ≤ s ≤ t, t ≤ T}. Once these conditional distributions have been determined, other quantities that are functionals of conditional distribution functions such as conditional quantiles and conditional densities can be constructed as well.
Having summarized the covariate trajectory X̃ (·, t) by the linear predictor function η(·) (13), we add the assumption that the linear predictor function determines the conditional distribution,
| (17) |
where the unknown functions φt,y(·) are assumed to be smooth in y and t and in their argument. Estimating conditional remaining lifetime distributions then is equivalent to estimating functions φt,y(·) and η(·). By substituting the linear predictor function η(t) in lieu of the observed event history process to time t, X̃ (s, t), 0 ≤ s ≤ t, we have reduced the initially infinite-dimensional predictor to dimension one. This corresponds to a major dimension reduction step. We assume that φt,y(·) is smooth or may alternatively choose a paramet ric model for φt,y(·). We obtain P(T − t ≤ y|η) by estimating the functions φt,y(·) nonparametrically, as detailed in Section 3.3.
3. Estimating the Model Components
3.1 Preliminaries
For smoothing purposes, we find it convenient to use local linear scatterplot smoothers for the nonparametric estimation of a regression function E (y|X = x). We note that many other smoothing methods could be used, for example, various spline smoothers. Given data {(xi, yi) ∈ R2, i = 1, …, n}, these are implemented by weighted local least squares, where
| (18) |
is minimized with respect to (b0, b1). Here, K ≥ 0 is a nonnegative kernel function, chosen as a probability density, and h a suitably chosen bandwidth (smoothing parameter). The resulting nonparametric regression estimate is
| (19) |
which is linear in the data yi, with weight functions wi (x, h) given by
| (20) |
(see, e.g., Fan and Gijbels, 1996).
A first step is to estimate the mean function μ(·, ·) (1) at current time t for all subjects who are at risk at t, i.e., for which T > t, where T is age-at-death. The estimate of μ(s, t) is
| (21) |
where R(t) is the risk set at time t, the tij ’s are the pooled time points of all observations at which the ith subject was observed prior to time t, provided that Ti > t, and h is the smoothing parameter of the scatterplot smoother S, which is evaluated at the argument s (0 ≤ s ≤ t). The number M of components that are included in model (14) or (16) and the necessary bandwidths can be determined by one-curve-leave-out cross-validation (Rice and Silverman, 1991) or according to the amount of variation explained by the first M components.
We use two-dimensional local linear smoothers to obtain the covariance surface of observed processes up to time t and then estimate the eigenfunctions and eigenvalues. Additional measurement errors may contaminate the observed covariates, i.e., the observed data are (X̃i, Ti), i = 1, …, n, where Ti is observed age-at-death for the ith subject and observed covariates are
| (22) |
with E(εij) = 0, var(εij) = σ2, and cov(εij, εil) = 0 for any j ≠ l. In this case,
| (23) |
where δjl is 1 if j = l and 0 otherwise. The raw covariances , 0 ≤ tij, til ≤ t, with μ̂ from (21), contain the measurement errors in the diagonal, therefore unrestricted 2-dimensional smoothing is not feasible, as it would produce biased covariance estimates in the neighborhood of the diagonal. We, therefore, apply a 2-dimensional scatterplot smoother to off-diagonal elements only, to obtain the smoothed covariance surface denoted as Ĝt(r, s). Details can be found in Yao et al. (2003); compare also Staniswalis and Lee (1998) for similar considerations.
Estimated eigenfunctions and eigenvalues are solutions of estimated eigenequations,
| (24) |
where eigenfunctions are subject to the constraints and for j < l. These solutions are found by discretizing (24), obtaining corresponding discrete solutions, and then smoothing them to obtain eigenfunction estimates; appropriate modifications as described in Yao et al. (2003) are made in case Ĝt is not positive definite. The functional principal component scores are then determined by
| (25) |
which can be obtained by numerical integration or, in the Gaussian case, by a conditioning argument. Consistency results for ρ̂jt, μ̂, and ε̂ijt can be found in an unpublished manuscript by P. Hall and M. Hosseini-Nasab (2003).
3.2 Estimating Mean Remaining Lifetime
Denoting the least squares estimates of βt = (β0t, …, βMt) in model (16) by β̂t,
| (26) |
the fitted model for remaining lifetime is
| (27) |
where β̂0t is the mean remaining lifetime function. In some cases (as it turns out, not in our application example), smoothness in t for varying coefficients βjt may be exploited to improve the varying coefficient estimates by a smoothing step (compare, e.g., Cai, Fan, and Li [2000] or Wu and Yu [2002] for smoothing varying coefficients). We require that the eigenfunctions ρjt(s) are continuous in a suitable function norm as t varies. The sign of the eigenfunctions is generally undetermined. If t is increased by a small δ > 0, we choose the sign of the eigenfunction at t + δ by determining
| (28) |
and declaring ζ̂ρ̂j(t+δ)(s) to be the jth eigenfunction at time t + δ.
For the fitted model (27), the one-leave-out prediction for the ith subject is
| (29) |
where ε̃ijt are the coefficients for the ith subject obtained for eigenfunctions , which are the estimated eigenfunctions after removing the ith subject’s trajectory. The one-leave-out predictions lead to the root squared prediction errors at t,
| (30) |
where Nt is the number of subjects in the risk set R(t). Root squared prediction error functions for various predictors that are described in Section 4 are displayed below in Figure 5.
Figure 5.
Pointwise average root squared prediction errors (RSPE, (30)) for various predictors of remaining lifetime: time-varying Cox model (38) (dotted), average remaining lifetime (dashed), functional linear varying coefficient model (16) (dash-dot), and semiparametric generalized linear time-varying regression model implemented with the QLUE algorithm (14) (solid), for the daily egg-laying data.
For generalized linear regression models, Chiou and Müller (1998, 1999) developed a nonparametric quasi-likelihood with unknown link and variance function estimation (QLUE) algorithm by substituting nonparametric estimates for link and variance functions in lieu of the true link and variance functions in the usual definition of quasi-likelihood. The estimating procedure is based on a three-stage iteration, which consists of three updating steps, nonparametric updating of the link function, nonparametric updating of the variance function, and updating of the regression parameters. Adapting to the current situation, and setting ε̂it = (ε̂i1t, …, ^εipt)T, one aims at solving the semiparametric estimating equation
| (31) |
with respect to βt. Here is the linear predictor function with estimated predictors ε̂ijt (25), ĝ(·; β̂t) and ĝ1(·; β̂t) are nonparametric smooth estimates of the link function, respectively, its first derivative, and σ̂2(·) is a smooth estimate of the variance function.
When applying QLUE, we use data-based bandwidth selection for smoothing link and variance functions based on Pearson’s chi-square by obtaining
| (32) |
where μ̂b0i denotes the estimated predictors based on the link function obtained with bandwidth b0 and denotes the nonparametric variance function estimate obtained with the bandwidth b, where p is the number of predictors, p = M in our case. For details see Chiou and Müller (1998). The QLUE algorithm can be shown to provide consistent estimates β̂t,, ĝ (·), and σ̂2(·) for the components of models (14) and (15), for each fixed t, given predictors εijt, and this result can be extended to the case where true predictors εijt are replaced by consistent estimates ε̂ijt. The resulting estimate of the linear predictor function η(t) (13) is
| (33) |
3.3 Estimating Conditional Distributions
Our approach to conditional distribution estimation is based on the single index assumption (17), which will provide the necessary dimension reduction, P{T − t ≤ y|X̃ (s, t), 0 ≤ s ≤ t, t ≤ T} = P{T − t ≤ y|η(t)}. Here is the single index or linear predictor function which can be estimated via (31).
We then obtain nonparametric smooth estimates of the conditional distribution with relative ease: Consider i.i.d. pairs, {(X1, Y1), …, (Xn, Yn)}, where (X, Y) ∈ R2. Then to estimate the conditional distribution function F(y|x) = P(Y ≤ y|X = x) from this sample, note that F(y|x) = E{I(Yi ≤ y)|X = x}, where I is the indicator function. Estimation of the conditional distribution function can thus be framed as a regression problem. Applying nonparametric regression, for example, via Nadaraya–Watson kernel estimators,
| (34) |
where for a bandwidth h. Combining estimate (34) with (17),
| (35) |
where η̂i(t) (33) is the linear predictor for the ith individual and R(t) the risk set at time t.
We can also apply a conditional density estimator (see, e.g., Yu and Jones, 1998; Hall and Müller, 2003), extending the notion of a conditional density to functional predictors. Assuming hy is the bandwidth for the density estimation step, Ky is the corresponding kernel, and h and K are bandwidth and kernel for the predictor η, respectively, we may estimate the conditional density f(y|εjt, j = 1, …, M) = ft (y|η(t) = η) by
| (36) |
Analogously, we obtain estimates for conditional quantile and quantile density functions.
In order to avoid boundary bias, we implement (36) with boundary kernels (Müller and Wang, 1994). For example, for arguments y in the left boundary region, 0 ≤ y ≤ hy, we define , and when using Epanechnikov kernels for estimating the conditional density, the kernels in the interior and at the boundary may be chosen as
If we use local polynomial smoothers, these will automatically include boundary adjustments through “equivalent” boundary kernels.
3.4 Inference via Bootstrap
We propose to use bootstrapping to obtain inference for coefficients βjt (12), based on estimates derived for either the linear model (16) or the generalized linear model (14). To generate bootstrap samples, we sample n times with replacement from the set of all subjects (Xi, Ti), i = 1, …, n. For each sampled subject we include its observed covariate trajectory and the associated age-at-death.
For each bootstrap sample, we then estimate the coefficients βjt at each time point t, using the method of choice such as (26) or (31). Obtaining the proposed estimates for each of the bootstrap samples, we end up with B bootstrap estimates . Finally, from the resulting bootstrap distributions we may construct pointwise confidence intervals for each t at level α, by finding the empirical percentiles at levels α/2 and (1 − α/2) from .
4. Remaining Lifetime for Female Medflies in Dependence on Daily Egg-Laying Trajectories
4.1 Estimating Remaining Lifetime
Individual egg-laying counts were recorded daily until death for 1000 mated female medflies (Mediterranean fruit fly, Ceratitis capitata) at the medfly mass-rearing facility in Metapa, Mexico. Age-at-death was recorded for each fly. Details of this study are provided in Carey et al. (1998). The aim of the study was to investigate relationships between patterns of reproduction and longevity. One of the basic questions of evolutionary theory is to what extent life extension is driven by enabling increased reproduction. Diverting resources that otherwise would be used for maintenance and repair into reproductive activity may shorten lifespan (Partridge and Harvey, 1985; see also the discussion in Westendorp and Kirkwood, 1999). Data from 893 medflies whose total number of eggs was not less than 20 were entered into this analysis.
We use local linear weighted least squares for smoothing the mean function and the variance–covariance surface as described above. The bandwidth for smoothing the mean function from one-curve-leave-out cross-validation was 2.4d (in days), which we rounded up to take h = 3d. The bandwidths for smoothing the covariance function were visually chosen as [10d, 10d], as cross-validation led to an undersmoothing choice. We define the evolution of the jth eigenfunction over time t as the function family
| (37) |
The time-evolutions of the mean egg-laying function and of the first two eigenfunctions are displayed in Figure 2 (for t = 30, 40, 50 days, respectively). These components describe the time-evolution of egg-laying trajectories. We find that mean and eigenfunction evolution is quite smooth, essentially appending new parts smoothly to the previous functions as their domain increases with increasing t, while not dramatically altering function shape on previously included domains. In particular the first eigenfunction evolution is very stable, while the second eigenfunction shows a small amount of flattening for increasing domains. The resulting smooth behavior of the varying coefficients β̂jt is confirmed in Figure 3.
Figure 2.
Evolution of mean functions (left column), and of first (middle column) and second (right column) eigenfunctions for current times t = 30 (first row), 40 (second row), and 50 (third row) days, for daily egg-laying data.
Figure 3.
The 95% pointwise bootstrap confidence intervals based on B = 1000 bootstrap samples for the varying coefficients β0t (top panel), β1t (middle panel), and β2t (bottom panel) of the functional linear varying coefficient model (16), for the daily egg-laying data. The solid lines are the coefficient estimates β̂0t, β̂1t, and β̂2t (26).
While smooth behavior is expected for the mean function evolution, smoothness of the eigenfunction evolution demonstrates that the short-range covariances are relatively stable over time. For all domains, the first eigenfunction delineates a broad rapidly rising peak and subsequent slow decline. The second eigenfunction is positively associated with a sharp and narrower peak with subsequent more rapid decline, and negatively with a less sharp peak followed by extensive egg-laying, a correlation that is biologically highly plausible as the “cost of reproduction,” shown here as a cost of early rapid reproduction. The average leave-one-out RSPEs (30) for QLUE regression for possible candidate values M = 1, 2, 3, 4, 5 for the number of included components were 10.22, 10.06, 10.12, 10.20, 10.21, respectively, which suggested the choice M = 2.
Once the evolution of mean and eigenfunctions has been determined, estimated functional principal component scores ε̂ijt for each fly, 0 ≤ t ≤ T, are obtained via (25). These serve as predictors in various regression models that can be considered for predicting remaining lifetime:
-
proportional hazards regression, using current egg-laying values; this is a standard method of survival analysis that has been implemented in many software packages (compare, e.g., Klein and Moeschberger, 1997). At each fixed time point t, we use the current value of the covariate as predictor and fit a Cox proportional hazards model
(38) by partial likelihood, where Xi (t) is the realization of the random process X for the ith subject at t. After obtaining the estimated coefficients γ̂t at each time point t, the resulting estimated varying-predictor Cox model is given by . From the hazard function estimates that this model produces, we then obtain mean remaining lifetime function estimates , where .
The functional linear varying coefficient estimates based on model (16).
The functional time-varying generalized linear model quasi-likelihood regression estimate (QLUE), based on (14), (15). Link and variance functions for this nonparametric quasi-likelihood are estimated from the data (in our application with bandwidths 17 and 2, respectively, determined according to (32)). The egg-laying trajectories are then summarized by estimated linear predictor functions η̂(t) (33).
The functional varying coefficient estimates β̂jt (26) obtained for the linear model (16) and also the pointwise bootstrap confidence intervals are illustrated in Figure 3. The coefficients (solid lines) are seen to vary smoothly in t. Coefficients β̂0t (solid line in upper panel) are an estimate of the overall mean remaining life function r0(t) in (16) and taper off accordingly with increasing t. Coefficient estimates β̂1t (solid line in middle panel) show quite a bit of fluctuation around a small negative value; the bootstrap confidence bands provide evidence that the underlying values are significantly negative most of the time. The higher the cost in terms of diminished remaining lifetime, the larger is the corresponding estimated predictor ε̂i1t, which in turn indicates stronger alignment of an individual’s trajectory with the first eigenfunction; it is helpful to take another look at the shape of these eigenfunctions (Figure 2) to see that this corresponds to a broadly elevated level of overall egg-laying. In contrast, coefficient estimates β̂2t show a clear upward trend from a significantly negative level with the trend tapering off at higher ages. The peaky egg-laying spike that is reflected in the second eigenfunction with rapid subsequent decline is thus seen to be a portender of short remaining lifetime, and more so at younger ages t; once a fly with the spiky egg-laying feature has made it to higher age, the prognosis is increasingly less affected by the spiky peak in early reproduction. Biologically, this could be interpreted as an attenuation of the cost of early reproduction for older flies.
It is also of interest to directly visualize the evolution of the estimate of the time-varying evaluation function β(s, t) in (11), i.e., for the generalized linear model (14). This time-evolution is depicted in Figure 4. It is surprisingly smooth. We find that there is a cost of reproduction, as the evaluation function is found to associate reproductive trajectories with higher early peaks with shorter remaining lifespans. Again this effect attenuates for the larger domains and corresponding older ages, as can be seen from the contour plot which indicates a flatter, i.e., less negative, trough for increasing domains. In comparison with the previous analysis of the varying coefficient estimates, this more direct approach may be somewhat easier to interpret, while the conclusions remain similar.
Figure 4.
Evolution of the evaluation functions β(s, t), 0 ≤ s ≤ t, (14) over time in the semiparametric regression model (14) (simultaneous display of surface and contour plots).
4.2 Comparing Estimates and Estimating Remaining Lifetime Distribution
A baseline in the comparative evaluation of performance of various remaining lifetime predictors is to simply take average remaining lifetime as predictor for individual remaining lifetime; this predictor does not use the covariate process. We present pointwise root squared prediction errors RSPE (30) to compare this baseline method with other predictors for the egg-laying data (Figure 5). Cox regression and the baseline average remaining lifetime clearly have larger and between them quite similar root squared prediction errors than the linear and semiparametric generalized linear varying coefficient estimates; the latter uniformly has the best performance.
Conditional distributions of remaining lifetime, given individual covariate trajectories X̃ up to age t, are of interest to predict characteristics of remaining lifetime for individuals, for example, by constructing 95% prediction regions. The lower panel of Figure 6 displays the predicted conditional density functions (36) for remaining lifetime at various levels of the linear predictor η̂(t) (33) at time t = 30d. Smaller values of η̂ (t) are seen to be associated with densities that have higher peaks at earlier ages, and therefore, are associated with decreased longevity, while larger values lead to flatter peaks and heavier tails with overall increased longevity. It is interesting to compare this with the observed egg-laying trajectories, sorted according to the value of η̂, which are displayed in the upper panel of Figure 6. We find that the trajectories with an early rapid rise in egg-laying are those with more peaked densities and reduced survival. The speed of decline in egg-laying also seems to play a role. This adds to the evidence that reproduction is intimately connected with longevity, and that there is a cost in terms of reduced remaining lifetime for early steep increases in egg-laying activity, especially at ages immediately after the reproductive peak.
Figure 6.
Observed egg-laying curves (upper panel) from birth to current age t = 30d and estimated densities of remaining lifetime (36) after age 30d (lower panel), both in dependency on the value of the estimated single index function η̂(t) (33), obtained from the semiparametric estimating equation (31), for the daily egg-laying data.
Revisiting Figure 1, the lower panel displays the estimated conditional density functions of remaining lifetime based on the observed egg-laying trajectories from birth to age 25 (solid segments in upper panel of Figure 1) for two individual flies. In accordance with the overall findings, the egg-laying curve with the higher peak around age 10 leads to a lower longevity prediction and a left-shifted predicted remaining lifetime density. The egg-laying trajectory with the smaller and right-shifted peak of egg-laying activity before age 25 corresponds to increased predicted survival and a density estimate with a much longer right tail and a broader peak. The predicted lifetimes are 33d and 46d, marked as crosses in the figure, and correspond to observed age-at-deaths of 36d and 52d, respectively, marked as open circles.
5. Discussion and Concluding Remarks
The time-evolution of evaluation functions β (9) and of mean and eigenfunctions is a concept that provides a stepping stone to extend the reach of functional data analysis to the analysis of trajectories that are truncated by death or other events. The interpretation of Figures 2 and 4 depicting this time-evolution and implying an attenuation of the cost of reproduction for older flies indicates that the proposed analysis tools can lead to interesting insights that would be hard to come by with traditional methods. In the medfly example, these insights shed light on the interplay between cost of reproduction and longevity. In other applications it might be a practical matter to predict remaining lifetime and remaining lifetime distributions. In studies on aging, predicted remaining lifetime is a useful measure for senescence (as opposed to age). The proposed methods yield estimates for such measures based on observed covariate trajectories.
In the model comparison reported in Section 4.2, the ubiquitous Cox regression model with time-varying predictors (which takes into account only the last value of the covariate process) is performing relatively poorly and in fact slightly worse than the average remaining lifetime estimator that does not take the covariate into account at all. This demonstrates that the reduction of the information that is contained in the covariate process to just the last observed value is suboptimal in cases where the entire event history plays a role in determining remaining survival. In fact, the proposed methodology indicates that the shape of the egg-laying peak influences cost of reproduction in an age-dependent fashion; such findings would not be possible with methods that only use the current value of the observed trajectories.
Comparing the two better performing models that both take the entire covariate trajectory into account, the semi-parametric generalized linear time-varying approach (11), (31) using the QLUE algorithm, was found to be somewhat better than the linear time-varying model (16), (27). The generalized model allows for more flexibility as it includes smooth link and variance functions, and therefore can be expected to perform better in many situations than the linear model; this increased flexibility turns out to be particularly useful for the estimation of remaining lifetime distributions, where the constraints of dimension reduction through a single index are partly offset by this increased flexibility. At the same time, this approach is more demanding numerically and harder to interpret.
The proposed methods allow for straightforward inclusion of more than one predictor trajectory per subject and also additional multivariate covariates that may be available for each subject in some applications. Additional functional predictors will give rise to additional functional principal scores, and can be easily added to existing predictors in the single index η(t) (13), by extending the range of summation. Testing the significance of individual effects may be carried out by bootstrapping; eventually an asymptotic theory might emerge. For the case of functional predictors, this requires not so straightforward increasing dimension asymptotics.
The estimation of conditional distributions of remaining lifetime given a random trajectory as predictor was implemented here under simplifying assumptions, such as assuming a single index as the relevant predictor that can then be used for nonparametric conditional distribution function estimation with kernels. More complex approaches could be of interest, such as multiple indices. Nearest-neighbor methods hold some promise (Rice, 2004). Once the conditional distribution has been estimated in a satisfactory manner, we may obtain estimates of tail probabilities. The probability that an individual will live beyond a certain age is of actuarial interest and of scientific interest in studies of the mortality of the oldest-old. As we have demonstrated, conditional densities are quite easily obtained and prove useful in demonstrating the effect of covariate trajectories. Other functionals of conditional distributions such as conditional medians and more general conditional quantiles are also easily estimated, leading to notions of conditional quantile regression for functional predictors.
Acknowledgments
This research was supported in part by NSF grants DMS02-04869 and DMS03-54448, and NIH grants P01-AG022500-01 and P01-AG08761-10. We are grateful to an associate editor and two referees for their comments which led to many improvements in the article.
References
- Ash RB, Gardner MF. Topics in Stochastic Processes. New York: Academic Press; 1975. [Google Scholar]
- Bycott P, Taylor J. A comparison of smoothing techniques for CD4 data measured with error in a time-dependent Cox proportional hazards model. Statistics in Medicine. 1998;17:2061–2077. doi: 10.1002/(sici)1097-0258(19980930)17:18<2061::aid-sim896>3.0.co;2-o. [DOI] [PubMed] [Google Scholar]
- Cai Z, Fan J, Li R. Efficient estimation and inferences for varying-coefficient models. Journal of the American Statistical Association. 2000;95:888–902. [Google Scholar]
- Cardot H, Ferraty F, Mas A, Sarda P. Testing hypotheses in the functional linear model. Scandinavian Journal of Statistics. 2003;30:241–255. [Google Scholar]
- Carey J, Liedo P, Müller HG, Wang JL, Chiou JM. Relationship of age patterns of fecundity to mortality, longevity and lifetime reproduction in a large cohort of Mediterranean fruit fly females. Journal of Gerontology, Biological Sciences and Medical Sciences. 1998;53:245–251. doi: 10.1093/gerona/53a.4.b245. [DOI] [PubMed] [Google Scholar]
- Castro PE, Lawton WH, Sylvestre EA. Principal modes of variation for processes with continuous sample curves. Technometrics. 1986;28:329–337. [Google Scholar]
- Chiou JM, Müller HG. Quasi-likelihood regression with unknown link and variance functions. Journal of the American Statistical Association. 1998;93:1376–1387. [Google Scholar]
- Chiou JM, Müller HG. Nonparametric quasi-likelihood. Annals of Statistics. 1999;27:36–64. [Google Scholar]
- Cox DR. Regression models and life tables (with Discussion) Journal of the Royal Statistical Society, Series B. 1972;34:187–200. [Google Scholar]
- Fan J, Gijbels I. Local Polynomial Modelling and Its Applications. London: Chapman and Hall; 1996. [Google Scholar]
- Hall P, Müller HG. Order-preserving non-parametric regression, with applications to conditional distribution and quantile function estimation. Journal of the American Statistical Association. 2003;98:598–608. [Google Scholar]
- Hastie T, Tibshirani R. Varying-coefficient models. Journal of the Royal Statistical Society, Series B. 1993;55:757–796. [Google Scholar]
- James GM. Generalized linear models with functional predictors. Journal of the Royal Statistical Society, Series B. 2002;64:411–432. [Google Scholar]
- Klein JP, Moeschberger ML. Survival Analysis Techniques for Censored and Truncated Data. New York: Springer; 1997. [Google Scholar]
- Malfait N, Ramsay JO. The historical functional linear model. Canadian Journal of Statistics. 2003;31:115–128. [Google Scholar]
- Müller HG. Functional modeling and classification of longitudinal data. Scandinavian Journal of Statistics. 2005;32:223–240. [Google Scholar]
- Müller HG, Stadtmüller U. Generalized functional linear models. Annals of Statistics. 2005;33:774–805. [Google Scholar]
- Müller HG, Wang JL. Hazard rate estimation under random censoring with varying kernels and bandwidths. Biometrics. 1994;50:61–76. [PubMed] [Google Scholar]
- Müller HG, Carey JR, Wu D, Liedo P, Vaupel JW. Reproductive potential predicts longevity of female Mediterranean fruit flies. Proceedings of the Royal Society B. 2001;268:445–450. doi: 10.1098/rspb.2000.1370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oeppen J, Vaupel JW. Broken limits to life expectancy. Science. 2002;296:1029–1031. doi: 10.1126/science.1069675. [DOI] [PubMed] [Google Scholar]
- Partridge L, Harvey PH. Costs of reproduction. Nature. 1985;316:20–21. [Google Scholar]
- Ramsay JO, Silverman BW. Functional Data Analysis. New York: Springer-Verlag; 1997. [Google Scholar]
- Rice J. Functional and longitudinal data analysis: Perspectives on smoothing. Statistica Sinica. 2004;14:631–647. [Google Scholar]
- Rice JA, Silverman BW. Estimating the mean and covariance structure nonparametrically when the data are curves. Journal of the Royal Statistical Society, Series B. 1991;53:233–243. [Google Scholar]
- Self S, Pawitan Y. Modeling a marker of disease progression and onset of disease. In: Jewell NP, Dietz K, Farewell VT, editors. AIDS Epidemiology: Methodological Issues. Boston: Birkhäuser; 1992. pp. 231–255. [Google Scholar]
- Staniswalis JG, Lee JJ. Nonparametric regression analysis of longitudinal data. Journal of the American Statistical Association. 1998;93:1403–1418. [Google Scholar]
- Tsiatis AA, DeGruttola V, Wulfsohn MS. Modeling the relationship of survival to longitudinal data measured with error: Application to survival and CD4 counts in patients with AIDS. Journal of the American Statistical Association. 1995;90:27–37. [Google Scholar]
- Vaupel JW, Carey JR, Christensen K, et al. Biodemographic trajectories of longevity. Science. 1998;280:855–860. doi: 10.1126/science.280.5365.855. [DOI] [PubMed] [Google Scholar]
- Westendorp RGJ, Kirkwood TBL. Human longevity at the cost of reproductive success. Nature. 1999;396:743–746. doi: 10.1038/25519. [DOI] [PubMed] [Google Scholar]
- Wu CO, Yu KF. Nonparametric varying coefficient models for the analysis of longitudinal data. International Statistical Review. 2002;70:373–393. [Google Scholar]
- Yao F, Müller HG, Clifford AJ, Dueker SR, Follett J, Lin Y, Buchholz BA, Vogel JS. Shrinkage estimation for functional principal component scores with application to the population kinetics of plasma folate. Biometrics. 2003;59:676–685. doi: 10.1111/1541-0420.00078. [DOI] [PubMed] [Google Scholar]
- Yu KM, Jones MC. Local linear quantile regression. Journal of the American Statistical Association. 1998;93:228–237. [Google Scholar]






