Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2026 Feb 13;45(3-5):e70442. doi: 10.1002/sim.70442

A Functional Joint Model for Survival and Multivariate Sparse Functional Data in Multi‐Cohort Alzheimer's Disease Study

Wenyi Wang 1,, Luo Xiao 1, Ruonan Li 1, Sheng Luo 2; Alzheimer's Disease Neuroimaging Initiative
PMCID: PMC12902813  PMID: 41684276

ABSTRACT

We develop an integrative joint model for multivariate sparse functional and survival data to analyze Alzheimer's disease (AD) across multiple studies. To address missing‐by‐design outcomes in multi‐cohort studies, our approach extends the multivariate functional mixed model (MFMM), which integrates longitudinal outcomes to extract shared disease progression trajectories and links these outcomes to time‐to‐event data through a parsimonious survival model. This framework balances flexibility and interpretability by modeling shared progression trajectories while accommodating cohort‐specific mean functions and survival parameters. For efficient estimation, we incorporate penalized splines into an EM algorithm. Application to three AD cohorts demonstrates the model's ability to capture disease trajectories and account for inter‐cohort variability. Simulation studies confirm its robustness and accuracy, highlighting its value in advancing the understanding of AD progression and supporting clinical decision‐making in multi‐cohort settings.

Keywords: EM algorithm, functional data, multivariate longitudinal data, penalized splines

1. Introduction

Alzheimer's disease (AD) is a progressive brain disorder that significantly impairs cognitive and behavioral functions. In the United States, AD is the fifth leading cause of death among people 65 years and older, with an estimated 6.9 million affected in 2024 [1]. The increasing prevalence of AD has led to the establishment of large‐scale longitudinal studies, such as the Alzheimer's Disease Neuroimaging Initiative (ADNI) [2], the National Alzheimer's Coordinating Center (NACC) [3], and the Religious Orders Study and Rush Memory and Aging Project (ROSMAP) [4]. These studies collect various longitudinal data, including neuropsychological and behavioral measurements, to monitor disease trajectories and assess risk factors over time (Table 1). Such measurements are critical for understanding disease progression and its associations with survival outcomes (e.g., dementia onset), ultimately informing clinical interventions and policy.

TABLE 1.

List of longitudinal outcomes and number of Mild Cognitive Impairment (MCI) subjects in three AD cohorts.

Order Longitudinal outcome Scaling coefficient (βj) ADNI NACC ROSMAP
1 MMSE 1.00
2 WMSLM 1.54
3 RAVLT 3.68
4 SDMT 3.65
5 CDR‐SB −0.60
6 ADAS −2.70
7 FAQ −1.84
8 TRAILA −1.08
Number of MCI subjects 715 3707 522

Note: , data available; , data unavailable.

Abbreviations: ADAS, Alzheimer disease assessment scale‐cognitive; CDR‐SB, clinical dementia rating; FAQ, functional assessment questionnaire; MMSE, mini‐mental state exam; RAVLT, rey auditory verbal learning test immediate recall; SDMT, symbol digit modalities test; TRAILA, trail making test Part A; WMSLM, wechsler memory scale logical memory.

The analysis of data from multiple cohorts of AD presents significant opportunities and challenges. Pooling data from multiple cohorts increases the sample size and statistical power, enabling the investigation of complex risk and protective factors, including nonlinear and time‐varying effects. Additionally, multi‐cohort analysis improves the robustness and generalizability of predictive models across diverse subgroups, settings, and countries, making findings more clinically applicable. However, heterogeneity in baseline characteristics, study designs, and data collection protocols complicates integration, particularly when some longitudinal outcomes are systematically missing by design (Table 1). Modeling longitudinal outcomes in this context requires methods capable of capturing nonlinear subject‐specific trajectories while accounting for cohort‐specific differences. Traditional parametric models often fail to address these complexities, especially in the presence of missing‐by‐design outcomes.

Joint models (JMs) are widely used for analyzing longitudinal and survival data [5] and have been extended to handle multiple longitudinal outcomes [6, 7, 8]. However, traditional JMs often rely on parametric assumptions for the longitudinal sub‐model, which can be overly restrictive and subject to model misspecification, in the presence of intricate nonlinear trajectories. To address this, longitudinal outcomes can alternatively be modeled as sparse functional data [9, 10], leveraging flexible nonparametric methods [11], to capture complex subject‐specific patterns over time. The joint modeling of survival and sparse functional data is termed functional joint models (FJMs) [12, 13]. Recent advances have extended FJMs to incorporate multivariate sparse functional data [8, 14, 15, 16], multi‐dimensional sparse functional data [17], recurrent event data [18], and imaging data [19]. Despite the successes of FJMs, their model estimation remains a major challenge, and many existing works adopt a computationally more feasible two‐stage estimation, often at the expense of substantial model bias. This challenge becomes more severe when developing FJM to address systematic missingness and inter‐cohort heterogeneity in multi‐cohort studies.

Motivated by the challenges, this paper develops an integrative FJM for multivariate sparse functional data and survival data from multiple cohorts, with an accompanied efficient and feasible model estimation. First, to accommodate missing‐by‐design outcomes while leveraging shared information across cohorts in the longitudinal sub‐model, we decompose longitudinal outcomes into shared and outcome‐specific disease trajectories as in the multivariate functional mixed model (MFMM) [14], which effectively balances model flexibility and interpretability, thereby enabling comprehensive insights into disease processes. Crucially, we assume consistent variation patterns across cohorts in modeling the shared and outcome‐specific disease trajectories, allowing the integration of longitudinal data from diverse sources while maintaining a parsimonious survival model. Indeed, our approach integrates longitudinal and survival data through shared latent trajectories, offering a unified framework that leverages all available information to improve predictive accuracy. Then to address cohort‐specific variability, we incorporate cohort‐specific mean functions in the longitudinal sub‐model and cohort‐specific regression coefficients in the survival sub‐model. As shall be shown in the application to AD studies, the integrative model is capable of revealing insights that cannot be identified by a single‐cohort analysis. Those model considerations and assumptions balance leveraging shared information across cohorts with accommodating their inherent heterogeneity.

To estimate the proposed multi‐cohort FJM, we develop an efficient and computationally feasible expectation‐maximization (EM) algorithm, another major contribution of the paper. FJMs often have multiple nonparametric smooth functions (more than 10 in our model) to estimate, and their estimation makes EM algorithms computationally very challenging. Indeed, regression splines were previously used [14], for which overfitting may not only affect estimates but also result in slow convergence of EM. Notice that overfitting is almost inevitable in joint models due to the right truncation of longitudinal outcomes by the survival outcome. Therefore, we propose to use penalized splines [20] to estimate nonparametric smooth functions in the longitudinal sub‐model, allowing for the modeling of complex nonlinear relationships while reducing overfitting risks associated with regression splines. Penalized splines also allow for varying smoothness across functions, accommodating diverse data structures and trajectories within and across cohorts. For example, Figure 3 in the data application shows that penalized splines provide not only highly nonlinear estimates but also essentially linear estimates. However, selecting smoothing parameters for multiple nonparametric functions presents great computational challenges, particularly in iterative EM algorithms. A fast algorithm for the global selection of smoothing parameters in nonparametric regression such as in generalized additive models [21] is inapplicable for FJMs because it can only deal with regression functions while for FJMs, nonparametric smooth functions (eigenfunctions) are also used to model variances in functional data models. To address this, we propose using a local selection of smoothing parameters, exploiting the iterative nature of EM algorithms. Specifically, at each iterative step of EM, we reformulate the estimation of each smooth function as a weighted least squares problem for nonparametric regression. This reformulation enables efficient simultaneous estimation of smooth functions and smoothing parameters using standard software, such as the well‐developed mgcv R package. This approach stabilizes parameter estimation and substantially accelerates EM convergence, making it feasible for multi‐cohort studies such as AD studies.

FIGURE 3.

FIGURE 3

Estimated cohort‐specific mean trajectories of longitudinal outcomes for ADNI (blue lines), NACC (red lines) and ROSMAP (green lines). Columns 1 and 3 compare multi‐cohort FJM (solid lines) with separate FJM (dashed lines), while columns 2 and 4 show results from the parametric MJM.

The remainder of this paper is organized as follows. Section 2 introduces the proposed multi‐cohort functional joint model. Section 3 details model estimation using the Monte Carlo Expectation‐Maximization (EM) algorithm. Section 4 outlines the selection of key model parameters, such as the principal components and the smoothing parameters. In Section 5, we apply the model to three AD cohorts, demonstrating its ability to capture shared and cohort‐specific disease progression patterns. Section 6 evaluates the performance of the model through simulation studies, and Section 7 concludes with a discussion. The R code for implementing the proposed method is available at https://github.com/wenyiwang2000/Multi‐Cohort‐FJM.

2. The Multi‐Cohort Functional Joint Model

2.1. Data Structure and Notation

As shown in Table 1, the longitudinal outcomes measured in each cohort differ, with some outcomes observed across all cohorts and others specific to one or two cohorts. Let n be the total number of subjects across all cohorts, indexed i, and let nc be the number of cohorts. The mapping c(i) indicates the cohort to which subject i belongs. Let J denote the total number of longitudinal outcomes and 𝒥c the set of outcomes observed in cohort c. For the multi‐cohort AD studies and their outcomes listed in Table 1, J=8, with 𝒥1={1,2,3,5,6,7} in ADNI, 𝒥2={1,2,5,8} in NACC, and 𝒥3={1,2,4} in ROSMAP. For subject i in cohort c(i) and outcome j𝒥c(i), let Yijk denote the kth observation of the jth longitudinal outcome at time tijk, with 1kmij and mij the number of observations. Observational times tijk are assumed to be within a compact interval 𝒯, representing the study follow‐up period.

The survival outcome, representing the time from baseline to the onset of AD dementia, is denoted by Si for subject i. When Si is right‐censored, we observe Ti=min(Si,Ci), where Ci is the censoring time, assumed independent of both the event time Si and the longitudinal outcomes. The binary event indicator Δi=1{SiCi} specifies whether Si is observed. Longitudinal times tijk are restricted to tijk[0,Ti]𝒯, implying no observations of longitudinal outcomes after Ti. Finally, let zi=(Zi1,,ZiP)P denote a common set of baseline covariates collected across all cohorts, including demographic and clinical characteristics such as age, sex, and APOE genotype, which are relevant to AD progression.

2.2. Multivariate Functional Mixed Model for Multi‐Cohort Longitudinal Data

We model multiple longitudinal outcomes as multivariate functional data and extend the multivariate functional mixed model (MFMM) [14] to multiple cohorts:

Yijk=Xij(tijk)+ϵijk,Xij(t)=μjc(i)(t)+βj{Ui(t)+Vij(t)},t𝒯, (1)

where Xij(t) is a smooth latent stochastic process, μjc(i)(t) is the fixed mean function for the jth outcome in cohort c(i), the random process Ui(t) captures shared variation patterns and induces correlations among multiple outcomes for the ith subject, Vij(t) is the deviation of the jth outcome from μjc(i)(t) for the ith subject, βj is the outcome‐specific scaling parameter for the jth outcome, and ϵijk is measurement error with variance σj2. These components jointly capture both shared disease progression and individual variations in AD longitudinal outcomes. It is assumed that Ui(t) is independent across subjects. The outcome‐specific deviation Vij(t) and the measurement error ϵijk are assumed to be independent of Ui(t), and they are mutually independent across subjects and outcomes.

The shared latent trajectory Ui(t) captures overall AD progression, reflecting variation patterns shared among outcomes. Modeled as a Gaussian process with zero mean and covariance 𝒞(s,t)=Cov{Ui(s),Ui(t)}, Ui(t) quantifies the overall trajectory of disease progression. Outcome‐specific variations are captured by Vij(t), another Gaussian process with covariance 𝒢(s,t)=Cov{Vij(s),Vij(t)}. The scaling coefficients βj adjust for differences in outcome magnitudes, ensuring comparability. Identifiability is ensured by fixing β1=1 for a sentinel outcome, such as MMSE or WMS. Note that different scaling parameters might be used for Ui(t) and Vij(t), but we found that such flexibility is unnecessary for our data application.

The covariance function of the shared latent trajectory Ui(t) is decomposed using eigendecomposition as 𝒞(s,t)=d1λ1dϕd(s)ϕd(t), where λ11λ120 are the eigenvalues, and ϕd(t) are the associated orthonormal eigenfunctions, satisfying 𝒯ϕd(t)ϕd(t)dt=1{d=d}. Similarly, the covariance of the outcome‐specific process Vij(t) is decomposed as 𝒢(s,t)=d1λ2dψd(s)ψd(t), where λ21λ220 are the eigenvalues and ψd(t) are orthonormal eigenfunctions that satisfy 𝒯ψd(t)ψd(t)dt=1{d=d}.

To facilitate estimation and interpretation, we assume that both Ui(t) and Vij(t) can be represented by a finite number of functional principal components (FPC). Specifically, Ui(t)=d=1D1ξidϕd(t) with FPC scores ξidN(0,λ1d), and Vij(t)=d=1D2ζijdψd(t) with FPC scores ζijdN(0,λ2d). Here, the parameters D1 and D2 represent the number of principal components retained for shared and outcome‐specific processes, respectively, and are adaptively selected using data‐driven criteria such as cross‐validation or AIC/BIC. Finally, measurement errors ϵijk are assumed to be independently distributed as N(0,σj2). Given the eigenfunctions ϕd(t) and ψd(t), Model (1) can be expressed as a mixed model

Yijk=μjc(i)(tijk)+βj{d1=1D1ξid1ϕd1(tijk)+d2=1D2ζijd2ψd2(tijk)}+ϵijk.

We make a few remarks about Model (1). First, heterogeneity across cohorts is modeled via the outcome‐ and cohort‐specific fixed mean functions, ensuring flexibility in modeling cohort‐specific baseline differences. Second, the proposed model integrates the cohorts by imposing shared variation patterns across the outcomes, allowing the scores ξid1 and ζijd2 to be directly comparable among subjects across cohorts. This comparability facilitates a parsimonious survival model that integrates information across cohorts, even when each cohort collects different sets of longitudinal outcomes. Using shared latent trajectories and cohort‐specific mean functions, MFMM naturally accommodates missing data by design, allowing outcomes observed in only a subset of cohorts to contribute to the model. Indeed, Proposition 1 below shows that MFMM is identifiable under loose conditions. Figure 1 illustrates this decomposition by plotting the estimated components for two outcomes, MMSE and CDR‐SB, for one subject in the ADNI study and another subject in the NACC study, highlighting the model's ability to separate shared and outcome‐specific variations.

FIGURE 1.

FIGURE 1

Estimated components of the multi‐cohort MFMM in Model (1) for two outcomes: MMSE and CDR‐SB for one subject from the ADNI study (top two rows) and one subject from the NACC study (bottom two rows). Columns represent the following: (1) observed MMSE/CDR‐SB values (black dots) and estimated latent processes Xij(t); (2) cohort‐specific mean functions μjc(i)(t); (3) shared latent disease trajectory Ui(t) scaled by coefficient βj; (4) outcome‐specific deviations Wij(t) scaled by βj.

Proposition 1

Under model (1), suppose that 𝒯𝒞(t,t)dt>0. Assume that β1=1 and 1c=1nc𝒥c. Then, Model (1) is identifiable. Specifically, the model parameters {μjc(t),j𝒥c,1cnc},{βj},𝒞(s,t),𝒢(s,t) and {σj2,1jJ} are identifiable.

The proof of Proposition 1 is similar to that of identifiability for the original MFMM for one cohort [14] and hence is omitted. The condition 𝒯𝒞(t,t)dt>0 ensures the presence of shared variation among outcomes, while fixing β1=1 anchors the scaling of the shared latent trajectories. These conditions are easily satisfied in AD studies because all cohorts collect the MMSE and WMS outcomes, one of which can serve as the sentinel outcome with β1=1. Moreover, the conditions in Proposition 1 can be relaxed. For example, it is not a necessary condition to have a single or multiple outcomes common to all cohorts, demonstrating the model's flexibility in accommodating different outcome collection across cohorts.

2.3. Joint Model for Longitudinal and Survival Data

Let hi(t) be the hazard function for the ith subject. In addition, let ξi=(ξi1,,ξiD1)T be the vector of scores corresponding to the shared latent trajectory among the outcomes. We consider the following cohort‐specific proportional hazards model,

hi(t)=h0(t)exp{ziTγ1c(i)+ξiTγ2c(i)}, (2)

where h0(t) is the baseline hazard function, γ1c(i) is the coefficient vector for the baseline covariates zi in cohort c(i), and γ2c(i)=(γ2c(i),1,,γ2c(i),D1)T is the coefficient vector associated with the shared latent score ξi for cohort c(i). Here, h0(t) represents the baseline hazard function, which may be specified parametrically or estimated nonparametrically. The cohort‐specific coefficients γ1c(i) account for baseline differences in covariate effects among cohorts, while γ2c(i) captures the influence of the shared latent trajectory on survival outcomes. This hazard model is parsimonious and applicable across cohorts because it incorporates longitudinal outcomes via shared random scores ξi, enabling integration across cohorts without dependence on the specific set of outcomes collected. Using shared latent scores, the model avoids the complexity of cohort‐specific dependence on collected outcomes, reducing dimensionality while maintaining interpretability and statistical power.

Although the proportional hazards assumption is used here, the framework can be extended to accommodate time‐varying effects or more complex survival models, providing additional flexibility for diverse applications. Moreover, although we employ the random effects model as the linking function between longitudinal outcomes and survival data, other functional forms can also be utilized. For example, derivative forms incorporating the rate of change in the shared latent trajectory, cumulative forms accounting for historical influence, or lag models emphasizing the weighted impact of recent history, could offer alternative approaches [16]. These extensions provide further flexibility to capture nuanced relationships between longitudinal and survival data, enhancing the applicability of the model across various contexts.

2.4. Likelihood of Joint Model

Let yij=(Yij1,,Yijmij)T be the column vector of observations for the jth outcome of the ith subject. Let ξi=(ξi1,,ξiD1)T be the vector of FPC scores corresponding to the shared latent trajectory, with a diagonal covariance matrix Λ1=diag(λ11,,λ1D1). Similarly, let ζij=(ζij1,,ζijD2)T be the vector of FPC scores corresponding to the outcome‐specific trajectory, with a diagonal covariance matrix Λ2=diag(λ21,,λ2D2). Define ϕd,ij=ϕd(tij1),,ϕd(tijmij)T and Φij=[ϕ1,ij,,ϕD1,ij] as eigenfunctions for the shared latent trajectory. Similarly, let ψd,ij=ψd(tij1),,ψd(tijmij)T and Ψij=[ψ1,ij,,ψD2,ij] represent the eigenfunctions for the outcome‐specific trajectory.

The predicted values for the jth outcome of the ith subject are given by xij=βj(Φijξi)+βjΨijζij. The predicted values xij provide the functional link between latent FPC scores and observed longitudinal data, capturing the modeled relationship between shared and outcome‐specific trajectories. Concatenate the vectors yij(j𝒥c(i)) into a long column vector yi and define similarly xi. Let ηi=(ξiT,ζi1T,,ζiJc(i)T)T be the collection of all FPC scores (treated as random effects in the likelihood framework). Finally, denote the covariance matrix of the measurement errors by i=blockdiag(σ12Imi1,,σJc(i)2ImiJc(i)). The conditional likelihood of the multivariate longitudinal data given ηi is

f(yi|xi,i)=(|2πi|)12exp{12(yixi)Ti1(yixi)}.

FPC scores (random effects) ξi and ζij are assumed to follow multivariate normal distributions: f(ξi|Λ1)=|2πΛ1|exp(12ξiTΛ11ξi) and f(ζij|Λ2)=|2πΛ2|exp(12ζijTΛ21ζij).

The conditional partial likelihood of the time‐to‐event data is given by

f(Ti,Δi|h0,zi,ξi,γ1c(i),γ2c(i))=h0(Ti)exp(ziTγ1c(i)+ξiTγ2c(i))Δi×exp{Λ0(Ti)exp(ziTγ1c(i)+ξiTγ2c(i))}, (3)

where Λ0(t)=0th0(u)du is the baseline cumulative hazard function.

The multivariate longitudinal data and the time‐to‐event data are assumed to be conditionally independent given the random effects ηi. This assumption reflects the idea that the shared latent trajectory ξi and outcome‐specific deviations ζij adequately capture the dependence structure between the longitudinal and survival components. To account for differences in cohort sizes, a weighted log‐likelihood is adopted. The smallest cohort (cohort 1) is designated as the reference cohort, ensuring that variability in sample sizes does not disproportionately influence parameter estimation. Let Nc=i:c(i)=cj𝒥cmij denote the total number of longitudinal observations in cohort c. In particular, we choose cohort 1 such that N1 is the smallest. Then the weights for subjects in different cohorts are defined as ωc=N1/Nc.

The overall marginal likelihood is obtained by integrating the product of the longitudinal and survival likelihoods over the distributions of the random effects ηi. The weighted marginal log‐likelihood is

i=1nωc(i)logf(yi|xi,i)f(Ti,Δi|h0,zi,ξi,γ1c(i),γ2c(i))f(ξi|Λ1)j𝒥c(i)f(ζij|Λ2)dηi. (4)

3. Model Estimation via Monte Carlo EM

3.1. Spline Approximation of Smooth Functions

Estimating the smooth mean and covariance functions in the MFMM model requires a flexible yet computationally efficient approach. We employ the spline approximation, which provides these properties and is particularly suited to complex functional data. Let b(t)={B1(t),,BK(t)}T be the vector of B‐spline basis functions, where K is the number of equally‐spaced interior knots plus the order (degree + 1) of the B‐splines. The mean function μjc(t) for the jth outcome in cohort c is modeled as μjc(t)=b(t)Tαjc, where αjc is the corresponding coefficient vector.

To ensure numerical stability, we orthonormalize the B‐spline bases using the Gram matrix: G=b(t)b(t)TdtK×K. Let b˜(t)=G1/2b(t) denote the resulting orthonormalized B‐spline basis functions. The covariance functions 𝒞(s,t) (shared latent trajectory) and 𝒢(s,t) (outcome‐specific trajectory) are decomposed into their dth eigenfunctions, which are approximated: ϕd(t)=b˜(t)Tθ1d and ψd(t)=b˜(t)Tθ2d, where θ1d and θ2d are coefficient vectors for the dth eigenfunctions of 𝒞 and 𝒢, respectively. The orthonormality of the eigenfunctions imposes the constraints θ1d1Tθ1d2=1{d1=d2} and θ2d1Tθ2d2=1{d1=d2}.

3.2. E‐Step

The spline approximation of nonparametric smooth functions enables the use of parametric estimation methods for the MFMM model. The random score vector ηi, which captures both shared and outcome‐specific variations, introduces significant dimensionality to the model. In practice, the dimension of ηi often exceeds four, making direct maximization of the log marginal likelihood in Model (4) computationally expensive or infeasible. To address this, we employ the EM algorithm [22], which treats ηi as latent (missing) data. The EM algorithm iteratively alternates between two steps until convergence: (1) computing the expectation of the conditional log‐likelihood of the observed data given the current parameter estimates (E‐step), and (2) maximizing the expected log‐likelihood to update the parameter estimates (M‐step).

Let the observed data for the ith subject be 𝕐i={yi,zi,Ti,Δi}, where yi represents the collection of longitudinal data. Let α={αjc:j𝒥c,1cnc}, β=(β1,,βJ)T, σ2=(σ12,,σJ2)T, Θ1=[θ11,,θ1D1] and Θ2=[θ21,,θ2D2]. Let γ={γ1c,γ2c:1cnc}. Denote the complete collection of parameters by Ω={h0,α,β,σ2,γ,Λ1,Λ2,Θ1,Θ2}. The E‐step calculates the expected value of the weighted conditional log‐likelihood given the observed data and current parameter estimates, Ω^. This is expressed as follows.

Q(Ω)=i=1nωc(i)𝔼i{log(f(yi|xi,i))}+i=1nωc(i)𝔼i{log(f(Ti,Δi|h0,zi,ξi,γ))}+i=1nωc(i)𝔼i{log(f(ξi|Λ1))}+i=1nωc(i)j𝒥c(i)𝔼i{log(f(ζij|Λ2))}, (5)

where 𝔼i(·) is the expectation with respect to the conditional distribution of ηi given the observed data 𝕐i and current parameter estimates Ω^.

Let g(·) be any smooth function of ηi. The conditional expectation 𝔼i{g(ηi)} is given by:

g(ηi)f(Ti,Δi|zi,ηi,Ω^)f(ηi|yi,Ω^)dηif(Ti,Δi|zi,ηi,Ω^)f(ηi|yi,Ω^)dηi,

where f(Ti,Δi|zi,ηi,Ω^) corresponds to the survival likelihood defined in Model (3), and

f(ηi|yi,Ω^)f(yi|xi,Ω^)f(ξi|Λ^1)j𝒥c(i)f(ζij|Λ^2).

The conditional distribution f(ηi|yi,Ω^) is multivariate normal (see Appendix A).

To efficiently compute 𝔼i{g(ηi)}, we use Monte Carlo integration. Specifically, we draw R samples ηi(r) from the multivariate normal distribution f(ηi|yi,Ω^) and compute the approximation:

𝔼i{g(ηi)}r=1Rg(ηi(r))f(Ti,Δi|zi,ηi(r),Ω^)r=1Rf(Ti,Δi|zi,ηi(r),Ω^).

The Monte Carlo approximation enables efficient implementation of the E‐step, ensuring computational feasibility even in high‐dimensional parameter spaces.

3.3. M‐Step

In the M‐step, updated estimates of Ω are obtained by maximizing the function Q(Ω) in (5). Specifically, estimates of the longitudinal parameters {α,β,σ2,Θ1,Θ2}, the survival parameters {h0,γ}, and the eigenvalue parameters {Λ1,Λ2} are updated iteratively. In the following, we provide details on the estimation process for each set of parameters.

First, the longitudinal parameters {α,β,σ2,Θ1,Θ2} are estimated iteratively by minimizing inωc(i)𝔼i{log(f(yi|xi,i))}, expressed as:

i=1nωc(i)j𝒥c(i){mij2log(2πσj2)+12σj2𝔼iyijBijαjc(i)(βjB˜ijΘ1ξi+βjB˜ijΘ2ζij)2}, (6)

where Bij=[b(tij1),,b(tijmij)]T is the matrix of B‐spline bases evaluated at time points tij1,,tijmij, and B˜ij=G1/2Bij is the orthonormalized basis obtained by multiplying Bij by the inverse square root of the Gram matrix G. The outcome‐specific scaling parameter βj is obtained through weighted least squares of linear regression, as detailed in Appendix C. To enforce smoothness in the estimates of the mean functions and eigenfunctions, quadratic penalties are added to (6). Compared to regression spline estimation [14], penalized spline estimation stabilizes iterative updates, reduces the number of iterations required for EM convergence, and produces smoother and more interpretable estimates. Empirical and theoretical studies of penalized splines show that the number of knots does not matter as long as a relatively large number of knots are used [23, 24]. In this study, seven spline bases constructed at equally spaced knots are used for both simulations and real data. The orthonormality of eigenfunctions is also ensured by post‐processing the updates.

Second, the baseline hazard function h0(t) in the Cox regression is estimated by minimizing inωc(i)𝔼i{log(f(Ti,Δi|h0,zi,ξi,γ1c(i),γ2c(i)))}, expressed as:

i=1nωc(i){Δi{log(h0(Ti))+ziTγ1c(i)+𝔼i(ξiTγ2c(i))}+Λ0(Ti)𝔼i{exp(ziTγ1c(i)+ξiTγ2c(i))}}.

The baseline hazard h0(t) is then estimated using a weighted ratio of observed events to expected contributions at each time t:

h^0(t)=i=1nωc(i)Δi1{Ti=t}i=1nωc(i)𝔼i{exp(ziTγ^1c(i)+ξiTγ^2c(i))}1{Tit}.

The parameter vector γc=(γ1cT,γ2cT)T, representing the effects of baseline covariates and latent trajectories, is updated iteratively using the Newton‐Raphson algorithm. The th iteration is given by: γ^c=γ^1c+Iγ^1c1Sγ^1c, where Sγ^1c and Iγ^1c are the score vector and the observed information matrix, respectively. By differentiating the survival likelihood (3) with respect to γc(i), the score for subject i is:

si(γc(i))=Δi(ziT,𝔼iξiT)Tv=1Vh0(tv)𝔼i{(ziT,ξiT)Texp(ziTγ1c(i)+ξiTγ2c(i))}1{Titv},

where tv are the distinct observed event times across all cohorts. The cohort‐level score Sγc and the information matrix Iγc are calculated as: Sγc=icsi(γc) and Iγc=icsi(γc)siT(γc)SγcSγcT/|c|, where |c| is the number of subjects in cohort c.

Third, the diagonal matrices of eigenvalues Λ1 and Λ2 are estimated by minimizing the weighted negative log‐likelihoods i=1nωc(i)𝔼i{log(f(ξi|Λ1))} and i=1nωc(i)𝔼i{log(f(ζij|Λ2))}, respectively. For Λ1, this minimization simplifies to: i{log|Λ1|+𝔼i{ξiTΛ11ξi}}. The estimator for the eigenvalues is: λ^1d=1i=1nωc(i)i=1nωc(i)𝔼i(ξid2), where 𝔼i(ξid2) represents the expected squared random scores for the shared latent trajectory. Similarly, for Λ2, minimizing the negative log‐likelihood leads to the estimator: λ^2d=1i=1nωc(i)i=1nωc(i)1Jc(i)j𝒥c(i)𝔼i(ζijd2), where 𝔼i(ζijd2) represents the expected squared random scores for the outcome‐specific deviations. The inclusion of cohort‐specific weights ωc(i) ensures that estimates adequately account for differences in cohort sizes and data distributions. This weighting improves model flexibility and accuracy in diverse cohort distributions.

4. Model Selection

Penalized splines [20] are used to estimate the mean functions and eigenfunctions in the longitudinal data model, which require the selection of appropriate smoothing parameters. These parameters are determined using generalized cross‐validation (GCV) at each iteration of the EM algorithm. By reformulating function estimation as a weighted least squares problem in nonparametric regression, we enable efficient implementation using the gam function from the R package mgcv [21, 25] (see Appendix C for details). Although the smoothing parameters may vary between iterations, they stabilize rapidly as the algorithm converges, ensuring consistent estimation of smooth functions and improving computational efficiency.

The number of principal components D1 for the shared latent process and D2 for the outcome‐specific latent process are critical tuning parameters in the model. To select these parameters, we use Bayesian information criteria (BIC), defined as BIC=2(Ω^)+c=1ncωclog(Nc)·dfc, where (Ω^) is the weighted marginal log‐likelihood of the data, approximated as:

i=1nωc(i)log{f(yi|Ω^)}+log{R1r=1RfTi,Δi|h^0,zi,ξi(r),γ^c(i)}.

Here, f(yi|Ω^) is the marginal density, which follows a normal distribution (see Appendix A). The samples ξi(r) are drawn from f(ηi|yi,Ω^); see Appendix B for the derivation. To ensure an accurate approximation of the marginal log‐likelihood, the number of samples R must be much larger than in the EM algorithm.

The degrees of freedom (DOF) for cohort c, denoted as dfc, are calculated as:

j𝒥cdfμjc+1ncd=1D1dfϕdD1(D11)2+1ncd=1D2dfψdD2(D21)2+(2J1)nc+(p+D1).

Each term in this formula corresponds to a specific model component:

  • j𝒥cdfμjc: DOF for estimating the mean functions in cohort c.

  • d=1D1dfϕdD1(D11)/2: DOF for the shared covariance function's eigenpairs.

  • d=1D2dfψdD2(D21)/2: DOF for the outcome‐specific covariance function's eigenpairs.

  • (2J1)/nc: DOF for estimating error variances and scaling coefficient vector β.

  • p: Number of baseline covariates in the Cox regression.

  • D1: Number of cohort‐specific coefficients for shared latent scores in the Cox regression.

Cohort c's DOF has a division factor nc to appropriately reflect cohort‐level contributions when parameters are shared across cohorts. A two‐dimensional grid is used to identify the optimal values of D1 and D2, balancing model complexity with goodness‐of‐fit.

5. Data Application

We apply the proposed multi‐cohort functional joint model (FJM) to data from three AD cohorts, integrating longitudinal outcomes via the extended MFMM and linking them to survival outcomes through a Cox regression model. This application jointly characterizes the progression of multivariate longitudinal outcomes and their association with time to the diagnosis of AD dementia. By leveraging cohort‐specific variations while incorporating shared latent processes, the multi‐cohort FJM allows for a unified yet flexible representation of disease trajectories across diverse datasets. We include the following baseline covariates [14]: age, sex, years of education, and the number of apolipoprotein E ϵ 4 alleles (APOE4), in the hazard model, reflecting known demographic and genetic risk factors for AD progression (see Table S1 in Appendix D for details). To assess performance, we compare the multi‐cohort FJM with the single‐cohort FJM‐MFMM [14] and the parametric multivariate joint model (MJM) [26], each applied separately to the cohorts.

Using the BIC criteria described in Section 4, we selected four principal components for the shared covariance structure (D1=4) and three for the outcome‐specific covariance structure (D2=3). The estimated scaling coefficients β, shown in Table 1, with signs aligning with expectations: lower values of MMSE, WMSLM, RAVLT, and SDMT and higher values of CDR‐SB, ADAS, FAQ, and TRAILA indicate AD progression. Figure 2 illustrates the first two eigenfunctions estimated using the multi‐cohort FJM and separate FJM models. Consistent trends in eigenfunctions across models support the utility of the multi‐cohort FJM in capturing subject‐specific random trajectories. For example, the first eigenfunction of the shared latent process (ϕ1(t)) is negative throughout, and its loading coefficients in β (Table 1) suggest that positive scores for the shared trajectory correspond to worsening outcomes (lower MMSE, WMSLM, RAVLT, and SDMT values, and higher CDR‐SB, ADAS, FAQ, and TRAILA values). This pattern indicates accelerated AD progression as time progresses. A similar interpretation holds for the first outcome‐specific eigenfunction (ψ1(t)), highlighting its role in capturing deviations specific to individual outcomes.

FIGURE 2.

FIGURE 2

Top two estimated eigenfunctions with associated eigenvalues for the shared and outcome‐specific latent processes, comparing multi‐cohort FJM (solid lines) and separate FJM (dashed lines). These eigenfunctions characterize the primary patterns of variability in the longitudinal outcomes across cohorts.

Figure 3 compares the estimated mean trajectories of longitudinal outcomes across three models: multi‐cohort FJM, single‐cohort FJM and MJM. Columns one and three display results for the FJM and multi‐cohort FJM, while columns two and four depict results for MJM. Unlike MJM, which assumes linear trends, both FJM and multi‐cohort FJM effectively capture nonlinear trajectories. Three important observations emerge. First, while FJM and multi‐cohort FJM produce similar mean functions, multi‐cohort FJM captures steeper declines in WMSLM and RAVLT and sharper increases in TRAILA, reflecting more rapid deterioration in some AD‐related measures. Second, the flexibility of penalized splines in the FJM models enables them to capture nonlinear trends in some outcomes (e.g., WMSLM, RAVLT, and TRAILA), which indicate an acceleration in deterioration in AD‐related measures. In contrast, MJM is linear and does not capture such dynamics, which exposes its limitations in modeling complex trends in AD studies. Third, clear cohort‐specific differences in progression are evident. For example, patients in the ROSMAP cohort demonstrate faster progression, as reflected in lower MMSE and WMSLM scores, compared to ADNI and NACC, where ADNI patients show the slowest progression. These differences highlight the importance of multi‐cohort modeling in uncovering population‐specific patterns.

Table 2 presents the estimated Cox regression coefficients of multi‐cohort FJM. APOE4 consistently emerges as a significant risk factor in all cohorts, while sex does not show a significant association. Age is significant in ADNI and NACC but not in ROSMAP, and years of education have a significant protective effect in NACC and ROSMAP but not in ADNI, potentially reflecting differences in population characteristics. Shared progression scores (ξi1,ξi2,ξi3,ξi4) are critical predictors of AD risk. Most scores show significant associations with time to AD diagnosis, emphasizing the utility of shared latent trajectory to capture meaningful variability in disease progression. For example, positive values for ξi1 are associated with higher AD risk, consistent with the interpretation of the shared trajectory reflecting overall deterioration. These findings underscore the importance of incorporating longitudinal outcomes into survival models to better understand the progression and prediction of AD.

TABLE 2.

Estimated Cox regression coefficients (with standard errors) from the multi‐cohort functional joint model, evaluating the association between baseline covariates, shared latent scores, and time to AD diagnosis.

Covariates ADNI NACC ROSMAP
Estimate (s.e.) p Estimate (s.e.) p Estimate (s.e.) p
Age 0.02 (0.01)* 0.01 0.01 (0.00)* 0.00 0.00 (0.01) 0.96
sex (female) 0.10 (0.17) 0.55 0.09 (0.08) 0.21 −0.31 (0.24) 0.20
Education 0.05 (0.03) 0.08 0.09 (0.01)* 0.00 0.11 (0.03)* 0.00
APOE4 0.28 (0.13)* 0.03 0.37 (0.06)* 0.00 0.40 (0.19)* 0.03
Shared score (ξi1) 0.63 (0.04)* 0.00 0.59 (0.02)* 0.00 0.81 (0.06)* 0.00
Shared score (ξi2) 0.68 (0.22)* 0.00 0.48 (0.07)* 0.00 0.10 (0.30) 0.74
Shared score (ξi3) −1.23 (0.26)* 0.00 −1.71 (0.12)* 0.00 −0.66 (0.39) 0.09
Shared score (ξi4) 2.86 (0.68)* 0.00 3.90 (0.31)* 0.00 −3.88 (0.71)* 0.00

Note: Covariates include baseline age, sex, years of education, and APOE4 status. Shared latent scores reflect subject‐specific progression in longitudinal outcomes. An asterisks indicates significance at the 0.05 level.

In multi‐cohort FJM, distinct mean functions for longitudinal outcomes and cohort‐specific Cox regression coefficients are employed to account for inter‐cohort variability. To investigate the impact of simplifying assumptions, we evaluate alternative models with either shared mean functions or shared Cox coefficients across cohorts. Using AIC/BIC for model selection, we assess the trade‐offs between model flexibility and parsimony. When both mean functions and Cox coefficients are shared across cohorts, the degrees of freedom dfc for cohort c are defined as:

j𝒥c1nyjdfμjc+1ncd=1D1dfϕdD1(D11)2+1ncd=1D2dfψdD2(D21)2+(2J1)nc+(p+D1)nc,

where nyj represents the number of cohorts in which the jth outcome is collected.

For each model, we apply multi‐cohort FJM with different values of the tuning parameter D0 and D1 (D0, D1 {1,2,3,4,5}) to the three AD cohorts; see Appendix D for details. The results show that cohort‐specific Cox coefficients are consistently required, as determined by both AIC and BIC. However, the two metrics diverge regarding mean functions: BIC favors shared mean functions across cohorts, reflecting its preference for simpler models, while AIC supports cohort‐specific mean functions, prioritizing flexibility. This divergence likely arises from the limited overlap of outcomes across cohorts, with only three of the eight outcomes shared by at least two cohorts. Allowing cohort‐specific mean functions in such a scenario does not significantly increase model flexibility. Results for the simpler model selected by BIC are provided in Appendix D and align with the interpretations of the full model. These findings underscore the critical role of cohort‐specific Cox coefficients in capturing inter‐cohort heterogeneity, while the choice to use cohort‐specific mean functions depends on the extent of outcome overlap across cohorts.

6. Simulation Study

6.1. Simulation Settings

We evaluate the performance of multi‐cohort FJM by generating data similar to the AD cohorts. The longitudinal data of 3 cohorts are generated according to Model (1) with J=8 outcomes, following the structure of AD cohorts. Each cohort includes the same set of outcomes as in the real data, with cohort‐specific mean functions, scaling parameters, and error variances derived from the estimates obtained in the application study. To simplify the analysis, the multi‐cohort FJM is fitted with two principal components for both the shared covariance function and the outcome‐specific covariance functions. The eigenfunctions used in the simulations are the estimated eigenfunctions from the three AD cohorts, corresponding to two shared principal components and two outcome‐specific principal components.

The shared random scores ξi1 and ξi2 are generated from a normal distribution N(0,λ1d), where λ11=12.02 and λ12=0.38. Similarly, outcome‐specific random scores ζijd are generated N(0,λ2d), with λ21=3.49 and λ22=0.16. Measurement errors, denoted as ϵijk, are sampled from a normal distribution N(0,σj2). The observed time points tijk=tik correspond to 11 fixed time points mapped to the interval [0,1] and are subject to truncation based on censoring and event times, as described below.

The events data are generated using the Cox regression model in (2), incorporating shared random scores and four baseline covariates from the data application. The linear hazard rate function is specified as ziTγ1c(i)+ξi1γ2c(i)1+ξi2γ2c(i)2, where the Cox coefficients are based on estimates from the data application. Specifically, γ11=(0.02,0.14,0.05,0.30)T, γ12=(0.02,0.13,0.09,0.35)T, and γ13=(0.02,0.19,0.07,0.40)T, with shared coefficients γ211=0.53, γ212=1.04, γ221=0.45, γ222=1.11, γ231=0.59, and γ232=0.09. A Weibull baseline hazard h0(t)=ρtρ1 with ρ=20 is used, and event times Si are generated via the inverse probability method [27]. Censoring times Ci are sampled independently from a Beta distribution, with α0c(i) and β0c(i) chosen to approximate censoring rates of 66%, 66% and 59%, matching the three AD cohorts. For each subject, only measurements at tikTi=min(Si,Ci) are retained. The cohort sample sizes are 715, 3707, and 522 for cohorts 1, 2, and 3, respectively, with an average number of observations per subject of 7.8, 6.7, and 7.3. These settings closely mimic the AD study, ensuring realistic simulation conditions. We simulate data 100 times.

6.2. Simulation Results

To assess model performance, we first fit the multi‐cohort FJM using the true number of shared and outcome‐specific principal components. Figure 4 shows the estimated mean functions across 100 replications. Gray lines represent estimates from individual replicates, dashed blue lines show the average of these estimates, and solid red lines represent the true mean functions. The close alignment of the dashed blue and solid red lines indicates accurate estimation. The estimated eigenfunctions and a comprehensive assessment of the estimated model parameters are presented in Appendix E. These parameters include the eigenvalues Λ, the outcome‐specific scaling parameter β, the Cox regression coefficients γ, and the white noise variance σj2. The estimated eigenfunctions demonstrate strong agreement with the ground truth, and the estimated model parameters align closely with their true values, confirming the good performance of the proposed multi‐cohort FJM.

FIGURE 4.

FIGURE 4

Estimated mean functions from 100 simulation replications. Gray lines: individual estimates; dashed blue lines: average of these estimates; solid red lines: true mean functions.

Finally, we evaluate model selection using AIC and BIC to determine the number of eigenfunctions in the covariance structures. The proposed selection method demonstrates high accuracy, achieving correct selection rates of 0.98 for AIC and 1.00 for BIC, respectively. These results highlight the robustness of BIC in correctly identifying model complexity, making it the preferred criterion for model selection.

In summary, these simulation results substantiate the reliability and effectiveness of multi‐cohort FJM and demonstrate its ability to accurately estimate model parameters and select the appropriate number of eigenfunctions.

6.3. Comparison With Regression Spline Estimation

To evaluate the computational efficiency of our EM implementation with penalized splines, we compare it with an EM implementation using regression splines [14]. Under the same simulation design as above, we generate 50 independent replications.

Estimation accuracy for the mean functions and covariance functions in the longitudinal sub‐model was quantified using the relative integrated squared error (RISE). Specifically, letting μjc denote the true mean function for outcome j in cohort c and μ^jc its estimate, and letting 𝒞 and 𝒢 denote the true shared and outcome‐specific covariance functions with estimates 𝒞^ and 𝒢^, we define

RISEμ=c=1ncj𝒥c01μjc(t)μ^jc(t)2dtc=1ncj𝒥c01μjc(t)2dt,
RISE𝒞=0101{𝒞(s,t)𝒞^(s,t)}2dsdt0101{𝒞(s,t)}2dsdt,
RISE𝒢=0101{𝒢(s,t)𝒢^(s,t)}2dsdt0101{𝒢(s,t)}2dsdt.

Figure 5 provides boxplots of RISEs over 50 replications for the two estimation methods. The penalized spline EM algorithm yields uniformly lower RISE values for the mean functions and shared covariance 𝒞(s,t) compared to the regression‐spline EM. For outcome‐specific covariance 𝒢(s,t), the two approaches show comparable accuracy across replications.

FIGURE 5.

FIGURE 5

Boxplots of RISE of the estimated mean functions, shared covariance 𝒞(s,t), and outcome‐specific covariance 𝒢(s,t) over 50 replications, comparing the penalized‐spline EM algorithm with the regression‐spline EM algorithm.

We also compare the computation time of the two estimation methods. Median (median absolute deviation) runtime for the penalized‐spline EM is 2.14 (1.04) h, whereas the regression‐spline EM requires 3.72 (1.74) h. Boxplots of computation times are provided in Appendix EE. Across identical simulation settings, penalized‐spline EM achieves better or comparable estimation accuracy and faster convergence than regression‐spline EM. These results provide direct empirical support for the efficiency claim of our proposed algorithm.

7. Discussion

We proposed a novel extension of the multivariate functional mixed model (MFMM) [14] to jointly analyze longitudinal outcomes from multiple cohorts. Our approach offers a principled solution to the challenges posed by disparate outcome collections across cohorts. By leveraging shared variation patterns extracted from the MFMM, we achieved a parsimonious, yet flexible framework for linking longitudinal and survival data. This approach enables robust characterization of disease trajectories and their associations with survival outcomes in heterogeneous cohorts. The application to the three AD cohorts underscores the utility of the model to uncover shared and cohort‐specific disease progression patterns. By identifying differences in disease trajectories across cohorts, such as faster progression in ROSMAP compared to ADNI and NACC, the model revealed insights that would be unavailable from a single‐cohort analysis. Additionally, inclusion of cohort‐specific survival model coefficients allowed us to capture inter‐cohort heterogeneity in baseline covariate effects, offering a more comprehensive understanding of AD risk factors, including the differential impact of age, education, and APOE4 status across cohorts.

For the proposed functional joint model, we developed a computationally feasible algorithm that effectively combines EM and penalized splines. The iterative nature of EM matches well with the local selection of smoothing parameters for penalized splines, overcoming the computational complexity of functional joint models. The success of our algorithm will no doubt encourage further uses of penalized splines in iterative algorithms. It should be noted that the local selection strategy of smoothing parameters is generally applicable to any iterative algorithm with nonparametric smooth functions to estimate.

Despite its strengths, the proposed model has limitations that warrant further investigation. Currently, the longitudinal and survival sub‐models are linked parametrically via the principal scores. Although this approach ensures model simplicity, it may not fully capture more complex associations between longitudinal outcomes and survival data. Future work could incorporate nonparametric linking mechanisms [16], to enhance flexibility and accommodate complex relationships between the two sub‐models.

Another limitation is the reliance on a single shared variation pattern to represent commonalities across longitudinal outcomes. Although sufficient for a moderate number of outcomes, this may become restrictive when analyzing a large set of outcomes with diverse dependencies. The recent latent functional factor model [28] allows multiple shared variation patterns and offers a promising direction to extend MFMM to handle such complexity. Incorporating multiple shared variation patterns into the functional joint model could further improve its ability to capture the rich structure of longitudinal data and enhance its applicability to more complex multi‐cohort studies.

Funding

The research of Sheng Luo was supported by the National Institute on Aging (grant number: R01AG064803, P30AG072958, and P30AG028716). The research of Wenyi Wang and Luo Xiao was partially supported by the National Institute of Neurological Disorders and Stroke (R01NS112303).

Conflicts of Interest

The authors declare no conflicts of interest.

Supporting information

Data S1. Supporting information.

SIM-45-0-s001.pdf (813.4KB, pdf)

Acknowledgments

Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (https://adni.loni.usc.edu/). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at http://adni.loni.usc.edu/wp‐content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Wang W., Xiao L., Li R., Luo S., Alzheimer's Disease Neuroimaging Initiative , “A Functional Joint Model for Survival and Multivariate Sparse Functional Data in Multi‐Cohort Alzheimer's Disease Study,” Statistics in Medicine 45, no. 3‐5 (2026): e70442, 10.1002/sim.70442.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

References

  • 1. Alzheimer's Association , “2024 Alzheimer's Disease Facts and Figures,” Alzheimer's & Dementia 20, no. 5 (2024): 3708–3821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Weiner M. W., Veitch D. P., Aisen P. S., et al., “2014 Update of the Alzheimer's Disease Neuroimaging Initiative: A Review of Papers Published Since Its Inception,” Alzheimer's & Dementia 11, no. 6 (2015): e1–e120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Besser L., Kukull W., Knopman D. S., et al., “Neuropsychology Work Group, Directors, and Clinical Core Leaders of the National Institute on Aging‐Funded US Alzheimer's Disease Centers Version 3 of the National Alzheimer's Coordinating Center's Uniform Data Set,” Alzheimer Disease and Associated Disorders 32, no. 4 (2018): 351–358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Bennett D. A., Buchman A. S., Boyle P. A., Barnes L. L., Wilson R. S., and Schneider J. A., “Religious Orders Study and Rush Memory and Aging Project,” Journal of Alzheimer's Disease 64, no. s1 (2018): S161–S189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Wulfsohn M. S. and Tsiatis A. A., “A Joint Model for Survival and Longitudinal Data Measured With Error,” Biometrics 330–339 (1997): 330. [PubMed] [Google Scholar]
  • 6. Lin H., McCulloch C. E., and Mayne S. T., “Maximum Likelihood Estimation in the Joint Analysis of Time‐To‐Event and Multiple Longitudinal Variables,” Statistics in Medicine 21, no. 16 (2002): 2369–2382. [DOI] [PubMed] [Google Scholar]
  • 7. Murray J. and Philipson P., “Fast Estimation for Generalised Multivariate Joint Models Using an Approximate EM Algorithm,” Computational Statistics and Data Analysis 187 (2023): 107819. [Google Scholar]
  • 8. He Y., Song X., and Kang K., “Joint Mixed Membership Modeling of Multivariate Longitudinal and Survival Data for Learning the Individualized Disease Progression,” Annals of Applied Statistics 18, no. 3 (2024): 1924–1946. [Google Scholar]
  • 9. James G., Hastie T., and Sugar C., “Principal Component Models for Sparse Functional Data,” Biometrika 87, no. 3 (2000): 587–602. [Google Scholar]
  • 10. Yao F., Müller H. G., and Wang J. L., “Functional Data Analysis for Sparse Longitudinal Data,” Journal of the American Statistical Association 100, no. 470 (2005): 577–590. [Google Scholar]
  • 11. Ramsay J. and Silverman B., Functional Data Analysis (Springer, 2005). [Google Scholar]
  • 12. Yao F., “Functional Principal Component Analysis for Longitudinal and Survival Data,” Statistica Sinica 17 (2007): 965–983. [Google Scholar]
  • 13. Yan F., Lin X., and Huang X., “Dynamic Prediction of Disease Progression for Leukemia Patients by Functional Principal Component Analysis of Longitudinal Expression Levels of an Oncogene,” Annals of Applied Statistics 11, no. 3 (2017): 1649–1670. [Google Scholar]
  • 14. Li C., Xiao L., and Luo S., “Joint Model for Survival and Multivariate Sparse Functional Data With Application to a Study of Alzheimer's Disease,” Biometrics 78, no. 2 (2022): 435–447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Li K. and Luo S., “Dynamic Prediction of Alzheimer's Disease Progression Using Features of Multiple Longitudinal Outcomes and Time‐To‐Event Data,” Statistics in Medicine 38, no. 24 (2019): 4804–4818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Zou H., Zeng D., Xiao L., and Luo S., “Bayesian Inference and Dynamic Prediction for Multivariate Longitudinal and Survival Data,” Annals of Applied Statistics 17, no. 3 (2023): 2574–2595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Shi H., Jiang S., Ma D., Beg M. F., and Cao J., “Dynamic Survival Prediction Using Sparse Longitudinal Images via Multi‐Dimensional Functional Principal Component Analysis,” Journal of Computational and Graphical Statistics 33 (2024): 1240–1251. [Google Scholar]
  • 18. Hong Y., Su L., Song S., and Yan F., “Dynamic Prediction of Disease Processes Based on Recurrent History and Functional Principal Component Analysis of Longitudinal Biomarkers: Application for Ovarian Epithelial Cancer,” Statistics in Medicine 40, no. 8 (2021): 2006–2023. [DOI] [PubMed] [Google Scholar]
  • 19. Zou H., Xiao L., Zeng D., and Luo S., “Dynamic Prediction With Multivariate Longitudinal Outcomes and Longitudinal Magnetic Resonance Imaging Data,” Annals of Applied Statistics 19, no. 1 (2025): 505–528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Eilers P. H. C. and Marx B. D., “Flexible Smoothing With B‐Splines and Penalties,” Statistical Science 11, no. 2 (1996): 89–102. [Google Scholar]
  • 21. Wood S. N., Pya N., and Säfken B., “Smoothing Parameter and Model Selection for General Smooth Models,” Journal of the American Statistical Association 111, no. 516 (2016): 1548–1563. [Google Scholar]
  • 22. Dempster A. P., Laird N. M., and Rubin D. B., “Maximum Likelihood From Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society: Series B: Methodological 39, no. 1 (1977): 1–22. [Google Scholar]
  • 23. Ruppert D., “Selecting the Number of Knots for Penalized Splines,” Journal of Computational and Graphical Statistics 11, no. 4 (2002): 735–757. [Google Scholar]
  • 24. Xiao L., “Asymptotic Theory of Penalized Splines,” Electronic Journal of Statistics 13, no. 1 (2019): 747–794. [Google Scholar]
  • 25. Wood S. N., “Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models,” Journal of the Royal Statistical Society (B) 73, no. 1 (2011): 3–36. [Google Scholar]
  • 26. Henderson R., Diggle P., and Dobson A., “Joint Modelling of Longitudinal Measurements and Event Time Data,” Biostatistics 1, no. 4 (2000): 465–480. [DOI] [PubMed] [Google Scholar]
  • 27. Bender R., Augustin T., and Blettner M., “Generating Survival Times to Simulate Cox Proportional Hazards Models,” Statistics in Medicine 24, no. 11 (2005): 1713–1723. [DOI] [PubMed] [Google Scholar]
  • 28. Li R. and Xiao L., “Latent Factor Model for Multivariate Functional Data,” Biometrics 79, no. 4 (2023): 3307–3318. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data S1. Supporting information.

SIM-45-0-s001.pdf (813.4KB, pdf)

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.


Articles from Statistics in Medicine are provided here courtesy of Wiley

RESOURCES