Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Nov 10.
Published in final edited form as: Stat Med. 2017 Aug 7;36(25):4028–4040. doi: 10.1002/sim.7401

A Joint Modeling and Estimation Method for Multivariate Longitudinal Data with Mixed Types of Responses to Analyze Physical Activity Data Generated by Accelerometers

Haocheng Li a,*, Yukun Zhang b, Raymond J Carroll c,d, Sarah Kozey Keadle e, Joshua N Sampson f, Charles E Matthews f
PMCID: PMC5656438  NIHMSID: NIHMS886387  PMID: 28786180

Abstract

A mixed effect model is proposed to jointly analyze multivariate longitudinal data with continuous, proportion, count and binary responses. The association of the variables is modeled through the correlation of random effects. We use a quasilikelihood type approximation for non-linear variables, and transform the proposed model into a multivariate linear mixed model framework for estimation and inference. Via an extension to the EM approach, an efficient algorithm is developed to fit the model. The method is applied to physical activity data, which uses a wearable accelerometer device to measure daily movement and energy expenditure information. Our approach is also evaluated by a simulation study.

Keywords: Accelerometers, Longitudinal data, Mixed effects model, Multivariate longitudinal data, Penalized Quasilikelihood

1. Introduction

We propose a new framework to estimate multivariate longitudinal data that consists of multiple types of variables. Our methodology is motivated by physical activity data recorded by wearable accelerometer devices [1]. These devices have several advantages over traditional questionnaire-based measures of physical activity and are increasingly used in studies investigating physical activity and health. One clear advantage is the ability to measure activity continuously over days or weeks at relatively high frequencies (e.g., 80 Hz). However, since this data can be summarized into many metrics that are potentially associated, the increased information presents new analytic challenges.

Recent literature [2] shows that a complete description of an individual’s pattern of physical activity requires specification of multiple variables, such as those listed in Figure 1 (a)–(h). Among these eight variables, we note that two are continuous measures, two are proportions, two are counts, and two are binary. Our motivating example is an experimental study that examined the impact of 12-weeks of exercise training on health. The 63 participants wore monitors during baseline and weeks 3, 6, 9 and 12 of the intervention and we extracted the eight variables for one day’s wear in each week.

Figure 1.

Figure 1

Sample data from two subjects on weeks 0, 3, 6, 9 and 12. Solid lines and “X” labels display the observations from individual with ID 4. Dashed lines and ”O” labels represent the outcomes from individual with ID 5. (a) Y(1): continuous variable for daily sedentary hours; (b) Y(2): continuous variable for energy expenditure; (c) Y(3): proportion of sedentary time greater than 20 minutes, (d) Y(4): proportion of active time greater than 5 minutes; (e) Y(5): count number of daily standing up behaviors; (f) Y(6): count number of daily steps; (g) Y(7): binary variable for whether daily moderate to vigorous physical activity (MVPA) time is greater than one hour; (h) Y(8): binary variable for whether the highest energy expenditure rate measured by metabolic equivalents (METs) in 10 minutes is greater than 3.

The main purposes of this work are to develop statistical methodology for jointly modeling the longitudinal pattern of the eight factors, which postulates their association structures. The association pattern can be widely used in this physical activity data analysis, including but not limit to, study daily sedentary time and energy expenditure levels under certain conditions. For example, we could explore the trend of the two outcomes for participants who are less likely to have sedentary activity but prefer to take more daily steps, more moderate to vigorous physical activity (MVPA) time, and higher intensity in activities at baseline. Although one may simply select the individuals who meet all criteria for the inference, the results would have a severe loss of information and there could be very few or even zero subjects satisfying all criteria in our small sample. On the other hand, as we will show in Section 5, the association structure obtained from our joint modeling uses the information from all participants and is flexible to handle user-specified conditions for different research purposes.

In this manuscript, we propose a flexible statistical model that handles a large number of response variables mixed with multiple types in longitudinal studies. A naive solution is to ignore the correlation among multiple outcomes and fit them separately as if they are independent, but this potentially losses important information from the data. The other simple approach is to consider the association only among the same type of variables (e.g. bivariate normal), while ignore the association across different types of responses. As we will show, our estimation for outcomes under specified conditions needs to understand the association patterns among all variables, while ignoring such correlation would lead to biased conclusions. For joint modeling approach, most of the current multivariate methods focus on one or two types of outcomes [3]. For example, Fieuws and Verbeke [4] and Fieuws et al. [5] respectively discuss the methods to handle multiple normal and binary outcomes, Gueorguieva and Agresti [6] develop a modeling strategy for one normal and one binary responses, and Buu et al. [7] propose a joint model for one count and one binary variables to handle zero-inflated count data. These models are limited in practice. Instead, we develop a multivariate mixed effect model for various types of outcomes, which incorporates the within individual correlation across variables by random effects.

The difficulty for this study is finding a method that can fit the proposed model. With the increasing number of variables of different types, the number of random effects also increases. Current maximum likelihood methods for joint modeling are only feasible for low dimensional random effects. For multivariate linear mixed models, where the analytical formulation of marginal density is known, the model fitting is infeasible when the number of variables exceed four as the parameters in the covariance matrix for random effects are rapidly increasing [4]. For models of non-linear variables, large dimensional integration problems are more difficult to handle in the presence of multiple random effects. For example, the maximum likelihood approach using Monte Carlo or Gauss-Hermite quadrature is computationally extremely difficult for the integration of seven random effects in Fieuws et al. [5]. For our accelerometer data, we respectively set a random intercept and a random slope for eight variables. Therefore, our model involves 2 × 8 = 16 dimensional random effects, and it exceeds the limit for maximum likelihood.

An alternative model fitting approach is to use a pairwise likelihood strategy by splitting joint estimation into a series of bivariate joint mixed models [4, 5]. However, it could still be intractable as the number of pairs quadratically increases with the number of variables. Moreover, as our proposed model involves four types of variables, the paired structures involve different formulations (i.e. bivariate normal, normal and proportion, binary and count, etc), which is difficult to implement in practice.

We use an upgraded penalized quasilikelihood approach to fit the data [8, 9]. This method extends the previous approach [10] by accommodating the data with continuous, proportion, count, and binary variables. Based on the penalized quasilikelihood method, all of the non-linear variables are transformed and approximately postulated by linear mixed models. Therefore, four types of measurements can be postulated by an approximated multivariate linear mixed model. In addition, the idea of an ECME algorithm [11] is extended to our joint model to solve the estimation problem for a large dimension of random effects. Our method avoids the numerical integrations for non-linear variables, and it only has modest computational workload. In addition, the updating of parameters in maximum likelihood method requires tedious computation of the first and second order derivatives with various formulations from different types of outcomes [12, 13], while our algorithm is easy to implement since it only involves the first derivatives of log-likelihood functions for linear mixed model.

The paper is organized as follows. Section 2 describes the models for multiple types of variables, and Section 3 describes our developed algorithm to fit the model. Section 4 gives results from our simulation studies. Section 5 analyzes the motivating physical activity data obtained from wearable accelerometer-based device. Concluding remarks are in Section 6.

2. Mixed Effects Model

2.1. Model Specification

Let Yij() be the ℓth outcome at occasion j = 1, …, Ji for subject i = 1, …, n. To accommodate the data, we set ℓ = 1, 2 for continuous responses, ℓ = 3, 4 for proportional responses, ℓ = 5, 6 for count responses and ℓ = 7, 8 for binary responses, respectively. We shall assume that Yij() comes from a specified distribution with mean μij(), linear predictor ηij(), and link functions to connect μij() and ηij() in various forms for different types of outcomes. Then we define linear predictor ηij()=Xij()β()+Zij()ui(), where Xij() and Zij() are covariate vectors for fixed and random effects, respectively, β(ℓ) is a vector of fixed effect coefficients, ui() is a vector of correlated random effects with Normal(0, Ψ(ℓℓ)).

For continuous data (ℓ = 1, 2), we assume that the response follows a normal distribution and let ηij()=μij() to have

Yij()=μij()+εij()=Xij()β()+Zij()ui()+εij(),=1,2, (1)

where εij() is independent random noise with Normal(0, σ2(ℓ)).

For proportional data, based on the Beta regression framework proposed by Ferrari and Cribari-Neto [14], the density function for proportional outcomes Yij() given random effects ui() is assumed to have

Γ{ϕ()}Γ{μij()ϕ()}Γ[{1μij()}ϕ()]{yij()}μij()ϕ()1{1yij()}{1μij()}ϕ()1,=3,4,

where Γ(·) is the gamma function, and we reparameterize σ2(ℓ) = 1/{1 + ϕ(ℓ)} (ℓ = 3, 4) to unify the notations with continuous outcomes. We also have var{Yij()|ui()}=σ2()μij(){1μij()} and use logit link function to connect μij() and ηij() as

logit{μij()}=ηij()=Xij()β()+Zij()ui(),=3,4. (2)

Poisson distributions with log link function and binomial distributions with logit link function are employed to model the count and binary data, respectively. They lead to

log{μij()}=ηij()=Xij()β()+Zij()ui(),=5,6, (3)
logit{μij()}=ηij()=Xij()β()+Zij()ui(),=7,8. (4)

We further assume that given ui()(=1,,8), the observations are independent across all occasions, subjects, and different types of responses. Therefore, the random effects model the association pattern across visit time points and also postulate the correlation structure across all variables.

For ease of exposition, we combine notations as Yi()={Yi1(),,YiJi()}T and Yi={Yi(1)T,,Yi(8)T}T for responses, Xi()={Xi1()T,,XiJi()T}T and Xi to be a block diagonal matrix with elements Xi()(=1,,8),Zi()={Zi1()T,,ZiJi()T}T and Zi to be a block diagonal matrix with elements Zi()(=1,,8).

We also denote β = {β(1)T, …, β(8)T}T for fixed effect coefficients, and ui={ui(1)T,,ui(8)T}T for random effect variables. Thus, the random effect variables have ui = Normal(0, Ψ) with block matrix Ψ including diagonal elements Ψ(ℓℓ) and off-diagonal element Ψ(ℓℓ̃) (ℓ, ℓ̃ = 1, 2, …, 8; ℓ ≠ ℓ̃). The element Ψ(ℓℓ) denotes the covariance of ui(). The element Ψ(ℓℓ̃) determines the covariance of ui() and ui(), which measures the association level across two variables. We further denote σ2 = {σ2(1), …, σ2(4)}T.

Therefore, the multivariate longitudinal data model for various types of variables involves three sets of parameters to be estimated: (1) fixed effect coefficients β; (2) ui’s covariance matrix Ψ; (3) dispersion parameters σ2.

2.2. Approximated Linear Mixed Model

We approximate the proportion, count and binary variables using the penalized quasilikelihood method (PQL) proposed by Breslow and Clayton [8]. The approximation is upgraded by Goldstein and Rasbash [9] with second order approximation terms to improve performance. As we will show in simulation study, the upgraded approach employed in this manuscript leads to smaller bias comparing to the regular PQL method without using second order approximation. We briefly describe the formula for proportional variables here, and the formulas for count and binary data are displayed in the Appendix A.1. Let H(·) be our inverse function of logit link with H(·) = exp(·)/{1 + exp(·)}, and H′(·) and H″(·) be its first and second derivatives. Given specified values of (β̂, ûi), the proportional data model in (2) has η^ij()=Xij()β^()+Zij()u^i()(=3,4). Following the approximation methods discussed in Lindstrom and Bates [15], Wolfinger and O’Connell [16], Goldstein and Rasbash [9] and Molenberghs and Verbeke [17], the formulation is displayed as follows with detailed derivations described in the Appendix A.2,

Yij()[1/H{η^ij()}][Yij()H{η^ij()}]+η^ij()[1/2H{η^ij()}]H{η^ij()}[Zij()var^{ui()u^i()}Zij()T]Xij()β()+Zij()ui()+εij(),=3,4, (5)

where εij() has mean zero and variance σ2()/H{η^ij()}.

After performing similar transformations for the binary and count variables, we now have six transformed outcomes Yij()(=3,,8) that have approximated mean structure E{Yij()|ui()}=Xij()β()+Zij()ui() and variance structure var{Yij(l)|ui()}=var{εij()}. The estimation of model parameters can thus be implemented as fitting a linear mixed model. As these variables can be respectively fitted as approximated linear mixed model and two continuous responses are also postulated by linear mixed models, a multivariate linear mixed model can be used to jointly fit all variables. Therefore, despite the various types of outcomes, they are transformed into a unified framework which can be conveniently estimated by the algorithm described in Section 3.

3. Model Estimation

We estimate the parameters by extending the idea of an ECME algorithm [11]. The ECME algorithm updates the fixed effects parameters (β, σ2) by the Newton-Raphson method and updates the random effects parameters Ψ by the EM algorithm. Liu and Rubin [18] study the convergence properties of this algorithm. We provide detailed procedures for the model fitting here. Similar to the notation of Yi, we define the approximated response vector as Yi()={Yi1(),,YiJi()}T(=3,4,,,8), and combine these vectors with continuous outcomes as Yi={Yi(1)T,Yi(2)T,Yi(3)T,,Yi(8)T}T. Let Σi be the diagonal matrix with the variance for εij() as

idiag[σ2(1)IJi,σ2(2)IJi,σ2(3)Gi(3),σ2(4)Gi(4),Fi(5),Fi(6),Gi(7),Gi(8)], (6)

where IJi is Ji × Ji identity matrix, Fi()=diag[exp{η^i1()},,exp{η^iJi()}] and Gi()=diag[1/H{η^i1()},,1/H{η^iJi()}]. We also let Vi to be the approximated covariance for Yi as

Vii+ZiΨZiT. (7)

The approximated conditional covariance for ui given Yi by assuming known β as

UiΨΨZiTVi1ZiΨ, (8)

and for unknown β as

var^(uiu^i)=Ui+ΨZiTVi1Xi(i=1nXiTVi1Xi)1XiTVi1ZiΨ, (9)

where the estimate of var^(ui()u^i()) for (5), (A.1) and (A.2) can be obtained by extracting the corresponding elements.

The algorithm has

  1. Based on the current working values {βcurr,Ψcurr,σcurr2,u^i,curr()(=3,,8)}, calculate η^i,curr()(=3,,8), update Σi,curr, Vi,curr, Ui,curr and var^curr(uiu^i) sequentially by Equations (6) to (9), and then update Yi,curr by Equations (5), (A.1) and (A.2).

  2. update β as
    βnew=(i=1nXiTVi,curr1Xi)1(i=1nXiTVi,curr1Yi,curr), (10)
  3. Based on {βnew,Ψcurr,σcurr2,u^i,curr()(=3,,8)}, repeat step 1 to calculate η^i,new()(=3,,8), Σi,new, Vi,new, Ui,new, var^new(uiu^i) and Yi,new, sequentially.

  4. update σ2(ℓ) as σnew2()=N1i=1n{Yi,new()Xiβnew()}T{Wi()}1{Yi,new()Xi()βnew()}, where N=i=1nJi,Wi()=IJi+Zi()Ψcurr()Zi()T/σcurr2()(=1,2), and Wi()=diag[1/H{η^i,new()}]+Zi()Ψcurr()Zi()T/σcurr2()(=3,4).

  5. update ûi as u^i,new=ΨcurrZiTVi,new1(Yi,newXiβnew).

  6. update Ψ as Ψnew=n1i=1n(u^i,newu^i,newT+Ui,new).

We keep u^i,new()(=3,,8) for the next iteration by selecting subvectors from ûi,new. The procedure is iterated until convergence of {β,Ψ,σ2,u^i()(=3,,8)}.

According to equation (10), the covariance matrix for β̂ can be calculated by (i=1nXiTVi1Xi)1. This variance estimator leads to satisfactory results in our simulation studies and data analysis. However, the estimator is based on the approximation for Yi and assuming other parameters are known. The improved estimators discussed in [8] can be used to reduce bias.

Remark 1

We fit the data by a generalized linear model without random effects to obtain an initial value of β. We also set all elements in σ2 to be 0.01 and all elements in u^i()(=3,,8) to be 0 as starting values. Following the discussion from [8], we set the diagonal elements in the covariance matrix Ψ to be small positive values and off-diagonal elements to be 0. In a preliminary simulation study (not reported), the diagonal variances in Ψ are all set to be 0.1. However, during estimation iteration, this initial value could lead to fairly extreme values in Σi for count outcomes, and thus Vi could be singular and may not be invertible. In the preliminary study, 36% of the simulation runs have this problem for 1 to 3 subjects. We solve the issue by resetting the variances in Ψ corresponding to the count outcomes as 0.001. Based on the new initial value settings, none of the runs in moderate sample scenario has such issue in the study reported in Section 4. There are less than 1% of the runs in large sample scenario have one subject with singular Vi. For those rare runs with singular Vi under new starting values, we remove the corresponding subjects from iteration procedures, and then all of the runs successfully achieve convergence.

4. Simulation Studies

In this section, we use a simulation study to demonstrate the performance of our proposed approach (labeled as JOINT-PQL2). As a comparison, two naive approaches are also explored, where the first method fits eight variables by assuming them to be independent (labeled as NAIVE1), and the other approach only models the association among the same type of variables (labeled as NAIVE2). We also study the performance of the regular PQL method which models the complete association structure but not using second order approximations for our proportion, count and binary outcomes (labeled as JOINT-PQL1).

In the simulation study of 500 runs, we have n = 200 subjects and each subject has Ji = 5 visits. We also consider a larger sample size with n = 400 and Ji = 9. At visit time j, subject i has eight observations {Yij(1),,Yij(8)} as described in Section 2. The continuous variables Yij(1) and Yij(2) are generated according to

Yij()=β0()+β1()tij+β2()xij()+β3()tijxij()+ui,0()+tijui,1()+εij(),=1,2,

where tij = j − 3 (j = 1, …, 5) for the moderate sample, tij = (j − 5)/2 (j = 1, …, 9) for the large sample, and xij() are generated from standard normal distribution and independent across i, j, ℓ. Other types of variables have similar linear predictor formulation with

ηij()=β0()+β1()tij+β2()xij()+β3()tijxij()+ui,0()+tijui,1(),=3,4,,8.

We set β0()=0.5,β1()=0.2,β2()=0.2,β3()=0.1 for ℓ = 1, 3, 5, 7, and β0()=0.5,β1()=0.2,β2()=0.2,β3()=0.1 for ℓ = 2, 4, 6, 8, and the covariance matrix Ψ for ui has Ψ(ℓℓ) (ℓ = 1, 2, …, 8) in 2 × 2 matrix with diagonal elements to be Ψ[1,1]()=Ψ[2,2]()=0.5 and off-diagonal elements to be Ψ[1,2]()=0.25, and Ψ(ℓℓ̃) in 2 × 2 matrix with all elements to be 0.1. We also set σ2(1) = σ2(2) = 1 and σ2(3) = σ2(4) = 1/30.

As the discussion in Section 1, we are interested in the estimation of model parameters as well as the longitudinal pattern of outcomes under certain conditions. The conditional patterns can be used to study the physical activity outcomes for participants who meet particular criteria, and they will be discussed in Section 5. As an example, we use three approaches to fit simulated data, and the estimates are used to calculate the following conditional expectations by averaging the results from one million times of Monte Carlo samplings:

E{Yij()|xij()=0,Yi1(3)0.5,Yi1(4)0.5,Yi1(5)15,Yi1(6)5,Yi1(7)=1,Yi1(8)=1},

where ℓ = 1, 2, j = 1, …, 5 for the moderate sample, and j = 1, …, 9 for the large sample.

Tables 12 present the relative bias percent (Bias%), the average model-based standard error (ASE), the empirical standard error (ESE) and 95% coverage rate (CR) for β estimates in count and binary outcomes, respectively. The β estimates for continuous and proportional outcomes are listed in Online Supplementary Material. The results show that our proposed JOINT-PQL2 method yield small biases and the coverage rates are appropriate for β. ASE and ESE agree reasonably well, suggesting the variance estimates for β is appropriate. The JOINT-PQL1 method, however, has larger bias and poor coverage rate for some parameters in β. For the estimation of β, the NAIVE1 and NAIVE2 approaches lead to acceptable conclusions as the proposed JOINT-PQL2 method. We also report the estimates for σ2 and diagonal elements in Ψ using our JOINT-PQL2 approach in Online Supplementary Material, which indicate fairly good performance in the estimation of the model parameters.

Table 1.

Simulation results for β(5) and β(6) in count outcomes. As defined in Section 4, “JOINT-PQL2” denotes our proposed method, “NAIVE1” and “NAIVE2” represent the approaches completely or partially ignoring the association structure among responses, and “JOINT-PQL1” is the method involving full association pattern but not using second order approximation. Displayed are the relative bias percent (Bias%), the average model-based standard error (ASE), the empirical standard error (ESE) and 95% coverage rate (CR) for the estimates. Relative bias greater than 10% is highlighted in bold.

n = 200, Ji = 5 n = 400, Ji = 9

Bias% ASE ESE CR Bias% ASE ESE CR
β0(5)
NAIVE1 −1.42 0.06 0.06 0.95 −0.90 0.04 0.04 0.93
NAIVE2 −1.62 0.06 0.06 0.96 −0.93 0.04 0.04 0.93
JOINT-PQL1 18.66 0.05 0.05 0.59 11.53 0.04 0.04 0.67
JOINT-PQL2 −1.22 0.06 0.06 0.95 −1.23 0.04 0.04 0.96

β1(5)
NAIVE1 −1.49 0.05 0.05 0.96 −0.92 0.04 0.04 0.96
NAIVE2 −2.67 0.05 0.05 0.94 −0.98 0.04 0.04 0.96
JOINT-PQL1 1.26 0.05 0.05 0.94 −0.64 0.04 0.04 0.94
JOINT-PQL2 −1.68 0.05 0.05 0.94 −1.43 0.04 0.04 0.95

β2(5)
NAIVE1 −0.50 0.03 0.03 0.93 0.05 0.01 0.01 0.94
NAIVE2 −0.05 0.03 0.03 0.94 0.08 0.01 0.01 0.94
JOINT-PQL1 −1.44 0.03 0.03 0.95 −0.40 0.01 0.01 0.95
JOINT-PQL2 0.27 0.03 0.03 0.95 0.34 0.01 0.01 0.94

β3(5)
NAIVE1 0.22 0.02 0.02 0.97 0.01 0.01 0.01 0.95
NAIVE2 −0.38 0.02 0.02 0.95 0.09 0.01 0.01 0.94
JOINT-PQL1 −0.91 0.02 0.02 0.95 0.13 0.01 0.01 0.95
JOINT-PQL2 −0.62 0.02 0.02 0.96 0.30 0.01 0.01 0.94

β0(6)
NAIVE1 4.83 0.07 0.08 0.90 1.95 0.04 0.05 0.93
NAIVE2 4.99 0.07 0.07 0.91 1.86 0.04 0.05 0.93
JOINT-PQL1 −34.49 0.06 0.06 0.19 −25.18 0.04 0.04 0.12
JOINT-PQL2 4.55 0.07 0.07 0.93 1.36 0.04 0.05 0.94

β1(6)
NAIVE1 5.61 0.06 0.06 0.95 0.11 0.04 0.04 0.94
NAIVE2 5.57 0.06 0.06 0.94 0.05 0.04 0.04 0.94
JOINT-PQL1 −9.75 0.05 0.05 0.94 −9.25 0.04 0.04 0.93
JOINT-PQL2 4.23 0.06 0.06 0.95 0.56 0.04 0.04 0.95

β2(6)
NAIVE1 −0.23 0.04 0.04 0.97 0.07 0.02 0.02 0.95
NAIVE2 −0.32 0.04 0.04 0.97 −0.00 0.02 0.02 0.95
JOINT-PQL1 −4.25 0.04 0.04 0.95 −1.09 0.02 0.02 0.95
JOINT-PQL2 0.80 0.04 0.04 0.97 0.26 0.02 0.02 0.96

β3(6)
NAIVE1 1.04 0.03 0.03 0.97 −0.16 0.01 0.01 0.97
NAIVE2 −1.31 0.03 0.03 0.97 −0.06 0.01 0.01 0.97
JOINT-PQL1 −5.04 0.03 0.03 0.94 0.37 0.01 0.01 0.95
JOINT-PQL2 0.42 0.03 0.03 0.97 0.99 0.01 0.01 0.97

Table 2.

Simulation results for β(7) and β(8) in binary outcomes. As defined in Section 4, “JOINT-PQL2” denotes our proposed method, “NAIVE1” and “NAIVE2” represent the approaches completely and partially ignoring the association structure among responses, respectively, and “JOINT-PQL1” is the method involving full association pattern but not using second order approximation. Displayed are the relative bias percent (Bias%), the average model-based standard error (ASE), the empirical standard error (ESE) and 95% coverage rate (CR) for the estimates. Relative bias greater than 10% is highlighted in bold.

n = 200, Ji = 5 n = 400, Ji = 9

Bias% ASE ESE CR Bias% ASE ESE CR
β0(7)
NAIVE1 1.20 0.09 0.10 0.91 0.92 0.05 0.05 0.93
NAIVE2 2.39 0.09 0.10 0.92 0.74 0.05 0.06 0.92
JOINT-PQL1 −15.06 0.08 0.08 0.84 −13.18 0.05 0.04 0.70
JOINT-PQL2 1.63 0.09 0.10 0.92 0.17 0.05 0.05 0.94

β1(7)
NAIVE1 3.57 0.07 0.08 0.92 2.83 0.05 0.05 0.93
NAIVE2 6.71 0.07 0.08 0.91 2.56 0.05 0.05 0.93
JOINT-PQL1 −21.84 0.06 0.06 0.87 −19.72 0.04 0.04 0.83
JOINT-PQL2 2.13 0.07 0.08 0.92 1.44 0.05 0.05 0.95

β2(7)
NAIVE1 −0.73 0.08 0.08 0.94 −1.40 0.04 0.04 0.94
NAIVE2 0.26 0.08 0.08 0.93 −1.26 0.04 0.04 0.93
JOINT-PQL1 −9.87 0.07 0.07 0.92 −7.92 0.04 0.04 0.94
JOINT-PQL2 −0.45 0.08 0.08 0.94 −0.54 0.04 0.04 0.95

β3(7)
NAIVE1 −0.97 0.06 0.06 0.93 −1.79 0.03 0.03 0.96
NAIVE2 −3.75 0.06 0.06 0.94 −1.72 0.03 0.03 0.95
JOINT-PQL1 −11.40 0.05 0.05 0.96 −12.47 0.03 0.03 0.95
JOINT-PQL2 −2.35 0.06 0.07 0.93 −2.28 0.03 0.03 0.94

β0(8)
NAIVE1 3.31 0.09 0.10 0.93 0.68 0.05 0.05 0.95
NAIVE2 3.47 0.09 0.10 0.93 0.50 0.05 0.05 0.95
JOINT-PQL1 −14.35 0.08 0.08 0.85 −13.04 0.05 0.05 0.72
JOINT-PQL2 2.98 0.09 0.09 0.95 0.62 0.05 0.06 0.94

β1(8)
NAIVE1 6.16 0.07 0.08 0.92 3.23 0.05 0.05 0.93
NAIVE2 6.43 0.07 0.08 0.90 3.20 0.05 0.05 0.93
JOINT-PQL1 −21.24 0.06 0.06 0.90 −19.48 0.04 0.04 0.82
JOINT-PQL2 5.89 0.07 0.08 0.92 2.71 0.05 0.05 0.93

β2(8)
NAIVE1 2.95 0.08 0.08 0.94 −0.57 0.04 0.04 0.95
NAIVE2 1.14 0.08 0.08 0.93 −0.32 0.04 0.04 0.94
JOINT-PQL1 −10.23 0.07 0.07 0.94 −9.68 0.04 0.04 0.92
JOINT-PQL2 1.53 0.08 0.08 0.93 0.23 0.04 0.04 0.94

β3(8)
NAIVE1 4.15 0.06 0.06 0.93 0.32 0.03 0.03 0.94
NAIVE2 0.70 0.06 0.06 0.93 0.17 0.03 0.03 0.94
JOINT-PQL1 −12.04 0.05 0.05 0.94 −12.39 0.03 0.03 0.92
JOINT-PQL2 6.81 0.06 0.07 0.92 0.02 0.03 0.03 0.93

Figure 2 shows the true and the averaged estimates of the conditional expectations for ℓ = 1, 2 across j. Our JOINT-PQL2 approach generally captures the true patterns, while the NAIVE1 and NAIVE2 methods lead to obvious bias to describe the trends of conditional expectations. We do not present the results from the JOINT-PQL1 method in Figure 2 because it leads to similar patterns as the JOINT-PQL2 method.

Figure 2.

Figure 2

Simulation results for the conditional expectations for ℓ = 1, 2 defined in Section 4. (a)(b) moderate sample size scenario with n = 200 and Ji = 5, (c)(d) large sample size scenario with n = 400 and Ji = 9. Dotted lines denote the true conditional expectation values. Solid lines represent the averaged values of the estimates from our JOINT-PQL2 method. Shadowed areas display the 10% to 90% quantiles of the estimated values in 500 simulation runs. Thick and thin dashed lines represent the averaged estimates by the NAIVE1 and NAIVE2 methods, respectively.

Therefore, the simulation results suggest that the JOINT-PQL1 method without using second order approximation, may have biased parameter estimation in β. The NAIVE1 and NAIVE2 approaches, which either completely or partially ignore the correlation among responses, may provide misleading conclusions in the estimation of conditional expectations with respect to multiple outcomes. On the other hand, our proposed JOINT-PQL2 method has satisfactory estimation results for both model parameters and conditional expectations.

5. Application to Physical Activity Data

In this section the proposed method is applied to the physical activity dataset collected by wearable devices. The project is to investigate the metabolic effects of exercise interventions to increase activity and reduce sedentary time in a group of office workers [1]. The raw activity data is obtained by a device named the ActivPAL™ (www.paltech.plus.com), which is taped in front of the thigh and uses a vertical-axis accelerometer to measure the angle of the thigh and the frequency of body movement. For this project, Zhang et al. [19] develop an R package “PAactivPAL” to handle the ActivPAL device’s raw records, which summarizes the dense time activity information into daily averages. We focus on the following eight variables: daily sedentary hours, daily energy expenditure (measured by METs hours), the proportion of time for sedentary bout greater than 20 minutes, the proportion of time for active bout greater than 5 minutes, daily number of standing up activities, daily number of steps, whether daily MVPA time is greater than one hour, and whether the highest METs in 10 minutes is greater than 3.

According to the data description in Section 1, the dataset has n = 63 subjects and Yij()(=1,,8) corresponds to eight selected responses. All individuals are scheduled to have one day’s measurement across five weeks with tij = −2, −1, 0, 1, 2, respectively. Incomplete daily observations are removed from our analysis. Moreover, the individual i enrolled in exercise group has Xij()=1 for all j and ℓ, while others in control group have Xij()=0.

The model used in simulation Section 4 are introduced to fit the data. This model includes intercept, group, time and a group-time interaction for fixed effects, and intercept and time for random effects. Our proposed JOINT-PQL2 approach is applied to estimate parameters. Table 3 presents β estimates for eight variables. The table illustrates that four outcomes, daily energy expenditure levels, the daily number of steps, the probability of daily MVPA time is greater than one hour, and the probability of highest METs in 10 minutes greater than 3, have significantly higher levels in exercise treatment group than control group. It implies that the treatment group will gain health benefits associated with increased physical activity.

Table 3.

Physical activity data analysis results for β using our proposed JOINT-PQL2 approach. Displayed are the estimates (Est.), the standard error (SE), and p–values.

Continuous Proportion

Est. SE p–value Est. SE p–value

β0(1)
9.77 0.30 < 0.01
β0(3)
−1.27 0.09 < 0.01
β1(1)
−0.13 0.14 0.38
β1(3)
−0.02 0.04 0.55
β2(1)
−0.05 0.40 0.91
β2(3)
−0.01 0.12 0.92
β3(1)
−0.17 0.19 0.38
β3(3)
−0.01 0.05 0.92
β0(2)
20.98 0.47 < 0.01
β0(4)
−0.75 0.07 < 0.01
β1(2)
0.13 0.25 0.59
β1(4)
0.01 0.03 0.80
β2(2)
1.67 0.64 0.01
β2(4)
0.10 0.10 0.30
β3(2)
0.12 0.34 0.73
β3(4)
0.04 0.05 0.36

Count Binary

Est. SE p–value Est. SE p–value

β0(5)
3.75 0.05 < 0.01
β0(7)
−3.70 0.54 < 0.01
β1(5)
−0.01 0.02 0.70
β1(7)
0.29 0.37 0.43
β2(5)
−0.00 0.07 0.97
β2(7)
4.79 0.69 < 0.01
β3(5)
−0.02 0.03 0.49
β3(7)
0.83 0.48 0.08
β0(6)
8.67 0.06 < 0.01
β0(8)
−1.58 0.41 < 0.01
β1(6)
0.05 0.03 0.11
β1(8)
0.52 0.31 0.10
β2(6)
0.49 0.09 0.00
β2(8)
4.08 0.60 < 0.01
β3(6)
0.05 0.04 0.25
β3(8)
1.00 0.46 0.03

Based on the model of the association structure among eight activity factors, it is of interest to study the pattern of daily sedentary hour and energy expenditure among particular subgroups across five weeks. We select two subsets of subjects with active and inactive behaviors at Week 0, respectively. The criteria for active performance have: (1) the proportion of sedentary bout greater than 20 minutes is less than 20%; (2) the active bouts greater than 5 minutes are more than 30%; (3) daily standing up behaviors are more than 40 times; (4) daily steps are more than 6000; (5) daily MVPA time is greater than one hour; (6) the highest METs in 10 minutes is greater than 3. The individual who reaches all six criteria at Week 0 is defined as active participant, while those who do not meet any term is taken into inactive group. Therefore, the conditional expectations to be estimated here are

E{Yij()|Yi1(3)<0.2,Yi1(4)0.3,Yi1(5)40,Yi1(6)6000,Yi1(7)=1,Yi1(8)=1},

and

E{Yij()|Yi1(3)0.2,Yi1(4)<0.3,Yi1(5)<40,Yi1(6)<6000,Yi1(7)=0,Yi1(8)=0},

where ℓ = 1, 2 and j = 1, …, 5. We study the situations with Xij()=0 and Xij()=1, respectively.

Figure 3(a)(b) shows the estimated conditional expectations for active/inactive participants with/without exercise training across five weeks. For daily sedentary hours, the estimates suggest that active participants have lower sedentary time than inactive subjects across weeks. In addition, the sedentary time in both exercise and control groups is decreasing with increasing time, and the exercise group has faster decreasing rate. A reasonable explanation is that many subjects in both groups receive several lifestyle suggestions to increase spontaneous activities (e.g. walk the stairs), which reduces the sedentary time. The supervised structured exercise training for the exercise group led to further reductions in sedentary behaviors. For energy expenditure levels, in both exercise and control groups, the active participants have higher outcome than the inactive ones at Week 0 but the difference is fairly small at Week 12. Active subjects at baseline in both the exercise and control groups decreased energy expenditure by Week 12, while the less active subjects increased energy expenditure. In this study, all participants in the exercise groups completed the same amount of exercise each week (~ 200 min) regardless of their activity status at Week 0. Although this is standard practice in such trials to ensure all participants complete the same dose, these data suggest that more active participants at baseline decreased their energy expenditure as a result of the standard intervention and future studies could consider tailoring exercise recommendations based on activity status at Week 0 to promote increases in energy expenditure for all participants.

Figure 3.

Figure 3

The estimates of conditional expectations across five weeks for daily sedentary hours (Yij(1)) and energy expenditure levels (Yij(2)) defined in Section 5. The individual who reaches the criteria in Section 5 at Week 0 is defined as active participant, while those who do not meet any term at Week 0 is defined as inactive participant. Thick and thin lines represent the estimates of active and inactive participants, respectively. Solid and dotted lines display the exercise treatment group and the control group, respectively.

6. Discussion

We have proposed a joint modeling and estimation strategy for longitudinal data with continuous, proportion, count and binary variables. To avoid the computational difficulty resulting from the large dimension of random effects, our algorithm uses an improved quasilikelihood approximation to handle non-linear outcomes, while employs an efficient estimation method to fit multivariate linear mixed model data. The simulation results are promising and suggest that the proposed method has little bias and outperforms naive approaches which ignore the association among multiple responses. The data analysis on the physical activity data collected by using wearable accelerometer device proves the utility of our method in applications.

The penalized quasilikelihood (PQL) method is often criticized for its biased estimates [20]. However, Goldstein and Rasbash [9] suggest that the second order approximation greatly improves the PQL approach. Vonesh et al. [21] study the asymptotic results for the PQL method and prove that it provides a consistent estimator if the number of subjects and the number of measurements per subject go to infinity. The simulation studies in this manuscript agree with these conclusions. We noted that our proposed approach has a small bias in the estimation for both β and conditional expectations with moderate sample sizes, and the bias become negligible with larger sample sizes.

To compare our method with commonly used model fitting approaches, we further investigate the performance of two maximum likelihood approaches, which are based on Laplace approximation (implemented by the R package “lme4” [22, 23]) and Gauss-Hermite quadrature approximation (11 quadrature points are used), respectively. Both methods are employed only to fit one binary outcome Yi(8). The estimates for the parameter β(8) are briefly reported in Online Supplementary Material and they are comparable to the estimates produced by our newly proposed method. However, in our multivariate data structure with eight responses, these maximum likelihood approaches are computationally intractable.

Finally, we have discussed the approach to select initial values in Section 3. Based on the suggestion from [8], we use small positive values for diagonal elements in Ψ (0.001 for count outcomes and 0.1 for other types of responses). This method works well in our simulation and application studies. Moreover, it would be interesting to consider selecting the initial Ψ from original data. This could be implemented by using either a maximum likelihood or restricted maximum likelihood estimation procedure, for example, see [8] and [16].

Supplementary Material

Supp info

Acknowledgments

Li was supported by discovery grants program from the Natural Sciences and Engineering Research Council of Canada (NSERC, RGPIN-2015-04409). Carroll was supported by a grant from the National Cancer Institute (U01-CA057030). The authors thank Sarah Kozey Keadle for making the physical activity data available to them.

Appendix

A.1. Second Order Approximation for Count and Binary Variables

For count variables, we have

Yij()exp{η^ij()}[Yij()exp{η^ij()}]+η^ij()0.5Zij()var^{ui()u^i()}Zij()TXij()β()+Zij()ui()+εij(),=5,6, (A.1)

where εij() has mean zero and variance exp{η^ij()}.

For binary variables, we have

Yij()[1/H{η^ij()}][Yij()H{η^ij()}]+η^ij()[1/2H{η^ij()}]H{η^ij()}[Zij()var^{ui()u^i()}Zij()T]Xij()β()+Zij()ui()+εij(),=7,8, (A.2)

where εij() has mean zero and variance 1/H{η^ij()}.

A.2. Second Order Approximation in Penalized Quasilikelihood

The derivation of penalized quasilikelihood (PQL) based approximations for proportion outcomes follows the techniques in Molenberghs and Verbeke [17]. For simplicity, we remove (ℓ) from notations. Given known values of (β̂, ûi) which lead to η̂ij = Xijβ̂ + Zijûi. A Taylor series expansion of H(ηij) around η̂ij for Yij has

YijH(η^ij)+H(η^ij)(ηijη^ij)+εij, (A.3)

where εij has mean zero and variance σ2H′(η̂ij). However, the approximation in (A.3) may lead to a severely biased estimator. To solve this problem, we could use a quadratic Taylor expansion with

YijH(η^ij)+H(η^ij)(ηijη^ij)+(1/2)H(η^ij)(ηijη^ij)2+εij. (A.4)

In practice the term (ηijη̂ij)2 may be intractable. Goldstein and Rasbash [9] suggest to work on a simplified formulation of (A.4) with the second order expansion only for ui, which has

YijH(η^ij)+H(η^ij)(ηijη^ij)+(1/2)H(η^ij)(ZijuiZiju^i)2+εij, (A.5)

and the term (ZijuiZijûi)2 can be replaced by Zijvar^(uiu^i)ZijT in computation. We take a transformation for (A.5) to obtain

{1/H(η^ij)}{YijH(η^ij)}+η^ij{1/2H(η^ij)}H(η^ij){Zijvar^(uiu^i)ZijT}ηij+εi(t)=Xijβ+Zijui+εij (A.6)

where εij has mean zero and variance σ2/H′{η̂ij}. Define the term on the left side of (A.6) as variable Yij

Yij{1/H(η^ij)}{YijH(η^ij)}+η^ij[1/2H(η^ij)]H(η^ij){Zijvar^(uiu^i)ZijT},

and (A.6) implies

YijXijβ+Zijui+εij.

Footnotes

Supplementary Materials

Supporting tables referenced in Sections 4 and 6 are available with this paper at the Statistics in Medicine website on Wiley Online Library.

References

  • 1.Kozey-Keadle S, Libertine A, Lyden K, Staudenmayer J, Freedson PS. Changes in sedentary time and spontaneous physical activity in response to an exercise training and/or lifestyle intervention. Journal of Physical Activity and Health. 2014;11:1324–1333. doi: 10.1123/jpah.2012-0340. [DOI] [PubMed] [Google Scholar]
  • 2.Keadle SK, Sampson J, Li H, Lyden K, Matthews CE, Carroll RJ. An evaluation of accelerometer-derived metrics to assess daily behavioral patterns. Medicine & Science in Sports & Exercise. 2017;49:54–63. doi: 10.1249/MSS.0000000000001073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Verbeke G, Fieuws S, Molenberghs G, Davidian M. The analysis of multivariate longitudinal data: A review. Statistical methods in medical research. 2014;23:42–59. doi: 10.1177/0962280212445834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Fieuws S, Verbeke G. Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics. 2006;62:424–431. doi: 10.1111/j.1541-0420.2006.00507.x. [DOI] [PubMed] [Google Scholar]
  • 5.Fieuws S, Verbeke G, Boen F, Delecluse C. High dimensional multivariate mixed models for binary questionnaire data. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2006;55:449–460. [Google Scholar]
  • 6.Gueorguieva RV, Agresti A. A correlated probit model for joint modeling of clustered binary and continuous responses. Journal of the American Statistical Association. 2001;96:1102–1112. [Google Scholar]
  • 7.Buu A, Li R, Tan X, Zucker RA. Statistical models for longitudinal zero-inflated count data with applications to the substance abuse field. Statistics in Medicine. 2012;31:4074–4086. doi: 10.1002/sim.5510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Breslow NE, Clayton DG. Approximate inference in generalized linear mixed models. Journal of the American Statistical Association. 1993;88:9–25. [Google Scholar]
  • 9.Goldstein H, Rasbash J. Improved approximations for multilevel models with binary responses. Journal of the Royal Statistical Society. Series A. 1996;159:505–513. [Google Scholar]
  • 10.Li H, Staudenmayer J, Carroll RJ. Hierarchical functional data with mixed continuous and binary measurements. Biometrics. 2014;70:802–811. doi: 10.1111/biom.12211. [DOI] [PubMed] [Google Scholar]
  • 11.Schafer JL. Tech. rep. The Methodological Center, The Pennsylvania State University; 1998. Some improved procedures for linear mixed models. [Google Scholar]
  • 12.Lindstrom MJ, Bates DM. Newton-raphson and em algorithms for linear mixed-effects models for repeated-measures data. Journal of the American Statistical Association. 1988;83:1014–1022. [Google Scholar]
  • 13.Broström G, Holmberg H. Generalized linear models with clustered data: Fixed and random effects models. Computational Statistics and Data Analysis. 2011;55:3123–3134. [Google Scholar]
  • 14.Ferrari S, Cribari-Neto F. Beta regression for modelling rates and proportions. Journal of Applied Statistics. 2004;31:799–815. [Google Scholar]
  • 15.Lindstrom MJ, Bates DM. Nonlinear mixed effects models for repeated measures data. Biometrics. 1990;46:673–687. [PubMed] [Google Scholar]
  • 16.Wolfinger R, O’Connell M. Generalized linear mixed models a pseudo-likelihood approach. Journal of Statistical Computation and Simulation. 1993;48:233–243. [Google Scholar]
  • 17.Molenberghs G, Verbeke G. Models for Discrete Longitudinal Data. Springer; 2005. [Google Scholar]
  • 18.Liu C, Rubin DB. The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika. 1994;81:633–648. [Google Scholar]
  • 19.Zhang Y, Li H, Kozey-Keadle S, Matthews CE, Carroll RJ. PAactivPAL: Summarize Daily Physical Activity from ’activPAL’ Accelerometer Data. R package version 1.0 2015 [Google Scholar]
  • 20.Rodriguez G, Goldman N. Improved estimation procedures for multilevel models with binary response: A case-study. Journal of the Royal Statistical Society. Series A (Statistics in Society) 2001;164:339–355. [Google Scholar]
  • 21.Vonesh EF, Wang H, Nie L, Majumdar D. Conditional second-order generalized estimating equations for generalized linear and nonlinear mixed-effects models. Journal of the American Statistical Association. 2002;97:271–283. [Google Scholar]
  • 22.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2017. [Google Scholar]
  • 23.Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. Journal of Statistical Software. 2015;67:1–48. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

RESOURCES