Summary
For longitudinal data, mixed models include random subject effects to indicate how subjects influence their responses over repeated assessments. The error variance and the variance of the random effects are usually considered to be homogeneous. These variance terms characterize the within-subjects (i.e., error variance) and between-subjects (i.e., random-effects variance) variation in the data. In studies using ecological momentary assessment (EMA), up to 30 or 40 observations are often obtained for each subject, and interest frequently centers around changes in the variances, both within and between subjects. In this article, we focus on an adolescent smoking study using EMA where interest is on characterizing changes in mood variation. We describe how covariates can influence the mood variances, and also extend the standard mixed model by adding a subject-level random effect to the within-subject variance specification. This permits subjects to have influence on the mean, or location, and variability, or (square of the) scale, of their mood responses. Additionally, we allow the location and scale random effects to be correlated. These mixed-effects location scale models have useful applications in many research areas where interest centers on the joint modeling of the mean and variance structure.
Keywords: Complex variation, Heteroscedasticity, Log-linear variance, Multilevel, Variance modeling
1. Introduction
Mixed-effects regression models (MRMs) have become a primary method for analysis of longitudinal data (Verbeke and Molenberghs, 2000). A basic characteristic of these models is the inclusion of random subject effects into regression models in order to account for the influence of subjects on their repeated observations. These random effects reflect each person’s growth or development across time, and the variance of these random effects indicate the degree of variation that exists in the population of subjects. Typically, the error variance, which characterizes the within-subjects (WS) variance, and the variance of the random effects, which characterizes the between-subjects (BS) variance, are treated as being homogeneous across subject groups or levels of covariates. However, these homogeneity of variance assumptions can be relaxed by modeling differences in variances, both between and within, across subject groups. The study of intraindividual variability has received increasing attention (Hertzog and Nesselroade, 2003; Fleeson, 2004; Martin and Hofer, 2004; Nesselroade, 2004); these articles describe many of the conceptual issues and some traditional statistical approaches for examining such variation. MRMs can be used to broaden this notion by assessing the determinants of both intraindividual (within-subjects) and interindividual (between-subjects) variation (Wolfinger and Tobias, 1998).
Modern data collection procedures, such as ecological momentary assessments (EMA) and/or real-time data captures, have been developed to record the momentary events and experiences of subjects in daily life (Bolger, Davis, and Rafaeli, 2003). These procedures yield relatively large numbers of subjects and observations per subject, and data from such designs are sometimes referred to as intensive longitudinal data (Walls and Schafer, 2006). Such designs are in keeping with the “bursts of measurement” approach described by Nesselroade and McCollam (2000), who called for such an approach in order to assess intraindividual variability. As noted by Nesselroade and McCollam (2000), such bursts of measurement increase the research burden in several ways; however, they are necessary for studying intraindividual variation and to explain why subjects differ in variability rather than solely in mean level (Bolger et al., 2003). In this article we describe data from a natural history study of adolescent smoking, using EMA, where interest was on determinants of the variation in the adolescents’ moods.
Mixed model analysis of EMA data is well described in Schwartz and Stone (2007). Additionally, a few articles have described approaches for examining determinants of BS and WS variance from EMA studies. Penner et al. (1994) used basic descriptive statistical methods to examine relationships among WS variation in several mood variables. More recently, Hedeker, Berbaum, and Mermelstein (2006) and Hedeker and Mermelstein (2007) have described mixed model approaches incorporating a log-linear structure for determinants of the WS variance. In this article, we extend these approaches in several ways. First, we include log-linear models for both the WS and BS variance, allowing covariates to potentially influence both sources of variation. More importantly, we also allow the inclusion of a random subject effect to the WS variance specification. This permits the WS variance to vary at the subject level, above and beyond the influence of covariates on this variance.
Models with random WS variance effects have been considered to some extent in the statistical literature. In these models, there are one or more random effects that characterize an individual’s mean response or location, and an additional random effect that characterizes the variability around an individual’s mean response. The latter is typically specified in terms of the standard deviation, and dubbed a random scale parameter. The technical report by Cleveland, Denby, and Liu (2002) provides a detailed description of this general class of models and summarizes much of the relevant work. These models have been developed using both Bayesian (Lindley, 1971; Leonard, 1975; Myles et al., 2003) and frequentist approaches (James et al., 1994; Chinchilli, Esinhart, and Miller, 1995; Lin et al., 1997). Most of these authors take the random scale distribution to be square root inverse gamma, though some consider the log-normal distribution. In this regard, we also specify the log-normal distribution for the (square of the) random scale effects. Furthermore, we allow the random scale effect to be correlated with the random location effect. This yields a more general and realistic specification for the random effects. Additionally, because we assume the normal distribution for the (square of the) random scale effects, standard software (i.e., SAS Institute, 2000, PROC NLMIXED) can be used to estimate these models, and therefore broaden the potential application of this approach. A syntax example is provided in the Web Appendix to facilitate this.
2. Adolescent Smoking Study
Data from a natural history study of adolescent smoking motivated the application of the location scale mixed model. Students included in this study were either in 9th or 10th grade at baseline, 55.1% female, and self-reported on a screening questionnaire 6–8 weeks prior to baseline that they had smoked at least one cigarette in their lifetime. The majority (57.6%) had smoked at least one cigarette in the past month at baseline. Written parental consent and student assent were required for participation. A total of 461 students completed the baseline measurement wave. The study utilized a multi-method approach to assess adolescents in terms of self-report questionnaires, a week-long time/event sampling method via palmtop computers (EMA), and in-depth interviews.
Here, we focus on the EMA data. Adolescents carried the hand-held computers with them at all times during a seven consecutive day data collection period and were trained to both respond to random prompts from the computers and to event record (initiate a data collection interview) smoking episodes. Questions included ones about place, activity, companionship, mood, and other subjective items. The hand-held computers date and time-stamped each entry. For the analyses reported, we treated the responses obtained from the random prompts. In all, there were 14,105 random prompts obtained from the 461 students with an approximate average of 30 prompts per student (range = 7–71).
Two outcomes were considered: measures of the subject’s negative and positive affect (NA and PA) at each random prompt. Both of these measures consisted of the average of several individual mood items, each rated from 1 to 10, that were identified via factor analysis. Specifically, PA consisted of the following items that reflected a subject’s assessment of their positive mood before the prompt signal: I felt happy, I felt relaxed, I felt cheerful, I felt confident, and I felt accepted by others. Similarly, NA consisted of the following items assessing preprompt negative mood: I felt sad, I felt stressed, I felt angry, I felt frustrated, and I felt irritable. Over all prompts, and ignoring the clustering of the data, the marginal mean of PA was 6.797 (SD = 1.935), whereas the NA marginal mean was 3.455 (SD = 2.253).
Of interest to the investigators was the degree of heterogeneity in these mood measures in terms of both WS and BS variation. Furthermore, it was of interest to examine whether certain covariates could explain some of the variation in these two sources of heterogeneity, over and above the influence of these covariates on the mean response. It also seemed reasonable to allow random subject effects for both the mean response (to allow for subjects with different average levels of mood) and for a subject’s WS variance (to allow for different levels of mood consistency). These considerations led to the application of the mixed location scale model.
3. Mixed Location Scale Model
Consider the following MRM for the affect measurement y, either NA or PA, of student i (i = 1, 2, …, N subjects) on occasion j (j = 1, 2, …, ni occasions):
| (1) |
where xij is the p × 1 vector of regressors (typically including a “1” for the intercept as the first element) and β is the corresponding p × 1 vector of regression coefficients. The random subject effect υi indicates the influence of individual i on his/her repeated mood assessments. The population distribution of these random effects is usually assumed to be a normal distribution with zero mean and variance . The errors εij are also assumed to be normally distributed in the population with zero mean and variance , and independent of the random effects. Here, represents the BS variance and is the WS variance. To allow covariates to influence these variances, we can utilize a log-linear representation, as has been described in the context of heteroscedastic (fixed-effects) regression models (Harvey, 1976; Aitkin, 1987), namely,
| (2) |
| (3) |
The variances are subscripted by i and j to indicate that their values change depending on the values of the covariates ui and wij (and their coefficients). The number of parameters associated with these variances does not vary with i or j. Both ui and wij would usually include a (first) column of ones for the reference BS and WS variances (α0 and τ0), respectively. Thus, the BS variance equals exp α0 when the subject-level covariates ui equal 0, and is increased or decreased as a function of these covariates and their coefficients α. Specifically, for a particular covariate u*, if α* > 0, then the BS variance increases as u* increases (and vice versa if α* < 0). Note that the exponential function ensures a positive multiplicative factor for any finite value of α, and so the resulting variance is guaranteed to be positive. The WS variance is modeled in the same way, with the exception that both time-varying and subject-varying covariates can influence the WS variance. For this reason, the covariate vector is indicated as wij for the WS variance. Thus, this model allows both subject-varying and time-varying covariates to influence the WS variance, but only subject-varying variables to influence the BS variance. The coefficients in α and τ indicate the degree of influence on the BS and WS variances, respectively, and the ordinary random intercept model is obtained as a special case if α = τ= 0 for all covariates in ui and wij (i.e., excluding the reference variances α0 and τ0).
We can further allow the WS variance to vary across individuals, above and beyond the contribution of covariates, namely,
| (4) |
where the random subject effects ωi are distributed in the population of subjects with mean 0 and variance . Note that taking logs yields , which indicates that if the distribution of ωi is specified as normal, then the random effects serve as log-normal subject-specific perturbations of the WS variance. In other words, the WS variances follow a log-normal distribution at the individual level. The skewed, nonnegative nature of the log-normal distribution makes it a reasonable choice for representing variances, and it has been used in many diverse research areas for this purpose (Leonard, 1975; Shenk, White, and Burnhamb, 1998; Fowler and Whitlock, 1999; Vasseur, 1999; Renò and Rizza, 2003).
In this model, υi is a random effect that influences an individual’s mean or location, and ωi is a random effect that influences an individual’s variance or (square of the) scale. Thus, we dub the model with both types of random effects as a mixed-effects location scale model. These two random effects are correlated with covariance parameter συω. This covariance parameter indicates the degree to which the random location and scale effects are associated with each other. As we will see in the results, this parameter can be useful to account for ceiling and floor effects of measurement.
It is convenient to represent the random effects in standardized form (i.e., as standard normals). For this, we can use the Cholesky factorization (Bock, 1975).
| (5) |
Here, we include the subscript i on the Cholesky elements because the BS variance varies with subjects. The model can now be written as
| (6) |
where , and the errors εij have variance given by
| (7) |
The standardized random effects θ1i and θ2i are both normally distributed with mean 0 and variance 1, and are independent of each other. The expectation of yij, E(yij), is simply . Additionally, because , the variance of yij is given by
| (8) |
The covariance for any two observations nested within the same individual i equals
| (9) |
This covariance can be expressed as a correlation, in which case it yields the intraclass correlation (ICC), denoted as rij,
| (10) |
Note that the ICC, which represents the proportion of total unexplained variation that is at the subject level, can be obtained for specific values of the covariates ui and wij. Thus, based on the current model, the ICC is allowed to vary as a function of both time-varying and time-invariant covariates.
4. Estimation
The model can be written in terms of the ni × 1 vector yi of affect responses of student i, either NA or PA, as
| (11) |
where Xi is the ni × p (location) covariate matrix influencing the mean of yi, Wi is the ni × r (scale) covariate matrix influencing the WS variance of yi, and 1i is a ni × 1 vector of ones. Similar to the standardized random effects θ1i and θ2i, the errors, which comprise the vector ei, are each standard normals in this representation of the model.
Marginally, the yi are distributed as independent normals with mean Xiβ and variance–covariance matrix , where these variances are given in (2) and (7), respectively. The marginal density of yi can be expressed as h(yi) = ∫θf(yi |θi)g(θ) dθ, where f(yi | θi) represents the normal distribution of yi, given the random effects θ1 and θ2, and g(θ) represents the standard bivariate normal density. The marginal log likelihood from the sample of N subjects is then obtained as . Maximizing this log likelihood yields maximum likelihood (ML) estimates, which are sometimes referred to as maximum marginal likelihood estimates (Bock, 1989) because integrating the joint likelihood of random effects and responses over the distribution of random effects translates to marginalization of the data distribution. SAS PROC NLMIXED can be used to readily obtain the ML estimates for this model. An example of the syntax necessary for this is provided in the Web Appendix.
5. Results
Table 1 lists the estimates of the location scale model of both outcomes, NA and PA, without any covariates.
Table 1.
Positive and negative affect—maximum likelihood estimates and standard errors (SE)
| Positive affect
|
Negative affect
|
|||
|---|---|---|---|---|
| Parameter | Estimate | SE | Estimate | SE |
| Mean β0 | 6.779 | 0.058 | 3.482 | 0.071 |
| WS variance τ0 | 0.622 | 0.036 | 0.741 | 0.047 |
| BS variance of location α0 | 0.367 | 0.069 | 0.793 | 0.069 |
| BS variance of scale | 0.518 | 0.039 | 0.963 | 0.069 |
| Covariance συω | − 0.386 | 0.048 | 0.765 | 0.080 |
| BS variance= exp(α0) | 1.443 | 2.210 | ||
| 2.413 | 3.396 | |||
| ICC | 0.374 | 0.394 | ||
As can be seen from the results in Table 1, there is considerable heterogeneity of scale BS. In other words, subjects differ in terms of their PA and NA variation. For both PA and NA, the estimates of greatly exceed their standard error estimates. Also, the covariance estimates are relatively large and of opposite sign for PA and NA. For NA, the positive covariance indicates that subjects who are high in terms of their NA mean also exhibit greater variation in NA. Thus, subjects with relatively poor moods (higher NA) fluctuate more in their mood. This could reflect the notion that better moods (less NA) may be more “trait-like” and not as reactive to different situational cues. Alternatively, this positive association might reflect a floor effect of measurement. Namely, as noted, the marginal mean of NA is about 3.5, or relatively low and toward the minimum of this scale. Thus, subjects who are lower than average have relatively less room to vary downward than those who are above average. Conversely, the negative covariance for PA indicates that subjects who are relatively high in terms of their mean PA are less varied across prompts in their PA responses. Again, this could suggest that more positive moods reflect trait-like positivity, or this could be a ceiling effect of measurement, because the marginal mean of PA is about 6.8, or toward the maximum value for this scale. Thus, one might argue that subjects with greater than average PA have relatively less room for upward movement, and so less variation. Finally, though the estimated ICCs are similar for both variables, both the BS and WS variances are larger for NA, relative to PA.
Based on these models without covariates, the empirical Bayes estimates of the random effects υi and ωi were obtained. For PA, Table 2 presents a comparison of the empirical Bayes estimates of the location parameters (specifically, β̂0 + υ̂i) with observed subject means, and also a similar comparison of the empirical Bayes scale parameter estimates (i.e., ) with observed subject standard deviations.
Table 2.
Observed versus model-based subject means, standard deviations (SDs), and slopes of positive affect location and scale
| Location
|
Scale
|
||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean of
|
SD of
|
Mean of
|
SD of
|
||||||||
| Sample | N | ȳi | ŷi | ȳi | ŷi | Slope | Syi | σ̂yi | Syi | σ̂yi | Slope |
| Overall | 461 | 6.78 | 6.78 | 1.24 | 1.18 | 0.95 | 1.44 | 1.40 | 0.51 | 0.44 | 0.86 |
| ni ≤ 15 | 21 | 6.63 | 6.63 | 1.04 | 0.90 | 0.86 | 1.49 | 1.41 | 0.62 | 0.43 | 0.69 |
| ni ≥ 43 | 17 | 6.50 | 6.49 | 1.39 | 1.35 | 0.97 | 1.56 | 1.52 | 0.62 | 0.55 | 0.89 |
| Erratic | 13 | 5.63 | 5.66 | 0.92 | 0.74 | 0.80 | 2.71 | 2.48 | 0.21 | 0.19 | 0.81 |
| Consistent | 16 | 8.36 | 8.36 | 1.11 | 1.10 | 0.99 | 0.49 | 0.55 | 0.08 | 0.08 | 0.95 |
ȳi = observed mean of responses for subject i.
ŷi = β̂0 + υ̂i = model-based mean for subject i.
Syi = observed standard deviation of responses for subject i.
model-based standard deviation for subject i.
Results are presented for the entire sample, for subjects with relatively few and many prompts, and for subjects with large estimates of scale ω̂i (erratic subjects) and small estimates of scale ω̂i (consistent subjects). The table lists the means and standard deviations of the sample statistics and model-based estimates, as well as the slope obtained from regressing the model-based estimates on the sample statistics. As can be seen from Table 2, overall, the model-based estimates of location are near identical to the sample means, with a small degree of shrinkage (i.e., smaller SD for the model-based estimates and slope < 1). This is to be expected given the empirical Bayes nature of these estimates. Table 2 also shows that the shrinkage is greater for subjects with less observations and more erratic subjects, and less for subjects with many observations and more consistent subjects. Similar observations apply to the scale estimates, although the degree of shrinkage is greater in all cases.
Next, we included several covariates into the models. Subject-level covariates included Smoker (an indicator of whether the student is a current smoker, coded no = 0 or yes = 1; this was determined based on whether or not the subject provided at least one smoking event during the week-long data collection period), Male (coded 0 = female or 1 = male), Grade10 (coded 0 = 9th or 1 = 10th grade), NovSeek (a measure of novelty seeking), and NegMoodReg (a measure of negative mood regulation). It is worth noting that these subject-level covariates were relatively uncorrelated with each other. All pairwise correlations among these five variables were below 0.10 with the exception of Male with NegMoodReg, in which a positive correlation of 0.23 was observed (males were higher in terms of negative mood regulation). In terms of prompt-level covariates, we considered the day of the week and whether the subject was alone or not (coded 0 = not alone or 1 = alone) at the time of the prompt. For the latter, we created both a BS and WS version (AloneBS and AloneWS) as described in Neuhaus and Kalbfleisch (1998), namely, Xij = X̄i + (Xij − X̄i). Note that AloneBS, the first term on the right-hand side, equals the proportion of random prompts in which a subject was alone, and AloneWS, the latter term on the right-hand side, is the prompt-specific deviation relative to this proportion (i.e., it equals either 0 – AloneBS or 1– AloneBS if the subject was not alone or was alone, respectively, for the given random prompt). For day of week, we created six indicator variables using Monday as the reference.
Tables 3 and 4 present the results for NA and PA, respectively, of the mixed location scale model and, for comparison purposes, a simpler random intercept model (i.e., a model which, in terms of the variance modeling, does not include covariate effects on the BS and WS variances, nor random scale variance, but does include the usual scalar BS and WS variance terms, albeit parameterized as α0 and τ0, respectively). In these tables, the first column lists the estimated regression coefficients (β̂) and their standard errors of the simpler random intercept model. The second column of these tables then lists the estimated regression coefficients (β̂) and their standard errors from the mixed location scale model. The third column lists the estimated effects pertaining to the BS variance in the mixed location scale model, namely, α̂0 (i.e., the estimate of the reference BS variance, which is listed in the row labeled “Intercept”) and the covariate effects in α̂. These estimates are on the natural log scale. We modeled the BS variance in terms of subject-level variables to examine how subject heterogeneity varies as a function of these specific subject characteristics. The final column lists the mixed location scale model estimates of the WS variance, both τ̂0 (i.e., the estimate of the reference WS variance, listed in the row labeled “Intercept”) and the covariate effects in τ̂. Again, these estimates are on the natural log scale. For the WS variance, we included both subject- and prompt-varying covariates. Finally, estimates pertaining to the random scale effects, the random WS variance and covariance (συω), are listed in the final column near the bottom.
Table 3.
Mixed models of negative affect, N = 461 and Σni = 14, 105, maximum likelihood estimates (standard errors)
| Random intercept
|
Mixed location scale
|
|||
|---|---|---|---|---|
| Mean (β) | Mean (β) | BS var (α) | WS Var (τ) | |
| Intercept | 4.464*** (0.390) | 4.382*** (0.383) | 1.244*** (0.359) | 0.617* (0.269) |
| Smoker | 0.236 (0.128) | 0.242 (0.124) | 0.120 (0.119) | 0.211* (0.088) |
| NovSeek | 0.180 (0.099) | 0.190 (0.097) | − 0.145 (0.087) | 0.222** (0.068) |
| NegMoodReg | − 0.799***(0.097) | − 0.765*** (0.095) | − 0.245* (0.095) | − 0.277*** (0.067) |
| Male | − 0.428**0.136) | − 0.366** (0.130) | − 0.217 (0.126) | − 0.375*** (0.094) |
| Grade10 | 0.085 (0.129) | 0.094 (0.124) | 0.020 (0.120) | − 0.074 (0.089) |
| AloneBS | 0.988** (0.335) | 0.848** (0.321) | 0.542 (0.306) | 0.415 (0.231) |
| AloneWS | 0.396*** (0.031) | 0.173*** (0.021) | 0.070* (0.029) | |
| Tuesday | 0.016 (0.053) | 0.058 (0.031) | 0.037 (0.051) | |
| Wednesday | 0.223*** (0.053) | 0.127*** (0.032) | 0.098 (0.051) | |
| Thursday | 0.234*** (0.053) | 0.123*** (0.032) | 0.180*** (0.051) | |
| Friday | 0.107* (0.053) | 0.083* (0.032) | 0.249*** (0.051) | |
| Saturday | − 0.110* (0.054) | − 0.022 (0.033) | 0.264*** (0.053) | |
| Sunday | − 0.126* (0.053) | − 0.037 (0.030) | 0.027 (0.051) | |
| BS variance of scale | 0.812*** (0.059) | |||
| Covariance συω | 0.527*** (0.061) | |||
p < 0.001
p < 0.01,
p < 0.05.
Table 4.
Mixed models of positive affect, N = 461 and Σni = 14,105, maximum likelihood estimates (standard errors)
| Random intercept
|
Mixed location scale
|
|||
|---|---|---|---|---|
| Mean (β) | Mean (β) | BS var (α) | WS var (τ) | |
| Intercept | 5.801*** (0.318) | 5.861*** (0.318) | 0.627 (0.371) | 0.535* (0.210) |
| Smoker | − 0.170 (0.105) | − 0.121 (0.101) | 0.017 (0.124) | 0.072 (0.069) |
| NovSeek | 0.107 (0.080) | 0.054 (0.081) | − 0.261** (0.092) | 0.130* (0.053) |
| NegMoodReg | 0.591*** (0.079) | 0.585*** (0.077) | − 0.086 (0.098) | − 0.167** (0.052) |
| Male | 0.240* (0.111) | 0.187 (0.106) | − 0.133 (0.130) | − 0.233** (0.073) |
| Grade10 | 0.033 (0.105) | − 0.014 (0.103) | − 0.290* (0.122) | − 0.151* (0.069) |
| AloneBS | − 1.491*** (0.273) | − 1.316*** (0.268) | 1.103*** (0.312) | 0.343 (0.180) |
| AloneWS | − 0.518*** (0.027) | − 0.364*** (0.023) | 0.070* (0.028) | |
| Tuesday | − 0.071 (0.048) | − 0.036 (0.036) | 0.039 (0.050) | |
| Wednesday | − 0.144** (0.048) | − 0.065 (0.038) | 0.137** (0.050) | |
| Thursday | − 0.121* (0.048) | − 0.096* (0.039) | 0.227*** (0.050) | |
| Friday | − 0.029 (0.048) | 0.002 (0.039) | 0.259*** (0.050) | |
| Saturday | 0.162*** (0.048) | 0.174*** (0.038) | 0.152** (0.050) | |
| Sunday | 0.116* (0.048) | 0.149*** (0.036) | − 0.017 (0.050) | |
| BS variance of scale | 0.461*** (0.036) | |||
| Covariance συω | − 0.306*** (0.040) | |||
p < 0.001,
p < 0.01,
p < 0.05.
First, in comparing the mean effects between the random intercept and mixed location scale models, some differences emerge. In general, the former yields a few more significant results than the latter. Specifically, from Table 3, the two weekend indicators (Saturday and Sunday) are deemed to significantly lower NA in the simpler random intercept model, whereas they are not significant in the mixed location scale model. Similarly, from Table 4, the Wednesday indicator significantly lowers positive affect and Male significantly increases it, whereas neither are significant in the proposed model. Thus, there may be some benefit for fitting the more flexible mixed location scale model even when the main interest centers on changes in the mean.
For the mixed location scale model, in terms of NA, the results in Table 3 show that several variables significantly increase (AloneBS, AloneWS, Wednesday, Thursday, and Friday), and decrease (NegMoodReg and Male) the mean level of this variable. Thus, being a loner (i.e., higher on AloneBS) as well as being alone (i.e., AloneWS) increase NA. Also, NA is increased in the middle and toward the end of the week. Conversely, being a male and having better negative mood regulation lower NA. In terms of BS heterogeneity, those with better negative mood regulation (i.e., higher on NegMoodReg) are less varied and more homogeneous. There are many significant determinants of WS variance including those that increase this variance (Smoker, NovSeek, AloneWS, Thursday, Friday, and Saturday), and those that diminish this variance (NegMoodReg and Male). Thus, the WS data are more varied from smokers and novelty seekers, and less varied from males and negative mood regulators. It is particularly interesting to note the opposite effect on WS variance of novelty seeking (positive) and negative mood regulation (negative). In terms of prompt-level variables, we see increased WS variation toward the end of the week and on Saturday, and also increased variation when one is alone.
As can be seen from Table 3, the random WS variance (i.e., the BS variance of scale) and covariance parameters are both highly significant. There is clear evidence that the WS variance varies across individuals, above and beyond the contribution of the many covariates in the WS variance model. In other words, subjects differ in terms of their NA variation. As noted, the covariance parameter συω is estimated to be positive (and is highly significant). This indicates that subjects who are high in terms of their NA mean also exhibit greater variation in NA, and possibly a floor effect of measurement.
Turning to the mixed location scale model results for PA in Table 4, we see that, in terms of the mean, several variables have opposite effects to those observed for NA, either increasing (NegMoodReg) or decreasing (AloneBS, AloneWS, and Thursday) the average PA. Additionally, the indicators for the weekend days (Saturday, Sunday) are seen to increase the mean of PA. In terms of BS heterogeneity, loners (i.e., AloneBS) are more varied in PA, whereas novelty seekers and 10th grade students are less varied. For the WS variance, nearly all of the results observed for NA are replicated for PA. Namely, several variables increase the heterogeneity of an individual’s responses (NovSeek, AloneWS, Thursday, Friday, Saturday), whereas others diminish this variance (NegMoodReg, Male). Additionally, decreased WS variance is observed for 10th grade students and increased variance for the Wednesday indicator.
As was the case for NA, the random WS variance and covariance parameters are both highly significant for PA. The significant variance of the random WS variance effects indicates that subjects vary in terms of their PA variation. Also, the random WS covariance parameter is estimated to be negative. Thus, subjects who are relatively high in terms of their mean PA are less varied across prompts in their PA responses. As mentioned, this could be due to a ceiling effect of measurement.
6. Discussion
This article has illustrated how mixed models for longitudinal data can be used to model differences in variances, and not just means, across subject- and time-varying covariates. As such, these models can help to identify predictors of both WS and BS variation, and to test hypotheses about these variances. Additionally, by including a random subject effect on the WS variance, this model can examine the degree to which subjects are heterogeneous in terms of their variation on the outcome variable. Our examples with NA and PA clearly show that subjects are quite heterogeneous in terms of their mood variation, as one might expect.
More applications of this class of models clearly exist. For example, many questions of both normal development and the development of psychopathology address the issue of variability or stability in emotional responses to various situations and/or contexts. Often, a concern is with the range of responses an individual gives to a variety of stimuli or situations, and not just with the overall mean level of responsivity. These models also allow us to examine hypotheses about cross-situational consistency of responses as well.
In this article, we have only considered the case of a single random subject effect for location. This could be generalized to allow multiple location random effects. For example, it is typical in longitudinal studies in which time is a factor, to consider a random subject intercept as well as random time trend parameters. However, for EMA data, there is not necessarily a notion that a person has some kind of systematic trend over the random prompts. In any case, the model could clearly be extended to allow multiple random location effects, and SAS PROC NLMIXED could still be used to estimate such a model.
Modern data collection procedures, such as EMA and/or real-time data captures, usually provide a fair amount of both WS and BS data, and so give rise to the opportunity for modeling of both WS and BS variances as a function of covariates. Clearly, these data from this EMA study, and the questions of the investigators, motivated the development of the model presented in this article. One might wonder about how much WS and BS data are necessary for estimation- and variance-modeling purposes. For random coefficient models, Longford (1993) noted the difficulty by providing general guidelines about the degree of complexity, for the variation part of a model, that a given data set could support. This would also seem to be true here. Nonetheless, carrying out some simulations with relatively small sample sizes (e.g., 20 subjects with 5 observations each) gives the general impression that the primary issue is that the estimation algorithm does not often converge, but instead has estimation difficulties of one sort or another, in small sample situations.
This article has focused on continuous outcomes. Because ordinal data are often obtained in many research areas as well, we are currently extending these procedures for ordinal data. Admittedly, there is more information in continuous than ordinal responses, so the ability to model variances in ordinal data may not be as general as what is possible using the methods presented here. Thus, we hope to examine the degree to which these models of variation can be applied to ordinal outcomes. Thus far, we have described an ordinal model that allows covariates to influence the WS and BS variances (Hedeker et al., 2006), which can be extended to also allow for the WS variance random effect, as presented here.
Supplementary Material
7. Supplementary Materials
The Web Appendix referenced in Sections 1 and 4 is available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.
Acknowledgments
This work was supported by National Cancer Institute grant 5PO1 CA98262. Portions of this article were presented at the 2007 ENAR meeting. The authors thank Dr Joseph Hogan for organizing the session that it was presented at, and Dr Rema Raman for delivering the presentation. The authors also thank a referee, the associate editor, and co-editor for helpful comments that led to an improved article.
References
- Aitkin M. Modelling variance heterogeneity in normal regression using GLIM. Applied Statistics. 1987;36:332–339. [Google Scholar]
- Bock RD. Multivariate Statistical Methods in Behavioral Research. New York: McGraw-Hill; 1975. [Google Scholar]
- Bock RD. Measurement of human variation: A two stage model. In: Bock RD, editor. Multilevel Analysis of Educational Data. New York: Academic Press; 1989. pp. 319–342. [Google Scholar]
- Bolger N, Davis A, Rafaeli E. Diary methods: Capturing life as it is lived. Annual Review of Psychology. 2003;54:579–616. doi: 10.1146/annurev.psych.54.101601.145030. [DOI] [PubMed] [Google Scholar]
- Chinchilli VM, Esinhart JD, Miller WG. Partial likelihood analysis of within-unit variances in repeated measurement experiments. Biometrics. 1995;51:205–216. [PubMed] [Google Scholar]
- Cleveland W, Denby L, Liu C. Random scale effects. Technical report, Bell Labs. 2002 http://cm.bell-labs.com/cm/ms/departments/sia/wsc/webpapers.html.
- Fleeson W. Moving personality beyond the person-situation debate. Current Directions in Psychological Science. 2004;13:83–87. [Google Scholar]
- Fowler K, Whitlock MC. The distribution of phenotypic variance with inbreeding. Evolution. 1999;53:1143–1156. doi: 10.1111/j.1558-5646.1999.tb04528.x. [DOI] [PubMed] [Google Scholar]
- Harvey AC. Estimating regression models with multiplicative heteroscedasticity. Econometrica. 1976;44:461–465. [Google Scholar]
- Hedeker D, Mermelstein RJ. Mixed-effects regression models with heterogeneous variance: Analyzing ecological momentary assessment data of smoking. In: Little TD, Bovaird JA, Card NA, editors. Modeling Ecological and Contextual Effects in Longitudinal Studies of Human Development. Mahwah, NJ: Erlbaum; 2007. pp. 183–206. [Google Scholar]
- Hedeker D, Berbaum M, Mermelstein R. Location-scale models for multilevel ordinal data: Between- and within-subjects variance modeling. Journal of Probability and Statistical Science. 2006;4:1–20. [Google Scholar]
- Hertzog C, Nesselroade JR. Assessing psychological change in adulthood: An overview of methodological issues. Psychology and Aging. 2003;18:639–657. doi: 10.1037/0882-7974.18.4.639. [DOI] [PubMed] [Google Scholar]
- James AT, Venables WN, Dry IB, Wiskich JT. Random effects and variances as a synthesis of nonlinear regression analysis of mitochondrial electron transport. Biometrika. 1994;81:219–235. [Google Scholar]
- Leonard T. A Bayesian approach to the linear model with unequal variances. Technometrics. 1975;17:95–102. [Google Scholar]
- Lin X, Raz J, Harlow S. Linear mixed models with heterogeneous within-cluster variances. Biometrics. 1997;53:910–923. [PubMed] [Google Scholar]
- Lindley DV. The estimation of many parameters. In: Godambe VP, Sprott DA, editors. Foundations of Statistical Inference. Toronto: Holt, Rinehart, and Winston; 1971. pp. 435–455. [Google Scholar]
- Longford NT. Random Coefficient Models. New York: Oxford University Press; 1993. [Google Scholar]
- Martin M, Hofer SM. Intraindividual variability, change, and aging: Conceptual and analytical issues. Gerontology. 2004;50:7–11. doi: 10.1159/000074382. [DOI] [PubMed] [Google Scholar]
- Myles JP, Price GM, Hunter N, Day M, Duffy SW. A potentially useful distribution model for dietary intake data. Public Health Nutrition. 2003;6:513–519. doi: 10.1079/PHN2003459. [DOI] [PubMed] [Google Scholar]
- Nesselroade JR. Intraindividual variability and short-term change. Gerontology. 2004;50:44–47. doi: 10.1159/000074389. [DOI] [PubMed] [Google Scholar]
- Nesselroade JR, McCollam KMS. Putting the process in developmental processes. International Journal of Behavioral Development. 2000;24:295–300. [Google Scholar]
- Neuhaus JM, Kalbfleisch JD. Between- and within-cluster covariate effects in the analysis of clustered data. Biometrics. 1998;54:638–645. [PubMed] [Google Scholar]
- Penner LA, Shiffman S, Paty JA, Fritzsche BA. Individual differences in intraperson variability in mood. Journal of Personality and Social Psychology. 1994;66:712–721. doi: 10.1037//0022-3514.66.4.712. [DOI] [PubMed] [Google Scholar]
- Renò R, Rizza R. Is volatility lognormal? Evidence from Italian futures. Physica A: Statistical Mechanics and its Applications. 2003;322:620–628. [Google Scholar]
- SAS Institute Inc. SAS/STAT User’s Guide, Version 8. Cary, NC: SAS Publishing; 2000. [Google Scholar]
- Schwartz JE, Stone A. The analysis of real-time momentary data: A practical guide. In: Stone AA, Shiffman SS, Atienza A, Nebeling L, editors. The Science of Real-Time Data Capture: Self-Report in Health Research. Oxford: Oxford University Press; 2007. [Google Scholar]
- Shenk TM, White GC, Burnhamb KP. Sampling-variance effects on detecting density dependence from temporal trends in natural populations. Ecological Monographs. 1998;68:445–463. [Google Scholar]
- Vasseur H. Prediction of tropospheric scintillation on satellite links from radiosonde data. IEEE Transactions on Antennas and Propagation. 1999;47:293–301. [Google Scholar]
- Verbeke G, Molenberghs G. Linear Mixed Models for Longitudinal Data. New York: Springer-Verlag; 2000. [Google Scholar]
- Walls TA, Schafer JL. Models for Intensive Longitudinal Data. New York: Oxford University Press; 2006. [Google Scholar]
- Wolfinger RD, Tobias RD. Joint estimation of location, dispersion and random effects in robust design. Technometrics. 1998;40:62–71. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
7. Supplementary Materials
The Web Appendix referenced in Sections 1 and 4 is available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.
