Tutorial on Biostatistics: Longitudinal Analysis of Correlated Continuous Eye Data

Gui-Shuang Ying; Maureen G Maguire; Robert J Glynn; Bernard Rosner

doi:10.1080/09286586.2020.1786590

. Author manuscript; available in PMC: 2021 May 26.

Published in final edited form as: Ophthalmic Epidemiol. 2020 Aug 2;28(1):3–20. doi: 10.1080/09286586.2020.1786590

Tutorial on Biostatistics: Longitudinal Analysis of Correlated Continuous Eye Data

Gui-Shuang Ying ^a, Maureen G Maguire ^a, Robert J Glynn ^b, Bernard Rosner ^b

PMCID: PMC8150110 NIHMSID: NIHMS1699782 PMID: 32744149

Abstract

Purpose:

To describe and demonstrate methods for analyzing longitudinal correlated eye data with a continuous outcome measure.

Methods:

We described fixed effects, mixed effects and generalized estimating equations (GEE) models, applied them to data from the Complications of Age-Related Macular Degeneration Prevention Trial (CAPT) and the Age-Related Eye Disease Study (AREDS). In CAPT (N = 1052), we assessed the effect of eye-specific laser treatment on change in visual acuity (VA). In the AREDS study, we evaluated effects of systemic supplement treatment among 1463 participants with AMD category 3.

Results:

In CAPT, the inter-eye correlations (0.33 to 0.53) and longitudinal correlations (0.31 to 0.88) varied. There was a small treatment effect on VA change (approximately one letter) at 24 months for all three models (p = .009 to 0.02). Model fit was better with the mixed effects model than the fixed effects model (p < .001).

In AREDS, there was no significant treatment effect in all models (p > .55). Current smokers had a significantly greater VA decline than non-current smokers in the fixed effects model (p = .04) and the mixed effects model with random intercept (p = .0003), but marginally significant in the mixed effects model with random intercept and slope (p = .08), and GEE models (p = .054 to 0.07). The model fit was better with the fixed effects model than the mixed effects model (p < .0001).

Conclusion:

Longitudinal models using the eye as the unit of analysis can be implemented using available statistical software to account for both inter-eye and longitudinal correlations. Goodness-of-fit statistics may guide the selection of the most appropriate model.

Keywords: Linear regression models, correlated data, inter-eye correlation, longitudinal correlation, fixed effects model, mixed effects model, generalized estimating equations

Introduction

Many ocular diseases affect both eyes of a subject, such as age-related macular degeneration (AMD), glaucoma and myopia. Interventions to prevent or treat these ocular diseases can be eye-specific or person-specific. For example, in the Complications of Age-related Macular Degeneration Prevention Trial (CAPT), laser treatment was eye-specific. One eye of a subject received laser treatment and the contralateral (fellow) eye was not treated.¹ In the Age-related Eye Disease Study (AREDS),² the intervention of dietary supplements was a systemic treatment so that both eyes received the same treatment. The primary outcome measure from clinical trials in ophthalmology is commonly eye-specific, such as measurement of visual acuity, visual field, or refractive error. These outcomes are often measured multiple times during the follow-up period, providing longitudinal data. The data from these studies have three types of correlation, including cross-sectional inter-eye correlation within the same subject, longitudinal repeated-measures correlation within the same eye over time, and cross-correlation between the outcome for one eye at one time point and the outcome for the fellow eye at a different time point. Inter-eye correlation at one point in time or longitudinal correlation for one eye over time individually are well recognized and statistical models have been developed to appropriately account for both of these types of correlation.^3,5 However, statistical models that simultaneously account for these types of correlation are less widely known. The purpose of this paper is to introduce appropriate statistical models to account for these correlations from longitudinal data from both eyes, and to demonstrate how to apply these models to datasets from the CAPT and the AREDS.

Methods

Data structure of longitudinal correlated eye data

To analyze longitudinal data from both eyes using statistical software, the data usually are laid out with each row representing an outcome measure from an eye at a specific time point (Table 1). The data usually consist of an ID variable indicating subject identification number, an EYE variable indicating which eye (left eye, right eye) was measured, a TIME variable indicating the visit when the outcome measure was taken, a GROUP variable indicating which treatment group the eye was in (GROUP has the same value for both eyes if treatment is systemic), an OUTCOME variable for an eye-specific continuous outcome measure, and possibly additional covariates X₁, X₂, … …, X_k. The covariates can be either subject-specific or eye-specific.

Table 1.

Layout of longitudinal correlated eye data for statistical analysis.

ID	EYE (L,R)	TIME	GROUP (0,1)	OUTCOME
1	L	0	0	y_1L1
1	R	0	1	y_1R1
1	L	1	0	y_1L2
1	R	1	1	y_1R2
1	L	2	0	y_1L3
1	R	2	1	y_1R3
2	L	0	1	Y_2L1
2	R	0	0	Y_2R1
2	L	1	1	Y_2L2
2	R	1	0	Y_2R2
2	L	2	1	Y_2L3
2	R	2	0	Y_2R3
∷	∷	∷	∷	∷

Open in a new tab

For the analysis of longitudinal data from both eyes of a subject, three types of correlations need to be accounted for: (1) the cross-sectional inter-eye correlation at each time point; (2) the longitudinal correlation among repeated measures in the same eye over time; and (3) the cross-correlation between the outcome for one eye at one time point and the outcome for the fellow eye at a different time point. We previously described the analysis for correlated eye data using the fixed effects model, the mixed effects model,³ and the population-average model using generalized estimating equations (GEE)⁴ to account for the inter-eye correlation in continuous eye data from cross-sectional studies.⁵ These modelling approaches can be extended to analyze longitudinal eye data by specifying the model and the correlation structures to account for all three types of correlation as described below.

Fixed effects model

The general equation of the fixed effects model for longitudinal eye data with covariates (x) and time (t) as independent variables is:

y_{i j k} = α + β t_{k} + \sum_{l = 1}^{L} γ_{l} x_{i j l} + \sum_{l = 1}^{L} δ_{l} t_{k} x_{i j l} + e_{i j k}

(1)

where i represents subject, j represents eye, k represents time point, α is the intercept (when t and all covariates are set to a value of 0); β is the rate of change in y for every unit change of t (i.e., slope) when all covariates are set to a value of 0; γ_l is the difference in intercept of y per unit change in the l^th covariate at baseline, and δ_l is the change in slope of y per unit change in the l^th covariate.

The covariates x_ijl in the model do not change with time k, however, the model can be expanded to include time-varying covariates x_ijkl.

When there are two treatment groups (treatment can be either eye-specific or systemic) and no other covariates, the above equation can be simplified to:

y_{i j k} = α + β t_{k} + γ x_{i j} + δ t_{k} x_{i j} + e_{i j k}

(2)

where x_ij = 1 for the active treatment group, x_ij = 0 for the control group. If both eyes of a subject are in the same treatment group, then x_ij = x_i. In this specified model, α is the intercept (when t and x were set to a value of 0), β is the slope per unit time in the control group, β + δ is the slope per unit time in the active group, and γ is the mean difference between the active group and the control group at baseline.

In the fixed effects model, the regression coefficients are assumed to be the same for all subjects (as regression coefficients α, β, γ and δ are all not associated with subject i or eye j). The fixed effects models are most applicable to longitudinal studies with fixed visit times as in clinical trials with scheduled visits, because the visit variable used to specify the correlation structure has to be categorical. In SAS, a fixed effects model is fit using PROC MIXED with the REPEATED option, where the covariance (or correlation) structures for both inter-eye correlation and longitudinal correlation of repeated measures are specified.

SAS offers three correlation structures that can accommodate both inter-eye correlation and longitudinal correlation in the same model including UN@UN, UN@CS and UN@AR (Table 2), where UN stands for unstructured covariance that allows all components of the covariance matrix to be different, CS stands for compound symmetry, and AR stands for first-order autoregressive and @ denotes matrix direct product.

Table 2.

The features of covariance structure of UN@UN, UN@CS and UN@AR for fixed effects model of the longitudinal correlated data with 3-time points.

	UN@UN	UN@CS	UN@AR
Inter-eye covariance matrix	$(\begin{matrix} σ_{L}^{2} & σ_{L R} \\ σ_{L R} & σ_{R}^{2} \end{matrix})$	$(\begin{matrix} σ_{L}^{2} & σ_{L R} \\ σ_{L R} & σ_{R}^{2} \end{matrix})$	$(\begin{matrix} σ_{L}^{2} & σ_{L R} \\ σ_{L R} & σ_{R}^{2} \end{matrix})$
Longitudinal covariance matrix	$(\begin{matrix} 1 & λ_{12} & λ_{13} \\ λ_{22} & λ_{23} \\ λ_{33} \end{matrix})$	$(\begin{matrix} 1 & λ & λ \\ 1 & λ \\ 1 \end{matrix})$	$(\begin{matrix} 1 & λ & λ^{2} \\ 1 & λ \\ 1 \end{matrix})$
Product of inter-eye and longitudinal covariance matrix	$\begin{matrix} σ_{L}^{2} & σ_{L}^{2} λ_{12} & σ_{L}^{2} λ_{13} & σ_{L R} & σ_{L R} λ_{12} & σ_{L R} λ_{13} \\ σ_{L}^{2} λ_{12} & σ_{L}^{2} λ_{22} & σ_{L}^{2} λ_{23} & σ_{L R} λ_{12} & σ_{L R} λ_{22} & σ_{L R} λ_{23} \\ σ_{L}^{2} λ_{13} & σ_{L}^{2} λ_{23} & σ_{L}^{2} λ_{33} & σ_{L R} λ_{13} & σ_{L R} λ_{23} & σ_{L R} λ_{33} \\ σ_{L R} & σ_{L R} λ_{12} & σ_{L R} λ_{13} & σ_{R}^{2} & σ_{R}^{2} λ_{12} & σ_{R}^{2} λ_{13} \\ σ_{L R} λ_{12} & σ_{L R} λ_{22} & σ_{L R} λ_{23} & σ_{R}^{2} λ_{12} & σ_{R}^{2} λ_{22} & σ_{R}^{2} λ_{23} \\ σ_{L R} λ_{13} & σ_{L R} λ_{23} & σ_{L R} λ_{33} & σ_{R}^{2} λ_{13} & σ_{R}^{2} λ_{23} & σ_{R}^{2} λ_{33} \end{matrix}$	$\begin{matrix} σ_{L}^{2} & σ_{L}^{2} λ & σ_{L}^{2} λ & σ_{L R} & σ_{L R} λ & σ_{L R} λ \\ σ_{L}^{2} λ & σ_{L}^{2} & σ_{L}^{2} λ & σ_{L R} λ & σ_{L R} & σ_{L R} λ \\ σ_{L}^{2} λ & σ_{L}^{2} λ & σ_{L}^{2} & σ_{L R} λ & σ_{L R} λ & σ_{L R} \\ σ_{L R} & σ_{L R} λ & σ_{L R} λ & σ_{R}^{2} & σ_{R}^{2} λ & σ_{R}^{2} λ \\ σ_{L R} λ & σ_{L R} & σ_{L R} λ & σ_{R}^{2} λ & σ_{R}^{2} λ & σ_{R}^{2} λ \\ σ_{L R} λ & σ_{L R} λ & σ_{L R} & σ_{R}^{2} λ & σ_{R}^{2} λ & σ_{R}^{2} \end{matrix}$	$\begin{matrix} σ_{L}^{2} & σ_{L}^{2} λ & σ_{L}^{2} λ & σ_{L R} & σ_{L R} λ & σ_{L R} λ \\ σ_{L}^{2} λ & σ_{L}^{2} & σ_{L}^{2} λ & σ_{L R} λ & σ_{L R} & σ_{L R} λ \\ σ_{L}^{2} λ^{2} & σ_{L}^{2} λ & σ_{L}^{2} & σ_{L R} λ^{2} & σ_{L R} λ & σ_{L R} \\ σ_{L R} & σ_{L R} λ & σ_{L R} λ^{2} & σ_{R}^{2} & σ_{R}^{2} λ & σ_{R}^{2} λ \\ σ_{L R} λ & σ_{L R} & σ_{L R} λ & σ_{R}^{2} λ & σ_{R}^{2} & σ_{R}^{2} λ \\ σ_{L R} λ^{2} & σ_{L R} λ & σ_{L R} & σ_{R}^{2} λ^{2} & σ_{R}^{2} λ & σ_{R}^{2} \end{matrix}$
Variance of outcome for the left (right) eyes at visit k	$σ_{L}^{2} λ_{k k} (σ_{R}^{2} λ_{k k})$	$σ_{L}^{2} (σ_{R}^{2})$	$σ_{L}^{2} (σ_{R}^{2})$
Inter-eye correlation at one visit k	$\frac{σ_{L R}}{σ_{L} σ_{R}}$	$\frac{σ_{L R}}{σ_{L} σ_{R}}$	$\frac{σ_{L R}}{σ_{L} σ_{R}}$
Longitudinal correlation between visits k₁ and k₂	$\frac{λ_{k_{1} k_{2}}}{\sqrt{λ_{k_{1} k_{1}} λ_{k_{2} k_{2}}}}$	$λ$	$λ^{\| k_{1} - k_{2} \|}$
Cross correlation between one eye at visit k₁ and the fellow eye at another visit k₂	$(\frac{σ_{L R}}{σ_{L} σ_{R}}) \frac{λ_{k_{1} k_{2}}}{\sqrt{λ_{k_{1} k_{1}} λ_{k_{2} k_{2}}}}$	$(\frac{σ_{L R}}{σ_{L} σ_{R}}) λ$	$(\frac{σ_{L R}}{σ_{L} σ_{R}}) λ^{\| k_{1} - k_{2} \|}$
Total number of parameters to estimate	8	4	4

Open in a new tab

UN = unstructured, CS = compound symmetry, AR = autoregressive

The UN@UN correlation structure specifies that both the correlation matrix between eyes and the longitudinal correlation over time are unstructured. The first UN corresponds to the cross-sectional covariance between left and right eyes and the second UN corresponds to the longitudinal covariance structure. If there are K time points, the product of these two covariance matrixes will lead to a matrix with dimension of 2 K x 2 K. For example, if there are 3-time points, the product of the inter-eye covariance matrix and longitudinal covariance matrix will be a 6 × 6 matrix, and have 8 parameters to estimate (3 from the cross-sectional component, and 5 from the longitudinal component).

As shown in Table 2, the covariance matrix of UN@UN has the attributes of allowing different variances across different time points and across different eyes, and can accommodate the inter-eye correlation (assumed to be the same at different time points), longitudinal correlation (allowed to be different for different pairs of time points, but the same for left eye and right eye) and cross correlation.

Similar to UN@UN, the UN@CS and UN@AR can also accommodate all correlations (e.g., inter-eye correlation, longitudinal correlation and cross correlation), but with stronger assumptions for the longitudinal correlation structure. UN@CS assumes an unstructured inter-eye covariance (three parameters), and a compound symmetry correlation structure (with only one parameter) for longitudinal correlation, while UN@AR assumes an unstructured inter-eye covariance (three parameters), and a first-order autoregressive correlation structure (with only one parameter) for longitudinal correlation. As both UN@CS and UN@AR only have four parameters to estimate, they are less computationally intensive and less likely to have a convergence problem than with the UN@UN. The features of UN@CS and UN@AR as compared to UN@UN are shown in Table 2. In the longitudinal data analysis literature, the model in Equations 1 and 2 is also sometimes referred to as a Covariance Pattern model.⁶

Mixed effects model

The mixed effects model contains both fixed effects and random effects. Regression coefficients for random effects are assumed to vary among different individuals. Random effects may be specified for both the intercept and time parameters for both the subject and the eye within the subject. Through specification of random effects for longitudinal measures over time and measures from two eyes of a subject, the mixed effects model explicitly accounts for the repeated measures correlation and inter-eye correlation. An important assumption of the mixed effects model is that the distribution of the random effects is assumed to be normal (i.e., Gaussian) and random effects are independent of all the covariates. In addition, covariate effects can be estimated for both a single individual as well as an average over many individuals. The mixed effects model requires correct specifications for both fixed effects and random effects. In SAS, the mixed effects model is executed using PROC MIXED through a RANDOM statement.

Below, we describe two common mixed effects models including a mixed effects model with random intercept, and a mixed effects model with random intercept and slope.

Mixed effects model with random intercept

The mixed effects model with random intercept for longitudinal eye data with treatment group (x) and time (t) as independent variables is specified as:

y_{i j k} = (α + u_{i} + v_{i j}) + β t_{k} + γ x_{i j} + δ t_{k} x_{i j} + e_{i j k}, i = 1, n, N; j = 1, 2; k = 1, n, K

(3)

where u_i and v_ij are person-specific and eye-specific random effects, respectively, and e_ijk is the random error. u_i is distributed as $N (0, σ_{u}^{2})$ , v_ij is distributed as $N (0, σ_{v}^{2})$ , and e_ijk is distributed as $N (0, σ_{e}^{2})$ . u_i, v_ij and e_ijk are assumed mutually independent each other. The covariance matrix of the three random effects terms u_i, v_i0 and v_i1, is denoted by G (Table 3).

Table 3.

Features of the covariance structure under random effects models for longitudinal correlated data with 3-time points.

	Random intercept for subject and eye within subject (equation 3)	Random intercept and slope for subject; random intercept for eye within subject (equation 6)	Random intercept and slope for subject; random slope for eye within subject (equation 6)
Vector of random effects (transpose)	(u_i, v_i0, v_i1)	(u_i, w_i, v_i0, v_i1)	(u_i, w_i, z_i0, z_i1)
Covariance matrix for random effects (G)	$(\begin{matrix} σ_{u}^{2} & 0 & 0 \\ 0 & σ_{v}^{2} & 0 \\ 0 & 0 & σ_{v}^{2} \end{matrix})$	$(\begin{matrix} σ_{u}^{2} & σ_{uw} & 0 & 0 \\ σ_{uw} & σ_{w}^{2} & 0 & 0 \\ 0 & 0 & σ_{v}^{2} & 0 \\ 0 & 0 & 0 & σ_{v}^{2} \end{matrix})$	$(\begin{matrix} σ_{u}^{2} & σ_{uw} & 0 & 0 \\ σ_{uw} & σ_{w}^{2} & 0 & 0 \\ 0 & 0 & σ_{z}^{2} & 0 \\ 0 & 0 & 0 & σ_{z}^{2} \end{matrix})$
Variance of outcome in an eye at visit k	$σ_{u}^{2} + σ_{v}^{2} + σ_{e}^{2}$	$σ_{u}^{2} + 2 t_{k} σ_{uw} + t_{k}^{2} σ_{w}^{2} + σ_{v}^{2} + σ_{e}^{2}$	$σ_{u}^{2} + 2 t_{k} σ_{uw} + t_{k}^{2} σ_{w}^{2} + t_{k}^{2} σ_{z}^{2} + σ_{e}^{2}$
Inter-eye correlation at one visit k	$\frac{σ_{u}^{2}}{σ_{u}^{2} + σ_{v}^{2} + σ_{e}^{2}}$	$\frac{σ_{u}^{2} + 2 t_{k} σ_{uw} + t_{k}^{2} σ_{w}^{2}}{σ_{u}^{2} + 2 t_{k} σ_{uw} + t_{k}^{2} σ_{w}^{2} + σ_{v}^{2} + σ_{e}^{2}}$	$\frac{σ_{u}^{2} + 2 t_{k} σ_{uw} + t_{k}^{2} σ_{w}^{2}}{σ_{u}^{2} + 2 t_{k} σ_{uw} + t_{k}^{2} σ_{w}^{2} + σ_{v}^{2} + σ_{e}^{2}}$
Longitudinal correlation between visits k₁ and k₂	$\frac{σ_{u}^{2} + σ_{v}^{2}}{σ_{u}^{2} + σ_{v}^{2} + σ_{e}^{2}}$	$\frac{σ_{u}^{2} + (t_{k_{1}} + t_{k 2}) σ_{uw} + t_{k_{1}} t_{k_{2}} σ_{w}^{2} + σ_{v}^{2}}{\sqrt{(σ_{u}^{2} + 2 t_{k_{1}} σ_{uw} + t_{k_{1}}^{2} σ_{w}^{2} + σ_{v}^{2} + σ_{e}^{2}) (σ_{u}^{2} + 2 t_{k_{2}} σ_{uw} + t_{k_{2}}^{2} σ_{w}^{2} + σ_{v}^{2} + σ_{e}^{2})}}$	$\frac{σ_{u}^{2} + (t_{k_{1}} + t_{k_{2}}) σ_{uw} + t_{k_{1}} t_{k_{2}} σ_{w}^{2} + t_{k_{1}} t_{k_{2}} σ_{z}^{2}}{\sqrt{(σ_{u}^{2} + 2 t_{k_{1}} σ_{uw} + t_{k_{1}}^{2} σ_{w}^{2} + t_{k_{1}}^{2} σ_{z}^{2} + σ_{e}^{2}) (σ_{u}^{2} + 2 t_{k_{2}} σ_{uw} + t_{k_{2}}^{2} σ_{w}^{2} + t_{k_{2}}^{2} σ_{z}^{2} + σ_{e}^{2})}}$

Open in a new tab

With x_ij = 1 for the active treatment group, x_ij = 0 for the control group, the regression coefficient α can be interpreted as the estimated mean outcome at baseline for the control group. The intercept for a specific person i and eye j in the control group is α + u_i + v_ij.

γ is the mean difference in outcome at baseline between the treatment and control group

β is the slope in the control group, assumed to be the same for all subjects in the control group

β + δ is the slope in the active group, assumed to be the same for all subjects in the active group,

and δ is the difference of slope over time between treatment group and control group, which is of primary interest in clinical trials.

In the mixed effects model with a random intercept, the variance(y_ijk)can be estimated as $σ_{u}^{2} + σ_{v}^{2} + σ_{e}^{2}$ , where $σ_{u}^{2}$ is between person variance, $σ_{v}^{2}$ is between eye (within person) variance, and $σ_{e}^{2}$ is within eye (replicate) variance. Since the variance(y_ijk)is not a function of j (eye) or k (visit), it implies that the variance is the same across all time points and the same for the left eye and right eye. This may not be true for some longitudinal data.

The mixed effects model with random intercept also assumes that the longitudinal correlation between outcomes for the same eye over time is the same for each pair of different time points, since

Corr (y_{{ijk}_{1}}, y_{{ijk}_{2}}) = ({σ_{u}}^{2} + {σ_{v}}^{2}) ∕ ({σ_{u}}^{2} + {σ_{v}}^{2} + {σ_{e}}^{2}), k_{1} \neq k_{2}

(4)

Finally, the inter-eye correlation is assumed to be the same at each visit and is the same as the cross correlation, which can be estimated as

Corr (y_{{ij}_{1} k_{1}}, y_{{ij}_{2} k_{2}}) = {σ_{u}}^{2} ∕ ({σ_{u}}^{2} + {σ_{v}}^{2} + {σ_{e}}^{2}), j_{1} \neq j_{2}

(5)

So the mixed effects model with random intercept has seven parameters including four mean parameters (α, β, γ, δ), and three variance parameters $(σ_{u}^{2}, σ_{v}^{2}, σ_{e}^{2})$ to estimate.

Mixed effects model with random intercept and slope

The equation for the mixed effects model with random intercept and slope that includes treatment group x and time t is:

y_{i j k} = (α + u_{i} + v_{i j}) + (β + w_{i} + z_{i j}) t_{k} + γ x_{i j} + δ t_{k} x_{i j} + e_{i j k}

(6)

where $u_{i} \sim N (0, σ_{u}^{2})$ , $w_{i} \sim N (0, σ_{w}^{2})$ are person-specific random effects for the intercept and slope respectively,

$v_{i j} \sim N (0, σ_{v}^{2})$ , $z_{i j} \sim N (0, σ_{z}^{2})$ are eye-specific random effects for the intercept and slope respectively,

and $e_{i j k} \sim N (0, σ_{e}^{2})$ is the residual effect controlling for the person and eye-specific fixed and random effects. The person and eye-specific random effects are independent of each other, but the random effects for the slope and intercept for a subject (u_i, w_i) may be correlated and the random effects for the slope and intercept of an eye (v_ij, z_ij) may also be correlated. The covariance matrix of the random effects (u_i, w_i, v_i0, v_i1, z_i0, z_i1) is denoted G. In many examples, the estimated G matrix for the model in Equation 6 may not be positive definite. In this case, a reduced model that drops either the two random eye-specific intercepts (v_i0, v_i1) or the two random eye-specific slopes (z_i0, z_i1) can often be fitted.

With x_ij = 1 for the active treatment group, x_ij = 0 for the control group,

α, β are the average intercept and slope, respectively, in the control group,

(α+γ) and (β+δ) are the average intercept and slope, respectively, in the treatment group,

α + u_i + v_ij is the intercept for the j^th eye of the i^th subject in the control group,

β + w_i + z_ij is the slope for the j^th eye of the i^th subject in the control group,

α + γ + u_i + v_ij is the intercept for the j^th eye of the i^th subject in the treatment group,

β + δ + w_i + z_ij is the slope for the j^th eye of the i^th subject in the treatment group.

The model has 11 parameters to estimate including (α, β, γ, δ) for the mean and ( $σ_{u}^{2}, σ_{v}^{2}, σ_{w}^{2}, σ_{z}^{2}, σ_{e}^{2})$ for the variance, ρ_uw and ρ_vz for correlations between random effects.

Let c_k = (1, 1, t_k, t_k) be a 1 × 4 vector, ∑_uvwz be a variance-covariance matrix of(u, v, w, z); then

$var (y_{i j k}) = c_{k} \sum_{u v w z} {c_{k}}^{'} + σ_{e}^{2}$ depends on t_k.

Thus, the variance of y_ijk may increase or decrease over time, but is assumed to be the same for left and right eyes.

Similarly, $cov (y_{{ijk}_{1}}, y_{{ijk}_{2}}) = c_{k_{1}} \sum_{uvwz} {c_{k_{2}}}^{'}$

also depends on time, thus the longitudinal correlation may be dependent on t_k₁ and t_k₂.

In the mixed effects model with random intercept and random slope, the variance of y_ijk is allowed to vary over time, but the variance for the left and right eye are assumed to be the same. Similarly, the longitudinal correlations are allowed to be different depending on the time points, and both the inter-eye correlation and cross correlation are also allowed to vary over time. So the mixed effects model with random intercept and random slope offers more flexibility than both the fixed effects model and mixed effects model with random intercept.

In both the mixed effects model and fixed effects model, the goodness of fit can be assessed using the log-likelihood (−2lnL) or Akaike’s Information Criteria (AIC). The −2lnL and AIC can be used to identify the most appropriate covariance structure. The smaller −2InL and AIC indicate a better fitting model.

Population-average model

The population-average model (or marginal model) using the GEE approach provides an estimate of changes in the population mean corresponding to changes in covariates. Although GEE was initially developed to analyze correlated data from longitudinal repeated measures,⁴ it also applies to the analysis of correlated eye data.⁵ Different from the fixed effects model or mixed effects model, the GEE approach does not require distributional assumptions because estimation of the population-average model depends only on correctly specifying the linear function relating the mean outcome to the covariates. GEE models only estimate effects for an average person (or eye) with specific covariate values. As GEE is a marginal model approach, no random effects are introduced. Instead, empirical methods such as the robust sandwich estimator for the variance of the estimated regression coefficients are employed to adjust the SE’s of the regression estimates for the correlated outcome data. The marginal model takes account of all correlations by estimating the covariance among the residuals from a single subject, assuming the residuals from a subject are correlated, while the standard linear regression model assumes the residuals are independent with a constant variance. The marginal model is usually executed in SAS using PROC GENMOD. The REPEATED statement in this procedure allows the specification of various correlation structures including Unstructured (UN), compound symmetry (CS) and the working independence (IND).

For the longitudinal data measured from two eyes of each subject at 3-time points, the correlation structure (6 X 6) of UN, CS and IND are as follows:

U N = (\begin{matrix} σ_{1}^{2} & ρ_{1} & ρ_{2} & ρ_{3} & ρ_{4} & ρ_{5} \\ ρ_{1} & σ_{2}^{2} & ρ_{6} & ρ_{7} & ρ_{8} & ρ_{9} \\ ρ_{2} & ρ_{6} & σ_{3}^{2} & ρ_{10} & ρ_{11} & ρ_{12} \\ ρ_{3} & ρ_{7} & ρ_{10} & σ_{4}^{2} & ρ_{13} & ρ_{14} \\ ρ_{4} & ρ_{8} & ρ_{11} & ρ_{13} & σ_{5}^{2} & ρ_{15} \\ ρ_{5} & ρ_{9} & ρ_{12} & ρ_{14} & ρ_{15} & σ_{6}^{2} \end{matrix})

C S = (\begin{matrix} σ^{2} & ρ & ρ & ρ & ρ & ρ \\ ρ & σ^{2} & ρ & ρ & ρ & ρ \\ ρ & ρ & σ^{2} & ρ & ρ & ρ \\ ρ & ρ & ρ & σ^{2} & ρ & ρ \\ ρ & ρ & ρ & ρ & σ^{2} & ρ \\ ρ & ρ & ρ & ρ & ρ & σ^{2} \end{matrix})

I N D = (\begin{matrix} σ^{2} & 0 & 0 & 0 & 0 & 0 \\ 0 & σ^{2} & 0 & 0 & 0 & 0 \\ 0 & 0 & σ^{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & σ^{2} & 0 & 0 \\ 0 & 0 & 0 & 0 & σ^{2} & 0 \\ 0 & 0 & 0 & 0 & 0 & σ^{2} \end{matrix})

The number of parameters to estimate is 21 for UN, 2 for CS, and 1 for working Independence.

Although the “working independence” correlation structure appears to ignore the inter-eye correlation and longitudinal correlation by specifying a correlation of 0, the GEE approach uses a robust variance estimator that provides asymptotically unbiased estimates for the regression coefficients, which are the same as those from the standard linear regression model, but their standard errors are adjusted for the correlated data. With the compound symmetry correlation structure and unstructured correlation structure, both the estimated regression coefficients and standard errors may differ from the standard linear regression model. When there are several time points with repeated measures, using an unstructured correlation structure requires estimating many covariance parameters, and substantially increases the computation time and possibly leads to computational convergence problems.

In GEE, when there is little knowledge available to choose among correlation structures, the model goodness of fit statistics can be used to find an acceptable working correlation structure including the Quasi-likelihood (Q) under the Independence model Criterion (QIC) and the related QICu.⁷ QIC is analogous to the AIC statistic used for comparing models fits with likelihood-based methods. Since it is not a likelihood-based method, the AIC statistic is not available in GEE. QIC can be used to compare GEE models with the same covariates under various correlation structures, thus guiding the selection of a proper correlation structure. The model with the smaller QIC statistic is preferred.⁸ QICu can be used to compare various models with different covariates but with the same correlation structure.

We demonstrate the application of the fixed effects model, mixed effects model, and marginal models to analyze longitudinal correlated eye data from two clinical trials as described below. The institutional review board associated with each clinical center approved the study protocol and informed consent was obtained from each patient, and each study adhered to the tenets of the Declaration of Helsinki.

In the analyses of clinical trial data using fixed effects models and mixed effects models, we used the DDFM = KR option to calculate the degrees of freedom for testing fixed effects as detailed by Kenward and Roger (1997).⁹ A simulation study by Schaalje et al. found that the KR method works reasonably well with various covariance structures when sample sizes are moderate to small and the design is reasonably balanced while the other methods for DDFM did not work as well as the KR method in some settings.¹⁰ All statistical analyses were performed in SAS 9.4 (SAS Institute Inc., Cary, NC), and the SAS codes are included in the Appendices 1 and 2. Similar codes in STATA (Appendix 3) and R (Appendix 4) are also provided.

Example 1: Analysis of visual acuity data from the complications of age-related macular degeneration prevention trial (CAPT)

The CAPT was a multi-center randomized clinical trial to evaluate whether low-intensity laser treatment of eyes with drusen prevents vision loss from AMD.¹ The study enrolled 1,052 participants (2,104 eyes) with age at least 50 years, at least 10 large drusen (retinal deposits) in each eye, and visual acuity (VA) at least 20/40 in each eye. The study randomly assigned one eye to laser treatment, and the fellow eye of a participant to observation (without any intervention). The visual acuity of each eye was measured using modified ETDRS charts at baseline, 6 months, and annually for at least 5 years. The primary outcome was the visual acuity score calculated as the total number of letters read correctly from the ETDRS charts. For this example, we analyzed visual acuity measured at baseline, and months 12, 24, 36, 48, 60. We evaluated the treatment effect and the effect of smoking on visual acuity using the fixed effects model, random effects models and GEE. In these models, time (in months) was fitted as a categorical variable with 6 levels (e.g., 0, 12, 24, 36, 48 and 60 months) because the visual acuity change over time did not appear to be linear. For the mixed effects model with random intercept and random slope, including both random intercept and slope in any of the random statements for subject or eye within subject caused a non-positive-definite G-matrix (matrix of random effects). Thus, we fitted the mixed effects model using a random visit effect for both subject and eye within the subject, but not a random intercept.

Example 2: Analysis of data from the age-related eye disease study (AREDS)

The AREDS AMD trial was a multi-center randomized clinical trial to evaluate the effect of high-dose vitamin C and E, beta carotene and zinc supplements on AMD progression and visual acuity.

The study enrolled 1063 participants who had extensive small drusen, retinal pigment abnormalities, or at least 1 intermediate size drusen (AMD category 2), 1463 participants who had extensive intermediate drusen, non-central geographic atrophy, or at least one large druse (AMD category 3); and 956 participants who had advanced AMD or visual acuity less than 20/32 due to AMD in one eye (AMD category 4).² All participants were randomly assigned to receive daily oral tablets containing: (1) antioxidants (vitamin C, 500 mg; vitamin E, 400 IU, and beta carotene, 15 mg); (2) zinc, 80 mg, as zinc oxide and copper, 2 mg, as cupric oxide; (3) anti-oxidants plus zinc; or (4) placebo. The participants were followed for outcome assessment every 6 months for at least 5 years. For the purpose of demonstration, we restricted the analyses to the 1463 participants (2334 eyes, 60% bilateral) with AMD category 3 at baseline.

We analyzed the AREDS data to evaluate the effects of treatment, age, hypertension status and smoking on the rate of change in visual acuity during follow-up using a fixed effects model, random effects models and GEE. In these models, time (in months) was fitted as a continuous variable. We initially fitted a mixed effects model with random intercept and random slope for both subject and eye within a subject. However, there was little variation in visual acuity attributed to the random intercept for eye within a subject, leading to a non-positive-definite G-matrix. Thus, we fitted the mixed effects model using a random intercept and random slope for subject, and only random slope for eye within a subject.

Results

Effect of treatment and smoking on visual acuity in CAPT

Among 1052 participants enrolled into the CAPT, 917 (87%) participants completed the 5-year follow-up. The inter-eye correlations assessed using Pearson correlation coefficients ranged from 0.33 to 0.53 and were highest at baseline. The longitudinal correlation coefficients for pairs of visits ranged from 0.31 to 0.88, diminished with a longer time between repeated measures, and tended to be higher between pairs of visits at later time points (Table 4). The longitudinal correlation coefficients were similar in treated eyes and fellow eyes (Table 4).

Table 4.

Visual acuity in the treated eye and control eye over time and their cross-sectional and longitudinal correlations (Pearson Correlation) in CAPT participants (N = 1052).

		Time points (months)
		0	12	24	36	48	60
# of Patients	N	1052	1035	1008	970	941	917
Control eye	Mean (SD) VA score in letters	82.1 (6.1)	80.7 (8.9)	79.0 (11.4)	76.8 (13.6)	75.3 (15.6)	73.1 (17.7)
Treated eye	Mean (SD) VA score in letters	82.2 (6.2)	81.0 (9.0)	80.1 (10.7)	77.9 (13.1)	75.9 (15.1)	72.9 (17.7)
		Pearson Correlation Coefficient
Control eye	0	1.00	0.58	0.49	0.41	0.37	0.32
	12		1.00	0.75	0.65	0.57	0.51
	24			1.00	0.79	0.70	0.64
	36				1.00	0.85	0.77
	48					1.00	0.88
	60						1.00
Treated eye	0	1.00	0.59	0.45	0.41	0.36	0.31
	12		1.00	0.70	0.63	0.55	0.46
	24			1.00	0.81	0.69	0.60
	36				1.00	0.84	0.73
	48					1.00	0.86
	60						1.00
Combined	0	1.00	0.58	0.47	0.41	0.36	0.32
	12		1.00	0.73	0.64	0.56	0.48
	24			1.00	0.80	0.70	0.62
	36				1.00	0.84	0.75
	48					1.00	0.87
	60						1.00
Inter-eye correlation	Pearson ρ	0.53	0.36	0.33	0.36	0.36	0.34

Open in a new tab

Over 5 years of follow-up, the mean visual acuity decreased over time, while the variation (e.g., standard deviation [SD]) of visual acuity increased over time, with mean (SD) of visual acuity 82 (6) letters at baseline and 73 (18) letters at year 5 in both treated eyes and observation eyes (Table 4).

The multivariable analysis results for treatment effect and other covariates from the naïve model (ignoring both inter-eye and longitudinal correlations), fixed effects model, mixed effects models (random intercept or random visit), and GEE models (using compound symmetry, or working independence) are shown in Table 5. Overall, the mean visual acuity decreased over time with about a 14-letter decline from baseline to 5 years (p < .0001, Table 5). There was a non-monotone effect of treatment over time with a small but significant effect of treatment on VA (estimated as mean difference in VA between treated eyes and control eyes) for all five models at 24 months (fixed effects model: mean (SE) = 0.98 ± 0.43 letters, p = .02; mixed effects model with random visit: mean (SE) = 0.98 ± 0.40 letters, p = .02; mixed effects model with random intercept: mean (SE) = 1.03 ± 0.47 letters, p = .03; GEE model using compound symmetry or working independence: mean (SE) = 1.05 ± 0.40 letters, p = .009). At 36 months, the estimated treatment effect was similar (mean difference approximately 1 letter, all p < .05, Table 5), but the treatment effect on VA at 60 months was minimal (mean VA difference ≤0.25 letters) and not statistically significant (all p ≥ 0.67). Unexpectedly, in four models (fixed effects model, mixed effects model with random visit, and GEE model using compound symmetry or working independence), current cigarette smokers had better VA than former or non-smokers at baseline (differed by 1.5 letters, p < .05) and at 12 months (differed by 2.1 letters, p < .05), but the difference was not significant by 60 months. The mixed effects model using a random intercept only and the naïve model provided somewhat different results for the smoking effect when compared to other models, without any statistically significant difference at baseline and at 12 months mainly due to the much larger estimate of standard error than in the other models, but with a statistically significant difference at 60 months (p < .05, Table 5). The goodness of model fit assessed using −2lnL and AIC (the smaller the better) for the fixed effects model and mixed effects models and using QIC or QICu for GEE models are shown in Table 5. The model fit was significantly better with the mixed effects model using random visit vs. the fixed effects model:Δ(−2 ln L) = 209.6, chi-square 19 df, p < .001. The mixed effects model using random intercept and the naïve model fit the data poorly as indicated by high values of −2lnL and AIC (Table 5). The GEE models using compound symmetry and working independence provided similar goodness of fit values.

Table 5.

The comparison of results for effects of treatment and cigarette smoking on visual acuity (letters) change over time in the CAPT study (n = 1052 subjects, 11846 eye visits).

Effect	Time (mons)	Naïve model (ignore inter-ye correlation and longitudinal correlations)		Fixed effects^†		Mixed effects^§ (Random visit) ^Δ		Mixed effects^§ (Random intercept)		GEE^£ (Compound symmetry)		GEE^£ (Working independence)
Effect	Time (mons)	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value
Visit			<0.0001		<0.0001		<0.0001		<0.0001		<0.0001		<0.0001
	0	(ref)		(ref)		(ref)		(ref)		(ref)		(ref)
	12	−0.78 (0.65)	0.65	−0.88 (0.80)	0.27	−0.88 (0.77)	0.25	−0.86 (1.12)	0.44	−0.86 (0.61)	0.16	−0.78 (0.62)	0.21
	24	−4.27 (1.73)	0.01	−4.63 (1.08)	<0.0001	−4.63 (1.08)	<0.0001	−4.64(1.14)	<0.0001	−4.62 (1.06)	<0.0001	−4.27 (1.11)	0.0001
	36	−6.42 (1.73)	0.0002	−6.79 (1.34)	<0.0001	−6.78 (1.37)	<0.0001	−6.69 (1.14)	<0.0001	−6.68 (1.36)	<0.0001	−6.42 (1.39)	<0.0001
	48	−7.89 (1.76)	<0.0001	−8.50 (1.60)	<0.0001	−8.51 (1.65)	<0.0001	−8.23 (1.16)	<0.0001	−8.19 (1.49)	<0.0001	−7.89 (1.55)	<0.0001
	60	−13.2 (1.77)	<0.0001	−13.5 (1.89)	<0.0001	−13.6 (1.94)	<0.0001	−13.5 (1.17)	<0.0001	−13.5 (1.99)	<0.0001	−13.2 (2.06)	<0.0001
Laser vs. no treatment			0.19		0.01		0.03		0.06		0.03		0.03
	0	0.03 (0.55)	0.96	0.04 (0.23)	0.86	0.03 (0.18)	0.89	0.03 (0.46)	0.96	0.03 (0.18)	0.89	0.03 (0.18)	0.89
	12	0.30 (0.55)	0.59	0.28 (0.34)	0.41	0.28 (0.31)	0.38	0.29 (0.46)	0.54	0.30 (0.31)	0.34	0.30 (0.31)	0.34
	24	1.05 (0.56)	0.06	0.98 (0.43)	0.02	0.98 (0.40)	0.02	1.03 (0.47)	0.03	1.05 (0.40)	0.009	1.05 (0.40)	0.009
	36	1.10 (0.57)	0.054	1.02 (0.52)	0.049	1.04 (0.48)	0.03	1.06 (0.48)	0.03	1.10 (0.49)	0.02	1.10 (0.49)	0.02
	48	0.59 (0.58)	0.31	0.72 (0.60)	0.23	0.73 (0.56)	0.19	0.61 (0.48)	0.21	0.59 (0.56)	0.30	0.59 (0.56)	0.30
	60	−0.19 (0.58)	0.75	−0.25 (0.70)	0.72	−0.25 (0.66)	0.71	−0.21 (0.49)	0.67	−0.19 (0.67)	0.78	−0.19 (0.67)	0.78
Current smoking vs. former/no smoking			0.14		0.003		0.003		0.002		0.01		0.01
Current smoking vs. former/no smoking	0	1.49 (1.20)	0.21	1.48 (0.63)	0.02	1.49 (0.73)	0.04	1.49 (1.34)	0.27	1.49 (0.65)	0.02	1.49 (0.65)	0.02
	12	2.17 (1.23)	0.08	2.11 (0.96)	0.03	2.11 (1.02)	0.04	2.13 (1.37)	0.12	2.13 (0.81)	0.009	2.17 (0.82)	0.008
	24	0.26 (1.26)	0.84	−0.01 (1.21)	0.99	0.02 (1.27)	0.99	−0.01 (1.38)	0.99	0.01 (1.25)	1.00	0.26 (1.27)	0.84
	36	0.32 (1.25)	0.80	0.19 (1.47)	0.90	0.25 (1.56)	0.87	0.23 (1.38)	0.87	0.24 (1.54)	0.88	0.32 (1.55)	0.84
	48	0.30 (1.20)	0.81	0.29 (1.72)	0.86	0.33 (1.83)	0.86	0.34 (1.40)	0.81	0.35 (1.65)	0.83	0.30 (1.69)	0.86
	60	−2.90 (1.31)	0.03	−2.61 (2.00)	0.19	−2.64(2.12)	0.21	−2.81 (1.41)	0.046	−2.81 (2.11)	0.18	−2.90 (2.17)	0.18
−2lnL		93,445		81,293		81,083		87520
AIC		93,447		81,339		81,191		87526
QIC										11867		11868
QICu										11864		11864
Covariance parameters		1		24		43		3		2		0

Open in a new tab

^†

Fixed effects model using SAS PROC MIXED with the REPEATED OPTION with a UN@UN correlation structure.

^§

Mixed effects model using SAS PROC MIXED with the RANDOM OPTION with random effects for subject and eye within subject for both the intercept and the Visit parameter.

^£

Generalized estimating equation (GEE) model.

^Δ

Time was modelled as categorical with 6 levels (i.e., months 0,12, 24, 36,48, 60) due to the non-linear decline in visual acuity.

Effect of treatment and factors associated with visual acuity outcome in AREDS

Among 1463 participants (871 bilateral) with AMD category 3 at baseline in their eligible eyes (2334 eyes) for this analysis, 1217 (83%) participants (1932 eyes) completed 5 years of follow-up. The inter-eye correlations assessed among bilateral cases using Pearson correlation coefficients ranged from 0.26 to 0.48 and were highest at baseline. The longitudinal correlation coefficients ranged from 0.26 to 0.90, diminished with longer time between repeated measures, and tended to be higher between pairs of visits at later time points (Table 6). The longitudinal correlation coefficients were similar in left eyes and right eyes (Table 6).

Table 6.

Visual acuity in left eye and right eye over time and their cross-sectional and longitudinal correlations (Pearson Correlation) among eyes with AMD category 3 at baseline in the AREDS participants (N = 1463 participants, 871 bilateral, and 592 unilateral).

		Time points (Months)
		0	12	24	36	48	60
Left eye	Mean (SD) VA score	83.6 (5.8)	82.8 (7.2)	81.4 (9.2)	80.0 (11.2)	78.6 (12.7)	76.7 (15.0)
Right eye	Mean (SD) VA score	84.0 (5.7)	82.7 (7.7)	81.5 (10.1)	79.7 (12.6)	78.1 (14.7)	76.3 (16.5)
		Pearson Correlation Coefficient
Left eye	0 (N = 1156)	1.00	0.42	0.33	0.30	0.29	0.26
	12 (N = 1104)		1.00	0.74	0.63	0.56	0.51
	24 (N = 1056)			1.00	0.81	0.72	0.63
	36 (N = 1031)				1.00	0.79	0.71
	48 (N = 996)					1.00	0.85
	60 (N = 953)						1.00
Right eye	0 (N = 1178)	1.00	0.39	0.34	0.29	0.26	0.26
	12 (N = 1122)		1.00	0.73	0.58	0.56	0.52
	24 (N = 1086)			1.00	0.80	0.75	0.70
	36 (N = 1061)				1.00	0.85	0.80
	48 (N = 1024)					1.00	0.90
	60 (N = 979)						1.00
Combined	0 (N = 2334)	1.00	0.40	0.33	0.29	0.27	0.26
	12 (N = 2226)		1.00	0.73	0.60	0.56	0.52
	24 (N = 2142)			1.00	0.81	0.74	0.67
	36 (N = 2092)				1.00	0.82	0.76
	48 (N = 2020)					1.00	0.88
	60 (N = 1932)						1.00
Inter-eye correlation	Pearson ρ	0.48	0.34	0.26	0.29	0.33	0.35
	N of bilateral subjects	871	832	797	777	746	715

Open in a new tab

Over the 5-year follow-up, the mean visual acuity decreased over time while the variation (e.g., SD) of visual acuity increased over time, with mean (SD) visual acuity 84 (6) letters at baseline for both left eyes and right eyes, 77 (15) letters at 5-year for left eyes and 76 (17) letters for right eyes (Table 6).

The multivariable analysis results from the naïve model, fixed effects model, mixed effects models (random intercept with or without random slope) and GEE models (using compound symmetry or working independence) are shown in Table 7. In these models that considered time as a continuous variable, the mean VA decreased over time with mean annual decline of approximately 1.5 letters (p < .0001, Table 7). The baseline VA means were similar across the four treatment groups (p ≥ 0.15) and there was no significant treatment effect on VA in all models (p ≥ 0.55 for test of interaction between time and treatment).

Table 7.

The comparison of results from various models for evaluating the factors association with visual acuity among eyes with AMD category 3 at baseline in AREDS Study (N = 1463 subjects, 2334 eyes, 13277 eye visits).

Effect	Naïve Model (ignore inter-ye correlation and longitudinal correlations)		Fixed effects^†		Mixed effects^§ (Random intercept and slope) ^Δ		Mixed effects^§ (Random intercept)		GEE^£ (Compound symmetry)		GEE^£ (Working independence)
Effect	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value
Intercept	85.0 (0.35)	<0.0001	84.7 (0.23)	<0.0001	85.0 (0.29)	<0.0001	85.2 (0.46)	<0.0001	85.4 (0.29)	<0.0001	85.0 (0.30)	<0.0001
Treatment Group		0.46		0.21		0.20		0.60		0.15		0.30
Placebo	(ref)		(ref)		(ref)		(ref)		(ref)		(ref)
Antioxidants only	0.02 (0.46)	0.96	0.04 (0.30)	0.90	0.30 (0.38)	0.42	0.17 (0.60)	0.77	0.26 (0.39)	0.51	0.02 (0.40)	0.96
Zinc only	−0.12 (0.46)	0.80	−0.17 (0.30)	0.58	−0.10 (0.38)	0.79	−0.18 (0.60)	0.77	−0.23 (0.39)	0.52	−0.12 (0.39)	0.76
Antioxidants+zinc	−0.63 (0.46)	0.17	−0.53 (0.30)	0.08	−0.51 (0.38)	0.18	−0.60 (0.60)	0.31	−0.59 (0.38)	0.13	−0.63 (0.39)	0.11
Time (Year)	−1.46 (0.12)	<0.0001	−1.39 (0.15)	<0.0001	−1.40 (0.15)	<0.0001	−1.47 (0.08)	<0.0001	−1.46 (0.16)	<0.0001	−1.46 (0.16)	<0.0001
Age (per year)	−0.34 (0.03)	<0.0001	−0.34 (0.02)	<0.0001	−0.34 (0.03)	<0.0001	−0.34 (0.04)	<0.0001	−0.34 (0.03)	<0.0001	−0.34 (0.03)	<0.0001
Hypertension: yes vs. No	−0.59 (0.34)	0.08	−0.39 (0.22)	0.07	−0.48 (0.28)	0.08	−0.55 (0.44)	0.21	−0.59 (0.29)	0.04	−0.59 (0.29)	0.04
Current Smoking: Yes vs. no	−1.99 (0.62)	0.002	−1.80 (0.40)	<0.0001	−1.71 (0.51)	0.0008	−1.67 (0.81)	0.04	−1.50 (0.57)	0.01	−1.99 (0.57)	0.0005
*Agetime**	−0.08 (0.01)	<0.0001	−0.08 (0.01)	<0.0001	−0.08 (0.01)	<0.0001	−0.08 (0.01)	<0.0001	−0.08 (0.01)	<0.0001	−0.08 (0.02)	<0.0001
*Current smoking time**	−0.52 (0.22)	0.02	−0.55 (0.27)	0.04	−0.48 (0.27)	0.08	−0.55 (0.15)	0.0003	−0.55 (0.28)	0.054	−0.52 (0.29)	0.07
*Hypertensiontime**	0.05 (0.11)	0.64	−0.05 (0.14)	0.73	−0.07 (0.15)	0.61	−0.01 (0.08)	0.88	−0.01 (0.15)	0.94	0.05 (0.16)	0.73
*Treatment grouptime**		0.92		0.73		0.75		0.55		0.87		0.96
Placebo	(ref)		(ref)		(ref)		(ref)		(ref)		(ref)
Antioxidants only	−0.03 (0.16)	0.87	−0.16 (0.20)	0.41	−0.15 (0.20)	0.45	−0.06 (0.11)	0.57	−0.06 (0.20)	0.76	−0.03 (0.21)	0.91
Zinc only	−0.01 (0.16)	0.95	−0.02 (0.19)	0.90	−0.03 (0.20)	0.88	−0.01 (0.11)	0.98	−0.01 (0.22)	0.99	−0.01 (0.23)	0.97
Antioxidants+zinc	0.07 (0.16)	0.63	−0.05 (0.20)	0.80	0.06 (0.20)	0.76	0.09 (0.11)	0.39	0.09 (0.21)	0.65	0.08 (0.21)	0.72
−2lnL	96518		84836		85634		91241
AIC	96520		84882		85646		91247
QIC									12814		12825
QICu									12760		12760
Covariance parameters	1		24		5		3		2		0

Open in a new tab

^†

Fixed effects model using SAS PROC MIXED with the REPEATED OPTION with a UN@UN correlation structure.

^§

Mixed effects model using SAS PROC MIXED with the RANDOM OPTION with random effects for subject and eye within subject.

^Δ

Using random intercept and slope for the subject level, and random intercept for the eye level.

^£

Generalized estimating equation (GEE) model.

Note: Age is normalized using mean age (i.e., age-69).

Older age was associated with worse baseline VA (0.34 letters worse for every year difference in age, p < .0001, Table 7), and was also associated with more decline in VA during follow-up (0.08 letters for every year increase, p < .0001). Participants with hypertension tended to have worse VA at baseline (mean difference of 0.4 to 0.6 letters, with p-value 0.04 to 0.21 depending on models, Table 4). However, hypertension had no effect on the VA change over time (p ≥ 0.61). Current smokers at baseline had worse baseline VA than non-current smokers with the mean difference ranging from 1.5 letters (GEE model with compound symmetry) to 2.0 letters (naïve model and GEE model with working independence) and p-values ranging from 0.04 (mixed effects model with random intercept) to <0.0001 (fixed effects model, Table 7). Current smokers also had a greater VA decline over time than non-current smokers (annual decline difference of approximately 0.5 letters from all models), and the difference was significant in the naïve model (p = .02), fixed effects model (p = .04), mixed effects model with random intercept (p = .0003), but only marginally significant in the mixed effects model with random intercept and slope (p = .08) and GEE models using compound symmetry (p = .054) and working independence (p = .07).

The model fit was significantly better with the fixed effects model vs. the mixed effects model with random intercept and random slope:Δ(−2 ln L) = 798, chi-square 19 df, p < .0001, although coefficient estimates and SE’s were similar in these two models. The mixed effects model using random intercept and the naïve model that ignores both inter-eye correlation and repeated measure correlation fit the data poorly as indicated by high −2lnL and AIC values (Table 7). The GEE models using compound symmetry fit the data slightly better than the working independence model (ΔQIC = 11,2df, p < 0.001), although coefficients and SE’s were similar.

Effect of sample size on the difference from various models for analysis of a random sample of AREDS participants

To evaluate the various models with smaller sample sizes, we analyzed the data from a random sample of 200 AREDS participants (114 bilateral) with AMD category 3 at baseline in their eligible eyes (314 eyes). The mean VA (SD), the magnitude and pattern of inter-eye correlation coefficient (0.20 to 0.49) among bilateral cases and the longitudinal correlation coefficients (0.20 to 0.91) in visual acuity were similar to those from the whole AREDS sample (Table 8).

Table 8.

Visual acuity in left eye and right eye over time and their cross-sectional and longitudinal correlations (Pearson Correlation) among randomly selected 200 participants in eyes with AMD category 3 at baseline in the AREDS (200 participants, 114 bilateral, 86 unilateral).

		Time points (Months)
		0	12	24	36	48	60
Left eye	Mean (SD) VA score	83.5 (6.2)	82.2 (7.7)	81.1 (10.3)	80.1 (11.0)	79.9 (10.2)	78.0 (12.0)
Right eye	Mean (SD) VA score	83.6 (5.6)	82.9 (7.1)	81.9 (10.2)	80.5 (11.5)	77.6 (16.3)	76.2 (18.4)
Left eye		Pearson Correlation Coefficient
	0 (N = 161)	1.00	0.34	0.20	0.23	0.33	0.29
	12 (N = 155)		1.00	0.63	0.59	0.52	0.48
	24 (N = 145)			1.00	0.76	0.69	0.56
	36 (N = 144)				1.00	0.76	0.64
	48 (N = 134)					1.00	0.89
	60 (N = 131)						1.00
Right eye
	0 (N = 153)	1.00	0.42	0.30	0.30	0.26	0.25
	12 (N = 149)		1.00	0.69	0.59	0.50	0.44
	24 (N = 137)			1.00	0.86	0.67	0.58
	36 (N = 136)				1.00	0.79	0.69
	48 (N = 129)					1.00	0.91
	60 (N = 126)						1.00
Combined
	0 (N = 314)	1.00	0.38	0.25	0.27	0.27	0.26
	12 (N = 304)		1.00	0.66	0.59	0.49	0.44
	24 (N = 282)			1.00	0.81	0.67	0.57
	36 (N = 280)				1.00	0.77	0.67
	48 (N = 263)					1.00	0.90
	60 (N = 257)						1.00
Inter-eye correlation	Pearson ρ	0.49	0.23	0.23	0.20	0.40	0.44
	N of bilateral subjects	114	111	101	101	94	92

Open in a new tab

The multivariable analysis results from the naïive model, fixed effects model, mixed effects models (random intercept with and without random slope) and GEE models (using compound symmetry or working independence) are shown in Table 9. Similar to the analysis using the whole AREDS sample, there was no significant treatment effect (p ≥ 0.12), but there was a significant age effect (p ≤ 0.03) on VA change over time. However, in this small sample analysis, the smoking effect and hypertension effect on VA varied substantially across models. The smoking effect was statistically significant in the mixed effect model using random intercept (current smokers had 1.34 letters more decline annually than non-current smokers, p = .004), but was not significant in all other models, mainly due to the much large SE from the other models (Table 9). The hypertension effect was significant (hypertensive patients had 0.65 letters less decline annually than non-hypertensive patients, p = .04) in the naïve model, and marginally significant in the mixed effects model using random intercept (slope difference of 0.44 letters, p = .06) and GEE model using a working independence (slope difference of 0.65 letters, p = .08), but was not significant in the fixed effects model (slope difference of 0.27 letters, p = .51) and the mixed effect model using random intercept and random slope (slope difference of 0.21, p = .63). The goodness of model fit showed that the fixed effects model fits the data better than the mixed effects model using random intercept and random slope (Δ(−2 ln L) = 188, chi-square 19 df, p < .0001), while the naïve model and the mixed effects model using random intercept fit the data much worse. The GEE model using compound symmetry and working independence provided similar goodness of fit.

Table 9.

The comparison of results from various models for evaluating the factors association with visual acuity among randomly selected 200 participants in eyes with AMD category 3 at baseline in AREDS Study (N = 200 subjects, 314 eyes, 1799 eyes visits).

	Naïve Model (ignore inter-ye correlation and longitudinal correlations)		Fixed effects^†		Mixed effects^§ (Random intercept and slope)		Mixed effects (Random intercept)		GEE^£ (Compound symmetry)		GEE^£ (Working independence)
Effect	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value	Beta (SE)	P-value
Intercept	83.4 (0.93)	<0.0001	83.0 (0.63)	<0.0001	83.2 (0.79)	<0.0001	83.5 (1.19)	<0.0001	83.7 (0.75)	<0.0001	83.4 (0.83)	<0.0001
Treatment Group		0.44		0.11		0.19		0.65		0.23		0.44
Placebo	(ref)		(ref)		(ref)		(ref)		(ref)		(ref)
Antioxidants only	0.48 (1.24)	0.70	1.10 (0.85)	0.19	1.39 (1.06)	0.19	0.79 (1.60)	0.62	1.03 (1.08)	0.34	0.48 (1.17)	0.68
Zinc only	1.88 (1.23)	0.13	1.87 (0.84)	0.03	2.14 (1.06)	0.04	1.91 (1.59)	0.23	1.92 (0.94)	0.04	1.88 (0.99)	0.06
Antioxidants+zinc	1.28 (1.24)	0.30	1.68 (0.84)	0.048	1.75 (1.06)	0.10	1.46 (1.60)	0.36	1.52 91.08)	0.16	1.28 (1.14)	0.26
Time (Year)	−1.33 (0.32)	<0.0001	−1.14 (0.42)	0.007	−1.27 (0.44)	0.004	−1.32 (0.24)	<0.0001	−1.31 (0.35)	0.0002	−1.33 (0.37)	0.0004
Age (per year)	−0.32 (0.08)	0.0001	−0.35 (0.06)	<0.0001	−0.32 (0.07)	<0.0001	−0.31 (0.11)	0.004	−0.31 (0.08)	0.0001	−0.32 (0.08)	<0.0001
Hypertension: yes vs. No	−1.33 (0.93)	0.15	−0.79 (0.63)	0.21	−0.93 (0.80)	0.24	−1.25 (1.20)	0.30	−1.34 (0.85)	0.12	−1.33 (0.89)	0.13
Current Smoking: Yes vs. no	2.11 (1.79)	0.24	1.23 (1.21)	0.31	2.18 (1.52)	0.15	2.78 (2.29)	0.23	2.78 (2.12)	0.19	2.11 (2.17)	0.33
*Agetime**	−0.09 (0.03)	0.002	−0.09 (0.04)	0.01	−0.09 (0.04)	0.03	−0.09 (0.02)	<0.0001	−0.09 (0.04)	0.02	−0.09 (0.04)	0.03
*Current smoking time**	−1.04(0.63)	0.10	−0.86 (0.79)	0.28	−1.03 (0.82)	0.21	−1.34 (0.47)	0.004	−1.33 (1.13)	0.24	−1.04(1.21)	0.39
*Hypertensiontime**	0.65 (0.32)	0.04	0.27 (0.41)	0.51	0.21 (0.43)	0.63	0.44 (0.23)	0.06	0.45 (0.35)	0.20	0.65 (0.37)	0.08
*Treatment grouptime**		0.31		0.39		0.55		0.12		0.62		0.59
Placebo	(ref)		(ref)		(ref)		(ref)		(ref)		(ref)
Antioxidants only	−0.18 (0.43)	0.68	−0.74 (0.56)	0.19	−0.58 (0.58)	0.32	−0.29 (0.32)	0.37	−0.28 (0.53)	0.59	−0.18 (0.56)	0.75
Zinc only	−0.58 (0.42)	0.16	−0.66 (0.54)	0.23	−0.52 (0.57)	0.36	−0.55 (0.31)	0.07	−0.55 (0.56)	0.32	−0.58 (0.58)	0.31
Antioxidants+zinc	0.16 (0.43)	0.71	−0.05 (0.55)	0.93	0.08 (0.58)	0.90	0.11 (0.31)	0.72	0.11 (0.47)	0.81	0.16 (0.49)	0.74
−2lnL	12809		11358		11546		12212
AIC	12811		11403		11559		12218
QIC									1766		1777
QICu									1714		1714
Covariance parameters	1		24		5		3		2		0

Open in a new tab

^†

Fixed effects model using SAS PROC MIXED with the REPEATED OPTION with a UN@UN correlation structure.

^§

Mixed effects model using SAS PROC MIXED with the RANDOM OPTION with random effects for subject and eye within subject.

^Δ

Using random intercept and slope for the subject level, and random intercept for the eye level.

^£

Generalized estimating equation (GEE) model.

Note: Age is normalized using mean age (i.e., age-69).

Discussion

In this paper, we introduced three statistical modelling approaches for longitudinal correlated continuous eye data including the fixed effects model, mixed effects model and modelling with generalized estimating equations. We demonstrated these three modelling approaches in SAS by analyzing datasets from two clinical trials with two eyes in different treatment groups (paired design) in one study (CAPT) and two eyes in the same comparison groups (parallel design) in another study (AREDS). We illustrated these modelling approaches with different covariance structures and compared their goodness of fit based on −2lnL and AIC for fixed effects and mixed effects models and QIC and QICu for the GEE model.

In the analysis of longitudinal data using mixed effects models, we need to choose which variables to model as fixed effects, which variables to model as random effects, and the covariance structure to account for the correlations from longitudinal repeated measures. There are a variety of considerations when selecting the covariance structure, including the number of parameters, the interpretation of the structure and the goodness of fit. Most statistical software has the capability of fitting the data with different covariance structures according to the mixed effects model.¹¹ In choosing the covariance structure for analyzing correlated data, we can use information criteria such as −2lnL, AIC,¹² and BIC.¹³ Ferron et al.¹⁴ found that on average, using the AIC led to the selection of the correct covariance structure about 79% of the time. The −2lnL and AIC can also be used to compare fixed effects models vs. mixed effects models and compare mixed effects models with vs. without random slope as we demonstrated in our examples. However, there is no direct way to compare the goodness of fit from fixed effects and mixed effects models vs. the GEE models, because the fixed effects and mixed effects models are likelihood-based methods, while the GEE models are not a likelihood based method.

The QIC statistic proposed by Pan⁷ and further discussed by Hardin and Hilbe⁸ is analogous to the familiar AIC statistic used for comparing the fit of models with likelihood-based methods. QIC can be used to compare working correlation structures for a given GEE model.

The goodness of fit statistics showed the fixed effects model provides the best fit (smallest −2lnL and AIC) for AREDS, while the mixed effects model with random visit provides a better fit than the fixed effects model for CAPT. For both CAPT and AREDS, GEE with a compound symmetry structure provided better fit than GEE with a working independence structure as the QIC from compound symmetry was smaller. We recommend fitting the data using various models and reporting the results from the model with the best fit.

For the examples in this paper, the fixed effects model and mixed effects model using random intercept and random slope provided a better fit than the mixed effects model with random intercept and the naïve model. However, the fixed effects model under UN@UN covariance structure usually requires longer computation time and sometimes the covariance matrix may be non-positive-definite or the model may not converge due to the large number of parameters that need to be estimated, particularly when the sample size is small. Mixed effects models using random intercept and slope often encounter the problem of a non-positive-definite G matrix due to over-parametrization from modelling time as a categorical variable (e.g., CAPT) or due to insufficient variation of outcome attributed to the random effect (e.g., AREDS). For a G matrix (i.e., covariance matrix of random effects) to be a valid covariance matrix, it must be positive-definite. If it is non-positive-definite, it is recommended to remove the corresponding random effect from the model. In our two examples, after we removed the random intercept from the mixed effects model for the analysis of the CAPT data, and removed the random intercept for eye within subject in the analysis of the AREDS data, the G matrix was positive-definite and provided almost identical goodness of fit statistics as the mixed effects model with both random intercept and random slope (data not shown). In addition, we used the SAS default REML method for the estimation of parameters for mixed effects model. However, we found the ML method provided extremely similar results as REML (data not shown).

All statistical models have assumptions and if the assumptions are not met, the model can yield biased estimates of regression coefficients (SE) and invalid p-values. For the mixed effects model using random intercept, it is assumed that the variance of the outcome measure is the same at all time points. This assumption is often violated in longitudinal data as in both of our examples where the variance increased over time. The mixed effects model with random intercept also assumes the inter-eye correlation remains constant across all time points and the longitudinal correlations are the same between any pair of two time points. The latter assumption was not met in the CAPT and AREDS datasets. The mixed effects model assumes the same longitudinal correlation in the left eye and right eye; this assumption seems to have been met in both CAPT and AREDS. However, these assumptions may not hold when the eye-specific treatment is effective in the study eye when compared to the untreated fellow eye.

Compared to the mixed effects model with random intercept, the fixed effects model and mixed effects model using random intercept and random slope allows the variance to change over time, and allows different longitudinal correlations between different time points; they thus offer more flexibility. Interestingly, in CAPT, we found the mixed effects model using random visits (modelled as categorical) provided a better model fit than the fixed effects model, while in AREDS, we found the fixed effects model provided better model fit than the mixed effects model with random intercept and slope based on the −2lnL. The mixed effects model with random intercept did not fit the data well, mainly because in both CAPT and AREDS, the variance and the longitudinal correlation was not constant across follow-up time points, thus not meeting the assumptions of this model. Before statistical modelling of longitudinal correlated eye data, checking of the inter-eye correlation at each time point and the longitudinal correlation for left eye and right eye separately may provide insights into the selection of the appropriate statistical models.

The various statistical modelling approaches all need to have a sample size sufficient to provide robust estimates of regression coefficients and variance parameters. As different models have different numbers of parameters involved in the covariance structure, the impact of smaller sample size on these models may vary. In our analysis of data from a random sample of 200 participants from the AREDS study, we found the differences across various statistical models became more substantial. For example, the smoking effect on VA change over time were all statistically significant or marginally significant (all p ≤ 0.08) in the analysis of the full AREDS dataset, but was significant only in the mixed effects model with random intercept for the analysis of AREDS data from random sample of 200 subjects.

In conclusion, longitudinal models (fixed effects models, random effects models or GEE) using the eye as the unit of analysis can be implemented using available statistical software (SAS, Stata and R) and offer many advantages for valid and efficient analysis of longitudinal ophthalmologic data with both inter-eye and longitudinal correlations. One issue is that the fixed effects model with a UN@UN, UN@CS or UN@AR correlation is currently only available in SAS, while mixed effects and GEE models are available in other statistical packages (e.g., STATA, R). Different models require different assumptions and may yield different results. Goodness-fit statistics may guide the selection of the appropriate model for a specific dataset.

Funding

Supported by grants [R01EY022445 and P30 EY01583-26] from the National Eye Institute, National Institutes of Health, Department of Health and Human Services.

Appendix 1: SAS codes for analyzing CAPT data

/* naïve fixed effects model without accounting for correlations */

proc mixed data=model_data order=data;

class id cigsmk eye visit group;

model VAscore=visit group*visit cigsmk*visit/solution;

format cigsmk cig2smkf.;

title “Naïve fixed effects model without accounting for correlations”;

run;

/* Fixed effects model using un@un */

proc mixed data=model_data noitprint covtest method=REML noclprint order=data;

class id cigsmk eye visit group;

model VAscore=visit group*visit cigsmk*visit/solution DDFM=KR;

repeated eye visit/type=un@un sub=ID R RCORR;

format cigsmk cig2smkf.;

title “Fixed effects model using un@un covariance structure “;

run;

/* Mixed effects model using random intercept */

proc mixed data=model_data noitprint covtest method=REML order=data;

class id eye cigsmk visit group;

model VAscore=visit group*visit cigsmk*visit/solution DDFM=KR;

random int/type=un sub=ID;

random int/type=un subject=ID(eye);

format cigsmk cig2smkf.;

title “Mixed effects model: two random statements for intercept”;

run;

/* Mixed effects model using random visit */

proc mixed data=model_data noitprint covtest method=REML order=data;

class id eye cigsmk visit group;

model VAscore=visit group*visit cigsmk*visit/solution DDFM=KR;

random visit /type=un sub=ID;

random visit /type=un subject=ID(eye);

format cigsmk cig2smkf.;

title “Mixed effects model: two random statements for visit treated as categorical”;

run;

/* GEE model using compound symmetry */

proc genmod data=model_data order=data;

class id eye cigsmk visit group;

model VAscore=visit group*visit cigsmk*visit/type3;

repeated subject=ID/type=cs;

format cigsmk cig2smkf.;

title “Using GEE type=cs”;

run;

/* GEE model using working independence */

proc genmod data=model_data order=data;

class id eye cigsmk visit group;

model VAscore=visit group*visit cigsmk*visit/type3;

repeated subject=ID/type=ind;

format cigsmk cig2smkf.;

title “Using GEE type=ind”;

run;

Appendix 2: SAS codes for analyzing AREDS data among participants with baseline AMD category 3

/* naïve fixed effects model without accounting for correlations ***/

proc mixed data=VA_eye_elig_cat3 noitprint covtest method=REML order=internal;

class id eye group smk_now hypstat/ref=first;

model VA=group year age_norm hypstat smk_now age_norm*year smk_now*year

hypstat*year group*year/solution DDFM=KR;

title “ Naïve fixed effects model without accounting for correlations”;

format hypstat hypstatf. AMDCAT_eye amdcatf.;

run;

/* using fixed effects model type=un@un */

proc mixed data=VA_eye_elig_cat3 noitprint covtest method=REML order=internal;

class id visit eye group smk_now hypstat/ref=first;

model VA=group year age_norm hypstat smk_now age_norm*year smk_now*year

hypstat*year group*year/solution DDFM=KR;

repeated eye visit/type=un@un sub=ID;

title “Fixed effects model with un@un and time as continuous”;

format hypstat hypstatf. AMDCAT_eye amdcatf.;

run;

/* using mixed effects model with random intercepts */

proc mixed data=VA_eye_elig_cat3 noitprint covtest method=REML order=internal;

class id eye group smk_now hypstat/ref=first;

model VA=group year age_norm hypstat smk_now age_norm*year smk_now*year

hypstat*year group*year/solution DDFM=KR;

random int/type=un sub=ID;

random int/type=un subject=ID(eye);

title “Random effects model: two random statements for intercept”;

format hypstat hypstatf. AMDCAT_eye amdcatf.;

run;

/* using mixed effects model with random intercept and random slope */

proc mixed data=VA_eye_elig_cat3 noitprint covtest method=REML order=internal;

class id eye group smk_now hypstat/ref=first;

model VA=group year age_norm hypstat smk_now age_norm*year smk_now*year hypstat*year group*year/solution DDFM=KR;

random int year/type=un sub=ID;

random year/type=un subject=ID(eye);

title “Mixed effects model: random intercept and random slope, time as continuous”;

format hypstat hypstatf. AMDCAT_eye amdcatf.;

run;

/* using GEE: type=cs */

proc genmod data=VA_eye_elig_cat3 order=internal;

class id eye group smk_now hypstat/ref=first;

model VA=group year age_norm hypstat smk_now age_norm*year smk_now*year hypstat*year group*year/type3;

repeated subject=ID/type=cs;

title “Using GEE type=cs”;

format hypstat hypstatf. AMDCAT_eye amdcatf.;

run;

/* using GEE: type=ind */

proc genmod data=VA_eye_elig_cat3 order=internal;

class id eye group smk_now hypstat/ref=first;

model VA=group year age_norm hypstat smk_now age_norm*year smk_now*year

hypstat*year group*year/type3;

repeated subject=ID/type=ind;

title “Using GEE type=ind”;

format hypstat hypstatf. AMDCAT_eye amdcatf.;

run;

Appendix 3: STATA Code for the Mixed Effect Model and GEE Models

use “U:\Working Projects\Bernie Rosner Project\Statistical analysis for longitudinal eye data\data\areds_cat3.dta”

*** Mixed effect model: random intercept -unstructured

mixed VA i.group##c.year c.age_norm##c.year smk_now##c.year hypstat##c.year ‖ID: ‖eye:, covariance(unstructured)

*** Mixed effects model: random intercept-exchangeable (i.e., compound symmetry)

mixed VA i.group##c.year c.age_norm##c.year smk_now##c.year hypstat##c.year ‖ID: ‖eye:, covariance(exchangeable)

*** mixed effect model: random intercept and slope-exchangeable

mixed VA i.group##c.year c.age_norm##c.year smk_now##c.year hypstat##c.year ‖ID:year ‖eye:year, covariance(exchangeable)

*** mixed effect model: random intercept and slope - unstructured

mixed VA i.group##c.year c.age_norm##c.year smk_now##c.year hypstat##c.year ‖ID:year ‖eye:year, covariance(unstructured)

*** GEE: working independence

xtgee VA i.group##c.year c.age_norm##c.year smk_now##c.year hypstat##c.year, i(id) family(Gaussian) link(identity) corr (independent)

*** GEE: exchangeable (i.e., compound symmetry)

xtgee VA i.group##c.year c.age_norm##c.year smk_now##c.year hypstat##c.year, i(id) family(Gaussian) link(identity) corr (exchangeable)

Appendix 4: R Codes for the Mixed Effects Model and GEE Models

areds<-read.csv(file=“U:\\Working Projects\\Bernie Rosner Project\\Statistical analysis for longitudinal eye data\\data\\are-ds_cat3.csv”,header=T)

## make categorical variable for treatment group

group.ct <- factor(areds$group)

## fit the mixed effects model

library(lme4)

## mixed effects model: random slope

areds.mixmodel= lmer(VA ~ group.ct+age_norm+year+smk_now+hypstat + age_norm*year + smk_now*year + hypstat*year + group.ct*year + (1|ID) + (1|eye:ID), data=areds)

summary(areds.mixmodel)

## mixed effect model: random slope + random intercept

areds.mixmodel= lmer(VA ~ group.ct+age_norm+year+smk_now+hypstat + age_norm*year + smk_now*year + hypstat*year + group.ct*year + (1+year|ID) + (1+year|eye:ID), data=areds)

summary(areds.mixmodel)

## fit GEE models

library(gee)

## working independence

areds.gee.ind= gee(VA ~ group.ct+age_norm+year+smk_now+hypstat + age_norm*year + smk_now*year + hypstat*year + group.ct*year, id=ID, corstr=“independence”, data=areds)

summary(areds.gee.ind)

## compound symmetry

areds.gee.cs= gee(VA ~ group.ct+age_norm+year+smk_now+hypstat + age_norm*year + smk_now*year + hypstat*year + group.ct*year, id=ID, corstr=“compound symmetry”, data=areds)

summary(areds.gee.cs)

Footnotes

Disclosure statement

All authors have no conflict of interest disclosure to disclose.

References

1.The CAPT Research Group. The complications of age-related macular degeneration prevention trial (CAPT): rationale, design and methodology. Clin Trials. 2004;1(1):91–107. doi: 10.1191/1740774504cn007xx. [DOI] [PubMed] [Google Scholar]
2.The AREDS Research Group. The age-related eye disease study (AREDS): design implications. AREDS report no. 1. Control Clin Trials. 1999;20(6):573–600. doi: 10.1016/S0197-2456(99)00031-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38(4):963–974. doi: 10.2307/2529876. [DOI] [PubMed] [Google Scholar]
4.Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics. 1986;42(1):121–130. doi: 10.2307/2531248. [DOI] [PubMed] [Google Scholar]
5.Ying GS, Maguire MG, Glynn R, Rosner B. Tutorial on biostatistics: linear regression analysis of continuous correlated eye data. Ophthalmic Epidemiol. 2017;24(2):130–140. doi: 10.1080/09286586.2016.1259636. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Jennrich RI, Schluchter MD. Unbalanced repeated measures models with structural covariance matrices. Biometrics. 1986;42(4):805–820. doi: 10.2307/2530695. [DOI] [PubMed] [Google Scholar]
7.Pan W Akaike’s information criterion in generalized estimating equations. Biometrics. 2001;57(1):120–125. doi: 10.1111/j.0006-341X.2001.00120.x. [DOI] [PubMed] [Google Scholar]
8.Hardin JW, Hilbe JM. Generalized Estimating Equations. Boca Raton, FL: Chapman & Hall/CRC; 2003. [Google Scholar]
9.Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53(3):983–997. doi: 10.2307/2533558. [DOI] [PubMed] [Google Scholar]
10.Schaalje GB, McBride JB, Fellingham GW. Adequacy of approximations to distributions of test statistics in complex mixed linear models. J Agric Biol Environ Stat. 2002;7(4):512–524. doi: 10.1198/108571102726. [DOI] [Google Scholar]
11.Littell RC, Milliken GA, Stroup WW, Wolfinger RD. SAS System for Mixed Models. 1st ed. Cary: SAS Institute, Incorporated; 1999. [Google Scholar]
12.Akaike H A new look at the statistical model identification. IEEE Trans Autom Control. 1974;19 (6):716–723. doi: 10.1109/TAC.1974.1100705. [DOI] [Google Scholar]
13.Schwarz G Estimating the dimension of a model. Ann Stat. 1978;6(2):461–464. doi: 10.1214/aos/1176344136. [DOI] [Google Scholar]
14.Ferron J, Dailey R, Yi Q. Effects of misspecifying the first-level error structure in two-level models of change. Multivariate Behav Res. 2002;37(3):379–403. doi: 10.1207/S15327906MBR3703_4. [DOI] [PubMed] [Google Scholar]

[R1] 1.The CAPT Research Group. The complications of age-related macular degeneration prevention trial (CAPT): rationale, design and methodology. Clin Trials. 2004;1(1):91–107. doi: 10.1191/1740774504cn007xx. [DOI] [PubMed] [Google Scholar]

[R2] 2.The AREDS Research Group. The age-related eye disease study (AREDS): design implications. AREDS report no. 1. Control Clin Trials. 1999;20(6):573–600. doi: 10.1016/S0197-2456(99)00031-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38(4):963–974. doi: 10.2307/2529876. [DOI] [PubMed] [Google Scholar]

[R4] 4.Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics. 1986;42(1):121–130. doi: 10.2307/2531248. [DOI] [PubMed] [Google Scholar]

[R5] 5.Ying GS, Maguire MG, Glynn R, Rosner B. Tutorial on biostatistics: linear regression analysis of continuous correlated eye data. Ophthalmic Epidemiol. 2017;24(2):130–140. doi: 10.1080/09286586.2016.1259636. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Jennrich RI, Schluchter MD. Unbalanced repeated measures models with structural covariance matrices. Biometrics. 1986;42(4):805–820. doi: 10.2307/2530695. [DOI] [PubMed] [Google Scholar]

[R7] 7.Pan W Akaike’s information criterion in generalized estimating equations. Biometrics. 2001;57(1):120–125. doi: 10.1111/j.0006-341X.2001.00120.x. [DOI] [PubMed] [Google Scholar]

[R8] 8.Hardin JW, Hilbe JM. Generalized Estimating Equations. Boca Raton, FL: Chapman & Hall/CRC; 2003. [Google Scholar]

[R9] 9.Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53(3):983–997. doi: 10.2307/2533558. [DOI] [PubMed] [Google Scholar]

[R10] 10.Schaalje GB, McBride JB, Fellingham GW. Adequacy of approximations to distributions of test statistics in complex mixed linear models. J Agric Biol Environ Stat. 2002;7(4):512–524. doi: 10.1198/108571102726. [DOI] [Google Scholar]

[R11] 11.Littell RC, Milliken GA, Stroup WW, Wolfinger RD. SAS System for Mixed Models. 1st ed. Cary: SAS Institute, Incorporated; 1999. [Google Scholar]

[R12] 12.Akaike H A new look at the statistical model identification. IEEE Trans Autom Control. 1974;19 (6):716–723. doi: 10.1109/TAC.1974.1100705. [DOI] [Google Scholar]

[R13] 13.Schwarz G Estimating the dimension of a model. Ann Stat. 1978;6(2):461–464. doi: 10.1214/aos/1176344136. [DOI] [Google Scholar]

[R14] 14.Ferron J, Dailey R, Yi Q. Effects of misspecifying the first-level error structure in two-level models of change. Multivariate Behav Res. 2002;37(3):379–403. doi: 10.1207/S15327906MBR3703_4. [DOI] [PubMed] [Google Scholar]

PERMALINK

Tutorial on Biostatistics: Longitudinal Analysis of Correlated Continuous Eye Data

Gui-Shuang Ying

Maureen G Maguire

Robert J Glynn

Bernard Rosner

Abstract

Purpose:

Methods:

Results:

Conclusion:

Introduction

Methods

Data structure of longitudinal correlated eye data

Table 1.

Fixed effects model

Table 2.

Mixed effects model

Mixed effects model with random intercept

Table 3.

Mixed effects model with random intercept and slope

Population-average model

Example 1: Analysis of visual acuity data from the complications of age-related macular degeneration prevention trial (CAPT)

Example 2: Analysis of data from the age-related eye disease study (AREDS)

Results

Effect of treatment and smoking on visual acuity in CAPT

Table 4.

Table 5.

Effect of treatment and factors associated with visual acuity outcome in AREDS

Table 6.

Table 7.

Effect of sample size on the difference from various models for analysis of a random sample of AREDS participants

Table 8.

Table 9.

Discussion

Funding

Appendix 1: SAS codes for analyzing CAPT data

Appendix 2: SAS codes for analyzing AREDS data among participants with baseline AMD category 3

Appendix 3: STATA Code for the Mixed Effect Model and GEE Models

Appendix 4: R Codes for the Mixed Effects Model and GEE Models

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases