A power approximation for the Kenward and Roger Wald test in the linear mixed model

Sarah M Kreidler; Brandy M Ringham; Keith E Muller; Deborah H Glueck

doi:10.1371/journal.pone.0254811

. 2021 Jul 21;16(7):e0254811. doi: 10.1371/journal.pone.0254811

A power approximation for the Kenward and Roger Wald test in the linear mixed model

Sarah M Kreidler ^1,^#, Brandy M Ringham ^2,^*,^#, Keith E Muller ^3,^#, Deborah H Glueck ^4,^#

Editor: Lei Shi⁵

PMCID: PMC8294572 PMID: 34288958

Abstract

We derive a noncentral $F$ power approximation for the Kenward and Roger test. We use a method of moments approach to form an approximate distribution for the Kenward and Roger scaled Wald statistic, under the alternative. The result depends on the approximate moments of the unscaled Wald statistic. Via Monte Carlo simulation, we demonstrate that the new power approximation is accurate for cluster randomized trials and longitudinal study designs. The method retains accuracy for small sample sizes, even in the presence of missing data. We illustrate the method with a power calculation for an unbalanced group-randomized trial in oral cancer prevention.

1 Introduction

1.1 Motivation

Linear mixed models are widely used in biomedical research for inference in analyses with missing data. Kenward and Roger [1] described a scaled Wald statistic and null case reference distribution for tests of fixed effects in the linear mixed model. Despite the widespread use of the Kenward and Roger [1] method for data analysis, no general methods are available to calculate power for the Kenward and Roger [1] test.

Several authors have described power approximations for related tests and models. Helms [2] described a noncentral $F$ power approximation for a Wald test. Helms used a different null case reference distribution than the one derived by Kenward and Roger. Stroup [3] suggested an “exemplary data” approach for calculating power for mixed models with missing data. Tu et al. [4, 5] developed an asymptotic power approximation based on generalized estimating equations. Shieh [6] provided non-central $F$ power approximations for multivariate models with random covariates and no missing data. Chi, Glueck, and Muller [7] demonstrated that power methods for the general linear multivariate model may be used in complete, balanced, homoscedastic mixed models.

We derive a noncentral $F$ power approximation for the Kenward and Roger [1] test for a broad range of models. We use a method of moments approach [8] to form an approximate distribution of the Kenward and Roger [1] scaled Wald statistic, F_R, under the alternative. The reference distribution of F_R under the alternative depends on the approximate moments of the unscaled Wald statistic.

The remainder of the manuscript is organized as follows. In Section 2, we introduce notation for the general linear mixed model and briefly review the methods of Kenward and Roger [1]. In Section 3, we describe a noncentral $F$ power approximation for the Kenward and Roger [1] test. In Section 4, we summarize the Monte Carlo simulation study used to evaluate the power approximation. In Section 5, we demonstrate a power calculation for a longitudinal trial in oral cancer prevention. In Section 6, we provide concluding remarks.

2 Notation, models, and hypothesis testing

2.1 Notation

For i ∈ {1, …, n}, let a = {a_i} denote an n × 1 column vector. Furthermore, for i ∈ {1, …, n} and j ∈ {1, …, m}, let A = {a_ij} indicate an n × m matrix with transpose A′ = {a_ji}. Let I_d be a (d × d) identity matrix. For a matrix A = [a₁ a₂ … a_n], let $vec (A) = [\begin{matrix} a_{1}^{'} & a_{2}^{'} & \dots & a_{n}^{'} \end{matrix}]^{'}$ . Define the Kronecker product of two matrices A and B as A ⊗ B = {a_ij B} [9, Section 1.3].

Extend the direct sum operator [9, Section 1.3] to sets of arbitrarily sized matrices as follows. Let {A₁, …, A_J} be a set of matrices such that A_j has dimension (r_j × c_j). Let $0_{r_{i}, c_{j}}$ be an (r_i × c_j) matrix of zeros. Define the direct sum of {A₁, …, A_J} as

\begin{matrix} \oplus_{j = 1}^{J} A_{j} = [\begin{matrix} A_{1} & 0_{r_{1}, c_{2}} & \dots & 0_{r_{1}, c_{J}} \\ 0_{r_{2}, c_{1}} & A_{2} & ⋮ \\ ⋮ & ⋱ & 0_{r_{J - 1}, c_{J}} \\ 0_{r_{J}, c_{1}} & \dots & 0_{r_{J}, c_{J - 1}} & A_{J} \end{matrix}] . \end{matrix}

(1)

For δ ∈ {1, …, (2^p − 1)} and d ∈ {1, …, δ}, define the set R_d where R_d ⊆ {1, …, p} of cardinality 1 ≤ p_d ≤ p. For every R_d, let D_p,d, a deletion matrix, be the (p_d × p) submatrix of I_p formed by keeping each row i of I_p such that i ∈ R_d. For example, given a (p × p) matrix A and R_d = {1, 3},

\begin{matrix} D_{3, d} = [\begin{matrix} 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix}] \end{matrix}

(2)

and

\begin{matrix} D_{3, d} A D_{3, d}^{'} = [\begin{matrix} a_{11} & a_{13} \\ a_{31} & a_{33} \end{matrix}] . \end{matrix}

(3)

Let E₀(u) and E_A(u) be the expectations of the random variable u under the null and alternative hypotheses, respectively. Similarly, let $V_{0} (u)$ and $V_{A} (u)$ indicate the variance under the null and alternative hypotheses. For random matrix variates, denote the covariance under the null and alternative hypotheses as $V_{0} (A)$ and $V_{A} (A)$ , respectively.

Let $X \sim D$ indicate that random variable X follows a distribution D exactly, while $X \dot{\sim} D$ indicates that distribution is followed approximately. Let $F \sim F (ν_{n}, ν_{d}, γ)$ indicate that the random variable F follows a noncentral $F$ distribution [10] with numerator degrees of freedom ν_n, denominator degrees of freedom ν_d, and noncentrality parameter γ. For γ = 0, F is said to follow a central $F$ distribution, written $F \sim F (ν_{n}, ν_{d})$ . Define $F^{- 1} (b; ν_{n}, ν_{d}, γ)$ such that for 0 ≤ b ≤ 1

\begin{matrix} F (f; ν_{n}, ν_{d}, γ) = b \Leftrightarrow F^{- 1} (b; ν_{n}, ν_{d}, γ) = f . \end{matrix}

(4)

Use $Y \sim N_{N, p} (M, Ξ, Σ)$ to indicate that the (N × p) matrix Y follows a matrix Gaussian distribution, with M an (N × p) matrix of means, Ξ an (N × N) symmetric, positive definite column covariance matrix, and Σ a (p × p) symmetric, positive definite row covariance matrix [9, Chapter 8]. Write $W \sim W_{p} (N, Σ)$ to indicate that the (p × p) matrix W follows a central Wishart distribution of dimension p, degrees of freedom N, on covariance Σ. For Ψ = Σ⁻¹, write $W^{- 1} \sim I W_{p} {(N + p + 1), Ψ}$ to indicate that W⁻¹ follows a central inverse Wishart distribution of dimension p, degrees of freedom N+ p+ 1, and precision matrix Ψ [11, p. 111, Theorem 3.4.1].

2.2 The general linear mixed model

We describe the general linear mixed model for Gaussian outcomes using the notation of Muller and Stewart [9, Chapter 5]. Let i ∈ {1, …, N} indicate the ith independent sampling unit [9, Chapter 5]. An independent sampling unit may be a single participant, as in a clinical trial, or a group of participants, as in a cluster-randomized study. Observations from two different independent sampling units are statistically independent. Observations within an independent sampling unit may be correlated. For example, for a particpant in a longitudinal trial, repeated measurements over time will be correlated.

Let p_i be the number of observations for the ith independent sampling unit, with p = max_i(p_i). For the ith independent sampling unit, let y_i be the (p_i × 1) vector of observed outcomes, X_i be the (p_i × r) fixed effects design matrix of rank r, and e_i be the (p_i × 1) vector of random errors. Assume that for i ≠ j, e_i ⊥ e_j and y_i ⊥ y_j. Let Σ_i be a (p_i × p_i) symmetric, positive definite matrix, with

\begin{matrix} e_{i} \sim N_{p_{i}} (0, Σ_{i}) . \end{matrix}

(5)

Let β be the (r × 1) vector of regression parameters. The linear mixed model for the ith independent sampling unit is

\begin{matrix} y_{i} = X_{i} β + e_{i} . \end{matrix}

(6)

Let $n = \sum_{i = 1}^{N} p_{i}$ . Define the (n × 1) vectors $y_{s} = [\begin{matrix} y_{1}^{'} & y_{2}^{'} & . . . & y_{N}^{'} \end{matrix}]^{'}$ and $e_{s} = [\begin{matrix} e_{1}^{'} & e_{2}^{'} & \dots & e_{N}^{'} \end{matrix}]^{'}$ . Stack the fixed effect design matrices into the (n × r) matrix

\begin{matrix} X_{s} = [\begin{matrix} X_{1}^{'} & X_{2}^{'} & \dots & X_{N}^{'} \end{matrix}]^{'} . \end{matrix}

(7)

Throughout, we assume that predictor values are not allowed to change within an independent sampling unit, i.e., that there are no repeated covariates. In addition, we assume that all predictor values are fixed as part of the study design. The population-averaged form of the linear mixed model is

\begin{matrix} y_{s} = X_{s} β + e_{s} . \end{matrix}

(8)

Define

\begin{matrix} Σ_{s} = \oplus_{i = 1}^{N} Σ_{i} . \end{matrix}

(9)

The distribution of y_s is

\begin{matrix} y_{s} \sim N_{n} (X_{s} β, Σ_{s}) . \end{matrix}

(10)

2.3 Tests for fixed effects in mixed models

Let α be the Type I error rate. Let C be the (a × r) matrix of fixed effects contrasts. Define the (a × 1) matrix θ = Cβ, and let θ₀ be the (a × 1) matrix of null values. The general linear hypothesis may be stated as

\begin{matrix} H_{0} : θ = θ_{0} . \end{matrix}

(11)

In order to conduct power analysis for the general linear hypothesis in the mixed model, we must consider the target estimation method. Several estimation methods have been described for mixed models [12, Chapter 5]. Common estimation methods include restricted maximum likelihood and maximum likelihood.

Let m indicate the estimation method. Let ${\hat{Σ}}_{s, m}$ and ${\hat{β}}_{m}$ be the estimates of Σ_s and β obtained from method m. Define ${\hat{θ}}_{m} = C {\hat{β}}_{m}$ . The Wald statistic for the linear mixed model is

\begin{matrix} w_{m} = ({\hat{θ}}_{m} - θ_{0})^{'} [C (X_{s}^{^{'}} {\hat{Σ}}_{s, m}^{- 1} X_{s})^{- 1} C^{'}]^{- 1} ({\hat{θ}}_{m} - θ_{0}) / a . \end{matrix}

(12)

The distribution of the Wald statistic is not known exactly for any m. Various reference distributions have been suggested for each estimation method m. In general, the distributions share a common form, with

\begin{matrix} w_{m} \dot{\sim} F (ν_{n, m}, ν_{d, m}, γ_{m}) . \end{matrix}

(13)

Under the null hypothesis, γ_m = 0 and $w_{m} \dot{\sim} F (ν_{n, m}, ν_{d, m})$ .

2.4 The Kenward-Roger test for fixed effects

Kenward and Roger [1] suggested using restricted maximum likelihood estimation (m = R) and a scaled Wald statistic.

\begin{matrix} F_{R} & = λ ({\hat{θ}}_{R} - θ_{0})^{'} [C (X_{s}^{^{'}} {\hat{Σ}}_{s, R}^{- 1} X_{s})^{- 1} C^{'}]^{- 1} ({\hat{θ}}_{R} - θ_{0}) / a . \\ = λ w_{R} \end{matrix}

(14)

Kenward and Roger [1] used Taylor expansion to estimate E₀(w_R) and $V_{0} (w_{R})$ from observed data. Kenward and Roger [1] substituted E₀(w_R) and $V_{0} (w_{R})$ into method of moments approximations for λ and the reference distribution of F_R under the null. With $F_{R} \dot{\sim} F (a, ν)$ ,

ρ = \frac{V_{0} (w_{R})}{2 E_{0} {(w_{R})}^{2}},

(15)

ν = 4 + \frac{a + 2}{a ρ - 1},

(16)

and

\begin{matrix} λ = \frac{ν}{(ν - 2) E_{0} (w_{R})} . \end{matrix}

(17)

3 Power approximation for the Kenward-Roger test in the linear mixed model

3.1 The approximate moments of the Wald statistic

We derive a noncentral $F$ power approximation for the Kenward and Roger [1] test. The method of moments approach [8] is used to form an approximate distribution of the Kenward and Roger [1] scaled Wald statistic, F_R, under the alternative. The reference distribution of F_R under the alternative depends on the approximate moments of the unscaled Wald statistic.

We demonstrate that the Wald statistic has an approximately noncentral $F$ reference distribution under the alternative and a central $F$ reference distribution under the null. The result depends on approximate distributional results for both $({\hat{θ}}_{R} - θ_{0})$ and $C (X_{s}^{^{'}} {\hat{Σ}}_{s, R}^{- 1} X_{s})^{- 1} C^{'}$ . Because distributional results are, in general, not available for restricted maximum likelihood estimation, we instead use distributional results based on other techniques.

Let m = W indicate weighted least squares, and m = M denote multivariate methods. Approximate $({\hat{θ}}_{R} - θ_{0})$ by $({\hat{θ}}_{W} - θ_{0})$ , which is Gaussian, conditional on Σ_s. The term $C (X_{s}^{^{'}} {\hat{Σ}}_{s, R}^{- 1} X_{s})^{- 1} C^{'}$ can be approximated by $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$ . We show that $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$ is approximately Wishart. Finally, under the assumption of independence, we combine the terms to obtain an approximate $F$ distribution.

3.1.1 The conditional distribution of $({\hat{θ}}_{W} - θ_{0})$

The weighted least squares estimate [12] of β is

\begin{matrix} {\hat{β}}_{W} = (X_{s}^{'} Σ_{s}^{- 1} X_{s})^{- 1} (X_{s}^{'} Σ_{s}^{- 1} y_{s}) . \end{matrix}

(18)

With ${\hat{θ}}_{W} = C {\hat{β}}_{W}$ ,

\begin{matrix} ({\hat{θ}}_{W} - θ_{0}) | Σ_{s} \sim N_{a} {(θ - θ_{0}), C (X_{s}^{'} Σ_{s}^{- 1} X_{s})^{- 1} C^{'}} . \end{matrix}

(19)

3.1.2 The approximate distribution of $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$

We approximate the distribution of

\begin{matrix} C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'} \end{matrix}

(20)

with a single central Wishart. The result follows from Theorems 1, 2 and 3 in A. The theorems provide an approximate distribution for a positive definite sum of potentially singular quadratic forms in independent inverse central Wishart matrices.

The accuracy of the approximation depends on the degrees of freedom of the component quadratic forms. To ensure sufficient degrees of freedom, we make the following homoscedasticity assumptions. Recall p = max_i(p_i). With Σ_max a symmetric, positive definite matrix, assume Σ_i ≡ Σ_max for all i ∈ {1, …, N} such that p_i = p. Let N_d indicate the number of independent sampling units with observation pattern R_d. Note $N = \sum_{d = 1}^{δ} N_{d}$ . For independent sampling units with observation pattern R_d, assume

\begin{matrix} Σ_{d} = D_{p, d} Σ_{m a x} D_{p, d}^{'} . \end{matrix}

(21)

Without loss of generality, permute the independent sampling units in Eq 8 so that

\begin{matrix} Σ_{s} = \overset{δ}{\underset{d = 1}{\oplus}} \oplus_{i = 1}^{N_{d}} Σ_{d} . \end{matrix}

(22)

Estimate Σ_s with

\begin{matrix} {\hat{Σ}}_{s} = \overset{δ}{\underset{d = 1}{\oplus}} \oplus_{i = 1}^{N_{d}} {\hat{Σ}}_{d} \end{matrix}

(23)

The following thought experiment gives reasonable approximations for the distribution of each ${\hat{Σ}}_{d}$ . All independent sampling units with observed data pattern R_d have p_d observations. For each R_d, suppose we form a complete, balanced mixed model containing only the independent sampling units with observed data pattern R_d. For each balanced mixed model, assume that X_s includes the full time by treatment interaction. This permits recasting each balanced mixed model as an equivalent general linear multivariate model [9, Chapter 14]. For cluster randomized designs, we assume that the mixed model is recast as a two-stage model of cluster means [13, Chapter 4], a special case of the multivariate model.

For the dth multivariate model, let q be the rank of the multivariate design matrix and ${\hat{E}}_{d}$ be the (N_d × p_d) matrix of residuals. Assume N_d > (q + p_d + 1). Then an unbiased, consistent estimate of Σ_d, ${\hat{Σ}}_{d, M}$ , can be formed using known results for the multivariate model. Thus,

\begin{matrix} {\hat{Σ}}_{d, M} = {\hat{E}}_{d}^{'} {\hat{E}}_{d} / (N_{d} - q), \end{matrix}

(24)

with distribution

\begin{matrix} {\hat{Σ}}_{d, M} \sim W_{p_{d}} {N_{d} - q, Σ_{d} / (N_{d} - q)} . \end{matrix}

(25)

Recall that in the Wald statistic (Eq 12),

\begin{matrix} X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s} = \sum_{d = 1}^{δ} \sum_{i = 1}^{N_{d}} X_{i}^{'} {\hat{Σ}}_{d, M}^{- 1} X_{i} . \end{matrix}

(26)

Using Eq 25 and Theorem 3 in Appendix, approximate the distribution of $X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s}$ with a single inverse central Wishart,

\begin{matrix} X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s} \dot{\sim} I W_{r} (N_{*}, Σ_{*}^{- 1}) . \end{matrix}

(27)

From the linear properties of Wishart matrices [11, p. 111, Theorem 3.4.1],

\begin{matrix} C (X_{s}^{'} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'} \dot{\sim} W_{a} {(N_{*} - r - 1), C Σ_{*} C^{'}} . \end{matrix}

(28)

3.1.3 Combining $({\hat{θ}}_{W} - θ_{0})$ and $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$ to form an approximate $F$

We now combine $({\hat{θ}}_{W} - θ_{0})$ and $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$ as described in Sections 3.1.1 and 3.1.2 to form a Wald statistic,

\begin{matrix} w = ({\hat{θ}}_{W} - θ_{0})^{'} [C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}]^{- 1} ({\hat{θ}}_{W} - θ_{0}) / a . \end{matrix}

(29)

We assume that w ≈ w_R. From Eq 19, $({\hat{θ}}_{W} - θ_{0})$ is approximately Gaussian. From Eq 28, $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$ is approximately Wishart.

For conciseness of notation, write μ = (θ − θ₀), with estimate $\hat{μ} = ({\hat{θ}}_{W} - θ_{0})$ , $W = C (X_{s}^{'} Σ_{s}^{- 1} X_{s})^{- 1} C^{'}$ and $\hat{W} = C (X_{s}^{'} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$ . Define $Q = [V (\hat{W})]^{- 1} V (\hat{μ})$ and $h = μ^{'} [V (\hat{W})]^{- 1} μ$ . Assume that ${\hat{θ}}_{W} ⊥ {\hat{Σ}}_{s, M}$ . The assumption rests on the following logic. If we had estimated both Σ_s and β using multivariate techniques, independence would follow [14, p. 291, Theorem 8.2.2]. Applying Theorem 4 in Appendix,

\begin{matrix} w \dot{\sim} {a (N_{*} - r + a - 2)}^{- 1} tr (Q) F {n_{u}, (N_{*} - r + a - 2), δ_{u}}, \end{matrix}

(30)

where

\begin{matrix} δ_{u} & = \frac{h tr (Q) + 2 h^{2}}{tr (Q Q^{'}) + 2 μ^{'} {Q [V (\hat{W})]^{- 1}} μ} \end{matrix}

(31)

and

\begin{matrix} n_{u} = δ_{u} h^{- 1} tr (Q) . \end{matrix}

(32)

From Eq 30, we calculate E₀(w), E_A(w), and $V_{A} (w)$ , using standard results for central and noncentral $F$ distributions [10].

3.2 A three-moment approximation for the distribution of the Kenward and Roger scaled Wald statistic under the alternative hypothesis

We use a method of moments approach [8] to form the approximate distribution of Kenward and Roger [1] scaled Wald statistic, F_R, under the alternative. The parameters of the distribution depend on the approximate Wald moments derived in Section 3.1. We approximate the distribution of the Kenward and Roger [1] statistic, F_R = λw_R, by the distribution of F = λw, where $F \dot{\sim} F (a, ν, γ)$ . Thus

\begin{matrix} F_{R} \dot{\sim} F (a, ν, γ) . \end{matrix}

(33)

To obtain values for λ, ν, and γ under the alternative, we match three moments, setting

E_{A} (F) = E_{A} (λ w),

(34)

V_{A} (F) = V_{A} (λ w),

(35)

and

\begin{matrix} E_{0} (F) = E_{0} (λ w) . \end{matrix}

(36)

With

\begin{matrix} ρ = \frac{V_{A} (w)}{2 {E_{0} (w)}^{2}}, \end{matrix}

(37)

we obtain

λ = \frac{ν}{(ν - 2) E_{0} (w)},

(38)

ν = 4 + \frac{2 (a + 2 γ) + (a + γ)^{2}}{ρ a^{2} - a - 2 γ},

(39)

and

\begin{matrix} γ = a {\frac{E_{A} (w)}{E_{0} (w)} - 1} . \end{matrix}

(40)

When γ = 0, Eq 39 reduces to

\begin{matrix} ν = 4 + \frac{a + 2}{a ρ - 1}, \end{matrix}

(41)

which shares the same form as the result obtained by Kenward and Roger (Eq 16). The exact values of ρ, and hence ν, will differ due to the disparate techniques used to obtain moments for the Wald statistics, w and w_R.

3.3 Power calculation for the Kenward and Roger test

We calculate power for the Kenward and Roger test as follows. Define α, Σ_max, β, C and θ₀. For i ∈ {1, …, N}, specify X_i and R_d. Calculate a, ν, and γ as described in Section 3.2. Form the reference distribution of $F_{R} \dot{\sim} F (a, ν, γ)$ . Using the approximate reference distribution of F_R under the null, $F_{R} \dot{\sim} F (a, ν, 0)$ , find the critical value

\begin{matrix} f_{c r i t} \approx F^{- 1} (1 - α; a, ν, 0) . \end{matrix}

(42)

Finally, using the approximate reference distribution of F_R under the alternative, $F_{R} \dot{\sim} F (a, ν, γ)$ , calculate power as

\begin{matrix} Power \approx 1 - F (f_{c r i t}; a, ν, γ) . \end{matrix}

(43)

4 Simulation study

4.1 Methods

We compared approximate power values, calculated as in Section 3.3, with empirical power for two types of study designs: unbalanced, cluster randomized trials and longitudinal studies with known dropout patterns. Approximate power was calculated using our mixedPower package for R version 4.0.2 [15].

Empirical power was calculated by Monte Carlo simulation in SAS [16, version 9.4]. We defined α, Σ_max, β, C and θ₀. For i ∈ {1, …, N}, we specified X_i and R_d. We generated 10, 000 replicates of e_s and computed y_s as in Eq 8. For each replicate, we tested the linear contrast C using SAS PROC MIXED with the DDFM = KenwardRoger flag to request Kenward and Roger [1] denominator degrees of freedom. Empirical power was estimated as the proportion of replicates for which the null hypothesis was rejected. Source code is available at http://github.com/SampleSizeShop/mixedPower.

4.1.1 Cluster randomized designs

We compared approximate and empirical power for 36 cluster randomized designs. We assumed that each design had a single Gaussian outcome. Half of the clusters were assumed to have complete data, with the remaining clusters assumed to have some amount of missing data. We varied the number of treatment groups, t ∈ {2, 4}, the number of clusters randomized to each treatment, N_treatment ∈ {10, 40}, the total number of participants in a complete cluster, p ∈ {5, 50} and the ratio of the incomplete cluster size to the complete cluster size s ∈ {0.6, 0.8, 1}. We only included designs which met the assumption that N_d > (q + p_d + 1) for all R_d.

For each design, we repeated the simulations for several intraclass correlation values ρ ∈ {0.04, 0.1, 0.2, 0.5}, with

\begin{matrix} Σ_{m a x} = 2 \times {1_{p} 1_{p}^{'} ρ + I_{p} (1 - ρ)} . \end{matrix}

(44)

The β matrix had the form

\begin{matrix} β = b \times [\begin{matrix} 1 & 0 \end{matrix}]^{'} \end{matrix}

(45)

for designs with 2 treatments and

\begin{matrix} β = b \times [\begin{matrix} 1 & 0 & 0 & 0 \end{matrix}]^{'} \end{matrix}

(46)

for designs with 4 treatments. The scale factor b was selected so that the approximate power was roughly 0.2, 0.5 or 0.8. In each scenario, we calculated power for the null hypothesis of no difference among treatment groups at α = 0.05. We used the Wald test with denominator degrees of freedom as described by Kenward and Roger [1].

4.1.2 Longitudinal designs

We calculated approximate and empirical power for 36 longitudinal study designs. Each design had 5 repeated measures and 50 participants per treatment group. We varied the number of treatment groups, t ∈ {2, 4}, the pattern of missing data, either monotone (missing the 4th and 5th observations), or non-monotone (missing the 2nd and 4th observations), and the number of participants in each treatment group with some amount of missing data, N_incomplete ∈ {0, 10, 20}. For observations within a given participant, we assumed a first-order auto-regressive correlation structure [12, p. 99], with ρ = 0.4 and σ² = 1. The β matrix had the form

\begin{matrix} β = b \times [\begin{matrix} 1 & 0_{9}^{'} \end{matrix}]^{'} \end{matrix}

(47)

for designs with 2 treatments and

\begin{matrix} β = b \times [\begin{matrix} 1 & 0_{19}^{'} \end{matrix}]^{'} \end{matrix}

(48)

for designs with 4 treatments. The scale factor and hypothesis testing were as described for the cluster randomized designs with one exception: we calculated power for the null hypothesis of no time by treatment interaction.

4.1.3 Performance criteria

For each design, we computed the deviation as approximate power minus empirical power. We produced box plots summarizing the deviations overall, within all cluster randomized trials, and within all longitudinal designs. For the cluster randomized trials, we produced box plots stratified by the number of treatment groups, the cluster size, and the ratio of the incomplete cluster size to the complete cluster size. For the longitudinal designs, we produced box plots summarizing the deviations stratified by the number of treatment groups, the pattern of missing observations, and the number of incomplete independent sampling units per treatment.

Positive deviations indicated that the approximate power values were larger than the empirical power values. Negative deviations indicated that the approximate power values were smaller than the empirical power values.

4.2 Results

Fig 1 summarizes the deviations between the approximate and the empirical power values. The three box plots show results for all designs, for cluster randomized trials, and for longitudinal studies. Overall, the median deviation between the approximate and the empirical power values was 0.010 (min: −0.010, 1st quartile: 0.005, 3rd quartile: 0.015, max: 0.064). For cluster randomized trials, the median deviation was 0.011, (min: −0.001, 1st quartile: 0.006, 3rd quartile: 0.017, max: 0.064). For longitudinal designs, the median deviation was 0.003, (min: −0.010, 1st quartile: 0.000, 3rd quartile: 0.009, max: 0.016).

Further details for cluster-randomized designs are shown in Fig 2. The accuracy of the power approximation improved with larger cluster sizes. The approximation retained accuracy regardless of the ratio of incomplete to complete cluster sizes. As shown in Table 1, accuracy was similar across ICC values, with slight improvements with increasing correlation.

Table 1. Deviations between approximate and empirical power in cluster randomized designs by ICC.

ICC	Minimum	1st Quartile	Median	3rd Quartile	Maximum
0.04	-0.001	0.006	0.012	0.019	0.064
0.1	0.002	0.009	0.012	0.017	0.054
0.2	0.001	0.005	0.010	0.016	0.059
0.5	0.001	0.006	0.010	0.014	0.038

Open in a new tab

Results for longitudinal designs are shown in Fig 3. The power approximation was highly accurate for all longitudinal designs tested.

5 Applied example

We demonstrate a power calculation for an unbalanced cluster-randomized trial of an intervention to reduce oral cancer risk behaviors. The example is based on a hypothetical study examining the impact of workplace smoking cessation programs on tobacco use. We used a synthetic, rather than a real example, so that the power calculation is easy to follow. In a real power calculation, values of differences in means, standard deviations and intra-class correlation coefficients could be drawn from the literature, as described in Guo et al. [17].

For our demonstration, we assume that 80 worksites will be randomized to 2 smoking cessation programs, with 40 sites per treatment condition. Of the 40 sites randomized to each smoking cessation program, 25 worksites will have 30 participants, and the remaining 15 will have 20 participants. The outcome for the analysis will be urinary cotinine. We wish to detect a difference of 25 ng/ml. We assume a standard deviation of 125 ng/ml, and an intraclass correlation of 0.04. We will calculate power for the Kenward and Roger [1] test of the smoking cessation program effect. We set α = 0.05.

To begin the calculation, we first identify the patterns of observations in the study, including complete clusters with 30 participants, and incomplete clusters with 20 participants. Table 2 summarizes the design matrices and patterns of observations by cluster size and treatment assignment.

Table 2. Design matrices and patterns of observations for proposed study of smoking cessation programs.

	p_i = 30	p_i = 20
Program 1	X_i = 1₃₀ ⊗ [1 0]	X_i = 1₂₀ ⊗ [0]
Program 1	R_d = {1, …, 30}	R_d = {1, …, 20}
Program 2	X_i = 1₃₀ ⊗ [1 1]	X_i = 1₂₀ ⊗ [0 1]
Program 2	R_d = {1, …, 30}	R_d = {1, …, 20}

Open in a new tab

In addition, we define

Σ_{m a x} = 125^{2} \times {1_{30} 1_{30}^{'} \times 0.04 + I_{30} (1 - 0.04)},

(49)

C = [\begin{matrix} 1 & - 1 \end{matrix}],

(50)

θ_{0} = [\begin{matrix} 0 \end{matrix}]

(51)

and

\begin{matrix} β = [\begin{matrix} 25 & 0 \end{matrix}]^{'} . \end{matrix}

(52)

At an α level of 0.05, the approximate power to detect a treatment difference of 25 ng/ml was 0.87 for the Wald test with Kenward and Roger [1] denominator degrees of freedom.

6 Discussion

We describe a power approximation for the Kenward and Roger (1997) test of fixed effects in the linear mixed model. The method was accurate to within about ±0.06 for all designs, with the best accuracy observed for longitudinal designs. We note that Kenward and Roger (2009) have since described a refinement which improves estimation of the non-linear covariance structures in small samples. We have restricted our discussion to the Kenward and Roger (1997) approach, since it is most commonly used in statistical practice.

The method has several limitations. The assumption of N_d > (q + p_d + 1) may be too restrictive for multilevel designs with large cluster sizes. In addition, we assume that the pattern of missing data is known. The method does not apply to repeated covariates, which often appear in biomedical studies. However, the method does apply to baseline covariates, a common study design. We make a strong homoscedasticity assumption of equal variance for each independent sampling unit. This assumption means that the power computations are not appropriate for random regression, for models with group differences in variance, or for certain spatial-temporal applications. Nevertheless, the assumption of homoscedasticity is widely made for randomized controlled clinical trials, laboratory studies, and observational studies, which makes the method useful for a variety of cases. Lastly, the method has not been evaluated for binary or Poisson data.

The analytic results from this manuscript suggest several future extensions. We may be able to calculate power for linear mixed models with random missing data patterns by invoking conditional distribution theory and calculating expected power across patterns of missingness. In addition, the approach used to form the distribution of ${\hat{Σ}}_{s, M}$ provides the first step towards a non-iterative alternative to restricted maximum likelihood estimation for some mixed models. For big data applications, such a non-iterative approach may facilitate highly parallel computation of parameter estimates in mixed models.

Our power approximation provides a general, flexible, accurate and rapid method to calculate power for the Kenward and Roger (1997) test. For studies in which the Kenward and Roger (1997) test is the planned method of data analysis, our power approximation should be used. By aligning power analysis with the planned data analysis, researchers can more accurately assess power for biomedical studies. Accurate power analysis is an ethical imperative for research with human participants.

7 Appendix

A Appendix: Theorems and proofs

Theorem 1. For m ∈ {1, …, k}, let p_m ∈ {1, 2, …, p}, N_m > (p_m + 3) and define Ψ_m = {ψ_mij} to be a (p_m × p_m) symmetric, positive definite matrix. Define a set of k ≥ 2, independent, non-identically distributed inverse central Wishart random matrices, such that for m ∈ {1, …, k}, $S_{m}^{- 1} \sim I W_{p_{m}} (N_{m}, Ψ_{m})$ . For i ∈ {1, …, q} and R_m ⊂ {1, 2, …, p} of cardinality p_m, define X_m to be a (p_m × qp) matrix of rank p_m < qp with the form

\begin{matrix} X_{m} = I_{q} ({i}) \otimes I_{p} (R_{m}) . \end{matrix}

(53)

If for each i ∈ {1, …, q}, there exists at least one m such that X_m = I_q({i}) ⊗ I_p, then

\begin{matrix} Q^{- 1} = \sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m} \end{matrix}

(54)

is positive definite.

Proof. Let Q_i = {X_m: X_m = I_q({i}) ⊗ I_p(R_m)}. Then

\begin{matrix} \sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m} \\ = & \sum_{i = 1}^{q} \sum_{Q_{i}} X_{m}^{'} S_{m}^{- 1} X_{m} \\ = & \sum_{i = 1}^{q} \sum_{Q_{i}} [I_{q} ({i}) \otimes I_{p} (R_{m})]^{'} S_{m}^{- 1} [I_{q} ({i}) \otimes I_{p} (R_{m})] \\ = & \sum_{i = 1}^{q} \sum_{Q_{i}} [I_{q} ({i})^{'} \otimes I_{p} (R_{m})^{'}] (I_{1} \otimes S_{m}^{- 1}) [I_{q} ({i}) \otimes I_{p} (R_{m})] \\ = & \sum_{i = 1}^{q} \sum_{Q_{i}} [I_{q} ({i})^{'} \otimes I_{p} (R_{m})^{'} S_{m}^{- 1}] [I_{q} ({i}) \otimes I_{p} (R_{m})] \\ = & \sum_{i = 1}^{q} \sum_{Q_{i}} [I_{q} ({i})^{'} I_{q} ({i}) \otimes I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})] \\ = & \sum_{i = 1}^{q} {I_{q} ({i})^{'} I_{q} ({i}) \otimes \sum_{Q_{i}} [I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})]} . \end{matrix}

(55)

Note that for i ∈ {1, 2, ‥, q}, I_q({i})′ I_q({i}) is a (q × q) matrix for which the ith diagonal element is 1 and all remaining elements are 0. Therefore, Eq 55 can be equivalently expressed as a direct sum.

\begin{matrix} \sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m} = \oplus_{i = 1}^{q} {\sum_{Q_{i}} [I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})]} . \end{matrix}

(56)

From Mathai and Provost [18, p.18, Theorem 2.2b.1], it follows that each $I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})$ is positive semi-definite. By assumption, for each Q_i, there exists a c_i such that $X_{c_{i}} \in Q_{i}$ such that $X_{c_{i}} = I_{q} ({i}) \otimes I_{p}$ . Then

\begin{matrix} \sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m} & = & \oplus_{i = 1}^{q} {I_{p} S_{c_{i}}^{- 1} I_{p} + \sum_{Q_{i} / X_{c_{i}}} [I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})]} \\ = & \oplus_{i = 1}^{q} {S_{c_{i}}^{- 1} + \sum_{Q_{i} / X_{c_{i}}} [I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})]} . \end{matrix}

(57)

Because $S_{c_{i}}^{- 1}$ is positive definite and the remaining $I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})$ are positive semi-definite for i ∈ {1, …, q}, then

\begin{matrix} S_{c_{i}}^{- 1} + \sum_{Q_{i} / X_{c_{i}}} [I_{p} (R_{m})^{'} S_{m}^{- 1} I_{p} (R_{m})] \end{matrix}

(58)

is positive definite.

Since $\sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m}$ is a block matrix, the eigenvalues of $\sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m}$ are the eigenvalues of all of the blocks. Since each block (Eq 58) is positive definite and hence has positive eigenvalues, it follows that $\sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m}$ must also be positive definite.

Theorem 2. For m ∈ {1, …, k}, i ∈ {1, …, q}, R_m ⊂ {1, 2, …, p} of cardinality p_m, X_m = I_q({i}) ⊗ I_p(R_m) a (p_m × qp) matrix of rank p_m < qp, N_m > (p_m + 3), Ψ_m a (p_m × p_m) symmetric, positive definite matrix, and $S_{m}^{- 1} \sim I W_{p_{m}} (N_{m}, Ψ_{m})$ ,

\begin{matrix} tr (S_{m}^{- 1}) = tr (X_{m}^{'} S_{m}^{- 1} X_{m}) . \end{matrix}

(59)

Proof. Let Dg(x) indicate a square matrix with the elements of the vector x on the diagonal.

Since $S_{m}^{- 1}$ is positive definite and has full rank, then by Lemma 1.24 (a) of Muller and Stewart [9], it has the spectral decomposition

\begin{matrix} S_{m}^{- 1} = V Dg (λ) V^{'}, \end{matrix}

(60)

where λ is the (p_m × 1) vector of eigenvalues and V is the (p_m × p_m) orthogonal matrix of eigenvectors of $S_{m}^{- 1}$ . Then

\begin{matrix} X_{m}^{'} S_{m}^{- 1} X_{m} = X_{m}^{'} V Dg (λ) V^{'} X_{m} . \end{matrix}

(61)

Since X_m has deficient rank p_m < qp, then by Lemma 1.25 of of Muller and Stewart [9] it must have qp − p_m zero eigenvalues. Let λ₀ be the (qp − p_m × 1) vector of zero eigenvalues and V₀ the [qp × (qp − p_m)] matrix of corresponding eigenvectors. Then

\begin{matrix} X_{m}^{'} S_{m}^{- 1} X_{m} & = & X_{m}^{'} V Dg (λ) V^{'} X_{m} \\ = & X_{m}^{'} V Dg (λ) V^{'} X_{m} + V_{0} Dg (λ_{0}) V_{0}^{'} \\ = & [\begin{matrix} X_{m}^{'} V & V_{0} \end{matrix}] [\begin{matrix} Dg (λ) & 0 \\ 0 & Dg (λ_{0}) \end{matrix}] [\begin{matrix} V^{'} X_{m} \\ V_{0}^{'} \end{matrix}] . \end{matrix}

(62)

Selecting V₀ such that $V_{0}^{'} V_{0} = I_{q p - p_{m}}$ , $V_{0} V_{0}^{'} = I - X_{m}^{'} X_{m}$ and X_m V₀ = 0, ensures that $[\begin{matrix} X_{m}^{'} V & V_{0} \end{matrix}]$ is orthogonal. Then Eq 62 is the spectral decomposition of $X_{m}^{'} S_{m}^{- 1} X_{m}$ , with eigenvalues $[\begin{matrix} λ^{'} & λ_{0}^{'} \end{matrix}]^{'}$ .

Since λ₀ contains only zero eigenvalues and using the definition of the trace,

\begin{matrix} tr (X_{m}^{'} S_{m}^{- 1} X_{m}) & = & \sum_{i = 1}^{p_{m}} λ_{i} + \sum_{j = 1}^{q p - p_{m}} λ_{0 j} \\ = & \sum_{i = 1}^{p_{m}} λ_{i} \\ = & tr (S_{m}^{- 1}) . \end{matrix}

(63)

Theorem 3. For m ∈ {1, …, k}, let p_m ∈ {1, …, p}, N_m > (p_m+ 3) and let Ψ_m = {ψ_mij} be a (p_m × p_m) symmetric, positive definite matrix. Define a set of k ≥ 2, independent, non-identically distributed inverse central Wishart random matrices, such that for m ∈ {1, …, k}, $S_{m}^{- 1} \sim I W_{p_{m}} (N_{m}, Ψ_{m})$ . For i ∈ {1, …, q} and R_m ⊂ {1, …, p} of cardinality p_m, define X_m to be a (p_m × qp) matrix of rank p_m < qp with the form

\begin{matrix} X_{m} = I_{q} ({i}) \otimes I_{p} (R_{m}) . \end{matrix}

(64)

Under the assumption that for each i ∈ {1, …, q}, there exists at least one m such that X_m = I_q({i}) ⊗ I_p, it can be shown that

\begin{matrix} Q^{- 1} = \sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m} \end{matrix}

(65)

is approximately distributed as $S_{*}^{- 1} \sim I W_{q p} (N_{*}, Ψ_{*})$ .

Proof. Theorem 1 in Appendix demonstrates that Q⁻¹ is positive definite under the restriction that for each i ∈ {1, …, q}, there exists at least one m such that X_m = I_q({i}) ⊗ I_p.

To derive an approximate distribution for Q⁻¹, we match the expectation of the sum of the Wishart matrices and the variance of the trace of the sum of the Wishart matrices. Set

\begin{matrix} E (S_{*}^{- 1}) = E (\sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m}) \end{matrix}

(66)

and

\begin{matrix} V [tr (S_{*}^{- 1})] = V [tr (\sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m})] . \end{matrix}

(67)

From Theorem 2 in Appendix and the independence of the $S_{m}^{- 1}$ ,

\begin{matrix} V [tr (\sum_{m = 1}^{k} X_{m}^{'} S_{m}^{- 1} X_{m})] & = & \sum_{m = 1}^{k} V [tr (X_{m}^{'} S_{m}^{- 1} X_{m})] \\ = & \sum_{m = 1}^{k} V [tr (S_{m}^{- 1})] . \end{matrix}

(68)

Then the approximate parameters for $S_{*}^{- 1} \sim I W_{q} (N_{*}, Ψ_{*})$ are

\begin{matrix} Ψ_{*} & = & (N_{*} - q - 1) \sum_{m = 1}^{k} \frac{1}{N_{m} - p_{m} - 1} X_{m}^{'} Ψ_{m} X_{m} \end{matrix}

(69)

and

\begin{matrix} N_{*} = \frac{b_{h} + \sqrt{b_{h}^{2} - 4 h_{4} c_{h}}}{2 h_{4}}, \end{matrix}

(70)

where

\begin{matrix} h_{1} & = & \sum_{i = 1}^{q p} [\sum_{m = 1}^{k} \frac{1}{N_{m} - p_{m} - 1} (X_{m}^{'} Ψ_{m} X_{m})_{i i}]^{2}, \end{matrix}

(71)

\begin{matrix} h_{2} & = & \sum_{1 < i < j < q p} {[\sum_{m = 1}^{k} \frac{1}{N_{m} - p_{m} - 1} {(X_{m}^{'} Ψ_{m} X_{m})}_{i i}] \\ \times [\sum_{m = 1}^{k} \frac{1}{N_{m} - p_{m} - 1} (X_{m}^{'} Ψ_{m} X_{m})_{j j}]}, \end{matrix}

(72)

\begin{matrix} h_{3} & = & \sum_{1 < i < j < q p} [\sum_{m = 1}^{k} \frac{1}{N_{m} - p_{m} - 1} (X_{m}^{'} Ψ_{m} X_{m})_{i j}]^{2}, \end{matrix}

(73)

\begin{matrix} h_{4} & = & \sum_{m = 1}^{k} \sum_{i = 1}^{p_{m}} \frac{2 ψ_{m i i}^{2}}{(N_{m} - p_{m} - 1)^{2} (N_{m} - p_{m} - 3)} \\ + 4 \sum_{m = 1}^{k} \sum_{1 < i < j < p_{m}} \frac{ψ_{m i i} ψ_{m j j} + (N_{m} - p_{m} - 1) ψ_{m i j}^{2}}{(N_{m} - p_{m}) (N_{m} - p_{m} - 1)^{2} (N_{m} - p_{m} - 3)}, \end{matrix}

(74)

b_{h} = (2 h_{1} + 4 h_{3} + 2 h_{4} p + 3 h_{4}),

(75)

and

\begin{matrix} c_{h} & = & (2 h_{1} p - 4 h_{2} + 4 h_{3} p + 4 h_{3} + h_{4} p^{2} + 3 h_{4} p) . \end{matrix}

(76)

The method of moments approximation yields an asymptotic approximation for the sum, as desired.

Theorem 4. Let n and p be positive integers, μ be a (p × 1) vector of means, and Σ_x ≠ Σ_W be symmetric and positive definite (p × p) matrices. Suppose $x \sim N_{p} (μ, Σ_{x})$ independently of $W \sim W_{p} (n, Σ_{W})$ . Then

\begin{matrix} x^{'} W^{- 1} x \dot{\sim} \frac{λ_{u} n_{u}}{(n + p - 1)} F {n_{u}, (n + p - 1), δ_{u}}, \end{matrix}

(77)

with

λ_{u} = δ_{u}^{- 1} (μ^{'} Σ_{W}^{- 1} μ),

(78)

n_{u} = δ_{u} (μ^{'} Σ_{W}^{- 1} μ)^{- 1} tr (Σ_{W}^{- 1} Σ_{x}),

(79)

and

\begin{matrix} δ_{u} = \frac{(μ^{'} Σ_{W}^{- 1} μ) tr (Σ_{W}^{- 1} Σ_{x}) + 2 (μ^{'} Σ_{W}^{- 1} μ)^{2}}{tr (Σ_{W}^{- 1} Σ_{x} Σ_{W}^{- 1} Σ_{x}) + 2 μ^{'} Σ_{W}^{- 1} Σ_{x} Σ_{W}^{- 1} μ} . \end{matrix}

(80)

Proof. Define $V = x^{'} Σ_{W}^{- 1} x / x^{'} W^{- 1} x$ . Define $U = x^{'} Σ_{W}^{- 1} x$ . Using Lemma 17.10 in Arnold [19, p. 319], it follows that $V | x \sim χ_{n + p - 1}^{2}$ . Hence, V ⊥ x, which implies V ⊥ U.

The expression U is a weighted sum of noncentral χ² random variables [9, Theorem 9.5, p. 176]. Approximate the distribution of U with a single noncentral χ², so that $U \dot{\sim} λ_{u} χ_{n_{u}}^{2} (δ_{u})$ . Using the approach described by Kim et al. [8], obtain values for λ_u, n_u and δ_u by matching the following three moments:

E_{0} {λ_{u} χ_{n_{u}}^{2} (δ_{u})} = E_{0} (U),

(81)

E_{A} {λ_{u} χ_{n_{u}}^{2} (δ_{u})} = E_{A} (U),

(82)

and

\begin{matrix} V_{A} {λ_{u} χ_{n_{u}}^{2} (δ_{u})} = V_{A} (U) . \end{matrix}

(83)

The moments of U are [9, Corollary 9.6.3, p. 179],

E_{0} (U) = tr (Σ_{W}^{- 1} Σ_{x}),

(84)

E_{A} (U) = tr (Σ_{W}^{- 1} Σ_{x}) + μ^{'} Σ_{W}^{- 1} μ,

(85)

and

\begin{matrix} V_{A} (U) = 2 tr (Σ_{W}^{- 1} Σ_{x} Σ_{W}^{- 1} Σ_{x}) + 4 μ^{'} Σ_{W}^{- 1} Σ_{x}^{- 1} Σ_{W}^{- 1} μ . \end{matrix}

(86)

Then the approximate parameters of U are

λ_{u} = δ_{u}^{- 1} (μ^{'} Σ_{W}^{- 1} μ),

(87)

n_{u} = δ_{u} (μ^{'} Σ_{W}^{- 1} μ)^{- 1} tr (Σ_{W}^{- 1} Σ_{x}),

(88)

and

\begin{matrix} δ_{u} = \frac{(μ^{'} Σ_{W}^{- 1} μ) tr (Σ_{W}^{- 1} Σ_{x}) + 2 (μ^{'} Σ_{W}^{- 1} μ)^{2}}{tr (Σ_{W}^{- 1} Σ_{x} Σ_{W}^{- 1} Σ_{x}) + 2 μ^{'} Σ_{W}^{- 1} Σ_{x}^{- 1} Σ_{W}^{- 1} μ} . \end{matrix}

(89)

Since $(U / λ_{u}) \dot{\sim} χ_{n_{u}}^{2} (δ_{u})$ , $V \sim χ_{n + p - 1}^{2}$ , and V ⊥ U,

\begin{matrix} \frac{U / (λ_{u} n_{u})}{V / (n + p - 1)} \dot{\sim} F {n_{u}, (n + p - 1), δ_{u}} . \end{matrix}

Because U/V = x′W⁻¹ x,

\begin{matrix} (x^{'} W^{- 1} x) \frac{(n + p - 1)}{λ_{u} n_{u}} \dot{\sim} F {n_{u}, (n + p - 1), δ_{u}}, \end{matrix}

and the result follows.

Acknowledgments

A portion of this paper was submitted to the University of Colorado Denver in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biostatistics for Dr. Sarah M. Kreidler.

Data Availability

Source code, data and instructions for reproducing the manuscript results are available at http://github.com/SampleSizeShop/mixedPower.

Funding Statement

This study was supported by The National Institute of Dental and Craniofacial Research (www.nih.gov) in the form of a grant awarded to KEM and DHG (NIDCR 1 R01 DE020832-01A1), The National Institute of General Medical Sciences (www.nih.gov) in the form of a grant awarded to KEM and DHG (NIGMS 9R01GM121081-05), and the Office of the Director (www.nih.gov) in the form of a grant awarded to Dana Dabelea, PI (OD 5UG3OD023248-02). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53(3):983–997. doi: 10.2307/2533558 [DOI] [PubMed] [Google Scholar]
2. Helms RW. Intentionally incomplete longitudinal designs: I. Methodology and comparison of some full span designs. Statistics in medicine. 1992;11(14-15):1889–1913. doi: 10.1002/sim.4780111411 [DOI] [PubMed] [Google Scholar]
3.Stroup WW. Mixed Model Procedures to Assess Power, Precision, and Sample Size in the Design of Experiments. 1999 Proceedings of the Biopharmaceutical Section, Alexandria, VA: American Statistical Association. 1999; p. 15–24.
4. Tu XM, Kowalski J, Zhang J, Lynch KG, Crits-Christoph P. Power analyses for longitudinal trials and other clustered designs. Statistics in medicine. 2004;23(18):2799–2815. doi: 10.1002/sim.1869 [DOI] [PubMed] [Google Scholar]
5. Tu XM, Zhang J, Kowalski J, Shults J, Feng C, Sun W, et al. Power analyses for longitudinal study designs with missing data. Statistics in medicine. 2007;26(15):2958–2981. doi: 10.1002/sim.2773 [DOI] [PubMed] [Google Scholar]
6. Shieh G. A unified approach to power calculation and sample size determination for random regression models. Psychometrika. 2007;72(3):347–360. doi: 10.1007/s11336-007-9012-5 [DOI] [Google Scholar]
7. Chi YY, Glueck DH, Muller KE. Power and Sample Size for Fixed-Effects Inference in Reversible Linear Mixed Models. The American Statistician. in press;. doi: 10.1080/00031305.2017.1415972 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Kim HY, Gribbin MJ, Muller KE, Taylor DJ. Analytic, Computational, and Approximate Forms for Ratios of Noncentral and Central Gaussian Quadratic Forms. Journal of computational and graphical statistics: a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America. 2006;15(2):443–459. doi: 10.1198/106186006X112954 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Muller KE, Stewart PW. Linear model theory: univariate, multivariate, and mixed models. Hoboken, New Jersey: John Wiley and Sons; 2006. [Google Scholar]
10. Johnson NL, Kotz S, Balakrishnan N. Continuous univariate distributions. Wiley & Sons; 1995. [Google Scholar]
11. Gupta AK, Nagar DK. Matrix variate distributions. Boca Raton, FL: Chapman & Hall; 2000. [Google Scholar]
12. Verbeke G, Molenberghs G. Linear mixed models for longitudinal data. New York: Springer; 2009. [Google Scholar]
13. Murray DM. Design and Analysis of Group- Randomized Trials. 1st ed. Oxford University Press, USA; 1998. [Google Scholar]
14. Anderson TW. An Introduction to Multivariate Statistical Analysis. 2nd ed. Wiley Series in Probability and Statistics. Wiley; 1984. [Google Scholar]
15.R Development Core. R: A Language and Environment for Statistical Computing. Vienna, Austria; 2010. Available from: http://www.R-project.org/.
16.SAS Institute Inc. SAS 9.3 Software, Version 9.3. Cary, NC; 2013. Available from: http://www.sas.com/software/sas9/.
17. Guo Y, Logan HL, Glueck DH, Muller KE. Selecting a sample size for studies with repeated measures. BMC Medical Research Methodology. 2013;13(1). doi: 10.1186/1471-2288-13-100 [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Mathai AM, Provost SB. Quadratic Forms in Random Variables: Theory and Applications. Marcel Dekker Incorporated; 1992. [Google Scholar]
19. Arnold SF. The theory of linear models and multivariate analysis. New York: Wiley; 1981. [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0254811.r001

Decision Letter 0

Lei Shi

3 Mar 2021

PONE-D-21-00404

A power approximation for the Kenward and Roger Wald test in the linear mixed model

PLOS ONE

Dear Dr. Ringham

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Apr 17 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Lei Shi

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following in the Competing Interests section:

"The authors have declared that no competing interests exist."

We note that one or more of the authors are employed by a commercial company: Sunrun Inc.

2.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

2.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

3. Thank you for stating the following in the Acknowledgments Section of your manuscript:

Funding for this work was provided by NIDCR 1 R01 DE020832-01A1 (Keith E. Muller,

PI; Deborah H. Glueck, University of Colorado site PI), NIGMS 9R01GM121081-05

(Deborah H. Glueck, Keith E. Muller, Dana Dabelea, PIs), and OD 5UG3OD023248-02

(Dana Dabelea, PI).

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

"KEM, DHG

NIDCR 1 R01 DE020832-01A1

The National Institute of Dental and Craniofacial Research

www.nih.gov

KEM, DHG

NIGMS 9R01GM121081-05

The National Institute of General Medical Sciences

www.nih.gov

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this paper, a power approximation of Kenward and Roger test is derived. Via Monte Carlo simulation, author’s demonstrate that the new power approximation is accurate for cluster randomised trials and longitudinal study designs. The paper is well written and addresses an interesting problem.

My major issues are listed below.Major comments:

Comment 1: On line 171, it is claimed that if Σs and β are estimated using multivariate techniques, independence would follow. Provide a reference for this or give a detailed explanation in support of this claim.

Comment 2: The comparison of empirical and proposed powers is done assuming intraclass correlation(ICC) 0.04. This value of ICC is very small and in practice it can vary up to 0.5. Make the comparison of powers for higher values of ICC as well (say 0.1, 0.2, 0.5).

Comment 3: In the spirit of the longitudinal studies, how efficient is the power approximation when the correlation structure is assumed to be auto–regressive?

Comment 4: In Section 5 (Applied Example), rather than assuming the values of standard deviation and intraclass correlation, it is more reasonable to use the estimates of the parameters obtained from the data.

Minor comments:

Comment 1:Check line 122

Reviewer #2: The authors here present a noncentral F power approximation for the Kenward and Roger test. This work is innovative, and the organization of the manuscript is clear and comprehensive. Below are some minor comments that could help further streamline the text.

My review comments has been uploaded as an attachment.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Huang Lin

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Review_Report.pdf

Click here for additional data file.^{(78KB, pdf)}

Attachment

Submitted filename: Comments.docx

Click here for additional data file.^{(13.7KB, docx)}

PLoS One. 2021 Jul 21;16(7):e0254811. doi: 10.1371/journal.pone.0254811.r002

Author response to Decision Letter 0

10 May 2021

Response to Reviewers

We thank the reviewers for their kind comments. We have responded to all comments below. Reviewer comments are in italics and our response is in plain type.

General Comments

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

Done.

2. Thank you for stating the following in the Competing Interests section:

"The authors have declared that no competing interests exist." We note that one or more of the authors are employed by a commercial company: Sunrun Inc. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

We have removed Dr. Kreidler’s affiliation with SunRun, Inc. Previously we indicated that Dr. Kreidler was affiliated with SunRun Inc. After further review, and examining the policies of PLOS One, we realized that SunRun did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and, in addition, did not provide support in the form of salaries for any author, including SMK.

We have updated Dr. Kreidler’s affiliation to be the Department of Biostatistics and Information, University of Colorado Denver. Although Dr. Kreidler is currently employed by SunRun, Inc., all of the work on this manuscript was completed while Dr. Kreidler was a doctoral student at the University of Colorado Denver. Current revisions are being done during Dr. Kreidler’s personal time. No SunRun, Inc. resources were used in the revisions of this manuscript for submission. The publication of this manuscript will not affect SunRun Inc., nor Dr. Kreidler in any financial way.

3. Thank you for stating the following in the Acknowledgments Section of your manuscript: Funding for this work was provided by NIDCR 1 R01 DE020832-01A1 (Keith E. Muller, PI; Deborah H. Glueck, University of Colorado site PI), NIGMS 9R01GM121081-05 (Deborah H. Glueck, Keith E. Muller, Dana Dabelea, PIs), and OD 5UG3OD023248-02 (Dana Dabelea, PI).

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

"KEM, DHG

NIDCR 1 R01 DE020832-01A1

The National Institute of Dental and Craniofacial Research

www.nih.gov

KEM, DHG

NIGMS 9R01GM121081-05

The National Institute of General Medical Sciences

www.nih.gov

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

We have provided an amended funding statement in the cover letter for this resubmission.

Reviewer 1 Comments

1. On line 171, it is claimed that if Σs and β are estimated using multivariate techniques, independence would follow. Provide a reference for this or give a detailed explanation in support of this claim.

We now provide a citation to support the statement above where it is mentioned in the manuscript. For reference, the citation is Anderson (1984, pg. 291, Theorem 8.2.2).

2. The comparison of empirical and proposed powers is done assuming intraclass correlation(ICC) 0.04. This value of ICC is very small and in practice it can vary up to 0.5. Make the comparison of powers for higher values of ICC as well (say 0.1, 0.2, 0.5).

We have re-run the simulation after incorporating the reviewer’s suggestions. The new results appear in the revised manuscript. The different ICC’s do not appear to change the accuracy of the results.

3. In the spirit of the longitudinal studies, how efficient is the power approximation when the correlation structure is assumed to be auto–regressive?

We used an auto-regressive correlation structure for the simulation studies described in Section 4.1.2 Longitudinal Designs. The median deviation between the approximate power and the true power was 0.003 (range: -0.010 to 0.016; 1st quartile: 0.00, 3rd quartile: 0.009).

4. In Section 5 (Applied Example), rather than assuming the values of standard deviation and intraclass correlation, it is more reasonable to use the estimates of the parameters obtained from the data.

The example is a synthetic example designed to demonstrate the utility of the calculations for a simple, cluster randomized study with different sized clusters. While the example was inspired by the trial described by Hennrikus et al., the example was so simplified that the reference to the Hennrikus et al. paper was confusing, rather than helpful. We have removed the reference to Hennrikus et al., clarified that the example is synthetic, and added a sentence to describe how a researcher might extract the inputs for the power analysis from review of the literature.

5. Check line 122

We reviewed the line and removed the duplicated equation. Thank you for finding the error.

Reviewer 2 Comments

1. The homoscedasticity assumption (line 136) the author made could be a key limitation of the approximation method with the presence of different random coefficients. Even though it has been mentioned that there are no repeated covariates (line 82), it is worth mentioning it in the Discussion section.

We now describe limitations due to model assumptions in the Discussion section.

“The method does not apply to repeated covariates, which often appear in biomedical studies. However, the method does apply to baseline covariates, a common study design. We make a strong homoscedasticity assumption of equal variance for each independent sampling unit. This assumption means that the power computations are not appropriate for random regression, for models with group differences in variance, or for certain spatial-temporal applications. Nevertheless, the assumption of homoscedasticity is widely made for randomized controlled clinical trials, laboratory studies, and observational studies, which makes the method useful for a variety of cases.”

2. Theorem 3 in the Appendix lacks the reference regarding “the sum of the inverse Wishart distribution asymptotically or approximately follows an inverse Wishart distribution.”

We have now clarified the proof for Theorem 3 to explicitly state what we set out to prove. We added the following text to the end of the proof.

“The method of moments approximation yields an asymptotic approximation for the sum, as desired.”

3. Using p_i to denote the number of observations is counter-intuitive, especially given you define n=∑_(i=1)^N▒p_i . Replacing p_i with n_i would be easier to read.

We recognize the value of using n to denote different sample sizes. However, for this particular application, we use multivariate modeling theory to develop our results. In multivariate notation, it is standard to refer to the repeated measures using p. When the data are stacked, the vector of observations for independent sampling unit i is then given pi. We recognize that the notation may appear slightly awkward, but respectfully request that we may keep it as written so that it aligns with standard notation in the field. See, for example, Muller and Stuart (2006).

4. Duplicated terms appeared in line 122.

Thank you for finding the error. We have corrected it.

5. The notation for noncentrality parameter of F distribution is very similar to the notation of Wald statistic, please consider changing one of them.

We agree and have now adjusted the notation for the non-centrality parameter that appears in Section 2.1 Notation.

6. Box-plot elements should be defined (e.g. center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers) in the legend.

Done.

Attachment

Submitted filename: Repsonse to Editor Requests 04.pdf

Click here for additional data file.^{(172.8KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0254811.r003

Decision Letter 1

Lei Shi

5 Jul 2021

A power approximation for the Kenward and Roger Wald test in the linear mixed model

PONE-D-21-00404R1

Dear Dr. Brandy Ringham,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Lei Shi

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

6. Review Comments to the Author

Reviewer #1: In the revised version authors successfully incorporated all the suggestions and corrections.

All comments are addressed adequately.

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Huang Lin

PLoS One. doi: 10.1371/journal.pone.0254811.r004

Acceptance letter

Lei Shi

9 Jul 2021

PONE-D-21-00404R1

A Power Approximation for the Kenward and Roger Wald Test in the Linear Mixed Model

Dear Dr. Ringham:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Lei Shi

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: Review_Report.pdf

Click here for additional data file.^{(78KB, pdf)}

Attachment

Submitted filename: Comments.docx

Click here for additional data file.^{(13.7KB, docx)}

Attachment

Submitted filename: Repsonse to Editor Requests 04.pdf

Click here for additional data file.^{(172.8KB, pdf)}

Data Availability Statement

Source code, data and instructions for reproducing the manuscript results are available at http://github.com/SampleSizeShop/mixedPower.

[pone.0254811.ref001] 1. Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53(3):983–997. doi: 10.2307/2533558 [DOI] [PubMed] [Google Scholar]

[pone.0254811.ref002] 2. Helms RW. Intentionally incomplete longitudinal designs: I. Methodology and comparison of some full span designs. Statistics in medicine. 1992;11(14-15):1889–1913. doi: 10.1002/sim.4780111411 [DOI] [PubMed] [Google Scholar]

[pone.0254811.ref003] 3.Stroup WW. Mixed Model Procedures to Assess Power, Precision, and Sample Size in the Design of Experiments. 1999 Proceedings of the Biopharmaceutical Section, Alexandria, VA: American Statistical Association. 1999; p. 15–24.

[pone.0254811.ref004] 4. Tu XM, Kowalski J, Zhang J, Lynch KG, Crits-Christoph P. Power analyses for longitudinal trials and other clustered designs. Statistics in medicine. 2004;23(18):2799–2815. doi: 10.1002/sim.1869 [DOI] [PubMed] [Google Scholar]

[pone.0254811.ref005] 5. Tu XM, Zhang J, Kowalski J, Shults J, Feng C, Sun W, et al. Power analyses for longitudinal study designs with missing data. Statistics in medicine. 2007;26(15):2958–2981. doi: 10.1002/sim.2773 [DOI] [PubMed] [Google Scholar]

[pone.0254811.ref006] 6. Shieh G. A unified approach to power calculation and sample size determination for random regression models. Psychometrika. 2007;72(3):347–360. doi: 10.1007/s11336-007-9012-5 [DOI] [Google Scholar]

[pone.0254811.ref007] 7. Chi YY, Glueck DH, Muller KE. Power and Sample Size for Fixed-Effects Inference in Reversible Linear Mixed Models. The American Statistician. in press;. doi: 10.1080/00031305.2017.1415972 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0254811.ref008] 8. Kim HY, Gribbin MJ, Muller KE, Taylor DJ. Analytic, Computational, and Approximate Forms for Ratios of Noncentral and Central Gaussian Quadratic Forms. Journal of computational and graphical statistics: a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America. 2006;15(2):443–459. doi: 10.1198/106186006X112954 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0254811.ref009] 9. Muller KE, Stewart PW. Linear model theory: univariate, multivariate, and mixed models. Hoboken, New Jersey: John Wiley and Sons; 2006. [Google Scholar]

[pone.0254811.ref010] 10. Johnson NL, Kotz S, Balakrishnan N. Continuous univariate distributions. Wiley & Sons; 1995. [Google Scholar]

[pone.0254811.ref011] 11. Gupta AK, Nagar DK. Matrix variate distributions. Boca Raton, FL: Chapman & Hall; 2000. [Google Scholar]

[pone.0254811.ref012] 12. Verbeke G, Molenberghs G. Linear mixed models for longitudinal data. New York: Springer; 2009. [Google Scholar]

[pone.0254811.ref013] 13. Murray DM. Design and Analysis of Group- Randomized Trials. 1st ed. Oxford University Press, USA; 1998. [Google Scholar]

[pone.0254811.ref014] 14. Anderson TW. An Introduction to Multivariate Statistical Analysis. 2nd ed. Wiley Series in Probability and Statistics. Wiley; 1984. [Google Scholar]

[pone.0254811.ref015] 15.R Development Core. R: A Language and Environment for Statistical Computing. Vienna, Austria; 2010. Available from: http://www.R-project.org/.

[pone.0254811.ref016] 16.SAS Institute Inc. SAS 9.3 Software, Version 9.3. Cary, NC; 2013. Available from: http://www.sas.com/software/sas9/.

[pone.0254811.ref017] 17. Guo Y, Logan HL, Glueck DH, Muller KE. Selecting a sample size for studies with repeated measures. BMC Medical Research Methodology. 2013;13(1). doi: 10.1186/1471-2288-13-100 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0254811.ref018] 18. Mathai AM, Provost SB. Quadratic Forms in Random Variables: Theory and Applications. Marcel Dekker Incorporated; 1992. [Google Scholar]

[pone.0254811.ref019] 19. Arnold SF. The theory of linear models and multivariate analysis. New York: Wiley; 1981. [Google Scholar]

PERMALINK

A power approximation for the Kenward and Roger Wald test in the linear mixed model

Sarah M Kreidler

Brandy M Ringham

Keith E Muller

Deborah H Glueck

Roles

Abstract

1 Introduction

1.1 Motivation

2 Notation, models, and hypothesis testing

2.1 Notation

2.2 The general linear mixed model

2.3 Tests for fixed effects in mixed models

2.4 The Kenward-Roger test for fixed effects

3 Power approximation for the Kenward-Roger test in the linear mixed model

3.1 The approximate moments of the Wald statistic

3.1.1 The conditional distribution of (θ^W-θ0)

3.1.2 The approximate distribution of C(Xs′Σ^s,M-1Xs)-1C′

3.1.3 Combining (θ^W-θ0) and C(Xs′Σ^s,M-1Xs)-1C′ to form an approximate F

3.2 A three-moment approximation for the distribution of the Kenward and Roger scaled Wald statistic under the alternative hypothesis

3.3 Power calculation for the Kenward and Roger test

4 Simulation study

4.1 Methods

4.1.1 Cluster randomized designs

4.1.2 Longitudinal designs

4.1.3 Performance criteria

4.2 Results

Fig 1. Power deviations for all designs, cluster randomized designs only, and longitudinal designs only.

Fig 2. Power deviations for cluster randomized designs.

Table 1. Deviations between approximate and empirical power in cluster randomized designs by ICC.

Fig 3. Power deviations in longitudinal designs.

5 Applied example

Table 2. Design matrices and patterns of observations for proposed study of smoking cessation programs.

6 Discussion

7 Appendix

A Appendix: Theorems and proofs

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Lei Shi

Roles

Author response to Decision Letter 0

Decision Letter 1

Lei Shi

Roles

Acceptance letter

Lei Shi

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.1.1 The conditional distribution of $({\hat{θ}}_{W} - θ_{0})$

3.1.2 The approximate distribution of $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$

3.1.3 Combining $({\hat{θ}}_{W} - θ_{0})$ and $C (X_{s}^{^{'}} {\hat{Σ}}_{s, M}^{- 1} X_{s})^{- 1} C^{'}$ to form an approximate $F$