Sample Size Estimation for Comparing Rates of Change in K-group Repeated Count Outcomes

Ying Lou; Jing Cao; Chul Ahn

doi:10.1080/03610926.2016.1260744

. Author manuscript; available in PMC: 2019 Mar 27.

Published in final edited form as: Commun Stat Theory Methods. 2017 Aug 7;46(22):11204–11213. doi: 10.1080/03610926.2016.1260744

Sample Size Estimation for Comparing Rates of Change in K-group Repeated Count Outcomes

Ying Lou ¹, Jing Cao ², Chul Ahn ³

PMCID: PMC6436812 NIHMSID: NIHMS1505258 PMID: 30930527

Abstract

Sample size estimation for comparing the rates of change in two-arm repeated measurements has been investigated by many investigators. In contrast, the literature has paid relatively less attention to sample size estimation for studies with multiarm repeated measurements where the design and data analysis can be more complex than two-arm trials. For continuous outcomes, Jung & Ahn (2004) and Zhang & Ahn (2013) have presented sample size formulas to compare the rates of change and time-averaged responses in multi-arm trials, using the generalized estimating equation (GEE) approach. To our knowledge, there has been no corresponding development for multi-arm trials with count outcomes. We present a sample size formula for comparing the rates of change in multi-arm repeated count outcomes using the GEE approach that accommodates various correlation structures, missing data patterns, and unbalanced designs. We conduct simulation studies to assess the performance of the proposed sample size formula under a wide range of designing configurations. Simulation results suggest that empirical type I error and power are maintained close to their nominal levels. The proposed method is illustrated using an epileptic clinical trial example.

Keywords: GEE, sample size, clinical trials, repeated count outcomes

1. Introduction

There has been extensive literature on the derivation of sample size formulas for comparing the rates of change in repeated measurements studies between two treatment arms (Patel & Rowe 1999, Diggle et al. 2002, Jung & Ahn 2003, 2005, Zhang & Ahn 2010, Ahn et al. 2015). In contrast, researchers have paid relatively less attention to the sample size problem in more complicated scenarios, multi-arm clinical trials with repeated measurements, which can be more complex in design, data analysis, and implementation than two-arm trials.

In phase III randomized clinical trials (RCTs), the number of experimental agents that can be tested may be extremely limited since RCTs are expensive and time-consuming. Efficiency of RCTs can be improved by conducting multi-arm trials where multiple experimental treatment arms are compared to a single control arm simultaneously. By sharing one control arm, the sample size required by a multi-arm trial can be much smaller than the total sample size of multiple two-arm trials, each separately comparing an experimental agent with the control. The multi-arm trial is also more appealing to patients and physicians because patients will have a higher chance of receiving an experimental agent than the control treatment. Freidlin et al. (2008) discussed statistical and logistical issues in the design of multi-arm trials that affect their relative efficiency compared with separate two-armed trials. Many randomized trials employ multi-arm trials. For example, Liu & Dahlberg (1995) showed that 24 out of 112 phase III trials employed multi-arm trials in progress at the time in Southwest Oncology Group.

In the context of multi-arm trials with repeatedly measured continuous outcomes, Jung & Ahn (2004) and Zhang & Ahn (2013) developed sample size formulas for K-sample (K ≥ 3) comparisons of slopes and time-averaged responses, respectively, using the generalized estimating equation (GEE) approach. To our knowledge, there has been no development of corresponding sample size approach for count outcomes. Repeatedly measured count outcomes are frequently encountered in clinical studies (Diggle et al. 2002, Ogungbenro & Aarons 2010), including the examples of epileptic seizure counts (Thall & Vail 1990) and swollen joints counts in rheumatoid arthritis patients (Tilley et al. 1995). There has been some development in sample size calculation for count outcomes in two-arm trials. For example, Patel & Rowe (1999) provided sample size formulas for comparing the rates of change in repeated count outcomes using GEE, which incorporates general correlation structures. However, their approach does not account for missing data. Recently, a more general sample size formula was developed by Lou et al. (in press) which accommodates arbitrary type of missing patterns. In this study, we propose to investigate sample size calculation for the comparison of slopes for multi-arm clinical trials with repeated measurements of count outcomes.

The proposed sample size calculation approach is based on an approximately normal test statistic under the GEE approach. It is realistic in its ability to accommodate arbitrary types of missing data patterns and correlation structures, inherent for repeated measurements of outcomes in clinical trials. We assess the performance of the sample size formula under various correlation structures, missing data patterns, and observation probabilities through simulation studies. Finally, we illustrate the sample size approach using a real clinical trial example with epileptic patients.

2. Generalized Estimating Equation

A total of n subjects are recruited and randomly assigned to one of K treatment groups. Suppose each subject is scheduled to be measured at J time points t₁ < ··· < t_J. Let n_k denote the number of subjects assigned to treatment group k with $\sum_{k = 1}^{K} n_{k} = n$ . Then, r_k = n_k/n is the proportion of subjects assigned to the kth treatment. Let y_kij be the count outcome observed from the ith subject of the kth treatment group at time t_j. Defining μ_kij = E(y_kij), we model y_kij by a Poisson model,

f (y_{k i j}) = \frac{e^{- μ_{k i j}} μ_{k i j}^{y_{k i j}}}{y_{k i j}!} .

(1)

Employing a log link function g(μ) = log(μ), we have

g (μ_{k i j}) = \log (μ_{k i j}) = a_{k} + b_{k} t_{j} .

(2)

Coefficients a_k and b_k are the group-specific intercept and slope parameters. Hence the first moment of y_kij is modeled as $E (y_{k i j}) = μ_{k i j} (a_{k}, b_{k}) = g^{- 1} (a_{k} + b_{k} t_{j}) = e^{a_{k} + b_{k} t_{j}}$ .

As for the second moment, recall that under a Poisson model, Var(y_ikj) = μ_ikj. Furthermore, we use Corr(y_kij, y_kij′) = ρ_jj′ (with ρ_jj = 1) to denote within-subject correlation. We assume the observations to be independent acros s subjects.(2)

The GEE estimators of parameters b = (a₁, b₁, · · · , a_K, b_K)^′, denoted as $\hat{b}$ , can be solved from U_n(b) = 0, where U_n(b) contains the score functions

U_{n} (b) = {\begin{array}{l} \frac{1}{\sqrt{n_{1}}} \sum_{i = 1}^{n_{1}} \sum_{j = 1}^{J} (y_{1 i j} - μ_{1 i j} (b)) \\ \frac{1}{\sqrt{n_{1}}} \sum_{i = 1}^{n_{1}} \sum_{j = 1}^{J} (y_{1 i j} - μ_{1 i j} (b)) t_{j} \\ \dots \\ \frac{1}{\sqrt{n_{K}}} \sum_{i = 1}^{n_{K}} \sum_{j = 1}^{J} (y_{K i j} - μ_{K i j} (b)) \\ \frac{1}{\sqrt{n_{K}}} \sum_{i = 1}^{n_{K}} \sum_{j = 1}^{J} (y_{K i j} - μ_{K i j} (b)) t_{j} \end{array}} .

(3)

Note that in (3) the score functions are obtained using an independent working correlation structure, which greatly simplifies the derivation of the sample size estimate. When the true correlation is unknown, which is usually the case in practice, an independent working correlation has been used because the parameter estimates remain consistent and the computation is usually more stable and efficient (Liang & Zeger 1986, Crowder 1995, McDonald 1993).

We solve U_n(b) = 0 using the Newton-Raphson algorithm. Specifically, at the lth itera-tion,

{\hat{b}}^{(l)} = {\hat{b}}^{(l - 1)} + n^{- 1 / 2} A_{n}^{- 1} ({\hat{b}}^{(l - 1)}) U_{n} ({\hat{b}}^{(l - 1)}),

where

A_{n} (b) = (\begin{matrix} \frac{1}{n_{1}} \sum_{i = 1}^{n_{1}} \sum_{j = 1}^{J} μ_{1 i j} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) & 0 & ... \\ ... \\ 0 & ... & \frac{1}{n_{K}} \sum_{i = 1}^{n_{K}} \sum_{j = 1}^{J} μ_{K i j} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) \end{matrix}) .

Note that A_n(b) is a block-diagonal matrix. By Liang & Zeger (1986), $\sqrt{n} (\hat{b} - b) \to N (0, V)$ in distribution as n → ∞. The covariance matrix V can be consistently estimated by $V_{n} = W A_{n}^{- 1} (\hat{b}) Σ_{n} A_{n}^{- 1} (\hat{b}) W$ , where

Σ_{n} = (\begin{matrix} \frac{1}{n_{1}} \sum_{i = 1}^{n_{1}} {\sum_{j = 1}^{J} {\hat{ϵ}}_{1 i j} (\begin{matrix} 1 \\ t_{j} \end{matrix})}^{\otimes 2} & 0 & ... \\ ... \\ 0 & ... & \frac{1}{n_{K}} \sum_{i = 1}^{n_{K}} {\sum_{j = 1}^{J} {\hat{ϵ}}_{K i j} (\begin{matrix} 1 \\ t_{j} \end{matrix})}^{\otimes 2} \end{matrix}),

W is a diagonal matrix with the diagonal elements being $(1 / \sqrt{r_{1}}, 1 / \sqrt{r_{1}}, \dots, 1 / \sqrt{r_{K}}, 1 / \sqrt{r_{K}}), {\hat{ϵ}}_{k i j} = y_{k i j} - μ_{k i j} (\hat{b})$ denotes residual, and c^⊗2 = cc^T for a vector c.

To compare the slope parameters among K groups, the hypotheses of interest are H₀ : b₁ = ··· = b_K, versus H₁ : b₁ = θ₁,··· ,b_K = θ_K (at least one θ_k is different from the others). We can construct a test statistic

Z = \frac{C^{'} \hat{b}}{\sqrt{Var (C^{'} \hat{b})}},

(4)

where C is a vector defining a contrast of the slope parameters. For example, without loss of generality, let k = 1 denote the control arm and k = 2, · · · , K denote the experimental arms. We can specify $C = {(0, 1, 0, - \frac{1}{K - 1}, \dots, 0, - \frac{1}{K - 1})}^{'}$ . Note that the elements corresponding to intercepts (a_k) are set to 0. We reject the null hypothesis if |Z| > z_1−α/2, where z_1−α/2 is the 100(1 − α/2)th percentile of the standard normal distribution.

3. Sample Size Calculation

Let A and Σ be the limit of A_n and Σ_n as n → ∞, respectively. Then the limit of V_n is V = WA⁻¹ΣA⁻¹W. Based on test statistic (4), given type I error α and power 1 − γ, the sample size needed to reject H₀ : b₁ = ··· = b_K given truth b₁ = θ₁,··· ,b_K = θ_K is

n = \frac{{(z_{1 - α / 2} + z_{1 - γ})}^{2} C^{'} V C}{{(C^{'} θ)}^{2}},

(5)

where θ = (ϑ₁,θ₁,··· ,ϑ_K,θ_K)^′. Here ϑ_k denotes the true value of a_k. It is noteworthy that although we are only interested in testing the hypotheses about slopes b_k (k = 1,··· ,K), for count outcomes with a Poisson distribution, the intercept parameters a_k (k = 1,··· ,K) still affect the test statistic and sample size through μ_kij = a_k + b_kt_j, which is also the variance of y_kij.

In clinical trials, researchers frequently encounter missing data due to various reasons. In the following, we present a generalized derivation of V to accommodate the presence of missing data under the MCAR (missing complete at random) assumption. Specifically, we introduce a missing indicator Δ_kij, which takes value 0/1 for missed/observed measurements. Then A_n and Σ_n with missing data can be expressed as

A_{n} (b) = (\begin{matrix} \frac{1}{n_{1}} \sum_{i = 1}^{n_{1}} \sum_{j = 1}^{J} Δ_{1 i j} μ_{1 i j} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) & 0 & ... \\ ... \\ 0 & ... & \frac{1}{n_{K}} \sum_{i = 1}^{n_{K}} \sum_{j = 1}^{J} Δ_{K i j} μ_{K i j} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) \end{matrix})

and

Σ_{n} = (\begin{matrix} \frac{1}{n_{1}} \sum_{i = 1}^{n_{1}} {\sum_{j = 1}^{J} Δ_{1 i j} {\hat{ϵ}}_{1 i j} (\begin{matrix} 1 \\ t_{j} \end{matrix})}^{\otimes 2} & 0 & ... \\ ... \\ 0 & ... & \frac{1}{n_{K}} \sum_{i = 1}^{n_{K}} {\sum_{j = 1}^{J} Δ_{K i j} {\hat{ϵ}}_{K i j} (\begin{matrix} 1 \\ t_{j} \end{matrix})}^{\otimes 2} \end{matrix}) .

Let δ_j = E(Δ_kij) be the marginal probability of obtaining an observation at time t_j and δ_jj0 = E(Δ_kijΔ_kij′) be the joint probability of a subject having observations at both t_j and t_j0. Note that δ_jj = δ_j. The above specification effectively assumes that the probabilities of missing data depend on time only. By specifying δ_j and δ_jj′, this approach allows us to accommodate arbitrary types of missing patterns. Utilizing the fact that μ_kij = μ_kj, it can be shown that A_n(b) and Σ_n converge to

A (b) = (\begin{matrix} \sum_{j = 1}^{J} δ_{j} μ_{1 j} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) & 0 & \dots \\ \dots \\ \dots & \dots & \sum_{j = 1}^{J} δ_{j} μ_{K j} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) \end{matrix}),

and

Σ = (\begin{matrix} \sum_{j = 1}^{J} \sum_{j' = 1}^{J} δ_{j j^{'}} ρ_{j j^{'}} \sqrt{μ_{1 j} μ_{1 j^{'}}} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) & 0 & \dots \\ \dots \\ 0 & \dots & \sum_{j = 1}^{J} \sum_{j' = 1}^{J} δ_{j j^{'}} ρ_{j j^{'}} \sqrt{μ_{K j} μ_{K j^{'}}} (\begin{matrix} 1 & t_{j} \\ t_{j} & t_{j}^{2} \end{matrix}) \end{matrix}),

respectively. By plugging A(b) and Σ into (5), the required sample size can be calculated with $C = {(0, 1, 0, - \frac{1}{K - 1}, \dots, 0, - \frac{1}{K - 1})}^{'}$ , and $W = diag (1 / \sqrt{r_{1}}, 1 / \sqrt{r_{1}}, \dots, 1 / \sqrt{r_{K}}, 1 / \sqrt{r_{K}})$ .

In summary, to assess the sample size requirement for comparing the slope parameters of a count outcome among K arms, besides power 1 − γ and type I error α, information needed at the design stage includes: measurement schedule characterized by (t₁,··· ,t_J), true correlation structure by ρ_jj0, missing data pattern by δ_j and δ_jj^′, randomization ratio by r_k, and the true values of parameters θ (including intercepts and slopes).

4. Numerical Studies

To investigate the performance of the proposed sample size formula, we conduct extensive simulations under different parameter settings. Suppose that K = 4 treatment groups are compared under a balanced design with J=6 equidistant measurements obtained at time t_j = (j − 1)/(J − 1),j = 1, · · · , 6. We explore two missing patterns: independent missing (IM) and monotone missing (MM). Independent missing (IM) means that the occurrence of missing value at time t_j is independent of the occurrence at time t_j′. That is, δ_jj′ = δ_jδ_j′. Monotone missing (MM) means that if a subject misses a clinical visit at time t_j, then he/she will miss all the remaining visits. Under MM, we have δ_jj′ = δ_j′ for j′ > j. Note that IM and MM leads to different definitions of joint probabilities δ_jj′. For marginal probabilities δ = (δ₁,··· ,δ_m)′, we investigate four scenarios:

\begin{array}{l} δ_{1} & = & (1, 1, 1, 1, 1, 1), \\ δ_{2} & = & (1, 0.95, 0.9, 0.85, 0.8, 0.75), \\ δ_{3} & = & (1, 0.99, 0.96, 0.91, 0.84, 0.75), \\ δ_{4} & = & (1, 0.91, 0.84, 0.79, 0.76, 0.75) . \end{array}

All scenarios assume complete observations at t₁. δ₁ corresponds to the scenario of no missing data across the study period. Under δ₂, δ₃ and δ₄, the missing probabilities follow different trends, but with an equal proportion (25%) of missing values at the end of study. Specifically, δ₂ represents a linear trend. Under δ₃ there are little missing values initially but the proportion of missing values increase rapidly toward the end of study. Under δ₄ the trend is opposite to δ₃.

We consider two structures for within-subject correlation: compound symmetric (CS) with ρ_jj′ = ρ for j ≠ j^′ and AR(1) with $ρ_{j j^{'}} = ρ^{| t_{j} - t_{j^{'}} |}$ . We explore two values for the correlation parameter ρ: 0.3 and 0.5. Nominal type I error and power are set at α = 0.05 and 1 − γ = 0.8, respectively. We assume b₁ = · · · = b₄ = 0 under the null hypothesis. We consider two types of of alternative hypotheses for clinical trials with K treatments. One assumes the first group to receive the control treatment and the remaining K − 1 groups to receive experimental treatments with similar efficacy. Hence the true slopes are specified as θ₁ = 0, θ₂ = θ₃ = θ₄ = 0.25; The other assumes the K − 1 experimental groups to be ordered with respect to treatment effect. Hence we set b₁ = 0, b₂ = 0.12, b₃ = 0.24, b₄ = 0.36. Finally, the intercept parameters are specified as a₁ = · · · = a₄ = 0. For each combination of design parameters, we conduct simulation as follows:

Step 1: Calculate sample size n according to formula (5).

Step 2: Generate random samples of size n under H₀ and H₁, respectively. Correlated count outcomes are generated using R-package corcounts (Erhardt & Czado 2009).

Step 3: Generate missing indicators according to specified missing pattern and marginal observation probabilities.

Step 4: Calculate $\hat{b}$ , A_n, and Σ_n, and obtain test statistic Z. Reject the null hypothesis if |Z| > z_1−α/2.

Step 5: Repeat Step 2 to Step 4 for L = 5000 times. The empirical type I error and power are calculated as the proportions of rejections among the 5000 repetitions under the null and alternative hypotheses, respectively.

Tables 1 summarizes sample size estimates based on (5), as well as the empirical powers and type I errors under a₁ = · · · = a₄ = 0. Results from a similar simulation except with different intercepts (a₁ = · · · = a₄ = 0.1) are presented in Table 2. First we observe that empirical powers and type I errors are close to the nominal levels, indicating that the proposed sample size method performs well across a wide range of designing configurations. Sample sizes under the AR(1) correlation structure are always larger than those under the CS correlation structure given the same ρ. Furthermore, given the same marginal observation probabilities, the monotone missing pattern leads to a larger sample size requirement than the independent missing pattern. Finally, comparison across Table 1 and Table 2 suggests that sample sizes are different under different levels of intercepts a₁ − a₄, despite the same treatment effects represented by slopes b₁ − b₄. In the design of clinical trials with count outcomes, the baseline level is an important design input, which is generally not the case for trials with continuous outcomes.

Table 1:

Sample size per group(Empirical power, Empirical type I error) under a₁ = a₂ = a₃ = a₄ = 0.

Missing	δ	CS		AR(1)
		ρ = 0.3	ρ = 0.5	ρ = 0.3	ρ = 0.5
One control vs. three similar treatments: b₁ = 0,b₂ = b₃ = b₄ = 0.25
IM	δ₁	163(0.812,0.059)	117(0.811,0.061)	245(0.801,0.056)	175(0.805,0.055)
	δ₂	198(0.813,0.046)	151(0.802,0.058)	280(0.806,0.053)	209(0.807,0.057)
	δ₃	194(0.800,0.052)	148(0.809,0.050)	277(0.810,0.056)	206(0.800,0.057)
	δ₄	202(0.807,0.060)	155(0.799,0.056)	282(0.807,0.051)	212(0.809,0.052)
MM	δ₁	163(0.823,0.054)	117(0.803,0.064)	245(0.812,0.058)	175(0.809,0.056)
	δ₂	204(0.817,0.054)	162(0.800,0.059)	299(0.806,0.053)	230(0.810,0.048)
	δ₃	201(0.801,0.053)	158(0.805,0.058)	294(0.806,0.054)	225(0.802,0.050)
	δ₄	208(0.814,0.053)	166(0.798,0.059)	305(0.805,0.049)	235(0.802,0.051)
One control vs. three ordered treatments b₁ = 0,b₂ = 0.12,b₃ = 0.24,b₄ = 0.36
IM	δ₁	176(0.807,0.061)	126(0.815,0.060)	264(0.808,0.057)	189(0.803,0.048)
	δ₂	214(0.799,0.054)	163(0.796,0.061)	302(0.805,0.045)	226(0.801,0.051)
	δ₃	210(0.809,0.056)	159(0.807,0.055)	299(0.805,0.049)	223(0.807,0.051)
	δ₄	218(0.802,0.056)	168(0.799,0.058)	304(0.804,0.046)	229(0.800,0.054)
MM	δ₁	176(0.806,0.065)	126(0.797,0.056)	264(0.809,0.060)	189(0.814,0.054)
	δ₂	221(0.801,0.059)	175(0.794,0.049)	323(0.807,0.053)	248(0.804,0.051)
	δ₃	216(0.800,0.061)	171(0.801,0.058)	317(0.800,0.052)	242(0.795,0.055)
	δ₄	225(0.801,0.053)	179(0.797,0.055)	329(0.801,0.052)	254(0.799,0.053)

Open in a new tab

Table 2:

Sample size per group (Empirical power, Empirical type I error) under a₁ = a₂ = a₃ = a₄ = 01.

Missing	δ	CS		AR(1)
		ρ = 0.3	ρ = 0.5	ρ = 0.3	ρ = 0.5
One control vs. three similar treatments: b₁ = 0,b₂ = b₃ = b₄ = 0.25
IM	δ₁	147(0.809,0.063)	106(0.806,0.066)	221(0.815,0.048)	158(0.802,0.054)
	δ₂	179(0.811,0.059)	137(0.798,0.058)	253(0.803,0.053)	189(0.802,0.050)
	δ₃	176(0.807,0.055)	134(0.810,0.052)	251(0.803,0.055)	187(0.798,0.057)
	δ₄	183(0.822,0.057)	140(0.810,0.056)	255(0.808,0.053)	192(0.803,0.059)
MM	δ₁	147(0.808,0.061)	106(0.802,0.061)	221(0.814,0.048)	158(0.807,0.060)
	δ₂	185(0.809,0.057)	147(0.804,0.054)	271(0.814,0.052)	208(0.807,0.051)
	δ₃	182(0.811,0.063)	143(0.809,0.058)	266(0.804,0.054)	203(0.802,0.053)
	δ₄	189(0.817,0.056)	150(0.809,0.059)	276(0.806,0.049)	213(0.806,0.051)
One control vs. three ordered treatments b₁ = 0,b₂ = 0.12,b₃ = 0.24,b₄ = 0.36
IM	δ₁	159(0.807,0.055)	114(0.796,0.062)	239(0.809,0.057)	171(0.806,0.057)
	δ₂	193(0.811,0.055)	148(0.800,0.056)	273(0.793,0.053)	204(0.805,0.060)
	δ₃	190(0.803,0.054)	144(0.811,0.060)	270(0.808,0.053)	201(0.811,0.053)
	δ₄	197(0.807,0.060)	152(0.801,0.055)	275(0.803,0.055)	207(0.810,0.052)
MM	δ₁	159(0.805,0.064)	114(0.799,0.057)	239(0.792,0.052)	171(0.807,0.054)
	δ₂	200(0.802,0.057)	158(0.797,0.057)	292(0.796,0.057)	224(0.813,0.053)
	δ₃	196(0.803,0.057)	154(0.793,0.057)	287(0.808,0.052)	219(0.808,0.057)
	δ₄	203(0.808,0.054)	162(0.794,0.061)	297(0.801,0.055)	230(0.802,0.054)

Open in a new tab

Examining sample sizes under different values of δ in Tables 1 and 2, we conclude that the proposed sample size method can appropriately account for missing data. Let n₀ be the required sample size under no missing data (δ₁) and q be the dropout rate at the end of study. Traditionally, to account for missing data sample sizes have been computed as n₀/(1 − q), which is conservative because it ignores information contributed by subjects with partial observations. While δ₂ − δ₄ represent three different missing scenarios with the same dropout rate at the end of study (q=0.25), we show that sample sizes required under δ₂ −δ₄ are always smaller than n₀/(1− q). For example, in Table 1, the sample sizes per group under δ₂ − δ₄ are 198, 194, and 202, respectively, under three equally effective experimental treatments, independent missing, ρ=0.3, and CS correlation structure. The traditional adjustment for missing data, however, produces a sample size of 163/0.75 = 217, which is 7.4% to 11.9% larger than actually needed according to the proposed method.

In Table 3 we further list the sample size requirement under a wide range of correlation parameter ρ. It shows that with other designing parameters being the same, a stronger correlation is associated with a smaller sample size when comparing the slopes of a count outcome among K ≥ 3 groups. Similar relationship has been observed for continuous outcomes in (Jung & Ahn 2004).

Table 3:

Sample size per group for different values of ρ under a₁ = a₂ = a₃ = a₄ = 0.

Missing	δ	CS					AR(1)
ρ =		0.1	0.3	0.5	0.7	0.9	0.1	0.3	0.5	0.7	0.9
One control vs. three similar treatments: b₁ = 0,b₂ = b₃ = b₄ = 0.25
IM	δ₁	209	163	117	70	24	310	245	175	105	35
	δ₂	245	198	151	105	58	345	280	209	139	69
	δ₃	241	194	148	101	54	343	277	206	135	66
	δ₄	249	202	155	109	62	347	282	212	142	73
MM	δ₁	209	163	117	70	24	310	245	175	105	35
	δ₂	247	204	162	120	77	362	299	230	160	90
	δ₃	243	201	158	116	73	357	294	225	155	86
	δ₄	251	208	166	124	81	367	305	235	165	95
Four ordered treatments: b₁ = 0,b₂ = 0.12,b₃ = 0.24,b₄ = 0.36
IM	δ₁	226	176	126	76	26	334	264	189	113	39
	δ₂	264	214	163	113	63	372	302	226	150	75
	δ₃	260	210	159	109	59	370	299	223	146	71
	δ₄	268	218	168	117	67	374	304	229	154	79
MM	δ₁	226	176	126	76	26	334	264	189	113	39
	δ₂	266	221	175	129	83	390	323	248	173	98
	δ₃	262	216	171	125	79	385	317	242	168	93
	δ₄	270	225	179	134	88	396	329	254	178	103

Open in a new tab

5. Example

In a randomized epilepsy clinical trial, epileptic patients will be randomly assigned to one of four medications: placebo, tegretol, felbatol, and lamictal. The number of epileptic seizures will be recorded at baseline and at four consecutive two-week intervals from each patient. Hence we have J = 5 and the measurement times are coded as t_j = (0,0.25,0.5,0.75,1).

An investigator wants to design a study that compares the three medication with placebo. The null hypothesis states that there is no difference in the rate of change in the number of seizures over the 8-week treatment period among 4 groups. A previous study (Diggle et al. 2002) shows that the number of seizures fluctuates around 8 over the five measurement time points in the placebo group, based on which we can specify the intercept parameters as a_k = log(8) = 2.08 (k = 1,· · · ,4) and the slope parameter in the placebo group as b₁ = 0. Suppose a clinically meaningful difference in the number of seizures is 4 after an 8-week treatment period between placebo and the other medication groups. Hence the true slope parameter can be obtained by solving exp(a_k + b_kt_J) = 8 − 4 = 4. Plugging in a_k = 2.08 and t_J = 1, we have b_k = −0.693 for k = 2,3,4. The hypotheses of interest are then H₀ : b₁ = · · · = b₄ = 0 versus H₁ : b₁ = 0, b₂ = b₃ = b₄ = −0.693. We assume the marginal observation probabilities to be δ = (1,0.95,0.9,0.85,0.8), which implies a linear trend with a dropout rate of 20% at the end of study. We also assume a correlation parameter of ρ = 0.5. We calculate the sample size requirement under a balanced design to achieve 80% power at 5% two-sided type I error. Under the CS correlation structure and missing pattern IM and MM, the required sample sizes are 50 and 53 per group, respectively. Under AR(1) and missing pattern IM, and MM, the required sample sizes are 63 and 67 per group, respectively.

6. Discussion

We have presented a sample size formula for comparing the rates of change in a K-group (K ≥ 3) study based on repeated count outcomes, which allows incorporation of general correlation structures, missing data patterns, and unbalanced experimental designs into the sample size formula. We employed the GEE method for sample size estimation which has been widely used for the analysis of repeated measurement studies due to its robustness to random missing and misspecification of the true correlation structure. Simulation studies show that empirical type I errors and powers are close to their nominal levels under various correlation structures and missing data patterns. We demonstrate that sample size decreases as ρ increases in studies comparing the rates of change among K(≥ 3) groups, which was also observed in repeated continuous outcomes (Jung & Ahn 2004). The actual information loss caused by missing data depends on various designing factors, such as correlation structure, missing pattern, and trend in marginal observation probabilities, etc. The proposed sample size method is advantageous in appropriately taking such factors in to consideration and the resulting sample size is much smaller than that under traditional adjustment for missing data.

In practice the true correlation structure is usually unknown at the design stage. We used an independent working correlation structure because it not only simplified derivation, but also improved the applicability of the proposed sample size approach to clinical trials where prior knowledge about correlation is limited. When the true correlation is known, using an independent working correlation leads to loss in efficiency. In this case the proposed approach provides a conservative sample size relative to that calculated under an correctly specified working correlation.

The proposed sample size approach is based on a Poisson model (1), which implies a marginal variance Var(y_kij) = μ_kij. In order to accommodate under- or over-dispersion, a different model (such as a negative binomial model) needs to be employed, which will be the topic of our future research.

The software to estimate the sample size for comparing rates of change in K-group repeated count outcomes can be obtained from http://faculty.smu.edu/jcao/Rcode-KgroupCount-Slope.zip

Acknowledgment

The work was supported in part by NIH grant 1UL1TR001105, AHRQ grant R24HS22418, NSF grant IIS-1302497–04, and CPRIT grants RP110562-C1 and RP120670-C1.

Contributor Information

Ying Lou, Department of Statistical Science, Southern Methodist University, Dallas, TX.

Jing Cao, Department of Statistical Science, Southern Methodist University, Dallas, TX.

Chul Ahn, Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX.

References

Ahn C, Heo M & Zhang S (2015), Sample size calculations for clustered and longitudinal outcomes in clinical research, Chapman & Hall; New York. [Google Scholar]
Crowder M (1995), ‘On the use of a working correlation matrix in using generalised linear models for repeated measures’, Biometrika 82, 407–410. [Google Scholar]
Diggle P, Heagerty P, Liang K & Zeger S (2002), Analysis of longitudinal data (2nd ed.), Oxford University Press. [Google Scholar]
Erhardt V & Czado C (2009), ‘A method for approximately sampling high-dimensional count variables with prespecified Pearson correlation’, Technical Report. [Google Scholar]
Freidlin B, Korn E, Gray R & Martin A (2008), ‘Multi-arm clinical trials of new agents: some design considerations’, Clinical Cancer Research 14, 4368–4371. [DOI] [PubMed] [Google Scholar]
Jung S & Ahn C (2003), ‘Sample size estimation for GEE method for comparing slopes in repeated measurements data’, Statistics in Medicine 22(8), 1305–1315. [DOI] [PubMed] [Google Scholar]
Jung S . & Ahn C (2004), ‘K-sample test and sample size calculation for comparing slopes in data with repeated measurements’, Biometrical Journal 46(5), 554–564. [Google Scholar]
Jung S & Ahn C (2005), ‘Sample size for repeated binary measurements using gee’, Statistics in Medicine 24, 2583–2596. [DOI] [PubMed] [Google Scholar]
Liang K & Zeger S (1986), ‘Longitudinal data analysis using generalized linear models’, Biometrika 73(1), 13–22. [Google Scholar]
Liu P & Dahlberg S (1995), ‘Design and analysis of multiarm clinical trials with survival endpoints’, Controlled Clinical Trials 16, 119–130. [DOI] [PubMed] [Google Scholar]
Lou Y, Cao J, Zhang S & Ahn C (in press), ‘Sample size estimation for a two-group comparison of repeated count outcomes using gee’, Communications in Statistics Theory and Methods. [DOI] [PMC free article] [PubMed] [Google Scholar]
McDonald B (1993), ‘Estimating logistic regression parameters for bivariate binary data’, Journal of the Royal Statistical Society, Series. B 55, 391–397. [Google Scholar]
Ogungbenro K & Aarons L (2010), ‘Sample size/power calculations for population pharmacodynamic experiments involving repeated-count measurements’, Journal of Biopharmaceutical Statistics 20(5), 1026–1042. [DOI] [PubMed] [Google Scholar]
Patel H & Rowe E (1999), ‘Sample size for comparing linear growth curves’, Journal of Biopharmaceutical Statistics 9(2), 339–350. [DOI] [PubMed] [Google Scholar]
Thall PF & Vail SC (1990), ‘Some covariance models for longitudinal count data with overdispersion’, Biometrics pp. 657–671. [PubMed] [Google Scholar]
Tilley B, Alarcon G, Heyse S, Trentham D, Neuner R, Kaplan D, Clegg D, Leisen J, Buckley L, Cooper S, Duncan H, Pillemer S, Tuttleman M & Fowler S (1995), ‘Minocycline in rheumatoid-arthritis a 48-week, double-blind, placebo-controlled trial’, Ann. Intern. Med 122(2), 81–89. [DOI] [PubMed] [Google Scholar]
Zhang S & Ahn C (2010), ‘Effects of correlation and missing data on sample size estimation in longitudinal clinical trials’, Pharmaceutical Statistics 9(1), 2–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang S & Ahn C (2013), ‘Sample size calculation for comparing time-averaged responses in k-group repeated measurement studies’, Computational Statistics and Data Analysis 58, 283–291. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] Ahn C, Heo M & Zhang S (2015), Sample size calculations for clustered and longitudinal outcomes in clinical research, Chapman & Hall; New York. [Google Scholar]

[R2] Crowder M (1995), ‘On the use of a working correlation matrix in using generalised linear models for repeated measures’, Biometrika 82, 407–410. [Google Scholar]

[R3] Diggle P, Heagerty P, Liang K & Zeger S (2002), Analysis of longitudinal data (2nd ed.), Oxford University Press. [Google Scholar]

[R4] Erhardt V & Czado C (2009), ‘A method for approximately sampling high-dimensional count variables with prespecified Pearson correlation’, Technical Report. [Google Scholar]

[R5] Freidlin B, Korn E, Gray R & Martin A (2008), ‘Multi-arm clinical trials of new agents: some design considerations’, Clinical Cancer Research 14, 4368–4371. [DOI] [PubMed] [Google Scholar]

[R6] Jung S & Ahn C (2003), ‘Sample size estimation for GEE method for comparing slopes in repeated measurements data’, Statistics in Medicine 22(8), 1305–1315. [DOI] [PubMed] [Google Scholar]

[R7] Jung S . & Ahn C (2004), ‘K-sample test and sample size calculation for comparing slopes in data with repeated measurements’, Biometrical Journal 46(5), 554–564. [Google Scholar]

[R8] Jung S & Ahn C (2005), ‘Sample size for repeated binary measurements using gee’, Statistics in Medicine 24, 2583–2596. [DOI] [PubMed] [Google Scholar]

[R9] Liang K & Zeger S (1986), ‘Longitudinal data analysis using generalized linear models’, Biometrika 73(1), 13–22. [Google Scholar]

[R10] Liu P & Dahlberg S (1995), ‘Design and analysis of multiarm clinical trials with survival endpoints’, Controlled Clinical Trials 16, 119–130. [DOI] [PubMed] [Google Scholar]

[R11] Lou Y, Cao J, Zhang S & Ahn C (in press), ‘Sample size estimation for a two-group comparison of repeated count outcomes using gee’, Communications in Statistics Theory and Methods. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] McDonald B (1993), ‘Estimating logistic regression parameters for bivariate binary data’, Journal of the Royal Statistical Society, Series. B 55, 391–397. [Google Scholar]

[R13] Ogungbenro K & Aarons L (2010), ‘Sample size/power calculations for population pharmacodynamic experiments involving repeated-count measurements’, Journal of Biopharmaceutical Statistics 20(5), 1026–1042. [DOI] [PubMed] [Google Scholar]

[R14] Patel H & Rowe E (1999), ‘Sample size for comparing linear growth curves’, Journal of Biopharmaceutical Statistics 9(2), 339–350. [DOI] [PubMed] [Google Scholar]

[R15] Thall PF & Vail SC (1990), ‘Some covariance models for longitudinal count data with overdispersion’, Biometrics pp. 657–671. [PubMed] [Google Scholar]

[R16] Tilley B, Alarcon G, Heyse S, Trentham D, Neuner R, Kaplan D, Clegg D, Leisen J, Buckley L, Cooper S, Duncan H, Pillemer S, Tuttleman M & Fowler S (1995), ‘Minocycline in rheumatoid-arthritis a 48-week, double-blind, placebo-controlled trial’, Ann. Intern. Med 122(2), 81–89. [DOI] [PubMed] [Google Scholar]

[R17] Zhang S & Ahn C (2010), ‘Effects of correlation and missing data on sample size estimation in longitudinal clinical trials’, Pharmaceutical Statistics 9(1), 2–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Zhang S & Ahn C (2013), ‘Sample size calculation for comparing time-averaged responses in k-group repeated measurement studies’, Computational Statistics and Data Analysis 58, 283–291. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Sample Size Estimation for Comparing Rates of Change in K-group Repeated Count Outcomes

Ying Lou

Jing Cao

Chul Ahn

Abstract

1. Introduction

2. Generalized Estimating Equation

3. Sample Size Calculation

4. Numerical Studies

Table 1:

Table 2:

Table 3:

5. Example

6. Discussion

Acknowledgment

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Sample Size Estimation for Comparing Rates of Change in K-group Repeated Count Outcomes

Ying Lou

Jing Cao

Chul Ahn

Abstract

1. Introduction

2. Generalized Estimating Equation

3. Sample Size Calculation

4. Numerical Studies

Table 1:

Table 2:

Table 3:

5. Example

6. Discussion

Acknowledgment

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases