Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Mar 27.
Published in final edited form as: Commun Stat Theory Methods. 2017 Aug 7;46(22):11204–11213. doi: 10.1080/03610926.2016.1260744

Sample Size Estimation for Comparing Rates of Change in K-group Repeated Count Outcomes

Ying Lou 1, Jing Cao 2, Chul Ahn 3
PMCID: PMC6436812  NIHMSID: NIHMS1505258  PMID: 30930527

Abstract

Sample size estimation for comparing the rates of change in two-arm repeated measurements has been investigated by many investigators. In contrast, the literature has paid relatively less attention to sample size estimation for studies with multiarm repeated measurements where the design and data analysis can be more complex than two-arm trials. For continuous outcomes, Jung & Ahn (2004) and Zhang & Ahn (2013) have presented sample size formulas to compare the rates of change and time-averaged responses in multi-arm trials, using the generalized estimating equation (GEE) approach. To our knowledge, there has been no corresponding development for multi-arm trials with count outcomes. We present a sample size formula for comparing the rates of change in multi-arm repeated count outcomes using the GEE approach that accommodates various correlation structures, missing data patterns, and unbalanced designs. We conduct simulation studies to assess the performance of the proposed sample size formula under a wide range of designing configurations. Simulation results suggest that empirical type I error and power are maintained close to their nominal levels. The proposed method is illustrated using an epileptic clinical trial example.

Keywords: GEE, sample size, clinical trials, repeated count outcomes

1. Introduction

There has been extensive literature on the derivation of sample size formulas for comparing the rates of change in repeated measurements studies between two treatment arms (Patel & Rowe 1999, Diggle et al. 2002, Jung & Ahn 2003, 2005, Zhang & Ahn 2010, Ahn et al. 2015). In contrast, researchers have paid relatively less attention to the sample size problem in more complicated scenarios, multi-arm clinical trials with repeated measurements, which can be more complex in design, data analysis, and implementation than two-arm trials.

In phase III randomized clinical trials (RCTs), the number of experimental agents that can be tested may be extremely limited since RCTs are expensive and time-consuming. Efficiency of RCTs can be improved by conducting multi-arm trials where multiple experimental treatment arms are compared to a single control arm simultaneously. By sharing one control arm, the sample size required by a multi-arm trial can be much smaller than the total sample size of multiple two-arm trials, each separately comparing an experimental agent with the control. The multi-arm trial is also more appealing to patients and physicians because patients will have a higher chance of receiving an experimental agent than the control treatment. Freidlin et al. (2008) discussed statistical and logistical issues in the design of multi-arm trials that affect their relative efficiency compared with separate two-armed trials. Many randomized trials employ multi-arm trials. For example, Liu & Dahlberg (1995) showed that 24 out of 112 phase III trials employed multi-arm trials in progress at the time in Southwest Oncology Group.

In the context of multi-arm trials with repeatedly measured continuous outcomes, Jung & Ahn (2004) and Zhang & Ahn (2013) developed sample size formulas for K-sample (K ≥ 3) comparisons of slopes and time-averaged responses, respectively, using the generalized estimating equation (GEE) approach. To our knowledge, there has been no development of corresponding sample size approach for count outcomes. Repeatedly measured count outcomes are frequently encountered in clinical studies (Diggle et al. 2002, Ogungbenro & Aarons 2010), including the examples of epileptic seizure counts (Thall & Vail 1990) and swollen joints counts in rheumatoid arthritis patients (Tilley et al. 1995). There has been some development in sample size calculation for count outcomes in two-arm trials. For example, Patel & Rowe (1999) provided sample size formulas for comparing the rates of change in repeated count outcomes using GEE, which incorporates general correlation structures. However, their approach does not account for missing data. Recently, a more general sample size formula was developed by Lou et al. (in press) which accommodates arbitrary type of missing patterns. In this study, we propose to investigate sample size calculation for the comparison of slopes for multi-arm clinical trials with repeated measurements of count outcomes.

The proposed sample size calculation approach is based on an approximately normal test statistic under the GEE approach. It is realistic in its ability to accommodate arbitrary types of missing data patterns and correlation structures, inherent for repeated measurements of outcomes in clinical trials. We assess the performance of the sample size formula under various correlation structures, missing data patterns, and observation probabilities through simulation studies. Finally, we illustrate the sample size approach using a real clinical trial example with epileptic patients.

2. Generalized Estimating Equation

A total of n subjects are recruited and randomly assigned to one of K treatment groups. Suppose each subject is scheduled to be measured at J time points t1 < ··· < tJ. Let nk denote the number of subjects assigned to treatment group k with k=1Knk=n. Then, rk = nk/n is the proportion of subjects assigned to the kth treatment. Let ykij be the count outcome observed from the ith subject of the kth treatment group at time tj. Defining μkij = E(ykij), we model ykij by a Poisson model,

f(ykij)=eμkijμkijykijykij!. (1)

Employing a log link function g(μ) = log(μ), we have

g(μkij)=log(μkij)=ak+bktj. (2)

Coefficients ak and bk are the group-specific intercept and slope parameters. Hence the first moment of ykij is modeled as E(ykij)=μkij(ak,bk)=g1(ak+bktj)=eak+bktj.

As for the second moment, recall that under a Poisson model, Var(yikj) = μikj. Furthermore, we use Corr(ykij, ykij′) = ρjj (with ρjj = 1) to denote within-subject correlation. We assume the observations to be independent acros s subjects.(2)

The GEE estimators of parameters b = (a1, b1, · · · , aK, bK), denoted as b^, can be solved from Un(b) = 0, where Un(b) contains the score functions

Un(b)={1n1i=1n1j=1J(y1ijμ1ij(b))1n1i=1n1j=1J(y1ijμ1ij(b))tj1nKi=1nKj=1J(yKijμKij(b))1nKi=1nKj=1J(yKijμKij(b))tj}. (3)

Note that in (3) the score functions are obtained using an independent working correlation structure, which greatly simplifies the derivation of the sample size estimate. When the true correlation is unknown, which is usually the case in practice, an independent working correlation has been used because the parameter estimates remain consistent and the computation is usually more stable and efficient (Liang & Zeger 1986, Crowder 1995, McDonald 1993).

We solve Un(b) = 0 using the Newton-Raphson algorithm. Specifically, at the lth itera-tion,

b^(l)=b^(l1)+n1/2An1(b^(l1))Un(b^(l1)),

where

An(b)=(1n1i=1n1j=1Jμ1ij(1tjtjtj2)0......0...1nKi=1nKj=1JμKij(1tjtjtj2)).

Note that An(b) is a block-diagonal matrix. By Liang & Zeger (1986), n(b^b)N(0,V) in distribution as n → ∞. The covariance matrix V can be consistently estimated by Vn=WAn1(b^)ΣnAn1(b^)W, where

Σn=(1n1i=1n1{j=1Jϵ^1ij(1tj)}20......0...1nKi=1nK{j=1Jϵ^Kij(1tj)}2),

W is a diagonal matrix with the diagonal elements being (1/r1,1/r1,,1/rK,1/rK),ϵ^kij=ykijμkij(b^) denotes residual, and c⊗2 = ccT for a vector c.

To compare the slope parameters among K groups, the hypotheses of interest are H0 : b1 = ··· = bK, versus H1 : b1 = θ1,··· ,bK = θK (at least one θk is different from the others). We can construct a test statistic

Z=Cb^Var(Cb^), (4)

where C is a vector defining a contrast of the slope parameters. For example, without loss of generality, let k = 1 denote the control arm and k = 2, · · · , K denote the experimental arms. We can specify C=(0,1,0,1K1,,0,1K1). Note that the elements corresponding to intercepts (ak) are set to 0. We reject the null hypothesis if |Z| > z1−α/2, where z1−α/2 is the 100(1 − α/2)th percentile of the standard normal distribution.

3. Sample Size Calculation

Let A and Σ be the limit of An and Σn as n → ∞, respectively. Then the limit of Vn is V = WA−1ΣA−1W. Based on test statistic (4), given type I error α and power 1 − γ, the sample size needed to reject H0 : b1 = ··· = bK given truth b1 = θ1,··· ,bK = θK is

n=(z1α/2+z1γ)2CVC(Cθ)2, (5)

where θ = (ϑ11,··· KK). Here ϑk denotes the true value of ak. It is noteworthy that although we are only interested in testing the hypotheses about slopes bk (k = 1,··· ,K), for count outcomes with a Poisson distribution, the intercept parameters ak (k = 1,··· ,K) still affect the test statistic and sample size through μkij = ak + bktj, which is also the variance of ykij.

In clinical trials, researchers frequently encounter missing data due to various reasons. In the following, we present a generalized derivation of V to accommodate the presence of missing data under the MCAR (missing complete at random) assumption. Specifically, we introduce a missing indicator Δkij, which takes value 0/1 for missed/observed measurements. Then An and Σn with missing data can be expressed as

An(b)=(1n1i=1n1j=1JΔ1ijμ1ij(1tjtjtj2)0......0...1nKi=1nKj=1JΔKijμKij(1tjtjtj2))

and

Σn=(1n1i=1n1{j=1JΔ1ijϵ^1ij(1tj)}20......0...1nKi=1nK{j=1JΔKijϵ^Kij(1tj)}2).

Let δj = Ekij) be the marginal probability of obtaining an observation at time tj and δjj0 = EkijΔkij) be the joint probability of a subject having observations at both tj and tj0. Note that δjj = δj. The above specification effectively assumes that the probabilities of missing data depend on time only. By specifying δj and δjj, this approach allows us to accommodate arbitrary types of missing patterns. Utilizing the fact that μkij = μkj, it can be shown that An(b) and Σn converge to

A(b)=(j=1Jδjμ1j(1tjtjtj2)0j=1JδjμKj(1tjtjtj2)),

and

Σ=(j=1Jj'=1Jδjjρjjμ1jμ1j(1tjtjtj2)00j=1Jj'=1JδjjρjjμKjμKj(1tjtjtj2)),

respectively. By plugging A(b) and Σ into (5), the required sample size can be calculated with C=(0,1,0,1K1,,0,1K1), and W=diag(1/r1,1/r1,,1/rK,1/rK).

In summary, to assess the sample size requirement for comparing the slope parameters of a count outcome among K arms, besides power 1 − γ and type I error α, information needed at the design stage includes: measurement schedule characterized by (t1,··· ,tJ), true correlation structure by ρjj0, missing data pattern by δj and δjj, randomization ratio by rk, and the true values of parameters θ (including intercepts and slopes).

4. Numerical Studies

To investigate the performance of the proposed sample size formula, we conduct extensive simulations under different parameter settings. Suppose that K = 4 treatment groups are compared under a balanced design with J=6 equidistant measurements obtained at time tj = (j − 1)/(J − 1),j = 1, · · · , 6. We explore two missing patterns: independent missing (IM) and monotone missing (MM). Independent missing (IM) means that the occurrence of missing value at time tj is independent of the occurrence at time tj. That is, δjj = δjδj. Monotone missing (MM) means that if a subject misses a clinical visit at time tj, then he/she will miss all the remaining visits. Under MM, we have δjj = δj for j> j. Note that IM and MM leads to different definitions of joint probabilities δjj. For marginal probabilities δ = (δ1,··· m)′, we investigate four scenarios:

δ1=(1,1,1,1,1,1),δ2=(1,0.95,0.9,0.85,0.8,0.75),δ3=(1,0.99,0.96,0.91,0.84,0.75),δ4=(1,0.91,0.84,0.79,0.76,0.75).

All scenarios assume complete observations at t1. δ1 corresponds to the scenario of no missing data across the study period. Under δ2, δ3 and δ4, the missing probabilities follow different trends, but with an equal proportion (25%) of missing values at the end of study. Specifically, δ2 represents a linear trend. Under δ3 there are little missing values initially but the proportion of missing values increase rapidly toward the end of study. Under δ4 the trend is opposite to δ3.

We consider two structures for within-subject correlation: compound symmetric (CS) with ρjj = ρ for j ≠ j and AR(1) with ρjj=ρ|tjtj|. We explore two values for the correlation parameter ρ: 0.3 and 0.5. Nominal type I error and power are set at α = 0.05 and 1 − γ = 0.8, respectively. We assume b1 = · · · = b4 = 0 under the null hypothesis. We consider two types of of alternative hypotheses for clinical trials with K treatments. One assumes the first group to receive the control treatment and the remaining K − 1 groups to receive experimental treatments with similar efficacy. Hence the true slopes are specified as θ1 = 0, θ2 = θ3 = θ4 = 0.25; The other assumes the K − 1 experimental groups to be ordered with respect to treatment effect. Hence we set b1 = 0, b2 = 0.12, b3 = 0.24, b4 = 0.36. Finally, the intercept parameters are specified as a1 = · · · = a4 = 0. For each combination of design parameters, we conduct simulation as follows:

Step 1: Calculate sample size n according to formula (5).

Step 2: Generate random samples of size n under H0 and H1, respectively. Correlated count outcomes are generated using R-package corcounts (Erhardt & Czado 2009).

Step 3: Generate missing indicators according to specified missing pattern and marginal observation probabilities.

Step 4: Calculate b^, An, and Σn, and obtain test statistic Z. Reject the null hypothesis if |Z| > z1−α/2.

Step 5: Repeat Step 2 to Step 4 for L = 5000 times. The empirical type I error and power are calculated as the proportions of rejections among the 5000 repetitions under the null and alternative hypotheses, respectively.

Tables 1 summarizes sample size estimates based on (5), as well as the empirical powers and type I errors under a1 = · · · = a4 = 0. Results from a similar simulation except with different intercepts (a1 = · · · = a4 = 0.1) are presented in Table 2. First we observe that empirical powers and type I errors are close to the nominal levels, indicating that the proposed sample size method performs well across a wide range of designing configurations. Sample sizes under the AR(1) correlation structure are always larger than those under the CS correlation structure given the same ρ. Furthermore, given the same marginal observation probabilities, the monotone missing pattern leads to a larger sample size requirement than the independent missing pattern. Finally, comparison across Table 1 and Table 2 suggests that sample sizes are different under different levels of intercepts a1a4, despite the same treatment effects represented by slopes b1b4. In the design of clinical trials with count outcomes, the baseline level is an important design input, which is generally not the case for trials with continuous outcomes.

Table 1:

Sample size per group(Empirical power, Empirical type I error) under a1 = a2 = a3 = a4 = 0.

Missing δ CS AR(1)
ρ = 0.3 ρ = 0.5 ρ = 0.3 ρ = 0.5
One control vs. three similar treatments: b1 = 0,b2 = b3 = b4 = 0.25
IM δ1 163(0.812,0.059) 117(0.811,0.061) 245(0.801,0.056) 175(0.805,0.055)
δ2 198(0.813,0.046) 151(0.802,0.058) 280(0.806,0.053) 209(0.807,0.057)
δ3 194(0.800,0.052) 148(0.809,0.050) 277(0.810,0.056) 206(0.800,0.057)
δ4 202(0.807,0.060) 155(0.799,0.056) 282(0.807,0.051) 212(0.809,0.052)
MM δ1 163(0.823,0.054) 117(0.803,0.064) 245(0.812,0.058) 175(0.809,0.056)
δ2 204(0.817,0.054) 162(0.800,0.059) 299(0.806,0.053) 230(0.810,0.048)
δ3 201(0.801,0.053) 158(0.805,0.058) 294(0.806,0.054) 225(0.802,0.050)
δ4 208(0.814,0.053) 166(0.798,0.059) 305(0.805,0.049) 235(0.802,0.051)
One control vs. three ordered treatments b1 = 0,b2 = 0.12,b3 = 0.24,b4 = 0.36
IM δ1 176(0.807,0.061) 126(0.815,0.060) 264(0.808,0.057) 189(0.803,0.048)
δ2 214(0.799,0.054) 163(0.796,0.061) 302(0.805,0.045) 226(0.801,0.051)
δ3 210(0.809,0.056) 159(0.807,0.055) 299(0.805,0.049) 223(0.807,0.051)
δ4 218(0.802,0.056) 168(0.799,0.058) 304(0.804,0.046) 229(0.800,0.054)
MM δ1 176(0.806,0.065) 126(0.797,0.056) 264(0.809,0.060) 189(0.814,0.054)
δ2 221(0.801,0.059) 175(0.794,0.049) 323(0.807,0.053) 248(0.804,0.051)
δ3 216(0.800,0.061) 171(0.801,0.058) 317(0.800,0.052) 242(0.795,0.055)
δ4 225(0.801,0.053) 179(0.797,0.055) 329(0.801,0.052) 254(0.799,0.053)

Table 2:

Sample size per group (Empirical power, Empirical type I error) under a1 = a2 = a3 = a4 = 01.

Missing δ CS AR(1)
ρ = 0.3 ρ = 0.5 ρ = 0.3 ρ = 0.5
One control vs. three similar treatments: b1 = 0,b2 = b3 = b4 = 0.25
IM δ1 147(0.809,0.063) 106(0.806,0.066) 221(0.815,0.048) 158(0.802,0.054)
δ2 179(0.811,0.059) 137(0.798,0.058) 253(0.803,0.053) 189(0.802,0.050)
δ3 176(0.807,0.055) 134(0.810,0.052) 251(0.803,0.055) 187(0.798,0.057)
δ4 183(0.822,0.057) 140(0.810,0.056) 255(0.808,0.053) 192(0.803,0.059)
MM δ1 147(0.808,0.061) 106(0.802,0.061) 221(0.814,0.048) 158(0.807,0.060)
δ2 185(0.809,0.057) 147(0.804,0.054) 271(0.814,0.052) 208(0.807,0.051)
δ3 182(0.811,0.063) 143(0.809,0.058) 266(0.804,0.054) 203(0.802,0.053)
δ4 189(0.817,0.056) 150(0.809,0.059) 276(0.806,0.049) 213(0.806,0.051)
One control vs. three ordered treatments b1 = 0,b2 = 0.12,b3 = 0.24,b4 = 0.36
IM δ1 159(0.807,0.055) 114(0.796,0.062) 239(0.809,0.057) 171(0.806,0.057)
δ2 193(0.811,0.055) 148(0.800,0.056) 273(0.793,0.053) 204(0.805,0.060)
δ3 190(0.803,0.054) 144(0.811,0.060) 270(0.808,0.053) 201(0.811,0.053)
δ4 197(0.807,0.060) 152(0.801,0.055) 275(0.803,0.055) 207(0.810,0.052)
MM δ1 159(0.805,0.064) 114(0.799,0.057) 239(0.792,0.052) 171(0.807,0.054)
δ2 200(0.802,0.057) 158(0.797,0.057) 292(0.796,0.057) 224(0.813,0.053)
δ3 196(0.803,0.057) 154(0.793,0.057) 287(0.808,0.052) 219(0.808,0.057)
δ4 203(0.808,0.054) 162(0.794,0.061) 297(0.801,0.055) 230(0.802,0.054)

Examining sample sizes under different values of δ in Tables 1 and 2, we conclude that the proposed sample size method can appropriately account for missing data. Let n0 be the required sample size under no missing data (δ1) and q be the dropout rate at the end of study. Traditionally, to account for missing data sample sizes have been computed as n0/(1 − q), which is conservative because it ignores information contributed by subjects with partial observations. While δ2δ4 represent three different missing scenarios with the same dropout rate at the end of study (q=0.25), we show that sample sizes required under δ2δ4 are always smaller than n0/(1− q). For example, in Table 1, the sample sizes per group under δ2δ4 are 198, 194, and 202, respectively, under three equally effective experimental treatments, independent missing, ρ=0.3, and CS correlation structure. The traditional adjustment for missing data, however, produces a sample size of 163/0.75 = 217, which is 7.4% to 11.9% larger than actually needed according to the proposed method.

In Table 3 we further list the sample size requirement under a wide range of correlation parameter ρ. It shows that with other designing parameters being the same, a stronger correlation is associated with a smaller sample size when comparing the slopes of a count outcome among K ≥ 3 groups. Similar relationship has been observed for continuous outcomes in (Jung & Ahn 2004).

Table 3:

Sample size per group for different values of ρ under a1 = a2 = a3 = a4 = 0.

Missing δ CS AR(1)
ρ = 0.1 0.3 0.5 0.7 0.9 0.1 0.3 0.5 0.7 0.9
One control vs. three similar treatments: b1 = 0,b2 = b3 = b4 = 0.25
IM δ1 209 163 117 70 24 310 245 175 105 35
δ2 245 198 151 105 58 345 280 209 139 69
δ3 241 194 148 101 54 343 277 206 135 66
δ4 249 202 155 109 62 347 282 212 142 73
MM δ1 209 163 117 70 24 310 245 175 105 35
δ2 247 204 162 120 77 362 299 230 160 90
δ3 243 201 158 116 73 357 294 225 155 86
δ4 251 208 166 124 81 367 305 235 165 95
Four ordered treatments: b1 = 0,b2 = 0.12,b3 = 0.24,b4 = 0.36
IM δ1 226 176 126 76 26 334 264 189 113 39
δ2 264 214 163 113 63 372 302 226 150 75
δ3 260 210 159 109 59 370 299 223 146 71
δ4 268 218 168 117 67 374 304 229 154 79
MM δ1 226 176 126 76 26 334 264 189 113 39
δ2 266 221 175 129 83 390 323 248 173 98
δ3 262 216 171 125 79 385 317 242 168 93
δ4 270 225 179 134 88 396 329 254 178 103

5. Example

In a randomized epilepsy clinical trial, epileptic patients will be randomly assigned to one of four medications: placebo, tegretol, felbatol, and lamictal. The number of epileptic seizures will be recorded at baseline and at four consecutive two-week intervals from each patient. Hence we have J = 5 and the measurement times are coded as tj = (0,0.25,0.5,0.75,1).

An investigator wants to design a study that compares the three medication with placebo. The null hypothesis states that there is no difference in the rate of change in the number of seizures over the 8-week treatment period among 4 groups. A previous study (Diggle et al. 2002) shows that the number of seizures fluctuates around 8 over the five measurement time points in the placebo group, based on which we can specify the intercept parameters as ak = log(8) = 2.08 (k = 1,· · · ,4) and the slope parameter in the placebo group as b1 = 0. Suppose a clinically meaningful difference in the number of seizures is 4 after an 8-week treatment period between placebo and the other medication groups. Hence the true slope parameter can be obtained by solving exp(ak + bktJ) = 8 − 4 = 4. Plugging in ak = 2.08 and tJ = 1, we have bk = −0.693 for k = 2,3,4. The hypotheses of interest are then H0 : b1 = · · · = b4 = 0 versus H1 : b1 = 0, b2 = b3 = b4 = −0.693. We assume the marginal observation probabilities to be δ = (1,0.95,0.9,0.85,0.8), which implies a linear trend with a dropout rate of 20% at the end of study. We also assume a correlation parameter of ρ = 0.5. We calculate the sample size requirement under a balanced design to achieve 80% power at 5% two-sided type I error. Under the CS correlation structure and missing pattern IM and MM, the required sample sizes are 50 and 53 per group, respectively. Under AR(1) and missing pattern IM, and MM, the required sample sizes are 63 and 67 per group, respectively.

6. Discussion

We have presented a sample size formula for comparing the rates of change in a K-group (K ≥ 3) study based on repeated count outcomes, which allows incorporation of general correlation structures, missing data patterns, and unbalanced experimental designs into the sample size formula. We employed the GEE method for sample size estimation which has been widely used for the analysis of repeated measurement studies due to its robustness to random missing and misspecification of the true correlation structure. Simulation studies show that empirical type I errors and powers are close to their nominal levels under various correlation structures and missing data patterns. We demonstrate that sample size decreases as ρ increases in studies comparing the rates of change among K(≥ 3) groups, which was also observed in repeated continuous outcomes (Jung & Ahn 2004). The actual information loss caused by missing data depends on various designing factors, such as correlation structure, missing pattern, and trend in marginal observation probabilities, etc. The proposed sample size method is advantageous in appropriately taking such factors in to consideration and the resulting sample size is much smaller than that under traditional adjustment for missing data.

In practice the true correlation structure is usually unknown at the design stage. We used an independent working correlation structure because it not only simplified derivation, but also improved the applicability of the proposed sample size approach to clinical trials where prior knowledge about correlation is limited. When the true correlation is known, using an independent working correlation leads to loss in efficiency. In this case the proposed approach provides a conservative sample size relative to that calculated under an correctly specified working correlation.

The proposed sample size approach is based on a Poisson model (1), which implies a marginal variance Var(ykij) = μkij. In order to accommodate under- or over-dispersion, a different model (such as a negative binomial model) needs to be employed, which will be the topic of our future research.

The software to estimate the sample size for comparing rates of change in K-group repeated count outcomes can be obtained from http://faculty.smu.edu/jcao/Rcode-KgroupCount-Slope.zip

Acknowledgment

The work was supported in part by NIH grant 1UL1TR001105, AHRQ grant R24HS22418, NSF grant IIS-1302497–04, and CPRIT grants RP110562-C1 and RP120670-C1.

Contributor Information

Ying Lou, Department of Statistical Science, Southern Methodist University, Dallas, TX.

Jing Cao, Department of Statistical Science, Southern Methodist University, Dallas, TX.

Chul Ahn, Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX.

References

  1. Ahn C, Heo M & Zhang S (2015), Sample size calculations for clustered and longitudinal outcomes in clinical research, Chapman & Hall; New York. [Google Scholar]
  2. Crowder M (1995), ‘On the use of a working correlation matrix in using generalised linear models for repeated measures’, Biometrika 82, 407–410. [Google Scholar]
  3. Diggle P, Heagerty P, Liang K & Zeger S (2002), Analysis of longitudinal data (2nd ed.), Oxford University Press. [Google Scholar]
  4. Erhardt V & Czado C (2009), ‘A method for approximately sampling high-dimensional count variables with prespecified Pearson correlation’, Technical Report. [Google Scholar]
  5. Freidlin B, Korn E, Gray R & Martin A (2008), ‘Multi-arm clinical trials of new agents: some design considerations’, Clinical Cancer Research 14, 4368–4371. [DOI] [PubMed] [Google Scholar]
  6. Jung S & Ahn C (2003), ‘Sample size estimation for GEE method for comparing slopes in repeated measurements data’, Statistics in Medicine 22(8), 1305–1315. [DOI] [PubMed] [Google Scholar]
  7. Jung S . & Ahn C (2004), ‘K-sample test and sample size calculation for comparing slopes in data with repeated measurements’, Biometrical Journal 46(5), 554–564. [Google Scholar]
  8. Jung S & Ahn C (2005), ‘Sample size for repeated binary measurements using gee’, Statistics in Medicine 24, 2583–2596. [DOI] [PubMed] [Google Scholar]
  9. Liang K & Zeger S (1986), ‘Longitudinal data analysis using generalized linear models’, Biometrika 73(1), 13–22. [Google Scholar]
  10. Liu P & Dahlberg S (1995), ‘Design and analysis of multiarm clinical trials with survival endpoints’, Controlled Clinical Trials 16, 119–130. [DOI] [PubMed] [Google Scholar]
  11. Lou Y, Cao J, Zhang S & Ahn C (in press), ‘Sample size estimation for a two-group comparison of repeated count outcomes using gee’, Communications in Statistics Theory and Methods. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. McDonald B (1993), ‘Estimating logistic regression parameters for bivariate binary data’, Journal of the Royal Statistical Society, Series. B 55, 391–397. [Google Scholar]
  13. Ogungbenro K & Aarons L (2010), ‘Sample size/power calculations for population pharmacodynamic experiments involving repeated-count measurements’, Journal of Biopharmaceutical Statistics 20(5), 1026–1042. [DOI] [PubMed] [Google Scholar]
  14. Patel H & Rowe E (1999), ‘Sample size for comparing linear growth curves’, Journal of Biopharmaceutical Statistics 9(2), 339–350. [DOI] [PubMed] [Google Scholar]
  15. Thall PF & Vail SC (1990), ‘Some covariance models for longitudinal count data with overdispersion’, Biometrics pp. 657–671. [PubMed] [Google Scholar]
  16. Tilley B, Alarcon G, Heyse S, Trentham D, Neuner R, Kaplan D, Clegg D, Leisen J, Buckley L, Cooper S, Duncan H, Pillemer S, Tuttleman M & Fowler S (1995), ‘Minocycline in rheumatoid-arthritis a 48-week, double-blind, placebo-controlled trial’, Ann. Intern. Med 122(2), 81–89. [DOI] [PubMed] [Google Scholar]
  17. Zhang S & Ahn C (2010), ‘Effects of correlation and missing data on sample size estimation in longitudinal clinical trials’, Pharmaceutical Statistics 9(1), 2–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Zhang S & Ahn C (2013), ‘Sample size calculation for comparing time-averaged responses in k-group repeated measurement studies’, Computational Statistics and Data Analysis 58, 283–291. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES