Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Dec 1.
Published in final edited form as: Stat Methods Med Res. 2015 Aug 24;26(6):2543–2551. doi: 10.1177/0962280215601137

Sample Size Considerations for Split-Mouth Design

Hong Zhu 1, Song Zhang 1, Chul Ahn 1
PMCID: PMC5573650  NIHMSID: NIHMS894714  PMID: 26303156

Abstract

Split-mouth designs are frequently used in dental clinical research, where a mouth is divided into two or more experimental segments that are randomly assigned to different treatments. It has the distinct advantage of removing a lot of inter-subject variability from the estimated treatment effect. Methods of statistical analyses for split-mouth design have been well developed. However, little work is available on sample size consideration at the design phase of a split-mouth trial, although many researchers pointed out that the split-mouth design can only be more efficient than a parallel-group design when within-subject correlation coefficient is substantial. In this paper, we propose to use the generalized estimating equation (GEE) approach to assess treatment effect in split-mouth trials, accounting for correlations among observations. Closed-form sample size formulas are introduced for the split-mouth design with continuous and binary outcomes, assuming exchangeable and “nested exchangeable” correlation structures for outcomes from the same subject. The statistical inference is based on the large sample approximation under the GEE approach. Simulation studies are conducted to investigate the finite-sample performance of the GEE sample size formulas. A dental clinical trial example is presented for illustration.

Keywords: continuous and binary outcomes, dental clinical trial, GEE, sample size, split-mouth

1. Introduction

In dental clinical trials, clinicians have the option of randomizing treatments over individuals (mouth level) or over segments in the mouth (segment level) within each individual. In the former case, all segments/sites of an individual receive the same treatment, which is called the parallel-group design. In the latter case, the split-mouth design is adopted, where the mouth is divided into two or more experimental segments that are randomly assigned to different treatments. For example, Morrow et al.1 enrolled 23 patients in a split-mouth trial for the treatment of gingivitis. Four sites located on the left or right side (segment) of a patient’s mouth were randomly assigned to either the experimental treatment (chlorhexidine) or control (saline). The split-mouth design has the advantage of removing a lot of inter-subject variability from the estimated treatment effect, and potentially requires fewer subjects than a parallel-group trial with the same power. Statistical methods for the analysis of quantitative outcomes arising from the split-mouth design have been developed, as reviewed in Lesaffre et al.2 Mixed-effect ANOVA models, generalized mixed-effect models, and generalized estimating equation (GEE) technique3 can be used to test or estimate the treatment effect, adjusting for the clustering of data. Particularly, Donner4 proposed an adjusted chi-square test for analyzing binary data in a split-mouth design, and Donner5 discussed the problem using GEE method to further correct for imbalance in baseline measurements. Split-mouth trials are not recommended when contamination between sites is suspected and when finding matching sites is impossible in subjects.6 In addition, some researchers pointed out that the split-mouth design can only be more efficient than the parallel-group design when the within-patient correlation coefficient is substantial, where efficiency is determined in terms of the number of measurements needed.7 However, relatively little attention has been paid in literature to the sample size consideration for the split-mouth design and a rigorous comparison with the parallel-group design is not available. This paper intends to fill this gap by developing closed-form sample size formulas to help clinicians on effectively designing a split-mouth study.

The GEE method has been widely used to model repeated measurement data or clustered data due to its robustness against mis-specification of the true correlation structure and ability to accommodate missing data. Sample size calculation based on the GEE method has been explored by many researchers in literature. Liu and Liang8 developed a sample size formula based on a generalized score test. Jung and Ahn9,10 proposed sample size approaches to comparing rates of change for repeated continuous and binary measurements between two treatment groups. Zhang and Ahn11 and Lou et al.12 investigated sample size calculation for time-averaged differences for continuous and binary outcomes from repeated measurement studies, in presence of missing data. In this article, we derive closed-form sample size formulas based on the GEE approach, which properly account for correlations among observations in the split-mouth design, for both continuous and binary outcomes. We also explore the connection between the sample size requirement for the split-mouth design and that for the parallel-group design, and demonstrate how correlations among observations can influence their relative efficiency. There are other application areas in health sciences, such as dermatology and certain animal studies,13,14 which also utilizes the split-mouth design, or more accurately, the split-cluster design. Nevertheless, we adopt the terminology of the split-mouth design here, and the proposed methods can be directly applied to the split-cluster design when naturally occurring clusters such as multiple sites or organs in the same subject are assigned to different treatments.

The remainder of the article is arranged as follows. In Section 2, we present the data structure and notations related to the split-mouth design, and propose a GEE sample size approach based on a marginal linear regression model for continuous outcomes. In Section 3, a GEE sample size approach based on a marginal logistic regression model for binary outcomes from the split-mouth design is presented. Closed-form sample size formulas are derived based on the large sample approximation under the GEE approach. Section 4 describes simulation studies conducted to investigate performances of the GEE sample size formulas for continuous and binary outcomes with exchangeable and “nested exchangeable”15 correlation structures. It shows that power levels as predicted by the GEE sample size formula generally agree well with the empirical power. In Section 5, the proposed sample size method is illustrated with a dental clinical trial example. Some recommendations at the design stage of a split-mouth trial and potential extensions of the proposed methods are discussed in Section 6.

2. Statistical model and sample size approach for continuous outcomes

Consider a split-mouth design where two treatments are involved. In the split-mouth design, each subject’s mouth is divided into two segments, and segments of m1j and m2j sites within the j th subject are randomly assigned to receive either experimental or control treatments, where m1j + m2j = mj, and j = 1,…,n. Let Yijl denote the continuous outcome on site l of subject j under treatment i, for i = 1,2, j = 1,…,n and l = 1,…,mij. We further assume that there is a common correlation ρ among outcomes (Yijl, Yijl) within the same subject, where ii′ or jj′. Observed outcomes from different subjects are assumed to be independent. Let r1j = 1 and r2j = 0 denote the experimental and control treatments, respectively.

To make an inference on the differential effect of treatment on Yijl, we assume a linear regression model,

Yijl=β1+β2rij+εijl, (1)

where β1 is the intercept, β2 quantifies the differential effect of treatment, and εijl denotes random error. It is common to assume that E(εijl) = 0 and Var(εijl) = σ2. Our primary interest is to test the null hypothesis H0:β2 = 0, accounting for the within-subject correlation. Let β = (β1, β2)′. For the convenience of discussion, model (1) can be re-written as

Yj=Xjβ+εj,

where for subject j, Yj is an mj × 1 vector of outcomes, εj is an mj × 1 vector of random errors, and Xj is an mj × 2 design matrix

Xj=(1m1j1m1j1m2j0m2j),

and 1m is an m × 1 vector of 1’s and 0m is an m × 1 vector of 0’s.

Under the model assumption, the true correlation matrix is given by R0 = (1 − ρ)Im1j + m2j + ρJ m1j + m2j (exchangeable), where I is an identity matrix and J is a square matrix of 1’s. We use the independent working correlation structure to derive the GEE estimator β̂ = (β̂1,β̂2)′, which is obtained by solving equation Sn(β) = 0 with

Sn(β)=n-1/2j(Yj-Xjβ)TXj.

Liang and Zeger3 showed that n12(β^-β) is approximately normal with mean 0 and the variance is consistently estimated by

n=σ2(n-1jXjTXj)-1(n-1jXjTR0Xj)(n-1jXjTXj)-1.

The robust variance of n1/2β̂2 is the (2,2)th element of Σn, denoted by σ22. We reject H0 if |n12β^2σ2|>z1-α/2, where z1−α/2 is the 100(1-α2)th percentile of a standard normal distribution. Sample size formula for the split-mouth design can be developed based on this result.

For illustration, we assume a balanced design where m1j = m2j = k. By some algebra, it can be shown that, as n → ∞,

(n-1jXjTXj)-1(1k-1k-1k2k),n-1jXjTR0Xj(2k{(1+(2k-1)ρ}k{(1+(2k-1)ρ}k{(1+(2k-1)ρ}k{(1+(k-1)ρ}).

The (2,2)th element of Σn is simplified as σ22=2σ2(1-ρ)k. Then, given type I error α, power 1 − γ, and the true value of differential treatment effect β2 the required total number of subjects for the split-mouth design is

ns=2σ2(1-ρ)(z1-α2+z1-r)2kβ22, (2)

where k is the number of sites per segment. Since each subject contributes 2k sites in a split-mouth trial, the total number of sites is ms = 2kns. On the other hand, the required total number of subjects for the parallel-group design with a cluster size of k is16

np={1+(k-1)ρ}k×4σ2(z1-α2+z1-r)2β22.

In the parallel-group design the total number of sites is mp = knp. Therefore, in terms of the number of subjects, the relative efficiency of the split-mouth design over the parallel-group design can be expressed as

npns=2{1+(k-1)ρ}1-ρ,

from which we find that the relative efficiency of the split-mouth design increases with the within-subject correlation ρ. The relative efficiency of the split-mouth design in terms of the number of sites is

mpms=np2ns=1+(k-1)ρ1-ρ,

Further, for the special case of one site per segment (k = 1), the relative efficiency of the split-mouth design in terms of the number of subjects becomes

npns=21-ρ,

which was discussed in Wang and Bakhai.17

Additionally, we can employ a more general correlation structure by assuming a common correlation ρ among outcomes (Yijl, Yijl) observed in the same segment, referred to as the intra-segment correlation, while the correlation among outcomes (Y1jl, Y2jl) observed in different segments within the same subject is given by ρ12, referred to as the inter-segment correlation. Consequently, both the intra-segment correlation and the inter-segment correlation need to be taken into account for testing the null hypothesis H0: β2 = 0. Under the new model assumption, the true correlation matrix (“nested exchangeable”) is

R1=((1-ρ)Im1j+ρJm1jρ12Jm1j×m2jρ12Jm2j×m1j(1-ρ)Im2j+ρJm2j).

The combinations (ρ,ρ12), for which R1 is positive definite, can be determined using the technique in Teerenstra et al.15 Similarly, the robust variance of n1/2β̂2 is the (2,2)th element of

n1=σ2(n-1jXjTXj)-1(n-1jXjTR1Xj)(n-1jXjTXj)-1.

For a balanced design where m1j = m2j = k, as n → ∞,

n-1jXjTR1Xj(2k{1+(k-1)ρ+kρ12}k{1+(k-1)ρ+kρ12}k{1+(k-1)ρ+kρ12}k{1+(k-1)ρ})

and the (2,2)th element of n1 is σ212=2σ2{1+(k-1)ρ-kρ12}k. Thus, given type I error α, power 1 − γ, and the true value of differential treatment effect β2, the required total number of subjects for the split-mouth design is

ns=2σ2{1+(k-1)ρ-kρ12}(z1-α2+z1-r)2kβ22, (3)

where k is the number of sites per segment. When the true correlation is exchangeable (ρ = ρ12), the sample size formula in (3) reduces to the one in (2).

3. Statistical model and sample size approach for binary outcomes

In this section, we discuss sample size consideration for the split-mouth design with binary outcomes. Let Yijl denote the binary response on site l of subject j receiving treatment i, where Yijl = 1 denotes a “success” and Yijl = 0 denotes a “failure” for i = 1,2, j = 1,…,n and l = 1,…,mij. We assume that the intra-segment correlation is ρ and the inter-segment correlation is ρ12.

To make an inference on the differential effect of treatment, we assume the following logistic regression model: Yijl~Bernoulli(pijl) and

logit{Pr(Yijl=1)}=log(pijl1-pijl)=β1+β2rij, (4)

r1j = 1 and r2j = 0 indicate the experimental and control treatments, respectively, β1 is the log-transformed odds for the control group, and β2 is the log-transformed odds ratio between the experimental and control treatments, representing the treatment difference on the outcome. The primary interest is to test the null hypothesis H0:β2 = 0 accounting for both the intra-segment and inter-segment correlations. We can apply the GEE using the robust variance approach and an independent working correlation matrix to test H0:β2 = 0. Model (4) can be re-written as

pijl(β)=eβZijl1+eβZijl,

where β = (β1, β2)′ and Zijl = (1,rij). Under the model assumption, the true correlation matrix is “nested exchangeable”. By the GEE method, an estimator β̂ is obtained by solving equation Sn(β)=0 with

Sn(β)=n-1/2j=1ni=12l=1mij{Yijl-pijl(β)}Zijl.

The equation is solved by the Newton-Raphson algorithm. At the mth iteration,

β^(m)=β^(m-1)+n-12An-1(β^(m-1))Sn(β^(m-1)),

where

An(β)=-n-12Sn(β)β=n-1j=1ni=12l=1mijpijl(1-pijl)(1rijrijrij2).

Liang and Zeger3 showed that n12(β^-β) is approximately normal with mean 0 and the variance is consistently estimated by n=An-1(β^)Vn(β^)An-1(β^), where

Vn(β^)=n-1j=1n(i=12l=1mijε^ijlZijl)2,

with ε̂ijl = Yijlpijl(β̂), and c⊗2 = ccT for a vector c. We reject H0 if |n12β^2σ2|>z1-α/2, where σ22 is the (2,2)th element of n.

We consider a balanced design where m1j = m2j = k. To facilitate the discussion, let P1 = eβ1+β2/(1 + eβ1+β2) and P2 = eβ1/(1 + eβ1) denote the true success rates in the experimental and control groups, respectively. Define Q1 = 1 − P1 and Q2 = 1 − P2. The null hypothesis H0: β2 = 0 is equivalent to H0: P1 = P2.

Theorem 1

As n → ∞, n=An-1(β^)Vn(β^)An-1(β^) and the (2,2)th element of Σ* has a closed form

σ22={1+(k-1)ρ}(P1Q1+P2Q2)-2kρ12P1Q1P2Q2kP1Q1P2Q2.

The proof of Theorem 1 is presented in the Appendix. Given type I error α, power 1 − γ, and the true value of differential treatment effect β2 the required total number of subjects for the split-mouth design is

ns=σ22(z1-α2+z1-r)2β22, (5)

When the true correlation is exchangeable (ρ = ρ12) and P1 = P2 = P (P is the true overall success rate), the sample size formula (5) can be simplified as

ns=2(1-ρ)(z1-α2+z1-r)2kP(1-P)β22,

where k is the number of sites per segment. For the parallel-group design with a cluster size of k for binary outcomes, the required total number of subjects is

np={1+(k-1)ρ}k×4(z1-α2+z1-r)2P(1-P)β22.

Therefore, with the exchangeable correlation structure, we have

npns=2{1+(k-1)ρ}1-ρ,

which is consistent with the relative efficiency of the split-mouth design over the parallel-group design that we find for the continuous outcomes, in terms of the number of subjects.

4. Simulation study

The first set of the simulation is to demonstrate the effect of various design configurations on the sample size for the split-mouth design with continuous outcomes. The nominal levels of type I error and power are set at α = 0.05 and 1 − γ = 0.8, respectively. We consider both exchangeable (ρ = ρ12) and “nested exchangeable” (ρρ12) correlation structures, where values of ρ range from 0.1, 0.15 to 0.2, and values of ρ12 range from 0.05, 0.1 to 0.15. Assuming a balanced design with m1j = m2j = k = 3, three sites in each segment are assigned to either experimental or control treatment. We set the true values of regression coefficients β = (β1, β2)′ = (0.3, 0.2)′ and variance σ2 = 0.5 or 1, where (β2, σ2) = (0.2, 1) indicates an effect size of 0.2 comparing experimental treatment with control treatment. We assess the performance of the proposed sample size approach for continuous outcomes for each combination of (σ2,ρ,ρ12) The simulation procedure is as follows: (i) Estimate sample size ns based on equation (3); (ii) Generate 5000 null (β2 = 0) data sets and 5000 alternative (β2 = 0.2) data sets, each containing ns subjects. For subject j, generate a vector of outcomes Yj from the Model Yj = Xjβ + εj where a vector of random errors εj is generated from a multivariate normal distribution with mean 0, variance σ2 = 0.5 or 1, and correlation matrix R0 (exchangeable) or R1 (“nested exchangeable”); (iii) For each data set, estimate β̂2 and σ2 (or σ21); (iv) Calculate the empirical type I error and empirical power as the proportion of |n12β^2σ2|>z1-0.05/2 (or |n12β^2σ21|>z1-0.05/2) under the null and alternative hypotheses.

Table 1 presents the sample size estimate, empirical power and empirical type I error from the simulation. The empirical powers and type I errors are generally close to their nominal levels, which indicates a good performance of the proposed method. With all other factors fixed, for the “nested exchangeable” correlation (ρ ≠ ρ12), the sample size increases as the intra-segment correlation ρ increases, or the inter-segment correlation ρ12 decreases. For the exchangeable correlation (ρ = ρ12)), the sample size increases as ρ decreases. The statistical inference under the GEE method is based on a large sample approximation. It is thus important to examine the performance of the proposed method in some small-sample-size scenarios. In Table 1, we have explored scenarios where sample size can be as small as 49 and the corresponding empirical power remains close to the nominal level of 0.8. This provides assurance to researchers that the proposed method is widely applicable to split-mouth clinical trials, even when the sample size is relatively small.

Table 1.

Sample size (empirical power, empirical type I error) for simulation with continuous outcomes, type I error=0.05, power=0.8.

σ2 σ σ12 = 0.05 σ12 = 0.1 σ12 = 0.15
0.5 0.1 69 (0.810, 0.058) 59 (0.795, 0.061) 49 (0.791, 0.062)
0.15 75 (0.810, 0.047) 65 (0.795, 0.057) 56 (0.809, 0.054)
0.2 82 (0.820, 0.057) 72 (0.805, 0.053) 62 (0.804, 0.056)
1 0.1 137 (0.799, 0.051) 118 (0.808, 0.052) 98 (0.809, 0.056)
0.15 150 (0.798, 0.053) 131 (0.785, 0.049) 111 (0.793, 0.057)
0.2 163 (0.798, 0.047) 144 (0.810, 0.056) 124 (0.791, 0.051)

The second set of the simulation is to evaluate the performance of the GEE sample size formula for binary outcomes, under various design configurations: true success rate for control P2 of 0.1 or 0.2; true treatment effect Δ = P1P2 of 0.05 or 0.1; true intra-segment correlation ρ from 0.1, 0.15 to 0.2; true inter-segment correlation ρ12 from 0.05, 0.1 to 0.15. We assume a balanced design with m1j = m2j = k = 3. For each (P1,P2,ρ,ρ12) we generate 5000 null data sets and 5000 alternative data sets with sample size ns, where the vector of binary outcomes Yj for subject j is generated using the approach described by Obuchowski.18 The empirical power and type I error are calculated as proportions of times that the alternative hypothesis and null hypothesis are rejected based on the GEE approach, when fitting the logistic regression model (4) to the simulated data. Table 2 summarizes the simulation results, including the required sample size, empirical power and empirical type I error under different combinations of design factors (P1,P2,ρ,ρ12) We have studied scenarios with a wide range of sample size, varying from 53 to 457. It shows that a larger treatment effect leads to a smaller sample size requirement. Similar to continuous outcomes, the sample size increases as ρ increases or ρ12 decreases, when ρρ12, with all other factors fixed. The empirical type I errors are all close to the nominal level of 0.05. The empirical power tends to be smaller than the nominal level when the sample size is relatively small, and gets closer to the nominal level as the sample size increases. The possible explanation might be related to normal approximation. When sample size is small, the normal approximation might not be very satisfactory.

Table 2.

Sample size (empirical power, empirical type I error) for simulation with binary outcomes, type I error=0.05, power=0.8.

P1 P2 ρ ρ12 = 0.05 ρ12 = 0.1 ρ12 = 0.15
0.15 0.1 0.1 244 (0.793, 0.054) 209 (0.766, 0.047) 175 (0.730, 0.049)
0.15 267 (0.826, 0.051) 232 (0.791, 0.050) 198 (0.759, 0.049)
0.2 290 (0.823, 0.052) 256 (0.806, 0.049) 221 (0.780, 0.056)
0.2 0.1 0.1 73 (0.836, 0.056) 63 (0.802, 0.052) 53 (0.759, 0.053)
0.15 80 (0.845, 0.053) 70 (0.830, 0.054) 60 (0.795, 0.053)
0.2 87 (0.862, 0.048) 77 (0.846, 0.046) 67 (0.813, 0.055)
0.25 0.2 0.1 384 (0.826, 0.052) 330 (0.805, 0.053) 275 (0.768, 0.054)
0.15 421 (0.844, 0.054) 366 (0.823, 0.051) 311 (0.791, 0.053)
0.2 457 (0.847, 0.050) 403 (0.834, 0.050) 348 (0.805, 0.049)
0.3 0.2 0.1 104 (0.822, 0.053) 89 (0.790, 0.051) 75 (0.762, 0.054)
0.15 114 (0.839, 0.053) 99 (0.819, 0.051) 85 (0.787, 0.053)
0.2 124 (0.841, 0.057) 109 (0.819, 0.051) 95 (0.801, 0.055)

5. Example

Morrow et al.1 reported a split-mouth trial of chlorhexidine in the treatment of gingivitis, where the left and right sides of a patient’s upper and lower jaws were randomly assigned to chlorhexidine or a control treatment. Each treatment was applied to four sites located on the left and right sides of the upper and lower jaws (m1j = m2j = k = 4). The trial enrolled 23 orthodontic patients, and the proportions of patients having plague in chlorhexidine and control groups are estimated as 1 = 0.89 and 2 = 0.77, respectively, with the intra-segment correlation estimated as ρ̂ = 0.070 and the inter-segment correlation estimated as ρ̂12 = 0.039. Correspondingly, we have the log-transformed odds for the control group β̂1 = 1.21, and the log-transformed odds ratio β̂2 = 0.88. An investigator would like to conduct a split-mouth trial to study the effect of a new drug as a treatment of gingivitis, following a similar study design where the new drug and a control treatment are randomly assigned to each segment with four sites of a patient’s mouth. To test the hypotheses H0:β2 = 0 versus H1:β2 ≠ 0with 80% power at a 5% significant level, the number of subjects needs to be determined. Based on the preliminary data from the aforementioned trial, we assume that the design factors are P2 = 0.77, ρ = 0.070 and ρ12 = 0.039. By the proposed method for binary outcomes, the required total number of subjects to detect a treatment effect of Δ = P1P2 = 0.10 (β2 = 0.69) is ns = 84 for a split-mouth trial. We also estimate the sample size for a parallel-group trial of the same cluster size of k = 4, and the required total number of subjects is np = 135. In terms of the number of subjects, the relative efficiency of the split-mouth design over the parallel-group design is npns=1.61. Further, to detect a treatment effect of Δ = P1P2 = 0.15 (β2 = 1.23), the required total number of subjects is ns = 35 for a split-mouth trial, and the required total number of subjects for the corresponding parallel-group trial is np = 48

6. Discussion

In this paper, we present closed-form sample size formulas for the split-mouth design with continuous and binary outcomes. To our knowledge, it is the first attempt to systematically investigate the sample size methods for designing a split-mouth study. Marginal linear and logistic regression models with the GEE approach are employed to account for correlations among observations in split-mouth trials. The main contribution of this paper is to provide closed-form sample size formulas, which allows deeper insight into the impact of various design factors (treatment effect size, intra-segment and inter-segment correlations, etc.) on the sample size. We also theoretically derive the relative efficiency of the split-mouth design over the parallel-group design. The relative efficiency increases with the inter-segment correlation ρ12, while it decreases with the intra-segment correlation ρ. The statistical inference under the GEE approach is based on the asymptotic properties, thus the sample size formulas are generally applicable for large sample sizes. Our simulation shows that the nominal power and type I error are preserved over a wide range of sample sizes.

Clinical trials with multiple or repeated measurements often encounter missing data, and in a split-mouth trial, there may be missing observations at some sites of a patient. A common practice to account for missing data is to estimate sample size by n/(1 − q), where n is the sample size (number of subjects or sites) calculated assuming no missing and q is the expected missing rate. However, as shown in Zhang and Ahn,11 this crude adjustment may be unsatisfactory, as missing data might cause less information loss if the correlations among repeated measurements are high. On the other hand, the GEE approach has the advantage of utilizing incomplete observations. One possible extension of the current work is to incorporate missing data into sample size consideration. Finally, in this paper we have concentrated on the split-mouth design with continuous and binary outcomes. In the future, we will extend the proposed methods to the split-mouth design with categorical, count and survival outcomes. The aforementioned extensions demand significant methodological development and will be pursued in separate studies.

Acknowledgments

This work was supported in part by the Cancer Center Support Grant from the National Cancer Institute (5P30CA142543) awarded to the Harold C. Simmons Cancer Center at the University of Texas Southwestern Medical Center. The authors thank the editor and two reviewers for their constructive comments that have improved the initial version of this paper.

Appendix : Proof of Theorem 1

We consider a balanced design where m1j = m2j = k. First of all, An(β̂) can be split into two parts for the treatment and control as

An(β^)=1nj=1ni=12l=1kpijl(β^)(1-pijl(β^))(1rijrijrij2)=1nj=1ni=12l=1kpijl(β)(1-pijl(β))(1rijrijrij2)+op(1)=1nj=1nl=1kp1jl(1-p1jl)(1111)+1nj=1nl=1kp2jl(1-p2jl)(1000)+op(1).

Applying the central limit theorem, as n → ∞, An(β̂) converges to

A=P(1-P)(k(P1Q1+P2Q2)kP1Q1kP1Q1kP1Q1).

Next, we separate Vn(β̂) into two parts for the treatment and control as

Vn(β^)=1nj=1n(i=12l=1kε^ijlZijl)2=1nj=1n{l=1k(ε1jl+ε2jlε1jl)}2+op(1)=1nj=1nl=1kl=1k(ε1jlε1jl+ε1jlε2jl+ε2jlε2jl+ε2jlε1jlε1jlε1jl+ε2jlε1jlε1jlε1jl+ε1jlε2jlε1jlε1jl)+op(1).

By the central limit theorem, as n → ∞, Vn(β̂) converges to V, where

V11={k+(k2-k)ρ}(P1Q1+P2Q2)+2k2ρ12P1Q1P2Q2,V12=V21={k+(k2-k)ρ}P1Q1+k2ρ12P1Q1P2Q2,V22={k+(k2-k)ρ}P1Q1.

A few steps of algebra show that the (2,2)th element of Σ* = A−1VA−1 is

σ22={1+(k-1)ρ}(P1Q1+P2Q2)-2kρ12P1Q1P2Q2kP1Q1P2Q2.

References

  • 1.Morrow D, Wood DP, Speechley M. Clinical effect of subgingival chlorhexidine irrigation on gingivitis in adolescent orthodontic patients. Am J Orthod Dentofac. 1992;101:408–413. doi: 10.1016/0889-5406(92)70113-O. [DOI] [PubMed] [Google Scholar]
  • 2.Lesaffre E, Philstrom B, Needleman I, et al. The design and analysis of split-mouth studies: what statisticians and clinicians should know. Stat Med. 2009;28:3470–3482. doi: 10.1002/sim.3634. [DOI] [PubMed] [Google Scholar]
  • 3.Liang K, Zeger S. Longitudinal data analysis for discrete and continuous outcomes using generalized linear models. Biometrika. 1986;84:3–32. [PubMed] [Google Scholar]
  • 4.Donner A, Klar N, Zou G. Methods for the statistical analysis of binary data in split-cluster designs. Biometrics. 2004;60:919–925. doi: 10.1111/j.0006-341X.2004.00247.x. [DOI] [PubMed] [Google Scholar]
  • 5.Donner A, Zou G. Methods for the statistical analysis of binary data in split-smouth designs with baseline measurements. Stat Med. 2007;26:3476–3486. doi: 10.1002/sim.2782. [DOI] [PubMed] [Google Scholar]
  • 6.Pandis N. Sample calculation for split-mouth designs. Am J Orthod Dentofac. 2012;141:818–819. doi: 10.1016/j.ajodo.2012.03.015. [DOI] [PubMed] [Google Scholar]
  • 7.Hujoel P, Loesche W. Efficiency of split-mouth designs. J Clin Periodontol. 1990;17:722–728. doi: 10.1111/j.1600-051x.1990.tb01060.x. [DOI] [PubMed] [Google Scholar]
  • 8.Liu G, Liang K. Sample size calculations for studies with correlated observations. Biometrics. 1997;53:937–947. [PubMed] [Google Scholar]
  • 9.Jung S, Ahn C. Sample size estimation for GEE method for comparing slopes in repeated measurements data. Stat Med. 2003;22:1305–1315. doi: 10.1002/sim.1384. [DOI] [PubMed] [Google Scholar]
  • 10.Jung S, Ahn C. Sample size for a two-group comparison of repeated binary measurements using GEE. Stat Med. 2005;24:2583–2596. doi: 10.1002/sim.2136. [DOI] [PubMed] [Google Scholar]
  • 11.Zhang S, Ahn C. Sample size calculation for time-averaged differences in the presence of missing data. Contemp Clin Trials. 2012;33:550–556. doi: 10.1016/j.cct.2012.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lou Y, Cao J, Zhang S, et al. Sample size calculations for time-averaged difference of longitudinal binary outcomes. Commun Stat A-Theor. doi: 10.1080/03610926.2014.991040. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bigby M, Godenne A. Continuing medical education: understanding and evaluating clinical trials. J Am Acad Dermatol. 1986;34:555–593. doi: 10.1016/s0190-9622(96)80053-3. [DOI] [PubMed] [Google Scholar]
  • 14.Weiss RA. Comparison of endovenous radiofrequency versus 810 nm diode laser occlusion of large veins in an animal model. Dermatol Surg. 2002;28:56–61. doi: 10.1046/j.1524-4725.2002.01191.x. [DOI] [PubMed] [Google Scholar]
  • 15.Teerenstra S, Lu B, Preisser JS, et al. Sample size considerations for GEE analyses of three-level cluster randomized trials. Biometrics. 2010;66:1230–1237. doi: 10.1111/j.1541-0420.2009.01374.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Donner A, Klar N. Design and analysis of cluster randomization trials in health research. London: Arnold; 2000. [Google Scholar]
  • 17.Wang D, Bakhai A. Clinical Trials in Practice. A Practical Guide to Design, Analysis and Reporting, chapter 10. London: Remedica; 2006. [Google Scholar]
  • 18.Obuchowski NA. On the comparison of correlated proportions for clustered data. Stat Med. 1998;17:1495–1507. doi: 10.1002/(sici)1097-0258(19980715)17:13<1495::aid-sim863>3.0.co;2-i. [DOI] [PubMed] [Google Scholar]

RESOURCES