Simultaneous confidence interval construction for many-to-one comparisons of proportion differences based on correlated paired data

Zhengyu Yang; Guo-Liang Tian; Xiaobin Liu; Chang-Xing Ma

doi:10.1080/02664763.2020.1795815

. 2020 Jul 22;48(8):1442–1456. doi: 10.1080/02664763.2020.1795815

Simultaneous confidence interval construction for many-to-one comparisons of proportion differences based on correlated paired data

Zhengyu Yang ¹, Guo-Liang Tian ¹, Xiaobin Liu ¹, Chang-Xing Ma ^1,^CONTACT

PMCID: PMC9042020 PMID: 35706469

Abstract

In some medical researches such as ophthalmological, orthopaedic and otolaryngologic studies, it is often of interest to compare multiple groups with a control using data collected from paired organs of patients. The major difficulty in performing the data analysis is to adjust the multiplicity between the comparison of multiple groups, and the correlation within the same patient's paired organs. In this article, we construct asymptotic simultaneous confidence intervals (SCIs) for many-to-one comparisons of proportion differences adjusting for multiplicity and the correlation. The coverage probabilities and widths of the proposed CIs are evaluated by Monte Carlo simulation studies. The methods are illustrated by a real data example.

Keywords: Bilateral data, profile likelihood method, score method, simultaneous confidence interval, Wald method

2010 Mathematics Subject Classification: 62F99

1. Introduction

In some medical researches such as ophthalmological, orthopaedic or otolaryngologic studies, it is common that bilateral observations from paired organs are collected for data analysis. For example, in a randomized trial of group comparisons, researchers are interested in accessing the effectiveness of several new treatments to a traditional one. After a patient is randomly allocated to one of these treatment groups, he/she receives the same therapy on his/her paired organs like eyes, ears and hands. Response data on both organs are collected from each patient, labeled with group allocation. The information can often be summarized as categorical bilateral data: both of the paired organs have responses, one of the paired organs has response, and none of them has response. It is natural that there is correlation between the responses of paired organs from the same patient.

Existing researches have shown that ignoring the correlation may yield biased inference [2,5,8]. Back in 1982, a parametric model [16] with a constant R was introduced to measure the dependency in bilateral data. For this ‘R model,’ various homogeneity test statistics were proposed [10,19,20] and confidence intervals for the difference of proportions have been developed [18] between two groups. In this article, we propose statistical approaches to construct simultaneous confidence intervals (SCIs) of proportion differences with g ( $g \geq 2$ ) groups under the dependency assumption.

The rest of this article is organized as follows. In Section 2, we propose the method for multiplicity adjustment. Then we introduce methods for constructing SCIs in Section 3. Monte Carlo simulations are conducted in Section 4 to evaluate the performance of the proposed SCIs with respect to their coverage probabilities and widths. A real example is presented in Section 5 to illustrate the methodology.

2. Multiplicity adjustment

In the construction of SCIs, ignoring multiplicity adjustment may lead to biased results. Suppose that there are g groups (1 control group, and g−1 target groups) and the ith group has proportion $π_{i}$ , we are interested in estimating the difference of proportions between the ith target group and the control group: $π_{i} - π_{1}$ $(i = 2, \dots, g)$ . The simplest way for handling multiplicity adjustment is to apply the Bonferroni correction by using the quantile $c = z_{1 - α / 2 (g - 1)}$ instead of $c = z_{1 - α / 2}$ for two-sided limits, where $z_{1 - α}$ denotes the $(1 - α)$ quantile of the standard normal distribution.

For problems of multiple comparisons to a control, Dunnett's test [7] is designed to control the family-wise error rate at or below nominal level when normal assumption is valid, taking the correlation among groups into account. Based on Dunnett's test, a general approach [14] is developed to construct many-to-one SCIs for binary data. The correlation coefficient, $ρ_{i j} \hat{=} w_{i} w_{j}$ where $w_{i} = (1 + n_{1} π_{i} (1 - π_{i}) / n_{i} π_{1} (1 - π_{1}))^{- 1 / 2}$ , is determined from the asymptotic correlation between ${\hat{π}}_{i} - {\hat{π}}_{1}$ and ${\hat{π}}_{j} - {\hat{π}}_{1}$ . Then the critical value $c = z_{g - 1, 1 - α, R}$ is calculated by the $1 - α$ quantile of the $(g - 1)$ -variate normal distribution with zero mean and correlation matrix $R = (ρ_{i j})$ . In this paper, these values are estimated consistently by replacing $π_{i}$ with ${\hat{π}}_{i}$ .

3. The proposed methods

Let $l (= 0, 1, 2)$ be the number of responses per patient. When both of the paired organs have responses, l = 2; one of the paired organs has response, l = 1; and none of them has response, l = 0. Let $m_{l i}$ denote the number of patients with l response(s) in the ith group, $i = 1, \dots, g$ . Let $m_{i} = \sum_{l = 0}^{2} m_{l i}$ be the number of patients in the ith group, and $N = \sum_{i = 1}^{g} m_{i}$ be the total number of patients. Therefore, a typical data structure can be summarized in Table 1 with each column following a multinomial distribution.

Table 1. Data structure for bilateral patients.

	Group (i)
Number of responses (l)	1	2	···	g	Total
0	$m_{01}$	$m_{02}$	···	$m_{0 g}$	$S_{0}$
1	$m_{11}$	$m_{12}$	···	$m_{1 g}$	$S_{1}$
2	$m_{21}$	$m_{22}$	···	$m_{2 g}$	$S_{2}$
Total	$m_{1}$	$m_{2}$	···	$m_{g}$	N

Open in a new tab

Let $Z_{i j k}$ be the indicator variable of response for the kth eye of the jth patient in the ith group, where $i = 1, \dots, g$ , $j = 1, \dots, m_{i}$ and k = 1, 2. Let $Z_{i j k} = 1$ if there is a response, and 0 otherwise. We present the ‘R model’ and assume that

{\begin{cases} \Pr (Z_{i j k} = 1) = π_{i}, \\ \Pr (Z_{i j k} = 1 | Z_{i j, 3 - k} = 1) = R π_{i}, \end{cases}

(1)

where $π_{i}$ is the response rate in the ith group, and R is a positive constant measuring the dependence between two eyes of the same patient. R = 1 indicates that the two eyes of the same patient are completely independent, while $R π_{i} = 1$ means complete dependency. It can be shown that the correlation between the two eyes is

ρ_{i} = corr (Z_{i j 1}, Z_{i j 2}) = \frac{π_{i}}{1 - π_{i}} (R - 1), i = 1, \dots, g .

Let $p_{l i}$ be the corresponding cell probability to $m_{l i}$ , i.e. the probability that a patient in the ith group has l responses. It is easy to show that $p_{0 i} = R π_{i}^{2} - 2 π_{i} + 1, p_{1 i} = 2 π_{i} (1 - R π_{i})$ and $p_{2 i} = R π_{i}^{2}$ [12]. Without loss of generality, let the first group be the control group with the response rate of $π_{1}$ . In this article, we aim to construct the SCIs for the differences of response rates between the other g−1 treatment groups and the control group, denoted by $Δ_{i} = π_{i} - π_{1}$ ( $i = 2, \dots, g$ ) within $[- 1, 1]$ . Given the observed data $\tilde{M} = {m_{01}, \dots, m_{0 g}, m_{11}, \dots, m_{1 g}, m_{21}, \dots, m_{2 g}}$ , the log-likelihood function of parameters $(π_{1}, \dots, π_{g}, R)$ is

l_{1} (π_{1}, \dots, π_{g}, R) = \sum_{i = 1}^{g} l_{i}^{'}

(2)

where

l_{i}^{'} = m_{0 i} \log (R π_{i}^{2} - 2 π_{i} + 1) + m_{1 i} \log [2 π_{i} (1 - R π_{i})] + m_{2 i} \log (R π_{i}^{2}) .

Substituting $π_{i} = π_{1} + Δ_{i} (i = 2, \dots, g)$ into $l_{1}$ , the log-likelihood function can be rewritten as

\begin{aligned} l_{2} (Δ_{i}; π_{1}, \dots, π_{i - 1}, π_{i + 1}, \dots, π_{g}, R) & = l_{1}^{'} + \dots + l_{i - 1}^{'} + l_{i + 1}^{'} + \dots + l_{g}^{'} \\ + m_{0 i} \log [R (π_{1} + Δ_{i})^{2} - 2 (π_{1} + Δ_{i}) + 1] \\ + m_{1 i} \log {2 (π_{1} + Δ_{i}) [1 - R (π_{1} + Δ_{i})]} \\ + m_{2 i} \log [R (π_{1} + Δ_{i})^{2}] . \end{aligned}

(3)

In the form of $l_{2}$ , $Δ_{i}$ $(i = 2, \dots, g)$ is the parameter of interest, the rest are nuisance parameters.

3.1. The method of variance estimates recovery

A general approach [22,23] was developed to construct CI for a difference between effect measures. Here we briefly summarize the concept of the method of variance estimates recovery (MOVER). Let $θ_{ℓ}$ be the proportion of population $ℓ (ℓ = 1, 2)$ with parameter estimate ${\hat{θ}}_{ℓ}$ . The problem of interest is to construct CI (L, U) for $θ_{1} - θ_{2}$ . Let $(l_{1}, u_{1})$ and $(l_{2}, u_{2})$ be the single two-sided $100 (1 - α) %$ CIs for $θ_{1}$ and $θ_{2}$ , respectively. To obtain L, var( ${\hat{θ}}_{1}$ ) is estimated under $θ_{1} = l_{1}$ and $θ_{2} = u_{2}$ . By applying the inversion principle, $\hat{var} ({\hat{θ}}_{1}) = ({\hat{θ}}_{1} - l_{1})^{2} / z_{1 - α / 2}^{2}$ , and $\hat{var} ({\hat{θ}}_{1}) = (u_{1} - {\hat{θ}}_{1})^{2} / z_{1 - α / 2}^{2}$ . Similarly, $\hat{var} ({\hat{θ}}_{2}) = ({\hat{θ}}_{2} - l_{2})^{2} / z_{1 - α / 2}^{2}$ , and $\hat{var} ({\hat{θ}}_{2}) = (u_{2} - {\hat{θ}}_{2})^{2} / z_{1 - α / 2}^{2}$ . (L, U) is traditionally given by ${\hat{θ}}_{1} - {\hat{θ}}_{2} \mp z_{α / 2} \sqrt{\hat{var} ({\hat{θ}}_{1}) + \hat{var} ({\hat{θ}}_{2})}$ . By plugging in the variance estimators in the (L, U), we have

\begin{aligned} L & = {\hat{θ}}_{1} - {\hat{θ}}_{2} - \sqrt{({\hat{θ}}_{1} - l_{1})^{2} + (u_{2} - {\hat{θ}}_{2})^{2}} and \\ U & = {\hat{θ}}_{1} - {\hat{θ}}_{2} + \sqrt{(u_{1} - {\hat{θ}}_{1})^{2} + ({\hat{θ}}_{2} - l_{2})^{2}} . \end{aligned}

The resulting expressions do not rely on specific distributions for ${\hat{θ}}_{i}$ .

To construct SCIs for $Δ_{i} = π_{i} - π_{1}$ , where $i = 2, \dots, g$ , the single CI for $θ_{i}$ is needed. We review two single CI estimators as follows.

3.1.1. Wilson score interval

The Wilson score interval [21] was developed for binomial proportions. Given Y responses in n trials, define $\tilde{n} = n + z^{2}$ , $\hat{p} = Y / n$ and $\tilde{p} = (Y + 0.5 z^{2}) / \tilde{n}$ . Then, the Wilson score CI is given by $\tilde{p} \mp z / \tilde{n} \sqrt{n \hat{p} (1 - \hat{p}) + z^{2} / 4}$ . Under the assumption in Section 3, each column of the data structure follows multinomial distribution. In this case, $\tilde{n} = 2 m_{i} + c^{2}$ , $\hat{p} = (m_{1 i} + 2 m_{2 i}) / 2 m_{i}$ and $\tilde{p} = (m_{1 i} + 2 m_{2 i} + 0.5 c^{2}) / (2 m_{i} + c^{2})$ , where c is the critical value of multivariate normal distribution described in Section 2.

3.1.2. Agresti–Coull interval

Using the same notations of $\tilde{n}$ and $\tilde{p}$ , the Agresti–Coull interval [1] is given by $\tilde{p} \mp z \sqrt{\tilde{p} (1 - \tilde{p}) / \tilde{n}}$ . Similarly, under the assumption in Section 3, $\tilde{n} = 2 m_{i} + c^{2}$ and $\tilde{p} = (m_{1 i} + 2 m_{2 i} + 0.5 c^{2}) / (2 m_{i} + c^{2})$ .

3.2. Wald-type intervals

3.2.1. Under the null hypothesis

An intuitive sample estimate of $Δ_{i}$ is

{\hat{Δ}}_{i} = \frac{m_{1 i} + 2 m_{2 i}}{2 m_{i}} - \frac{m_{11} + 2 m_{21}}{2 m_{1}} .

It has been shown [20] that it is the maximum likelihood estimate (MLE) of $Δ_{i}$ under the null hypothesis ( $H_{0 i} : Δ_{i} = 0$ ), and the variance is

var ({\hat{Δ}}_{i}) = \frac{π_{i} [1 + (R_{i} - 2) π_{i}]}{2 m_{i}} + \frac{π_{1} [1 + (R_{i} - 2) π_{1}]}{2 m_{1}} .

The MLE of $π_{i}$ and $R_{i}$ are given by [16]

{\hat{π}}_{i} = \frac{m_{1 i} + 2 m_{2 i}}{2 m_{i}} and {\hat{R}}_{i} = \frac{4 (m_{1} + m_{i}) (m_{21} + m_{2 i})}{(m_{11} + m_{1 i} + 2 m_{21} + 2 m_{2 i})^{2}} .

Substituting these in the variance we obtain the estimate of variance $\hat{var} ({\hat{Δ}}_{i})$ . Then, the Wald-type SCI of $Δ_{i}$ is given by

[max (- 1, {\hat{Δ}}_{i} - c \sqrt{\hat{var} ({\hat{Δ}}_{i})}), min (1, {\hat{Δ}}_{i} + c \sqrt{\hat{var} ({\hat{Δ}}_{i})})],

where c is the critical value of multivariate normal distribution defined in Section 2.

3.2.2. Under the dependence assumption

The unconstrained MLEs of all unknown parameters ( $π_{1}, \dots, π_{g}, R$ ) were derived [10] based on the data structure shown in Table 1 under the dependence assumption. We apply the algorithm to construct SCI for $Δ_{i}$ using a contrast matrix

K_{g \times (g + 1)} = (\begin{matrix} 0 & 0 & 0 & \dots & 0 & 0 & 0 \\ - 1 & 1 & 0 & \dots & 0 & 0 & 0 \\ - 1 & 0 & 1 & \dots & 0 & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ & ⋮ & ⋮ \\ - 1 & 0 & 0 & \dots & 1 & 0 & 0 \\ - 1 & 0 & 0 & \dots & 0 & 1 & 0 \end{matrix}) .

With the invariant property of MLE, we can obtain the MLE of $Δ_{i}$ by using a simple linear transformation: $β = (π_{1}, \dots, π_{g}, R)$ , and the corresponding MLEs $\hat{β} = ({\hat{π}}_{1}, \dots, {\hat{π}}_{g}, \hat{R})$ , then the MLE of $Δ_{i}$ is

{\hat{Δ}}_{i} = K_{i ∙} {\hat{β}}^{T},

where $K_{i ∙}$ is the ith row of $K$ ( $i = 2, \dots, g$ ).

According to the asymptotic normality of MLE under certain regularity conditions, $\sqrt{n} (\hat{β} - β) \overset{d}{\to} N_{g + 1} (0, I^{- 1})$ , where $I$ is the Fisher information matrix [10] for $β$ . The $100 (1 - α) %$ CI is defined as

K_{i ∙} {\hat{β}}^{T} \mp c \sqrt{K_{i ∙} I^{- 1} K_{i ∙}^{T}},

where c is the critical value of multivariate normal distribution described in Section 2. Considering the range of Δ, the Wald-type SCI is given by

[max (- 1, K_{i ∙} {\hat{β}}^{T} - c \sqrt{K_{i ∙} I^{- 1} K_{i ∙}^{T}}), min (1, K_{i ∙} {\hat{β}}^{T} + c \sqrt{K_{i ∙} I^{- 1} K_{i ∙}^{T}})] .

3.3. Profile likelihood confidence interval

An asymptotic profile likelihood CI can be constructed from the $χ^{2}$ distribution by inverting the likelihood ratio test of the null hypothesis $H_{0} : Δ_{i} = Δ_{0}$ against $H_{a} : Δ_{i} \neq Δ_{0}$ . Here we describe the case of $Δ_{2} = π_{2} - π_{1}$ , the procedures are similar in finding intervals of other $Δ_{i}$ 's. Let the vector $({\tilde{π}}_{1}, {\tilde{π}}_{3}, \dots, {\tilde{π}}_{g}, \tilde{R})$ denotes the constrained MLE of $(π_{1}, π_{3}, \dots, π_{g}, R)$ under the null hypothesis. Then $({\tilde{π}}_{1}, {\tilde{π}}_{3}, \dots, {\tilde{π}}_{g}, \tilde{R})$ can be computed by solving the system of equations:

{\begin{cases} \frac{\partial l_{2}}{\partial π_{i}} |_{(Δ_{i} = Δ_{0})} = 0, & (i = 1, 3, \dots, g), \\ \frac{\partial l_{2}}{\partial R} |_{(Δ_{i} = Δ_{0})} = 0. \end{cases}

There is no closed form for $({\tilde{π}}_{1}, {\tilde{π}}_{3}, \dots, {\tilde{π}}_{g}, \tilde{R})$ , so we solve the maximum likelihood equations numerically using the Fisher scoring algorithm:

[\begin{matrix} π_{1}^{(t + 1)} \\ π_{3}^{(t + 1)} \\ ⋮ \\ π_{g}^{(t + 1)} \\ R^{(t + 1)} \end{matrix}] = [\begin{matrix} π_{1}^{(t)} \\ π_{3}^{(t)} \\ ⋮ \\ π_{g}^{(t)} \\ R^{(t)} \end{matrix}] + I^{- 1} (π_{1}^{(t)}, π_{3}^{(t)}, \dots, π_{g}^{(t)}, R^{(t)}) [\begin{matrix} \frac{\partial l_{2}}{\partial π_{1}} \\ \frac{\partial l_{2}}{\partial π_{3}} \\ ⋮ \\ \frac{\partial l_{2}}{\partial π_{g}} \\ \frac{\partial l_{2}}{\partial R} \end{matrix}] |_{(π_{i} = π_{i}^{(t)}, R = R^{(t)})},

for $i = 1, 3, \dots, g$ , where $I (π_{1}^{(t)}, π_{3}^{(t)}, \dots, π_{g}^{(t)}, R^{(t)})$ is the $g \times g$ Fisher information matrix (derivation and details are given in Appendix). Let $(\hat{Δ}, {\hat{π}}_{1}, {\hat{π}}_{3}, \dots, {\hat{π}}_{g}, \hat{R})$ be the unconstrained MLEs of $(Δ, π_{1}, π_{3}, \dots, π_{g}, R)$ under the alternative hypothesis. The likelihood ratio test statistic equals the difference between $2 l_{2}$ in the form of (3) under the ‘full’ model and the reduced model. Since the test statistic follows an asymptotic chi-square distribution with one degree of freedom under the null hypothesis, the $100 (1 - α) %$ CI for $Δ \in [- 1, 1]$ satisfies

l_{2} ({\hat{Δ}}_{2}, {\hat{π}}_{1}, {\hat{π}}_{3}, \dots, {\hat{π}}_{g}, \hat{R}) - l_{2} (Δ_{2}, {\tilde{π}}_{1}, {\tilde{π}}_{3}, \dots, {\tilde{π}}_{g}, \tilde{R}) \leq χ_{1 - α / (g - 1)}^{2} (1) / 2,

where $χ_{1 - α / (g - 1)}^{2} (1)$ is the $1 - α / (g - 1)$ quantile of the chi-square distribution with one degree of freedom. The CIs for other $Δ_{i}$ s can be constructed in a similar way, and thus the SCI for all $Δ_{i}$ s is obtained.

In practical computing, the following algorithm is used to find the roots of $Δ_{2}$ in the equation

l_{2} ({\hat{Δ}}_{2}, {\hat{π}}_{1}, {\hat{π}}_{3}, \dots, {\hat{π}}_{g}, \hat{R}) - l_{2} (Δ_{2}, {\tilde{π}}_{1}, {\tilde{π}}_{3}, \dots, {\tilde{π}}_{g}, \tilde{R}) = χ_{1 - α / (g - 1)}^{2} (1) / 2.

To search for the larger root (upper bound) of $Δ_{2}$ , each iteration performs following steps:

Calculate the unconstrained MLEs $({\hat{Δ}}_{2}, {\hat{π}}_{1}, {\hat{π}}_{3}, \dots, {\hat{π}}_{g}, \hat{R})$ as initial estimates, set initial step size $= 0.1$ and flag $= 1$ .
Update ${\hat{Δ}}_{2}^{(t + 1)} = {\hat{Δ}}_{2}^{(t)} +$ step size × flag. Based on ${\hat{Δ}}_{2}^{(t + 1)}$ , calculate constrained MLEs $({\tilde{π}}_{1}, {\tilde{π}}_{3}, \dots, {\tilde{π}}_{g}, \tilde{R})$ , for nuisance parameters. Evaluate log-likelihood function in the form of (3), ${\hat{l}}_{2}^{(t + 1)}$ .
If $2 \times$ flag $\times [l_{2} ({\hat{Δ}}_{2}, {\hat{π}}_{1}, {\hat{π}}_{3}, \dots, {\hat{π}}_{g}, \hat{R}) - {\hat{l}}_{2}^{(t + 1)}] < χ_{1 - α / (g - 1)}^{2} (1)$ , return to step 2. Otherwise, $flag = - flag$ , step size $= 0.1 \times$ step size, return to step 2. If step size is sufficiently small (say $10^{- 5}$ ), convergence is satisfactory, return ${\hat{Δ}}_{2}^{(t + 1)}$ and stop iterating.

Similarly, to obtain the lower bound, repeat the iteration with initial step size $= 0.1$ , and flag $= - 1$ .

3.4. Score confidence interval

We first derive the score test under the null hypothesis $H_{0} : Δ_{i} = Δ_{0}$ against $H_{a} : Δ_{i} \neq Δ_{0}$ . We consider the case of $Δ_{2} = π_{2} - π_{1}$ , the procedure is similar to that of finding CIs of other $Δ_{i}$ 's. The score test statistic $T_{S C}$ utilizes the constrained MLEs under $H_{0}$ . Then, $T_{SC}$ for testing the equality of proportion difference is

T_{SC} = U I^{- 1} U^{T} |_{H_{0}},

where $U = (\partial l_{2} / \partial Δ_{2}, \partial l_{2} / \partial π_{1}, \partial l_{2} / \partial π_{3}, \dots, \partial l_{2} / \partial π_{g}, \partial l_{2} / \partial R)$ is the score vector and $I$ is the Fisher information matrix for $β = (Δ_{2}, π_{1}, π_{3}, \dots, π_{g}, R)^{T}$ . Note that $Δ_{2}$ is the parameter of interest, $π_{i}$ and R are nuisance parameters. Therefore, the score function is $U = (\partial l_{2} / \partial Δ_{2}, 0, \dots, 0) |_{Δ_{2} = Δ_{0}}$ , so that the test statistics can be rewritten as

T_{SC} = {(\frac{\partial l_{2}}{\partial Δ_{2}})}^{2} I^{- 1} (1, 1),

where $I^{- 1} (1, 1)$ represents the $(1, 1)^{t h}$ element of $I^{- 1}$ , and the expression of $\partial l_{2} / \partial Δ_{2}$ is given in Appendix. $T_{SC}$ is asymptotically distributed as a chi-square distribution with 1 degree of freedom.

By inverting the score test, we can obtain the $100 (1 - α) %$ CI by including all $- 1 \leq Δ_{0} \leq 1$ which satisfies

T_{SC} \leq χ_{1 - α / (g - 1)}^{2} (1) .

Similar to the profile likelihood CI, we can use the bisection algorithm to search the lower and upper limits.

4. Simulation studies

4.1. Simulations under selected sets of parameters

The performance of the proposed CIs is evaluated with both unbalanced ( $m_{i}$ 's are different) and balanced (all $m_{i}$ 's are equal) designs in terms of empirical coverage probability (ECP) and mean interval width (MIW). Let $π_{i}$ and R be chosen as $π_{i} = 0.3$ to 0.6 and $R = 0.5, 1.0, 1.5$ . The settings of the simulations are listed in Table 2.

Table 2. Scenarios of simulation.

g	Scenario	R	$π_{1}, π_{2}, \dots, π_{g}$	$m_{1}, m_{2}, \dots, m_{g}$
3	1	0.5	.4, .4, .5	50, 50, 50
	2	1.0	.4, .5, .6	50, 50, 50
	3	1.5	.3, .4, .5	50, 50, 50
	4	0.5	.4, .4, .5	40, 30, 60
	5	1.0	.4, .4, .5	40, 30, 60
	6	1.5	.4, .4, .5	40, 30, 60
4	7	0.5	.4, .4, .5, .5	50, 50, 50, 50
	8	1.0	.5, .4, .5, .6	50, 50, 50, 50
	9	1.5	.4, .3, .4, .5	50, 50, 50, 50
	10	0.5	.4, .4, .5, .5	40, 30, 50, 70
	11	1.0	.5, .4, .5, .6	40, 30, 50, 70
	12	1.5	.4, .3, .4, .5	40, 30, 50, 70
5	13	0.5	.4, .4, .5, .5, .5	50, 50, 50, 50, 50
	14	1.0	.45, .4, .45, .5, .6	50, 50, 50, 50, 50
	15	1.5	.4, .3, .4, .5, .6	50, 50, 50, 50, 50
	16	0.5	.4, .4, .5, .5, .5	40, 30, 50, 60, 70
	17	1.0	.45, .4, .45, .5, .6	40, 30, 50, 60, 70
	18	1.5	.4, .3, .4, .5, .6	40, 30, 50, 60, 70

Open in a new tab

Under each scenario, we simulate $\tilde{M}$ for 10,000 times, i.e. 10,000 summary tables in the format of Table 1 are randomly generated. For example, ( $m_{0 i}, m_{1 i}, m_{2 i}$ ) are generated from multinomial distribution $M u l t (m_{i}, p_{0 i}, p_{1 i}, p_{2 i})$ , where $p_{0 i} = R π_{i}^{2} - 2 π_{i} + 1, p_{1 i} = 2 π_{i} (1 - R π_{i}), p_{2 i} = R π_{i}^{2}$ are cell probabilities as shown in Section 3. $(m_{0 i}, m_{1 i}, m_{2 i})^{T}$ is the ith column of $\tilde{M}$ . Then 95% SCI for $Δ_{i}$ are computed by using the proposed methods for each $\tilde{M}$ . The ECP is defined as the proportion of events that $Δ_{0 i}$ under null hypothesis is within the constructed SCI, and the MIW is the mean of all SCI widths.

Tables 3 and 4 present the simulated ECPs and MIWs under selected sets of parameters. MOVER1 and MOVER2 are MOVER SCIs with the Wilson score interval and the Agresti–Coull interval, respectively. Wald1 indicates the Wald-type SCIs under the null hypothesis, and Wald2 is the one under the dependence assumption.

Table 3. Simulation results: empirical coverage probability (%).

Case	R	MOVER1	MOVER2	Wald1	Wald2	Profile likelihood	Score
1	0.5	99.04	99.09	94.63	94.67	94.59	94.63
2	1	95.37	95.18	94.99	93.64	94.53	95.10
3	1.5	89.80	89.90	95.10	94.05	95.13	95.47
4	0.5	98.90	98.84	94.36	93.62	94.93	95.18
5	1	94.61	94.66	94.64	93.75	94.94	95.13
6	1.5	90.02	90.32	95.06	93.53	94.44	95.45
7	0.5	99.30	99.20	94.55	94.31	95.05	95.24
8	1.0	94.78	95.30	94.73	94.06	94.83	95.09
9	1.5	89.53	88.81	94.83	93.70	95.08	95.55
10	0.5	99.19	98.96	94.60	93.92	94.85	96.08
11	1.0	95.22	95.15	94.56	94.39	94.09	95.70
12	1.5	89.15	89.68	94.92	93.16	95.31	95.90
13	0.5	99.24	99.34	94.67	94.56	95.33	95.95
14	1.0	94.84	95.23	94.79	93.59	94.23	95.55
15	1.5	87.29	88.17	94.93	93.06	94.90	95.66
16	0.5	99.20	99.30	94.35	94.35	94.77	95.73
17	1.0	95.17	94.95	94.49	93.80	93.33	96.21
18	1.5	87.47	87.53	94.85	92.95	95.09	96.19

Open in a new tab

Note: MOVER1 = MOVER SCIs with the Wilson Score interval. MOVER2 = MOVER SCIs with the Agresti–Coull interval. Wald1 = Wald-type SCIs under the null hypothesis. Wald2 = Wald-type SCIs under the dependence assumption.

Table 4. Simulation results: mean interval width.

Case	R	MOVER1	MOVER2	Wald1	Wald2	Profile likelihood	Score
1	0.5	0.3005	0.3007	0.2410	0.2281	0.2340	0.2334
2	1.0	0.2997	0.3000	0.3096	0.3036	0.3077	0.3083
3	1.5	0.2912	0.2921	0.3471	0.3095	0.3149	0.3174
4	0.5	0.3317	0.3321	0.2696	0.2567	0.2614	0.2596
5	1.0	0.3315	0.3318	0.3437	0.3384	0.3404	0.3420
6	1.5	0.3213	0.3225	0.3823	0.3440	0.3484	0.3535
7	0.5	0.3187	0.3190	0.2536	0.2382	0.2448	0.2444
8	1.0	0.3197	0.3200	0.3282	0.3248	0.3296	0.3314
9	1.5	0.3133	0.3138	0.3732	0.3243	0.3334	0.3353
10	0.5	0.3425	0.3430	0.2765	0.2612	0.2672	0.2653
11	1.0	0.3440	0.3442	0.3545	0.3517	0.3547	0.3584
12	1.5	0.3360	0.3370	0.3996	0.3504	0.3590	0.3633
13	0.5	0.3310	0.3312	0.2628	0.2454	0.2525	0.2524
14	1.0	0.3313	0.3316	0.3408	0.3379	0.3423	0.3443
15	1.5	0.3254	0.3260	0.3974	0.3275	0.3340	0.3369
16	0.5	0.3497	0.3501	0.2815	0.2650	0.2716	0.2700
17	1.0	0.3504	0.3507	0.3615	0.3595	0.3628	0.3667
18	1.5	0.3436	0.3444	0.4164	0.3507	0.3568	0.3614

Open in a new tab

When the independence assumption is valid (i.e. R = 1), all methods have coverage close to the nominal level (95%) with similar MIWs. When the dependence assumption holds ( $R \neq 1$ ), neither of the MOVER SCIs works well on coverage probability. The Wald-type SCI under the dependence assumption, profile likelihood, and score SCI are slightly shorter than the Wald-type SCI under the null hypothesis. It is noteworthy that the Wald-type SCI under the dependence assumption obtains poor standard deviation estimates when the sample size is small, or the data are very unequally distributed among the cells of the table. The profile likelihood and score SCI still works well under these conditions. However, it requires more computing effort.

4.2. A more extensive simulation study

A more extensive simulation study was performed for g = 3 balanced designs with $m_{i} = 60$ in each group. In this part, various sets of parameters ( $π_{1}, π_{2}, π_{3}$ ) and R are randomly generated, where R is simulated from uniform distribution $U (0, 2)$ , and $π_{i}$ 's are generated from independent uniform distribution $U (0, 1)$ . Probabilities $p_{0 i} = R π_{i}^{2} - 2 π_{i} + 1, p_{1 i} = 2 π_{i} (1 - R π_{i})$ and $p_{2 i} = R π_{i}^{2}$ are computed to validate the parameter setting. All $p_{l i}$ 's need to be within the range of $[0, 1]$ for a set of ( $π_{1}, π_{2}, π_{3}$ and R) to be considered a valid one to be used in the actual simulation.

Under each parameter setting, similar to the simulation described in Section 4.1, 10,000 different data sets are simulated, with the corresponding 95% SCI for $Δ_{i}$ computed using the proposed methods. An ECP and MIW can be calculated from the 10,000 computed SCIs in each parameter setting. We repeat the process for 1000 times (1000 sets of parameters $π_{1}, π_{2}, π_{3}$ and R), resulting in 1,000 simulated ECPs and MIWs.

Figures 1 and 2 summarize the distribution of the 1,000 simulated ECPs and MIWs. Both of the two Wald-type, profile likelihood and score SCIs give median coverage probability close to the nominal level. Both MOVER intervals have poor coverage probabilities (median below 92%), with median widths very close to the Wald-type SCI under the dependence assumption and profile likelihood SCI. The Wald-type SCI under the null hypothesis generates relatively border median width. Score SCI generates relatively border median width than the Wald-type SCI under the dependence assumption and profile likelihood SCI. Overall, the Wald-type SCI under the dependence assumption, profile likelihood and score SCI are favored and thus are recommended.

Figure 1. — Boxplots of empirical coverage probabilities.

Figure 2. — Boxplots of mean interval width.

5. A real example

We revisited the dataset [16] obtained from an outpatient population of 218 subjects at Massachusetts Eye and Ear Infirmary. Table 5 summarizes the prevalence of affected eyes from a outpatient population aged 20–39 with retinitis pigmentosa (RP). They were classified on the basis of a detailed family history into four generic types (Dominant (DOM), Recessive (AR), Sex-linked (SL) and Isolate (ISO)). The question arises as to whether there is difference in the affection rates between generic types with DOM as a control. SCI is an appropriate tool for statistical analysis.

Table 5. Prevalence of retinitis pigmentosa.

	Genetic type
Number of affected eyes	DOM	AR	SL	ISO
0	15	7	3	67
1	6	5	2	24
2	7	9	14	57

Open in a new tab

Applying the proposed methods, we have $\hat{R} = 1.664$ , which suggests large correlation. Using MOVER SCIs or Wald interval under the null hypothesis could yield biased results. According to the results of simulation study, the Wald-type SCI under the dependence assumption, profile likelihood SCI and score SCI are recommended. Affection rates for each genetic type is DOM 0.393, AR 0.480, SL 0.563, ISO 0.493. The Wald-type SCI under the dependence assumption gives: AR-DOM ( $- 0.113$ , 0.287), SL-DOM ( $- 0.001$ , 0.341), ISO-DOM ( $- 0.056$ , 0.256). Profile likelihood SCI gives: AR-DOM ( $- 0.121, 0.295$ ), SL-DOM ( $- 0.001, 0.305$ ), ISO-DOM ( $- 0.055, 0.195$ ). Score SCI gives: AR-DOM ( $- 0.111, 0.291$ ), SL-DOM (0.004, 0.319), ISO-DOM ( $- 0.067, 0.282$ ). Overall, we consider there is no statistically significant difference in the affection rates between generic types.

6. Discussions

In this article, our aim is to find simple and reliable asymptotic SCIs for the difference of proportions, which requires large sample. Small sample trial using exact method is studied [17]. Also, we focus on adjusting the multiplicity for many-to-one comparison. For all possible pairwise comparisons, there are other ways to control false positive error rate, e.g. the Tukey and Schffe method. When constructing profile likelihood SCI, Bonferroni correct method is used because it is computationally easy. But it has been criticized to be conservative when g is large and the test statistics are correlated [3,4,13].

Another parametric model [6] (‘ρ model’) was suggested with an adjusted chi-square statistic for testing homogeneity. Based on this model, three asymptotic procedures [9] for testing the equality of proportions among g ( $\geq 2$ ) groups have been developed, along with the construction of asymptotic CIs [11] for difference of proportions between two groups. The future work will be SCI construction based on the ‘ρ model.’ Sample size calculation [15] has been done for similar problems. The sample size formula can also be extended to the g group problem.

7. Supporting information

We create an online calculator for readers to simulate bilateral data, calculate the proposed confidence intervals, and compute intervals using data collected from their studies. The website can be accessed through the link: http://www.acsu.buffalo.edu/~cxma/CI_constant_R_model.htm.

Appendix. Fisher information matrix.

The second-order differential equations with respect to $Δ_{2}, π_{1}, π_{3}, \dots, π_{g}$ , and R yield

\begin{aligned} \frac{\partial^{2} l_{2}}{\partial Δ_{2}^{2}} & = \frac{2 m_{22}}{{(π_{1} + Δ_{2})}^{2}} - \frac{2 m_{22} (2 π_{1} + 2 Δ_{2})}{{(π_{1} + Δ_{2})}^{3}} \\ - \frac{2 R m_{02}}{2 π_{1} + 2 Δ_{2} - R {(π_{1} + Δ_{2})}^{2} - 1} \\ - \frac{m_{02} {[R (2 π_{1} + 2 Δ_{2}) - 2]}^{2}}{{[2 π_{1} + 2 Δ_{2} - R {(π_{1} + Δ_{2})}^{2} - 1]}^{2}} \\ - \frac{2 m_{12} [R (2 π_{1} + 2 Δ_{2}) + 2 R (π_{1} + Δ_{2}) - 2]}{[R (π_{1} + Δ_{2}) - 1] {(2 π_{1} + 2 Δ_{2})}^{2}} \\ + \frac{4 R m_{12}}{[R (π_{1} + Δ_{2}) - 1] (2 π_{1} + 2 Δ_{2})} \\ - \frac{R m_{12} [R (2 π_{1} + 2 Δ_{2}) + 2 R (π_{1} + Δ_{2}) - 2]}{{[R (π_{1} + Δ_{2}) - 1]}^{2} (2 π_{1} + 2 Δ_{2})}, \\ \frac{\partial^{2} l_{2}}{\partial π_{i}^{2}} & = \frac{m_{0 i} (- 2 R^{2} π_{i}^{2} + 4 R π_{i} + 2 R - 4)}{(R π_{i}^{2} - 2 π_{i} + 1)^{2}} \\ - \frac{2 m_{2 i}}{π_{i}^{2}} - \frac{(2 R^{2} π_{i}^{2} - 2 R π_{i} + 1) m_{1 i}}{π_{i}^{2} (R π_{i} - 1)^{2}}, i = 1, 3, \dots, g \\ \frac{\partial^{2} l_{2}}{\partial R^{2}} & = - \frac{S_{2}}{R} - \sum_{i = 1}^{2} [\frac{π_{i}^{2} m_{1 i}}{(R π_{i} - 1)^{2}} - \frac{π_{i}^{4} m_{0 i}}{(R π_{i}^{2} - 2 π_{i} + 1)^{2}}], \\ \frac{\partial^{2} l_{2}}{\partial Δ_{2} \partial π_{1}} & = \frac{\partial^{2} l_{2}}{\partial Δ_{2}^{2}}, \\ \frac{\partial^{2} l_{2}}{\partial Δ_{2} \partial π_{i}} & = 0, i = 3, \dots, g, \\ \frac{\partial^{2} l_{2}}{\partial Δ_{2} \partial R} & = \frac{m_{12} (4 π_{1} + 4 Δ_{2})}{[R (π_{1} + Δ_{2}) - 1] (2 π_{1} + 2 Δ_{2})} - \frac{m_{02} (2 π_{1} + 2 Δ_{2})}{2 π_{1} + 2 Δ_{2} - R {(π_{1} + Δ_{2})}^{2} - 1} \\ - \frac{m_{02} {(π_{1} + Δ_{2})}^{2} [R (2 π_{1} + 2 Δ_{2}) - 2]}{{[2 π_{1} + 2 Δ_{2} - R {(π_{1} + Δ_{2})}^{2} - 1]}^{2}} \\ - \frac{m_{12} (π_{1} + Δ_{2}) [R (2 π_{1} + 2 Δ_{2}) + 2 R (π_{1} + Δ_{2}) - 2]}{{[R (π_{1} + Δ_{2}) - 1]}^{2} (2 π_{1} + 2 Δ_{2})}, \\ \frac{\partial^{2} l_{2}}{\partial π_{i} \partial R} & = - \frac{m_{1 i}}{(R π_{i} - 1)^{2}} - \frac{2 (π_{i} - 1) π_{1} m_{0 i}}{(R π_{i}^{2} - 2 π_{i} + 1)^{2}}, i = 1, 3, \dots, g, \\ \frac{\partial^{2} l_{2}}{\partial π_{i} \partial π_{j}} & = 0, i \neq j, i, j = 1, 3, \dots, g . \end{aligned}

Therefore, the Fisher information matrix is given by

I (Δ_{2}, π_{1}, π_{3}, \dots, π_{g}, R) = (\begin{matrix} I_{Δ_{2}, Δ_{2}} & I_{Δ_{2}, 1} & 0 & \dots & 0 & I_{Δ_{2}, R} \\ I_{Δ_{2}, 1} & I_{1, 1} & 0 & \dots & 0 & I_{1, R} \\ 0 & 0 & I_{3, 3} & \dots & 0 & I_{3, R} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & I_{g, g} & I_{g, R} \\ I_{Δ_{2}, R} & I_{1, R} & I_{3, R} & \dots & I_{g, R} & I_{R, R} \end{matrix}),

where

\begin{aligned} I_{Δ_{2}, Δ_{2}} & = E (- \frac{\partial^{2} l_{2}}{\partial Δ_{2}^{2}}) \\ = \frac{2 R m_{2} (π_{1} + Δ_{2} - 2 R π_{1} - 2 R Δ_{2} + 2) (π_{1} + Δ_{2}) - 2 m_{2}}{(R π_{1} + R Δ_{2} - 1) (R {π_{1}}^{2} + 2 R π_{1} Δ_{2} - 2 π_{1} + R Δ_{2}^{2} - 2 Δ_{2} + 1) (π_{1} + Δ_{2})}, \\ I_{Δ_{2}, 1} & = E (- \frac{\partial^{2} l_{2}}{\partial Δ_{2} \partial π_{1}}) = I_{Δ_{2}, Δ_{2}}, \\ I_{Δ_{2}, R} & = E (- \frac{\partial^{2} l_{2}}{\partial Δ_{2} \partial R}) = 4 m_{2} (π_{1} + Δ_{2}) - m_{2} (2 π_{1} + 2 Δ_{2}) \\ - \frac{2 m_{2} (π_{1} + Δ_{2}) (2 R π_{1} + 2 R Δ_{2} - 1)}{R π_{1} + R Δ_{2} - 1} \\ + \frac{m_{2} {(π_{1} + Δ_{2})}^{2} (2 R π_{1} + 2 R Δ_{2} - 2)}{R {π_{1}}^{2} + 2 R π_{1} Δ_{2} - 2 π_{1} + R Δ_{2}^{2} - 2 Δ_{2} + 1}, \\ I_{1, 1} & = E (- \frac{\partial^{2} l_{1}}{\partial π_{1}^{2}}) = \frac{2 m_{1} ([2 R^{2} π_{1}^{2} - R π_{1}^{2} - 2 R π_{1} + 1)}{π_{1} (R π_{1}^{2} - 2 π_{1} + 1) (1 - R π_{1})} \\ + \frac{2 m_{2} [2 R^{2} (π_{1} + Δ_{2})^{2} - R (π_{1} + Δ_{2})^{2} - 2 R (π_{1} + Δ_{2}) + 1]}{(π_{1} + Δ_{2}) [R (π_{1} + Δ_{2})^{2} - 2 (π_{1} + Δ_{2}) + 1] [1 - R (π_{1} + Δ_{2})]}, \\ I_{i, i} & = E (- \frac{\partial^{2} l_{1}}{\partial π_{i}^{2}}) = \frac{2 m_{i} (2 R^{2} π_{i}^{2} - R π_{i}^{2} - 2 R π_{i} + 1)}{π_{i} (R π_{i}^{2} - 2 π_{i} + 1) (1 - R π_{i})}, i = 3, \dots, g, \\ I_{1, R} & = E (- \frac{\partial^{2} l_{1}}{\partial π_{1} \partial R}) = \frac{2 m_{1} π_{1}^{2} (R - 1)}{(R π_{1}^{2} - 2 π_{1} + 1) (1 - R π_{1})} \\ + \frac{2 m_{2} (π_{1} + Δ_{2})^{2} (R - 1)}{[R (π_{1} + Δ_{2})^{2} - 2 (π_{1} + Δ_{2}) + 1] [1 - R (π_{1} + Δ_{2})]}, \\ I_{i, R} & = E (- \frac{\partial^{2} l_{1}}{\partial π_{i} \partial R}) = \frac{2 m_{i} π_{i}^{2} (R - 1)}{(R π_{i}^{2} - 2 π_{i} + 1) (1 - R π_{i})}, i = 3, \dots, g, \\ I_{R, R} & = E (- \frac{\partial^{2} l_{1}}{\partial R^{2}}) = \sum_{i = 1}^{g} \frac{m_{i} π_{i}^{2} (2 π_{i} - R π_{i} - 1)}{R (R π_{i}^{2} - 2 π_{i} + 1) (R π_{i} - 1)}, \end{aligned}

where $π_{2} = π_{1} + Δ_{2}$ .

For given $Δ_{2} = Δ_{0}$ , the Fisher information matrix of $(π_{1}, π_{3}, \dots, π_{g}, R)$ is given by the right-bottom $g \times g$ submatrix of $I (Δ_{2}, π_{1}, π_{3}, \dots, π_{g}, R)$ .

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

1.Agresti A. and Coull B.A., Approximate is better than ‘exact’ for interval estimation of binomial proportions, Am. Stat. 52 (1998), pp. 119–126. [Google Scholar]
2.Ahn C., Jung S.H., and Donner A., Application of an adjusted $χ^{2}$ statistic to site-specific data in observational dental studies, J. Clin. Periodontol. 29 (2002), pp. 79–82. doi: 10.1034/j.1600-051x.2002.290112.x [DOI] [PubMed] [Google Scholar]
3.Aickin M. and Gensler H., Adjusting for multiple testing when reporting research results: The Bonferroni vs holm methods, Am. J. Public Health 86 (1996), pp. 726–728. doi: 10.2105/AJPH.86.5.726 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Bland J.M. and Altman D.G., Multiple significance tests: The Bonferroni method, Br. Med. J. 310 (1995), pp. 170. doi: 10.1136/bmj.310.6973.170 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Dallal G.E., Paired Bernoulli trials, Biometrics 44 (1988), pp. 253–257. doi: 10.2307/2531913 [DOI] [PubMed] [Google Scholar]
6.Donner A., Statistical methods in ophthalmology: An adjusted chi-square approach, Biometrics 45 (1989), pp. 605–611. doi: 10.2307/2531501 [DOI] [PubMed] [Google Scholar]
7.Dunnett C.W., A multiple comparison procedure for comparing several treatments with a control, J. Amer. Statist. Assoc. 50 (1955), pp. 1096–1121. doi: 10.1080/01621459.1955.10501294 [DOI] [Google Scholar]
8.Jung S.H., Ahn C., and Donner A., Evaluation of an adjusted chi-square statistic as applied to observational studies involving clustered binary data, Stat. Med. 20 (2001), pp. 2149–2161. doi: 10.1002/sim.857 [DOI] [PubMed] [Google Scholar]
9.Ma C.X. and Liu S., Testing equality of proportions for correlated binary data in opthalmologic studies, J. Biopharm. Stat. 27 (2016), pp. 611–619. doi: 10.1080/10543406.2016.1167072 [DOI] [PubMed] [Google Scholar]
10.Ma C., Shan G., and Liu S., Homogeneity test for correlated binary data, PLoS One 10 (2015), p. e0124337. doi: 10.1371/journal.pone.0124337 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Pei Y., Tang M.L., Wong W.K., and Guo J., Confidence intervals for correlated proportion differences from paired data in a two-arm randomised clinical trial, Stat. Methods Med. Res. 21 (2012), pp. 167–187. doi: 10.1177/0962280210365018 [DOI] [PubMed] [Google Scholar]
12.Peng X., Liu C., Liu S., and Ma C.X., Asymptotic confidence interval construction for proportion ratio based on correlated paired data, J. Biopharm. Stat. 29 (2019), pp. 1137–1152. doi: 10.1080/10543406.2019.1584629 [DOI] [PubMed] [Google Scholar]
13.Perneger T.V., What's wrong with Bonferroni adjustments, Br. Med. J. 316 (1998), pp. 1236–1238. doi: 10.1136/bmj.316.7139.1236 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Piegorsch W.W., Multiple comparisons for analyzing dichotomous response, Biometrics 47 (1991), pp. 45–52. doi: 10.2307/2532494 [DOI] [PubMed] [Google Scholar]
15.Qiu S.F., Tang N.S., Tang M.L., and Pei Y.B., Sample size for testing difference between two proportions for the bilateral-sample design, J. Biopharm. Stat. 19 (2009), pp. 857–871. doi: 10.1080/10543400903105372 [DOI] [PubMed] [Google Scholar]
16.Rosner B., Statistical methods in ophthalmology: An adjustment for the intraclass correlation between eyes, Biometrics 38 (1982), pp. 105–114. doi: 10.2307/2530293 [DOI] [PubMed] [Google Scholar]
17.Tang M.L., Ling M.H., and Tian G.L., Exact and approximate unconditional confidence intervals for proportion difference in the presence of incomplete data, Stat. Med. 28 (2009), pp. 625–641. doi: 10.1002/sim.3490 [DOI] [PubMed] [Google Scholar]
18.Tang N.S., Qiu S.F., Tang M.L., and Pei Y.B., Asymptotic confidence interval construction for proportion difference in medical studies with bilateral data, Stat. Methods Med. Res. 20 (2011), pp. 233–259. doi: 10.1177/0962280209358135 [DOI] [PubMed] [Google Scholar]
19.Tang N.S., Tang M.L., and Qiu S.F., Testing the equality of proportions for correlated otolaryngologic data, Comput. Stat. Data Anal. 52 (2008), pp. 3719–3729. doi: 10.1016/j.csda.2007.12.017 [DOI] [Google Scholar]
20.Tang M.L., Tang N.S., and Rosner B., Statistical inference for correlated data in ophthalmologic studies, Stat. Med. 25 (2006), pp. 2771–2783. doi: 10.1002/sim.2425 [DOI] [PubMed] [Google Scholar]
21.Wilson E.B., Probable inference, the law of succession, and statistical inference, J. Amer. Statist. Assoc. 22 (1927), pp. 209–212. doi: 10.1080/01621459.1927.10502953 [DOI] [Google Scholar]
22.Zou G.Y., On the estimation of additive interaction by use of the four-by-two table and beyond, Am. J. Epidemiol. 168 (2008), pp. 212–224. doi: 10.1093/aje/kwn104 [DOI] [PubMed] [Google Scholar]
23.Zou G.Y. and Donner A., Construction of confidence limits about effect measures: A general approach, Stat. Med. 27 (2008), pp. 1693–1702. doi: 10.1002/sim.3095 [DOI] [PubMed] [Google Scholar]

[CIT0001] 1.Agresti A. and Coull B.A., Approximate is better than ‘exact’ for interval estimation of binomial proportions, Am. Stat. 52 (1998), pp. 119–126. [Google Scholar]

[CIT0002] 2.Ahn C., Jung S.H., and Donner A., Application of an adjusted $χ^{2}$ statistic to site-specific data in observational dental studies, J. Clin. Periodontol. 29 (2002), pp. 79–82. doi: 10.1034/j.1600-051x.2002.290112.x [DOI] [PubMed] [Google Scholar]

[CIT0003] 3.Aickin M. and Gensler H., Adjusting for multiple testing when reporting research results: The Bonferroni vs holm methods, Am. J. Public Health 86 (1996), pp. 726–728. doi: 10.2105/AJPH.86.5.726 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0004] 4.Bland J.M. and Altman D.G., Multiple significance tests: The Bonferroni method, Br. Med. J. 310 (1995), pp. 170. doi: 10.1136/bmj.310.6973.170 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0005] 5.Dallal G.E., Paired Bernoulli trials, Biometrics 44 (1988), pp. 253–257. doi: 10.2307/2531913 [DOI] [PubMed] [Google Scholar]

[CIT0006] 6.Donner A., Statistical methods in ophthalmology: An adjusted chi-square approach, Biometrics 45 (1989), pp. 605–611. doi: 10.2307/2531501 [DOI] [PubMed] [Google Scholar]

[CIT0007] 7.Dunnett C.W., A multiple comparison procedure for comparing several treatments with a control, J. Amer. Statist. Assoc. 50 (1955), pp. 1096–1121. doi: 10.1080/01621459.1955.10501294 [DOI] [Google Scholar]

[CIT0008] 8.Jung S.H., Ahn C., and Donner A., Evaluation of an adjusted chi-square statistic as applied to observational studies involving clustered binary data, Stat. Med. 20 (2001), pp. 2149–2161. doi: 10.1002/sim.857 [DOI] [PubMed] [Google Scholar]

[CIT0009] 9.Ma C.X. and Liu S., Testing equality of proportions for correlated binary data in opthalmologic studies, J. Biopharm. Stat. 27 (2016), pp. 611–619. doi: 10.1080/10543406.2016.1167072 [DOI] [PubMed] [Google Scholar]

[CIT0010] 10.Ma C., Shan G., and Liu S., Homogeneity test for correlated binary data, PLoS One 10 (2015), p. e0124337. doi: 10.1371/journal.pone.0124337 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0011] 11.Pei Y., Tang M.L., Wong W.K., and Guo J., Confidence intervals for correlated proportion differences from paired data in a two-arm randomised clinical trial, Stat. Methods Med. Res. 21 (2012), pp. 167–187. doi: 10.1177/0962280210365018 [DOI] [PubMed] [Google Scholar]

[CIT0012] 12.Peng X., Liu C., Liu S., and Ma C.X., Asymptotic confidence interval construction for proportion ratio based on correlated paired data, J. Biopharm. Stat. 29 (2019), pp. 1137–1152. doi: 10.1080/10543406.2019.1584629 [DOI] [PubMed] [Google Scholar]

[CIT0013] 13.Perneger T.V., What's wrong with Bonferroni adjustments, Br. Med. J. 316 (1998), pp. 1236–1238. doi: 10.1136/bmj.316.7139.1236 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0014] 14.Piegorsch W.W., Multiple comparisons for analyzing dichotomous response, Biometrics 47 (1991), pp. 45–52. doi: 10.2307/2532494 [DOI] [PubMed] [Google Scholar]

[CIT0015] 15.Qiu S.F., Tang N.S., Tang M.L., and Pei Y.B., Sample size for testing difference between two proportions for the bilateral-sample design, J. Biopharm. Stat. 19 (2009), pp. 857–871. doi: 10.1080/10543400903105372 [DOI] [PubMed] [Google Scholar]

[CIT0016] 16.Rosner B., Statistical methods in ophthalmology: An adjustment for the intraclass correlation between eyes, Biometrics 38 (1982), pp. 105–114. doi: 10.2307/2530293 [DOI] [PubMed] [Google Scholar]

[CIT0017] 17.Tang M.L., Ling M.H., and Tian G.L., Exact and approximate unconditional confidence intervals for proportion difference in the presence of incomplete data, Stat. Med. 28 (2009), pp. 625–641. doi: 10.1002/sim.3490 [DOI] [PubMed] [Google Scholar]

[CIT0018] 18.Tang N.S., Qiu S.F., Tang M.L., and Pei Y.B., Asymptotic confidence interval construction for proportion difference in medical studies with bilateral data, Stat. Methods Med. Res. 20 (2011), pp. 233–259. doi: 10.1177/0962280209358135 [DOI] [PubMed] [Google Scholar]

[CIT0019] 19.Tang N.S., Tang M.L., and Qiu S.F., Testing the equality of proportions for correlated otolaryngologic data, Comput. Stat. Data Anal. 52 (2008), pp. 3719–3729. doi: 10.1016/j.csda.2007.12.017 [DOI] [Google Scholar]

[CIT0020] 20.Tang M.L., Tang N.S., and Rosner B., Statistical inference for correlated data in ophthalmologic studies, Stat. Med. 25 (2006), pp. 2771–2783. doi: 10.1002/sim.2425 [DOI] [PubMed] [Google Scholar]

[CIT0021] 21.Wilson E.B., Probable inference, the law of succession, and statistical inference, J. Amer. Statist. Assoc. 22 (1927), pp. 209–212. doi: 10.1080/01621459.1927.10502953 [DOI] [Google Scholar]

[CIT0022] 22.Zou G.Y., On the estimation of additive interaction by use of the four-by-two table and beyond, Am. J. Epidemiol. 168 (2008), pp. 212–224. doi: 10.1093/aje/kwn104 [DOI] [PubMed] [Google Scholar]

[CIT0023] 23.Zou G.Y. and Donner A., Construction of confidence limits about effect measures: A general approach, Stat. Med. 27 (2008), pp. 1693–1702. doi: 10.1002/sim.3095 [DOI] [PubMed] [Google Scholar]

PERMALINK

Simultaneous confidence interval construction for many-to-one comparisons of proportion differences based on correlated paired data

Zhengyu Yang

Guo-Liang Tian

Xiaobin Liu

Chang-Xing Ma

Abstract

1. Introduction

2. Multiplicity adjustment

3. The proposed methods

Table 1. Data structure for bilateral patients.

3.1. The method of variance estimates recovery

3.1.1. Wilson score interval

3.1.2. Agresti–Coull interval

3.2. Wald-type intervals

3.2.1. Under the null hypothesis

3.2.2. Under the dependence assumption

3.3. Profile likelihood confidence interval

3.4. Score confidence interval

4. Simulation studies

4.1. Simulations under selected sets of parameters

Table 2. Scenarios of simulation.

Table 3. Simulation results: empirical coverage probability (%).

Table 4. Simulation results: mean interval width.

4.2. A more extensive simulation study

Figure 1.

Figure 2.

5. A real example

Table 5. Prevalence of retinitis pigmentosa.

6. Discussions

7. Supporting information

Appendix. Fisher information matrix.

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Simultaneous confidence interval construction for many-to-one comparisons of proportion differences based on correlated paired data

Zhengyu Yang

Guo-Liang Tian

Xiaobin Liu

Chang-Xing Ma

Abstract

1. Introduction

2. Multiplicity adjustment

3. The proposed methods

Table 1. Data structure for bilateral patients.

3.1. The method of variance estimates recovery

3.1.1. Wilson score interval

3.1.2. Agresti–Coull interval

3.2. Wald-type intervals

3.2.1. Under the null hypothesis

3.2.2. Under the dependence assumption

3.3. Profile likelihood confidence interval

3.4. Score confidence interval

4. Simulation studies

4.1. Simulations under selected sets of parameters

Table 2. Scenarios of simulation.

Table 3. Simulation results: empirical coverage probability (%).

Table 4. Simulation results: mean interval width.

4.2. A more extensive simulation study

Figure 1.

Figure 2.

5. A real example

Table 5. Prevalence of retinitis pigmentosa.

6. Discussions

7. Supporting information

Appendix. Fisher information matrix.

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases