Closed Testing of Each Group Versus the Others Combined in a Multiple Group Analysis

John M Lachin; Ionut Bebu

doi:10.1177/1740774519879932

. Author manuscript; available in PMC: 2020 Apr 10.

Published in final edited form as: Clin Trials. 2019 Oct 24;17(1):77–86. doi: 10.1177/1740774519879932

Closed Testing of Each Group Versus the Others Combined in a Multiple Group Analysis

John M Lachin ^1,^*, Ionut Bebu ¹

PMCID: PMC7147831 NIHMSID: NIHMS1539768 PMID: 31647326

Abstract

Background:

Many studies, such as a study of comparative effectiveness, entail a comparison of the beneficial and adverse effects of multiple K > 2 competing therapies. Often the analysis consists of a comparison of the K groups using an omnibus (T²-like) test for any difference among the groups followed by pairwise comparisons with adjustments for multiple tests.

Methods:

We evaluate the properties of an analysis strategy in which each group is compared to the average of the others in hopes of establishing the overall superiority (or harm) of at least one of the therapies. Testing of one-versus-others can be accomplished for virtually any model using simple tests and the type I error probability α can be controlled by conducting such tests under the closed testing principle. Testing using linear models, the family of generalized linear models (GLMs) and Cox proportional hazards are described with examples.

Results:

Since each tested hypothesis compares one treatment to the average of the others, the K–level null hypothesis in the tree of closed testing is equivalent to any of the K – 1 level tests, thus reducing the number of tests required. This applies to linear, generalized linear and Cox PH models. While the Bonferroni, Holm, and Hommell procedures preserve the desired level α, all are conservative relative to closed one-versus-others testing and closed testing in general provides greater power.

Conclusions:

Testing each of multiple treatments versus the average of the others is readily and efficiently conducted under the closed testing principle and may be especially useful in the assessment of studies of comparative effectiveness.

ClinicalTrials.gov Identifier for the ALLHAT study: NCT00000542

Keywords: Comparative Effectiveness, One-versus-Others Combined, Closed Testing

Background

The American Recovery and Reinvestment Act (ARRA) of 2009 reinvigorated the interest of the National Institutes of Health and other agencies (e.g. PCORI) in comparative effectiveness research aimed at identifying the most effective (and safest) therapy among the many that may be available to treat a specific condition [1]. Many such studies involve the comparison of multiple K > 2 groups. The objective is to then determine which treatment, if any, is “best”.

For example, the NIDDK-funded Glycemia Reduction Approaches in Diabetes, A Comparative Effectiveness Study (GRADE) is designed to compare the effectiveness of four classes of drugs commonly used to treat early type 2 diabetes of less than 10 years duration [2]. All 5047 participants had previously been treated with metformin alone and were randomly assigned to receive a second drug from among one of four commonly used drug classes: either glimiperide (a sulfonylurea), sitagliptin (a DPP-4 inhibitor), liraglutide (a GLP-1 agonist) or glargine (a basal insulin). All medications have been approved for use by the FDA and each is administered in accordance with the FDA-approved drug labeling. There is no placebo or “comparator” group. The cohort of 5047 participants is being followed until the summer of 2021 with follow-up ranging from 4 to 7.5 years, depending on the time of entry.

The primary analyses will compare the long-term differences between treatment groups in metabolic status as measured by the HbA1c levels over time. Traditionally this would be conducted using a T²-like omnibus test for any difference in any direction among the 4 groups. To illustrate, let μ_j denote the mean within the j th group, j = 1, . . . , K. A T²-like test on K – 1 df would provide a test of the joint null hypothesis

H_{0} : μ_{1} = μ_{2} = \dots = μ_{K}

(1)

against the global alternative

H_{1} : μ_{i} \neq μ_{j} for some 1 \leq i < j \leq K .

(2)

Alternately, all K(K − 1)/2 pairwise comparisons could be conducted between the pairs of groups. To protect against inflation of the type I error probability, the set of pairwise tests would be conducted with an appropriate adjustment for multiple tests. The simplest adjustment would be an improved Bonferroni procedure such as that of Holm or Hommel [3], among others. Alternately, closed sequential testing might be employed [4].

However, neither the omnibus test nor the pairwise tests will address whether one treatment is superior to the other treatments combined for a given outcome. Herein we propose a different approach that compares each group in turn to the other treatments combined, such as group 1 versus groups 2, 3 and 4 combined, then group 2 versus groups 1, 3 and 4 combined, etc. The comparison of group 1 versus the others would compare μ₁ versus (μ₂ + μ₃ + μ₄)/3; and likewise for group 2 versus others, etc. Each would constitute a simple contrast among the 4 groups that is interpretable in the context of the GRADE study.

Herein we first show that this one-versus-others approach for the comparison of group means is conveniently conducted using the closed testing principle, followed by examples, simulations and numerical computations to show that it may have favorable properties in some settings. This is followed by a description of its implementation using the family of generalized linear regression models (GLMs) that includes logistic and Poisson regression models, among others, and using Cox Proportional Hazards models for event time data. We then present a detailed analysis of the past-completed “Antihypertensive and Lipid-lowering Treatment to Prevent Heart Attack Trial” (ALLHAT) [7].

Methods

One-Versus-Others Closed Testing

Consider a study comparing the means {μ_i} of K = 3 treatments versus the others using a testing procedure that preserves the type I error probability at the desired level α. The null one-versus-others elemental hypotheses of interest are

H_{01} : μ_{1} = \frac{μ_{2} + μ_{3}}{2}, H_{02} : μ_{2} = \frac{μ_{1} + μ_{3}}{2}, and H_{03} : μ_{3} = \frac{μ_{1} + μ_{2}}{2} .

(3)

To account for the three tests, the p -values could be adjusted for multiplicity using the Bonferroni or Holm (or other) adjustment. Alternatively, the closed testing procedure [4] provides strong control of type I error probability. To reject a simple hypothesis (e.g., H₀₁) at level α, one needs to reject at level α all intersection hypotheses that include that particular simple hypothesis. More specifically, to reject H₀₁ at level α, one needs to reject H₀₁, H₀₁∩H₀₂, H₀₁ ∩ H₀₃ and H₀₁ ∩ H₀₂ ∩ H₀₃ at level α. A similar testing tree would apply to the tests of H₀₂ and H₀₃. Note that the K = 3 level intersection hypothesis is equivalent to the joint null hypothesis of equality in the three groups H₀,₁₂₃: μ₁ = μ₂ = μ₃.

However, in this instance, the testing tree can be further simplified since any level 2 intersection hypothesis, such as H₀₁ ∩ H₀₂, is equivalent to the joint null hypothesis H₀,₁₂₃. For example, H₀₁ ∩ H₀₂ specifies that

\begin{array}{l} μ_{1} = \frac{μ_{2} + μ_{3}}{2} and μ_{2} = \frac{μ_{1} + μ_{3}}{2} \\ \Rightarrow μ_{3} = 2 μ_{1} - μ_{2} = 2 μ_{2} - μ_{1} \\ \Rightarrow μ_{1} = μ_{2} = μ_{3} . \end{array}

(4)

Therefore, with three treatments, if the 2 df test of the joint null hypothesis H₀,₁₂₃ is rejected at level α, then one can proceed to test each of the elementary hypotheses H₀₁, H₀₂ and H₀₃ at level α without the need to specifically test the intersection hypotheses H₀₁ ∩H₀₂, H₀₁ ∩H₀₃ and H₀₂ ∩ H₀₃.

Similarly, for K = 4 treatments, there are four individual hypotheses:

\begin{array}{l} H_{01} : μ_{1} = \frac{μ_{2} + μ_{3} + μ_{4}}{3}, H_{02} : μ_{2} = \frac{μ_{1} + μ_{3} + μ_{4}}{3} \\ H_{03} : μ_{3} = \frac{μ_{1} + μ_{2} + μ_{4}}{3}, H_{04} : μ_{4} = \frac{μ_{1} + μ_{2} + μ_{3}}{3} . \end{array}

(5)

To reject H₀₁ at level α using the closed testing principle, one needs to reject at level α all intersection hypotheses that include H₀₁, namely

\begin{array}{l} level 4 : H_{01} \cap H_{02} \cap H_{03} \cap H_{04} \\ level 3 : H_{01} \cap H_{02} \cap H_{03}, H_{01} \cap H_{02} \cap H_{04}, H_{01} \cap H_{03} \cap H_{04} \\ level 2 : H_{01} \cap H_{02}, H_{01} \cap H_{03}, H_{01} \cap H_{04} \\ level 1 : H_{01} \end{array}

(6)

where the Level 4 intersection hypothesis is the joint null hypothesis

H_{0, 1234} : μ_{1} = μ_{2} = μ_{3} = μ_{4} .

(7)

Note, however, that H₀₁ specifies that

\begin{array}{l} μ_{1} = \frac{μ_{2} + μ_{3} + μ_{4}}{3} \\ \Rightarrow 3 μ_{1} = μ_{2} + μ_{3} + μ_{4} \Rightarrow 4 μ_{1} = μ_{1} + μ_{2} + μ_{3} + μ_{4}, \end{array}

(8)

or equivalently

μ_{1} = \bar{μ} = \frac{μ_{1} + μ_{2} + μ_{3} + μ_{4}}{4} .

(9)

Thus, the elemental hypotheses H₀₂, H₀₃, and H₀₄ also specify that μ₂, μ₃ and μ₄ equal $\bar{μ}$ so that each of the Level 3 intersection hypotheses is also equivalent to H₀,₁₂₃₄. For example, H₀₁ ∩ H₀₂ ∩ H₀₃ specifies that $μ_{1} = μ_{2} = μ_{3} = \bar{μ}$ that can only occur if $μ_{4} = \bar{μ}$ which yields H₀,₁₂₃₄. Then the Level 2 intersection hypotheses specify linear functions of the means, such as H₀₁ ∩ H₀₂ which specifies that μ₁ = μ₂ = (μ₃ + μ₄)/2.

To summarize, with four groups it follows that rejecting H₀₁ at level α requires rejecting H₀₁, H₀₁ ∩H₀₂, H₀₁ ∩H₀₃, H₀₁ ∩H₀₄, and H₀₁ ∩H₀₂ ∩H₀₃ each at level α. Note that the test of any level 3 hypothesis involving H₀₁ suffices to test all other level 3 hypotheses involving H₀₁, as well as the Level K = 4 intersection hypothesis. Also note that this equivalence applies for other values of K where the Level K and level K − 1 intersection hypotheses can be tested simultaneously using a test of any one of these hypotheses.

Simple Test Statistics

To describe a family of simple test statistics, we assume that the vector of estimates $\hat{μ} = {({\hat{μ}}_{1} {\hat{μ}}_{2} \dots {\hat{μ}}_{K})}^{T}$ is asymptotically normally distributed with mean $μ = {(μ_{1} \dots μ_{K})}^{T}$ and covariance matrix $C o v (\hat{μ}) = Σ = d i a g (σ_{1}^{2} / n_{1} \dots σ_{K}^{2} / n_{K}), σ_{j}^{2}$ being the variance of the observations in the jth group of size n_j. These statistics could be a set of sample means or proportions or log (hazard rates), etc. Further, the design could be balanced (equal sample sizes) or not.

Then the joint null hypothesis (1) can be tested against the global alternative (2) with a T²-like test on K − 1 df using a (K − 1) × K contrast matrix of the form

C = {[\begin{matrix} 1 & 0 & \dots & 0 & - 1 \\ 0 & 1 & \dots & 0 & - 1 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & \dots & 1 & - 1 \end{matrix}]}^{T} or equivalently C = {[\begin{matrix} 1 & - q & \dots & - q & - q \\ - q & 1 & \dots & - q & - q \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ - q & - q & \dots & 1 & - q \end{matrix}]}^{T}

(10)

where q = 1/(K − 1), both satisfying C^T J = 0 for a K element unit vector J. Given the vector of estimates $\hat{μ}$ and a consistent estimate of the covariance matrix $\hat{Σ} = \hat{C o v} (\hat{μ})$ , the test statistic

X_{K - 1}^{2} = {\hat{μ}}^{T} C {(C^{T} \hat{Σ} C)}^{- 1} C^{T} \hat{μ}

(11)

is asymptotically distributed as chi-square on K − 1 df.

Moreover, any intersection hypothesis can also be tested using an appropriate contrast matrix. Let I = {i₁, . . . i_mI} denote a subset of m_I of the K treatments. The hypothesis H_0I = ∩_i∈I H_0i can be testing using the contrast matrix C_I with rows given by c_i with elements c_ii = 1 and c_ij = −1/(K − 1) for j ≠ i where i ∈ I. This yields a T²-like test on m_I df. For example, consider a test of the intersection hypothesis H₀₁ ∩ H₀₂ that equals the hypothesis μ₁ = μ₂ = (μ₃ + μ₄)/2. This would be tested using a contrast matrix

C = {[\begin{matrix} 1 & - 1 / 3 & - 1 / 3 & - 1 / 3 \\ - 1 / 3 & 1 & - 1 / 3 & - 1 / 3 \end{matrix}]}^{T} or equivalently C = {[\begin{matrix} 1 & - 1 & 0 & 0 \\ 1 & 0 & - 0.5 & - 0.5 \end{matrix}]}^{T} .

(12)

on 2 df. Then each individual hypothesis H_0i (i = 1, . . . K) can be tested using the contrast $c_{i}^{T} \hat{μ}$ . Again, the K level and all (K − 1) level tests are numerically equivalent.

For the ith group, the elemental hypothesis H_0i in (5) can also be tested using a large sample Z-test

Z_{i} = \frac{{\hat{μ}}_{i} - \frac{\sum_{j \neq i} {\hat{μ}}_{j}}{K - 1}}{S E}

(13)

where SE is the root of the variance of the numerator, this being

V = \frac{{\hat{σ}}_{i}^{2}}{n_{i}} + \frac{\sum_{j \neq i} {\hat{σ}}_{j}^{2}}{{(K - 1)}^{2} n_{j}} .

(14)

The statistic Z_i can then be used to conduct a one- or a two-sided test at level α.

Results

Analysis of Means

McMillan-Price et al. [5] evaluated weight loss in 129 overweight or obese young adults (BMI≥25) who were randomized to one of four diets: Diet 1: high carbohydrate and high glycemic index; Diet 2: high carbohydrate and low glycemic index; Diet 3: high protein and high glycemic index; and Diet 4: high protein and low glycemic index. The mean reductions and standard errors were

	Diet: 1	2	3	4
n	32	32	32	33
Mean	4.2	5.5	6.2	4.8
SE	0.6	0.5	0.4	0.7

Open in a new tab

This yields the following closed set of tests of hypotheses comparing one group versus the others where ∗ designates a non-significant test such that further testing for the groups involved is not conducted

\begin{matrix} Test \\ H_{01} \cap H_{02} \cap H_{03} \\ H_{01} \cap H_{02} * \\ \begin{array}{l} H_{01} \cap H_{03} \\ H_{01} \cap H_{04} \\ H_{02} \cap H_{03} \\ H_{02} \cap H_{04} * \\ H_{03} \cap H_{04} \\ H_{03} \end{array} \end{matrix} \begin{matrix} χ^{2} \\ 8.727 \\ 3.686 \\ \begin{array}{l} 8.330 \\ 6.708 \\ 7.761 \\ 0.675 \\ 6.716 \\ 6.618 \end{array} \end{matrix} \begin{matrix} df \\ 3 \\ 2 \\ \begin{array}{l} 2 \\ 2 \\ 2 \\ 2 \\ 2 \\ 1 \end{array} \end{matrix} \begin{matrix} p \\ 0.034 \\ 0.159 \\ \begin{array}{l} 0.016 \\ 0.035 \\ 0.021 \\ 0.714 \\ 0.035 \\ 0.010 \end{array} \end{matrix}

(15)

The 3 df test is significant at the 0.05 level so that the 2-level tests can then be conducted. All those involving group 3 are significant at the 0.05 level whereas tests involving H₀₁, H₀₂ and H₀₄ fail to reach significance and thus their elemental hypotheses are not tested. The test of group 3 versus the others is then significant at the 0.05 level comparing the mean in group 3 of 6.2 versus the average of 4.83 in groups 1, 2 and 4. Note that a simple test of H₀₃ with a Holm (or Bonferroni) adjustment for 4 tests would also have reached significance but with a larger adjusted p = 0.040.

Analysis of Proportions

Treiman et al. [6] reported the results of a randomized clinical trial that compared four treatments (phenytoin, lorazepam, phenobarbital and diazepam & phenytoin) for convulsive status epilepticus. The proportions in the four groups who were successfully treated among the patients with overt status epilepticus were $\hat{p}$ = (0.436,0.649,0.582,0.558) and the number of participants in each group was n = (101,97,91,95) with estimated covariance of $\hat{p}$ being

\hat{Σ} = d i a g ({\hat{p}}_{i} (1 - {\hat{p}}_{i}) / n_{i}), i = 1, \dots 4.

(16)

This yields the closed set of tests of hypotheses as in (15). The K-level Wald T²-test as in (11) yields p = 0.0199 for the 3 df test and the 2 df tests have p-values

\begin{array}{l} H_{01} \cap H_{02} & H_{01} \cap H_{03} & H_{01} \cap H_{04} & H_{02} \cap H_{03} & H_{02} \cap H_{04} & H_{03} \cap H_{04} \\ 0.0077 & 0.019 & 0.013 & 0.035 & 0.067 & 0.816 \end{array} .

(17)

Since all intersection tests involving H₀₁ are significant at the 0.05 level then the elementary hypothesis can also be tested and is significant at p = 0.0052 on 1 df. Thus, phenytoin is significantly more effective than the average of the other three treatments.

Note that the nominal p-value for the test H₀₂ (lorazepam versus others, not shown) is 0.029 but H₀₂ cannot be tested because the test of the intersection hypothesis H₀₂ ∩ H₀₄ is 0.067. The Holm procedure would also reject H₀₁ (adjusted p-value=0.021) and fail to reject H₀₂ (adjusted p-value=0.087).

The Supplemental Material contains the SAS program that performed the above calculations.

Simulations

Consider the case of three groups of n = 200 each with X_i ~ N (μ_i, σ_i²), σ_i = 5 (i = 1,2,3). For specified mean values {μ₁, μ₂, μ₃}, we then performed 10,000 simulations to compare the rejection probabilities of the tests of the elemental hypotheses in (3) under closed testing compared to no adjustment for multiple tests, and adjustment using the Bonferroni, Holm or Hommel procedures. We employed the F-test on 2 df of the joint 3-group null hypothesis as in (1), and the t-test for the elemental hypotheses in (3). The simulation employed two scenarios with specified values for the {μ_i}. The probabilities of rejection of the elemental hypotheses of each group versus the others are presented in Table 1.

Table 1:

Probabilities of rejection of the elemental hypotheses of equality of each of three groups versus the others for specific mean values μ₁, μ₂ and μ₃ and standard deviation of 5 without adjustment for three tests of each group versus the others, and using the Bonferroni, Holm, Hommel and closed testing procedures; 10,000 simulations with n=200 per group.

	Parameters			Rejection Probabilities
Test	μ₁	μ₂	μ₃	μ₁ vs. μ₂₃	μ₂ vs. μ₁₃	μ₃ vs. μ₁₂
Unadjusted	0	0.25	0.5	0.517	0.052	0.501
Bonferroni	0	0.25	0.5	0.338	0.020	0.326
Holm	0	0.25	0.5	0.365	0.030	0.343
Hommel	0	0.25	0.5	0.375	0.033	0.351
Closed testing	0	0.25	0.5	0.436	0.049	0.421
Unadjusted	0	0.50	0.5	0.706	0.221	0.250
Bonferroni	0	0.50	0.5	0.533	0.121	0.139
Holm	0	0.50	0.5	0.547	0.151	0.168
Hommel	0	0.50	0.5	0.552	0.159	0.175
Closed testing	0	0.50	0.5	0.583	0.207	0.234

Open in a new tab

Let μ_ij = (μ_i + μ_j)/2 for (ij) = (12,13,23). The first 5 rows present results for the scenario with ordered means (0, 0.25, 0.5), which yields μ₁ − μ₂₃ = −0.375, μ₂ − μ₁₃ = 0 and μ₃ − μ₁₂ = 0.375. Under this scenario, the hypothesis μ₂ = μ₁₃ that group 2 equals the others combined is true (i.e. a null hypothesis) even though μ₁ ≠ μ₂ ≠ μ₃. In this case the Bonferroni, Holm and Hommel procedures all protect the type I error probability α at the 0.05 level, but are unduly conservative with rejection probabilities ≤ 0.033 relative to the closed testing procedure with rejection probability of 0.049. For the other comparisons under this scenario, μ₁ − μ₂₃ = −0.375 and μ₃ − μ₁₂ = 0.375, the closed testing approach provides rejection probability (power) somewhat greater than the other adjusted procedures.

The second 5 rows present the scenario with means (0, 0.5, 0.5), in which case μ₁ − μ₂₃ = −0.5 whereas μ₂ −μ₁₃ = μ₃ −μ₁₂ = 0.25. Again the closed testing procedure is more powerful than the other methods except the unadjusted test owing to its inflated type I error probability.

Figures 1 and 2 present contour plots comparing the probability of rejecting H₀₁ : μ₁ = (μ₂+μ₃)/2 using the closed testing procedure versus using a t-test with Bonferroni adjustment. These calculations employed numerical integration of the corresponding power functions, rather than simulations. See the Supplemental Material.

Figure 1 employs μ₁ = 0 and values for μ₂ and μ₃ ranging from −1 to 1 whereas Figure 2 does so for μ₁ = 1. Each plot shows contours for the ratio of the rejection probability using closed testing versus that of a Bonferroni adjustment. As seen from both figures, the proposed closed testing procedure for testing one group versus the mean of all others provides more power (ratio > 1) than the Bonferroni adjustment over all values of μ₂ and μ₃ considered.

For μ₁ = 0 (Figure 1), the ratio of the powers increases as the parameters μ₂ and μ₃ approach the null μ₁ = μ₂ = μ₃ = 0 (i.e., the center of Figure 1), most likely due to the conservativeness of the Bonferroni adjustment. Conversely, for μ₂ and μ₃ further from the null values (i.e., the SW and NE regions in Figure 1), both tests have power approaching one, and the ratio decreases.

The same pattern is observed for μ₁ = 1 (Figure 2), where values for μ₂ and μ₃ approaching the upper right corner approach the null whereas values in the lower left region have substantial departure from the null. Again the ratio of the closed testing to Bonferroni power is largest around the null parameters μ₂ = μ₃ = 1 (i.e., the NE region of Figure 2), and decreases as μ₂ and μ₃ depart from the null values (e.g., the SW region).

Regression Models

We now extend the above results for tests of means to tests of the coefficients for group comparisons in a regression model, starting with the family of Generalized Linear Models that includes the linear, logistic and Poisson models as special cases, followed by the Cox Proportional Hazards model.

Generalized Linear Models.

Consider an analysis using a member of the family of generalized linear models. Let μ_x denote E(Y |x) where g(μ_x) = x^T β for covariate vector x. For a linear model, μ_x is the conditional mean with the identity link function. For a logistic model, μ_x is the probability of an outcome and g(·) is the logit so that $μ_{x} = g^{- 1} (x^{T} β) = e^{x^{T} β} / (1 + e^{x^{T} β})$ . For a Poisson model, μ_x is the rate of an event and g(·) is the log so that $μ_{x} = g^{- 1} (x^{T} β) = e^{x^{T} β}$ . Herein we describe application to logistic regression.

Again consider the case of K = 3 groups with expectations μ_i in group i = 1,2,3 parameterized as

g (μ_{1}) = α; g (μ_{2}) = α + β_{2 : 1}; g (μ_{3}) = α + β_{3 : 1}

(18)

where β_i:j = g(μ_i) − g(μ_j) comparing group i versus group j, with group j as the reference, and where β_j:i = −β_i:j. In logistic regression β_i:j is the log odds ratio comparing group i versus j, where β_i:j < 0 provides evidence that individuals in group i have lower risk than those in group j (1 ≤ i < j ≤ K). The elementary null hypothesis for the test of group 1 versus the others then specifies that other then specifies that

H_{01} : g (μ_{1}) = \frac{g (μ_{2}) + g (μ_{3})}{2}, or [g (μ_{1}) - g (μ_{2})] + [g (μ_{1}) - g (μ_{3})] = 0, or (β_{1 : 2} + β_{1 : 3}) = 0, or (β_{2 : 1} + β_{3 : 1}) = 0

(19)

The other elemental hypotheses are

\begin{array}{l} H_{02} : g (μ_{2}) = \frac{g (μ_{1}) + g (μ_{3})}{2}, or 2 β_{2 : 1} - β_{3 : 1} = 0 \\ H_{03} : g (μ_{3}) = \frac{g (μ_{1}) + g (μ_{2})}{2}, or 2 β_{3 : 1} - β_{2 : 1} = 0. \end{array}

(20)

Expressed in terms of the g(μ_i), the Level 2 hypothesis H₀₁ ∩ H₀₂ specifies that 2g(μ₃) − g(μ₁) − g(μ₂) = 0 that is the elemental hypothesis H₀₃. Thus, H₀₁ ∩ H₀₂ implies the Level 3 hypothesis H₀₁ ∩ H₀₂ ∩ H₀₃ and vice versa. As was the case of an analysis of means, it also follows that the K-level and all (K − 1)-level hypotheses are equal and a single test of any one will suffice. Thus, to reject H₀₁, for example, accounting for the multiple tests, the closed testing procedure with three groups requires rejecting H₀₁ and H₀₁ ∩ H₀₂ at level α.

For a logistic model with 4 groups, the variable group would provide the coefficient estimates ${\hat{β}}_{2 : 1}, {\hat{β}}_{3 : 1}$ ,and ${\hat{β}}_{4 : 1}$ . The analysis would then entail tests of the following contrasts for each of the elemental hypotheses

\begin{matrix} H y p o t h e s i s \\ H_{04} \\ H_{03} \\ H_{02} \\ H_{01} \end{matrix} \begin{matrix} C o n t r a s t \\ - β_{2 : 1} - β_{3 : 1} + 3 β_{4 : 1} = 0 \\ - β_{2 : 1} + 3 β_{3 : 1} - β_{4 : 1} = 0 \\ 3 β_{2 : 1} - β_{3 : 1} - β_{4 : 1} = 0 \\ β_{2 : 1} + β_{3 : 1} + β_{4 : 1} = 0 \end{matrix}

The Supplemental Material provides the SAS code that could be used to fit a logistic model for an analysis with three groups, either balanced or unbalanced, using 1 df contrasts. It also describes contrasts for the analysis of four groups.

The Supplemental Material also includes a simulation assessment of the properties of this approach and confirms that the advantages noted in the above simulation also apply to analyses using a logistic regression model.

Cox Proportional Hazards Models.

Consider the case of K = 3 groups with hazard functions λ_i(t) over time in group i = 1,2,3. The elementary null hypothesis for the test of group 1 versus the others then specifies that the hazard function in the first group equals the average of the hazards in the other groups, or

H_{01} : λ_{1} (t) = \frac{λ_{2} (t) + λ_{3} (t)}{2}, or H R_{2 : 1} (t) + H R_{3 : 1} (t) = 2.

(21)

Under the assumption of proportional hazards among groups, let β_i:j denote the log hazard ratio (HR_i:j) comparing group i versus group j, with group j as the reference, and where β_i:j < 0 provides evidence that individuals in group i have lower risk than those in group j (1 ≤ i < j ≤ K). Then the elementary null hypotheses become

H_{01} : \exp {β_{2 : 1}} + \exp {β_{3 : 1}} = 2

(22)

H_{02} : \exp {β_{1 : 2}} + \exp {β_{3 : 2}} = 2

(23)

H_{03} : \exp {β_{1 : 3}} + \exp {β_{2 : 3}} = 2.

(24)

The individual coefficients are related such that

β_{1 : 2} + β_{2 : 3} + β_{3 : 1} = \log \frac{λ_{1} (t)}{λ_{2} (t)} + \log \frac{λ_{2} (t)}{λ_{3} (t)} + \log \frac{λ_{3} (t)}{λ_{1} (t)} = 0,

(25)

the denominator of one term canceling with the numerator of the next. To evaluate H₀₁ ∩H₀₂, setting β_1:2 = −β_2:1 in (22) yields

β_{3 : 1} = \log (2 - \exp {- β_{1 : 2}}),

(26)

while setting β_2:3 = −β_3:2 in (23) yields

β_{2 : 3} = - \log (2 - \exp {β_{1 : 2}}) .

(27)

Replacing the expressions above for β_3:1 and β_2:3 in (25) yields

β_{1 : 2} - \log (2 - \exp {β_{1 : 2}}) + \log (2 - \exp {- β_{1 : 2}}) = 0.

(28)

The left hand side of (28) is an increasing function in β_1:2, and therefore it has a unique root given by β_1:2 = 0. Replacing β_1:2 with 0 in (22) and (23) then yields β_1:2 = β_2:3 = β_3:1 = 0 that equals the joint null H₀₁ ∩ H₀₂ ∩ H₀₃. Again H₀₁ ∩ H₀₂ implies H₀₁ ∩ H₀₂ ∩ H₀₃.

The test of the intersection hypothesis for the closed testing procedure is provided by the test of the 2 df test for β_2:1 = β_3:1 = 0. If signficant at level α then tests of the elementary hypotheses can also be conducted at level α.

To test the null hypothesis H₀₁ in (22), estimates for β_2:1 and β_3:1, denoted by $({\hat{β}}_{2 : 1}, {\hat{β}}_{3 : 1})$ , along with their variance-covariance matrix, denoted by ${\hat{Σ}}_{2, 3 : 1}$ can be obtained by fitting a Cox PH model with group as a class variable with 3 levels and level 1 as the reference group. Then from (22) the test statistic is

X_{01}^{2} = \frac{{[\exp {{\hat{β}}_{2 : 1}} + \exp {{\hat{β}}_{3 : 1}} - 2]}^{2}}{(\exp {{\hat{β}}_{2 : 1}} \exp {{\hat{β}}_{3 : 1}}) ({\hat{Σ}}_{23 : 1}) {(\exp {{\hat{β}}_{2 : 1}} \exp {{\hat{β}}_{3 : 1}})}^{T}}

(29)

where the variance in the denominator is obtained using the delta method and $X_{01}^{2}$ is asymptotically distributed as chi-square on 1 df.

The null hypotheses in (23) and (24) can similarly be tested by noting that

{({\hat{β}}_{1 : 2}, {\hat{β}}_{3 : 2})}^{T} = (C_{13 : 2}) {({\hat{β}}_{2 : 1} {\hat{β}}_{3 : 1})}^{T}, and {({\hat{β}}_{1 : 3}, {\hat{β}}_{2 : 3})}^{T} = (C_{12, 3}) {({\hat{β}}_{2 \cdot 1} {\hat{β}}_{3 : 1})}^{T}

(30)

where

C_{13 : 2} = [\begin{matrix} - 1 & 0 \\ - 1 & 1 \end{matrix}], and C_{12 : 3} = [\begin{matrix} 0 & - 1 \\ 1 & - 1 \end{matrix}]

(31)

so that

{\hat{Σ}}_{13 : 2} = C_{13 : 2} ({\hat{Σ}}_{23 : 1}) C_{13 : 2}^{T}, and {\hat{Σ}}_{12 : 3} = (C_{12 : 3}) ({\hat{Σ}}_{23 : 1}) {(C_{12 : 3})}^{T} .

(32)

Then the test of H₀₂ is constructed as in (29) using ${\hat{β}}_{1 : 2}$ , ${\hat{β}}_{3 : 2}$ and ${\hat{Σ}}_{13 : 2}$ and that of H₀₃ using ${\hat{β}}_{1 : 3}, {\hat{β}}_{2 : 3} and {\hat{Σ}}_{12 : 3}$ .

The Supplemental Material includes a SAS program and an R/latex program to conduct one-versus-others analyses for three groups. The supplement also provides a generalization to an analysis of four groups. Simulations also show that for time-to-event outcomes, as for the comparisons of means and proportions, the closed testing procedure is preferable to the other adjustments for multiple tests.

Example: ALLHAT

The Antihypertensive and Lipid-lowering Treatment to Prevent Heart Attack Trial (ALLHAT) [7] compared the risks of cardiovascular outcomes in 33,357 patients who were randomly assigned to receive the diuretic Chlorthalidone (n =15,255) versus the ACE inhibitor Lisinopril (n =9,054) versus the calcium channel blocker Amlodipine (n =9,048). The main report [7] presents analyses comparing Lisinopril versus Chlorthalidone and Amlodipine versus Chlorthalidone for 12 major cardiovascular clinical outcomes, one designated as primary. As is common for comparison of two groups (Lisinopril and Amlodipine) versus a common control group (Chlorthalidone), the control group sample size was increased.

Herein we consider the three pairwise comparisons among the three groups using a Holm adjustment for multiple tests, and closed testing of each group versus the others combined starting with the 2-df test of the joint null hypothesis. Note that the statistical analysis comparing one group versus the others is based on combinations of coefficients comparing 2 groups at a time and thus is not affected by an imbalance among groups as in this example.

Of the 12 variables analyzed, 5 had at least one multiplicity-corrected test (pairwise Holm adjusted or closed one-versus-others) that met the criteria for significance at the 0.05 level two-sided. For each of these 5 outcomes, Table 2 presents the number of subjects (cases) who experienced each type of event within each treatment group and the corresponding event rate per 1000 patent years. The rates of angina, combined cardiovascular disease and stroke were highest in the Lisinopril group, and those of congestive and combined heart failure highest with Amlodipine, while the rates of angina, combined cardiovascular disease, congestive and combined congestive heart failure were lowest with Chlorthalidone.

Table 2:

Number of subjects (cases) with each of the selected type of events and the corresponding rate per 1000 patients years among subjects receiving the diuretic Chlorthalidone (C) versus the ACE inhibitor Lisinopril (L) versus the calcium channel blocker Amlodipine (A) in ALLHAT.

	Amlodipine		Lisinopril		Chlorthalidone
	n = 9,048		n = 9,054		n = 15,255
Outcome	Cases	Rate	Cases	Rate	Cases	Rate
Angina	950	23.6	1019	25.8	1567	23.2
Combined CVD	2432	65.4	2514	69.2	3941	62.8
Congestive Heart Failure	706	16.9	612	14.8	870	12.3
Combined Heart Failure	578	13.7	471	11.3	724	10.2
Stroke	377	8.9	457	10.9	675	9.5

Open in a new tab

Table 3 then presents the hazard ratios and p-values for these 5 outcomes. For each type of event the three pairwise tests are presented with Holm-adjusted p-values as well as the three one-versus-others comparisons using closed testing. Note that owing to the construction in (21) the models assess the hazard ratio of the other combined groups versus a given group rather than that of the one group versus the others.

Table 3:

Comparisons of the risk of cardiovascular outcomes among subjects receiving the diuretic Chlorthalidone (C) versus the ACE inhibitor Lisinopril (L) versus the calcium channel blocker Amlodipine (A). Hazard ratio and Wald-test p-value shown. For pairwise comparisons the Holm-adjusted p-value is shown. If the ordered p-values are p₁ < p₂ < p₃ then the corrected values are 3p₁, max(p₁, 2p₂) and max(p₂, p₃). For closed testing the 2 df p-value of the joint null hypothesis is presented, and if ≤ 0.05 then the one-group versus others are tested at the 0.05 level.

	Holm-Corrected Pairwise Comparisions						Closed Testing One vs. Others
	A vs C		L vs C		A vs L		2 df	Others vs A		Others vs L		Others vs C
Outcome	HR	P	HR	P	HR	P	P	HR	P	HR	p	HR	P
Angina			1.11	0.0292			0.0304			0.91	0.0076
Combined CVD			1.10	0.0005			0.0008			0.93	0.0009	1.07	0.0018
Congestive Heart Failure	1.38	<.0001	1.20	0.0009	1.14	0.0159	<.0001	0.80	<.0001			1.29	<.0001
Combined Heart Failure	1.35	<.0001			1.22	0.0030	<.0001	0.78	<.0001			1.23	0.0002
Stroke			1.15	0.0416	0.81	0.0090	0.0084	1.15	0.0298	0.84	0.0084

Open in a new tab

For angina, the pairwise test of Lisinopril versus Chlorthalidone was significant with an adjusted p = 0.0292. Further, the 2-df test of the joint null hypothesis was significant (p = 0.0304), and in closed testing the other groups combined had a 9% lower risk of angina than Lisinopril (HR = 0.91, p = 0.0076).

For combined cardiovascular disease, the pairwise test shows a 10% greater risk for Lisinopril versus Chlorthalidone (p = 0.0005) whereas the closed testing shows that the other groups combined have 7% lower risk (HR = 0.93, p = 0.0009) than Lisinopril, and further that the other groups have a 7% greater risk than Chlorthalidone (HR = 1.07, p = 0.0018). A similar pattern is observed for congestive heart failure and combined heart failure in which one or more pairwise comparison showed higher risk for either Amlodipine or Lisinopril versus Chlorthalidone but the closed testing showed that the other groups combined had a significantly higher risk than Chlorthalidone.

For stroke, the pairwise comparisons showed that Lisinopril had a significantly higher risk of stroke than did Chlorthalidone (HR = 1.15) and that Amlodipine had a significantly lower risk than Lisinopril (HR = 0.81). Alternately, closed testing showed that the others combined had higher risk than Amlodipine (HR = 1.15), and that the others combined had significantly lower risk than Lisinopril (HR = 0.84).

The results of the pairwise and closed testing versus the others are largely concordant. However, the closed one-versus-others results provide a less complicated and more global conclusion for some outcomes, such as where Chlorthalidone has significantly lower risks than the other two treatments combined for combined cardiovascular disease, congestive heart failure and combined heart failure.

In addition, we also computed Holm-corrected p-values for the comparison of each group versus the others (data not shown). By construction the Holm p-values were at least as large as the closed testing p-values since, for example, the Holm procedure penalizes the smallest of the three tests multiplying by 3 and so on.

Discussion

The comparison of multiple alternative therapies for the treatment of a given condition is the hallmark of a comparative effectiveness assessment. While it might be useful in such situations to compare each of the K alternative treatments against each other, as in pairwise testing, it often could suffice from a public health perspective and cost-efficiency to assess whether any one treatment is better than all others combined. Herein we describe such analyses and show that testing each therapy versus the others combined can be conducted efficiently through the closed-testing principle.

The traditional pairwise testing approach consists of K(K − 1)/2 pairwise tests, while the proposed closed testing procedure for one versus the others requires 2^K−1 − K + 1 tests, counting all tests of intersection and elementary hypotheses. The closed testing approach yields fewer tests than pairwise testing for K = 3 or K = 4, and more tests for K ≥ 5. However, all individual tests in the closed testing procedure are conducted at level α, while marginally adjusting for multiplicity with the Holm procedure requires testing at a much smaller significance level (e.g., α/(K(K − 1)/2) for the most significant pairwise test.

Note that the test of each group versus the others employs the total sample size N that will afford greater power than the sample size of 2(N/K) that would be employed in each of the pairwise tests. This is true for the three examples presented herein.

Closed one-versus-others testing herein is described in terms of test statistics that are based on combinations of effects or coefficients from pairwise 1:1 group comparisons. This approach can then be applied to any study regardless of variation of the sample sizes among groups. However, in the case of a balanced design with equal sample sizes (or approximately so), another approach might be to conduct the tests using the pooled comparator groups.

For example, consider the analysis of means in a balanced 3 group study. Then, the elemental hypothesis for group 1 becomes H₀,_1:23: μ₁ = μ₂₃ where μ₂₃ is the mean of groups 2 and 3 combined in aggregate that in turn equals the unweighted average of the group means in a balanced study, such as

H_{0, 1 : 23} : μ_{1} = μ_{23} \equiv H_{01} : μ_{1} = \frac{μ_{2} + μ_{3}}{2} .

(33)

H₀,_1:23 could be tested using a simple t-test based on ${\hat{μ}}_{1} - {\hat{μ}}_{23}$ . However, in a 4 or more group study we would then need to test the intersection hypotheses such as

H_{0, 1 : 234} \cap H_{0, 2 : 134} : [μ_{1} = μ_{234}] \cap [μ_{2} = μ_{134}] \equiv H_{01} \cap H_{02} : [μ_{1} = \frac{μ_{2} + μ_{3} + μ_{4}}{3}] \cap [μ_{2} = \frac{μ_{1} + μ_{3} + μ_{4}}{3}]

(34)

However, a T²-like test of this intersection hypothesis would require an estimate of the $C o v ({\hat{μ}}_{234}, {\hat{μ}}_{134})$ as would higher order intersection hypotheses.

For three balanced groups, this approach also would provide a simplification of the Cox PH Model analysis. In that case the elemental hypotheses for group 1 versus the others can be tested using the Wald test of the coefficient ${\hat{β}}_{1 : 23}$ in a model with a binary covariate for group 1 versus the other two. Likewise the other coefficients ${\hat{β}}_{2 : 13}$ and ${\hat{β}}_{3 : 12}$ can be tested. Again, in a 4 or more group study such an approach would require the covariance between terms such as ${\hat{β}}_{1 : 234}$ and ${\hat{β}}_{2 : 134}$ which may be obtained using a sandwich estimator.

However, the pooled comparator approach should not be employed with a logistic regression model because the logistic model hypotheses (19) and (20) were specified on the log odds scale. Therefore, comparing group 1 versus a pooled comparator group combining the other two groups would not provide a valid test for (19). Note that (19) specifies that g(μ₁) = [g(μ₂) + g(μ₃)]/2. However, in an analysis comparing group 1 to the pooled groups 2 and 3 the hypothesis states that g(μ₁) = g[(μ₂ +μ₃)/2] that is different from (19). In fact, the same applies to a model based on a member of the exponential family with a link function other than the identity function.

On the other hand, a Cox proportional hazards model in a balanced study with a lumped comparator group is valid since the analysis is conducted on the hazard scale. For example, with three groups (21) becomes λ₁(t) = λ₂₃(t) = [λ₂(t) + λ₃(t)]/2.

A further natural question is whether the closed testing of one versus the others could be combined with pairwise testing. For example, suppose the hypothesis H₀₁ comparing group 1 versus the others is rejected using the closed testing procedure described herein. The question is whether the corresponding pairwise comparisons still require adjustment for multiplicity. Let ${\tilde{H}}_{0 i j}$ denote the null hypothesis μ_i = μ_j, 1 ≤ i < j ≤ K. Using K = 4 to illustrate the ideas, the individual hypotheses are then given by H_0i and ${\tilde{H}}_{0 i j}$ with 1 ≤ i < j ≤ K. Under the closed testing procedure, rejecting (say) ${\tilde{H}}_{012}$ requires rejecting all intersection hypotheses including it. It is easy to show that ${\tilde{H}}_{012} \cap H_{01}$ and ${\tilde{H}}_{012} \cap H_{02}$ are each equivalent to H₀₁ ∩H₀₂. However, ${\tilde{H}}_{012} \cap H_{03}$ is equivalent to μ₁ = μ₂ = (3μ₃ − μ₄)/2, which was not tested under the closed testing procedure for H₀₁. So the answer seems to be that a further adjustment is required, or additional intersection hypotheses have to be tested (again, all at level α).

Supplementary Material

NIHMS1539768-supplement-3.pdf^{(167.9KB, pdf)}

Acknowledgment

The data from The Antihypertensive and Lipid-lowering Treatment to Prevent Heart Attack Trial (ALLHAT) study were provided by the National Heart, Lung and Blood Institute’s Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC).

Funding

This work was partially supported by grant U01-DK-098246 from the National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK), NIH for the Glycemia Reduction Approaches in Diabetes: A Comparative Effectiveness (GRADE) Study.

The second author (I.B.) was also supported by the Samuel W. Greenhouse Biostatistics Research Enhancement Award.

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

[1].Lauer MS, Collins FS. Using science to improve the nation’s health System. NIH’s commitment to comparative effectiveness research. JAMA 2010; 303(21): 2182–2183. [DOI] [PubMed] [Google Scholar]
[2].Nathan DM, Buse JB, Kahn SE, Krause-Steinrauf H, Larkin ME, Staten M, Wexler D, and Lachin JM. Rationale and Design of the Glycemia Reduction Approaches in Diabetes: A Comparative Effectiveness Study (GRADE). Diabetes Care 2013; 36(8): 2254–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Bretz F, Hothorn T, Westfall P (2011). Multiple Comparisons Using R, CRC Press. [Google Scholar]
[4].Marcus R, Peritz E, Gabriel KR. On closed testing procedures with special reference to ordered analysis of variance. Biometrika 1976; 63: 655–660. [Google Scholar]
[5].McMillan-Price J, Peteocz P, Atkinson F, O’Neill K, Samman S, Steinbeck K, Caterson I, Brand-Miller J. Comparison of 4 diets of varying glycemic load on weight loss and cardiovascular risk reduction in overweight and obese young aduls. Archives of Internal Medicine 2006; 166: 1466–1475. [DOI] [PubMed] [Google Scholar]
[6].Treiman DM, Meyers PD, Walton NY, Collins JF, Colling C, Rowan AJ, Handforth A, Faught E, Calabrese VP, Uthman BM, Ramsay RE, Mamdani MB, Yagnik P, Jones JC, Barry E, Boggs JG, Kanner AM, for the Veterans Affairs Status Epilepticus Cooperative Study Group. A comparison of four treatments for generalized convulsive status epilepticus. The New England Journal of Medicine 1998; 339: 792–798. [DOI] [PubMed] [Google Scholar]
[7].ALLHAT Officers and Coordinators for the ALLHAT Collaborative Research Group. Major outcomes in high-risk hypertensive patients randomized to angiotensin-converting enzyme inhibitor or calcium channel blocker vs diuretic: the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT). JAMA 2002; 288: 2981–2997. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS1539768-supplement-3.pdf^{(167.9KB, pdf)}

[R1] [1].Lauer MS, Collins FS. Using science to improve the nation’s health System. NIH’s commitment to comparative effectiveness research. JAMA 2010; 303(21): 2182–2183. [DOI] [PubMed] [Google Scholar]

[R2] [2].Nathan DM, Buse JB, Kahn SE, Krause-Steinrauf H, Larkin ME, Staten M, Wexler D, and Lachin JM. Rationale and Design of the Glycemia Reduction Approaches in Diabetes: A Comparative Effectiveness Study (GRADE). Diabetes Care 2013; 36(8): 2254–61. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Bretz F, Hothorn T, Westfall P (2011). Multiple Comparisons Using R, CRC Press. [Google Scholar]

[R4] [4].Marcus R, Peritz E, Gabriel KR. On closed testing procedures with special reference to ordered analysis of variance. Biometrika 1976; 63: 655–660. [Google Scholar]

[R5] [5].McMillan-Price J, Peteocz P, Atkinson F, O’Neill K, Samman S, Steinbeck K, Caterson I, Brand-Miller J. Comparison of 4 diets of varying glycemic load on weight loss and cardiovascular risk reduction in overweight and obese young aduls. Archives of Internal Medicine 2006; 166: 1466–1475. [DOI] [PubMed] [Google Scholar]

[R6] [6].Treiman DM, Meyers PD, Walton NY, Collins JF, Colling C, Rowan AJ, Handforth A, Faught E, Calabrese VP, Uthman BM, Ramsay RE, Mamdani MB, Yagnik P, Jones JC, Barry E, Boggs JG, Kanner AM, for the Veterans Affairs Status Epilepticus Cooperative Study Group. A comparison of four treatments for generalized convulsive status epilepticus. The New England Journal of Medicine 1998; 339: 792–798. [DOI] [PubMed] [Google Scholar]

[R7] [7].ALLHAT Officers and Coordinators for the ALLHAT Collaborative Research Group. Major outcomes in high-risk hypertensive patients randomized to angiotensin-converting enzyme inhibitor or calcium channel blocker vs diuretic: the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT). JAMA 2002; 288: 2981–2997. [DOI] [PubMed] [Google Scholar]

PERMALINK

Closed Testing of Each Group Versus the Others Combined in a Multiple Group Analysis

John M Lachin

Ionut Bebu

Abstract

Background:

Methods:

Results:

Conclusions:

Background

Methods

One-Versus-Others Closed Testing

Simple Test Statistics

Results

Analysis of Means

Analysis of Proportions

Simulations

Table 1:

Figure 1:

Figure 2:

Regression Models

Generalized Linear Models.

Cox Proportional Hazards Models.

Example: ALLHAT

Table 2:

Table 3:

Discussion

Supplementary Material

Acknowledgment

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Closed Testing of Each Group Versus the Others Combined in a Multiple Group Analysis

John M Lachin

Ionut Bebu

Abstract

Background:

Methods:

Results:

Conclusions:

Background

Methods

One-Versus-Others Closed Testing

Simple Test Statistics

Results

Analysis of Means

Analysis of Proportions

Simulations

Table 1:

Figure 1:

Figure 2:

Regression Models

Generalized Linear Models.

Cox Proportional Hazards Models.

Example: ALLHAT

Table 2:

Table 3:

Discussion

Supplementary Material

Acknowledgment

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases