Comparing Two Exponential Distributions Using the Exact Likelihood Ratio Test

Gang Han; Michael J Schell; Jongphil Kim

doi:10.1080/19466315.2012.698945

. Author manuscript; available in PMC: 2013 Oct 1.

Published in final edited form as: Stat Biopharm Res. 2012 Oct 1;4(4):348–356. doi: 10.1080/19466315.2012.698945

Comparing Two Exponential Distributions Using the Exact Likelihood Ratio Test

Gang Han ¹, Michael J Schell ^1,², Jongphil Kim ^1,²

PMCID: PMC3693095 NIHMSID: NIHMS354604 PMID: 23814641

Abstract

The exact two-sided likelihood ratio test for testing the equality of two exponential means is proposed and proved to be the uniformly most powerful unbiased test. This exact test has advantages over two alternative approaches in that it is unbiased and more powerful while maintaining the type I error. The use of the proposed test is demonstrated in a non-small cell lung cancer clinical trial design.

Some Keywords: Survival analysis, exponential family, uniformly most powerful unbiased test, power calculation

1 Introduction

The exponential distribution has been shown to have high inference efficiency in survival analysis (Miller (1983) and Meier et al. (2004)). The problem of testing the equality of two exponential means is commonly seen in practice. For example, Bui et al. (2011) compared the survival distributions of sarcoma before and after metastasis as well as for patients with low and high biomarker Cx43 scores, where the survival distribution is believed to be exponential.

There have been existing approaches for comparing two exponential distributions. For example, Lee (1992) stated an asymptotic test. An exact F-test, which was originally given in Cox (1953), has been widely employed by statistical software packages including both noncommercial packages such as STPLAN and commercial packages such as Power Analysis and Sample Size (PASS). These two widely used approaches, however, suffer their weaknesses. As shown in Section 4, the ALRT is an asymptotic test requiring a relatively large sample size to maintain type I error. The F-test is a biased test and with the bias, its power is typically low, which may cause misleading interpretation of the data.

In this paper, we propose a uniformly most powerful unbiased (UMPU) test that has advantages over the existing tests. We discuss a likelihood ratio test in Section 2, and prove it is the UMPU test in Section 3. In Section 4, we compare the proposed test with the two existing tests. Concluding remarks are given in Section 5. Proofs of a remark, two lemmas, and a theorem are provided in the appendix.

2 Testing the Equivalence of Two Exponential Distributions

Suppose we have two groups of observations following exponential distributions. In group 1, we let {t_1,_i}_{i=1, …, n₁} and {c_1,_i}_{i=1, …,, n₁} denote the event times and the censoring indicator, respectively, where n₁ is the number of observations, c_1,_i = 1 if the ith observation is a event, and c_1,_i = 0 if censored. Similarly, we let {t_2,_i}_{i=1, …, n₂} and {c_2,_i}_{i=1, …, n₂} denote the event/cesoring times and the censoring status in group 2. We let d₁ and d₂ denote the numbers of events in groups 1 and 2, respectively. We let {t_1,1 ≤ t_1,2 ≤ ··· ≤ t_1,d₁} denote the event times in group 1, and the corresponding c_1,1 = c_1,2 = ··· = c_1,d₁ = 1 the indicators of the d₁ events. Similarly, {t_2,1 ≤ t_2,2 ≤, …, ≤ t_2,d₂} are the event times in group 2, and c_2,1 = c_2,2 = ··· = c_2,d₂ = 1 are indicators of d₂ events. We assume that the survival times in group 1 and 2 follow exponential distributions with means λ₁ and λ₂, respectively, and that the censoring mechanisms in two groups are both type II. For m ∈ {1, 2}, the maximum likelihood estimates (MLE) of λ_m is

{\hat{λ}}_{m} = \frac{x_{m}}{d_{m}},

(1)

where x_m is the total time on test (TTOT) between time 0 and time t_{m,d_m}

x_{m} = TTOT (0, t_{m, d_{m}}) = \sum_{i = 1}^{d_{m}} t_{m, i} + (n_{m} - d_{m}) \times t_{m, d_{m}} .

The α level confidence interval of λ_m is

λ_{m} \in (\frac{2 d_{m} {\hat{λ}}_{m}}{χ_{2 d_{m}, α / 2}^{2}}, \frac{2 d_{m} {\hat{λ}}_{m}}{χ_{2 d_{m}, 1 - α / 2}^{2}}),

where $χ_{2 d_{m}, α / 2}^{2}$ is the lower quantile at probability α/2 of the central chi-square distribution with 2d_m degrees of freedom (Epstein and Sobel 1954). As discussed in Kalbfleisch and Prentice (1980), page 41, if the survival distribution is exponential in group 1, then x₁ is a realization from Gamma(d₁, λ₁), the gamma distribution with parameters (d₁, λ₁) and having mean d₁λ₁ and variance $d_{1} λ_{1}^{2}$ . Similarly, x₂ is a realization from Gamma(d₂, λ₂) if the survival distribution is exponential in group 2. We let X₁ and X₂ denote two random variables with their distributions X₁ ~ Gamma(d₁, λ₁) and X₂ ~ Gamma(d₂, λ₂).

The null and alternative hypotheses for testing the equivalence between λ₁ and λ₂ are

H_{0} : λ_{1} = λ_{2} vs . H_{1} : λ_{1} \neq λ_{2} .

(2)

Under the null hypothesis, we let λ = λ₁ = λ₂ denote the model parameter. We derive the likelihood ratio test (LRT) in Remark 1.

Remark 1

The likelihood ratio test statistic for testing hypothesis (2) can be written as

φ (x_{1}, x_{2}) = {(\frac{x_{1}}{x_{1} + x_{2}})}^{d_{1}} {(\frac{x_{2}}{x_{1} + x_{2}})}^{d_{2}} .

The level α likelihood ratio test rejects H₀ if φ (x₁, x₂) ≤ C_α, where C_α is a real value having P(φ(X₁, X₂) ≤ C_α|H₀) = α for all α ∈ (0, 1).

Next we quantify the p-value of this LRT. Under H₀, Y = X₁/(X₁ + X₂) has a beta distribution, Beta(d₁,d₂), with the probability density function (p.d.f.)

f_{β} (y ∣ d_{1}, d_{2}) = \frac{(d_{1} + d_{2} - 1)!}{(d_{1} - 1)! (d_{2} - 1)!} y^{d_{1} - 1} {(1 - y)}^{d_{2} - 1},

(3)

for all 0 < y < 1. Given x₁ and x₂, the p-value of the LRT is P (Y (1−Y) ≤ φ(x₁, x₂)|H₀). Because the function y^d₁(1−y)^d₂ ∞ f_β(y|d₁+1, d₂+1) is strictly increasing for y ∈ [0, d₁/(d₁+d₂)] and decreasing for y ∈ [d₁/(d₁+d₂), 1], there must exist two real numbers A₁ ∈ [0, d₁/(d₁+d₂)] and A₂ ∈ [d₁/(d₁ + d₂), 1] satisfying

Y^{d_{1}} {(1 - Y)}^{d_{2}} \leq φ (x_{1}, x_{2}) iif Y \in [0, A_{1}] \cup [A_{2}, 1],

(4)

where 0 ≤ A₁ ≤ d₁/(d₁ + d₂) ≤ A₂ ≤ 1, and

f_{β} (A_{1} ∣ d_{1} + 1, d_{2} + 1) = f_{β} (A_{2} ∣ d_{1} + 1, d_{2} + 1) = φ (x_{1}, x_{2}) \times \frac{(d_{1} + d_{2} + 1)!}{d_{1}! d_{2}!} .

(5)

The p-value of the LRT can be computed as

p - value = P (Y \leq A_{1} ∣ H_{0}) + P (Y \geq A_{2} ∣ H_{0}) .

(6)

We illustrate this LRT for d₁ = d₂ = 1. In this case, the test statistic is φ(x₁, x₂) = x₁x₂/(x₁ + x₂)² = y(1 − y), where y = x₁/(x₁ + x₂). So φ(x₁, x₂) ≤ C_α iff

y \in [0, 0.5 - \sqrt{0.25 - C_{α}}] \cup [0.5 + \sqrt{0.25 - C_{α}}, 1] .

Because Y = X₁/(X₁+X₂) has a uniform distribution on [0, 1], $P (y \in [0, 0.5 - \sqrt{0.25 - C_{α}}] \cup [0.5 + \sqrt{0.25 - C_{α}}, 1]) = α$ leads to C_α = 0.25 × (2α − α²). If α = 0.05, the LRT rejects H₀ if x₁/(x₁ + x₂) ≤ 1/40 or x₁/(x₁ + x₂) ≥ 39/40.

There exist A₁ and A₂ to achieve any level α ∈ (0, 1) because the function y^d₁(1 − y)^d₂ is continuous and monotone for y ∈ [0, d₁/(d₁+d₂)] and y ∈ [d₁/(d₁+d₂), 1], and the cumulative density function (c.d.f.) of Beta(d₁, d₂) is continuous as well. To compute the p-value of the test (2) given x₁ and x₂, we search (A₁, A₂) numerically for arbitrary positive integers d₁ and d₂ with (5) and (6). Based on (3)–(5), x₁/(x₁ + x₂) = A₁ if x₁/(x₁ + x₂) ≤ d₁/(d₁ + d₂), and x₁/(x₁ + x₂) = A₂ if x₁/(x₁ + x₂) ≥ d₁/(d₁ + d₂). We implement the bisection search algorithm to determine A₂ ∈ [d₁/(d₁ + d₂)] (or A₁ ∈ [0, d₁/(d₁ + d₂)]). The exact p-value in (6) can be computed using a beta distribution table or a beta c.d.f. in software packages such as MATLAB (function “betacdf”) and SAS (function “cdf”). Similarly, given a level α, the critical values A₁ and A₂ can be computed using (5) and the equation

P (A_{1} < Y < A_{2} ∣ H_{0}) = 1 - α .

(7)

3 UMPU Property of the Likelihood Ratio Test

Lehmann (1986), page 188–192, provided a general form of the UMPU test for the exponential family. For a two parameter model having parameters (θ₁, θ₂), the level α UMPU test for

H_{0}^{★} : θ_{2} = 0 vs . H_{1}^{★} : θ_{2} \neq 0

(8)

is to reject $H_{0}^{★}$ if V (x) ≤ C₁ or V (x) ≥ C₂ if the following five conditions hold:

The likelihood function can be written as
$L (θ_{1}, θ_{2} ∣ x) = C (θ_{1}, θ_{2}) H (x) exp {T (x) θ_{1} + U (x) θ_{2}},$ (9)

where C(θ₁, θ₂) is a function of (θ₁, θ₂) and H(x), T(x), and U(x) are functions of x;
T(x) is sufficient for θ₁ and U(x) is sufficient for θ₂;
V(X) is independent of T(X) under θ₂ = 0;
V(x) can be written as
$V (x) = a (T (x)) U (x) + b (T (x)),$ (10)

where a(·) and b(·) are two functions with a(T(x)) > 0;
V(x) is continuous in x and L(θ₁, θ₂|x) is continuous in θ₁ and θ₂.

The value of (C₁, C₂) is determined by

P (V (X) \leq C_{1}) + P (V (X) \geq C_{2}) = α

(11)

and

E (V (X) \times (1 - I_{(C_{1}, C_{2})} (V (X)))) = α \times E (V (X)),

(12)

where I_(C₁,C₂)(·) is an indicator function such that I_(C₁,C₂)(V(X)) = 1 if V(X) ∈ (C₁, C₂) and 0 otherwise.

Applying this general form to the two sample exponential setting, we can derive Lemma 1.

Lemma 1

The UMPU test of H₀ vs. H₁ in (2) rejects H₀ if y = x₁/(x₁ + x₂) ≤ C₁ or y ≥ C₂ where C₁ and C₂ are determined by

\int_{C_{1}}^{C_{2}} f_{β} (y ∣ d_{1}, d_{2}) d y = 1 - α

(13)

and

\int_{C_{1}}^{C_{2}} f_{β} (y ∣ d_{1} + 1, d_{2}) d y = 1 - α .

(14)

We propose Lemma 2 to prove Theorem 1.

Lemma 2

Suppose that X_n is a random variable having binomial distribution Bin(n, p), where the probability mass function (p.m.f.) of X_n can be written as

P (X_{n} = x) = \frac{n!}{x! (n - x)!} p^{x} {(1 - p)}^{n - x},

(15)

for all x = 0, 1, …, n, and Z_n₊₁ is a random variable with Bin(n+1, p). If Z_n₊₁ is independent with X_n, then

P (Z_{n + 1} \geq k) = P (X_{n} \geq k) + P (X_{n} = k - 1) \times p,

(16)

for any positive integer k.

Following Lemma 1 and Lemma 2, Theorem 1 provides the UMPU condition for the test of H₀ vs. H₁.

Theorem 1

For any level α ∈ (0, 1), the LRT in Remark 1 is the UMPU test.

4 Method Comparison

In this section, we compare two existing tests with the exact LRT. Prior to the comparison in Section 4.3, we introduce the alternative tests in Section 4.1 and derive the power of the three tests in Section 4.2.

4.1 Alternative Tests

The first alternative is an asymptotic likelihood ratio test (ALRT) described in Lee (1992), page 233–236. The α level ALRT rejects H₀ in (2) if $- 2 log (Λ) > χ_{1, 1 - α}^{2}$ , where

Λ = \frac{{(d_{1} + d_{2})}^{d_{1} + d_{2}}}{d_{1}^{d_{1}} d_{2}^{d_{2}}} \times \frac{x_{1}^{x_{1}} x_{2}^{x_{2}}}{{(x_{2} + x_{2})}^{x_{1} + x_{2}}}

is the LRT test statistic. Following the notation y = x₁/(x₁ + x₂) and the p.d.f. in (3), the α level ALRT rejects H₀ if $f_{β} (y ∣ d_{1} + 1, d_{2} + 1) \leq C_{α}^{A}$ , where

C_{α}^{A} = \frac{d_{1}^{d_{1}} d_{2}^{d_{2}} (d_{1} + d_{2} + 1)!}{{(d_{1} + d_{2})}^{d_{1} + d_{2}} d_{1}! d_{2}!} exp {- χ_{1, 1 - α}^{2} / 2} .

As discussed in Section 2, f_β(y|d₁ + 1, d₂ + 1) ∞ y^d₁ (1 − y)^d₂ is monotonically increasing for y ∈ [0, d₁/(d₁ + d₂)] and monotonically decreasing for y ∈ [d₁/(d₁ + d₂), 1]. The rejection region of the ALRT for testing the two-sided hypothesis (2) at level α is

y \in [0, A_{1}^{A}] \cup [A_{2}^{A}, 1],

(17)

where $0 \leq A_{1}^{A} \leq d_{1} / (d_{1} + d_{2}) \leq A_{2}^{A} \leq 1$ and $f_{β} (A_{1}^{A} ∣ d_{1} + 1, d_{2} + 1) = f_{β} (A_{2}^{A} ∣ d_{1} + 1, d_{2} + 1) = C_{α}^{A}$ for all α ∈ [0, 1]. The bisection search algorithm mentioned in Section 2 can be used to compute $A_{1}^{A}$ and $A_{2}^{A}$ numerically. Lee (1992) pointed out three limitations of the ALRT. First, it does not suit one-sided hypotheses. Secondly, this asymptotic test requires a large sample size. Thirdly, the power of ALRT is typically lower than the exact tests.

The second alternative is an F-test originally proposed by Cox (1953) for comparing the rates of occurrence of two Poisson samples, which is the same as comparing the means of two exponential samples of waiting times. Based on the fact that F′ = (X₁/d₁)/(X₂/d₂) has an exact F distribution with degrees of freedom 2d₁ and 2d₂ under H₀, the F-test rejects H₀ in (2) at level α if f′ ≤ F_{2d₁,2d₂,α/2} or f′ ≥ F_{2d₁,2d₂,1−α/2}, where f′ = (x₁/d₁)/(x₂/d₂) and F_{2d₁,2d₂,α} is the 100 × α% lower quantile of the F distribution with degrees of freedom 2d₁ and 2d₂. Because of the functional relation Y = X₁/(X₁ + X₂) = 1 − 1/(F′d₂ +1), the F-test and the LRT are identical when testing a one-sided hypothesis. For the two-sided hypothesis in (2), the level α F-test has the rejection region

y \in [0, I_{β}^{- 1} (α / 2; d_{1}, d_{2})] \cup [I_{β}^{- 1} (1 - α / 2; d_{1}, d_{2}), 1]

(18)

The difference between the F-test and the proposed LRT lies in that the F-test has the equal tail probability rejection region in (18) but the rejection region of the LRT is defined by equations (5) and (7). The F-test and the LRT are identical if and only if A₁ = I_β(α/2; d₁, d₂) and A₂ = I_β(1 − α/2; d₁, d₂), which occurs when d₁ = d₂.

4.2 Rejection Probabilities of the Three Tests

To quantify the power, we first derive the p.d.f. of y = x₁/(x₁ + x₂) given λ₁ and λ₂. Define z = x₂. Then x₁ = yz/(1 − y) and x₂ = z. Thus,

\begin{array}{l} f (y ∣ λ_{1}, λ_{2}) = \int_{0}^{\infty} f (y, z ∣ λ_{1}, λ_{2}) d z \\ = \int_{0}^{\infty} f (\frac{y z}{1 - y} | λ_{1}) f_{2} (z ∣ λ_{2}) \times | \begin{matrix} \frac{{d x}_{1}}{d z} & \frac{{d x}_{2}}{d z} \\ \frac{{d x}_{1}}{d y} & \frac{{d x}_{2}}{d y} \end{matrix} | d z \\ = \int_{0}^{\infty} f (\frac{y z}{1 - y} | λ_{1}) f_{2} (z ∣ λ_{2}) \times | \frac{z}{{(1 - y)}^{2}} | d z \\ = \frac{y^{d_{1} - 1} (d_{1} + d_{2} - 1)!}{(d_{1} - 1)! (d_{2} - 1)! λ_{1}^{d_{1}} λ_{2}^{d_{2}} {(1 - y)}^{d_{1} + 1}} {[\frac{λ_{1} λ_{2} (1 - y)}{λ_{2} y + λ_{1} (1 - y)}]}^{d_{1} + d_{2}} \\ = f_{β} (y ∣ d_{1}, d_{2}) \times \frac{λ_{1}^{d_{2}} λ_{2}^{d_{1}}}{{[(λ_{2} - λ_{1}) y + λ_{1}]}^{d_{1} + d_{2}}} . \end{array}

(19)

Thus, given two generic real numbers 0 ≤ g₁ ≤ g₂ ≤ 1, we can quantify the probability of Y ∈ [0, g₁] ∪ [g₁, 1] as

\begin{array}{l} P (Y \in [0, g_{1}] \cup [g_{2}, 1] ∣ λ_{1}, λ_{2}) = 1 - P (Y \in (g_{1}, g_{2}) ∣ λ_{1}, λ_{2}) \\ = 1 - \int_{A_{1}}^{A_{2}} f (y ∣ λ_{1}, λ_{2}) d y \\ = 1 - \int_{A_{1}}^{A_{2}} f_{β} (y ∣ d_{1}, d_{2}) \times \frac{λ_{1}^{d_{2}} λ_{2}^{d_{1}}}{{[(λ_{2} - λ_{1}) y + λ_{1}]}^{d_{1} + d_{2}}} d y . \end{array}

(20)

Combining (20) with (5), (7), (17), and (18), the rejection probabilities of all three tests at level α has the form (20) with $(g_{1}, g_{2}) = (A_{1}^{A}, A_{2}^{A})$ for the ALRT, $(g_{1}, g_{2}) = (I_{β}^{- 1} (α / 2; d_{1}, d_{2}), I_{β}^{- 1} (1 - α / 2; d_{1}, d_{2}))$ for the F-test, and (g₁, g₂) = (A₁, A₂) for the proposed LRT. Numerical integration methods (e.g., importance sampling, Laplace approximation, and Riemann sum approximation) can be used to approximate the integration in (20). We use Riemann sum approximation for the three examples in Section 4.3.

4.3 Comparing Three Tests

We compare the three tests in Examples 1–3 by investigating the type I error in Example 1 and the power in Examples 2 and 3.

Example 1

Let d₁ = d₂ = d, λ₁ = λ₂, and level α = 0.05. Rejection probabilities of the three tests are calculated for d = 1, …, 100 using (20). Figure 1(a) plots the type I error against d₁. We can see that both the F-test and the LRT have their type I errors equal to α. The type I error of the ALRT is larger than 0.05 and converges to 0.05 asymptotically as the sample size increases. A general rule given by Lee (1992) suggests that the ALRT works only if d₁ + d₂ ≥ 25, which can be violated in engineering experiments and medical research if the sample size is small. The rejection probability under α = 0.05 for d = 13 (the smallest value of d to satisfy d₁ + d₂ ≥ 25) is roughly 0.053, which can be still too high for controlling the type I error.

Example 2

Let (d₁, d₂) = (30, 4) and (λ₁, λ₂) = (12, 11). We compare the power of the F-test with the power of the LRT for level α ∈ (0, 1). Following the power computation in (20), we depict in Figure 1(b) the difference between the power and level α. We see that the proposed LRT is unbiased because the solid curve is positive. The F-test, however, is biased for all α ∈ (0, 1). Further, the LRT is more powerful than the F-test. For example, at level α = 0.1, power of the F-test is about 0.097, and the power from the LRT is about 0.104 in this example.

Example 3

We investigate the relation between the power and sample size of the F-test and LRT under two scenarios. In this example, α = 0.1, (λ₁, λ₂) = (10, 10.5) for scenario 1, and (λ₁, λ₂) = (10, 15) for scenario 2. We compute the power of the F-test and LRT for d₁ = {1, 2, …, 50} and d₂ = 4 × d₁, which mimics a typical sample size increase when moving from the phase II to the phase III of a clinical trial. Figure 1(c) and (d) plot the power against d₁ for scenarios 1 and 2, respectively. We can see that the power is increasing with the sample size d₁. The power of the LRT is higher than the power of the F-test for all d₁ ∈ {1, 2, … 50}. For instance, 2 more patients are needed for the F-test to be as powerful as the LRT for all d₁ ∈ {1, …, 50} in scenario 2. Moreover, bias from the F-test occurs under both scenarios. In scenario 1 where the difference between λ₁ and λ₂ is small (λ₂ = 1.05 × λ₁), the F-test is biased for all d₁ < 11; for scenario 2 with a larger difference between λ₁ and λ₂ (λ₂ = 1.5 × λ₁), the F-test is biased for d₁ < 2. This result indicates that the bias in the F-test occurs mostly for small sample size with a moderate difference between λ₁ and λ₂.

4.4 Application in the Design of a Comparative Lung Cancer Clinical Trial

We implement the F-test and LRT to design a non-small cell lung cancer (NSCLC) clinical trial. Simon, Schell, Begum, Haura, Antonia and Bepler (2011) introduced a personalized NSCLC therapy along with three alternative standard (non-personalized) NSCLC therapies, and showed that the personalized therapy had statistically significant improvements over standard therapies in response rate, overall survival, and progression free survival. We attempt to design a phase III trial to further compare the first 2 year survival distributions of the personalized therapy and that of a standard therapy (recorded as MCC 13303 in Simon et al. (2011)). The time span of first two years was chosen because it was where the difference in survival was found and where exponential assumption was believed to be valid. We assume that more patients can be recruited for the personalized therapy compared with the alternative therapy because 1, the personalized therapy showed a significant improvement in the Phase II trial (Simon et al. 2011), and 2, non-small cell lung cancer patients typically prefer the most promising treatment. Specifically, we assume that 75% of the events are from the personalized therapy and the other 25% are from the alternative therapy.

Figure 2 plots the Kaplan-Meier curves of overall survival corresponding to the two therapies from month 0 to 24. Assuming that λ₁ = 24 for personalized therapy group and λ₂ = 15 for the alternative group, we applied the exponential test in Hollander and Proschan (1979) to check the exponential assumption with patients alive at 2 years being censored. The p-value of 0.853 (test statistic 0.185) of personalized therapy group and 0.561 (test statistic 0.582) of the other group indicate that exponential assumption was valid for both groups.

Kaplan-Meier curves of the first 2 year survival data from the personalized therapy and an alternative therapy described in Simon et al. (2011).

We compute the power of the F-test and LRT based on the MLE of (λ₁, λ₂) that were estimated using the data in Simon et al. (2011), i.e., λ̂₁ = 22.25 and λ̂₂ = 13.52. Table 1 lists the exact power from the two tests for the two-sided hypothesis in (2) with significance level 0.1. We can see that in the first row, where only 4 patients were recruited, the power of the F-test is less than the significance level 0.1, which indicates that the F-test is biased. The proposed LRT is unbiased. Furthermore, the minimum sample size to achieve 80% of power is 136 for the LRT, less than that for the F-test with difference of 4 patients. Similarly, the minimum sample size to achieve 90% of power is 184 for the LRT, which is 4 patients less than that of the F-test.

Table 1.

Total sample size (left column), power from the LRT (middle column), and power from the F-test (right column) for testing the equivalence of λ₁ and λ₂ with (λ̂₁, λ̂₂) = (22.25, 13.52) and proportion of patients being 3: 1 in the two groups.

Total Sample Size	LRT Power	F-Test Power
4	0.112	0.098
8	0.148	0.124
12	0.177	0.152
100	0.687	0.673
104	0.703	0.689
108	0.718	0.704
112	0.732	0.719
116	0.745	0.733
120	0.758	0.747
124	0.771	0.760
128	0.783	0.772
132	0.794	0.784
136	0.805	0.795
140	0.815	0.806
144	0.825	0.816
148	0.834	0.826
152	0.843	0.835
156	0.852	0.844
160	0.860	0.852
164	0.867	0.860
168	0.875	0.868
172	0.882	0.875
176	0.888	0.882
180	0.894	0.889
184	0.900	0.895
188	0.906	0.901

Open in a new tab

5 Summary and Concluding Remarks

The test for comparing two sample exponential distributions is used extensively in survival studies and engineering experiments. A UMPU test is necessary to provide correct interpretation of the two example exponential data in scientific research. In this paper we have proposed a two-sided exact likelihood ratio test to compare two exponential parameters. We have proved the LRT is the UMPU test. This test can also be implemented to compare the occurrence rates of two Poisson samples (Cox 1953).

Because of the UMPU property, the acceptance regions of Y = X₁/(X₁ + X₂) for all level α are the uniformly most accurate unbiased (UMAU) confidence sets under H₀ (Lehmann 1986). Similar form of the acceptance region has been discussed in literature. For instance, Tate and Klett (1959) provided the optimal confidence interval of the variance parameter from a normal distribution. Their confidence interval was defined using two conditions similar to (5) and (7).

To implement the proposed LRT (as well as the other two tests), it is crucial to verify the exponential assumption. Given that the focus of this manuscript is rather on the exact UMPU test for comparing two exponential samples, we briefly discuss violation and verification of the exponential assumption next. Violation of the exponential assumption may be due to either non-exponential survival or censoring (e.g., different follow-up time or lost to follow-up). Verifying the exponential distribution with censored observations can be challenging, especially for small samples. Three approaches could be applied to check the exponential assumption: The first approach is the test derived in Hollander and Proschan (1979) that was shown in Section 4.4 for the lung cancer data with censoring. The second is a graphical inspection approach given in Lee (1992). Relying on subjective judgement only, this approach can be difficult to apply in practice. The third approach, which is currently under development, is a reduced piecewise exponential modeling approach that builds on the proposed LRT and a backward elimination procedure to model the failure rate by a piecewise constant function with significant changepoints. Exponential distribution is believed to be valid if no significant changepoint in the failure rate could be identified. According to our findings, this approach is more objective than the graphical inspection, and can be more powerful than the test in Hollander and Proschan (1979).

Current software packages (e.g., PASS and STPLAN) implement the F-test, which, as shown in Section 4.3, may fail to reject significant differences due to its bias and lower power, and thus result in misleading interpretation of the data. The weaknesses of the F-test are more pronounced in situations where the sample size of the test is relatively small, which can be seen in clinical trials of rare diseases and high cost industrial experiments. We suggest that the proposed LRT shall be used instead for comparing two exponential distributions. A software package on the two-sided UMPU test has been developed in MATLAB and is available upon request.

Acknowledgments

The authors would like to thank the editor, Dr. Steven Snapinn, the associate editor, and two referees for their constructive suggestions that improved the quality of this paper. The authors give thanks Gerold Bepler and George R. Simon for providing the non-small cell lung cancer data. This research was sponsored, in part, by the National Institute of Health and the National Cancer Institute, grant 1RC2CA14833201.

APPENDIX. PROOFS

Proof of Remark 1

Proof

The test statistic of the LRT is

\begin{array}{l} Λ = \frac{L (\hat{λ} ∣ {t_{1, i}, c_{1, i}}_{i = 1}^{n_{1}})}{L ({\hat{λ}}_{1}, {\hat{λ}}_{2} ∣ {t_{2, j}, c_{2, j}}_{j = 1}^{n_{2}})} \\ = \frac{\prod_{i = 1}^{d_{1}} f (t_{1, i} ∣ \hat{λ}) {(1 - F (t_{1, d_{1}} ∣ \hat{λ}))}^{n_{1} - d_{1}} \times \prod_{j = 1}^{d_{2}} f (t_{2, j} ∣ \hat{λ}) {(1 - F (t_{2, d_{2}} ∣ \hat{λ}))}^{n_{2} - d_{2}}}{\prod_{i = 1}^{d_{1}} f (t_{1, i} ∣ {\hat{λ}}_{1}) {(1 - F (t_{1, d_{1}} ∣ {\hat{λ}}_{1}))}^{n_{1} - d_{1}} \times \prod_{j = 1}^{d_{2}} f (t_{2, j} ∣ {\hat{λ}}_{2}) {(1 - F (t_{2, d_{2}} ∣ {\hat{λ}}_{2}))}^{n_{2} - d_{2}}} \\ = \frac{{(d_{1} + d_{2})}^{d_{1} + d_{2}}}{d_{1}^{d_{1}} d_{2}^{d_{2}}} \times \frac{x_{1}^{x_{1}} x_{2}^{x_{2}}}{{(x_{1} + x_{2})}^{x_{1} + x_{2}}} \\ \propto {(\frac{x_{1}}{x_{1} + x_{2}})}^{d_{1}} {(\frac{x_{2}}{x_{1} + x_{2}})}^{d_{2}} = φ (x_{1}, x_{2}) . \end{array}

So the LRT rejects H₀ if (x₁/(x₁ + x₂))^d₁ (x₂/(x₁ + x₂))^d₂ ≤ C_α, where P((X₁/(X₁ + X₂))^d₁ (X₂/(X₁ + X₂))^d₂ ≤ C_α)= α under H₀.

Proof of Lemma 1

Proof

Define θ₁ = 1/λ₁ + 1/λ₂ and θ₂ = 1/λ₂ − 1/λ₁. The likelihood function of (θ₁, θ₂) can be written in the form of (9), where

\begin{array}{l} C (θ_{1}, θ_{2}) = I_{(0, \infty)} (θ_{1}) I_{(- θ_{1}, θ_{1})} (θ_{2}) / [{(θ_{1} + θ_{2})}^{d_{1}} {(θ_{1} - θ_{2})}^{d_{2}}], \\ H (x_{1}, x_{2}) = 2^{d_{1} + d_{2}} x_{1}^{d_{1} - 1} x_{2}^{d_{2} - 1} / [(d_{1} - 1)! (d_{2} - 1)!], \\ T (x_{1}, x_{2}) = - (x_{1} + x_{2}) / 2, \\ U (x_{1}, x_{2}) = (x_{1} - x_{2}) / 2. \end{array}

Let V (x₁, x₂) = x₁/(x₁ + x₂) = y. Then the distribution of Y, Beta(d₁, d₂), is free of θ₁ under H₀. Thus V (x₁, x₂) is an ancillary statistic of θ₁. For exponential family, T (X₁, X₂) is minimum sufficient and complete for θ₁. (See Lehmann (1986), page 142.) Thus, Basu’s theorem (Basu 1958) guarantees that V (X₁, X₂) and T (X₁, X₂) are independent. On the other hand, V (x₁, x₂) can be written in the form of (10) with a(T (x₁, x₂)) = −1/[2T (x₁, x₂)] > 0 and b(T(x₁, x₂)) = 0.5. Thus, (9) and (10) both hold.

Note that testing the hypothesis in (2) is equivalent to testing $H_{0}^{★} : θ_{2} = 0$ vs. $H_{1}^{★} : θ_{2} \neq 0$ . Following the five conditions in Section 3, the UMPU test rejects H₀ (or $H_{0}^{★}$ ) if y ≤ C₁ or y ≥ C₂ where the values of C₁ and C₂ are determined by (11) and (12). Since Y has Beta(d₁, d₂) distribution, (11) can be written as (13), and (12) can be written as

\int_{C_{1}}^{C_{2}} {y f}_{β} (y ∣ d_{1}, d_{2}) d y = (1 - α) \times \frac{d_{1}}{d_{1} + d_{2}} .

(21)

By (3), (21) is equivalent to (14). Thus (13) and (14) are equivalent to (11) and (12). The test in Lemma 1 is the UMPU test.

Proof of Lemma 2

Proof

Let $X_{n} = \sum_{i = 1}^{n} S_{i}$ and $Z_{n + 1} = \sum_{j = 1}^{n + 1} R_{j}$ denote two mutually independent random variables distributed Bin(n, p) and Bin(n+1, p), respectively, where {{S_i}_i_=1,…,_n, {R_j}_j_=1,…,_n₊₁} are 2n + 1 mutually independently and identically distributed (i.i.d.) random variables having Bernoulli distribution with parameter p. Using the i.i.d. condition and the definition of X_n and Z_n₊₁,

\begin{array}{l} P (Z_{n + 1} \geq k) = P (\sum_{i = 1}^{n + 1} R_{i} \geq k) \\ = P (\sum_{i = 1}^{n} R_{i} \geq k and \sum_{i = 1}^{n + 1} R_{i} \geq k) + P (\sum_{i = 1}^{n} R_{i} < k and \sum_{i = 1}^{n + 1} R_{i} \geq k) \\ = P (\sum_{i = 1}^{n} R_{i} \geq k) + P (\sum_{i = 1}^{n} R_{i} = k - 1 and R_{n + 1} = 1) \\ = P (\sum_{i = 1}^{n} R_{i} \geq k) + P (\sum_{i = 1}^{n} R_{i} = k - 1) \times P (R_{n + 1} = 1) \\ = P (\sum_{i = 1}^{n} S_{i} \geq k) + P (\sum_{i = 1}^{n} S_{i} = k - 1) \times p \\ = P (X_{n} \geq k) + P (X_{n} = k - 1) \times p . \end{array}

Equation (16) holds for all positive integer k.

Proof of Theorem 1

Proof

Conditional on (13), (14) is equivalent to $\int_{C_{1}}^{C_{2}} f_{β} (y ∣ d_{1} + 1, d_{2}) d y = \int_{C_{1}}^{C_{2}} f_{β} (y ∣ d_{1}, d_{2}) d y$ , which can be written as I_β(C₂; d₁ + 1, d₂) − I_β(C₁; d₁ + 1, d₂) = I_β(C₂; d₁, d₂) − I_β(C₁; d₁, d₂), where the function I_β(·; d₁, d₂) is the incomplete Beta function with parameters d₁ and d₂, and I_β(C₁; d₁, d₂) = P (Y < C₁|Y ~ Beta(d₁, d₂)). Thus, conditional on (13), (14) can be written as

I_{β} (C_{2}; d_{1}, d_{2}) - I_{β} (C_{2}; d_{1} + 1, d_{2}) = I_{β} (C_{1}; d_{1}, d_{2}) - I_{β} (C_{1}; d_{1} + 1, d_{2}) .

(22)

By integrating beta p.d.f. (3) by parts, I_β(C₁; d₁, d₂) can be written as

I_{β} (C_{1}; d_{1}, d_{2}) = \sum_{j = d_{1}}^{d_{1} + d_{2} - 1} \frac{(d_{1} + d_{2} - 1)!}{j! (d_{1} + d_{2} - 1 - j)!} C_{1}^{j} {(1 - C_{1})}^{d_{1} + d_{2} - 1 - j} .

(23)

Let n = d₁ +d₂ −1 and p = C₁. Following the notation in Lemma 2, X_n ~ Bin(d₁ +d₂ −1, C₁) and Z_n₊₁ ~ Bin(d₁ + d₂, C₁).

By (15) and (23), I_β(C₁; d₁, d₂) = P (X_n ≥ d₁) and I_β(C₁; d₁ + 1, d₂) = P (Z_n₊₁ ≥ d₁ + 1). Using Lemma 2, the right hand side of (22) can be written as

\begin{array}{l} I_{β} (C_{1}; d_{1}, d_{2}) - I_{β} (C_{1}; d_{1} + 1, d_{2}) = P (X_{n} \geq d_{1}) - P (Z_{n + 1} \geq d_{1} + 1) \\ = P (X_{n} \geq d_{1}) - [P (X_{n} \geq d_{1} + 1) + C_{1} \times P (X_{n} = d_{1})] \\ = [P (X_{n} \geq d_{1}) - P (X_{n} \geq d_{1} + 1)] - C_{1} \times P (X_{n} = d_{1}) \\ = P (X_{n} = d_{1}) - C_{1} \times P (X_{n} = d_{1}) \\ = (1 - C_{1}) \times P (X_{n} = d_{1}) \\ = \frac{(d_{1} + d_{2} - 1)!}{d_{1}! (d_{2} - 1)!} C_{1}^{d_{1}} {(1 - C_{1})}^{d_{2}} . \end{array}

(24)

Similarly, the left hand side of (22) can be derived as

I_{β} (C_{2}; d_{1}, d_{2}) - I_{β} (C_{2}; d_{1} + 1, d_{2}) = \frac{(d_{1} + d_{2} - 1)!}{d_{1}! (d_{2} - 1)} C_{2}^{d_{1}} {(1 - C_{2})}^{d_{2}} .

(25)

By (22), (24), and (25), the joint condition of (13, 14) is equivalent to the joint condition of (13) and

C_{1}^{d_{1}} {(1 - C_{1})}^{d_{2}} = C_{2}^{d_{1}} {(1 - C_{2})}^{d_{2}} .

(26)

Notice that if (A₁, A₂) = (C₁, C₂), the joint condition of (5, 7) can be written as (13, 26). Thus, A₁ and A₂ in the LRT can guarantee (13) and (14) if we let A₁ = C₁ and A₂ = C₂. This completes the proof.

Contributor Information

Gang Han, Email: Gang.Han@moffitt.org.

Michael J. Schell, Email: Michael.Schell@moffitt.org.

Jongphil Kim, Email: Jongphil.Kim@moffitt.org.

References

Basu D. On Statistics Independent of a Sufficient Statistic. Sankhyā. 1958;20:223–226. [Google Scholar]
Bui M, Han G, Acs G, Gonzalez R, Reed D, Pasha IL, Zhang P. Connexin 43 Is a Potential Prognostic Biomarker for Ewing Sarcoma (EWS)/Primitive neuroectodermal tumor (PNET) Sarcoma. 2011:Article ID 971050. doi: 10.1155/2011/971050. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cox DR. Some Simple Approximate Tests for Poisson Variates. Biometrika. 1953;40:354–360. [Google Scholar]
Epstein B, Sobel M. Some Theorems Relevant to Life Testing from an Exponential Distribution. Annals of Mathematical Statistics. 1954;25:373–381. [Google Scholar]
Hollander M, Proschan F. Testing to Determine the Underlying Distribution Using Randomly Censored Data. Biometrics. 1979;35:393–401. [Google Scholar]
Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Wiley; New York: 1980. [Google Scholar]
Lee ET. Statistical Methods for Survival Datat Analysis. Wiley; New York: 1992. [Google Scholar]
Lehmann EL. Testing Statistical Hypotheses. Springer; New York: 1986. [Google Scholar]
Meier P, Karrison T, Chappell R, Xie H. The Price of Kaplan-Meier. Journal of the American Statistical Association. 2004;99:890–896. [Google Scholar]
Miller RG. What Price KaplanMeier? Biometrics. 1983;39:1077–1081. [PubMed] [Google Scholar]
Simon GR, Schell MJ, Begum M, Haura JKE, Antonia SJ, Bepler G. Preliminary indication of survival benefit from ERCC1 and RRM1-tailored chemotherapy in patients with advanced nonsmall cell lung cancer: Evidence from an individual patient analysis. Cancer. 2011 doi: 10.1002/cncr.26522. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tate RF, Klett GW. Optimal Confidence Intervals for the Variance of a Normal Distribution. Journal of the American Statistical Association. 1959;54:674– 682. [Google Scholar]

[R1] Basu D. On Statistics Independent of a Sufficient Statistic. Sankhyā. 1958;20:223–226. [Google Scholar]

[R2] Bui M, Han G, Acs G, Gonzalez R, Reed D, Pasha IL, Zhang P. Connexin 43 Is a Potential Prognostic Biomarker for Ewing Sarcoma (EWS)/Primitive neuroectodermal tumor (PNET) Sarcoma. 2011:Article ID 971050. doi: 10.1155/2011/971050. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Cox DR. Some Simple Approximate Tests for Poisson Variates. Biometrika. 1953;40:354–360. [Google Scholar]

[R4] Epstein B, Sobel M. Some Theorems Relevant to Life Testing from an Exponential Distribution. Annals of Mathematical Statistics. 1954;25:373–381. [Google Scholar]

[R5] Hollander M, Proschan F. Testing to Determine the Underlying Distribution Using Randomly Censored Data. Biometrics. 1979;35:393–401. [Google Scholar]

[R6] Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Wiley; New York: 1980. [Google Scholar]

[R7] Lee ET. Statistical Methods for Survival Datat Analysis. Wiley; New York: 1992. [Google Scholar]

[R8] Lehmann EL. Testing Statistical Hypotheses. Springer; New York: 1986. [Google Scholar]

[R9] Meier P, Karrison T, Chappell R, Xie H. The Price of Kaplan-Meier. Journal of the American Statistical Association. 2004;99:890–896. [Google Scholar]

[R10] Miller RG. What Price KaplanMeier? Biometrics. 1983;39:1077–1081. [PubMed] [Google Scholar]

[R11] Simon GR, Schell MJ, Begum M, Haura JKE, Antonia SJ, Bepler G. Preliminary indication of survival benefit from ERCC1 and RRM1-tailored chemotherapy in patients with advanced nonsmall cell lung cancer: Evidence from an individual patient analysis. Cancer. 2011 doi: 10.1002/cncr.26522. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Tate RF, Klett GW. Optimal Confidence Intervals for the Variance of a Normal Distribution. Journal of the American Statistical Association. 1959;54:674– 682. [Google Scholar]

PERMALINK

Comparing Two Exponential Distributions Using the Exact Likelihood Ratio Test

Gang Han

Michael J Schell

Jongphil Kim

Abstract

1 Introduction

2 Testing the Equivalence of Two Exponential Distributions

Remark 1

3 UMPU Property of the Likelihood Ratio Test

Lemma 1

Lemma 2

Theorem 1

4 Method Comparison

4.1 Alternative Tests

4.2 Rejection Probabilities of the Three Tests

4.3 Comparing Three Tests

Example 1

Figure 1.

Example 2

Example 3

4.4 Application in the Design of a Comparative Lung Cancer Clinical Trial

Figure 2.

Table 1.

5 Summary and Concluding Remarks

Acknowledgments

APPENDIX. PROOFS

Proof of Remark 1

Proof

Proof of Lemma 1

Proof

Proof of Lemma 2

Proof

Proof of Theorem 1

Proof

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases