Abstract
The aim of this study is to propose a new pairwise multiple comparison adjustment procedure based on Genz's numerical computation of probabilities from a multivariate normal distribution. This method is applied to the results of two-sample log-rank and weighted log-rank statistics where the survival data contained right-censored observations. We conducted Monte Carlo simulation studies not only to evaluate the familywise error rate and power of the proposed procedure but also to compare the procedure with conventional methods. The proposed method is also applied to the data set consisting of 815 patients on a liver transplant waiting list from 1990 to 1999. It was found that the proposed method can control the type I error rate, and it yielded similar power as Tukey's and high power with respect to the other adjustment procedures. In addition to having a straightforward formula, it is easy to implement.
1. Introduction
Survival analysis is based on making inferences from the time-to-event data. It provides many statistical procedures for studying the data, including the time from a correctly identified origin until the occurrence of a certain event [1]. One of the main interests in survival analysis is evaluating the equality of survival functions for different groups. Many tests such as log-rank and weighed log-rank have been proposed [2–8]. Although these tests made important contributions to survival analysis, they can only provide overall or two-sample comparison results. Researchers will fail if they use these tests to compare one with another in a multigroup study design because the probability of making at least one type I error will be increased above the critical level. To prevent this mistake, pairwise multiple comparison procedures are needed. In case of the inequality of more than two groups, it is necessary to correctly decide which groups are different from the others. The appropriate way to control the type I error is to consider the familywise error (FWE) rate, which is the probability of making at least one type I error when making all pairwise comparisons [9].
Adjustment methods such as Bonferroni, Holm, and Sidak methods are commonly used in the literature. However, in survival analysis this topic has only recently been studied. Adjustment methods are applied to the results of two-sample log-rank and weighted log-rank tests. Bonferroni is the most preferred method among the others. In a two-sided test, Bonferroni assumes the significance level as (α/2) × m, where m is the number of pairwise comparisons in the study, but it fails when controlling the familywise error rate. In spite of its simplicity, it has been determined to be a conservative method in survival analysis [9, 10]. Logan et al. proposed two different adjustment methods that consider the correlation among the pairwise tests [9]. One of the methods was derived from multivariate normal distribution, while the other was obtained from a simulated martingales approach. Koziol and Reid used the Sidak adjustment method to calculate the pairwise comparisons results of weighted log-rank tests. Although it generates more consistent results than Bonferroni's, it was also found to be conservative [11]. Not only were pairwise multiple comparisons proposed, but comparisons against a single control group were also proposed for survival functions with right-censored data in the statistical literature. Chakraborti and Desu developed linear rank tests, and Chen proposed a generalized Steel's test and an alternative method to the generalization of Steel's test [12–15].
The aim of this study is to propose a new pairwise multiple comparison adjustment procedure based on Genz's numerical computation of probabilities from a multivariate normal distribution [16, 17]. This method is applied to the results of two-sample log-rank and weighted log-rank statistics where the survival data contained right-censored observations. In Section 2, some notations are given, and the construction of the simulation study is detailed. In the simulation studies SAS PROC LIFETEST and R package with mvtnorm library were used. Moreover, all adjustment methods are applied to a real life-time data set and they are compared with each other. The results and discussion about other studies are evaluated in Section 3. Finally, conclusions are mentioned in Section 4.
2. Materials and Methods
2.1. Notation and Background
Let S k(t) be the survival function of the kth group for k = 1,…, K, where K is the number of groups. The null and alternative hypotheses for the survival functions are
(1) |
where τ is the largest observed time.
Let (T i, δ i, X i, w i), for i = 1,…, n, indicates that an independent sample for right-censored survival data where T i is right-censored time, δ i is the indicator variable for censoring (δ i = 0 if T i is censored; δ i = 1 if T i is an event time), X i is the group indicator of 1,…, K, and w i is a weight function. Let t 1 < t 2 < ⋯<t D j = 1,…, D be distinct event times in the sample. At time t j, for the kth group, let Y jk = ∑i:T≥tj I(X i = k) and d jk = ∑i:Ti=tj δ i I(X i = k) denote the number of individuals at risk and the number of events, respectively. Let Y j = ∑k=1 K Y jk and d j = ∑k=1 K d jk denote the number individuals at risk and the number of events, respectively. The weighted number of individuals at risk in the kth group is Y jk w = ∑i:Ti≥tj w i I(X i = k), while the weighted number of events in the kth group is d jk w = ∑i:Ti=tj w i δ i I(X i = k). Let Y j w = ∑k=1 K Y jk w and d j w = ∑k=1 K d jk w indicate the weighted number of individuals at risk and the weighted number of events, respectively.
For testing the null hypothesis, the test statistics have the form of a K-vector R = (r 1, r 2,…,r k)′, where
(2) |
Variance of r k and covariance for r k and r h are as follows, respectively:
(3) |
![]() |
(4) |
Because the sum of r k is equal to 0, they are linearly dependent. Accordingly, the general test statistic is constructed by selecting any K − 1 of r k's. The test statistic, (r 1, r 2,…, r K−1)V (K − 1)×(K − 1) −1(r 1, r 2,…,r K−1)′, follows a Chi-square distribution with K − 1 degrees of freedom, where V is the variance-covariance matrix.
Let m be the number of all pairwise comparisons where m = K(K − 1)/2. The two-sided test statistic, Z kh, compares the groups k and h and follows a standard normal distribution.
(5) |
The unadjusted p value is p kh = P(χ 1 2 > Z kh 2). The multiple comparison procedures that are used to adjust the p values in this study are shown below:
Bonferroni: p kh = min[1, m × P(χ 1 2 > Z kh 2)].
Scheffé: p kh = P(χ K−1 2 > Z kh 2).
Sidak: p kh = 1 − [1 − P(χ 1 2 > Z kh 2)]m.
Studentized Maximum Modulus: p kh = 1 − [2 × Φ(Z kh)]m.
Tukey: ,
where ϕ and Φ are standard normal and cumulative standard normal functions, respectively.
2.2. Proposed Adjustment Procedure
Z = [Z 12,…,Z 1K, Z 23,…,Z 2K,…Z K−1,K]m has a multivariate normal distribution with a mean of zero and a variance-covariance matrix Σ. Under the null hypothesis, the elements of Σ follow a rule which is Cov(Z kh, Z kh′) = 0.5, Cov(Z kh, Z k′h′) = 0, and Cov(Z kh, Z hk′) = −0.5, where 1 ≤ k ≠ h ≠ k′ ≠ h′ ≤ K [9, 14, 15].
The function of (a, b, Σ) is
(6) |
For the integration shown above, we used “mvtnorm” library, released February 2, 2016, for numerical computation in R program. There are three algorithms available for evaluating normal probabilities: The default is the randomized Quasi-Monte-Carlo procedure by Genz (1992, 1993). We used this approach because it is easy to use and calculate with R program.
The proposed multiple adjustment procedures for the pairwise comparison of the kth and hth groups are obtained using Φ and shown below:
(7) |
where a = [−∞,…,−∞]m and b = [Z kh,…,Z kh]m.
Additionally, the critical value for the pairwise comparison can be evaluated with
(8) |
2.3. Simulation Study
We performed Monte Carlo simulation studies to examine the proposed and conventional adjustment procedures. The FWE rate and power of the adjustment procedures were obtained through the simulation results. In this study, the number of groups was determined as K = 4; X i = 1,…, 4. The sample sizes were considered equal for each group as n = 50, 150, and 250 to estimate the FWE rate, while it was just 250 in the power study. The right-censored survival times T i were derived from the exponential T i ~ exp(λ Xi) and lognormal distribution T i ~ lognormal(μ Xi, σ 2). The censoring rate was considered to be 30%. Therefore, the censoring variable was generated from a Bernoulli distribution δ i ~ Bernoulli(p = 0.70). Note that the censoring rate was fixed for each group in the FWE rate and power study. To obtain the adjusted p values, Bonferroni, Scheffé, Sidak, SMM, Tukey, and the proposed adjustment procedure were applied to the pairwise comparison results of log-rank and weighted log-rank tests. For each scenario 1000 data sets were simulated independently.
To compare the FWE rates of the adjustment procedures, the survival times for each group were generated from the standard exponential distribution with λ k = 1 and the lognormal distribution with a mean of μ k = 0 and scale parameter σ k = 0.5. The estimated FWE rates of the adjustment procedures were evaluated with respect to the critical value α = 0.05. In the power study, we used exponential distributions with various parameters λ k and lognormal distributions with σ k = 0.5 but different values of μ k. For power calculation, we calculated the probability of making a correct decision only for unequal pairs. Note that the exponential distribution provides a proportional hazards model while the lognormal distribution corresponds to location shifts in log survival times. The lognormal distributions with various means were used because they have different hazards at early times [15].
2.4. Application Data
The data set was obtained from the free data sets used in the R package, “survival” [18, 19]. It consisted of 815 patients on a liver transplant waiting list from 1990 to 1999 with six variables: age at the addition to the waiting list, sex, blood type, year in which a patient entered the waiting list, and time from the entry to end point. The final disposition of the patients was categorized as received a transplant, died while waiting, withdrew from the list, or censored. Blood type is a crucial factor which affects the waiting time for transplantation. Although the liver donation from subjects with blood type O can be used by patients with all blood types, a patient with blood type O can only receive donation from the subjects with blood type O. Thus, patients with O blood type on the waiting list have a disadvantage. These data is of historical interest and provides a useful example of competing risks, but it has little relevance to current practice. We used these data as an example to demonstrate the comparison of the proposed and conventional adjustment techniques on a real data set. We considered that the event is receiving a transplant, while the other categories of final disposition are censored.
3. Results and Discussion
Table 1 shows the simulation results for the estimated FWE rates of the proposed and conventional adjustment procedures for exponential survival distribution with different sample sizes. Under the null hypothesis, FWE rates are expected to be 0.05. As the sample size increases, estimates get closer to the targeted value in all adjustment procedures. It is obvious that the Scheffé method is the most inefficient among the others. The proposed adjustment procedure and Tukey's present similar results. It can be seen that both adjustment procedures can control the type I error even for small samples. Their performance is followed by Sidak, SMM, and Bonferroni procedures. In Table 2, the estimates of the FWE rates for the survival times from the lognormal distribution with the parameters μ k = 0 and σ k = 0.5 are given. Unlike the previous simulation results, not all procedures give estimates that are close to the targeted value. The proposed adjustment procedure and Tukey's provide the most efficient results. The decrease in the performance of the adjustment procedures could depend on the type of distributions. Because an exponential distribution provides a more appropriate proportional hazard model than a lognormal distribution, this affects the performance of the log-rank and the weighted log-rank tests. Therefore, the adjustment procedures tend to cause errors.
Table 1.
FWE rates of the proposed and conventional adjustment procedures for K = 4 and α = 0.05 and exponential survival distribution with λ k = 1.
Sample size | Tests | Proposed and conventional adjustment techniques | ||||||
---|---|---|---|---|---|---|---|---|
Unadjusted | Bonferroni | Scheffé | Sidak | SMM | Tukey | Proposed | ||
50 | Fleming | 0.194 | 0.039 | 0.031 | 0.040 | 0.040 | 0.053 | 0.053 |
Log-rank | 0.187 | 0.040 | 0.024 | 0.040 | 0.040 | 0.046 | 0.046 | |
ModPeto | 0.196 | 0.040 | 0.031 | 0.043 | 0.043 | 0.054 | 0.054 | |
Peto | 0.194 | 0.039 | 0.031 | 0.042 | 0.042 | 0.054 | 0.054 | |
Tarone | 0.193 | 0.041 | 0.027 | 0.041 | 0.041 | 0.053 | 0.053 | |
Wilcoxon | 0.204 | 0.043 | 0.029 | 0.044 | 0.044 | 0.052 | 0.052 | |
| ||||||||
150 | Fleming | 0.206 | 0.034 | 0.022 | 0.035 | 0.035 | 0.039 | 0.039 |
Log-rank | 0.185 | 0.036 | 0.019 | 0.037 | 0.037 | 0.044 | 0.044 | |
ModPeto | 0.204 | 0.033 | 0.023 | 0.034 | 0.034 | 0.038 | 0.038 | |
Peto | 0.206 | 0.034 | 0.022 | 0.034 | 0.034 | 0.038 | 0.038 | |
Tarone | 0.198 | 0.035 | 0.020 | 0.035 | 0.035 | 0.045 | 0.045 | |
Wilcoxon | 0.211 | 0.038 | 0.023 | 0.038 | 0.038 | 0.045 | 0.045 | |
| ||||||||
250 | Fleming | 0.214 | 0.045 | 0.032 | 0.046 | 0.046 | 0.057 | 0.057 |
Log-rank | 0.209 | 0.043 | 0.030 | 0.044 | 0.044 | 0.049 | 0.049 | |
ModPeto | 0.214 | 0.045 | 0.032 | 0.046 | 0.046 | 0.057 | 0.057 | |
Peto | 0.214 | 0.045 | 0.032 | 0.046 | 0.046 | 0.057 | 0.057 | |
Tarone | 0.210 | 0.047 | 0.033 | 0.047 | 0.047 | 0.054 | 0.054 | |
Wilcoxon | 0.209 | 0.044 | 0.029 | 0.045 | 0.045 | 0.056 | 0.056 |
Table 2.
FWE rates of the proposed and conventional adjustment procedures for K = 4 and α = 0.05 and log-normal survival distribution with μ k = 0 and σ k = 0.5.
Sample size | Tests | Proposed and conventional adjustment techniques | ||||||
---|---|---|---|---|---|---|---|---|
Unadjusted | Bonferroni | Scheffé | Sidak | SMM | Tukey | Proposed | ||
50 | Fleming | 0.188 | 0.035 | 0.023 | 0.035 | 0.035 | 0.041 | 0.041 |
Log-rank | 0.199 | 0.038 | 0.021 | 0.038 | 0.038 | 0.043 | 0.043 | |
ModPeto | 0.187 | 0.036 | 0.023 | 0.036 | 0.036 | 0.041 | 0.041 | |
Peto | 0.189 | 0.035 | 0.023 | 0.036 | 0.036 | 0.041 | 0.041 | |
Tarone | 0.182 | 0.032 | 0.022 | 0.033 | 0.033 | 0.041 | 0.041 | |
Wilcoxon | 0.182 | 0.038 | 0.019 | 0.038 | 0.038 | 0.047 | 0.047 | |
| ||||||||
150 | Fleming | 0.202 | 0.046 | 0.025 | 0.046 | 0.046 | 0.051 | 0.051 |
Log-rank | 0.220 | 0.043 | 0.030 | 0.043 | 0.043 | 0.050 | 0.050 | |
ModPeto | 0.200 | 0.045 | 0.025 | 0.046 | 0.046 | 0.051 | 0.051 | |
Peto | 0.201 | 0.045 | 0.025 | 0.046 | 0.046 | 0.051 | 0.051 | |
Tarone | 0.210 | 0.041 | 0.029 | 0.043 | 0.043 | 0.051 | 0.051 | |
Wilcoxon | 0.196 | 0.044 | 0.023 | 0.044 | 0.044 | 0.051 | 0.051 | |
| ||||||||
250 | Fleming | 0.196 | 0.040 | 0.024 | 0.041 | 0.041 | 0.049 | 0.049 |
Log-rank | 0.201 | 0.037 | 0.023 | 0.037 | 0.037 | 0.044 | 0.044 | |
ModPeto | 0.197 | 0.040 | 0.024 | 0.040 | 0.040 | 0.049 | 0.049 | |
Peto | 0.196 | 0.040 | 0.024 | 0.041 | 0.041 | 0.049 | 0.049 | |
Tarone | 0.195 | 0.045 | 0.023 | 0.046 | 0.046 | 0.049 | 0.049 | |
Wilcoxon | 0.202 | 0.032 | 0.021 | 0.034 | 0.034 | 0.042 | 0.042 |
Next, the simulation results are calculated for the power of the proposed and conventional adjustment procedures for the exponential survival distribution. Under a variety of hypothesis configurations denoted by λ k, the estimated power results are given in Table 3. As the values of λ k become different from each other, the power of all of the adjustment procedures decreases rapidly. The proposed adjustment procedure and Tukey's provide similar results with the highest power. We also conducted additional simulations where the survival times were derived from a lognormal distribution. The estimates of power under alternative configurations of μ k are given in Table 4. Inefficient power results are only seen when all of the μ k values are different. Moreover, the performance of all of the adjustment procedures gives very similar results. In all the simulation results, it can be seen that there is no notable difference between the log-rank and weighted log-rank tests.
Table 3.
Power of the proposed and conventional adjustment procedures for K = 4 and α = 0.05,and exponential survival distribution with different λ k.
Parameters | Tests | Proposed and conventional adjustment techniques | ||||||
---|---|---|---|---|---|---|---|---|
(λ 1, λ 2, λ 3, λ 4) | Unadjusted | Bonferroni | Scheffé | Sidak | SMM | Tukey | Proposed | |
(2.25, 1.50, 1.50, 1.50) | Fleming | 0.765 | 0.594 | 0.523 | 0.597 | 0.597 | 0.618 | 0.618 |
Log-rank | 0.858 | 0.785 | 0.726 | 0.784 | 0.784 | 0.802 | 0.802 | |
ModPeto | 0.763 | 0.593 | 0.521 | 0.593 | 0.593 | 0.618 | 0.618 | |
Peto | 0.764 | 0.593 | 0.522 | 0.597 | 0.597 | 0.618 | 0.618 | |
Tarone | 0.789 | 0.654 | 0.595 | 0.657 | 0.657 | 0.676 | 0.676 | |
Wilcoxon | 0.725 | 0.506 | 0.428 | 0.507 | 0.507 | 0.535 | 0.535 | |
| ||||||||
(2.25, 2.25, 1.50, 1.50) | Fleming | 0.757 | 0.514 | 0.437 | 0.516 | 0.516 | 0.541 | 0.541 |
Log-rank | 0.823 | 0.627 | 0.555 | 0.629 | 0.629 | 0.665 | 0.665 | |
ModPeto | 0.756 | 0.512 | 0.436 | 0.516 | 0.516 | 0.539 | 0.539 | |
Peto | 0.757 | 0.513 | 0.436 | 0.516 | 0.516 | 0.541 | 0.541 | |
Tarone | 0.789 | 0.558 | 0.490 | 0.560 | 0.560 | 0.588 | 0.588 | |
Wilcoxon | 0.713 | 0.442 | 0.358 | 0.447 | 0.447 | 0.468 | 0.468 | |
| ||||||||
(2.25, 1.75, 1.75, 1.25) | Fleming | 0.243 | 0.032 | 0.017 | 0.033 | 0.033 | 0.045 | 0.045 |
Log-rank | 0.368 | 0.063 | 0.034 | 0.064 | 0.064 | 0.080 | 0.080 | |
ModPeto | 0.243 | 0.032 | 0.017 | 0.033 | 0.033 | 0.044 | 0.044 | |
Peto | 0.243 | 0.032 | 0.017 | 0.033 | 0.033 | 0.044 | 0.044 | |
Tarone | 0.290 | 0.046 | 0.024 | 0.047 | 0.047 | 0.055 | 0.055 | |
Wilcoxon | 0.186 | 0.026 | 0.009 | 0.026 | 0.026 | 0.031 | 0.031 | |
| ||||||||
(2.50, 2.00, 1.50, 1.00) | Fleming | 0.168 | 0.010 | 0.002 | 0.010 | 0.010 | 0.014 | 0.014 |
Log-rank | 0.269 | 0.023 | 0.006 | 0.024 | 0.024 | 0.035 | 0.035 | |
ModPeto | 0.167 | 0.009 | 0.002 | 0.010 | 0.010 | 0.013 | 0.013 | |
Peto | 0.167 | 0.010 | 0.002 | 0.010 | 0.010 | 0.014 | 0.014 | |
Tarone | 0.204 | 0.012 | 0.006 | 0.013 | 0.013 | 0.018 | 0.018 | |
Wilcoxon | 0.121 | 0.005 | 0.000 | 0.005 | 0.005 | 0.005 | 0.005 |
Table 4.
Power of the proposed and conventional adjustment procedures for K = 4 and α = 0.05 and log-normal survival distribution with different μ k and σ k = 0.5.
Parameters | Tests | Proposed and conventional adjustment techniques | ||||||
---|---|---|---|---|---|---|---|---|
(μ 1, μ 2, μ 3, μ 4) | Unadjusted | Bonferroni | Scheffé | Sidak | SMM | Tukey | Proposed | |
(0.5, 0, 0, 0) | Fleming | 0.871 | 0.978 | 0.985 | 0.978 | 0.978 | 0.976 | 0.976 |
Log-rank | 0.901 | 0.995 | 0.996 | 0.995 | 0.995 | 0.991 | 0.991 | |
ModPeto | 0.871 | 0.978 | 0.985 | 0.978 | 0.978 | 0.976 | 0.976 | |
Peto | 0.871 | 0.978 | 0.985 | 0.978 | 0.978 | 0.976 | 0.976 | |
Tarone | 0.880 | 0.979 | 0.989 | 0.978 | 0.978 | 0.973 | 0.973 | |
Wilcoxon | 0.869 | 0.978 | 0.985 | 0.977 | 0.977 | 0.973 | 0.973 | |
| ||||||||
(0.5, 0.5, 0, 0) | Fleming | 0.922 | 0.987 | 0.992 | 0.987 | 0.987 | 0.984 | 0.984 |
Log-rank | 0.923 | 0.988 | 0.995 | 0.988 | 0.988 | 0.985 | 0.985 | |
ModPeto | 0.923 | 0.987 | 0.992 | 0.987 | 0.987 | 0.984 | 0.984 | |
Peto | 0.923 | 0.987 | 0.992 | 0.987 | 0.987 | 0.984 | 0.984 | |
Tarone | 0.928 | 0.987 | 0.994 | 0.987 | 0.987 | 0.986 | 0.986 | |
Wilcoxon | 0.931 | 0.986 | 0.992 | 0.986 | 0.986 | 0.983 | 0.983 | |
| ||||||||
(0.3, 0, 0, −0.3) | Fleming | 0.962 | 0.982 | 0.980 | 0.982 | 0.982 | 0.984 | 0.984 |
Log-rank | 0.947 | 0.940 | 0.906 | 0.940 | 0.940 | 0.949 | 0.949 | |
ModPeto | 0.962 | 0.982 | 0.980 | 0.982 | 0.982 | 0.984 | 0.984 | |
Peto | 0.962 | 0.982 | 0.980 | 0.982 | 0.982 | 0.984 | 0.984 | |
Tarone | 0.958 | 0.979 | 0.974 | 0.979 | 0.979 | 0.980 | 0.980 | |
Wilcoxon | 0.962 | 0.985 | 0.979 | 0.985 | 0.985 | 0.984 | 0.984 | |
| ||||||||
(0.5, 0.3, −0.3, −0.5) | Fleming | 0.716 | 0.332 | 0.245 | 0.336 | 0.336 | 0.373 | 0.373 |
Log-rank | 0.551 | 0.260 | 0.199 | 0.262 | 0.262 | 0.293 | 0.293 | |
ModPeto | 0.716 | 0.329 | 0.244 | 0.337 | 0.337 | 0.371 | 0.371 | |
Peto | 0.716 | 0.332 | 0.245 | 0.337 | 0.337 | 0.371 | 0.371 | |
Tarone | 0.712 | 0.345 | 0.278 | 0.347 | 0.347 | 0.395 | 0.395 | |
Wilcoxon | 0.698 | 0.266 | 0.191 | 0.271 | 0.271 | 0.304 | 0.304 |
Descriptive statistics of the application data set are given in Table 5 and the survival functions of the groups are shown in Figure 1. The overall comparison of blood type groups is conducted with log-rank test. The result is found to be highly significant (χ 2 = 45.5, df = 3, and p < 0.001). Thus, pairwise comparisons followed by multiple adjustment procedures were conducted, and the results are given in Table 6. All of the adjustment procedures had the same conclusions and present results that are similar to those that we observed in the simulation studies. The comparison results show that, with the exception of the pair of B and AB, all of the blood types are highly different from each other. The p values obtained for each comparative test for the application data showed significant differences (p < 0.001) between the survival times of the blood groups except for the comparison of AB and B groups (p > 0.05). The results can be seen in Kaplan-Meier curves represented in Figure 1. The survival curves show a proportional structure until the middle of the 0–500-day interval. Also, the survival curves of AB and B blood groups are closer to each other compared to the other groups.
Table 5.
Descriptive statistics for the liver transplant waiting list data.
Blood groups | LTX | Censored | Total | Percent censored | Median follow-up (days) | 95% confidence interval | |
---|---|---|---|---|---|---|---|
Lower | Upper | ||||||
A | 269 | 56 | 325 | 0.172 | 100 | 95 | 108 |
AB | 33 | 8 | 41 | 0.195 | 84 | 52 | 202 |
B | 78 | 25 | 103 | 0.243 | 173 | 116 | 212 |
0 | 256 | 90 | 346 | 0.260 | 223 | 193 | 276 |
Total | 636 | 179 | 815 | 0.219 |
Figure 1.
Kaplan-Meier estimates of not receiving a transplant for each blood type group.
Table 6.
Test statistics and the adjusted p values of the proposed and conventional adjustment techniques for the liver transplant waiting list data.
Tests | Blood groups | Test statistics | Proposed and conventional adjustment techniques | |||||||
---|---|---|---|---|---|---|---|---|---|---|
|Z| | Unadjusted | Bonferroni | Scheffé | Sidak | SMM | Tukey | Proposed | |||
Fleming | A | AB | 5.106 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 |
A | B | 5.236 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
A | 0 | 7.570 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
AB | B | 1.639 | 0.1012 | 0.6070 | 0.4424 | 0.4727 | 0.4727 | 0.3565 | 0.4297 | |
AB | 0 | 7.039 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
B | 0 | 4.826 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
| ||||||||||
Log-rank | A | AB | 4.543 | <0.0001 | <0.0001 | 0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 |
A | B | 4.519 | <0.0001 | <0.0001 | 0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
A | 0 | 6.483 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
AB | B | 1.258 | 0.2084 | 1.0000 | 0.6634 | 0.7539 | 0.7539 | 0.5898 | 0.7716 | |
AB | 0 | 5.924 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
B | 0 | 4.088 | <0.0001 | 0.0003 | 0.0008 | 0.0003 | 0.0003 | 0.0003 | <0.001 | |
| ||||||||||
ModPeto | A | AB | 5.103 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 |
A | B | 5.233 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
A | 0 | 7.570 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
AB | B | 1.638 | 0.1015 | 0.6089 | 0.4433 | 0.4738 | 0.4738 | 0.3573 | 0.4303 | |
AB | 0 | 7.042 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
B | 0 | 4.829 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
| ||||||||||
Peto | A | AB | 5.102 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 |
A | B | 5.231 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
A | 0 | 7.567 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
AB | B | 1.637 | 0.1016 | 0.6094 | 0.4435 | 0.4741 | 0.4741 | 0.3576 | 0.4312 | |
AB | 0 | 7.040 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
B | 0 | 4.827 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
| ||||||||||
Tarone | A | AB | 5.131 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 |
A | B | 5.153 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
A | 0 | 7.480 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
AB | B | 1.480 | 0.1388 | 0.8330 | 0.5338 | 0.5921 | 0.5921 | 0.4495 | 0.5594 | |
AB | 0 | 6.887 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
B | 0 | 4.776 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
| ||||||||||
Wilcoxon | A | AB | 5.100 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 |
A | B | 5.264 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
A | 0 | 7.598 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
AB | B | 1.688 | 0.0915 | 0.5488 | 0.4156 | 0.4376 | 0.4376 | 0.3301 | 0.3944 | |
AB | 0 | 7.084 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | |
B | 0 | 4.840 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 | <0.0001 |
A statistician can use this method in usual data analysis procedure as follows. For example, to calculate the adjusted p value for the comparison of the groups k and h,
calculate Z kh and Σ defined in Section 2.1,
use pvnorm command in mvtnorm library in R as follows:
l = rep(−Inf, m).
u = rep(Z kh, m).
a = pmvnorm(lower = l, upper = u, mean = 0, corr = sigma).
p = 2∗(1 − (a[1] + attributes(a)$error)),
where m is the number of all comparisons, and sigma is Σ.
4. Conclusions
In this study, we proposed a multiple adjustment procedure for the pairwise comparisons of survival functions with right-censored data. We conducted Monte Carlo simulation studies not only to evaluate the FWE rate and power of the proposed procedure but also to compare the procedure with conventional methods. It was found that the proposed method can control the type I error rate, and it yielded similar power as Tukey's and high power with respect to the other adjustment procedures. In addition to having a straightforward formula, it is easy to implement.
This study has some limitations. The main issue was that the simulations were performed by using proposed and conventional methods. However, comparisons can be extended including the methods such as that of Logan et al. (2005) in the comparison. Logan et al. proposed two different adjustment methods that consider the correlation among the pairwise tests. One of the methods was derived from multivariate normal distribution, while the other was obtained from a simulated martingales approach. These models may work well for the data with proportional hazard structure. Future researches should take into account the models for comparisons.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
References
- 1.Collett D. Modelling Survival Data in Medical Research. Boca Raton, Fla, USA: CRC press; 2015. [Google Scholar]
- 2.Gehan E. A. A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika. 1965;52:203–223. doi: 10.1093/biomet/52.1-2.203. doi: 10.1093/biomet/52.1-2.203. [DOI] [PubMed] [Google Scholar]
- 3.Mantel N. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports. 1966;50(3):163–170. [PubMed] [Google Scholar]
- 4.Breslow N. A generalized Kruskal-Wallis test for comparing k samples subject to unequal patterns of censorship. Biometrika. 1970;57(3):579–594. doi: 10.1093/biomet/57.3.579. doi: 10.1093/biomet/57.3.579. [DOI] [Google Scholar]
- 5.Peto R., Peto J. Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society. Series A (General) 1972;135(2):p. 185. doi: 10.2307/2344317. [DOI] [Google Scholar]
- 6.Prentice R. L. Linear rank tests with right censored data. Biometrika. 1978;65(1):167–179. doi: 10.1093/biomet/65.1.167. doi: 10.1093/biomet/65.1.167. [DOI] [Google Scholar]
- 7.Tarone R. E., Ware J. On distribution-free tests for equality of survival distributions. Biometrika. 1977;64(1):156–160. doi: 10.1093/biomet/64.1.156. [DOI] [Google Scholar]
- 8.Harrington D. P., Fleming T. R. A class of rank test procedures for censored survival data. Biometrika. 1982;69(3):553–566. doi: 10.1093/biomet/69.3.553. [DOI] [Google Scholar]
- 9.Logan B. R., Wang H., Zhang M.-J. Pairwise multiple comparison adjustment in survival analysis. Statistics in Medicine. 2005;24(16):2509–2523. doi: 10.1002/sim.2125. [DOI] [PubMed] [Google Scholar]
- 10.Tressler A., Chow A. Multiple pairwise comparison procedures based on the Lin and Wang test for right censored survival data. Statistica Neerlandica. 2013;67(1):112–120. doi: 10.1111/j.1467-9574.2012.00535.x. [DOI] [Google Scholar]
- 11.Koziol J. A., Reid N. On multiple comparisons among K samples subject to unequal patterns of censorship. Communications in Statistics—Theory and Methods. 1977;6(12):1149–1164. doi: 10.1080/03610927708827558. [DOI] [Google Scholar]
- 12.Steel R. G. A multiple comparison rank sum test: treatments versus control. Biometrics. 1959;15:560–572. doi: 10.2307/2527654. [DOI] [Google Scholar]
- 13.Chakraborti S., Desu M. M. Linear rank tests for comparing treatments with a control when data are subject to unequal patterns of censorship. Statistica Neerlandica. 1991;45(3):227–254. doi: 10.1111/j.1467-9574.1991.tb01307.x. [DOI] [Google Scholar]
- 14.Chen Y.-I. A generalized steel procedure for comparing several treatments with a control under random right-censorship. Communications in Statistics—Simulation and Computation. 1994;23(1):1–16. doi: 10.1080/03610919408813152. [DOI] [Google Scholar]
- 15.Chen Y.-I. Multiple comparisons in carcinogenesis study with right-censored survival data. Statistics in Medicine. 2000;19(3):353–367. doi: 10.1002/(SICI)1097-0258(20000215)19:3<353::AID-SIM333>3.0.CO;2-B. doi: 10.1002/(SICI)1097-0258(20000215)19:3<353::AID-SIM333>3.0.CO;2-B. [DOI] [PubMed] [Google Scholar]
- 16.Genz A. Numerical computation of multivariate normal probabilities. Journal of Computational and Graphical Statistics. 1992;1(2):141–149. doi: 10.1080/10618600.1992.10477010. [DOI] [Google Scholar]
- 17.Genz A. Comparison of methods for the computation of multivariate normal probabilities. Computing Scienc e and Statistics. 1993;25:400–4005. [Google Scholar]
- 18.Kim W. R., Therneau T. M., Benson J. T., et al. Deaths on the liver transplant waiting list: an analysis of competing risks. Hepatology. 2006;43(2):345–351. doi: 10.1002/hep.21025. [DOI] [PubMed] [Google Scholar]
- 19.Therneau T. M., Lumley T. Package ‘survival’, Ed., Verze, 2016.