A sample size planning approach that considers both statistical significance and clinical significance

Bin Jia; Henry S Lynn

doi:10.1186/s13063-015-0727-9

. 2015 May 12;16:213. doi: 10.1186/s13063-015-0727-9

A sample size planning approach that considers both statistical significance and clinical significance

Bin Jia ¹, Henry S Lynn ^1,^✉

PMCID: PMC4455608 PMID: 25962998

Abstract

Background

The CONSORT statement requires clinical trials to report confidence intervals, which help to assess the precision and clinical importance of the treatment effect. Conventional sample size calculations for clinical trials, however, only consider issues of statistical significance (that is, significance level and power).

Method

A more consistent approach is proposed whereby sample size planning also incorporates information on clinical significance as indicated by the boundaries of the confidence limits of the treatment effect.

Results

The probabilities of declaring a “definitive-positive” or “definitive-negative” result (as defined by Guyatt et al., CMAJ 152(2):169-173, 1995) are controlled by calculating the sample size such that the lower confidence limit under H₁ and the upper confidence limit under H₀ are bounded by relevant cut-offs. Adjustments to the traditional sample size can be directly derived for the comparison of two normally distributed means in a test of nonequality, while simulations are used to estimate the sample size for evaluating the hazards ratio in a proportional-hazards model.

Conclusions

This sample size planning approach allows for an assessment of the potential clinical importance and precision of the treatment effect in a clinical trial in addition to considerations of statistical power and type I error.

Keywords: clinical significance, confidence interval, sample size

Background

The importance of confidence intervals is clearly attested by journal guidelines [1-3] as they “convey information about magnitude and precision of effect simultaneously, and keep these two aspects of measurements closely linked” [4]. For clinical trials, the CONSORT statement [5] stipulates the reporting of the “estimated effect size and its precision (such as 95% confidence interval)” and “how sample size was determined,” but traditional sample size calculations for testing scientific hypotheses consider only statistical significance and power. The precision and clinical importance of the effect that can be depicted by confidence intervals is ignored. Under the usual practice, one calculates the sample size needed to declare some “clinically important difference” statistically significant at the α-level with 1 - β probability. The problem is that there is substantial subjectivity in quantifying this difference, and this can turn the sample size calculation into a moot exercise for choosing a difference to justify the number of patients the study can afford [6]. Frequently, the selected difference ends up larger than what is usual, and thus many studies may display large differences but lack the precision to make them statistically significant. Such shortcomings have led some to argue for reform of current sample size conventions in order to avoid misinterpretation of completed studies and harm to scientific research [7].

What would be helpful is a sample size estimation procedure that provides information on the confidence interval to supply users with information on the clinical significance and precision of the treatment effect in addition to power and statistical significance. Beal [8] suggested selecting sample size such that there is a high probability of the half-width of the confidence interval being less than some prescribed length, conditional on the interval containing the parameter of interest. Similarly, Liu [9] chose the sample size to yield a short confidence interval width but conditional on the rejection of the null hypothesis H₀. Jiroutek et al. [10] combined the two by considering the probability of attaining a certain interval width conditional on both rejection of H₀ and inclusion of the true parameter. Cesana et al. [11,12] introduced a two-step procedure by first obtaining the sample size according to power and then iteratively increasing the sample size until the probability of obtaining confidence intervals with widths less than the expected interval width under H₁ exceeds a specified level.

In the above methods, the user either has to designate an interval length as reference or rely on the expected interval width, which may not be clinically relevant. A more straightforward alternative is to calculate a sample size such that the confidence limits of the parameter will be bounded by designated cut-offs. Specifically, the sample size is chosen such that according to the confidence limits the result can be deemed “definitive-positive” if there is indeed an effect or deemed “definitive-negative” if there is none. According to Guyatt et al. [13], a “definitive-positive” result implies that the lower confidence limit (LCL) of the parameter is not only larger than zero, implying a “positive” and statistically significant study, but above a relevant nonzero threshold. Conversely, a “definitive-negative” result implies that the upper confidence limit (UCL) is below some nonzero threshold. In hypothesis testing, one does not know whether H₁ or H₀ is true and can only control the probabilities of making a false positive or false negative error. Likewise, in this approach, we control the probabilities of declaring a “definitive-positive” or “definitive-negative” result by calculating the sample size such that LCL under H₁ and UCL under H₀ are bounded by fixed cut-offs. The following section demonstrates these concepts first for continuous normally distributed data and then for time-to-event data.

Methods

Normally distributed data

Consider a randomized 1:1 clinical trial comparing the mean responses between the treatment and control groups. When the response (or appropriately transformed response) can be regarded as normally distributed, the assessment of the treatment effect can be formulated as a hypothesis test of H₀: μ₁ - μ₀ = 0 versus H₁: μ₁ - μ₀ ≠ 0. The sample size is then given by

n = \frac{σ^{2} {(Z_{1 - α / 2} + Z_{1 - β})}^{2}}{δ^{2}},

where Z_γ is the γth quantile of the standard normal distribution, (μ_0,σ₀) and (μ_1,σ₁) are the means and standard deviations of the control and treatment groups, respectively, $σ^{2} = σ_{0}^{2} + σ_{1}^{2}$ , and δ = μ₁- μ₀ is the clinically important difference to be detected at level α with power 1 - β.

We first examine how likely the above sample size will yield a “definitive-negative” or “definitive-positive” result by calculating, respectively, the probabilities Pr(UCL < k₀δ | H₀) and Pr(LCL > k₁δ | H₁) for k₀, k₁ ∈ [0,1]. Without loss of generality, assume δ > 0 and let $\bar{D}$ be the sample estimate of the treatment difference. If σ is known, then

\begin{array}{l} Pr (U C L < k_{0} δ | H_{0}) = Pr (\bar{D} + Z_{1 - α / 2} \frac{σ}{\sqrt{n}} < k_{0} δ | H_{0}) = Pr (Z < k_{0} δ \frac{\sqrt{n}}{σ} - Z_{1 - α / 2}) \\ \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c}  \end{array} \end{array} \end{array} & = Pr (Z < (k_{0} - 1) Z_{1 - α / 2} + k_{0} Z_{1 - β}), \end{array} and \end{array}

\begin{array}{l} Pr (L C L > k_{1} δ | H_{1}) = Pr (\bar{D} - Z_{1 - α / 2} \frac{σ}{\sqrt{n}} > k_{1} δ | H_{1}) = Pr (Z > (k_{1} δ - δ) \frac{\sqrt{n}}{σ} + Z_{1 - α / 2}) \\ \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c}  \end{array} \end{array} \end{array} & = Pr (Z > (k_{1} - 1) Z_{1 - β} + k_{1} Z_{1 - α / 2}), \end{array} \end{array}

where Z is the standard normal variable. As k₀, k₁ vary from 0 to 1, these two probability functions are mirror images about 1/2, with Pr(LCL > δ /2 | H₁) = Pr(UCL < δ /2 | H₀). At the boundaries of 0 and 1, Pr(LCL > 0 | H₁) = Pr(UCL < δ | H₀) = 1 - β.

Based on the derivations of equations (2) and (3), it can be shown that if the sample size is increased to $n_{0} = n / k_{0}^{2}$ then Pr(UCL < k₀δ | H₀) = 1 - β for k₀ ∈ (0,1) and if it is increased to n₁ = n/(1 − k₁)² then Pr(LCL > k₁δ | H₁) = 1 - β for k₁ ∈ (0,1). For example, with k₀ = k₁ = 1/2 and sample size n₀ = n₁ = 4n both Pr(LCL > δ /2 | H₁) = Pr(UCL < δ /2 | H₀) = 1 - β. Note that if k₀ = k₁ < 1/2 then n₀ > n₁ and a larger sample size is required to establish a “definitive-negative” compared to a “definitive-positive” result. Conversely, if k₀ = k₁ > 1/2, then n₀ < n_1, and a larger sample size is needed to establish a “definitive-positive” result. In general, if

k_{0} = 1 - k_{1} and n_{0} = n_{1} = n / k_{0}^{2},

then Pr(UCL < k₀δ | H₀) = Pr(LCL > k₁δ | H₁) = 1 - β. For example, if k₀ = 2/3, k₁ = 1/3 and n₀ = n₁ = 9n /4 then Pr(LCL > δ /3 | H₁) = Pr(UCL < 2δ /3 | H₀) = 1 - β.

Time-to-event data

We extend our proposed method to include time-to-event data, and use this case to show how a simulation-based approach can be used to estimate the sample size when the validity of normal approximation may be in doubt. In situations where a closed-form sample size formula is not readily available or difficult to derive, simulation provides an alternative and offers greater flexibility for adapting to more complicated analyses. Briefly, the initial sample size required to detect the clinically important difference δ at power 1 - β is first calculated and then iteratively increased until Pr(LCL > k₁δ | H₁) and Pr(UCL < k₀δ | H₀) reach desired levels. The hazard ratio Δ is chosen as the parameter of interest with its corresponding confidence limits LCL and UCL being estimated using Cox regression. In the following description, we select for simplicity and convenience a single common cut-off by letting k₀ = k₁ = 1/2.

Under the proportional hazards assumption, the initial total sample size N₀ for detecting δ = log_eΔ at level α and power 1 - β can be estimated using Schoenfeld’s [14] formula,

N_{0} = \frac{{(Z_{1 - α / 2} + Z_{1 - β})}^{2}}{P_{0} P_{1} {({log}_{e} Δ)}^{2}} \frac{1}{1 - π_{c}},

where π_c is the overall censoring proportion, and P₀ and P₁ are the proportion of subjects in the treatment and control groups, respectively. (Another choice is to use Freedman’s [15] formula, which gives a slightly smaller sample size.)

Time-to-event data are simulated from the exponential distribution since it is most widely used to model time-to-event data under the proportional hazards assumption. Specifically, we simulate exponential survival times T_i and exponential censoring times L_i for subjects i = 1, …, N₀/2 in each group, and consider a subject censored whenever T_i < L_i. According to Halabi and Bahadur [16], the parameters for the survival and censoring time distributions are given by

2 π_{c} = \frac{λ_{c}}{(λ_{0} + λ_{c})} + \frac{λ_{c}}{(λ_{1} + λ_{c})},

where λ₀, λ₁ are the hazard rates of the exponential survival times for the control and treatment groups, respectively, and λ_c is the hazard rate for the exponential censoring time. When π_c = 0.5, equation (6) reduces to the simple relationship

λ_{c} = \sqrt{λ_{0} λ_{1}} .

We set λ₀ = 1 and select four values, (1.25, 1.5, 1.75, 2.0), for the hazard ratio Δ ≡ λ₁/λ₀ = λ₁. For each value of Δ, the procedure goes through the following steps:

With α = 0.05, β = 0.2, P₀ = P₁ = 0.5, π_c = 0.5, and δ = log_e(Δ), calculate the initial total sample size N₀ using (5);
Simulate N₀/2 independent samples of exponential survival and censoring times for the treatment and control groups with corresponding parameters λ₀ = 1, λ₁, and $λ_{c} = \sqrt{λ_{1}};$
Compare the survival times between the treatment and control groups using Cox regression and compute the 95% confidence interval for log_e(Δ);
Repeat steps (2) and (3) for 10,000 iterations and estimate Pr(LCL > δ /2 | H₁) using the proportion of iterations where LCL > δ /2;
Set Δ = 1 and repeat steps (2) and (3) 10,000 times to estimate Pr(UCL < δ /2 | H₀) using the proportion of times when UCL < δ /2;
Replace N₀ with a larger sample size and repeat steps (2) through (5) until the estimates for both Pr(LCL > δ /2 | H₁) and Pr(UCL < δ /2 | H₀) are greater than some desired level (for example, 0.8).

The above procedure was programmed using SAS 9.2, and a sample SAS program is provided in the Appendix as reference.

Results

For comparing the means of normally distributed outcomes, Figure 1 shows that when α = 0.05 and power = 0.8, Pr(LCL > kδ | H₁) decreases steadily from 0.8 to 0.025 while Pr(UCL < kδ | H₀) increases steadily from 0.025 to 0.80 as k varies from 0 to 1. In fact, these two probability functions are mirror images about k = 1/2, where they both equal 0.288. This implies that a trial designed to detect a clinically important difference δ at the 5% significance level with 80% power will be “definitive-positive” about 29% of the time if one wants to say with 95% confidence that the treatment effect must be at least δ /2.

Plot of Pr(*LCL* > kδ | H ₁) (red curve) and Pr(*UCL* < kδ | H ₀) (blue curve) for k ∈ [0,1], α = 0.05, β = 0.80 in a comparison of normally distributed mean responses with known σ between treatment and control groups for a 1:1 randomized clinical trial.

For time-to-event data, the initial total sample size (N₀ = 1264) for detecting a hazard ratio Δ = 1.25 is almost 5/(1 - π_c) or ten times larger than that (N₀ = 132) for detecting Δ = 2.00 according to Schoenfeld’s [14] formula. At these initial sample sizes, the estimates of Pr(LCL > 0 | H₁) ranged from 0.79 to 0.81 as expected, while Pr(UCL < δ | H₀) ranged from 0.70 to 0.77, slightly less than 0.8. Similarly, estimates for Pr(LCL > δ /2 | H₁) ranged from 0.27 to 0.29, close to what is expected for normally distributed data, while estimates of Pr(UCL < δ /2 | H₀) are slightly lower than expected, ranging from 0.23 to 0.27. For a specific example, say Δ = 1.75, then N₀ = 204 according to (5) and the estimates of α and β are 0.0485 and 0.2044, respectively. The β estimate implies that 79.6% of the samples have LCL > 0 under H₁. But the mean LCL is 0.16, thus as shown in Table 1 only 27.7% of the samples have LCL > δ /2 = log_e(1.75)/2 = 0.28. Correspondingly, 95.2% of the samples under H₀ have confidence intervals that include zero, but since the mean UCL is 0.42 only 25.4% of the samples have UCL < 0.28.

Table 1.

Clinical significance and precision of the log-hazard ratio according to the initial and final sample sizes

Δ	log _e (Δ)	^b λ _c	N		Pr( *LCL* > δ /2 \| H ₁ )	^e CIW ₁	Pr( *UCL* < δ /2 \| H ₀ )	^d CIW ₀
1.25	0.22	1.12	^aInitial	1264	0.2925	0.322	0.2651	0.314
			^cFinal	5402	0.8241	0.155	0.8016	0.151
1.50	0.41	1.22	^aInitial	384	0.2759	0.602	0.2658	0.577
			^cFinal	1694	0.8349	0.285	0.8039	0.273
1.75	0.56	1.32	^aInitial	204	0.2766	0.850	0.2536	0.804
			^cFinal	938	0.8496	0.392	0.8021	0.371
2.00	0.69	1.41	^aInitial	132	0.2700	1.087	0.2344	1.018
			^cFinal	632	0.8503	0.487	0.8052	0.457

Open in a new tab

The ^ainitial N calculated using equation (5), Schoenfeld’s [14] formula, is the total sample size required to detect a hazard ratio Δ at the 5% level with 80% power, assuming equal subject allocation and a 0.5 overall censoring proportion. ^b λ_c is the hazard rate for the exponential censoring time given by equation (7), and δ. = log_e(Δ). The ^cfinal N is the total sample size such that both Pr(LCL > δ /2 | H ₁) and Pr(UCL < δ /2 | H ₀) are at least 0.8 as estimated by the proportion of times LCL and UCL are bounded by δ /2 in 10,000 iterations. ^dCIW₀ and ^eCIW₁ are the mean width of the 95% confidence intervals under H ₀ and H ₁, respectively.

Table 1 suggests that sample sizes need to be larger by four to five times the initial sample size before estimates of both Pr(LCL > δ /2 | H₁) and Pr(UCL < δ /2 | H₀) are above 0.8. For example, with Δ = 1.75, the mean LCL for samples under H₁ equals 0.38 when the sample size reaches 938 (4.6 times N₀), and 85.0% of the samples then have LCL > δ /2 = 0.28. In addition, at this sample size, the mean UCL for samples under H₀ equals 0.19, and 80.2% of the samples have UCL < 0.28. In terms of confidence interval width, the final sample sizes yield confidence interval widths that are between 0.4 to 0.5 times narrower than those at the initial sample sizes. For example, with Δ = 1.75 and a final sample size of 938, the mean confidence interval widths are 0.37 and 0.39 under H₀ and H₁, respectively, and 0.46 times narrower than the corresponding mean confidence interval widths at the initial sample size of 204.

Discussion

Many researchers realize that a traditional sample size calculation for testing H₀: μ₁ - μ₀ = 0 versus H₁: μ₁ - μ₀ ≠ 0 with α = 0.05 and 80% power to detect a clinically important difference δ implies that: 1) 95% of its 95% confidence intervals for μ₁ - μ₀ will include zero when H₀ is true, and 2) 80% of the 95% confidence intervals will exclude zero when H₁ (that is, μ₁ - μ₀ = δ) is true. However, a confidence interval with a LCL that is barely larger than zero may indicate a statistically significant treatment effect but be unconvincing to investigators who desire a “definitive-positive” result [13]. In contrast, a confidence interval that includes zero and demonstrates a “statistically nonsignificant” effect may be more convincing as a “definitive-negative” result when its UCL is small. Therefore, we propose that information on Pr(LCL > cut-off | H₁) and Pr(UCL < cut-off | H₀) be available to assist investigators in gauging the clinical significance of the treatment effect. For example, a plot similar to Figure 1 can be provided as a supplement to the usual sample size calculation or the investigator can directly estimate the sample size required such that LCL and UCL are bounded by relevant cut-offs with high probability. This offers a more consistent approach since the confidence interval becomes an important component in the design of clinical trials and not solely for analysis.

One question for this method concerns how a clinically relevant cut-off can be selected. Since δ, the clinically important difference, is already defined in the original sample size calculation, a convenient choice is to specify the cut-off with respect to δ. Given the uncertainty involved in quantifying δ and the tendency to inflate it [6], we set the cut-off equal to kδ for k ∈ (0,1). This bypasses the need to additionally specify a confidence interval reference width [8-10] or calculate an expected confidence interval width [11,12]. For example, δ /2 can be used as the cut-off since it gives equal consideration to the expected precision of symmetrical intervals under H₀ and H₁. However, it should be stressed that there is no requirement for intervals under H₀ and H₁ to be given equal emphasis or for the boundaries of LCL and UCL to be the same. A researcher may well choose different cut-offs corresponding to a “definitive-positive” and a “definitive-negative” result; for example, LCL > 3δ /4 and UCL < δ /4 or LCL > δ /3 and UCL < 2δ /3.

Previous considerations of sample size estimation by controlling statistical power and precision often involve complex calculations even for normally distributed or binary outcomes. The current proposal is pedagogically straightforward as it simply focuses on the position of the confidence limits in relation to clinically relevant boundaries. Greenland [17] designed a method that provides high power to discriminate between the parameter values under H₀ and H₁. A sample size was chosen such that the discriminatory power, min{ Pr(LCL > 0 | H₁), Pr(UCL < δ | H₀)}, equals a specified level. Our method also focuses on the probabilities of the lower and upper confidence limits being bounded, but the boundaries are different as Greenland was not thinking of clinically important effect sizes but the original parameter values under H₀ and H₁.

The condition LCL > k₁δ corresponds to the alternative hypothesis for a superiority test of H₀: μ₁ - μ₀ ≤ k₁δ versus H₁: μ₁ - μ₀ > k₁δ. However, the sample size n₁ to attain a “definitive-positive” result is different from the sample size for the superiority test since the former is two-sided while the latter is one-sided. For example, with α = 0.05, β = 0.2, σ² = 2, δ = 1, and k₁ = 1/2, equations (1) and (4) imply that n₁ = 4×16 = 64, while the sample size for the superiority test, as given by

\frac{σ^{2} {(Z_{1 - α} + Z_{1 - β})}^{2}}{{(δ - k_{1} δ)}^{2}},

equals 50. More importantly, our method calculates not only the sample size involving LCL > k₁δ but also that for UCL < k₀δ.

Conclusions

In summary, our proposed method allows the researcher to calculate the sample size for a clinical trial not only according to the specifications of statistical significance (that is, α and β) but also in terms of clinical significance as judged by the boundaries of the confidence limits. For normally distributed data, simple formulae are available and their results serve as a reference for sample size planning when analyzing other types of data. For example, to ensure that LCL and UCL are both bounded by δ /2 the sample size needs to be increased 4-fold when comparing normally distributed means. Likewise, when evaluating the hazard ratio for time-to-event data, simulation results also suggest that sample sizes need to be 4 to 5 times larger. The results of our method indicate that sample size needs to be increased but our intention is not to mandate larger sample sizes per se. Such an effort may be futile since in practice cost constraints force clinical trials to aim for the smallest possible sample size What is important is that researchers be informed, for example by a graph similar to Figure 1, as to how their sample size will affect judgments of clinical significance using confidence intervals. In this respect, our proposal directs attention back to the importance of gauging effect sizes using confidence intervals, and is consistent with the predicted confidence intervals Goodman and Berlin [6] advocated to help investigators better understand the idea of statistical power when calculating sample size.

Acknowledgements

None. This research was not supported by any external funding resources.

Abbreviations

CONSORT: Consolidated Standards of Reporting Trials
LCL: lower confidence limit
UCL: upper confidence limit

Appendix

Sample SAS program to estimate the total sample size for testing H₀: Δ = 1 versus H₁: Δ ≠ 1 such that Pr(LCL > δ /2 | H₁) = Pr(UCL < δ /2 | H₀) = 1 - β. Survival and censoring times are assumed to be exponentially distributed, and the overall censoring proportion equals 0.5. The initial sample size is estimated using Schoenfeld’s [14] formula for detecting δ = log_e(Δ) with 80% power at the 5% significance level.

graphic file with name 13063_2015_727_Figa_HTML.gif

Footnotes

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

HSL conceived the study, performed the analyses, and drafted the manuscript. BJ participated in the analyses and drafted the manuscript. Both authors have read and approved the final manuscript.

Contributor Information

Bin Jia, Email: bjia33@its.jnj.com.

Henry S Lynn, Email: hslynn@shmu.edu.cn.

References

1.Simon R, Wittes RE. Methodologic guidelines for reports of clinical trials. Cancer Treat Rep. 1985;69:1–3. [PubMed] [Google Scholar]
2.Bailar JC, Mosteller F. Guidelines for statistical reporting in articles for medical journals. Ann Intern Med. 1988;108:266–73. doi: 10.7326/0003-4819-108-2-266. [DOI] [PubMed] [Google Scholar]
3.Lang T. Documenting research in scientific articles: guidelines for authors. 1. Reporting research designs and activities. Chest. 2006;130:1263–8. doi: 10.1378/chest.130.4.1263. [DOI] [PubMed] [Google Scholar]
4.Rothman K. Modern epidemiology. Boston: Little Brown; 1986. [Google Scholar]
5.Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, et al. The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med. 2001;134:663–94. doi: 10.7326/0003-4819-134-8-200104170-00012. [DOI] [PubMed] [Google Scholar]
6.Goodman S, Berlin J. The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Ann Intern Med. 1994;121:200–6. doi: 10.7326/0003-4819-121-3-199408010-00008. [DOI] [PubMed] [Google Scholar]
7.Bacchetti P. Current sample size conventions: flaws, harms, and alternatives. BMC Med. 2010;8 doi: 10.1186/1741-7015-8-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Beal SL. Sample size determination for confidence intervals on the population mean and on the difference between two population means. Biometrics. 1989;45:969–77. doi: 10.2307/2531696. [DOI] [PubMed] [Google Scholar]
9.Liu XS. Implications of statistical power for confidence intervals. Br J Math Stat Psychol. 2012;65:427–37. doi: 10.1111/j.2044-8317.2011.02035.x. [DOI] [PubMed] [Google Scholar]
10.Jiroutek MR, Muller KE, Kupper LL, Stewart PW. A new method for choosing sample size for confidence interval-based inferences. Biometrics. 2003;59:580–90. doi: 10.1111/1541-0420.00068. [DOI] [PubMed] [Google Scholar]
11.Cesana BM, Reina G, Marubini E. Sample size for testing a proportion in clinical trials: a ’two-step’ procedure combining power and confidence interval expected width. Am Stat. 2001;55:288–92. doi: 10.1198/000313001753272222. [DOI] [Google Scholar]
12.Cesana BM. Sample size for testing and estimating the difference between two paired and unpaired proportions: a ‘two-step’ procedure combining power and the probability of obtaining a precise estimate. Stat Med. 2004;23:2359–73. doi: 10.1002/sim.1827. [DOI] [PubMed] [Google Scholar]
13.Guyatt G, Jaeschke R, Heddle N, Cook D, Shannon H, Walter S. Basic statistics for clinicians: 2. Interpreting study results: confidence intervals. Can Med Assoc J. 1995;152:169–73. [PMC free article] [PubMed] [Google Scholar]
14.Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983;39:499–503. doi: 10.2307/2531021. [DOI] [PubMed] [Google Scholar]
15.Freedman LS. Tables of the number of patients required in clinical trials using the logrank test. Stat Med. 1982;1:121–9. doi: 10.1002/sim.4780010204. [DOI] [PubMed] [Google Scholar]
16.Halabi S, Bahadur S. Sample size determination for comparing several survival curves with unequal allocations. Stat Med. 2004;23:1793–815. doi: 10.1002/sim.1771. [DOI] [PubMed] [Google Scholar]
17.Greenland S. On sample-size and power calculations for studies using confidence intervals. Am J Epidemiol. 1988;128:231–7. doi: 10.1093/oxfordjournals.aje.a114945. [DOI] [PubMed] [Google Scholar]

[CR1] 1.Simon R, Wittes RE. Methodologic guidelines for reports of clinical trials. Cancer Treat Rep. 1985;69:1–3. [PubMed] [Google Scholar]

[CR2] 2.Bailar JC, Mosteller F. Guidelines for statistical reporting in articles for medical journals. Ann Intern Med. 1988;108:266–73. doi: 10.7326/0003-4819-108-2-266. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Lang T. Documenting research in scientific articles: guidelines for authors. 1. Reporting research designs and activities. Chest. 2006;130:1263–8. doi: 10.1378/chest.130.4.1263. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Rothman K. Modern epidemiology. Boston: Little Brown; 1986. [Google Scholar]

[CR5] 5.Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, et al. The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med. 2001;134:663–94. doi: 10.7326/0003-4819-134-8-200104170-00012. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Goodman S, Berlin J. The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Ann Intern Med. 1994;121:200–6. doi: 10.7326/0003-4819-121-3-199408010-00008. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Bacchetti P. Current sample size conventions: flaws, harms, and alternatives. BMC Med. 2010;8 doi: 10.1186/1741-7015-8-17. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Beal SL. Sample size determination for confidence intervals on the population mean and on the difference between two population means. Biometrics. 1989;45:969–77. doi: 10.2307/2531696. [DOI] [PubMed] [Google Scholar]

[CR9] 9.Liu XS. Implications of statistical power for confidence intervals. Br J Math Stat Psychol. 2012;65:427–37. doi: 10.1111/j.2044-8317.2011.02035.x. [DOI] [PubMed] [Google Scholar]

[CR10] 10.Jiroutek MR, Muller KE, Kupper LL, Stewart PW. A new method for choosing sample size for confidence interval-based inferences. Biometrics. 2003;59:580–90. doi: 10.1111/1541-0420.00068. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Cesana BM, Reina G, Marubini E. Sample size for testing a proportion in clinical trials: a ’two-step’ procedure combining power and confidence interval expected width. Am Stat. 2001;55:288–92. doi: 10.1198/000313001753272222. [DOI] [Google Scholar]

[CR12] 12.Cesana BM. Sample size for testing and estimating the difference between two paired and unpaired proportions: a ‘two-step’ procedure combining power and the probability of obtaining a precise estimate. Stat Med. 2004;23:2359–73. doi: 10.1002/sim.1827. [DOI] [PubMed] [Google Scholar]

[CR13] 13.Guyatt G, Jaeschke R, Heddle N, Cook D, Shannon H, Walter S. Basic statistics for clinicians: 2. Interpreting study results: confidence intervals. Can Med Assoc J. 1995;152:169–73. [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983;39:499–503. doi: 10.2307/2531021. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Freedman LS. Tables of the number of patients required in clinical trials using the logrank test. Stat Med. 1982;1:121–9. doi: 10.1002/sim.4780010204. [DOI] [PubMed] [Google Scholar]

[CR16] 16.Halabi S, Bahadur S. Sample size determination for comparing several survival curves with unequal allocations. Stat Med. 2004;23:1793–815. doi: 10.1002/sim.1771. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Greenland S. On sample-size and power calculations for studies using confidence intervals. Am J Epidemiol. 1988;128:231–7. doi: 10.1093/oxfordjournals.aje.a114945. [DOI] [PubMed] [Google Scholar]

PERMALINK

A sample size planning approach that considers both statistical significance and clinical significance

Bin Jia

Henry S Lynn

Abstract

Background

Method

Results

Conclusions

Background

Methods

Normally distributed data

Time-to-event data

Results

Figure 1.

Table 1.

Discussion

Conclusions

Acknowledgements

Abbreviations

Appendix

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A sample size planning approach that considers both statistical significance and clinical significance

Bin Jia

Henry S Lynn

Abstract

Background

Method

Results

Conclusions

Background

Methods

Normally distributed data

Time-to-event data

Results

Figure 1.

Table 1.

Discussion

Conclusions

Acknowledgements

Abbreviations

Appendix

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases