Skip to main content
CMAJ : Canadian Medical Association Journal logoLink to CMAJ : Canadian Medical Association Journal
. 2002 Jan 8;166(1):65–66.

If we're so different, why do we keep overlapping? When 1 plus 1 doesn't make 2

Rory Wolfe 1, James Hanley 1
PMCID: PMC99228  PMID: 11800251

In the last decade, guidelines for the presentation of statistical results in medical journals have emphasized confidence intervals (CIs) as an adjunct to, or even a replacement for, statistical tests and p values. Because of the intimate links between the 2 concepts, authors now use statements like “the 95% CI overlaps 0” where they would formerly have stated “the difference is not statistically significant at the 5% level.” Although this interchangeability is technically correct in 1-sample situations, it does not carry over fully to comparisons involving 2 samples. A frequently encountered misconception is that if 2 independent 95% CIs overlap each other, as they do in Fig. 1, then a statistical test of the difference will not be statistically significant at the 5% level.

graphic file with name 19FF1.jpg

Fig. 1: Group means with confidence intervals that overlap.

Why is this not necessarily so? Consider the means in 2 independent groups, meanA and meanB, with for simplicity meanA being the smaller of the 2. The 95% CI for the mean in group A is approximately given by meanA plus or minus twice the standard error of the mean for that group, SEA, and correspondingly for group B. A mathematical check for whether these CIs overlap is given by adding the distance 2SEA (from meanA to the upper bound of the CI) to 2SEB and comparing this sum with the distance between the 2 means, that is, meanB minus meanA (Fig. 2). The CIs overlap when

graphic file with name 19FF2.jpg

Fig. 2: Confidence intervals and comparison of 2 group means (hypothetical clinical trial data: SEA = SEB = 1.8, means differ by 3 SE; assuming n > 30 and independent samples, the 2-sided p value for testing the difference in means is approximately 0.036). SE = standard error of the mean.

[Equation 1]

graphic file with name 19MM1.jpg

But overlapping confidence intervals do not demonstrate that group means are not statistically significantly different from each other. In a 2-sample t-test to compare 2 means, significance is attained at the 0.05 level if the t statistic exceeds the critical value of about 2, which occurs when the difference between the means exceeds twice its standard error, namely, if

[Equation 2]

graphic file with name 19MM2.jpg

This standard error reflects the fact that the standard error of a difference involves summing the standard error of each estimate, but doing so by “adding in quadrature,” for example,

[Equation 3]

graphic file with name 19MM3.jpg

Thus, to evaluate the overlap of 2 95% CIs and to determine whether at the same time the difference between the means is significant at the 0.05 level, the following rough rule can be used:

[Equation 4]

graphic file with name 19MM4.jpg

If SEA and SEB are equal, the condition is as follows:

[Equation 5]

graphic file with name 19MM5.jpg

When one SE is 25% larger than the other, the boundaries are 3.2 and 4.5 times the smaller SE. As the lower boundary remains close to 3, Moses1 was prompted to display group means with error bars that were 1.5 SE around the mean in order to have a “by eye” test of significance between the 2 group means while presenting the information in the 2 groups separately.

Footnotes

This article has been peer reviewed.

Competing interests: None declared.

Correspondence to: Dr. Rory Wolfe, Department of Epidemiology and Preventive Medicine, Central and Eastern Clinical School, Monash University and the Alfred Hospital, Commercial Rd., Prahran, Victoria 3183, Australia; fax 61 3 9903 0556; rory.wolfe@med.monash.edu.au

Reference

  • 1. Moses LE. Graphical methods in statistical analysis. Annu Rev Public Health 1987;8:309-53. [DOI] [PubMed]

Articles from CMAJ: Canadian Medical Association Journal are provided here courtesy of Canadian Medical Association

RESOURCES