Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2013 Mar 7;8(3):e58777. doi: 10.1371/journal.pone.0058777

Effect Sizes for 2×2 Contingency Tables

Jake Olivier 1,*, Melanie L Bell 2
Editor: Fabio Rapallo3
PMCID: PMC3591379  PMID: 23505560

Abstract

Sample size calculations are an important part of research to balance the use of resources and to avoid undue harm to participants. Effect sizes are an integral part of these calculations and meaningful values are often unknown to the researcher. General recommendations for effect sizes have been proposed for several commonly used statistical procedures. For the analysis of Inline graphic tables, recommendations have been given for the correlation coefficient Inline graphic for binary data; however, it is well known that Inline graphic suffers from poor statistical properties. The odds ratio is not problematic, although recommendations based on objective reasoning do not exist. This paper proposes odds ratio recommendations that are anchored to Inline graphic for fixed marginal probabilities. It will further be demonstrated that the marginal assumptions can be relaxed resulting in more general results.

Introduction

Sample size calculations are an integral part of scientifically useful and ethical research [1]. A study which is too small may not answer the research question, wasting resources and potentially putting participants at risk for no purpose [2]. Studies which are too large can also waste resources and expose participants to the potential harms of research needlessly, as well as delaying results and their translation into practice. The computation of sample size a priori is usually dependent upon predetermined values for power and level of significance, an estimate of the expected variability in the sample and an effect size of practical or clinical importance. By convention, the choice of power and level of significance is usually at least Inline graphic and no more than Inline graphic respectively. When a practically important effect size is unknown, there are several recommendations in the literature to guide the researcher. In his seminal paper, Cohen [3] gives operationally defined small, medium and large effect sizes for various, common significance tests. The use of effect size recommendations should not replace differences of clinical or practical importance [4] and may not be appropriate for all disciplines. In basic science research, for example, large effect sizes by Cohen's criteria are common and, therefore, require small sample sizes. On the other hand, clinical and epidemiological research often deals with small effect sizes and often requires large, population-based studies. While there are some approaches to estimating a minimum important effect [5], there are instances where this information is simply not known. Thus, effect size recommendations assist with the balance between overly small and overly large sample sizes.

When the researcher is interested in Inline graphic contingency tables, a common measure of effect size is Inline graphic which, in this instance, is equivalent to Pearson's correlation coefficient [6]. Cohen [3] recommends effect sizes of Inline graphic and Inline graphic for small, medium and large effect sizes respectively and are identical to his recommendations for the correlation coefficient. Although Cohen [3] denotes this statistic as Inline graphic, much of the literature uses Inline graphic [6][10] and the remainder of this manuscript follows this convention. To support his recommended effect sizes for correlation coefficients, Cohen [11] chose equivalent values for the difference in two means through the connection with point biserial correlation. Additionally, Inline graphic is applicable to logistic regression since it can be converted to an odds ratio (Inline graphic) when the row (or column) marginal probabilities of the Inline graphic table are fixed. For example, when the marginal probabilities are uniform (i.e., Inline graphic for row and column probabilities), Cohen's recommended effect sizes are equivalent to odds ratios of Inline graphic and Inline graphic. It will be demonstrated that the connection between the odds ratio and Inline graphic is largely dependent on the marginal probabilities and these Inline graphic values should not be used in general.

A problem arises when using the effect size Inline graphic for Inline graphic tables as the full range of correlation coefficients are only possible under very restrictive circumstances and are not justified in general [12]. On the other hand, odds ratios are valid effect size measures that are not constrained by the marginal probabilities. Ferguson [10] recommends small, medium, and large odds ratio effect sizes of Inline graphic and Inline graphic, but urges caution in their use as they are not “anchored” to Pearson's correlation coefficient. Although many have pointed out problems with Inline graphic as an association measure and advocate the use of odds ratios as an alternative, effect size recommendations for odds ratios do not exist in general.

It is common in randomised controlled trials and case-control studies to fix one of the marginal probabilities in the Inline graphic table as it directly relates to the ratio of participant allocation. For instance, a marginal probability of Inline graphic corresponds to a 1:1 case-control ratio while a 2:1 ratio is a marginal probability of Inline graphic (or equivalently Inline graphic for 1:2).

The aims of this paper are to demonstrate: (1) the equivalence of effect size measures for Inline graphic contingency tables, in particular the relationship between Inline graphic and the odds ratio; (2) that recommended odds ratio effect sizes can be derived from Cohen's work using the maximum value of Inline graphic as a guideline for fixed marginal probabilities; (3) the shortcomings of Inline graphic and the strength of the odds ratio as an effect size measure; and (4) that conservative odds ratio effect size recommendations can be derived without relying on fixed margins. We provide an example that investigates the association between helmet wearing by bicyclists and overtaking distance by automobiles.

Equivalence of Effect Size Measures for 2×2 Contingency Tables

Inline graphic Contingency tables

The two-way classification or contingency table is a common method for summarising the relationship between two binary variables, say Inline graphic and Inline graphic. Table 1 gives the joint probability distribution of Inline graphic and Inline graphic when their individual outcomes are from the set Inline graphic.

Table 1. 2×2contingency table of probabilities.

X = 0 X = 1 Total
Y = 0 π00 π01 π0+
Y = 1 π10 π11 π1+
Total π+0 π+1 1.0

In this formulation, Inline graphic, for Inline graphic, is the joint probability of Inline graphic and Inline graphic, Inline graphic is the marginal probability of Inline graphic, and Inline graphic is the marginal probability of Inline graphic. Under an assumption of independence between Inline graphic and Inline graphic, the product of the marginal probabilities equals the cell probabilities, i.e., Inline graphic. Alternatively, the Inline graphic table could be represented by the frequency of observations so that Inline graphic where Inline graphic. Similarly, the marginal frequencies are Inline graphic and Inline graphic. Note that Inline graphic is assumed to be the population proportion as the focus of this paper is the use of effect sizes as a planning tool and not statistical inference per se. In a case-control study, for example, Inline graphic may indicate the presence or absence of disease while Inline graphic is an indication of exposure. Thus, Inline graphic would represent the joint probability of being diseased and exposed.

Effect size Inline graphic and Equivalences for Inline graphic Tables

There are many association measures applicable to Inline graphic tables which, with the exception of the odds ratio and relative risk, are equivalent or similar to Inline graphic. The equivalence of some of these association measures is outlined below.

For the random sample Inline graphic, Pearson's correlation coefficient is

graphic file with name pone.0058777.e068.jpg

where Inline graphic and Inline graphic are the sample means of the Inline graphic and Inline graphic respectively. Although used primarily as a measure of linear association, Pearson's correlation coefficient can be applied to binary variables and is often given the notation Inline graphic. For the Inline graphic table case, we get

graphic file with name pone.0058777.e075.jpg
graphic file with name pone.0058777.e076.jpg
graphic file with name pone.0058777.e077.jpg

So, Pearson's correlation coefficient for binary random variables Inline graphic and Inline graphic is

graphic file with name pone.0058777.e080.jpg

Since Inline graphic under the hypothesis of independence, Inline graphic can be interpreted as measuring the departure from independence between Inline graphic and Inline graphic. Note that Cramér's Inline graphic is equivalent to this equation for the Inline graphic table case [11] as well as the square root of Goodman and Kruskal's Inline graphic [13].

For the analysis of contingency tables, in general (not just the Inline graphic table case) the effect size formula for Inline graphic total cells is

graphic file with name pone.0058777.e090.jpg

where Inline graphic and Inline graphic are cell probabilities under the null and alternative hypotheses respectively. Note that Inline graphic is related to the usual chi-square statistic Inline graphic by Inline graphic and is sometimes called the contingency coefficient. Using this formula, Cohen [3] recommends Inline graphic and Inline graphic for small, medium and large effect sizes. Making note that Inline graphic is the probability of each cell (Inline graphic) and Inline graphic is the cell probability under an independence assumption (so that Inline graphic), we can then write the effect size formula for the Inline graphic table as follows

graphic file with name pone.0058777.e103.jpg

Simple arithmetic demonstrates the equivalence of Inline graphic with Inline graphic. The Inline graphic function is used to give the appropriate sign since the chi-square statistic is inherently non-directional.

The relationship of Inline graphic to the odds ratio

The odds ratio for the association between Inline graphic and Inline graphic is Inline graphic. When the marginal probabilities are held constant and the cell probability Inline graphic is known, the remaining cell probabilities can be written as

graphic file with name pone.0058777.e112.jpg
graphic file with name pone.0058777.e113.jpg
graphic file with name pone.0058777.e114.jpg

Therefore, when the marginal probabilities are fixed, the odds ratio can be computed directly from Inline graphic, which can then be expressed as

graphic file with name pone.0058777.e116.jpg

It is clear from the above formula that the odds ratio will be greater than one (or less than one) precisely when the joint probability Inline graphic is greater (or less) than expected under an assumption of independence, i.e., Inline graphic. Additionally, the formula for Inline graphic can be rearranged to solve for Inline graphic, i.e.,

graphic file with name pone.0058777.e121.jpg

Although mathematically unattractive, it is clear the odds ratio can then be computed from Inline graphic, Inline graphic, and Inline graphic. Note that when Inline graphic (i.e., no correlation), we get Inline graphic (i.e., Inline graphic and Inline graphic are independent) and the odds ratio is Inline graphic. When Inline graphic, the term Inline graphic is then a measure of the departure from independence.

Maximum Inline graphic and Modified Effect Sizes

When the marginal probabilities are fixed constants, Inline graphic is an increasing linear function of Inline graphic. Further, Inline graphic is bounded by

graphic file with name pone.0058777.e136.jpg

These bounds are due to all cell probabilities being non-negative and the relationship of Inline graphic with the other cell probabilities given above. As a result, Inline graphic is bounded as well and attains its maximum when Inline graphic. Using the upper bound of the above inequality, it can be shown that

graphic file with name pone.0058777.e140.jpg

where Inline graphic to ensure Inline graphic. It is clear from the formula for Inline graphic that the full range of correlation coefficients, i.e., Inline graphic, is attainable only when the marginal probabilities are equal, i.e., Inline graphic or Inline graphic. This has an intuitive appeal as perfect correlation for two binary variables is only possible when two cell probabilities are zero. For example, when all observations are in either the Inline graphic or Inline graphic cells, Inline graphic. However, it would appear highly unlikely both marginal probabilities will be equal in practice. For example, in a 1:1 case-control study with mortality as the primary outcome, half of all patients would need to die for perfect correlation to be possible. On the other hand, if Inline graphic of all patients die, the maximum correlation possible is Inline graphic which is near a medium recommended effect size. So, in this situation, all estimates of Inline graphic, computed from observed proportions, are bounded by

graphic file with name pone.0058777.e153.jpg

Importantly, odds ratios are not bounded with possible values of Inline graphic as Inline graphic varies on the interval Inline graphic. In fact, as Inline graphic approaches Inline graphic, the Inline graphic increases without bound. Figure 1 demonstrates this relationship. Importantly, this indicates Inline graphic has serious limitations as a measure of association and that these limitations are not applicable to the odds ratio.

Figure 1. Relationship between the odds ratio and Inline graphic for unequal marginal probabilities.

Figure 1

Effect Sizes Relative to Inline graphic

In many practical instances, the marginal probabilities are not equal, making the full range of values for Inline graphic impossible with the potential of making Cohen's recommended effect sizes unusable for Inline graphic tables. Although not equivalent to perfect correlation, Inline graphic can be interpreted as the maximum possible correlation given the marginal probabilities. In fact, Inline graphic/Inline graphic has been proposed as an association measure with the interpretation as the proportion of observed correlation relative to the maximum attainable with fixed marginal probabilities [7], although the researcher is cautioned when the marginal probabilities diverge [6]. Note that Inline graphic is not equivalent to Cohen's similarity/agreement measure Inline graphic. However, Inline graphic suffers from the same boundary problems as Inline graphic and the two are equivalent when scaled to their maximum values, i.e., Inline graphic/Inline graphic/Inline graphic, making the two measures similar [6].

Recommended effect sizes in terms of the odds ratio

As an alternative to Cohen's recommendations, increments of Inline graphic can be related to the odds ratio, say Inline graphic, where Inline graphic. Note that values of Inline graphic or Inline graphic coincide with Cohen's usual recommendations when Inline graphic. The relationship between Inline graphic and the odds ratio can be simplified by choosing marginal probabilities for commonly used participant allocations. As an example, Figures 2 and 3 demonstrate the relationship between Inline graphic and odds ratios for Inline graphic, Inline graphic and Inline graphic for 1:1 and 1:2 allocations respectively. Note that the minimal odds ratios, and therefore most conservative when used to compute sample size, occur when Inline graphic tends to Inline graphic. Although the odds ratio does not exist when Inline graphic, the limit exists and is

graphic file with name pone.0058777.e188.jpg

Figure 2. Odds ratios and marginal probability by small, medium and large effect sizes for 1:1 allocation.

Figure 2

Figure 3. Odds ratios and marginal probability by small, medium and large effect sizes for 1:2 allocation.

Figure 3

Additionally, the maximal odds ratio, and therefore most anti-conservative, occurs when the marginal probabilities are equal, as expected. Below is the maximum attainable odds ratio for equal margins Inline graphic for increments Inline graphic of Inline graphic,

graphic file with name pone.0058777.e192.jpg

It is important to note that when Inline graphic, as is often true for case-control studies where cases are harder to identify or enrol than controls, the minimal odds ratio will be smallest for evenly allocated studies, i.e., Inline graphic. Further, it is generally recommended to use 1:1 allocation as it is the most statistically efficient ratio, i.e., maximum power for a fixed overall sample size. So, odds ratios of Inline graphic and Inline graphic can be used as small, medium and large effect sizes without assumptions regarding marginal probabilities. Sample sizes computed using these odds ratios for 1:1 allocation are given in Table 2 for Inline graphic power and Inline graphic level of significance. A SAS macro that will compute sample sizes from given marginal probabilities for small, medium and large odds ratios has been provided as a supplementary file.

Table 2. Sample sizes calculated for small, medium and large effect sizes for 1:1 allocation, 80Inline graphic power and Inline graphic.

π1+
Odds Ratio 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1.22 8168 4688 3646 3254 3188 3386 3948 5282 9576
1.86 724 436 354 330 338 374 454 632 1188
3.00 200 128 110 108 116 134 170 246 480

Interestingly, Haddock et al. [12] as a rule of thumb consider odds ratios greater than Inline graphic large effect sizes, although there is no clear justification given. In a situation where an allocation ratio other than 1:1 is used, recommended odds ratios can be computed directly using the above formula. These results are also applicable for other values of Inline graphic through its complement Inline graphic. This is equivalent to swapping the columns (or rows) and the researcher should be aware the recommended odds ratio effect sizes are now the reciprocals of those above, i.e., Inline graphic and Inline graphic for small, medium and large respectively.

This approach can also be applied to the relative risk and risk difference. If Inline graphic is taken as the grouping variable and Inline graphic as the outcome, the relative risk is Inline graphic. Simple substitution of Inline graphic and the marginal probabilities Inline graphic and Inline graphic results in a relative risk identical to Inline graphic for Inline graphic, i.e.,

graphic file with name pone.0058777.e212.jpg

Therefore, recommendations can also be derived for relative risk and are identical to those given for the odds ratio above. This result is expected as the odds ratio converges to the relative risk as the incidence rate approaches Inline graphic.

Instead of comparing the risk between two groups as a ratio, it is sometimes useful to compare their differences [14]. Again taking Inline graphic as the grouping variable and Inline graphic as the outcome, the risk difference can be written as

graphic file with name pone.0058777.e216.jpg

where Inline graphic to ensure Inline graphic as above. It is clear from the numerator in this representation that Inline graphic is a measure of the departure from independence, i.e., Inline graphic. Simple substitution of Inline graphic into Inline graphic yields

graphic file with name pone.0058777.e223.jpg

where the subscript Inline graphic is used to distinguish between risk difference formulae. This formula can be simplified somewhat for 1:1 allocations, i.e., Inline graphic; however, a general result independent of the marginal probabilities is clearly not possible in this instance as Inline graphic and therefore Inline graphic.

Alternatively, the Inline graphic formula can be solved for Inline graphic and compared to previously given odds ratio recommendations. In terms of Inline graphic and Inline graphic, we get

graphic file with name pone.0058777.e232.jpg

When the allocation ratio is 1:1, this formula simplifies to Inline graphic which has a form identical to Yule's Inline graphic [15]. So, Ferguson's [10] odds ratio recommendations of Inline graphic and Inline graphic therefore correspond to proportions of maximum correlation of Inline graphic and Inline graphic. This suggests Ferguson's recommendations have the potential to be anti-conservative from a sample size viewpoint.

Example

This paper was motivated by a reanalysis of passing distances for motor vehicles overtaking a bicyclist [16]. One of the primary results of this study was a significant association between helmet wearing and less overtaking distance, supporting a theory of risk perception for motor vehicle drivers directed towards bicyclists. Prior to collecting data, Walker [16] reported computing a sample size of Inline graphic overtaking manoeuvres based on a Inline graphic fixed effects factorial ANOVA for a small effect size Inline graphic, Inline graphic level of significance and Inline graphic power. The factors for this study were helmet wearing (2 levels) and bicycle position relative to the kerb (5 levels). It has been noted, however, that passing distances are often recommended and sometimes legislated to one metre or more [17]. So, passing manoeuvres of at least a metre are considered safe and less than a metre unsafe, with the implication that large differences in passing distance are unimportant beyond one metre in terms of bicycle safety. When compared with helmet wearing, safe/unsafe passing distances can be analysed using a Inline graphic table. Since Walker's study was powered at an unusually high level with subsequent increased probability of a type I error, bootstrap standard errors were estimated for more reasonable values for power of Inline graphic, Inline graphic and Inline graphic. Operationally defined small, medium and large effect sizes were also used since a meaningful difference in overtaking distance is unknown.

The relevant observed data from Walker [16] is given in Table 3. The observed marginal proportions here are Inline graphic for helmet wearing and Inline graphic for unsafe passing manoeuvres. Using the marginal probabilities, the maximum attainable effect size is Inline graphic and the estimated correlation is Inline graphic. A consequence is the effect size for the association between helmet wearing and safe passing distance is, at best, much less than a small effect size by Cohen's index. The corresponding small, medium and large odds ratio effect sizes using increments of Inline graphic are Inline graphic and Inline graphic for Inline graphic and Inline graphic. Note that these values are not much greater than the minimal recommended odds ratios mentioned in the previous section, further suggesting the association between safe/unsafe passing distance and helmet wearing is, at best, a small effect size. In fact, the unadjusted odds ratio is Inline graphic and non-significant by the chi-square test (Inline graphic). Conversely, sample sizes for a future study can be computed from the observed probabilities using G*Power for logistic regression with a single binomially distributed predictor for Inline graphic and Inline graphic power [18] resulting in Inline graphic and Inline graphic observations for small, medium and large odds ratios. To put these sample size computations into perspective, a future study would need to extend the sampling period by a factor greater than seven to detect a significant association between helmet wearing and safe/unsafe overtaking distance given a small effect size and identical marginal probabilities.

Table 3. Observed proportion of helmet use and safe passing manoeuvres from Walker (2007).

No Helmet Helmet Total
Safe 0.491 0.462 0.953
Unsafe 0.021 0.026 0.047
Total 0.512 0.488

Discussion

We present a demonstration that many contingency table correlation measures are equivalent for the Inline graphic case and their use is limited due to constraints created by fixed marginal probabilities. The odds ratio, which is a function of these measures for fixed marginal probabilities, is not problematic, is regularly used in statistical analyses and has a direct application to logistic regression. Recommended odds ratios have been proposed from Cohen's small, medium and large effect sizes for Inline graphic relative to the maximum attainable correlation Inline graphic. Further, minimal odds ratios can be computed with only knowledge of participant allocation.

The use of effect size recommendations should be avoided in situations in which clinical or practical differences are known. However, they can help the researcher balance between overly large or overly small sample size calculations when such information is unknown. In these situations, conservative estimates for odds ratio effect sizes can be derived from only the allocation ratio leading to a general result and, when a 1:1 allocation is chosen for optimal power, odds ratios of Inline graphic and Inline graphic correspond to small, medium and large effect sizes.

Supporting Information

File S1

SAS Macro to compute sample sizes from marginal probabilities for small, medium and large odds ratios.

(SAS)

Acknowledgments

The authors would like to thank Warren May, David Warton and Jakub Stoklosa for their help in the preparation of this manuscript.

Funding Statement

The authors have no support or funding to report.

References

  • 1. Lewis J (1999) Statistical principles for clinical trials (ICH E9): an introductory note on an inter-national guideline. Statistics in Medicine 18: 1903–1942. [DOI] [PubMed] [Google Scholar]
  • 2. Halpern S, Karlawish J, Berlin J (2002) The continuing unethical conduct of underpowered clinical trials. JAMA: The Journal of the American Medical Association 288: 358–362. [DOI] [PubMed] [Google Scholar]
  • 3. Cohen J (1992) A power primer. Psychological Bulletin 112: 155–159. [DOI] [PubMed] [Google Scholar]
  • 4. Lenth R (2001) Some practical guidelines for effective sample size determination. The American Statistician 55: 187–193. [Google Scholar]
  • 5. King M (2011) A point of minimal important difference (MID): a critique of terminology and methods. Expert Review of Pharmacoeconomics & Outcomes Research 11: 171–184. [DOI] [PubMed] [Google Scholar]
  • 6. Davenport Jr E, El-Sanhurry N (1991) Phi/phimax: review and synthesis. Educational and Psy-chological Measurement 51: 821–828. [Google Scholar]
  • 7. Ferguson G (1941) The factorial interpretation of test difficulty. Psychometrika 6: 323–329. [Google Scholar]
  • 8. Guilford J (1965) The minimal phi coefficient and the maximal phi. Educational and Psychological Measurement 25: 3–8. [Google Scholar]
  • 9. Breaugh J (2003) Effect size estimation: Factors to consider and mistakes to avoid. Journal of Management 29: 79–97. [Google Scholar]
  • 10. Ferguson C (2009) An effect size primer: A guide for clinicians and researchers. Professional Psychology: Research and Practice 40: 532–538. [Google Scholar]
  • 11.Cohen J (1988) Statistical Power Analysis for the Behavioral Sciences. Hillsdale, New Jersey: Lawrence Erlbaum Associates.
  • 12. Haddock C, Rindskopf D, Shadish W (1998) Using odds ratios as effect sizes for meta-analysis of dichotomous data: A primer on methods and issues. Psychological Methods 3: 339–353. [Google Scholar]
  • 13.Agresti A (2002) Categorical Data Analysis. New York: Wiley-interscience.
  • 14.Greenberg R, Daniels S, Flanders W, Eley J, Boring J (1996) Medical Epidemiology. Appleton & Lange.
  • 15.Liebetrau A (1983) Measures of Association, volume 32. Sage Publications, Incorporated.
  • 16. Walker I (2007) Drivers overtaking bicyclists: Objective data on the effects of riding position, helmet use, vehicle type and apparent gender. Accident Analysis & Prevention 39: 417–425. [DOI] [PubMed] [Google Scholar]
  • 17.Olivier J (2013) Bicycle helmet wearing is not associated with close overtaking: A re-analysis of Walker 2007. Submitted. [DOI] [PMC free article] [PubMed]
  • 18. Faul F, Erdfelder E, Buchner A, Lang A (2009) Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior research methods 41: 1149–1160. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

File S1

SAS Macro to compute sample sizes from marginal probabilities for small, medium and large odds ratios.

(SAS)


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES