Leveraging confidence intervals to inform the clinical significance of interventions

Pamela Fernainy; Nadia Sourial

doi:10.1093/fampra/cmac065

. 2022 Jun 18;39(6):1196–1198. doi: 10.1093/fampra/cmac065

Leveraging confidence intervals to inform the clinical significance of interventions

Pamela Fernainy ^1,^✉, Nadia Sourial ²

PMCID: PMC9680657 PMID: 35716096

Introduction

When testing the effect of an intervention, clinicians frequently use and report confidence intervals (CIs) to determine the statistical significance of the effect. Indeed, CIs provide the range of values within which we are reasonably confident that the true effect lies.¹ A 95% CI means that, if the study were to be repeated 100 times, the CI would contain the true effect of the intervention 95 times out of 100.² If the CI range excludes the value of “no effect” (i.e. a difference of zero between the intervention and control group), it can be concluded that the result is statistically significant. CIs have been favoured by clinicians and quantitative experts because they are not only useful in determining statistical significance but also in providing information on the direction and range of the effect of the intervention. While these advantages of CIs are well known, an underutilized yet useful application of CIs is their use in assessing the potential clinical significance of an intervention.³ In this Methods Brief, we will provide practical examples for clinicians using and reporting CIs of how CIs can be leveraged to “rule in” or “rule out” effects according to both their statistical and clinical significance.

Incorporating a minimally clinically important difference into the interpretation of confidence intervals

Let’s consider the following hypothetical example. A family physician wants to test if an exercise program can help her patients suffering from obesity lose at least 10% of their body weight over a period of 1 year. To that end, she randomizes her patients into two groups and observes an average decrease of 3% in body weight after 1 year in patients who followed the program compared to those who did not. Moreover, the results indicate a 95% CI of [2%, 4%].

In our example, the physician concludes that the result is “statistically significant” since the CI does not contain the value of 0%. However, she wonders, is the result “clinically significant?”

“Clinically significant” within the context of clinical practice is mostly dependent on “the extent of change, whether the change makes a real difference to subjects’ lives, how long the effects last, consumer acceptability, cost-effectiveness and ease of implementation”.⁴ In other words, clinical significance indicates whether the results of a study are meaningful or not for stakeholders, including patients, clinicians, managers, and decision-makers.

The minimally clinically important difference (MCID) is one approach to establishing a threshold for clinical significance, a threshold decided on a priori, at the stage of study design.³ Going back to the example, the 10% goal that the family physician had set out to find would correspond to the MCID. There are various methods for determining the MCID, including anchor-based approaches and distribution-based methods.⁵ In the anchor-based approach, the patients’ perspective is the focus and it is determined using an external measure of change, usually a global assessment. For example, for weight loss, the HRQOL (health-related quality of life) questionnaire, which includes the patients’ perceptions of physical, mental, and social functioning and their changes following weight loss, has been considered by some researchers as the anchor. Distribution-based interpretations are based on statistical criteria.⁶ For instance, contemporary guidelines for weight loss in adult patients suffering from obesity define 5%–10% weight reductions as clinically significant for this population. In other words, this threshold of 10% is based on expert opinion and statistically significant improvements in cardio-metabolic risk.⁷ Incorporating a MCID into the interpretation of CIs allows to go beyond solely examining statistical significance when making conclusions about the effectiveness of an intervention. Consequently, this avoids the possibility that studies might find statistical relationships that are not clinically significant or relevant to stakeholders.⁶

Leveraging the power of confidence intervals to “rule in” and “rule out” clinically significant differences

How then can CIs be used to inform the clinical significance of the exercise program? We will review four different hypothetical scenarios where the CIs can be used to “rule in” or “rule out” clinically significant effects based on the MCID.

Scenario 1: “rule out”: Let us once again consider the study described above, the physician calculated a 3% average weight loss for her patients, and a 95% CI of [2%, 4%] (Fig. 1a). Since the value of 0% is not found within the 95% CI, the result can be considered statistically significant. However, the physician had predetermined a MCID of 10% average reduction in weight for her patients. Since the MCID was not contained within the CI, the physician could conclude that implementing the weight loss program for all her patients suffering from obesity would likely result in a clinically significant difference. Therefore, although the results were statistically significant, they were not clinically significant and could be used to “rule out” the possible clinical significant effect of this intervention.

Fig. 1. — Combining confidence intervals (CI) and the minimally clinically important difference (MCID) to “rule in” or “rule out” potentially clinically significant interventions. In this example, the MCID was set at 10% and represents the cut-off for a clinically significant effect: Scenario (a): Results are statistically significant but the CI does not contain the MCID. These results support that the clinically significant effect of the intervention can be “ruled out”. Scenario (b): Results are not statistically significant and the CI does not contain the MCID. These results support that the clinically significant effect of the intervention can be “ruled out”. Scenario (c): Results are not statistically significant but the CI contains the MCID. These results support that the clinically significant effect of the intervention can neither be “ruled in” nor “ruled out”. Scenario (d): Results are statistically significant and the CI contains the MCID. These results support that the clinically significant effect of the intervention can be “ruled in”.

Scenario 2: “rule out”: Another situation that might occur is if the physician obtained an average weight loss of 2% for her patients and a 95% CI of [−4%, 4%] (Fig. 1b). Since the CI excludes a possible weight loss difference at or above the MCID threshold of 10%, the physician could “rule out” a potentially significant effect of the exercise program. Considering scenarios 1 and 2 together, we see that regardless of statistical significance, the CI in combination with the MCID allowed the family physician to conclude that the intervention was not clinically significant.

Scenario 3: Inconclusive, cannot “rule in” nor “rule out”: Now let us suppose instead that the physician calculated an 8% average weight loss for her patients, and a 95% CI of [0%, 14%] (Fig. 1c), how should the physician interpret these results? At first glance, she might be inclined to conclude that the exercise program is not efficacious since the results were not statistically significant. However, since the CI contains the MCID of 10%. The physician should therefore conclude that the results of the study were inconclusive since a clinically significant effect of the exercise program cannot be “ruled out”. One explanation for this inconclusive outcome and wide CI may be due to the size of the sample (in this scenario, the sample size was 58). This is because the CI is affected by the sample size.⁸ A smaller sample size could lead to a wider CI, while a larger sample size could lead to a narrower CI.⁹ In this scenario, the study could be repeated, with a larger sample size, to reassess the potential clinical importance of the program. Repeating the experiment would likely lead to both statistically significant and clinically significant results. For example, if the sample size was doubled from 58 to 116, the CI would become [3%, 13%].

Scenario 4: “rule in”: Finally, suppose the physician calculates an average weight loss for her patients of 11% and a 95% CI of [9%, 14%] (Fig. 1d). In this scenario, the results are statistically significant, and the MCID value of 10% is contained within the CI. The physician would therefore be confident in “ruling in” the clinically significant effect of the exercise program.

In view of these four scenarios, researchers should consider interpreting as a dyad both statistical and clinical significance by examining the CI in relation to the MCID.¹⁰ By considering the MCID when interpreting CIs, researchers can go beyond only focusing on statistical significance when making conclusions about the effectiveness of an intervention. Using this approach, clinicians will be able to identify interventions worthy of further exploration as well as interventions that are unlikely to yield benefit.⁶

Conclusion

CIs are a helpful tool, not only in determining statistical significance, but also in determining clinical significance when considered in combination with the MCID. In short, it is recommended that the MCID be decided on a priori, during study design, and that the CI be interpreted with that threshold as a guide. Finally, our purpose in this paper was to highlight the important and often overlooked role that CIs can have in deciding to “rule in” or “rule out” clinically important interventions.

Contributor Information

Pamela Fernainy, Department of Health Management, Evaluation and Policy, University of Montreal, Montreal, QC, Canada.

Nadia Sourial, Department of Health Management, Evaluation and Policy, University of Montreal, Montreal, QC, Canada.

Funding

This work was funded by a Canadian Institutes of Health Research grant (PJT-178264).

Ethical approval

Not applicable.

Conflict of interest

No conflicts of interest to disclose.

Data availability

Not applicable.

References

1. Gardner MJ, Altman DG.. Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med J (Clin Res Ed). 1986;292(6522):746–750. [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Greenfield MLVH, Kuhn JE, Wojtys EM.. A statistics primer. Am J Sports Med. 1998;26(1):145–149. [DOI] [PubMed] [Google Scholar]
3. Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR.. Methods to explain the clinical significance of health status measures. Mayo Clin Proc. 2002;77(4):371–383. [DOI] [PubMed] [Google Scholar]
4. LeFort SM. The statistical versus clinical significance debate. Image J Nurs Sch. 1993;25(1):57–62. [DOI] [PubMed] [Google Scholar]
5. Armijo-Olivo S. The importance of determining the clinical significance of research results in physical therapy clinical research. Braz J Phys Ther. 2018;22(3):175–176. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Rai SK, Yazdany J, Fortin PR, Aviña-Zubieta JA.. Approaches for estimating minimal clinically important differences in systemic lupus erythematosus. Arthritis Res Ther. 2015;17(1):143. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Warkentin LM, Majumdar SR, Johnson JA, Agborsangaya CB, Rueda-Clausen CF, Sharma AM, Klarenbach SW, Karmali S, Birch DW, Padwal RS.. Weight loss required by the severely obese to achieve clinically important differences in health-related quality of life: two-year prospective cohort study. BMC Med. 2014;12(1):175. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Cumming G. The new statistics: why and how. Psychol Sci. 2014;25(1):7–29. [DOI] [PubMed] [Google Scholar]
9. Trafimow D. Confidence intervals, precision and confounding. New Ideas Psychol. 2018;50(1):48–53. [Google Scholar]
10. Ranganathan P, Pramesh CS, Buyse M.. Common pitfalls in statistical analysis: clinical versus statistical significance. Perspect Clin Res. 2015;6(3):169–170. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.

[CIT0001] 1. Gardner MJ, Altman DG.. Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med J (Clin Res Ed). 1986;292(6522):746–750. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0002] 2. Greenfield MLVH, Kuhn JE, Wojtys EM.. A statistics primer. Am J Sports Med. 1998;26(1):145–149. [DOI] [PubMed] [Google Scholar]

[CIT0003] 3. Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR.. Methods to explain the clinical significance of health status measures. Mayo Clin Proc. 2002;77(4):371–383. [DOI] [PubMed] [Google Scholar]

[CIT0004] 4. LeFort SM. The statistical versus clinical significance debate. Image J Nurs Sch. 1993;25(1):57–62. [DOI] [PubMed] [Google Scholar]

[CIT0005] 5. Armijo-Olivo S. The importance of determining the clinical significance of research results in physical therapy clinical research. Braz J Phys Ther. 2018;22(3):175–176. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0006] 6. Rai SK, Yazdany J, Fortin PR, Aviña-Zubieta JA.. Approaches for estimating minimal clinically important differences in systemic lupus erythematosus. Arthritis Res Ther. 2015;17(1):143. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0007] 7. Warkentin LM, Majumdar SR, Johnson JA, Agborsangaya CB, Rueda-Clausen CF, Sharma AM, Klarenbach SW, Karmali S, Birch DW, Padwal RS.. Weight loss required by the severely obese to achieve clinically important differences in health-related quality of life: two-year prospective cohort study. BMC Med. 2014;12(1):175. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0008] 8. Cumming G. The new statistics: why and how. Psychol Sci. 2014;25(1):7–29. [DOI] [PubMed] [Google Scholar]

[CIT0009] 9. Trafimow D. Confidence intervals, precision and confounding. New Ideas Psychol. 2018;50(1):48–53. [Google Scholar]

[CIT0010] 10. Ranganathan P, Pramesh CS, Buyse M.. Common pitfalls in statistical analysis: clinical versus statistical significance. Perspect Clin Res. 2015;6(3):169–170. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Leveraging confidence intervals to inform the clinical significance of interventions

Pamela Fernainy

Nadia Sourial

Introduction

Incorporating a minimally clinically important difference into the interpretation of confidence intervals

Leveraging the power of confidence intervals to “rule in” and “rule out” clinically significant differences

Fig. 1.

Conclusion

Contributor Information

Funding

Ethical approval

Conflict of interest

Data availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Leveraging confidence intervals to inform the clinical significance of interventions

Pamela Fernainy

Nadia Sourial

Introduction

Incorporating a minimally clinically important difference into the interpretation of confidence intervals

Leveraging the power of confidence intervals to “rule in” and “rule out” clinically significant differences

Fig. 1.

Conclusion

Contributor Information

Funding

Ethical approval

Conflict of interest

Data availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases