Bayesian Inference: Understanding Experimental Data With Informative Hypotheses

Sabeeh A Baig

doi:10.1093/ntr/ntaa120

. 2020 Jul 1;22(11):2118–2121. doi: 10.1093/ntr/ntaa120

Bayesian Inference: Understanding Experimental Data With Informative Hypotheses

Sabeeh A Baig ^1,^✉

PMCID: PMC7593362 PMID: 32610347

Introduction

When analyzing experimental data, researchers will often test the null hypothesis that the means for all conditions are equal. In the event that the null hypothesis is rejected, researchers will conduct pairwise comparisons of the means to better understand the effect of an experimental manipulation. This exploratory procedure assumes that researchers do not know more about the data except that it corresponds to the experimental design. However, in designing an experiment, researchers commonly have clear expectations of how a manipulation will behave (e.g., in a three-arm design, both treatments will have similarly positive effects in comparison to the placebo control), often in light of prior studies and or relevant theory. Bayesian psychologists and statisticians refer to these expectations as informative hypotheses and have routinely emphasized testing them in a confirmatory fashion as a robust method of understanding experimental data.¹ The current commentary overviews how to specify informative hypotheses for experimental means, test them via Bayes factors, and account for multiple testing within this analytical framework.

Specifying Informative Hypotheses

The specification and evaluation of informative hypotheses follows from the essential quality of Bayesian methods that encourages incorporating prior information into the inferential process.² As such, informative hypotheses are specified to include restrictions on experimental means that follow from relevant theory or empirical studies. Informative hypotheses will commonly include order restrictions, which posit some partial or complete ordering of a set of experimental means. On occasion, informative hypotheses will include restrictions, which posit that one or more experimental means are greater or lesser than a substantively meaningful value. It is also common to evaluate multiple informative hypotheses because experimental data can be noisy. Evaluating the extent to which experimental data supports a small number of competing informative hypotheses amounts to assessing multiple theoretical explanations for the observed data. The use of informative hypotheses in this way then leads to a confirmatory procedure for understanding experimental data that is remarkably robust.³

Recently published research provides the opportunity to consider some illustrative examples of informative hypotheses about the effects of e-cigarette warnings. Given that there is uncertainty about the behavioral effects of e-cigarettes,⁴ it is possible that e-cigarette warnings may have diverse effects on the use of tobacco products including cigarettes. In this context, an informative hypothesis to consider could be that pictorial and text e-cigarette warnings elicit similarly high intentions to quit smoking combustible cigarettes (H₁: 3 < μ _Text ~ μ _Pictorial, where “3” is the midpoint on the five-point response scale for quit intentions). This informative hypothesis draws on an online experiment that found that pictorial and text warnings for e-cigarettes similarly decreased interest in smoking among smokers.⁵ Thus, salient risk information about e-cigarettes may generalize to combustible cigarettes thereby increasing intentions to quit smoking. An alternative informative hypothesis to consider could be that pictorial warnings for e-cigarettes increase intentions to quit smoking cigarettes even more so than text-only warnings (H₂: μ _Text < μ _Pictorial). When considering effect sizes, the same experiment did not demonstrate that pictorial warnings for e-cigarettes had a clear substantive advantage over text warnings. However, a previously published large meta-analysis of the impact of pictorial warnings for cigarettes demonstrated that pictorial warnings were clearly more effective than text warnings.⁶ As such, this finding about health warnings for cigarettes may still generalize to health warnings for other tobacco products including e-cigarettes.

Bayes Factors for Informative Hypotheses

Each informative hypothesis corresponds to a separate Bayesian model that can be evaluated using Bayes factors. In an experimental context, the calculation of such Bayes factors follows four basic steps. The first step is to calculate the Bayes factor that quantifies evidence for a main effect of the experimental manipulation of interest against a null effect. These general Bayes factors were covered in greater detail in the previous commentary.⁷ The second step is to evaluate the complexity of the informative hypothesis.⁸ In this setting, complexity corresponds to the proportion of the parameter space that is in agreement with the informative hypothesis. As a simple example, let us posit that the treatment is more effective than the control in a two-arm trial. Here, the parameter space is defined by the two distinct ways in which the experimental means can be ordered. As the informative hypothesis posits one of these two orders, its complexity is 0.5. The complexity of an informative hypothesis is incorporated as its prior probability into the corresponding Bayes factor.

The third step is to examine the posterior probability of the informative hypothesis, which is usually estimated as the joint proportion of the posterior distribution that satisfies the individual constraints in the informative hypothesis. Here, the posterior distribution is the one specified by the main effects model and corresponds to the unconstrained model that posits that all experimental means are independent.⁹ In the final step, we divide the posterior probability of the informative hypothesis by its prior probability (i.e., complexity) and multiply the resulting odds by the Bayes factor for a main effect, thereby combining the individual pieces of information computed in the previous steps. The resulting Bayes factor quantifies the extent to which the data support the informative hypothesis over a null effect of the experimental manipulation. The computation of Bayes factors, including components such as complexity, is partially or completely automated via specialized software that implement simulation and other data-based estimation methods.

Multiple testing is usually not a concern when analyzing experimental data using Bayesian model comparison.¹⁰ This is because Bayes factors for informative hypotheses account for the complexity of informative hypotheses upfront via prior specification. As a result, the difficulty of obtaining evidence in favor of an informative hypothesis increases as its complexity increases. When considering multiple informative hypotheses, it is common to examine the evidence for an informative hypothesis relative to the evidence for all others being considered using posterior model probabilities.¹¹ Posterior model probabilities can be thought of as standardized Bayes factors; for a given informative hypothesis, they are calculated as the ratio of the Bayes factor for that hypothesis and the sum of the Bayes factors for all informative hypotheses under consideration. The resulting quantity represents the probability that an informative hypothesis is true after observing the data. It can also be interpreted as the percentage of data that supports an informative hypothesis. Therefore, posterior model probabilities can provide a more complete picture of how experimental data support competing hypotheses and tackle the problem of multiple testing by recognizing that empirical data are often noisy and support multiple theoretical explanations.

Empirical Example: Formatively Testing Health Messages for E-cigarettes

Informative hypotheses may be readily applied to understanding how health messages about harmful or potentially harmful chemicals in e-cigarette aerosol affect discouragement from wanting to use e-cigarettes. As an illustrative hypothetical example, we conduct a formative online study with 358 adults who vape every day or some days to develop a better understanding of such messages. This experiment randomizes participants to receive one of four basic messages about one of three chemicals that are known to be present in e-cigarette vapor: acrolein, lead, and formaldehyde. An example message included in this hypothetical study is, “E-cigarette vapor contains acrolein.” The experiment also includes two messages that replace “acrolein” with “lead” or “formaldehyde” and another message about formaldehyde that is just a reword of the other. After viewing the message for 10 seconds, participants complete an item on discouragement, “How much does this message discourage you from wanting to vape?” ¹² Response options are “not at all” (coded as 1), “a little” (2), “somewhat” (3), and “a lot” (4).

Before analyzing the experimental data, we elicit three informative hypotheses based on previously published empirical studies in tobacco control (Table 1). The first hypothesis (H₁) specifies that the acrolein message will elicit lower discouragement than the lead and formaldehyde messages. The second hypothesis (H₂) specifies that the acrolein message will elicit lower discouragement than the lead message, which will elicit lower discouragement than both of the formaldehyde messages. The third hypothesis (H₃) specifies that the acrolein message will elicit lower discouragement than both formaldehyde messages, which will elicit lower discouragement than the lead message. These hypotheses follow from previous research on risk communication strategies for harmful chemicals in cigarette smoke showing that acrolein least discourages people from wanting to smoke, formaldehyde and lead are two of the more discouraging chemicals, and the latter chemicals may be similarly discouraging.¹³ Having only tested basic messages, we do not include predictions about the formaldehyde messages, only expecting them to be independent on discouragement in each of the hypotheses. Testing different order-restricted hypotheses can reveal experimental conditions that may behave similarly especially when the data assign equivocal evidence to the hypotheses.

Table 1.

Bayes Factors (BFs) and Posterior Model Probabilities (PMPs) for Informative Hypotheses About the Impact of Health Messages Focusing on Harmful Chemicals in E-cigarette Vapor on Discouragement From Wanting to Vape

Informative hypothesis	c	d	BF	PMP
H₁: μ _Acrolein < (μ _Lead, μ _{Formaldehyde 1}, μ _{Formaldehyde 2})	0.250	1.00	136 412	0.48
H₂: μ _Acrolein < μ _Lead < (μ _{Formaldehyde 1}, μ _{Formaldehyde 2})	0.083	0.21	86 726	0.31
H₃: μ _Acrolein < (μ _{Formaldehyde 1}, μ _{Formaldehyde 2}) < μ _Lead	0.083	0.15	60 900	0.21

Open in a new tab

Complexity (c) was calculated by considering possible orders of experimental means; joint posterior proportions (d) for agreement with hypothesized constraints assumed that mean differences of 0.05 or less on discouragement were substantively irrelevant; PMPs did not consider the unconstrained model because it was not substantively informative.

In this hypothetical study, participants who receive the acrolein messages elicit mean discouragement of 1.85 (SD = 0.79), and those who receive the lead message elicit mean discouragement of 2.48 (SD = 0.90). Those who receive one of the two formaldehyde messages elicit mean discouragement of 2.54 (SD = 1.00) or 2.45 (SD = 0.99). The Bayes factor quantifying evidence for a main effect of the health messages on discouragement is 34 106. This Bayes factor indicates that the data support a smaller effect of the health messages on discouragement over no effect at all by a factor of 34 100 or so. Turning to the three informative hypotheses, H₁, H₂, and H₃, their respective complexities or prior probabilities are approximately 0.25, 0.08, and 0.08 (calculated based on possible orderings of the experimental means). For H₁, H₂, and H₃, the joint proportions of the posterior distribution that agree with the constraints in each hypothesis are 1.00, 0.21, and 0.15. As such, the Bayes factors for H₁, H₂, and H₃ are 136 412, 86 726, and 60 900, respectively (Table 1). These Bayes factors are quite large and indicate that the data supports each of these hypotheses over no effect by a factor of at least 60 000. As the common constraint in all three hypotheses is about the acrolein message, these Bayes factors offer conclusive support that the acrolein message discourages the least out of all other messages being considered. Therefore, messages about acrolein may not be appropriate for risk communication interventions for e-cigarette use.

If we were presented with the Bayes factor for any one of the three informative hypotheses alone, we could be persuaded to accept that hypothesis as a plausible explanation for the data. After all, each of the three Bayes factors is very large and constitutes substantial evidence by the usual standards.^7,14 However, posterior model probabilities provide a more complete picture of the data. For H₁, H₂, and H₃, these are 0.48, 0.31, and 0.21, respectively (Table 1). None of the posterior model probabilities are negligible, indicating that none of the hypotheses should be ruled out based on the current experiment. Given that the hypotheses only differ about the constraints placed on the formaldehyde and lead messages, the posterior model probabilities suggest that the experimental data are unable to sufficiently distinguish between such messages. Perhaps such messages do affect discouragement similarly, which would correspond to studies of antismoking messages about formaldehyde and lead.^13,15 However, a revised experiment could more completely assess the effects of formaldehyde and lead messages.

Conclusions

As noted in the first commentary in this series,⁷ the hallmark of Bayesian model comparison (and other Bayesian approaches) is the consideration of uncertainty throughout the process of statistical inference.⁷ In comparison to simpler Bayesian model comparison, the evaluation of informative hypotheses via Bayes factors considers inferential uncertainty in two additional ways. First, the Bayes factor for an informative hypothesis incorporates the complexity of that hypothesis as its prior probability. Second, posterior model probabilities (i.e., standardized Bayes factors) simultaneously quantify the evidence for an informative hypothesis conditional on all others being considered. Therefore, Bayesian model comparison necessarily leaves open the possibilities that experimental data may just support a theoretical explanation while not ruling out others under consideration, or may not clearly favor any of the explanations under consideration, or may simply be noisy. In doing so, this approach further shifts science away from the binary inferences that have long dominated the enterprise toward more realistic understandings of data.

Although Bayes factors for informative hypotheses have been most commonly applied in the context of linear regression and analysis of variance, they have been extended to informative hypotheses formulated in the context of structural equation modeling, item response theory, simple correlations, population variances, and other methods.^16–19 Given that informative hypotheses are elicited based on relevant theory or empirical studies, they may facilitate some evaluation of the consistency between similar studies. However, this is best accomplished via specialized Bayes factors for evaluating replication attempts, which are the focus of the next commentary. Supplementary Appendix 1 provides annotated syntax for computing Bayes factors in the empirical example using R and the utility package BayesFactor. The syntax should be adaptable to common analyses for between-subject experiments.

Supplementary Material

A Contributorship Form detailing each author’s specific involvement with this content, as well as any supplementary data, are available online at https://academic.oup.com/ntr.

ntaa120_suppl_Supplementary_Appendix

Click here for additional data file.^{(28.8KB, docx)}

ntaa120_suppl_Supplementary_Taxanomy

Click here for additional data file.^{(128.6KB, pdf)}

Funding

This work was partially supported by the Office of the Director, National Institutes of Health (award number DP5OD023064) when the author was a postdoctoral research scientist at Columbia University Mailman School of Public Health.

Declaration of Interests

None declared.

References

1. Mulder J, Wagenmakers E-J. Editors’ introduction to the special issue “Bayes factors for testing hypotheses in psychological research: Practical relevance and new developments.” J Math Psychol. 2016;72:1–5. [Google Scholar]
2. Hoijtink H, Kooten P van, Hulsker K. Why Bayesian psychologists should change the way they use the Bayes factor. Multivariate Behav Res. 2016;51(1):2–10. [DOI] [PubMed] [Google Scholar]
3. Kuiper RM, Nederhoff T, Klugkist I. Properties of hypothesis testing techniques and (Bayesian) model selection for exploration-based and theory-based (order-restricted) hypotheses. Br J Math Stat Psychol. 2015;68(2):220–245. [DOI] [PubMed] [Google Scholar]
4. National Academies of Sciences, Engineering, and Medicine. Public Health Consequences of E-Cigarettes. Washington, DC: The National Academies Press; 2018. doi: 10.17226/24952 [DOI] [PubMed] [Google Scholar]
5. Brewer NT, Jeong M, Hall MG, et al. Impact of e-cigarette health warnings on motivation to vape and smoke. Tob Control. 2019;28(e1):e64–e70. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Noar SM, Hall MG, Francis DB, Ribisl KM, Pepper JK, Brewer NT. Pictorial cigarette pack warnings: a meta-analysis of experimental studies. Tob Control. 2016;25(3):341–354. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Baig SA. Bayesian inference: an introduction to hypothesis testing using Bayes factors. Nicotine Tob Res. 2020;22(7):1244–1246. doi: 10.1093/ntr/ntz207 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Klugkist I, Laudy O, Hoijtink H. Inequality constrained analysis of variance: a Bayesian approach. Psychol Methods. 2005;10(4):477–493. doi: 10.1037/1082-989x.10.4.477 [DOI] [PubMed] [Google Scholar]
9. Rouder JN, Morey RD, Verhagen J, Swagman AR, Wagenmakers EJ. Bayesian analysis of factorial designs. Psychol Methods. 2017;22(2):304–321. [DOI] [PubMed] [Google Scholar]
10. Dienes Z. Bayesian versus orthodox statistics: which side are you on? Perspect Psychol Sci. 2011;6(3):274–290. [DOI] [PubMed] [Google Scholar]
11. Etz A, Haaf JM, Rouder JN, Vandekerckhove J. Bayesian inference and testing any hypothesis you can specify. Adv Methods Pract Psychol Sci. 2018;1(2):281–295. [Google Scholar]
12. Baig SA, Noar SM, Gottfredson NC, Boynton MH, Ribisl KM, Brewer NT. UNC Perceived message effectiveness: validation of a brief scale. Ann Behav Med. 2018;53(8):732–742. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Brewer NT, Morgan JC, Baig SA, et al. Public understanding of cigarette smoke constituents: three US surveys. Tob Control. 2016;26(5):592–599. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Jeon M, De Boeck P. Decision qualities of Bayes factor and p value-based hypothesis testing. Psychol Methods. 2017;22(2):340–360. [DOI] [PubMed] [Google Scholar]
15. Kelley DE, Boynton MH, Noar SM, et al. Effective message elements for disclosures about chemicals in cigarette smoke. Nicotine Tob Res. 2017;20(9):1047–1054. doi: 10.1093/ntr/ntx109 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Gu X, Mulder J, Deković M, Hoijtink H. Bayesian evaluation of inequality constrained hypotheses. Psychol Methods. 2014;19(4):511–527. [DOI] [PubMed] [Google Scholar]
17. Tijmstra J, Hoijtink H, Sijtsma K. Evaluating manifest monotonicity using Bayes factors. Psychometrika. 2015;80(4):880–896. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Mulder J. Bayes factors for testing order-constrained hypotheses on correlations. J Math Psychol. 2016;72:104–115. [Google Scholar]
19. Böing-Messing F, Mulder J. Automatic Bayes factors for testing equality- and inequality-constrained hypotheses on variances. Psychometrika. 2018;83(3):586–617. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ntaa120_suppl_Supplementary_Appendix

Click here for additional data file.^{(28.8KB, docx)}

ntaa120_suppl_Supplementary_Taxanomy

Click here for additional data file.^{(128.6KB, pdf)}

[CIT0001] 1. Mulder J, Wagenmakers E-J. Editors’ introduction to the special issue “Bayes factors for testing hypotheses in psychological research: Practical relevance and new developments.” J Math Psychol. 2016;72:1–5. [Google Scholar]

[CIT0002] 2. Hoijtink H, Kooten P van, Hulsker K. Why Bayesian psychologists should change the way they use the Bayes factor. Multivariate Behav Res. 2016;51(1):2–10. [DOI] [PubMed] [Google Scholar]

[CIT0003] 3. Kuiper RM, Nederhoff T, Klugkist I. Properties of hypothesis testing techniques and (Bayesian) model selection for exploration-based and theory-based (order-restricted) hypotheses. Br J Math Stat Psychol. 2015;68(2):220–245. [DOI] [PubMed] [Google Scholar]

[CIT0004] 4. National Academies of Sciences, Engineering, and Medicine. Public Health Consequences of E-Cigarettes. Washington, DC: The National Academies Press; 2018. doi: 10.17226/24952 [DOI] [PubMed] [Google Scholar]

[CIT0005] 5. Brewer NT, Jeong M, Hall MG, et al. Impact of e-cigarette health warnings on motivation to vape and smoke. Tob Control. 2019;28(e1):e64–e70. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0006] 6. Noar SM, Hall MG, Francis DB, Ribisl KM, Pepper JK, Brewer NT. Pictorial cigarette pack warnings: a meta-analysis of experimental studies. Tob Control. 2016;25(3):341–354. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0007] 7. Baig SA. Bayesian inference: an introduction to hypothesis testing using Bayes factors. Nicotine Tob Res. 2020;22(7):1244–1246. doi: 10.1093/ntr/ntz207 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0008] 8. Klugkist I, Laudy O, Hoijtink H. Inequality constrained analysis of variance: a Bayesian approach. Psychol Methods. 2005;10(4):477–493. doi: 10.1037/1082-989x.10.4.477 [DOI] [PubMed] [Google Scholar]

[CIT0009] 9. Rouder JN, Morey RD, Verhagen J, Swagman AR, Wagenmakers EJ. Bayesian analysis of factorial designs. Psychol Methods. 2017;22(2):304–321. [DOI] [PubMed] [Google Scholar]

[CIT0010] 10. Dienes Z. Bayesian versus orthodox statistics: which side are you on? Perspect Psychol Sci. 2011;6(3):274–290. [DOI] [PubMed] [Google Scholar]

[CIT0011] 11. Etz A, Haaf JM, Rouder JN, Vandekerckhove J. Bayesian inference and testing any hypothesis you can specify. Adv Methods Pract Psychol Sci. 2018;1(2):281–295. [Google Scholar]

[CIT0012] 12. Baig SA, Noar SM, Gottfredson NC, Boynton MH, Ribisl KM, Brewer NT. UNC Perceived message effectiveness: validation of a brief scale. Ann Behav Med. 2018;53(8):732–742. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0013] 13. Brewer NT, Morgan JC, Baig SA, et al. Public understanding of cigarette smoke constituents: three US surveys. Tob Control. 2016;26(5):592–599. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0014] 14. Jeon M, De Boeck P. Decision qualities of Bayes factor and p value-based hypothesis testing. Psychol Methods. 2017;22(2):340–360. [DOI] [PubMed] [Google Scholar]

[CIT0015] 15. Kelley DE, Boynton MH, Noar SM, et al. Effective message elements for disclosures about chemicals in cigarette smoke. Nicotine Tob Res. 2017;20(9):1047–1054. doi: 10.1093/ntr/ntx109 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0016] 16. Gu X, Mulder J, Deković M, Hoijtink H. Bayesian evaluation of inequality constrained hypotheses. Psychol Methods. 2014;19(4):511–527. [DOI] [PubMed] [Google Scholar]

[CIT0017] 17. Tijmstra J, Hoijtink H, Sijtsma K. Evaluating manifest monotonicity using Bayes factors. Psychometrika. 2015;80(4):880–896. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0018] 18. Mulder J. Bayes factors for testing order-constrained hypotheses on correlations. J Math Psychol. 2016;72:104–115. [Google Scholar]

[CIT0019] 19. Böing-Messing F, Mulder J. Automatic Bayes factors for testing equality- and inequality-constrained hypotheses on variances. Psychometrika. 2018;83(3):586–617. [DOI] [PubMed] [Google Scholar]

PERMALINK

Bayesian Inference: Understanding Experimental Data With Informative Hypotheses

Sabeeh A Baig, PhD

Introduction

Specifying Informative Hypotheses

Bayes Factors for Informative Hypotheses

Empirical Example: Formatively Testing Health Messages for E-cigarettes

Table 1.

Conclusions

Supplementary Material

Funding

Declaration of Interests

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Bayesian Inference: Understanding Experimental Data With Informative Hypotheses

Sabeeh A Baig, PhD

Introduction

Specifying Informative Hypotheses

Bayes Factors for Informative Hypotheses

Empirical Example: Formatively Testing Health Messages for E-cigarettes

Table 1.

Conclusions

Supplementary Material

Funding

Declaration of Interests

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases