Abstract
Background and Aims
It has been proposed that more use should be made of Bayes factors in hypothesis testing in addiction research. Bayes factors are the ratios of the likelihood of a specified hypothesis (e.g. an intervention effect within a given range) to another hypothesis (e.g. no effect). They are particularly important for differentiating lack of strong evidence for an effect and evidence for lack of an effect. This paper reviewed randomized trials reported in Addiction between January and June 2013 to assess how far Bayes factors might improve the interpretation of the data.
Methods
Seventy‐five effect sizes and their standard errors were extracted from 12 trials. Seventy‐three per cent (n = 55) of these were non‐significant (i.e. P > 0.05). For each non‐significant finding a Bayes factor was calculated using a population effect derived from previous research. In sensitivity analyses, a further two Bayes factors were calculated assuming clinically meaningful and plausible ranges around this population effect.
Results
Twenty per cent (n = 11) of the non‐significant Bayes factors were < ⅓ and 3.6% (n = 2) were > 3. The other 76.4% (n = 42) of Bayes factors were between ⅓ and 3. Of these, 26 were in the direction of there being an effect (Bayes factor > 1 and < 3); 12 tended to favour the hypothesis of no effect (Bayes factor < 1 and > ⅓); and for four there was no evidence either way (Bayes factor = 1). In sensitivity analyses, 13.3% of Bayes Factors were < ⅓ (n = 20), 62.7% (n = 94) were between ⅓ and 3 and 24.0% (n = 36) were > 3, showing good concordance with the main results.
Conclusions
Use of Bayes factors when analysing data from randomized trials of interventions in addiction research can provide important information that would lead to more precise conclusions than are obtained typically using currently prevailing methods.
Keywords: Addiction, Bayes factors, Bayesian, hypothesis testing, non‐significant, RCT
Introduction
Bayesian statistical analyses are being used increasingly in addictions research, and it has been proposed that this trend should accelerate 1. One important component of Bayesian analysis is the calculation of Bayes factors, which overcome many of the problems of traditional frequentist statistics 2. One of these is the misinterpretation that P‐values can be used to make claims of ‘no effect’ 3, 4, 5. P‐values signal the extremeness of the data under the assumption of the null hypothesis and so only tell us the probability of a test statistic at least as extreme as the one observed 6. Thus, a P > 0.05 may reflect evidence for ‘no effect’ or data insensitivity, i.e. a failure to distinguish the null hypothesis from the alternative because, for example, the standard error (SE) is high.
Bayes factors are the ratio of the (average) likelihood of two hypotheses being correct given a set of data. When evaluating interventions, the two hypotheses are typically H1: that the intervention had a desired effect (for a given range of plausible sizes), or within a certain range, versus H0: that it had no effect. Thus, a Bayes factor is equivalent to a likelihood ratio 7 (averaged over different plausible effect sizes) and thus is often denoted as:
which simply represents the probability of the data (D) given the alternative hypothesis divided by the probability of the data given the null hypothesis.
The use of Bayes factors has become more feasible in recent years following the development of online calculators 8 and R code 9, 10. Conventional cut‐offs for the interpretation of Bayes factors depend typically upon those set by Jeffreys 2 in the 1930s, with a Bayes factor greater than 3, or else less than ⅓, representing sufficient evidence to be taken note of for the experimental and null hypotheses, respectively; while values between approximately ⅓ and 3 indicate that the data are insensitive (see Table 1).
Table 1.
Jeffreys’ Bayes factor cut‐offs.
Bayes factor | Interpretation |
---|---|
> 100 | Extreme evidence for the experimental hypothesis |
30–100 | Very strong evidence for the experimental hypothesis |
10–30 | Strong evidence for the experimental hypothesis |
3–10 | Moderate evidence for the experimental hypothesis |
1–3 | Anecdotal evidence for the experimental hypothesis |
1 | No evidence |
⅓–1 | Anecdotal evidence for the null hypothesis |
⅓–1/10 | Moderate evidence for the null hypothesis |
1/10–1/30 | Strong evidence for the null hypothesis |
1/30–1/100 | Very strong evidence for the null hypothesis |
< 1/100 | Extreme evidence for the null hypothesis |
This paper uses a set of randomized trials in the field of addiction to examine whether, and in what way, the conclusions may have been different had the authors calculated Bayes factors in their analyses. This should be useful in future research to assess whether and when to use this form of analysis.
Calculating Bayes factors
Several software packages are available including an online calculator developed by Zoltan Dienes (http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/inference/Bayes.htm) and a modified version by John Christie using R code, which allows one to adjust the quality of the estimation 9, 10.
Both approaches require the specification of an expected effect size (i.e. a plausible range of predicted values based on previous studies, judgement or clinical significance), the published effect size (e.g. mean difference or log odds ratio) and standard error of this parameter. They also both assume that the sampling distribution of the parameter estimate is distributed normally (hence the need to use the natural logs of odds ratios). The natural log of the odds ratio is approximately normally distributed with known standard error given by , where A is the number of individuals in the experimental condition with the outcome of interest, B is the number of individuals in the experimental condition without the outcome of interest and C and D reflect the number of individuals with and without the outcome of interest in the control condition respectively (i.e. odds ratio = (A/B)/(C/D)), provided that these numbers are not very small. For adjusted odds ratios, and/or where standard errors (SE) are not reported, 95% confidence intervals (CI) can be used to derive the standard error {i.e. [LN(upper confidence interval)–LN(lower confidence interval)]/3.92}.
In instances where the primary outcome measure is a continuous variable, SEs can be derived for mean differences or regression coefficients (β) either using the standard formula for the SE of mean difference, i.e. [(SD2 control/n Control) + (SD2 experimental/n experimental)]; or t‐test values using [mean difference (or β)/t‐test value]; or (3) 95% CI = {[LN(upper confidence interval)–LN(lower confidence interval)]/3.92}.
A worked example, using the calculator associated with Dienes, can be found in Supporting information, [Link].
Others have advocated alternative methods of computing Bayes factors, including the Jeffreys–Zellner–Siow (JZS) t‐test 4, 12, which can be implemented in R 13, 14 (see Dienes & McLatchie, submitted, for comparison). Moves have also been made towards full Bayesian modelling, which requires a much more advanced knowledge of R or specialist software packages, and is beyond the scope of the current paper (e.g. WinBUGS) 3, 11.
Methods
Bayes factors were calculated for 12 randomized controlled trials published in the first six issues of Addiction in 2013 (between January and June). Effect sizes, SEs, P–values and the main conclusions drawn by the authors were extracted from the papers for both primary and main secondary outcomes. Studies are generally only powered to detect estimated differences between experimental and control groups for the primary outcome, and thus Bayes factors may be particularly useful for secondary analyses 15, 16. Concerns have been raised previously regarding the interpretation of non‐significant findings for sensitivity analyses 15, 16.
Adjusted effect sizes (where available) and those reported at the longest point of follow‐up were used. Bayes factors were calculated using the online calculator provided by Dienes 8 and the modified version using R code by Christie 9, 10. Predicted values for the effect size or population standard deviation (SD) were based on previous studies (see Table 2). Additional sensitivity analyses were run to assess the effect of using higher and lower values. The chosen range was based either on the reported CI of the predicted effect size selected from previous publications or, when not available, the opinion of the lead author as to what would be a plausible effect.
Table 2.
Results, conclusions and corresponding Bayes factors for randomized controlled trials (RCTs) published in Addiction in the first six issues of 2013.
Study | Intervention | Control | Participants | Outcome | Sample mean | Sample standard error | Significance p | Study conclusions for non‐significant findings | Expected effect size | Bayes factor: Dienes (Christie) 8, 9, 10 | Interpretation of Bayes factor using Dienes 8 | Interpretation of Bayes Factors using Jeffreys 2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Kypri 19 | Web based alcohol screening and brief intervention for reducing hazardous drinking among Maori university students | Screening only | 6697 students aged 17–24 | P: Frequency of alcohol consumption | RaR 0.89 | 0.04 | 0.01** | ‘Web‐based screening and brief intervention reduced hazardous and harmful drinking among non‐help‐seeking Maori students’ | RaR 0.91c | 17.5 (17.5) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis |
RaR 0.85Ω | 16.0 (16.0) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
RaR 0.97Ω | 5.3 (5.3) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
P: Quantity of alcohol | RaR 0.92 | 0.04 | 0.04* | No mention of results > 0.05 | RaR 0.96a | 3.0 (3.0) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | ||||
RaR 0.91Ω | 3.4 (3.4) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
RaR 0.99Ω | 1.4 (1.4) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
P: Volume of alcohol | RaR 0.78 | 0.06 | < 0.001*** | RaR 0.89a | 261.6 (261.3) | Evidence for experimental hypothesis (i.e. an effect) | Extreme evidence for experimental hypothesis | |||||
RaR 0.82Ω | 475.0 (466.2) | Evidence for experimental hypothesis (i.e. an effect) | Extreme evidence for experimental hypothesis | |||||||||
RaR 0.96Ω | 13.2 (13.2) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
P: Academic Role Expectation and Alcohol Scale (AREAS) | RaR 0.81 | 0.08 | 0.01* | RaR 0.95a | 3.9 (3.9) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||
RaR 0.82Ω | 13.1 (13.1) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
RaR 0.99Ω | 1.3 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Binge drinking | OR 0.80 | 0.12 | 0.06 | OR 0.89a | 3.2 (3.2) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||
OR 0.65Ω | 2.8 (2.8) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 0.99Ω | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Heavy drinking | OR 0.65 | 0.15 | < 0.001*** | OR 0.55a | 19.0 (19.0) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||
OR 0.38Ω | 13.9 (13.9) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
OR 0.80Ω | 15.5 (15.5) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
Li 20 | Methadone maintenance therapy (MMT) care intervention (with motivational interviewing) | Standard care | 41 providers and 179 clients from six clinics | P: Provider client interaction | MD 4.82 | 2.23 | 0.033* | ‘The MMT CARE intervention targeting providers in methadone maintenance clinics can improve providers’ treatment knowledge and their interaction with clients. The intervention can also reduce clients’ drug‐using behaviour through motivational interviewing sessions conducted by trained providers. . It is difficult to explain the unexpected findings in provider MMT knowledge and client drug avoidance self‐efficacy [long term]; this may be a result of the small sample size and the pilot nature of the study’ | MD 4.65b | 5.6 (5.6) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis |
MD 2.18Ω | 4.2 (4.2) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
MD 7.01Ω | 4.9 (4.9) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
P: MMT knowledge | MD 1.00 | 0.56 | 0.544 | MD 4.65b | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
MD 2.18Ω | 2.1 (2.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 7.01Ω | 0.7 (0.7) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
P: Perceived stigma | MD −1.87 | 2.31 | 0.421 | MD −5.1c | 0.8 (0.8) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
MD −1.2Ω | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD −9.0Ω | 0.5 (0.5) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
P: Perceived client support | MD 1.82 | 0.65 | 0.006** | No mention of results >0.05 | MD 4.65b | 12.9 (12.9) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | ||||
MD 2.18Ω | 20.8 (20.8) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
MD 7.01Ω | 8.9 (8.9) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
P: Drug avoidance self‐efficacy | MD 1.25 | 1.24 | 0.312 | MD 0.9d | 1.4 (1.4) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
MD 0.3Ω | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 1.5Ω | 1.4 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
P: Concurrent drug use | OR 0.36 | 0.59 | 0.084 | OR 0.66e | 2.3 (2.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 0.56Ω | 2.7 (2.7) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 0.78Ω | 1.7 (1.7) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
Ward 21 | Behavioural support and nicotine replacement therapy (NRT) | Behavioural support | 269 adults in four primary care clinics | P: 12 month prolonged abstinence | OR 0.51 | 0.50 | 0.182 | ‘Nicotine patches may not be effective in helping smokers in low‐income countries to stop when given as an adjunct to behavioural support. . Our results do not support the incremental value of providing NRT in addition to behavioural counselling’ | OR 1.51f | 1.8 (1.8) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis |
OR 1.35Ω | 1.6 (1.6) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.70Ω | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: 7‐day point prevalence abstinence | OR 0.69 | 0.32 | > 0.05 | ‘Between‐group differences [for 12 month prolonged abstinence] were not statistically significant at follow‐up. . . No significant between‐group differences were found for seven‐day point prevalence abstinence’ | OR 1.78f | 1.4 (1.4) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | ||||
OR 1.49Ω | 1.5 (1.5) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 2.12Ω | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
Borland 22 | OnQ: An interactive text messaging program | Minimal intervention | 3530 smokers interested in quitting | P: 6‐months sustained abstinence | OR 1.44 | 0.24 | > 0.05 | ‘Smokers interested in quitting who were assigned randomly to an offer of wither the internet‐based support program and/or the intervention automated text‐messaging program had a non‐significantly greater odds of quitting for at least 6 months than those randomized to an offer of a single website. . we failed to find clear significant effects between the intervention and the control’ | OR 1.50g | 2.2 (2.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis |
OR 1.20α | 2.0 (2.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.80α | 1.9 (1.9) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: 7‐day point prevalence abstinence | OR 1.20 | 0.15 | > 0.05 | OR 1.50g | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 1.20α | 1.6 (1.6) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.80α | 0.9 (0.9) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
S: Quit attempt | OR 1.11 | 0.12 | > 0.05 | OR 1.50g | 0.6 (0.6) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
OR 1.20α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.80α | 0.4 (0.4) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
QuitCoach: Personalized tailored internet‐delivered advice program | Minimal intervention | P: 6‐months sustained abstinence | OR 1.40 | 0.24 | > 0.05 | ‘There were no differences in the proportion who reported making a quit attempt by the 1‐month follow‐up. . . At the 7‐month follow up, 8.5% of the sample achieved 6‐month sustained abstinence. No significant differences were found by condition, but the control condition was numerically least successful’. | OR 1.50g | 1.9 (1.9) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | ||
OR 1.20α | 1.8 (1.8) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.80α | 1.6 (1.6) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: 7‐day point prevalence abstinence | OR 1.03 | 0.15 | > 0.05 | OR 1.50g | 0.4 (0.4) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
OR 1.20α | 0.7 (0.7) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
OR 1.80α | 0.3 (0.3) | Evidence for null hypothesis (i.e. no effect) | Moderate evidence for null hypothesis | |||||||||
S: Quit attempt | OR 0.91 | 0.12 | > 0.05 | OR 1.50g | 0.6 (0.6) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
OR 1.20α | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
OR 1.80α | 0.4 (0.4) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
Integration of onQ and QuitCoach | Minimal intervention | P: 6‐months sustained abstinence | OR 1.06 | 0.15 | > 0.05 | OR 1.92g | 0.3 (0.3) | Evidence for null hypothesis (i.e. no effect) | Moderate evidence for null hypothesis | |||
OR 1.40α | 0.6 (0.6) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
OR 2.40α | 0.2 (0.2) | Evidence for null hypothesis (i.e. no effect) | Moderate evidence for null hypothesis | |||||||||
S: 7‐day point prevalence abstinence | OR 1.45 | 0.24 | > 0.05 | OR 1.92g | 1.8 (1.8) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 1.40α | 2.3 (2.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 2.40α | 1.5 (1.5) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Quit attempt | OR 1.03 | 0.12 | > 0.05 | OR 1.92g | 0.2 (0.2) | Evidence for null hypothesis (i.e. no effect) | Moderate evidence for null hypothesis | |||||
OR 1.40α | 0.4 (0.4) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
OR 2.40α | 0.2 (0.2) | Evidence for null hypothesis (i.e. no effect) | Moderate evidence for null hypothesis | |||||||||
Choice of either alone or combined program | Minimal intervention | P: 6‐months sustained abstinence | OR 1.47 | 0.24 | > 0.05 | OR 1.92g | 2.0 (2.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||
OR 1.40α | 2.5 (2.5) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 2.40α | 1.6 (1.6) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: 7‐day point prevalence abstinence | OR 1.07 | 0.15 | > 0.05 | OR 1.92g | 0.3 (0.3) | Evidence for null hypothesis (i.e. no effect) | Moderate evidence for null hypothesis | |||||
OR 1.40α | 0.6 (0.6) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
OR 2.40α | 0.3 (0.3) | Evidence for null hypothesis (i.e. no effect) | Moderate evidence for null hypothesis | |||||||||
S: Quit attempt | OR 1.15 | 0.12 | > 0.05 | OR 1.92g | 0.6 (0.6) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
OR 1.40α | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
OR 2.40α | 0.4 (0.4) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
Rendall‐Mkosi 23 | Motivational Interviewing | Minimal intervention | 165 women aged 18–44 years at risk of alcohol exposed pregnancy | P: Alcohol exposed pregnancy | OR 0.46 | 0.35 | 0.024* | ‘A five session motivational interviewing intervention was found to be effective with women at risk of an alcohol‐exposed pregnancy. . . it is noteworthy that the reduction in risk for AEP in this study was mainly due to the improved contraceptive rather than a reduction in risky alcohol use’ | OR 1.90h | 6.5 (6.5) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis |
OR 1.36Ω | 4.2 (4.2) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
OR 2.66Ω | 6.2 (6.2) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
S: Risky drinking | OR 0.75 | 0.53 | 0.580 | OR 0.84i | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 0.70α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 0.90α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Ineffective contraception | OR 0.51 | 0.37 | 0.067 | ‘At the 12‐month follow‐up, the reduction [in risky drinking] in the MI group (14.75%) was modestly larger when compared to the control group (10.94%), but this difference was also not statistically significant. . the reduction in the proportion of participants who were using ineffective contraception at 12 months was no longer statically significant’ | OR 0.63i | 3.0 (3.0) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | ||||
OR 0.54α | 3.2 (3.2) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
OR 0.74α | 2.6 (2.6) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
Coffin 24 | Aripiprazole | Placebo | 90 methamphetamine dependent, sexually active adults from the community | P: Methamphetamine use | RR 0.88 | 0.15 | 0.410 | ‘Compared with placebo, apripiprazole did not reduce methamphetamine use significantly among actively dependent adults. . notwithstanding the promising pre‐clinical results suggesting that apripiprazole might be effective at decreasing craving for methamphetamine and reducing it rewarding properties, we found no effect of this medication on methamphetamine use, severity of craving. We also did not evidence that apripiprazole was associated with increased methamphetamine use or rewards, as suggested by some investigators.’ | RR 1.12j | 1.3 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis |
RR 1.02α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
RR 1.22α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Adherence – medication event monitoring systems | RR 1.33 | 0.43 | 0.310 | RR 0.99k | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||
RR 0.80α | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
RR 1.00 | 0.7 (0.7) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
S: Adherence, self‐reported | RR 0.59 | 0.49 | 0.170 | RR 1.03k | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
RR 1.01α | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
RR 1.10α | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Number of partners with whom methamphetamines were used | RR 0.38 | 0.86 | 0.254 | RR 0.45k | 1.5 (1.5) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
RR 0.24Ω | 1.4 (1.4) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
RR 0.82Ω | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Number of sexual partners | RR 0.69 | 0.46 | 0.418 | ‘In the intention‐to‐treat GEE analysis, the risk of testing positive for methamphetamine was similar in the aripiprazole arm compared to the placebo arm. . difference between arms over follow‐up was not significant [in severity of dependence. . After controlling for imbalanced baseline characteristics, sexual risk behaviors declined similarly in the aripiprazole and placebo arms.’ | RR 0.20k | 0.2 (0.2) | Evidence for null hypothesis (i.e. no effect) | Strong evidence for null hypothesis | ||||
RR 0.04Ω | 0.1 (0.1) | Evidence for null hypothesis (i.e. no effect) | Strong evidence for null hypothesis | |||||||||
RR 0.93Ω | 0.9 (0.9) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
S: Episodes of anal and/or vaginal sex with sero‐discordant partners | RR 0.42 | 0.65 | 0.190 | RR 0.31k | 1.7 (1.7) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
RR 0.14Ω | 1.3 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
RR 0.66Ω | 1.7 (1.7) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Episodes of unprotected anal and/or vaginal sex wth sero‐discordant partners | RR 0.61 | 0.98 | 0.612 | RR 0.34k | 0.9 (0.9) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
RR 0.17Ω | 0.7 (0.7) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
RR 0.70Ω | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Episodes of insertive unprotected anal sex with sero‐discordant partners | RR 0.54 | 0.72 | 0.385 | RR 0.29k | 1.0 (1.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
RR 0.14Ω | 0.8 (0.8) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
RR 0.58Ω | 1.3 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Episodes of receptive unprotected anal and/or vaginal sex with sero‐discordant partners | RR 0.02 | 1.32 | 0.007** | RR 0.27k | 12.0 (12.0) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||
RR 0.05Ω | 30.9 (30.9) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis | |||||||||
RR 0.49Ω | 4.4 (4.4) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
S: Methamphetamine craving | MD 6.8 | 7.65 | 0.380 | MD 35k | 0.5 (0.5) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
MD 8Ω | 1.3 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 62Ω | 0.3 (0.3) | Evidence for null hypothesis (i.e. no effect) | Strong evidence for null hypothesis | |||||||||
S: Severity of dependence | MD −0.04 | 0.85 | 0.960 | MD 2.00l | 0.4 (0.4) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
MD 1.00α | 0.7 (0.7) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
MD 3.00α | 0.3 (0.3) | Evidence for null hypothesis (i.e. no effect) | Strong evidence for null hypothesis | |||||||||
S: Depression | MD 1.47 | 2.19 | 0.500 | MD 2.00l | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
MD 1.00α | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 3.00α | 1.0 (1.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
Gilbert 25 | Tailored cessation on advice reports, including levels of reading ability | Generic self‐help booklet | 58 66 current cigarette smokers aged 18–65 years, identified from general practitioner records | P: Prolonged abstinence for 3 months | OR 1.18 | 0.13 | 0.184 | ‘ESCAPE. . appears to increase the rate at which smokers try to stop, but if there is an effect on prolonged abstinence it is small… Quit rates for the primary outcome of three months of prolonged abstinence were not significantly different between study groups. Thus, the intervention showed no effect. Quit rates in a number of different outcome measures of abstinence also showed no significant effect. However, all outcome measures showed a non‐significant trend towards more abstinence in the intervention group’ | OR 1.42m | 1.3 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis |
OR 1.21Ω | 1.7 (1.7) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.68Ω | 0.9 (0.9) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
S: Prolonged abstinence for 1 month | OR 1.17 | 0.11 | 0.130 | OR 1.42m | 1.5 (1.5) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 1.21Ω | 2.0 (2.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.68Ω | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: 7‐day point prevalence abstinence | OR 1.11 | 0.10 | 0.307 | OR 1.42m | 0.8 (0.8) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||
OR 1.21Ω | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.68Ω | 0.5 (0.5) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
S: 24‐hour point prevalence abstinence | OR 1.15 | 0.09 | 0.131 | OR 1.42m | 1.4 (1.4) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 1.21Ω | 2.1 (2.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.68Ω | 1.0 (1.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Quit attempt | OR 1.11 | 0.06 | 0.074 | ‘The difference [in 3 month prolonged abstinence] was not significant. . No significant differences were found between the intervention and control groups on shorter periods or on point‐prevalence measures of abstinence’. | OR 1.42m | 1.4 (1.4) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | ||||
OR 1.21Ω | 2.3 (2.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.68Ω | 1.0 (1.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
Alessi 26 | Compensation for video recording alcohol breath tests using a cell phone and contingency management with escalating vouchers for on‐time alcohol‐negative tests. | Compensation for video recording alcohol breath tests using a cell phone | 30 adults who drank frequently but were not physiologically dependent | P: Negative breath sample | MD 20.20 | 5.74 | < 0.001*** | ‘Cellphone technology may be useful for extending contingency management to treatment for alcohol problems’ | MD 8.00n | 69.8 (69.9) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis |
MD 5.00α | 21.7 (21.7) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
MD 12.00α | 134.1 (134.2) | Evidence for experimental hypothesis (i.e. an effect) | Extreme evidence for experimental hypothesis | |||||||||
S: Longest duration of negative samples | MD 10.90 | 3.52 | < 0.001*** | No mention of results > 0.05 | MD 2.00n | 5.3 (5.3) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | ||||
MD 1.00α | 2.2 (2.2) | Evidence is insensitive | Moderate evidence for experimental hypothesis | |||||||||
MD 3.00α | 11.2 (11.2) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
S: Days of drinking | MD −11.00 | 3.48 | < 0.001*** | MD 3.71o | 19.5 (19.5) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||
MD 1.00α | 2.3 (2.3) | Evidence is insensitive | Moderate evidence for experimental hypothesis | |||||||||
MD 7.00α | 49.4 (49.4) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis | |||||||||
S: Drinks per drinking day | MD −0.80 | 0.83 | 0.350 | MD 1.20o | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
MD 0.5α | 1.3 (1.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 1.90α | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
S: Addiction Severity Index | MD −0.09 | 0.03 | 0.010** | MD 0.10n | 41.3 (41.3) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis | |||||
MD 0.01α | 2.6 (2.6) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 0.20α | 28.0 (28.0) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis | |||||||||
S: Drinker Inventory of Consequences | MD −0.80 | 0.23 | < 0.001*** | MD 1.00p | 120.0 (120.0) | Evidence for experimental hypothesis (i.e. an effect) | Extreme evidence for experimental hypothesis | |||||
MD 0.2Ω | 18.1 (18.1) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
MD 1.8Ω | 83.4 (83.4) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis | |||||||||
Richmond 27 | Nortriptyline added to multi‐component smoking cessation intervention (included nicotine replacement therapy and cognitive behavioural therapy) | Placebo added to multi‐component smoking cessation intervention (included nicotine replacement therapy and cognitive behavioural therapy) | 425 male prisoners | P: Continuous abstinence | OR 0.98 | 0.30 | > 0.05 | ‘Adding nortriptyline to a smoking cessation treatment package consisting of behavioural support and nicotine replacement therapy does not appear to improve long‐term abstinence rates in male prisoners. . In this study, we found no significant difference in an intention‐to‐treat analysis between the two study groups, suggesting that the additional use of NOR does not enhance quit rates for tobacco in the longer term’ | OR 1.21q | 0.9 (0.9) | Evidence is insensitive | Moderate evidence for null hypothesis |
OR 1.01Ω | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
OR 1.55Ω | 0.6 (0.6) | Evidence is insensitive | Moderate evidence for null hypothesis | |||||||||
P: Point prevalence abstinence | OR 0.81 | 0.29 | > 0.05 | OR 1.21q | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 1.01Ω | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
OR 1.55Ω | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
S: Smoking reduction (>50% reduction in cigarette consumption) | OR 0.75 | 0.26 | > 0.05 | ‘Based on an intention‐to‐treat analysis and cut‐off point for CO of ≤ 10 p.p.m, continuous abstinence between the treatment and comparison groups were not statistically different at 3 months. . point‐prevalence abstinence, using the ≤ 5 p.p.m. cut‐off between the treatment and control groups, was also not statistically significant different at three months’. | OR 0.43q | 0.9 (0.9) | Evidence is insensitive | Moderate evidence for null hypothesis | ||||
OR 0.12Ω | 0.4 (0.4) | Evidence is insensitive | Moderate evidence for null hypothesis | |||||||||
OR 0.99Ω | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
Levin 28 | Venlafaxine‐extended release | Placebo | 103 cannabis dependent adults | P: Two‐week abstinence | OR 0.23 | 0.52 | < 0.001*** | ‘For depressed, cannabis‐dependent patients, venlafaxine‐extended release does not appear to be effective at reducing depression and may lead to an increase in cannabis use’ | OR 0.80r | 2.9 (2.9) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis |
OR 0.70α | 5.5 (5.5) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
OR 0.90α | 1.6 (1.6) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
P: 50% reduction in depressive symptoms (Hamilton Depression rating scale) | OR 0.75 | 0.42 | 0.510 | OR 1.43s | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||
OR 1.20α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.60α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: THC urine levels | MD 964 | 320.27 | < 0.001*** | ‘No significant effect of treatment and no significant effect of baseline HAMD on 50% reduction of HAMD’. | MD 137.3t | 3.3 (3.3) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | ||||
MD 100α | 2.3 (2.3) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 300α | 11.9 (11.9) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
S: Use in grams | MD 2.67 | 4.72 | 0.320 | MD 0.45u | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||
MD 0.02α | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
MD 0.88α | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
Okuyemi 29 | Motivational interviewing and nicotine patch | Nicotine patch and brief advice to quit | 430 homeless smokers | P: 7‐day point prevalence abstinence | OR 1.33 | 0.21 | 0.170 | ‘Adding motivation interviewing counselling for nicotine patch did not increase smoking rate significantly at 26‐week follow‐up for homeless smokers. . MI did not improve adherence measures among participants who received MI.’ | OR 1.35v | 1.8 (1.8) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis |
OR 1.02Ω | 1.1 (1.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
OR 1.78Ω | 1.4 (1.4) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
S: Motivation to adhere | MD 1.4 | 0.49 | 0.080 | MD 4.97w | 11.2 (11.2) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||
MD 1.19 Ω | 25.0 (25.0) | Evidence for experimental hypothesis (i.e. an effect) | Strong evidence for experimental hypothesis | |||||||||
MD 8.75Ω | 6.6 (6.6) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
S:Self‐efficacy to adhere | MD 2.5 | 3.12 | 0.220 | ‘Motivation for adherence scores at week 6 were marginally higher for participants in the intervention group than those in the control group. . There were no differences between study groups in the proportion of participants who had their nicotine patches on at various study visits’. | MD 4.97 w | 1.0 (1.0) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | ||||
MD 1.19Ω | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
MD 8.75Ω | 0.7 (0.7) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
S: Nicotine patch use | OR 1.0 | 0.20 | 0.970 | OR 1.14z | 0.8 (0.8) | Evidence is insensitive | Moderate evidence for null hypothesis | |||||
OR 1.02α | 1.0 (1.0) | Evidence is insensitive | No evidence | |||||||||
OR 1.28α | 0.6 (0.6) | Evidence is insensitive | Moderate evidence for null hypothesis | |||||||||
Gustafson 30 | Interest circle calls | No intervention | 201 clinics | P: Waiting‐time (mean days between first contact and first treatment) | MD −0.24 | 2.12 | 0.911 | ‘When trying to improve the effectiveness of addiction treatment services, clinic‐level coaching appears to help improve waiting‐time and number of new patients while other components of improvement collaboratives (interest circle calls and learning sessions) do not seem to add further value’ | MD 10.6y | 0.2 (0.2) | Evidence for null hypothesis (i.e. no effect) | Strong evidence for null hypothesis |
MD 15α | 0.2 (0.2) | Evidence for null hypothesis (i.e. no effect) | Strong evidence for null hypothesis | |||||||||
MD 5α | 0.4 (0.4) | Evidence is insensitive | Moderate evidence for null hypothesis | |||||||||
P: Retention (percentage of patients retained from first to fourth treatment session) | MD −0.003 | 0.03 | 0.912 | MD 7.5y | 0.01 (0.01) | Evidence for null hypothesis (i.e. no effect) | Very strong evidence for null hypothesis | |||||
MD 10α | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
MD 5α | 0.01 (0.01) | Evidence for null hypothesis (i.e. no effect) | Very strong evidence for null hypothesis | |||||||||
P: Annual number of new patients | MD −0.04 | 0.04 | 0.369 | ‘Learning sessions had a modest waiting time reduction while interest circle calls had a slight increase, but these two groups’ changes were not statistically significant. . None of the groups showed significant improvement in retention for the 6‐month intervention period (Table 3a), or the entire intervention and sustainability period (Table 3b), and there were no significant differences between groups’ | MD 14.2y | 0.01 (0.01) | Evidence for null hypothesis (i.e. no effect) | Very strong evidence for null hypothesis | ||||
MD 20α | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
MD 10α | 0.01 (0.00) | Evidence for null hypothesis (i.e. no effect) | Very strong evidence for null hypothesis | |||||||||
Coaching | No intervention | P: Waiting‐time (mean days between first contact and first treatment) | MD 4.86 | 1.95 | 0.013* | MD 10.6y | 7.2 (7.2) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||
MD 15α | 5.4 (5.4) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
MD 5α | 10.7 (10.7) | Evidence for experimental hypothesis (i.e. an effect | Strong evidence for experimental hypothesis | |||||||||
P: Retention (percentage of patients retained from first to fourth treatment session) | MD 0.035 | 0.02 | 0.118 | MD 7.5y | 0.0 (0.0) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||
MD 10α | 0.0 (0.0) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
MD 5α | 0.0 (0.0) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
P: Annual number of new patients | MD 0.20 | 0.09 | 0.028* | MD 0.14y | 6.0 (6.0) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||
MD 0.20α | 6.3 (6.3) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
MD 0.10α | 5.0 (5.0) | Evidence for experimental hypothesis (i.e. an effect | Moderate evidence for experimental hypothesis | |||||||||
Learning sessions | No intervention | P: Waiting‐time (mean days between first contact and first treatment) | MD 3.14 | 1.93 | 0.103 | MD 10.6y | 1.2 (1.2) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||
MD 15α | 0.9 (0.9) | Evidence is insensitive | Anecdotal evidence for null hypothesis | |||||||||
MD 5α | 2.1 (2.1) | Evidence is insensitive | Anecdotal evidence for experimental hypothesis | |||||||||
P: Retention (percentage of patients retained from first to fourth treatment session) | MD −0.003 | 0.02 | 0.899 | MD 7.5y | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||
MD 10α | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
MD 5α | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
P: Annual number of new patients | MD −0.001 | 0.07 | 0.982 | MD 14.2y | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||
MD 20α | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
MD 10α | 0.01 (0.01) | Evidence for null hypothesis (i.e. no effect) | Very strong evidence for null hypothesis | |||||||||
Combination | No intervention | P: Waiting‐time (mean days between first contact and first treatment) | MD 6.16 | 1.97 | 0.002** | MD 10.6y | 41.2 (41.2) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis | |||
MD 15α | 31.8 (31.8) | Evidence for experimental hypothesis (i.e. an effect) | Very strong evidence for experimental hypothesis | |||||||||
MD 5α | 50.4 (50.4) | Evidence for experimental hypothesis (i.e. an effect | Very strong evidence for experimental hypothesis | |||||||||
P: Retention (percentage of patients retained from first to fourth treatment session) | MD −0.003 | 0.02 | 0.891 | MD 7.5y | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||
MD 10α | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
MD 5α | 0.00 (0.00) | Evidence for null hypothesis (i.e. no effect) | Extreme evidence for null hypothesis | |||||||||
P: Annual number of new patients | MD 0.09 | 0.04 | 0.029* | MD 0.14y | 5.6 (5.6) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||
MD 0.20α | 4.4 (4.4) | Evidence for experimental hypothesis (i.e. an effect) | Moderate evidence for experimental hypothesis | |||||||||
MD 0.10α | 6.5 (6.5) | Evidence for experimental hypothesis (i.e. an effect | Moderate evidence for experimental hypothesis |
P = primary outcome; S = secondary outcome;
significant at P < 0.05;
significant at P < 0.01;
significant at P < 0.001;
RaR = rate ratio; RR = relative risk; OR = odds ratio; MD = mean difference;
range of population SD reflects the CI of the expected effect size;
range of population SD based on opinion on a viable effect; a one‐directional relationship was assumed in all instances; Based on:
31;
32;
33;
34;
35;
36;
values specified in the sample size calculation;
37;
38;
39;
40;
41;
42;
43;
44;
45;
46;
47;
48;
49;
50;
51;
52;
53;
values specified in the sample size calculation.
HAMD = Hamilton Rating Scale for Depression; p.p.m. = parts per million.
When specifying the predicted effect, we used a ‘half normal distribution’ whose peak was at 0 (no effect) and extending upwards with a SD equal to the expected effect size. This represents a hypothesis that the intervention had at least some positive effect, with the effect being more likely to be smaller than larger. This is a conservative approach to prediction. Another approach would be to specify the hypothesis as a uniform distribution between 0 (or a minimally clinically significant value) and a plausible upper bound. Given that none of the authors of the studies reviewed indicated what they considered to be a clinically meaningful effect or a plausible upper bound for the effect size, we took the conservative approach.
Results
Of the 12 studies, 55 non‐significant effects and 20 significant effects were reported. For each of these, three Bayes factors were calculated: one based on an expected population SD (identified from previous studies) and two based on a range of values around the expected population SD (identified from previous studies or based on expert opinion). Thus, a total of 75 Bayes factors were calculated in the main analysis and 150 Bayes factors were derived in the sensitivity analysis (see Table 2).
Fifty‐six per cent (n = 42) of the Bayes factors were between ⅓ and 3; 14.7% (n = 11) were < ⅓ and 29.3% (n = 22) were > 3. When considering only the non‐significant findings (n = 55), 20.0% (n = 11) of Bayes factors were < ⅓ and 3.6% (n = 2) were > 3. The other 76.4% (n = 42) of Bayes factors were between ⅓ and 3. Of these, 26 were in the direction of there being an effect (Bayes factor > 1 and < 3); 12 tended to favour the hypothesis of no effect (Bayes factor < 1 and > ⅓); and for four there was no evidence either way (Bayes factor = 1).
In sensitivity analyses, 13.3% of Bayes factors were < ⅓ (n = 20), 62.7% (n = 94) were between ⅓ and 3 and 24.0% (n = 36) were >3, showing good consistency with the main results.
Authors either decided not to discuss results where P > 0.05, to report them as non‐significant and/or to state that no association was found. Good concordance was noted between the online calculator 8 and the adapted R code 9, except for those Bayes factors that indicated extreme evidence for the experimental hypothesis.
Discussion
Only ⅕ of all non‐significant findings provided support for the hypothesis of no effect, while nearly ⅔ of the Bayes factors indicated data insensitivity. Thus, reporting ‘no difference’ between conditions or lack of associations was appropriate for only a small number of papers. A minority of Bayes factors for the non‐significant effects also supported the experimental hypothesis; this tended to occur with P‐values close to statistical significance.
The development of online calculators and R code 9, 10 means that researchers in the addiction field can calculate Bayes factors easily to include as an adjunct to traditional frequentist results. The requirement to specify the experimental hypothesis means that scientific judgement is needed. This is a common criticism of Bayesian type methods 17, but it can also be a potential strength, because it forces researchers to be specific about what it is they are testing. Moreover, if there are differences of view about what may be plausible values of the effect size, it is a simple matter to conduct sensitivity analyses to assess what, if any, difference this makes. As a rule of thumb, if one is interested in a clinically relevant range then the uniform distribution can be specified; alternatively, one can use a half‐normal distribution with the peak at 0 if one is interested in any effect at all and has little confidence in the probable value. To prevent researcher bias, pre‐specified analysis plans may be published which detail the method which will be used to calculate Bayes factors, the cut‐off values for interpretation and the plausible effect size which is expected.
The findings of this review show that researchers should avoid the use of terms such as ‘no difference’ or ‘lack of associations’ for P‐values > 0.05, unless a Bayes factor < 0.3 is also found. Otherwise null findings should be framed as ‘the findings were inconclusive as to whether or not a difference/association was present’, or some similar wording. This is now encouraged practice by the Addiction journal 1. Researchers may also wish to use Bayes factors in order to quantify the evidence for the experimental hypothesis (i.e. moderate, strong, very strong and extreme) and/or use such a calculation as a stopping rule for data collection 18. For ethical and perhaps financial reasons interim analyses are often planned for randomized trials, with early stopping occurring if there is demonstrated efficacy, the intervention is harmful or there is no beneficial effect. P‐values cannot inform about us about the latter; in contrast, a Bayes factor indicating data insensitivity would suggest further recruitment, while a Bayes factor indicating evidence for the null hypothesis may point towards early termination.
Note that the methods used to derive Bayes factors in this paper did not cover all the possibilities. More advanced Bayesian hierarchical modelling (BHM) 11, implemented in R and winBUGS, allows a wider range of distributions, e.g. gamma, Poisson, binomial and negative binomial.
Declaration of interests
E.B. has received unrestricted funding from Pfizer. R.W. undertakes consultancy and research for and receives travel funds and hospitality from manufacturers of smoking cessation medications but does not, and will not, take funds from EC manufacturers or the tobacco industry. R.W. is an advisor to the National Centre for Smoking Cessation Z.D. has no conflicts of interest to declare.
Supporting information
Appendix S1 Example: calculating a Bayes Factor.
Supporting info item
Acknowledgements
R.W.’s salary is funded by Cancer Research UK (CRUK). E.B. is funded by CRUK and by the National institute for Health Research (NIHR)’s School for Public Health Research (SPHR). The views are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. SPHR is a partnership between the Universities of Sheffield; Bristol; Cambridge; Exeter; UCL; The London School for Hygiene and Tropical Medicine; the LiLaC collaboration between the Universities of Liverpool and Lancaster and Fuse; The Centre for Translational Research in Public Health, a collaboration between Newcastle, Durham, Northumbria, Sunderland and Teesside Universities. CM has nothing to declare.
Beard, E. , Dienes, Z. , Muirhead, C. , and West, R. (2016) Using Bayes factors for testing hypotheses about intervention effectiveness in addictions research. Addiction, 111: 2230–2247. doi: 10.1111/add.13501.
References
- 1. West R. Using Bayesian analysis for hypothesis testing in addiction science. Addiction 2016; 111: 3–4. [DOI] [PubMed] [Google Scholar]
- 2. Jeffreys H. Theory of Probability. Oxford: Clarendon Press; 1961. [Google Scholar]
- 3. Wagenmakers E.‐J. A practical solution to the pervasive problems of p values. Psychon Bull Rev 2007; 14: 779–4. [DOI] [PubMed] [Google Scholar]
- 4. Rouder J. N., Speckman P. L., Sun D., Morey R. D., Iverson G. Bayesian t‐tests for accepting and rejecting the null hypothesis. Psychon Bull Rev 2009; 16: 225–37. [DOI] [PubMed] [Google Scholar]
- 5. Gallistel C. The importance of proving the null. Psychol Rev 2009; 116: 439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Schervish M. J. P values: what they are and what they are not. Am Statistician 1996; 50: 203–6. [Google Scholar]
- 7. Berger J. O. Statistical Decision Theory and Bayesian Analysis. Berlin: Springer Science and Business Media; 2013. [Google Scholar]
- 8. Dienes Z. Using Bayes to get the most out of non‐significant results. Front Psychol 2014; 5: 781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Bayes C. J. Factor Calculator. R code 2011. Available at: http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/inference/bayesFactorCalc2.R (accessed 1 March 2016).
- 10. Baguley T., Kaye W. S. Review of understanding psychology as a science: an introduction to scientific and statistical inference. Br J Math Stat Psychol 2010; 63: 695–8. [Google Scholar]
- 11. Lee M., Wagenmakers E. Bayesian Modeling for Cognitive Science: A Practical Course. Cambridge: Cambridge University Press; 2013. [Google Scholar]
- 12. Rouder J. N., Morey R. D., Speckman P. L., Province J. M. Default Bayes factors for ANOVA designs. J Math Psychol 2012; 56: 356–74. [Google Scholar]
- 13. Baguley T. Serious Stats: A Guide to Advanced Statistics for the Behavioral Sciences. Basingstoke: Palgrave Macmillan; 2012. [Google Scholar]
- 14. Morey R., Rouder J., Jamil T. Bayes Factor: computation of Bayes factors for common designs. R package version 09. 2014; 8. Available from: https://cran.r‐project.org/web/packages/BayesFactor/BayesFactor.pdf (accessed 1 March 2016).
- 15. Koch M., Riss P., Umek W., Hanzal E. The primary outcomes and power calculations in clinical RCTs in urogynecology—need for improvement? Trials 2015; 16: 1.25971836 [Google Scholar]
- 16. Freemantle N. Interpreting the results of secondary end points and subgroup analyses in clinical trials: should we lock the crazy aunt in the attic? BMJ 2001; 322: 989–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Sprenger J. The Objectivity of Subjective Bayesian Inference 2015. Available at: http://philsci‐archive.pitt.edu/11936/1/ObjectiveBayesianStatistics_v3.pdf (accessed 1 March 2016).
- 18. Rouder J. N. Optional stopping: no problem for Bayesians. Psychon Bull Rev 2014; 21: 301–8. [DOI] [PubMed] [Google Scholar]
- 19. Kypri K., McCambridge J., Vater T., Bowe S. J., Saunders J. B., Cunningham J. A. et al. Web‐based alcohol intervention for Māori university students: double‐blind, multi‐site randomized controlled trial. Addiction 2013; 108: 331–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Li L., Wu Z., Liang L. J., Lin C., Zhang L., Guo S. et al. An intervention targeting service providers and clients for methadone maintenance treatment in China: a cluster‐randomized trial. Addiction 2013; 108: 356–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Ward K. D., Asfar T., Ali A. R., Rastam S., Weg M. W. V., Eissenberg T. et al. Randomized trial of the effectiveness of combined behavioral/pharmacological smoking cessation treatment in Syrian primary care clinics. Addiction 2013; 108: 394–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Borland R., Balmford J., Benda P. Population‐level effects of automated smoking cessation help programs: a randomized controlled trial. Addiction 2013; 108: 618–28. [DOI] [PubMed] [Google Scholar]
- 23. Rendall‐Mkosi K., Morojele N., London L., Moodley S., Singh C., Girdler‐Brown B. A randomized controlled trial of motivational interviewing to prevent risk for an alcohol‐exposed pregnancy in the Western Cape, South Africa. Addiction 2013; 108: 725–32. [DOI] [PubMed] [Google Scholar]
- 24. Coffin P. O., Santos G. M., Das M., Santos D. M., Huffaker S., Matheson T. et al. Aripiprazole for the treatment of methamphetamine dependence: a randomized, double‐blind, placebo‐controlled trial. Addiction 2013; 108: 751–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Gilbert H. M., Leurent B., Sutton S., Alexis‐Garsee C., Morris R. W., Nazareth I. ESCAPE: a randomised controlled trial of computer‐tailored smoking cessation advice in primary care. Addiction 2013; 108: 811–9. [DOI] [PubMed] [Google Scholar]
- 26. Alessi S. M., Petry N. M. A randomized study of cellphone technology to reinforce alcohol abstinence in the natural environment. Addiction 2013; 108: 900–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Richmond R., Indig D., Butler T., Wilhelm K., Archer V., Wodak A. A randomized controlled trial of a smoking cessation intervention conducted among prisoners. Addiction 2013; 108: 966–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Levin F. R., Mariani J., Brooks D. J., Pavlicova M., Nunes E. V., Agosti V. et al. A randomized double‐blind, placebo‐controlled trial of venlafaxine‐extended release for co‐occurring cannabis dependence and depressive disorders. Addiction 2013; 108: 1084–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Okuyemi K. S., Goldade K., Whembolua G. L., Thomas J. L., Eischen S., Sewali B. et al. Motivational interviewing to enhance nicotine patch treatment for smoking cessation among homeless smokers: a randomized controlled trial. Addiction 2013; 108: 1136–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Gustafson D. H., Quanbeck A. R., Robinson J. M., Ford J. H., Pulvermacher A., French M. T. et al. Which elements of improvement collaboratives are most effective? A cluster‐randomized trial. Addiction 2013; 108: 1145–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Kypri K., Hallett J., Howat P., McManus A., Maycock B., Bowe S. et al. Randomized controlled trial of proactive web‐based alcohol screening and brief intervention for university students. Arch Intern Med 2009; 169: 1508–14. [DOI] [PubMed] [Google Scholar]
- 32. Andrews S., Sorensen J. L., Guydish J., Delucchi K., Greenberg B. Knowledge and attitudes about methadone maintenance among staff working in a therapeutic community. J Mainten Addict 2005; 3: 47–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Livingston J. D., Milne T., Fang M. L., Amari E. The effectiveness of interventions for reducing stigma related to substance use disorders: a systematic review. Addiction 2012; 107: 39–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Hser Y.‐I. Predicting long‐term stable recovery from heroin addiction: findings from a 33‐year follow‐up study. J Addict Dis 2007; 26: 51–60. [DOI] [PubMed] [Google Scholar]
- 35. Mattick R. P., Breen C., Kimber J., Davoli M. Methadone maintenance therapy versus no opioid replacement therapy for opioid dependence. Cochrane Database Syst Rev 2009; 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Stead L. F., Perera R., Bullen C., Mant D., Lancaster T. Nicotine replacement therapy for smoking cessation. Cochrane Database Syst Rev 2008; 1: CD000146. [DOI] [PubMed] [Google Scholar]
- 37. Floyd R. L., Sobell M., Velasquez M. M., Ingersoll K., Nettleman M., Sobell L. et al. Preventing alcohol‐exposed pregnancies: a randomized controlled trial. Am J Prev Med 2007; 32: 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Ingersoll K. S., Ceperich S. D., Nettleman M. D., Karanda K., Brocksen S., Johnson B. A. Reducing alcohol‐exposed pregnancy risk in college women: initial outcomes of a clinical trial of a motivational intervention. J Subst Abuse Treat 2005; 29: 173–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Tiihonen J., Kuoppasalmi K., Föhr J., Tuomola P., Kuikanmäki O., Vorma H. et al. A comparison of aripiprazole, methylphenidate, and placebo for amphetamine dependence. Am J Psychiatry 2007; 164: 160–2. [DOI] [PubMed] [Google Scholar]
- 40. Colfax G. N., Santos G.‐M., Das M., Santos D. M., Matheson T., Gasper J. et al. Mirtazapine to reduce methamphetamine use: a randomized controlled trial. Arch Gen Psychiatry 2011; 68: 1168–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Meini M., Moncini M., Cecconi D., Cellesi V., Biasci L., Simoni G. et al. Aripiprazole and ropinirole treatment for cocaine dependence: evidence from a pilot study. Curr Pharm Des 2011; 17: 1376–83. [DOI] [PubMed] [Google Scholar]
- 42. Lancaster T., Stead L. F. Self‐help interventions for smoking cessation. Cochrane Database Syst Rev 2005; 3: CD001007. [DOI] [PubMed] [Google Scholar]
- 43. Petry N. M., Martin B., Cooney J. L., Kranzler H. R. Give them prizes and they will come: contingency management for treatment of alcohol dependence. J Consult Clin Psychol 2000; 68: 250. [DOI] [PubMed] [Google Scholar]
- 44. Barnett N. P., Tidey J., Murphy J. G., Swift R., Colby S. M. Contingency management for alcohol use reduction: a pilot study using a transdermal alcohol sensor. Drug Alcohol Depend 2011; 118: 391–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Litt M. D., Kadden R. M., Kabela‐Cormier E., Petry N. M. Changing network support for drinking: network support project 2‐year follow‐up. J Consult Clin Psychol 2009; 77: 229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Hughes J. R., Stead L. F., Hartmann‐Boyce J., Cahill K., Lancaster T. Antidepressants for smoking cessation. Cochrane Database Syst Rev 2014; 1: CD000031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Findling R. L., Pagano M. E., McNamara N. K., Stansbrey R. J., Faber J. E., Lingler J. et al. The short‐term safety and efficacy of fluoxetine in depressed adolescents with alcohol and cannabis use disorders: a pilot randomized placebo‐controlled trial. Child Adolesc Psychiatr Ment Health 2009; 3: 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Keller M. B., Trivedi M. H., Thase M. E., Shelton R. C., Kornstein S. G., Nemeroff C. B. et al. The Prevention of Recurrent Episodes of Depression with Venlafaxine for Two Years (PREVENT) Study: outcomes from the 2‐year and combined maintenance phases. J Clin Psychiatry 2007; 68: 1246–56. [DOI] [PubMed] [Google Scholar]
- 49. Bonnet U., Specka M., Stratmann U., Ochwadt R., Scherbaum N. Abstinence phenomena of chronic cannabis‐addicts prospectively monitored during controlled inpatient detoxification: Cannabis withdrawal syndrome and its correlation with delta‐9‐tetrahydrocannabinol and‐metabolites in serum. Drug Alcohol Depend 2014; 143: 189–97. [DOI] [PubMed] [Google Scholar]
- 50. Smeerdijk M., Keet R., Dekker N., van Raaij B., Krikke M., Koeter M. et al. Motivational interviewing and interaction skills training for parents to change cannabis use in young adults with recent‐onset schizophrenia: a randomized controlled trial. Psychol Med 2012; 42: 1626–36. [DOI] [PubMed] [Google Scholar]
- 51. Hettema J. E., Hendricks P. S. Motivational interviewing for smoking cessation: a meta‐analytic review. J Consult Clin Psychol 2010; 78: 868. [DOI] [PubMed] [Google Scholar]
- 52. Alterman A. I., Gariti P., Cook T. G., Cnaan A. Nicodermal patch adherence and its correlates. Drug Alcohol Depend 1999; 53: 159–65. [DOI] [PubMed] [Google Scholar]
- 53. Hollands G. J., McDermott M. S., Lindson‐Hawley N., Vogt F., Farley A., Aveyard P. Interventions to increase adherence to medications for tobacco dependence. Cochrane Database Syst Rev 2015; 2: CD009164. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Appendix S1 Example: calculating a Bayes Factor.
Supporting info item