Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jan 1.
Published in final edited form as: Psychother Res. 2018 Nov 18;30(1):53–67. doi: 10.1080/10503307.2018.1544724

Instrumental Variable Analyses for Causal Inference: Application to Multilevel Analyses of the Alliance-Outcome Relation

Paul Crits-Christoph 1, Robert Gallop 2, Averi Gaines 3, Agnes Rieger 4, Mary Beth Connolly Gibbons 5
PMCID: PMC6525655  NIHMSID: NIHMS1512062  PMID: 30451094

Abstract

Objective.

To introduce readers to instrumental variable analyses for causal inferences using as an example a test of the hypothesis that the quality of the therapeutic alliance has a causal role in relation to the outcome of psychotherapy.

Method.

We used data from a recent non-inferiority trial of cognitive and dynamic therapies for major depressive disorder in a community mental health setting. The data (N = 161) were analyzed using standard approaches as well as a multilevel 2-stage instrumental variables approach that allows for causal interpretations by removing the influence of unmeasured confounds.

Results.

Instrumental variables were created at the patient and therapist level using baseline patient and therapist variables. These baseline variables predicted the alliance but were otherwise unrelated to treatment outcome other than through their effects on the alliance. Standard multilevel mixed effects analyses revealed statistically significant associations of the alliance with outcome at the therapist level of analysis. The therapist level effect remained statistically significant when using the instrumental variables approach.

Conclusion.

Our results support the hypothesis that, at least at the therapist level, the alliance plays a causal role in producing better outcomes. Instrumental variable analyses can be a useful tool to supplement standard analyses.

Keywords: alliance, outcome, causal inference, instrumental variables, depression


In the past 40 years, the operationalization of the concept of the therapeutic relationship in terms of the alliance, defined transtheoretically by Bordin (1979) as being composed of the emotional bond between patient and therapist, agreement on tasks, and agreement on goals, has led to a broad acceptance of the alliance as an important aspect of the process of change in various psychotherapies. This has included recognition of the role of the alliance in cognitive-behavioral therapy (Castonguay, Constantino, McAleavey, & Goldfried, 2010), a range of psychodynamically-oriented therapies (Messer & Wolitzky, 2010), interpersonally oriented treatment (Benjamin & Critchfield, 2010), modern humanistic approaches (Watson & Kalogerakos, 2010), modes of family therapy (Escudero, Heatherington, & Friedlander, 2010), couples therapy (Horvath, Symonds, & Tapia, 2010), and group therapy (Piper & Ogrodniczuk, 2010). This broad recognition of the role of the alliance is supported by a meta-analysis of 295 empirical studies that found a modest average association (r =.278) between the quality of the alliance during psychotherapy and the outcomes of a wide range of therapies (Flückiger, Del Re, Wampold, & Horvath, 2018).

Despite the volume of studies showing a consistent association between the alliance and therapy outcome, questions remain. Most notable is the issue of causality (Barber et al., 1999; Barber, Connolly, Crits-Christoph, Gladis, & Siqueland, 2000; Crits-Christoph, Connolly Gibbons, & Hearon, 2006; DeRubeis, Brotman, & Gibbons, 2005). DeRubeis et al. (2005) argue that, at least for cognitive therapy, multiple studies have failed to find a relation between the alliance and outcome, particularly when predicting subsequent symptom change (e.g., DeRubeis & Feeley, 1990; Feeley, DeRubeis, & Gelfand, 1999; Safran & Wallner, 1991; Webb, Beard, Auerbach, Menninger, & Björgvinsson, 2014), and that evidence for the reverse causation (i.e., early positive outcome causes subsequent positive alliance) was present in three of these studies (DeRubeis & Feeley, 1990; Feeley, DeRubeis, & Gelfand, 1999; Webb et al., 2014). There are now at least 11 alliance studies that attempt to address the influence of early symptom improvement on the alliance-outcome relation, and a median r of.24 is evident across these studies (Crits-Christoph, Connolly Gibbons, & Mukherjee, 2013). However, the relatively small number of studies, variability in findings across studies (rs ranging from −.27 to.42), and use of diverse statistical methods in the studies leaves open the question of whether the alliance-outcome relation is partly, or wholly, a function of reverse causation.

Of course, early improvement is only one such “third variable” that might be generating a spurious association between the alliance and outcome. Few of the nearly 300 studies of the alliance-outcome relation statistically controlled for other variables. One study (Klein et al., 2003) had a large enough sample size to specify a number of covariates, including gender, chronicity of depression, comorbid anxiety, substance use, personality disorders, social functioning, and history of abuse/neglect in childhood. Of note is that the alliance predicted subsequent change in this study even after controlling for this set of variables. However, there are many other potential confounding variables that were not controlled.

The problem of “third variables” in non-experimental (i.e., non-random assignment) investigations is, of course, not unique to studies of the alliance-outcome relation. The mantra that “correlation does not prove causality” echoes through all graduate training programs and is a standard point made in the limitation sections of many articles reporting results from non-random assignment studies. Statistical techniques such as structural equation modeling (SEM), which disentangle the direct and indirect effects of latent and observed variables on a specified outcome, are often used by some investigators to infer causal connections. However, causal relations inferred from such standard statistical models remain vulnerable to spurious findings driven by unmeasured variables. Such unmeasured confounding variables are described as “specification errors” in SEM. Thus, causality is an assumption of a standard SEM model rather than a consequence (Brannick, 1995), though causal inference methods (described below) can also be applied within the context of SEM to address unmeasured confounding.

Further complicating a full understanding of any causal connection between the alliance and outcome is the fact that psychotherapy is a dyadic process in which causal elements may occur at either the patient or therapist level. In recent years, this has been more fully acknowledged with the use of multilevel analyses to disentangle patient level effects from therapist level effects. A number of studies (e.g., Baldwin, Wampold, & Imel, 2007; Constantino et al., 2017; Crits-Christoph et al., 2009; Crits-Christoph, Connolly Gibbons, Hamilton, Ring-Kurtz, & Gallop, 2011; Crits-Christoph, Hamilton, Ring-Kurtz, Gallop, McClure, Kulaga, & Rotrosen, 2011; Dinger, Strack, Leichsenring, Wilmers, & Schauenberg, 2008; Falkenström, Granström, Holmqvist, 2014; Huppert et al., 2014; Laws et al., 2017; Marcus, Kashy, Wintersteen, & Diamond, 2011; Zuroff, Kelly, Leybman, Blatt, & Wampold, 2010) have now reported multilevel analyses of the alliance-outcome relation. Though significant effects at the therapist (but not patient) level have been evident in some studies (e.g., Baldwin et al., 2007; Crits-Christoph et al., 2009; Dinger et al., 2008; Marcus et al., 2011; Zuroff et al., 2010), significant effects at the patient (but not therapist) level have been found in other studies (e.g., Crits-Christoph, Hamilton, et al., 2011; Falkenström et al., 2014; Huppert et al., 2014; Laws et al., 2016). Further multilevel studies of the alliance-outcome relation are therefore necessary to begin to sort out the potential effect of alliance on outcome.

Beyond designing a randomized experiment, which is a difficult task for research addressing the alliance-outcome relation, how is it possible to determine whether a positive alliance “causes” better psychotherapy outcomes? Morgan and Winship (2015), in their book Counterfactuals and Causal Inference: Method and Principles for Social Research, list three approaches to causality in the context of observational studies. One approach is to measure and control for potential confounding variables in order to eliminate the noncausal portion of the association between a variable of interest and an outcome variable. This is done typically through regression analyses that include pre-identified confounders in the model. Elimination can also be achieved through stratification, or matching, on such confounders. Standard regression analyses with covariates or more advanced approaches, such as standard SEM and path analysis, rely on this approach to causality. As mentioned, it may not be easy to identify and measure all possible patient and therapist confounding variables when attempting to model the causal influence of the alliance on outcome (or, for that matter, many other associations of interest in psychotherapy research and other social science domains of investigation). However, one way to eliminate patient characteristics and therapist characteristics that might confound the alliance-outcome relation is to conduct within-patient analyses in studies that measure the alliance and outcome repeatedly over time and use lagged analyses (change in the alliance predicting subsequent change in symptoms) to eliminate reverse causation (e.g., Falkenström, Granström, & Holmqvist, 2013). Although such within-patient analyses are appealing for examining the effect of alliance on next session outcomes, there remain possible unmeasured within-patient variables that confound the alliance-symptom change association. In theory, such within-patient relations could be examined using the causal inference methods described in this article to eliminate the impact of unmeasured within-patient confounders, though the identification and measurement of within-patient instrumental variables (see below) for accomplishing this would be a significant challenge. It is also important to be aware that it is possible that patients who show the strongest association between session-by-session changes in alliance and symptoms might be those that benefit the least from psychotherapy (e.g., patients with borderline personality disorders who have highly variable feelings towards their therapists from session to session). Thus, within-patient analyses may not be ideal for asking questions about which patients benefit relatively more from psychotherapy.

Another approach to isolate causal effects involves examining the mechanism through which the causal variable of interest affects the outcome. If one or more mediator variables are the only causal route through which the variable of interest affects the outcome, then causality can be estimated indirectly through mediational analysis (e.g., by examining if the relationship between alliance and outcome is reduced to zero once mediators are included in a regression model). Yet, causality is evident with this approach only if all potential confounding variables are both measured and statistically controlled, which is likely impracticable. The third approach to causality described by Morgan and Winship (2015) when randomization is not possible is much less known in psychology and psychotherapy research. This is the method of instrumental variables. Because it is less known in our field, we provide an introduction to this method here.

The basic idea of the method of instrumental variables is relatively straightforward. One identifies a variable (called the “instrument,” typically labeled “Z”) that “causes” the predictor variable of interest (“X”), but with the constraint that Z has no independent association with the outcome (“Y”). The need for the identified instrument to have a causal effect on the predictor is called the relevance assumption. The lack of causal paths from Z to Y, except through X, is called the exclusion restriction. For example, assume that patient interpersonal style (e.g., desire/willingness to emotionally connect to people vs. lack of interest in connecting emotionally to other people) is an instrument (Z) that predicts alliance in therapy (X), but Z has no other association with treatment outcome (Y), other than the causal path that Z is associated with X which is in turn associated with Y. If such an instrument can be identified, then the method of instrumental variables proceeds by statistically isolating the part of variable X that is associated only with Z. This part of X is then correlated with Y to determine if X has a statistical association with Y, independent of any and all potential confounding variables. Because we have isolated the part of X (e.g., alliance) that is “caused” only by Z (the instrument), no other causal factors (measured or unmeasured) could “cause” Y (except through their association with X).

It is useful to note that an instrument could also be a linear combination of variables. By using a combination of variables, a relatively larger portion of X can be isolated. If only a small portion of X is isolated (e.g., by using a single, less effective instrument variable) it is less likely that a meaningful causal association between X and Y is uncovered. A weak correlation between the instrument (Z) and the predictor (X) can lead to misleading parameter estimates with large standard errors (wide confidence intervals) when examining the causal impact on outcome (Y). As a general guideline, an F ratio of 10 or greater has been suggested for evaluating the statistical relevance of Z to X (Staige & Stock, 1997), though other factors (i.e., persuasive arguments, theory, previous research) should also be considered when evaluating potential instruments.

In addition to the relevance assumption and the exclusion restriction, the instrumental variable approach assumes that the instrument does not share common causes with the outcome (this is termed the exchangeability assumption) (Lousdal, 2018). Further, the approach also assumes that effects of the predictor variable are homogeneous. In the case of the alliance, the assumption of homogeneity (at the extreme) would be that a given level of the alliance’s effects on outcome are the same across all patients. Because the assumption of homogeneity would often be violated (almost certainly in the case of the alliance-outcome association), an alternative assumption is one of monotonicity (Swanson, Miller, Robins, & Hernán, 2015). In the case of the alliance-outcome association, this assumption means that the alliance effect on outcome is in the same direction for each subject, though the magnitude of the effect can vary. Although the monotonicity assumption is a preferred alternative to the homogeneity assumption because the latter is likely to be violated, the monotonicity assumption comes with the cost of reduced generalizability of causal effects estimates (Lousdal, 2018).

The method of instrumental variables was developed within the context of econometric research in the 1920s, with early studies focusing on the estimation of supply and demand curves in agricultural markets (Stock & Trebbi, 2003). The method has since been widely applied in economic research (Angrist & Krueger, 2006) and has expanded to medical research (e.g., Earle et al., 2001). Of note is that econometric applications of instrumental variable analysis typically have much larger sample sizes than those typically used in psychotherapy studies.

There is now an extensive statistical literature on various causal inference methods (see: Imbens & Rubin, 2015; Morgan & Winship, 2015; Pearl, 2009; Pearl, Glymour, & Jewell, 2016). Several extensions of the method of instrumental variables have been described that rely on different estimation approaches (Uddin et al., 2015). The structural mean model approach, developed specifically in the context of attempting to address randomized experiments that have been compromised by attrition or lack of compliance, uses a semiparametric method of estimation called G estimation (Robins, 1994). The structural nested mean model approach is a further extension that applies the G estimation method to longitudinal data with potential time-dependent confounders (Robins, 2000). These models were subsequently extended to the study of mediating variables (which have not been subject to randomization) in the context of randomized controlled trials (Gallop et al., 2009; Ten Have et al., 2007). One problem with the G estimation method is that such methods are currently not readily available in commonly used statistical packages like SAS and SPSS without specialized macros or modules. This is one likely reason why their use is relatively uncommon. In contrast, the method of instrumental variables can be easily implemented with a two-stage ordinary least squares regression analysis. In the first stage, the predictor variable (X) is predicted from the instrument (Z), and the predicted values of X (that portion of X related to Z) are output from the regression. This new variable (portion of X predicted by Z) is then used to predict the outcome (Y) in the second stage of the analyses. There is no reason why such analyses cannot be generalized to a multilevel context, with instruments used at each level, as done by Holmes et al. (2017) in their examination of adherence for sleep apnea interventions.

While the instrumental variable analyses may seem straightforward, the challenge for conducting causal inference analyses using the instrumental variable method is first determining an adequate instrument. As mentioned, a weak instrument will lead to large standard errors, reducing the likelihood of uncovering a statistically significant causal connection between the predictor variable and the outcome. With the alliance-outcome association, the task is complicated by the need to identify valid instruments at both the patient and therapist level of analyses. Adequate instruments at both levels would need to both fulfill the exclusion criteria and explain a meaningful amount of variance in the alliance at both levels.

Ideally, valid instruments would be identified in some areas of science based purely on logic and/or theoretical grounds. That is, there might be some variable that is highly associated with the predictor (X) of interest, but logically/theoretically cannot have a direct causal impact on the outcome measure (except through its relation to the predictor). Within psychotherapy research, identifying variables on purely logical or theoretical grounds that cause the alliance, but do not cause outcome (except through their relation with the alliance), is a difficult task. Instead of relying on theoretical or logical grounds, one could potentially identify instruments on empirical grounds. The likelihood that such empirically-identified instruments are sample-specific, or at the least have rather limited generalizability to other contexts, is high. Accordingly, it is difficult to recommend causal inference methods based on empirically-developed instruments as the primary analyses that should be conducted for examining the alliance-outcome relation (or, for that matter, for testing any process-outcome association). The alternative is to use causal inference methods as a secondary supporting analysis to be conducted after a statistically significant (and clinically meaningful) process-outcome association is detected with standard (non-causal) statistical methods. Then, one could investigate if the association holds up using causal inference methods (i.e., to what extent can we be confident that the association is not due to unmeasured confounds?). This is the approach taken in the current article and is consistent with the recommendation of Greenland (2000) for use of instrumental variable analysis as a secondary, or confirmatory, approach.

To narrow the search for empirically-based instruments, it is useful to begin by relying on results from previous investigations of variables predicting the alliance. Fortunately, a number of studies have reported correlates of the alliance at the patient and therapist levels. With regard to patient variables that could be considered as instruments at the patient level of analysis, several studies have indicated that patient baseline interpersonal problems and patient expectations regarding therapy outcome or therapy process predict the alliance, or have found that the alliance mediates the relation of these factors to outcome (Abouguendia, Joyce, Piper, & Ogrodniczuk, 2004; Connolly Gibbons et al., 2003; Constantino, Arnow, Blasey, & Agras, 2005; Gaudiano & Miller, 2006; Hersoug. Høglend, Monsen, & Havik, 2001; Johasson, Hoglend, & Hersoug, 2011; Joyce, Ogrodniczuk, Piper, & McCallum, 2003; Joyce & Piper, 1998; Meyer et al., 2002; Ollila, Knekt, Heinonen, & Lindfors, 2016; Patterson, Uhlin, & Anderson, 2008; Stiles-Shields et al., 2016; Tsai, Ogrodniczuk, Sochting, & Mirmiran, 2014). We therefore evaluated available variables in the current dataset that were related to these domains as potential instruments at the patient level. Therapist variables previously found to be related to the alliance include therapist experience (Mallinckrodt & Nelson, 1991) and personal characteristics (Ackerman & Hilsenroth, 2004; Dunkle & Friedlander, 1996; Hersoug et al., 2001; Hersoug, Høglend, Havik, von der Lippe, & Monsen, 2009; Nissen-Lie, Monsen, & Rønnestad, 2010). Based on these findings, we examined therapist personal characteristics and therapist experience variables as predictors of therapist level alliance in the current study.

Instrumental variable analysis has been used in one previous study to examine the alliance-outcome relation (Goldsmith, Lewis, Dunn, & Bentall, 2015). The study examined the interaction between the number of sessions attended and the alliance at Session 4 in relation to the outcome of 6-week treatment with cognitive behavioral therapy plus routine care, supportive counselling plus routine care, or routine care alone for patients who had a first or second episode of psychosis. The instrumental variables were interactions between randomized group assignment and four baseline variables: the log of the duration of untreated psychosis, years of education, baseline symptoms, and treatment site (three sites were included in the study). The instrumental variable analysis supported a potential causal role for the alliance interacting with number of sessions in relation to 18 month follow-up outcomes. The authors concluded that attending more sessions resulted in significantly better outcomes when there was a positive alliance, but when there was a poor alliance, more sessions were detrimental to outcome. Though these findings are important, the authors were not able to model therapist vs. patient level effects. In addition, the strength of the relation between the instruments and the predictor (alliance by number of sessions) was not reported. Further, the psychotherapy treatment was very brief (6 sessions maximum), and 45% of treated patients failed to provide an alliance score.

The purpose of the current article is to illustrate the use of causal inference analysis in the context of a multilevel analysis of the alliance-outcome relation. Our hope is to introduce psychotherapy researchers to causal inference methods and to advance the study of the alliance-outcome relation through raising important issues regarding whether or not the relation is spurious. We first applied standard multilevel (non-causal) statistical methods and, then, conducted multilevel causal analyses using an instrumental variables approach. By implementing the causal inference approach, we attempted to rule out the influence of unmeasured confounding variables on the alliance-outcome association at the patient and therapist levels.

Method

Overview

The dataset that was used to evaluate the potential causal influence of the alliance on outcome was chosen because it had a relatively large sample size (a feature needed for causal inference analyses), and both patient and therapist baseline variables that could be used as instruments at both levels in the multilevel causal inference approach. The dataset was from a large-scale non-inferiority trial comparing cognitive therapy (CT) and dynamic therapy (DT) for patients with major depressive disorder being treated at a community mental health center (Connolly Gibbons et al., 2016). The alliance in relation to outcome in this dataset has not been previously reported.

A brief description of additional methods used in the study is given below; further details are available in other published reports from the parent study (Connolly Gibbons et al., 2014; Connolly Gibbons et al., 2016).

Participants

The trial was conducted at an outpatient community mental health center (CMHC). Patients were recruited from those seeking services for depression at the CMHC. The Quick Inventory for Depressive Symptomatology (QIDS; Rush et al., 2003) was completed by all adult patients attending an intake assessment at the clinic. Patients ages 18 to 65 who scored at least 11 on the QIDS were screened and, if eligible, scheduled for a research baseline assessment. A trained research clinical evaluator (advanced graduate students) blind to the study design then conducted the Structured Clinical Interview for the DSM-IV Axis I Disorders Interview (SCID; First, Spitzer, Gibbon, & Williams, 1997), as well as the interview for the Hamilton Depression Rating Scale (HAM-D; Hamilton, 1960). Patients who met criteria for major depressive disorder (MDD) were included in the study.

Clinicians employed at the CMHC were recruited to participate as study therapists. All clinicians had a master’s degree or above. Clinicians were matched to treatment based on previous training and education, theoretical orientation, and desire to be trained in a given treatment. Clinicians were predominantly female and Caucasian and had an average age of 41 years. A total of nine cognitive therapists and 11 psychodynamic therapists treated at least one randomized patient in the original Connolly Gibbons et al. (2016) study. Of these 20, 18 therapists who had alliance scores and outcome data for at least one patient were included in the current analyses. Eleven of the 18 therapists self-identified their religion as Christian. On average, these 18 therapists treated 8.9 patients each in the study.

Treatments

Patients were randomized to either 16 sessions of manual-guided DT or CT. The DT consisted of supportive-expressive DT (Book, 1998; Luborsky, 1984). The treatment includes supportive techniques to build a positive working alliance and expressive techniques to help patients gain self-understanding of their repetitive maladaptive relationship patterns. The treatment is an active, focused exploration of current relationship conflicts, including an explicit socialization to treatment and a focus on specific interpersonal goals.

The CT (Beck, 1995; Beck, Rush, Shaw, & Emery, 1979) consisted of structured sessions focused on behavioral activation and the exploration of negative thought patterns. Specific interventions included activity scheduling, evaluating automatic thoughts, and behavioral experiments. Treatment progressed to include exploration of underlying beliefs and attitudes.

The training and supervision was provided by expert supervisors with substantial experience delivering the respective treatments in clinical practice, delivering the treatment as part of research study protocols, and in supervising delivery of the treatment. Training consisted of a workshop followed by intensive individual supervision across each clinician’s first three training cases, during which clinicians received one hour of individual supervision for every two hours of delivered therapy. Supervision continued throughout the protocol’s randomization phase with ongoing bimonthly group supervisions.

Measures

The primary outcome measure was the 17-item HAM-D total score, administered at baseline and Months 1, 2, 4, and 5, to evaluate the severity of common symptoms of depression. The HAM-D was completed by applying the Structured Interview Guide to enhance reliability (Williams, 1988). A meta-analysis reports a Cronbach’s alpha of.79, as well as good inter-rater and test-retest reliability (Trajkovic et al., 2011). The internal consistency for the HAM-D at the month 5 assessment was adequate for the current sample (Cronbach’s α =.78).

The Working Alliance Inventory–Client (WAI-C) scale (Horvath & Greenberg, 1989) was used to assess the alliance in this project. The 12-item version was administered at sessions 2, 4, 6, and 8, but only the Session 2 and 4 measures were used in the current study due to (a) high levels of attrition from treatment, and (b) our desire to minimize the impact of reverse causation (outcome influencing the alliance) in later sessions. The total score of the WAI-C has been previously reported to have an internal consistency of.93 (Cronbach’s alpha coefficient). In the Connolly Gibbons et al. (2016) data, the WAI-C total score had an alpha of.89 using Session 2 data and.90 using Session 4 data. The correlation between Session 2 and Session 4 WAI-C total scores was.67; these two session scores were averaged for each patient for the analyses presented here (if one session WAI-C was missing, the other session score was used).

After averaging Session 2 and 4 WAI-C total scores, two alliance variables were created to separate the patient-to-patient differences and therapist-to-therapist differences in the alliance-outcome relation. The difference between patients within the same therapist (i.e., patient level alliance score) was quantified by patient differences from their respective therapist’s mean. By subtracting the alliance score of each patient from their therapist’s mean alliance score, the resultant deviation scores are not influenced by therapist differences in average alliances. Therefore, the deviation scores can be used as the index of patient level alliance. The therapist mean alliance score across all patients within each therapist’s caseload within the study was used as the therapist level alliance score.

Patient Baseline Variables Evaluated for Use as Instruments

We evaluated all available variables related to patient interpersonal problems and patient expectations about therapy as potential instruments. This consideration included the 32-item version of the Inventory of Interpersonal Problems (IIP; Barkham, Hardy, & Startup, 1996; Horowitz, Rosenberg, Baer, Ureño, & Villaseñor, 1988), scored for 8 subscales: hard to be sociable, hard to be assertive, too aggressive, too open, too caring, hard to be supportive, hard to be involved, and too dependent. Variables related to attitudes and expectations regarding therapy were available from the Attitudes and Expectations Questionnaire (Elkin et al., 1989). One set of questions asked the patient to rate, on a 1 to 7 scale, the extent to which each of the following would be helpful to learn in the treatment of their depression: (a) how to get along better with the important people in their life, (b) more realistic attitudes about themselves and the world, (c) how to stop the negative thoughts that interfere with functioning, (d) understand the patterns they have in their relationships, and (e) understand how the ways they respond to people can cause problems in relationships. Additional questions asked were “What is your attitude toward talking with a therapist/counselor as treatment for your problem?” and “Overall, how much improvement do you expect to experience as a result of treatment?” The Attitudes and Expectations questionnaire also included two items that touched on interpersonal issues: (a) the extent to which the patient believed that their problems are caused by issues in their relationships with the important people in their life, and (b) the extent to which the patient believed that their problems are caused by repeating the same mistakes in all of their relationships. All of these variables were retained as single items so that the unique contribution of each to the alliance could be evaluated.

Therapist Baseline Variables Evaluated for Use as Instruments

From a therapist questionnaire administered prior to participation in the Connolly Gibbons et al. (2016) study, we collected several variables related to therapist experience that could be evaluated as potential instruments. These included: (a) the length of time the therapist had worked at the CMHC, (b) the number of clinical hours per week the therapist worked, (c) the number of years practicing therapy since receiving their degree, and (d) whether or not the therapist had received previous training in an evidence-based therapy. In addition to general background experience variables, we evaluated variables related to the therapist’s experience in the study’s training phase and with working at the particular agency. These were (each rated on a 1 to 7 scale): (a) “How sensible does the treatment you have been practicing as a part of this study seem to you?” (b) “How confident are you that this treatment will help to decrease your patients’ depression?” (c) “How comfortable would you feel recommending this kind of treatment to another therapist?” (d) “How satisfied were you with the training you received as a part of this study?” (e) “How much did the training you received as a part of this study change your treatment approach?” (f) “How satisfied are you with your work environment at the agency?” (g) “How satisfied are you with your experience participating in this study?” (h) “After the conclusion of this study, will the treatment you have been trained in continue to be useful?”, and (i) “After the conclusion of this study, how likely are you to continue using the treatment with future clients?” We also examined therapist demographic variables as potential instruments, including age, race (coded as white vs. all others), Hispanic vs. not Hispanic, and religion (coded as Christian vs. another religion or no religion, with 62% of therapists self-identifying as Christian; sample sizes of more diverse religions were not large enough be coded as a dichotomous variable). Finally, one question that related to therapist values was available to examine: “How important is religion in your life?” There was only one male therapist, so gender was not included in these analyses.

Statistical Analyses

Creation of instruments.

At the patient level, the full set of patient baseline variables were entered into a multiple regression analyses with backwards elimination set at a p value of.10. A p value of.10 was chosen to allow for a broad initial consideration of variables as potential instruments. The dependent variable in this analysis was the patient level alliance score. Potential variables identified in the backwards elimination were examined as predictors of outcome that could meet the exclusion restriction for instruments (i.e., in our case, the instrument could only predict outcome through its association with the alliance). These analyses were conducted as longitudinal mixed effects that examined the slope of HAM-D scores from Month 1 to Month 5 as the dependent variable. In separate analyses, each of the potential instrument variables were entered as a predictor of the slope of HAM-D change over Months 1 to 5, with the patient level alliance scores also in the model. After these analyses, only potential instrument variables that were non-significant predictors of outcome at p >.20 (to indicate emphatic non-significance) were retained for use (following the exclusion restriction).

To identify instruments at the therapist level, the limited sample size (N = 18 therapists) required a focus on effect sizes rather than on statistical significance for selection of variables. Initial analyses selected therapist variables based on a zero-order correlation of.30 or greater with therapist level alliance scores. Potential therapist-level instrument variables were then examined as therapist level predictors of the slope of HAM-D change over Months 1 to 5, using the full sample (N = 161) but including therapist as a clustering variable and therapist level alliance scores in the models. The variables that were unrelated to outcome (near zero association), with therapist level alliance scores in the model, were retained for use in the creation of the final therapist level instrument (following the exclusion restriction).

Standard non-causal analyses.

We first implemented a standard (non-causal), mixed effects, multilevel modeling approach, adjusting for the hierarchy of patient clustering within therapists with nested random effects to predict outcome from the patient level and therapist level alliance scores. The HAM-D total scores obtained at Months 1, 2, 4, and 5 served as the dependent variable. Time of each assessment was coded as weeks from baseline because of deviations from exact “monthly” intervals. Because there were no significant effects on outcome for treatment group by time, or treatment group by alliance by time, in any analyses, no term for treatment group was included in the models. This allowed for simplification of the models so that convergence could be obtained. The analyses were conducted using the SAS GLIMMIX procedure specifying an identify link (indicating a normal distribution of the dependent variable) and the use of robust standard errors. Robust standard errors address any potential heteroscedasticity that occurs if variability in the dependent variable (here, the HAM-D) is unequal across the range of values of the predictor variables (here, the patient and therapist level alliance variables).

Effect sizes (Cohen’s d, which was then converted to partial rs to compare to the literature) were derived from the F-test for the compound symmetry design created by the specification of therapist as a random effect, were calculated as d = 2Fdf, where F is the F-test statistic for the regression coefficient of the interaction term (Rosenthal & Rosnow, 1991). Denominator degrees of freedom for all models were based on the Containment approach as given in SAS output for the GLIMMIX procedure.

Causal Inference Analyses.

Causal inference analyses were conducted using the commonly employed two-stage approach, but extended to a multilevel model with a longitudinal outcome. In the first stage, the instruments were created at the patient and therapist level as described above. In the second stage, the part of the alliance predicted separately by the patient and therapist level instruments were used as predictors of HAM-D outcomes in a mixed effects multilevel model with patients nested within therapists. The SAS GLIMMIX procedure was used for this second stage analysis. As with the standard analysis, we specified an identify link and robust standard errors.

The equation for this analysis is as follows:

Yijt=β0+β1PIVij+β3TIVj+β3weekijt+β4PIVij×weekijt+β5TIVj×weekijt
+r1ijRandomPatienteffect+r2jRandomTherapisteffect+eijtRandomErroreffect,

Where:

r1ijN0,σp2,r2jN0,σT2,eijtN0,σe2

with i indicating patients within therapist j at week = t. The PIV variable is the patient level (within-therapist) score created by predicting the patient level alliance variable from the patient instruments. The TIV variable is the therapist level score created by predicting the between-therapist alliance variable from the therapist level instruments.

Results

Characteristics of Sample

Demographic and other characteristics of the sample (N = 161) with at least one alliance assessment and at least one outcome assessment at Months 1, 2, 4 or 5 are presented in Table 1.

Table 1.

Demographic Characteristics of Patients

Characteristics n=161
Gender, % Female 77.0
Marital Status
        % Single 54.0
        % Married/Cohabitating 18.0
        % Separated/Divorced 25.5
        % Widowed 2.5
Ethnicity, % Hispanic 3.7
Race
        % African-American 39.1
        % Caucasian 52.2
        % Other 8.7
Employment
        % Full-Time 6.2
        % Part-Time 6.2
        % Stay at Home Parent 6.8
        % Unemployed 55.3
        % Student 8.1
        % Disability 17.4
Highest level of education
        % < High School Diploma 22.4
        % High School Diploma/GED 31.7
        % Some College 37.3
        % College Graduate 4.3
        % Post-graduate or Professional degree 4.3
Age, years m (sd) 37.5 (12.1)

The mean (SD) treatment duration was 8.8 (4.7) sessions in the sample of 161 patients.

We calculated intraclass correlation coefficients (ICC) to examine therapist effects on alliance (average of session 2 and 4 on the WAI) and outcome (HAM-D scores over time). These ICCs were estimated from variance components derived from mixed effect models with therapist specified as a random effect. For the alliance, the ICC for therapist differences was. 11; for outcome it was.02.

Creation of Instruments

At the patient level, an initial set of 5 variables passed the backwards elimination criteria of.10 for retaining as predictors of patient level alliance. However, in analyses predicting outcome, two of these variables, belief that learning more realistic attitudes about myself and the world would be helpful, F(1, 338) =4.8, p =.029; and overall expectations for improvement, F(1, 338) =3.5, p =.06), predicted the slope of HAM-D change in the context of the patient level alliance. These two variables were therefore dropped because they failed to meet the restriction exclusion. The relation of the other three variables to outcome, in the context of patient level alliance, were (a) the extent to which the patient believes that their depression was caused by repeating mistakes in relationships, F(1, 346) =.40, p =.53; (b) having a negative attitude about talking to a therapist, F(1, 351) =.48, p =.49; and (c) the IIP-32 subscale “hard to be supportive,” F(1, 360) =1.29, p =.26. The regression model predicting the patient level alliance score from these three measures yielded a multiple R of.41 (F(3, 155) =10.4, p <.001), verifying that the composite of these three variables would serve as an adequate instrument given that these variables were not related to outcome when alliance was added to a regression model predicting outcome. Standardized regression coefficients for the individual instrument variables predicting alliance were as follows: repeating relationship mistakes, β =.26 (p =.001); attitude towards talking with a therapist, β = −.24 (p =.002); and IIP-32 hard to be supportive, β = −.154 (p =.039). Controlling for therapist differences in average alliance, patients who had more positive alliances endorsed the belief that the cause of their depression was repeating relationship mistakes, had a positive attitude about talking to a therapist, and did not find it hard to be supportive to others. We used these three variables in a multiple regression with patient level alliance as the dependent variable and outputted model-based predicted patient level alliance scores to use in the final causal inference analysis predicting outcome.

Among the therapist variables, there were six variables that demonstrated a zero-order correlation of greater than.30 with therapist level (N = 18) alliance: (a) importance of religion in the therapist’s life (r =.41), (b) Christian religion vs. other religion or no religion (r =.37), (c) length of time the therapist had been employed at the agency (r =.47), (d) “How comfortable would you feel recommending this kind of treatment to another therapist?” (r =.31), (e) “How likely are you to continue using the treatment with future clients?” (r =.31), and (f) “How much did the training you received as part of the study change your treatment approach?” (r =.43). Analyses in the full sample (N = 161, clustering by therapist) predicting HAM-D outcomes revealed that some of the variables predicted outcome in the context of the therapist level alliance scores (i.e., failed the restriction exclusion) to a relatively stronger degree than other variables. To balance the need to meet the exclusion criteria but retain as many variables as possible, we excluded the three variables with the highest associations with outcome with alliance in the model. These were importance of religion (r =.08), “How likely are you to continue using the treatment with future clients?” (r =.04), and “How much did the training you received as part of the study change your treatment approach?” (r =.07). The other three variables had less association with outcome in the context of the alliance: length of time working at agency, r =.01; “How comfortable would you feel recommending this kind of treatment to another therapist?”, r =.01; and Christian religion, r =.03. We therefore used these three variables in a multiple regression with therapist level alliance as the dependent variable and outputted model-based predicted therapist level alliance scores to use in the final causal inference analysis predicting outcome. The effect size (multiple R) for the relation of the three baseline variables to alliance at the therapist level was.56. Therapists who had better alliances on average tended to have worked longer at the agency, were more comfortable recommending the treatment to another therapist, and were more likely to be Christian.

Relation of Alliance to Outcome: Multilevel Standard Analyses

Standard (non-causal) multilevel model results for examining the alliance (average of Sessions 2 and 4) predicting subsequent change in the HAM-D (slope of change from Month 1 to 5) revealed a statistically significant relationship of the alliance to outcome at the therapist level, F(1, 314) = 9.8 p =.0019; r =.174, but not the patient level, F(1, 314) = 1.4, p =.23, r =.069. The unpartitioned association of the alliance with subsequent HAM-D change was r = −.130 (p =.012).

Relation of Alliance to Outcome: Multilevel Causal Inference Analyses

The multilevel, mixed effects, analyses predicting HAM-D slope from Month 1 to 5 from patient and therapist level instruments yielded results consistent with the standard (non-causal) analyses (Table 2). A statistically significant effect, F(1, 310) = 5.07, p =.025, r =.127, was evident at the therapist level but not the patient level, F(1, 310) =.25, p =.62, r =.028.

Table 2.

Results of Multilevel Instrumental Variables Analyses Predicting Slope of HAM-D Change from Month 1 to 5

Parameter Parameter Estimate SE DF F p
Intercept 19.46 .54 1, 16 1280.2 <.0001
Time −.144 .038 1, 310 14.5 .0002
Patient Level Instrument-Predicted Alliance .037 .47 1, 310 .01 .94
Therapist Level Instrument-Predicted Alliance −.319 .346 1, 310 .85 .36
Patient Level Instrument-Predicted Alliance
By Time
−.024 .047 1, 310 .25 .62
Therapist Level Instrument Predicted Alliance
By Time
−.065 .029 1, 310 5.07 .025

Note. HAM-D = 17-item Hamilton Rating Scale for Depression total score.

Discussion

This article had two goals: (a) to present an introduction to causal inference models, and (b) to apply a causal inference statistical model to the alliance-outcome association. Regarding the first goal, it should be acknowledged that our presentation consisted of a brief overview designed to provide an initial orientation to these approaches. The topic of causal inference in statistics is broad. Consultation of relevant textbooks and articles is recommended for a more in depth understanding (see, for example: Angrist, Imbens, & Rubin, 1996; Bound, Jaeger, & Baker, 1995; Lousdal, 2018; Morgan & Winship, 2014; Pearl, 2009; Robins, 1994; Rubin, 1974). However, we hope that our brief overview of causal inference methods has highlighted the importance of attempting to understand whether associations between variables are causal or not, and has provided an initial starting point for psychotherapy researchers who might be interested in using such methods.

The results of applying a causal inference approach (method of instrumental variables) to the alliance-outcome association provided evidence for a causal role for therapist level alliance in producing relatively better treatment outcomes. This suggests that individual differences between therapists in their abilities to form, on average, positive alliances with patients are a contributing factor leading to positive outcomes. As such, the current findings support the results of previous studies using non-causal statistical methods that also found therapist level associations between the alliance and outcome (Baldwin et al., 2007; Crits-Christoph et al., 2009; Dinger et al., 2008; Marcus et al., 2011; Zuroff et al., 2010).

It is important to stress that valid interpretation of causality using instrumental variables analysis rests on a number of assumptions. Some of these assumptions, such the exclusion restriction (in our case, lack of association between the instruments and outcome, except through the alliance), can be evaluated and met. Other assumptions, such as the exchangeability assumption (i.e., the instrument does not share common causes with the outcome), cannot be tested. If, for example, the causes of “baseline attitude towards talking with a therapist” also independently cause better outcome (separate from their causal influence through “baseline attitude toward talking with a therapist”), then confounding has simply been moved from the alliance to the instrument. Additional concerns about the use of the instrumental variable approach are the strength of the instruments and finite sample size bias. These two factors interact (i.e., weak instruments are particularly problematic within smaller sample sizes) (Crown, Henk, & Vanness, 2011). Because of these limitations, it is not appropriate to claim that the current analyses definitely “prove” causality of the alliance in relation to outcome. We would, however, suggest that the analyses reported here, in addition to providing an illustration of instrumental variable analysis, moves the ball down the field a bit further than standard (non-causal) analyses in terms of evidence supporting a causal role for the alliance. Or, at the very least, our approach forages a path for future studies to further examine a potential causal role for the alliance. To address the limitations of the current study, the ideal study of the alliance-outcome relation would (a) include a larger sample size of both patients and therapists (and more patients per therapist), (b) obtain more assessments of the alliance, and (c) build upon theory and empirical findings of the causes of the alliance in order to measure variables that likely would be stronger instruments.

It is also important that the instruments selected be plausible and relevant. We believe that the three therapist level variables used as instruments in this study are plausible causes of therapist level alliance. Therapists who have worked for a longer period of time at the community mental health agency may have developed better skills at establishing an alliance with the types of patients who receive treatment at the agency. Therapists who self-identified as being Christian vs. other religion or no religion, as part of their value system may be particularly empathic and willing to create a bond with individuals who are less fortunate, and in our experience some of these therapists have chosen to work in the community mental health setting based on their values towards helping disadvantaged populations. Those therapists who feel comfortable recommending the type of treatment they delivered in the study to another therapist also likely are positive about their treatment modality and this carries over to a motivation to emotionally connect to their patients and/or achieve agreement on goals and tasks of treatment.

Although a causal effect was suggested here, critics of the role of the alliance might argue that the size of the effect was relatively small (r =.127) and therefore the alliance is relatively unimportant. However, there are several issues to consider when evaluating the reported size of the effect. First, the size of the causal effect is constrained by the particular instruments used. Thus, the effect found here is not necessarily the full causal effect of the alliance on outcome. Rather, the therapist level causal effect reported here is the causal effect of that part of the alliance that is caused by the three therapist variables used as instruments. Second, we predicted outcomes from Month 1 to 5 because of constraints on when the alliance and outcome were assessed and the desire to predict subsequent change in symptoms. Given that much improvement typically occurs prior to Month 1, more robust effects of the causal role of the alliance might be obtained by measuring the alliance and outcome earlier so that the impact of the alliance on this early change can be included in the statistical modeling. Third, we measured the alliance as an average of two sessions. Previous work has shown that the size of the alliance-outcome relation increases substantially as the alliance is assessed at more sessions (Crits-Christoph, Connolly Gibbons, et al., 2011). The typical session to session variability in the alliance suggests that averaging over two sessions is only a marginally acceptable basis for creating a stable estimate of the typical level of the alliance for a given patient. Moreover, the average level of alliance may be only one facet of the alliance that plays a causal role in outcome. Other studies have suggested that of particular importance are the pattern of change in the alliance over therapy (e.g., Kivlighan & Shaughnessy, 2000; Stiles et al., 2004), the resolution of alliance ruptures (e.g., Daly, Llewelyn, McDougall, & Chanen, 2010; Muran et al., 2009), and the differential impact of specific dimensions of the alliance (e.g., Webb et al., 2011). Finally, it should also be mentioned that there are alternative ways of calculating effect sizes in the context of mixed effects multilevel designs that yield highly discrepant results (Feingold, 2009). The different methods are a function of different ways of estimating the degrees of freedom for the relevant terms in such models. The approach taken here relied on the Containment approach incorporated in SAS. Using estimates for degrees of freedom for the therapist level predictor that is closer to the number of therapists in the study (N = 18) would yield substantially higher effect sizes once formulas for translating the obtained F to r are implemented. Regardless of which effect size is more appropriate, given the limitations of assessment instruments (less than perfect reliability), the complexity of change over the course of psychotherapy, restriction in range in process measures, and variability between patients in the degree of which they respond to a given element of psychotherapy process, it is surprising that even relatively small process-outcome associations can be detected (DeRubeis, Gelfand, German, Fournier, & Forand, 2014).

But what does it mean to say that the alliance causes positive outcomes? Under the counterfactual model that guides causal inference testing (Rubin, 1974), causality would reduce theoretically to this: if one individual is exposed to a high alliance therapy, and an identical individual is exposed to a lower alliance therapy, all other things being identical, the individual exposed to the higher alliance therapy would have relatively better outcomes. Thus, the definition of causality does not mean that there are no intervening variables between the alliance and outcome, nor does it mean that there are no other causes of improvement. Both of these are highly likely. A positive alliance may permit a therapist to more effectively help a patient understand interpersonal conflicts, persuade a patient to engage in new behaviors, or reevaluate negative cognitive strategies. Multiple patient, therapist, dyad, and setting variables also contribute to outcome. Each of the factors potentially associated with treatment outcome can in theory be examined using causal inference methods as a supplement to standard non-causal statistical methods.

As mentioned, there are a variety of issues related to measurement, assumptions of statistical methods, sampling of patients and therapists, the nature of the setting (community mental health center), and attrition that place limits on the generalizability of our findings. Given these issues and the empirical selection (guided by previous research) of variables to be used as instruments, the results presented herein are unlikely to be broadly generalizable. It may well be that the alliance does not play a causal role within treatments in which the techniques might have a stronger influence on outcome (e.g., exposure therapy for certain anxiety disorders). Further, as we have discussed elsewhere (Crits-Christoph, Connolly Gibbons, et al., 2011), the ability of a therapist to provide a caring and empathic therapeutic relationship that facilitates the alliance may be especially important for patients with depressive disorders who are isolated from others and/or have low self-esteem. In patients with other disorders, a positive alliance might be more a function of prior symptomatic improvement rather than playing a causal role in subsequent symptom improvement.

Nonetheless, the findings of the current study serve as a starting point, and challenge, to psychotherapy researchers to begin to identify whether the alliance has a causal effect on treatment outcome. The nearly 300 studies documenting an association between the alliance and outcome are more than sufficient to support the hypothesis that these variables are correlated. What is needed moving forward is to identify the extent to which the empirical association is causal. As part of such a program of research, identification of the causal influences on the alliance, and the variables that mediate the connection between the alliance and outcome, should also be delineated. If research can help support strong theories about the causes of the alliance, and the causal factors intervening between the alliance and outcome, more robust and generalizable findings might emerge when applying causal inference approaches, such as the method of instrumental variables used here, to the investigation of the alliance-outcome association.

Acknowledgments

Research reported in this publication was supported by the Agency for Healthcare Research and Quality under award number R01HS018440, award number K02HS022124, and National Institute of Mental Health award number R01-MH092363. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality or the National Institute of Mental Health.

Contributor Information

Paul Crits-Christoph, University of Pennsylvania.

Robert Gallop, West Chester University.

Averi Gaines, University of Pennsylvania.

Agnes Rieger, University of Pennsylvania.

Mary Beth Connolly Gibbons, University of Pennsylvania.

References

  1. Abouguendia M,Joyce AS,Piper WE, &Ogrodniczuk JS(2004).Alliance as a mediator of expectancy effects in short-term group psychotherapy.Group Dynamics: Theory, Research, and Practice,8(1),3–12. doi: 10.1037/1089-2699.8.1.3 [DOI] [Google Scholar]
  2. Ackerman SJ, & Hilsenroth MJ (2003). A review of therapist characteristics and techniques positively impacting the therapeutic alliance. Clinical Psychology Review, 23, 1–33. doi: 10.1016/S0272-7358(02)00146-0 [DOI] [PubMed] [Google Scholar]
  3. Angrist JD, & Kruger AB (2001). Instrumental variables and the search for identification: From supply and demand to natural experiments. Journal of Economic Perspectives, 15(4), 69–85. doi: 10.3386/w8456 [DOI] [Google Scholar]
  4. Baldwin SA, Wampold BE, & Imel ZE (2007). Untangling the alliance-outcome correlation: Exploring the relative importance of therapist and patient variability in the alliance. Journal of Consulting and Clinical Psychology, 75(6), 842–852. doi: 10.1037/0022-006X.75.6.842 [DOI] [PubMed] [Google Scholar]
  5. Barber JP, Connolly MB, Crits-Christoph P, Gladis M, & Siqueland L (2000). Alliance predicts patients’ outcome beyond in-treatment change in symptoms. Journal of Consulting and Clinical Psychology, 68(6), 1027–1032. doi: 10.1037/0022-006X.68.6.1027 [DOI] [PubMed] [Google Scholar]
  6. Barber JP, Luborsky L, Crits-Christoph P Thase M, Weiss R, Frank A, … Gallop R (1999). Therapeutic alliance as a predictor of outcome in treatment of cocaine dependence. Psychotherapy Research, 9(1), 54–73. doi: 10.1080/10503309912331332591 [DOI] [Google Scholar]
  7. Barkham M, Hardy GE, & Startup M (1996). The IIP-32: A short version of the Inventory of Interpersonal Problems. British Journal of Clinical Psychology, 35(1), 21–35. doi: 10.1111/j.2044-8260.1996.tb01159.x [DOI] [PubMed] [Google Scholar]
  8. Beck AT, Rush AJ, Shaw BF, & Emery G (1979). Cognitive therapy of depression. New York, NY: Guilford Press. [Google Scholar]
  9. Beck JS (1995). Cognitive therapy: Basics and beyond. New York, NY: Guilford Press. [Google Scholar]
  10. Benjamin LS & Critchfield KL (2010). An interpersonal perspective on therapy alliances and techniques In Muran JC & Barber JP (Eds.), The therapeutic alliance: An evidence-based approach to practice and training (pp. 123–149). New York, NY: Guilford Press. [Google Scholar]
  11. Robinson GK (1991). That BLUP is a good thing. Statistical Science 6, 15–51. [Google Scholar]
  12. Book HE (1998). How to practice brief dynamic psychotherapy: The CCRT method. Washington, DC: American Psychological Association. [Google Scholar]
  13. Bordin ES (1979). The generalizability of the psycho-analytic concept of the working alliance. Psychotherapy: Theory, Research, and Practice, 16(3), 252–260. doi: 10.1037/h0085885. [DOI] [Google Scholar]
  14. Bound J, Jaeger DA, & Baker RM (1995). Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association, 90(430), 443–450. doi: 10.1080/01621459.1995.10476536 [DOI] [Google Scholar]
  15. Brannick MT (1995). Critical comments on applying covariance structure modeling. Journal of Organizational Behavior, 16(3), 201–213. doi: 10.1002/job.4030160303 [DOI] [Google Scholar]
  16. Castonguay LG, Constantino MJ, McAleavey AA, & Goldfried MR (2010). The therapeutic alliance in cognitive-behavioral therapy In Muran JC & Barber JP (Eds.), The therapeutic alliance: An evidence-based approach to practice and training (pp. 150–172). New York, NY: Guilford Press. [Google Scholar]
  17. Connolly Gibbons MB, Gallop R, Thompson D, Luther D, Crits-Christoph K, Jacobs, … Crits-Christoph P (2016). Comparative effectiveness of cognitive and dynamic therapies for major depressive disorder in a community mental health setting: A randomized non-inferiority trial. JAMA Psychiatry, 73(9), 904–911. doi: 10.1001/jamapsychiatry.2016.1720 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Connolly Gibbons MB, Mack R, Lee J, Gallop R, Thompson D, Burock D, & Crits-Christoph P (2014). Comparative effectiveness of cognitive and dynamic therapies for major depressive disorder in a community mental health setting: Study protocol for a randomized non-inferiority trial. BMC Psychology, 2(1), 47. doi: 10.1186/s40359-014-0047-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Connolly Gibbons MB,Crits-Christoph P,de la Cruz C,Barber JP,Siqueland L, &Gladis M(2003).Pretreatment expectations, interpersonal functioning, and symptoms in the prediction of the therapeutic alliance across supportive-expressive psychotherapy and cognitive therapy.Psychotherapy Research,13(1),59–76. doi: 10.1093/ptr/kpg007 [DOI] [PubMed] [Google Scholar]
  20. Constantino MJ, Coyne AE, Luukko EK, Newkirk K, Bernecker SL, Ravitz P, & McBride C (2017). Therapeutic alliance, subsequent change, and moderators of the alliance–outcome association in interpersonal psychotherapy for depression. Psychotherapy, 54(2), 125–135. doi: 10.1037/pst0000101 [DOI] [PubMed] [Google Scholar]
  21. Constantino MJ,Arnow BA,Blasey C, &Agras W(2005).The association between patient characteristics and the therapeutic alliance in cognitive-behavioral and interpersonal therapy for bulimia nervosa.Journal of Consulting and Clinical Psychology,73(2),203–211. doi: 10.1037/0022-006X.73.2.203 [DOI] [PubMed] [Google Scholar]
  22. Crits-Christoph P, Connolly Gibbons MB, & Hearon B (2006). Does the alliance cause good outcome? Recommendations for future research on the alliance. Psychotherapy: Theory, Research, Practice, Training, 43(2), 280–285. doi: 10.1037/0033-3204.43.3.280 [DOI] [PubMed] [Google Scholar]
  23. Crits-Christoph P, Connolly Gibbons MB, & Mukherjee D (2013). Psychotherapy process-outcome research In Lambert M (Ed), Handbook of psychotherapy and behavior change (6th ed., pp. 298–339). New York, NY: John Wiley and Sons. [Google Scholar]
  24. Crits-Christoph P, Connolly Gibbons MB, Hamilton J, Ring-Kurtz S, & Gallop R (2011). The dependability of alliance assessments: The alliance-outcome correlation is larger than you might think. Journal of Consulting and Clinical Psychology, 79(3), 267–278. doi: 10.1037/a0023668 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Crits-Christoph P, Gallop R, Temes CM, Woody G, Ball SA, Martino S, & Carroll KM (2009). The alliance in motivational enhancement therapy and counseling as usual for substance use problems. Journal of Consulting and Clinical Psychology, 77(6), 1125–1135. doi: 10.1037/a0017045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Crits-Christoph P, Hamilton J, Ring-Kurtz S, Gallop R, McClure B, Kulaga A, & Rotrosen R (2011). Program, counselor, and patient variability in the alliance: A multilevel study of the alliance in relation to substance use outcomes. Journal of Substance Abuse Treatment, 40(4), 405–413. doi: 10.1016/j.jsat.2011.01.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Crown WH, Henk HJ, & Vanness DV (2011). Some cautions on the use of instrumental variables estimators in outcomes research: How bias in instrumental variables estimators is affected by instrument strength, instrument contamination, and sample size. Value in Health, 14, 1078–1084. doi: 10.1016/j.jval.2011.06.009 [DOI] [PubMed] [Google Scholar]
  28. Daly AM, Llewelyn S, McDougall E, & Chanen AM (2010). Rupture resolution in cognitive analytic therapy for adolescents with borderline personality disorder. Psychology and Psychotherapy: Theory, Research and Practice, 83, 273–288. doi: [DOI] [PubMed] [Google Scholar]
  29. DeRubeis RJ, & Feeley M (1990). Determinants of change in cognitive therapy for depression. Cognitive Therapy Research, 14(5), 469–482. doi: 10.1007/BF01172968 [DOI] [Google Scholar]
  30. DeRubeis RJ, Brotman MA, & Gibbons CJ (2005). A conceptual and methodological analysis of the nonspecifics argument. Clinical Psychology: Science & Practice, 12(2), 174–183. doi: 10.1093/clipsy/bpi022 [DOI] [Google Scholar]
  31. DeRubeis RJ, Gelfand LA, German RE, Fournier JC, & Forand NR (2014). Understanding processes of change: How some patients reveal more than others - and some groups of therapists less - about what matters in psychotherapy. Psychotherapy Research, 24(3), 419–428. doi: 10.1080/10503307.2013.838654 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Dinger U, Strack M, Leichsenring F, Wilmers F, & Schauenburg H (2008). Therapist effects on outcome and alliance in inpatient psychotherapy. Journal of Clinical Psychology, 64(3), 344–354, doi: 10.1002/jclp.20443 [DOI] [PubMed] [Google Scholar]
  33. Dunkle JH, & Friedlander ML (1996). Contribution of therapist experience and personal characteristics to the working alliance.Journal of Counseling Psychology,43(4), 456–460. doi: 10.1037/0022-0167.43.4.456 [DOI] [Google Scholar]
  34. Earle CC, Tsai JS, Gelber RD, Weinstein MC, Neumann PJ, & Weeks JC (2001). Effectiveness of chemotherapy for advanced lung cancer in the elderly: Instrumental variable and propensity analysis. Journal of Clinical Oncology, 19,1064–1070.doi: 10.1200/JCO.2001.19.4.1064 [DOI] [PubMed] [Google Scholar]
  35. Elkin I, Shea MT, Watkins JT, Imber SD, Sotsky JF, Collins DR,… Parloff MB (1989). National Institute of Mental Health Treatment of Depression Collaborative Research Program: General effectiveness of treatments. Archives of General Psychiatry, 46(11), 971–982. doi: 10.1001/archpsyc.1989.01810110013002 [DOI] [PubMed] [Google Scholar]
  36. Escudero V, Heatherington L & Friedlander ML (2010). Therapeutic alliances and alliance building in family therapy In Muran JC & Barber JP (Eds.),The therapeutic alliance: An evidence-based guide to practice (pp. 240–262).New York, NY: Guilford. [Google Scholar]
  37. Falkenström F, Granström F, & Holmqvist R (2013). Therapeutic alliance predicts symptomatic improvement session by session. Journal of Counseling Psychology, 60(3), 317–28. doi: 10.1037/a0032258 [DOI] [PubMed] [Google Scholar]
  38. Falkenström F, Granström F, & Holmqvist R (2014). Working alliance predicts psychotherapy outcome even while controlling for prior symptom improvement. Psychotherapy Research, 24(2), 146–159, doi: 10.1080/10503307.2013.847985 [DOI] [PubMed] [Google Scholar]
  39. Feeley M, DeRubeis RJ, & Gelfand LA (1999). The temporal relation of adherence and alliance to symptom change in cognitive therapy for depression. Journal of Consulting and Clinical Psychology, 67(4), 578–582. doi: 10.1037/0022-006X.67.4.578 [DOI] [PubMed] [Google Scholar]
  40. Feingold A (2009). Effect sizes for growth-modeling analysis for controlled clinical trials in the same metric as for classical analysis. Psychological Methods, 14(1), 43–53. doi: 10.1037/a0014699 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. First MB, Spitzer RL, Gibbon M, & Williams JB (1997). Structured clinical interview for DSM-IV axis I disorders (SCID-I), Clinical Version. Washington, DC: American Psychiatric Association. [Google Scholar]
  42. Flückiger C, Del Re AC, Wampold BE, & Horvath AO (2018). The alliance in adult psychotherapy: A meta-analytic synthesis.Psychotherapy.Advance online publication. doi: 10.1037/pst0000172 [DOI] [PubMed] [Google Scholar]
  43. Gallop R, Small D, Lin JY, Elliott MR, Joffe M, & Ten Have TR (2009). Mediation analysis with principal stratification. Statistics in Medicine, 28(7), 1108–1130. doi: 10.1002/sim.3533 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Gaudiano BA, &Miller IW(2006).Patients’ expectancies, the alliance in pharmacotherapy, and treatment outcomes in bipolar disorder. Journal of Consulting and Clinical Psychology,74(4),671–676. doi: 10.1037/0022-006X.74.4.671 [DOI] [PubMed] [Google Scholar]
  45. Goldsmith LP, Lewis SW, Dunn G, & Bentall RP (2015). Psychological treatments for early psychosis can be beneficial or harmful, depending on the therapeutic alliance: An instrumental variable analysis. Psychological Medicine, 45(11), 2365–2373. doi: 10.1017/S003329171500032X [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Greenland S(2000). An introduction to instrumental variables for epidemiologists. International Journal of Epidemiology, 29, 722–729. doi: 10.1093/ije/29.4.722 [DOI] [PubMed] [Google Scholar]
  47. Hamilton MA (1960). A rating scale for depression. Journal of Neurological and Neurosurgical Psychiatry, 23, 56–62. doi: 10.1136/jnnp.23.1.56 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Hersoug AG, Høglend P, Havik O, von der Lippe A, & Monsen J (2009). Therapist characteristics influencing the quality of alliance in long‐term psychotherapy. Clinical Psychology and Psychotherapy, 16, 100–110. doi: 10.1002/cpp.605 [DOI] [PubMed] [Google Scholar]
  49. Hersoug AG, Høglend P, Monsen JT, & Havik OE (2001). Quality of working alliance in psychotherapy: Therapist variables and patient/therapist similarity as predictors. Journal of Psychotherapy Research and Practice, 10(4), 205–216 [PMC free article] [PubMed] [Google Scholar]
  50. Holmes TH, Zulman DM, & Kushida CA (2017). Adjustment for variable adherence under hierarchical structure: instrumental variable modeling through compound residual inclusion. Medical Care, 55, e120–e130. doi: 10.1097/MLR.0000000000000464 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Horowitz LM, Rosenberg SE, Baer BA, Ureño G, & Villaseñor VS (1988). Inventory of Interpersonal Problems: Psychometric properties and clinical applications. Journal of Consulting and Clinical Psychology, 56(6), 885–892. doi: 10.1037/0022-006X.56.6.885 [DOI] [PubMed] [Google Scholar]
  52. Horvath AO, Symonds D, & Tapia L (2010). Therapeutic alliances in couple therapy: The web of relationships In Muran JC & Barber JP (Eds.), The therapeutic alliance: An evidence-based guide to practice (pp. 210–239). New York, NY: Guilford. [Google Scholar]
  53. Horvath AO, & Greenberg LS. (1989). Development and validation of the Working Alliance Inventory. Journal of Counseling Psychology, 36(2), 223–233. doi: 10.1037/0022-0167.36.2.223 [DOI] [Google Scholar]
  54. Huppert JD, Kivity Y, Barlow DH, Gorman JM, Shear MK, & Woods SW (2014). Therapist effects and the outcome-alliance correlation in cognitive behavioral therapy for panic disorder with agoraphobia. Behaviour Research and Therapy, 52, 26–34. doi: 10.1016/j.brat.2013.11.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Imbens G, & Rubin D (2015). Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge: Cambridge University Press. [Google Scholar]
  56. Johansson P,Høglend P, &Hersoug AG(2011).Therapeutic alliance mediates the effect of patient expectancy in dynamic psychotherapy.British Journal of Clinical Psychology,50(3),283–297. doi: 10.1348/014466510X517406 [DOI] [PubMed] [Google Scholar]
  57. Joyce AS, &Piper WE(1998).Expectancy, the therapeutic alliance, and treatment outcome in short-term individual psychotherapy. Journal of Psychotherapy Practice and Research,7(3),236–248. doi: 10.1037/0022-006X.71.4.672 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Joyce AS,Ogrodniczuk JS,Piper WE, &McCallum M(2003).The alliance as mediator of expectancy effects in short-term individual therapy.Journal of Consulting and Clinical Psychology,71(4),672–679. doi: 10.1037/0022-006X.71.4.672 [DOI] [PubMed] [Google Scholar]
  59. Kivlighan DM, & Shaughnessy P (2000). Patterns of working alliance development: A typology of client’s working alliance ratings. Journal of Counseling Psychology, 47(3), 362–371. doi: 10.1037/0022-0167.47.3.362 [DOI] [Google Scholar]
  60. Klein DN, Schwartz JE, Santiago NJ, Vivian D, Vocisano C, Castonguay LG, … Keller MB (2003). Therapeutic alliance in depression treatment: Controlling for prior change and patient characteristics. Journal of Consulting and Clinical Psychology, 71(6), 997–1006. doi: 10.1037/0022-006X.71.6.997 [DOI] [PubMed] [Google Scholar]
  61. Laws HB, Constantino MJ, Sayer AG, Klien DN, Kocsis JH, Manber R, … Arnow BA (2017). Convergence in patient–therapist therapeutic alliance ratings and its relation to outcome in chronic depression treatment, Psychotherapy Research, 27(4), 410–424. doi: 10.1080/10503307.2015.1114687 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Lousdal ML (2018). An introduction to instrumental variable assumptions, validation and estimation.Emerging Themes in Epidemiology,15(1). doi: 10.1186/s12982-018-0069-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Luborsky L (1984). Principles of psychoanalytic psychotherapy: A manual for supportive-expressive (SE) treatment. New York, NY: Basic Books. [Google Scholar]
  64. Mallinckrodt B, & Nelson ML (1991). Counselor training level and the formation of the psychotherapeutic working alliance. Journal of Counseling Psychology, 38(2), 133–138. doi: 10.1037/0022-0167.38.2.133 [DOI] [Google Scholar]
  65. Marcus DK, Kashy DA, Wintersteen MB, & Diamond GS (2011). The therapeutic alliance in adolescent substance abuse treatment: A one-with-many analysis. Journal of Counseling Psychology, 58(3), 449–455. doi: 10.1037/a0023196 [DOI] [PubMed] [Google Scholar]
  66. Messer SB, & Wolitzky DL (2010). A psychodynamic perspective on the therapeutic alliance In Muran JC & Barber JP (Eds.), The therapeutic Alliance: An evidence-based approach to practice and training (pp. 97–122). New York, NY: Guilford Press. [Google Scholar]
  67. Meyer B,Pilkonis PA,Krupnick JL,Egan MK,Simmens SJ, &Sotsky SM(2002). Treatment expectancies, patient alliance and outcome: Further analyses from the national institute of mental health treatment of depression collaborative research program.Journal of Consulting and Clinical Psychology,70(4),1051–1055. doi: 10.1037/0022-006X.70.4.1051 [DOI] [PubMed] [Google Scholar]
  68. Morgan SL, & Winship C (2015). Counterfactuals and causal inference: Methods and principles for social research. New York, NY: Cambridge University Press. [Google Scholar]
  69. Muran JC, Safran JD, Gorman BS, Samstag LW, Eubanks-Carter C, & Winston A (2009). The relationship of early alliance ruptures and their resolution to process and outcome in three time-limited psychotherapies for personality disorders. Psychotherapy: Theory, Research, Practice, Training, 46, 233–248. doi: 10.1037/a0016085 [DOI] [PubMed] [Google Scholar]
  70. Nissen-Lie HA, Monsen JT, & Rønnestad MH (2010). Therapist predictors of early patient-rated working alliance: A multilevel approach, Psychotherapy Research,20(6),627–646.doi: 10.1080/10503307.2010.497633 [DOI] [PubMed] [Google Scholar]
  71. Ollila P, Knekt P, Heinonen E, & Lindfors O (2016). Patients’ pre-treatment interpersonal problems as predictors of therapeutic alliance in long-term psychodynamic psychotherapy. Psychiatry Research, 241, 110–117. doi: 10.1016/j.psychres.2016.04.093 [DOI] [PubMed] [Google Scholar]
  72. Patterson CL,Uhlin B, &Anderson T(2008).Clients’ pretreatment counseling expectations as predictors of the working alliance.Journal of Counseling Psychology,55(4),528–534. doi: 10.1037/a0013289 [DOI] [PubMed] [Google Scholar]
  73. Pearl J (2009).Causality: Models, reasoning, and inference (2nd ed.). New York, NY: Cambridge University Press. [Google Scholar]
  74. Pearl J, Glymour M, & Jewell NP (2016). Causal inference in statistics: a primer. West Susses, UK: John Wiley & Sons. [Google Scholar]
  75. Piper WE, & Ogrodniczuk JS (2010). The therapeutic alliance in group therapy In Muran JC & Barber JP (Eds.), The therapeutic alliance: An evidence-based guide to practice (pp. 263–282). New York, NY: Guilford Press. [Google Scholar]
  76. Robins JM (1994). Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics - Theory and Methods, 23(8), 2379–2412. doi: 10.1080/03610929408831393 [DOI] [Google Scholar]
  77. Robins JM (2000). Marginal structural models versus structural nested models as tools for causal inference In Halloran M & Berry D (Eds.), Statistical models in epidemiology: The environment, and clinical trials. (IMA Vol. Math. Appl., 116, pp. 95–133). New York, NY: Springer-Verlag. [Google Scholar]
  78. Rosenthal R, & Rosnow RL (1991). Essentials of behavioral research: Methods and data analysis (2nd ed.). New York, NY: McGraw Hill. [Google Scholar]
  79. Rubin D (1974). Estimating causal effects of treatments in randomized and non-randomized studies. Journal of Educational Psychology, 66(5), 688–701. doi: 10.1037/h0037350 [DOI] [Google Scholar]
  80. Rush AJ, Trivedi MH, Ibrahim HM, Carmody TJ, Arnow B, Klein DN,… Keller MB (2003). The 16-Item Quick Inventory of Depressive Symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): A psychometric evaluation in patients with chronic major depression. Biological Psychiatry, 54(5), 573–583. doi: 10.1016/S0006-3223(02)01866-8 [DOI] [PubMed] [Google Scholar]
  81. Safran JD, & Wallner LK (1991). The relative predictive validity of two therapeutic alliance measures in cognitive therapy.Psychological Assessment: A Journal of Consulting and Clinical Psychology, 3(2), 188–195. doi: 10.1037/1040-3590.3.2.188 [DOI] [Google Scholar]
  82. Staiger D, & Stock J (1997). Instrumental variables regression with weak instruments. Econometrica, 65, 557–586 [Google Scholar]
  83. Stiles WB, Glick MJ, Osatuke K, Hardy GE, Shapiro DA, Agnew-Davies R, … Barkham M (2004). Patterns of alliance development and the rupture-repair hypothesis: Are productive relationships U-shaped or V-shaped? Journal of Counseling Psychology, 51(1), 81–92. doi: 10.1037/0022-0167.51.1.81 [DOI] [Google Scholar]
  84. Stiles-Shields C, Bamford BH, Touyz S, Le Grange D, Hay P, & Lacey H (2016). Predictors of therapeutic alliance in two treatments for adults with severe and enduring anorexia nervosa. Journal of Eating Disorders, 4, 13. doi: 10.1186/s40337-016-0102-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Stock JH, & Trebbi F (2003). Who invented instrumental variables regression? Journal of Economic Perspectives, 17(3), 177–194. doi: 10.1257/089533003769204416 [DOI] [Google Scholar]
  86. Swanson SA, Miller M, Robins JM, & Hernán MA (2015). Definition and evaluation of the monotonicity condition for preference-based instruments.Epidemiology, 26(3), 414–420. doi: 10.1097/EDE.0000000000000279 [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Ten Have TR, Joffe MM, Lynch KG, Brown GK, Maisto SA, & Beck AT (2007). Causal mediation analyses with rank preserving models. Biometrics, 63(3), 926–934. doi: 10.1111/j.1541-0420.2007.00766.x [DOI] [PubMed] [Google Scholar]
  88. Trajković G, Starčević V, Latas M, Leštarević M, Ille T, Bukumirić Z, & Marinković J (2011). Reliability of the Hamilton Rating Scale for Depression: A meta-analysis over a period of 49 years. Psychiatry Research, 189(1), 1–9. doi: 10.1016/j.psychres.2010.12.007 [DOI] [PubMed] [Google Scholar]
  89. Tsai M,Ogrodniczuk JS,Sochting I, &Mirmiran J(2014).Forecasting success: Patients’ expectations for improvement and their relations to baseline, process, and outcome variables in group cognitive-behavioural therapy for depression.Clinical Psychology and Psychotherapy,21(2),97–107. doi: 10.1002/cpp.1831 [DOI] [PubMed] [Google Scholar]
  90. Uddin MJ, Groenwold RH, de Boer T, Belitser SV, Roes KC, & Klungel OH (2015). Instrumental variable analysis in epidemiologic studies: an overview of the estimation methods. Pharmaceutica Analytica Acta, 6:4. doi: 10.4172/2153-2435.1000353 [DOI] [Google Scholar]
  91. Watson JC, &Kalogerakos F (2010). The therapeuticalliancein humanistic psychotherapy In Muran JC & Barber JP (Eds.), The therapeuticalliance: An evidence-based guide to practice (pp. 191–209).New York, NY: Guilford Press. [Google Scholar]
  92. Webb CA, DeRubeis RJ, Amsterdam JD, Shelton RC, Hollon SD, & Dimidjian S (2011). Two Aspects of the Therapeutic Alliance: Differential Relations with Depressive Symptom Change.Journal of Consulting and Clinical Psychology,79(3), 279–283. doi: 10.1037/a0023252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Webb CA, Beard C, Auerbach RP, Menninger E, & Björgvinsson T (2014). The therapeutic alliance in a naturalistic psychiatric setting: Temporal relations with depressive symptom change. Behaviour Research and Therapy, 61, 70–77. doi: 10.1016/j.brat.2014.07.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Williams JBW (1988). A structured interview guide for the Hamilton Depression Rating Scale. Archives of General Psychiatry, 45(8), 742–747. doi: 10.1001/archpsyc.1988.01800320058007 [DOI] [PubMed] [Google Scholar]
  95. Zuroff DC, Kelly AC, Leybman MJ, Blatt SJ, & Wampold BE (2010). Between therapist and within-therapist differences in the quality of the therapeutic relationship: Effects on maladjustment and self-critical perfectionism. Journal of Clinical Psychology, 66(7), 681–697. doi: 10.1002/jclp.20683 [DOI] [PubMed] [Google Scholar]

RESOURCES