Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Sep 1.
Published in final edited form as: Adm Policy Ment Health. 2014 Sep;41(5):697–705. doi: 10.1007/s10488-013-0520-6

Reliability of Therapist Self-Report on Treatment Targets and Focus in Family-Based Intervention

Aaron Hogue 1, Sarah Dauber 2, Craig E Henderson 3, Howard A Liddle 4
PMCID: PMC3966982  NIHMSID: NIHMS527805  PMID: 24068479

Abstract

Reliable therapist-report methods appear to be an essential component of quality assurance procedures to support adoption of evidence-based practices in usual care, but studies have found weak correspondence between therapist and observer ratings of treatment techniques. This study examined therapist reliability and accuracy in rating intervention target (i.e., session participants) and focus (i.e., session content) in a manual-guided, family-based preventive intervention implemented with 50 inner-city adolescents at risk for substance use. A total of 106 sessions selected from three phases of treatment were rated via post-session self-report by the participating therapist and also via videotape by nonparticipant coders. Both groups estimated the amount of session time devoted to model-prescribed treatment targets (adolescent, parent, conjoint) and foci (family, school, peer, prosocial, drugs). Therapists demonstrated excellent reliability with coders for treatment targets and moderate to high reliability for treatment foci across the sample and within each phase. Also, therapists did not consistently overestimate their degree of activity with targets or foci. Implications of study findings for fidelity assessment in routine settings are discussed.

Keywords: Therapist self-report, quality assurance, treatment fidelity, treatment targets, therapy focus, usual care, family therapy

Introduction

Need for Validated Therapist Self-Report Measures of Fidelity to Manualized Behavioral Interventions in Routine Care

Whereas observational assessment of treatment implementation of manualized behavioral interventions remains the gold standard for scientific research on treatment fidelity (Hogue, Liddle, & Rowe, 1996; Garland, Hurlburt, Brookman-Frazee, Taylor, & Accurso, 2010), it is critical to develop reliable complements or even alternatives to observational methods that are cost-effective and easy to use by non-researchers in clinical practice. The most promising method is therapist self-report measures, which offer several advantages over observational ratings (Carroll, Nich, & Rounsaville, 1998; Weersing, Weisz, & Dononberg, 2002): they are quick, inexpensive, and non-intrusive; they can be completed throughout treatment, which facilitates evaluation of infrequent but clinically meaningful interventions; and they can assess therapist intentions as well as observed behaviors.

Reliable therapist-report methods may also be essential for improved adoption of evidence-based practices (EBPs) specified in manualized treatments (Garland et al., 2010). Transporting manualized treatments into usual care invariably requires expensive and time-consuming quality assurance “superstructures” consisting of intensive on-site training of line staff by model experts, followed by continuous monitoring via booster trainings and remote consultation (Hogue, Ozechowski, Robbins, & Waldron, 2013). Thus, resource-efficient quality assurance tools that monitor EBP implementation in field settings with precision and clinical sophistication are urgently needed (Kelley et al, 2010; Schoenwald et al., 2011). Valid therapist-report measures could support technology transfer in several ways: as a self-check by therapists to mark their own progress in treating individual cases; as a supervision aid for trainers and agency supervisors to monitor treatment fidelity; and as administrative data for stakeholders and external reviewers to evaluate therapist- and agency-level clinical performance (Bearsley-Smith et al., 2008; Carroll et al., 1998; Garland, Bickman, & Chorpita, 2010). Therapist-report measures could eventually develop standardized “red line” scores to serve as benchmarks for determining if a given session was faithful to an EBP (Dobson & Shaw, 1993).

Extant Research on the Reliability of Therapist-Reported Fidelity to EBPs

To date only a handful of studies have tested a therapist-report measure of EBP implementation (e.g., Weersing et al., 2002). Fewer still have attempted to confirm the validity of therapist self-reports by comparing them to observational ratings. Pioneering work with EBPs for adult substance abuse have found modest to weak correspondence between nonparticipant raters and therapist reports of fidelity to manualized intervention techniques (Carroll et al., 1998; Martino et al., 2009). In the youth treatment arena, Hurlburt, Garland, Nguyen, & Brookman-Frazee (2010) found that observational coders reported substantially less occurrence and lower intensity of EBPs compared to therapist report in front-line mental health care. Therapist reports of adherence to multisystemic therapy (MST) for antisocial youth have been linked to various client, therapist, and organizational factors (e.g., Schoenwald, Letourneau, & Halliday-Boykins, 2005), including ongoing expert consultation (Schoenwald, Sheidow, & Letourneau, 2004). However, the MST therapist-report fidelity scale assesses adherence to basic intervention principles that guide the MST model specifically—limiting its utility for measuring other EBPs—and has not yet been validated with non-participant observational coding.

Methodological Innovations of the Current Study

The current study advances research on therapist ratings of EBPs in three ways. First, it examined therapist reliability and accuracy in rating intervention target (session participants, i.e., to whom interventions are addressed) and focus (session content, i.e., domains of functioning addressed). Therapist-report studies have paid close attention to treatment principles and specific techniques but have not addressed the “contours” of implementation (Schoenwald et al., 2011) defined by the parameters of a given treatment (i.e., service delivery aspects of implementation: to whom, where, and how often); nor have they examined prescribed session content (Garland, Hurlburt, et al., 2010; Hogue et al., 2004). To make headway in developing efficient fidelity measures for real-world application, it is critical to explore the feasibility of various methods for assessing various dimensions of implementation (Schoenwald, 2011). Asking therapists to judge the (more) readily defined targets and foci of their interventions, rather than treatment techniques that are often multifaceted and interwoven, sets the fidelity measurement bar a notch lower, which might engender improved reliability. Along these lines Kelley et al. (2010) developed a brief therapist-report measure of session focus that showed acceptable internal consistency and distinguished between clinician versus client influences on session content, though it has not been validated with observational data.

Fidelity evaluation of treatment targets and foci is especially germane to family-based intervention, one of the most common treatment modalities in child mental health services (Hoagwood, 2005) and one that places strong emphasis on tracking treatment activity with various family members and social systems (e.g., school, social services) that serve as key developmental contexts for youth (Glisson et al., 2010; Schoenwald et al., 2008). Whereas adherence to signature family therapy techniques has been linked to in-session changes in parenting and family interactions (Diamond & Liddle, 1996, 1999; Schmidt, Liddle, & Dakof, 1996) and to posttreatment outcomes (Huey et al., 2000; Hogue, Henderson, et al., 2008; Robbins et al., 2011), research on benefits associated with faithful attention to treatment parameters and content has rarely surfaced. One exception is a study by Hogue et al. (2006) that incorporated observational ratings of treatment foci in showing that both family-focused and adolescent-focused interventions in family therapy predicted improvement in adolescent symptoms and family processes up to one year posttreatment. Looking ahead, as manualized family therapies progress in efforts to adapt to new settings and clinical populations (e.g., Hogue, Liddle, Becker, & Johnson-Leckrone, 2002; Liddle et al., 2006) and to ensure fidelity standards during large-scale dissemination initiatives (e.g., Barnoski, 2003; Zazzali et al., 2008), adherence to model tenets about what persons and domains of functioning to address in what stages of treatment will be prominently featured in quality assurance procedures.

A second key feature of this study is inclusion of research-funded therapists rather than community practitioners (see also Carroll et al., 1998). Although the ultimate goal of fidelity measurement is real-world utility (Schoenwald, 2011), at this nascent stage in the development of therapist-report instruments there are two practical advantages to using research therapists. First, study therapists were made fluent and then continuously supervised in the intervention components being rated, and they participated in designing the measure itself. In this way they received training in the coding scheme that was equivalent to training received by the observational coders (see 2.3.2). Second, study therapists were operating in a de facto “low stakes” context, wherein the self-report data they provided were tallied for scientific and clinical purposes only. This stands in contrast to “high stakes” fidelity evaluation that may obtain in front-line settings, wherein practitioner self-reports of model implementation can have direct implications for performance ratings, immediate rewards or punitive action, and even job security (Schoenwald et al., 2011). Thus, utilizing research therapists created optimal conditions in which to measure their fundamental capacity for accurate reporting, uncontaminated by potential negative consequences for subpar fidelity. Once baseline benchmarks for self-report accuracy are established in low stakes contexts such as this, it will be possible to move forward with the critical task of determining which training and motivational factors influence (compromise—or perhaps enhance?) accuracy in higher stakes settings.

Third, whereas most EBP fidelity measures assess intervention extensiveness (i.e., the thoroughness and/or frequency with which an intervention is delivered; Hogue et al., 1996), this study examined therapist ratings of the number of minutes devoted to each treatment target and focus. This self-rating task aligns with standard quality assurance procedures in everyday practice, wherein it is important to account accurately for billable clinical hours spent with various members of the client system as well as time devoted to model- or client-specific clinical topics (Fixsen et al., 2005; Schoenwald et al., 2011).

Study Hypotheses

The main study hypothesis was that research therapist ratings of session time devoted to various treatment targets (adolescent alone, parent(s) alone, adolescent-parent conjointly) and developmental foci (family, peer, school, prosocial activities, substance use) would be reliable with ratings made by observational coders, and that strong reliability would be consistent across treatment phases (early, middle, late). A secondary hypothesis was that therapists would overestimate the average amount of therapeutic activity in these modules and domains (Hurlburt et al., 2010). The observational coder data used in the current study have already been examined in a previous study with this sample (Hogue, Liddle, Singer, & Leckrone, 2005) to demonstrate that study therapists adhered to manual-specified guidelines regarding session activity in treatment modules and domains for a manualized family-based intervention, multidimensional family prevention (MDFP; Hogue & Liddle, 1999; Hogue, Liddle, & Becker, 2002). Of interest in the current study was whether therapists were able to make consistently reliable judgments about their own adherence to these MDFP intervention components.

Method

Participants

Clients

Clients (N = 50) were adolescents and families who participated in a randomized trial of the MDFP model (Hogue et al., 2002b). Clients were recruited from a community youth enrichment program after the target adolescents were identified as indicated risk for developing substance use problems based on individualized risk screening (see Hogue, Johnson-Leckrone, & Liddle, 1999). The mean age of the adolescents was 12.5 years (SD = .79, range 11–14). There were 24 boys (48%) and 26 girls (52%), and 98% identified as African American. Families were headed by single biological parent (54%), grandparent(s) (16%), one biological and one stepparent (14%), two biological parents (12%), and other (4%). A total of 60% reported annual family income under $15,000, and 62% received some form of public assistance.

Therapists

Four male therapists completed post-session self-reports of MDFP implementation: two African American, one European American, and one Asian American (M age: 31 yrs). Three had a master’s in counseling and one a doctorate in Clinical Psychology, and they averaged two years of experience as family counselors.

Observational Coders

Observational coders were nine undergraduates and one Psychology graduate student: six European American females, two Asian American females, one European American male, one Asian American male. Undergraduates were considered qualified for this task because they were not required to make clinically sophisticated judgments about thoroughness, quality, or appropriateness of interventions within the treatment model (Hogue et al., 1996; Hogue, Dauber, et al. 2008).

Intervention Model and Fidelity Measure

Multidimensional family prevention (MDFP)

MDFP (Hogue & Liddle, 1999) is a family-based preventive intervention for indicated-risk adolescents. MDFP is a prevention version of multidimensional family therapy, an empirically supported treatment for adolescent substance use (as nominated in Becker & Curry, 2008; Waldron & Turner, 2008). MDFP has shown efficacy in preventing problem behaviors in multiple developmental domains (Hogue et al., 2002b). MDFP session composition varies on a case-by-case and session-by-session basis, and therapists regularly spend time working individually with family members to accomplish family-wide goals. A total of 15–25 sessions are held over a 3–4 month period. Initial sessions are dedicated to assessment of adolescent and family functioning in several risk and protection domains, particularly family relations, school performance, extracurricular prosocial activities, peer relations, and drug use attitudes and experiences. Therapist and family then review the risk profile and construct a treatment agenda for addressing the most significant themes within three primary modules. The Adolescent Module focuses on the teen’s developmental milestones, coping and problem-solving skills, investment in prosocial institutions, and risky behaviors associated with drug use and delinquency. The Parent Module fosters parenting competency by supporting consistency in limit-setting and discipline, teaching age-appropriate behavioral management techniques, and encouraging monitoring of school performance and other behavior outside the home. The Interactional Module builds relationship skills and fosters autonomy and relatedness in the parent-teen relationship. In-session conversation among family members is shaped to increase family cohesion, problem-solving, and clarity of communication and roles. As an individually tailored model, MDFP makes few prescriptions about specific modules or domains being implemented in specific sessions.

Therapist Self-Report Checklist (TSRC)

The TSRC is a therapist-report measure of the amount of time devoted in session to predefined treatment targets and domains of adolescent functioning. TSRC items are consistent with the MDFP model but also with general implementation principles that guide the family-based approach for adolescent behavior problems. Modules (i.e., targets) are defined by who attends the session. Module ratings require a therapist to estimate the number of minutes during which s/he was (a) with the adolescent only (Adolescent Module), (b) with parent(s) only (Parent Module), or (c) with parent(s) and adolescent conjointly (Interactional Module). Domains (i.e., foci) are defined by session content. Therapists estimate the number of minutes during which each of five core developmental domains were discussed: family relationships (Family Domain), school-related issues (School Domain), health-promoting extracurricular activities (Prosocial Domain), peer relations (Peer Domain), and drug use attitudes and experiences (Drug Domain). Therapists co-score categories whenever more than one domain is discussed simultaneously. For example, if a therapist and teen spend one-third of a 60-minute session discussing the drug use attitudes of the teen’s friends, then Drug Domain and Peer Domain should each receive a score of 20 minutes.

Procedures

Sampling design

MDFP therapists videotaped every session (as possible) of every case and also completed a TSRC after every session. For observational coding purposes, one session was randomly chosen for videotape review from each available treatment phase for every case in order to sample representatively across the duration of treatment. Treatment phases were defined as: Phase 1 (sessions 1–5), Phase 2 (sessions 6–12), and Phase 3 (sessions 13 and higher). Thus every study case had between 1–3 sessions selected, depending on how long the given case was retained in treatment, and these sessions were rated by both the MDFP therapist and observational coders. Of the 106 sessions selected, 44 (42%) were from Phase 1, 38 (36%) from Phase 2, and 24 (23%) from Phase 3. Due to logistical barriers and client preferences it was not possible to videotape a Phase 1 session for six of the 50 study cases. Successively fewer sessions were available for coding in Phases 2 and 3 due to treatment attrition during the trial. Of the 65 families who participated in the MDFP trial (Hogue et al., 2002b), 15 were not included in the current study for the following reasons: four did not complete a posttreatment assessment, one attended no treatment sessions at all, and ten refused to be videotaped. There were no significant differences between the original trial sample and the current study sample on any demographic variables (Hogue et al., 2005).

Therapist and Observational Coder Training

Therapists were introduced to the structure and content of the TSRC during a one-hour training session. Then, for four weeks they dedicated a portion of their routine clinical supervision to reviewing their self-reported ratings of module and domain activity; questions regarding TSRC fidelity ratings were resolved at these meetings. Observational coders trained in a group format for ninety minutes per week over a four-month period to reach adequate pre-study reliability (Intraclass correlation coefficient (1,2) > .65 for most study items). Training consisted of didactic instruction and discussion of the rating manual, trainer and peer review of practice scales using pilot cases, and coding exercises designed to test and expand understanding of each scale item. Once rating of study tapes commenced, coders reconvened on a weekly basis for supportive training and to prevent rater drift.

Ratings

Therapists completed the TSRC after every session. Observational coders completed TSRC items after viewing entire videotaped sessions (average length: 60 minutes). Coders were kept unaware of the intent of the study, instructed that family involvement and session composition would vary according to the contingencies of each case, and informed that each scale item could arise in every session. Two coders rated every session; coders were randomly assigned to sessions following balanced incomplete block design procedures (Fleiss, 1981). Reliability and fidelity of these observational data were described in a previous study (Hogue et al., 2005). Intraclass correlation coefficients (ICCs; Shrout & Fleiss, 1979) were adequate for Module items: Adolescent Module ICC(1,2)= .62; Parent = .91, and Interactional = .85. ICCs for Domain items were adequate as well: Family Domain ICC(1,2) = .80, School = .76, Prosocial = .50, Peer = .72, and Drug = .82. The final observational rating scores for each item were calculated by averaging the time estimates provided by each coder.

The study was conducted under approval by the governing Institutional Review Board. Active consent from caregivers and assent from adolescents were collected. Therapists provided active consent for sessions to be judged for MDFP model adherence.

Plan of Analysis

Therapist reliability in rating the amount of session time devoted to MDFP modules and domains was calculated using the one-way random ICC (Shrout & Fleiss, 1979). Therapist accuracy was tested by comparing therapist versus coder mean scores using paired samples t –tests. Cohen’s d effect size was calculated for mean differences of p < .10; according to Cohen (1988), d = .20 is a small effect, .50 is medium, and .80 is large. Analyses were conducted first on all 106 study sessions and then separately by phase (1 = early, 2 = middle, 3 = later) to examine consistency over the course of intervention. Variance components analysis was then used to determine the proportion of variance in both therapist and observer scores attributable to client and therapist effects.

Results

According to Cicchetti (1994), ICC > .70 is acceptable reliability and > .80 strong reliability for molar behavioral ratings. Results for the full sample (Table 1) show strong reliability for modules: Adolescent ICC(1,2) = .96; Parent ICC = .95; Interactional ICC = .86. Therapists reported more time spent working in the Parent (t(105) = −2.45, p < .05, d = .48) and Interactional (t(105) = −2.21, p < .05, d = .43) modules than did observers. Domain ratings showed moderate to high reliability: ICC(1,2) was above .80 for Drug and above .70 for Family, Peer, and School; one exception was a low ICC = .59 for Prosocial. Mean comparisons revealed a trend for observers to give higher scores in the Family Domain (t(105) = 1.80, p < .10, d = .35) but no differences in any other domain.

Table 1.

Interrater Reliability and Mean Scores for Therapist Self-Report and Observational Ratings of Treatment Modules and Domains for the Full Sample (N = 106 sessions)

Therapist-Observer ICC Therapist Score (# Minutes) M (SD) Observer Score (# Minutes) M (SD) Paired-Sample t and Cohen’s d coefficients
Adolescent module .96 12.6 (18.0) 12.7 (16.5) Ns
Parent module .95 13.8 (20.2) 11.8 (17.2) t = −2.45**, d = .48
Interactional module .86 20.3 (21.9) 17.3 (19.6) t = −2.21**, d = .43
Family domain .71 29.2 (15.4) 32.1 (19.4) t = 1.80*, d = .35
Peer domain .70 5.2 (6.3) 5.0 (5.8) Ns
School domain .74 7.3 (7.0) 6.9 (8.0) Ns
Prosocial domain .59 5.9 (5.8) 5.1 (5.5) Ns
Drug domain .87 1.9 (4.8) 1.9 (5.2) Ns

Note:

**

p < .05,

*

p < .10

Results for each treatment phase are presented in Table 2. Phase-specific results for modules mirror those found for the overall sample, with ICCs above .80 in all three phases and relatively stable across phase for each module. Mean comparisons revealed trend-level effects for higher therapist scores in Parent Module Phase 2 (t(37) = −1.98, p < .10, d = .65) and in Interactional Module Phase 1 (t(43) = −1.79, p < .10, d = .55) and Phase 3 (t(23) = −2.00, p < .10, d = .83). Though not statistically significant, Table 2 suggests a trend for Adolescent interventions to decrease as therapy enters the later phases, whereas Parent and Interactional work remain constant or increase. ICCs for domains were also relatively stable across time, with the exception of the School and Drug Domains, which both declined from Phase 1 to Phase 2, and then increased in Phase 3 for Drug Domain. Trend-level differences between reporters were found for Family Domain Phase 2 (t(37) = 2.02, p < .10, d = .66), with observers giving higher scores, and Peer Domain Phase 3 (t(23) = −1.83, p < .10, d = .66), with therapists giving higher scores.

Table 2.

Interrater Reliability and Mean Scores for Therapist Self-Report and Observational Ratings of Treatment Modules and Domains by Treatment Phase

Phase One (n = 44) Phase Two (n = 38) Phase Three (n = 24)

Therapist-Observer ICC Therapist Score M (SD) Observer Score M (SD) Therapist-Observer ICC Therapist Score M (SD) Observer Score M (SD) Therapist-Observer ICC Therapist Score M (SD) Observer Score M (SD)

Adolesent module .95 13.0 (18.3) 12.9 (16.0) .97 14.7 (19.9) 14.8 (17.6) .95 8.7 (14.1) 9.1 (15.4)
Parent module .95 12.6 (19.6) 10.8 (17.2) .93 14.3 (21.2) 11.4 (16.7)* .97 15.2 (20.3) 14.6 (18.5)
Interactional module .92 21.7 (21.6) 18.7 (19.0)* .83 18.0 (24.2) 16.2 (22.6) .83 21.7 (19.0) 16.5 (15.8)*
Family domain .65 27.1 (13.9) 29.2 (19.9) .75 29.4 (15.1) 34.1 (17.9)* .73 32.6 (18.3) 34.0 (20.7)
Peer domain .69 6.3 (7.6) 5.8 (6.2) .73 4.3 (5.1) 5.4 (6.6) .61 4.5 (5.1) 2.9 (2.8)*
School domain .87 8.2 (7.3) 8.2 (7.3) .67 6.2 (7.0) 6.1 (7.0) .62 7.5 (6.7) 5.7 (10.2)
Prosocial domain .61 5.8 (5.5) 5.2 (6.6) .57 5.4 (5.3) 4.7 (4.4) .58 6.8 (6.9) 5.5 (5.2)
Drug domain .93 2.1 (5.1) 1.9 (6.2) .63 1.5 (3.3) 2.2 (4.6) .92 2.2 (6.3) 1.7 (4.2)

Note:

*

p < .10

Variance components analyses were conducted on all sessions using a restricted maximum-likelihood estimation method for three terms: Therapist, Client (nested within Therapist), and Error (Note: There were not enough observations to calculate the multiply nested Phase and Coder terms). Each term was entered as a random effect in the analysis, and estimates of variance for each term were transformed into proportions of variance based on the estimates of total variance across terms. For observational ratings of treatment modules, Therapist accounted for 8% (Adolescent), 2% (Parent), and 0% (Interactional) of respective total score variance, whereas Client accounted for 14% (Adolescent), 28% (Parent), and 0% (Interactional). For therapist ratings of modules, Therapist accounted for 5% (Adolescent), 0% (Parent), and 0% (Interactional), whereas Client accounted for 23% (Adolescent), 20% (Parent), and 0% (Interactional), respectively. Therapist and Client component variances were less than 5% for each domain score for both observer and therapist ratings, with three exceptions: Client term for therapist ratings in the Prosocial Domain (24%), Client term for therapist ratings in Family (17%), and Therapist term for observer ratings in Family (11%). These data show that therapists and coders concurred that session time spent with adolescents and parents varied somewhat across clients but hardly at all across therapists. Informants also agreed that time spent in the Interactional Module, and in most domains, was highly consistent across therapists and clients.

Discussion

Main Study Findings

This study found that therapists were quite reliable in judging their own activity in treatment modules and domains: Correlations with nonparticipant observers ranged from ICC = .57 to .97. This level of reliability is substantially higher overall than found in previous research on therapist self-report of treatment techniques (Carroll et al., 1998; Hurlburt et al., 2010; Martino et al., 2009). Contrary to hypotheses, therapists were not prone to overestimating their degree of activity in modules and domains, especially when compared to strong positive biases (i.e., over-reporting) documented for techniques (Carroll et al., 1998; Hurlburt et al., 2010). Therapists and observers also agreed that the amount of time spent working in treatment modules and (to a lesser extent) domains varied somewhat across clients but not across therapists. Pending replication with other EBPs and client populations, results suggest that therapists can be much more reliable and accurate in judging their implementation of treatment targets and foci than specific techniques.

The comparatively strong reliability documented in this study may be attributable to any of the following reasons: use of research-hired therapists (vs. Martino et al., 2009), thorough training and monitoring of therapists in the coding task (vs. Hurlburt et al., 2010), or the measurement specificity associated with rating treatment targets and foci rather than techniques (vs. Carroll et al., 1998). Clearly, the shallow pool of studies on therapist-report fidelity needs to be considerably deepened in order to identify which person, task, and context factors most influence therapist reliability, including whether and how self-report fidelity will be affected by high-stakes performance incentives enacted in front-line settings (Schoenwald et al., 2011). The current study advances this area of work by establishing reliability benchmarks that are achievable under optimal conditions for accurate reporting on treatment parameters and content.

The pattern of ICCs indicates that the highly specifiable Module items—defined primarily by who attends the session—garnered stronger therapist-observer concordance than did the Domain items, which are more broadly defined by the content of participant speech in session. The Prosocial Domain, which was more broadly defined than other Domain items (and rarely benefitted from the word “prosocial” being uttered in session), demonstrated the weakest correlations overall. Also, results from variance components analyses suggest that therapists can accurately discern that they spend time discriminately with adolescents and parents—devoting more or less time in these Modules depending on the presenting needs of the case—but remain constant across cases in the amount of time devoted to interactional work, in keeping with MDFP principles (Hogue et al., 2002a). These results mirror those generated by the observational coders for MDFP therapist adherence to core family therapy techniques for these same cases (Hogue et al., 2005).

Study Limitations

The study sample was too small to test reliably for fidelity-outcome correlations. An important next step is to determine whether levels of prescribed activity with treatment targets and foci predict client outcomes for EBPs of all kinds. Establishing fidelity-outcome links, or the absence thereof, is a high priority for implementation science. There remains little knowledge regarding which aspects of EBPs directly influence outcomes, and virtually no knowledge about which aspects are, and are not, essential for producing key effects. As it stands, model-specific treatment techniques, which have received the most attention to date, have not consistently predicted outcome (Perepletchikova & Kazdin, 2005), and when they have, effect sizes are typically small (Webb, DeRubeis, & Barber, 2010). However, it is possible that technique-outcome links are stronger in family-based treatment for adolescent behavior problems, as several studies have reported at least some adherence effects on outcome (Hogue, Henderson, et al., 2008; Huey et al., 2000; Robbins et al., 2011). Investigating whether fidelity to the contours of family therapy also predicts outcome is a logical next step. More broadly, advancing research on multiple dimensions of fidelity-outcome links involving the techniques, content, and service delivery parameters of treatment is perhaps the most promising method for uncovering mechanisms of change in EBPs (McLeod, Southam-Gerow, Tully, Rodriguez, & Smith, 2013).

Another limitation is that this study focused on only aspect of treatment fidelity—treatment adherence (i.e., quantity)—and did not attempt to measure therapist competence (i.e., quality) in implementing MDFP modules and domains. Although virtually every manualized treatment offers guidelines for skillful implementation—how to deliver the appropriate interventions at the appropriate time—it has proven exceedingly difficult to assess competence reliably (Barber, Sharpless, Klostermann, & McCarthy, 2007; Hogue, Dauber, et al., 2008). Moreover, even reliable measures of competence do not reliably predict outcomes, a counterintuitive but persistent finding (Barber et al., 2007; Hogue, Henderson, et al., 2008).

Other limitations include the small sample size for therapists, the limited number and range of treatment domains measured by the study’s fidelity instrument, and inclusion of male therapists only. Study clients were enrolled in prevention counseling and may therefore differ in meaningful ways from families that typically seek outpatient behavioral treatment; however, the central task of accurately self-reporting on model implementation appears little affected by this potential difference. Also, although there was some variability in the strength of correlations across treatment phases, the small samples within each phase, combined with the modest variability in mean scores reported within-phase for each Module and Domain item, make it premature to draw conclusions about phase-specific differences in reporter accuracy for these aspects of treatment.

Implications for Mental Health Services

This study was not designed to validate a specific fidelity instrument or promulgate treatment elements specific to MDFP, but instead, to test a question broadly applicable to fidelity measures and quality assurance procedures of all kinds: Can therapists estimate their own activity with regard to the targets and foci of treatment with moderate to high reliability and acceptable accuracy? Study results confirm that they can, and this is welcome news for service providers and evaluators invested in quality assurance methods for EBPs. Many evidence-based models have explicit guidelines for whom to treat and what themes to address. Specifications for activity with treatment targets and foci are often contained in quality assurance procedures designed to promote fidelity during the training and sustainability phases of EBP dissemination (McHugh & Barlow, 2010). Such specifications are especially prominent in dissemination toolkits for family-based models (Henggeler & Sheidow, 2012) and are primary indices of fidelity success in family therapy transportability research (e.g., Liddle et al., 2006), marking them as central to effective implementation in routine care.

In addition to straightforward fidelity monitoring, therapist-report procedures can supply front-line implementation data that establish concrete benchmarks for therapist performance in real-world conditions (McLeod et al., 2013). Such fidelity benchmarks should prove to be essential for calibrating EBP training efforts and assessing the adequacy and consistency of programmatic EBP delivery across clients, therapists, and organizational levels. Indeed, it may be difficult to achieve meaningful improvements in EBP delivery, and subsequent client outcomes, without compiling fidelity benchmarks for treatment parameters and techniques that have been carefully vetted in routine care (Hogue et al., 2013).

For all these reasons, quality assurance procedures for EBP dissemination and implementation will be greatly enhanced by validated methods for therapist self-report of treatment targets and foci, particularly methods that can be easily incorporated into therapist training guidelines, supervision and fidelity monitoring procedures, and clinical feedback systems to improve client care (Garland, Bickman, & Chorpita, 2010; Hogue et al., 2013). Moreover, methods for training therapists to rate their own fidelity to treatment contours might “lift the whole boat” by generalizing to ratings of treatment techniques and socializing therapists to become more reflective and programmatic about their interventions. The methods described in this study represent a generic assessment approach that can be readily adapted to fit specifications for all varieties of models—including prescriptions as appropriate for how much time and activity should be devoted to each—and then incorporated as evaluation items into companion fidelity toolkits.

Acknowledgments

Aaron Hogue and Sarah Dauber, The National Center on Addiction and Substance Abuse at Columbia University; Craig E. Henderson, Department of Psychology, Sam Houston State University; Howard A. Liddle, Departments of Epidemiology and Public Health, and Psychology, Center for Treatment Research on Adolescent Drug Abuse, University of Miami Miller School of Medicine.

Preparation of this article was supported by grants R01DA023945 from the National Institute on Drug Abuse and HD1SP07054 from the Center for Substance Abuse Prevention. The authors are grateful for the dedicated work of the prevention counselors at Temple Teen Care and the observational coders for the TBRS-2 project.

Contributor Information

Aaron Hogue, The National Center on Addiction and Substance Abuse at Columbia University

Sarah Dauber, The National Center on Addiction and Substance Abuse at Columbia University

Craig E. Henderson, Sam Houston State University

Howard A. Liddle, Center for Treatment Research on Adolescent Drug Abuse, University of Miami Miller School of Medicine

References

  1. Barber JP, Sharpless B, Klostermann S, McCarthy KS. Assessing intervention competence and its relation to therapy outcome: A selected review derived from the outcome literature. Professional Psychology: Research and Practice. 2007;38:493–500. [Google Scholar]
  2. Barnoski R. Outcome evaluation of Washington State's research-based programs for juvenile offenders. Washington State Institute for Public Policy; 2003. ( wsipp.wa.gov) [Google Scholar]
  3. Bearsley-Smith C, Sellick K, Chesters J, Francis K. Treatment content in child and adolescent mental health services: Development of the Treatment Recording Sheet. Administration and Policy in Mental Health and mental Health Services Research. 2008;35:423–435. doi: 10.1007/s10488-008-0184-9. [DOI] [PubMed] [Google Scholar]
  4. Becker SJ, Curry JF. Outpatient interventions for adolescent substance abuse: A quality of evidence review. Journal of Consulting and Clinical Psychology. 2008;76:531–543. doi: 10.1037/0022-006X.76.4.531. [DOI] [PubMed] [Google Scholar]
  5. Carroll KM, Nich C, Rounsaville BJ. Utility of therapist session checklists to monitor delivery of coping skills treatment for cocaine abusers. Psychotherapy Research. 1998;8:307–320. [Google Scholar]
  6. Cicchetti D. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment. 1994;6:284–90. [Google Scholar]
  7. Cohen J. Statistical power analysis for the behavioral sciences. 2. Hillsdale, NJ: Erlbaum; 1988. [Google Scholar]
  8. Diamond GS, Liddle HA. Resolving a therapeutic impasse between parents and adolescents in Multidimensional Family Therapy. Journal of Consulting and Clinical Psychology. 1996;64:481–488. doi: 10.1037//0022-006x.64.3.481. [DOI] [PubMed] [Google Scholar]
  9. Diamond GS, Liddle HA. Transforming negative parent-adolescent interactions: From impasse to dialogue. Family Process. 1999;38:5–26. doi: 10.1111/j.1545-5300.1999.00005.x. [DOI] [PubMed] [Google Scholar]
  10. Dobson KS, Shaw BF. The training of cognitive therapists: What have we learned from treatment manuals? Psychotherapy: Theory, Research, Practice, & Training. 1993;30:573–77. [Google Scholar]
  11. Fixsen DL, Naoom SF, Blasé KA, Friedman RM, Wallace F. Implementation research: A synthesis of the literature. Tampa, FL: University of South Florida, Louis de la parte Florida Mental Health Institute, The National Implementation Research Network; 2005. Report No.: FMHI Publication #231. [Google Scholar]
  12. Fleiss JL. Balanced incomplete block designs for inter-rater reliability studies. Applied Psychological Measurement. 1981;5:105–112. [Google Scholar]
  13. Garland AF, Bickman L, Chorpita BF. Change what? Identifying quality improvement targets by investigating usual mental health care. Administration and Policy in Mental Health and Mental Health Services Research. 2010;37:15–26. doi: 10.1007/s10488-010-0279-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Garland AF, Hurlburt MS, Brookman-Frazee L, Taylor RM, Accurso EC. Methodological challenges of characterizing usual care psychotherapeutic practice. Administration and Policy in Mental Health and Mental Health Services Research. 2010;37:208–220. doi: 10.1007/s10488-009-0237-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Glisson C, Schoenwald SK, Hemmelgarn A, Green P, Dukes D, Armstrong KS, Chapman JE. Randomized trial of MST and ARC in a two-level evidence-based treatment implementation strategy. Journal of Consulting and Clinical Psychology. 2010;78:537–550. doi: 10.1037/a0019160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Henggeler SW, Sheidow AJ. Empirically supported family-based treatments for conduct disorder and delinquency in adolescents. Journal of Marital and Family Therapy. 2012;38:30–58. doi: 10.1111/j.1752-0606.2011.00244.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hoagwood KE. Family-based services in children’s mental health: A research review and synthesis. Journal of Child Psychology and Psychiatry. 2005;46:690–713. doi: 10.1111/j.1469-7610.2005.01451.x. [DOI] [PubMed] [Google Scholar]
  18. Hogue A, Dauber S, Chinchilla P, Fried A, Henderson CE, Inclan J, Reiner R, Liddle HA. Assessing fidelity in individual and family therapy for adolescent substance abuse. Journal of Substance Abuse Treatment. 2008;35:137–147. doi: 10.1016/j.jsat.2007.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hogue A, Dauber S, Samuolis J, Liddle HA. Treatment techniques and outcomes in multidimensional family therapy for adolescent behavior problems. Journal of Family Psychology. 2006;20:535–543. doi: 10.1037/0893-3200.20.4.535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hogue A, Henderson CE, Dauber S, Barajas PC, Fried A, Liddle HA. Treatment adherence, competence, and outcome in individual and family therapy for adolescent behavior problems. Journal of Consulting and Clinical Psychology. 2008;76:544–555. doi: 10.1037/0022-006X.76.4.544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hogue A, Johnson-Leckrone J, Liddle HA. Recruiting high-risk families into family-based prevention and prevention research. Journal of Mental Health Counseling. 1999;21:337–351. [Google Scholar]
  22. Hogue A, Liddle HA. Family-based preventive intervention: An approach to preventing substance use and antisocial behavior. American Journal of Orthopsychiatry. 1999;69:278–293. doi: 10.1037/h0080403. [DOI] [PubMed] [Google Scholar]
  23. Hogue A, Liddle HA, Becker D. Multidimensional family prevention for at-risk adolescents. In: Patterson T, editor. Comprehensive handbook of psychotherapy, volume II: Cognitive-behavioral approaches. NY: Wiley; 2002a. pp. 141–166. [Google Scholar]
  24. Hogue A, Liddle HA, Becker D, Johnson-Leckrone J. Family-based prevention counseling for high-risk young adolescents: Immediate outcomes. Journal of Community Psychology. 2002b;30:1–22. [Google Scholar]
  25. Hogue A, Liddle HA, Dauber S, Samuolis J. Linking session focus to treatment outcome in evidence-based treatments for adolescent substance abuse. Psychotherapy: Theory, Research, Practice, & Training. 2004;41:83–96. doi: 10.1037/0033-3204.41.2.83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hogue A, Liddle HA, Rowe C. Treatment adherence process research in family therapy: A rationale and some practical guidelines. Psychotherapy: Theory, Research, Practice, & Training. 1996;33:332–345. [Google Scholar]
  27. Hogue A, Liddle HA, Singer A, Leckrone J. Intervention fidelity in family-based prevention counseling for adolescent problem behaviors. Journal of Community Psychology. 2005;33:191–211. [Google Scholar]
  28. Hogue A, Ozechowski TJ, Robbins MS, Waldron HB. Making fidelity an intramural game: Localizing quality assurance procedures to promote sustainability of evidence-based practices in usual care. Clinical Psychology: Science and Practice. 2013;20:60–77. [Google Scholar]
  29. Huey S, Henggeler SW, Brondino MJ, Pickrel SG. Mechanisms of change in multisystemic therapy: Reducing delinquent behavior through therapist adherence and improved family and peer functioning. Journal of Consulting & Clinical Psychology. 2000;68:451–67. [PubMed] [Google Scholar]
  30. Hurlburt MS, Garland AF, Nguyen K, Brookman-Frazee L. Child and family therapy process: Concordance of therapist and observational perspectives. Administration and Policy in Mental Health and Mental Health Services Research. 2010;37:230–244. doi: 10.1007/s10488-009-0251-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kelley SD, Vides de Andrade AR, Sheffer E, Bickman L. Exploring the black box: Measuring youth treatment process and progress in usual care. Administration and Policy in Mental Health and Mental Health Services Research. 2010;37:287–300. doi: 10.1007/s10488-010-0298-8. [DOI] [PubMed] [Google Scholar]
  32. Liddle HA, Rowe CL, Gonzalez A, Henderson CE, Dakof GA, Greenbaum PE. Changing provider practices, program environment, and improving outcomes by transporting multidimensional family therapy to an adolescent drug treatment setting. The American Journal on Addictions. 2006;15:102–112. doi: 10.1080/10550490601003698. [DOI] [PubMed] [Google Scholar]
  33. Martino S, Ball S, Nich C, Frankforter TL, Carroll KM. Correspondence of motivational enhancement treatment integrity ratings among therapists, supervisors, and observers. Psychotherapy Research. 2009;19:181–193. doi: 10.1080/10503300802688460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. McHugh RK, Barlow DH. The dissemination and implementation of evidence-based psychological treatments. American Psychologist. 2010;65:73–84. doi: 10.1037/a0018121. [DOI] [PubMed] [Google Scholar]
  35. McLeod BD, Southam-Gerow MA, Tully CB, Rodriguez A, Smith MM. Making a case for treatment integrity as a psychosocial treatment quality indicator for youth mental health care. Clinical Psychology: Science and Practice. 2013;20:14–32. doi: 10.1111/cpsp.12020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Perepletchikova F, Kazdin AE. Treatment integrity and therapeutic change: Issues and research recommendations. Clinical Psychology: Science and Practice. 2005;12:365–383. [Google Scholar]
  37. Robbins MS, Feaster DJ, Horigian VE, Puccinelli MJ, Henderson CE, Szapocznik J. Therapist adherence in Brief Strategic Family Therapy for adolescent drug abusers. Journal of Consulting & Clinical Psychology. 2011;79:43–53. doi: 10.1037/a0022146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Schmidt SE, Liddle HA, Dakof GA. Changes in parenting practices and adolescent drug abuse during multidimensional family therapy. Journal of Family Psychology. 1996;10:12–27. [Google Scholar]
  39. Schoenwald SK. It’s a bird, it’s a plane, it’s…fidelity measurement in the real world. Clinical Psychology: Science and Practice. 2011;18:142–147. doi: 10.1111/j.1468-2850.2011.01245.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Schoenwald SK, Chapman JE, Kelleher KE, Hoagwood KE, Landsverk J, Stevens J, et al. A survey of the infrastructure for children’s mental health services: Implications for the implementation of empirically supported treatments (ESTs) Administration and Policy in Mental Health and Mental Health Services Research. 2008;35:84–97. doi: 10.1007/s10488-007-0147-6. [DOI] [PubMed] [Google Scholar]
  41. Schoenwald SK, Garland AF, Chapman JE, Frazier SL, Sheidow AJ, Southam-Gerow MA. Toward the effective and efficient measurement of implementation fidelity. Administration and Policy in Mental Health and Mental Health Services Research. 2011;38:32–43. doi: 10.1007/s10488-010-0321-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Schoenwald SK, Letourneau EJ, Halliday-Boykins C. Predicting therapist adherence to a transported family-based treatment for youth. Journal of Clinical Child and Adolescent Psychology. 2005;34:658–670. doi: 10.1207/s15374424jccp3404_8. [DOI] [PubMed] [Google Scholar]
  43. Schoenwald SK, Sheidow A, Letourneau EJ. Toward effective quality assurance in evidence-based practice: Links between expert consultation, therapist fidelity, and child outcomes. Journal of Clinical Child and Adolescent Psychology. 2004;33:94–104. doi: 10.1207/S15374424JCCP3301_10. [DOI] [PubMed] [Google Scholar]
  44. Shrout PE, Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin. 1979;86:420–428. doi: 10.1037//0033-2909.86.2.420. [DOI] [PubMed] [Google Scholar]
  45. Waldron HB, Turner CW. Evidence-based psychosocial treatments for adolescent substance abuse. Journal of Clinical Child and Adolescent Psychology. 2008;37:238–261. doi: 10.1080/15374410701820133. [DOI] [PubMed] [Google Scholar]
  46. Webb CA, DeRubeis RJ, Barber JP. Therapist adherence/competence and treatment outcome: A meta-analytic review. Journal of Consulting and Clinical Psychology. 2010;78:200–211. doi: 10.1037/a0018912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Weersing RV, Weisz JR, Donenberg GR. Development of the Therapy Procedures Checklist: A therapist-report measure of technique use in child and adolescent treatment. Journal of Clinical Child Psychology. 2002;31:168–80. doi: 10.1207/S15374424JCCP3102_03. [DOI] [PubMed] [Google Scholar]
  48. Zazzali JL, Sherbourne C, Hoagwood KE, Greene D, Bigley MF, Sexton T. The adoption and implementation of an evidence based practice in child and family mental health services organizations: A pilot study of Functional Family Therapy in New York State. Administration and Policy in Mental Health and Mental Health Services Research. 2008;35:38–49. doi: 10.1007/s10488-007-0145-8. [DOI] [PubMed] [Google Scholar]

RESOURCES