Abstract
This paper examines the selection and use of multiple methods and informants for the assessment of disruptive behavior syndromes and attention deficit/hyperactivity disorder (ADHD), providing a critical discussion of (a) the bidirectional linkages between theoretical models of childhood psychopathology and current assessment techniques; and (b) current knowledge concerning the utility of different methods and informants for key clinical goals. There is growing recognition that children’s behavior varies meaningfully across situations, and evidence indicates that these differences, in combination with informants’ unique perspectives, are at least partly responsible for inter-rater discrepancies in reports of symptomatology. Such data suggest that we should embrace this contextual variability as clinically meaningful information, moving away from models of psychopathology as generalized traits that manifest uniformly across situations and settings, and towards theoretical conceptualizations that explicitly incorporate contextual features, such as considering clinical syndromes identified by different informants to be discrete phenomena. We highlight different approaches to measurement that embrace contextual variability in children’s behavior and describe how the use of such tools and techniques may yield significant gains clinically (e.g., for treatment planning and monitoring). The continued development of a variety of feasible, contextually sensitive methods for assessing children’s behavior will allow us to determine further the validity of incorporating contextual features into models of developmental psychopathology and nosological frameworks.
I. Introduction
Given there is no biological or behavioral marker that definitively indicates the presence of clinically impairing psychological syndromes in children or adolescents (De Los Reyes, 2011; Kraemer et al., 2003), the collection of data from multiple sources is, by necessity, the gold standard for measuring developmental psychopathology (Hunsley & Mash, 2007). Thus, clinicians and researchers are tasked with two key assessment decisions: (1) How, and from whom, should information be collected? (2) How should the resulting data be integrated? Little consensus exists on how to make these important choices. The Diagnostic and Statistical Manual of Mental Disorders-IV-TR (DSM-IV, American Psychiatric Association, 2000), for example, provides minimal guidance concerning what information should be obtained to guide diagnostic decision-making (Hudziak, Achenbach, Althoff, & Pine, 2007).
In this paper, we critically review evidence for the selection and use of multiple methods and informants to assess psychopathology in children and adolescents (herein referred to as “children”), focusing on the DSM disruptive behavior syndromes (oppositional defiant disorder, ODD, and conduct disorder, CD) and attention deficit/ hyperactivity disorder (ADHD), because the empirical knowledge base concerning multi-method and multi-informant measurement is most substantial for these syndromes. Our goal is not to provide a comprehensive cataloguing of tools and techniques for assessing these syndromes, as others have done this work (e.g., Hunsley & Mash, 2008). Rather, our objective is to examine: (a) the bidirectional linkages between conceptual models of childhood psychopathology and commonly used assessment techniques; and (b) current knowledge concerning the utility of different methods and informants for making diagnoses and planning and monitoring progress in treatment across development.
Assessment practices – which include not only the instruments used but techniques for quantifying, summarizing, analyzing, and interpreting the resulting information – should maximally fit theoretical models of the clinical phenomena under consideration (McFall, 2005). Psychological syndromes are theoretical models advanced to explain patterns in children’s functioning (Kendell & Jablensky, 2003), and these conceptualizations are tested by measuring referents, which are the observable phenomena that are the manifestations of the underlying construct (McFall & Townsend, 1998). As such, selecting measures and analytic approaches stakes a theoretical claim. For example, integrating data across different informants’ reports to form one score reflects an implicit conceptualization of the underlying syndrome as a unitary construct of interest that generalizes across settings (Gomez, Burns, Walsh, & de Moura, 2003). Thus, beyond psychometric considerations, choosing among measures is inherently a theory-based process that necessitates thoughtful evaluation of the nature of the phenomena under consideration.
For this reason, there must be an ongoing dialogue between assessment and theory, such that theoretical and nosological frameworks are continuously refined to accommodate the data resulting from developments in assessment techniques. The current psychiatric nosology has advanced our understanding of childhood psychopathology in many important ways (Angold & Costello, 2009). Its problems, however, have also been documented. Critically, the existing diagnostic categories do not provide adequately defined phenotypes for studies of genetic contributions to psychiatric symptomatology (e.g., Ginsburg et al., 1996), nor are they easily integrated with findings from clinical neuroscience (Insel et al., 2010). The DSM taxonomy yields groupings that are merely descriptive, highly heterogeneous, and markedly overlapping. As researchers seek to identify increasingly specific causal mechanisms, it has become apparent that alternative approaches to organizing behavioral and emotional dysfunction are needed (Sanislow et al., 2010).
One promising possibility is to examine functional characterizations reflecting the different circumstances in which children’s symptoms manifest (De Los Reyes, Henry, Tolan, & Wakschlag, 2009; Wright & Zakriski, 2001;). Mounting evidence suggests that children’s behavior varies reliably and meaningfully across interpersonal situations (Dirks, Treat, & Weersing, 2007a; Wright, Zakriski, & Drinkwater, 1999). These behavioral differences, in combination with informants’ unique perspectives on children’s behavior, are at least partly responsible for the discrepancies that occur when different raters are asked to report on children’s behavior and psychological symptoms (De Los Reyes & Kazdin, 2005; Dumenci, Achenbach, & Windle, 2011). Variability in assessments of children’s behavior – across both specific interpersonal situations and more broadly construed settings (e.g., home, school, clinic), and as judged by different individuals – has often been considered something to be erased, in order to identify the “true” dispositions underlying actions (see Wright et al., 2011). We, on the other hand, take the perspective that such differences should be embraced, as they will contribute to our understanding of psychopathology. There may be both conceptual and practical benefits to revising our theoretical models to incorporate commonly observed contextual variations in children’s behavior generally and symptom presentations in particular. We suggest that symptoms occurring in different situations or settings, or as perceived by different informants, may constitute distinct phenotypes, and that our ability to understand those phenotypes holds the promise of advancing the diagnosis and treatment of psychopathology.
Advancing understanding of why behavioral variability occurs and what it can reveal about the heterogeneity of behavioral disorders in children will help both clinicians and researchers to collect the most relevant information and to use those data efficiently. These issues of utility, or the extent to which an assessment practice contributes to improved clinical decision-making (Hunsley & Mash, 2007), must also inform assessment choices (McFall, 2005). During an assessment, each instrument, informant, and data-analytic strategy should show evidence of incremental validity, contributing uniquely to the goal(s) of the process (Hunsley & Meyer, 2003). The rule of parsimony should prevail unless there is empirical evidence that “more is better,” yet too often more intensive practices are adopted in the absence of compelling evidence for their added value (Cella, Gershon, Lai, & Choi, 2007; Dirks & Boyle, 2010). This is a critical challenge for the field, as incorporating more complex procedures without demonstrated utility increases burden on families and may exacerbate clinicians’ resistance to the incorporation of standardized measurements into clinical practice (Johnston & Murray, 2003).
II. Choosing Among Methods in the Assessment of Child Psychopathology
In this section, we review how these considerations of theory and utility do, and should, guide clinicians and researchers as they choose among three of the major approaches to assessing psychopathology: rating scales, interviews, and observational procedures. These three strategies share a key limitation, which is that they only provide access to information that can be reported or seen. There are important processes in the etiology and maintenance of ADHD, ODD, and CD that can only be assessed through biological or performance-based tasks (e.g., cognitive functioning in ADHD; Pelham, Fabiano, & Massetti, 2005). Although such tools have yielded significant insights into these syndromes, their clinical utility has yet to be widely established. This situation is likely to change. For example, recent work incorporates a performance-based test of interpretation biases into a treatment protocol for adolescent mood disorder (Lothmann et al., 2011), and as the Research Domain Criteria (RDoC) initiative advances understanding of the underlying etiological mechanisms of psychiatric disorder (Insel et al., 2010), performance-based and biological assessment approaches will likely become more common. Yet even as the validity of such tasks for clinical purposes becomes increasingly established, there will always be need for valid and reliable indices of children’s observable functioning. Understanding of the cognitive, biological, and social mechanisms contributing to psychopathology is advancing rapidly, but the complexity of both the pathways leading to children’s behavioral and emotional dysfunction and the resulting phenotypes makes it unlikely that the field will reach a point when valid and reliable reports of phenomenology will play no role in diagnosis (Kendler, 2005). Moreover, they will always be important for treatment planning and monitoring (Pelham et al., 2005). Given their current and future importance for the assessment of psychopathology, it is essential that we continually evaluate the theoretical underpinnings and clinical utility of rating scales, interviews, and observational procedures.
II. a. Underlying Theoretical Models of Rating Scales, Interviews, and Observational Procedures
Rating scales and interviews, which can be unstructured, respondent based (i.e., structured) or interviewer based (i.e., semi-structured), are the most widely used tools for assessing childhood psychopathology (e.g., Hunsley & Mash, 2008). Naturalistic and laboratory observational procedures are also used in the assessment of ADHD, ODD, and CD (Frick & McMahon, 2008; Pelham et al., 2005), often to corroborate evidence provided by rating scales or interviews (McConaughy et al., 2010). Diagnostic observation procedures, however, are specifically designed to generate unique information to be incorporated into clinical decision-making, by engaging families in standardized laboratory procedures that “press” for the range of clinically salient behaviors (Lord et al., 2000).
Although the methodologies differ, rating scales, interviews, and observational procedures, as they are typically used, reflect similar underlying theoretical models of child psychopathology. Specifically, they emphasize psychopathology as a trait that will generalize across situations (see Wright, Zakriski, Hartley, & Parad, 2011). In general, rating scales ask informants to make global judgments about the frequency or intensity of symptoms (Dirks, Treat, & Weersing, 2007a; McDermott, 1993; Wright et al., 1999). For example, the widely used Child Behavior Checklist (CBCL, Achenbach & Rescorla, 2001) asks parents to evaluate the extent to which statements such as “argues a lot,” “talks too much,” and “threatens people” are true of their child. Such ratings emphasize overall frequency or require a global trait judgment without explicit reference to the interpersonal circumstances in which symptoms are occurring. Diagnostic interviews also tend not to solicit information about context, except in cases where the DSM-IV criteria explicitly reference contextual antecedents. For example, to be diagnosed with ADHD, impairment must be present in two settings, and as such, diagnostic interviews often query whether symptoms or impairment occur at home, at school, or in other contexts (e.g., the Child and Adolescent Psychiatric Assessment, CAPA, Angold & Costello, 2000).
In contrast to interviews and rating scales, observational measures provide a significant amount of contextual information, both at the setting and situation level. Observations can take place at home, school, or in the clinic, three discrete settings that present different demands. Moreover, it is possible to observe the specific interpersonal circumstances that precede behaviors. Often, however, this information is disregarded when these procedures are used to obtain decontextualized frequency counts of behavior (Wright et al., 2011). In addition, behavior may only be assessed in one setting, under the assumption that it will generalize to others, which may not be the case (see Gardner, 2000). Thus, as they are typically used, observational approaches are consonant with a theoretical model similar to that described for rating scales and interviews. Behaviors are the referents of the underlying psychopathology, without regard to the interpersonal situations in which they are embedded. Situational information is available, but not considered.
II. a. i. The role of interpersonal context in developmental psychopathology
Increasingly however, there is emphasis on the importance of context for diagnostic assessment (Drabick, 2009). The situations in which symptoms are elicited can provide important clues concerning the presence and severity of the syndrome. Pervasiveness, the extent to which symptoms are displayed across situations, has been identified as a key indicator of psychopathology in children (Angold & Costello, 2000), an idea that has been incorporated into some measures. For example, the CAPA uses the number of activities in which symptoms occur as a marker of intensity (Angold & Costello, 2000). The Adjustment Scales for Children and Adolescents (ASCA; McDermott, 1993) operationalizes psychopathology as the occurrence of symptoms across discrete situations. Teachers are asked to identify how a child responds in a particular interpersonal circumstance (e.g., when given correction) from a menu of behaviors (e.g., “takes correction without fuss,” “takes correction badly, [such as] sulky muttering, expressions, etc.,”) and the presence of a clinically concerning syndrome is determined based on the number of situations in which the corresponding behavior occurs. Clinically significant oppositionality, for example, is identified when a child is reported to engage in the related behaviors in six or more situations (also see Dodge, McClaskey, & Feldman, 1985; DuPaul & Barkley, 1992).
A second way in which situation may modify the clinical significance of a behavior is through the principle of developmental expectability (Wakschlag et al., 2010). Some behaviors are more likely, or expectable, in particular interpersonal contexts (Cole, Martin, & Denis, 2004) and behavior that occurs in an expected situation (e.g. a preschooler displaying aggression during a toy dispute) may not be as clinically concerning as behavior that occurs in atypical circumstances (e.g., aggression by a preschooler that appears to come “out of the blue;” Wakschlag et al., 2010). In this way, the referent of the underlying pathology is the behavior tied to its interpersonal and/or broader contextual antecedents (e.g., setting). This idea has been incorporated into some measures for some symptom types, with the most common example being the discounting of aggression towards siblings as a symptom of ODD or CD.
Both pervasiveness and expectability involve considering contextual information at the level of the referent. Alternatively, the underlying syndrome could be conceptualized to incorporate contextual variability in the manifestation of symptoms (Wright & Zakriski, 2001). Within groups of children experiencing each of ADHD, ODD, and CD, there is variability in the number and types of situations in which they exhibit symptoms (e.g., DuPaul & Barkley, 1992; Matthys, Maassen, Cuperus, & van Engeland, 2001). Although little work has examined this issue, it is possible that stable patterns of situation-symptom contingencies may underlie, or cut across, the existing diagnostic categories. For example, Wright and Zakriski (2001) found that within a group of boys exhibiting clinically significant conduct problems, two distinct subgroups emerged, differentiated by situational variability in aggressive behavior: One group was perceived by teachers to show elevated aggression only in response to aversive events with peers, whereas the second was perceived to engage in elevated aggression in response to all interpersonal situations. Within this theoretical framework, then, not only will symptomatology show variability across situations at the level of the child, but there will be stable patterns of situations and responses that differentiate groups of children. For example, a child who displays oppositionality only with a parent would be considered different from a child who displayed such behaviors only with peers. This approach is evident in the distinction in developmental psychopathology research between reactive and proactive aggression. These two behaviors differ in the eliciting situations: Reactive aggression occurs in the context of perceived provocation, frustration, or threat. In contrast, proactive aggression is planned behavior intended to help achieve a desired outcome (Crick & Dodge, 1996). These types of aggression have been associated with unique developmental pathways (e.g., Brendgen, Vitaro, Boivin, Dionne, & Perusse, 2006), suggesting that the incorporation of contextual information may contribute to the identification of more precise etiological mechanisms.
To be consistent with a theoretical model incorporating contextual patterning, the referents must tie symptoms to context, and this information should be maintained when data is aggregated or summarized. Some rating scales and interviews contain contextualized items (e.g., “argues when denied own way,” Wright et al., 2011), but when these items are added together to form a total score this information is lost. The Behavior-Environment Transactional Analysis (BETA; Wright & Zakriski, 2001) is an example of an instrument that maintains situational patterning of behavior by asking informants to report how often children encounter specific social events and how they respond in these circumstances, and then capturing these situation-behavior contingencies in the scoring. For example, children receive separate scores for aggressive behavior in response to aversive events occurring with adults and aversive events occurring with peers. Similarly, there are a number of inventories that assess children’s management of key interpersonal situations, such as responding to conflict with a friend (Rose & Asher, 1999), and peer provocation (Dirks, Treat, & Weersing, 2007b). To date, such measures have not been widely integrated into clinical research or practice. The data they yield, however, could prove valuable for these purposes by providing detailed information about the specific circumstances under which behavioral dysfunction is occurring.
II. b. Issues of Utility in the Use of Rating Scales, Interviews, and Observational Procedures
Ultimately, bringing our measurement approaches in line with data concerning the situational specificity of youth symptomatology should pay dividends for clinical decision-making. When evaluating the utility of an assessment practice, it is important to consider who is being assessed and why, as different methods will be more or less informative depending upon developmental stage (Silverman & Ollendick, 2005), and will be better suited to some purposes than others (Angold & Costello, 2009). We next consider the utility of rating scales, interviews, and observations for different clinical tasks, highlighting circumstances under which the incorporation of contextual information may be particularly useful.
II. b. i. Diagnostic decision-making
Many clinicians rely on unstructured interviews for diagnostic purposes (Jensen-Doss & Hawley, 2010). A recent meta-analysis indicates that case classifications based on these evaluations show limited agreement with those yielded by standardized interviews (Rettew, Lynch, Achenbach, Dumenci, & Ianova, 2009), a difference that may be partly due to the demands of clinical practice. Jensen and Weisz (2002) compared clinician-generated diagnoses to those obtained through a structured interview and found that the latter were more likely to result in no diagnosis, which may reflect the reality that clinicians have to assign a diagnosis to have services authorized. On average, standardized interviews also generated more diagnoses for a given child, a discrepancy that may relate to time limitations that understandably force clinicians to focus on the primary concern. Although these differences make sense given the constraints of clinical practice, there is evidence that structured interviews are more comprehensive and reliable than unstructured interviews (see Garb, 2007, for review). Only a few studies have examined whether greater structure affords increased validity (Garb, 2007); available evidence, however, suggests that standardized interviews yield more valid classifications than unstructured interviews (see Jensen-Doss & Hawley, 2010, for review). This difference may be due, in part, to the minimization of biases affecting the unstructured collection of diagnostic information, (e.g., selectively obtaining information that confirms initial impressions; Garb, 2007), and other influences on clinicians’ judgment (e.g., therapeutic orientation, Pottick et al., 2007). Such data clearly indicate the benefits of incorporating standardized assessments into clinical practice.
A primary reason that clinicians are reluctant to use standardized tools is that they view them as impractical (Jensen-Doss & Hawley, 2010). In this regard, interviews are considerably more onerous than rating scales as they are lengthier, and, in the case of semi-structured interviews, must be administered by a trained individual. It is critical, then, that the scientific benefits of interviews, compared to rating scales, compensate for this additional burden. The relative utility of interviews will vary as a function of the purpose of the assessment. For example, for researchers wishing to quantify symptomatology, available evidence suggests that briefer rating scales perform as well as respondent-based interviews in community samples (Dirks & Boyle, 2010). Similarly, their equivalence has been demonstrated when diagnoses are being made to estimate prevalence rates in the general population (e.g., Boyle et al., 1997). Under these conditions, as long as false positives and false negatives are roughly balanced, inferences will not be affected (Costello, Egger, & Angold, 2005).
In the clinic, however, diagnoses are tied to treatment, making accurate identification of cases critical. A number of studies have demonstrated that rating scales perform as well as structured interviews for diagnosing ADHD (see Johnston & Mah, 2008; Pelham et al., 2005), However, further work is needed to assess the generalizability of these findings, particularly among samples of youth seeking clinical services. There appears to be less work comparing interview- and rating scale-based diagnoses of ODD and CD, but two investigations have suggested that rating scales perform comparably to structured (Edelbrock & Costello, 1988) and semi-structured interviews (Grayson & Carlson, 1991).
This research provides preliminary evidence for the possibility that ADHD, ODD, and CD could be accurately diagnosed with briefer assessments. Further support for the potential of using shorter measures comes from data indicating that, although the DSM-IV weights all symptoms of these syndromes equally, some symptoms are more predictive of a diagnosis than others (e.g., Frick et al., 1994; Gelhorn et al., 2009; Power et al., 2001). Such findings suggest the possibility of paring down assessment items. Building on this idea, psychometric advances, particularly item-response theory (IRT), have facilitated the development of computerized adaptive testing (CAT), an individualized approach to measurement that greatly reduces the number of questions needed to assess accurately the construct of interest (Cella et al., 2007). Applications of CAT to the assessment of psychopathology have begun recently (Reise & Waller, 2009) and wider use of this technique will contribute to the development of more efficient assessment batteries. This dissemination will also advance basic knowledge of developmental psychopathology, as this approach could provide information about which symptoms, as rated by which informants, are most predictive of clinically significant syndromes.
Although there is evidence suggesting that briefer assessments may yield comparable classification, it is also important to consider other types of information essential for diagnostic decisions, as well as whether a trained interviewer may be necessary to gather these data. For example, an interviewer may be able to obtain more precise estimates of the onset and duration of symptoms, which may prove important given evidence for the different trajectories associated with early- versus late-onset CD (Frick & McMahon, 2008). Empirical tests of the incremental validity of interviews should consider all of the information being gathered, to pinpoint more precisely the conditions under which interviews yield significant added value.
This work must also consider developmental level, as it is likely that the need for more intensive assessments, such as interviews or diagnostic observation, will vary across childhood. During some developmental periods, it may be difficult for someone without specialized training to determine whether a behavior is clinically concerning. For example, aggression and oppositionality commonly occur in preschoolers. Thus the presence of these behaviors per se may not be clinically informative as it is in older children, making reliance on reports of behavioral frequency inadequate during this developmental period (Wakschlag, Tolan, & Leventhal, 2010). To address the challenges associated with disentangling clinically significant disruptive behavior from normative misbehavior at this age, Wakschlag et al. (2008a, b) developed the Disruptive Behavior Diagnostic Observation System (DB-DOS). This standardized diagnostic observation moves beyond simple behavioral counts by using ordinal ratings to code clinical concern. These judgments are based on the quality of behavior, its age appropriateness, and, importantly, the context in which it is occurring. For example, saying “no” in response to a request to clean up is developmentally expectable, but a “reflexive” no across a range of circumstances is not and is thus coded as clinically concerning (Wakschlag et al., 2007). Early evidence from the DB-DOS suggests that examining the expectability and pervasiveness of preschoolers’ disruptive behavior may have incremental clinical utility, suggesting the potential value of systematically incorporating contextual information for accurate identification of clinical syndromes (Wakschlag et al, 2008a).
II. b. ii. Treatment planning and monitoring
Incorporation of contextual features could also be helpful for planning or monitoring progress in treatment. To date, little research has examined treatment utility, the extent to which an assessment contributes to beneficial intervention outcomes (Mash & Hunsley, 2005). Some studies, however, have shown that functional-analytic assessments, which focus on understanding the conditioning of symptom expression, are associated with greater improvement in treatment (Haynes, Liesen, & Blaine, 1997), suggesting the possible utility of incorporating situation-level variability into measures and maintaining it in scoring algorithms. Similarly, assessing change in overall rates of behavior may obscure important differences in children’s behavior in specific social situations. Wright et al. (2011) found that over the course of a therapeutic summer-camp program, children’s average level of prosocial behavior increased and mean aggression decreased. Closer inspection revealed, however, that children’s aggressive behavior actually increased significantly in response to provocation by a peer, and prosocial behavior in this situation decreased. These findings suggest that although there are a number of rating scales sensitive to change in treatment for ADHD, ODD, and CD (Frick & McMahon, 2008; Johnston & Mah, 2008), such global assessments may provide an incomplete accounting of behavioral change. Children’s behavior may improve in some situations, but worsen or show no change in others.
Although much more research is needed to determine the generalizability of these findings, particularly within the context of widely implemented interventions for childhood disorders, preliminary evidence points to the value of situation-specific measurement in the context of intervention planning and delivery. Observational methods would seem ideally suited to this task, and work is ongoing to increase the feasibility of these approaches for clinical use (Pelham et al., 2005; Wakschlag et al., 2008b). It is also possible to translate the knowledge gained from more intensive methodologies into briefer instruments. The nuanced information about situation-behavior patterning gleaned from qualitative interviewing or observational paradigms can provide the foundation for the construction of contextually and developmentally sensitive rating scales (e.g., Dirks, Treat, & Weersing, 2011; Wright & Zakriski, 2001). For example, Wakschlag, Briggs-Gowan and colleagues have “translated” constructs about behavioral qualities salient to identification of disruptive behavior at preschool age from direct observation during the DB-DOS to a paper and pencil measure. This Multidimensional Assessment of Preschool Disruptive Behavior queries multiple facets and contexts of behavior in order to distinguish normative from clinically concerning occurrence (Wakschlag et al., 2011). In general, such measures maintain valuable contextual information, but may be more broadly useful given how much easier they are to administer.
The use of contextualized measures of psychopathology holds considerable promise for clinical practice, and we advocate here for research that examines empirically the clinical utility of such assessments. To begin, incorporating existing situation-based inventories of children’s behavior (e.g., Dirks et al., 2007b; Rose & Asher, 1999) into intervention studies will provide preliminary evidence concerning whether the inclusion of contextual information provides a more nuanced view of behavior change. Next, a more systematic approach could be taken to assess the treatment utility of these measures. In general, randomized studies of measurement approaches have been rare; however, such investigations would provide significant information about the relative utility of different assessment strategies. For example, using this approach to assess the incremental validity of systematically tiered levels of methods would provide an empirical knowledge base for decision-making about inclusion of various levels of measurement. Although this type of work would be labor intensive, such efforts would be justified by the strength of the resulting inferences and implications for treatment, and the proposed studies offer a promising opportunity for researchers and clinicians to collaborate in ways that would enhance assessment, theory, and intervention.
III. Use of Multiple Informants in the Assessment of Developmental Psychopathology
As the field works to develop rating scales and interviews that focus explicitly on the contextual patterning of children’s behavior, clinicians and researchers who use these methods will continue to face a second choice point: Who should complete them? It is generally recommended that data be collected from more than one informant (Hunsley & Mash, 2007) and typical raters include the children themselves, their parents, and their teachers. It has been widely documented that the agreement between any two of these individuals will be low to moderate (Achenbach, McConaughy, & Howell, 1987; De Los Reyes & Kazdin, 2005). To make sound choices about which informants to ask, and ultimately, to make sense of the resulting data, it is essential to understand why these discrepancies arise.
III. a. Reasons for Informant Discrepancies
Historically, random error, which can result from a number of different factors (e.g., differing interpretations of the anchors on a rating scale), has been viewed as the principal reason informants diverge (De Los Reyes, 2011). Several lines of evidence, however, suggest that this is not the case. First, different informants provide reports of children’s behavioral problems that are reliable and valid (De Los Reyes, 2011). Second, reports by different raters often share unique associations with a number of indices of youth functioning, both concurrently and longitudinally (see Burt, McCue, Krueger, & Iacono, 2005; Collishaw, Goodman, Ford, Rabe-Hesketh, & Pickles, 2009) and some research suggests that the variance unique to informants may share stronger associations with criterion variables than the variance shared between them (Dirks, Boyle, & Georgiades, 2011; but see Van Dulmen & Egeland, 2011). Third, discrepancies between informants are stable over time (e.g., De Los Reyes, Alfano, & Biedel, 2010) and show high levels of internal consistency (De Los Reyes et al., 2011a).
III. a. i. Reason 1: Informants’ unique perspectives
Given such findings, it is likely that systematic differences between raters are playing a bigger role in informant disagreement. Some of these are sources of error: Factors that cause raters to consistently report particular symptoms unconfirmed by other sources. A significant amount of work has focused on detailing such biases (see De Los Reyes & Kazdin, 2005), including contrast effects, such that the behavior of one sibling influences perceptions of the other (Simonoff et al., 1998); and halo effects, in which estimates of a given behavioral problem (e.g., ADHD), are inflated in the presence of other symptom types (e.g., ODD; Abikoff, Courtney, Pelham, & Koplewicz, 1993).
Some of the differences between informants’ perceptions, however, likely reflect variability in the meaning or interpretation of a particular behavior or symptom across contexts (Dirks, Treat, & Weersing, 2010; De Los Reyes et al., 2009). Research has shown that thresholds for the acceptability of children’s behavior vary as a function of cultural factors (see Weisz, Chaiyasit, Weiss, Eastman, & Jackson, 1997), and at a more micro level, these thresholds likely also vary across settings. In school, for instance, teachers must handle the demands of managing a classroom, and under these circumstances, behaviors that are often considered assertive, such as questioning rules and perceived unfair treatment (Gresham & Elliott, 2008), may be construed as oppositional. As such, informant discrepancies may be capturing, in part, differences in the types of behaviors that are problematic in a given context from the perspective of a particular informant (Dumenci et al., 2011). When considered within this framework, variability in informants’ ratings is not a problem, but an opportunity to learn about children’s adaptation in various settings. Disentangling the extent to which informant discrepancies reflect factors resulting from rater characteristics and genuine differences in the meaning of a behavior across settings will be an important focus for future research.
III. a. ii. Reason 2: Situation specificity of children’s behavior
In addition to informant characteristics and perspective, researchers have hypothesized that the marked situation specificity of children’s behavior is a key contributor to inter-rater discrepancies (Achenbach et al., 1987; De Los Reyes & Kazdin, 2005). Previously, support for this supposition has been limited to the indirect evidence that there is greater agreement between informants in the same setting (e.g., peers and teachers) than informants in different settings (e.g., parent and teacher; Achenbach et al., 1987). Two recent studies, however, provide more direct corroboration. De Los Reyes et al. (2009) used the DB-DOS to examine the associations between preschoolers’ disruptive behavior observed in two interpersonal contexts – interacting with an examiner and interacting with a parent – and different informant ratings. Results indicated that observed disruptive behavior with the parent was associated with parent, but not teacher, ratings of disruptive behavior, whereas observed disruptive behavior with the examiner was associated with teacher, but not parent, ratings of disruptive behavior, a pattern that indicates that contextual variability in children’s behavior is “real,” and not merely an artifact of rater characteristics. In a second study, Hartley, Zakriski, and Wright (2011) found that greater similarity in the types of interpersonal events children experience at home and school predicted increased agreement between parent and teacher reports of their aggressive behavior, suggesting that some of the discrepancy between parent and teacher reports might be attributable to differences in the social situations children encounter in each context. Greater situational similarity likely leads to increased consistency of behavior across contexts, which then contributes to greater convergence between informants. In both of these studies, then, variability in children’s behavior across interpersonal contexts, defined by both interaction partner and interaction type, contributes substantially to inter-rater agreement.
III. b. Theoretical Implications of Informant Discrepancies
Such evidence that the differences between informants reflect meaningful variation is inconsistent with the historical emphasis in developmental psychopathology on the agreement between raters (see Hartley et al., 2011). This expectation of convergence is consistent with a theoretical model of psychopathology as a trait that generalizes across contexts (see Rowe & Kandel, 1997): a given syndrome should manifest in the same way across settings and situations, and be perceived in the same way by raters. Within this conceptual framework, each informant is thought to provide an alternate sample of the indicators of the underlying construct. As noted by McFall and Townsend (1998) “if the construct is a good one, these different sampling methods should yield convergent evidence” (p. 317). If measurement of the referents is adequate, and if the referents reflect the same underlying construct, informants’ ratings should converge.
The presence of significant discrepancies between raters, then, signals one of two issues. Given the characteristics and perspective unique to each informant, inter-rater discrepancies may reflect differences in understanding of the referents. There may be variability between parents and teachers, for example, in their judgments of the nature and severity of behaviors that would warrant ratings of “often forgetful,” “often leaves seat,” and “easily distracted.” In this case, it may be possible to reduce inter-rater discrepancies by providing tighter specification of symptoms. If, however, differences in ratings are at least partly driven by informants’ access to different behavioral samples, as well as differences in the meaning of a behavior in a given context, then the issue may lie with the overarching theoretical construction. Rather than reflecting a unitary syndrome, it may be that reports by different informants are representative of different underlying constructs; for instance, teacher-reported ODD may be a different construct than parent-reported ODD (see Drabick, Gadow, & Loney, 2007; Drabick, Bubier, Chen, Price, & Lanza, 2011), a conceptualization that maps on to findings, reviewed earlier, that there may be functional differences between children in the manifestation of psychopathology (e.g., Wright & Zakriski, 2001). For example, children who have behavioral difficulties only in interactions with peers may be identified by teachers, but not parents (De Los Reyes et al., 2009).
III. b. i. Implications of theoretical models of informant discrepancies for data aggregation strategies
Clarifying the underlying theoretical model of informant discrepancies is critical because it informs the selection of strategies used to combine multi-informant data. Many of the strategies used to aggregate data are inconsistent with the burgeoning evidence that source-specific variability is meaningful. The “or” rule counts a symptom (or diagnosis) as present if it is endorsed by any informant, making no distinction between children for whom there is agreement about symptoms or classification and those for whom there is disagreement (Dirks et al., 2011a). Adding symptoms identified by different informants together also does not distinguish between raters; a child who had two symptoms reported by a parent and six by a teacher would be treated the same as one who had eight symptoms reported by a parent and none by a teacher (Holmbeck, Li, Schurmanm, Friedman, & Coakley, 2002). Alternatively, the “and” rule emphasizes convergence of information; symptoms (or diagnoses) “count” only when informants agree. Similarly, latent constructs that combine data provided by multiple raters reflect the variance shared between informants, with unique information relegated to the error terms (Holmbeck et al., 2002), although it is possible, with careful selection of informants, to model inter-rater discrepancies meaningfully (Kraemer et al., 2003).
Treating raters as equivalent, or discarding the differences between them as error, will result in the loss of valuable information about children’s current impairment and ultimate prognosis, leading a number of authors to suggest that information provided by raters should be maintained separately (e.g., Offord et al., 1996; Drabick et al., 2007). This source-specific approach is consistent with a theoretical model that indicates that variability across informants’ ratings is consequential. It also assumes, however, that the agreement between raters is not informative (Baillargeon, Boulerice, & Tremblay, 2001), a problematic premise for at least two reasons. First, the variability shared among informants is consistently associated with outcomes of interest (e.g., Cole, Martin, Powers, & Truglio, 1996; Perren, Von Wyl, Stadelmann, Burgin, & Von Klitzing, 2006) suggesting that it is not occurring purely by chance. Second, there may be differences between children identified as exhibiting a clinical syndrome by multiple informants compared to one informant, variability that will not be apparent if ratings are kept separate (e.g., Ho et al., 1996).
What is needed, then, are strategies that capture both the convergence and divergence among raters. One approach is to differentiate between children identified as having a clinically impairing syndrome by one or multiple informants. The ADHD and Disruptive Behaviors Workgroup has suggested this approach for DSM-5, recommending the use of a severity index of ODD based on the pervasiveness of symptoms across contexts (Drabick, 2011). Because informant typically serves as a proxy for setting (Drabick, 2011), in practice, this approach, which is similar to the DSM-IV specification that impairment must be present in two settings for a child to be diagnosed with ADHD, would often mean children identified by only their parent (or a teacher) would be seen as having a less severe presentation than those identified by both.
Research, however, does not unequivocally support this framework for either ODD or ADHD. For example, Drabick et al. (2007) compared boys in three groups: those who met criteria for ODD based on maternal report only, teacher report only, or report by both informants (combined). To provide support for the hypothesis that the combined group was the most severely impaired, their functioning would have to be significantly poorer than both of the single-informant groups. This pattern emerged for two variables out of eighteen. In a second study, Munkvold, Lundervold, Lie, and Manger (2009) found that a combined group was significantly more impaired, as rated by both parents and teachers, than parent- and teacher-only groups, and had more CD symptoms, as rated by teachers. The combined group was not the most severe on seven other variables, however, and was identified using the “and” rule for symptoms, which resulted in the identification of a relatively small (.2% of 7007 children), and thus possibly unrepresentative, group.
Evidence for the hypothesis that “pervasive” ADHD identified by both parents and teachers represents a more severe presentation than “situational” ADHD identified by only one of these informants is also not clear cut (see Costello, Loeber, & Stouthamer-Loeber, 1991; Ho et al., 1996). Some work has shown that pervasive ADHD is associated with poorer functioning than situational ADHD on a number of objectively measured indices, including inhibitory control and response reengagement (Schachar, Tannock, & Logan, 1993), IQ (Schachar, Rutter, & Smith, 1981), and objectively measured levels of hyperactivity (Tripp & Luk, 1997). Other investigations, however, are not consistent with this pattern, finding no significant difference across groups on the latter two variables (e.g., Costello et al., 1991; Rettew et al., 2011; Rapoport, Donnelly, Zametkin, & Carrougher, 1986).
Two recent investigations have relevance for the applicability of this conceptualization to CD, comparing children identified as having clinically concerning conduct problems by parents only, teachers only, or both parents and teachers (combined). The first found that the combined group had significantly lower IQ scores and significantly greater parent-rated impairment than the other two groups (Rettew et al., 2011). The magnitude of the difference between the combined and parent-only groups on the impairment rating was small, however, and teacher-ratings of impairment did not differ between the combined and teacher-only groups. The second found no difference between the groups on a number of longitudinal outcomes, including criminality, substance use, anxiety, and depression, although the small number of children per group may have limited analytic power (Fergusson, Boden, & Horwood, 2009).
Taken together, available data are not clearly consonant with a model positing cross-setting pervasiveness as a marker of syndrome severity. There are not enough studies assessing the patterning of correlates and outcomes across syndromes identified by different informants to draw firm conclusions about ODD and CD, and although more data are available concerning ADHD, interpretations are complicated by small sample sizes, under-representation of girls, and differences across studies in the definition of situational hyperactivity (i.e., are children identified by parents only or teachers only considered separately or collapsed into one group). As we await further research to elucidate this issue, two themes emerge from the extant literature. First, clinically significant syndromes identified by only one of parents or teachers are associated with substantial impairment and should not be discounted (see Drabick, 2011; Fergusson et al., 2009). Clinicians may wish to investigate carefully whether ADHD reported by parent only would be better characterized as a disruptive behavior problem, given data suggesting that these children (a) are not distinguishable from children with antisocial behavior problems on a number of indices, including family relationships and IQ (Ho et al., 1996), (b) do not demonstrate the same deficits in executive control exhibited by children with ADHD identified by a teacher (Schachar et al., 1993), and (c) may have better long-term prognoses (Manuzza, Klein, & Moulton, 2002). Second, the assumption made by a cross-setting severity index is that what matters is the number of settings in which children are impaired, but previous work suggests that which settings is also critical information, as children with clinically significant syndromes identified by parents appear different from those identified by teachers as well as from those identified by both.
III. c. Utility and the Use of Multi-Informant Data: Choosing Among Aggregation Strategies and Informants
III. c. i. Aggregation strategies
If there are important differences among groups of children identified via different combinations of informants, then there should be utility in maintaining this patterning during clinical decision-making. Researchers have adopted a number of different approaches to capture this information analytically. Laird and Weems (2011) suggested constructing regression models that assess whether the interaction between informants explains additional variance, after accounting for the prediction afforded by the separate ratings. Kraemer et al. (2003) advocated using a principal components analysis to parse explicitly the variance between informants into three meaningful components: trait, the characteristic of interest; context, “factors related to place and circumstance that influence the subject’s expression of [the trait] (p. 1569);” and perspective, which is characteristics of informants that impact their judgments. Similarly, other investigators have used factor-analytic strategies to derive latent variables capturing different aspects of informants’ ratings. For example, Dumenci et al. (2011) created factors reflecting a higher-order externalizing trait generalized across raters, and lower-order traits reflecting source-specific variability. Finally, a number of research teams have used latent class analysis to identify groups of children differentiated by their behavior in specific contexts (e.g., behavior is displayed when interacting with a parent, with a stranger, or both; De Los Reyes et al., 2009) or as perceived by different informants (e.g., high ratings of problematic behavior given by mother only, teacher only, or both; Fergusson et al., 2009).
As a beginning step, the last approach may hold the most promise for case conceptualization. This strategy could be adapted for clinical use by identifying meaningful cut points on dimensions of interest as rated by a particular informant and using those to classify children. For example, children would be grouped as manifesting clinically significant ODD as identified by parent only, teacher only, or both (e.g., Drabick et al., 2007). Clinicians, fundamentally, have to make a dichotomous decision – treat or not – and given the evidence reviewed previously, children in all three groups would warrant intervention. However, what type of intervention, and how children could respond, might vary meaningfully and in unexpected ways if the differences between these groups are not limited to phenomenology. There is some evidence, for instance, that children with pervasive hyperactivity benefit more from treatment with stimulants than children with situational hyperactivity (Schachar & Tannock, 1993).
Clinicians are sensitive to the context in which symptoms occur (Pottick et al., 2007), and many will be incorporating this type of information into their conceptualizations already. Systematizing this process provides an opportunity to examine critically the potential clinical utility of such distinctions, allowing for further refinements. For example, it would be important to establish that there is predictive power with regard to treatment outcome associated with establishing these groupings. Preliminary evidence could be obtained by reanalyzing existing data to ascertain (a) whether it is possible to obtain consistently meaningful classifications of children into these groupings, and (b) their associations with correlates and outcomes, both normatively and in response to intervention. The existing diagnostic categories likely provide a useful starting point. Given the movement within the field to identifying core, underlying mechanisms of psychopathology (Insel et al., 2010), it may eventually be fruitful to examine inter-informant variability in more specific behavioral and emotional processes.
Although there is anticipated benefit to maintaining cross-informant patterning at the onset of treatment, considering each rater separately may be the most useful strategy for monitoring progress. Research suggests that raters’ ability to report on behavior outside of their own setting is limited. For example, parental report on behavior at school shows markedly higher correlations with their ratings of behavior at home than with teacher report of behavior at school, and the converse is also true (de Nijs et al., 2004; Mitsis, McKay, Schulz, Newcorn, & Halperin, 2000). As such, report from an informant in one setting may not capture adequately functioning in a different context. Given the situation specificity of children’s behavior, more generalized response to intervention may not always occur, making it important to collect data from an informant with first-hand knowledge of the setting of interest (De Los Reyes & Kazdin, 2009). One concern about using a source-specific approach is that it may yield lower quality measurement than strategies that combine information. There is some evidence to suggest, however, that reliability of source-specific ratings is comparable to a number of other data aggregation approaches, including both the “and” and the “or” rule for symptoms (Drabick et al., 2007; Jensen et al.,1995; Kraemer et al., 2003; Munkvold et al., 2009; Offord et al.,1996).
III. c. ii. Informants
As the preceding review has made clear, there is substantial clinical utility associated with collecting information from both parents and teachers when making diagnostic decisions concerning ODD, CD, and ADHD (for additional evidence, see Owens & Hoza, 2003; Pelham et al., 2005; Loeber, Green, Lahey, & Stouthamer-Loeber; 1989). Most of this work has been conducted with school-aged children but there is evidence that teacher reports will also be useful for those who attend preschool (e.g., Murray et al., 2007). Obtaining self-report from children is also informative under some circumstances. Depending on the instrument used, young children may not be able to provide a reliable report (Frick & McMahon, 2008). For older children, however, self-report is a critical piece of the puzzle in the assessment of CD, likely due to the fact that many of the behaviors occur in settings to which adults are not privy (Cantwell, Lewinsohn, Rohde, & Seeley, 1997; Jensen et al., 1999; Loeber et al., 1989), In contrast, children’s self-report of ADHD symptoms is of limited value (Pelham et al., 2005), and there is debate concerning how much children’s self-report of ODD symptoms contributes beyond parental report (Angold & Costello, 1996; Loeber et al., 1989; Jensen et al., 1999).
Thus far, the research reviewed provides information about a given class of informants, on average. One question with which clinicians must wrestle is whether there are conditions under which reports provided by a particular rater may not be credible (Youngstrom et al., 2011; De Los Reyes et al. 2011b). Given the reliance of clinicians and researchers on maternal report, there has been substantial interest in factors that may impact mothers’ judgments, with much work focusing on whether maternal depression is associated with a tendency to over endorse disruptive behavior problems. This phenomenon has been demonstrated (e.g., Boyle & Pickles, 1997; Briggs-Gowan, Carter, & Schwab-Stone, 1996; but see Conrad & Hammen, 1989), but the magnitude of the bias may actually be quite small, indicating that there is still value in these reports (Youngstrom et al., 1999). When considering teacher ratings, concern has been raised that there may be a systematic over-reporting of externalizing problems for minority children (e.g. Epstein et al., 2005); however, empirical support for this position is equivocal. Some studies are consistent with this hypothesis, (e.g., Sonuga-Barke et al., 1993; see Lau et al., 2004, for review), but others are not (e.g., Chang & Sue, 2003; Epstein et al., 2005; Hosterman et al., 2008), with evidence appearing stronger for disruptive behaviors than for ADHD. Some researchers have suggested that such biases may be due to a cultural mismatch between teachers, who, at least in the United States, are predominantly non-Hispanic white (Hosterman et al., 2008), and their students (Puig et al., 1999); however, data addressing this issue appear sparse and do not clearly indicate that congruence between teacher and student ethnicity will yield a more accurate accounting (see Pigott & Cowen, 2000; Dominguez de Ramirez & Shapiro, 2005).
Even if reports by parents and teachers, on average, do not show evidence of substantial bias, clinicians will always confront individual cases in which they are concerned about the veracity of a report (e.g., the informant uses substances; Youngstrom et al., 2011). In recent research, Youngstrom et al. (2011) examined clinicians’ perceptions of informants’ credibility. Results indicated that informants seen as less credible did provide less valid information, but the authors concluded that these differences were not great enough to justify discarding the data. Although further work on this issue is needed, these findings suggest that it is rare that an informant’s report is of no value, and that one fruitful direction for research would be the development of techniques to correct for systematic error in informants’ reports, both at the individual and aggregate levels.
IV. “Coming Around Again”: Application of Advances in Assessment to the Refinement of Conceptualizations of Child Psychopathology for DSM-5
Advancing understanding of how to obtain maximum benefit from informants’ reports will increase the clinical utility of these instruments, but ultimately what is needed is greater understanding of the meaning of informant disagreement for conceptualizations of clinical phenomenology. Although inter-rater discrepancies in judgments of children’s psychopathology have been viewed as a problem, these differences reflect meaningful variability in children’s behavior and informants’ perspectives across contexts, and as such, the presence of informant disagreement provides an opportunity to advance theory and nosology in childhood psychopathology, which, in turn, should contribute to an increased understanding of developmental mechanisms. For example, there is growing evidence that genetic contributions to childhood psychopathology vary as a function of informant (e.g., Burt, 2009; Gizer et al., 2008). Such work suggests further unpacking informant discrepancies will advance clinical practice not only by enabling the development of more valid and efficient assessment techniques, but also by contributing to fundamental understanding of the etiology and maintenance of psychiatric disorders in childhood.
The critical next step for this line of research is to disentangle the relative contribution of (a) situational variability in behavior, and (b) rater-specific variables. To date, work on the situation specificity of children’s behavior has been conducted along disparate lines from investigation of inter-rater discrepancies in evaluations of children’s psychological symptomatology. The merging of these two traditions (e.g., De Los Reyes et al., 2009; Hartley et al., 2011) will be essential as researchers work to delineate the extent to which variability in informants’ evaluations are driven by differences in the behavior of the child across contexts versus factors related to the informant, including both bias and varying perspectives resulting from the demands of a particular setting. Such work is already underway. For example, a recent investigation by Gomez (2007) used IRT to demonstrate that ADHD symptoms were perceived in a similar way by parents and teachers, suggesting that the low agreement between informants was resulting from cross-setting differences in children’s behavior. More research of this type is needed to examine the generalizability of these findings to other symptom types, as well as to clinical samples.
It is also imperative that research move beyond the common practice of confounding informant and setting (i.e., using parent report to assess behavior at home and teacher report to assess behavior at school; Drabick, 2011), which often complicates interpretation of data due to the issue of shared method variance (see Costello et al., 1991) and provides only a crude measure of children’s behavior across settings and situations. This decoupling can be achieved by systematically assessing differences in specific behaviors across situations directly, resulting in a clearer mapping of the roles of situational and informant factors in inter-rater discrepancies (De Los Reyes et al., 2009). This work will contribute to the continued development of theoretical models of developmental psychopathology. In this paper, we have suggested two possible ways to parse children’s symptomatology: functional groupings, based on the situations in which symptoms occur, and source-specific categories, defined by the combinations of informants who have identified clinically significant syndromes or behaviors. Although related, these conceptualizations are not the same, and in order to determine which approach is more valid, it is necessary to separate children’s behavior from the informant, so that the contributions of each may be analyzed.
Advances in contextualized measurement make it possible to answer these questions. Much work in this area has relied on intensive, naturalistic observations (e.g., Wright et al., 2011), which provide a rich behavioral sample, but are impractical, particularly in clinical settings. It is now clear that it is possible to capture reliable, clinically meaningful, contextual variability in behavior using interviews and rating scales (e.g., Dirks et al., 2007b; Wright & Zakriski, 2001), as well as brief, structured observational tasks (Wakschlag et al., 2008a). The increased feasibility of these approaches will allow researchers to conduct studies explicitly examining the associations between situational factors and symptomatology with a variety of samples and in an increased range of settings, providing significant opportunity to advance understanding of the role of situation- and setting-level factors in externalizing behavior problems. For example, Gray et al. (2011) utilized the contextualized measurement afforded by the DB-DOS to demonstrate that the pervasiveness of disruptive behavior may be less clinically informative for girls. Specifically, they showed that disruptive boys were disruptive during interactions with both parent and examiner, whereas disruptive girls showed high levels of disruptive behavior only when interacting with their parents. These findings suggest that a cross-contextual pervasiveness requirement may under identify clinically significant disruptive behavior in girls, information that could only be obtained through the use of standardized, contextually sensitive measures.
V. Conclusions
The role of context in the development and maintenance of children’s behavior problems has long been recognized by clinicians in their day-to-day work with individual children and their families. This knowledge, however, has not been widely integrated into measurement tools, nor into conceptualizations of psychopathology. Yet, there is increasing evidence that behavioral differences across settings and situations are reliable and meaningful, data that suggest that developing a more fine-grained understanding of the contextualized patterns of children’s symptomatology will advance our knowledge of developmental psychopathology. As the field pushes towards DSM-5 there is an opportunity to consider how to strengthen the existing nosological framework. Considering the specific conditions under which symptomatology manifests, and measuring these contingencies systematically, may aid in the refinement of psychiatric phenotypes, work that may be necessary to push the boundaries of our knowledge of the etiology and maintenance of childhood psychiatric disorder.
Increased attention to the role of context in the expression of psychological symptoms should also translate into more precise assessment of clinical phenomena, ultimately bolstering the utility of our assessment approaches. For example, advances in contextualized measurement have helped, in part, to address the absence of developmental considerations that has characterized the disruptive syndromes (Wakschlag et al., 2010) by providing a more detailed framework by which to evaluate whether behaviors are clinically concerning or within normative bounds. Incorporation of contextual features could pay dividends for the creation of developmentally sensitive measures at other stages of childhood and adolescence, an issue that has received little attention (Carter et al, in press).
Such focus on the utility of measurement approaches remains critically important. Given the enormous and growing strain on the mental health system, it is essential that assessment procedures be as streamlined as possible, with each approach and informant contributing substantially to diagnosis and treatment. The incremental validity of different techniques has received insufficient attention from researchers and we recommend that the bar be raised in regard to standards of evidence for inclusion of multiple methods and informants for treatment and prediction (see Hunsley & Meyer, 2003). There are data suggesting that briefer rating scales perform as well as lengthier interviews for some purposes, as well as substantial evidence indicating that acquiring information from children’s teachers about disruptive behavior syndromes and ADHD is worth the extra effort. Much work remains, however. Utility will be heavily influenced by developmental concerns, but little work has evaluated whether different methods are more informative during particular periods of childhood. Establishing treatment utility by determining the extent to which assessments contribute to outcomes in intervention will provide a strong case for their inclusion and will help to trim unnecessary procedures from assessment batteries. As the field advances and we continue to deepen our understanding of which assessment practices are most efficient, for whom, and when, the goal should not be the eradication of differences across informants and methods. Rather, these differences should be embraced, as they reflect meaningful information that could play an important role in clinical decision-making. Ultimately, further elucidation of their causes will yield significant theoretical dividends, enhancing both our measurement, and eventually, our intervention practices.
Key Points.
The tools and techniques used to assess developmental psychopathology must be consistent with theoretical models of the phenomena, and data yielded by advances in measurement should contribute to refinement of these conceptualizations
Children’s behavior varies meaningfully across contexts, differences that, in combination with informants’ perspectives, contribute to inter-rater discrepancies in symptom reports
Incorporating contextual features into measurement approaches (e.g., maintaining patterns of ratings across informants rather than collapsing them together) will contribute to conceptual understanding of psychopathology and enhance the clinical utility of assessment instruments
Clinical utility of methods and informants must be considered carefully, relative to the goal of the assessment, and the “value added” of more intensive methods and additional informants must be demonstrated
Acknowledgments
The writing of this paper has been supported by NIMH grant R01MH082830 to Drs. Wakschlag and Briggs-Gowan and support to Dr. Wakschlag by the Walden & Jean Young Shaw Foundation.
Dr. Dirks is grateful to Dr. Timothy Strauman for his comments on earlier versions of this manuscript, as well as to her student, Laura Bellhouse.
References
- Abikoff H, Courtney M, Pelham WE, Koplewicz HS. Teachers’ ratings of disruptive behaviors: The influence of halo effects. Journal of Abnormal Child Psychology. 1993;21:519–533. doi: 10.1007/BF00916317. [DOI] [PubMed] [Google Scholar]
- Achenbach TM, McConaughy SH, Howell CT. Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin. 1987;101:213–232. [PubMed] [Google Scholar]
- Achenbach TM, Rescorla LA. Manual for the ASEBA school-age forms and profiles. Burlington, VT: University of Vermont, Research Center for Children, Youth and Families; 2001. [Google Scholar]
- American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 4th. Washington, DC: Author; 2000. text rev. [Google Scholar]
- Angold A, Costello EJ. The Child and Adolescent Psychiatric Assessment (CAPA) Journal of the American Academy of Child & Adolescent Psychiatry. 2000;39:39–48. doi: 10.1097/00004583-200001000-00015. [DOI] [PubMed] [Google Scholar]
- Angold A, Costello EJ. Nosology and measurement in child and adolescent psychiatry. Journal of Child Psychology and Psychiatry. 2009;50:9–15. doi: 10.1111/j.1469-7610.2008.01981.x. [DOI] [PubMed] [Google Scholar]
- Angold A, Costello EJ. The relative diagnostic utility of child and parent reports of oppositional defiant behaviors. International Journal of Methods in Psychiatric Research. 1996;6:253–259. [Google Scholar]
- Baillargeon RH, Boulerice B, Tremblay RE. Modeling interinformant agreement in the absence of a “gold standard”. Journal of Child Psychology and Psychiatry. 2001;42:463–473. [PubMed] [Google Scholar]
- Boyle MH, Offord DR, Racine YA, Szatmari P, Sanford M, Fleming JE. Adequacy of interviews vs checklists for classifying childhood psychiatric disorder based on parent reports. Archives of General Psychiatry. 1997;54:793–799. doi: 10.1001/archpsyc.1997.01830210029003. [DOI] [PubMed] [Google Scholar]
- Boyle MH, Pickles A. Influence of maternal depressive symptoms on ratings of childhood behavior. Journal of Abnormal Child Psychology. 1997;25:399–412. doi: 10.1023/a:1025737124888. [DOI] [PubMed] [Google Scholar]
- Brendgen M, Vitaro F, Boivin M, Dionne G, Perusse D. Examining genetic and environmental effects on reactive versus proactive aggression. Developmental Psychology. 2006;42:1299–1312. doi: 10.1037/0012-1649.42.6.1299. [DOI] [PubMed] [Google Scholar]
- Briggs-Gowan MJ, Carter AS, Schwab-Stone M. Discrepancies among mother, child, and teacher reports: Examining the contributions of maternal depression and anxiety. Journal of Abnormal Child Psychology. 1996;24:749–765. doi: 10.1007/BF01664738. [DOI] [PubMed] [Google Scholar]
- Burt SA. Rethinking environmental contributions to child and adolescent psychopathology: A meta-analysis of shared environmental influences. Psychological Bulletin. 2009;135:608–637. doi: 10.1037/a0015702. [DOI] [PubMed] [Google Scholar]
- Burt SA, McGue M, Krueger RF, Iacono WG. Sources of covariation among the child-externalizing disorders: Informant effects and the shared environment. Psychological Medicine. 2005;35:1133–1144. doi: 10.1017/S0033291705004770. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cantwell DP, Lewinsohn PM, Rohde P, Seeley JR. Correspondence between adolescent report and parent report of psychiatric diagnostic data. Journal of the American Academy of Child & Adolescent Psychiatry. 1997;36:610–619. doi: 10.1097/00004583-199705000-00011. [DOI] [PubMed] [Google Scholar]
- Carter AS, Gray SAO, Baillargeon RH, Wakschlag LS. A multidimensional approach to disruptive behaviors: Informing lifespan research from an early childhood perspective. In: Tolan PH, Leventhal BL, editors. Brain Research Foundation Symposium Series Advances in development and psychopathology: Volume 1, Disruptive behavior disorders. New York: Springer; (In press) [Google Scholar]
- Cella D, Gershon R, Lai JS, Choi S. The future of outcomes measurement: Item banking, tailored short-forms, and computerized adaptive assessment. Quality of Life Research. 2007;16(Supplement 1):133–141. doi: 10.1007/s11136-007-9204-6. [DOI] [PubMed] [Google Scholar]
- Chang DF, Sue S. The effects of race and problem type on teachers’ assessments of student behavior. Journal of Consulting and Clinical Psychology. 2003;71:235–242. doi: 10.1037/0022-006x.71.2.235. [DOI] [PubMed] [Google Scholar]
- Cole DA, Martin JM, Powers B, Truglio R. Modeling causal relations between academic and social competence and depression: A multi-trait, multi-method longitudinal study of children. Journal of Abnormal Psychology. 1996;105:258–270. doi: 10.1037//0021-843x.105.2.258. [DOI] [PubMed] [Google Scholar]
- Cole PM, Martin SE, Dennis TA. Emotion regulation as a scientific construct: Methodological challenges and directions for child development research. Child Development. 2004;75:317–333. doi: 10.1111/j.1467-8624.2004.00673.x. [DOI] [PubMed] [Google Scholar]
- Collishaw S, Goodman R, Ford T, Rabe-Hesketh S, Pickles A. How far are associations between child, family and community factors and child psychopathology informant-specific and informant-general? Journal of Child Psychology and Psychiatry. 2009;50:571–580. doi: 10.1111/j.1469-7610.2008.02026.x. [DOI] [PubMed] [Google Scholar]
- Conrad M, Hammen C. Role of maternal depression in perceptions of child maladjustment. Journal of Consulting and Clinical Psychology. 1989;57:663–667. doi: 10.1037//0022-006x.57.5.663. [DOI] [PubMed] [Google Scholar]
- Costello EJ, Egger H, Angold A. 10-year research update review: The epidemiology of child and adolescent psychiatric disorders: I. Methods and public health burden. Journal of the American Academy of Child & Adolescent Psychiatry. 2005;44:972–986. doi: 10.1097/01.chi.0000172552.41596.6f. [DOI] [PubMed] [Google Scholar]
- Costello EJ, Loeber R, Stouthamer-Loeber M. Pervasive and situational hyperactivity—confounding effect of informant: A research note. Journal of Child Psychology and Psychiatry. 1991;32:367–376. doi: 10.1111/j.1469-7610.1991.tb00313.x. [DOI] [PubMed] [Google Scholar]
- Crick NR, Dodge KA. Social information-processing mechanisms in reactive and proactive aggression. Child Development. 1996;67:993–1002. [PubMed] [Google Scholar]
- De Los Reyes A. Introduction to the special section: More than measurement error: Discovering meaning behind informant discrepancies in clinical assessments of children and adolescents. Journal of Clinical Child and Adolescent Psychology. 2011;40:1–9. doi: 10.1080/15374416.2011.533405. [DOI] [PubMed] [Google Scholar]
- De Los Reyes A, Alfano CA, Beidel DC. The relations among measurements of informant discrepancies within a multisite trial of treatments for childhood social phobia. Journal of Abnormal Child Psychology. 2010;38:395–404. doi: 10.1007/s10802-009-9373-6. [DOI] [PubMed] [Google Scholar]
- De Los Reyes A, Henry DB, Tolan PH, Wakschlag LS. Linking informant discrepancies to observed variations in young children’s disruptive behavior. Journal of Abnormal Child Psychology. 2009;37:637–652. doi: 10.1007/s10802-009-9307-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Los Reyes A, Kazdin AE. Informant discrepancies in the assessment of childhood psychopathology: A critical review, theoretical framework, and recommendations for further study. Psychological Bulletin. 2005;131:483–509. doi: 10.1037/0033-2909.131.4.483. [DOI] [PubMed] [Google Scholar]
- De Los Reyes A, Kazdin AE. Conceptualizing changes in behavior in intervention research: The range of possible changes model. Psychological Review. 2006;113:554–583. doi: 10.1037/0033-295X.113.3.554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Los Reyes A, Kazdin AE. Identifying evidence-based interventions for children and adolescents using the range of possible changes model: A meta-analytic illustration. Behavior Modification. 2009;33:583–617. doi: 10.1177/0145445509343203. [DOI] [PubMed] [Google Scholar]
- De Los Reyes A, Youngstrom EA, Pabón SC, Youngstrom JK, Feeny NC, Findling RL. Internal consistency and associated characteristics of informant discrepancies in clinic referred youths age 11 to 17 years. Journal of Clinical Child and Adolescent Psychology. 2011a;40:36–53. doi: 10.1080/15374416.2011.533402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Los Reyes A, Youngstrom EA, Swan AJ, Youngstrom JK, Feeny NC, Findling RL. Informant discrepancies in clinical reports of youths and interviewers’ impressions of the reliability of informants. Journal of Child and Adolescent Psychopharmacology. 2011b;21:417–424. doi: 10.1089/cap.2011.0011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Nijs PFA, Ferdinand RF, de Bruin EI, Dekker MCJ, van Duijn CM, Verhulst DC. Attention-deficit/hyperactivity disorder (ADHD): Parents’ judgment about school, teachers’ judgment about home. European Child and Adolescent Psychiatry. 2004;13:315–320. doi: 10.1007/s00787-004-0405-z. [DOI] [PubMed] [Google Scholar]
- De Ramirez RD, Shapiro ES. Effects of student ethnicity on judgments of ADHD symptoms among Hispanic and White teachers. School Psychology Quarterly. 2005;20:268–287. [Google Scholar]
- Dirks MA, Boyle MH. The comparability of mother-report structured interviews and checklists for the quantification of youth externalizing symptoms. Journal of Child Psychology and Psychiatry. 2010;51:1040–1049. doi: 10.1111/j.1469-7610.2010.02244.x. [DOI] [PubMed] [Google Scholar]
- Dirks MA, Boyle MH, Georgiades K. Psychological symptoms in youth and later socioeconomic functioning: Do associations vary by informant? Journal of Clinical Child and Adolescent Psychology. 2011a;40:10–22. doi: 10.1080/15374416.2011.533403. [DOI] [PubMed] [Google Scholar]
- Dirks MA, Treat TA, Weersing VR. Integrating theoretical, measurement, and intervention models of youth social competence. Clinical Psychology Review. 2007a;27:327–347. doi: 10.1016/j.cpr.2006.11.002. [DOI] [PubMed] [Google Scholar]
- Dirks MA, Treat TA, Weersing VR. The situation specificity of youth responses to peer provocation. Journal of Clinical Child and Adolescent Psychology. 2007b;36:621–628. doi: 10.1080/15374410701662758. [DOI] [PubMed] [Google Scholar]
- Dirks MA, Treat TA, Weersing VR. The judge specificity of evaluations of youth social behaviour: The case of peer provocation. Social Development. 2010;19:736–757. [Google Scholar]
- Dirks MA, Treat TA, Weersing VR. The latent structure of youth responses to peer provocation. Journal of Psychopathology and Behavioral Assessment. 2011b;33:58–68. doi: 10.1007/s10862-010-9206-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dodge KA, McClaskey CL, Feldman E. Situational approach to the assessment of social competence in children. Journal of Consulting and Clinical Psychology. 1985;53:344–353. doi: 10.1037//0022-006x.53.3.344. [DOI] [PubMed] [Google Scholar]
- Drabick DAG. Can a developmental psychopathology perspective facilitate a paradigm shift toward a mixed categorical-dimensional classification system? Clinical Psychology: Science and Practice. 2009;16:41–49. doi: 10.1111/j.1468-2850.2009.01141.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drabick DAG. Report to the DSM-5 ADHD and Disruptive Behaviors Working Group on Oppositional Defiant Disorder. 2011 Unpublished manuscript. Retrieved from http://www.dsm5.org/ProposedRevision/Pages/proposedrevision.aspx?rid=106#. [Google Scholar]
- Drabick DAG, Bubier J, Chen D, Price J, Lanza IH. Source-specific oppositional defiant disorder among inner-city children: Prospective prediction and moderation. Journal of Clinical Child and Adolescent Psychology. 2011;40:23–35. doi: 10.1080/15374416.2011.533401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drabick DAG, Gadow KD, Loney J. Source-specific oppositional defiant disorder: Comorbidity and risk factors in referred elementary schoolboys. Journal of the American Academy of Child & Adolescent Psychiatry. 2007;46:92–101. doi: 10.1097/01.chi.0000242245.00174.90. [DOI] [PubMed] [Google Scholar]
- Dumenci L, Achenbach TM, Windle M. Measuring context-specific and cross-contextual components of hierarchical constructs. Journal of Psychopathology and Behavioral Assessment. 2011;33:3–10. doi: 10.1007/s10862-010-9187-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DuPaul GJ, Barkley RA. Situational variability of attention problems: Psychometric properties of the revised home and school situations questionnaires. Journal of Clinical Child Psychology. 1992;21:178–188. [Google Scholar]
- Edelbrock C, Costello AJ. Convergence between statistically derived behavior problem syndromes and child psychiatric diagnoses. Journal of Abnormal Child Psychology. 1988;16:219–231. doi: 10.1007/BF00913597. [DOI] [PubMed] [Google Scholar]
- Epstein JN, Willoughby M, Valencia EY, Toney ST, Abikoff HB, Arnold LE, Hinshaw SP. The role of ethnicity in children’s ethnicity in the relationship between teacher ratings of attention-deficit/hyperactivity disorder and observed classroom behavior. Journal of Consulting and Clinical Psychology. 2005;73:424–434. doi: 10.1037/0022-006X.73.3.424. [DOI] [PubMed] [Google Scholar]
- Fergusson DM, Boden JM, Horwood LJ. Situational and generalised conduct problems and later life outcomes: Evidence from a New Zealand birth cohort. Journal of Child Psychology and Psychiatry. 2009;50:1084–1092. doi: 10.1111/j.1469-7610.2009.02070.x. [DOI] [PubMed] [Google Scholar]
- Frick PJ, McMahon RJ. Child and adolescent conduct problems. In: Hunsley J, Mash EJ, editors. A guide to assessments that work. New York: Oxford; 2008. pp. 41–68. [Google Scholar]
- Garb H. Computer-administered interviews and rating scales. Psychological Assessment. 2007;19:4–13. doi: 10.1037/1040-3590.19.1.4. [DOI] [PubMed] [Google Scholar]
- Gardner F. Methodological issues in the direct observation of parent-child interaction: Do observational findings reflect the natural behavior of participants? Clinical Child and Family Psychology Review. 2000;3:185–198. doi: 10.1023/a:1009503409699. [DOI] [PubMed] [Google Scholar]
- Gelhorn H, Hartman C, Sakai J, Mikulich-Gilbertson S, Stallings M, Young S, Crowley T. An item-response theory analysis of DSM-IV conduct disorder. Journal of the American Academy of Child and Adolescent Psychiatry. 2009;48:42–50. doi: 10.1097/CHI.0b013e31818b1c4e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ginsburg BE, Werick TM, Escobar JI, Kugelmass S, Treanor JJ, Wendtland L. Molecular genetics of psychopathologies: A search for simple answers to complex problems. Behavior Genetics. 1996;26:325–333. doi: 10.1007/BF02359388. [DOI] [PubMed] [Google Scholar]
- Gizer IR, Waldman ID, Abramowitz A, Barr C, Feng Y, Wigg KG, Rowe DC. Relations between multi-informant assessments of ADHD symptoms, DAT1, and DRD4. Journal of Abnormal Psychology. 2008;117:869–880. doi: 10.1037/a0013297. [DOI] [PubMed] [Google Scholar]
- Gomez R. Parent and teacher ratings of the DSM-IV ADHD symptoms: Differential symptom functioning, and parent-teacher agreement and differences. Journal of Attention Disorders. 2007;11:17–27. doi: 10.1177/1087054706295665. [DOI] [PubMed] [Google Scholar]
- Gomez R, Burns GL, Walsh JA, de Moura MA. A multitrait-multisource confirmatory factor analytic approach to the construct validity of ADHD rating scales. Psychological Assessment. 2003;15:3–16. doi: 10.1037/1040-3590.15.1.3. [DOI] [PubMed] [Google Scholar]
- Gray S, Carter A, Briggs-Gowan MJ, Hill C, Danis B, Wakschlag LS. Preschool children’s observed disruptive behavior: Variations across sex, interactional context, and severity of disruptive behavior. 2011 doi: 10.1080/15374416.2012.675570. Manuscript submitted for publication. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grayson P, Carlson GA. The utility of a DSM-III-R-based checklist in screening child psychiatric patients. Journal of the American Academy of Child & Adolescent Psychiatry. 1991;30:669–673. doi: 10.1097/00004583-199107000-00021. [DOI] [PubMed] [Google Scholar]
- Gresham FM, Elliott SN. Social Skills Improvement System: Rating Scales. Bloomington, MN: Pearson Assessments; 2008. [Google Scholar]
- Hart EL, Lahey BB, Loeber R, Hanson KS. Criterion validity of informants in the diagnosis of disruptive behavior disorders in children: A preliminary study. Journal of Consulting and Clinical Psychology. 1994;62:410–414. doi: 10.1037/0022-006X.62.2.410. [DOI] [PubMed] [Google Scholar]
- Hartley AG, Zakriski AL, Wright JC. Probing the depths of informant discrepancies: Contextual influences on divergence and convergence. Journal of Clinical Child and Adolescent Psychology. 2011;40:54–66. doi: 10.1080/15374416.2011.533404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haynes SN, Leisen M, Blaine DD. Design of individualized behavioral treatment programs using functional analytic clinical case models. Psychological Assessment. 1997;9:334–348. [Google Scholar]
- Ho TP, Luk ESL, Leung PWL, Taylor E, Lieh-Mak F, Bacon-Shone J. Situational versus pervasive hyperactivity in a community sample. Psychological Medicine. 1996;26:308–321. doi: 10.1017/s003329170003470x. [DOI] [PubMed] [Google Scholar]
- Holmbeck GN, Li ST, Schurman JV, Friedman D, Coakley RM. Collecting and managing multisource and multimethod data in studies of pediatric populations. Journal of Pediatric Psychology. 2002;27:5–18. doi: 10.1093/jpepsy/27.1.5. [DOI] [PubMed] [Google Scholar]
- Hosterman SJ, DuPaul GJ, Jitendra AK. Teacher ratings of ADHD symptoms in ethnic minority students: Bias or behavioral difference? School Psychology Quarterly. 2008;23:418–435. [Google Scholar]
- Hudziak JJ, Achenbach TM, Althoff RR, Pine DS. A dimensional approach to developmental psychopathology. International Journal of Methods in Psychiatric Research. 2007;16(S1):S16–S23. doi: 10.1002/mpr.217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hunsley J, Mash EJ. Evidence-based assessment. Annual Review of Clinical Psychology. 2007;3:29–51. doi: 10.1146/annurev.clinpsy.3.022806.091419. [DOI] [PubMed] [Google Scholar]
- Hunsley J, Mash EJ, editors. A guide to assessments that work. New York: Oxford; 2008. [Google Scholar]
- Hunsley J, Meyer GJ. The incremental validity of psychological testing and assessment: Conceptual, methodological, and statistical issues. Psychological Assessment. 2003;15:446–455. doi: 10.1037/1040-3590.15.4.446. [DOI] [PubMed] [Google Scholar]
- Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, Wang P. Research Domain Criteria (RDoC): Toward a new classification framework for research on mental disorders. American Journal of Psychiatry. 2010;167:748–750. doi: 10.1176/appi.ajp.2010.09091379. [DOI] [PubMed] [Google Scholar]
- Jensen AL, Weisz JR. Assessing match and mismatch between practitioner-generated and standardized interview-generated diagnoses for clinic-referred children and adolescents. Journal of Consulting and Clinical Psychology. 2002;70:158–168. [PubMed] [Google Scholar]
- Jensen-Doss A, Hawley K. Understanding barriers to evidence-based assessment: Clinician attitudes towards toward standardized assessment tools. Journal of Clinical Child and Adolescent Psychology. 2010;39:885–896. doi: 10.1080/15374416.2010.517169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jensen P, Roper M, Fisher P, Piacentini J, Canino G, Richters J, Schwab-Stone ME. Test-retest reliability of the Diagnostic Interview Schedule for Children (DISC 2.1) Archives of General Psychiatry. 1995;52:61–71. doi: 10.1001/archpsyc.1995.03950130061007. [DOI] [PubMed] [Google Scholar]
- Jensen PS, Rubio-Stipec M, Canino G, Bird HR, Dulcan MK, Schwab-Stone ME, Lahey BB. Parent and child contributions to diagnosis of mental disorder: Are both informants always necessary? Journal of the American Academy of Child & Adolescent Psychiatry. 1999;38:1569–1579. doi: 10.1097/00004583-199912000-00019. [DOI] [PubMed] [Google Scholar]
- Johnston C, Mah JWT. Child attention-deficit/hyperactivity disorder. In: Hunsley J, Mash EJ, editors. A guide to assessments that work. New York: Oxford; 2008. pp. 41–68. [Google Scholar]
- Johnston C, Murray C. Incremental validity in the psychological assessment of children and adolescents. Psychological Assessment. 2003;15:496–507. doi: 10.1037/1040-3590.15.4.496. [DOI] [PubMed] [Google Scholar]
- Kendell R, Jablensky A. Distinguishing between the validity and utility of psychiatric diagnoses. The American Journal of Psychiatry. 2003;160:4–12. doi: 10.1176/appi.ajp.160.1.4. [DOI] [PubMed] [Google Scholar]
- Kendler KS. Toward a philosophical structure for psychiatry. American Journal of Psychiatry. 2005;162:433–440. doi: 10.1176/appi.ajp.162.3.433. [DOI] [PubMed] [Google Scholar]
- Kraemer HC, Measelle JR, Ablow JC, Essex MJ, Boyce WT, Kupfer DJ. A new approach to integrating data from multiple informants in psychiatric assessment and research: mixing and matching contexts and perspectives. American Journal of Psychiatry. 2003;160:1566–1577. doi: 10.1176/appi.ajp.160.9.1566. [DOI] [PubMed] [Google Scholar]
- Laird RD, Weems CF. The equivalence of regression models using difference scores and models using separate scores for each informant: Implications for the study of informant discrepancies. Psychological Assessment. 2011;23:388–397. doi: 10.1037/a0021926. [DOI] [PubMed] [Google Scholar]
- Lau AS, Garland AF, Yeh M, McCabe KM, Wood PA, Hough RL. Race/ethnicity and inter-informant agreement in assessing adolescent psychopathology. Journal of Emotional and Behavioral Disorders. 2004;12:145–156. [Google Scholar]
- Loeber R, Green SM, Lahey BB, Stouthamer-Loeber M. Optimal informants on childhood disruptive behaviors. Development and Psychopathology. 1989;1:317–337. [Google Scholar]
- Loeber R, Green SM, Lahey BB, Stouthamer-Loeber M. Differences and similarities between children, mothers, and teachers as informants on disruptive child behavior. Journal of Abnormal Child Psychology. 1991;19:75–95. doi: 10.1007/BF00910566. [DOI] [PubMed] [Google Scholar]
- Lord C, Risi S, Lambrecht L, Cook E, Leventhal B, DiLavore P, Pickles A, Rutter M. The Autism Diagnostic Observation Schedule-Generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of Autism and Developmental Disorders. 2000;30:205–223. [PubMed] [Google Scholar]
- Lothmann C, Holmes EA, Chan SW, Lau JY. Cognitive bias modification training in adolescents: effects on interpretation biases and mood. Journal of Child Psychology and Psychiatry. 2011;52:24–32. doi: 10.1111/j.1469-7610.2010.02286.x. [DOI] [PubMed] [Google Scholar]
- Mannuzza S, Klein RG, Moulton JL. Young adult outcome of children with “situational” hyperactivity: A prospective, controlled follow-up study. Journal of Abnormal Child Psychology. 2002;30:191–198. doi: 10.1023/a:1014761401202. [DOI] [PubMed] [Google Scholar]
- Mash EJ, Hunsley J. Evidence-based assessment of child and adolescent disorders: Issues and challenges. Journal of Clinical Child and Adolescent Psychology. 2005;34:362–379. doi: 10.1207/s15374424jccp3403_1. [DOI] [PubMed] [Google Scholar]
- Matthys W, Maassen GH, Cuperus JM, van Engeland H. The assessment of the situational specificity of children’s problem behavior in a peer-peer context. Journal of Child Psychology and Psychiatry. 2001;42:413–420. [PubMed] [Google Scholar]
- McConaughy SH, Harder VS, Antshel KM, Gordon M, Eiraldi R, Dumenci L. Incremental validity of test session and classroom observations in a multimethod assessment of attention deficit/hyperactivity disorder. Journal of Clinical Child and Adolescent Psychology. 2010;39:650–666. doi: 10.1080/15374416.2010.501287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDermott PA. National standardization of uniform multisituational measures of child and adolescent behavior pathology. Psychological Assessment. 1993;5:413–424. [Google Scholar]
- McFall RM. Theory and utility-key themes in evidence-based assessment: Comment on the special section. Psychological Assessment. 2005;17:312–323. doi: 10.1037/1040-3590.17.3.312. [DOI] [PubMed] [Google Scholar]
- McFall RM, Townsend JT. Foundations of psychological assessment: Implications for cognitive assessment in clinical science. Psychological Assessment. 1998;10:316–330. [Google Scholar]
- Mitsis EM, McKay KE, Schulz KP, Newcorn JH, Halperin JM. Parent-teacher concordance for DSM-IV attention deficit/hyperactivity disorder in a clinic-referred sample. Journal of the American Academy of Child & Adolescent Psychiatry. 2000;39:308–313. doi: 10.1097/00004583-200003000-00012. [DOI] [PubMed] [Google Scholar]
- Munkvold L, Lundervold A, Lie SA, Manger T. Should there be separate parent and teacher-based categories of ODD? Evidence from a general population. Journal of Child Psychology and Psychiatry. 2009;50:1264–1272. doi: 10.1111/j.1469-7610.2009.02091.x. [DOI] [PubMed] [Google Scholar]
- Murray DW, Kollins SH, Hardy KK, Abikoff HB, Swanson JM, Cunningham C, Chuang SZ. Parent versus teacher ratings of attention-deficit/hyperactivity disorder symptoms in the Preschoolers with Attention-Deficit/Hyperactivity Disorder Treatment Study (PATS) Journal of Child and Adolescent Psychopharmacology. 2007;17:605–619. doi: 10.1089/cap.2007.0060. [DOI] [PubMed] [Google Scholar]
- Offord DR, Boyle MH, Racine Y, Szatmari P, Fleming JE, Sanford M, Lipman EL. Integrating assessment data from multiple informants. Journal of the American Academy of Child and Adolescent Psychiatry. 1996;35:1078–1085. doi: 10.1097/00004583-199608000-00019. [DOI] [PubMed] [Google Scholar]
- Owens J, Hoza B. Diagnostic utility of DSM-IV-TR symptoms in the prediction of DSM-IV-TR ADHD subtypes and ODD. Journal of Attention Disorders. 2003;7:11–27. doi: 10.1177/108705470300700102. [DOI] [PubMed] [Google Scholar]
- Pelham WR, Fabiano GA, Massetti GM. Evidence-based assessment of attention deficit hyperactivity disorder in children and adolescents. Journal of Clinical Child and Adolescent Psychology. 2005;34:449–476. doi: 10.1207/s15374424jccp3403_5. [DOI] [PubMed] [Google Scholar]
- Perren S, Von Wyl A, Stadelmann S, Burgin D, Von Klitzing K. Associations between behavioral/emotional difficulties in kindergarten children and the quality of their peer relationships. Journal of the American Academy of Child and Adolescent Psychiatry. 2006;45:867–876. doi: 10.1097/01.chi.0000220853.71521.cb. [DOI] [PubMed] [Google Scholar]
- Pigott RL, Cowen EL. Teacher race, child race, racial congruence, and teacher ratings of children’s school adjustment. Journal of School Psychology. 2000;38:177–196. [Google Scholar]
- Pottick KJ, Kirk SA, Hsieh DK, Tian X. Judging mental disorder: Effects of client, clinician, and contextual differences. Journal of Consulting and Clinical Psychology. 2007;75:1–8. doi: 10.1037/0022-006X.75.1.1. [DOI] [PubMed] [Google Scholar]
- Power TJ, Costigan TE, Leff SS, Eiraldi RB, Landau S. Assessing ADHD across settings: Contributions of behavioral assessment to categorical decision making. Journal of Clinical Child Psychology. 2001;30:399–412. doi: 10.1207/S15374424JCCP3003_11. [DOI] [PubMed] [Google Scholar]
- Puig M, Lambert MC, Rowan GT, Winfrey T, Lyubansky M, Hannah SD, Hill MF. Behavioral and emotional problems among Jamaican and African-American children, ages 6 to 11: Teacher reports versus direct observations. Journal of Emotional and Behavioral Disorders. 1999:240–250. [Google Scholar]
- Rapoport JL, Donnelly M, Zametkin A, Carrougher J. ‘Situational hyperactivity’ in a U.S. clinical setting. Journal of Child Psychology and Psychiatry. 1986;27:639–646. doi: 10.1111/j.1469-7610.1986.tb00188.x. [DOI] [PubMed] [Google Scholar]
- Reise SP, Waller NG. Item response theory and clinical measurement. Annual Review of Clinical Psychology. 2009;5:27–48. doi: 10.1146/annurev.clinpsy.032408.153553. [DOI] [PubMed] [Google Scholar]
- Rettew DC, Lynch AD, Achenbach TM, Dumenci L, Ivanova MY. Meta-analyses of agreement between diagnoses made from clinical evaluations and standardized diagnostic interviews. International Journal of Methods in Psychiatric Research. 2009;18:169–184. doi: 10.1002/mpr.289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rettew DC, van Oort FVA, Verhulst FC, Buitelaar JK, Ormel J, Hartman C, Hudziak JJ. When parent and teacher ratings don’t agree: The Tracking Adolescents’ Individual Lives Survey (TRAILS) Journal of Child and Adolescent Psychopharmacology. 2011;21:389–397. doi: 10.1089/cap.2010.0153. [DOI] [PubMed] [Google Scholar]
- Rose AJ, Asher SR. Children's goals and strategies in response to conflicts within a friendship. Developmental Psychology. 1999;35:69–79. doi: 10.1037/0012-1649.35.1.69. [DOI] [PubMed] [Google Scholar]
- Rowe DC, Kandel D. In the eye of the beholder? Parental ratings of externalizing and internalizing symptoms. Journal of Abnormal Child Psychology. 1997;25:265–275. doi: 10.1023/a:1025756201689. [DOI] [PubMed] [Google Scholar]
- Sanislow CA, Pine DS, Quinn KJ, Kozak MJ, Garvey MA, Heinssen RK, Wang PS, Cuthbert BN. Developing constructs for psychopathology research: Research domain criteria. Journal of Abnormal Psychology. 2010;119:631–639. doi: 10.1037/a0020909. [DOI] [PubMed] [Google Scholar]
- Schachar R, Rutter M, Smith A. The characteristics of situationally and pervasively hyperactive children: Implications for syndrome definition. Journal of Child Psychology and Psychiatry. 1981;22:375–392. doi: 10.1111/j.1469-7610.1981.tb00562.x. [DOI] [PubMed] [Google Scholar]
- Schachar R, Tannock R. Childhood hyperactivity and psychostimulants: A reviewof extended treatment studies. Journal of Child and Adolescent Psychopharmacology. 1993;3:81–97. doi: 10.1089/cap.1993.3.81. [DOI] [PubMed] [Google Scholar]
- Schachar RJ, Tannock R, Logan G. Inhibitory control, impulsiveness, and attention deficit hyperactivity disorder. Clinical Psychology Review. 1993;13:721–739. [Google Scholar]
- Silverman WK, Ollendick TH. Evidence-based assessment of anxiety and its disorders in children and adolescents. Journal of Clinical Child and Adolescent Psychology. 2005;34:380–411. doi: 10.1207/s15374424jccp3403_2. [DOI] [PubMed] [Google Scholar]
- Simonoff E, Pickles A, Hervas A, Silberg JL, Rutter M, Eaves L. Genetic influences on childhood hyperactivity: Contrast effects imply parental rating bias, not sibling interaction. Psychological Medicine. 1998;28:825–837. doi: 10.1017/s0033291798006886. [DOI] [PubMed] [Google Scholar]
- Sonuga-Barke EJS, Minocha K, Taylor EA, Sandberg S. Inter-ethnic bias in teachers’ ratings of childhood hyperactivity. British Journal of Developmental Psychology. 1993;11:187–200. [Google Scholar]
- Tripp GG, Luk SL. The identification of pervasive hyperactivity: Is clinic observation necessary? Journal of Child Psychology and Psychiatry. 1997;38:219–234. doi: 10.1111/j.1469-7610.1997.tb01856.x. [DOI] [PubMed] [Google Scholar]
- Van Dulmen MHM, Egeland B. Analyzing multiple informant data on child and adolescent behavior problems: Predictive validity and comparison of aggregation procedure. International Journal of Behavioral Development. 2011;35:84–92. [Google Scholar]
- Wakschlag LS, Briggs-Gowan MJ, Carter AS, Hill C, Danis B, Keenan K, Leventhal BL. A developmental framework for distinguishing disruptive behavior from normative misbehavior in preschool children. Journal of Child Psychology and Psychiatry. 2007;48:976–987. doi: 10.1111/j.1469-7610.2007.01786.x. [DOI] [PubMed] [Google Scholar]
- Wakschlag LS, Briggs-Gowan MJ, Hill C, Danis B, Leventhal BL, Keenan K, Carter AS. Observational assessment of preschool disruptive behavior, Part II Validity of the Disruptive Behavior Diagnostic Observation Schedule (DB-DOS) Journal of the American Academy of Child & Adolescent Psychiatry. 2008a;47:632–641. doi: 10.1097/CHI.0b013e31816c5c10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wakschlag LS, Hill C, Carter AS, Danis B, Egger HL, Keenan K, Briggs-Gowan MJ. Observational assessment of preschool disruptive behavior, Part I: Reliability of the Disruptive Behavior Diagnostic Observation Schedule (DB-DOS) Journal of the American Academy of Child & Adolescent Psychiatry. 2008b;47:622–631. doi: 10.1097/CHI.0b013e31816c5bdb. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wakschlag LS, Tolan PH, Leventhal BL. Research review: 'Ain't misbehavin': Towards a developmentally-specified nosology for preschool disruptive behavior. Journal of Child Psychology and Psychiatry. 2010;51:3–22. doi: 10.1111/j.1469-7610.2009.02184.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weisz JR, McCarty CA, Eastman KL, Chaiyasit W, Sunwanlert S. Developmental psychopathology and culture: Ten lessons from Thailand. In: Luthar S, Burack J, Cicchetti D, Weisz J, editors. Developmental psychopathology: Perspectives on adjustment, risk, and disorder. New York: Cambridge University Press; 1997. pp. 568–592. [Google Scholar]
- Wright JC, Zakriski AL. A contextual analysis of externalizing and mixed syndrome boys: When syndromal similarity obscures functional dissimilarity. Journal of Consulting and Clinical Psychology. 2001;69:457–470. [PubMed] [Google Scholar]
- Wright JC, Zakriski AL, Drinkwater M. Developmental psychopathology and the reciprocal patterning of behavior and environment: Distinctive situational and behavioral signatures of internalizing, externalizing, and mixed-syndrome children. Journal of Consulting and Clinical Psychology. 1999;67:95–107. doi: 10.1037//0022-006x.67.1.95. [DOI] [PubMed] [Google Scholar]
- Wright JC, Zakriski AL, Hartley AG, Parad HW. Reassessing the assessment of change in at-risk youth: Conflict and coherence in overall versus contextual assessments of behavior. Journal of Psychopathology and Behavioral Assessment. 2011;33:215–227. [Google Scholar]
- Youngstrom EA, Findling RL, Calabrese JR. Who are the comorbid adolescents? Agreement between psychiatric diagnosis, youth, parent, and teacher report. Journal of Abnormal Child Psychology. 2003;31:231–245. doi: 10.1023/a:1023244512119. [DOI] [PubMed] [Google Scholar]
- Youngstrom E, Izard C, Ackerman B. Dysphoria-related bias in maternal ratings of children. Journal of Consulting and Clinical Psychology. 1999;67:905–916. doi: 10.1037//0022-006x.67.6.905. [DOI] [PubMed] [Google Scholar]
- Youngstrom EA, Youngstrom JK, Freeman AJ, De Los Reyes A, Feeny NC, Findling RL. Informants are not all equal: Predictors and correlates of clinician judgments about caregiver and youth credibility. Journal of Child and Adolescent Psychopharmacology. 2011;21:407–415. doi: 10.1089/cap.2011.0032. [DOI] [PMC free article] [PubMed] [Google Scholar]