Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jul 1.
Published in final edited form as: J Pers Assess. 2019 Jul 5;102(4):480–487. doi: 10.1080/00223891.2019.1618319

Psychometric Properties of the Aggressive Behaviors Scale from the Youth Self-Report in Juvenile Offenders

Ashley D Kendall, Erin M Emerson 1, Richard E Zinbarg 2, Geri R Donenberg 3
PMCID: PMC6942636  NIHMSID: NIHMS1534266  PMID: 31276436

Abstract

The study of aggression in juvenile offenders, a high priority from clinical and public health standpoints, depends on properly measuring and modeling aggression. The Aggressive Behaviors scale from the Youth Self-Report (YSR-AB) has been widely used to measure youth aggression, often functioning as a stand-alone scale in analyses (of note, even when analyzed alone, the YSR-AB must be administered as part of the full YSR to retain its integrity). However, knowledge of its factor analytic structure among juvenile offenders is lacking. We addressed this gap. Factor analyses of YSR-AB data from 310 probation youth (mean=16 years, 90% African-American, 66% male) supported a hierarchical structure, with two lower-order factors distinguishing aggression targeting others (e.g., physical attack) from related symptoms (e.g., mood swings). The targeted aggression items showed significantly stronger associations with other externalizing symptoms than did the related symptom items; the opposite pattern emerged for internalizing symptoms. In further support of the convergent and discriminant validity of these subscales, the related symptoms were differentially linked to gender, with females reporting significantly higher levels than males. The hierarchical solution appeared to be stable over one-year. Implications for interpreting past findings, and conducting future research with the YSR-AB, are discussed.

Keywords: aggression, juvenile offenders, Youth Self-Report, structure


The study of aggression in juvenile offenders (JOs), a high priority from clinical and public health standpoints, depends on properly measuring and modeling aggression. Aggression is typically defined as a behavioral pattern that causes or threatens harm to others, with physical violence considered an important aspect (e.g., Loeber & Hay, 1997). Well over a million juveniles are arrested annually in the United States, including tens of thousands (e.g., 61,070 in 2014) for violent crimes (Puzzanchera, 2014). Violent JOs are at risk for a variety of adverse outcomes including psychiatric and physical health problems, low educational and vocational achievement, and ongoing criminal activity (e.g., Colins, Vermeiren, Schuyten, & Broekaert, 2009; Haapasalo & Hamalainen, 1996; Lambie & Randell, 2013). Victims and witnesses of violence, in turn, are vulnerable to increased anxiety and depression, impaired academic functioning, and higher likelihood of going on to initiate violence themselves (e.g., Margolin & Gordis, 2000). The cumulative economic costs of violent acts by JOs are substantial, including lost productivity on the part of both offenders and victims (e.g. Cohen, Miller, & Rossman, 1994; Cohen & Piquero, 2009). There is thus need for research on aggression in JOs—a necessary first step to which is articulating the psychometric properties of the scale(s) used to measure aggression in this group. Beyond ensuring that a scale is appropriate for use with JOs, such work could pave the way to identifying the aspects of the scale most strongly related to key outcomes, such as change in aggression over time or recidivism.

To measure symptoms of aggression in young people, the Aggressive Behaviors Scale from the Youth Self-Report (YSR-AB; Achenbach & Rescorla, 2001) has been widely employed (e.g., Shoal, Giancola, & Kirillova, 2003). The YSR-AB was conceptualized as tapping a single underlying aggression construct and, consistent with this, has typically been represented by a total scale score (e.g., Shoal et al., 2003). However, the items included in the scale appear to tap two potentially distinguishable—and conceptually meaningful—dimensions. Some items assess aggressive behaviors targeting others, such as physical attack. Others measure related symptoms, such as sudden mood swings or having a “hot temper.” We refer to these dimensions as targeted aggression and related symptoms, respectively. It stands to reason that targeted aggression would be a stronger measure of the underlying aggression construct, given that the symptoms captured by this facet are relatively specific to aggression, whereas symptoms like mood change are common across several psychiatric conditions (American Psychiatric Association, 2013). And importantly, it is acting out against another person (i.e., targeted aggression), not simply having a “hot” temper or being stubborn (i.e., related symptoms), that would bring a youth into contact with the juvenile justice system.

Although the psychometric properties of the full YSR have been extensively examined (e.g., Achenbach & Rescorla, 2001; Ivanova et al., 2007), and the YSR has been used in numerous studies with justice-involved youth (e.g., Colins, 2015; Karnik, Jones, Campanaro, Haapanen, & Steiner, 2006; Vreugdenhil, van den Brink, Ferdinand, Wouters, & Doreleijers, 2006), there are three important gaps to current understanding of the YSR-AB. First, the structure of the YSR-AB items has not been tested in isolation of the full YSR. Of note, we do not advocate administering the YSR-AB items outside of their standard context (i.e., interspersed among the other YSR items that assess various problems and favorable characteristics). Doing so may result in halo effects (in this case, the tendency for impressions created by answering one item to influence opinion when answering others) or other artifacts and violates the YSR copyright. Rather, recognizing the relatively common usage of the items as a stand-alone measure within particular studies after proper administration (e.g., Shoal et al., 2003), we suggest it is an important shortcoming that their structure independent of the full YSR remains largely unknown. Second, it remains to our knowledge unknown if a general factor underlies the YSR-AB, and thus if it is appropriate to represent the items with a total scale score. Such a conclusion would rely on testing a hierarchical solution that includes a general factor and calculating omegahierarchical, which indicates the proportion of variance attributable to the general factor (Zinbarg, Revelle, Yovel, & Li, 2005; Zinbarg, Yovel, Revelle, & McDonald, 2006). Third, despite the clear relevance of the scale to JOs (e.g., for identifying those youth who warrant in-depth assessment of their tendencies to act aggressively), existing psychometric work on the YSR items in isolation has not to our knowledge focused on this population.

Song et al. (1994) examined the factor structure of an older (Achenbach, 1991) version of the YSR-AB items in isolation, however, several aspects of their study rendered it unable address the gaps described above. To begin, the 1991 version of the YSR-AB shares less than 80% item overlap with the updated version of the scale. As such, any factor solution based on these items is not directly applicable to the revised scale. In line with our content analysis of the 2001 items, Song and colleagues rejected a unidimensional structure in the older scale, instead selecting three group factors. The authors labeled these factors active aggressive behaviors, which included items that involved aggression targeting others; affect aggressive behaviors, which consisted of items tapping emotions or personality traits related to aggression; and attention-focused aggressive behaviors, which was made up of items measuring attempts at attracting other peoples’ attention. They did not compare their multidimensional model to a similar one that included a general factor, or test the proportion of variance accounted for by a general factor in such a hierarchical solution, leaving open the question of whether a general factor underlay the scale. Finally, Song et al. utilized a sample of hospitalized psychiatric adolescents that was primarily female and white. JOs in the U.S., by contrast, are disproportionately male and African American (U.S. Department of Justice, 2010), limiting generalizability of the prior findings to this population.

An important issue raised by content analysis of the YSR-AB, as well as the psychometric study from Song et al. (1994), is that the scale appears to include some items, such as assessment of sudden mood change, that do not pertain to aggression directly. This issue can be similarly observed in other investigations of aggression scales, such as the Child Behavior Checklist, in which subtypes of “aggression” are identified based on existing scale items (e.g., Ligthart, Bartels, Hoekstra, Hudziak, & Boomsma, 2005). It is a common problem resulting from data-driven approaches to developing dimensional measures of psychopathology, whereby items that tend to correlate with each other are included in a scale even when some of the items merely tap risk factors for the construct of interest. Factor analysis cannot solve issues pertaining to the content of a scale. However, such psychometric examination can help distinguish those items in a scale that most strongly mark the intended construct from items that simply reflect related symptoms, and guide optimal modeling of the scale to maximize its performance.

We thus aimed to establish the psychometric properties of the YSR-AB (Achenbach & Rescorla, 2001) as a stand-alone scale in a sample of JOs. To address this aim, we tested and compared three competing structural models. Following from the common practice in the literature of deriving a total scale score, the first model tested the YSR-AB data as a unidimensional structure. The second model evaluated whether the data would have a two-factor structure, reflecting targeted aggression and related symptoms, but would include a general factor. We believed it was unlikely that three group factors, along the lines of those identified by Song et al. (1994), would provide significantly better fit to the data than two, and viewed the single distinction between targeted aggression and related symptoms as more parsimonious and conceptually meaningful. The third model was that the data would take the same two-factor structure but would not include a general factor, challenging the use of a total YSR-AB score. We hypothesized that the second model would provide the best fit to the YSR-AB data, revealing factors distinguishing targeted aggression from related symptoms, along with a general factor that accounted for at least a moderate amount of variance in the solution, supporting use of a total scale score.

To further test the separability of the hypothesized group factors in the hierarchical solution, we examined the convergent and discriminant validity of the items making up these factors with (1) other mental health symptoms and (2) gender. Gender was selected because it allowed for tests of convergent and discriminant validity based on data obtained outside the YSR, and for its conceptual relevance. We hypothesized that a subscale based on the targeted aggression factor would show a significantly stronger relation with other externalizing symptoms (i.e., rule breaking) than would a subscale based on the related symptoms factor, and that the opposite pattern would emerge for internalizing symptoms. These predictions were based on our expectation that the targeted aggression items would tap more overtly externalized behaviors, such as physical attack, whereas the related symptoms items would reflect relatively more internalized symptoms, such as mood swings. We further expected targeted aggression would be higher among males than females, and related symptoms higher among females than males. This was because externalizing symptoms are typically elevated among boys relative to girls, with the reverse being true for internalizing symptoms (e.g., Leadbeater, Kuperminc, Blatt, & Hertzog, 1999), although it should be acknowledged that evidence from detained youth indicates both externalizing and internalizing symptoms may be higher among justice-involved girls relative to boys (e.g., Van Damme, Colins, & Vanderplasschen, 2014).

Finally, we anticipated that the hierarchical structure in the YSR-AB data, including the targeted aggression and related symptoms group factors, would be stable over one year.

Method

Participants

Participants came from a prospective clinical trial aimed primarily at reducing HIV risk behaviors among JOs (Donenberg, Emerson, & Kendall, 2018). JOs 13–17 years old and currently on probation were recruited via two strategies (see Donenberg et al., 2018 for details). First, they were recruited from Evening Reporting Centers (ERCs) in Cook County, Chicago. The ERCs housed a community-based probation program that served as an alternative to detention following arrest. Minors could be court-ordered to participate in an ERC for anywhere from 5–28 days in lieu of detention. The ERC programing was single-sex, on-site, after school supervision that youth participated in while awaiting sentencing. Research staff presented the study to all youth at the ERC as a group; youth assent and parental consent was obtained for those interested.

Because females constituted only 14% of young people arrested annually in Cook County and were less likely than males to be remanded to ERCs, our second recruitment strategy focused exclusively on females. Specifically, probation officers distributed study fliers to the girls in their caseload; to avoid coercion, officers were instructed not to promote the program. They collected responses (indicating yes/no interest in learning more about the study) in a sealed envelope.

JOs were eligible to participate if they: spoke English, as assessments were normed for English speakers; understood the consent/assent process; provided assent; had consent from a legal guardian; were remanded to an ERC and attended at last one of the first four intervention sessions; and completed baseline assessments.

The final analytic sample included 310 JOs (average age = 16.08 years, SD = 1.09) at baseline (see Donenberg et al., 2018 for details). Consistent with national demographics, participants were primarily African American (90%), male (66%), and qualified for free lunch at school (86%). The frequencies of their most recent offences, as reported by the JOs at baseline, were: person-related crimes (i.e., assault and battery) (41%), public order offenses (e.g., gun possession) (22%), property-related crimes (e.g., theft, trespassing) (20%), drug-related crimes (9%), and other offenses (8%). All study procedures were approved by the University of Illinois at Chicago Institutional Review Board with special attention to vulnerable populations. A Certificate of Confidentiality was obtained for additional protection of participant privacy.

Measures

Mental health data came from the full Youth Self-Report (YSR) (Achenbach & Rescorla, 2001), administered at baseline and 12-month follow-up. There is extensive support for reliability and validity of the YSR in assessing mental health among young people (e.g., Achenbach, McConaughy, & Howell, 1987; Achenbach & Rescorla, 2001). The YSR is made up of eight syndrome scales: anxious/depressed, withdrawn/depressed, somatic complaints, social problems, thought problems, attention problems, rule-breaking behaviors, and aggressive behaviors. The syndrome scales are combined to create externalizing (rule-breaking and aggressive behaviors), internalizing (anxious/depressed, withdrawn/depressed, and somatic complaints), and total problems (all items) broadband scales. Each syndrome and broadband scale from the YSR generates a raw and T-score.

The aggressive behaviors (YSR-AB), externalizing, and internalizing scales were used for the present report. The YSR-AB consists of 17 items describing various behaviors and moods related to aggression. Because the Externalizing Symptoms broadband scale is comprised of items from the YSR-AB and Rule-Breaking scales, only the Rule-Breaking items were retained for tests of discriminant validity between YSR-AB subscales and (other) externalizing behaviors in order to avoid overlap. The Rule-Breaking scale includes 15 items assessing rule-breaking behaviors. The Internalizing Symptoms broadband scale is made up of 31 items from the Anxious/Depressed, Withdrawn/Depressed, and Somatic Complaints scales. All YSR items are rated out of 3 points ranging from not true to very/often true. In our sample, coefficient alpha values for the YSR-AB, Rule-Breaking, and Internalizing Symptoms scales, respectively, were .86, .72, and .85. Mean respective T-scores at baseline were 58.85 (SD = 9.45), 63.50 (SD = 8.68), and 54.05 (SD = 10.53). Symptoms can be considered within the clinical range for T-scores at or above 67 for the YSR-AB and Rule-Breaking scales, and at or above 63 for the Internalizing scale; all baseline scores were thus at subclinical levels, on average.

Data Analyses

Analyses were run with Mplus (Muthen & Muthen, 1998–2011) and the psych package (Revelle, 2019) in R (R Core Team, 2019). Full information maximum likelihood (FIML) accommodated missing data. FIML is a state-of-the-art method that reduces some bias that would have been caused by differential attrition if list-wise deletion were employed instead (Enders & Bandalos, 2001). The full sample (n = 310) provided baseline data. Follow-up retention was high at eighty-six percent (n = 268). The mean YSR-AB T-score at baseline among youth who provided follow-up data (mean = 59.83, SD = 9.67) was significantly higher than among those lost to attrition (mean = 54.98, SD = 7.43), t (121) = −4.33, p < .05.

Exploratory factor analyses (EFA) of one- and two-factor solutions for the YSR-AB data were first conducted in the full baseline sample. The reason we did not immediately perform CFA based on our content analysis of the items was that although we expected support for targeted and related symptoms group factors, we did not have a priori expectations about the assignment of every item. Items in the EFA were treated as continuous with the minimum residual method. In order to allow for either an oblique or orthogonal structure to emerge from the data, an oblimin rotation was applied.

Three fit indices were consulted to evaluate model goodness of fit in the confirmatory factor analyses (CFA) that followed the EFA: root mean square error of approximation (RMSEA; Steiger, 1989) with 90% confidence interval (CI) as is standard in the field, standardized root mean square residual (SRMR), and comparative fit index (CFI; Bentler, 2004). To conclude good fit between the observed data and the hypothesized model, we considered cutoffs of ≤ .06 and .08 for RMSE and SRMR, respectively, and ≥ .95 for CFI (Hu & Bentler, 1998). Based on the advice that these cutoffs be used flexibility (Marsh, Hau, & Wen, 2004), we considered a model to have good fit if it met two of the three cut points. Of note, Hu and Bentler’s original recommendations applied to interpreting pairs of fit indices (e.g., RMSEA and CFI); by consulting three indices, the present study took a more stringent approach.

Results

Factor Structure and General Factor Variance of the YSR-AB

In interpreting factor loadings from the EFA of one- and two-factor solutions, we assigned an item to a group factor if its largest loading was on that factor and if the item loading met or exceeded an absolute value of 0.30 (Gorsuch, 1983). Consistent with expectations, the first factor appeared to primarily reflect aggressive behaviors targeting others, and the second related symptoms (see Table 1). One of the items (“I am suspicious of others”) did not have absolute loadings at or above 0.30 on either factor, but had substantial loadings (i.e., above 0.20) on both factors. This item was considered good a representation of the general factor variance, and allowed to load onto both factors.1

Table 1.

Factor Pattern of the Aggressive Behavior Scale Items from the Youth Self-Report (YSR) after Oblimin Rotation for a Two-Factor Exploratory Extraction at Baseline

Item Targeted Aggression Related Symptoms
#21 I destroy things belonging to others 0.77 −0.12
#97 I threaten to hurt people 0.74 −0.04
#57 I physically attack people 0.61 0.17
#94 I tease others a lot 0.56 0.03
#23 I disobey at school 0.56 0.05
#20 I destroy my own things 0.41 0.08
#16 I am mean to others 0.40 0.22
#37 I get in many fights 0.37 0.24
#19 I try to get a lot of attention 0.35 0.12
#22 I disobey my parents 0.30 0.20
#89 I am suspicius of others 0.23 0.24
#87 My moods or feelings change suddenly −0.08 0.62
#68 I scream a lot −0.01 0.56
#86 I am stubborn 0.01 0.53
#95 I have a hot temper 0.20 0.44
#3 I argue a lot 0.21 0.42
#104 I am louder than other kids 0.18 0.31

Note. Loadings in boldface indicate the factor to which an item was assigned.

In support of testing up to two factors, results from both a scree test and parallel analysis (Horn, 1965), which compared the eigenvalues of the observed data to random re-samples, suggested retaining at most two factors (see Figure 1). That is, it was only the first two eigenvalues that exceeded the eigenvalues based on the random re-sampled data, and likewise it was only the first two that appeared to be above the scree line.

Figure 1.

Figure 1.

Principal components (PCs) and factors (FAs) derived from the Aggressive Behaviors scale from the Youth Self-Report (YSR-AB) at baseline comparing the eigenvalues of the observed data to random re-samples.

CFA was then used to test the one- and two-factor models. By conducting the EFA in the same sample as the CFA we could have capitalized on sampling error. Our tests of temporal invariance, described below, provided a partial check against this by testing fit in a wave different from the one in which the EFA was derived. In the two-factor CFA, items were not allowed to load on more than one of the group factors, with the exception of the one item described above. A general factor consisting of all the items was included in the model, making it a hierarchical (or bifactor) model.2 The group factors were constrained to be orthogonal to each other and to the general factor. We opted to use a hierarchical model rather than a simple higher-order model in which two latent factors load on a superordinate factor because the former allows for a clean decomposition of variance due to a general factor versus variance due to group factors. To evaluate the inclusion of the general factor in the hierarchical solution, a chi-square difference compared the hierarchical model with two group factors, RMSEA = 0.047 with 90% CI = [0.04–0.06] , SRMR = 0.04, CFI = 0.94, to one that included the same group factors constrained to be orthogonal but that did not include a general factor, RMSEA = 0.083 with 90% CI = [0.07–0.09], SRMR = 0.15, CFI = 0.80. This test showed that inclusion of the general factor significantly improved model fit, χ2 (17) = 197.14, p < .001. A chi-square difference test next determined if the hierarchical model was a better representation of the data than was the one-factor solution. The fit indices for the one-factor CFA were RMSEA = 0.070 with 90% CI = [0.06–0.08], SRMR = 0.06, CFI = 0.86. Results from the difference test showed that compared with the one-factor model, the hierarchical model provided significantly better fit, χ2 (18) = 127.42, p < .001. We thus ultimately selected the hierarchical solution. An omegahierarchicalh) value of 0.66 provided further support for the presence of a moderate to strong general factor in the data, indicating that 66% of the variance in the total score was attributable to the general factor (Zinbarg et al., 2005; Zinbarg et al., 2006). Together, these findings supported our hypothesis, demonstrating the use of a total scale score was appropriate but also revealing group factors distinguishing targeted aggression from related symptoms.

Table 2 presents the factor loadings from the hierarchical CFA. As can be seen in this table, the highest loadings on the general factor came almost entirely from items making up the targeted aggression group factor. In other words, the targeted aggression items appeared to be better measures of the underlying aggressive behaviors construct than did the remaining items. This conclusion was reinforced by comparison of a hierarchical model with two group factors in which all item loadings onto the general factor were constrained to be equal to each other to a similar model in which loadings onto the general factor were set equal among the targeted aggression items (unstandardized loadings = .33) and, separately, among the related symptoms items (unstandardized loadings = .29). The significant difference between models, χ2 (1) = 3.86, p < .05, supported the differential loadings and suggested that the targeted aggression items had reliably higher loadings onto the general factor than did the related symptoms items.

Table 2.

Factor Loadings for the Hierarchical Model of the Aggressive Behaviors Scale Items from the Youth Self-Report (YSR) at Baseline

Item Targeted Aggression Related Symptoms General Factor
#21 I destroy things belonging to others 0.15 -- 0.61
#97 I threaten to hurt people 1.23 -- 0.62
#57 I physically attack people 0.13 -- 0.69
#94 I tease others a lot 0.00 -- 0.58
#23 I disobey at school −0.03 -- 0.63
#20 I destroy my own things 0.00 -- 0.47
#16 I am mean to others −0.01 -- 0.58
#37 I get in many fights −0.01 -- 0.54
#19 I try to get a lot of attention −0.02 -- 0.45
#22 I disobey my parents −0.11 -- 0.49
#89 I am suspicius of others 0.08 0.18 0.37
#87 My moods or feelings change suddenly -- 0.48 0.33
#68 I scream a lot -- 0.43 0.37
#86 I am stubborn -- 0.38 0.37
#95 I have a hot temper -- 0.33 0.49
#3 I argue a lot -- 0.23 0.52
#104 I am louder than other kids -- 0.27 0.37

Note. All loadings are standardized. Values for the targeted aggression and related symptoms factors indicate the item loadings residualized for general factor variance.

Tests of Convergent and Discriminant Validity

We next tested if subscales corresponding to the targeted aggression and related symptoms factors related differentially to other mental health symptoms and gender. Tests of the relative strengths of correlations from dependent samples were used to compare the links between targeted aggression and related symptoms and the continuous symptom measures. To examine if these subscales related differentially to the binary gender outcome, a repeated measures ANOVA was first conducted of aggression subscale by gender. The significant higher-order interaction was then followed up by independent samples t-tests of the gender effects for each of the two scales. The correlation between targeted aggression and rule-breaking (r = 0.68) was significantly stronger than that between related symptoms and rule-breaking (r = 0.45), after accounting for the correlation between targeted aggression and related symptoms (r = 0.65), z = 6.12, p < .001. Similarly, targeted aggression showed a significantly weaker association with internalizing symptoms (r = 0.42) than did related symptoms (r = .58), after accounting for the correlation between subscales (r = .65), z = −3.91, p < .001.3 The higher-order interaction of subscale (targeted aggression vs. related symptoms) by gender was significant, F (1, 308) = 54.30, p = < .001. Follow-up tests revealed no significant difference between genders on targeted aggression, t (308) = −.89, p = .37. However, significant effects emerged for related symptoms, t (179) = 5.54, p < .001.4 In line with expectations, related symptoms were significantly higher among females (mean = 6.26, SD = 3.36) than males (mean = 4.14, SD = 2.80).

Temporal Invariance over One-Year Follow-up

Although measurement equivalence across time is often assumed, it is important to test this assumption (Ciesla, Cole, & Steiger, 2007). As a final analytic step, we constrained the item loadings in the hierarchical solution to be equal from baseline to 12-month follow-up (as well as equal among each other for loadings onto each group or general factor, as the model would otherwise not converge). Fit indices for the configural invariant model were RMSEA = 0.053 with 90% CI = [0.048–0.057], SRMR = 0.07, and CFI = 0.83.Those for the metric invariant model were RMSEA = 0.053 with 90% CI = [0.048–0.057], SRMR = 0.07, and CFI = 0.83. Imposing across-time equality constraints did not result in a significant decrement in model fit, χ2 (3) = 4.92, p > .05, indicating any longitudinal changes in the group or general factors was due to changes in levels of the latent construct, as opposed to changes in the structure of the measure.

Discussion

Consistent with expectations, we found that a hierarchical model with two group factors was the best representation of the YSR-AB data in a sample of JOs on probation. The two lower-order factors differentiated targeted aggression (e.g., physically attacking a person) from related symptoms (e.g., mood swings). In this model, 66% of the variance in the total score was attributable to a general factor. Tests of temporal invariance conducted over one year showed that the factor structure held across time. Together, these findings suggested that use of a total scale score to represent YSR-AB data in JOs was justified. At the same time, they pointed to the potential importance of considering distinctions between items tapping targeted aggression versus related symptoms. This distinction could have implications for identifying aspects of the YSR-AB most strongly associated with key outcomes related to aggression.

The separable nature of targeted aggression from related symptoms was reinforced by our tests of convergent and discriminant validity, as well as by examination of the factor loadings from the hierarchical solution. We found that a subscale comprised of the targeted aggression items had a significantly stronger association with other externalizing symptoms (i.e., rule-breaking behaviors) than did a related symptoms subscale. Similarly, the targeted aggression subscale had a significantly weaker association with internalizing symptoms than did the related symptoms subscale. There were no significant gender differences in targeted aggression, but females were significantly higher in related symptoms than were males. Although we expected males to report significantly higher levels of targeted aggression than females, the pattern of findings was otherwise in line with expectations. This was because we anticipated that targeted aggression would tap more overtly externalized behaviors, such as physical attack, whereas related symptoms would largely reflect symptoms that were more internalized in nature, such as mood swings. Moreover, internalizing symptoms are generally higher among adolescent girls than boys (e.g., Leadbeater et al., 1999). Given that aggression is defined as an externalizing condition (American Psychiatric Association, 2013), the results also offered evidence that the targeted aggression items are better indicators of the general aggression construct. In line with this conclusion, we found a small but reliable difference in the item loadings onto the general factor in our hierarchical solution, with the highest loadings coming almost entirely from those items assigned to the targeted aggression group factor.

We suggest that future studies either consider a hierarchical structure with two group factors to represent the YSR-AB data or, where latent variable modeling is not employed due to sample size limitations or other methodological constraints, run separate tests of aggression as measured by subscales corresponding to the targeted aggression and related symptoms factors. This could be useful as the targeted aggression items, to the extent that they are stronger and purer markers of the general construct, will be better predictors of key outcomes related to aggression. For example, the targeted aggression items might be more sensitive than the related symptom items to changes in aggression over time resulting from an intervention, or be more strongly linked with future behaviors such as recidivism. In support of this expectation, work from our group based on the present report found that the prospective effects of our psychosocial intervention on future aggression in clinically aggressive JOs were significantly stronger when aggression was measured with the targeted aggression versus related symptom items from the YSR-AB (Kendall, Emerson, Hartmann, Zinbarg, & Donenberg, 2017). Furthermore, targeted aggression might be more costly to society, given that it is overt expressions of aggression—and not the related internal states per se—that result in physical harm to others and contact with the justice system. From intervention and prevention standpoints, focus on reductions in targeted aggression might thus be the most critical outcome.

Implications for Interpreting Past Findings from the YSR-AB

Although our findings of a hierarchical model help justify the use in past research of a total scale score (e.g., Shoal et al., 2003), they also highlight points of caution in interpreting past results from studies with JOs. As previously described, we showed that the targeted aggression items were the strongest indicators of aggression. By combining the targeted aggression and related symptom items from the YSR-AB into a single composite score, it is possible that past studies may have attenuated their measure of aggression and produced effect sizes that were weaker than the true values, or even missed significant effects entirely. Furthermore, whereas the targeted aggression items appeared to be relatively specific to aggression, those from the related symptoms factor appeared to more often be common to multiple psychiatric conditions (American Psychiatric Association, 2013). Past studies using a total YSR-AB score may therefore have inadvertently infused the aggression construct with other symptom variance.

Considerations and Conclusions

Limitations must be considered in interpreting the results. Our focus on JOs fills an important gap and positions the present report to inform research on aggression in this population, but also necessarily limits generalizability. In particular, our findings based on a primarily urban, African-American, and male sample might not generalize to youth outside the juvenile justice system or to other youth within it. Yet, this sample is similar to other urban JO populations, providing information about the majority of youth arrested in the U.S. (U.S. Department of Justice, 2010). Second, there was evidence of differential attrition, with baseline aggression higher among youth who provided follow-up data than those who did not (although both groups were at subclinical levels). Concerns related to differential bias are offset, however, by the fact that our tests of metric invariance were supported. Were this not the case, it would remain possible that invariance held, but worked differentially depending on level of aggression. Instead, the present report demonstrated that the factor structure of the YSR-AB in JOs was stable across both time and level of aggression. Third, because EFA was conduced in the same sample as CFA, our analyses may have capitalized on sampling error. A partial check against this was provided by the tests of temporal invariance, as model fit was tested in a wave different from the one in which the EFA was derived. Fourth, the inclusion of an item with roughly equivalent loadings on both the targeted aggression and related symptoms factor could be problematic without significant primary-secondary discrepancy (Matsunaga, 2010). However, re-running all models without inclusion of this item on either factor did not meaningfully impact the pattern of results. Fifth, some of the item loadings from our two-factor extraction were small, indicating the items were not strong markers of their assigned factor. Such items would thus not function well as stand-alone measures, but of note, we are not advocating that any one of the items be used on their own. A final limitation is that all of our outcomes relied on self-report. It could be useful for future work to establish the convergent and discriminant validity of the targeted aggression and related symptom items from the YSR-AB in relation to objective measures, such as youths’ officially registered violent offenses. However, it is a strength that convergent and discriminant validity were demonstrated not just using other mental health symptoms from the YSR, but also a baseline characteristic obtained outside the YSR (gender). Future studies should expand on these findings by assessing gender with non-binary response options.

Despite its limitations, the present study addresses important shortcomings to knowledge of the psychometric properties of the YSR-AB, a widely used measure of aggression in young people. It also assists in interpreting past work conducted with the scale, and lays the foundation for productive future applications. We hope our analyses will encourage hierarchical factor modeling of the YSR-AB data where possible, particularly in work with JOs, and special consideration of the targeted aggression versus related symptom items.

Acknowledgments

This research was supported in part by National Institute of Minority Health and Health Disparities (R01MD005861) to Geri R. Donenberg and Erin M. Emerson. We thank all collaborating institutions in the conduct of this study (Cook County Department of Probation and Court Services, Evening Reporting Centers, Circuit Court of Cook County, Cook County Juvenile Temporary Detention Center, Illinois Department of Juvenile Justice, Cook County Sheriff’s Office, Illinois Department of Corrections). We also thank the youth for their participation. Ashley D. Kendall, Erin M. Emerson, Richard E. Zinbarg, and Geri R. Donenberg have no conflicts of interest.

Footnotes

1

As an alternative approach, we re-ran the models described below without this item included on either factor. Doing so did not meaningfully impact the pattern of results. For consistency with other studies that use the full YSR-AB scale, we report the set of findings that included all items.

2

The hierarchical solution produced a Heywood case, prompting us to constrain the residual variance for a single item (Item 97) to zero. The fit indices for this constrained model, RMSEA = 0.047 with 90% CI = [0.04–0.06], SRMR = 0.04, CFI = 0.94, were nearly identical to those from the original solution, suggesting that the Heywood case was not indicative of poor model specification. In order to facilitate comparison to other models in which constraint of the single item was not called for, we thus retained use of the original hierarchical model (i.e., that in which no constraint was imposed on item 97).

3

We re-ran each of the mental health symptom tests excluding Item 87 (“My moods or feelings change suddenly”), given that this item would be expected to correlate strongly with internalizing symptoms. After doing so, the correlations did not change by more than a hundredth of a point as compared with those reported in the text, and the significant effects all remained at the level of p < .001.

4

Levene’s test for equality of variances was violated; results are thus reported from a model in which equal variances were not assumed.

Contributor Information

Erin M. Emerson, University of Illinois at Chicago

Richard E. Zinbarg, Northwestern University and The Family Institute at Northwestern University

Geri R. Donenberg, University of Illinois at Chicago

References

  1. Achenbach T (1991). Manual for the Youth Self-Report and 1991 Profile. Burlington, VT: University of Vermont, Department of Psychiatry. [Google Scholar]
  2. Achenbach T, McConaughy SH, & Howell CT (1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin, 101, 213–232. doi: 10.1037/0033-2909.101.2.213 [DOI] [PubMed] [Google Scholar]
  3. Achenbach T, & Rescorla LA (2001). Manual for the ASEBA school-age forms & profiles. Burlington, VT: University of Vermont, Department of Psychiatry. [Google Scholar]
  4. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental health disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. [Google Scholar]
  5. Bentler PM (2004). EQS 6.1 Structural Equations Program Manual. Encino, CA: Multivariate Software. [Google Scholar]
  6. Ciesla JA, Cole DA, & Steiger JH (2007). Extending the trait-state-occasion model: How important is within-wave measurement equivalence? Structural Equation Modeling, 14, 77–97. doi: 10.1080/10705510709336737 [DOI] [Google Scholar]
  7. Cohen MA, Miller TR, & Rossman SB (1994). The costs and consequences of violent behavior in the United States In Reiss AJ & Roth JA (Eds.), Understanding and Preventing Violence: Consequences and Control (Vol. 4, pp. 67–166). Washington, D.C.: National Academy Press. [Google Scholar]
  8. Cohen MA, & Piquero AR (2009). New evidence on the monetary value of saving a high risk youth. Journal of Quantitative Criminology(15), 25–49. doi: 10.1007/s10940-008-9057-3 [DOI] [Google Scholar]
  9. Colins O (2015). Assessing reactive and proactive aggression in detained adolescents outside of a research context. Child Psychiatry and Human Development, 47, 159–172. doi: 10.1007/s10578-015-0553-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Colins O, Vermeiren R, Schuyten G, & Broekaert E (2009). Psychiatic disorders in property, violent, and versatile offending detained male adolescents. American Journal of Orthopsychiatry, 79, 31–38. doi: 10.1037/a0015337 [DOI] [PubMed] [Google Scholar]
  11. Donenberg GR, Emerson E, & Kendall AD (2018). HIV-Risk Reduction Intervention for Juvenile Offenders on Probation: The PHAT Life Group Randomized Controlled Trial. Health Psychology, 37, 364–374. doi: 10.1037/hea0000582 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Enders CK, & Bandalos DL (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling, 8, 430–457. doi: 10.1207/S15328007SEM0803_5 [DOI] [PubMed] [Google Scholar]
  13. Gorsuch RL (1983). Factor Analysis (2nd ed.). Hillsdale, New Jersey: Lawrence Erlbaum Associates. [Google Scholar]
  14. Haapasalo J, & Hamalainen T (1996). Childhood family problems and current psychiatric problems among young violent and property offenders. Journal of the American Academy of Child & Adolescent Psychiatry, 35, 1394–1401. doi: 10.1097/00004583-199610000-00027 [DOI] [PubMed] [Google Scholar]
  15. Horn J (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179–185. doi: 10.1007/BF02289447 [DOI] [PubMed] [Google Scholar]
  16. Hu L, & Bentler PM (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 434–453. doi: 10.1037/1082-989X.3.4.424 [DOI] [Google Scholar]
  17. Ivanova MY, Achenbach T, Dumenci L, Bilenberg N, Broberg AG, Dopfner M, … Zukauskiene R (2007). The generalizability of the Youth Self-Report syndom structure in 23 societies. Journal of Consulting and Clinical Psychology, 75, 729–738. doi: 10.1037/0022-006X.75.7.729 [DOI] [PubMed] [Google Scholar]
  18. Karnik NS, Jones PA, Campanaro AE, Haapanen R, & Steiner H (2006). Ethnic variation of self-reported psychopathology among incarcerated youth. Community Mental Health Journal, 42, 477–486. doi: 10.1007/s10597-006-9056-5 [DOI] [PubMed] [Google Scholar]
  19. Kendall AD, Emerson EM, Hartmann W, Zinbarg RE, & Donenberg GR (2017). A two-week psychosocial intervention reduces future aggression and incarceration in clinically aggressive juvenile offenders. Journal of the American Academy of Child & Adolescent Psychiatry, 56, 1053–1061. doi: 10.1016/j.jaac.2017.09.424 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Lambie I, & Randell I (2013). The impact of incarceration on juvenile offenders. Clinical Psychology Review, 33, 448–459. doi: 10.1016/j.cpr.2013.01.007 [DOI] [PubMed] [Google Scholar]
  21. Leadbeater BJ, Kuperminc GP, Blatt SJ, & Hertzog C (1999). A multivariate model of gender differences in adolescents’ internalizing and externalizing problems. Developmental Psychology, 35, 12268–11282. [DOI] [PubMed] [Google Scholar]
  22. Ligthart L, Bartels M, Hoekstra RA, Hudziak JJ, & Boomsma DI (2005). Genetic contributions to subtypes of aggression. Twin Research and Human Genetics, 8, 483–491. doi: 10.1375/twin.8.5.483 [DOI] [PubMed] [Google Scholar]
  23. Loeber R, & Hay D (1997). Key issues in the development of aggression and violence from childhood to early adulthood. Annual Review of Psychology, 48, 371–410. doi: 10.1146/annualrev.psych.48.1.371 [DOI] [PubMed] [Google Scholar]
  24. Margolin G, & Gordis EB (2000). The effects of family and community violence on children. Annual Review of Psychology, 51, 445–479. doi: 10.1146/annurev.psych.51.1.445 [DOI] [PubMed] [Google Scholar]
  25. Marsh HW, Hau K, & Wen Z (2004). In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings. Structural Equation Modeling, 11, 320–341. doi: 10.1207/s15328007sem1103_2 [DOI] [Google Scholar]
  26. Matsunaga M (2010). How to factor-analyze your data right: Do’s, don’ts, and how-to’s. International journal of Psychological Research, 3, 97–110. doi: 10.21500/20112084.854 [DOI] [Google Scholar]
  27. Muthen LK, & Muthen BO (1998-2011). Mplus User’s Guide (6 ed.). Los Angeles, CA: Muthen & Muthen. [Google Scholar]
  28. Office of Juvenile Justice and Delinquency Prevention. (2013). Juvenile Arrests 2011. U.s. Department of Jusice, Office of Justice Programs. [Google Scholar]
  29. Puzzanchera C (2014). Juvenile arrests 2012 Juvenile Justice Bulletin. Washington, DC: Office of Juvenile Justice and Delinquency Prevention. [Google Scholar]
  30. R Core Team. (2019). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; Retrieved from http://www.R-project.org/ [Google Scholar]
  31. Revelle W (2019). psych: Procedures for personality and psychological reserach. Evanston, IL: Northwestern University; Retrieved from http://CRAN.R-project.org/package=psych [Google Scholar]
  32. Shoal GD, Giancola PR, & Kirillova GP (2003). Salivary cortisol, personality, and aggressive behavior in adolescent boys: A 5-year longitudinal study. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 1101–1107. doi: 10.1097/01.CHI.000007024 [DOI] [PubMed] [Google Scholar]
  33. Steiger JH (1989). EzPATH: A sypplementary module for SYSTAT and SYSGRAPH. Evanston: SYSTAT. [Google Scholar]
  34. U.S. Department of Justice-Federal Bureau of Investigation. (2010). Crime in the United States, 2009. Retrieved from http://www2.fbi.gov/ucr/cius2009/data/table_43.htlm.
  35. Van Damme L, Colins OF, & Vanderplasschen W (2014). Gender differences in psychiatric disorders and cluster of self-esteem among detained adolescents. Psychiatry Research, 220, 991–997. doi: 10.1016/j.psychres.2014.10.012 [DOI] [PubMed] [Google Scholar]
  36. Vreugdenhil C, van den Brink W, Ferdinand R, Wouters L, & Doreleijers T (2006). The ability of YSR scales to predict DSM/DISC-C psychiatric disorders among incarcerated male adolescents. European Child & Adolescent Psychiatry, 15, 88–96. doi: 10.1007/s00787-006-0497-8 [DOI] [PubMed] [Google Scholar]
  37. Zinbarg RE, Revelle W, Yovel I, & Li W (2005). Cronbach’s α, Revelle’s β, and McDonald’s ω H : Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70(1), 123–133. doi: 10.1007/s11336-003-0974-7 [DOI] [Google Scholar]
  38. Zinbarg RE, Yovel I, Revelle W, & McDonald RP (2006). Estimating generalizability to a latent variable common to all of a scale’s indicators: A comparison of estimators for ωh. Applied Psychological Measurement, 30, 121–144. doi: 10.1177/0146621605278814 [DOI] [Google Scholar]

RESOURCES