Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 May 1.
Published in final edited form as: J Clin Child Adolesc Psychol. 2017 Aug 18;48(3):491–500. doi: 10.1080/15374416.2017.1350963

Associations between anxious and depressive symptoms and the recognition of vocal socio-emotional expressions in youth

Michele Morningstar 1, Melanie A Dirks 1, Brent I Rappaport 2, Daniel S Pine 2, Eric E Nelson 2,3
PMCID: PMC6314909  NIHMSID: NIHMS1512483  PMID: 28820619

Abstract

Objective:

The current study examined the associations between internalizing symptoms and adolescents’ recognition of vocal socio-emotional expressions produced by youth.

Method:

Fifty-seven youth (8–17 years old, M = 12.62, SD = 2.66; 29 anxious, 28 non-anxious; 32 female, 25 male) were asked to identify the intended expression in auditory recordings of youth’s portrayals of basic emotions and social attitudes.

Results:

Recognition accuracy increased with age, suggesting that the ability to recognize vocal affect continues to develop into adolescence. Anxiety symptoms were not associated with recognition ability, but youth’s depressive symptoms were related to poorer identification of anger and happiness.

Conclusions:

Youth experiencing symptoms of depression may be likely to misinterpret vocal expressions of happiness and anger.

Keywords: Vocal emotion recognition, anxiety, depression, youth, development


The ability to recognize others’ emotions and social intents from their nonverbal cues is an important component of social cognition and competent interpersonal interactions (Halberstadt, Denham, & Dunsmore, 2001), as well as psychosocial well-being (Maxim & Nowicki, 2003; Trentacosta & Fine, 2009). Prior research has linked deficits in the identification of facial expressions to anxiety and depression in youth (e.g., Demenescu, Kortekaas, den Boer, & Aleman, 2010; Easter et al., 2005; Guyer et al., 2007; McClure, Pope, Hoberman, Pine, & Leibenluft, 2003; Simonian, Beidel, Turner, Berkes, & Long, 2001); however, little is known about how internalizing symptoms may be associated with the recognition of other nonverbal cues, such as vocal expressions. Given that vocal affect is an important component of emotional communication in social interactions (Scherer, 2003), the current study examined how adolescents’ anxious and depressive symptoms related to their ability to identify other youth’s vocal expressions.

Emotion recognition (ER) is crucial to youth’s navigation of their social worlds (Crick & Dodge, 1994; Halberstadt et al., 2001). During adolescence, youth begin to interact more frequently with peers (Larson & Richards, 1991; Larson, Richards, Moneta, Holmbeck, & Duckett, 1996), with whom they share their emotional experiences (Stanton-Salazar & Spina, 2005). ER deficits may complicate such social interactions, which can already be difficult for youth experiencing anxiety and depression (Erath, Flanagan, & Bierman, 2007; Kingery, Erdley, Marshall, Whitaker, & Reuter, 2010; Kochel, Ladd, & Rudolph, 2012; La Greca & Harrison, 2005). As such, investigating ER deficits in youth with internalizing symptoms can help identify problematic social-information processing patterns that may be linked to the experience of depression and anxiety.

Most research on the psychosocial correlates of ER in youth has examined the identification of emotional cues in faces, which relies on the visual modality. In contrast, little attention has been given to the recognition of other nonverbal cues, which the NIMH Research Domain Criteria (RDoC) delineates as a distinct facet of social communication. For instance, a speaker’s voice contains important information about their emotions and social attitudes (Banse & Scherer, 1996), beyond what is gleaned from facial expressions (Heinrich & Borkenau, 1998; Pell, 2002; Zaki, Bolger, & Ochsner, 2009). However, very little is known about youth’s understanding of emotional prosody in the auditory modality.

The ability to identify emotional cues in the voice has been found to increase with age throughout childhood (Allgood & Heaton, 2015; Chronaki, Hadwin, Garner, Maurage, & Sonuga-Barke, 2014; Sauter, Panattoni, & Happé, 2013) and beyond, as adults typically outperform youth in vocal ER tasks (Brosgole & Weisman, 1995; Chronaki et al., 2014; Zupan, 2015). However, previous studies have investigated age-related changes in vocal ER skills in youth 12 years old or younger, and little data exist to inform the developmental trajectory of this skill in later adolescence. The fact that vocal ER capacities may be continuing to mature at a time when internalizing disorders often manifest indicates that developmental investigations of vocal ER in anxious and depressed youth are warranted.

At present, few studies have examined the associations between internalizing symptoms and vocal ER in youth. Manassis and Young (2000) reported that 8- to 12-year-old patients diagnosed with anxiety disorders were highly accurate in identifying vocal sadness, and McClure and Nowicki (2001) found that a community sample of 8- to 10-year-old children with higher self-reported levels of social anxiety struggled to identify fearful voices; however, little else is known about the links between anxiety and vocal affect recognition in youth. Similarly, only a few studies have examined associations between depression and vocal ER in children and adolescents: clinically depressed 9- to 11-year-old boys were less accurate than their peers at recognizing vocal affect (Emerson, Harrison, & Everhart, 1999), and boys with high self-reported feelings of depression showed similar impairments (Nowicki & Carton, 1997).

The present research sought to address two gaps in the existing literature. First, most of the studies reviewed above assessed vocal ER and its association with internalizing disorders in children between 7 and 12 years of age. However, the increased incidence of anxiety and depression in adolescence (see Paus, Keshavan, & Giedd, 2008) encourages the investigation of this link in older youth as well. The current study thus examined the relation between vocal ER and symptoms of depression and anxiety in 8- to 17-year-olds. Second, most previous studies have used adult-generated stimuli to assess youth’s ER. However, since children and adolescents may be exposed more frequently to their peers’ emotional outputs than to those of adults (e.g., Larson & Richards, 1991), the recognition of other youth’s affect may be more relevant to their interpersonal functioning. Though some of the previous research (e.g., McClure & Nowicki, 2001; Nowicki & Carton, 1997) has utilized the Diagnostic Analysis of Nonverbal Accuracy Scale (DANVA; Nowicki & Duke, 1994), which contains a child paralanguage subtest, the stimuli used in this measure is produced by one 10-year-old girl. Given adolescents’ expanding social networks, the inclusion of stimuli produced by several peer-aged speakers may provide a more generalizable estimate of youth’s vocal ER abilities in their day-to-day lives.

Goals and Hypotheses of the Current Study

The current study investigated associations between the recognition of youth’s vocal affect and symptoms of anxiety and depression in 8- to 17-year-olds. We recruited clinically-referred anxious youth, along with non-anxious comparison youth, from an ongoing study on the association between anxiety and cognitive and emotional responses (Stoddard et al., 2016). Given the high comorbidity between anxiety and depression in youth (Brady & Kendall, 1992), we assessed both types of symptoms in the same group of participants. Consistent with current emphasis on treating internalizing symptoms as continuous rather than discrete variables (Krueger & Markon, 2006), we examined associations between participants’ self-reported anxious and depressive symptoms and their ability to identify key basic emotions (anger, disgust, fear, happiness, and sadness), as well as two social expressions, meanness and friendliness.

We included the latter socio-emotional expressions of social rejection (meanness) and affiliation (friendliness) due to their relevance to youth’s social interactions and psychological outcomes. Experiencing social rejection is deemed hurtful by youth (Paquette & Underwood, 1999) and is linked to psychosocial difficulties (Richman & Leary, 2009), including lower self-reported well-being (Rigby, 2000) and aggression (Asher, Rose, & Gabriel, 2001; Kupersmidt, Burchinal, & Patterson, 1995). Further, enhanced vigilance to cues of affiliation and rejection may lead to greater rejection sensitivity (Masten et al., 2009). For these reasons, it is crucial for youth to be able to parse these cues accurately. Additionally, meanness and friendliness have been shown to be acoustically distinct from emotions like anger and happiness (Morningstar, Huang, & Dirks, 2017; Noble & Xu, 2011), and there is evidence that social expressions are processed differently than basic emotions at a neural level (e.g., Adolphs, Baron-Cohen, & Tranel, 2002). Including meanness and friendliness in our stimulus set thus allowed for a deeper understanding of youth’s interpretation of socially relevant affective cues.

Due to limited prior research, we had no specific hypotheses about the associations between anxiety symptoms and vocal ER. We expected that greater depressive symptoms would be linked to lower recognition accuracy (Deveney, Brotman, Decker, Pine, & Leibenluft, 2012; Emerson et al., 1999; Nowicki & Carton, 1997). Additionally, since some evidence suggests specific deficits in depressed adults for the recognition of happy facial expressions (Joorman & Gotlib, 2006; Leppanen, 2006), we hypothesized that greater depressive symptoms in youth may be related to an especially pronounced deficit in the identification of vocal portrayals of positively valenced expressions, like happiness and friendliness. Lastly, based on previous findings showing that adults outperformed 4- to 12-year-olds in recognition tasks (e.g., Chronaki et al., 2014; Zupan, 2015), we expected that increased age in adolescence would be associated with heightened vocal ER accuracy.

Method

Participants

Fifty-seven participants (32 female, 25 male), 8–17 years old (M=12.62 years old, SD=2.66), were recruited from a sample of youth in an urban area of the American East coast, who participated in a study examining the association between anxiety and cognitive and emotional responses (Stoddard et al., 2016). Participants in the larger project were given information about the current study, and those interested in participating were included in the present sample.

Twenty-nine listeners were diagnosed with an anxiety disorder following DSM-IV criteria, using semi-structured clinical interviews (K-SADS, or Schedule for Affective Disorders and Schizophrenia for School-Age Children, Present and Lifetime; Kaufman et al., 1997) administered by trained clinicians to participants and their parents (with diagnoses based on reports by either of the respondents), and confirmed by consultation with a psychiatrist (with reliability exceeding κ=0.7). Most participants (i.e., 24 of 29 youth) were diagnosed with multiple anxiety disorders, including social anxiety disorder (14/29), separation anxiety disorder (15/29), generalized anxiety disorder (22/29), specific phobias (14/29), and selective mutism (3/29). Exclusion criteria were assessed with the K-SADS and included autism spectrum disorder, IQ below 70 (assessed using the matrix reasoning and vocabulary subtests of the Wechsler Abbreviated Scale of Intelligence; Psychological Corporation, 1999), history of head trauma, attention deficit/hyperactivity disorder, obsessive-compulsive disorder, and depression if primary to anxiety. All participants were free of psychotropic medication for at least one month prior to testing. Twenty-eight participants were healthy comparison youth recruited from the same broader study. Participants were generally from middle-class families, with parents with a college education (Hollingshead, 1975); 50% were Caucasian, 24.5% were Black or African-American, and 24.5% reported other ethnicities. Two additional participants were excluded from analyses due to missing data. All participants were compensated for their time.

Measures

All participants completed the Screen for Childhood Anxiety-Related Emotional Disorders (SCARED, child report; Birmaher et al., 1997), a 41-item measure of anxiety symptoms (current sample α=0.94), and the Children’s Depression Inventory (CDI; Kovacs, 1981), a 27-item measure of youth depression symptoms (current sample α=0.90). Self-report measures were used as they are generally more sensitive than parent reports for internalizing disorders (Dougherty, Klein, Olino, & Laptook, 2008). Single missing items (<0.01% of questionnaire data) were imputed by taking the average of that participant’s other responses on the questionnaire. Total scores were computed (SCARED M=18.30, SD=13.63; CDI M=6.36, SD=6.83).

Stimuli

Participants heard vocal recordings produced by 12 female youth actors, 11 to 15 years old (M=13.60 years, SD=1.43), who spoke sentences in each of the following 7 socio-emotional expressions: anger, disgust, fear, friendliness, happiness, meanness, and sadness (from Morningstar et al., 2017). The sentences’ text was neutral in emotional content, socially relevant, and designed to be applicable to all expressions (“Why did you do that?”, “I was just trying to be nice,” “I didn’t know about it,” “You shouldn’t have done that”). The use of actors is a common practice in the field of emotion production (Scherer, 2003), given their training in the portrayal of stereotypical representations of various expressions, and the possibility for experimental control over the stimuli’s verbal content. Only recordings from females were included in this study due to difficulties recruiting young male actors.

From the total sample of recordings produced by these actors, we used validation data obtained from 55 listeners (47 adults, 87.2% female, 18–28 years old, and 8 youth, all female, 10–15 years old) in pilot testing to select the best-recognized recordings for each expression and sentence. The task included a total of 84 recordings (3 best-recognized versions of each of the 4 sentences, for each of the 7 expressions).

Procedure

All procedures were approved by the ethics review boards of McGill University and the National Institute of Mental Health. Written assent and consent were obtained from participants and their parents, respectively. The study was conducted in a laboratory setting at a research institution. Listeners heard each recording twice, in a randomized order, over sound-cancelling headphones. They then selected the speaker’s intended expression from seven label options (anger, disgust, fear, friendliness, happiness, meanness, sadness). Responses were self-paced after each recording was presented. The protocol took approximately 20 minutes.

Analyses

Recognition accuracy was computed using the unbiased hit rate (Hu; Wagner, 1993), which corrects raw accuracy estimates by accounting for response biases1. For instance, a participant who selected “anger” for all recordings would be 100% accurate in detecting anger (i.e., no false negatives for anger), but would not be discriminating anger from other expressions (i.e., many false alarms). Instead, Hu integrates information about both detection and discrimination in recognition. A Hu value of 1 represents perfect recognition (100% hit rate, without false alarms or false negatives), whereas a value of 0 indicates no recognition (0% hit rate, with only false alarms and false negatives). Hu was calculated for each expression, yielding 7 values of Hu for each participant. Following recommendations from Wagner (1993), Hu values were arcsine-transformed prior to analyses, but raw means are presented in text for ease of interpretation.

A Generalized Linear Model was performed to examine the effect of Expression (within-subjects, 7 levels: anger, disgust, fear, friendliness, happiness, meanness, sadness) on listeners’ accuracy (Hu). Continuous mean-centered measures of participants’ anxiety symptoms (SCARED score), depression symptoms (CDI score), and age were entered as subject-level predictors. Gender (2 levels: male vs. female) was also entered as a control variable. Anxious and depressed symptoms were significantly correlated, r(55)=.682, p<.001; we accounted for this covariation by entering both predictors in the same model. Two-way interactions between each predictor and Expression were also included in the GLM. Based on the results of Mauchly’s test of sphericity (p>.05), sphericity was assumed and univariate test results were considered, with α = 0.05.

We controlled for family-wise error rate by applying a Holm-Bonferroni correction (Holm, 1979). First, we ordered effects from smallest to greatest p value. For each effect, the obtained p value is compared to an adjusted alpha level, computed with the following formula: original alpha level (i.e., 0.05) / (number of effects – rank number of effect tested + 1). If the obtained p value is smaller than the adjusted alpha level, that effect is considered robust and satisfies significance criteria. Effects with p values that are not smaller than the adjusted alpha level are considered non-significant. We applied this procedure to the main effects and interactions in the GLM model, as well as when probing significant interactions.

Results

Across all participants, we found a significant main effect of Expression, F(6,312)=45.74, p<.001, η2=.47, showing that accuracy varied by emotional expression. Pairwise comparisons with Šidák corrections indicated that anger (M=0.47, SE=0.02) and sadness (M=0.40, SE=0.02) were the best-recognized expressions (and did not differ in recognition rate from one another, p=.267), followed by fear (M=0.38, SE=0.02, which did not differ from sadness, p=.999), friendliness (M=0.30, SE=0.02), then happiness (M=0.23, SE=0.02; did not differ from friendliness, p=.11), meanness (M=0.16, SE=0.02), and disgust (M=0.17, SE=0.01; the latter three did not differ from one another, all ps>.05). Unless otherwise specified, all expressions differed significantly from one another, ps<.05. To better understand the nature of incorrect responses, the Appendix contains a confusion matrix for all participants, which tabulates the types of responses made for each stimulus category.

We found a significant main effect of Age, F(1,52)=49.72, p<.001, η2=.49, whereby increased age was associated with greater identification accuracy. Age also interacted with Expression at the conventional alpha level of .05, F(6,312)=2.67, p=.015, η2=.05, but this effect was considered non-significant after application of the Holm-Bonferroni correction. There were no main or interaction effects related to Gender (all ps>.05).

There was no main effect of the continuous measure of Anxiety symptoms, F(1,52)=0.11, p=.740, η2<.012. The Expression x Anxiety interaction was significant at the conventional alpha level of .05, F(6,312)=2.41, p=.027, η2=.04, but was not retained as significant after application of the Holm-Bonferroni correction procedure. The main effect of Depression symptoms, F(1,52)=4.30, p=.043, was also not retained after alpha corrections; however, the Expression x Depression interaction was significant, F(6,312)=2.84, p=.01, η2=.05. Parameter estimates suggested that greater depression symptoms were related to poorer identification of anger, β=−0.48, t(52)=−2.80, p=.007, 95% CI [−.83, −.14], and of happiness, β=−0.54, t(52)=−2.89, p=.006, 95% CI [−.92, −.17] (Figure 1).

Figure 1.

Figure 1.

Scatterplots representing the relationship between recognition accuracy (Hu; unbiased hit rate, arcsine transformed) and depression symptoms (residualized on age, gender, and anxiety symptoms). Top panel: depression symptoms and Hu for Anger. Bottom panel: depression symptoms and Hu for Happiness.

Discussion

The current study investigated the associations between anxious and depressive symptoms and youth’s ability to identify affect in vocal socio-emotional expressions. Consistent with work on vocal ER with adults (Banse & Scherer, 1996; Johnstone & Scherer, 2000), portrayals of anger, sadness, and fear were well-recognized by listeners, whereas happiness, meanness and disgust were not. We discuss two additional findings in turn: ER increased with age, and depressive symptoms were linked to poorer recognition of both anger and happiness.

Our results add to a growing body of evidence documenting that ER skills continue to improve through adolescence (Brosgole & Weisman, 1995; Chronaki et al., 2014; Zupan, 2015). Though prior studies had demonstrated increased recognition ability in younger children (Allgood & Heaton, 2015; Sauter et al., 2013), our finding that age was positively associated with accuracy in our sample suggests improvements in vocal ER at least through age 17. It is possible that the convergent neural and social development during adolescence (see Nelson, Leibenluft, McClure, & Pine, 2005) contributes to the growth of ER skills during this developmental stage.

Further, we found links between ER deficits and internalizing symptoms, although these effects were not as robust as the age-related findings. Mirroring previous findings on facial ER in youth (Demenescu et al., 2010), no global deficit was noted in relation to anxiety symptoms, nor was the interaction between anxiety and expression significant after applying an alpha correction procedure. Generally, the literature linking anxiety to vocal ER has been inconsistent (e.g., Manassis & Young, 2000; McClure & Nowicki, 2001). In contrast, more evidence supports the presence of attentional biases related to anxiety (e.g., Bar-Haim, Lamy, Pergamin, Bakermans-Kranenburg, & van Ijzendoorn, 2007). It may be that orienting and accuracy are orthogonal constructs, and that anxiety symptoms do not impair the identification of emotional outputs per se. For instance, though anxious youth may attend to threatening stimuli more than their non-anxious peers, this may not translate to errors in understanding others’ cues and may not negatively impact their social behaviours. Researchers should continue to map the associations between anxiety and different stages of social-information processing.

Though we recruited participants based on anxiety diagnoses, we also examined the contribution of depressive symptoms to vocal ER accuracy. Because of the high comorbidity between anxiety and depression, many participants reported elevated depression symptoms. Though previous studies have shown that depressive symptomatology is linked to poorer vocal ER (Emerson et al., 1999; Nowicki & Carton, 1997), the current study indicates that these deficits may be expression-specific, rather than reflecting a generalized problem: our results revealed a significant interaction between expression and depressive symptoms, such that greater symptomatology was associated with poorer recognition of anger and happiness only.

Our finding that greater depression symptoms were associated with poorer recognition of anger mirrors reports in the facial ER literature with children (Lenti et al., 2000; van Beek & Dubas, 2008); this pattern has only been noted with youth, and has not yet been integrated in theoretical conceptualizations of depression. Conversely, the relationship between depressive symptoms and impaired recognition of happiness is consistent with theoretical proposals that depression may be associated with deficits in the processing of positive emotional cues (Beck, 1979; Bourke, Douglas, & Porter, 2010) and aligns with reports of facial ER deficits in adults (e.g., Luck & Dowrick, 2004; Surguladze et al., 2004). However, depressive symptoms were not linked to impaired recognition of friendliness, which is also a positively valenced expression. It may be that friendly voices are perceived as less “positive” than are happy voices. Friendliness has been found to be expressed with a smaller deviation from baseline speech in pitch mean, pitch range, and intensity range than is happiness (Morningstar et al., 2017). As a result, listeners may perceive friendliness to be less emotionally intense than happiness. Indeed, based on acoustic analyses, one previous study has suggested that friendliness may be perceived as “polite” rather than happy (Noble & Xu, 2011). Further research is needed to determine how listeners distinguish friendly and happy speech. More broadly, it will be important to attempt to replicate our finding that depressive symptoms are not associated with identification of friendly expressions, as well as to examine the links between depression and recognition of other positively valenced expressions, such as amusement or contentment (Sauter & Scott, 2007).

Expression-specific deficits in vocal ER may be contributing to the negative social outcomes often faced by youth experiencing depression, such as reduced friendship stability (Chan & Poulin, 2009) and greater rejection by peers (Platt, Kadosh, & Lau, 2013). For example, difficulty recognizing anger has been associated with greater aggressive behaviour (e.g., Fine, Trentacosta, Izard, Mostow, & Campbell, 2004), which in turn has been linked to interpersonal consequences, such as reduced acceptance by peers (Chang et al., 2005; Newcomb, Bukowski, & Pattee, 1993). Correspondingly, improving recognition of anger may ameliorate social relationships; these benefits may then alleviate depressive symptomatology, given evidence that interpersonal stress plays a role in the maintenance of depression (e.g., Davila, Hammen, Burge, Paley, & Daley, 1995; Rudolph et al., 2000).

Moreover, misidentification of happiness may reflect a reduced capacity to recognize positive social cues, or lead to the perception that social interactions are less positive than they are, thus potentially playing a role in the maintenance of depressive symptoms (Joorman & Gotlib, 2006). For these reasons, improving youth’s recognition of relevant socio-emotional expressions, and addressing vocal ER in particular, may be an important intervention target for both social skills training (Spence, 2003) and psychotherapy for depression (Bellack, Hersen, & Himmelhoch, 1996). There is preliminary evidence that such training programs could be successful. For example, interventions that encouraged the recognition of happiness over anger in ambiguous faces reduced aggressive behaviour in youth (Penton-Voak et al., 2013). Similar strategies could be explored for vocal ER.

Strengths and limitations

Though adult voices are overwhelmingly used to assess children’s vocal ER skills, the present investigation used stimuli acquired from youth actors. Adults’ prosodic modulations have been shown to differ acoustically (Morningstar et al., 2017) and perceptually (McClure & Nowicki, 2001) from those of children and adolescents. Youth may have limited exposure to the socio-emotional outputs of adults, as their relevant social experiences occur primarily with other youth (Larson & Richards, 1991; Larson et al., 1996). Our study thus provided information about youth’s interpretation of socially relevant cues.

In that context, limitations must be noted. First, our stimuli were produced by girls, whose emotional prosody has been shown to differ from that of boys (Morningstar et al., 2017). There is some evidence that vocal expressions produced by females are easier to recognize than those of males (Belin, Fillion-Bilodeau, & Gosselin, 2008; Gallois & Callan, 1986; Zuckerman, Lipets, Hall Koivumaki, & Rosenthal, 1975); thus, our results may be overestimating youth’s ability to decode youth’s voices, as the task may have been harder if it contained boys’ portrayals of socio-emotional expressions. However, data collected in our lab suggests no interaction between speaker gender and expression for youth listeners’ ability to recognize vocal affect (analyses available from first author); thus, it appears unlikely that the inclusion of boys’ voices would yield different expression-specific patterns. Second, our stimuli were actors’ posed, rather than naturalistic, portrayals of socio-emotional expressions. It is possible that listeners’ recognition of posed expressions may have differed from their identification of spontaneous emotions; however, studies have shown that the use of actors versus untrained speakers does not elicit different recognition patterns by listeners (Jürgens, Grass, Drolet, & Fischer, 2015; Spackman, Brown, & Otto, 2009).

Moreover, the inclusion of social expressions beyond the typically studied basic emotions is both a strength and limitation of the current investigation. Meanness and friendliness are distinguished from emotions like anger and happiness in terms of acoustic characteristics (Morningstar et al., 2017), as well as in recognition and confusion patterns (see Appendix); as such, the current study assesses youth’s ability to identify a distinct type of social information. However, participants may have had difficulty distinguishing between the two types of stimuli due to their differing nature: emotions are thought to be internal experiences (Ortony, Clore, & Foss, 1987), whereas social expressions are inherently interpersonal. As such, a forced-choice design may not have provided accurate estimates of youth’s relative ability to recognize basic and social emotional expressions. Further, we used the K-SADS to assess psychiatric exclusion criteria; however, the use of more specific screening measures may have been more appropriate for certain conditions (e.g., the Social Communication Questionnaire [Rutter, Bailey, & Lord, 2003] to screen for autism spectrum disorders). Lastly, replication of our findings in a larger sample of greater socio-economic and ethnic diversity would increase confidence in the generalizability of our results.

Future directions

It will be important for future studies to examine associations between internalizing symptoms and the perception of multimodal emotion cues. Though our findings converge with reports about facial ER, our study design does not permit a direct comparison of these modalities. Moreover, our results may not generalize to naturalistic social communication, which involves the integration of both vocal and facial cues. The ability to interpret multimodal nonverbal information may also be subject to developmental changes and be influenced by internalizing symptoms. For instance, socially anxious individuals tend to avoid looking at the eyes of facial stimuli (Horley, Williams, Gonsalvez, & Gordon, 2004), and may thus prioritize vocal information to decipher others’ affect.

Additionally, it is possible that the association between internalizing symptoms and ER will vary across development. For instance, since ER ability develops with age, depression may interfere more significantly with this skill’s normative development if symptoms emerge in childhood rather than in adolescence. Our sample size did not afford adequate power for analyses testing whether age moderated the associations between internalizing symptoms and the recognition of different socio-emotional expressions; however, it will be important for future research to use larger samples and longitudinal designs to render a more detailed picture of the links between internalizing symptoms and vocal ER during different developmental stages.

Conclusion

The current study examined associations between symptoms of anxiety and depression and the recognition of vocal affect in youth’s voices. Accuracy in recognition increased with age, demonstrating that vocal ER capacity continues to develop during late adolescence. Further, though anxiety symptoms were not robustly associated with recognition accuracy, greater depressive symptoms were linked to poorer identification of anger and happiness. Given that misidentifications of such cues can impact social behaviours, these findings suggest that addressing expression recognition skills may be a useful intervention in psychotherapy or social skills training for youth.

Key points:

  • There were age-related increases in vocal emotion recognition skills through age 17.

  • Depression symptoms were associated with deficits in the recognition of vocal anger and happiness.

Acknowledgments

The present data do not appear in previous publications. This work was funded by the intramural research program of the National Institute of Mental Health.

Appendix

Responses (All)
Stimulus Anger Disgust Fear Friendliness Happiness Meanness Sadness Total stimuli:
Anger 485 50 32 6 7 90 12 682

Disgust 82 273 40 32 10 149 96 682

Fear 26 44 396 52 6 10 150 684

Friendliness 19 42 37 397 85 21 82 683

Happiness 39 48 47 238 220 28 63 683

Meanness 127 236 19 25 5 235 33 680

Sadness 10 28 82 70 7 16 470 683

Total responses: 788 721 653 820 340 549 906 4777

Note. Confusion matrix tabulating all participants’ responses to different types of stimuli. The stimulus type is indicated on the rows, and participants’ responses are indicated in the columns.

Footnotes

The authors have no conflict of interest to report.

1

Hu = (number of correct responses/number of times that stimuli category was presented) x (number of correct responses/number of times that response was made).

2

We conducted a secondary analysis using a binary variable for Anxiety (“anxious” or “non-anxious”). The effect of Anxiety remained non-significant (p > .05).

References

  1. Adolphs R, Baron-Cohen S, & Tranel D (2002). Impaired recognition of social emotions following amygdala damage. Journal of Cognitive Neuroscience, 14, 1264–1274. [DOI] [PubMed] [Google Scholar]
  2. Allgood R, & Heaton P (2015). Developmental change and cross-domain links in vocal and musical emotion recognition performance in childhood. British Journal of Developmental Psychology, 33, 398–403. [DOI] [PubMed] [Google Scholar]
  3. Asher SR, Rose AJ, & Gabriel SW (2001). Peer rejection in everyday life. In Leary MR (Ed.), Interpersonal Rejection (pp. 105–144). Oxford University Press. [Google Scholar]
  4. Banse R, & Scherer KR (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70, 614–636. [DOI] [PubMed] [Google Scholar]
  5. Bar-Haim Y, Lamy D, Pergamin L, Bakermans-Kranenburg MJ, & van Ijzendoorn MH (2007). Threat-related attentional bias in anxious and nonanxious individuals: A meta-analytic study. Psychological Bulletin, 133, 1–24. [DOI] [PubMed] [Google Scholar]
  6. Beck A (1979). Cognitive Therapy of Depression Guilford Press. [Google Scholar]
  7. Belin P, Fillion-Bilodeau S, & Gosselin F (2008). The Montreal Affective Voices : A validated set of nonverbal affect bursts for research on auditory affective processing. Behaviour Research Methods, 40, 531–539. [DOI] [PubMed] [Google Scholar]
  8. Bellack AS, Hersen M, & Himmelhoch JM (1996). Social skills training for depression: A treatment manual. In Van Hasselt VB et al. (Eds.), Sourcebook of Psychological Treatment Manuals for Adult Disorders (pp. 179–200). New York, NY: Springer. [Google Scholar]
  9. Birmaher B, Khetarpal S, Brent D, Cully M, Balach L, Kaufman J, & McKenzie Neer S (1997). The Screen for Child Anxiety Related Emotional Disorders (SCARED): Scale construction and psychometric characteristics. Journal of the American Academy of Child and Adolescent Psychiatry, 36, 545–553. [DOI] [PubMed] [Google Scholar]
  10. Bourke C, Douglas K, & Porter R (2010). Processing of facial emotion expression in major depression: A review. Australian and New Zealand Journal of Psychiatry, 44, 681–696. [DOI] [PubMed] [Google Scholar]
  11. Brady EU, & Kendall PC (1992). Comorbidity of anxiety and depression in children and adolescents. Psychological Bulletin, 111, 244–255. [DOI] [PubMed] [Google Scholar]
  12. Brosgole L, & Weisman J (1995). Mood recognition across the ages. International Journal of Neuroscience, 82, 169–189. [DOI] [PubMed] [Google Scholar]
  13. Chan A, & Poulin F (2009). Monthly instability in early adolescent friendship networks and depressive symptoms. Social Development, 18, 1–23. [Google Scholar]
  14. Chang L, Lei L, Li KK, Liu H, Guo B, Wang Y, & Fung KY (2005). Peer acceptance and self-perceptions of verbal and behavioural aggression and social withdrawal. International Journal of Behavioral Development, 29, 48–57. [Google Scholar]
  15. Chronaki G, Hadwin JA, Garner M, Maurage P, & Sonuga-Barke EJS (2014). The development of emotion recognition from facial expressions and non-linguistic vocalizations during childhood. British Journal of Developmental Psychology, 33, 218–236. [DOI] [PubMed] [Google Scholar]
  16. Crick NR, & Dodge KA (1994). A review and reformulation of social information-processing mechanisms in children’s social adjustment. Psychological Bulletin, 115, 74–101. [Google Scholar]
  17. Davila J, Hammen C, Burge D, Paley B, & Daley SE (1995). Poor interpersonal problem solving as a mechanism of stress generation in depression among adolescent women. Journal of Abnormal Psychology, 104, 592–600. [DOI] [PubMed] [Google Scholar]
  18. Demenescu LR, Kortekaas R, den Boer JA, & Aleman A (2010). Impaired attribution of emotion to facial expressions in anxiety and major depression. PLoS ONE, 5, e15058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Deveney CM, Brotman MA, Decker AM, Pine DS, & Leibenluft E (2012). Affective prosody labeling in youths with bipolar disorder or severe mood dysregulation. Journal of Child Psychology and Psychiatry, 53, 262–270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Dougherty LR, Klein DN, Olino TM, & Laptook RS (2008). Depression in children and adolescents. In Eds. Hunsley J, & Mash EJ, A Guide to Assessments that Work (pp. 69–95). New York Oxford University Press. [Google Scholar]
  21. Easter J, McClure EB, Monk CS, Dhanani M, Hodgdon H, Leibenluft E, Charney DS, Pine DS, & Ernst M (2005). Emotion recognition deficits in pediatric anxiety disorders: Implications for amygdala research. Journal of Child and Adolescent Psychopharmacology, 15, 563–570. [DOI] [PubMed] [Google Scholar]
  22. Emerson CS, Harrison DW, & Everhart DE (1999). Investigation of receptive affective prosodic ability in school-aged boys with and without depression. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 12, 102–109. [PubMed] [Google Scholar]
  23. Erath SA, Flanagan KS, & Bierman KL (2007). Social anxiety and peer relations in early adolescence: Behavioral and cognitive factors. Journal of Abnormal Child Psychology, 35, 405–416. [DOI] [PubMed] [Google Scholar]
  24. Fine SE, Trentacosta CJ, Izard CE, Mostow AJ, & Campbell JL (2004). Anger perception, caregivers’ use of physical discipline, and aggression in children at risk. Social Development, 13, 213–228. [Google Scholar]
  25. Gallois C, & Callan VJ (1986). Decoding emotional messages: Influence of ethnicity, sex, message type, and channel. Journal of Personality and Social Psychology, 51, 755–762. [Google Scholar]
  26. Guyer AE, McClure EB, Adler AD, Brotman MA, Rich BA, Kimes AS, Pine DS, Ernst M, & Leibenluft E (2007). Specificity of facial expression labelling deficits in childhood psychopathology. Journal of Child Psychology and Psychiatry, 48, 863–871. [DOI] [PubMed] [Google Scholar]
  27. Halberstadt AG, Denham SA, & Dunsmore JC (2001). Affective social competence. Social Development, 10, 79–119. [Google Scholar]
  28. Heinrich CU, & Borkenau P (1998). Deception and deception detection: The role of cross-modal inconsistency. Journal of Personality, 66, 687–712. [DOI] [PubMed] [Google Scholar]
  29. Hollingshead AA (1975). Four-factor index of social status. Unpublished manuscript, Yale University, New Haven, CT. [Google Scholar]
  30. Holm S (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70. [Google Scholar]
  31. Horley K, Williams LM, Gonsalvez C, & Gordon E (2004). Face to face: visual scanpath evidence for abnormal processing of facial expressions in social phobia. Psychiatry Research, 127, 43–53. [DOI] [PubMed] [Google Scholar]
  32. Johnstone T, & Scherer KR (2000). Vocal communication of emotion. In Lewis M & Haviland J (Eds.), The Handbook of Emotion (pp. 220–235). New York: Guilford. [Google Scholar]
  33. Joorman J & Gotlib IH (2006). Is this happiness I see? Biases in the identification of emotional facial expressions in depression and social phobia. Journal of Abnormal Psychology, 115, 705–714. [DOI] [PubMed] [Google Scholar]
  34. Jürgens R, Grass A, Drolet M, & Fischer J (2015). Effect of acting experience on emotion expression and recognition in voice: Non-actors provide better stimuli than expected. Journal of Nonverbal Behavior, 39, 195–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kaufman J, Birmaher B, Brent D, Rao U, Flynn C, Moreci P, Williamson D, & Ryan N (1997). Schedule for Affective Disorders and Schizophrenia for School-Age Children – Present and Lifetime Version (K-SADS-PL): Initial reliability and validity data. Journal of the American Academy of Child and Adolescent Psychiatry, 36, 980–988. [DOI] [PubMed] [Google Scholar]
  36. Kingery JN, Erdley CA, Marshall KC, Whitaker KG, & Reuter TR (2010). Peer experiences of anxious and socially withdrawn youth: An integrative review of the developmental and clinical literature. Clinical Child and Family Psychology Review, 13, 91–128. [DOI] [PubMed] [Google Scholar]
  37. Kochel KP, Ladd GW, & Rudolph KD (2012). Longitudinal associations among youth depressive symptoms, peer victimization, and low peer acceptance: An interpersonal process perspective. Child Development, 83, 637–650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kovacs M (1981). Rating scales to assess depression in school-aged children. Acta Paedopsychiatrica: International Journal of Child & Adolescent Psychiatry, 46, 305–315. [PubMed] [Google Scholar]
  39. Krueger RF, & Markon KE (2006). Understanding psychopathology: Melding behavior genetics, personality, and quantitative psychology to develop an empirically based model. Current Directions in Psychological Science, 15, 113–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kupersmidt J, Burchinal M, & Patterson CJ (1995). Developmental patterns of childhood peer relations as predictors of externalizing behavior problems. Development and Psychopathology, 7, 825–843. [Google Scholar]
  41. La Greca AM, & Harrison HM (2005). Adolescent peer relations, friendships, and romantic relationships: Do they predict social anxiety and depression? Journal of Clinical Child & Adolescent Psychology, 34, 49–61. [DOI] [PubMed] [Google Scholar]
  42. Larson R, & Richards MH (1991). Daily companionship in late childhood and early adolescence: Changing developmental contexts. Child Development, 62, 284–300. [DOI] [PubMed] [Google Scholar]
  43. Larson RW, Richards MH, Moneta G, Holmbeck G, & Duckett E (1996). Changes in adolescents’ daily interactions with their families from ages 10 to 18: Disengagement and transformation. Developmental Psychology, 32, 744–754. [Google Scholar]
  44. Lenti C, Giacobbe A, & Pegna C (2000). Recognition of emotional facial expressions in depressed children and adolescents. Perceptual and Motor Skills, 91, 227–236. [DOI] [PubMed] [Google Scholar]
  45. Leppanen JM (2006). Emotional information processing in mood disorders: a review of behavioral and neuroimaging findings. Current Opinion in Psychiatry, 19, 34–39. [DOI] [PubMed] [Google Scholar]
  46. Luck P, & Dowrick CF (2004). ‘Don’t look at me in that tone of voice!’ Disturbances in the perception of emotion in facial expression and vocal intonation by depressed patients. Primary Care Mental Health, 2, 99–106. [Google Scholar]
  47. Manassis K, & Young A (2000). Perception of emotions in anxious and learning disabled children. Depression and Anxiety, 12, 209–216. [DOI] [PubMed] [Google Scholar]
  48. Masten CL, Eisenberger NI, Borofsky LA, Pfeifer JH, McNealy K, Mazziotta JC, & Dapretto M (2009). Neural correlates of social exclusion during adolescence: Understanding the distress of peer rejection. Social Cognitive and Affective Neuroscience, 4, 143–157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Maxim LA, & Nowicki SJ (2003). Developmental associations between nonverbal ability and social competence. Philosophy, Sociology and Psychology, 2, 745–758. [Google Scholar]
  50. McClure EB, & Nowicki S (2001). Associations between social anxiety and nonverbal processing skill in preadolescent boys and girls. Journal of Nonverbal Behavior, 25, 3–19. [Google Scholar]
  51. McClure EB, Pope K, Hoberman AJ, Pine DS, & Leibenluft E (2003). Facial expression recognition in adolescents with mood and anxiety disorders. American Journal of Psychiatry, 160, 1172–1174. [DOI] [PubMed] [Google Scholar]
  52. Morningstar M, Huang S, & Dirks MA (2017). Vocal cues underlying youth and adult portrayals of socio-emotional expressions. Journal of Nonverbal Behavior, 1–29.29497220
  53. Nelson EE, Leibenluft E, McClure EB, & Pine DS (2005). The social re-orientation of adolescence: A neuroscience perspective on the process and its relation to psychopathology. Psychological Medicine, 35, 163–174. [DOI] [PubMed] [Google Scholar]
  54. Newcomb AF, Bukowski WM, & Pattee L (1993). Children’s peer relations: A meta-analytic review of popular, rejected, neglected, controversial, and average sociometric status. Psychological Bulletin, 113, 99–128. [DOI] [PubMed] [Google Scholar]
  55. Noble L & Xu Y (2011). Friendly speech and happy speech – are they the same? Proceedings of the 17th international congress of phonetic sciences Hong Kong. [Google Scholar]
  56. Nowicki S, & Carton E (1997). The relation of nonverbal processing ability of faces and voices and children’s feelings of depression and competence. Journal of Genetic Psychology, 158, 357–363. [DOI] [PubMed] [Google Scholar]
  57. Nowicki S & Duke MP (1994). Individual differences in the nonverbal communication of affect: The Diagnostic Analysis of Nonverbal Accuracy scale. Journal of Nonverbal Behavior, 18, 9–35. [Google Scholar]
  58. Ortony A, Clore GL, & Foss M (1987). The referential structure of the affective lexicon. Cognitive Science, 11, 361–384. [Google Scholar]
  59. Paquette JA, & Underwood MK (1999). Gender differences in young adolescents’ experiences of peer victimization: Social and physical aggression. Merrill-Palmer Quarterly (1982-), 45, 242–266. [Google Scholar]
  60. Paus T, Keshavan M, & Giedd JN (2008). Why do many psychiatric disorders emerge during adolescence? Nature Reviews Neuroscience, 9, 947–957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Pell MD (2002). Evaluation of nonverbal emotion in face and voice: Some preliminary findings on a new battery of tests. Brain and Cognition, 48, 499–514. [PubMed] [Google Scholar]
  62. Penton-Voak IS, Thomas J, Gage SH, McMurran M, McDonald S, & Munafò MR (2013). Increasing recognition of happiness in ambiguous facial expressions reduces anger and aggressive behavior. Psychological Science, 24, 1–10. [DOI] [PubMed] [Google Scholar]
  63. Platt B, Kadosh KC, & Lau JYF (2013). The role of peer rejection in adolescent depression. Depression and Anxiety, 30, 809–821. [DOI] [PubMed] [Google Scholar]
  64. Corporation Psychological. (1999). Wechsler Abbreviated Scale of Intelligence San Antonio, TX: Author. [Google Scholar]
  65. Richman LS, & Leary MR (2009). Reactions to discrimination, stigmatization, ostracism, and other forms of interpersonal rejection: A multimotive model. Psychological Review, 116, 365–383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Rigby K (2000). Effects of peer victimization in schools and perceived social support on adolescent well-being. Journal of Adolescence, 23, 57–68. [DOI] [PubMed] [Google Scholar]
  67. Rudolph KD, Hammen C, Burge D, Lindberg N, Herzberg D, & Daley SE (2000). Toward an interpersonal life-stress model of depression: The developmental context of stress generation. Development and Psychopathology, 12, 215–234. [DOI] [PubMed] [Google Scholar]
  68. Rutter M, Bailey A, & Lord C (2003). Manual for the Social Communication Questionnaire Los Angeles, CA: Western Psychological Services. [Google Scholar]
  69. Sauter DA, & Scott SK (2007). More than one kind of happiness: Can we recognize vocal expressions of different positive states? Motivation and Emotion, 31, 192–199. [Google Scholar]
  70. Sauter DA, Panattoni C, & Happé F (2013). Children’s recognition of emotions from vocal cues. British Journal of Developmental Psychology, 31, 97–113. [DOI] [PubMed] [Google Scholar]
  71. Scherer KR (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40, 227–256. [Google Scholar]
  72. Simonian SJ, Beidel DC, Turner SM, Berkes JL, & Long JH (2001). Recognition of facial affect by children and adolescents diagnosed with social phobia. Child Psychiatry and Human Development, 32, 137–145. [DOI] [PubMed] [Google Scholar]
  73. Spackman MP, Brown BL, & Otto S (2009). Do emotions have distinct vocal profiles? A study of idiographic patterns of expression. Cognition & Emotion, 23, 1565–1588. [Google Scholar]
  74. Spence SH (2003). Social skills training with children and young people: Theory, evidence and practice. Child and Adolescent Mental Health, 8, 84–96. [DOI] [PubMed] [Google Scholar]
  75. Stanton-Salazar RD, & Spina SU (2005). Adolescent peer networks as a context for social and emotional support. Youth & Society, 36(4), 379–417. [Google Scholar]
  76. Stoddard J, Tseng WL, Kim P, Chen G, Yi J, Donahue L, Brotman MA, Towbin KA, Pine DS, Leibenluft E (2016). Association of irritability and anxiety with the neural mechanisms of implicit face emotion processing in youths with psychopathology. JAMA Psychiatry Advance online publication. doi: 10.1001/jamapsychiatry.2016.3282 [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Surguladze SA, Young AW, Senior C, Brébion G, Travis MJ, & Phillips ML (2004). Recognition accuracy and response bias to happy and sad facial expressions in patients with major depression. Neuropsychology, 18, 212–218. [DOI] [PubMed] [Google Scholar]
  78. Trentacosta CJ, & Fine SE (2009). Emotion knowledge, social competence, and behavior problems in childhood and adolescence: A meta-analytic review. Social Development, 19, 1–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. van Beek Y, & Dubas JS (2008). Decoding basic and non-basic facial expressions and depressive symptoms in late childhood and adolescence. Journal of Nonverbal Behavior, 32, 53–64. [Google Scholar]
  80. Wagner HL (1993). On measuring performance in category judgment studies of nonverbal behavior. Journal of Nonverbal Behavior, 17, 3–28. [Google Scholar]
  81. Zaki J, Bolger N, & Ochsner K (2009). Unpacking the informational bases of empathic accuracy. Emotion, 9, 478–487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Zuckerman M, Lipets MS, Hall Koivumaki J, & Rosenthal R (1975). Encoding and decoding nonverbal cues of emotion. Journal of Personality and Social Psychology, 32, 1068–1076. [DOI] [PubMed] [Google Scholar]
  83. Zupan B (2015). Recognition of high and low intensity facial and vocal expressions of emotion by children and adults. Journal of Social Sciences and Humanities, 1, 332–344. [Google Scholar]

RESOURCES