Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Mar 1.
Published in final edited form as: J Clin Child Adolesc Psychol. 2021 Mar-Apr;50(2):155–176. doi: 10.1080/15374416.2021.1878898

Using Evaluative Criteria to Review Youth Anxiety Measures, Part II: Parent-Report

Rebecca G Etkin 1, Eli R Lebowitz 1, Wendy K Silverman 1
PMCID: PMC8025201  NIHMSID: NIHMS1670894  PMID: 33739908

Abstract

This Evidence Base Update of parent-report measures of youth anxiety symptoms is a companion piece to our update on youth self-report anxiety symptom measures (Etkin, Shimshoni, Lebowitz, & Silverman, 2020). We rate the psychometric properties of the parent-report measures as Adequate, Good, or Excellent using criteria developed by Hunsley and Mash (2008) and Youngstrom et al. (2017). Our review reveals that the evidence base for parent-report measures is considerably less developed compared with the evidence base for youth self-report measures. Nevertheless, several measures, the parent-report Screen for Child Anxiety Related Emotional Disorders, Multidimensional Anxiety Scale for Children, and Spence Children’s Anxiety Scale, were found to have Good to Excellent psychometric properties. We conclude our review with suggestions about which parent-report youth anxiety measures are best suited to perform different assessment functions and directions for additional research to expand and strengthen the evidence base.

Keywords: Evidence-based, Anxiety, Assessment, Measurement, Child, Parent


This article is a companion piece to “Using Evaluative Criteria to Review Youth Anxiety Measures, Part I: Self-Report” (Etkin et al., 2020). In that article we used evaluative criteria developed by Hunsley and Mash (2008) and expanded by Youngstrom et al. (2017) to evaluate and rate the quality and quantity of research examining child and adolescent anxiety assessment measures’ psychometric properties (‘youth’ unless referring to a specific developmental period). In this article we evaluate the parent versions of these same measures, as well as two other measures designed specifically for parental use (i.e., no companion youth version). As illustrated by the rubric showing the evaluative criteria (Table 1), each psychometric property may earn a rating of Adequate, Good, or Excellent. Benchmarks for each rating vary by the psychometric property under consideration. A rating of Adequate indicates that a measure has minimal level of rigorous research support; a rating of Good indicates sound empirical research support; a rating of Excellent indicates substantial and high-quality research for a given psychometric property (De Los Reyes & Langer, 2018). Together, Part I and Part II of this review provide systematic evaluations that allow for comparisons of psychometric properties and associated assessment functions (e.g., screening) within, and between, youth and parent measures (Silverman & Kurtines, 1996).

Table 1.

Rubric for evaluating norms, validity, and utility (De Los Reyes & Langer, 2018)

Criterion Adequate Good Excellent
Norms M and SD for total score (and subscores if relevant) from a large, relevant clinical sample M and SD for total score (and subscores if relevant) from multiple large, relevant samples, at least one clinical and one nonclinical Same as “good,” but must be from representative sample (i.e., random sampling, or matching to census data)
Internal consistency (Cronbach’s alpha, split half, etc.) Most evidence shows alpha values of 0.70–0.79 Most reported alphas 0.80–0.89 Most reported alphas ≥0.90
Inter-rater reliability Most evidence shows kappas of 0.60–0.74, or ICCs of 0.70–0.79 Most reported kappas of 0.75–0.84, ICCs of 0.80–0.89 Most kappas ≥0.85, or ICCs ≥0.90
Test–retest reliability (stability) Most evidence shows test–retest correlations ≥0.70 over period of several days or weeks Most evidence shows test–retest correlations ≥0.70 over period of several months Most evidence shows test–retest correlations ≥0.70 over 1 year or longer
Repeatability Bland–Altman (Bland & Altman, 1986) plots show small bias, and/or weak trends; coefficient of repeatability is tolerable compared to clinical benchmarks (Vaz, Falkmer, Passmore, Parsons, & Andreou, 2013) Bland–Altman plots and corresponding regressions show no significant bias, and no significant trends; coefficient of repeatability is tolerable Bland–Altman plots and corresponding regressions show no significant bias, and no significant trends; established for multiple studies; coefficient of repeatability is small enough that it is not clinically concerning
Content validity Test developers clearly defined domain and ensured representation of entire set of facets Same as “adequate,” plus all elements (items, instructions) evaluated by judges (experts or pilot participants) Same as “good,” plus multiple groups of judges and quantitative ratings
Construct validity (e.g., predictive, concurrent, convergent, and discriminant validity) Some independently replicated evidence of construct validity Bulk of independently replicated evidence shows multiple aspects of construct validity Same as “good,” plus evidence of incremental validity with respect to other clinical data
Discriminative validity Statistically significant discrimination in multiple samples; AUCs <0.6 under clinically realistic conditions (i.e., not comparing treatment seeking and healthy youth) AUCs of 0.60 to <0.75 under clinically realistic conditions AUCs of 0.75 to 0.90 under clinically realistic conditions
Prescriptive validity Statistically significant accuracy at identifying a diagnosis with a well-specified matching intervention, or statistically significant moderator of treatment Same as “adequate,” with good kappa for diagnosis, or significant treatment moderation in more than one sample Same as “good,” with good kappa for diagnosis in more than one sample, or moderate effect size for treatment moderation
Validity generalization Some evidence supports use with either more than one specific demographic group or in more than one setting Bulk of evidence supports use with either more than one specific demographic group or in multiple settings Bulk of evidence supports use with either more than one specific demographic group AND in multiple settings
Treatment sensitivity Some evidence of sensitivity to change over course of treatment Independent replications show evidence of sensitivity to change over course of treatment Same as “good,” plus sensitive to change across different types of treatments
Clinical utility After practical considerations (e.g., costs, respondent burden, ease of administration and scoring, availability of relevant benchmark scores, patient acceptability), assessment data are likely to be clinically actionable Same as “adequate,” plus published evidence that using the assessment data confers clinical benefit (e.g., better outcome, lower attrition, greater satisfaction), in areas important to stakeholders Same as “good,“ plus independent replication

Note: ICC = intraclass correlation coefficient; AUC = area under the curve. Table reproduced with permission.

Extending our previous update to parent-report measures reflects the assessment process itself: assessing youths’ emotional and behavioral problems, including anxiety, is incomplete without parental perspectives. A multimodal, multi-informant approach has long been considered a hallmark of evidence-based assessment in clinical child and adolescent psychology and psychiatry because it captures the breadth, depth, and nuances of the clinical picture (e.g., Ross, 1980; Silverman & Kurtines, 1996). Parents are essential and unique informants of their child’s clinical status given their proximity and close relationship. Parent-report measures of youth anxiety offer the advantages of being inexpensive, efficient, and versatile. They offer the additional advantage of providing critical information about anxiety among youths who may have difficulty completing self-report measures, such as young children and children with developmental disabilities (Fisak & Barrett, 2019; Gotham, Brunwasser, & Lord, 2015), and those too anxious and/or unwilling to engage in the assessment process (e.g., Silverman & Eisen, 1992).

When incorporating parents’ perspectives of youth anxiety into the assessment process, discrepancies between parent- and youth-reports are the general rule, not the exception (e.g., Achenbach, McConaughy, & Howell, 1987). Indeed, such discrepancies often reflect meaningful differences relating to the different contexts (e.g., home versus peer) in which parents and youths observe or experience youths’ emotional and behavioral problems (e.g., Cannon et al., 2020; Makol et al., 2020), as well as parents’ and youths’ respective histories, attributions, and motivations (De Los Reyes & Kazdin, 2004, 2005; Kraemer et al., 2003). Discrepancies in reporting on youth anxiety symptoms may also relate to other meaningful demand characteristics such as social desirability and self-presentation (e.g., DiBartolo, Albano, Barlow, & Heimberg, 1998; Pina, Silverman, Saavedra, & Weems, 2001). In sum, it is not the case that one informant is right and the other is wrong, nor is it the assessor’s task to solve a right-versus-wrong conundrum. Rather, whether either discrepant or nondiscrepant, reports from multiple informants provide the opportunity to glean conceptually- and clinically-meaningful information that can enhance clinical decision-making (De Los Reyes, Thomas, Goodman, & Kundey, 2013; Kraemer et al., 2003; Makol et al., 2020). In this article, while our emphasis is on providing recommendations based on our evaluation of parent-report measures of youth anxiety, we do compare our evaluative ratings as well across informants in the Discussion.

Method

Our procedure for retrieving and evaluating the studies used in this review is identical to that in Etkin et al. (2020). Briefly, to search for studies of the psychometric properties of parent-report youth anxiety symptom measures, we conducted several keyword-guided electronic database searches using PsycINFO and Google Scholar. Some searches included the specific names of the youth anxiety self-report measures evaluated in our prior review (e.g., “Parent-report AND Multidimensional Anxiety Scale for Children OR MASC OR MASC-P”). Other searches included broader, descriptive terms (e.g., “Parent AND Youth OR Child* OR Adolesc* AND Measure OR Questionnaire OR Survey OR Scale”). We also consulted narrative reviews and meta-analyses of anxiety treatment studies to locate articles that allowed for our evaluation of measures’ treatment sensitivity – one of the psychometric criteria.

Our inclusion criteria required that articles were (1) peer-reviewed, (2) empirical studies evaluating the psychometric properties of parent-report youth anxiety symptom measures, and (3) published in English and used the English language version of the instruments. Articles were screened by the first and last authors to ensure these criteria were met. We included studies of the most widely used and researched version of each measure and included samples of parents whose children ranged in age from 3 to 19 years. Given our focus on anxiety symptom measures, as in Etkin et al. (2020), we did not include studies of measures specific to obsessive-compulsive disorder (OCD) and posttraumatic stress disorder, or broadband measures of youth psychiatric disorders. Although the Child Behavior Checklist (CBCL; Achenbach, 1991; Achenbach & Edelbrock, 1983) is frequently used to assess youth anxiety, we did not include CBCL studies because this measure was not developed as an anxiety symptom measure per se, and because no one subscale is consistently used in the literature (e.g., Internalizing scale, Anxious/Depressed scale). Finally, although the Social Interaction Anxiety Scale (Mattick & Clarke, 1998) and the Social Phobia Scale (Mattick & Clarke, 1998) have been used in a small number of studies to assess adolescent anxiety from both the youth and parent perspective (e.g., Deros et al., 2017; Glenn et al., 2019), we did not include them in this review because they are adult self-report measures of social anxiety.

A total of 55 articles met these inclusion criteria. We reviewed these articles for information pertinent to our rating measures’ psychometric properties as Adequate, Good, or Excellent. Because no study evaluated the psychometric properties of repeatability, prescriptive validity, or clinical utility, these criteria are not covered in the current review.

Overview of Measures

There were far fewer articles for us to cover in the current review (N = 55) compared with our youth anxiety self-report measure review (N = 136). A likely reason is that most of the youth anxiety measures were developed first, to assess anxiety from the youth perspective (Edelbrock, Costello, Dulcan, Kalas, & Conover, 1985). Only the parent version of the Screen for Child Anxiety Related Emotional Disorders (SCARED-P; Birmaher et al., 1997) was developed and evaluated in tandem with the youth self-report version. As part of the development/evaluation of the youth self-report version of the Multidimensional Anxiety Scale for Children (MASC), Parents were asked to complete the youth form from their perspective (March, Parker, Sullivan, Stallings, & Conners, 1997; referred to subsequently as the MASC-P; e.g., Wood, Piacentini, Bergman, McCracken, & Barrios, 2002). The other parent versions were developed after the youth measures, by their original developers (Social Phobia and Anxiety Inventory for Children; SPAI-C/P; Beidel, Turner, Hamlin, & Morris, 2000; Social Anxiety Scale for Children - Revised; SASC-R/P; La Greca, 1999; Spence Children’s Anxiety Scale; SCAS-P; Nauta et al., 2004) or by independent investigators (State-Trait Anxiety Inventory for Children - Trait scale; STAIC-P-T; Revised Children’s Manifest Anxiety Scale; RCMAS-P; Fear Schedule Survey for Children - Revised; FSSC-R/P; e.g., Kendall, 1994; Weems, Silverman, Saavedra, Pina, & Lumpkin, 1999). Changes from the youth versions involved re-wording items from “I…” to “My child…” The STAIC-P-T also contained additional items not in the youth version; the SCAS-P had some item deleted (see Content Validity).

As noted, our literature search revealed two youth anxiety symptom measures that were developed from the outset as parent-report measures and met the inclusion criteria for our review, the Preschool Anxiety Scale/Preschool Anxiety Scale - Revised (PAS/PAS-R; Edwards, Rapee, Kennedy, & Spence, 2010; Spence, Rapee, McDonald, & Ingram, 2001) and Selective Mutism Questionnaire (SMQ; Bergman, Keller, Piacentini, & Bergman, 2008). The PAS/PAS-R are both included and evaluated together in this review because the differences between the two versions are minimal (see Content Validity); they appear to have generally the same frequency of use across the literature; and they have nearly identical psychometric properties.

In total, we cover psychometric studies of 10 parent-report youth anxiety symptom measures in the current review (See Table 2 for a list of measure length, subscales, translations, and cost). A summary of this research is presented in the Results, organized by psychometric property. Each section proceeds from lowest to highest quality evaluative rating (i.e., Adequate, Good, Excellent) and from oldest- to newest-developed measure. If a measure falls below the threshold required for a rating of Adequate for a given psychometric property, we explain the reasons at the end of the section. Unless otherwise noted, our use of the term “clinical samples” refers to families who presented to youth anxiety disorder specialty clinics. Our ratings are summarized in Table 3.

Table 2.

Summary Information for Youth Anxiety Parent-Report Measures

Measure & citation Age range (years) Number of items & response scale Subscales Translations Open access?
State Trait Anxiety Inventory for Children –Parent-Report –Trait Version (STAIC-P-T; Strauss, 1987) 7–15 (Southam-Gerow et al., 2003) 20 items + 6 items assessing physiological symptoms

Response scale: 1 (hardly) – 3 (often)
None None Yes
Revised Children’s Manifest Anxiety Scale – Parent Report (RCMAS-P: e.g., Silverman et al., 1999) 6–17 (Pina et al., 2001) 28 items + 9-item Lie scale (37 items total)

Response scale: Yes/No (Cole et al., 2000 used 3 response options: Yes, Sort of, No)
Cole et al. (2000) found evidence of 3 factors: Social Alienation; Worry-Oversensitivity; Sleep Disturbance (Lie scale not examined)

Original factors (youth-version): Worry, Oversensitivity, Physiological Reactivity, and Concentration Problems, Lie Scale
German, Spanish No
Fear Survey Schedule for Children- Revised – Parent-Report (FSSC-R/P; e.g., Weems et al., 1999) 6–17 (Weems et al., 1999) 80 items

Response scale: 1 (none) – 3 (a lot)
Fear of Failure and Criticism; Fear of Danger and Death; Fear of Small Animals; Medical Fears; Fear of the Unknown Italian, Spanish, Swedish Yes
Social Anxiety Scale for Children – Revised – Parent-Report (SASC-R/P; La Greca, 1999) 8–16 (e.g., Epkins & Seegan, 2015) 22 items

Response scale: 1 (not at all) – 5 (all the time)
Not published. Original factors (youth-version): Fear of Negative Evaluation; Social Avoidance and Distress in New Situations; General Social Avoidance and Distress Norwegian Yes
Social Phobia and Anxiety Inventory for Children (SPAI-C/P; Beidel et al., 2000) 8–14 (Beidel et al., 2000; Higa et al., 2006) 26 items

Response scale: 0 (never or hardly ever) – 2 (most of the time or always)
Assertiveness/General Conversation; Traditional Social Encounters; Public Performance Swedish No
Multidimensional Anxiety Scale for Children (MASC; March et al., 1997) 6–17 (e.g., Langer et al., 2010) 39 items

Response scale: 0 (never) – 3 (often)
Social Anxiety; Separation Anxiety/Panic; Harm Avoidance; Physical Symptoms Spanish No
Screen for Child Anxiety Related Emotional Disorders – Parent-Report (SCARED-P; Birmaher et al., 1997, 1999) 5–19 (e.g., Birmaher et al., 1999; Sequeira et al., 2019) 41 items

Response scale: 0 (not true or hardly ever true) – 2 (very true or often true)
Social Anxiety; Separation Anxiety; Generalized Anxiety; Panic Disorder/ Significant Somatic Symptoms; Significant School Avoidance Arabic, Dutch, Finnish, Hebrew, German, Portuguese, Spanish, Turkish Yes
Spence Children’s Anxiety Scale – Parent-Report (SCAS-P; Nauta et al., 2004) 6–18 (Nauta et al., 2004) 38 items

Response scale: 0 (never) – 3 (always)
Separation Anxiety; Social Phobia; Generalized Anxiety; Obsessive–Compulsive Disorder; Panic/Agoraphobia; Physical Injury Fears Armenian, Catalan, Chinese, Czech, Danish, Dutch, French, Hebrew, Italian, Japanese, Malay, Norwegian, Persian, Polish, Portuguese, Slovenian, Spanish, Swedish, Syrian-Arabic, Urdu (many available on scaswebsite.com) Yes
Preschool Anxiety Scale (Spence et al., 2001) Preschool Anxiety Scale-Revised (Edwards et al., 2010) 3–6 (Spence et al., 2001; Edwards et al., 2010) PAS: 28 items (34 items if PTSD scale included)

PAS-R: 28 items

Response scale: 0 (not at all true) – 4 (very often true)
PAS: Generalized Anxiety; Social Anxiety; Obsessive-Compulsive; Physical Injury Fears; Separation Anxiety; Posttraumatic Stress (not scored)

PAS-R: Generalized Anxiety; Social Anxiety; Physical Injury Fears; Separation Anxiety
Armenian, Chinese, Danish, Dutch, French, Hebrew, Icelandic, Persian, Portuguese, Romanian, Russian, Slovenian, Spanish, Syrian-Arabic, Turkish (many available on scaswebsite.com) Yes
Selective Mutism Questionnaire (SMQ; Bergman et al., 2008) 3–11 (Bergman et al., 2008; Letamendi et al., 2008) 17 items

Response scale: 0 (never) – 3 (always)

*Note: lower scores indicate lower frequency of speech/greater SM symptoms
School; Home; Other Social Situations Hebrew, Norwegian Yes

Table 3.

Summary of Ratings for Youth Anxiety Parent-Report Measures

Criterion

Measure Norms Internal Consistency Test-retest Reliability/ Stability Content Validity Construct Validity Discriminative Validity Validity Generalization Treatment Sensitivity Totals (by measure)

STAIC-P-T Adequate Good Excellent Good Adequate Adequate Adequate Good 4A, 3G, 1E
RCMAS-P Adequate Good Excellent Good N/A Adequate Good Good 2A, 4G, 1E
FSSC-R/P Adequate Good N/A Good N/A Adequate Good Good 2A, 4G
SASC-R/P Adequate Good N/A Good Adequate Adequate Adequate N/A 4A, 2G
SPAI-C/P N/A Good N/A Excellent Adequate N/A Adequate N/A 2A, 1G, 1E
SCARED-P Excellent Good Excellent Good Good Excellent Excellent Excellent 3G, 5E
MASC-P Good Good Good Excellent Adequate Excellent Good Excellent 1A, 4G, 3E
SCAS-P Good Good N/A Excellent Adequate Good Good Excellent 1A, 4G, 2E
PAS/PAS-R N/A Good Excellent Excellent Adequate Adequate Adequate Good 3A, 2G, 2E
SMQ Adequate Good N/A Good Adequate Adequate Adequate Good 4A, 3G
Totals (by criterion) 5A, 2G, 1E 10G 1G, 4E 6G, 4E 7A, 1G 6A, 1G, 2E 5A, 4G, 1E 5G, 3E

Note. N/A = the measure could not be rated due to insufficient data or did not meet the threshold for a rating of Adequate on a given criterion.

Results

Norms

The STAIC-P-T, RCMAS-P, FSSC-R/P, SASC-R/P, and SMQ each earn a rating of Adequate because means and standard deviations are available for one clinical sample with N > 100 (defined as large in prior Evidence Base updates; Holly et al., 2019). For the STAIC-P-T, there are descriptive data for mother- and father-reports in a clinical sample of 241 youth (7–16 years; 83% White; Southam-Gerow, Flannery-Schroeder, & Kendall, 2003) and a smaller community sample of 85 youth (8–16 years; 94% White; Engel, Rodrigue, & Geffken, 1994; also for the State scale). For the RCMAS-P, Pina et al. (2001) present means and standard deviations (broken down by youth age, sex, and ethnicity) for the Total and Lie scale scores in a clinical sample (N = 284; 6–17 years; 60% White). There are also descriptive data (Lie and Total scales only) for a smaller sample of parents and their children presenting to anxiety and attention-deficit/hyperactivity disorder specialty clinics (N = 62; 8–12 years; 85% White; Barbosa, Tannock, & Manassis, 2002) and for several community samples (Ns = 85–359; 7–16 years; 21 – 94% White; Cole, Hoffman, Tram, & Maxwell, 2000; Engel et al., 1994; Varela, Sanchez-Sosa, Biggs, & Luis, 2008).

For the FSSC-R/P, Weems et al. (1999) present means and standard deviations by subscales and youth diagnosis for a clinical sample (N = 120; 6–17 years; 34% Hispanic); Varela et al. (2008) present descriptive data for subscales (mother- and father-reports) for a community sample (N = 217; 7–16 years; 79% Mexican or Hispanic American). For the SASC-R/P, there are descriptive data from a sample of parents and their children referred to an outpatient community-based clinic (N = 110; 8–16 years; 71% White; Epkins & Seegan, 2015) and another small clinical sample (N = 23; 7–15 years; 74% White; Manassis et al., 2003). The mean and standard deviation for the Total scale score was also reported in a community sample of 32 children (3rd – 5th grade; 75% White; DiBartolo & Grills, 2006) and in a large primary care sample of 714 youths (8–17 years; 71% White; Bailey, Chavira, Stein, & Stein, 2006). The SMQ was originally normed from data using an internet-based community sample (N = 589) and a clinical sample of youth with anxiety disorders including selective mutism (N = 66; 3–10 years; 65 – 83% White; Bergman et al., 2008). There are descriptive data for one other sample of parents and their children with (n = 102) and without (n = 43) selective mutism diagnoses participating in a national research study (3–11 years; 75% White; Letamendi et al., 2008) and the small clinical sample described above (Manassis et al., 2003).

The MASC-P and SCAS-P each earn a rating of Good because means and standard deviations are available from several clinical and community samples with Ns >100. March et al. (1997) first reported descriptive data for parents completing the MASC from a sample of youth and their parents who presented to specialty anxiety disorder and attention-deficit/hyperactivity disorder clinics, respectively (N = 24, 8–16 years; all White). Means and standard deviations (broken down by subscales and youth age and sex) are reported from two other larger clinical samples (Ns = 438 – 488; 7–17 years; 79% White; Palitz et al., 2018; Wei et al., 2014) (Ns = 174–186; 6–17 years; 77% White; Langer, Wood, Bergman, & Piacentini, 2010; Wood et al., 2002) and community samples (Ns = 217 – 499; 7–16 years; Australian and Mexican/Hispanic/White; Baldwin & Dadds, 2007). For the SCAS-P, means and standard deviations (broken down by mother- and father-report, subscales, and other relevant factors) can be found for several clinical samples (Ns = 88 – 1,438; 7–18 years; predominantly White; Brown-Jacobsen, Wallace, & Whiteside, 2011; Evans, Thirlwall, Cooper, & Creswell, 2017; Reardon et al., 2019) and mixed clinical/community samples (Ns = 85 – 484; 6–18 years; predominantly White; Nauta et al., 2004; Whiteside & Brown, 2008).

The SCARED-P earns a rating of Excellent because there are descriptive data from a nationally representative sample of parents of 1,570 5–12 year-old youth (59% White) from all 50 states of the U.S., who were selected to match the U.S. population on key demographic variables (Sequeira, Silk, Woods, Kolko, & Lindhiem, 2019). There are more descriptive data for the SCARED-P than any other parent measure, including data from clinical, community/healthy control, and primary care samples. Across these studies, sample sizes were Ns = 190 – 1,092, youth ages were 5–19 years, and 9 – 64% were ethnic minority (Bailey et al., 2006; Behrens, Swetlitz, Pine, & Pagliaccio, 2019; Birmaher et al., 1999; Birmaher et al., 1997; Bowers et al., 2020; Caporino et al., 2017; Dirks et al., 2014; Ford-Paz et al., 2019; Gardner, Lucas, Kolko, & Campo, 2007; Gonzalez, Weersing, Warnick, Scahill, & Woolston, 2012; Jastrowski Mano et al., 2012; Monga et al., 2000; Rappaport, Pagliaccio, Pine, Klein, & Jarcho, 2017; Van Meter et al., 2018; Wren et al., 2007; Wren, Bridge, & Birmaher, 2004).

The SPAI-C/P and PAS/PAS-R did not meet the threshold for ratings of Adequate because there are no normative data from clinical samples of N > 100. The SPAI-C/P was normed originally in a clinical sample of parents of socially anxious (n = 40) and healthy controls (n = 15) youth (8–14 years; 80% White); the means and standard deviations reported were derived by combining the data of the two subsamples (Beidel et al., 2000). The SPAI-C/P Total and subscale score means and standard deviations were also reported in a Hawaiian community sample (N = 158; 10–14 years; majority bi/multiracial, 6% White; Higa, Fernandez, Nakamura, Chorpita, & Daleiden, 2006); means and standard deviations are reported in other studies of mixed community and clinic-referred samples (Ns < 100; e.g., Glenn et al., 2019; Lipton, Augenstein, Weeks, & De Los Reyes, 2014). The PAS/PAS-R were normed originally in Australian community samples of predominately White and middle-to-upper-income mothers (ns = 755 – 764) and fathers (ns = 383 – 418) of preschool children (2.5–7 years); descriptive data are presented broken down by mother- and father-report, subscales, and youth sex (Edwards, Rapee, Kennedy, et al., 2010; Spence et al., 2001).

Internal Consistency

The STAIC-P-T, RCMAS-P, FSSC-R/P, SASC-R/P, SPAI-C/P, PAS/PAS-R, and SMQ each earn a rating of Good given that most Cronbach’s alphas (α) range from .80 – .89, although alphas are lower for certain subscales and higher for Total scales. For the STAIC-P-T, alphas ranged from .84 – .91 for mother-report and .88 – .91 for father-report (at four assessments; two before and two after treatment) in a clinical sample (Southam-Gerow et al., 2003). For the RCMAS-P, Pina et al. (2001) reported alphas of .85 and .82 for the Total scale and Lie scales, respectively, in a clinical sample. In community samples, Total scale alphas ranged from .89 – .92 (Cole et al., 2000; Cole, Truglio, & Peeke, 1997). For the FSSC-R/P, Weems et al. (1999) reported subscale alphas ranging from .78 – .92 in a clinical sample. In a community sample of mothers and fathers Varela et al. (2008) reported alphas ranging from .94 – .97 for the Total scale and .59 – .92 for the subscales. For the SASC-R/P, Total scale alphas ranged from .68 – .89 in a community sample and .91 in a clinical sample (Bergman et al., 2008; DiBartolo & Grills, 2006). For the SPAI-C/P, alphas ranged from .93 – .95 for the Total scale and .79 – .88 for the subscales in community and mixed community and clinic-referred samples (e.g., Higa et al., 2006; Lipton et al., 2014). For the PAS/PAS-R, Total scale alphas were .92 and the subscale alphas ranged from .72 – .89 in a community sample of mothers and fathers (Edwards, Rapee, Kennedy, et al., 2010); Total scale alphas above .80 are consistent with those reported in treatment studies (e.g., Kennedy, Rapee, & Edwards, 2009). For the SMQ, Bergman et al. (2008) reported a Total scale alpha of .84 in a community sample, and a Total scale alpha of .97 and subscale alphas ranging from .88 – .97 in a clinical sample. Letamendi et al. (2008) reported similar alpha coefficients in a clinical sample (Total α = .78; subscale αs = .65 – .91).

The MASC-P, SCARED-P, and SCAS-P also each earn ratings of Good, and have more supportive data than the measures discussed above. For the MASC-P, alpha coefficients in clinical samples range from .78 – .88 for the Total scale and .68 – .88 for the subscales (Bergman et al., 2008; Langer et al., 2010; Wei et al., 2014; Wood et al., 2002). In community samples, alphas range from .78 – .91 for the Total scale and .33 – .87 for the subscales (Baldwin & Dadds, 2007; Varela et al., 2008). In the first studies of the SCARED-P, Birmaher et al. (1997) and Birmaher et al. (1999) reported alphas of .74 to .90 in samples presenting to a mood and anxiety disorders clinic; which alpha values corresponded to which subscale, and to parent- or youth-report was not specified. Alphas reported in several other samples of parents of children presenting to anxiety and general outpatient clinics range from .91 – .96 for the Total scale and .72 – .95 for the subscales (Dirks et al., 2014; Gonzalez et al., 2012; Rappaport et al., 2017; Van Meter et al., 2018). In two primary care samples, alphas ranged from .92 – .93 for the Total scale and .59 – .89 for the subscales (Jastrowski Mano et al., 2012; Wren et al., 2007). For the SCAS-P, Nauta et al. (2004) originally reported alphas from a sample presenting to anxiety and general outpatient clinics (Total scale α =.89, subscale αs = .61 – .81) and a healthy control sample drawn from the community (Total scale α =.89, subscale αs = .58 – .74). Whiteside and Brown (2008) reported a Total score alpha of .93 and subscale alphas ranging from .47 – .83 for a combined clinical and community sample (authors indicate estimates were nearly identical for each group), and Total and subscale alphas ranged from .88 – .90 and .62 – .87, respectively, in two clinical samples (Brown-Jacobsen et al., 2011; Reardon et al., 2019).

Test-Retest Reliability and Stability

Test-retest reliability and stability respectively refer to the consistency of a measure’s scores across over a few days or weeks, and a few months or longer (Watson, 2004). We rated measures’ test-retest reliability and stability based on commonly used benchmarks (e.g., Cohen, 2013; Landis & Koch, 1977): intraclass correlations (ICCs) > .74 and Pearson’s r > .70 are considered Excellent; ICCs = .59 – .74 and Pearson’s r = .50 – .70 are considered Good; and ICCs = .40 – .58 and Pearson’s r = .30 – .50 are considered Adequate (see Etkin et al., 2020).

The MASC-P earns a rating of Good, as Baldwin and Dadds (2007) reported r = .70 for the Total scale and rs = .56 – .70 for the subscales over 12-months in a community sample. The STAIC-P-T, RCMAS-P, SCARED-P, and PAS/PAS-R each earn a rating of Excellent. For the STAIC-P-T, Southam-Gerow et al. (2003) reported ICCs of .71 over 8-weeks and .76 over one-year for mother-report, and .75 over 8-weeks and .68 over one-year for father-report, in a clinical sample. For the RCMAS-P, Cole et al. (2000) reported r = .76 over 6-months for the Total scale in a community sample. For the SCARED-P, Birmaher et al. (1997) reported ICCs of .86 for the Total scale and .70 – .90 for the subscales over a span of 4-days to 15-weeks (median = 5-weeks) in a clinical sample. Behrens et al. (2019) also found ICCs of .86 for the Total scale and .74 – .85 for the subscales over 5-days to 15-weeks apart (M = 39 days) in a mixed treatment-seeking/healthy sample. For the PAS-R, Edwards, Rapee, Kennedy, et al. (2010) found 12-month stability estimates of rs = .73 – .74 for the Total scale, and rs = .60 – .76 for the subscales (most > .70) in a community sample of mothers and fathers (similar findings in Edwards, Rapee, & Kennedy, 2010). The FSSC-R/P, SASC-R/P, SPAI-C/P, SCAS-P, and SMQ are not rated because we could not locate any studies of their test-retest reliability or stability.

Content Validity

As noted, the STAIC-P-T, RCMAS-P, FSSC-R-P, SASC-R/P, SPAI-C/P, SCARED-P, MASC-P and SCAS-P were created by changing item stems from the youth self-perspective (e.g., “I…”) to the parent perspective (e.g., “My child…”). For the STAIC-P-T, six items were added to assess parents’ perceptions of several anxiety-related physiological responses (e.g., headaches, jitters) in their child (Southam-Gerow et al., 2003; Strauss, 1987). For the SCAS-P, six filler items (e.g., “I am good at sports”) included in the original youth version were omitted from the parent version (Nauta et al., 2004). Given that the above noted parent measures were all adapted from the youth versions, their content validity ratings are the same as those we reported in our Part I update (Etkin et al., 2020). Specifically, the STAIC-P-T, RCMAS-P, FSSC-R/P, SASC-R/P, and SCARED-P each earn a rating of Good because developers defined clearly each measure’s content domains, ensured representation across items, and items were evaluated by judges (e.g., experts, youth). The SPAI-C/P, MASC-P, and SCAS-P each earn a rating of Excellent because developers further refined the measures through consultation with judges and pilot tested the next iteration in another independent sample.

The PAS/PAS-R and SMQ are the two parent-report youth anxiety measures that we indicated earlier were designed from the outset for parental use (i.e., not derived from a youth version). The SMQ earns a rating of Good. The item pool was developed to assess a range of situations in which youth may display a failure to speak, which is the core feature of selective mutism. Developers consulted clinicians and parents of youth with selective mutism to generate these items (Bergman et al., 2008). The PAS/PAS-R earns a rating of Excellent. PAS items were developed to assess a broad spectrum of anxiety symptoms relevant to preschoolers based on literature reviews, DSM criteria, other forms of anxiety assessments (e.g., Anxiety Disorders Interview Schedule for Children; ADIS-C/P; Silverman & Albano, 1996), and input from developers, who were experts on anxiety problems in preschool. In addition, several items deemed relevant to preschoolers were adapted from the SCAS. The initial item pool was reduced following consultation with groups of parents of preschool children who provided feedback about the relevance and understandability of the items. The next iteration underwent pilot testing with 600 mothers and fathers of children aged 3–5 years before the final set of items was established (Spence et al., 2001). The PAS-R is a slightly modified version of the PAS; specifically, nine items were added to improve coverage of symptoms, three items were modified to further clarify meaning, and seven items were removed due to very low response rates in the original version (Edwards, Rapee, Kennedy, et al., 2010). Following confirmatory factory analyses, two additional obsessive-compulsive symptom items were removed from the PAS-R due to poor psychometric properties (e.g., αs < .50), leading to equivalent content between the PAS and PAS-R aside from the Obsessive-Compulsive subscale and the item modifications noted above.

Construct Validity

The STAIC-P-T, SPAI-C/P, SASC-R/P, MASC-P, SCAS-P, PAS/PAS-R and SMQ each earn ratings of Adequate as there is independently replicated evidence to support some form of each measure’s construct validity. In demonstrating convergent validity, moderate-to-strong significant correlations have been found between these measures. For example, a significant correlation of r = −.52 was found between the SMQ and SASC-R/P in a clinical sample (lower SMQ scores indicate greater selective mutism symptoms; Bergman et al., 2008); significant correlations of rs > .70 were found between corresponding MASC-P and SCAS-P subscales in a community sample (except for the MASC-P Harm Avoidance scale; Baldwin & Dadds, 2007). In addition, each of these measures significantly correlates with clinician and/or observer ratings of youth anxiety. For example, significant correlations of rs = .39 – .67 were found between the SCAS-P subscales and clinician ratings of corresponding symptoms on the ADIS-C/P in a clinical sample (Brown-Jacobsen et al., 2011). Significant correlations of rs = .25 – .31 were found between the SPAI-C/P and observer-rated youth social anxiety during social interaction and speech tasks in a combined sample of clinic-referred and community control adolescents (Glenn et al., 2019).

In demonstrating divergent validity, low and/or nonsignificant correlations have been found with non-anxiety measures, including measures of externalizing and depressive symptoms, as would be expected given they are different constructs than anxiety. For example, in a clinical sample, correlations were r = .09 (ns) /.16 (p <.05) and .21/.31 (ps <.05) (father/mother report) between the STAIC-P-T and the CBCL Delinquent and Aggressive scales, respectively, and .14 (ns) / .20 (p < .05) between the STAIC-P-T and the Beck Depression Inventory (Southam-Gerow et al., 2003). Significant but low correlations of rs = .18 – .19 were found between SPAI-C/P and the CBCL Aggressive, Rule-Breaking, and Externalizing scales (Higa et al., 2006), and correlations of r = .21/.27 were found between mother/father-reports of the PAS and the CBCL Externalizing scale (Spence et al., 2001)

The SCARED-P earns a rating of Good because it has a larger body of independently replicated evidence supporting different forms of its construct validity. Regarding convergent validity, there are significant correlations of rs = .34 – .76 with other parent-report measures of anxiety and internalizing problems (Monga et al., 2000; Van Meter et al., 2018; Wren et al., 2004), clinician-rated anxiety (r = 0.25, p < .001; Behrens et al., 2019), and observer ratings of social anxiety (Social Anxiety subscale, rs = .21 −26, ps < .05; Bowers et al., 2020). Regarding divergent validity, there is evidence of low correlations between the SCARED-P and externalizing symptom measures. For example, Monga et al. (2000), in a clinical sample, found the Total score to correlate significantly higher, t(159) = 7.11, p <.001, with the CBCL Internalizing (r = 0.61, p < .001) than Externalizing scale (r = 0.28, p <.001); this also was true for SCARED-P subscales (i.e., each subscale correlated more highly with the CBCL Internalizing than Externalizing scale), and similar findings were reported in a primary care sample (Jastrowski Mano et al., 2012).

The SCARED-P, MASC-P, SCAS-P, and SMQ each have some evidence of incremental validity, that is, the ability to predict clinically-relevant data above and beyond other measures of the same construct (e.g., Haynes & Lench, 2003; Johnston & Murray, 2003). Letamendi et al. (2008) found the SMQ Total scale, School, and Social subscales added significant variance in the prediction of selective mutism diagnosis over the CBCL Anxious/Depressed subscale. Wei et al. (2014) found the MASC-P had incremental validity over the MASC in predicting the percentage of youth correctly diagnosed with separation (SAD), social (SOC), and generalized (GAD) anxiety disorders. Reardon et al. (2019) found father- and mother-reports on the SCAS-P Physical Injury Fears subscale each made a significant contribution in identifying specific phobias among girls, and mother-reports only in identifying specific phobias among boys (for other subscales and disorders, all reporters made unique contributions). Bowers et al. (2020) found that the SCARED-P Social Anxiety scale significantly predicted youth observed anxiety during a speech task above and beyond youth reports on the same scale. Because of the limited amount of evidence, however, no measure was rated as Excellent.

The RCMAS-P and FSSC-R/P are not rated because, as far as our search revealed, they do not have independently replicated evidence of construct validity (i.e., no studies of construct validity for the FSSC-R/P; low convergent and divergent validity for the RCMAS-P in a community sample; Cole et al., 1997).

Discriminative Validity

Discriminative validity refers to the ability of a measure to differentiate between or identify groups, in this case, youth with anxiety disorders. Some studies compare mean levels of scores on anxiety measures for youth with and without anxiety disorders, with significantly higher scores for youth with anxiety disorders indicating good discriminative validity. Other studies use logistic regression, likelihood ratios, or discriminant function analysis to determine whether anxiety disorders can be identified from a measure’s set of scores. Receiver operating characteristics (ROC) analysis is also used to determine how well scores on a measure can differentiate between or identify disorders with the area under the curve (AUC) metric. Studies also commonly report measures’ sensitivity (i.e., the measure’s ability to correctly identify those with anxiety), and specificity (i.e., the measure’s ability to correctly identify those without anxiety). For the AUC, sensitivity, and specificity, values closer to 1.00 indicate better discrimination.

The STAIC-P-T, RCMAS-P, FSSC-R/P, SASC-R/P, SPAI-C/P, PAS/PAS-R, and SMQ are all rated Adequate, as there is some evidence of significant discrimination in clinically realistic conditions (i.e., not just comparing treatment-seeking and healthy youth). For the STAIC-P-T, one study found low though significant discrimination between youth with anxiety disorders and youth with depressive and/or disruptive disorders (AUC = 0.61; Monga et al., 2000). For the RCMAS-P, Pina et al. (2001) found that the Total scale mean score significantly discriminated youth with overanxious disorder/GAD from youth with specific phobias, and girls with comorbid anxiety disorders from girls with comorbid anxiety/depressive disorders; the Total and Lie scale mean scores also discriminated boys with comorbid anxiety disorders from boys with comorbid anxiety/disruptive disorders. Barbosa et al. (2002) also found the RCMAS-P Total score discriminated youth with anxiety disorders from healthy controls in a small clinical sample. For the FSSC-R/P, Weems et al. (1999) found an overall diagnostic accuracy rate of 73% based on subscale scores in a clinical sample. Positive and negative predictive power (i.e., the probability that youth do or do not have the anxiety disorder) for the most highly-endorsed items in identifying corresponding specific phobia types ranged from .16 – .77 and .87 – .97, respectively; these items outperformed subscale scores. For the SASC-R/P, Bailey et al. (2006) found AUCs of .83 – .85 (for children) and .85 – .87 (for adolescents) for the Total scale score in identifying SOC in a primary care sample; they also found AUCs >.80 for most subscales. For the PAS/PAS-R, mean scores were significantly higher for children with anxiety disorders than children without anxiety disorders, and a series of logistic regressions showed that anxiety diagnoses of GAD, SAD, SOC, and specific phobia were most strongly predicted by the corresponding PAS–R subscales (Edwards, Rapee, Kennedy, et al., 2010). For the SMQ, mean scores were significantly lower (i.e., greater selective mutism symptoms) across scales for youth diagnosed with selective mutism than for youth with other anxiety disorders (Bergman et al., 2008).

The SCAS-P earns a rating of Good. Youth with anxiety disorders were found to have significantly higher mean scores on all SCAS-P subscales than healthy control youth. Moreover, there was specificity in the scores in that youth with SAD, SOC, and OCD had significantly higher scores on respective subscales (i.e., Separation Anxiety, Social Phobia, Obsessive-Compulsive) than for the other subscales (e.g., youth with SAD did not have higher scores on GAD subscale than youth with GAD) (Nauta et al., 2004; Whiteside & Brown, 2008). Using discriminant function analysis, 80.5% of youth with anxiety disorders and healthy controls were correctly identified, and among youth with anxiety disorders specifically, 72% were correctly identified with OCD, 70% with SAD, 68% with panic disorder/agoraphobia, 60% with SOC, and 31% with GAD (Nauta et al., 2004). Using a series of logistic regressions, Brown-Jacobsen et al. (2011) found that the Separation Anxiety, Social Phobia, and Obsessive-Compulsive subscales also correctly identified about 70% of youth with corresponding diagnoses. Using ROC analysis, S. P. Whiteside, Gryczkowski, Biggs, Fagen, and Owusu (2012) found that the SCAS-P Obsessive-Compulsive subscale discriminated youth with OCD and community youth (cut-score = 3; AUC = .96; sensitivity = .95, specificity = .87), and also youth with OCD from youth with other anxiety disorders (cut-score = 7, AUC = .89; sensitivity = .76, specificity = .88). Reardon et al. (2019) found that mother- and father-reports on the SCAS-P Separation Anxiety subscale correctly identified SAD (AUC = .74 – .82, sensitivity = .70 – .78, specificity = .69 – .75) and the Social Phobia scale correctly identified SOC (AUCs = .70 – .77, sensitivity = .66 – .71, specificity = .63 – .71) for boys and girls with primary anxiety disorders participating in a multisite research study. However, the Physical Injury Fears scale identified specific phobia only with father-report and for girls (AUC = .70, sensitivity = .61, specificity = .71) and the Generalized Anxiety scale could not identify youth with GAD (AUCs < .65).

The SCARED-P and MASC-P each earn a rating of Excellent. For both measures, mean scores were significantly higher for youth with anxiety disorders than youth with other psychiatric disorders in several studies. Additionally there was specificity in the scores in that youth had significantly higher scores on subscales corresponding to their anxiety disorder (e.g., Birmaher et al., 1999; Birmaher et al., 1997; Monga et al., 2000; Wei et al., 2014; Wood et al., 2002).

Several studies used ROC analysis. For the MASC-P, Wood et al. (2002) found optimal cut-scores for the Total scale that identified SOC (cut-score = 16.5, sensitivity = .70, specificity = .63), SAD (cut-score = 13.75, sensitivity = .76; specificity =.72); and panic disorder (cut-score = 18.5, sensitivity = .88, specificity = .83). Wei et al. (2014) found that most subscales accurately identified corresponding diagnoses; AUCs, in the order of highest to lowest and for children/adolescents, were Separation/Panic predicting SAD (AUCs = .78/.80), Social Anxiety predicting SAD (AUCs = .74/.82), and Physical Symptoms (AUCs = .63/67) and Harm Avoidance (AUCs =.56/.68) predicting GAD.

For the SCARED-P, Birmaher et al. (1999) and Monga et al. (2000) found the Total scale score significantly discriminated youth with anxiety disorders from those with depressive and/or disruptive disorders in clinical samples (AUC = 0.67; Monga et al., 2000); although mean scores discriminated anxiety and depression specifically, AUCs did not (AUC = .59, ns; Birmaher et al., 1999). Rappaport et al. (2017) found the Total score correctly identified diagnoses of GAD and/or SAD (AUC = .98, sensitivity = 65%, specificity = 99%), the Generalized Anxiety scale identified GAD specifically (AUC = .93, sensitivity = 78%, specificity = 91%), and the Separation Anxiety scale SAD specifically (AUC = .94, sensitivity = 79%, specificity = 92%) in a treatment-seeking sample of youth with and without anxiety disorders (similar results reported for non-treatment seeking youth with and without anxiety disorders by the authors). Van Meter et al. (2018) found AUCs = .69 – .89 for the Total scale score in identifying any anxiety disorder, GAD, and SAD; only the Generalized Anxiety scale score outperformed the Total scale score when identifying GAD (AUC = .86). Using diagnostic likelihood ratios for optimal cut-scores, authors also found Total scale scores ≥ 22 were associated with a moderate increase in the odds of receiving one or more anxiety disorder diagnoses, and scores ≤ 21 were associated with reduced likelihood of receiving any anxiety diagnosis (similar findings emerged for the Generalized and Separation Anxiety subscales with cut-scores of 8 for both). For all optimal cut-scores, sensitivity values were ≥ .82 (except .56 for identifying any anxiety disorder with the Total score) and specificity values were ≥ .79. Gonzalez et al. (2012) similarly found Total scale scores ≥ 25 were significantly associated with increased odds of meeting criteria for any anxiety disorder diagnosis (sensitivity = 60% – 68%, specificity = 77% – 88%). Finally, in two primary care samples, Gardner et al. (2007) reported significant identification of any anxiety disorder (AUC = .79, sensitivity = .44, specificity = .92) and Bailey et al. (2006) reported significant identification of SOC with the Social Anxiety subscale optimal cutoff of 5 (children: AUCS = .81 – .85; adolescents: AUCs = .84 – .89, sensitivity = .74 – .78, specificity = .73 – .69). Finally, the SPAI-C/P is not rated because we did not find any studies of discriminative validity.

Validity Generalization

Validity generalization refers to “the extent to which there is evidence for validity across a range of samples and settings” for a given measure (Hunsley & Mash, 2008, p. 11). The STAIC-P-T and SASC-R/P, SPAI-C/P, PAS/PAS-R, and SMQ each earn a rating of Adequate. Regarding samples, some evidence supports their use with more than one specific demographic group, including parent and youth sex and race/ethnicity. Each of these measures, except the STAIC-P-T, have also been used in international samples/validated in languages other than English (see Table 2). Regarding settings, as far as our search revealed, the SASC-R/P has been used most in community settings (e.g., DiBartolo & Grills, 2006) and less used in outpatient (Epkins & Seegan, 2015) and primary care settings (Bailey et al., 2006). The STAIC-P-T has been used exclusively in anxiety specialty clinics (e.g., Southam-Gerow et al., 2003). The SPAI-C/P, PAS/PAS-R, and SMQ have been used in community and clinical settings.

The RCMAS-P, FSSC-R/P, MASC-P, and SCAS-P each earn a rating of Good because they have more evidence supporting their use with more than one specific demographic group and in multiple settings. Each of these measures has also been used internationally and/or translated into languages other than English (see Table 2). Additionally, they each have evidence of measurement invariance across specific demographic groups. There are three forms of measurement invariance: (1) configural, which indicates that the measure’s factor structure is equivalent between groups, (2) metric (i.e., weak factorial), which indicates that factor loadings of items are equivalent between groups, and (3) scalar (i.e., strong factorial), which indicates that item scores/intercepts are equivalent between groups (Millsap & Meredith, 2007). For the RCMAS-P, FSSC-R/P and MASC-P, Varela et al. (2008) found evidence of configural, metric, and scalar invariance across White, Hispanic-American, and Mexican participants. Nauta et al. (2004) reported that the factors of the SCAS-P are “sufficiently invariant” (p. 827) across youth age and gender, and participants from Australia and the Netherlands. The MASC-P and SCAS-P are also most commonly used relative to the other parent measures to assess anxiety among youth with autism spectrum disorder, although studies do not find strict measurement invariance (i.e., configural, metric, and scalar) in these populations (Glod et al., 2017; Magiati et al., 2017; Toscano et al., 2019; White et al., 2015).

The SCARED-P earns a rating of Excellent. In addition to the bulk of evidence supporting its use with more than one specific demographic group and in multiple settings, several studies find support for some degree of measurement invariance across youth ages and racial/ethnic groups. For example, Gonzalez et al. (2012) found evidence of configural and partial metric invariance (i.e., some different factor loadings on the Panic/Somatic and School Avoidance subscales) across Black and White parents, and Dirks et al. (2014) found evidence for configural and complete metric invariance among White and ethnic minority parents, both in general outpatient samples. Behrens et al. (2019) found evidence of strict invariance of among younger and older youth, suggesting that parents’ interpretation of the SCARED-P is not significantly impacted by youth age. Sequeira et al. (2019) also found evidence of nearly identical model fit between several different ethnic groups, and between children aged 5–8 years and 8–12 years in a nationally representative community sample. However, like earlier research in ethnically diverse samples (e.g., Wren et al., 2007), a five-factor model was not a good fit to the data, highlighting the need for even more factor analytic research on the SCARED-P.

Treatment Sensitivity

The STAIC-P-T, RCMAS-P, FSSC-R/P, PAS/PAS-R, and SMQ each earn ratings of Good, as independent replications show evidence of treatment sensitivity, which refers to the extent that measures can detect change over the course of treatment (Youngstrom et al., 2017). The STAIC-P-T is sensitive to change over the course of individual and group cognitive-behavioral therapy (CBT), as shown in several randomized controlled trials (e.g, Flannery-Schroeder & Kendall, 2000; Kendall, 1994). The RCMAS-P is likewise sensitive to change across several CBT approaches, including individual, group, and parent-involvement (e.g., Lumpkin, Silverman, Weems, Markham, & Kurtines, 2002; Silverman, Kurtines, Ginsburg, Weems, Lumpkin, et al., 1999; Silverman, Kurtines, Ginsburg, Weems, Rabian, et al., 1999; Silverman et al., 2019). The FSSC-R/P has also been used in several of these same studies and is sensitive to change (e.g., Silverman, Kurtines, Ginsburg, Weems, Rabian, et al., 1999). It is also sensitive to change in single-session phobia treatments (Oar, Farrell, Waters, Conlon, & Ollendick, 2015). The PAS/PAS-R has been primarily used in interventions focused on preventing the development of anxiety disorders in temperamentally-inhibited children (e.g., Bayer et al., 2011). It has also been used in internet-delivered parent-based treatment for young children with anxiety disorders (Donovan & March, 2014). The SMQ is sensitive as well to different treatment approaches including individual and intensive group-based CBT for selective mutism (e.g., Bergman et al., 2008; Cornacchio et al., 2019).

The MASC-P and SCARED-P, and SCAS-P each earn a rating of Excellent because they have multiple independent replications showing evidence of sensitivity to change across different treatment approaches (e.g., internet-delivered CBT; Spence et al., 2011), types (e.g., parent-focused training; attention bias modification training; pharmacological treatment; Birmaher et al., 2003; Klein et al., 2015; Lebowitz, Marin, Martino, Shimshoni, & Silverman, 2019; Pettit et al., 2020), and settings (e.g., school-based CBT; treatment delivered in primary care; Chavira et al., 2014; Chiu et al., 2013; Ginsburg, Pella, Pikulski, Tein, & Drake, 2020). The MASC-P and SCARED-P were also used in the Child/Adolescent Multimodal Study (CAMS) and showed sensitivity to change across study arms (CBT, sertraline, and combined; e.g., Compton et al., 2014).

As noted in our Part I review, ROC analyses can be used to measure response and remission following treatment; the MASC-P, SCARED-P, and SCAS-P all have supportive evidence. Palitz et al. (2018) found a 35% reduction in the MASC-P Total score had maximum efficiency (i.e., the greatest accuracy of classification) for predicting treatment response (sensitivity = .77, specificity = .77), and a raw score cutoff of 42 had maximum efficiency for predicting remission (sensitivity = .83, specificity = .64) for CAMS participants. The MASC-P subscales also predicted response and remission from specific anxiety disorders, except for the Harm Avoidance subscale not being predictive of GAD remission. Also among CAMS participants, Caporino et al. (2017) found a 55% reduction in SCARED-P Total scores most strongly predicted response (i.e., maximum efficiency), and a 60% reduction most strongly predicted remission. For the SCAS-P, Evans et al. (2017) found that the Separation Anxiety subscale predicted remission from SAD (AUC = .82 for a cut-score of 6; sensitivity = .76, specificity = .73) for youth participating in two randomized controlled trials of different CBT approaches. The Total scale score also predicted remission from SAD, although somewhat less accurately (AUC = .77 for a cut-score of 24; sensitivity = .62, specificity = .73); the Physical Injury subscale predicted remission from specific phobias (AUC = .71 for a cut-score of 2.5), although poor sensitivity (.48) limits this subscale’s utility.

The SASC-R/P and SPAI-C/P are not rated because we were able to identify their usage in only two school-based interventions for social anxiety (Masia Warner et al., 2016; Masia Warner, Fisher, Shrout, Rathor, & Klein, 2007). Evidence of sensitivity to change has only been found for the corresponding youth self-report versions of these measures.

Discussion

Overview of Findings

This article complements our Part I review of youth anxiety self-report measures (Etkin et al., 2020) by providing an evaluative review of parent-report measures of youth anxiety. We applied criteria that allowed us to rate the studies that examined the measures’ psychometric properties, relevant to accomplishing key assessment goals (Hunsley & Mash, 2008; Silverman & Kurtines, 1996; Youngstrom et al., 2017). Of 80 ratings assigned across the ten measures and eight psychometric properties, 29% (n = 23) of the ratings were Adequate, 37.5% (n = 30) were Good, 19% (n = 15) were Excellent, and 17.5% (n = 14) could not be assigned.

This distribution of ratings, while less encouraging than that found for the youth-self-report measures (i.e., 9% Adequate; 53% Good; 38% Excellent; Etkin et al., 2020), is still overall encouraging; over one half of all the ratings were either Good or Excellent. The Good and Excellent ratings tended to cluster around a fewer number of parent measures compared with the youth measures, for which the Good and Excellent ratings were more evenly spread. Indeed, a noteworthy parent-report ‘frontrunner’ was the SCARED-P. The SCARED-P was the only parent measure to receive all Good (37.5%; n = 3) and Excellent (62.5%; n = 5) ratings, perhaps due in part to its greater psychometric scrutiny relative to other measures (n = 17 SCARED-P studies versus n < 10 studies for others). The MASC-P and SCAS-P had the next set of strongest ratings, with a majority of Good (50%; n = 4 for each) and Excellent ratings (MASC-P: 37.5%; n = 3; SCAS-P: 25%; n = 2). Given these findings, the psychometric evidence supports the use of the SCARED-P, MASC-P, and SCAS-P for performing most assessment functions.

Of the remaining seven measures, the STAIC-P, RCMAS-P, SPAI-C/P, and PAS/PAS-R received a mix of Adequate, Good, and Excellent ratings, and could not be rated along several psychometric properties; the FSSC-R/P, SASC-R/P, and SMQ received no Excellent ratings (see Table 3). Of note, Adequate and missing ratings do not necessarily reflect poor psychometric qualities of the measures or methodological limitations of studies; they also reflect an underdeveloped evidence base for that measure. For example, ratings for several of the psychometric properties (e.g., discriminative validity) are affected by whether there is evidence from multiple samples; most measures lack this evidence.

Below we summarize the findings in more detail and discuss their implications. We begin with psychometric properties that received the most Good and Excellent ratings across the ten parent-report measures, followed next by properties that received the most Adequate ratings.

Criteria with the Most Good and Excellent Ratings

Internal Consistency.

Internal consistency was the only psychometric property with ratings of Good across all ten of the parent-report measures of youth anxiety. While all youth self-report measures also received ratings of Good (Etkin et al., 2020), alpha coefficients for parent versions were typically higher than for the respective youth versions. For example, Rappaport et al. (2017) found SCARED-P Total and subscale alphas ranged from .86 to .96, but ranged from .71 to .93 for the SCARED. Nevertheless, the ubiquity of Good ratings (and lack of Excellent ratings) is because most of the subscale alphas across the parent measures were in the .80 range (see Table 1). Moreover, several parent-subscale alphas were .70 or lower, including the SCAS-P Physical Injury Fears, MASC-P Harm Avoidance, and SCARED-P School Avoidance. It is further the case that Cronbach’s alpha is not a ‘true’ estimate of internal consistency because it is a function of the number of scale items and their average correlation; McDonald’s omega (ω) coefficient (1999) is preferable because it accounts for the magnitude of item factor loadings and residual covariance between item pairs (Revelle & Condon, 2019). Although no studies of the measures we evaluated utilized this metric, we anticipate its increased usage in future work.

Test-Retest Reliability and Stability.

Four of the ten parent-report measures received ratings of Excellent for test-retest reliability and stability: the STAIC-P-T, RCMAS-P, SCARED-P, and PAS/PAS-R. The MASC-P received a rating of Good. Like internal consistency, estimates were generally higher for parent than youth versions when administered within the same sample (e.g., Baldwin & Dadds, 2007). Indeed, only two youth versions, the SPAI-C and MASC, received ratings of Excellent while the others were rated Good. Nevertheless, test-retest reliability and stability studies are scarce for parent versions. The FSSC-R/P, SASC-R/P, SPAI-C/P, SCAS-P, and SMQ have no study of either test-retest reliability or stability and therefore could not be rated. The Good or Excellent ratings received by the measures noted above were derived from one or two studies only. Given the small number of studies, more research is needed to evaluate the parent measures’ test-retest reliability and stability. Also, as in Etkin et al. (2020), we needed to deviate somewhat from the Hunsley and Mash (2008) and Youngstrom et al. (2017) criteria for test-retest reliability/stability to avoid conflating test-retest reliability and stability or ‘penalizing’ measures for having lower reliability estimates over longer intervals of time (e.g., Youngstrom, Salcedo, Frazier, & Perez Algorta, 2019). We instead rated measures according to commonly used benchmarks for Adequate, Good, and Excellent Pearson’s r and ICCs (e.g., Cohen, 2013). We again suggest a revision to the criteria to reflect these issues.

Content Validity.

All ten of the parent measures received ratings of either Good or Excellent for content validity. Although this is very encouraging, we note that all measures, except for the PAS/PAS-R and SMQ, were re-worded versions of the youth versions and not developed ‘ground up.’ As such, ratings were equivalent to those for the youth versions (i.e., the same youth versions also received Good and Excellent ratings). As Chorpita and Lilienfeld (1999) noted, “adaptation of a new measure from an existing questionnaire raises potential issues about content validity” (p. 215). A related issue is whether the content of these measures would differ if they were developed from the outset to assess youth anxiety from parents’ perspectives. Because most parent-report measures of youth anxiety were adapted from the youth self-report versions and involved mainly changing the pronouns from “I” to “My child,” it is interesting to consider whether using the measurement development procedures specified in the criteria, such as pilot-testing initial item pools with parents instead of youth (see Table 1), would boost the measures’ content validity. The only two measures that included parents in the development process, the PAS/PAS-R and SMQ, are also unique in their respective focus on preschool anxiety symptoms and selective mutism symptoms – content domains that are not included in any of the other parent measures.

Treatment Sensitivity.

Eight of the ten measures received ratings of either Good (i.e., STAIC-P, RCMAS-P, FSSC-R/P, PAS/PAS-R, SMQ) or Excellent (i.e., the SCARED-P, MASC-P, SCAS-P) for treatment sensitivity. The SPAI-C/P and SASC-R could not be rated. For the youth self-report measures, six measures received ratings of Excellent and two measures (the STAIC and SASC-R) received ratings of Good. Like the youth versions, the parent measures that assess for symptoms linked with DSM criteria (i.e., the SCARED-P, MASC-P, SCAS-P, PAS/PAS-R) are most widely used in current treatment outcome studies. These measures were all developed later than the STAIC-P, RCMAS-P, and FSSC-R/P; in earlier studies the latter were therefore the measures used to assess parent views of their child’s progress in cognitive-behavioral treatments (e.g., Kendall, 1994; Silverman, Kurtines, Ginsburg, Weems, Lumpkin, et al., 1999; Silverman, Kurtines, Ginsburg, Weems, Rabian, et al., 1999) and in recent trials to enable comparisons (e.g., Silverman et al., 2019). Parent ratings generally mirror the improvements shown in their child’s symptom ratings, making the measures rated Good and Excellent all candidates for inclusion in treatment outcome studies. Consistent use of the same measure or set of measures across treatment outcome studies (as in Silverman and colleagues’ trials) does allow though for drawing comparisons about measures’ sensitivity to change (Creswell et al., 2020; Etkin et al., 2020; Silverman & Ollendick, 2005). A consistent measurement approach can help address additional questions relating to treatment sensitivity, such as whether and how type of parent involvement in their child’s treatment may impact parents’ ratings of their child’s progress over the course of treatment and follow-up (e.g., Saavedra, Silverman, Morgan‐Lopez, & Kurtines, 2010).

The criteria delineating treatment sensitivity may further benefit from improved specificity to permit more nuanced distinctions among measures (Etkin et al., 2020). For example, ratings of Excellent could be contingent on studies demonstrating a measure’s ability to identify treatment response and remission with ROC analysis (e.g., Palitz et al., 2018), a measure’s sensitivity to change following different time intervals (e.g., long-term follow-up; Saavedra et al., 2010), or social validation of symptom change (e.g., assessing whether reduction in anxiety symptoms is associated with qualitative changes such as increased social interactions; Kazdin, 1977). Additions to the treatment sensitivity criteria that encourage such research would facilitate more specific guidance on this important assessment function.

Criteria with Mostly Adequate Ratings

Norms.

Five of the ten parent measures received ratings of Adequate for norms: the STAIC-P, RCMAS-P, FSSC-R/P, SASC-R/P, and SMQ. We could not provide ratings for the PAS/PAS-R and SPAI-C/P because normative/descriptive data from large clinical samples, a required criterion for ratings of Adequate, are lacking. For the youth versions, there were mostly Good ratings and only one measure rated Excellent, the RCMAS. Our finding that the majority of parent-report youth anxiety measures received ratings of Adequate or could not be rated underscores the need for more descriptive data from larger and more diverse clinical samples of parents (and youths) (Etkin et al., 2020). Availability of descriptive data from clinical samples will advance understanding and interpretation of parents’ scores on these measures. Of note, while the MASC-P and SCAS-P received ratings of Good for Norms, the SCARED-P is the only measure to receive a rating of Excellent. This Excellent rating is based on the SCARED-P’s larger body of supportive normative data in different samples, including one large (N = 1,570) sample selected to match U.S. census data on key demographic variables (Sequeira et al., 2019). Overall, the procedure used to gather normative data for the SCARED-P is unique from the other parent measures and illustrates how other measures’ normative data might be extended in future research.

Construct Validity.

Seven of the ten parent measures received ratings of Adequate for construct validity. Of the three remaining measures, the SCARED-P received a rating of Good, and RCMAS-P and FSSC-R/P could not be rated because they lacked independently replicated evidence of construct validity. Like the youth measures, convergent and divergent validity were the most commonly studied forms of construct validity for the parent measures. Yet, unlike the youth measures, which all earned ratings of Good, there was overall less research on different forms of construct validity beyond convergent and divergent, and fewer independently replicated studies focused on a given aspect of construct validity. The SCARED-P once again stands out due to its greater amount of construct validity evidence relative to the other parent-report measures. Likewise, the MASC-P, SCAS-P, and SMQ stand out because each had one study demonstrating their incremental validity (i.e., predictive ability beyond that provided by another measure; Hunsley & Meyer, 2003). Knowledge about measures’ incremental validity is useful in research and clinical settings as it informs which measure to use over another and thereby reduces burden for researchers, clinicians, and respondents (De Los Reyes & Langer, 2018). Because supportive evidence of incremental validity is part of the criteria for Excellent construct validity, additional research would likely result in more measures receiving higher ratings.

Although not specified in the criteria, CFA is widely viewed as an optimal way to establish a construct’s validity. CFA identifies the dimensions or factors underlying a construct, and the patterns of item-factor relationships or factor loadings (Brown, 2015). CFA offers a more modern analytic approach for establishing convergent and divergent validity with multitrait-multimethod matrices (Campbell & Fiske, 1959). The RCMAS-P, SCARED, SCAS-P, and PAS/PAS-R have undergone CFA and findings generally support construct validity of the latter three measures (Bowers et al., 2020; Cole et al., 1997; Edwards, Rapee, & Kennedy, 2010; Langer et al., 2010; Spence et al., 2001). We suggest revisions to the construct validity criteria to require studies of CFA in large samples to attain Excellent ratings.

Discriminative Validity.

Six of the ten measures, the STAIC-P, RCMAS-P, FSSC-R/P, SASC-R/P, PAS/PAS-R, and SMQ received ratings of Adequate. A similar pattern was found for the youth anxiety self-report measures as well; specifically, the STAIC, RCMAS, FSSC-R and SASC-R also received ratings of Adequate for discriminative validity (Etkin et al., 2020). Most studies of the parent-report measures established discriminative validity by testing for Total and/or subscale mean score differences between parents of anxious youths and parents of youths with other psychiatric disorders and/or of healthy controls. ROC analysis provides additional, nuanced information about measures’ discriminative abilities; specifically, the accuracy with which specific Total and subscale scores can identify youth with specific diagnoses. Only the SCARED-P, MASC-P, and SCAS-P, the three measures receiving ratings of Good or Excellent, relied on multiple ROC analyses. The AUCs and estimates of sensitivity/specificity and positive/negative predictive values found in these studies were higher for parent-reports than youth-reports, pointing to overall superiority of the parent measures to discriminate anxiety from other disorders and/or no disorder (e.g., Bailey et al., 2006; Gardner et al., 2007; Rappaport et al., 2017; Wood et al., 2002). Like the youth measures, variation was found in the parent measures’ subscales’ accuracy regarding identifying youth with the corresponding anxiety disorder. Generally, subscales assessing disorders with more observable/behavioral symptoms (e.g., SAD, SOC) performed better than those with less observable/behavioral symptoms (e.g., GAD). Additional ROC studies would be helpful to provide further evidence about optimal parent cut-scores for identifying different anxiety disorders for different groups of youth (e.g., boys, girls, clinical, community).

Validity Generalization.

Five of the ten measures, the STAIC-P-T, SASC-R/P, SPAI-C/P, PAS/PAS-R, and SMQ, received ratings of Adequate for validity generalization, a significant departure from the all-Excellent ratings found for the youth versions. All the parent measures have been used with different demographic groups of youth and in different settings, but only the measures rated Good (i.e., the RCMAS-P, FSSC-R/P, MASC-P, SCAS-P) and Excellent (i.e., the SCARED-P) have studies supporting measurement invariance between demographic groups. Although support for measurement invariance is not currently required as part of the criteria, it would provide improved understanding of measures’ validity generalization by showing measures have equivalent meaning between groups. In absence of such support, the meaning of group differences in scores on anxiety measures (i.e., parents of one group of youth view their children as more anxious than parents of another group) is difficult to interpret.

The SCARED-P has the most research supporting its measurement invariance between different racial/ethnic groups; we therefore feel most confident about this parent measure’s validity generalization. Of interest though is that studies using diverse samples (e.g., Sequeira et al., 2019) found invariance for a different factor structure than that found in predominantly White samples (e.g., Birmaher et al., 1999; Birmaher et al., 1997). Of further note, relatively fewer studies focused on establishing invariance of parent-reports between children and adolescents (e.g., Behrens et al., 2019); this may be especially important to promote developmentally sensitive assessment (Creswell et al., 2020; Saavedra et al., 2010). The criteria could thus be further specified to encourage more studies of measurement invariance that promote confidence about the equivalent meaning and use of measures in different demographic groups and different settings.

Considering Multiple Informants

About 75% of the articles included in our review reported associations between the parent-report youth anxiety measures and versions completed by other informants, usually the youth. Discrepancies between informants on anxiety measures are more common than not, with typical correlations in the range of r = .20 – .30 for parent and youth versions (e.g., Achenbach et al., 1987; De Los Reyes et al., 2015; De Los Reyes & Kazdin, 2005). As noted, when interpreting discrepancies, it is important to keep in mind that “diverging findings do not always signal ‘noisy data,’ and at times might index meaningful psychological phenomena” (De Los Reyes et al., 2019, p. 294). Briefly, the context in which informants witness/experience symptoms (e.g., home; school; different cultures; De Los Reyes et al., 2019) and the extent that symptoms are observable (e.g., Comer & Kendall, 2004) are examples of variables that influence agreement/discrepancy, while informant characteristic variables are less consistently found to be influential (e.g., youth age, parent psychopathology; Achenbach et al., 1987; De Los Reyes et al., 2015; De Los Reyes & Kazdin, 2005). Studies demonstrating associations between discrepancy and variables such as the parent-child relationship quality (De Los Reyes & Kazdin, 2006) and anxiety treatment outcome (Becker-Haimes, Jensen-Doss, Birmaher, Kendall, & Ginsburg, 2018; Zilcha-Mano, Shimshoni, Silverman, & Lebowitz, 2020) further highlight the clinical significance of informant discrepancy.

Given the often-meaningful nature of informant discrepancy, we caution against further research characterizations of correlations between informants’ reports to index a measure’s validity. Rather, we recommend more research investigations of the contribution of different informants to achieve certain assessment goals, such as incremental validity (e.g., whether certain combinations of informant-reports add significant clinical benefit relative to other combinations in identifying anxiety disorders; Ford-Paz et al., 2019; Reardon et al., 2019; Wei et al., 2014). In one recent illustration, Makol et al. (2020) find support for an approach in which leveraging the shared variance from multiple informants’ reports of youth social anxiety – including their unique perspectives (self-report versus other-report) and contexts (home versus school) – was a superior predictor of clinical outcomes (i.e., observed social anxiety; referral status) than a single informant-rating, or the average of informants’ ratings. Makol et al. (2020) and other novel approaches (e.g., De Los Reyes et al., 2013) are promising means of improving evidence-based clinical judgement and decision-making, including those relating to the National Institute of Mental Health Research Domain Criteria initiative for incorporation of multiple units of analysis within clinical and translational research (De Los Reyes, Drabick, Makol, & Jakubovic, 2020; Lebowitz, Gee, Pine, & Silverman, 2018).

Additional Recommendations

Other Youth Anxiety Measures.

Several measures did not meet our inclusion criteria for this review but warrant mention. The Youth Anxiety Measure for DSM-5 (YAM-5) uniquely assesses symptoms of all DSM-5 anxiety disorders and phobias. The parent-report has initial evidence of reliability and validity, but to date only in Dutch samples (Muris, Mannens, Peters, & Meesters, 2017; Muris, Simon, et al., 2017; Simon, Bos, Verboon, Smeekens, & Muris, 2017). The parent-report versions of the Children’s Separation Anxiety Scale (Méndez, Espada, Orgilés, Hidalgo, & García-Fernández, 2008; Méndez, Espada, Orgilés, Llavona, & García-Fernández, 2014) and Separation Anxiety Avoidance Inventory (In-Albon, Meyer, & Schneider, 2013; Orenes, García-Fernández, & Méndez, 2019) are measures specific to SAD; they also have initial evidence of reliability and validity in Spanish and German samples. We also note the emergence of computerized adaptive measures of youth anxiety. The Youth Online Diagnostic Assessment (YODA; McLellan et al., 2020) is a parent-completed online diagnostic measure; while it can be fully automated, preliminary evidence suggests that it is most reliable and valid when reviewed by a clinician (i.e., 65% agreement between clinician-reviewed YODA and ADIS for the presence of any anxiety disorder; κ = 0.70; 89% sensitivity; 93% specificity). This measure format may have clinical utility in reducing the time and resources that often pose as barriers to other assessment formats, such as diagnostic interviews (see Ford-Paz et al. (2019) for another approach).

Measures of Anxiety Impairment and Accommodation.

As noted in Etkin et al. (2020), it is important to evaluate impairment related to youth anxiety symptoms. The Child Anxiety Impact Scale (Langley, Bergman, McCracken, & Piacentini, 2004) and Child Anxiety Life Interference Scale (Lyneham et al., 2013), which also has a preschool version (Gilbertson, Morgan, Rapee, Lyneham, & Bayer, 2017), are well-supported measures assessing parents’ perspectives of the degree to which youth anxiety symptoms interfere with functioning across contexts. There has also been a proliferation of parent-rating scales of family accommodation, a behavior associated with greater anxiety symptom severity and impairment. The most widely used is the Family Accommodation Scale (Lebowitz et al., 2013). Other measures include the Family Accommodation Checklist and Interference Scale (Thompson-Hollands, Kerns, Pincus, & Comer, 2014), Pediatric Accommodation Scale (Benito et al., 2015), Parenting Anxious Kids Ratings Scale (Flessner, Murphy, Brennan, & D’Auria, 2017), and Parental Accommodation Scale (Meyer et al., 2018). The Parenting to Reduce Child Anxiety and Depression Scale (Sim, Jorm, Lawrence, & Yap, 2019) assesses other parenting behaviors known to protect against the development of youth anxiety and depression. When included in the assessment process, these measures of impairment and accommodation may add clinical benefit by identifying potential treatment targets. Finally, we note that a thorough evaluation of youth anxiety symptoms should not only include rating measures but other methods (e.g., observations) – selection of which should be directly informed by the goals of the assessment (Silverman & Kurtines, 1996).

Limitations

This article uniquely contributes to the youth anxiety assessment literature by using evaluative criteria to rate parent-report measures. In conjunction with our Part I article, these articles provide a full picture of the scope of symptom rating measures that specifically assess youth anxiety. Other assessment methods (e.g., diagnostic interviews, peer- and teacher-reports, clinician-rated measures, broadband measures) are essential to the assessment of youth anxiety but were beyond the aims of this evaluative review. We also did not include studies in which measures were administered in languages other than English to ensure the consistency of our evaluations. As such, studies of some psychometric properties were omitted (e.g., studies reporting norms from clinical samples and test-retest reliability for the PAS/PAS-R; Benga, Ţincaş, & Visu-Petra, 2010; Broeren & Muris, 2008; Wang & Zhao, 2015).

Our review is also of course bounded by the extant literature. There are measures that we did not evaluate due to lack of supporting research – one notable example being the parent-report MASC-2 - which has supporting psychometric data upon purchase but does not, as far as our search revealed, has peer-reviewed psychometric research. There is also limited research on father-reports on youth anxiety measures. There is a long history of neglecting fathers in youth research even though they play crucial roles in child development and psychopathology (e.g., Bögels & Phares, 2008; Phares & Compas, 1992). In our review, only studies for 50% of the measures (i.e., STAIC-P, RCMAS-P, SCARED-P, SCAS-P, PAS/PAS-R), reported data specific to fathers. Although not a new call to action, it certainly bears repeating that there is a need to broaden samples to include fathers. Finally, we could not discuss three psychometric properties included in the criteria (i.e., repeatability, prescriptive validity, clinical utility) because we could not locate any supporting studies. We hope that as the evidence base continues to grow that these limitations will be addressed.

Conclusion

This is the first article to use evaluative criteria to review the state of the evidence base for parent-report measures assessing youth anxiety symptoms. Our findings provide both good news and less good news for the field. Encouragingly, several measures were found to have good psychometric properties, and the SCARED-P stands out particularly as a well-researched measure that can be confidently used in research and clinical settings. However, our findings also underscore that the field of parent-report measures remains under-developed, especially in comparison with youth self-reports. We hope parent-report measurement development and evaluation will continue to be the focus of systematic and rigorous empirical research and that our review can be meaningfully updated in the future with the addition of new knowledge.

Acknowledgments

This study was supported by National Institute of Mental Health grants R01MH119299, R01DK117651, and R61MH115113. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References

  1. Achenbach TM (1991). Manual for the Child Behavior Checklist/4–18 and 1991 profile. University of Vermont, Department of Psychiatry. [Google Scholar]
  2. Achenbach TM, & Edelbrock C. (1983). Child behavior checklist (CBCL). [Google Scholar]
  3. Achenbach TM, McConaughy SH, & Howell CT (1987). Child/adolescent behavioral and emotional problems: implications of cross-informant correlations for situational specificity. Psychological bulletin, 101(2), 213. [PubMed] [Google Scholar]
  4. Bailey KA, Chavira DA, Stein MT, & Stein MB (2006). Brief measures to screen for social phobia in primary care pediatrics. Journal of Pediatric Psychology, 31(5), 512–521. [DOI] [PubMed] [Google Scholar]
  5. Baldwin JS, & Dadds MR (2007). Reliability and validity of parent and child versions of the multidimensional anxiety scale for children in community samples. Journal of the American Academy of Child & Adolescent Psychiatry, 46(2), 252–260. [DOI] [PubMed] [Google Scholar]
  6. Barbosa J, Tannock R, & Manassis K. (2002). Measuring anxiety: Parent‐child reporting differences in clinical samples. Depression and Anxiety, 15(2), 61–65. [DOI] [PubMed] [Google Scholar]
  7. Bayer JK, Rapee RM, Hiscock H, Ukoumunne OC, Mihalopoulos C, Clifford S, & Wake M. (2011). The Cool Little Kids randomised controlled trial: Population-level early prevention for anxiety disorders. BMC public health, 11(1), 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Becker-Haimes EM, Jensen-Doss A, Birmaher B, Kendall PC, & Ginsburg GS (2018). Parent–youth informant disagreement: Implications for youth anxiety treatment. Clinical child psychology and psychiatry, 23(1), 42–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Behrens B, Swetlitz C, Pine DS, & Pagliaccio D. (2019). The Screen for Child Anxiety Related Emotional Disorders (SCARED): Informant Discrepancy, Measurement Invariance, and Test–Retest Reliability. Child Psychiatry & Human Development, 50(3), 473–482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Beidel DC, Turner SM, Hamlin K, & Morris TL (2000). The Social Phobia and Anxiety Inventory for Children (SPAI-C): external and discriminative validity. Behavior Therapy, 31(1), 75–87. [Google Scholar]
  11. Benga O, Ţincaş I, & Visu-Petra L. (2010). Investigating the structure of anxiety symptoms among Romanian preschoolers using the Spence Preschool Anxiety Scales. Cognitie, Creier, Comportament/Cognition, Brain, Behavior, 14(2). [Google Scholar]
  12. Benito KG, Caporino NE, Frank HE, Ramanujam K, Garcia A, Freeman J, . . . Storch E. (2015). Development of the pediatric accommodation scale: Reliability and validity of clinician-and parent-report measures. Journal of Anxiety Disorders, 29, 14–24. [DOI] [PubMed] [Google Scholar]
  13. Bergman RL, Keller ML, Piacentini J, & Bergman AJ (2008). The development and psychometric properties of the selective mutism questionnaire. Journal of Clinical Child & Adolescent Psychology, 37(2), 456–464. [DOI] [PubMed] [Google Scholar]
  14. Birmaher B, Axelson DA, Monk K, Kalas C, Clark DB, Ehmann M, . . . Brent DA. (2003). Fluoxetine for the treatment of childhood anxiety disorders. Journal of the American Academy of Child & Adolescent Psychiatry, 42(4), 415–423. [DOI] [PubMed] [Google Scholar]
  15. Birmaher B, Brent DA, Chiappetta L, Bridge J, Monga S, & Baugher M. (1999). Psychometric properties of the Screen for Child Anxiety Related Emotional Disorders (SCARED): a replication study. Journal of the American Academy of Child & Adolescent Psychiatry, 38(10), 1230–1236. [DOI] [PubMed] [Google Scholar]
  16. Birmaher B, Khetarpal S, Brent D, Cully M, Balach L, Kaufman J, & Neer SM (1997). The screen for child anxiety related emotional disorders (SCARED): Scale construction and psychometric characteristics. Journal of the American Academy of Child & Adolescent Psychiatry, 36(4), 545–553. [DOI] [PubMed] [Google Scholar]
  17. Bögels S, & Phares V. (2008). Fathers’ role in the etiology, prevention and treatment of child anxiety: A review and new model. Clinical Psychology Review, 28(4), 539–558. [DOI] [PubMed] [Google Scholar]
  18. Bowers ME, Reider LB, Morales S, Buzzell GA, Miller N, Troller-Renfree SV, . . . Fox NA. (2020). Differences in Parent and Child Report on the Screen for Child Anxiety-Related Emotional Disorders (SCARED): Implications for Investigations of Social Anxiety in Adolescents. Journal of Abnormal Child Psychology, 48(4), 561–571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Broeren S, & Muris P. (2008). Psychometric evaluation of two new parent-rating scales for measuring anxiety symptoms in young Dutch children. Journal of Anxiety Disorders, 22(6), 949–958. [DOI] [PubMed] [Google Scholar]
  20. Brown-Jacobsen AM, Wallace DP, & Whiteside SP (2011). Multimethod, multi-informant agreement, and positive predictive value in the identification of child anxiety disorders using the SCAS and ADIS-C. Assessment, 18(3), 382–392. [DOI] [PubMed] [Google Scholar]
  21. Brown TA (2015). Confirmatory factor analysis for applied research: Guilford publications. [Google Scholar]
  22. Campbell DT, & Fiske DW (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological bulletin, 56(2), 81. [PubMed] [Google Scholar]
  23. Cannon CJ, Makol BA, Keeley LM, Qasmieh N, Okuno H, Racz SJ, & De Los Reyes A. (2020). A paradigm for understanding adolescent social anxiety with unfamiliar peers: Conceptual foundations and directions for future research. Clinical child and family psychology review, 1–27. [DOI] [PubMed] [Google Scholar]
  24. Caporino NE, Sakolsky D, Brodman DM, McGuire JF, Piacentini J, Peris TS, . . . Kendall PC (2017). Establishing clinical cutoffs for response and remission on the Screen for Child Anxiety Related Emotional Disorders (SCARED). Journal of the American Academy of Child & Adolescent Psychiatry, 56(8), 696–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Chavira DA, Drahota A, Garland AF, Roesch S, Garcia M, & Stein MB (2014). Feasibility of two modes of treatment delivery for child anxiety in primary care. Behaviour Research and Therapy, 60, 60–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Chiu AW, Langer DA, McLeod BD, Har K, Drahota A, Galla BM, . . . Wood JJ (2013). Effectiveness of modular CBT for child anxiety in elementary schools. School psychology quarterly, 28(2), 141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Chorpita BF, & Lilienfeld SO (1999). Clinical assessment of anxiety sensitivity in children and adolescents: where do we go from here? Psychological assessment, 11(2), 212. [Google Scholar]
  28. Cohen J. (2013). Statistical power analysis for the behavioral sciences: Academic press. [Google Scholar]
  29. Cole DA, Hoffman K, Tram JM, & Maxwell SE (2000). Structural differences in parent and child reports of children’s symptoms of depression and anxiety. Psychological assessment, 12(2), 174. [DOI] [PubMed] [Google Scholar]
  30. Cole DA, Truglio R, & Peeke L. (1997). Relation between symptoms of anxiety and depression in children: A multitrait-multimethod-multigroup assessment. Journal of consulting and clinical psychology, 65(1), 110. [DOI] [PubMed] [Google Scholar]
  31. Comer JS, & Kendall PC (2004). A symptom-level examination of parent–child agreement in the diagnosis of anxious youths. Journal of the American Academy of Child & Adolescent Psychiatry, 43(7), 878–886. [DOI] [PubMed] [Google Scholar]
  32. Compton SN, Peris TS, Almirall D, Birmaher B, Sherrill J, Kendall PC, . . . Rynn MA (2014). Predictors and moderators of treatment response in childhood anxiety disorders: Results from the CAMS trial. Journal of consulting and clinical psychology, 82(2), 212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Cornacchio D, Furr JM, Sanchez AL, Hong N, Feinberg LK, Tenenbaum R, . . . Miguel E. (2019). Intensive group behavioral treatment (IGBT) for children with selective mutism: A preliminary randomized clinical trial. Journal of consulting and clinical psychology, 87(8), 720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Creswell C, Nauta MH, Hudson JL, March S, Reardon T, Arendt K, . . . Halldorsson B. (2020). Research Review: Recommendations for reporting on treatment trials for child and adolescent anxiety disorders–an international consensus statement. Journal of Child Psychology and Psychiatry. [DOI] [PubMed] [Google Scholar]
  35. De Los Reyes A, Augenstein TM, Wang M, Thomas SA, Drabick DA, Burgers DE, & Rabinowitz J. (2015). The validity of the multi-informant approach to assessing child and adolescent mental health. Psychological bulletin, 141(4), 858. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. De Los Reyes A, Drabick DA, Makol BA, & Jakubovic RJ (2020). Introduction to the Special Section: The Research Domain Criteria’s Units of Analysis and Cross-Unit Correspondence in Youth Mental Health Research: Taylor & Francis. [DOI] [PubMed] [Google Scholar]
  37. De Los Reyes A, & Kazdin AE (2004). Measuring informant discrepancies in clinical child research. Psychological assessment, 16(3), 330. [DOI] [PubMed] [Google Scholar]
  38. De Los Reyes A, & Kazdin AE (2005). Informant discrepancies in the assessment of childhood psychopathology: a critical review, theoretical framework, and recommendations for further study. Psychological bulletin, 131(4), 483. [DOI] [PubMed] [Google Scholar]
  39. De Los Reyes A, & Kazdin AE (2006). Informant discrepancies in assessing child dysfunction relate to dysfunction within mother-child interactions. Journal of Child and Family Studies, 15(5), 643–661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. De Los Reyes A, & Langer DA (2018). Assessment and the journal of clinical child and adolescent psychology’s evidence base updates series: Evaluating the tools for gathering evidence. Journal of Clinical Child & Adolescent Psychology, 47(3), 357–365. [DOI] [PubMed] [Google Scholar]
  41. De Los Reyes A, Lerner MD, Keeley LM, Weber RJ, Drabick DA, Rabinowitz J, & Goodman KL (2019). Improving interpretability of subjective assessments about psychological phenomena: A review and cross-cultural meta-analysis. Review of General Psychology, 23(3), 293–319. [Google Scholar]
  42. De Los Reyes A, Thomas SA, Goodman KL, & Kundey SM (2013). Principles underlying the use of multiple informants’ reports. Annual review of clinical psychology, 9, 123–149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. DiBartolo PM, & Grills AE (2006). Who is best at predicting children’s anxiety in response to a social evaluative task?: A comparison of child, parent, and teacher reports. Journal of Anxiety Disorders, 20(5), 630–645. [DOI] [PubMed] [Google Scholar]
  44. Dirks MA, Weersing VR, Warnick E, Gonzalez A, Alton M, Dauser C, . . . Woolston J. (2014). Parent and youth report of youth anxiety: Evidence for measurement invariance. Journal of Child Psychology and Psychiatry, 55(3), 284–291. [DOI] [PubMed] [Google Scholar]
  45. Donovan CL, & March S. (2014). Online CBT for preschool anxiety disorders: a randomised control trial. Behaviour Research and Therapy, 58, 24–35. [DOI] [PubMed] [Google Scholar]
  46. Edelbrock C, Costello AJ, Dulcan MK, Kalas R, & Conover NC (1985). Age differences in the reliability of the psychiatric interview of the child. Child development, 265–275. [PubMed] [Google Scholar]
  47. Edwards SL, Rapee RM, & Kennedy S. (2010). Prediction of anxiety symptoms in preschool‐aged children: examination of maternal and paternal perspectives. Journal of Child Psychology and Psychiatry, 51(3), 313–321. [DOI] [PubMed] [Google Scholar]
  48. Edwards SL, Rapee RM, Kennedy SJ, & Spence SH (2010). The assessment of anxiety symptoms in preschool-aged children: the revised Preschool Anxiety Scale. Journal of Clinical Child & Adolescent Psychology, 39(3), 400–409. [DOI] [PubMed] [Google Scholar]
  49. Engel NA, Rodrigue JR, & Geffken GR (1994). Parent-child agreement on ratings of anxiety in children. Psychological Reports, 75(3), 1251–1260. [DOI] [PubMed] [Google Scholar]
  50. Epkins CC, & Seegan PL (2015). Mother-reported and children’s perceived social and academic competence in clinic-referred youth: Unique relations to depression and/or social anxiety and the role of self-perceptions. Child Psychiatry & Human Development, 46(5), 656–670. [DOI] [PubMed] [Google Scholar]
  51. Etkin RG, Shimshoni Y, Lebowitz ER, & Silverman WK (2020). Using evaluative criteria to review youth anxiety measures, part i: self-report. Journal of Clinical Child & Adolescent Psychology, 1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Evans R, Thirlwall K, Cooper P, & Creswell C. (2017). Using symptom and interference questionnaires to identify recovery among children with anxiety disorders. Psychological assessment, 29(7), 835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Fisak B, & Barrett P. (2019). Anxiety in preschool children: Assessment, treatment, and prevention: Routledge. [Google Scholar]
  54. Flannery-Schroeder EC, & Kendall PC (2000). Group and individual cognitive-behavioral treatments for youth with anxiety disorders: A randomized clinical trial. Cognitive Therapy and Research, 24(3), 251–278. [Google Scholar]
  55. Flessner CA, Murphy YE, Brennan E, & D’Auria A. (2017). The Parenting Anxious Kids Ratings Scale-Parent Report (PAKRS-PR): initial scale development and psychometric properties. Child Psychiatry & Human Development, 48(4), 651–667. [DOI] [PubMed] [Google Scholar]
  56. Ford-Paz RE, Gouze KR, Kerns CE, Ballard R, Parkhurst JT, Jha P, & Lavigne J. (2019). Evidence-based assessment in clinical settings: Reducing assessment burden for a structured measure of child and adolescent anxiety. Psychological services. doi: 10.1037/ser0000367 [DOI] [PubMed] [Google Scholar]
  57. Gardner W, Lucas A, Kolko DJ, & Campo JV (2007). Comparison of the PSC-17 and alternative mental health screens in an at-risk primary care sample. Journal of the American Academy of Child & Adolescent Psychiatry, 46(5), 611–618. [DOI] [PubMed] [Google Scholar]
  58. Gilbertson TJ, Morgan AJ, Rapee RM, Lyneham HJ, & Bayer JK (2017). Psychometric properties of the child anxiety life interference scale–preschool version. Journal of Anxiety Disorders, 52, 62–71. [DOI] [PubMed] [Google Scholar]
  59. Ginsburg GS, Pella JE, Pikulski PJ, Tein J-Y, & Drake KL (2020). School-based treatment for anxiety research study (STARS): a randomized controlled effectiveness trial. Journal of Abnormal Child Psychology, 48(3), 407–417. [DOI] [PubMed] [Google Scholar]
  60. Glenn LE, Keeley LM, Szollos S, Okuno H, Wang X, Rausch E, . . . Makol BA (2019). Trained observers’ ratings of adolescents’ social anxiety and social skills within controlled, cross-contextual social interactions with unfamiliar peer confederates. Journal of Psychopathology and Behavioral Assessment, 41(1), 1–15. [Google Scholar]
  61. Glod M, Creswell C, Waite P, Jamieson R, McConachie H, South MD, & Rodgers J. (2017). Comparisons of the factor structure and measurement invariance of the spence children’s anxiety scale—parent version in children with autism spectrum disorder and typically developing anxious children. Journal of Autism and Developmental Disorders, 47(12), 3834–3846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Gonzalez A, Weersing VR, Warnick E, Scahill L, & Woolston J. (2012). Cross-ethnic measurement equivalence of the SCARED in an outpatient sample of African American and non-Hispanic white youths and parents. Journal of Clinical Child & Adolescent Psychology, 41(3), 361–369. [DOI] [PubMed] [Google Scholar]
  63. Gotham K, Brunwasser SM, & Lord C. (2015). Depressive and anxiety symptom trajectories from school age through young adulthood in samples with autism spectrum disorder and developmental delay. Journal of the American Academy of Child & Adolescent Psychiatry, 54(5), 369–376. e363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Haynes SN, & Lench HC (2003). Incremental validity of new clinical assessment measures. Psychological assessment, 15(4), 456. [DOI] [PubMed] [Google Scholar]
  65. Higa CK, Fernandez SN, Nakamura BJ, Chorpita BF, & Daleiden EL (2006). Parental assessment of childhood social phobia: Psychometric properties of the Social Phobia and Anxiety Inventory for Children–Parent Report. Journal of Clinical Child and Adolescent Psychology, 35(4), 590–597. [DOI] [PubMed] [Google Scholar]
  66. Holly LE, Fenley AR, Kritikos TK, Merson RA, Abidin RR, & Langer DA (2019). Evidence-base update for parenting stress measures in clinical samples. Journal of Clinical Child & Adolescent Psychology, 48(5), 685–705. [DOI] [PubMed] [Google Scholar]
  67. Hunsley J, & Mash EJ (2008). Developing criteria for evidence-based assessment: An introduction to assessments that work. A guide to assessments that work, 3–14. [Google Scholar]
  68. Hunsley J, & Meyer GJ (2003). The incremental validity of psychological testing and assessment: conceptual, methodological, and statistical issues. Psychological assessment, 15(4), 446. [DOI] [PubMed] [Google Scholar]
  69. In-Albon T, Meyer AH, & Schneider S. (2013). Separation anxiety avoidance inventory-child and parent version: psychometric properties and clinical utility in a clinical and school sample. Child Psychiatry & Human Development, 44(6), 689–697. [DOI] [PubMed] [Google Scholar]
  70. Jastrowski Mano KE, Evans JR, Tran ST, Anderson Khan K, Weisman SJ, & Hainsworth KR (2012). The psychometric properties of the screen for child anxiety related emotional disorders in pediatric chronic pain. Journal of Pediatric Psychology, 37(9), 999–1011. [DOI] [PubMed] [Google Scholar]
  71. Johnston C, & Murray C. (2003). Incremental validity in the psychological assessment of children and adolescents. Psychological assessment, 15(4), 496. [DOI] [PubMed] [Google Scholar]
  72. Kazdin AE (1977). Assessing the clinical or applied importance of behavior change through social validation. Behavior modification, 1(4), 427–452. [Google Scholar]
  73. Kendall PC (1994). Treating anxiety disorders in children: results of a randomized clinical trial. Journal of consulting and clinical psychology, 62(1), 100. [DOI] [PubMed] [Google Scholar]
  74. Kennedy SJ, Rapee RM, & Edwards SL (2009). A selective intervention program for inhibited preschool-aged children of parents with an anxiety disorder: Effects on current anxiety disorders and temperament. Journal of the American Academy of Child & Adolescent Psychiatry, 48(6), 602–609. [DOI] [PubMed] [Google Scholar]
  75. Klein AM, Rapee RM, Hudson JL, Schniering CA, Wuthrich VM, Kangas M, . . . Rinck M. (2015). Interpretation modification training reduces social anxiety in clinically anxious children. Behaviour Research and Therapy, 75, 78–84. [DOI] [PubMed] [Google Scholar]
  76. Kraemer HC, Measelle JR, Ablow JC, Essex MJ, Boyce WT, & Kupfer DJ (2003). A new approach to integrating data from multiple informants in psychiatric assessment and research: Mixing and matching contexts and perspectives. American journal of psychiatry, 160(9), 1566–1577. [DOI] [PubMed] [Google Scholar]
  77. La Greca AM (1999). Social anxiety scales for children and adolescents: Manual and instructions for the SASC, SASC-R, SAS-A (adolescents), and parent versions of the scales: AM La Greca. [Google Scholar]
  78. Landis JR, & Koch GG (1977). The measurement of observer agreement for categorical data. biometrics, 159–174. [PubMed] [Google Scholar]
  79. Langer DA, Wood JJ, Bergman RL, & Piacentini JC (2010). A multitrait–multimethod analysis of the construct validity of child anxiety disorders in a clinical sample. Child Psychiatry & Human Development, 41(5), 549–561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Langley AK, Bergman RL, McCracken J, & Piacentini JC (2004). Impairment in childhood anxiety disorders: Preliminary examination of the child anxiety impact scale–parent version. Journal of Child and Adolescent Psychopharmacology, 14(1), 105–114. [DOI] [PubMed] [Google Scholar]
  81. Lebowitz ER, Gee DG, Pine DS, & Silverman WK (2018). Implications of the Research Domain Criteria project for childhood anxiety and its disorders. Clinical Psychology Review, 64, 99–109. doi: 10.1016/j.cpr.2018.01.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Lebowitz ER, Marin C, Martino A, Shimshoni Y, & Silverman WK (2019). Parent-based treatment as efficacious as cognitive-behavioral therapy for childhood anxiety: A randomized noninferiority study of supportive parenting for anxious childhood emotions. Journal of the American Academy of Child & Adolescent Psychiatry. doi: 10.1016/j.jaac.2019.02.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Lebowitz ER, Woolston J, Bar-Haim Y, Calvocoressi L, Dauser C, Warnick E, . . . Leckman JF (2013). Family accommodation in pediatric anxiety disorders. Depression and Anxiety, 30(1), 47–54. doi: 10.1002/da.21998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Letamendi AM, Chavira DA, Hitchcock CA, Roesch SC, Shipon-Blum E, & Stein MB (2008). Selective mutism questionnaire: Measurement structure and validity. Journal of the American Academy of Child & Adolescent Psychiatry, 47(10), 1197–1204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Lipton MF, Augenstein TM, Weeks JW, & De Los Reyes A. (2014). A multi-informant approach to assessing fear of positive evaluation in socially anxious adolescents. Journal of Child and Family Studies, 23(7), 1247–1257. [Google Scholar]
  86. Lyneham HJ, Sburlati ES, Abbott MJ, Rapee RM, Hudson JL, Tolin DF, & Carlson SE (2013). Psychometric properties of the Child Anxiety Life Interference Scale (CALIS). Journal of Anxiety Disorders, 27(7), 711–719. doi: 10.1016/j.janxdis.2013.09.008 [DOI] [PubMed] [Google Scholar]
  87. Magiati I, Lerh JW, Hollocks MJ, Uljarevic M, Rodgers J, McConachie H, . . . Hardan A. (2017). The measurement properties of the spence children’s anxiety scale‐parent version in a large international pooled sample of young people with autism spectrum disorder. Autism Research, 10(10), 1629–1652. [DOI] [PubMed] [Google Scholar]
  88. Makol BA, Youngstrom EA, Racz SJ, Qasmieh N, Glenn LE, & De Los Reyes A. (2020). Integrating multiple informants’ reports: How conceptual and measurement models may address long-standing problems in clinical decision-making. Clinical Psychological Science, 8(6), 953–970. [Google Scholar]
  89. Manassis K, Fung D, Tannock R, Sloman L, Fiksenbaum L, & McInnes A. (2003). Characterizing selective mutism: Is it more than social anxiety? Depression and Anxiety, 18(3), 153–161. [DOI] [PubMed] [Google Scholar]
  90. March JS, Parker JD, Sullivan K, Stallings P, & Conners CK (1997). The Multidimensional Anxiety Scale for Children (MASC): factor structure, reliability, and validity. Journal of the American Academy of Child & Adolescent Psychiatry, 36(4), 554–565. [DOI] [PubMed] [Google Scholar]
  91. Masia Warner C, Colognori D, Brice C, Herzig K, Mufson L, Lynch C, . . . Moceri DC. (2016). Can school counselors deliver cognitive‐behavioral treatment for social anxiety effectively? A randomized controlled trial. Journal of Child Psychology and Psychiatry, 57(11), 1229–1238. [DOI] [PubMed] [Google Scholar]
  92. Masia Warner C, Fisher PH, Shrout PE, Rathor S, & Klein RG (2007). Treating adolescents with social anxiety disorder in school: An attention control trial. Journal of Child Psychology and Psychiatry, 48(7), 676–686. [DOI] [PubMed] [Google Scholar]
  93. Mattick RP, & Clarke JC (1998). Development and validation of measures of social phobia scrutiny fear and social interaction anxiety. Behaviour Research and Therapy, 36(4), 455–470. [DOI] [PubMed] [Google Scholar]
  94. McLellan LF, Kangas M, Rapee RM, Iverach L, Wuthrich VM, Hudson JL, & Lyneham HJ (2020). The Youth Online Diagnostic Assessment (YODA): Validity of a New Tool to Assess Anxiety Disorders in Youth. Child Psychiatry and Human Development. [DOI] [PubMed] [Google Scholar]
  95. Méndez X, Espada JP, Orgilés M, Hidalgo MD, & García-Fernández JM (2008). Psychometric properties and diagnostic ability of the Separation Anxiety Scale for Children (SASC). European Child & Adolescent Psychiatry, 17(6), 365–372. [DOI] [PubMed] [Google Scholar]
  96. Méndez X, Espada JP, Orgilés M, Llavona LM, & García-Fernández JM (2014). Children’s separation anxiety scale (CSAS): psychometric properties. PloS one, 9(7), e103212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Meyer JM, Clapp JD, Whiteside SP, Dammann J, Kriegshauser KD, Hale LR, . . . Deacon BJ (2018). Predictive relationship between parental beliefs and accommodation of pediatric anxiety. Behavior Therapy, 49(4), 580–593. [DOI] [PubMed] [Google Scholar]
  98. Millsap RE, & Meredith W. (2007). Factorial invariance: Historical perspectives and new problems. Factor analysis at, 100, 131–152. [Google Scholar]
  99. Monga S, Birmaher B, Chiappetta L, Brent D, Kaufman J, Bridge J, & Cully M. (2000). Screen for child anxiety‐related emotional disorders (SCARED): Convergent and divergent validity. Depression and Anxiety, 12(2), 85–91. [DOI] [PubMed] [Google Scholar]
  100. Muris P, Mannens J, Peters L, & Meesters C. (2017). The Youth Anxiety Measure for DSM-5 (YAM-5): Correlations with anxiety, fear, and depression scales in non-clinical children. Journal of Anxiety Disorders, 51, 72–78. [DOI] [PubMed] [Google Scholar]
  101. Muris P, Simon E, Lijphart H, Bos A, Hale W, & Schmeitz K. (2017). The youth anxiety measure for DSM-5 (YAM-5): development and first psychometric evidence of a new scale for assessing anxiety disorders symptoms of children and adolescents. Child Psychiatry & Human Development, 48(1), 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Nauta MH, Scholing A, Rapee RM, Abbott M, Spence SH, & Waters A. (2004). A parent-report measure of children’s anxiety: Psychometric properties and comparison with child-report in a clinic and normal sample. Behaviour Research and Therapy, 42(7), 813–839. doi: 10.1016/S0005-7967(03)00200-6 [DOI] [PubMed] [Google Scholar]
  103. Oar EL, Farrell LJ, Waters AM, Conlon EG, & Ollendick TH (2015). One session treatment for pediatric blood-injection-injury phobia: A controlled multiple baseline trial. Behaviour Research and Therapy, 73, 131–142. [DOI] [PubMed] [Google Scholar]
  104. Orenes A, García-Fernández JM, & Méndez X. (2019). Separation Anxiety Assessment Scale—Parent Version: Spanish Validation (SAAS-P: Spanish Validation). Child Psychiatry & Human Development, 50(5), 826–834. [DOI] [PubMed] [Google Scholar]
  105. Palitz SA, Caporino NE, McGuire JF, Piacentini J, Albano AM, Birmaher B, . . . Kendall PC (2018). Defining treatment response and remission in youth anxiety: A signal detection analysis with the multidimensional anxiety scale for children. Journal of the American Academy of Child & Adolescent Psychiatry, 57(6), 418–427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Pettit JW, Bechor M, Rey Y, Vasey MW, Abend R, Pine DS, . . . Silverman WK (2020). A randomized controlled trial of attention bias modification treatment in youth with treatment-resistant anxiety disorders. Journal of the American Academy of Child & Adolescent Psychiatry, 59(1), 157–165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Phares V, & Compas BE (1992). The role of fathers in child and adolescent psychopathology: make room for daddy. Psychological bulletin, 111(3), 387. [DOI] [PubMed] [Google Scholar]
  108. Pina AA, Silverman WK, Saavedra LM, & Weems CF (2001). An analysis of the RCMAS lie scale in a clinic sample of anxious children. Journal of Anxiety Disorders, 15(5), 443–457. [DOI] [PubMed] [Google Scholar]
  109. Rappaport B, Pagliaccio D, Pine D, Klein D, & Jarcho J. (2017). Discriminant validity, diagnostic utility, and parent-child agreement on the Screen for Child Anxiety Related Emotional Disorders (SCARED) in treatment-and non-treatment-seeking youth. Journal of Anxiety Disorders, 51, 22–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Reardon T, Creswell C, Lester KJ, Arendt K, Blatter-Meunier J, Bögels SM, . . . Herren C. (2019). The utility of the SCAS-C/P to detect specific anxiety disorders among clinically anxious children. Psychological assessment. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Revelle W, & Condon DM (2019). Reliability from α to ω: A tutorial. Psychological assessment, 31(12), 1395. [DOI] [PubMed] [Google Scholar]
  112. Ross AO (1980). Psychological disorders of children: A behavioral approach to theory, research, and therapy: McGraw-Hill Companies. [Google Scholar]
  113. Saavedra LM, Silverman WK, Morgan‐Lopez AA, & Kurtines WM (2010). Cognitive behavioral treatment for childhood anxiety disorders: long‐term effects on anxiety and secondary disorders in young adulthood. Journal of Child Psychology and Psychiatry, 51(8), 924–934. [DOI] [PubMed] [Google Scholar]
  114. Sequeira SL, Silk JS, Woods WC, Kolko DJ, & Lindhiem O. (2019). Psychometric Properties of the SCARED in a Nationally Representative US Sample of 5–12-Year-Olds. Journal of Clinical Child & Adolescent Psychology, 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Silverman WK, & Albano AM (1996). Anxiety disorders interview schedule: Adis-IV child interview schedule (Vol. 2): Graywind Publications. [Google Scholar]
  116. Silverman WK, & Eisen AR (1992). Age differences in the reliability of parent and child reports of child anxious symptomatology using a structured interview. Journal of the American Academy of Child & Adolescent Psychiatry, 31(1), 117–124. [DOI] [PubMed] [Google Scholar]
  117. Silverman WK, & Kurtines WM (1996). Anxiety and phobic disorders: A pragmatic approach: Springer Science & Business Media. [Google Scholar]
  118. Silverman WK, Kurtines WM, Ginsburg GS, Weems CF, Lumpkin PW, & Carmichael DH (1999). Treating anxiety disorders in children with group cognitive-behavioral therapy: A randomized clinical trial. Journal of consulting and clinical psychology, 67(6), 995. [DOI] [PubMed] [Google Scholar]
  119. Silverman WK, Kurtines WM, Ginsburg GS, Weems CF, Rabian B, & Serafini LT (1999). Contingency management, self-control, and education support in the treatment of childhood phobic disorders: A randomized clinical trial. Journal of consulting and clinical psychology, 67(5), 675. [DOI] [PubMed] [Google Scholar]
  120. Silverman WK, Marin CE, Rey Y, Kurtines WM, Jaccard J, & Pettit JW (2019). Group-versus parent-involvement CBT for childhood anxiety disorders: Treatment specificity and long-term recovery mediation. Clinical Psychological Science, 7(4), 840–855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Silverman WK, & Ollendick TH (2005). Evidence-based assessment of anxiety and its disorders in children and adolescents. Journal of Clinical Child and Adolescent Psychology, 34(3), 380–411. [DOI] [PubMed] [Google Scholar]
  122. Sim WH, Jorm AF, Lawrence KA, & Yap MB (2019). Development and evaluation of the Parenting to Reduce Child Anxiety and Depression Scale (PaRCADS): assessment of parental concordance with guidelines for the prevention of child anxiety and depression. PeerJ, 7, e6865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Simon E, Bos AER, Verboon P, Smeekens S, & Muris P. (2017). Psychometric properties of the Youth Anxiety Measure for DSM-5 (YAM-5) in a community sample. Personality and Individual Differences, 116, 258–264. doi: 10.1016/j.paid.2017.04.058 [DOI] [Google Scholar]
  124. Southam-Gerow MA, Flannery-Schroeder EC, & Kendall PC (2003). A psychometric evaluation of the parent report form of the State-Trait Anxiety Inventory for Children—Trait Version. Journal of Anxiety Disorders, 17(4), 427–446. [DOI] [PubMed] [Google Scholar]
  125. Spence SH, Donovan CL, March S, Gamble A, Anderson RE, Prosser S, & Kenardy J. (2011). A randomized controlled trial of online versus clinic-based CBT for adolescent anxiety. Journal of consulting and clinical psychology, 79(5), 629. [DOI] [PubMed] [Google Scholar]
  126. Spence SH, Rapee R, McDonald C, & Ingram M. (2001). The structure of anxiety symptoms among preschoolers. Behaviour Research and Therapy, 39(11), 1293–1316. [DOI] [PubMed] [Google Scholar]
  127. Strauss C. (1987). Modification of trait portion of State-Trait Anxiety Inventory for Children-parent form. Gainesville, FL: University of Florida. [Google Scholar]
  128. Thompson-Hollands J, Kerns CE, Pincus DB, & Comer JS (2014). Parental accommodation of child anxiety and related symptoms: Range, impact, and correlates. Journal of Anxiety Disorders, 28(8), 765–773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Toscano R, Baillie AJ, Lyneham HJ, Kelly A, Kidd T, & Hudson JL (2019). Assessment of anxiety in children and adolescents: A comparative study on the validity and reliability of the Spence Children’s Anxiety Scale in children and adolescents with anxiety and Autism Spectrum Disorder. Journal of Affective Disorders. [DOI] [PubMed] [Google Scholar]
  130. Van Meter AR, You DS, Halverson T, Youngstrom EA, Birmaher B, Fristad MA, . . . Frazier TW. (2018). Diagnostic efficiency of caregiver report on the SCARED for identifying youth anxiety disorders in outpatient settings. Journal of Clinical Child & Adolescent Psychology, 47(sup1), S161–S175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Varela RE, Sanchez-Sosa JJ, Biggs BK, & Luis TM (2008). Anxiety symptoms and fears in Hispanic and European American children: Cross-cultural measurement equivalence. Journal of Psychopathology and Behavioral Assessment, 30(2), 132–145. [Google Scholar]
  132. Wang M, & Zhao J. (2015). Anxiety disorder symptoms in Chinese preschool children. Child Psychiatry & Human Development, 46(1), 158–166. [DOI] [PubMed] [Google Scholar]
  133. Watson D. (2004). Stability versus change, dependability versus error: Issues in the assessment of personality over time. Journal of Research in Personality, 38(4), 319–350. [Google Scholar]
  134. Weems CF, Silverman WK, Saavedra LM, Pina AA, & Lumpkin PW (1999). The discrimination of children’s phobias using the Revised Fear Survey Schedule for Children. The Journal of Child Psychology and Psychiatry and Allied Disciplines, 40(6), 941–952. [PubMed] [Google Scholar]
  135. Wei C, Hoff A, Villabø MA, Peterman J, Kendall PC, Piacentini J, . . . Rynn M (2014). Assessing anxiety in youth with the multidimensional anxiety scale for children. Journal of Clinical Child & Adolescent Psychology, 43(4), 566–578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  136. White SW, Lerner MD, McLeod BD, Wood JJ, Ginsburg GS, Kerns C, . . . Walkup J. (2015). Anxiety in youth with and without autism spectrum disorder: Examination of factorial equivalence. Behavior Therapy, 46(1), 40–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Whiteside, & Brown AM (2008). Exploring the utility of the Spence Children’s Anxiety Scales parent- and child-report forms in a North American sample. Journal of Anxiety Disorders, 22(8), 1440–1446. doi: 10.1016/j.janxdis.2008.02.006 [DOI] [PubMed] [Google Scholar]
  138. Whiteside SP, Gryczkowski MR, Biggs BK, Fagen R, & Owusu D. (2012). Validation of the Spence Children’s Anxiety Scale’s obsessive compulsive subscale in a clinical and community sample. Journal of Anxiety Disorders, 26(1), 111–116. [DOI] [PubMed] [Google Scholar]
  139. Wood JJ, Piacentini JC, Bergman RL, McCracken J, & Barrios V. (2002). Concurrent validity of the anxiety disorders section of the Anxiety Disorders Interview Schedule for DSM-IV: Child and Parent Versions. Journal of Clinical Child and Adolescent Psychology, 31(3), 335–342. [DOI] [PubMed] [Google Scholar]
  140. Wren FJ, Berg EA, Heiden LA, Kinnamon CJ, Ohlson LA, Bridge JA, . . . Bernal MP (2007). Childhood anxiety in a diverse primary care population: parent-child reports, ethnicity and SCARED factor structure. Journal of the American Academy of Child & Adolescent Psychiatry, 46(3), 332–340. [DOI] [PubMed] [Google Scholar]
  141. Wren FJ, Bridge JA, & Birmaher B. (2004). Screening for childhood anxiety symptoms in primary care: integrating child and parent reports. Journal of the American Academy of Child & Adolescent Psychiatry, 43(11), 1364–1371. [DOI] [PubMed] [Google Scholar]
  142. Youngstrom EA, Salcedo S, Frazier TW, & Perez Algorta G. (2019). Is the finding too good to be true? Moving from “more is better” to thinking in terms of simple predictions and credibility. Journal of Clinical Child & Adolescent Psychology, 48(6), 811–824. [DOI] [PubMed] [Google Scholar]
  143. Youngstrom EA, Van Meter A, Frazier TW, Hunsley J, Prinstein MJ, Ong ML, & Youngstrom JK (2017). Evidence‐based assessment as an integrative model for applying psychological science to guide the voyage of treatment. Clinical Psychology: Science and Practice, 24(4), 331–363. [Google Scholar]
  144. Zilcha-Mano S, Shimshoni Y, Silverman WK, & Lebowitz ER (2020). Parent-Child Agreement on Family Accommodation Differentially Predicts Outcomes of Child-Based and Parent-Based Child Anxiety Treatment. Journal of Clinical Child & Adolescent Psychology, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES