Abstract
Objectives:
To compare the reliability and convergent validity of parent assessments from the Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID—a structured diagnostic interview) and the Ontario Child Health Study Emotional Behavioural Scales (OCHS-EBS) symptom checklist for classifying conduct disorder (CD), conduct disorder or oppositional defiant disorder (CD-ODD), attention-deficit hyperactivity disorder (ADHD), major depressive disorder (MDD), generalized anxiety disorder (GAD), and separation anxiety disorder (SAD) based on DSM-5 criteria.
Methods:
Data came from 283 parent-youth dyads aged 9 to 18 years. Parents and youth completed the assessments separately on 2 different occasions 7 to 14 days apart. After converting the OCHS-EBS scale scores to binary disorder classifications, we compare test-retest reliability estimates and use structural equation modelling (SEM) to compare estimates of convergent validity for the same disorders assessed by each instrument.
Results:
Average test-retest reliabilities based on κ were 0.71 (MINI-KID) and 0.67 (OCHS-EBS). The average β coefficients for 3 latent measures comprising the following indicators—parent perceptions of youth mental health need and impairment, diagnosis of specific disorders based on health professional communications and youth taking prescribed medication, and youth classifications of disorder based on the MINI-KID—were 0.67 (MINI-KID) and 0.69 (OCHS-EBS).
Conclusion:
The OCHS-EBS and MINI-KID achieve comparable levels of reliability and convergent validity for classifying child psychiatric disorder. The flexibility, low cost, and minimal respondent burden of checklists for classifying disorder make them well suited for studying disorder in the general population and screening in clinical settings.
Keywords: symptom checklist, structured diagnostic interview, measurement, structural equation modelling, validity, reliability, child psychiatric disorder
Abstract
Objectifs:
Comparer la fiabilité et la validité convergente des évaluations des parents à la mini-entrevue neuropsychiatrique internationale pour enfants et adolescents (MINI-KID—une entrevue diagnostique structurée) et à la liste de vérification des symptômes des Échelles émotionnelles comportementales de l’Étude sur la santé des jeunes ontariens (EEC-ESJO) pour classer le trouble des conduites (TC), le trouble des conduites ou le trouble oppositionnel avec provocation (TC-TOP), le trouble de déficit de l’attention avec hyperactivité (TDAH), le trouble dépressif majeur (TDM), le trouble d’anxiété généralisée (TAG) et le trouble d’anxiété de séparation (TAS) selon les critères du DSM-5.
Méthodes:
Les données provenaient de 283 dyades parents-adolescents âgés de 9 à 18 ans. Parents et adolescents ont rempli les évaluations séparément en 2 différentes occasions, de 7 à 14 jours d’intervalle. Après conversion des scores aux échelles EEC-ESJO en classifications de troubles binaires, nous avons comparé les estimations de fiabilité test-retest et utilisé la modélisation par équation structurelle (MES) pour comparer les estimations de validité convergente pour les mêmes troubles évalués par chaque instrument.
Résultats:
La moyenne des fiabilités test-retest selon κ était de 0,71 (MINI-KID) et de 0,67 (EEC-ESJO). La moyenne des coefficients β pour 3 mesures latentes comprenant les indicateurs suivants—perceptions des parents des besoins et des déficiences de santé mentale des adolescents, diagnostic de troubles spécifiques d’après les communications des professionnels de la santé et les adolescents qui utilisent des médicaments prescrits, les classifications des troubles par les adolescents d’après la MINI-KID—étaient de 0,67 (MINI-KID) et de 0,69 (EEC-ESJO).
Conclusion:
Les EEC-ESJO et la MINI-KID atteignent des niveaux comparables de fiabilité et de validité convergente pour la classification des troubles psychiatriques des enfants. La flexibilité, le faible coût et pour le répondant, la charge minimale des listes de vérification pour classer les troubles font en sorte que ces instruments conviennent bien à l’étude des troubles dans la population générale et au dépistage en milieu clinique.
Reliable, valid, and inexpensive instruments are needed to measure child and adolescent (youth) psychiatric disorder conceptualized as both dimensional and categorical (present or absent) phenomena for use in epidemiological studies in the general population and screening in clinical settings.1 The most common approaches used to measure youth disorders are structured and semistructured standardized diagnostic interviews (SDIs) and self-completed symptom checklists.2,3 Interviews focus on disorder as a categorical phenomenon, drawing on symptom and impairment criteria specified in the Diagnostic and Statistical Manual of Mental Disorders (DSM)4 to classify disorder. Checklists focus on mental problems as dimensional phenomena, drawing on empirical methods such as factor analysis to identify syndromes based on parent or youth ratings of problem behaviours.
SDIs are expensive and time-consuming to implement. For example, the fourth edition of the Diagnostic Interview Schedule for Children takes on average 70 minutes to complete for a nonclinic respondent (general population) and 90 to 120 minutes for a clinic respondent.5 To lessen the burden of response, most interviews use screening questions to skip respondents out of modules where they are likely to test negative.6 This strategy leads to information loss about psychiatric symptoms and an inability to construct dimensional measures of disorder applicable to all respondents. Checklists are brief, simple, and inexpensive to implement; pose little burden to respondents; and collect information on all symptoms. Choosing cut-points along the continuum of scale scores allows checklists to represent disorder categorically as well as dimensionally. Demonstrating comparable reliability and validity between checklists and SDIs would greatly expand our ability to study and screen for child psychiatric disorder in situations where SDIs would be too burdensome (general population studies, community child mental health centres).
Although SDIs have become the de facto gold standard for classifying youth psychiatric disorder,7 there are compelling arguments for expecting checklists to classify psychiatric disorder as reliably and validly as SDIs.8 Admittedly, the empirical studies9–14 are dated, few in number, and associated with some important limitations that include 1) relatively small samples, 2) lack of comparative data on the test-retest reliabilities of the instruments, 3) inattention to prevalence effects, 4) reliance on subjective interpretation of the numerical findings in the absence of formal empirical tests, and 5) failure to account for measurement error in comparing the validity of the instruments.
In this study, we conduct a direct comparison of the reliability and validity of the Ontario Child Health Study Emotional Behavioural Scales (OCHS-EBS)15 measuring conduct disorder (CD), conduct disorder or oppositional-defiant disorder (CD-ODD), attention-deficit hyperactivity disorder (ADHD), generalized anxiety disorder (GAD), separation anxiety disorder (SAD), and major depressive disorder (MDD) with the parent Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID-P—a structured diagnostic interview). (CD and ODD are combined because the MINI-KID skips respondents over ODD when youth test positive for CD.)16,17 We address the limitations and extend previous studies by 1) comparing the test-retest reliability of the 2 instruments for classifying disorder in the same time interval (7-14 days), 2) implementing formal empirical tests of convergent validity, 3) using structural equation modelling (SEM) with latent variables free of measurement error for the validity analysis, and 4) conducting a sensitivity analysis to determine the extent to which instrument differences in prevalence account for differences in convergent validity.
Methods
Participants
In total, 283 parent-youth dyads aged 9 to 18 years (185 from the general population and 98 from a mental health outpatient clinic) participated. One parent participant, in addition to 5 parent-youth dyad participants (2.1%), did not complete the retest interview and were removed from the reliability analysis.
General population participants
Youth in the general population were sampled from 4 elementary schools (grades 5-8) and 4 high schools (grades 9-12), enlisted by school board representatives. Students took home a study letter, a consent form to be completed by parents, and a 7-item screening questionnaire to be completed by parents of elementary students or students themselves, if attending high school (N = 4333). Students who returned signed parental consent agreeing to be contacted about the study and completing the screening questionnaire (n = 1210) formed the eligible sample (27.9% response). The questionnaire had 7 items that included assessments of students’ emotional, social, and academic functioning. The items, identical for parents and youth, were scored from positive to negative and summed to produce a distribution of risk. Based on parent assessments (elementary school) or youth assessments (secondary school), youth were classified at high risk (top 10%), medium risk (11%-30%), or low risk (bottom 70%), sampled in equal numbers from each risk group and invited to participate. Youth classified at high and medium risk were oversampled to increase the number of participants likely to have disorder. Across the strata, 346 were sampled and 185 participated—34.1% at low risk, 30.8% at medium risk, and 35.1% at high risk. Sampling weights were created based on the probability of youth being selected and participating within each stratum.
Mental health outpatient clinic participants
Eligible were families who provided informed consent, had youth aged 10 to 17 at no immediate risk of self-harm or harm to others, and exhibited no apparent developmental or learning problem such as autism or a learning disability. One university and 1 community-based children’s mental health centre contributed data to the clinic sample. In the university-based centre, 243 families seen at intake during the study period were deemed eligible. The research team contacted 129 of these families (53.1%), and 54 participated (22.2%). In the community-based centre, 158 families seen at intake during the study period were deemed eligible. The research team contacted all of these families and 45 participated (28.5%).
Families were interviewed 7 to 14 days apart between December 2011 and December 2013. All study procedures, including consent and confidentiality requirements, were approved by the Hamilton Integrated Research Ethics Board at McMaster University, the Research Ethics Committees at the School Boards, and the clinics involved in the study.
Concepts and Measures
Measures of psychiatric disorder.
MINI-KID-P
The MINI-KID-P for parents and MINI-KID-Y for youth are SDIs that assess DSM-IV-TR disorders in youth aged 6 to 17 years.17 Validated against the Schedule for Affective Disorders and Schizophrenia for School Aged Children–Present and Lifetime Version, the MINI-KID has a 1- to 5-day test-retest reliability based on κ (>0.75) for all diagnoses identified in combined interviews with parents and youth.17 In our study, the interviews were administered separately: the MINI-KID-Y to youth and the MINI-KID-P to parents.15
OCHS-EBS
The scales used in this study were developed for the 2014 Ontario Child Health Study to provide both dimensional and categorical representations of child psychiatric disorder.15 They draw on items used in our previous studies,18,19 as well as new items judged by clinicians and researchers to approximate DSM-5 criteria. The reference period for assessing items is the past 6 months, and each one is scored 0, 1, or 2, indicating responses of ‘never or not true’, ‘sometimes or somewhat true’, and ‘often or very true’, respectively. The raw scores are summed to form a scale score to measure each disorder. The OCHS-EBS takes about 8 to 10 minutes for a parent to complete.
Convergent validity indicators
Convergent validity is an approach to testing the validity of a measure by quantifying its strength of association with measures of similar constructs hypothesized to be linked theoretically. In this study, we compare the strength of association between the classifications of disorder based on the MINI-KID-P and OCHS-EBS with 3 highly related constructs measured as latent variables. Variable 1 is parental perceptions of their youth’s need for professional help with emotional or behavioural problems, in conjunction with impaired social or academic functioning—a general trait assessed in the past 6 months and hypothesized to underlie all types of psychiatric disorder.20 Variable 2 includes specific types of emotional-behavioural problems ever diagnosed by health care professionals or school personnel and communicated to parents, in conjunction with parental reports of their youth currently taking prescribed medication for the same problem. Variable 3 is youth classifications of the same disorders identified independently by the MINI-KID-Y.
Mental health need/impairment: parent ratings
Need for help: a binary indicator coded positive 1) when respondents checked yes to 2 questions: ‘During the last 6 months, do you think that ___ has had any emotional or behavioural problems? Do you think that ___ needs or needed professional help with these problems?’
Impaired social functioning: A summated rating scale of items scored from 1) very well, no problems to 5) not well at all, constant problems, in response to the stem question: ‘During the past 6 months, how well has ___ gotten along with…’ asking about a) other kids such as friends or classmates, b) teachers at school, and c) the family.
Impaired academic functioning: A single item scored from 1) excellent student to 5) poor student, constant problems in response to the question, ‘Which of the statements best describes how well ___ has done overall in subjects at school during the past 6 months?’
Health care diagnosed problems and use of prescription medication: parent report
Specific diagnoses of emotional-behavioural problems lifetime: Binary classifications derived from positive responses to the following question: ‘Have you ever been told by a teacher, school official, doctor, nurse or other health professional that ___ has a) anxiety; b) depression; c) attention problems; d) behavioural problems?’
Use of prescription medications currently: Binary classifications derived from positive responses to a stem question and follow-up questions: ‘Is ___ currently taking any prescribed medication?’ and ‘What does ___ take this medication for: Hyperactivity? Behavioural problem? Depression? Anxiety?’
Youth (cross-informant) classifications of the same disorders based on the MINI-KID-Y
Please see MINI-KID-P above.
Analyses
Converting OCHS-EBS scale scores to binary classifications of disorder
Scale scores on the OCHS-EBS were converted to binary measures of disorder independently of the MINI-KID-P at the thresholds matching the general population prevalence estimates for CD (2.1%), ODD (3.6%), ADHD (3.4%), and MDD (1.3%) identified among youth in a recent worldwide meta-analysis of prevalence studies.21 Prevalence estimates for GAD (1.8%) and SAD (1.9%) were taken from a different review.22 Each scale score was converted to a binary measure at the threshold closest to the prevalence of its corresponding disorder identified above. These threshold scores were determined in the weighted general population sample (see “General Population Participants”) and applied to all respondents.
Prevalence, test-retest reliability, and cross-instrument agreement
The 6-month prevalence of disorders assessed by the MINI-KID-P and OCHS-EBS is expressed as a percentage. Test-retest reliability over a 1- to 2-week period and cross-instrument agreement are estimated by κ.23 Our sample size for reliability is 277 parent-youth dyads because 6 did not complete the retest. With a type I error (α) set at 0.05 (2-tailed), the statistical power (1 – β) available in the study to identify a difference in κ between the MINI-KID-P and OCHS-EBS of |0.20| goes from about 35% to 95%. This variability in power depends on the test-positive rate (prevalence) expected to go from 0.04 to 0.26.
Convergent validity
We used SEM to test for differences in convergent validity between the MINI-KID-P and OCHS-EBS classifications of disorder. SEM is a multivariate statistical technique with a measurement component—derivation of latent variable measures based on indicator variables—and a structural component—the specification of relationships among the latent variable measures. To remove temporal error, we construct latent variable measures of each disorder (dependent variables) for each instrument based on their assessments at each time point. We also construct latent variable measures of our convergent validity variables (independent variables). For 2 of the variables—youth mental health need/impairment and health care diagnosed problems and use of prescription medication—the latent variable measures are based on their time 1 indicators because these questions were not repeated at time 2. For classifications of the same disorders based on the MINI-KID-Y, we create latent variable measures of disorder based on their assessments at each time point as we did with the MINI-KID-P.
To compare the convergent validity of the MINI-KID-P and OCHS-EBS classifications of disorder, we specify separate regression models for each disorder and test for differences in the magnitude (β coefficients) and strength (explained variance) of association linking the MINI-KID-P and OCHS-EBS to each of the construct validity variables. Each model consists of 3 latent variable measures: one each for the 2 instruments (dependent variables) and one for the convergent validity variable (independent variable). Figure 1 illustrates the regression of the MINI-KID-P and OCHS-EBS classifications of ADHD on MINI-KID-Y classifications of ADHD.
We used MPlus 7.424 to develop separate SEMs for each disorder. MPlus offers a generalized measurement component, which allows for dichotomous and ordered categorical variables (indicators) in the derivation of latent variable measures.25 Adequate model fit was defined as values ≥0.98 for the comparative fit index (CFI, range 0 to 1.0), ≤0.05 for the root mean squared error of approximation (RMSEA), and a nonsignificant χ2 for model fit. The Wald statistic follows the Student t distribution.26 Using maximum likelihood to handle missing retest information, our sample size for validity is 283. With type I error (α) set at 0.05 (2-tailed), power depends on the effect size (difference between βMINI-KID-P and βOCHS-EBS) and the standard deviation of the regression errors. In our study, these errors varied at the extremes between 0.38 and 0.95 (standardized) so that effect size differences that go from about 0.07 to 0.24 can be reliably identified with 80% power.
Finally, we conducted a sensitivity analysis to determine if statistically significant differences in convergent validity between the MINI-KID-P and OCHS-EBS were influenced by our approach to setting thresholds for classification. To do this, we re-ran the original SEM analysis after re-setting the OCHS-EBS thresholds for classifying disorder to align with the prevalence estimates observed for the MINI-KID-P in our general population sample.
Results
The sample characteristics and distribution of the construct validity indicators appear in Table 1. There are fewer males (43.5%) than females. The average age of youth is 14.8 (SD = 2.3) years.
Table 1.
Sample Characteristics | % | Mean (SD) |
---|---|---|
Youth | ||
Male | 43.5 | |
Mean age, y | 14.8 (2.3) | |
Born outside Canada | 8.5 | |
Parent | ||
Birth mother | 83.0 | |
Mean age, y | 44.7 (6.8) | |
Born outside Canada | 19.1 | |
≤ Secondary education | 25.8 | |
Family | ||
Lone parent | 29.3 | |
Mean income, $000s | 72.9 (39.6) | |
Convergent validity indicators | ||
Mental health need/impairment | ||
Need for help | 42.4 | |
Mean impaired academic functioning | 2.37 (1.08) | |
Mean impaired social functioning | 6.32 (2.40) | |
Diagnostic groupings | ||
Specific mental health problems | ||
a) Behaviour | 25.8 | |
b) Attention | 36.7 | |
c) Depression | 21.2 | |
d) Anxiety | 36.4 | |
Prescribed medication | ||
a) Behaviour | 8.5 | |
b) Hyperactivity | 12.4 | |
c) Depression | 9.5 | |
d) Anxiety | 13.8 | |
Youth Psychiatric Disorder MINI-KID | ||
CD | 2.5 | |
CD-ODD | 11.7 | |
ADHD | 4.2 | |
MDD | 15.9 | |
GAD | 12.4 | |
SAD | 3.2 |
ADHD, attention deficit hyperactivity disorder; CD, conduct disorder; CD-ODD, conduct disorder or oppositional defiant disorder; GAD, generalized anxiety disorder; MDD, major depressive disorder; MINI-KID, Mini International Neuropsychiatric Interview for Children and Adolescents; SAD, separation anxiety disorder.
In Table 2, the weighted prevalence of disorders identified by the OCHS-EBS approximates the population estimates on which they are based and is similar to the MINI-KID-P except for MDD, where the prevalence is 6.9% versus 2.0% for the MINI-KID-P and OCHS-EBS, respectively. The unweighted prevalence estimates for the general population and clinic samples combined are higher for CD and ADHD based on the OCHS-EBS and for CD-ODD, GAD, and MDD based on the MINI-KID-P.
Table 2.
Weighted Prevalencea (n = 185) | Unweighted Prevalence (n = 283) | Test-Retest Reliability (n = 277), κ (SE) | MINI-KID-P OCHS-EBS Agreement (SE) (n = 283) | ||||
---|---|---|---|---|---|---|---|
Disorder | MINI-KID-P | OCHS-EBS | MINI-KID-P | OCHS-EBS | MINI-KID-P | OCHS-EBS | |
CD | 0.7 | 1.7 | 6.4 | 13.1 | .67 (.10) | .65 (.08) | .46 (.09) |
CD-ODD | 3.6 | 4.1 | 26.5 | 18.4 | .77 (.04) | .73 (.06) | .59 (.06) |
ADHD | 2.7 | 2.4 | 9.5 | 18.4 | .77 (.07) | .71 (.06) | .49 (.07) |
GAD | 3.1 | 2.1 | 19.8 | 9.5 | .75 (.05) | .56 (.09) | .46 (.07) |
MDD | 6.9 | 2.0 | 20.1 | 7.1 | .67 (.06) | .62 (.09) | .38 (.07) |
SAD | 1.3 | 2.7 | 4.6 | 12.4 | .60 (.12) | .72 (.07) | .38 (.09) |
ADHD, attention deficit hyperactivity disorder; CD, conduct disorder; CD-ODD, conduct disorder or oppositional defiant disorder; GAD, generalized anxiety disorder; MDD, major depressive disorder; MINI-KID-P, Mini International Neuropsychiatric Interview for Children and Adolescents Parent Version; SAD, separation anxiety disorder.
a Based on general population sample responses weighted inversely to their probability of being selected. All other estimates based on combined, unweighted general and clinic population sample response.
Differences between instruments in the test-retest reliability based on κ are ≤|0.10| for all of the disorders except for GAD (0.19 higher in the MINI-KID-P) and SAD (0.12 higher in the OCHS-EBS)—and none of the differences are statistically significant. Based on κ, agreement between instruments on the classifications of disorder goes from 0.38 (MDD, SAD) to 0.59 (CD-ODD).
Table 3 shows the SEM results used to test equivalence in the convergent validity of the instruments based on the construct validity variables. Each line represents a separate SEM, and all models provide excellent fit based on the CFI (all ≥0.98) and RMSEA (all ≤0.05) (not shown). All estimates based on the model fit χ2 are nonsignificant at P > 0.05.
Table 3.
β Coefficient (SE) | Model Fit, χ2 (df) [P Value] | Wald χ2 (P Value) | |||
---|---|---|---|---|---|
Covariate/Disorder | βMINI-KID-P | βOCHS-EBS | b | Residual | |
Mental health need/impairment | |||||
CD | .73 (.08) | .85 (.05) | 16.03 (11) [.14] | 3.69 (.06)** | 1.66 (.20) |
CD-ODD | .93 (.03) | .92 (.04) | 12.99 (11) [.29] | 0.16 (.69) | 0.15 (.70) |
ADHD | .76 (.07) | .90 (.05) | 12.39 (11) [.34] | 1.77 (.18) | 4.32 (.04)* |
MDD | .82 (.05) | .79 (.09) | 17.68 (11) [.09] | 0.03 (.86) | 0.18 (.67) |
GAD | .79 (.05) | .65 (.08) | 18.87 (11) [.06] | 5.84 (.02)* | 1.01 (.32) |
SAD | .60 (.10) | .55 (.08) | 15.81 (11) [.15] | 0.42 (.52) | 0.01 (.93) |
Diagnostic groupings | |||||
CD | .63 (.11) | .65 (.10) | 4.11 (6) [.66] | 0.07 (.80) | 0.00 (.95) |
CD-ODD | .81 (.06) | .70 (.07) | 4.62 (6) [.59] | 1.57 (.21) | 2.78 (.09)** |
ADHD | .87 (.06) | .88 (.05) | 2.41 (6) [.88] | 0.33 (.56) | 0.21 (.65) |
MDD | .77 (.06) | .82 (.07) | 6.46 (7) [.49] | 0.65 (.42) | 0.19 (.67) |
GAD | .76 (.06) | .76 (.08) | 5.59 (8) [.69] | 0.03 (.87) | 0.02 (.90) |
SAD | .45 (.13) | .51 (.09) | 12.27 (8) [.72] | 0.22 (.64) | 0.13 (.72) |
Youth-identified psychiatric disorder | |||||
CD | .41 (.08) | .74 (.07) | 2.53 (7) [.92] | 8.06 (.01)* | 4.33 (.04)* |
CD-ODD | .57 (.09) | .70 (.08) | 2.68 (6) [.85] | 2.88 (.09)** | 1.40 (.24) |
ADHD | .54 (.14) | .65 (.12) | 3.99 (6) [.68] | 0.42 (.52) | 2.08 (.15) |
MDD | .61 (.08) | .66 (.08) | 4.80 (6) [.57] | 0.47 (.49) | 0.14 (.71) |
GAD | .58 (.09) | .41 (.12) | 7.32 (6) [.29] | 3.73 (.05)** | 0.14 (.70) |
SAD | .38 (.16) | .16 (.14) | 9.03 (8) [.34] | 5.65 (.02)* | 0.03 (.87) |
ADHD, attention deficit hyperactivity disorder; CD, conduct disorder; CD-ODD, conduct disorder or oppositional defiant disorder; GAD, generalized anxiety disorder; MDD, major depressive disorder; MINI-KID-P, Mini International Neuropsychiatric Interview for Children and Adolescents Parent Version; SAD, separation anxiety disorder.
a Wald χ2 (1 df): estimated loss of fit associated with constraining the unstandardized b coefficients and residual variance to be equal for the MINI-KID-P and OCHS-EBS.
*P < 0.05. **P ≥ 0.05, ≤ 0.10.
Among the 18 SEMs in Table 3, the β coefficients are numerically larger for the MINI-KID-P in 7 comparisons and for the OCHS-EBS in 10 comparisons, as well as identical in 1 comparison. The average sizes of the β coefficients for the interview and checklist are 0.67 and 0.69, respectively. The Wald tests of parameter constraints with 1 degree of freedom indicate that constraining the unstandardized b coefficients or residual variances to be equal led to statistically significant loss of fit (χ2 ≥ 3.86, P < 0.05) in 5 comparisons and marginally significant loss of fit (P ≥ 0.05, <0.10) in 4 comparisons. Although there is no obvious pattern of between-instrument differences, it appears that the MINI-KID-P is stronger in measuring GAD while the OCHS-EBS may be stronger in measuring CD.
Table 4 shows the effect of resetting the checklist thresholds to align with the prevalence of disorder observed for the MINI-KID-P. This analysis is restricted to the 8 disorders associated with significant loss of fit. In comparison with the βOCHS-EBS in Table 3, the βOCHS-EBS in Table 4 converges towards βMINI-KID-P in all instances. With two exceptions—CD-ODD (diagnostic groupings) and CD (youth-identified psychiatric disorder)—all of the effects in Table 3 were rendered statistically nonsignificant.
Table 4.
β Coefficient (SE) | Model Fit, χ2 (df) [P Value] | Wald χ2 (P Value) | |||
---|---|---|---|---|---|
Covariate/Disorder | βMINI-KID-P | βOCHS-EBS | b | Residual | |
Mental health need/impairment | |||||
CD | .73 (.08) | .83 (.08) | 13.32 (11) [.14] | 0.80 (.37) | 1.32 (.25) |
ADHD | .76 (.07) | .83 (.08) | 13.02 (11) [.29] | 0.15 (.69) | 0.79 (.38) |
GAD | .79 (.05) | .78 (.06) | 10.17 (11) [.52] | 1.52 (.22) | 0.17 (.68) |
Diagnostic groupings | |||||
CD-ODD | .81 (.06) | .71 (.07) | 2.72 (6) [.84] | 1.03 (.31) | 4.24 (.04)* |
Youth-identified psychiatric disorder | |||||
CD | .37 (.12) | .72 (.10) | 4.66 (7) [.70] | 4.67 (.03)* | 3.23 (.07)** |
CD-ODD | .57 (.09) | .65 (.08) | 11.85 (7) [.11] | 0.96 (.33) | 0.64 (.42) |
GAD | .58 (.09) | .56 (.10) | 1.55 (6) [.96] | 0.57 (.45) | 0.51 (.48) |
SAD | .38 (.16) | .21 (.18) | 8.12 (8) [.42] | 1.73 (.19) | 0.26 (.61) |
ADHD, attention deficit hyperactivity disorder; CD, conduct disorder; CD-ODD, conduct disorder or oppositional defiant disorder; GAD, generalized anxiety disorder; MDD, major depressive disorder; MINI-KID-P, Mini International Neuropsychiatric Interview for Children and Adolescents Parent Version; SAD, separation anxiety disorder.
a Wald χ2 (1 df): estimated loss of fit associated with constraining the unstandardized b coefficients and residual variance to be equal for the MINI-KID-P and OCHS-EBS.
*P < 0.05. **P ≥ 0.05, ≤ 0.10.
Discussion
This study indicates that a self-administered problem checklist can achieve the same levels of reliability and convergent validity for classifying youth psychiatric disorder as a structured SDI. These findings are consistent with the small number of investigations done in the 1990s that examined the construct validity of interviews and checklists for classifying child psychiatric disorder.9–14
There are important challenges associated with comparing the psychometric properties of SDIs and checklists. The first challenge arises from limits to our understanding about the nature of child psychopathology. In the absence of criterion measures, we rely on construct validity indicators to assess the comparative validity and usefulness of these instruments. These indicators should be theoretically important, empirically supported (reliable and valid), and ‘independent’ of the specific questions/items and methods making up the competing instruments. Although our third approach to convergent validity—use of using cross-informant classifications of disorder based solely on the MINI-KID-Y—violates one of these dictums, we believe that the absence of between-instrument differences in their strength of association with MINI-KID-Y classifications of disorder is strong evidence of their equivalence.
The second challenge arises from selecting checklist thresholds for identifying disorders. To ensure independence of the MINI-KID-P, we aligned our checklist thresholds with population prevalence estimates from a meta-analysis. The few convergent validity advantages of either instrument can be traced to differences in prevalence, which, in turn, can be traced to differences in reliability, particularly at the extremes of prevalence.27 A recent study of 3 SDIs yielded prevalence estimates of 1+ disorders in the same respondents of 47.1%, 32.4%, and 17.7%,28 illustrating that the challenge of selecting thresholds is not unique to checklists.
The third challenge focuses on a trio of methodological concerns arising from sampling and response, statistical power, and measurement error. One, comparative studies should be done separately in clinical and general population samples—a requirement far beyond our funding capacity. Our sampling strategy reflected a desire to represent the general population while ensuring that there would be enough youth classified with disorder to conduct meaningful convergent validity analyses. Without access to information on nonrespondents, we cannot evaluate the representativeness of our samples. If there are selection factors at work in our study, they would need to exert a differential effect on reliability and validity across instruments, which seems unlikely to us. Two, large samples are needed to have adequate statistical power for comparing the psychometric properties of different instruments measuring the same traits.29 Although our sample is large compared with other studies, it is still limited. Furthermore, power in a given study will vary across disorders because of differences in their prevalence. Three, uncontrolled measurement error is a serious threat when comparing the validity of measurement instruments. The use of SEM to remove measurement error in the classification of disorder and measurement of convergent validity variables substantially enhanced our ability to conduct meaningful tests. However, the use of SEM did not fully overcome the effects of prevalence differences on associations between the convergent validity variables and disorders. Not taking prevalence differences into account could lead one to believe mistakenly that the validity and usefulness of alternative instruments for classification might depend on the type of disorder being assessed.
Checklists Versus Interviews
In this article, we focus exclusively on the measurement objective of classifying disorder for epidemiological studies in the general population and screening in clinical settings in a head-to-head comparison with structured SDI. In clinical settings, semistructured SDIs serve the broader diagnostic objectives of engaging patients and formulating an intervention plan. This process depends on years of clinical training and experience. Although checklists can contribute to this process through screening, they cannot substitute for it.
Over the past 30 years, substantial resources have gone into the development of structured SDIs, resulting in a belief of their superiority. At the same time, there is a willingness to overlook differences among them in prevalence arising from the same diagnostic criteria28 and the fact that the overall test-retest reliability of SDIs is modest at best (κ = 0.58; 95% CI, 0.53 to 0.63) and highly variable across studies.30 Given the striking differences in cost and burden between structured SDIs and checklists, it is surprising how little research has been directed towards examining their relative scientific merits. In our view, carefully developed symptom checklists can substitute for structured SDIs and provide an effective way to measure child and youth psychiatric disorder as both categorical and dimensional phenomena. Studies addressing this question are urgently needed to provide researchers and clinicians with an appropriate evidence base for making cost-effective decisions about using checklists or SDIs for classifying youth disorder in epidemiological studies and for screening in clinical practice.
Footnotes
Data Access: Data access available upon request, with appropriate ethics approval.
Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by research operating grant FRN111110 from the Canadian Institutes of Health Research (CIHR). Dr. Boyle was supported by CIHR Canada Research Chair in the Social Determinants of Child Health, Dr. Georgiades by a CIHR New Investigator Award and the David R. (Dan) Offord Chair in Child Studies, Dr. Ferro by a CIHR Canada Research Chair in Youth Mental Health and a Research Early Career Award from Hamilton Health Sciences, and Dr. MacMillan by the Chedoke Health Chair in Child Psychiatry.
ORCID iD: Laura Duncan, MA https://orcid.org/0000-0001-7120-6629
References
- 1. Coghill D, Sonuga-Barke EJS. Annual research review: categories versus dimensions in the classification and conceptualisation of child and adolescent mental disorders—implications of recent empirical study. J Child Psychol Psychiatry. 2012;53(5):469–489. [DOI] [PubMed] [Google Scholar]
- 2. Angold A. Diagnostic interviews with parents and children In: Rutter M, Taylor E, eds. Child and adolescent psychiatry. 4th ed Oxford (UK: ): Blackwell; 2002. p. 32–51. [Google Scholar]
- 3. Verhulst FC, Van der Ende J. Rating scales In: Rutter M, Taylor E, eds. Child and adolescent psychiatry. 4th ed Oxford (UK: ): Blackwell; 2002. p. 70–86. [Google Scholar]
- 4. American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 5th ed Arlington (VA; ): American Psychiatric Publishing; 2013. [Google Scholar]
- 5. Shaffer D, Fisher P, Lucas CP, et al. National Institute of Mental Health Diagnostic Interview Schedule for Children Version IV (NIMH DISC-IV): description, differences from previous versions, and reliability of some common diagnoses. J Am Acad Child Adolesc Psychiatry. 2000;39(1):28–38. [DOI] [PubMed] [Google Scholar]
- 6. Kessler RC, Wittchen HU, Abelson JM, et al. Methodological studies of the Composite International Diagnostic Interview (CIDI) in the US national comorbidity survey. Int J Methods Psychiatr Res. 1998;7(1):33–55. [Google Scholar]
- 7. Rettew DC, Lynch AD, Achenbach TM, et al. Meta-analyses of agreement between diagnoses made from clinical evaluations and standardized diagnostic interviews. Int J Methods Psychiatr Res. 2009;18(3):169–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Boyle MH, Duncan L, Georgiades K, et al. Classifying child and adolescent psychiatric disorder by problem checklists and standardized interviews. Int J Methods Psychiatr Res. 2016;26(4):e1544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Boyle MH, Offord DR, Racine Y, et al. Interviews versus checklists: adequacy for classifying childhood psychiatric disorder based on adolescent reports. Int J Methods Psychiatr Res. 1996;6:309–319. [Google Scholar]
- 10. Boyle MH, Offord DR, Racine Y, et al. Adequacy of interviews versus checklists for classifying childhood psychiatric disorder based on parent reports. Arch Gen Psychiatry. 1997;54(9):793–797. [DOI] [PubMed] [Google Scholar]
- 11. Dirks MA, Boyle MH. The comparability of mother-report structured interviews and checklists for the quantification of youth externalizing symptoms. J Child Psychol Psychiatry. 2010;51(9):1040–1049. [DOI] [PubMed] [Google Scholar]
- 12. Gould MS, Bird H, Jaramillo S. Correspondence between statistically derived behavior problem syndromes and child psychiatric diagnoses in a community sample. J Abnorm Child Psychol. 1993;21(3):287–313. [DOI] [PubMed] [Google Scholar]
- 13. Jensen PS, Watanabe HK, Richters JE, et al. Scales, diagnosis and child psychopathology, II: comparing the CBCL and the DISC against external validators. J Abnorm Child Psychol. 1996;24(2):151–168. [DOI] [PubMed] [Google Scholar]
- 14. Jensen PS, Watanabe HK. Sherlock Holmes and child psychopathology assessment approaches: the case of the false-positive. J Am Acad Child Adolesc Psychiatry. 1999;38(2):138–146. [DOI] [PubMed] [Google Scholar]
- 15. Duncan L, Georgiades K, Wang L, et al. The 2014 Ontario Child Health Study Emotional Behavioural Scales (OCHS-EBS) Part I: a checklist for dimensional measurement of selected DSM-5 disorders. Can J Psychiatry. 2018;64:423–433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Duncan L, Georgiades K, Wang L, et al. Psychometric evaluation of the Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID). Psychol Assess. 2017;30(7):916–928. [DOI] [PubMed] [Google Scholar]
- 17. Sheehan DV, Sheehan KH, Shytle RG, et al. Reliability and validity of the Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID). J Clin Psychiatry. 2010;71(3):313–326. [DOI] [PubMed] [Google Scholar]
- 18. Boyle M, Offord DR, Hofmann HF, et al. Ontario Child Health Study, I: methodology. Arch Gen Psychiatry. 1987;44(9):826–831. [DOI] [PubMed] [Google Scholar]
- 19. Boyle MH, Offord DR, Racine YA, et al. Evaluation of the revised Ontario Child Health Study Scales. J Child Psychol Psychiatry. 1993;34(2):189–213. [DOI] [PubMed] [Google Scholar]
- 20. Wittchen HU, Nelson CB, Lachner G. Prevalence of mental disorders and psychosocial impairments in adolescents and young adults. Psychol Med. 1998;28(1):109–126. [DOI] [PubMed] [Google Scholar]
- 21. Polanczyk GV, Salum GA, Sugaya LS, et al. Annual research review: a meta-analysis of the worldwide prevalence of mental disorders in children and adolescents. J Child Psychol Psychiatry. 2015;56(3):345–365. [DOI] [PubMed] [Google Scholar]
- 22. Merikangas KR, Nakamura EF, Kessler RC. Epidemiology of mental disorders in children and adolescents. Dialogues Clin Neurosci. 2009;11(1):7–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46. [Google Scholar]
- 24. Muthén LK, Muthén BO. Mplus user’s guide. 6th ed Los Angeles (CA; ): Muthén & Muthén; 2016. [Google Scholar]
- 25. Muthen B. A general structural equation model with dichotomous, ordered categorical and continuous latent variable indicators. Psychometrika. 1984;4:115–132. [Google Scholar]
- 26. Dupont WD, Plummer WD. Power and sample size calculations for studies involving linear regression. Control Clin Trials. 1998;19(6):589–601. [DOI] [PubMed] [Google Scholar]
- 27. Shrout P. Measurement reliability and agreement in psychiatry. Stat Methods Med Res. 1998;7(3):301–317. [DOI] [PubMed] [Google Scholar]
- 28. Angold A, Erkanl A, Copeland W, et al. Psychiatric diagnostic interviews for children and adolescents: a comparative study. J Am Acad Child Adolesc Psychiatry. 2012;38:506–517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Lin HM, Williamson JM, Lipsitz SR. Calculating power for the comparison of dependent κ-coefficients. J R Stat Soc Ser C Appl Stat. 2003;52:391–404. [Google Scholar]
- 30. Duncan L, Comeau J, Wang L, et al. Research review: test-retest reliability of standardized diagnostic interviews to assess child and adolescent psychiatric disorders: a systematic review and meta-analysis. J Child Psychol Psychiatry. 2018 Feb 19. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]