This study evaluates consistency between clinical diagnosis and diagnosis incorporating the Autism Diagnostic Observation Schedule and examines clinician and child factors that predict consistency between index and reference standard diagnoses.
Key Points
Question
What is the role of the Autism Diagnostic Observation Schedule (ADOS) for diagnosis of autism spectrum disorder (ASD) in young children?
Findings
In this diagnostic study of 349 children ages 18 months to 5 years, 11 months, there was 90.0% agreement between index diagnoses (ie, clinical diagnosis) and reference standard diagnoses (ie, diagnosis including information from ADOS). Clinician diagnostic certainty was the best predictor of consistency between index diagnoses and reference standard diagnoses.
Meaning
The ADOS is not required for ASD diagnosis in young children; specialist clinicians can identify children for whom the ADOS may contribute to accurate diagnosis.
Abstract
Importance
Autism spectrum disorder (ASD) affects 1 in 44 children. The Autism Diagnostic Observation Schedule (ADOS) is a semi-structured observation developed for use in research but is considered a component of gold standard clinical diagnosis. The ADOS adds time and cost to diagnostic assessments.
Objective
To evaluate consistency between clinical diagnosis (index ASD diagnosis) and diagnosis incorporating the ADOS (reference standard ASD diagnosis) and to examine clinician and child factors that predict consistency between index diagnoses and reference standard diagnoses.
Design, Setting, and Participants
This prospective diagnostic study was conducted between May 2019 and February 2020. Developmental-behavioral pediatricians (DBPs) made a diagnosis based on clinical assessment (index ASD diagnosis). The ADOS was then administered, after which the DBP made a second diagnosis (reference standard ASD diagnosis). DBPs self-reported diagnostic certainty at the time of the index diagnoses and reference standard diagnoses. The study took place at 8 sites (7 US and 1 European) that provided subspecialty assessments for children with concerns for ASD. Participants included children aged 18 months to 5 years, 11 months, without a prior ASD diagnosis, consecutively referred for possible ASD. Among 648 eligible children, 23 refused, 376 enrolled, and 349 completed the study. All 40 eligible DBPs participated.
Exposures
ADOS administered to all child participants.
Main Outcomes and Measures
Index diagnoses and reference standard diagnoses of ASD (yes/no).
Results
Among the 349 children (279 [79.7%] male; mean [SD] age, 39.9 [13.4] months), index diagnoses and reference standard diagnoses were consistent for 314 (90%) (ASD = 250; not ASD = 64) and changed for 35. Clinician diagnostic certainty was the most sensitive and specific predictor of diagnostic consistency (area under curve = 0.860; P < .001). In a multilevel logistic regression, no child or clinician factors improved prediction of diagnostic consistency based solely on clinician diagnostic certainty at time of index diagnosis.
Conclusions and Relevance
In this prospective diagnostic study, clinical diagnoses of ASD by DBPs with vs without the ADOS were consistent in 90.0% of cases. Clinician diagnostic certainty predicted consistency of index diagnoses and reference standard diagnoses. This study suggests that the ADOS is generally not required for diagnosis of ASD in young children by DBPs and that DBPs can identify children for whom the ADOS may be needed.
Introduction
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by deficits in social communication and the presence of restricted and repetitive behaviors.1 The increasing prevalence of ASD, estimated at 1 in 44 eight-year-old children, is an acknowledged public health crisis.2 Early, intensive treatment offers the best hope for improved function and outcomes.3 ASD is diagnosed according to behavioral criteria from the Diagnostic and Statistical Manual of Mental Disorders (Fifth Edition) (DSM-5), including deficits in social communication and social behavior (A criteria) and presence of restricted, repetitive behaviors or interests (B criteria).1 Diagnosis of ASD by a health care professional is required for a child to access treatment through insurance or government-funded early intervention and school programs, depending on the state of residence.4,5 There are long wait lists, often many months long, to access diagnostic services and significant disparities in the age of diagnosis by race, ethnicity, socioeconomic status, and geography.2,4,6,7,8,9,10
The Autism Diagnostic Observation Schedule, Second Edition (ADOS-2) is a semistructured observation that allows examiners to observe behaviors relevant to ASD.11,12 The ADOS was developed to establish ASD status among research patients and studies of its accuracy have primarily been conducted in research settings.13,14 The ADOS takes 45 to 60 minutes to administer, additional time to score and interpret, and requires specialized training. According to the ADOS manual, “…clinical judgment should overrule the ADOS-2 Classification in achieving a best-estimate clinical diagnosis.”11 Despite limited research in clinical settings, the ADOS is considered by some to be a component of a gold standard ASD clinical diagnosis, and is often required for access to early intervention, school-based intervention, and intensive behavioral treatment.2,4,5,6,7,15,16 The use of semistructured ASD assessments (including the ADOS) varies from 12.9% to 100% across 10 sites in the Developmental-Behavioral Pediatric Research Network (DBPNet).17
Given the prevalence of ASD, limited data on the clinical use of the ADOS, frequent requirement for ADOS results to access treatment, importance of early diagnosis, long wait lists, and disparities in age of diagnosis, it is critically important to evaluate both the accuracy and efficiency of different diagnostic approaches.18 Therefore, this multisite prospective diagnostic study was conducted to evaluate the consistency between clinical ASD diagnosis (index diagnosis) and diagnosis that incorporates results of the ADOS (reference standard diagnosis). Factors that predict consistency between the index and reference standard diagnoses were examined.
Methods
Study Design and Participants
This prospective diagnostic study collected data prior to completion of index diagnoses and reference standard diagnoses for ASD. The Standards for Reporting of Diagnostic Accuracy Studies (STARD) reporting guidelines were followed in this report. The study was approved by the institutional review boards of the Children’s Hospital of Philadelphia (DBPNet coordinating center) and the 8 participating sites (7 DBPNet and 1 Austrian site), each of which provides subspecialty assessments for children with ASD. The study was completed during the course of routine clinical care and included children from ages 18 months to 5 years, 11 months consecutively referred for possible ASD between May 2019 and February 2020. Written informed consent was obtained from the parent or legal guardian of each participant. Children were excluded if they had a prior diagnosis of ASD established through a multidisciplinary team assessment or the child was non-English speaking (non-German speaking at the Austrian site). Board eligible or certified Developmental-Behavioral Pediatricians (DBPs) who completed diagnostic assessments were also considered participants in this study and consented to participate. At the Austrian site, physicians were specialists trained to diagnose ASD.
Study Procedure
Screening Phase
Prior to enrollment of children, all participating DBPs completed a demographic form. Eligible children were identified based on a review of physician schedules and medical records for age, reason for referral, and confirmation of no prior ASD diagnosis (Figure).
Child Assessment Phase 1 and Completion of Index Diagnosis
A demographic form was completed including visit date, age, parent-reported race and ethnicity, insurance type, primary caregiver education level, and primary reason for referral. DBPs had access to clinical intake forms that included information about presenting concerns and referring clinician questions, any prior assessments, and, when available, results of the Modified Checklist for Autism in Toddlers-Revised screening tool.19 Each child was evaluated by a DBP who completed a medical and developmental history, physical examination, and clinical observation form developed for this study that included a DSM-5 ASD symptom checklist, DSM-5 severity ratings for A and B criteria, and visit characteristics (total time spent by the diagnosing clinician directly observing and/or interacting with the child, number of visits to complete assessment). The presence of significant aggressive, hyperactive, or inattentive/distractible behaviors was recorded. All child participants had a clinical evaluation determined by the usual approach and assessment tools at each site, including cognitive/developmental, language, adaptive, and social skills measures.
At the conclusion of phase 1, the DBP made and recorded a diagnosis of ASD yes or ASD no. This constituted the index diagnosis (clinical diagnosis). The DBPs self-rated the degree of certainty of their diagnosis on a 10-point Likert scale with poles labeled as follows: 1 indicated low (not at all certain) and 10 indicated high (very certain).
Child Assessment Phase 2 and Completion of Reference Standard Diagnosis
During phase 2, the ADOS-2 or Autism Diagnostic Observation Schedule-Toddler Module (ADOS-T), was administered to each child by clinical staff trained to administer the ADOS in a clinically reliable manner.11,12,20 The ADOS-2 yields 3 classifications (nonspectrum, spectrum, autism). The ADOS was required for this study, and therefore considered to be a study procedure. The DBP was provided with the results of the ADOS-2 or ADOS-T and, on the basis of the ADOS plus information from phase 1, the DBP again made a diagnosis of ASD-yes or ASD-no, and self-rated their degree of diagnostic certainty. The diagnosis based on the combination of all clinically obtained data plus the results of the ADOS-2 or ADOS-T constituted the final, reference standard diagnosis (clinical diagnosis including ADOS). Index diagnoses and reference standard diagnoses were completed on the same day, except for 2 sites, at which the ADOS-2 and ADOS-T were completed several days to weeks later due to site logistics.
Analyses
Frequencies and descriptive statistics describe participant and site characteristics, as well as the assessment results by site (eMethods in the Supplement). Intended sample size for univariate analyses was based on a power calculation for logistic regression to identify factors associated with differences between index diagnoses and reference standard diagnoses. Using sex as an example, a sample size of 300 observations (75% male) achieves 80% power at an α level of .05 (2-tailed test). When considering power, a 2-tailed significance was considered using α equals 5%. An exception was the multivariate, multilevel logistic model for which the significance of a predictor and its entrance into the model is judged considering any amounts of variance this predictor shares with others. Thus, to not ignore meaningful predictors when building the multivariate model, a 1-tailed test was the criterion for a predictor’s inclusion into the model.
Multilevel multivariate logistic regression was used to compare index diagnoses and reference standard diagnoses.21 The model was estimated using full maximum likelihood estimation using 300 Laplace iterations. Deviance statistics were used across nested models to justify the inclusion of additional predictors based on their contribution to model fit (eMethods in the Supplement).
Power of the logistic regression coefficients was estimated so that an odds ratio (OR) of 2.0 would be significant 80% of the time using a nominal α level of 5% and a 2-tailed test.22,23 Given the observed ratio of consistency/inconsistency between index diagnoses and reference standard diagnoses, estimated sample size for power levels equal to 80% was 188 participants.
Receiver operating characteristic (ROC) analyses were used to supplement the multilevel logistic model by adding the predictive ability of variables to correctly classify consistency between index and reference standard diagnoses (sensitivity) vs inconsistency between index diagnoses and reference standard diagnoses (specificity). Thus, information on predictive ability, as well as the cutoff values in the predictor that maximize prediction is also presented.24 Less than 60% area under the curve (AUC) represents chance classification accuracy.25 The significance of the curve supplemented by effect size conventions were both used.
Power for the ROC curve was estimated using the following specifications: power equals 80%, type-II error equals 20%, null hypothesis ROC equals 50%; alternative hypothesis ROC equals 70%; ratio of positive vs negative cases 10:1. The required sample sizes for 80% power were equal to 18 and 180 participants for a total of n = 198. Missing data represented less than 1% of the sample across all analyses and were treated using pairwise deletion.
Results
Participants/Demographics
The reference standard diagnosis was completed for 349 children (Table 1). Diagnoses were made by 40 DBPs. Overall, 279 children (79.9%) were male, 276 children were non-Hispanic (80.5% of 346 with data available) and 212 children were White (60.7% of 323 with data available). Scores on cognitive, language, and adaptive measures were typical for a population of young children referred for possible ASD.1 For example, among those with a reference standard diagnosis of ASD, 39.6% had mild, moderate, or severe cognitive impairment.
Table 1. Child Participant and Clinician Demographic Characteristics.
Child participants | No. (%) | |
---|---|---|
All sites | Range across sites | |
No. of participants | 349 | 28-65 |
Index diagnosis of ASD | 249 (71.3) | 17-46 (60.7-70.8) |
Reference diagnosis of ASD | 250 (71.6) | 18-44 (64.3-78.6) |
Child’s age, mean (SD), mo | 39.90 (13.4) | 33.47-49.86 |
Sex | ||
Male | 279 (79.9) | 23-49 (74.4-85.7) |
Female | 70 (20.1) | 4-16 (14.3-25.6) |
Racea | ||
>1 Race | 33 (9.4) | 0-8 (0-24.1) |
Asian | 22 (6.3) | 0-8 (0-19.5) |
Black | 52 (14.9) | 1-16 (3.0-28.6) |
Hawaiian/Pacific Islander | 1 (0.3) | 0-1 (0-1.5) |
Native American | 2 (0.6) | 0-1 (0-3.6) |
White | 212 (60.7) | 12-43 (27.9-80.5) |
Unknown/not reported | 26 (7.4) | 0-1 (0-1.5) |
Ethnicitya | ||
Hispanic | 67 (19.5) | 1-54 (1.5-9.1) |
Non-Hispanic | 276 (80.5) | 19-54 (44.2-96.4) |
Unknown | 6 (1.7) | |
Insurance | ||
Medicaid/SCHIP/CHIP | 184 (52.7) | 1-45 (3.6-100) |
Private | 155 (44.4) | 0-44 (0-89.3) |
Military | 9 (2.6) | 0-3 (0-11.5) |
Self-pay | 0 (0.0) | 0 (0) |
Primary caregiver education level | 343 | |
<HS | 38 (11.1) | 0-10 (0-24.4) |
HS/GED | 82 (23.9) | 1-28 (1.8-68.3) |
Some post-HS | 63 (18.4) | 0-18 (0-36.0) |
College graduate | 105 (30.6) | 0-26 (0-53.5) |
Graduate degree | 55 (16.0) | 1-18 (3.2-32.7) |
Unknown | 6 (1.7) | 1-3 (2.3-10.7) |
ASD diagnosis | ASD | Not ASD |
Child participant assessment results by reference standard diagnosis | 250 | 99 |
Behavior problems noted | 225 | 99 |
Aggressive | 39 (17) | 27 (27) |
Hyperactive | 86 (38) | 36 (36) |
Inattentive/distractible | 100 (44) | 36 (36) |
Cognitive assessmentb | 154 | 65 |
Average to above average | 55 (36) | 40 (62) |
Borderline | 38 (25) | 11 (17) |
Mild impairment | 41 (27) | 10 (15) |
Moderate impairment | 13 (8) | 3 (5) |
Severe/profound impairment | 7 (5) | 1 (2) |
Language assessmentb | 138 | 50 |
Average to above average | 11 (8) | 14 (28) |
Borderline | 23 (17) | 17 (34) |
Mild impairment | 38 (27) | 12 (24) |
Moderate impairment | 49 (36) | 3 (6) |
Severe/profound impairment | 17 (12) | 4 (8) |
Adaptive assessmentb | 162 | 56 |
Average to above average | 17 (10) | 11 (20) |
Borderline | 46 (28) | 29 (52) |
Mild impairment | 76 (47) | 14 (25) |
Moderate impairment severe/profound impairment | 21 (13) | 0 (0) |
Impairment | 2 (1) | 2 (3) |
Participating DBP clinicians | ||
No. | 40 | 3-9 |
ADOS routinely used at site (Y/N) | 34 (85.0) | NA |
Sex | ||
Female | 33 (82.5) | NA |
Male | 7 (17.5) | NA |
Age, mean (SD) | 48.10 (10.7) | NA |
Years of experience, mean (SD) | 14.04 (11.9) | NA |
ASD care a primary responsibility | 33 (82.5) | NA |
Abbreviations: ADOS, Autism Diagnostic Observation Schedule; ASD, autism spectrum disorder; CHIP, Children’s Health Insurance Program; DBP, developmental-behavioral pediatricians; GED, general educational development diploma; HS, high school; NA, not applicable; SCHIP, State Children’s Health Insurance Program.
Race and ethnicity were self-reported.
Scores for Cognitive, Language and Adaptive measures were obtained using different instruments across sites. Standard scores from individual test results were therefore categorized as average to above average, borderline, mild impairment, moderate impairment, or severe to profound impairment according to the following convention: average to above average, higher than 84; borderline: 69 to 84; mild impairment: 54 to 69; moderate impairment: 39 to 54; severe to profound impairment: less than 39.
Among participating DBPs, 33 were women (82.5%), mean age was 48.1 years, with a mean of 14 years since training was completed. Due to variability across sites, site of data collection was treated as a random variable across all models.
Assessment Characteristics and Index vs Reference Standard Diagnoses
Overall, there was consistency between index diagnoses and reference standard diagnoses for 314 children (90%) (ASD, 250; not ASD, 64), while diagnoses differed for 35 (index-ASD; reference-not ASD, 17; index-not ASD; reference-ASD, 18) (Table 2). Among 250 children who received a reference standard diagnosis of ASD, 232 children (92.8%) also received an index diagnosis of ASD. DBP diagnostic certainty increased from 7.5 at the time of the index diagnosis to 8.7 at the time of the reference standard diagnosis (P < .001). The reference standard diagnosis was consistent with the diagnostic categorization from the ADOS for 322 children (92.2%) and differed for 27 children (7.7%). These findings are consistent with rates of agreement noted in the ADOS manual.11,12,20
Table 2. Assessment Characteristics by Site and Overall.
Characteristic | No. (%) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Site 1 | Site 2 | Site 3 | Site 4 | Site 5 | Site 6 | Site 7 | Site 8 | All sites | |
Children assessed, No. | 55 | 28 | 33 | 43 | 65 | 56 | 28 | 41 | 349 |
Index and reference standard diagnoses | |||||||||
Index = ASD, reference = ASD | 38 (69.1) | 20 (71.4) | 26 (78.8) | 29 (67.4) | 39 (60) | 43 (76.8) | 17 (60.7) | 20 (48.8) | 232 (66.5) |
Index = not-ASD, reference = ASD | 5 (9.0) | 1 (3.6) | 1 (3) | 7 (16.3) | 2 (3.1) | 1 (1.8) | 1 (3.6) | 0 (0) | 18 (5.2) |
Index = ASD, reference = not-ASD | 2 (3.6) | 0 (0) | 2 (6.1) | 3 (6.9) | 7 (10.8) | 3 (5.4) | 0 (0) | 0 (0) | 17 (4.8) |
Index = not-ASD, reference = not-ASD | 10 (18.2) | 7 (25) | 4 (12.1) | 4 (9.3) | 17 (26.1) | 9 (16.1) | 10 (35.7) | 21 (51.2) | 82 (23.4) |
Total No. of DBP assessments at site | |||||||||
DBP clinician who completed assessment | 55 | 27 | 33 | 43 | 66 | 55 | 28 | 41 | 348 |
Attending MD | 44 (80.0) | 22 (81.5) | 28 (84.8) | 41 (95.3) | 62 (93.9) | 50 (90.9) | 16 (57.1) | 41 (100) | 304 (87.4) |
Supervised trainee | 11 (20.0) | 5 (18.5) | 5 (15.2) | 2 (4.7) | 4 (6.1) | 5 (9.1) | 12 (42.9) | 0 (0) | 44 (12.6) |
Time spent in direct observation | |||||||||
Children with data on time spent, No. | 55 | 5a | 32 | 42 | 65 | 56 | 27 | 41 | 323 |
Time spent, min | |||||||||
<30 | 6 (10.9) | 0 (0) | 10 (31.3) | 2 (4.8) | 0 (0) | 2 (3.6) | 14 (51.9) | 0 (0) | 34 (10.5) |
31-60 | 36 (65.5) | 1 (20.0) | 6 (18.8) | 6 (14.3) | 51 (78.5) | 35 (62.5) | 11 (40.7) | 0 (0) | 146 (45.2) |
61-90 | 13 (23.6) | 4 (80.0) | 16 (50.0) | 34 (81.0) | 14 (21.5) | 19 (33.9) | 2 (7.4) | 41 (100) | 143 (44.3) |
No. of visits to complete assessment | |||||||||
Total No. of children with assessment results | 55 | 27 | 32 | 43 | 65 | 55 | 28 | 41 | 346 |
1 | 55 (100) | 16 (59.3) | 30 (93.8) | 24 (55.8) | 2 (3.0) | 24 (43.6) | 6 (21.4) | 32 (78.0) | 189 (54.5) |
2 | 0 (0) | 11 (40.7) | 1 (3.1) | 7 (16.3) | 64 (97.0) | 31 (56.4) | 22 (78.6) | 9 (22.0) | 145 (41.8) |
>2 | 0 (0) | 0 (0) | 1 (3.1) | 12 (27.9) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 13 (3.7) |
Assessment results available to diagnosing clinician at time of index diagnosis | 55 | 28 | 33 | 43 | 66 | 56 | 29 | 41 | 657b |
Cognitive/developmental | 51 (92.7) | 14 (50.0) | 21 (63.6) | 30 (69.8) | 6 (9.1) | 30 (53.6) | 0 (0) | 41 (100) | 193 (29.4) |
Language | 0 (0) | 1 (3.6) | 1 (3.0) | 24 (55.8) | 35 (53.0) | 47 (83.9) | 1 (3.4) | 40 (97.6) | 149 (22.7) |
Adaptive | 17 (30.9) | 9 (32.1) | 24 (72.7) | 19 (44.2) | 44 (66.7) | 22 (39.3) | 0 (0) | 0 (0) | 135 (20.5) |
Social | 32 (58.2) | 13 (46.4) | 2 (6.1) | 36 (83.7) | 59 (89.4) | 14 (25.0) | 0 (0) | 24 (58.5) | 180 (27.4) |
Abbreviations: ASD, autism spectrum disorder; DBP, developmental-behavioral pediatricians.
Data were missing for time spent in direct observation for all but 5 participants at site 2.
The total number of formal, standardized assessment results available would equal the total number per site times 4 (for the 4 assessments) if all children had all 4 types of assessment administered at each site. However, assessments were administered as part of routine clinical practice at each site and therefore this was not the case. For example, at site 1, 93% of the child participants had a cognitive assessment, none had a language, 31% had an adaptive assessment, and 58% had a social assessment.
Diagnoses Among Child Participants Not Receiving Reference Standard Diagnosis of ASD
Among the 99 children who did not receive a reference standard diagnosis of ASD, 49 children (49.5%) received more than 1 diagnosis. Diagnoses included a range of language, developmental, motor, behavioral, and neurodevelopmental disorders (eMethods in the Supplement).
Prediction of Consistency Between Index and Reference Standard Diagnoses: Univariate Analyses
Univariate models were evaluated in which 1 predictor was evaluated at a time so that all available data would be used (eMethods in the Supplement). The importance of predictors was evaluated using both inferential criteria (P < .05) and effect size indicators of ORs (Table 3).26 Ratings of severe for DSM A (social/communication) and B criteria (restricted/repetitive behaviors) were associated with diagnostic consistency (OR, 2.962; 95% CI, 1.882-4.663 for A and OR, 2.044; 95% CI, 1.379-3.031 for B criteria). Higher DBP diagnostic certainty was also associated with diagnostic consistency as an increase of 1 unit in certainty was associated with almost twice the odds of a consistent diagnosis (OR, 1.809; 95% CI, 1.589-2.059 for employing binary logistic regression).
Table 3. Coefficients, Odds Ratios, and Respective 95% CIs When Predicting the Consistency Between Index and Reference Autism Spectrum Disorder (ASD) Diagnoses Using Univariate Mixed-Effects Logistic Regression Models.
Parameter fixed effects | Coefficient | OR (95% CI) | Effect size of OR |
---|---|---|---|
Predicted intercepta | 2.142b | 8.514 (5.401-13.420)b | Large |
Child-level predictors | |||
Child age | −0.006 | 0.994 (0.964-1.025) | Small |
Child sex (male) | −0.353 | 0.702 (0.300-1.646) | Small |
Child race | −0.005 | 0.995 (0.797-1.243) | Small |
Child ethnicity | 0.577 | 1.782 (0.663-4.638) | Small to medium |
Insurance | |||
Medicaid | −0.060 | 0.941 (0.414-2.143) | Small |
Private | −0.058 | 0.944 (0.421-2.118) | Small |
Militaryc | |||
Severity DSM-5 A criteria | 1.086b | 2.962 (1.882-4.663)b | Small to medium |
Severity DSM-5 B criteria | 0.715b | 2.044 (1.379-3.031)b | Small to medium |
DBP clinician self-rated diagnostic certainty per child participant | 0.593b | 1.809 (1.589-2.059)b | Small to medium |
ADOS classificationd | 0.448 | 1.565 (0.944-2.594) | Small |
ADOS module scores | 0.086 | 1.089 (1.016-1.168) | Small |
Cognition | −0.008 | 0.992 (0.974-1.010) | Small |
Language | −0.029 | 0.972 (0.945-0.999) | Small |
Adaptive behavior | −0.011 | 0.989 (0.948-1.031) | Small |
Child directly observed (time) | 0.656b | 1.926 (1.123-3.303)b | Small to medium |
Evaluation by trainee | −0.780 | 0.459 (0.176-1.194) | Small |
Availability of measures | |||
Cognitive | 0.704 | 2.021 (0.960-4.256) | Small to medium |
Language | 0.589 | 1.803 (0.833-3.901) | Small to medium |
Social function | 0.653 | 1.921 (0.904-4.080) | Small to medium |
Adaptive behavior | 0.328 | 1.388 (0.481-4.013) | Small |
Behavioral problems | |||
Self-injurious | 1.594 | 4.924 (0.800-30.285) | Medium to large |
Aggressive | −0.080 | 0.923 (0.395-2.161) | Small |
Hyperactive | −0.182 | 0.833 (0.416-1.668) | Small |
Inattentive/distractible | −0.171 | 0.843 (0.370-1.920) | Small |
DBP clinician-level predictors | |||
Sex (male) | 1.124 | 3.076 (0.932-10.157) | Small to medium |
Age | −0.028 | 0.973 (0.936-1.011) | Small |
Years past training | −0.031 | 0.969 (0.936-1.004) | Small |
Experience with DP | 0.249 | 1.283 (0.317-5.191) | Small |
Primary effort in ASD | −0.531 | 0.588 (0.209-1.655) | Small |
ADOS routinely used | 0.685 | 1.984 (0.398-9.884) | Small to medium |
Abbreviations: DP, Developmental Pediatrician defined by board certification in Developmental-Behavioral Pediatrics and/or Neurodevelopmental Disabilities; DSM-5, Diagnostic and Statistical Manual of Mental Disorders (Fifth Edition); OR, odds ratio.
Predicted intercept is from the null model. Results are based on univariate models to use all available data per predictor variable. The conventions used in the table refer to small, medium, and large effects as denoted by Cohen’s d values of .20, .50, and .80 standard deviations (in the OR metric). The child variables were directly observed and evaluation by a trainee were conceptually sound as clinician-level variables but had estimates per child and were used as child-level predictors.
P < .05, 2-tailed test. Significance was adjusted for multiple comparisons using the Benjamini-Hochberg correction and a false discovery rate equal to 5%.
Coefficients could not be estimated because of low frequency military insurance that did not contain enough variability to predict a binary outcome (consistency of index and reference diagnoses).
Values indicate: (0 = non ASD, 1 = autism spectrum, 2 = autism).
For each additional 30 minutes spent observing the child, index diagnoses and reference standard diagnoses were twice as likely to be consistent (OR, 1.926; 95% CI, 1.123-3.303); however, there was no correlation between degree of diagnostic certainty and time spent with the child (r = -0.025; P = .68).
Due to site variability in clinic flow, cognitive, language, adaptive, and social measures were not always available to the DBP at the time of index diagnosis. The availability of 1 or more cognitive or social behavior measure at the time of index diagnosis was associated with increased consistency between index diagnosis and reference standard diagnosis (cognitive measure OR, 2.021; 95% CI, 0.960-4.256; and social measure OR, 1.921; 95% CI, 0.904-4.680). However, availability of measures was not correlated with clinician degree of diagnostic certainty at time of index diagnosis.
The presence of self-injurious behavior was associated with an increase in consistency of diagnosis (OR, 4.924; 95% CI, 0.800-30.285) using a medium to large effect size, although this estimate did not exceed levels of significance following false discovery rate correction.
Clinician male sex was associated with a 3-fold increase in diagnostic consistency (OR, 3.076; 95% CI, 0.932-10.157), despite not being statistically significant. The same was true of the ADOS classification results (OR, 1.565, 95% CI, 0.944-2.594), which were associated with a 2-fold increase in consistency.
Prediction of Consistency Between Index and Reference Standard Diagnoses: Receiver Operating Curve (ROC)
The most important predictor was DBP diagnostic certainty at the time of the index diagnosis (AUC, 0.860; effect size, good) (eTable in the Supplement). A value 7 on the 10-point Likert scale maximized prediction of a consistent diagnosis (sensitivity or correct classification of consistent diagnosis 76.43%; specificity or correct classification of inconsistent diagnosis 80%). Clinician ratings of severe DSM A criteria (AUC 0.732) and scores on language measures (AUC 0.707) were fair predictors of diagnostic consistency (eFigure in the Supplement).
Prediction of Consistency Between Index and Reference Standard Diagnoses: Multivariate Model
A multivariate model was employed, keeping predictors only if: (1) predictors were significantly different from 0 either using 1-tailed or 2-tailed tests and (2) model fit improved in their presence by use of the deviance statistic (Table 4). DBP diagnostic certainty at index diagnosis, child ethnicity, and amount of time spent observing the child were important predictors of diagnostic consistency. Across all predictors, the higher the certainty, being Hispanic, and spending more time with the child were all associated with increased diagnostic consistency. Among clinician level predictors, no significant findings emerged in the multivariate model.
Table 4. Odds Ratios for the Prediction of Consistency Between Index and Reference ASD Diagnoses Using Child- and Clinician-Based Predictors Using Multivariate Multilevel Modela.
Parameter fixed effects | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 |
---|---|---|---|---|---|
Predicted intercept | 2.135b | 2.777b | 1.014b | 0.239 | 0.029 |
Child-level predictor | |||||
Certainty | NA | 1.794b | 1.536b | 1.631b | 1.817b |
Ethnicity | NA | NA | 2.659b | 2.796b | 2.459c |
Child directly observed | NA | NA | NA | 1.947b | 2.312b |
Clinician-level predictor | |||||
Sex | NA | NA | NA | NA | 3.084 |
Age | NA | NA | NA | NA | 1.004 |
Years past training | NA | NA | NA | NA | 0.970 |
ASD primary responsibility | NA | NA | NA | NA | 2.156 |
ADOS routinely used | NA | NA | NA | NA | 0.933 |
Model improvementd | |||||
Deviance based χ2 | 837.701 | 54.812 | 20.838 | 121.760 | NA |
df | 2 | 3 | 4 | 1 | NA |
P value | NA | <.001 | <.001 | <.001 | NA |
Abbreviations: ADOS, Autism Diagnostic Observation Schedule, ASD; autism spectrum disorder; NA, not applicable.
Valid cases in the multivariate model were n = 341, thus, there were 8 missing cases (0.023%) and were due to listwise deletion required by the multivariate model.
P < .05, 2-tailed test.
P < .05, 1-tailed test.
Variance reduction by use of a χ2 test based on the difference in the 2 models’ deviance estimates. Nested models involve only significant parameters in the multivariate model using either 1-tailed or 2-tailed. NA in model 5 denotes the absence of a comparison model 6.
Discussion
In this prospective diagnostic study including children ages 18 months to 5 years, 11 months who were referred for possible ASD, there was 90.0% agreement between the index diagnoses (ie, clinical diagnosis) and reference standard diagnoses (ie, clinical diagnosis plus information from the ADOS). In univariate analyses, factors associated with consistency between the index diagnoses and reference standard diagnoses included severity of ASD symptoms, clinician diagnostic certainty of the index diagnosis, time spent in direct observation of the child, availability of measures of development, and presence of self-injurious behavior. ROC and multivariate analyses indicated that clinician diagnostic certainty was the most robust predictor of consistency between index diagnoses and reference standard diagnoses.
A recent meta-analysis27 identified 14 studies on ASD classification using the ADOS alone compared with a reference standard assessment that included a focused clinical interview and the ADOS. Only 6 of these studies included toddler and preschool age children, and only 3 were conducted exclusively in clinical settings. The authors27 concluded that additional research on the ADOS is needed in the clinical setting. The current study was not designed to compare the ASD classification with the ADOS to a reference standard diagnosis that includes a clinical evaluation plus the ADOS; rather, it was aimed at evaluating the clinical utility of the ADOS. Clinical diagnoses by DBPs (index diagnosis), employing information from a clinical evaluation and, in some cases, results of standardized cognitive, language, and adaptive measures, were consistent with the diagnosis that also included information from the ADOS (reference standard diagnosis) in 90% of children. The likelihood that the index and reference standard diagnoses would be consistent was predicted by the DBP’s level of certainty in their clinical diagnosis.
While clinician diagnostic certainty was the most robust predictor of diagnostic consistency, several other factors also predicted consistency. Some children present with more severe and therefore diagnostically more salient ASD symptoms, and children in this study whose ASD symptoms were rated as more severe were also more likely to have consistency between the index diagnoses and reference standard diagnoses.2,3,4,15 The presence of self-injurious behaviors was found to predict consistent index diagnoses and reference standard diagnoses, suggesting that when self-injurious behaviors are noted in young children referred for possible ASD, there may be a higher likelihood that the child has ASD.
A previous study28 found that ASD diagnoses based on brief periods of direct observation of the child are often inaccurate. In this study, increased time of observation was associated with increased consistency between index diagnoses and reference standard diagnoses; however, post-hoc analyses found that clinician diagnostic certainty did not vary as function of time spent observing the child. In the clinical settings for this study,28 clinicians typically spent at least 1 half hour, often longer, in direct observation and/or interaction with the child.
Hispanic ethnicity predicted diagnostic consistency in multivariate analyses only. This important finding requires further consideration beyond the scope of this initial report.
This study has several strengths, including a large, diverse sample from 8 US sites and 1 international site. The study was conducted in the context of routine clinical care in specialty clinics, and findings are therefore likely to reflect the population of children referred for ASD diagnostic evaluations. Findings also reflect typical variations in clinical practice, including time spent evaluating the child, availability of standardized measures of cognitive, language, and adaptive function, and variability in the child’s neurodevelopmental profile.
Limitations
First, the study included only English (or German) speaking children under age 6 years, limiting generalizability. Second, the study was conducted in tertiary care referral centers, suggesting caution in generalizing findings to other settings. Third, DBPs who participated in this study are experienced subspecialists; therefore, study results are not applicable to nonspecialist clinicians who are often asked to diagnose ASD. Fourth, logistical and feasibility considerations precluded a design that incorporated an ASD evaluation independent of the clinician participants in the study.
Conclusions
In this prospective diagnostic study, clinical diagnoses of ASD by DBPs (index diagnosis) were consistent with diagnoses that incorporated information from the ADOS (reference standard diagnosis) in 90.0% of cases. Clinician diagnostic certainty in their clinical diagnosis predicted consistency between the index diagnoses and reference standard diagnosis. The ADOS may have clinical use in certain scenarios (eg, older children or evaluations by less highly trained specialist clinicians); however, this study suggests that the ADOS is generally not required for diagnosis of ASD by DBPs and that DBPs can identify children for whom the ADOS may contribute to accurate diagnosis. ASD diagnostic assessments that do not include the ADOS are less time consuming and costly, potentially leading to more streamlined assessments that could improve access to timely diagnosis by more children. Additionally, this study suggests that results from the ADOS should not be required by insurers, early intervention programs, or schools for children to access intervention and treatment for ASD.
References
- 1.American Psychiatric Association . Diagnostic and Statistical Manual of Mental Disorders. 5th ed. American Psychiatric Association; 2013. [Google Scholar]
- 2.Maenner MJ, Shaw KA, Bakian AV, et al. Prevalence and characteristics of autism spectrum disorder among children aged 8 years—Autism and Developmental Disabilities Monitoring Network, 11 sites, United States, 2018. MMWR Surveill Summ. 2021;70(11):1-16. doi: 10.15585/mmwr.ss7011a1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hyman SL, Levy SE, Myers SM; Council on Children With Disabilities, Section on Developmental and Behavioral Pediatrics . Identification, evaluation, and management of children with autism spectrum disorder. Pediatrics. 2020;145(1):e20193447. doi: 10.1542/peds.2019-3447 [DOI] [PubMed] [Google Scholar]
- 4.Gwynette MF, McGuire K, Fadus MC, Feder JD, Koth KA, King BH. Overemphasis of the autism diagnostic observation schedule (ADOS) evaluation subverts a clinician’s ability to provide access to autism services. J Am Acad Child Adolesc Psychiatry. 2019;58(12):1222-1223. doi: 10.1016/j.jaac.2019.07.933 [DOI] [PubMed] [Google Scholar]
- 5.Mandell DS, Barry CL, Marcus SC, et al. Effects of autism spectrum disorder insurance mandates on the treated prevalence of autism spectrum disorder. JAMA Pediatr. 2016;170(9):887-893. doi: 10.1001/jamapediatrics.2016.1049 [DOI] [PubMed] [Google Scholar]
- 6.Kanne SM, Bishop SL. Editorial perspective: the autism waitlist crisis and remembering what families need. J Child Psychol Psychiatry. 2021;62(2):140-142. doi: 10.1111/jcpp.13254 [DOI] [PubMed] [Google Scholar]
- 7.McNally Keehn R, Tomlin A, Ciccarelli MR. COVID-19 pandemic highlights access barriers for children with autism spectrum disorder. J Dev Behav Pediatr. 2021;42(7):599-601. doi: 10.1097/DBP.0000000000000988 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Constantino JN, Abbacchi AM, Saulnier C, et al. Timing of the diagnosis of autism in African American children. Pediatrics. 2020;146(3):e20193629. doi: 10.1542/peds.2019-3629 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Durkin MS, Maenner MJ, Baio J, et al. Autism spectrum disorder among US children (2002-2010): socioeconomic, racial, and ethnic disparities. Am J Public Health. 2017;107(11):1818-1826. doi: 10.2105/AJPH.2017.304032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wiggins LD, Durkin M, Esler A, et al. Disparities in documented diagnoses of autism spectrum disorder based on demographic, individual, and service factors. Autism Res. 2020;13(3):464-473. doi: 10.1002/aur.2255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lord C, Rutter M, DiLavore PC, Risi S, Gotham K, Bishop S. Autism Diagnostic observation schedule: ADOS-2. Western Psychological Services; 2012. [Google Scholar]
- 12.Lord C, Rutter M, DiLavore PC, Risi S. Autism diagnostic observation schedule: toddler module. Western Psychological Services; 2006. [Google Scholar]
- 13.Akshoomoff N, Corsello C, Schmidt H. The role of the autism diagnostic observation schedule in the assessment of autism spectrum disorders in school and community settings. Calif School Psychol. 2006;11:7-19. doi: 10.1007/BF03341111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Havdahl KA, Hus Bal V, Huerta M, et al. Multidimensional influences on autism symptom measures: Implications for use in etiological research. J Am Acad Child Adolesc Psychiatry. 2016;55(12):1054-1063.e3. doi: 10.1016/j.jaac.2016.09.490 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kaufman NK. Rethinking “gold standards” and “best practices” in the assessment of autism. Appl Neuropsychol Child. Published online May 5, 2022. doi: 10.1080/21622965.2020.1809414 [DOI] [PubMed] [Google Scholar]
- 16.Johnson CP, Myers SM; American Academy of Pediatrics Council on Children With Disabilities . Identification and evaluation of children with autism spectrum disorders. Pediatrics. 2007;120(5):1183-1215. doi: 10.1542/peds.2007-2361 [DOI] [PubMed] [Google Scholar]
- 17.Hansen RL, Blum NJ, Gaham A, Shults J; DBPNet Steering Committee . Diagnosis of autism spectrum disorder by developmental-behavioral pediatricians in academic centers. Pediatrics. 2016;137(suppl 2):S79-S89. doi: 10.1542/peds.2015-2851F [DOI] [PubMed] [Google Scholar]
- 18.Lord C, Charman T, Havdahl A, et al. The Lancet Commission on the future of care and clinical research in autism. Lancet. 2022;399(10321):271-334. doi: 10.1016/S0140-6736(21)01541-5 [DOI] [PubMed] [Google Scholar]
- 19.Robins DL, Casagrande K, Barton M, Chen CM, Dumont-Mathieu T, Fein D. Validation of the modified checklist for autism in toddlers, revised with follow-up (M-CHAT-R/F). Pediatrics. 2014;133(1):37-45. doi: 10.1542/peds.2013-1813 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Luyster R, Gotham K, Guthrie W, et al. The Autism Diagnostic Observation Schedule-toddler module: a new module of a standardized diagnostic measure for autism spectrum disorders. J Autism Dev Disord. 2009;39(9):1305-1320. doi: 10.1007/s10803-009-0746-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Raudenbush SW, Bryk AS. Hierarchical linear models: applications and data analysis methods; second edition. Sage, 2002 [Google Scholar]
- 22.Hsieh FY, Bloch DA, Larsen MD. A simple method of sample size calculation for linear and logistic regression. Stat Med. 1998;17(14):1623-1634. doi: [DOI] [PubMed] [Google Scholar]
- 23.NSS Statistical Software . PASS 2021 power analysis and sample size. Accessed September 13, 2022. http://ncss.com/software/pass
- 24.Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29-36. doi: 10.1148/radiology.143.1.7063747 [DOI] [PubMed] [Google Scholar]
- 25.Gallop RJ, Crits-Christoph P, Muenz LR, Tu XM. Determination and interpretation of the optimal operating point for ROC curves derived through generalized linear models. Underst Stat. 2003;2:219-242. doi: 10.1207/S15328031US0204_01 [DOI] [Google Scholar]
- 26.Chen H, Cohen P, Chen S. How big is a big odds ratio? Interpreting the magnitudes of odds ratios in epidemiological Studies. Commun Stat Simul Comput. 2010;39(4):860-864. doi: 10.1080/03610911003650383 [DOI] [Google Scholar]
- 27.Lebersfeld JB, Swanson M, Clesi CD, O’Kelley SE. Systematic review and meta-analysis of the clinical utility of the ADOS-2 and the ADI-R in diagnosing autism spectrum disorders in children. J Autism Dev Disord. 2021;51(11):4101-4114. doi: 10.1007/s10803-020-04839-z [DOI] [PubMed] [Google Scholar]
- 28.Gabrielsen TP, Farley M, Speer L, Villalobos M, Baker CN, Miller J. Identifying autism in a brief observation. Pediatrics. 2015;135(2):e330-e338. doi: 10.1542/peds.2014-1428 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.