Sensitivity and Specificity of the Modified Checklist for Autism in Toddlers (Original and Revised): A Systematic Review and Meta-analysis

Andrea Trubanova Wieckowski; Lashae N Williams; Juliette Rando; Kristen Lyall; Diana L Robins

doi:10.1001/jamapediatrics.2022.5975

. 2023 Feb 20;177(4):373–383. doi: 10.1001/jamapediatrics.2022.5975

Sensitivity and Specificity of the Modified Checklist for Autism in Toddlers (Original and Revised)

A Systematic Review and Meta-analysis

Andrea Trubanova Wieckowski ^1,^✉, Lashae N Williams ¹, Juliette Rando ¹, Kristen Lyall ¹, Diana L Robins ¹

¹A.J. Drexel Autism Institute, Drexel University, Philadelphia, Pennsylvania

Accepted for Publication: November 11, 2022.

Published Online: February 20, 2023. doi:10.1001/jamapediatrics.2022.5975

^✉

Corresponding Author: Andrea Trubanova Wieckowski, PhD, A.J. Drexel Autism Institute, 3020 Market St, Ste 560, Philadelphia, PA 19104 (atw64@drexel.edu).

Author Contributions: Dr Wieckowski had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Wieckowski, Lyall, Robins.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Wieckowski, Williams, Rando, Lyall.

Critical revision of the manuscript for important intellectual content: Wieckowski, Williams, Lyall, Robins.

Statistical analysis: Wieckowski, Rando, Lyall.

Administrative, technical, or material support: Wieckowski.

Supervision: Wieckowski, Lyall, Robins.

Conflict of Interest Disclosures: Dr Wieckowski reported receiving grants from the Pennsylvania Medical Society and the Eagles Autism Foundation. Dr Lyall reported receiving grants from the Eagles Autism Foundation. Dr Robins reported receiving personal fees from M-CHAT LLC co-ownership, in which licensees pay royalties; receiving grants from the Eagles Autism Foundation, the National Institutes of Health, and the Pennsylvania Medical Society; having a contract to contribute to a Food and Drug Administration trial from Autism Speaks; receiving a gift to support pilot research from the Wawa Foundation; receiving personal fees from Quadrant Biosciences, Inc, for serving as a member of an advisory board; having a contract to collaborate on a toddler screening study in Monterrey, Mexico, from Autismo ABP outside the submitted work; and holding a copyright for M-CHAT, M-CHAT-R/F issued to M-CHAT, LLC (M-CHAT and M-CHAT-R/F are copyrighted instruments). No other disclosures were reported.

Data Sharing Statement: See Supplement 2.

^✉

Corresponding author.

PMCID: PMC9941975 PMID: 36804771

Key Points

Question

What factors are associated with the sensitivity and specificity of the Modified Checklist for Autism in Toddlers (M-CHAT) and the M-CHAT, Revised With Follow-up (M-CHAT-R/F)—henceforth referred to as M-CHAT(-R/F)?

Findings

In this systematic review and meta-analysis of 50 studies, the pooled sensitivity of M-CHAT-R/F was 0.83, and the pooled specificity was 0.94. Heterogeneity analyses revealed greater diagnostic accuracy for low- vs high-likelihood samples, a concurrent vs prospective case confirmation strategy, a large vs small sample size, use of M-CHAT(-R) Follow-up, and non-English vs primarily English.

Meaning

Findings suggest that M-CHAT(-R/F) shows strong performance as an autism spectrum disorder screener, but researchers and clinicians should be aware of the variability in the sensitivity and specificity based on multiple factors.

Abstract

Importance

The Modified Checklist for Autism in Toddlers (M-CHAT) and the M-CHAT, Revised With Follow-up (M-CHAT-R/F)—henceforth referred to as M-CHAT(-R/F)—are the most commonly used toddler screeners for autism spectrum disorder (ASD). Their use often differs from that in the original validation studies, resulting in a range of estimates of sensitivity and specificity. Also, given the variability in reports of the clinical utility of the M-CHAT(-R/F), researchers and practitioners lack guidance to inform autism screening protocols.

Objective

To synthesize variability in sensitivity and specificity of M-CHAT(-R/F) across multiple factors, including procedures for identifying missed cases, likelihood level, screening age, and single compared with repeated screenings.

Data Sources

A literature search was conducted with PubMed, Web of Science, and Scopus to identify studies published between January 1, 2001, and August 31, 2022.

Study Selection

Articles were included if the studies used the M-CHAT(-R/F) (ie, original or revised version) to identify new ASD cases, were published in English-language peer-reviewed journals, included at least 10 ASD cases, reported procedures for false-negative case identification, screened children by 48 months, and included information (or had information provided by authors when contacted) needed to conduct the meta-analysis.

Data Extraction and Synthesis

The systematic review and meta-analysis was conducted within the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline. The Quality Assessment of Diagnostic Accuracy Studies–2 tool evaluated bias in sample selection. Data extraction and quality assessment were performed by 2 authors independently. The overall diagnostic accuracy of the M-CHAT(-R/F) was assessed with the hierarchic summary receiver operating characteristic (HSROC) model.

Main Outcomes and Measures

Sensitivity, specificity, diagnostic odds ratios, and HSROC curves of M-CHAT(-R/F).

Results

The review included 50 studies with 51 samples. The pooled sensitivity of M-CHAT(-R/F) was 0.83 (95% CI, 0.77-0.88), and the pooled specificity was 0.94 (95% CI, 0.89-0.97). Heterogeneity analyses revealed greater diagnostic accuracy for low- vs high-likelihood samples, a concurrent vs prospective case confirmation strategy, a large vs small sample size, use of M-CHAT(-R/F) Follow-up, and non-English vs English only.

Conclusions and Relevance

Overall, results of this study suggest the utility of the M-CHAT(-R/F) as an ASD screener. The wide variability in psychometric properties of M-CHAT(-R/F) highlights differences in screener use that should be considered in research and practice.

This systematic review and meta-analysis assesses the sensitivity and specificity of the Modified Checklist for Autism in Toddlers (M-CHAT) and the M-CHAT, Revised With Follow-up as autism spectrum disorder screeners.

Introduction

Autism spectrum disorder (ASD) is a neurodevelopmental disorder marked by core deficits in social communication and restricted and repetitive behaviors. It can be accurately detected by 18 months of age, although the diagnosis usually occurs much later.^1,2,3 Early detection of ASD informs autism-specific early intervention, which improves outcomes.^4,5 The American Academy of Pediatrics (AAP) recommends all children undergo general developmental and autism-specific screening paired with developmental surveillance.⁶ Although most primary care professionals screen for autism with a standardized autism-specific tool (eg, 72%),⁷ implementation of screening is inconsistent, in part owing to the lack of a recommendation for universal screening from the US Preventive Services Task Force.⁵ Reports of low sensitivity of toddler screening^8,9 may also affect universal implementation.

The Modified Checklist for Autism in Toddlers (M-CHAT)¹⁰ and the M-CHAT, Revised With Follow-up (M-CHAT-R/F)¹¹—henceforth referred to as M-CHAT(-R/F)—are the most commonly used ASD-specific screening tools.¹² However, implementation often diverges from that in the original validation studies, including use of the Follow-up portion of the M-CHAT-R/F administration, which limits interpretation of sensitivity and specificity. Previous studies describing variability in the methods used to examine psychometric properties of ASD screening tools,^13,14 and the M-CHAT specifically,¹⁵ have been limited in their assessment of contextual factors, such as the rigor of false-negative case identification strategies. Subsequently, there is lack of consensus regarding M-CHAT(-R/F) screening properties. The present systematic literature review and meta-analysis examines screening properties of the M-CHAT(-R/F) to assess variations in reported sensitivity and specificity across multiple factors, including procedures for identifying missed cases, likelihood level, screening age, single or repeated screenings, sample size, version, language administered, and use of the structured Follow-up portion of the M-CHAT(-R/F).

Methods

Search Strategy

This systematic review and meta-analysis was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline.¹⁶ We searched PubMed, Web of Science, and Scopus to identify peer-reviewed articles published between January 1, 2001, and August 31, 2022 (see eTable 1 in Supplement 1 for key words). This systematic review and meta-analysis has been registered with the International Prospective Register of Systematic Reviews (CRD42021232792).

Inclusion Criteria

Included articles used M-CHAT(-R/F) to screen for ASD and (1) were published in English-language peer-reviewed journals; (2) used M-CHAT(-R/F) to screen children younger than 48 months, before ASD diagnosis; (3) identified at least 10 ASD cases; (4) reported procedures for identifying false-negative cases (critical to estimate sensitivity); and (5) included information (or had information provided by authors when contacted) needed to conduct the meta-analysis (rates of true-positive, false-positive, false-negative, and true-negative cases). Reviews, commentaries, and conference articles were excluded. For studies with overlapping data sets, the study with the largest sample was included.

Data Extraction and Quality Assessment

Data in Table 1 were extracted independently by 2 authors (A.T.W. and L.N.W.).^{8,9,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64} Screening metrics not provided in the studies were calculated from raw values. When possible, specificity was recalculated from the raw numbers so that true-negative cases included presumed true-negative ones (ie, screen-negative children who were not evaluated). Of 42 authors contacted for missing information, 27 either responded with the requested information or noted that it was not available.

Table 1. Study Characteristics and Screening Metrics for M-CHAT(-R/F).

Source	Total No.^a	Screen age, mo^b	Evaluation age, mo^b	HL, LL, or mixed	M-CHAT language	FU	ASD detection^c	TP	FN	FP	TN	Sens	Spec^d
Baduel et al,¹⁷ 2017	1250	22-26	24-34	LL	French	Y	C: FN strategy	12	6	8	1201	0.667	0.993
Beacham et al,¹⁸ 2018	154	16-45	16-45	HL	English	N	C: all eval	105	19	14	16	0.847	0.533
Canal-Bedia et al,¹⁹ 2011	2480	18-36	18-48	Mixed^e	Spanish	Y	C: weak	23	0	43	2414	1.000	0.980^f
Carbone et al,⁸ 2020	26 364	16-30	46.8 (17.7)^g	LL	English	Y/N^h	P: record	125	253	579	25 407	0.331	0.978
Chang et al,²⁰ 2021	990	17-37	17-37	LL	English	Y	C: FN strategy	31	7	11	941	0.816	0.988
Charman et al,²¹ 2016	543	18-56	32-73	HL	English	N	C: all eval	45	10	32.5	32.5	0.818	0.500
Chlebowski et al,²² 2013	18 989	16-30	25.8 (4.5)	LL	English or Spanish	Y	C: FN strategy	92	6	79	18 289	0.939	0.996
Choueiri et al,²³ 2021	80	18-36	18-36	HL	English	N	C: all eval	53	3	0	24	0.946	1.000
Christopher et al,²⁴ 2021	290	18-48	18-48	HL	English	Y	C: all eval	170	48	48	24	0.780	0.333
Coelho-Medeiros et al,²⁵ 2019	120	16-30	16-30	Mixed^e	Spanish (Chile)	Y	C: FN strategy	18	0	4	97	1.000	0.960
Dereu et al,²⁶ 2012	199	16-31	13-51	HL	Dutch	N	C: FN strategy	10	4	22	163	0.714ⁱ	0.881
DiGuiseppi et al,²⁷ 2010	85	20-86	20-86	HL	English or Spanish	N	C: FN strategy	9	2	26	38	0.818	0.594
Dudova et al,²⁸ 2014	157	≈24	NA	HL	Czech^j	N	C: FN strategy	9	4	9	112	0.692	0.926
Eaves et al,²⁹ 2006	84	17-48	22-53	HL	English	N	C: all eval	48	4	22	8	0.923	0.267
Guo et al,³⁰ 2019	7928	16-30	23 (4)	LL	Chinese	Y	C: FN strategy	72	10	103	7166	0.878	0.986
Guthrie et al,⁹ 2019^k	20 375	16-26	17-88	LL	English or Spanish	Y/N^h	P: record	225	229	1247	18 674	0.496	0.937
Harris et al,³¹ 2021^l	360	24-48	NA	LL	English or Spanish	Y	C: FN strategy	8	29	4	315	0.216	0.987
Hoang et al,³² 2019	17 277	18-30	18-30	LL	Vietnamese	N	C: FN strategy	129	1	118	17 021	0.992	0.993
Inada et al,³³ 2011	1187	17-23	35-44	LL	Japanese	N	P: FN strategy	11	9	46	1121	0.550	0.961
Jonsdottir et al,³⁴ 2022	1586	31.7 (1.7)	NA^m	LL	Icelandic	Y	P: record	18	11	7	1549	0.621	0.988
Kamio et al,³⁵ 2014	1851	17-26	33-73	LL	Japanese	Y	P: FN strategy	29ⁿ	22	24	1661	0.569	0.986
Kanne et al,³⁶ 2018	158	18-48	24-32	HL	English	Y	C: all eval	96	23	24	15	0.807	0.385
Kara et al,³⁷ 2014	618	18-36	24-42	Mixed^e	Turkish	Y	C: FN strategy	45	2	15	534	0.957	0.973
Keehn et al,³⁸ 2021	605	18-48	18-48	HL	English	Y	C: all eval	198	31	234	142	0.865	0.378
Kerub et al,³⁹ 2020	1591	18-36	NA^o	LL	Hebrew	Y	P: record	7	3	43	1538	0.700	0.973
Kim et al,⁴⁰ 2016	827	14-43^p	110-151	HL	English	N	P: all eval	30	28	123	646	0.517	0.840
Kleinman et al,⁴¹ 2008^q	1416	16-30	52.2 (8.0)^g	Mixed^e	English	Y	P: FN strategy	73	7	51	1285	0.913	0.962
Koh et al,⁴² 2014	580	17-48	18-69	HL	English or Chinese	N	C: all eval	158	41	123	258	0.794	0.677
Magán-Maganto et al,⁴³ 2020	6625	14-36	23-36	LL	Spanish	Y	C: FN strategy	15	4	24	6542	0.789	0.996
Matson et al,⁴⁴ 2013	552	16-30	16-30	HL	English	N	C: all eval	150	101	150	151	0.598	0.502
Oner and Munir,⁴⁵ 2020	6712	16-36	16-41	LL	Turkish	Y	C: weak	57	0	95	6388	1.000	0.985
Robins et al,⁴⁶ 2014	16 041	16-30	26.2 (5.5)	LL	English	Y	C: FN strategy	105	18	116	15 496	0.854	0.993
Salim et al,⁴⁷ 2020	143	18-48	18-48	HL	Indonesian	N	C: all eval	16	1	27	99	0.941	0.786
Salisbury et al,⁴⁸ 2018	485	16-48	16-48	HL	English	N	C: all eval	220	77	82	106	0.741	0.564
Samadi and McConkey,⁴⁹ 2015	2941	24-60	24-60	LL	Kurdish or Persian	N	C: FN strategy	25	3	45	2380	0.903^r	0.981
Schjølberg et al,⁵⁰ 2022^s	54 436	19.0 (1.2)	≈42	LL	Norwegian	N	P: record	105	232	4048	50 078	0.312	0.925
Smith et al,⁵¹ 2013	217	18-48	18-48	HL	English	N	C: all eval	97	39	31	50	0.713	0.617
Snow and Lecavalier,⁵² 2008	56	18-48	18-48	HL	English	N	C: all eval	38	5	8	5	0.884	0.385
Srisinghasongkram et al,⁵³ 2016 (sample 1)	109	18-48	18-48	HL	Thai	Y	C: all eval	40	5	1	63	0.889	0.984
Srisinghasongkram et al,⁵³ 2016 (sample 2)	732	18-48	18-48	LL	Thai	Y	C: FN strategy	9	0	1	722	1.000	0.999
Sturner et al,⁵⁴ 2016	5071	18-24	14-41	LL	English	Y	C: FN strategy	23	16	17	4772	0.590	0.996
Sturner et al,⁵⁵ 2022	408	16-20	20.5 (1.9)	LL	English	Y	C: FN strategy	46	17	118	227	0.730	0.658
Taylor et al,⁵⁶ 2014	145	28.1 (4.8)	28.1 (4.8)	HL	English	N	C: all eval	74	12	26	33	0.860	0.559
Toh et al,⁵⁷ 2018	19 297	15-36	NA	LL	Malay, Chinese, or English	N	P: record	18	32	20	19 227	0.360	0.999
Tsai et al,⁵⁸ 2019	317	16-32	36-37	Mixed	Mandarin Chinese	Y	P: all eval	22	3	19	273	0.860^r	0.935
Thi Vui et al,⁵⁹ 2022	40 243	18-30	NA	LL	Vietnamese	N	C: FN strategy	302	3	193	39 726	0.990	0.995
Weitlauf et al,⁶⁰ 2015	74	16-21	18-43	HL	English	Y	C: FN strategy	21	6	7	29	0.778	0.806
Wieckowski et al,⁶¹ 2021^t	3052	17-22	18-60	LL	English or Spanish	Y	C: FN strategy	61	13	79	2729	0.824	0.972
Windiani et al,⁶² 2016	110	18-48	18-48	HL	Indonesian	Y	C: all eval	16	2	5	87	0.889	0.946
Wong et al,⁶³ 2018	236	18-47	18-47	HL	Chinese	N	C: all eval	99	14	58	65	0.876	0.528
Zhang et al,⁶⁴ 2022	11 190	18-24	23.1 (4.6)	LL	Chinese	Y	C: FN strategy	33	15^u	56^u	11 056^u	0.688	0.995

Open in a new tab

Abbreviations: all eval, all participants evaluated; ASD, autism spectrum disorder; C, concurrent; FN, false negative; FP, false positive; FU, M-CHAT(-R/F) Follow-up administration; HL, high likelihood for ASD; LL, low likelihood for ASD; M-CHAT(-R/F), Modified Checklist for Autism in Toddlers and Modified Checklist for Autism in Toddlers, Revised With Follow-up (original and revised versions combined); N, no; NA, not available; P, prospective; record, medical record review; Sens, sensitivity; Spec, specificity; TN, true negative; TP, true positive; Y, yes; Y/N, Y, but not consistently.

^{^a}

Refers to sample who received the M-CHAT(-R/F).

^{^b}

Range reported for entire sample who received M-CHAT(-R/F), when available. If not available, mean (SD) is reported. If mean (SD) was not available, an estimate from the article or from communication with authors is reported.

^{^c}

Strategy used to detect FN cases.

^{^d}

Specificity was recalculated from the raw numbers so that TN cases included presumed TN results (ie, including children who screened negative but were not further evaluated) for consistency across studies, unless noted otherwise. Negative screen results were presumed to be TN unless there was other presented evidence.

^{^e}

Reclassified as low risk for analyses because most participants were low risk.

^{^f}

True negative and specificity taken directly from article and not recalculated due to missing information.

^{^g}

Age of evaluation is for ASD sample only; age for non-ASD sample is unknown.

^{^h}

Yes/no was reclassified for analyses: Carbone et al⁸ was reclassified as N based on few practices using the Follow-up portion of the M-CHAT-R/F, and Guthrie et al⁹ was reclassified as Y even though the Follow-up portion of the M-CHAT-R/F was not always used, given that it was built into the medical record system and intended to be used when indicated.

^ⁱ

Sensitivity differs slightly from that reported in the article because of a focus on M-CHAT(-R/F) only.

^{^j}

Language presumed as not directly reported in the article.

^{^k}

Values were obtained from the main author for repeated screenings and do not match those reported in the article for single screening.

^{^l}

Subsample of children screened before 48 months of age only is reported. Values were obtained from communication with the main author.

^{^m}

Age of evaluation was up to 18 months after the age of screening.

^ⁿ

P value differs from the one reported in the article because of the addition of 9 nonresponders who needed the Follow-up portion of the M-CHAT-R/F but did not complete it and had confirmed ASD.

^{^o}

Age of evaluation was within 10 months of screenings.

^{^p}

Screen age reported is uncorrected for prematurity.

^{^q}

Study 2 sample only presented and analyzed because of overlap of sample 1 with Chlebowski et al.²²

^{^r}

Sensitivity calculation slightly differs from article’s reported sensitivity owing to rounding in calculation from raw numbers.

^{^s}

Information for M-CHAT²³ is reported.

^{^t}

Information is reported for 18-month screening start age only.

^{^u}

Numbers adjusted to include the 12 screen-negative children with a diagnosis of ASD during subsequent well-child visit and follow-up.

Two authors (A.T.W. and L.N.W.) used the Quality Assessment of Diagnostic Accuracy Studies–2 (QUADAS-2)⁶⁵ tool, adapting signaling questions (see Yuen et al¹⁵) to evaluate bias in sample selection, implementation of M-CHAT(-R/F), and diagnostic assessment procedures. QUADAS-2 assesses risk of bias across 4 domains (patient selection, index test, reference standard, and flow and timing) and applicability across 3 domains (patient selection, index test, and reference standard). Each study received ratings of low, high, or unclear risk of bias and applicability for each domain according to the signaling questions (eTables 2 and 3 in Supplement 1).

Data Analysis

Data analyses were conducted between March 11, 2022, and October 10, 2022. The overall diagnostic accuracy of the M-CHAT(-R/F) was assessed with the hierarchic summary receiver operating characteristic (HSROC)^66,67 model using the MetaDAS SAS version 1.3.0 (SAS Institute Inc) macro.⁶⁸ Models were run with and without covariates: likelihood level of sample, case confirmation strategy classification, sample size, M-CHAT version, use of the Follow-up, and language. The HSROC parameters output by each model were input into RevMan, version 5 (Cochrane) software to create the HSROC summary curves. Models were run with all included studies except those that were unable to be classified into main categories based on predominant data (eMethods in Supplement 1). The diagnostic odds ratio (DOR) is an estimate of overall diagnostic test accuracy, reflecting how many times higher the odds are of obtaining an M-CHAT(-R/F) score in the diagnostic range for a randomly selected person with ASD vs without ASD. The likelihood ratio test assesses the goodness of fit of 2 competing statistical models by testing whether the ratio of their likelihoods is significantly different from 1 and was used to assess whether the addition of each covariate to the model was significant (eMethods in Supplement 1).

Results

Search Results

Overall, 50 published studies met the criteria and were included in our systematic review (see eFigure 1 in Supplement 1 for exclusions). One study⁵³ included 2 distinct samples, resulting in 51 study samples described in the present review and meta-analysis.

Study Characteristics

Table 1 and eTable 4 in Supplement 1 summarize the characteristics of the included studies. Almost half of the samples were small (<500; 21 of 51 studies [41%]); 26 (51%) administered the M-CHAT(-R/F) primarily in English. Thirty-two studies (63%) used the original version of M-CHAT, whereas 19 (37%) used M-CHAT-R. Thirty studies (59%) used the structured Follow-up portion of the M-CHAT-R/F with the original or revised M-CHAT, and 21 (41%) used only the initial items.

Studies differed based on case confirmation strategies. Most of the 51 studies (n = 40 [78%]) used concurrent detection methods, defined as evaluation to confirm ASD status within 6 months of screening. These studies included (1) additional rigorous false-negative case detection approaches, such as asking pediatric physicians to note ASD concerns, using additional screening tools for all or a subset of the sample (n = 21), or both^{17,20,22,25,26,27,28,30,31,32,37,43,46,49,53,54,55,59,60,61,64}; or (2) evaluation of all children regardless of screening results (n = 17).^{18,21,23,24,29,36,38,42,44,47,48,51,52,53,56,62,63} In addition, 2 studies used strategies that were classified as weak concurrent false-negative case confirmation strategies, defined as strategies that resulted in few children who had not screened positive being invited for an evaluation.^19,45 The rest of the studies (n = 11 [22%]) used prospective approaches to identify children who received a diagnosis of ASD, defined as diagnostic confirmation occurring when children were older (>6 months from screening). These studies used strategies that included medical record reviews from both primary and specialty clinics (n = 6),^{8,9,34,39,50,57} additional rigorous strategies to identify false-negative cases through additional screening tools or physician-indicated concerns (n = 3),^33,35,41 and evaluation of all children regardless of screening result (n = 2).^40,58

Studies also differed significantly based on the characteristics of the children screened. Twenty-four of the 51 studies (47%) recruited low-likelihood (LL) children, which included screening at well-child visits, childcare centers, kindergarten, or preschool. Twenty-two studies (43%) used high-likelihood (HL) samples, including children referred to community health services or for ASD-specific evaluations because of parent or practitioner concern about development,^{18,21,24,29,36,38,42,47,48,51,52,53,56,62,63} those at elevated likelihood for ASD according to another screening result,²⁶ those receiving services through an early intervention system,^23,44 those with Down syndrome,²⁷ those with a history of prematurity or low birth weight,^28,40 and those with older siblings who had received a diagnosis of ASD.⁶⁰ In addition, 5 studies administered the M-CHAT(-R/F) to a mixed sample comprising both LL and HL children.^{19,25,37,41,58} Most of these studies included predominantly LL samples, however.

Methodological Quality of Included Studies

Figure 1 shows the results of the methodological quality for all published studies included in the review. There was generally low risk of bias across domains with the exception of the flow and timing, for which 36 of the 51 studies (71%) showed high risk of bias, although this was because many studies used LL samples that did not evaluate all children screened with the M-CHAT(-R/F) or because there was longer timing between screening and evaluation in prospective case confirmation studies. Regarding concern for applicability, all studies showed low concern for patient selection and reference standard, with only 1 rated to be of high risk and 1 unclear risk for concern of applicability for the index test domain. See eTable 3 in Supplement 1 for domain ratings for individual studies.

Meta-analysis Results

Across all studies, M-CHAT-(R/F) sensitivity ranged from 0.22 to 1.00, with a pooled sensitivity of 0.83 (95% CI, 0.77-0.88). Specificity similarly ranged from 0.27 to 1.00, with a pooled specificity of 0.94 (95% CI, 0.89-0.97). See Table 1 for screening metrics for each identified study, Table 2 for HSROC model results, and Figure 2 for a forest plot of included studies, sorted by sensitivity.^{8,9,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,59,60,61,62,63,64} The SROC plot of the overall model is depicted in eFigure 2 in Supplement 1. Figure 3 displays a visual of sensitivity and specificity values grouped by several factors affecting the values, including strategy used for identifying missed cases, risk level, and size of the study.

Table 2. Sensitivity and Specificity of M-CHAT(-R/F) Overall and by Study Characteristics.

Characteristic	No.	Sens	Spec	DOR (95% CI)	−2LL	P value
Overall	49^a	0.83	0.94	75.40 (35.56-159.87)	NA	NA
M-CHAT version
M-CHAT	31	0.83	0.94	63.47 (25.38-158.73)	1.35	.72
M-CHAT-R	18	0.83	0.95	102.69 (27.87-378.26)	1.35	.72
Likelihood level
Low	27	0.83	0.99	334.78 (156.69-715.31)	58.28	<.001^b
High	22	0.82	0.70	10.93 (4.45-26.83)	58.28	<.001^b
Case confirmation strategy
Concurrent	40	0.86	0.93	85.84 (37.56-196.16)	17.66	.001^b
Prospective	9	0.57	0.97	38.71 (8.38-178.82)	17.66	.001^b
Sample size
<500	20	0.83	0.78	17.31 (6.48-46.21)	38.26	<.001^b
500-5000	17	0.80	0.96	88.94 (29.58-267.43)
>5000	12	0.85	0.99	553.41 (159.34-1922.06)
Follow-up
Initial only	22	0.82	0.84	24.61 (9.01-67.23)	11.29	.01^b
Follow-up	27	0.83	0.97	182.87 (71.12-470.22)	11.29	.01^b
Language
English or primarily English	26	0.78	0.85	20.19 (7.85-51.91)	23.96	<.001^b
Other	23	0.89	0.98	361.76 (145.80-897.58)	23.96	<.001^b

Open in a new tab

Abbreviations: DOR, diagnostic odds ratio; M-CHAT, Modified Checklist for Autism in Toddlers; M-CHAT-R, Modified Checklist for Autism in Toddlers, Revised; NA, not applicable; Sens, sensitivity; Spec, specificity; −2LL, −2 log likelihood difference.

^{^a}

Studies classified as “mixed” or “unknown” in any category were excluded from the analysis (n = 2).^57,58 When the overall model was run with the full set of 51 studies, sensitivity was only slightly different, specificity was identical, and DOR was slightly higher (78.71 [95% CI, 38.11-162.57]). Therefore, for comparison across the subanalyses, the common set of 49 studies is reported.

^{^b}

P < .05.

Figure 2. — FN indicates false negative; FP, false positive; TN, true negative; and TP, true positive.

^aRefers to sample 2 of Srisinghasongkram et al.⁵³

Figure 3. — FN indicates false negative.

Analyses testing the association of covariates (likelihood level of sample, case confirmation strategy classification, sample size, M-CHAT version, Follow-up use, and language) suggested several significant covariates (overall: DOR, 75.40 [95% CI, 35.56-159.87]) (Table 2; eFigure 2 in Supplement 1). All covariates examined except for the M-CHAT(-R/F) version (original vs revised) were statistically significant (Table 2; eFigure 3 in Supplement 1). Mixed and unknown subgroups were excluded from the models presented. When included, sensitivity was not significantly different, specificity was identical, and DOR was slightly higher (78.71 [95% CI, 38.11-162.57]). The LL group reported a DOR more than 30 times greater than that of the HL group (334.78 [95% CI, 156.69-715.31] vs 10.93 [95% CI, 4.45-26.83]) (Table 2; eFigure 4 in Supplement 1). Sensitivity was similar between the 2 groups (0.83 vs 0.82), but specificity of the LL group was higher than that of the HL group (0.99 vs 0.70). Higher DOR was achieved by the concurrent rather than the prospective case confirmation strategy (85.84 [95% CI, 37.56-196.16] vs 38.71 [95% CI, 8.38-178.82]) (Table 2; eFigure 5 in Supplement 1). Although the concurrent group achieved a higher sensitivity than the prospective group (0.86 vs 0.57), there were no large differences in specificity based on the detection strategy (0.93 vs 0.97). Diagnostic accuracy increased with sample size; the small sample size group (<500) had a DOR 5 times smaller than that of the medium sample size group (500-5000), which was, in turn, approximately 6 times smaller than that of the large sample size group (>5000). The DOR was 17.31 (95% CI, 6.48-46.21), 88.94 (95% CI, 29.58-267.43), and 553.41 (95% CI, 159.34-1922.06), respectively (Table 2; eFigure 6 in Supplement 1). Specificity increased with increasing sample size (0.78, 0.96, and 0.99, respectively), but sensitivity did not (0.83, 0.80, and 0.85, respectively). Higher DOR, sensitivity, and specificity were observed for use of the Follow-up compared with initial screening only (DOR, 182.87 [95% CI, 71.12-470.22] vs 24.61 [95% CI, 9.01-67.23]; sensitivity, 0.83 vs 0.82; specificity, 0.97 vs 0.84) (Table 2; eFigure 7 in Supplement 1). Higher DOR, sensitivity, and specificity were observed for studies conducted in languages other than predominantly English (DOR, 361.76 [95% CI, 145.80-897.58] vs 20.19 [95% CI, 7.85-51.91]; sensitivity, 0.89 vs 0.78; specificity, 0.98 vs 0.85) (Table 2; eFigure 8 in Supplement 1).

Systematic Review Results

Two factors were examined in an insufficient number of studies to be assessed as covariates in meta-analyses: screening age and repeated screening. We therefore provide a descriptive review of these factors here. Although M-CHAT(-R/F) was initially validated for children between 16 and 30 months of age, it has been used with children up to 48 months of age. Only 5 included studies directly compared M-CHAT(-R/F) psychometric values for younger children within the validated age range (up to 30 months) vs children older than 30 months (eTable 5 in Supplement 1). Descriptively, all 5 studies found slightly lower sensitivity for children older than 30 months compared with younger than 30 months.^{18,24,36,42,48} However, specificity differed, with 1 study reporting lower specificity with children older than 30 months,¹⁸ whereas other studies showed higher specificity for the same age group.^24,36,42,48

Only 6 of the 51 studies (12%) reported repeated screenings at 18 and 24 months, as is recommended by the AAP. Three of the repeated screening studies used concurrent false-negative case detection strategies,^22,46,61 1 study used a prospective false-negative case detection strategy,⁴¹ and 2 studies used prospective record review.^8,9 Three studies did not report enough information to allow for comparison of sensitivity for single compared with repeated screenings.^22,41,46 The remaining 3 studies demonstrated 11% to 45% higher sensitivity across repeated screenings compared with sensitivity based on a single screening, without decreasing specificity (eTable 6 in Supplement 1).

Discussion

The AAP identifies sensitivity and specificity above 70% to be acceptable for screening measures.⁶⁹ Estimates of these properties for the M-CHAT(-R/F) vary widely according to study methods and sample characteristics, and this variability affects use in both clinical and research settings. For example, as a result of 2 large prospective studies’ finding of low sensitivity for M-CHAT(-R/F),^8,9 a recent study⁷⁰ used an alternative screener that not only lacked long-term outcome data but also had tested the screener with only small samples, insufficient for thorough validation. Overall, across the studies identified in this systematic review and meta-analysis, sensitivity and specificity of the M-CHAT(-R/F) were found to be strong, with pooled values of 0.83 and 0.94, respectively. The variability of the estimates of sensitivity and specificity of M-CHAT(-R/F), however, highlights a need to consider factors that influence screening performance.

Case confirmation strategies used to identify missed cases were closely associated with sensitivity estimates. Weak concurrent strategies used to detect false-negative results likely inflated sensitivity compared with rigorous false-negative strategies better equipped to identify missed cases. Prospective strategies, on the other hand, likely conflate missed children who may not have had measurable symptoms during toddlerhood (ie, children whose ASD symptoms emerged later in childhood) with children truly missed by screening (who should have been detectable), potentially because of parents’ inaccurate report or their not being willing or ready to endorse an increased likelihood of ASD behaviors during screening.⁷¹ Studies show that for some children with ASD, symptoms are subtle early in development or show a prolonged course of symptom development,⁷² consistent with the theory that although brain development may be different before birth, measurable symptoms of neurodevelopmental disorders may emerge gradually because children are expected to demonstrate more sophisticated behavior as they grow older.⁷³ Although an advantage of reviewing medical records or registries lies in obtaining information over a range of ages, this broad age range may also be associated with ascertainment differences, given expected differences in medical record content for preschoolers compared with older children. Furthermore, prospective record review often includes community diagnoses that may not be as rigorous as diagnostic procedures in research⁷⁴; therefore, some children identified through record review would not meet more stringent research classification.

Another factor associated with variability in M-CHAT(-R/F)’s specificity is the classification of study samples based on ASD likelihood. The M-CHAT(-R/F) casts a broad net for children in need of expert differential diagnosis. Therefore, it is not surprising that specificity is lower for HL groups compared with LL samples because HL samples include many children with other developmental delays or co-occurring conditions. However, sensitivity—the ability to detect ASD when it is truly present—is equally high for both groups. The lower performance of specificity—the ability to identify individuals without ASD—supports the recommendation for comprehensive evaluation to assess differential diagnoses when M-CHAT(-R/F) is used, particularly in HL samples. Similarly, because the M-CHAT(-R/F) was validated for children between 16 and 30 months of age, it is not surprising that studies found slightly lower sensitivity for those older than 30 months compared with 30 months of age or younger, although M-CHAT(-R/F) has utility for children up to 48 months of age.

Repeated screening is also an important factor in maximizing sensitivity. However, the results of this systematic review suggest that repeated screening is extremely underused because only 6 of the 51 studies reported systematically screening toddlers more than once. The 3 studies for which data allowed direct comparison suggested a large increase in sensitivity with repeated vs single screening, without decreasing specificity. Similarly, other studies have shown that repeated screenings for ASD at 18 and 24 months of age increase the likelihood of identifying children missed by earlier screening,^75,76 and rescreening after 18 months of age detects children with ASD who initially screened negative.⁷⁷ In addition to symptom emergence detected among children at later ages, parents’ limited knowledge of typical developmental milestones, or of ASD-specific symptoms, may be associated with negative screening results at 18 months. For example, some children demonstrate symptoms of ASD at 18 months, even though they screen negative for ASD at that age.⁷⁸ These findings support the AAP’s recommendation for repeated screening at 18- and 24-month well-child visits; however, most studies did not adopt this recommendation in their study designs.

The studies included in the systematic review and meta-analysis differed in many aspects, including sample size, version of M-CHAT (original vs revised), whether the structured Follow-up was used appropriately for children whose initial scores were in the medium likelihood of ASD range, and language of M-CHAT(-R/F). Larger sample sizes resulted in higher DORs, possibly because of improved statistical stability in larger samples or because more resources were available in larger studies. Use of the Follow-up portion of the M-CHAT(-R/F) significantly improved the tool’s performance, with greatly increased specificity and a more than 7-fold increase in DOR, consistent with findings from the original validation studies of the M-CHAT(-R/F).^22,46 Even though use of the Follow-up does not change sensitivity (nor would it be expected to), it greatly reduces false-positive rates, which in turn reduces the burden on tertiary care clinics that receive referrals. This finding highlights the importance and benefit of the use of the Follow-up portion of the M-CHAT(-R/F) during screening, even though only slightly more than half of the identified studies administered the Follow-up consistently. The structured Follow-up clarifies parents’ endorsements during initial screening, giving them the opportunity to explain behaviors beyond dichotomous response options, which improves accuracy. The improved performance of the M-CHAT in non-English languages is interesting. It is unclear whether other factors may be associated with this finding. For example, non-English administration of M-CHAT(-R/F) appears to occur more often in HL studies. These potential interactions between variables need to be explored in future studies. In addition to the factors explored in the present meta-analysis, future factors to explore include parents’ prior knowledge of ASD, which may be more common in HL compared with LL samples, or parent education, which may account for some of the findings.

Limitations

A limitation of this study is the methodological issues that were identified in a majority of the studies, which could have biased reported accuracy measures. In particular, many studies with LL samples did not evaluate every child who was screened with the M-CHAT(-R/F), likely because of obvious feasibility challenges. Children who screened negative and were not evaluated were therefore presumed to have true-negative results for analyses, but it is possible that some cases were missed. Similarly, variation in the report of ASD diagnostic criteria and the type of clinicians who performed assessments were associated with between-study heterogeneity. In addition, the present systematic review was limited to studies published in English-language peer-reviewed journals.

Conclusions

When the M-CHAT(-R/F)’s utility in detecting toddlers at greater likelihood for ASD is evaluated, it is critical to consider study strategies, age of the children, and ASD likelihood status of the children in addition to other factors that tend to vary widely between the existing studies. Although the AAP recommends screening at both 18- and 24-month visits and studies emphasize the added value of repeated screenings, very few studies consistently rescreened participants, highlighting the need for continued effort in dissemination of best practice screening protocols. The results of this systematic review and meta-analysis illuminate important clinical implications for pediatric physicians. Even though no single measure will identify all children at increased likelihood for ASD, M-CHAT(-R/F) shows strong overall sensitivity and specificity, and research with the tool indicates that screening early and at multiple time points is critical to identify children at increased likelihood for ASD who are in need of access to ASD-specific early intervention services.

Overall, the results of this systematic review and meta-analysis highlight strong sensitivity and specificity for the M-CHAT(-R/F) across the 51 study samples—critical information for clinicians, researchers, and policy makers alike. However, the wide variability, ranging from poor to excellent screening metrics, highlights the differences between screener use, which should be considered when studies are designed. Critically, although the version of M-CHAT (original vs revised) does not significantly affect sensitivity and specificity, use of the Follow-up portion of the M-CHAT(-R/F) significantly reduces false-positive rates, which in turn reduces the burden on diagnostic and intervention systems. Other guidance based on this study’s findings includes emphasizing the importance of referral for comprehensive evaluations to discern symptoms of autism from symptoms observed in other developmental disorders, particularly when M-CHAT(-R/F) is used with HL populations. In addition, the difference in M-CHAT(-R/F)’s sensitivity in concurrent vs prospective studies indicates the need to account for timing and diagnostic rigor when designing studies.

Supplement 1.

eMethods. Diagnostic Accuracy of the M-CHAT(-R/F)

eTable 1. Database Search Terms

eTable 2. QUADAS-2 Description and Adapted Signaling Questions

eTable 3. Quality Assessment of Studies Included in the Systematic Review

eTable 4. Additional Study Characteristics and Psychometric Properties for M-CHAT(-R/F)

eTable 5. Sensitivity and Specificity for Younger and Older Samples

eTable 6. Sensitivity and Specificity for Single and Repeated Screening

eFigure 1. Study Selection Flow Chart Following PRISMA Guidelines

eFigure 2. Overall SROC of M-CHAT(-R/F) (n = 49 Studies)

eFigure 3. SROC Plot of M-CHAT(-R/F) by M-CHAT Version (M-CHAT n = 31, M-CHAT-R n = 18)

eFigure 4. SROC Plot of M-CHAT(-R/F) by Likelihood Level of Sample (Low Likelihood n = 27, High Likelihood n = 22)

eFigure 5. SROC Plot of M-CHAT(-R/F) by Case Confirmation Strategy (Concurrent n = 40, Prospective n = 9)

eFigure 6. SROC Plot of M-CHAT(-R/F) by Sample Size (Small n = 20, Medium n = 17, Large n = 12)

eFigure 7. SROC Plot of M-CHAT(-R/F) With Follow-up vs Initial Only (Initial n = 22, Follow-up n = 27)

eFigure 8. SROC Plot of M-CHAT(-R/F) by Language (English/Primarily English n = 26, Other Language n = 23)

Click here for additional data file.^{(586.3KB, pdf)}

Supplement 2.

Data Sharing Statement

Click here for additional data file.^{(14.8KB, pdf)}

References

1.Ozonoff S, Young GS, Landa RJ, et al. Diagnostic stability in young children at risk for autism spectrum disorder: a Baby Siblings Research Consortium study. J Child Psychol Psychiatry. 2015;56(9):988-998. doi: 10.1111/jcpp.12421 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Pierce K, Gazestani VH, Bacon E, et al. Evaluation of the diagnostic stability of the early autism spectrum disorder phenotype in the general population starting at 12 months. JAMA Pediatr. 2019;173(6):578-587. doi: 10.1001/jamapediatrics.2019.0624 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Shaw KA, Maenner MJ, Baio J, et al. Early identification of autism spectrum disorder among children aged 4 years—Early Autism and Developmental Disabilities Monitoring Network, six sites, United States, 2016. MMWR Surveill Summ. 2020;69(3):1-11. doi: 10.15585/mmwr.ss6903a1 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Elder JH, Kreider CM, Brasher SN, Ansell M. Clinical impact of early diagnosis of autism on the prognosis and parent-child relationships. Psychol Res Behav Manag. 2017;10:283-292. doi: 10.2147/PRBM.S117499 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Siu AL, Bibbins-Domingo K, Grossman DC, et al. ; US Preventive Services Task Force (USPSTF) . Screening for depression in adults: US Preventive Services Task Force recommendation statement. JAMA. 2016;315(4):380-387. doi: 10.1001/jama.2015.18392 [DOI] [PubMed] [Google Scholar]
6.Hyman SL, Levy SE, Myers SM, et al. Identification, evaluation, and management of children with autism spectrum disorder. Pediatrics. 2020;145(1):e20193447. doi: 10.1542/peds.2019-3447 [DOI] [PubMed] [Google Scholar]
7.Lipkin PH, Macias MM, Baer Chen B, et al. Trends in pediatricians’ developmental screening: 2002–2016. Pediatrics. 2020;145(4):e20190851. doi: 10.1542/peds.2019-0851 [DOI] [PubMed] [Google Scholar]
8.Carbone PS, Campbell K, Wilkes J, et al. Primary care autism screening and later autism diagnosis. Pediatrics. 2020;146(2):e20192314. doi: 10.1542/peds.2019-2314 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Guthrie W, Wallis K, Bennett A, et al. Accuracy of autism screening in a large pediatric network. Pediatrics. 2019;144(4):e20183963. doi: 10.1542/peds.2018-3963 [DOI] [PubMed] [Google Scholar]
10.Robins DL, Fein D, Barton ML, Green JA. The Modified Checklist for Autism in Toddlers: an initial study investigating the early detection of autism and pervasive developmental disorders. J Autism Dev Disord. 2001;31(2):131-144. doi: 10.1023/A:1010738829569 [DOI] [PubMed] [Google Scholar]
11.Robins DL, Fein D, Barton M. Modified Checklist for Autism in Toddlers–Revised With Follow-up (M-CHAT-R/F). Lineagen; 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Levy SE, Wolfe A, Coury D, et al. Screening tools for autism spectrum disorder in primary care: a systematic evidence review. Pediatrics. 2020;145(suppl 1):S47-S59. doi: 10.1542/peds.2019-1895H [DOI] [PubMed] [Google Scholar]
13.Petrocchi S, Levante A, Lecciso F. Systematic review of level 1 and level 2 screening tools for autism spectrum disorders in toddlers. Brain Sci. 2020;10(3):180. doi: 10.3390/brainsci10030180 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Sánchez-García AB, Galindo-Villardón P, Nieto-Librero AB, Martín-Rodero H, Robins DL. Toddler screening for autism spectrum disorder: a meta-analysis of diagnostic accuracy. J Autism Dev Disord. 2019;49(5):1837-1852. doi: 10.1007/s10803-018-03865-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Yuen T, Penner M, Carter MT, Szatmari P, Ungar WJ. Assessing the accuracy of the Modified Checklist for Autism in Toddlers: a systematic review and meta-analysis. Dev Med Child Neurol. 2018;60(11):1093-1100. doi: 10.1111/dmcn.13964 [DOI] [PubMed] [Google Scholar]
16.Moher D, Liberati A, Tetzlaff J, Altman DG; PRISMA Group . Preferred Reporting Items for Systematic Reviews and Meta-analyses: the PRISMA statement. Int J Surg. 2010;8(5):336-341. doi: 10.1016/j.ijsu.2010.02.007 [DOI] [PubMed] [Google Scholar]
17.Baduel S, Guillon Q, Afzali MH, Foudon N, Kruck J, Rogé B. The French version of the Modified-Checklist for Autism in Toddlers (M-CHAT): a validation study on a French sample of 24 month-old children. J Autism Dev Disord. 2017;47(2):297-304. doi: 10.1007/s10803-016-2950-y [DOI] [PubMed] [Google Scholar]
18.Beacham C, Reid M, Bradshaw J, et al. Screening for autism spectrum disorder: profiles of children who are missed. J Dev Behav Pediatr. 2018;39(9):673-682. doi: 10.1097/DBP.0000000000000607 [DOI] [PubMed] [Google Scholar]
19.Canal-Bedia R, García-Primo P, Martín-Cilleros MV, et al. Modified Checklist for Autism in Toddlers: cross-cultural adaptation and validation in Spain. J Autism Dev Disord. 2011;41(10):1342-1351. doi: 10.1007/s10803-010-1163-z [DOI] [PubMed] [Google Scholar]
20.Chang Z, Di Martino JM, Aiello R, et al. Computational methods to measure patterns of gaze in toddlers with autism spectrum disorder. JAMA Pediatr. 2021;175(8):827-836. doi: 10.1001/jamapediatrics.2021.0530 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Charman T, Baird G, Simonoff E, et al. Testing two screening instruments for autism spectrum disorder in UK community child health services. Dev Med Child Neurol. 2016;58(4):369-375. doi: 10.1111/dmcn.12874 [DOI] [PubMed] [Google Scholar]
22.Chlebowski C, Robins DL, Barton ML, Fein D. Large-scale use of the modified checklist for autism in low-risk toddlers. Pediatrics. 2013;131(4):e1121-e1127. doi: 10.1542/peds.2012-1525 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Choueiri R, Lindenbaum A, Ravi M, Robsky W, Flahive J, Garrison W. Improving early identification and access to diagnosis of autism spectrum disorder in toddlers in a culturally diverse community with the Rapid Interactive Screening Test for Autism in Toddlers. J Autism Dev Disord. 2021;51(11):3937-3945. doi: 10.1007/s10803-020-04851-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Christopher K, Bishop S, Carpenter LA, Warren Z, Kanne S. The implications of parent-reported emotional and behavioral problems on the Modified Checklist for Autism in Toddlers. J Autism Dev Disord. 2021;51(3):884-891. doi: 10.1007/s10803-020-04469-5 [DOI] [PubMed] [Google Scholar]
25.Coelho-Medeiros ME, Bronstein J, Aedo K, et al. M-CHAT-R/F validation as a screening tool for early detection in children with autism spectrum disorder. Article in Spanish. Rev Chil Pediatr. 2019;90(5):492-499. doi: 10.32641/rchped.v90i5.703 [DOI] [PubMed] [Google Scholar]
26.Dereu M, Raymaekers R, Warreyn P, Schietecatte I, Meirsschaut M, Roeyers H. Can child care workers contribute to the early detection of autism spectrum disorders? a comparison between screening instruments with child care workers versus parents as informants. J Autism Dev Disord. 2012;42(5):781-796. doi: 10.1007/s10803-011-1307-9 [DOI] [PubMed] [Google Scholar]
27.DiGuiseppi C, Hepburn S, Davis JM, et al. Screening for autism spectrum disorders in children with Down syndrome: population prevalence and screening test characteristics. J Dev Behav Pediatr. 2010;31(3):181-191. doi: 10.1097/DBP.0b013e3181d5aa6d [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Dudova I, Markova D, Kasparova M, et al. Comparison of three screening tests for autism in preterm children with birth weights less than 1,500 grams. Neuropsychiatr Dis Treat. 2014;10:2201-2208. doi: 10.2147/NDT.S72921 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Eaves LC, Wingert H, Ho HH. Screening for autism: agreement with diagnosis. Autism. 2006;10(3):229-242. doi: 10.1177/1362361306063288 [DOI] [PubMed] [Google Scholar]
30.Guo C, Luo M, Wang X, et al. Reliability and validity of the Chinese version of Modified Checklist for Autism in Toddlers, Revised, With Follow-up (M-CHAT-R/F). J Autism Dev Disord. 2019;49(1):185-196. doi: 10.1007/s10803-018-3682-y [DOI] [PubMed] [Google Scholar]
31.Harris JF, Coffield CN, Janvier YM, Mandell D, Cidav Z. Validation of the developmental check-in tool for low-literacy autism screening. Pediatrics. 2021;147(1):e20193659. doi: 10.1542/peds.2019-3659 [DOI] [PubMed] [Google Scholar]
32.Hoang VM, Le TV, Chu TTQ, et al. Prevalence of autism spectrum disorders and their relation to selected socio-demographic factors among children aged 18-30 months in northern Vietnam, 2017. Int J Ment Health Syst. 2019;13(1):29. doi: 10.1186/s13033-019-0285-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Inada N, Koyama T, Inokuchi E, Kuroda M, Kamio Y. Reliability and validity of the Japanese version of the Modified Checklist for Autism in Toddlers (M-CHAT). Res Autism Spectr Disord. 2011;5(1):330-336. doi: 10.1016/j.rasd.2010.04.016 [DOI] [Google Scholar]
34.Jonsdottir SL, Saemundsen E, Jonsson BG, Rafnsson V. Validation of the Modified Checklist for Autism in Toddlers, Revised With Follow-up in a population sample of 30-month-old children in Iceland: a prospective approach. J Autism Dev Disord. 2022;52(4):1507-1522. doi: 10.1007/s10803-021-05053-1 [DOI] [PubMed] [Google Scholar]
35.Kamio Y, Inada N, Koyama T, Inokuchi E, Tsuchiya K, Kuroda M. Effectiveness of using the Modified Checklist for Autism in Toddlers in two-stage screening of autism spectrum disorder at the 18-month health check-up in Japan. J Autism Dev Disord. 2014;44(1):194-203. doi: 10.1007/s10803-013-1864-1 [DOI] [PubMed] [Google Scholar]
36.Kanne SM, Carpenter LA, Warren Z. Screening in toddlers and preschoolers at risk for autism spectrum disorder: evaluating a novel mobile-health screening tool. Autism Res. 2018;11(7):1038-1049. doi: 10.1002/aur.1959 [DOI] [PubMed] [Google Scholar]
37.Kara B, Mukaddes NM, Altınkaya I, Güntepe D, Gökçay G, Özmen M. Using the Modified Checklist for Autism in Toddlers in a well-child clinic in Turkey: adapting the screening method based on culture and setting. Autism. 2014;18(3):331-338. doi: 10.1177/1362361312467864 [DOI] [PubMed] [Google Scholar]
38.Keehn RM, Tang Q, Swigonski N, Ciccarelli M. Associations among referral concerns, screening results, and diagnostic outcomes of young children assessed in a statewide early autism evaluation network. J Pediatr. 2021;233:74-81. doi: 10.1016/j.jpeds.2021.02.063 [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Kerub O, Haas EJ, Meiri G, Davidovitch N, Menashe I. A comparison between two screening approaches for ASD among toddlers in Israel. J Autism Dev Disord. 2020;50(5):1553-1560. doi: 10.1007/s10803-018-3711-x [DOI] [PubMed] [Google Scholar]
40.Kim SH, Joseph RM, Frazier JA, et al. Predictive validity of the Modified Checklist for Autism in Toddlers (M-CHAT) born very preterm. J Pediatr. 2016;178:101-107. doi: 10.1016/j.jpeds.2016.07.052 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Kleinman JM, Robins DL, Ventola PE, et al. The Modified Checklist for Autism in Toddlers: a follow-up study investigating the early detection of autism spectrum disorders. J Autism Dev Disord. 2008;38(5):827-839. doi: 10.1007/s10803-007-0450-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Koh HC, Lim SH, Chan GJ, et al. The clinical utility of the Modified Checklist for Autism in Toddlers with high risk 18–48 month old children in Singapore. J Autism Dev Disord. 2014;44(2):405-416. doi: 10.1007/s10803-013-1880-1 [DOI] [PubMed] [Google Scholar]
43.Magán-Maganto M, Canal-Bedia R, Hernández-Fabián A, et al. Spanish cultural validation of the Modified Checklist for Autism in Toddlers, Revised. J Autism Dev Disord. 2020;50(7):2412-2423. doi: 10.1007/s10803-018-3777-5 [DOI] [PubMed] [Google Scholar]
44.Matson JL, Kozlowski AM, Fitzgerald ME, Sipes M. True versus false positives and negatives on the Modified Checklist for Autism in Toddlers. Res Autism Spectr Disord. 2013;7(1):17-22. doi: 10.1016/j.rasd.2012.02.011 [DOI] [Google Scholar]
45.Oner O, Munir KM. Modified Checklist for Autism in Toddlers Revised (MCHAT-R/F) in an urban metropolitan sample of young children in Turkey. J Autism Dev Disord. 2020;50(9):3312-3319. doi: 10.1007/s10803-019-04160-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Robins DL, Casagrande K, Barton M, Chen CMA, Dumont-Mathieu T, Fein D. Validation of the Modified Checklist for Autism in Toddlers, Revised With Follow-up (M-CHAT-R/F). Pediatrics. 2014;133(1):37-45. doi: 10.1542/peds.2013-1813 [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Salim H, Soetjiningsih S, Windiani IGAT, Widiana IGR. Validation of the Indonesian version of Modified Checklist for Autism in Toddlers: a diagnostic study. Paediatr Indones. 2020;60(3):160-166. doi: 10.14238/pi60.3.2020.160-6 [DOI] [Google Scholar]
48.Salisbury LA, Nyce JD, Hannum CD, Sheldrick RC, Perrin EC. Sensitivity and specificity of 2 autism screeners among referred children between 16 and 48 months of age. J Dev Behav Pediatr. 2018;39(3):254-258. doi: 10.1097/DBP.0000000000000537 [DOI] [PubMed] [Google Scholar]
49.Samadi SA, McConkey R. Screening for autism in Iranian preschoolers: contrasting M-CHAT and a scale developed in Iran. J Autism Dev Disord. 2015;45(9):2908-2916. doi: 10.1007/s10803-015-2454-1 [DOI] [PubMed] [Google Scholar]
50.Schjølberg S, Shic F, Volkmar FR, et al. What are we optimizing for in autism screening? examination of algorithmic changes in the M-CHAT. Autism Res. 2022;15(2):296-304. doi: 10.1002/aur.2643 [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Smith NJ, Sheldrick RC, Perrin EC. An abbreviated screening instrument for autism spectrum disorders. Infant Ment Health J. 2013;34(2):149-155. doi: 10.1002/imhj.21356 [DOI] [Google Scholar]
52.Snow AV, Lecavalier L. Sensitivity and specificity of the Modified Checklist for Autism in Toddlers and the Social Communication Questionnaire in preschoolers suspected of having pervasive developmental disorders. Autism. 2008;12(6):627-644. doi: 10.1177/1362361308097116 [DOI] [PubMed] [Google Scholar]
53.Srisinghasongkram P, Pruksananonda C, Chonchaiya W. Two-step screening of the Modified Checklist for Autism in Toddlers in Thai children with language delay and typically developing children. J Autism Dev Disord. 2016;46(10):3317-3329. doi: 10.1007/s10803-016-2876-4 [DOI] [PubMed] [Google Scholar]
54.Sturner R, Howard B, Bergmann P, et al. Autism screening with online decision support by primary care pediatricians aided by M-CHAT/F. Pediatrics. 2016;138(3):e20153036. doi: 10.1542/peds.2015-3036 [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Sturner R, Howard B, Bergmann P, et al. Autism screening at 18 months of age: a comparison of the Q-CHAT-10 and M-CHAT screeners. Mol Autism. 2022;13(1):2. doi: 10.1186/s13229-021-00480-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Taylor CM, Vehorn A, Noble H, Weitlauf AS, Warren ZE. Brief report: can metrics of reporting bias enhance early autism screening measures? J Autism Dev Disord. 2014;44(9):2375-2380. doi: 10.1007/s10803-014-2099-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Toh TH, Tan VWY, Lau PST, Kiyu A. Accuracy of Modified Checklist for Autism in Toddlers (M-CHAT) in detecting autism and other developmental disorders in community clinics. J Autism Dev Disord. 2018;48(1):28-35. doi: 10.1007/s10803-017-3287-x [DOI] [PubMed] [Google Scholar]
58.Tsai JM, Lu L, Jeng SF, et al. Validation of the Modified Checklist for Autism in Toddlers, Revised With Follow-up in Taiwanese toddlers. Res Dev Disabil. 2019;85:205-216. doi: 10.1016/j.ridd.2018.11.011 [DOI] [PubMed] [Google Scholar]
59.Thi Vui L, Duc DM, Thuy Quynh N, et al. Early screening and diagnosis of autism spectrum disorders in Vietnam: a population-based cross-sectional survey. J Public Health Res. 2022;11(2):2460. doi: 10.4081/jphr.2021.2460 [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Weitlauf AS, Vehorn AC, Stone WL, Fein D, Warren ZE. Using the M-CHAT-R/F to identify developmental concerns in a high-risk 18-month-old sibling sample. J Dev Behav Pediatr. 2015;36(7):497-502. doi: 10.1097/DBP.0000000000000194 [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Wieckowski AT, Hamner T, Nanovic S, et al. Early and repeated screening detects autism spectrum disorder. J Pediatr. 2021;234:227-235. doi: 10.1016/j.jpeds.2021.03.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Windiani I, Soetjiningsih S, Adnyana I, Lestari KA. Indonesian Modified Checklist for Autism in Toddler, Revised With Follow-up (M-CHAT-R/F) for autism screening in children at Sanglah General Hospital, Bali-Indonesia. Bali Med J. 2016;5(2):133. doi: 10.15562/bmj.v5i2.240 [DOI] [Google Scholar]
63.Wong YS, Yang CC, Stewart L, Chiang CH, Wu CC, Iao LS. Use of the Chinese version Modified Checklist for Autism in Toddlers in a high-risk sample in Taiwan. Res Autism Spectr Disord. 2018;49:56-64. doi: 10.1016/j.rasd.2018.01.010 [DOI] [Google Scholar]
64.Zhang Y, Zhou Z, Xu Q, et al. Screening for autism spectrum disorder in toddlers during the 18- and 24-month well-child visits. Front Psychiatry. 2022;13:879625. doi: 10.3389/fpsyt.2022.879625 [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Whiting PF, Rutjes AW, Westwood ME, et al. ; QUADAS-2 Group . QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529-536. doi: 10.7326/0003-4819-155-8-201110180-00009 [DOI] [PubMed] [Google Scholar]
66.Rutter CM, Gatsonis CA. Regression methods for meta-analysis of diagnostic test data. Acad Radiol. 1995;2(suppl 1):S48-S56. [PubMed] [Google Scholar]
67.Rutter CM, Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Stat Med. 2001;20(19):2865-2884. doi: 10.1002/sim.942 [DOI] [PubMed] [Google Scholar]
68.Takwoingi Y, Deeks J. MetaDAS: a SAS macro for meta-analysis of diagnostic accuracy studies: quick reference and worked example. Accessed October 1, 2022. https://methods.cochrane.org/sdt/sites/methods.cochrane.org.sdt/files/uploads/MetaDAS%20Quick%20Reference%20v1.3%20May%202012.pdf
69.Council on Children With Disabilities; Section on Developmental Behavioral Pediatrics; Bright Futures Steering Committee; Medical Home Initiatives for Children With Special Needs Project Advisory Committee . Identifying infants and young children with developmental disorders in the medical home: an algorithm for developmental surveillance and screening. Pediatrics. 2006;118(1):405-420. doi: 10.1542/peds.2006-1231 [DOI] [PubMed] [Google Scholar]
70.Campbell K, Carbone PS, Liu D, Stipelman CH. Improving autism screening and referrals with electronic support and evaluations in primary care. Pediatrics. 2021;147(3):e20201609. doi: 10.1542/peds.2020-1609 [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Robins DL. How do we determine the utility of screening tools? Autism. 2020;24(2):271-273. doi: 10.1177/1362361319894170 [DOI] [PubMed] [Google Scholar]
72.Ozonoff S, Young GS, Brian J, et al. Diagnosis of autism spectrum disorder after age 5 in children evaluated longitudinally since infancy. J Am Acad Child Adolesc Psychiatry. 2018;57(11):849-857. doi: 10.1016/j.jaac.2018.06.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
73.Dennis M, Spiegler BJ, Simic N, et al. Functional plasticity in childhood brain disorders: when, what, how, and whom to assess. Neuropsychol Rev. 2014;24(4):389-408. doi: 10.1007/s11065-014-9261-x [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Hausman-Kedem M, Kosofsky BE, Ross G, et al. Accuracy of reported community diagnosis of autism spectrum disorder. J Psychopathol Behav Assess. 2018;40(3):367-375. doi: 10.1007/s10862-018-9642-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
75.Barton ML, Dumont-Mathieu T, Fein D. Screening young children for autism spectrum disorders in primary practice. J Autism Dev Disord. 2012;42(6):1165-1174. doi: 10.1007/s10803-011-1343-5 [DOI] [PubMed] [Google Scholar]
76.Crais ER, Watson LR. Challenges and opportunities in early identification and intervention for children at-risk for autism spectrum disorders. Int J Speech Lang Pathol. 2014;16(1):23-29. doi: 10.3109/17549507.2013.862860 [DOI] [PubMed] [Google Scholar]
77.Dai YG, Miller LE, Ramsey RK, Robins DL, Fein DA, Dumont-Mathieu T. Incremental utility of 24-month autism spectrum disorder screening after negative 18-month screening. J Autism Dev Disord. 2020;50(6):2030-2040. doi: 10.1007/s10803-019-03959-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
78.Øien RA, Schjølberg S, Volkmar FR, et al. Clinical features of children with autism who passed 18-month screening. Pediatrics. 2018;141(6):e20173596. doi: 10.1542/peds.2017-3596 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials