Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Jan 1.
Published in final edited form as: J Child Psychol Psychiatry. 2022 Aug 14;64(1):156–166. doi: 10.1111/jcpp.13681

Complexity Analysis of Head Movements in Autistic Toddlers

Pradeep Raj Krishnappa Babu 1, J Matias Di Martino 1, Zhuoqing Chang 1, Sam Perochon 1,2, Rachel Aiello 3,4, Kimberly LH Carpenter 3,4, Scott Compton 3, Naomi Davis 3, Lauren Franz 3,4,6, Steven Espinosa 5, Jacqueline Flowers 3,4, Geraldine Dawson 3,4,*, Guillermo Sapiro 1,7,*
PMCID: PMC9771883  NIHMSID: NIHMS1828843  PMID: 35965431

Abstract

Background:

Early differences in sensorimotor functioning have been documented in young autistic children and infants who are later diagnosed with autism. Previous research has demonstrated that autistic toddlers exhibit more frequent head movement when viewing dynamic audiovisual stimuli, compared to neurotypical toddlers. To further explore this behavioral characteristic, in this study, computer vision (CV) analysis was used to measure several aspects of head movement dynamics of autistic and neurotypical toddlers while they watched a set of brief movies with social and nonsocial content presented on a tablet.

Methods:

Data were collected from 457 toddlers, 17–36 months old, during their well-child visit to four pediatric primary care clinics. Forty-one toddlers were subsequently diagnosed with autism. An application (app) displayed several brief movies on a tablet, and the toddlers watched these movies while sitting on their caregiver’s lap. The front-facing camera in the tablet recorded the toddlers’ behavioral responses. CV was used to measure the participants’ head movement rate, movement acceleration and complexity using multiscale entropy.

Results:

Autistic toddlers exhibited significantly higher rate, acceleration and complexity in their head movements while watching the movies compared to neurotypical toddlers, regardless of the type of movie content (social versus nonsocial). The combined features of head movement acceleration and complexity reliably distinguished the autistic and neurotypical toddlers.

Conclusions:

Autistic toddlers exhibit differences in their head movement dynamics when viewing audiovisual stimuli. Higher complexity of their head movements suggests that their movements were less predictable and less stable compared to neurotypical toddlers. CV offers a scalable means of detecting subtle differences in head movement dynamics, which may be helpful in identifying early behaviors associated with autism and providing insight into the nature of sensorimotor differences associated with autism.

Keywords: autism, head movements, computer vision, complexity analysis


Autism is characterized by differences in social communication and the presence of restrictive and repetitive behaviors (American Psychiatric Association, 2014). In addition to the presence of motor stereotypies, other motor differences often associated with autism include impairments in fine and gross motor skills, motor planning, and motor coordination (Bhat, 2021; Flanagan, Landa, Bhat, & Bauman, 2012; Fournier, Hass, Naik, Lodha, & Cauraugh, 2010; Melo et al., 2020). Studies based on home videos of infants who were later diagnosed with autism reported asymmetry in body movements (Baranek, 1999; Esposito & Venuti, 2009; Teitelbaum, Teitelbaum, Nye, Fryman, & Maurer, 1998). Detailed motor assessments have documented difficulties in gait and balance stability, postural control, movement accuracy, manual dexterity, and praxis among autistic individuals (Chang, Wade, Stoffregen, Hsu, & Pan, 2010; Minshew, Sung, Jones, & Furman, 2004; Molloy, Dietrich, & Bhattacharya, 2003; Wilson, Enticott, & Rinehart, 2018; Wilson, McCracken, Rinehart, & Jeste, 2018).

Recent research utilizing computer vision analysis to measure differences in movement patterns has documented differences in patterns of head movement dynamics while watching dynamic audiovisual stimuli among autistic children. Martin and colleagues (Martin et al., 2018) examined differences in head movement displacement and velocity in 2.5 to 6.5-year-old children with and without a diagnosis of autism while they watched a video of social and nonsocial stimuli. Head movement differences between the autistic and neurotypical children were found in the lateral (yaw and roll) but not vertical (pitch) movement and were specific to periods when children were watching social videos. These authors suggested that the autistic children may use head movements to modulate their perception of social scenes. Zhao et al. quantified three-dimensional head movements in 6–13-year-old children with and without autism while they engaged in a conversation with an adult (Zhao et al., 2021). They found that the autistic children showed differences in their head movement dynamics not explained by whether they were fixating on the adult. Dawson et al. found that toddlers who were later diagnosed with autism exhibited a significantly higher rate of head movement while watching brief movies as compared to neurotypical toddlers (Dawson et al., 2018).

In the current study, we extended our earlier work on head movements in autistic toddlers (Dawson et al., 2018) in two ways: First, we sought to replicate our earlier findings related to the head movement rate with a significantly larger sample using similar but re-designed novel movies with social versus nonsocial content. Second, we expanded our analysis by not only computing head movement rate, but also the acceleration and the complexity of the time-series associated with the head movements. The acceleration provided an estimate of changes in head movement rate (i.e., velocity), while the complexity estimate reflects the level of predictability and stability of head movements (Costa, Goldberger, & Peng, 2002). We used multiscale entropy (MSE) to quantitatively assess the complexity or predictability of head movement dynamics (Costa et al., 2002). This metric quantifies the regularity of one-dimensional time-series on multiple scales (Costa et al., 2002; Zhao et al., 2021).

We hypothesized that, compared to neurotypical toddlers, autistic toddlers would exhibit higher head movement rate, acceleration, and increased complexity (less predictability). We also examined whether differences in head movement measures were more pronounced when autistic children watched movies with high levels of social content. Finally, we used machine learning classification analyses to determine whether these measures can be integrated to distinguish autistic and neurotypical toddlers.

Methods

Participants

Participants were 457 toddlers, 17–36 months of age, who were recruited from four pediatric primary care clinics during a well-child checkup. Forty-one toddlers were subsequently diagnosed with autism spectrum disorder (ASD) based on DSM-5 criteria. Inclusion criteria were: (i) age 16–38 months; (ii) not ill at the time of visit; and (iii) caregiver’s primary language at home was English or Spanish. Exclusion criteria were: (i) known hearing or vision impairments; (ii) the child was too upset during the visit; (iii) the caregiver expressed no interest, not enough time; (iv) the child was not able to complete the study procedures (e.g., if child would not stay in their caregiver’s lap, or the app or device failed to upload data), or the clinical information was missing; and/or (v) presence of a significant sensory or motor impairment that precluded the child from seeing the movies and/or sitting upright. Table 1 shows the participants’ demographic characteristics.

Table 1:

Demographic characteristics

N (%)

Groups Neurotypical (N=416) Autistic (N=41)

Age in months
 Mean (SD, Range) 20.59 (3.18, 17.2–32.3)a 24.38 (4.73, 17.9–36.8)a
Sex
 Female 207 (49.76%)b 12 (29.26%)b
 Male 209 (50.24%)b 29 (70.73%)b
Race
 American Indian/Alaskan Native 1 (0.24%) 3 (7.32%)
 Asian 6 (1.44%) 1 (2.44%)
 Black or African American 43 (10.33%) 6 (14.63%)
 Native Hawaiian or Other Pacific Islander 0 (0.00%) 0 (0.00%)
 White/Caucasian 316 (75.96%) 21 (51.22%)
 More Than One Race 41 (9.85%) 6 (14.63%)
 Other 9 (2.16%) 4 (9.75%)
Ethnicity
 Hispanic/Latino 31 (7.45%)b 12 (29.26%)b
 Not Hispanic/Latino 385 (92.54%)b 29 (70.73%)b
Caregivers’ Highest Level of Education
 Without High School Diploma 2 (0.49%)b 4 (9.76%)b
 High School Diploma or Equivalent 14 (3.36%)b 5 (12.20%)b
 Some College Education 40 (9.61%)b 10 (24.39%)b
 4-Year College Degree or More 356 (85.57%)b 21 (53.65%)b
 Unknown/Not Reported 4 (0.96%) 0 (0.00%)

Clinical Variables Mean (SD, Range)

ADOS-2 Toddler Module
 Calibrated Severity Score N/A 7.56 (1.67, 2–10)
Mullen Scales of Early Learning
 Early Learning Composite Score N/A 63.64 (10.17, 49–87)
 Expressive Language T-Score N/A 28.08 (7.41, 20–50)
 Receptive Language T-Score N/A 23.14 (4.93, 20–37)
 Fine Motor T-Score N/A  34.11 (10.76, 20–56)
 Visual Reception T-Score N/A 33.94 (10.63, 20–50)

ADOS-2: Autism Diagnostic Observation Schedule – Second Edition

Age of diagnosis: Mean=23.9, SD=4.5, Range=18–37.

Time between the ages at diagnosis and app administration: Mean=0.7, SD=1.2, Range=−0.1–5.9.

a

Significant difference between the two groups based on ANOVA test.

b

Significant difference between the two groups based on Chi-Square test.

Ethical considerations.

Caregivers provided written informed consent, and the study protocols were approved by the Duke University Health System Institutional Review Board (Pro00085434, Pro00085435).

Clinical measures

Modified checklist for toddlers – revised with follow-up (M-CHAT-R/F).

As a part of routine clinical care, all participants were assessed with a commonly used autism screening questionnaire, the M-CHAT-R/F (Robins et al., 2014). The M-CHAT-R/F consists of 20 questions answered by the caregiver to evaluate the presence/absence of autism-related symptoms.

Diagnostic and cognitive assessments.

Toddlers with a total M-CHAT-R/F score ≥3 initially and those for whom the total score was ≥2 after the follow-up questions, and/or those for whom the pediatrician and/or parent expressed developmental concerns, were referred for diagnostic evaluation by the study team psychologist. The Autism Diagnostic Observation Schedule – Toddler Module (ADOS-2) was administered by research-reliable licensed psychologists who determined whether the child met DSM-5 criteria for ASD (Luyster et al., 2009). Cognitive and language abilities were assessed using the Mullen Scales for Early Learning (Mullen, 1995).

Group definitions

Group definitions were: (1) Autistic (N = 41), defined as having a positive M-CHAT-R/F score or caregiver/physician raised concerns and subsequently meeting DSM-5 diagnostic criteria for autism spectrum disorder (ASD) with or without developmental delay based on both the ADOS-2 and clinical judgment by licensed psychologist; and (2) Neurotypical (N = 416), defined as (a) having a high likelihood of typical development with a score of 1 or below on the M-CHAT-R/F and no developmental concerns raised by caregiver/physician or (b) having a positive M-CHAT-R/F score and/or caregiver/physician raised concerns, but then determined based on the ADOS-2 and clinical judgement of the psychologist as not having developmental or autism-related concerns. There was another group of participants (N=12) who had a positive M-CHAT-R/F score and received a diagnosis other than autism (e.g., language delay without autism). Given the small sample size, we have excluded these participants from our current analyses.

Application (app) administration and stimuli

In each of the four clinics, a quiet room with few distractions was identified in which the app could be administered. Although it was quiet and without a lot of distraction, it was not otherwise controlled as in a laboratory setting. The rooms for the four clinics were similar in size, lighting, and presence of distractions (e.g., table in the room).

App administration.

We designed an app compatible with iOS devices which displayed developmentally appropriate movies. The front-facing camera was used to record the toddlers’ behavioral responses while watching the movies. Caregivers were requested to hold their child on their lap while a tablet was placed on a tripod at about 60 cm in front of the child. No special instructions were given regarding how to hold the child on their lap. The parent sat quietly throughout the app administration and was asked not to guide the child’s behavior or give instructions. Other family members (e.g., siblings) and the assistant who administered the app stayed behind both the caregiver and the child to reduce distractions during the experiment. We computed and analyzed the participants’ head movements that were recorded while they watched four movies with high social content, and three movies with low social content (nonsocial), described next.

Social movies containing human actors in the scene.

Blowing-Bubbles (~64 secs): An actor with a bubble wand blew the bubbles, along with smiling and frowning and limited verbal expressions (Figure 1a). Spinning-Top (~53 secs): An actress played with a spinning top, smiled, and frowned with limited verbal expressions (Figure 1b). Rhymes (~30 secs): An actress said nursery rhymes while smiling and gesturing (Figure 1c). Make-Me-Laugh (~56 secs): An actress demonstrated funny actions while smiling (Figure 1d).

Figure 1.

Figure 1.

Illustrative screenshots: social movies were (a) Blowing-Bubbles, (b) Spinning-Top, (c) Rhymes, and (d) Make-Me-Laugh; nonsocial movies were (e) Floating-Bubbles, (f) Mechanical-Puppy, and (g) Dog-in-the-Grass-RRL.

Nonsocial movies containing dynamic objects.

Floating-Bubbles (~35 secs): Bubbles were presented at random and moved throughout the frame (Figure 1e). Mechanical-Puppy (~25 secs): A mechanical toy puppy barked and walked towards vegetable toys (Figure 1f). Dog-in-Grass-Right-Right-Left (RRL) (~40 secs): A puppy appeared at parts of the screen, randomly in the right/left part of the screen followed by a constant right-right-left pattern (Figure 1g).

Capturing facial landmarks and head orientation using computer vision

A computer vision algorithm was used to detect the faces in each frame of the recorded video (King, 2009). Similar to (Chang et al., 2021; Perochon et al., 2021), scarce human supervision was triggered by a face-tracking algorithm to ensure that we tracked only the participant’s face. Then, we extracted 49 facial landmarks consisting of 2D-positional coordinates (Baltrusaitis, Zadeh, Lim, & Morency, 2018) (Figure 2) that were time-synchronized with the movies. Since we were interested in measuring the participants’ head movements while watching the movies, we focused our analysis on time-segments for which the participants are looking towards the tablet’s screen. To this end, for each frame we computed the child’s head pose angles relative to the tablet (Figure 2), θyaw (left-right), θpitch (up-down), and θroll (tilting left-right) (Hashemi et al., 2021). A criterion |θyaw| < ~20° of head pose angle was used as proxy for attention (Dawson et al., 2018; Hashemi et al., 2021), supported by ‘central bias’ theory for gaze estimation (Li et al., 2013; Mannan et al., 1995). In addition, we also ensured that the participants were looking towards the tablet’s screen using their gaze information extracted using the automatic gaze estimation algorithm (Chang et al., 2021). The resulting time-segments were used to estimate the head movement time-series data for our further analysis.

Figure 2.

Figure 2.

Facial landmarks representation. Circular points illustrate the position of the set of landmarks extracted. Axes illustrate the 3D system of coordinates associated to the face, with angles of rotation. The red colored landmarks were used to compute head movement dynamics.

Computational estimates of the head movement

Rate.

We computed the average Euclidean displacements of three central landmarks (represented in red colors in Figure 2) between two consecutive frames. To reduce the effect of changes in the distance between the child and the camera, we normalized the landmarks’ displacement, dividing by a 1 second window average of the distance between the eyes (W in Figure 2). We regularized the time-series corresponding to the normalized landmark’s displacement by considering an average of the signals over a moving window of 10 frames (1/3 second) (Dawson et al., 2018). For further analysis, we estimated mean head movement rate (Mean_rateHM) from this time-series.

Acceleration.

We estimated the child’s acceleration from the rate of head movements. Intuitively, acceleration is associated with the child’s head movement rate (i.e., their physical velocity), and therefore the first derivative of this quantity can be interpreted as the second derivative of the head positions. This second-order derivate is of particular interest since it relates to the magnitude of the instantaneous forces involved in head movement. We estimated the absolute mean of the acceleration (Mean_accelHM) as the difference between the head movement between two consecutive frames, averaged over a 1/3 second window. We have also estimated total energy of the head movements, which was not as powerful as Mean_accelHM in detecting a statistical difference between the two groups.

Complexity.

To estimate the complexity of head movement rate (time-series) at multiple time-resolutions using MSE (Costa et al., 2002, Costa, Goldberger, & Peng, 2005), a time-series X={x1, …, x2, …, xN} was down-sampled to 30 different scales (τ =1 to 30) and represented as

yj(τ)=1τi=(j1)τ+1jτxi,where1jNτ. (1)

Subsequently, sample entropy (SampEn) was calculated on each of these resolutions of the time-series. SampEn can be defined as an estimate of irregularity in time-series: given an embedding dimension m and a positive scalar tolerance r, the SampEn is the negative logarithm of the conditional probability that if the sets of simultaneous data points of length m repeat within the distance r, then the sets of data points having length m+1 also repeat within the distance r (Richman & Moorman, 2000). If the repeatability is low, the SampEn will be high, and the time-series is considered more complex. Considering the m-dimensional embedding vector Xim=xim,xi+1m,xi+2m,xi+2m,,xi+m1m from the time-series X={x1, …, x2, …, xN} of length N, the distance d between two vectors xim and xjm was defined as

d(xim,xjm)=max{|xi+k1mxj+k1m|},k=1,2,,m, (2)

and

SampEn=lnC(m+1)(r)Cm(r). (3)

Equation (3) defines the sample entropy, where Cm (r) and C(m+1) (r) denote the cumulative sum of the number of repeating vectors in the m and m+1 embedding spaces, respectively. Two vectors xim and xjm were defined as repeating if they met the condition d(xim,xjm)r,whereij. To handle any bias due to missing data while computing the SampEn, we only considered the segments where the data were available in the m+1 dimensional space (Dong et al., 2019).

The parameter m was set to 2, similar to (Costa et al., 2002; Harati, Crowell, Mayberg, Kong, & Nemati, 2016), and r = 0.15 *σ, where 0.15 is a scaling factor chosen, similar to (Costa et al., 2002; Dong et al., 2019; Lake, Richman, Pamela, & Randall, 2002). σ denotes the signal’s standard deviation that characterizes the time-series. Since σ can vary across different participants, we used the population-wise standard deviation; this choice defined a distance threshold r consistent across participants (see (Krishnappa Babu et al., 2021) for a detailed discussion). Finally, a global complexity estimation (across multiple scales) was obtained integrating the SampEn across the first ten scales (Intergrated-entropyHM). At least 40% of the data was necessary to perform effective complexity analysis (Cirugeda-Roldan, Cuesta-Frau, Miro-Martinez, & Oltra-Crespo, 2014; Lake et al., 2002) after handling the missing segments similar to (Krishnappa Babu et al., 2021). Below this threshold the estimation of the SampEn may not be reliable. Participants having <40% data were removed from the analysis for each specific movie.

Statistical analysis

Mann–Whitney U test was used to estimate the statistical significance between the groups, using python (pingouin.mwu). Within group comparisons (e.g., to compare between the social and nonsocial movies) were performed using Wilcoxon signed-rank test using python (pingouin.wilcoxon). The statistical power was calculated using the effect size, ‘r’ provided by pingouin.mwu and pingouin.wilcoxon. A 2X2 mixed ANOVA was used to estimate the main effects due to (i) group and (ii) movie type (social and nonsocial) and their interaction effect with python function, pingouin.mixed_anova. For mixed ANOVA analysis, we have estimated the mean values of the Mean_rateHM, Mean_accelHM and Integrated_entropyHM for the social and non-social movies across all the participants. Additionally, analysis of covariance (ANCOVA) using pingouin.ancova was performed to determine the influence of covariates such as participants’ age and percentage of missing data. A support vector machine (SVM) based classifier with radial basis function (RBF) kernel (Cortes & Vapnik, 1995) was used to assess the classification power of the proposed features. Classification performance was compared using the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) with Leave-one-out cross-validation (Elisseeff & Pontil, 2003). 95% confidence intervals were computed with the Hanley and McNeil method (Hanley & McNeil, 1982). We chose SVM since it is popular when used on relatively smaller datasets, and also a cross-validation was done to minimize the risk of overoptimistic classification (Vabalas, Gowen, Poliakoff, & Casson, 2019). Classification performance of the different models were compared based on their true difference (dt), estimated using observed difference in the error (d) and sum of variance of the error across the two models (σd), at a significant threshold of p<0.05: dt=d±1.96*σd (Tan et al., 2006). If the dt spans over zero, then the two models are considered not significantly different.

Results

Engagement with the app administration

The percentages of assessments to which the child attended for the majority of the app administration were 95% and 93% for the neurotypical and autistic groups, respectively.

Differences in rate, acceleration, and complexity of head movements

Rate.

Figure 3 displays the time-series plots of the head movement rate per 1/3 second for each of the movies. To replicate our findings in (Dawson et al., 2018), we used the mean head movement rate (Mean_rateHM) to examine between the two groups. A 2X2 mixed ANOVA was used to estimate the main effects of (i) group and (ii) movie type (social and nonsocial) and the interaction between group * movie type. There was a significant effect of group, with autistic toddlers exhibiting a higher rate of movement compared to the neurotypical toddlers (F(1, 427) = 42.48, p<0.0001), and a significant effect of movie type, with social movies eliciting more movement than nonsocial movies (F(1, 427) = 35.22, p<0.0001), but no significant interaction effect (F(1, 427) = 0.77, p=0.09). Analyzed by type of movie separately, the Mean_rateHM was significantly higher for the autistic group compared to neurotypical group for social movies (Blowing-Bubbles (p<0.0001, r = 0.56), Spinning-Top (p=0.0001, r = 0.39), Rhymes (p<0.0001, r = 0.72), and Make-Me-Laugh (p<0.0001, r = 0.62)), with medium to large effect sizes, as well as for the nonsocial movies except for Mechanical-Puppy (Floating-Bubbles (p<0.001, r = 0.35, Dog-in-Grass-RRL (p = 0.005, r = 0.27)), with small to medium effect sizes.

Figure 3.

Figure 3.

(a - g) Represents rate of head movement (left) and Mean_accelHM (right) for each of the movies. Solid lines represent the median, and the faded overline represents the 1st and 3rd quartile. The number of participants in the autism and neurotypical groups varies across the movies since we have considered only individuals having at least 40% of valid data for the stimuli. The last plot (h) shows the mean value of Mean_accelHM using all the tasks for social and nonsocial movies.

Acceleration.

Figure 3 displays the mean acceleration values per group. A 2X2 mixed ANOVA results showed, again there was a significant effect of group, with autistic toddlers exhibiting higher acceleration than neurotypical toddlers F(1, 427) = 38.65, p<0.0001), and a significant effect of movie type, with social movies eliciting higher acceleration than nonsocial movies (F(1, 427) = 70.14, p<0.0001), but no significant interaction effect (F(1, 427) = 0.007, p=0.92). Analyzed by type of movie separately, across all the social movies, the Mean_accelHM of the autistic group was significantly higher than neurotypical group, with medium to large effect sizes (Figure 3ad), as well as for the nonsocial movies (Figure 3eg), except for Mechanical Puppy. Notably, the effect sizes comparing the two groups were smaller for the nonsocial movies compared with the effect sizes for the social movies.

Complexity (entropy).

Figure 4ag displays the MSE of the head movements across 1–30 scales. With 2X2 mixed ANOVA a significant effect of group was found indicating a greater level of complexity for the autistic compared to the neurotypical group (F(1, 427) = 29.68, p<0.0001). A significant effect of movie type was also found indicating greater complexity of movement during social compared to nonsocial movies (F(1, 427) = 42.94, p<0.0001), but there was not a significant interaction effect (F(1, 427) = 1.68, p=0.06). Analyzed by type of movie separately, the SampEn was significantly higher for the autistic group compared to the neurotypical group during social movies, especially over the first 10 scales, with effect size being small to medium (ranging from .25 to .48; p-values are provided per scale in Figure 4ag). The Integrated_entropyHM during social movies (Figure 4ad) was also significantly higher for the autistic group, with effect sizes ranging from medium (r = .4) to large (r = .68). Similar results were found for the nonsocial movies (Figure 4eg), with significant group differences for the SampEn at some of the resolutions, but the effect sizes were smaller (0.15–0.2). The Integrated_entropyHM was also significantly different between the two groups during the nonsocial movies (with small to large effect sizes; see Figure 4), except Mechanical Puppy.

Figure 4.

Figure 4.

(a - g) Representation of the MSE (left) across scales and the Integrated_MSEHM (right). The number of participants in the autism and neurotypical (NT) groups varies across the movies since we have considered only individuals having at least 40% of valid data (Symbol: Δ→ p<.05). The last plot (h) shows the mean value of Integrated_MSEHM using all the tasks for social and nonsocial movies.

Within-group differences.

Further, we have analyzed the differences within each of the autistic and neurotypical groups for Mean_rateHM, Mean_accelHM and Integrated_entropyHM in response to the social and nonsocial movies. The Wilcoxon signed-rank test indicated that the neurotypical group exhibited significantly higher Mean_rateHM (p<0.0001, r = 0.47), Mean_accelHM (p<0.0001, r = 0.62; Figure 3g) and Integrated_entropyHM (p<0.0001, r = 0.51; Figure 4g) during social movies than the nonsocial movies. In contrast, the autistic group did not exhibit differences in Mean_rateHM, Mean_accelHM and Integrated_entropyHM during the social versus nonsocial movies.

Influence of varying time segments and age.

We repeated our analyses using the number of time-segments and the participant’s age as covariates in ANCOVA and found these two covariates did not affect our between-group analyses.

Relationship between head movement variables and cognitive ability

Mullen Scale scores were available for the autistic group. There was a positive correlation between the Mullen Early Learning Composite Score and the head movement measures, Mean_rateHM, Mean_accelHM and Integrated_entropyHM, during the social movies (r’s = 0.36, 0.39 and 0.34, respectively; p’s <0.05) but not during the nonsocial movies (all nonsignificant).

Combining acceleration and complexity via a classification framework

The Mean_accelHM and the Integrated_entropyHM were moderately correlated (r = 0.4–0.5). We used these two measures from the four social movies for classification analysis since effect sizes for group differences in head movements were larger during the social movies. We trained an SVM-based classifier using these two input features and group as the classification target to evaluate how these measures can be used to discriminate groups. We evaluated the performance using information collected during a single movie and combination of all four social movies. For the latter analysis, we included the data from participants who had data from these two features from all four movies, resulting in N=31 for the autistic group and N=389 for the neurotypical group. Testing for individual movies, Mean_accelHM and Integrated_entropyHM distinguished the autistic and neurotypical groups (see Figure 5), both when used alone and in combination. Combining either of the two features across all the movies (resulting in a 4-dimensional feature space), the AUC of the ROC increased to 0.85 for the Mean_accelHM and 0.80 for the Integrated_entropyHM. Combining either of the features across movies performed better than combining two features extracted during each of the movies, though the classifiers are not statistically significantly different. Combining all the features for all the movies (8-dimensional space) resulted in a lower performance (.73 AUC compared to .85 and .80 AUC achieved with individual features combined across movies). For datasets of moderate size, a decrease in performance with the increase in number of features to a certain number (data-dependent) is a common phenomenon.

Figure 5.

Figure 5.

ROC curves of isolation and combination of Mean_accelHM and Integrated_MSEHM for (a) Rhymes and (b) all the social movies. CI=Confidence interval. AUC of the ROC for other movies are: Blowing-Bubbles: Mean_accelHM:AUC=0.71,CI=[0.60,0.82], Integrated_MSEHM:AUC=0.68,CI=[0.53,0.75], both:AUC=0.73, CI=[0.62,0.85]; Spinning-Top: Mean_accelHM:AUC=0.68,CI=[0.54,0.77], Integrated_MSEHM:AUC=0.67,CI=[0.51,0.75], both: AUC=0.71,CI=[0.52,0.78], Make-Me-Laugh: Mean_accelHM AUC=0.70,CI=[0.52,0.71], Integrated_MSEHM AUC=0.69,CI=[0.52,0.76]), both: AUC=0.72,CI=[0.56,0.78].

Discussion

We demonstrated that a scalable app delivered to toddlers on an iPad during a well-child visit can be used to detect early head movement differences in toddlers diagnosed with autism. Similar to our previously published findings, autistic toddlers exhibited a higher rate of head movements while watching dynamic audiovisual movies, regardless of whether the content was social or nonsocial in nature. Furthermore, we found that the autistic toddlers also showed greater acceleration and complexity of their head movements compared to neurotypical toddlers. Our findings suggest that this sensorimotor behavior, which is exhibited while watching complex, dynamic stimuli and characterized by more frequent head movements that have higher acceleration and more complexity, is an early feature of autism. Moreover, in an analysis combining measures of head movement acceleration and complexity for each movie and across all movies with social content, we demonstrated that an SVM-based classifier based on head movement dynamics differentiated the autistic and neurotypical groups in a data-driven fashion.

The nature of these differences in head movement dynamics is not fully understood. Such differences do not appear to reflect the degree of attention to the movies, as the measures were only taken during the time frames when children were attending to the movies (facing forward and gazing towards the screen). Moreover, including the amount of attention to the movies as a covariate in our analyses did not affect our results. Similarly, the head movements do not appear to reflect degree of social engagement because the autistic children also showed differences in head movement dynamics while watching the nonsocial movies. Martin and colleagues found that autistic children exhibited higher levels of head movements only while viewing social stimuli and interpreted the movements as a mechanism for modulating their perception of the social stimuli (Martin et al., 2018). In contrast, we found that, whereas the neurotypical toddlers showed increased head movements during social as compared to nonsocial movies, the autistic toddlers showed similarly high levels of head movements during both types of movies. Thus, our data do not support the hypothesis that the head movements of the autistic toddlers were used to modulate the perception of social stimuli, per se. It is still possible, however, that the movements were more generally used to modulate sensory information across the different types of movies. Interestingly, autistic children with lower cognitive abilities showed higher levels of head movement rate, acceleration, and complexity specifically during viewing of the social movies. It is possible that children with lower cognitive abilities found the social movies more difficult to interpret, as these movies did involve the use of simple speech, facial expressions, and gestures by the actor. Another possibility is that the head movements reflect differences in postural control. Previous studies of postural sway in autistic individuals have found that postural control difficulties increase when sensory demands are increased, such as when viewing stimuli requiring multisensory integration (Cham et al., 2021; Minshew et al., 2004). Examining the videos of toddlers with high levels of head movements in the present study revealed that the movement involves not just the head but also the upper body including trunk and shoulders, which were not captured by our computer vision algorithm, as we focused solely on the face in this study. Maintaining stability of posture and midline head control relies on complex sensorimotor processes which are challenged when viewing complex multisensory information. Difficulties in multisensory integration have been documented in autistic individuals (Donohue, Darling, & Mitroff, 2012). Like some forms of repetitive behavior, head movements might also serve a regulatory function, especially if children found the stimuli arousing, similar to findings in studies of postural control (Cham et al., 2021). Future research is needed to further explore the developmental course of differences in head movement dynamics in autism and elucidate their nature and neurobiological basis.

Limitations of this study include the sample size, which was relatively large but included a smaller number of autistic children and did not offer sufficient power to determine the influence of biological and demographic characteristics, such as sex. Data from all participants were not used in some of the analyses because we used data from participants who attended for at least 40% of the movie length. The autistic and neurotypical group differed in cognitive ability and, thus, it is not clear the degree to which differences in cognition contributed to our findings.

In summary, results of this study confirm that a difference in head movement dynamics is one of the early sensorimotor signs associated with autism. Combining this feature with other behavioral biomarkers such as gaze, facial dynamics, and response to name will allow us to develop a multimodal computer vision-based digital phenotyping tool capable of offering a quantitative and objective characterization of early behaviors associated with autism.

Key points.

  • Autistic children exhibit more frequent head movements while watching dynamic stimuli compared to neurotypical children.

  • Earlier research suggested that computer vision can automatically measure these head movement patterns.

  • This larger study confirmed that computer vision can be used to objectively and automatically measure head movement dynamics in toddlers from videos recorded via a digital app during a well-child checkup in primary care.

  • Rate, acceleration, and complexity of head movements were found to be significantly higher in autistic toddlers compared to neurotypical toddlers.

  • Combining head movements with other behavioral biomarkers, a multimodal computer vision and machine learning based digital autism screening tool can be developed, offering quantitative and objective characterization of early autism-related behaviors.

Acknowledgements

The authors wish to thank the many caregivers and children for their participation in the study, without whom this research would not have been possible. The authors gratefully acknowledge the collaboration of the physicians and nurses in Duke Children’s Primary Care, members of the NIH ACE research team, as well as several clinical research specialists. This work was supported by NIH Autism Centers of Excellence Award NICHD P50HD093074 (Dawson, PI), NIMH R01MH121329 (Dawson & Sapiro, Co-PIs), and NIMH R01MH120093 (Sapiro & Dawson, Co-PIs). Perochon, Di Martino, and Sapiro received additional support from NSF and the Department of Defense (ONR and NGA) as well as gifts from Cisco, AWS, and Microsoft. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. K.C., Z.C., S.E., G.D., and G.S. developed technology related to the app that has been licensed, and both they and Duke University have benefited financially. G.D. is on the Scientific Advisory Boards of Akili Interactive, Inc., Zynerba, Nonverbal Learning Disability Project, and Tris Pharma, is a consultant to Apple Inc., Gerson Lehrman Group, and Guidepoint Global, Inc., and receives book royalties from Guilford Press and Springer Nature. G.D. and G.S. have invention disclosures and patent applications registered at the Duke Office of Licensing and Ventures. G.S. is affiliated with Apple Inc.; this work was done before and independently of that affiliation. The remaining authors have declared that they have no competing or potential conflicts of interest.

Abbreviations:

ADOS

Autism Diagnostic Observation Schedule

ANOVA

Analysis of variance

ASD

Autism spectrum disorder

AUC

Area under the curve

CV

Computer vision

DSM-5

Diagnostic and Statistical Manual of Mental Disorders – 5th Edition

M-CHAT-R/F

Modified Checklist for Toddlers-Revised with Follow-Up

MSE

Multiscale entropy

RBF

Radial basis function

ROC

Receiver Operating Characteristic

SVM

Support vector machine

References

  1. American Psychiatric Association. (2014). Diagnostic and statistical manual of mental disorders : DSM-5. American Psychiatric Association. [Google Scholar]
  2. Baltrusaitis T, Zadeh A, Lim YC, & Morency LP (2018). OpenFace 2.0: Facial behavior analysis toolkit. Proceedings - 13th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2018, 59–66. [Google Scholar]
  3. Baranek GT (1999). Autism during infancy: A retrospective video analysis of sensory-motor and social behaviors at 9–12 months of age. Journal of Autism and Developmental Disorders, 29(3), 213–224. [DOI] [PubMed] [Google Scholar]
  4. Bhat AN (2021). Motor Impairment Increases in Children With Autism Spectrum Disorder as a Function of Social Communication, Cognitive and Functional Impairment, Repetitive Behavior Severity, and Comorbid Diagnoses: A SPARK Study Report. Autism Research, 14(1), 202–219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cham R, Iverson JM, Bailes AH, Jennings JR, Eack SM, & Redfern MS (2021). Attention and sensory integration for postural control in young adults with autism spectrum disorders. Experimental Brain Research, 239(5), 1417–1426. [DOI] [PubMed] [Google Scholar]
  6. Chang CH, Wade MG, Stoffregen TA, Hsu CY, & Pan CY (2010). Visual tasks and postural sway in children with and without autism spectrum disorders. Research in Developmental Disabilities, 31(6), 1536–1542. [DOI] [PubMed] [Google Scholar]
  7. Chang Z, Di Martino JM, Aiello R, Baker J, Carpenter K, Compton S, Davis N, Eichner B, Espinosa S, Flowers J, Franz L, Harris A, Howard J, Perochon S, Perrin EM, Krishnappa Babu PR, Spanos M, Sullivan C, Walter BK, Kollins SH, Dawson G, & Sapiro G (2021). Computational Methods to Measure Patterns of Gaze in Toddlers With Autism Spectrum Disorder. JAMA Pediatrics, 175(8), 827–836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cirugeda-Roldan E, Cuesta-Frau D, Miro-Martinez P, & Oltra-Crespo S (2014). Comparative study of entropy sensitivity to missing biosignal data. Entropy, 16(11), 5901–5918. [Google Scholar]
  9. Cortes C, & Vapnik V (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297. [Google Scholar]
  10. Costa M, Goldberger AL, & Peng CK (2002). Multiscale Entropy Analysis of Complex Physiologic Time Series. Physical Review Letters, 89(6), 068102. [DOI] [PubMed] [Google Scholar]
  11. Costa M, Goldberger AL, & Peng CK (2005). Multiscale entropy analysis of biological signals. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 71(2), 021906. [DOI] [PubMed] [Google Scholar]
  12. Dawson G, Campbell K, Hashemi J, Lippmann SJ, Smith V, Carpenter K, Egger H, Espinosa S, Vermeer S, Baker J, & Sapiro G (2018). Atypical postural control can be detected via computer vision analysis in toddlers with autism spectrum disorder. Scientific Reports, 8(1), 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dong X, Chen C, Geng Q, Cao Z, Chen X, Lin J, Jin Y, Zhang Z, Shi Y, & Zhang XD (2019). An improved method of handling missing values in the analysis of sample entropy for continuous monitoring of physiological signals. Entropy, 21(3), 274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Donohue SE, Darling EF, & Mitroff SR (2012). Links between multisensory processing and autism. Experimental Brain Research, 222(4), 377–387. [DOI] [PubMed] [Google Scholar]
  15. Elisseeff A, & Pontil M (2003). Leave-one-out error and stability of learning algorithms with applications. Advances in Learning Theory: Methods, Models and Applications, NATO Science Series III: Computer & Systems Sciences, 190, 111–130. [Google Scholar]
  16. Esposito G, & Venuti P (2009). Symmetry in infancy: Analysis of motor development in autism spectrum disorders. Symmetry, 1(2), 215–225. [Google Scholar]
  17. Flanagan JE, Landa R, Bhat A, & Bauman M (2012). Head lag in infants at risk for autism: A preliminary study. American Journal of Occupational Therapy, 66(5), 577–585. [DOI] [PubMed] [Google Scholar]
  18. Fournier KA, Hass CJ, Naik SK, Lodha N, & Cauraugh JH (2010). Motor coordination in autism spectrum disorders: A synthesis and meta-analysis. Journal of Autism and Developmental Disorders, 40(10), 1227–1240. [DOI] [PubMed] [Google Scholar]
  19. Hanley JA, & McNeil BJ (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36. [DOI] [PubMed] [Google Scholar]
  20. Harati S, Crowell A, Mayberg H, Kong J, & Nemati S (2016). Discriminating clinical phases of recovery from major depressive disorder using the dynamics of facial expression. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 2016-Octob, 2254–2257. [DOI] [PubMed] [Google Scholar]
  21. Hashemi J, Dawson G, Carpenter KLH, Campbell K, Qiu Q, Espinosa S, Marsan S, Baker JP, Egger HL, & Sapiro G (2021). Computer Vision Analysis for Quantification of Autism Risk Behaviors. IEEE Transactions on Affective Computing, 12(1), 215–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. King DE (2009). Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10, 1755–1758. [Google Scholar]
  23. Krishnappababu PR, Di Martino M, Chang Z, Perochon SP, Carpenter KLH, Compton S, Espinosa S, Dawson G, & Sapiro G (2021). Exploring Complexity of Facial Dynamics in Autism Spectrum Disorder. IEEE Transactions on Affective Computing, 1–12. 10.1109/taffc.2021.3113876 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lake DE, Richman JS, Pamela Griffin M, & Randall Moorman J (2002). Sample entropy analysis of neonatal heart rate variability. American Journal of Physiology - Regulatory Integrative and Comparative Physiology, 283(3), R789–R797. [DOI] [PubMed] [Google Scholar]
  25. Li Y, Fathi A, & Rehg JM (2013). Learning to predict gaze in egocentric video. Proceedings of the IEEE International Conference on Computer Vision, 3216–3223. [Google Scholar]
  26. Luyster R, Gotham K, Guthrie W, Coffing M, Petrak R, Pierce K, Bishop S, Esler A, Hus V, Oti R, Richler J, Risi S, & Lord C (2009). The autism diagnostic observation schedule - Toddler module: A new module of a standardized diagnostic measure for autism spectrum disorders. Journal of Autism and Developmental Disorders, 39(9), 1305–1320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Mannan S, Ruddock KH, & Wooding DS (1995). Automatic control of saccadic eye movements made in visual inspection of briefly presented 2-D images. Spatial Vision, 9(3), 363–386. [DOI] [PubMed] [Google Scholar]
  28. Martin KB, Hammal Z, Ren G, Cohn JF, Cassell J, Ogihara M, Britton JC, Gutierrez A, & Messinger DS (2018). Objective measurement of head movement differences in children with and without autism spectrum disorder. Molecular Autism, 9(1), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Melo C, Ruano L, Jorge J, Pinto Ribeiro T, Oliveira G, Azevedo L, & Temudo T (2020). Prevalence and determinants of motor stereotypies in autism spectrum disorder: A systematic review and meta-analysis. Autism, 24(3), 569–590. [DOI] [PubMed] [Google Scholar]
  30. Minshew NJ, Sung KB, Jones BL, & Furman JM (2004). Underdevelopment of the postural control system in autism. Neurology, 63(11), 2056–2061. [DOI] [PubMed] [Google Scholar]
  31. Molloy CA, Dietrich KN, & Bhattacharya A (2003). Postural Stability in Children with Autism Spectrum Disorder. Journal of Autism and Developmental Disorders, 33(6), 643–652. [DOI] [PubMed] [Google Scholar]
  32. Mullen EM (1995). Mullen Scales of Early Learning. Circle Pines, MN: American Guidance Service., 58–64. [Google Scholar]
  33. Perochon S, Di Martino M, Aiello R, Baker J, Carpenter K, Chang Z, Compton S, Davis N, Eichner B, Espinosa S, Flowers J, Franz L, Gagliano M, Harris A, Howard J, Kollins SH, Perrin EM, Raj P, Spanos M, Walter B, Sapiro G, & Dawson G (2021). A scalable computational approach to assessing response to name in toddlers with autism. Journal of Child Psychology and Psychiatry and Allied Disciplines, 62(9), 1120–1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Richman JS, & Moorman JR (2000). Physiological time-series analysis using approximate and sample entropy. American Journal of Physiology - Heart and Circulatory Physiology, 278(6), H2039–H2049. [DOI] [PubMed] [Google Scholar]
  35. Robins DL, Casagrande K, Barton M, Chen CMA, Dumont-Mathieu T, & Fein D (2014). Validation of the modified checklist for autism in toddlers, revised with follow-up (M-CHAT-R/F). Pediatrics, 133(1), 37–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Stevenson RA, Ghose D, Fister JK, Sarko DK, Altieri NA, Nidiffer AR, Kurela LAR, Siemann JK, James TW, & Wallace MT (2014). Identifying and Quantifying Multisensory Integration: A Tutorial Review. Brain Topography, 27(6), 707–730. [DOI] [PubMed] [Google Scholar]
  37. Tan P-N, Steinbach M, & Kumar V (2006). Classification : Basic Concepts, Decision Trees, an model evaluation. Introduction to Data Mining, 1, 145–205. [Google Scholar]
  38. Teitelbaum P, Teitelbaum O, Nye J, Fryman J, & Maurer RG (1998). Movement analysis in infancy may be useful for early diagnosis of autism. Proceedings of the National Academy of Sciences of the United States of America, 95(23), 13982–13987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Vabalas A, Gowen E, Poliakoff E, & Casson AJ (2019). Machine learning algorithm validation with a limited sample size. PLoS ONE, 14(11). [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Wilson RB, Enticott PG, & Rinehart NJ (2018). Motor development and delay: advances in assessment of motor skills in autism spectrum disorders. Current Opinion in Neurology, 31(2), 134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Wilson RB, McCracken JT, Rinehart NJ, & Jeste SS (2018). What’s missing in autism spectrum disorder motor assessments? Journal of Neurodevelopmental Disorders, 10(1), 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Zhao Z, Zhu Z, Zhang X, Tang H, Xing J, Hu X, Lu J, Peng Q, & Qu X (2021). Atypical Head Movement during Face-to-Face Interaction in Children with Autism Spectrum Disorder. Autism Research, 14(6), 1197–1208. [DOI] [PubMed] [Google Scholar]

RESOURCES