Abstract
Objectives
Research on typically developing (TD) children and those with neurodevelopmental disorders and genetic syndromes was targeted. Specifically, studies on autism spectrum disorder, Down syndrome, Rett syndrome, fragile X syndrome, cerebral palsy, Angelman syndrome, tuberous sclerosis complex, Williams-Beuren syndrome, Cri-du-chat syndrome, Prader-Willi syndrome, and West syndrome were searched. The objectives are to review observational and computational studies on the emergence of (pre-)babbling vocalisations and outline findings on acoustic characteristics of early verbal functions.
Methods
A comprehensive review of the literature was performed including observational and computational studies focusing on spontaneous infant vocalisations at the pre-babbling age of TD children, individuals with genetic or neurodevelopmental disorders.
Results
While there is substantial knowledge about early vocal development in TD infants, the pre-babbling phase in infants with neurodevelopmental and genetic syndromes is scarcely scrutinised. Related approaches, paradigms, and definitions vary substantially and insights into the onset and characteristics of early verbal functions in most above-mentioned disorders are missing. Most studies focused on acoustic low-level descriptors (e.g. fundamental frequency) which bore limited clinical relevance. This calls for computational approaches to analyse features of infant typical and atypical verbal development.
Conclusions
Pre-babbling vocalisations as precursor for future speech-language functions may reveal valuable signs for identifying infants at risk for atypical development. Observational studies should be complemented by computational approaches to enable in-depth understanding of the developing speech-language functions. By disentangling features of typical and atypical early verbal development, computational approaches may support clinical screening and evaluation.
Keywords: Autism, Developmental disorder, Infant, Speech-language, Vocalisation
Myriad studies have furthered our understanding of the ontogeny of human behaviour and early neurofunctions that underlie our later capacities and skills. Early human behaviours are complex, dynamic, and diverse. Given commonalities in emerging neurofunctions along development, there are undeniable individual distinctions. One of the most fascinating questions in development is if, which, and how individual oscillations lead to long-term favourable or adverse outcomes? Following a neurodevelopmentalist perspective of development, acknowledging early functions as precursors and prerequisites for later ones, we presume that early deviations or impairments precede suboptimal traits or adverse outcomes, even if the core symptomatology of certain disorders may appear later in development (as for example in the case of autism spectrum disorder, ASD; e.g. Estes et al., 2019). This assumption, also known as deep constructivist notion (neuroconstructivism; e.g. Johnson, 2000; Johnson et al., 2021; Karmiloff-Smith, 1998; Mareschal, 2011; Westermann et al., 2011), is tightly linked to attempts at detecting and defining early functional markers of neurodiversity or atypicality, i.e. predictors of developmental trajectories (D’Souza & Karmiloff-Smith, 2017; Jones et al., 2014; Karmiloff-Smith, 1998, 2009; Marschik et al., 2014b, 2017; Micai et al., 2020). Concerns on whether behaviours reflect potential developmental atypicality or delay or mere diversity in typical development often result from recognising inter-individual discrepancies among peers caused by slowed or divergent functional acquisition within and across developmental domains, which might indicate stagnation or regression of intra-individual development. Notably, although the very early periods of speech-language development are not yet fully understood, atypicalities in the verbal domain are often one of the first perceived signs of neurodiversity during the first year of life.
Taking a closer look at the developing speech-language and communicative system, there is broad consensus regarding the essential role of prelinguistic vocalisations during early infancy for successful development of subsequent verbal functions (e.g. Karmiloff & Karmiloff-Smith, 2002; Locke, 1995; Oller, 2000; Vihman et al., 1985). Verbal development, meaning speech-language and communicative functions, follows a developmental trajectory of increasing complexity, accuracy and stability, thus building the complex human verbal capacity (e.g. Buder et al., 2013; Locke, 1995; Nathani et al., 2006; Oller, 1978, 2000; Papoušek, 1994; Stark, 1980). About four decades ago, stage-models were proposed describing developmental pathways from an infant’s first cry to becoming a competent communicator (c.f. Karmiloff & Karmiloff-Smith, 2002; Koopmans-van Beinum & van der Stelt, 1986; Oller, 1978; Papoušek, 1994; Roug et al., 1989; Stark, 1980). While there are differences in exact definitions and labels for categorically distinct vocalisation types, reported age of onset and stages/phases, and the mastering of certain milestones, researchers have offered similar models which describe evolving verbal functions. In the initial developmental phase, most vocalisations are faint and brief quasi-vowels. This first phase is often referred to as phonation stage or uninterrupted phonation stage (Fig. 1; Koopmans-van Beinum & van der Stelt, 1986; Oller, 2000). Thereafter, emerging at 1 to 2 months of age, vocalisations with articulatory movements of the tongue during phonation are uttered, a stage which was labelled “cooing” or “gooing” phase (Oller, 1978, 2000). Approximately 2 months later, an expansion of vocal and articulatory capacities can be observed. Vocalisation types at this expansion or vocal play stage, are vowel-/consonant-like sounds, squeals, and marginal syllables. These utterances are not yet produced with the articulatory accuracy and timing of adult-speech (Fig. 1; Nathani et al., 2006; Oller, 2000; Stark, 1980). The final stage of prelinguistic development, commonly referred to as canonical babbling stage, marks an infant’s start to produce speech-like syllables, usually starting between 5 and 10 months of age (Oller, 2000). Vocalisations are single or multiple consonant–vowel-combinations with rapid formant transitions between the consonantal and vocalic part. In some stage models, reduplicated and variegated babbling have been proposed as subsequent stages (Oller, 1978; Roug et al., 1989; Stark, 1980). In summary, specific vocalisation types occur in a cascading fashion and become increasingly speech-like towards the end of the first year of life, when the first (proto-)words are uttered. Besides this shift to language-specific phonetic forms, vocalisation types and developmental stages during the first year of life have been considered as universal (cf. Buder et al., 2013 who provide an acoustic phonetic catalogue of pre-speech vocalisations).
Fig-1.

The developing speech-language capacity
The classical approach to assess whether the above-mentioned early speech-language milestones are met, follows a perceptual segmentation-annotation-classification procedure of infant utterances. In such studies (which are observational), vocalisation-entities are commonly defined through the breath group criterion (i.e. vocalisation(s) uttered in the exhalation/expiration phase of one breathing cycle; Lynch et al., 1995a; Nathani & Oller, 2001) and segmented accordingly. Other approaches segmenting infant speech have differentiated vocalisations through a pause criterion (e.g. pauses longer 300 ms subdivide vocalisation clusters; Oller et al., 2010). In either way, the segmentation step is usually followed by an annotation process, in which trained listeners assign vocalisations to the predefined vocalisation classes (e.g. Koopmans-van Beinum & van der Stelt, 1986; Lang et al., 2021; Lynch et al., 1995a; Nathani et al., 2006; Oller, 1978, 2000; Roug et al., 1989; Stark, 1980). Recently, a citizen science study externally validated the expert classification of babbling vocalisations and the onset of canonical babbling (Cychosz et al., 2021). Together with findings on auditory Gestalt perception of experts and naïve listeners differentiating early verbal functions of infants with neurodevelopmental disorders (NDDs), this points to the existence of an intrinsic human Gestalt of different vocal categories or typical vs. atypical pre-linguistic vocalisations (Marschik et al., 2012a). Human auditory Gestalt perception, or the adult capacity of intuitively recognising different vocal categories, becomes more robust when evaluating “higher order verbal functions” of infants. Explicitly, babbling vocalisations, being more salient in form, are easier to be categorised by listeners as compared to pre-babbling vocalisations uttered in the first 5 months of life (e.g. Marschik et al., 2012a; Pokorny et al., 2018).
In the first 5 months of life, before the canonical babbling stage, the various stage-models concordantly include descriptions of a developmental pathway from simple phonation to an expansion phase (Fig. 1, e.g. Kent, 2022; Nathani et al., 2006; Oller, 2000). Oller and colleagues introduced a classification scheme of three types of infant vocalisations: cry, laughter and protophones; the latter are defined as precursors to speech and subdivided into vocants, squeals and growls (Jhang & Oller, 2017; Oller et al., 2013). Interestingly, evidence showed spontaneously produced protophones to outnumber cries and laughter from early on (Jhang & Oller, 2017; Oller et al., 2019). The importance of protophones lies, in contrast to cry and laughter, in their functional flexibility. They can be used in variable contexts and may fulfil different communicative functions (Jhang & Oller, 2017; Oller et al., 2013). Besides flexibility in functioning, the ontogeny of vocalisations has been discussed in terms of physiological constraints. Physiological adaptation of peripheral anatomical structures, such as the larynx descent or vocal-tract shape (e.g. Fitch, 2010; Lieberman et al., 2001) as well as neurophysiological changes governing the functional output, shape the development and the increasing complexity of vocalisations (see Fig. 1; e.g. Kent, 2021, 2022; Oller, 2000; Zhang & Ghazanfar, 2020).
In infants with various developmental disorders (DDs), an increasing number of studies has investigated the prelinguistic development aiming to detect early atypical findings and potential associations with later speech-language development (for reviews see for example Lang et al., 2019; Roche et al., 2018; Yankowitz et al., 2019). Canonical babbling, for example, was reported to be delayed or deviant in infants with hearing impairment (HI; Eilers & Oller, 1994; Koopmans-van Beinum et al., 2001; Moeller et al., 2007; Nathani Iyer & Oller, 2008; Shehata-Dieler et al., 2013; von Hapsburg & Davis, 2006), Down syndrome (DS; Lohmander et al., 2017; Lynch et al., 1995b), cerebral palsy (CP; Levin, 1999; Nyman & Lohmander, 2018), Williams-Beuren syndrome (WBS; Masataka, 2001), Cri-du-chat syndrome (CDS; Sohner & Mitchell, 1991), tuberous sclerosis complex (TSC; Gipson et al., 2021), autism spectrum disorder (ASD; Patten et al., 2014; Paul et al., 2011; Yankowitz et al., 2022), Rett syndrome (RTT; Einspieler et al., 2014; Marschik et al., 2012b, 2013), and fragile X syndrome (FXS; Belardi et al., 2017; Marschik et al., 2014a). Findings were however inconsistent and may depend on measures applied. For example, some infants with late detected developmental disorders (LDDDs such as ASD, RTT, FXS) exhibited a delayed onset of canonical babbling whereas others have reached this milestone at an adequate age, i.e. between 5 and 10 months (Bartl-Pokorny et al., 2022; Lang et al., 2019; Marschik et al., 2013; Yankowitz et al., 2019, 2022).
As findings regarding achievement of developmental milestones in infants with DDs were inconclusive, recent research increasingly aimed at gaining in-depth knowledge about early vocal patterns through the extraction and characterisation of acoustic features of emerging verbal functions. For example, in cry but also in spontaneous infant vocal patterns acoustic features like fundamental frequency (lowest frequency of a periodic waveform, usually denoted as F0) or duration of vocalisations have been documented (Borysiak et al., 2017; Buder et al., 2013; Hamrick et al., 2019; Kent & Murray, 1982; Wermke & Robb, 2010). More complex models on analysing acoustic properties of infant vocalisations include machine learning approaches applied on a set of parameters or features on signal level (Pokorny et al., 2020; Schuller & Batliner, 2013). There are established parameter sets for analysing voice features such as the extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS; Eyben et al., 2015) and the Computational Paralinguistics ChallengEs parameter set (ComParE; Schuller et al., 2013).
The features that are included in such sets can be subdivided into three categories: parameters related to frequency aspects (e.g. pitch), parameters related to the energy or amplitude of the signal (e.g. harmonics-to-noise ratio; HNR) and spectral parameters (e.g. harmonic differences). Another common approach to produce a more specialised parameter set is the usage of the unsupervised Bag-of-Audio-Words (BoAW) approach to the best set of features according to a customised codebook quantisation of the low level descriptors (LLDs). In addition, machine learning models have been applied to vocalisations including neural networks in different varieties, testing classification tasks (e.g. adult vs. infant speech, canonical vs. non-canonical utterances; Ebrahimpour et al., 2020; Warlaumont et al., 2010). In our group, we have utilised a machine learning approach (i.e. support vector machines), that focused on automatic preverbal vocalisation-based differentiation between typically developing infants and infants later diagnosed with RTT, FXS or ASD (Pokorny et al., 2016a, 2017, 2022). Studies evaluating acoustic features of early vocalisations or applying machine learning models or neural networks will be referred to as “computational studies” hereafter.
Given recent efforts to perceptually classify preverbal vocal patterns and characterise them acoustically, there is still a lack of synergised information in the field of prodromal or pre-diagnostic development in infants with neurodevelopmental or genetic disorders, especially concerning the pre-babbling phase. Therefore, the current article aimed to (i) outline characteristics of age-specific pre-linguistic vocalisations in the first 5 months of age (i.e. the pre-babbling phase), (ii) summarise computer-based approaches for the automated analysis of physiological and pathological pre-babbling vocalisations, and (iii) compare computer-based approaches on atypical early verbal functions and outline their potential to serve as neurofunctional marker of DDs.
Methods
To address the above-mentioned issues, we systematically searched the existing literature for (a) characteristics of and (b) state-of-the-art computational and observational methods on prelinguistic vocalisations in infants with DDs. We conducted two rounds of paper extraction and selection, the first one in September 2021 and a second one in February and March 2022 in the following online electronic databases: PubMed, Web of Science, Science Direct, Scopus, and Psy-cINFO using the search strings “infan* AND (prelinguistic OR preverbal OR cooing OR babbling OR vocal) AND (syndrome OR “genetic disorder” OR “developmental disorder”)” and “infan* AND vocal* AND (“computational analysis” OR “acoustic analysis” OR “audio analysis”)”.
Following this initial step, we performed an ancestral search for papers from the retrieved articles and searched Google Scholar for further publications. The retrieved articles were screened by two independent raters (CW and SL). Results were discussed with the co-authors, duplicates were removed, and articles were selected according to the following criteria: (1) peer-reviewed; (2) original studies or reviews and meta-analyses; (3) written in English; and (4) focusing on the pre-babbling age (0 to 5 months) in (4a) typically developing infants and (4b) infants at elevated likelihood for or diagnosed with neurodevelopmental disorders (NDDs), late detected developmental disorders (LDDDs), genetic syndromes, or developmental disorders (DDs). Articles of interest were those based on human coder-based assessments (observational studies) as well as articles on machine learning approaches (computational studies). We intended to focus on spontaneous infant vocalisations and excluded all studies analysing or reporting infant cry or distress vocalisations as well as vocalisations from parent–child interaction paradigms (PCI).
Results
Our literature selection process led to a total of 27 papers, 17 of which are on pre-babbling in infants diagnosed with neurodevelopmental disorders or genetic syndromes applying observational methods (Table 1). Six articles focused on DS, seven on ASD (one of them also including infants with TSC), three on RTT or the preserved speech variant of RTT (PSV), and one on PWS. Two of the 17 articles reported acoustic features in addition to observational characteristics. The remaining ten articles focused on acoustic features/computational models, three studies applying computational methods on pre-babbling behaviour in TD infants (Table 2) and seven papers discussed the babbling stage in infants later diagnosed with a DD (i.e. ASD, CDS, PSV-RTT, RTT, WS and one study reporting on ASD, FXS, and RTT; Table 2). It is important to note that a differentiation between spontaneous vocalisations vs. vocalisations in interactive settings could not be reliably done for all articles. Thus, against the initially set exclusion criterion, we decided to report all observational studies of this age-range and outlined information on data sampling whenever possible (Tables 1 and 2).
Table 1.
Studies analysing pre-babbling behaviour in infants with neurodevelopmental disorders or genetic syndromes applying observational methods (ascending order)
| Observational studies on pre-babbling in neurodevelopmental disorders or genetic syndromes | ||||||
|---|---|---|---|---|---|---|
| Authors (year of publication) | Condition: sample size (M/F) | Age range (in months) | Data sampling | Behaviours analysed related to pre-babbling vocalisations | Key results | |
| Smith and Oller (1981) | TD: n = 9 (7/2) DS: n = 10 (5/5) |
T1: 0–3 T2: 3–6 T3: 6–9 T4: 9–12 T5: 12–15 T6 (DS): 15–18 T7 (DS): 18–21 |
|
|
|
|
| Legerstee et al. (1992) | DS: n = 8 (4/4) |
2–10 |
|
|
|
|
| Steffens et al. (1992) | TD: n = 27 (17/10) DS: n = l3 (4/9) |
4–18 |
|
|
|
|
| Lynch et al., (1995b) | TD: n = 27 (17/10) DS: n = 13 (4/9) |
4–18 |
|
|
|
|
| Lynch et al., (1995a) | TD: n = 8 (−/−) DS: n = 8 (−/−) |
T1: 2–4 T2: 6–8 T3: 10–12 |
|
|
|
|
| Maestro et al. (2001) | TD: n = 15 (11/4) ASD: n = 15 (10/5) |
T1: 0–6 T2: 6–12 T3: 12–18 T4: 18–24 |
|
|
|
|
| Maestro et al. (2002) | TD: n = 15 (9/6) ASD or PDD-NOS: n = 15 (10/5) |
0–6 |
|
|
|
|
| Maestro et al. (2005) | TD: n = 13 (5/8) ASD: n = 15 (11/4) |
T1: 0–6 T2: 6–12 |
|
|
|
|
| Marschik et al. (2009) | PSV-RTT: n = 1 (0/1) |
0–24 |
|
|
|
|
| Apicella et al. (2013) | TD: n = 9 (−/−) ASD: n = 10 (9/1) |
T1: 0–6 T2: 6–12 |
|
|
|
|
| Marschik et al. (2013) | RTT: n = 10 (0/10) PSV-RTT: n = 5 (0/5) |
0–24 |
|
|
|
|
| Brisson et al. (2014) | TD: n = 13 (9/4) ASD: n = 13 (11/2) |
0–6 |
|
|
|
|
| Einspieler et al. (2014) | RTT: n = 2 (0/2) monozygotic twins |
0–22 |
|
|
|
|
| Zappella et al. (2015) | Suspected ASD: n = 18 (18/0) ASD+ : n = 10 ASD−: n = 1 TSC: n = 7 |
T1: 1–2 T2: 3–4 T3: 5–6 |
|
|
|
|
| Chericoni et al. (2016) | TD: n = 10 (8/2) ASD: n = 10 (9/1) |
T1: 0–6 T2: 6–12 T3: 12–18 |
|
|
|
|
| Pansy et al. (2019) | PWS: n = 1 (1/0) |
0–6 |
|
|
|
|
| Onnivello et al. (2021) | DS: n = 14 (38/36) |
4–18 |
|
|
|
|
ASD autism spectrum disorder, DS Down syndrome, PDD-NOS Pervasive Developmental Disorder-Not Otherwise Specified, PSV-RTT preserved speech variant–Rett syndrome, PWS Prader-Willi syndrome, RTT Rett syndrome, RVA retrospective video analysis, TD typically developing, TSC tuberous sclerosis complex
Only verbal scale attributes of CIRS (for details see Apicella et al., 2013)
SAEVD-R (for details see Nathani et al., 2006)
Bayley Scales of Infant and Toddler Development Third Edition (Bayley, 2006)
Table 2.
Studies applying computational methods on pre-babbling behaviour in typically developing infants and on babbling behaviour in infants with neurodevelopmental disorders or genetic syndromes (ascending order)
| Computational studies on pre-babbling in TD | |||||
|---|---|---|---|---|---|
| Authors (year of publication) | Condition: sample size (M/F) | Age range (in months) | Data sampling | Behaviours analysed related to pre-babbling vocalisations | Methods/results |
| Warlaumont et al. (2010) | TD: n = 6 (2/4) |
T1: 3–6 T2: 6–9 T3: 9–12 |
|
|
|
| Ebrahimpour et al. (2020) | TD: n = 15 (−/−) |
3–18 |
|
|
|
| Li et al. (2021) | TD: n = 119 (−/−) |
3–12 |
|
|
|
| Computational studies on babbling in neurodevelopmental disorders or genetic syndromes | |||||
| Sohner and Mitchell (1991) | CDS: n = 1 (−/−) |
8–26 |
|
|
|
| Pokorny et al. (2016a) | TD: n = 4 (0/4) RTT: n = 4 (0/4) |
6–12 |
|
|
|
| Pokorny et al. (2016b) | ASD: n = 14 (13/−) IT FXS: n = 1 (1/−) AT RTT: n = 2 (−/2) AT RTT: n = 2 (−/2) DE RTT: n = 5 (−/5)UK TD: n = 9 (5/4) AT |
7–12 |
|
|
|
| Pokorny et al. (2017) | TD: n = 10 (5/5) ASD: n = 10 (5/5) |
10 |
|
|
|
| Pokorny et al. (2018) | PSV-RTT: n = 1 (−/1) |
7–12 |
|
|
|
| Pokorny et al. (2020) | TD: n = 3 (−/−) RTT: n = 3 (−/−) |
6–12 |
|
|
|
| Ouss et al. (2020) | TD: n = 19 (−/−) WS: n = 32 (−/−) Outcome 48 M: WS with ASD/ID: n = 10 WS without ASD/ID: n = 22 |
9–12 |
|
|
|
ASD autism spectrum disorder, AT Austria, CDS Cri-du-Chat syndrome, DE Germany, FXS fragile X syndrome, ID intellectual disorder/intellectual disability, IT Italy, PSV-RTT preserved speech variant–Rett syndrome, RTT Rett syndrome, RVA retrospective video analysis, TD typical development/typically developing, UAR unweighted average recall, UK United Kingdom, WS West syndrome
LENA (for details see Oller et al., 2010)
SAEVD-R (for details see Nathani et al., 2006)
ComParE (for details see Schuller et al., 2013)
eGeMAPS (for details see Eyben et al., 2015)
Whilst there is a number of studies reporting early physiological development according to the established stage models (Fig. 1), reports of atypical development in infants with neurodevelopmental disorders or genetic syndromes in the younger ages are rare (Table 1). Most of the 17 included studies report on expanded age-bands up to 24 months; very few explicitly investigate the characteristics of early verbal functions emerging in the first 5 months of life (Brisson et al., 2014; Maestro et al., 2002; Pansy et al., 2019; Zappella et al., 2015). Most studies investigate developing verbal functions applying the classical approach of perceptual segmentation-annotation-classification. There is less effort present in delineating acoustic features (such as duration of vocalisations, syllables or phrases, pitch, fundamental frequency (F0) or intonation contours; Brisson et al., 2014; Lynch et al., 1995a). Observational studies reveal inconclusive results on behavioural differences in pre-babbling vocalisations in infants with DDs and typical development. Compared to TD infants several diverse behaviours have been reported for DD: e.g. longer duration of rhythmic units in infants with DS (Lynch et al., 1995a); divergent intonation contours and less vocal response in interactive settings (Brisson et al., 2014); some participants with ASD failed to achieve the developmental milestone “cooing” (Maestro et al., 2002; Zappella et al., 2015); typical vocalisations interspersed with atypical forceful and/or inspiratory vocalisations in infants with RTT (Marschik et al., 2009); more details on age-specific vocalisations and characteristics of this period are outlined in Table 1.
More advanced methods such as digital measurement instruments and computational analyses open new possibilities for earlier identification of atypical development, as they surpass human capabilities of perception. Most approaches identified aim to describe and investigate trends in the typical development of vocalisations throughout the first 5 months of life. Very early studies focus on a categorical analysis of vocalisations, applying spectral analysis to gain additional insights in addition to the verbal Gestalt-perception (Buder et al., 2008; Lynch et al., 1995a; Oller et al., 2019; Warlaumont et al., 2010). The spectra analysed were acquired through the application of a window function. Most commonly, a fast-Fourier transformation is used to present results as a graphical visualisation, showing the intensity of frequencies at a point in time (Heideman et al., 1985). With the resulting graphical representation, one can visually determine fundamental and formant frequencies (F0 and Fn respectively) and the general “shape” of a vocalisation (Bauer & Kent, 1987; Kent & Murray, 1982; Oller et al., 2019). The method of spectrography has been applied in studies over the last 3 decades, finding specific intonation patterns in pre-babbling vocalisations and a developmental trajectory of the F0 and Fn (Kent & Murray, 1982). Oller and colleagues used spectrograms to visualise examples of vocants, squeals, growls, and cries at specific ages, providing a visual description of the noise found in the signal as well as other unique features (e.g. F0 contour) of the analysed classes of utterances (Oller et al., 2019). Another feature, which can be identified through inspection of the spectrogram or the waveform of a vocalisation, is the duration of a single utterance. The duration is used in several studies to gain an understanding of how utterance durations change with age (Apicella et al., 2013; Brisson et al., 2014; Lynch et al., 1995a; Smith & Oller, 1981).
More in depth analyses of audio signals require multi-dimensional parameter sets to provide feature-based representations of the underlying audio segment to a classifier, which can then build an optimal predictor for the classification scheme provided. There are pre-defined parameter sets that are commonly utilised in linguistic and acoustic analyses. Such parameter sets are for example the Computational Pralinguistics ChallengEs parameter set (ComParE; Schuller et al., 2013) or the eGeMAPS (Eyben et al., 2015). These parameter sets consist of low-level descriptors (LLDs). LLDs are parameters that are very closely related to the signal itself (e.g. fundamental frequency F0. loudness). To gain further insights about the general occurrence and statistical behaviour of those LLDs, functionals (e.g. mean, kurtosis, variance) are used on top of these (Schuller & Batliner, 2013).
Yet, in the field of pre-babbling vocalisations, most studies rely on basic features such as duration or fundamental frequency to gain a more in depth understanding of infant vocalisations (Apicella et al., 2013; Brisson et al., 2014; Lynch et al., 1995a; Smith & Oller, 1981). Visual spectrogram analysis has been used to evaluate different vocalisation shapes and help estimate signal to noise ratios in certain vocalisation types (Oller et al., 2019). These approaches, whilst not utilizing advanced computational methods, highlight the importance of particular features for identification of certain vocalisation types and analysis of developmental trajectories. Lynch and colleagues, who focused on a comparison between TD children and children with DS, present the only study that employs a feature-based approach in the analysis of pre-babbling vocalisations in infants with DDs (Lynch et al., 1995a). In this study, the duration of utterances was compared between DS and TD children across respective timelines. For the first 5 months of life, no significant difference was found between TD infants and infants with DS. Nevertheless, the duration of utterances increases until 8 months of age and then decreases until 12 months of age, continuously diverging between TD and DS groups (Lynch et al., 1995a). Although the methodology is not sensitive enough for an accurate differentiation between the two studied groups, it provides a starting point in the identification of possible features that can be used for future analysis of pre-babbling vocalisations (Lynch et al., 1995a). This early phase of verbal development is not yet very well researched in terms of the effectiveness of the aforementioned parameter sets (i.e. ComParE & eGeMAPS). So far, there is a lack of studies applying advanced computational approaches as well as comparative studies that enable rendering a verdict on their applicability (see Table 2). Deep learning approaches have been applied to different settings (e.g. interactive settings, home recordings; Pokorny et al., 2020) of pre-segmented infant audio signals to solve superficial classification tasks (e.g. infant vs. adult, canonical vs. non-canonical). However, none of these studies focused on infants with DDs in the first few months of life (Ebrahimpour et al., 2020; Warlaumont et al., 2010).
Several studies on machine learning approaches applied to vocalisations in the first year of life (pre-babbling and babbling) were identified. In the pre-babbling phase, only three studies utilised approaches beyond the manual analysis of LLDs in the assessment of vocalisations in TD infants (Table 2). To the best of our knowledge, there are no studies available in infants at risk or with a later diagnosis of DDs. These approaches investigate the effectiveness of different neural network architectures (i.e. convolutional neural network, self-organising map and perceptron hybrid network), input features (i.e. spectrograms, waveform, parametric representation), and classification schemes (i.e. infant-directed speech vs. adult-directed speech, infant vs. adult, vocalisation vs. non-vocalisation, canonical vs. non-canonical; vocant vs. squeal vs. growl; Ebrahimpour et al., 2020; Li et al., 2021; Warlaumont et al., 2010). Opposed to that, in the babbling phase, a number of studies analyse verbal capacities utilizing computational approaches (e.g. Pokorny et al., 2018, 2020, 2022). In general, manual analysis of LLDs such as fundamental frequency (F0) is not very common for babbling vocalisations. Spectrographic analysis is very often used only for representational purposes, e.g. to represent different syllable types (e.g. Poeppel & Assaneo, 2020). For analysis and detection of atypical development by utilising computational methods, the number of approaches described is limited (Table 2).
Discussion
Some 40 years ago, the field of early infant vocalisation study was revolutionised with new ways to assess, measure and interpret early development (Koopmans-van Beinum & van der Stelt, 1986; Oller, 1978; Papoušek, 1994; Roug et al., 1989; Stark, 1980). Since then, we have learned a lot about infant prelinguistic development and vocalisation categories. Most studies, however, focused on babbling and the emergence of first words (second half of the first year of life) whilst the pre-babbling phase (first months of life), especially in infants at elevated likelihood for or diagnosed with neurodevelopmental disorders and genetic syndromes, was less researched.
The very early phase of verbal development is mostly described through the achievement of certain milestones (e.g. phonation, cooing, expansion) or via perceptual assignment of infant vocalisations to certain types (e.g. vocant, canonical syllable). Another, albeit still rarely used approach is the description of infant vocalisations through acoustic features (e.g. duration, mean pitch, F0). Studies have only recently focused on the investigation of quantitative changes of different vocalisation types in the first 5 months of life (Jhang & Oller, 2017; Oller et al., 2013, 2019). However, these studies have not assessed infants with developmental disorders or genetic syndromes so far. Threshold definitions, such as the canonical babbling ratio (CBR) applied in the second half of the first year of life, have to the best of our knowledge, not yet been developed or used for types of pre-babbling vocalisations. For the later stages of development, a number of different approaches to define the onset of certain functions (e.g. canonical babbling) providing similar critical time periods in which milestones are achieved (Lang et al., 2021; Molemans et al., 2012; Oller, 2000), have been proposed. Oller and colleagues (Oller et al., 1998, 1999) reported that delayed onset of canonical babbling is a precursor to later adverse linguistic functioning. Whether precursors of atypical development may already be detected in earlier vocalisations has not yet been investigated. Further research observing typical verbal development is still needed for a basis to understand deviant patterns and trajectories.
Besides pioneering the field of perceptively evaluating infant vocalisations, Oller and colleagues were also at the fore-front to propose semi-automated recording and analytical tools for the assessment of infant vocalisations (e.g. LENA system; Oller et al., 2010). Challenges of recording preverbal data as well as advantages of automated tools for the acquisition and analyses of acoustic features have been increasingly discussed (Pokorny et al., 2020). The aim of this article is not to discuss pros and cons of automated data acquisition approaches but to focus on whether such undertakings have been utilised in the study of infant vocalisations in the first half year of life, in typical cohorts, in individuals at elevated likelihood for DDs, or groups with DDs or pre-/perinatally diagnosed disorders.
When looking beyond behavioural observations and general perceptual evaluations of early infant vocalisations, there is a lack of computational methods that study, substantiate, and support the findings of observational studies. We found that despite the existence of thoroughly tested computational approaches for babbling-vocalisations, there are no attempts to use these methods in the evaluation of pre-babbling vocalisations. These perceptually less salient vocalisations, as compared to canonical babbling, have preferably been studied through simple LLDs such as F0 and duration. Only a few studies have used more advanced computational approaches to prove the applicability and value of such approaches in the field of pre-babbling vocalisations (Ebrahimpour et al., 2020; Li et al., 2021; Warlaumont et al., 2010). Besides missing analytical approaches, there is also a lack of standardisation of coding-schemes and datasets, which impedes the comparability of performance between applied computational models in the field of speech-language analysis in the first 5 months of life. Additionally, the sample sizes investigated in observational and computational studies are usually small (i.e. 1–119; see Table 2). Generalisation capabilities of machine learning approaches applied on small dataset sets are questionable. Computational or feature-based approaches are underrepresented in studying pre-babbling vocalisations, especially in infants with NDDs (Brisson et al., 2014; Lynch et al., 1995a). To fingerprint early neurofunctional development and its deviations (Marschik et al., 2017), we need in-depth understanding of physiological functioning as well as disorder specific characteristics. Early verbal development is one domain of interest cluing in the integrity of the developing nervous system. Recent development of analytical tools appear well suited for analysing pre-linguistic vocalisations at pre-babbling age to enhance our insights into emerging early verbal functions. Pioneer work is required to verify computational tools in identifying disorder-specific features in early vocalisations, which may inform future clinical diagnoses and be used for monitoring therapeutic success.
Acknowledgements
We would like to thank the members of the Systemic Ethology and Developmental Science Team (SEE), especially Christiane Theodossiou-Wegner for critically discussing the manuscript; PBM and FW are supported by the Volkswagen Foundation – IDENTIFIED and a Leibniz Science Campus Audacity Award; CW was funded through the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), SFB1528 – Project C03; SL by Rett Elternhilfe e.V.; DZ and LP were supported by DFG 456967546. We would like to extend our sincere gratitude to all colleagues and experts of early vocal development who have discussed the idea of this manuscript with us and helped to further develop our approach.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Conflict of Interest
All authors declare no direct conflict of interest related to this article. Bölte discloses that he has in the last 3 years acted as an author, consultant, or lecturer for Medice and Roche. He receives royalties for textbooks and diagnostic tools from Hogrefe and Liber. Marschik and Lang receive royalties from Elsevier, Springer, and Urban & Fischer. Bölte is shareholder in SB Education/Psychological Consulting AB and NeuroSupportSolutions International AB.
References
All references marked with an * are included in the review
- *.Apicella F, Chericoni N, Costanzo V, Baldini S, Billeci L, Cohen D, & Muratori F (2013). Reciprocity in interaction: A window on the first year of life in autism. Autism Research and Treatment, 2013, 705895. 10.1155/2013/705895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bartl-Pokorny KD, Pokorny FB, Garrido D, Schuller BW, Zhang D, & Marschik PB (2022). Vocalisation repertoire at the end of the first year of life: An exploratory comparison of Rett syndrome and typical development. Journal of Developmental and Physical Disabilities. 10.1007/s10882-022-09837-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bauer HR, & Kent RD (1987). Acoustic analyses of infant fricative and trill vocalizations. The Journal of the Acoustical Society of America, 81(2), 505–511. 10.1121/1.394916 [DOI] [PubMed] [Google Scholar]
- Bayley N (2006). Bayley scales of infant and toddler development (3rd ed.). Psychological Corporation. [Google Scholar]
- Belardi K, Watson LR, Faldowski RA, Hazlett H, Crais E, Baranek GT, McComish C, Patten E, & Oller DK (2017). A retrospective video analysis of canonical babbling and volubility in infants with fragile X syndrome at 9–12 months of age. Journal of Autism and Developmental Disorders, 47(4), 1193–1206. 10.1007/s10803-017-3033-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borysiak A, Hesse V, Wermke P, Hain J, Robb M, & Wermke K (2017). Fundamental frequency of crying in two-month-old boys and girls: Do sex hormones during mini-puberty mediate differences? Journal of Voice, 31(1), 128.e21–128.e28. 10.1016/j.jvoice.2015.12.006 [DOI] [PubMed] [Google Scholar]
- *.Brisson J, Martel K, Serres J, Sirois S, & Adrien JL (2014). Acoustic analysis of oral productions of infants later diagnosed with autism and their mother. Infant Mental Health Journal, 35(3), 285–295. 10.1002/imhj.21442. [DOI] [PubMed] [Google Scholar]
- Buder EH, Chorna LB, Oller DK, & Robinson RB (2008). Vibratory regime classification of infant phonation. Journal of Voice: Official Journal of the Voice Foundation, 22(5), 553–564. 10.1016/j.jvoice.2006.12.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buder EH, Warlaumont AS, & Oller DK (2013). An acoustic phonetic catalog of prespeech vocalizations from a developmental perspective. In Peter B & MacLeod AAN (Eds.), Comprehensive perspectives on speech sound development and disorders: Pathways from linguistic theory to clinical practice (pp. 103–134). Nova Publishers. [Google Scholar]
- *.Chericoni N, de Brito Wanderley D, Costanzo V, Diniz-Goncalves A, Leitgel Gille M, Parlato E, Cohen D, Apicella F, Calderoni S, & Muratori F (2016). Pre-linguistic vocal trajectories at 6–18 months of age as early markers of autism. Frontiers in Psychology, 7, 1595. 10.3389/fpsyg.2016.01595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cychosz M, Cristia A, Bergelson E, Casillas M, Baudet G, Warlaumont AS, Scaff C, Yankowitz L, & Seidl A (2021). Vocal development in a large-scale crosslinguistic corpus. Developmental Science, 24(5), e13090. 10.1111/desc.13090 [DOI] [PMC free article] [PubMed] [Google Scholar]
- D’Souza H, & Karmiloff-Smith A (2017). Neurodevelopmental disorders. WIREs. Cognitive Science, 8(1–2), e1398. 10.1002/wcs.1398 [DOI] [PubMed] [Google Scholar]
- *.Ebrahimpour MK, Schneider S, Noelle DC, & Kello CT (2020). InfantNet: A deep neural network for analyzing infant vocalizations, arXiv preprint arXiv:2005.12412. 10.48550/arXiv.2005.12412 [DOI] [Google Scholar]
- Eilers RE, & Oller DK (1994). Infant vocalizations and the early diagnosis of severe hearing impairment. Journal of Pediatrics, 124(2), 199–203. 10.1016/s0022-3476(94)70303-5 [DOI] [PubMed] [Google Scholar]
- Einspieler C, Marschik PB, Domingues W, Talisa VB, Bartl-Pokorny KD, Wolin T, & Sigafoos J (2014). Monozygotic twins with Rett syndrome: Phenotyping the first two years of life. Journal of Developmental and Physical Disabilities, 26(2), 171–182 10.1007/sl0882-013-9351-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Estes A, St. John T, & Dager SR (2019). What to tell a parent who worries a young child has autism. JAMA Psychiatry, 76(10), 1092–1093. 10.1001/jamapsychiatry.2019.1234 [DOI] [PubMed] [Google Scholar]
- Eyben F, Scherer KR, Schuller BW, Sundberg J, André E, Busso C, Devillers LY, Epps J, Laukka P, & Narayanan SS (2015). The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Transactions on Affective Computing, 7(2), 190–202. 10.1109/TAFFC.2015.2457417 [DOI] [Google Scholar]
- Fitch WT (2010). The evolution of language. Cambridge University Press. 10.1017/CBO9780511817779 [DOI] [Google Scholar]
- Gipson TT, Ramsay G, Ellison EE, Bene ER, Long HL, & Oller DK (2021). Early vocal development in tuberous sclerosis complex. Pediatric Neurology, 125, 48–52. 10.1101/2021.01.06.21249364 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamrick LR, Seidl A, & Tonnsen BL (2019). Acoustic properties of early vocalizations in infants with fragile X syndrome. Autism Research, 12(11), 1663–1679. 10.1002/aur.2176 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heideman MT, Johnson DH, & Burrus CS (1985). Gauss and the history of the fast Fourier transform. Archive for History of Exact Sciences, 34(3), 265–277. [Google Scholar]
- Jhang Y, & Oller DK (2017). Emergence of functional flexibility in infant vocalizations of the first 3 months. Frontiers in Psychology, 8, 300. https://www.frontiersin.org/article/https://doi.org/10.3389/fpsyg.2017.00300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson MH (2000). Functional brain development in infants: Elements of an interactive specialization framework. Child Development, 71(1), 75–81. 10.1111/1467-8624.00120 [DOI] [PubMed] [Google Scholar]
- Johnson MH, Charman T, Pickles A, & Jones EJH (2021). Annual research review: Anterior modifiers in the emergence of neurodevelopmental disorders (AMEND)—A systems neuroscience approach to common developmental disorders. Journal of Child Psychology and Psychiatry, 62(5), 610–630. 10.1111/jcpp.13372 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones EJH, Gliga T, Bedford R, Charman T, & Johnson MH (2014). Developmental pathways to autism: A review of prospective studies of infants at risk. Neuroscience & Biobehavioral Reviews, 39, 1–33. 10.1016/j.neubiorev.2013.12.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karmiloff-Smith A (2009). Nativism versus neuroconstructivism: Rethinking the study of developmental disorders. Developmental Psychology, 45(1), 56–63. 10.1037/a0014506 [DOI] [PubMed] [Google Scholar]
- Karmiloff K, & Karmiloff-Smith A (2002). Pathways to language: From fetus to adolescent. Harvard University Press. [Google Scholar]
- Karmiloff-Smith A (1998). Development itself is the key to understanding developmental disorders. Trends in Cognitive Sciences, 2(10), 389–398. [DOI] [PubMed] [Google Scholar]
- Kent RD (2021). Developmental functional modules in infant vocalizations. Journal of Speech, Language, and Hearing Research, 64(5), 1581–1604. 10.1044/2021_JSLHR-20-00703 [DOI] [PubMed] [Google Scholar]
- Kent RD (2022). The maturational gradient of infant vocalizations: Developmental stages and functional modules. Infant Behavior and Development, 66, 101682. 10.1016/j.infbeh.2021.101682 [DOI] [PubMed] [Google Scholar]
- Kent RD, & Murray AD (1982). Acoustic features of infant vocalic utterances at 3, 6, and 9 months. The Journal of the Acoustical Society of America, 72(2), 353–365. 10.1121/1.388089 [DOI] [PubMed] [Google Scholar]
- Koopmans-van Beinum FJ, Clement CJ, & van den Dikkenberg-Pot I (2001). Babbling and the lack of auditory speech perception: A matter of coordination? Developmental Science, 4(1), 61–70. 10.1111/1467-7687.00149 [DOI] [Google Scholar]
- Koopmans-van Beinum FJ, & van der Stelt JM (1986). Early stages in the development of speech movements. In Lindblom B & Zetterström R (Eds.), Precursors of early speech (pp. 37–50). Stockton. [Google Scholar]
- Lang S, Bartl-Pokorny KD, Pokorny FB, Garrido D, Mani N, Fox-Boyer AV, Zhang D, & Marschik PB (2019). Canonical babbling: A marker for earlier identification of late detected developmental disorders? Current Developmental Disorders Reports, 6(3), 111–118. 10.1007/s40474-019-00166-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lang S, Willmes K, Marschik PB, Zhang D, & Fox-Boyer A (2021). Prelexical phonetic and early lexical development in German-acquiring infants: Canonical babbling and first spoken words. Clinical Linguistics & Phonetics, 35(2), 185–200. 10.1080/02699206.2020.1731606 [DOI] [PubMed] [Google Scholar]
- *.Legerstee M, Bowman TG, & Fels S (1992). People and objects affect the quality of vocalizations in infants with Down syndrome. Early Development and Parenting, 1(3), 149–156. 10.1002/edp.2430010304. [DOI] [Google Scholar]
- Levin K (1999). Babbling in infants with cerebral palsy. Clinical Linguistics & Phonetics, 13(4), 249–267. 10.1080/026992099299077 [DOI] [Google Scholar]
- *.Li J, Hasegawa-Johnson M, & McElwain NL (2021). Analysis of acoustic and voice quality features for the classification of infant and mother vocalizations. Speech Communication, 133, 41–61. 10.1016/j.specom.2021.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lieberman DE, McCarthy RC, Hiiemae KM, & Palmer JB (2001). Ontogeny of postnatal hyoid and larynx descent in humans. Archives of Oral Biology, 46(2), 117–128. 10.1016/S0003-9969(00)00108-4 [DOI] [PubMed] [Google Scholar]
- Locke JL (1995). The child’s path to spoken language. Harvard University Press. [Google Scholar]
- Lohmander A, Holm K, Eriksson S, & Lieberman M (2017). Observation method identifies that a lack of canonical babbling can indicate future speech and language problems. Acta Paediatrica, 106(6), 935–943. 10.1111/apa.13816 [DOI] [PubMed] [Google Scholar]
- *.Lynch MP, Oller DK, Steffens ML, & Buder EH (1995a). Phrasing in prelinguistic vocalizations. Developmental Psychobiology, 28(1), 3–25. [DOI] [PubMed] [Google Scholar]
- *.Lynch MP, Oller DK, Steffens ML, Levine SL, Basinger DL, & Umbel V (1995b). Onset of speech-like vocalizations in infants with Down syndrome. American Journal of Mental Retardation, 100(1), 68–86. [PubMed] [Google Scholar]
- *.Maestro S, Muratori F, Barbieri F, Casella C, Cattaneo V, Cavallaro MC, Cesari A, Milone A, Rizzo L, Viglione V, Stern DD, & Palacio-Espasa F (2001). Early behavioral development in autistic children: The first 2 years of life through home movies. Psychopathology, 34(3), 147–152. 10.1159/000049298. [DOI] [PubMed] [Google Scholar]
- *.Maestro S, Muratori F, Cavallaro MC, Pecini C, Cesari A, Paziente A, Stern D, Golse B, & Palacio-Espasa F (2005). How young children treat objects and people: An empirical study of the first year of life in autism. Child Psychiatry and Human Development, 35(4), 383–396. 10.1007/s10578-005-2695-x. [DOI] [PubMed] [Google Scholar]
- *.Maestro S, Muratori F, Cavallaro MC, Pei F, Stern D, Golse B, & Palacio-Espasa F (2002). Attentional skills during the first 6 months of age in autism spectrum disorder. Journal of the American Academy of Child and Adolescent Psychiatry, 41 (10), 1239–1245. 10.1097/00004583-200210000-00014. [DOI] [PubMed] [Google Scholar]
- Mareschal D (2011). From NEOconstructivism to NEUROconstructivism. Child Development Perspectives, 5(3), 169–170. 10.1111/j.1750-8606.2011.00185.x [DOI] [Google Scholar]
- *.Marschik PB, Einspieler C, Oberle A, Laccone F, & Prechtl HF (2009). Case report: Retracing atypical development: A preserved speech variant of Rett syndrome. Journal of Autism and Developmental Disorders, 39(6), 958–961. 10.1007/s10803-009-0703-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marschik PB, Einspieler C, & Sigafoos J (2012a). Contributing to the early detection of Rett syndrome: The potential role of auditory Gestalt perception. Research in Developmental Disabilities, 33(2), 461–466. 10.1016/j.ridd.2011.10.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marschik PB, Pini G, Bartl-Pokorny KD, Duckworth M, Gugatschka M, Vollmann R, Zappella M, & Einspieler C (2012b). Early speech-language development in females with Rett syndrome: Focusing on the preserved speech variant. Developmental Medicine and Child Neurology, 54(5), 451–456. 10.1111/j.1469-8749.2012.04123.x [DOI] [PubMed] [Google Scholar]
- *.Marschik PB, Kaufmann WE, Sigafoos J, Wolin T, Zhang D, Bartl-Pokorny KD, Pini G, Zappella M, Tager-Flusberg H, & Einspieler C (2013). Changing the perspective on early development of Rett syndrome. Research in Developmental Disabilities, 34(4), 1236–1239. 10.1016/j.ridd.2013.0l.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marschik PB, Bartl-Pokorny KD, Sigafoos J, Urlesberger L, Pokorny F, Didden R, Einspieler C, & Kaufmann WE (2014a). Development of socio-communicative skills in 9-to 12-month-old individuals with fragile X syndrome. Research in Developmental Disabilities, 35(3), 597–602. 10.1016/j.ridd.2014.01.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marschik PB, Bartl-Pokorny KD, Tager-Flusberg H, Kaufmann WE, Pokorny F, Grossmann T, Windpassinger C, Petek E, & Einspieler C (2014b). Three different profiles: Early socio-communicative capacities in typical Rett syndrome, the preserved speech variant and normal development. Developmental Neurorehabilitation, 17(1), 34–38. 10.3109/17518423.2013.837537 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marschik PB, Pokorny FB, Peharz R, Zhang D, O’Muircheartaigh J, Roeyers H, Bolte S, Spittle AJ, Urlesberger B, Schuller B, Poustka L, Ozonoff S, Pernkopf F, Pock T, Tammimies K, Enzinger C, Krieber M, Tomantschger I, Bartl-Pokorny KD, Bee-Pri Study Group. (2017). A novel way to measure and predict development: A heuristic approach to facilitate the early detection of neurodevelopmental disorders. Current Neurology and Neuroscience Reports, 17(5), 43. 10.1007/sl1910-017-0748-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masataka N (2001). Why early linguistic milestones are delayed in children with Williams syndrome: Late onset of hand banging as a possible rate-limiting constraint on the emergence of canonical babbling. Developmental Science, 4, 158–164. 10.1111/1467-7687.00161 [DOI] [Google Scholar]
- Micai M, Fulceri F, Caruso A, Guzzetta A, Gila L, & Scattoni ML (2020). Early behavioral markers for neurodevelopmental disorders in the first 3 years of life: An overview of systematic reviews. Neuroscience & Biobehavioral Reviews, 116, 183–201. 10.1016/j.neubiorev.2020.06.027 [DOI] [PubMed] [Google Scholar]
- Moeller MP, Hoover B, Putman C, Arbataitis K, Bohnenkamp G, Peterson B, Lewis D, Estee S, Pittman A, & Stelmachowicz P (2007). Vocalizations of infants with hearing loss compared with infants with normal hearing: Part II–transition to words. Ear and Hearing, 28(5), 628–642. 10.1097/AUD.0b013e31812564c9 [DOI] [PubMed] [Google Scholar]
- Molemans I, van den Berg R, van Severen L, & Gillis S (2012). How to measure the onset of babbling reliably? Journal of Child Language, 39(3), 523–552. 10.1017/S0305000911000171 [DOI] [PubMed] [Google Scholar]
- Morgan L, & Wren YE (2018). A systematic review of the literature on early vocalizations and babbling patterns in young children. Communication Disorders Quarterly, 40(1), 3–14. 10.1177/1525740118760215 [DOI] [Google Scholar]
- Nathani Iyer S, & Oller DK (2008). Prelinguistic vocal development in infants with typical hearing and infants with severe-to-profound hearing loss. Volta Review, 108(2), 115–138. [PMC free article] [PubMed] [Google Scholar]
- Nathani S, Ertrner DJ, & Stark RE (2006). Assessing vocal development in infants and toddlers. Clinical Linguistics & Phonetics, 20(5), 351–369. 10.1080/02699200500211451 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nathani S, & Oller DK (2001). Beyond ba-ba and gu-gu: Challenges and strategies in coding infant vocalizations. Behavior Research Methods, Instruments, & Computers, 33(3), 321–330. 10.3758/bf03195385 [DOI] [PubMed] [Google Scholar]
- Nyman A, & Lohmander A (2018). Babbling in children with neurodevelopmental disability and validity of a simplified way of measuring canonical babbling ratio. Clinical Linguistics & Phonetics, 32(2), 114–127. 10.1080/02699206.2017.1320588 [DOI] [PubMed] [Google Scholar]
- Oller DK (1978). Infant vocalization and the development of speech. Allied Health and Behavioral Sciences, 1(4), 523–549. [Google Scholar]
- Oller DK (2000). The emergence of the speech capacity. Lawrence Erlbaum Associates. [Google Scholar]
- Oller DK, Buder EH, Ramsdell HL, Warlaumont AS, Chorna L, & Bakeman R (2013). Functional flexibility of infant vocalization and the emergence of language. Proceedings of the National Academy of Sciences, 110(16), 6318–6323. 10.1073/pnas.1300337110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oller DK, Caskey M, Yoo H, Bene ER, Jhang Y, Lee C-C, Bowman DD, Long HL, Buder EH, & Vohr B (2019). Preterm and full term infant vocalization and the origin of language. Scientific Reports, 9(1), 1–10. 10.1038/s41598-019-51352-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oller DK, Eilers RE, Neal AR, & Cobo-Lewis AB (1998). Late onset canonical babbling: A possible early marker of abnormal development. American Journal of Mental Retardation, 103(3), 249–263. [DOI] [PubMed] [Google Scholar]
- Oller DK, Eilers RE, Neal AR, & Schwartz HK (1999). Precursors to speech in infancy: The prediction of speech and language disorders. Journal of Communication Disorders, 32(4), 223–245. 10.1016/s0021-9924(99)00013-1 [DOI] [PubMed] [Google Scholar]
- Oller DK, Niyogi P, Gray S, Richards JA, Gilkerson J, Xu D, Yapanel U, & Warren S (2010). Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development. Biological Sciences, 107, 13354–13359. 10.1073/pnas.1003882107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- *.Onnivello S, Schworer EK, Daunhauer LA, & Fidler DJ (2021). Acquisition of cognitive and communication milestones in infants with Down syndrome. Journal of Intellectual Disability Research, jir.12893. 10.1111/jir.12893. [DOI] [PubMed] [Google Scholar]
- *.Ouss L, Palestra G, Saint-Georges C, Leitgel Gille M, Afshar M, Pellerin H, Bailly K, Chetouani M, Robel L, Golse B, Nabbout R, Desguerre I, Guergova-Kuras M, & Cohen D (2020). Behavior and interaction imaging at 9 months of age predict autism/intellectual disability in high-risk infants with West syndrome. Translational Psychiatry, 10(1), 1–7. 10.1038/s41398-020-0743-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pansy J, Barones C, Urlesberger B, Pokorny FB, Bartl-Pokorny KD, Verheyen S, Marschik PB, & Einspieler C (2019). Early motor and pre-linguistic verbal development in Prader-Willi syndrome—A case report. Research in Developmental Disabilities, 88, 16–21. 10.1016/j.ridd.2019.01.012. [DOI] [PubMed] [Google Scholar]
- Papoušek M (1994). Vom ersten Schrei zum ersten Wort. Anfiinge der Sprachentwicklung in der vorsprachlichen Kommunikation. Hans Huber. [Google Scholar]
- Patten E, Belardi K, Baranek GT, Watson LR, Labban JD, & Oller DK (2014). Vocal patterns in infants with autism spectrum disorder: Canonical babbling status and vocalization frequency. Journal of Autism and Developmental Disorders, 44(10), 2413–2428. 10.1007/s10803-014-2047-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paul R, Fuerst Y, Ramsay G, Chawarska K, & Klin A (2011). Out of the mouths of babes: Vocal production in infant siblings of children with ASD. Journal of Child Psychology and Psychiatry and Allied Disciplines, 52(5), 588–598. 10.1111/j.1469-7610.2010.02332.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poeppel D, & Assaneo MF (2020). Speech rhythms and their neural foundations. Nature Reviews Neuroscience, 21(6), 322–334. 10.1038/s41583-020-0304-4 [DOI] [PubMed] [Google Scholar]
- *.Pokorny FB, Bartl-Pokorny KD, Einspieler C, Zhang D, Vollmann R, Bölte S, Gugatschka M, Schuller BW, & Marschik PB (2018). Typical vs. atypical: Combining auditory Gestalt perception and acoustic analysis of early vocalisations in Rett syndrome. Research in Developmental Disabilities, 82, 109–119. 10.1016/j.ridd.2018.02.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- *.Pokorny FB, Bartl-Pokorny KD, Zhang D, Marschik PB, Schuller D, & Schuller BW (2020). Efficient collection and representation of preverbal data in typical and atypical development. Journal of Nonverbal Behavior, 44(4), 419–436. 10.1007/sl0919-020-00332-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- *.Pokorny FB, Marschik PB, Einspieler C, & Schuller BW (2016a). Does she speak RTT? Towards an earlier identification of Rett Syndrome through intelligent pre-linguistic vocalisation analysis. Proceedings of Interspeech 2016. 1953–1957. 10.21437/Interspeech.2016-520. [DOI] [Google Scholar]
- *.Pokorny FB, Peharz R, Roth W, Zöhrer M, Pernkopf F, Marschik PB, & Schuller B (2016b). Manual versus automated: The challenging routine of infant vocalisation segmentation in home videos to study neuro(mal)development. Proceedings of Interspeech 2016, 2997–3001. 10.21437/Interspeech.2016-1341. [DOI] [Google Scholar]
- *.Pokorny FB, Schuller BW, Marschik PB, Brueckner R, Nyström P, Cummins N, Bölte S, Einspieler C, & Falck-Ytter T (2017). Earlier identification of children with autism spectrum disorder: An automatic vocalisation-based approach. Proceedings of Interspeech 2017, 309–313. 10.21437/Interspeech.2017-1007. [DOI] [Google Scholar]
- Pokorny FB, Schmitt M, Egger M, Bartl-Pokorny KD, Zhang D, Schuller BW, & Marschik PB (2022). Automatic vocalisation-based detection of fragile X syndrome and Rett syndrome. Scientific Reports, 12(1), 1–13. 10.1038/s41598-022-17203-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roche L, Zhang D, Bartl-Pokorny KD, Pokorny FB, Schuller BW, Esposito G, Bolte S, Roeyers H, Poustka L, Gugatschka M, Waddington H, Vollmann R, Einspieler C, & Marschik PB (2018). Early vocal development in autism spectrum disorder, Rett syndrome, and fragile X syndrome: Insights from studies using retrospective video analysis. Advances in Neurodevelopmental Disorders, 2(1), 49–61. 10.1007/s41252-017-0051-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roug L, Landberg I, & Lundberg LJ (1989). Phonetic development in early infancy: A study of four Swedish children during the first eighteen months of life. Journal of Child Language, 16(1), 19–40. 10.1017/s0305000900013416 [DOI] [PubMed] [Google Scholar]
- Schuller BW, & Batliner AM (2013). Computational paralinguistics: Emotion, affect and personality in speech and language processing. John Wiley & Sons Ltd. 10.1002/9781118706664 [DOI] [Google Scholar]
- Schuller BW, Steidl S, Batliner A, Vinciarelli A, Scherer K, Ringeval F, Chetouani M, Weninger F, Eyben F, Marchi E, Mortillaro M, Salarnin H, Polychroniou A, Valente F, & Kim S (2013). Computational paralinguistics challenge: Social signals, conflict, emotion, autism. Proceedings of Interspeech, 2013, 148–152. [Google Scholar]
- Shehata-Dieler W, Ehrmann-Mueller D, Wermke P, Voit V, Cebulla M, & Wermke K (2013). Pre-speech diagnosis in hearing-impaired infants: How auditory experience affects early vocal development. Speech, Language and Hearing, 16(2), 99–106. 10.1179/2050571x13z.00000000011 [DOI] [Google Scholar]
- *.Smith BL, & Oller DK (1981). A comparative study of pre-meaningful vocalizations produced by normally developing and Down’s syndrome infants. Journal of Speech and Hearing Disorders, 46(1), 46–51. 10.1044/jshd.4601.46. [DOI] [PubMed] [Google Scholar]
- *.Sohner L, & Mitchell P (1991). Phonatory and phonetic characteristics of prelinguistic vocal development in cri du chat syndrome. Journal of Communication Disorders, 24(1), 13–20. 10.1016/0021-9924(91)90030-M. [DOI] [PubMed] [Google Scholar]
- Stark RE (1980). Stages of speech development in the first year of life. In Yeni-Komshian G, Kavanagh JF, & Ferguson CA (Eds.), Child Phonology (Vol. 1, pp. 73–90). Academic Press. [Google Scholar]
- *.Steffens ML, Oller DK, Lynch M, & Urbano RC (1992). Vocal development in infants with Down syndrome and infants who are developing normally. American Journal of Mental Retardation, 97(2), 235–246. [PubMed] [Google Scholar]
- Vihman MM, Macken MA, Miller R, Simmons H, & Miller J (1985). From babbling to speech: A re-assessment of the continuity issue. Language, 61(2), 397–445. [Google Scholar]
- von Hapsburg D, & Davis BL (2006). Auditory sensitivity and the prelinguistic vocalizations of early-amplified infants. Journal of Speech, Language, and Hearing Research, 49(4), 809–822. 10.1044/1092-4388(2006/057) [DOI] [PubMed] [Google Scholar]
- *.Warlaumont AS, Oller DK, Buder EH, Dale R, & Kozma R (2010). Data-driven automated acoustic analysis of human infant vocalizations using neural network tools. The Journal of the Acoustical Society of America, 127(4), 2563–2577 10.1121/1.3327460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wermke K, & Robb MP (2010). Fundamental frequency of neonatal crying: Does body size matter? Journal of Voice, 24(4), 388–394. 10.1016/j.jvoice.2008.11.002 [DOI] [PubMed] [Google Scholar]
- Westermann G, Thomas MSC, & Karmiloff-Smith A (2011). Neuroconstructivism. In Goswami U (Ed.), The Wiley-Blackwell handbook of childhood cognitive development (2nd ed., pp. 723–747). Wiley-Blackwell. [Google Scholar]
- Yankowitz LD, Schultz RT, & Parish-Morris J (2019). Pre- and paralinguistic vocal production in ASD: Birth through school age. Current Psychiatry Reports, 21(12), 126. 10.1007/s11920-019-1113-1 [DOI] [PubMed] [Google Scholar]
- Yankowitz LD, Petrulla V, Plate S, Tune B, Guthrie W, Meera SS, Tena K, Pandey J, Swanson MR, Pruett JR, Cola M, Russel A, Marrus N, Hazlett HC, Botteron K, Constantino JN, Dager SR, Estes A, Zwaigenbaum L, … Network, T. I. B. I. S. (2022). Infants later diagnosed with autism have lower canonical babbling ratios in the first year of life. Molecular Autism, 13, 28. 10.1186/s13229-022-00503-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- *.Zappella M, Einspieler C, Bartl-Pokorny KD, Krieber M, Coleman M, Bölte S, & Marschik PB (2015). What do home videos tell us about early motor and socio-communicative behaviours in children with autistic features during the second year of life—An exploratory study. Early Human Development, 91 (10), 569–575. 10.1016/j.earlhumdev.2015.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang YS, & Ghazanfar AA (2020). A hierarchy of autonomous systems for vocal production. Trends in Neurosciences, 43(2), 115–126. 10.1016/j.tins.2019.12.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
