Graphical abstract
Keywords: Evoked potentials, Language acquisition, Syntactic processing, Toddlers
Abstract
Syntax allows human beings to build an infinite number of sentences from a finite number of words. How this unique, productive power of human language unfolds over the course of language development is still hotly debated. When they listen to sentences comprising newly-learned words, do children generalize from their knowledge of the legal combinations of word categories or do they instead rely on strings of words stored in memory to detect syntactic errors? Using novel words taught in the lab, we recorded Evoked Response Potentials (ERPs) in two-year-olds and adults listening to grammatical and ungrammatical sentences containing syntactic contexts that had not been used during training. In toddlers, the ungrammatical use of words, even when they have been just learned, induced an early left anterior negativity (surfacing 100–400 ms after target word onset) followed by a late posterior positivity (surfacing 700–900 ms after target word onset) that was not observed in grammatical sentences. This late effect was remarkably similar to the P600 displayed by adults, suggesting that toddlers and adults perform similar syntactic computations. Our results thus show that toddlers build on-line expectations regarding the syntactic category of upcoming words in a sentence.
1. Introduction
Language's impressive productivity is primarily the result of syntax, a set of processes that allow listeners to encode the relationship between the words in a sentence. Adults rapidly access the syntactic structure of unfolding sentences (e.g., Brown and Hagoort, 1999, Friederici, 1995). But how and when do language learners acquire the syntax of their native language? At the onset of speech production, children typically do not utter well-formed sentences. For instance, toddlers around two years of age tend to use ‘telegraphic’ speech, omitting many grammatical items (e.g., articles, auxiliaries) from their spontaneous productions. This lack of grammatical items was originally interpreted as a sign of their inability to process – or even perceive – these short, and often reduced words. However, a mounting body of more recent experimental studies testing toddlers’ comprehension of grammatical words has shown that children are sensitive to the function words of their language from very early on (Gerken et al., 1990, Shi, 2014, for a review), suggesting that the absence of these words reflects difficulties in planning and uttering multi-word utterances, rather than a lack of receptive knowledge.
The accessibility of grammatical words early in life could be of great use for children's language development, in particular for syntactic categorization (Christophe et al., 2008). Specifically, because nouns tend to be preceded by determiners and verbs by pronouns, children might exploit these differential contextual statistics to group the two types of words into different syntactic classes (Redington et al., 1998). Experimental work supports this hypothesis. Young children familiarized with novel words preceded by one determiner later distinguish cases in which this new word is (correctly) preceded by a different determiner from those in which it is (erroneously) preceded by a pronoun (Höhle et al., 2004, Shi and Melançon, 2010; see also Cauvet et al., 2014). Grammatical words can furthermore help toddlers constrain the meaning of unknown content words. That is, toddlers map a displayed action onto a label when the grammatical structure of the accompanying sentence, as established by its function words, is consistent with a verb context, but not when it is consistent with a noun context (Bernal et al., 2007, Waxman et al., 2009). It is unclear, however, if children in those studies relied on the local co-occurrences of function and content words or whether a more complex syntactic structure was used, as has been shown in older, 4–5-year-old children (e.g., Huttenlocher et al., 2004, Lidz and Musolino, 2002). To better understand whether and how young children compute syntactic structure, we explore the nature and time course of syntactic processing in 24-month-old toddlers.
To examine children's early syntactic abilities, we recorded toddlers’ ERPs while they listened to grammatical and ungrammatical sentences. In adults, ungrammatical sentences typically evoke a P600, a late posterior ERP component that is thought to reflect revision processes or sustained integration processes needed to interpret these sentences (Hagoort and Brown, 2000, Osterhout, 1997, Hagoort, 2011, Kaan et al., 2000, Kuperberg, 2007 for a review). Many studies additionally report an earlier negative component, either an Early Left Anterior Negativity (ELAN) that is thought to reflect the violation of syntactic expectations based on local dependencies (Hahne and Friederici, 1999, Neville et al., 1991, Steinhauer and Drury, 2012 for a review) or an N400, typically associated with the integration of the current word within the on-going semantic representation (Kutas and Hillyard, 1980, Debruille et al., 2008, Kutas and Federmeier, 2000 for a review). In toddlers, like in adults, syntactic violations trigger evoked potentials that are different from those observed for syntactically well-formed co-occurrences. For instance, Oberecker et al. (2005) and Oberecker and Friederici (2006) exposed German-learning 2-year-olds to sentences containing local grammatical errors such as *Der Löwe im brüllt (‘The lion in-the roars’) and observed a greater positivity, interpreted as a P600, for these sentences compared to sentences containing grammatically intact phrases, such as Der Löwe brüllt im Zoo (‘The lion roars in the zoo’). However, the observed effect might be related to memory processes rather than syntactic processing per se: Ungrammatical pairs of words (e.g., “im brüllt”) may create a surprise because they have never been heard before. Bernal et al. (2010) partially addressed this issue by exploiting the ambiguity of certain French function words and constructing ungrammatical sentences in which all adjacent pairs of words occurred together in child-directed speech (e.g., la poire ‘the pear’ is grammatical, so is je la mange ‘I eat it’, but je la poire ‘I pear it’ is ungrammatical), and they observed a significant grammaticality effect. However, even though all word pairs in this study were legal, ungrammatical sentences did contain novel triplets of words, which may have been noticed by toddlers (Gómez and Gerken, 2000).
If toddlers distinguish between grammatical and ungrammatical sentences on the basis of the relative frequency of word strings (pairs, triplets, or more generally n-grams), this would imply that they rely on memorized chunks of sentences and may not be able of a more abstract processing based on syntactic categories – which would be particularly useful in language acquisition, as it would allow learners to generalize to novel situations. However, a frequency of zero does not necessarily imply that a sentence is ungrammatical. It is precisely because syntax is more than a mere frequency analysis that humans can interpret sentences they have never heard before. As a consequence, the only way to eliminate frequency confounds and truly test whether toddlers are able to perform syntactic computations, is to use novel words, for which the listener has not yet established any frequency counts. In the present experiment, we thus test listeners’ sensitivity to the grammatical usage of newly-learned nouns and verbs. As these novel words are only heard in a laboratory setting, this allows us to control the contexts in which they occur during training and use novel contexts at test. This way, only if processing is grammatical in nature will listeners be able to discriminate the grammatical and ungrammatical structures. To establish a baseline against which children's performance could be compared, we first test a group of adults in this paradigm in Experiment 1. In Experiment 2, we subsequently extend this work to 24-month-old toddlers.
2. Experiment 1: adults
In Experiment 1, we validate our experimental stimuli by examining adults’ sensitivity to the grammatical structure of sentences containing novel words. This ensures that participants in this experiment rely on the grammatical structure of the sentence, rather than on local sequences of words that tend to co-occur. Although past work has convincingly shown that adults compute the grammatical structure of sentences online, and that they can learn novel words efficiently and with tremendous speed (Batterink and Neville, 2011, Davis and Gaskell, 2009), it is important to verify that newly acquired words undergo syntactic processing similar to well-known words, and hence trigger similar ERP responses.
Adults were first taught the meaning of four new words (e.g., touse meaning triceratops). In the following test phase, they listened to grammatical and ungrammatical sentences containing these novel words (e.g., L’indien pousse le touseN, ‘The Indian pushes the touseN’; *Alors elle le touseN de joie, ‘*Then she tousesN it happily’, respectively), and, solely for comparison purposes, to grammatical and ungrammatical sentences containing familiar words (e.g., Et il le chienN à Marie, ‘And he dogsN it to Marie’; Puis elle dispute le chienN, ‘Then she scolds the dogN’). Crucially, at test, these words were presented following an ambiguous function word (le, ‘the/it’), which had never been used before these critical words during the teaching period. Therefore, both grammatical and ungrammatical sentences contained local co-occurrences of words that had a frequency of zero, and only the computation of the syntactic category of each word in the sentence would allow listeners to detect whether or not a word fits within the syntactic structure. We expect to observe a P600, a robust component often observed following a syntactic error. We might also observe an early component, ELAN or N400, depending on the weight given to the lexical or syntactic information carried by the target word in this particular paradigm (Friederici, 2011, Luck, 2005).
2.1. Methods
2.1.1. Participants
Twenty-one right-handed and native French-speaking adults (18–25 years) with no history of neurological disorders and no known hearing deficits were recruited to participate in the current experiment. Data of 5 additional participants were not included in the final analysis: 3 because of their insufficient quality due to hair thickness (<45 artifact-free trials in each condition) and 2 because we encountered a technical problem during the recording. Participants received a monetary compensation. The study was approved by the local ethical committee for biomedical research.
2.1.2. Stimuli
Four phonotactically legal target words (two nouns and two verbs) were selected as critical words in this study. As the goal was to test children on the same stimuli as adults, invented nouns (touse, rane) referred to animals unlikely to be familiar to young children (a triceratops and a vulture). Invented verbs (pouner, dumer) referred to actions that were imageable, yet non-familiar to young Parisian children (to saddle (a horse) and to fish). We used toys (e.g., triceratops, vulture, fishing rod, fish) to display the objects and actions. In addition, four well-known monosyllabic words (two nouns and two verbs) were used as target words in control trials (chat ‘cat’, chien ‘dog’, donner ‘to give’, manger ‘to eat’).
Ungrammatical sentences were constructed by inserting a noun in a verb position or a verb in a noun position (see Table 1). Within the test sentences, all target words were preceded by the French function word le. This function word could either be a determiner preceding a noun (equivalent to the in the catN ‘le chatN’), or an object clitic preceding a verb (equivalent to it in I giveV it ‘je le donneV’). In this design, the comparison between grammatical and ungrammatical conditions relies on responses evoked by the same string of words (e.g., the critical word touse, preceded by the function word le), thereby ruling out potential acoustical confounds due to the phonological form of words. In addition, we reserved the ‘le’ context for use in the EEG recording session. In the case of newly-learned words, this meant that both grammatical and ungrammatical test sentences contained word strings that subjects had never heard before. Consequently, the transitional probability between le and the target word was zero in all these test sentences, ensuring that differentiation of the two conditions could only be accomplished by retrieving the grammatical category of the target word and checking whether it matched the syntactic context. Sentences with well-known and newly-learned words were similar in structure, and so were grammatical and ungrammatical sentences (counterbalancing involved having, on average, the same number of syllables before and after the critical word across these four subconditions; also the number of times which the critical word appeared in specific structural positions – for instance, subject vs direct object, for noun slots – was counterbalanced across subconditions; see supplementary materials for a full list of experimental stories).
Table 1.
Examples of experimental stimuli involving well-known and newly-learned words. Both nouns and verbs occurred in noun and verb contexts, yielding grammatical sentences (when the context was congruent with the critical word syntactic category) and ungrammatical sentences, marked with a star (when context and critical word syntactic category were incongruent).
| Well-known words |
Newly-learned words |
|||
|---|---|---|---|---|
| Grammatical | Ungrammatical | Grammatical | Ungrammatical | |
| Noun | Martin éloigne le chat du cheval. | *Le cheval le chat pour s’amuser. | L’indien pousse le touse. | *Alors, elle le touse de joie. |
| Martin keeps the cat away from the horse. | *The horse cats it for fun. | The indian pushes the touse. | *Then, she touses it happily. | |
| Verb | Martin le donne. | *Voilà que le donne se fane. | Marie le dume bien. | *Alors, le dume aboie. |
| Martin gives it. | *Now the give withers. | Marie dumes it well. | *Then the dume barks. | |
We recorded 16 different video clips, where a French native speaker (the last author) narrated a 30-s story in child-directed speech. She used the toys to illustrate the stories and keep the participants interested during the introduction and filler sentences (see Fig. 1 for a story outline). Test sentences only displayed the speaker's face to keep the visual information similar across conditions, and to minimize eye movements. Each story contained four test sentences (64 in total), constructed in such a way that all parameters were counterbalanced (newly-learned/well-known, grammatical/ungrammatical, noun/verb). Acoustic–prosodic analyses were conducted on the stimuli (see Table 2), and showed that neither the critical word nor its preceding clitic differed in prosodic characteristics.
Fig. 1.
Example of a video story. In this story the newly-learnt word was the verb “pouner” meaning “to saddle” (here, used correctly in trial T2 and incorrectly in T4). During test trials only the speaker's face was visible, whereas in the remainder of the video the whole scene was presented, to keep participants interested.
Table 2.
Duration and pitch of the critical word and preceding clitic.
| Grammatical mean (standard error) | Ungrammatical mean (standard error) | t(31) (p-value) | ||
|---|---|---|---|---|
| Duration (ms) | Clitic | 139.0 (5.6) | 138.9 (4.8) | <1 |
| Critical word | 392.6 (14.5) | 418.2 (16.5) | 1.18 (0.25) | |
| F0 (Hz) | Clitic | 262.9 (6.3) | 257.9 (5.5) | <1 |
| Critical word | 304.4 (7.5) | 299.9 (7.6) | <1 | |
As a result, it is unlikely that any difference that arises in the ERPs is due to acoustic characteristics of either the critical words, or the preceding clitics (which constituted most of the 200 ms baseline).
2.1.3. Procedure
Participants were told that they would watch short movies created for children. The experimenter explained that these movies featured novel words and gave them the meaning of these words (touse means ‘triceratops’; rane means ‘vulture’, pouner means ‘to saddle’, and dumer means ‘to fish’). Once the EGI net was put in place relative to external anatomical markers (reference at the vertex), test trials started. Participants watched four blocks of 16 video stories while their EEG was recorded.
Two computers were used to conduct the experiment; one played the video-clips, the other one selected the clip to be played and sent trial information to the EEG recording system.
2.1.4. ERP recording
High-density EEG (128 electrodes referenced to the vertex, net amp 200 system EGI, Eugene, USA) was continuously digitized at 250 Hz during the video presentations. Recordings were pre-processed using custom functions developed within the EEGlab MATLAB toolbox (Delorme and Makeig, 2004). The signal was digitally band-pass filtered (0.3–20 Hz) and segmented into 1400-ms epochs starting 200 ms prior to target word onset. In order to use the same pre-processing procedure for adults and children, channels at the edge of the scalp, which are generally very noisy in toddlers, were removed from the analyses.1 For each trial, the channels that were contaminated by eye or motion artifacts (i.e. local deviation higher than 40 μV) were excluded, and channels comprising fewer than 50% good trials over a participant's whole recording session were rejected. Trials with more than four contaminated channels (5%) were not taken into account for the analyses. Excluded channels were interpolated for each trial separately by using the linear interpolation method of EEGlab. The artifact-free trials (mean 245.4, range 223–256) were averaged for each participant in each condition (Well-known words, grammatical 60.4, ungrammatical 61.3; Newly-learned words, grammatical 62.2, ungrammatical 61.4). Averages were baseline-corrected (−200 ms to 0 ms window) and transformed into reference-independent values using the average of all channels as reference.
2.1.5. Data analysis
Analyses of EEG datasets are complex given the number of electrodes (here 91, after removal of the outer electrodes) and time samples (here 300) increasing the risk of type I errors (false alarms) if each possible comparison is considered (here 91 * 300). To avoid these errors and reduce the number of comparisons, three strategies are generally proposed. The most classical consists in constraining the analysis through the existing literature and computing ANOVAs on the time-windows and scalp regions already reported to be significant in similar experimental conditions. This method has been criticized as being sensitive to biases in the literature reports and in the experimenters’ choices (increasing the risk of false-alarms and ‘double-dipping’), but also as restricting analyses to known effects. Furthermore in less studied populations, the literature may not be sufficiently dense. A second strategy consists in identifying experimental effects on a subset of the data then checking whether it replicates it on another subset; this strategy reduces the number of trials taken into account, which is problematic with a toddler population where it is challenging to obtain a sufficient number of trials. A final strategy, the cluster-based permutation analysis (Maris and Oostenveld, 2007) exploits the fact that neighboring channels and time-points are highly correlated. It identifies spatio-temporal clusters which exhibit a significant difference between conditions. The statistical value of these clusters is subsequently assessed by comparing them to a null distribution obtained through randomized permutations of the initial data. In practice, a t-test is computed on each electrode and time-point, then a threshold is applied and clusters are built as the sum of the t-values above threshold in neighboring points in time and space. The same procedure is applied on the shuffled data and the largest clusters from the original data are compared to the distribution of the clusters obtained in the shuffled data. This general method, which is instantiated in several matlab toolboxes (Fieldtrip, Oostenveld et al., 2010; SPM, Kiebel and Friston, 2004; LIMO, Pernet et al., 2011; TFCE, Mensen and Khatami, 2013), is conservative, and its sensitivity depends on how the clusters are constructed (see Mensen and Khatami, 2013 for a comparison of the different toolboxes and the different choices to construct clusters). In a nutshell, there is a trade-off between sensitivity to local but intense effects vs effects with smaller amplitude but which are more sustained in time and diffuse on the scalp. Here, we used a combination of the last and first approaches, cluster-based permutation analysis (with the Fieldtrip toolbox) and t-tests on identified regions of interest.
We applied the exact same analysis strategy in adults and toddlers. First, to ensure that a main effect of grammaticality was present in our data, we performed the conservative cluster-based permutation analysis on the main effect of grammaticality (i.e. comparison between grammatical and ungrammatical sentences, pooling together newly-learned and well-known words) using Fieldtrip, with 10,000 iterations and a threshold of p = .01. For this analysis, we considered two time-windows: an early one (100–600 ms) to capture the early effects described in adults, i.e. either a LAN (Left Anterior Negativity), which typically deploys between 100 and 400 ms, or an N400 (around 300–600 ms). The second time-window (500–1000 ms) aims to capture the late P600 response whose typical latency is between 500 and 800 ms, but may be later especially in children (Atchley et al., 2006, Oberecker and Friederici, 2006, Schipke et al., 2011).
We then conducted comparisons on selected time-windows and clusters of electrodes, constraining this selection through the existing literature. The inspection of the 2D-maps of the main effect of grammaticality in adults showed two differences compatible with the literature, a LAN followed by a P600. We selected the time-windows and clusters of electrodes encompassing these effects and averaged the voltage over the selected electrodes and time-windows in each subject and each of the 4 conditions (Grammaticality by Familiarity). We performed paired t-test comparisons between Grammatical vs Ungrammatical sentences for each type of words (newly-learned and well-known). We also examined the Familiarity by Grammaticality interaction. The aim of this second analysis is to assess whether the grammaticality effect that we expect to observe for the well-known words is reproduced when newly-learned words are considered.
2.2. Results
2.2.1. Cluster-based permutation analyses
During the early time-window (100–600 ms), the cluster-based permutation analysis did not reveal any significant effect (p below 0.05). In contrast, the analysis of the late time-window revealed a significant positive centro-posterior cluster (p = 0.007) spreading between 650 and 800 ms and involving up to 10 electrodes around Cz and C3 at its peak (between 700 and 750 ms): this effect exhibits the timing and topography typical of a P600, which is almost systematically reported in adults when grammatical and ungrammatical sentences are compared.
2.2.2. Analyses for each type of words
The inspection of the two-dimensional reconstructions of the Ungrammatical–Grammatical difference revealed two expected classical effects: an early left frontal negativity, with the timing and topography of a LAN (Left Anterior Negativity), recorded between 250 and 400 ms after critical word onset, followed by a late centro-posterior positivity observed between 550 and 800 ms, very similar to a P600 and corresponding to the main effect observed in the cluster-based permutation analysis (see Fig. 2). For each time window, we selected a cluster of electrodes based on the maximum of the main effect, in order to examine the effect of grammaticality for each type of words. These clusters of electrodes comprised 12 channels (including C3 and Fz) for the LAN (250–400 ms) and 12 central channels (including Cz and Pz) for the P600 (550–800 ms) (see the black triangles in Fig. 2).
Fig. 2.
Adult results. Top: LAN (250–400 ms); bottom: P600 (550–800 ms). (A) Maps of statistical significance (z-score) of the difference Ungrammatical–Grammatical (triangles represent the electrodes used in the ANOVAs), for well-known words (left) and newly-learned words (right), together with the time course of the activation for the selected cluster of electrodes, over the entire trial (blue curve: grammatical sentences; green curve: ungrammatical sentences); the selected time window is shaded. (B) Mean potentials over the selected time window and electrodes, split by Grammaticality and Familiarity. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
2.2.2.1. Well-known words
A significant Grammaticality effect was observed for these words in both time-windows: LAN (t(20) = 3.18 p = 0.004) and P600 (t(20) = −2.6 p = 0.02, Fig. 2). Consistent with the existing literature, this observation of a classical LAN-P600 complex suggests that adults computed whether the syntactic category of a well-known word matched its syntactic context online as the sentence unfolds.
2.2.2.2. Newly-learned words
Only the P600 was observed for the newly-learned words (t(20) = −2.43 p = 0.02). As can be seen in Fig. 2 (bottom), this P600 response was very similar for newly-learned and well-known words. In contrast, during the LAN time-window newly-learned words induced a negative potential for both grammatical and ungrammatical sentences (t(20) = 1.25 p = 0.22). Even though well-known and newly-learned words exhibited different responses during the early time-window, the interaction between Grammaticality and Familiarity was not significant (p > .10). The absence of the early effect on newly-learned words probably explains why this effect did not come out significant in the cluster-based permutation analysis.
2.3. Discussion
Overall, our results show that the well-known words exhibit a classical LAN-P600 complex, while the newly-learned words (although not significantly different from the well-known words) exhibit only a significant P600 effect. LAN components are typically elicited by a violation of a strong syntactic expectation (e.g., the expectation to hear a word belonging to a specific syntactic category). This component is thought to reflect highly automatic syntactic processing (Hahne and Friederici, 1999). The finding that the LAN effect was statistically present only for well-known words might thus indicate that the access to the syntactic category of the newly-learned words may not have been fully automatized yet, which is not too surprising given that these words had been learned only just before the test. In addition, since the newly-learned words designated referents that adults already had a label for (e.g., to saddle, a triceratops), participants may have anticipated the real French words instead of (or in addition to) the novel words.
Importantly, the P600 effect was observed both for misplaced well-known words and for misplaced newly-learned words. The P600 effect is usually assumed to reflect a re-analysis process when the syntactic structure is perceived as incongruent (or sometimes simply very hard to process). This suggests that adults were able to use the syntactic category of a word that they had learned just before the test when processing the syntactic structure of sentences. Given that we get evidence that adults are able to compute on-line the syntactic structure of sentences involving newly-learned words, we can now turn to test toddlers with the same procedure.
Previous research has suggested that children, too, respond differently to grammatical and ungrammatical sentences (Bernal et al., 2010, Oberecker et al., 2005, Oberecker and Friederici, 2006, Schipke et al., 2011, Silva Pereyra et al., 2005, Silva-Pereyra et al., 2007). It is, however, unclear whether children in those studies differentiated ungrammatical from grammatical sentences on the basis of purely syntactic information or on the basis of frequency. Experiment 2 disentangles these two possibilities by presenting children with the same materials that were used in Experiment 1.
3. Experiment 2
In Experiment 2, 24-month-old toddlers were taught the meaning of four new words in an interactive play session. They heard many different correct contexts for each new word, but the test context, featuring an ambiguous function word, was never used. A week later, they were tested on the same experiment as the adults in Experiment 1 (same stimuli and same procedure). If the grammaticality effects observed in previous ERP studies with children emerged due to the infrequent co-occurrence of the words in ungrammatical trials, then the current experiment should find the same ERP components for both grammatical and ungrammatical sentences, as both had a frequency of zero. By contrast, if toddlers compute the syntactic structure of sentences on-line, and if they have encoded the syntactic category of the new words during the training session, then the electrical response elicited by grammatical and ungrammatical sentences should diverge from one another, for both well-known and newly-learned words.
3.1. Methods
3.1.1. Participants
24 Monolingual French toddlers (11 boys) were tested (mean age 24.4; range 23.8–25.1). An additional 26 children were tested, but were too agitated to reach the criterion for inclusion in the analysis (at least 16 artifact-free trials in each of the conditions, 20 children), or had too many excluded channels for the interpolation procedure to work reliably (6 children). Another 34 children were recruited to participate, but refused to wear the net (32 children), or were not tested because the parent used the critical phrase during training (2 children, see below).
3.1.2. Stimuli and procedure
Stimuli and procedure were identical to those in Experiment 1, except for a more elaborate learning session, which took place a week prior to test. New words were taught to toddlers in an interactive play session that lasted approximately 20 min. During this first session, toddlers heard many exemplars of good contexts for each new word, but never the one used in the test sentences. For example, the experimenter may have said Oh regarde, un touse! ‘Oh look, a touse!’, or Ce touse est très beau! ‘This touse is beautiful!’ but never said le touse ‘the touse’. The session was videotaped, and later checked to ensure that the critical phrase had not been produced by the experimenter, nor by the parent. The experimenter freely exploited all the socio-pragmatic cues that are known to help word learning. At the end of this session, toddlers were able to point to the toy animals and perform the actions corresponding to the new verbs. Parents were instructed not to use the words during the week.
A week later, the EEG was recorded while children were seated on their parent's lap and watched at least two blocks of 16 video stories. Parents were asked to remain silent throughout the experiment.
3.1.3. ERP recording and data analysis
EEG recording, data processing and analyses were the same as for adults, except that different thresholds were used for artifact rejections because of the higher amplitude of the background EEG and the different types of artifacts in toddlers: (1) Trials with local deviation higher than 80 μV in more than 20% of the channels were rejected. (2) Channels comprising fewer than 50% good trials over the entire recording were rejected. In average, 99.1 artifact-free trials (range 68–172) were kept per toddler (by subcondition: well-known words, grammatical 25.3, ungrammatical 24.3; newly-learned words, grammatical 24.0, ungrammatical 23.4). Data analyses were conducted as in adults.
3.2. Results
3.2.1. Cluster-based permutation analyses
The analysis centered on the 100–600 ms time-window uncovered a significant negative spatio-temporal cluster (p = 0.01) spreading between 100 and 400 m and involving up to 10 left frontal electrodes around F7 and F3 at its peak (between 150 and 300 ms). The second analysis, centered on the 500–1000 ms time-window, revealed two marginally significant clusters, namely a positive centro-posterior cluster (p = 0.086) spreading between 650 and 800 ms and involving up to 6 electrodes around P3 and Pz at its peak (between 700 and 750 ms), as well as a negative cluster that was its counterpart and spread between 700 and 800 ms, involving up to 4 electrodes around F3 and Fz at its peak (between 700 and 750 ms). These first analyses revealed that toddlers were able to distinguish between ungrammatical and grammatical sentences, even in the strictly controlled contexts used here. Furthermore, they presented a two-step response similar to the LAN-P600 complex that was observed in adults (on well-known words). The goal of the subsequent analyses is to examine whether the main grammaticality effect described here is observed in the newly-learnt words as well as in the well-known words.
3.2.2. Analyses for each type of words
The inspection of the two-dimensional reconstructions of the Ungrammatical–Grammatical difference revealed a left anterior left negativity from 100 to 400 ms after critical word onset, with the timing and topography of a LAN, followed by a late centro-posterior positivity (700–900 ms), with the timing and topography of a P600 (Fig. 3). Both effects are consistent with the cluster-based permutation analysis. We selected a cluster of 13 channels (including C3 and Fz) to analyze the effect present in the early time-window and a second cluster of 15 central channels (including Cz and Pz) for the late effect (black triangles in Fig. 3).
Fig. 3.
Toddler results. Top: early effect (100–400 ms); bottom: late effect (700–900 ms). (A) Maps of statistical significance (z-score) of the difference Ungrammatical–Grammatical, on well-known words and newly-learned words, and the time-course of the activation for the selected cluster of electrodes (selected time-window in gray). (B) Mean voltage over the selected time-window and electrodes, split by Grammaticality and Familiarity.
3.2.2.1. Well-known words
The analyses restricted to the well-known words revealed a significant Grammaticality effect for both time-windows: a LAN-like early left frontal negativity (100–400 ms: t(23) = 2.81 p = 0.015) as well as a P600-like late centro-posterior positivity (700–900 ms: t(23) = −2.1 p = 0.05).
3.2.2.2. Newly-learned words
A significant Grammaticality effect was also observed for both time windows: a LAN-like early left anterior negativity (t(23) = 2.10 p = 0.047), followed by a P600-like late centro-posterior positivity (t(23) = −2.68 p = 0.014). As can be seen in Fig. 3, newly-learned and well-known words look very much alike, both exhibiting a significant early effect with very similar topography and timing, as well as a similar P600 effect. Follow-up analyses showed no difference in the size of the grammaticality effects between the two types of words, for both time-windows (Grammaticality by Familiarity interaction: both F(1,23) < 1).2
3.3. Discussion
Two-year-olds process newly-learned and well-known words in a very similar fashion, and detect cases in which a word's syntactic category does not match its syntactic context with equal ease in both conditions. This is all the more striking that the newly-learned words had never been heard in the grammatical contexts before the EEG experiment. Two-year-olds are thus able to distinguish between strings of words depending on whether or not they are grammatical, even though all those word strings had a frequency of zero. To do so, they first had to assign a syntactic category to the newly-learned words during the training session (using either their linguistic contexts, their meanings, or both), and then correctly compute which contexts were correct or not for these syntactic categories.
4. General discussion
In a series of two studies, we examined adults’ and children's ability to execute syntactic computations online. Using ERPs, we found that adults and children alike rapidly compute expectations regarding the possible syntactic categories of upcoming words and match these to the actual syntactic category of the words they hear. Moreover, this pattern of results was found regardless of whether listeners had long known the critical words or acquired them only recently. As these newly-learned words had never been heard in any of the test contexts, and both grammatical and ungrammatical sentences contained novel strings of words (le + critical word), toddlers’ responses cannot have been due to hearing unfamiliar strings of words. This suggests that 24-month-olds, as well as adults, rely on abstract syntactic categories during on-line analysis of their language input.
To be able to recognize the ungrammatical use of newly-learned words, toddlers must have proceeded in two steps. First, they had to learn the syntactic category of the novel word (i.e. noun or verb) during the training session. Second, as toddlers had never heard the newly-learned words preceded by the function word used at test (i.e. le), they had to exploit their knowledge of their native language syntax to compute which contexts are legal for each of the newly-learnt words and extrapolate this information during online syntactic processing. Note that the syntactic contexts used here were particularly complex, in that they featured an ambiguous function word, le, which could either be an article or an object clitic. Toddlers were thus able to use the syntactic context preceding an ambiguous function word to identify its role in the sentence and then build expectations regarding the syntactic category of the following content word. Children's sensitivity to the sentence structure of their language may hence be grammatical rather than solely distributional in nature from early on.
For both toddlers and adults, ungrammatical sentences induced a sequence of two components, a late P600-like component and an early negativity. Both components were very similar in toddlers and adults. The P600 effect surfaced in both populations as a central positivity, and was delayed by about 150 ms in toddlers relative to adults, which is not surprising (Atchley et al., 2006, Hahne et al., 2004, Holcomb et al., 1992). The early effect surfaced in both populations as a left-lateralized early negativity, resembling a LAN, with overlapping topographies and time-windows. The only difference between toddlers and adults comes from the finding that toddlers exhibited similar responses on both newly-learned and well-known words, while in adults newly-learned words did not show the early effect. This may be due to either or both of the following factors: First, adults learned the novel words just before the experiments, while toddlers had the opportunity to sleep between the learning and the test sessions, and sleep is known to consolidate lexical knowledge (Davis and Gaskell, 2009, Friedrich et al., 2015). Second, toddlers are used to learning novel words every day, and had yet to build a lexical entry for the referents that were used, whereas word learning is a more unusual activity for adults, who in addition had already stored a competing lexical item for the referents. Overall, the presence of this LAN-like early effect, a component often observed in the context of phrase structure violations that can be detected on the basis of a fast template-matching process, suggests that the computation of syntactic category expectations is fast and robust in toddlers, as it is in adults.
The early latency of the first grammaticality effect raises interesting questions regarding the time course of grammatical processing in young children. Specifically, as the entire effect (observed between 100 and 400 ms after the onset of the critical word) develops before the end of the critical words (average word duration: 405 ms), this suggests that the misplacement of the critical word is noticed extremely fast. That is, after hearing as little as just the first few phonemes of a content word, children can infer whether or not this word is correctly placed in the sentence context. Specific features of our experimental paradigm may have promoted this extremely fast processing. Each story featured only two protagonists performing a small number of actions and these stories were narrated using a very limited number of nouns and verbs. Because each story typically only contained 3–5 content words, anticipating the syntactic category of the next word in the sentence may have allowed toddlers to restrict their expectations to a very small number of specific words (typically 1 or 2 only). For this reason, hearing the initial phoneme of the critical item was sufficient to determine whether the unfolding word matched the expected lexical item (see Connolly and Phillips, 1994, for a similar effect in adults). Note, however, that although the minimal selection of words may have helped toddlers’ anticipation of the specific content words, restriction of the set could only be achieved once children had computed the structure of the phrase and inferred the category of the upcoming word.
If children's sensitivity to the syntactic contexts of nouns and verbs is so robust, then how did they acquire this knowledge? In other work, we explored the possibility that children bootstrap into the abstract categories of their language by initially relying on a small number of highly frequent nouns and verbs. This hypothesis rests on two pre-requisites: First, toddlers should be able to learn the meaning of this small group of frequent words by solely relying on their visual contexts. Recent findings showing that infants as young as 6 months of age already display some knowledge about some content words (Bergelson and Swingley, 2012, Bergelson and Swingley, 2013), suggest that this hypothesis is plausible. Second, within this set of known words, they should be able to group objects with objects and actions with actions, again a plausible assumption given the existing work on concept formation in infants (see Carey, 2011 for a review). These initial semantic-based categories may then be expanded by collecting the contexts in which words from these categories occur. A simulation of this procedure on a Childes corpus of child-directed speech showed that even with an initial vocabulary as small as only 6 nouns and 2 verbs, new words can be categorized with 75–80% precision (Brusini et al., 2011, Gutman et al., 2015), suggesting that it is a plausible strategy for learning noun and verb contexts during the first year of the life.
5. Conclusion
Our results indicate that two-year-old children, like adults, are able to deduce the syntactic category of novel words and use this information to infer the syntactic structures in which these novel words can occur. In addition, we show that two-year-olds compute online expectations regarding the syntactic category of upcoming words, suggesting that children's early syntactic processing is extremely robust. This, in turn, may facilitate lexical acquisition. When children encounter a novel word for the first time, they could assign the anticipated syntactic category to that word, which would provide a major boost to the acquisition of the lexicon.
Acknowledgements
This work was supported by the French Ministry of Research, the French Agence Nationale de la Recherche (grants n° ANR-2010-BLAN-1901, ANR-13-APPR-0012, ANR-10-IDEX-0001-02 PSL* and ANR-10-LABX-0087 IEC), the Fondation de France, as well as by the Région Ile-de-France and European Research Council 269502, which supported P.B. while writing this manuscript. We thank M. van Heugten, A. Cristia, S. Kouider, and S. Peperkamp for suggestions on the manuscript; A.-C. Fiévet for help with the data collection, L. Barbosa for help with data analysis, V. Ul for technical help, and Dr. Billard, as well as all children and their families for their participation.
Footnotes
128-channels Hydrocel Geodesic Sensor Net, the following electrodes, which represent the three outer-most circles of the geodesic net, were removed: 17-126-127-21-14-8-1-125-121-120-119-114-113-107-99-94-88-81-73-68-63-56-49-43-48-128-44-38-32-25-100-95-89-82-74-69-64-57. As a result, 91 electrodes are analyzed.
Following a reviewer's suggestion, we conducted an additional analysis to see whether nouns and verbs elicited similar effects or not. In two ANOVAs that included a Category factor (Noun vs Verb), together with the Grammaticality factor (using the time-windows and electrodes selected for the overall analysis, and pooling together Well-known and Newly-learnt words), we observed only a significant main effect of Grammaticality; word Category did not yield a main effect, nor did it interact with Grammaticality. Both the early and the late effects were thus present for both nouns and verbs.
Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.dcn.2016.02.009.
Appendix A. Supplementary data
Example of one of the 16 videos presented to the subjects. The bips present at the beginning of each critical word were used as a marker to time-lock the EEG data and were not heard by participants.
References
- Atchley R.A., Rice M.L., Betz S.K., Kwasny K.M., Sereno J.A., Jongman A. A comparison of semantic and syntactic event related potentials generated by children and adults. Brain Lang. 2006;99(3):236–246. doi: 10.1016/j.bandl.2005.08.005. [DOI] [PubMed] [Google Scholar]
- Batterink L., Neville H. Implicit and explicit mechanisms of word learning in a narrative context: an event-related potential study. J. Cogn. Neurosci. 2011;23(11):3181–3196. doi: 10.1162/jocn_a_00013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergelson E., Swingley D. At 6–9 months, human infants know the meanings of many common nouns. Proc. Natl. Acad. Sci. 2012;109(9):3253–3258. doi: 10.1073/pnas.1113380109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergelson E., Swingley D. The acquisition of abstract words by young infants. Cognition. 2013;127(3):391–397. doi: 10.1016/j.cognition.2013.02.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernal S., Dehaene-Lambertz G., Millotte S., Christophe A. Two-year-olds compute syntactic structure on-line. Dev. Sci. 2010;13(1):69–76. doi: 10.1111/j.1467-7687.2009.00865.x. [DOI] [PubMed] [Google Scholar]
- Bernal S., Lidz J., Millotte S., Christophe A. Syntax constrains the acquisition of verb meaning. Lang. Learn. Dev. 2007;3(4):325–341. [Google Scholar]
- Brown C., Hagoort P. Architectures and Mechanisms for Language Processing. Cambridge University Press; 1999. On the electrophysiology of language comprehension: implications for the human language system. [Google Scholar]
- Brusini P., Amsili P., Chemla E., Christophe A. Presented at the Society for Research on Child Development Biennial Meeting, Montreal, Canada. 2011. Learning to categorize nouns and verbs on the basis of a few known examples: a computational model relying on 2-word contexts. [Google Scholar]
- Carey S. The origin of concepts: a precis. Behav. Brain Sci. 2011;34(3):113–162. doi: 10.1017/S0140525X10000919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cauvet E., Limissuri R., Millotte S., Skoruppa K., Cabrol D., Christophe A. Function words constrain on-line recognition of verbs and nouns in French 18-month-olds. Lang. Learn. Dev. 2014;10(1):1–18. [Google Scholar]
- Christophe A., Millotte S., Bernal S., Lidz J. Bootstrapping lexical and syntactic acquisition. Lang. Speech. 2008;51(1–2):61–75. doi: 10.1177/00238309080510010501. [DOI] [PubMed] [Google Scholar]
- Connolly J.F., Phillips N.A. Event-related potential components reflect phonological and semantic processing of the terminal word of spoken sentences. J. Cogn. Neurosci. 1994;6(3):256–266. doi: 10.1162/jocn.1994.6.3.256. [DOI] [PubMed] [Google Scholar]
- Davis M.H., Gaskell M.G. A complementary systems account of word learning: neural and behavioural evidence. Philos. Trans. R. Soc. Lond. B: Biol. Sci. 2009;364(1536):3773–3800. doi: 10.1098/rstb.2009.0111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Debruille J.B., Ramirez D., Wolf Y., Schaefer A., Nguyen T.-V., Bacon B.A., …, Brodeur M. Knowledge inhibition and N400: a within- and a between-subjects study with distractor words. Brain Res. 2008;1187:167–183. doi: 10.1016/j.brainres.2007.10.021. [DOI] [PubMed] [Google Scholar]
- Delorme A., Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods. 2004;134(1):9–21. doi: 10.1016/j.jneumeth.2003.10.009. [DOI] [PubMed] [Google Scholar]
- Friederici A.D. The time course of syntactic activation during language processing: a model based on neuropsychological and neurophysiological data. Brain Lang. 1995;50(3):259–281. doi: 10.1006/brln.1995.1048. [DOI] [PubMed] [Google Scholar]
- Friederici A.D. The brain basis of language processing: from structure to function. Physiol. Rev. 2011;91(4):1357–1392. doi: 10.1152/physrev.00006.2011. [DOI] [PubMed] [Google Scholar]
- Friedrich M., Wilhelm I., Born J., Friederici A.D. Generalization of word meanings during infant sleep. Nat. Commun. 2015:6. doi: 10.1038/ncomms7004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerken L., Landau B., Remez R.E. Function morphemes in young children's speech perception and production. Dev. Psychol. 1990;26(2):204–216. [Google Scholar]
- Gómez R.L., Gerken L. Infant artificial language learning and language acquisition. Trends Cogn. Sci. 2000;4(5):178–186. doi: 10.1016/s1364-6613(00)01467-4. [DOI] [PubMed] [Google Scholar]
- Gutman A., Dautriche I., Crabbé B., Christophe A. Bootstrapping the syntactic bootstrapper: probabilistic labeling of prosodic phrases. Lang. Acquis. 2015;22(3):285–309. [Google Scholar]
- Hagoort P. Interplay between syntax and semantics during sentence comprehension: ERP effects of combining syntactic and semantic violations. J. Cogn. Neurosci. 2011;15(6):883–899. doi: 10.1162/089892903322370807. [DOI] [PubMed] [Google Scholar]
- Hagoort P., Brown C.M. ERP effects of listening to speech compared to reading: the P600/SPS to syntactic violations in spoken sentences and rapid serial visual presentation. Neuropsychologia. 2000;38(11):1531–1549. doi: 10.1016/s0028-3932(00)00053-1. [DOI] [PubMed] [Google Scholar]
- Hahne A., Eckstein K., Friederici A.D. Brain signatures of syntactic and semantic processes during children's language development. J. Cogn. Neurosci. 2004;16(7):1302–1318. doi: 10.1162/0898929041920504. [DOI] [PubMed] [Google Scholar]
- Hahne A., Friederici A.D. Electrophysiological evidence for two steps in syntactic analysis: early automatic and late controlled processes. J. Cogn. Neurosci. 1999;11(2):194–205. doi: 10.1162/089892999563328. [DOI] [PubMed] [Google Scholar]
- Höhle B., Weissenborn J., Kiefer D., Schulz A., Schmitz M. Functional elements in infants’ speech processing: the role of determiners in the syntactic categorization of lexical elements. Infancy. 2004;5(3):341–353. [Google Scholar]
- Holcomb P.J., Coffey S.A., Neville H.J. Visual and auditory sentence processing: a developmental analysis using event-related brain potentials. Dev. Neuropsychol. 1992;8(2–3):203–241. [Google Scholar]
- Huttenlocher J., Vasilyeva M., Shimpi P. Syntactic priming in young children. J. Mem. Lang. 2004;50(2):182–195. [Google Scholar]
- Kaan E., Harris A., Gibson E., Holcomb P. The P600 as an index of syntactic integration difficulty. Lang. Cogn. Process. 2000;15(2):159. [Google Scholar]
- Kiebel S.J., Friston K.J. Statistical parametric mapping for event-related potentials: I. Generic considerations. NeuroImage. 2004;22(2):492–502. doi: 10.1016/j.neuroimage.2004.02.012. [DOI] [PubMed] [Google Scholar]
- Kuperberg G. Neural mechanisms of language comprehension: challenges to syntax. Brain Res. 2007;1146:23–49. doi: 10.1016/j.brainres.2006.12.063. [DOI] [PubMed] [Google Scholar]
- Kutas M., Federmeier K. Electrophysiology reveals semantic memory use in language comprehension. Trends Cogn. Sci. 2000;4(12):463–470. doi: 10.1016/s1364-6613(00)01560-6. [DOI] [PubMed] [Google Scholar]
- Kutas M., Hillyard S.A. Reading senseless sentences: brain potentials reflect semantic incongruity. Science. 1980;207(4427):203–205. doi: 10.1126/science.7350657. [DOI] [PubMed] [Google Scholar]
- Lidz J., Musolino J. Children's command of quantification. Cognition. 2002;84(2):113–154. doi: 10.1016/s0010-0277(02)00013-6. [DOI] [PubMed] [Google Scholar]
- Luck S. 2005. An Introduction to the Event-Related Potential Technique (Cognitive Neuroscience). A Bradford Book. Retrieved from http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0262621967. [Google Scholar]
- Maris E., Oostenveld R. Nonparametric statistical testing of EEG- and MEG-data. J. Neurosci. Methods. 2007;164(1):177–190. doi: 10.1016/j.jneumeth.2007.03.024. [DOI] [PubMed] [Google Scholar]
- Mensen A., Khatami R. Advanced EEG analysis using threshold-free cluster-enhancement and non-parametric statistics. Neuroimage. 2013;67:111–118. doi: 10.1016/j.neuroimage.2012.10.027. [DOI] [PubMed] [Google Scholar]
- Neville H., Nicol J.L., Barss A., Forster K.I., Garrett M.F. Syntactically based sentence processing classes: evidence from event-related brain potentials. J. Cogn. Neurosci. 1991;3(2):151–165. doi: 10.1162/jocn.1991.3.2.151. [DOI] [PubMed] [Google Scholar]
- Oberecker R., Friederici A.D. Syntactic event-related potential components in 24-month-olds’ sentence comprehension. Neuroreport. 2006;17(10):1017–1021. doi: 10.1097/01.wnr.0000223397.12694.9a. [DOI] [PubMed] [Google Scholar]
- Oberecker R., Friedrich M., Friederici A.D. Neural correlates of syntactic processing in two-year-olds. J. Cogn. Neurosci. 2005;17(10):1667–1678. doi: 10.1162/089892905774597236. [DOI] [PubMed] [Google Scholar]
- Oostenveld R., Fries P., Maris E., Schoffelen J.-M. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 2010;2011:e156869. doi: 10.1155/2011/156869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Osterhout L. On the brain response to syntactic anomalies: manipulations of word position and word class reveal individual differences. Brain Lang. 1997;59(3):494–522. doi: 10.1006/brln.1997.1793. [DOI] [PubMed] [Google Scholar]
- Pernet C.R., Chauveau N., Gaspar C., Rousselet G.A. LIMO EEG: a toolbox for hierarchical linear modeling of electroencephalographic data. Comput. Intell. Neurosci. 2011;2011:e831409. doi: 10.1155/2011/831409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Redington M., Chater N., Finch S. Distributional information: a powerful cue for acquiring syntactic categories. Cogn. Sci. 1998;22(4):425–469. [Google Scholar]
- Schipke C.S., Friederici A.D., Oberecker R. Brain responses to case-marking violations in German preschool children. Neuroreport. 2011;22(16):850–854. doi: 10.1097/WNR.0b013e32834c1578. [DOI] [PubMed] [Google Scholar]
- Shi R. Functional morphemes and early language acquisition. Child Dev. Perspect. 2014;8(1):6–11. [Google Scholar]
- Shi R., Melançon A. Syntactic categorization in French-learning infants. Infancy. 2010;15(5):517–533. doi: 10.1111/j.1532-7078.2009.00022.x. [DOI] [PubMed] [Google Scholar]
- Silva-Pereyra J., Conboy B.T., Klarman L., Kuhl P.K. Grammatical processing without semantics? An event-related brain potential study of preschoolers using jabberwocky sentences. J. Cogn. Neurosci. 2007;19(6):1050–1065. doi: 10.1162/jocn.2007.19.6.1050. [DOI] [PubMed] [Google Scholar]
- Silva Pereyra J.F., Klarman L., Lin L.J.-F., Kuhl P.K. Sentence processing in 30-month-old children: an event-related potential study. Neuroreport. 2005;16(6):645–648. doi: 10.1097/00001756-200504250-00026. [DOI] [PubMed] [Google Scholar]
- Steinhauer K., Drury J.E. On the early left-anterior negativity (ELAN) in syntax studies. Brain Lang. 2012;120(2):135–162. doi: 10.1016/j.bandl.2011.07.001. [DOI] [PubMed] [Google Scholar]
- Waxman S.R., Lidz J.L., Braun I.E., Lavin T. Twenty-four-month-old infants’ interpretations of novel verbs and nouns in dynamic scenes. Cognit. Psychol. 2009;59(1):67–95. doi: 10.1016/j.cogpsych.2009.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




