Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Aug 31.
Published in final edited form as: Top Cogn Sci. 2019 Mar 5;12(1):153–169. doi: 10.1111/tops.12416

Children and Adults as Language Learners: Rules, Variation, and Maturational Change

Elissa L Newport 1
PMCID: PMC7457165  NIHMSID: NIHMS1617195  PMID: 30834701

Abstract

Here we overview our recent research investigating children and adults’ learning of rules and variation. In all these studies, our findings are that children and adults differ in how they acquire linguistic patterns that are productive, variable, inconsistently used, or lexically restricted. Some of our studies examine children’s learning of natural languages; other studies expose learners to miniature languages and then ask them to produce novel sentences or judge their grammaticality. In every case there are important differences between learners as a function of their ages. Young children learn categorical rules and categorically follow patterns that are widespread in natural languages, even when their linguistic input exemplifies these patterns only probabilistically. In contrast, adult learners reproduce the probabilistic patterns of the input. Older children are in between, producing regular patterns somewhat more often than they appear in the input but also acquiring some probabilistic variation. These results suggest that the outcome of learning is quite different at different ages and that many of the properties of natural languages may be shaped by the behavior of children as they learn their native languages.

Keywords: Language acquisition, Statistical learning, Morphology, Rules, Inconsistent input

Preface to Lila

I am incredibly honored to be invited to write this paper to celebrate the award of the Rumelhart Prize to Lila Gleitman. Many years ago, Lila (and her dear husband, Henry) were my graduate advisors and shaped my thinking in countless ways regarding how to select an important problem to study and how to design research that could contribute to the answers. Lila has been a lifelong inspiration for my research. From her I have acquired an appreciation of the riches and revealing insights I could obtain from linguistics regarding language structure and the complexities of its acquisition, as well as an equally deep appreciation of learning and development as crucial contributors. I also learned the complementary advantages of studying experiments of nature and conducting experiments of our own design. Since all those many years ago, Lila has also been my dear friend. Her own work, and her training of so many of us who were her students, forged a new and distinctive approach to language and its acquisition. For this we all owe her a debt of gratitude—and many thanks to those who had the wisdom to award her the Rumelhart Prize.

1. Introduction

Much of my recent work has focused on what we have called statistical learning: the idea that language learners—infants and young children, and also adult learners in our experiments in the lab—use the statistical information derived from the linguistic distribution of elements in the speech stream to determine such things as what sequences of sound form the morphemes and words of the language, in what syntactic contexts these elements can appear, what grammatical categories they form, and what the phrases and hierarchical sentence structures of the language are (Reeder, Newport, & Aslin, 2013; Saffran, Aslin & Newport, 1996; Saffran, Newport & Aslin 1996; Schuler, Reeder, Newport, & Aslin, 2017; Thompson & Newport, 2007; Wonnacott, Newport, & Tanenhaus, 2008). In contrast to some researchers taking this approach, we do not argue that statistical learning is limited to very simple sequence statistics (e.g., finite state statistics); even infants and young children can acquire quite complex statistical properties of linguistic corpora, those that linguists from every theoretical persuasion think of as the distributional properties that define linguistic structure. What I will focus on in the present paper is a related but somewhat different question: What are the statistics of rules, as compared with probabilistic or inconsistent variation, and how do children and adults use these statistics to acquire constructions in their languages?

There are a number of issues that arise in considering how learners might learn from consistent versus variable or inconsistent properties of languages. First, what is the statistical distribution of forms in the input that characterizes consistent rule-governed forms, as compared with usages that are variable rules of the language or inconsistent usages of their input models? Presumably, different types of consistency and variation have different patterns of occurrence in the corpus, and these may lead learners to acquire them differently. Second, we have seen across many studies, on both natural languages and artificial languages learned in the laboratory, that the age of the learner can make an important difference in how learning proceeds and what types of structures are learned most readily. Young children learn differently—and produce different learning outcomes—than adult learners. While young children base many aspects of their learning on the input statistics, they do not merely reproduce these statistics; rather, they may use them to formulate patterns and rules. In the next section, I will review our studies of children versus adults learning consistent morphology. In the subsequent section, I will review our studies of children versus adults learning inconsistent or variable use of morphology. In the final section I will look more directly at how these results change over age, including older children as well as young children and adults, and consider how learners’ age may be pertinent to understanding how consistent and variable properties of language are acquired.

2. Learning regular morphology

What is the distributional information that learners receive in their input for morphology that is regular and consistently used (e.g., -s marking plural nouns in English or -ed marking past tense in verbs)? First, each lexical item is consistently marked in consistent linguistic contexts; that is, every time a word (e.g., elephant) is used in a particular context (e.g., a context for plural), that word is marked with the same inflection (e.g., -s). Second, many lexical items will show the same pattern; that is, many words in the same form class category (e.g., many nouns) will take the same marking (-s) in the same contexts (plural). This does not necessarily mean that every lexical item in the form class category will follow the same pattern or take the same inflection; individual lexical items may have a unique marking of their own (lexical exceptions, e.g., -ren in children), or small numbers of lexical items may follow a different pattern (e.g., no change, as in deer or sheep, or vowel change, as in goose/geese and tooth/teeth). Again, though, they will be so marked every time they occur in the relevant context.

What is the particular distribution that characterizes productive morphology—that is, morphemes that learners will apply to novel lexical items they have never heard in these contexts before? There are a number of different hypotheses that have been suggested in the literature regarding when a morpheme will be productive. Rumelhart and McClelland’s neural network model (1987) suggests (implicitly) that a morpheme will be productive when a majority of the lexical items—or a majority of the lexical tokens—take that morpheme in the relevant context. Marcus et al. (1995) suggest that a morpheme will be the productive default form when a diverse set of lexical items, differing in the phonological characteristics of their stems, all take this same morpheme in the relevant context (e.g., when lexical items like table, elephant, and car, whose forms cannot be characterized by a single phonological rule, all take -s in the plural). In contrast to these hypotheses, Yang (2016) has suggested a principle of computational efficiency (the Tolerance Principle [TP]) that predicts when a morpheme will become productive: Specifically, when enough lexical items in a class take the same morpheme, it becomes more efficient to store a rule (plus exceptions) for the class than to store each of the lexical forms individually. In our recent studies (Schuler, 2017; Schuler, Yang, & Newport, 2016), we have begun to ask, in a miniature language learning paradigm, whether child and adult learners follow lexical type or token frequency, diversity of phonological forms, or computational efficiency as articulated by the TP.

2.1. The Tolerance Principle

In many languages, there is a cluster of properties that characterize the regular form: a number of lexical items that are regular, often the majority of regular word forms in terms of both types and tokens, and also phonological diversity in the stems that take the regular ending. It is not clear which of these properties is crucial for determining which form will be productive (applying to novel words, foreign words, derived words, or other words for which there is no memorized form); different investigators have argued for each of these properties as most relevant (cf. Baayen, 1992, 1993; Goldberg, 2019; Marcus et al., 1995; McClelland & Patterson, 2002; O’Donnell, 2011).

One account, of particular interest because it is accompanied by a psycholinguistic rationale and also because it has been proposed in a mathematically precise and testable form, is the TP (Yang, 2005, 2016). The TP asserts that a productive rule will be formed when it is computationally more efficient than storing lexical forms individually. Yang has quantified the precise number of exceptions that a productive rule can tolerate before it becomes computationally less efficient than storing all the lexical items, by calculating the time complexity of applying a rule compared with accessing individual lexical forms. To illustrate, imagine that a learner is faced with a potential rule, for example, the English “add -ed to make a verb in the past tense.” The English learner will encounter many items that obey this rule (regular forms) as well as many that do not (irregular forms or exceptions). The learner can do one of two things:

  1. Store all lexical forms individually, then search through them to express the past tense.

  2. Form a productive rule and store the exceptions. To express the past tense, the learner searches the exceptions first. If the target verb is not among these exceptions, the learner applies the rule “add -ed.”

The TP computes the time required for each of these operations and assumes that the learner will adopt the optimal (faster) strategy. The details of this computation can be found in Yang (2016); here we note that, by solving the central equation of the TP for e (the number of exceptions), the TP allows us to compute the precise number of exceptions that a productive rule can tolerate before its formation becomes computationally less efficient than storing all the lexical forms without a rule.

Tolerance Principle (TP):

Let R be a rule that is applicable to N items, of which e are exceptions. R is productive if and only if e ≤ θN = N/ln(N).

That is, it is more efficient to form a productive rule when the number of exceptions is less than the number of items in the category divided by the natural log of the number of items in the category. For example, for a category of nine items, the TP says that 4.096 (θ9 = 9/ln9) is the cutoff for productivity (i.e., if there are more than 4.09 exceptions, forming a productive rule will be less efficient than storing all 9 items individually). Learners will form a productive rule if there are 4 or fewer exceptions, but not if there are 5 or more. This also implies that the distinction between forming a productive rule and storing individual lexical items is categorical.

2.2. Child versus adult learners and acquiring productive rules

Yang has examined a number of child corpora across different languages and found that the formation of many productive or unproductive rules in natural languages is consistent with the TP. However, by this method one can only test patterns that happen to occur in real languages; the number of exceptions is often quite distant from the critical values predicted by the TP. In collaboration with Yang, Katie Schuler and I (Schuler et al., 2016) tested the TP more closely by conducting an artificial language learning experiment, where we could precisely manipulate the number of lexical items that obeyed a rule or were exceptions, as well as their token frequencies. In our first experiment,1 for example, we created two conditions, one in which the TP predicts productive rule formation and another in which it predicts an unproductive pattern, using a “wug” test to assess whether children formed a productive rule (one that applied to novel lexical items).

Participants were 15 children ages 5–8 and 20 adults. We created nine nonsense nouns labeling nine objects (Fig. 1) and a rule R: “To make a noun plural, add -ka.” We used the TP to calculate the number of regulars versus exceptions a productive rule could tolerate for nine nouns (as noted above, 4.096 exceptions). We then created two conditions: one where a productive rule should be formed (five regulars, four exceptions: 5R/4E) and one where a productive rule should not be formed (three regulars, six exceptions: 3R/6E).

Fig. 1.

Fig. 1.

The nine nonsense objects used in Schuler et al. (2016) and the inflection used on each noun to mark the plural (-ka or various exceptions) in the 5R/4E condition (where five nouns were marked with -ka in the plural) and the 3R6E condition (where three nouns were marked with -ka in the plural). Noun frequency followed a Zipfian distribution, and in both conditions the most frequent nouns were marked with -ka. -ka was therefore the most frequent inflection in both conditions, but only the 5R/4E condition had enough nouns marked with -ka to form a productive rule according to the Tolerance Principle.

To create our exposure corpus, we assigned to each noun a plural marker that either followed the rule (add -ka) or was an exception (add -po, -tay, -lee, -bae, -muy, or -woo), depending on the condition. Then we created an exposure corpus of sentences, each containing one of the nonsense nouns in the singular or plural and accompanied by an appropriate picture (see Fig. 2a). All sentences began with the same nonsense verb gentif, meaning “there is/are.” Singular sentences were unmarked (“gentif + NOUN”) and paired with one image of the corresponding object. Plural sentences were formed “gentif + NOUN + MARKER” and paired with two, four, or six images of the corresponding object. Noun frequency varied along a Zipfian distribution, with nouns taking the regular ka as the most frequent in both conditions. (According to a Zipfian distribution, the second most frequent noun was presented half as often as the most frequent noun, the third was half as often as the second, and so on). Making the regular form the most frequent ensured that its frequency was high in both conditions.

Fig. 2.

Fig. 2.

(a) Illustrates the Exposure in which sentences are presented. Each exposure trial contains one nonsense noun in the singular or plural, accompanied by an appropriate picture. (b) Illustrates the Test, in which participants are given singular sentence-image pairs containing novel nonsense nouns they had not heard during exposure and are asked to describe the image requiring the plural form.

After exposure, we used a wug test to assess whether children had formed a productive rule (Berko, 1958). Participants were given singular sentence-image pairs containing novel nonsense nouns they had not heard during exposure and were asked to provide the plural form (Fig. 2b).

Participants who formed a productive rule should mark novel plural nouns with -ka close to 100% of the time. In contrast, participants who did not form a productive rule should use -ka substantially <100%. Fig. 3 shows the % of novel test trials in which -ka was produced by children and by adults during the production test in the 5R/4E and 3R/6E conditions. Children formed a productive rule when the TP predicted they would (5R/4E condition), but not when it predicted they would not (3R/6E condition). Six of seven children produced -ka on 100% of production trials in the 5R/4E condition, whereas only one of eight children did so in the 3R/6E condition.

Fig. 3.

Fig. 3.

Percentage of the regular inflection -ka applied to novel nouns by children and adults when their exposure contained five regulars/four exceptions compared with three regulars/six exceptions. Dashed line indicates the frequency of the -ka inflection predicted by the Tolerance Principle.

In contrast, adults in the 5R/4E condition marked novel nouns with -ka on 65% of plural trials, significantly below 100% (t = 3.23, p < .01). Adults in the 3R/6E marked novel nouns with ka in 51.7% of test trials. The TP is thus strongly in accord with the performance of children, who exhibited a highly categorical response in their use of -ka, but not clearly the right principle for describing the performance of adults. Indeed, children, but not adults, showed a significant difference in the use of -ka between the 5R/4E and 3R/6E conditions (children: t = 4.91, p < .001, adults: t = .89, p = .39).

Our results indicate that children did indeed form a productive rule when the TP predicted that they would and did not extend the rule when the TP predicted that no productive rule should be formed (even though the token frequency of the regular form was still high in this condition). This effect was nearly categorical. The TP thus appears to capture a basic aspect of generalization in rule formation in children.

In contrast, adults produce -ka with the frequency they heard it in their input, approximating probability matching (as we found in Hudson Kam & Newport, 2005, 2009). Because the nouns in our artificial language followed a Zipfian distribution, with ka nouns the most frequent, the frequency of -ka was high in both conditions: 75% of the plural exposure sentences in the 5R/4E condition and 58.3% in the 3R/6E condition. Adults matched this frequency in both the 5R/4E (t = .92, p = .19) and 3R/6E conditions (t = .63, p = .27). In contrast, children produced -ka significantly more than the input frequency in the 5R/4E condition (t = 2.00, p < .05) and significantly less than the input frequency in the 3R/6E condition (t = 3.40, p < .01). These results suggest that the TP captures something significant about generalization for children.

We have seen probability matching in adult learners in other experiments in our laboratory (Hudson Kam & Newport, 2005, 2009), which we will describe in the next section (below). However, it is important to note that the paradigms in these experiments are quite different. In our TP experiments, variation occurs across lexical items, but all tokens of each lexical item are consistent (e.g., tomber always takes -ka in the plural; mawg always takes -po in the plural). This is like what one finds in natural languages, where some lexical items are regular and others are irregular. In contrast, in our inconsistent input studies, each noun takes -ka on some occasions and -po on other occasions (as when parents are late learners and make inconsistent errors). In both cases, however, as we will see, children form strong, productive rules that apply consistently across words and their tokens, whereas adults match the inconsistencies of forms in their input.

These results suggest that children and adults may use different types of computations and/or storage as they learn a language. Children appear to be learning in relation to the number of lexical items following a pattern, not to the number of tokens following this pattern across lexical items. In these experiments we can see this definitively, because the Zipfian distribution of lexical frequency differentiates number of lexical items versus number of tokens that are regular. In contrast, adults appear to learn in relation to the number of tokens following the pattern, collapsed across lexical items. This contrast raises the possibility that children and adults may differ not necessarily in all the computations they perform during learning, but perhaps in how they bin the data they receive: Children appear to bin tokens separately for each lexical item and compare patterns across lexical items, whereas adults bin together the tokens of all lexical items in a class. In ongoing research we are further examining these computational differences between child and adult learners.

3. Learning morphology that is used inconsistently or probabilistically

In the previous section we focused on the statistics or distributional information provided by regular, productive morphology. Now we turn to asking about morphology that is used inconsistently or probabilistically. Our own studies have examined children’s acquisition of morphology that their non-native speaking parents use inconsistently; but variable structures—forms used probabilistically rather than deterministically in a particular context—also occur in many languages and contexts among native speakers as well (Labov, 2001). Variable forms that are used probabilistically and do not have an obligatory context will typically provide distributional information to children acquiring these structures in their native language that is similar to what we see in our experiments.

3.1. Children learning their native language from non-native parents

We have conducted several studies examining the acquisition of American Sign Language (ASL) by young children whose parents are late learners (and therefore non-native users) of the language. In the first of these studies, Jenny Singleton and I (Singleton & Newport, 2004) ran a longitudinal study of a Deaf child we called Simon, whose Deaf parents used ASL as their everyday language, including in their home as well as with their friends, but who grew up in an oral English environment and first learned ASL when they were teenagers.

An important phenomenon in the Deaf signing community is that, like Simon, children learning ASL as their native language from birth are often learning that language from parents who are late learners of the language (Fischer, 1978; Newport, 1981, 1982, 1990). While these parents, like late learners of other languages, may use complex constructions inconsistently and with many errors, their children look like other native users, acquiring these constructions without learning their errors (Ross, 2001; Singleton & Newport, 2004). In hearing communities, such improvements among child language learners might arise from input they receive from native speakers outside the family. However, in the Deaf community, due to the small numbers of native signers—only 5%–10% of Deaf signers are native users of the language (Schein & Delk, 1974)—many children learn ASL only from their late-learning parents, without any exposure to native signers. Singleton and Newport (2004) and Ross (2001) showed that such children make ASL constructions much more consistent, acquiring their parents’ regular usages but not their inconsistent errors. In these studies we examined children and their parents’ production of ASL verbs of motion, which are morphologically complex and often difficult for late learners to acquire. To elicit these forms, we presented short films of toy people and objects moving in varying paths and manners of motion (e.g., a doll jumping into a hoop, a robot moving past a motorcycle) and asked Deaf signers to say what happened in each film. Fig. 4, based on Singleton and Newport (2004), shows the production of morphemes in ASL verbs of motion produced by Simon, compared with those of his late-learning parents, who provided his only ASL input. While Simon’s parents produced each of these ASL morphemes correctly only about 70%–75% of the time (and produced inconsistent errors in the other 25%–30%), Simon produced the same morphemes correctly almost 90% of the time, virtually eliminating their inconsistent errors. We described this finding as regularization and suggested that it may be similar to what happens when children acquire young pidgin or early creole languages (sometimes called creolization) (cf. also Fischer, 1978; Newport, 1988, 1999). Simon’s parents inflected each lexical item inconsistently: In a particular morphological context, one morpheme (usually the correct ASL morpheme) was used most frequently, with multiple other forms used at lower frequencies. Moreover, his mother and father typically used the same dominant form but different minority forms. These characteristics may be the perfect circumstances for regularization by child learners. Indeed, Simon did not merely regularize these forms for a brief developmental moment; he was still regularizing his parents’ morphological forms at the oldest age at which we observed him (age 9). In a community in which many native learners experience this type of acquisition process, regularization may permanently change the language.

Fig. 4.

Fig. 4.

Elicited production of movement morphemes in American Sign Language (ASL) verbs of motion produced by Simon at age 9, compared with his parents, who first learned ASL in their teens and provided Simon’s only ASL input; and also compared with children of the same age who learned ASL from native signing parents (from Singleton & Newport, 2004).

3.2. Children, adults, and inconsistent input

Hudson Kam and Newport (2005, 2009) brought this phenomenon into the laboratory in order to understand the process by which children accomplished this regularization. We created miniature languages in which most properties were very regular, but one construction—“determiners” (ka or po) that followed nonsense nouns—were used inconsistently. The amount of inconsistency was varied across experimental conditions. After five to eight sessions of exposure to films of puppets interacting and hearing accompanying sentences, children and adults were asked to produce novel utterances to describe novel interactions among the puppets. The production of ka and po by the child and adult learners was the focus of the study. Austin, Furlong, Schuler, and Newport (in preparation) replicated these experiments with adult and child learners of different ages and with modifications to make the languages easier to learn. In all these experiments, we found that adults closely reproduced the inconsistencies of their linguistic input, but young children (ages 5–6) produced only the most regular and consistently used forms.

In Austin et al., participants saw short film clips of two puppets interacting, each clip accompanied by a spoken sentence in a nonsense language with VSO word order and the determiners ka or po appearing after the nouns. Fig. 5 shows an example of the input items.

Fig. 5.

Fig. 5.

An example film presented along with a spoken artificial language sentence meaning “The bee rams the giraffe” in Austin et al. (in preparation).

In our first experiment in Austin et al., ka appeared after nouns 67% of the time in the input corpus and po appeared after nouns 33% of the time. There were no characteristics of the sentences that predicted when ka vs. po would appear; they simply varied, with 67% ka and 33% po with every noun and in each sentence position in the language. At the end we provided novel films (the same puppets and the same actions, but in novel combinations) and asked learners to describe them. Fig. 6 shows the results from this experiment. Adults reproduced the 67/33 variation with amazing precision: They produced ka on 67.4% of their nouns, no different from the frequency of ka in their input (t(8) = 0.20, p = .85). Young children (ages 5–6), in contrast, produced ka 90% of the time and po only 10%, quite different from the probability with which these forms occurred in their input (t(8) = 4.11, p = .003).

Fig. 6.

Fig. 6.

Adults versus children ages 5–6: production of ka versus po in Inconsistent 67/33 Condition (from Austin et al., in preparation).

These results make a number of important points. First, it is important to point out that this is a type of statistical learning: Inconsistent variation in these studies follows a set of controlled statistical probabilities, and adult learners indeed reproduced the precise statistics of variation. Children also followed the statistics of their input, but only in the sense that they learned best the form that occurred more frequently or more consistently. However, in contrast to the adults, children reproduced this more consistent form almost all the time—turning probabilistic variation into something more like a rule.

Some additional findings are relevant to understanding this regularization phenomenon. In most of our experiments on inconsistent input, we have examined children’s judgments of sentences including various determiner forms, as well as their productions. Our judgment data suggest that children are aware of the minority forms and rate sentences containing them as significantly “better” than sentences that contain no determiners or contain a novel determiner they were not exposed to—even though they rarely produce these minority forms. However, we believe the dramatically greater production of ka, the more consistent form, is not merely a production bias but rather is due to its more robust learning (reflected in production because it is a difficult task). We have also shown that children are capable of learning and producing forms that simply occur with low frequency: They produce both high-frequency ka and low-frequency po when each is used consistently, in a predictable and obligatory context. This contrasts with their behavior when each is used unpredictably and inconsistently, where they strongly favor producing the more consistent form. This is apparently what Simon does in natural language learning, and perhaps what children exposed to young languages or inconsistent language communities do as well.

4. Summary so far

Thus far we have seen that when morphology is used consistently in the linguistic input that learners receive, young children use the same forms either fully productively or not at all—that is, they make a fairly categorical contrast between using the form as a regular morpheme or not using it much at all. This regularization behavior suggests that, when the input justifies it and in accord with Yang’s predictions from the TP, children form implicit rules that are applied in a fairly deterministic way. Most surprising, perhaps, is that they show the same rule-like and deterministic behavior even when the relevant grammatical forms are used inconsistently in their linguistic input. In contrast, across several different types of experiments, adult learners match the frequency with which an inflection is used. When a form is used inconsistently or probabilistically, without a conditioning context, adult learners reproduce the statistical inconsistencies or probabilities of that form in the input. This probability matching behavior is not characteristic of young child learners, either in our naturalistic data or in our laboratory experiments.

But these outcomes pose a problem. The speech (or signing) of late learners is indeed riddled with inconsistent usages that one does not find in native speakers of a language, and our results on children’s acquisition provide an account of this contrast. However, the literature on sociolinguistic variation shows that natural languages do contain many forms that are used variably (Labov, 2001). For example, in King of Prussia, Pennsylvania (and many other speech communities), the presence or absence of -t/d at the end of a word varies (e.g., old is pronounced sometimes as old and sometimes as ol’). Remarkably, the probability of -t/d deletion varies fairly systematically across different contexts of use (e.g., whether the syllable is stressed or unstressed; whether the word is one morpheme or a multi-morpheme inflected form), and the gradient in probabilities is stable across the speech community and across multiple generations of speakers (Guy, 1980; Labov, 1989). But our results reviewed thus far suggest that young children virtually always form consistent or categorical rules and do not produce variable or probabilistic usages. If this is correct, then how and when are the variable rules of natural languages acquired?

One possibility—since of course children do grown up!—is that variable or probabilistic rules (and perhaps particularly those that are systematic in their variations) are acquired by older children. In fact, in several of our miniature language studies, we have tested older children as well as younger children and adults, and indeed older children do look somewhat more like adults, acquiring probabilistically variable forms in a way that is not characteristic of their younger counterparts. Fig. 7 shows the results for 5- to 6- year-old children, 7- to 8-year-old children, and adults exposed to a language with inconsistently used 60% ka and 40% po. The young children and adults look like what we have already seen: Young children regularize, producing ka over 80%, whereas adults probability match, producing ka at about 60%, almost precisely the same as the input. Importantly, age 7- to 8-year-old children are in between these groups, producing ka in about 70% of the test items.

Fig. 7.

Fig. 7.

Adults, children ages 7–8, and children ages 5–6: Production of ka versus po in Inconsistent 60/40 Condition (from Austin et al., in preparation).

What does this change as children become older mean for the acquisition of regular versus variable forms? One possibility is that children acquire these types of constructions in sequence, as they get older: first learning forms as regular, consistent rules, and later acquiring the probabilistic or variable aspects of these forms’ usages (see also Guy & Boyd, 1990, and Song, Shattuck-Hufnagel, & Demuth, 2015, for evidence). A second possibility—but which seems less likely—is that different individuals, who learn the language at different ages, differ from each other in how they learn and use such forms. In either case, languages and language communities will ultimately include both consistent and variable forms and constructions. Our miniature language studies are very brief and do not follow learners as they mature; longitudinal studies of young children will be required to clarify this process. While there is a literature on children acquiring variable rules in their native languages (mostly acquiring variable phonological rules), the results from these studies are not highly consistent; in some, children match the usages of their parents, but in others they regularize. Further research is needed to clarify this process.

5. Conclusions

In many studies we have demonstrated that statistical learning is an important part of language acquisition; human learners are remarkably sensitive to a variety of statistics that reflect linguistic distributional information and can use these statistics to acquire the structures of natural languages. But our focus here has been on an important aspect of this process: Statistical learning is not just a matter of acquiring and reproducing whatever is in the input.

First, different types of statistical distributions in linguistic input—including those that differ only quantitatively and seem like small differences, such as the precise number of lexical items that take a highly frequent morpheme—can lead to qualitatively different learning outcomes. In examining the learning of consistently used morphological forms (such as -ed on English verbs or -ka on nouns in our “TP” miniature languages), we saw that forms used highly frequently in the input can be acquired either as regular, productive rules or as unproductive lexical exceptions, depending on specific details of the way the form is distributed across lexical items in the linguistic input. When morphological forms are used inconsistently and probabilistically across all lexical items (as in the input provided by late learner parents or in our “inconsistent input” miniature languages), there are several different ways in which learners acquire these forms, depending on the type of learner in question.

Second, learners of different ages can be exposed to the same linguistic input and yet learn in very different ways.2 When exposed to regular and consistently used morphological forms, young children learn these forms categorically, either forming a rule and using the form productively or not forming a rule and using the form very seldom or restricted to the lexical item on which they were exposed to it. For young children, this appears to be a sharp and surprisingly categorical contrast, even when the forms in question are all used with high token frequency. In contrast, adult learners exposed to precisely the same input will match the probabilities or token frequencies with which the forms are used; they do not form categorical rules in the way that young children do. These tendencies change gradually over age, with older children performing somewhere between young children and adults.

We believe that these phenomena regarding biases in learning and changes in these biases over a learner’s age, combined with the statistics or distribution of forms in linguistic input, are an important part of the processes that shape languages of the world.3 Our work suggests that young children may be biased toward forming categorical rules but that, as they grow up, may more readily acquire probabilistic variation. We believe that continued research using miniature languages can help to provide insights into the processes and the developmental changes that occur in natural language acquisition.

Acknowledgments

I am grateful to all my collaborators on the studies I have reviewed here, and especially to Katie Schuler, who has been my collaborator on many of the studies described in this paper. Thanks also to Jenny Culbertson and Barbara Landau for helpful comments on this paper. This research was supported in part by NIH grants R01 HD037082, R01 DC000167, and K18 DC014558, by the Feldstein Veron Innovation Fund, and by the Center for Brain Plasticity and Recovery at Georgetown University.

Footnotes

1.

See Schuler (2017) and K. D. Schuler, C. Yang, & E. L. Newport (unpublished data) for additional TP experiments.

2.

As noted above, one possibility in our TP outcomes is that learners of different ages may simply bin the linguistic data differently—for example, children might be keeping track of the use of a morpheme separately for each lexical item (and then forming a rule or not depending on the number of lexical items that take that morpheme), whereas adults are pooling the data about morpheme use across lexical items in the same class. This rather small difference in the way linguistic input is stored can make a qualitative difference in the outcomes (forming rules vs. matching usage probabilities).

3.

We are sometimes asked about the degree to which children may shape languages, if (of course) they grow up to become adults, who do not regularize the way children do. While this is an important topic for another paper, a few comments here may be helpful. First, adults do often show modest biases that are similar to the more dramatic biases of children (Culbertson, Smolensky, & Legendre, 2012; Fedzechkina, Jaeger, & Newport, 2012); some have argued that these adult biases, amplified over generations, may be responsible for language change (cf. Smith & Wonnacott, 2010). Languages are surely shaped by many forces, and the strong biases we see in children in the lab may be modulated and reduced in real languages. However, it is important to note that, in most linguistic communities, native speakers (that is, those who learn the language in childhood) outnumber adult learners, and most learning in native speakers occurs during their early years, not equally or predominantly during adulthood. We therefore do think that children’s biases, though possibly weaker than what we see in the lab, may be the more important force in shaping languages. This is especially likely in creolizing communities, where the irregular usages of adults may make children’s biases quite strong.

References

  1. Baayen RH (1992). Quantitative aspects of morphological productivity In Booij GE, & Marle JV(Eds.), Yearbook of morphology 1991 (pp. 109–149). Dordrecht: Kluwer Academic Publishers. [Google Scholar]
  2. Baayen RH (1993). On frequency, transparency, and productivity In Booij GE, & van Marle J (Eds.),Yearbook of Morphology 1992 (pp. 181–208). Dordrecht: Kluwer Academic Publishers. [Google Scholar]
  3. Berko J (1958). The child’s learning of English morphology. Word, 14, 150–177. [Google Scholar]
  4. Culbertson J, Smolensky P, & Legendre G (2012). Learning biases predict a word order universal. Cognition, 122, 306–329. [DOI] [PubMed] [Google Scholar]
  5. Fedzechkina M, Jaeger TF, & Newport EL (2012). Language learners restructure their input to facilitate efficient communication. Proceedings of the National Academy of Sciences, 109, 17897–17902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Fischer SD (1978). Sign language and creoles In Siple P (Ed.), Understanding language through sign language research (pp. 309–331). New York: Academic Press. [Google Scholar]
  7. Goldberg AE (2019). Explain me this. Princeton, NJ: Princeton University Press. [Google Scholar]
  8. Guy G (1980). Variation in the group and the individual: The case of final stop deletion In Labov W (Ed.), Locating language in time and space (pp. 1–36). New York: Academic Press. [Google Scholar]
  9. Guy G, & Boyd S (1990). The development of a morphological class. Language Variation and Change, 2,1–18. [Google Scholar]
  10. Hudson Kam C, & Newport EL (2005). Regularizing unpredictable variation. Language Learning and Development, 1, 151–195. [Google Scholar]
  11. Hudson Kam C, & Newport EL (2009). Getting it right by getting it wrong: When learners change languages. Cognitive Psychology, 59, 30–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Labov W (1989). The child as linguistic historian. Language Variation and Change, 1, 85–97. [Google Scholar]
  13. Labov W (2001). Principles of linguistic change, vol. 2: Social factors Oxford, UK: Blackwell. [Google Scholar]
  14. Marcus GF, Brinkmann U, Clahsen H, Wiese R, & Pinker S (1995). German inflection: The exception that proves the rule. Cognitive Psychology, 29, 189–256. [DOI] [PubMed] [Google Scholar]
  15. McClelland J, & Patterson K (2002). Rules or connections in past-tense inflections: What does the evidence rule out? Trends in Cognitive Sciences, 6, 465–472. [DOI] [PubMed] [Google Scholar]
  16. Newport EL (1981). Constraints on structure: Evidence from American Sign Language and language learning In Collins WA (Ed.), Aspects of the development of competence: Minnesota symposium on child psychology (pp. 93–124). Hillsdale, NJ: Lawrence Erlbaum Assoc. [Google Scholar]
  17. Newport EL (1982). Task-specificity in language learning? Evidence from speech perception and American Sign Language In Wanner E & Gleitman LR (Eds.), Language acquisition: The state of the art (pp. 450–486). New York: Cambridge University Press. [Google Scholar]
  18. Newport EL (1988). Constraints on learning and their role in language acquisition. Language Sciences, 10,147–172. [Google Scholar]
  19. Newport EL (1990). Maturational constraints on language learning. Cognitive Science, 14, 11–28. [Google Scholar]
  20. Newport EL (1999). Reduced input in the acquisition of signed languages: Contributions to the study of creolization In DeGraff M (Ed.), Language creation and language change: Creolization, diachrony, and development (pp. 161–178). Cambridge, MA: MIT Press. [Google Scholar]
  21. O’Donnell T (2011). Productivity and reuse in language. Unpublished doctoral dissertation, Harvard University. (Published as a book by MIT Press, 2015) [Google Scholar]
  22. Reeder PA, Newport EL, & Aslin RN (2013). From shared contexts to syntactic categories: The role of distributional information in learning linguistic form-classes. Cognitive Psychology, 66, 30–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Ross DS (2001). Disentangling the nature-nurture interaction in the language acquisition process: Evidence from deaf children of hearing parents exposed to non-native input. Unpublished doctoral dissertation, Department of Brain & Cognitive Sciences, University of Rochester. [Google Scholar]
  24. Rumelhart DE, & McClelland JL (1987). Learning the past tenses of English verbs: Implicit rules or parallel distributed processing In MacWhinney B (Ed.), Mechanisms of language acquisition (pp. 195– 248). Hillsdale NJ: Lawrence Erlbaum. [Google Scholar]
  25. Saffran JR, Aslin RN, & Newport EL (1996). Statistical learning by 8-month old infants. Science, 274, 1926–1928. [DOI] [PubMed] [Google Scholar]
  26. Saffran JR, Newport EL, & Aslin RN (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606–621. [Google Scholar]
  27. Schein J, & Delk M (1974). The deaf population of the United States. Silver Spring, MD: National Association of the Deaf. [Google Scholar]
  28. Schuler KD (2017). The acquisition of productive rules in child and adult language learners. Unpublished doctoral dissertation, Interdepartmental Program in Neuroscience, Georgetown University. [Google Scholar]
  29. Schuler KD, Reeder PA, Newport EL, & Aslin RN (2017). The effect of Zipfian frequency variations on category formation in adult artificial language learning. Language Learning and Development, 13, 357–374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Schuler KD, Yang C, & Newport EL (2016). Testing the Tolerance Principle: Children form productive rules when it is computationally more efficient to do so In Papafragou A, Grodner D, Mirman D, & Trueswell JC (Eds.) Proceedings of the 38th Annual Conference of the Cognitive Science Society (pp. 2321–2326). Austin, TX: Cognitive Science Society. [Google Scholar]
  31. Singleton JL, & Newport EL (2004). When learners surpass their models: The acquisition of American Sign Language from inconsistent input. Cognitive Psychology, 49, 370–407. [DOI] [PubMed] [Google Scholar]
  32. Smith K, & Wonnacott E (2010). Eliminating unpredictable variation through iterated learning. Cognition, 116, 444–449. [DOI] [PubMed] [Google Scholar]
  33. Song JY, Shattuck-Hufnagel S, & Demuth K (2015). Development of phonetic variants (allophones) in 2-year-olds learning American English: A study of alveolar stop /t, d/ codas. Journal of Phonetics, 52, 152–169. [Google Scholar]
  34. Thompson SP, & Newport EL (2007). Statistical learning of syntax: The role of transitional probability. Language Learning and Development, 3, 1–42. [Google Scholar]
  35. Wonnacott E, Newport EL, & Tanenhaus MK (2008). Acquiring and processing verb argument structure: Distributional learning in a miniature language. Cognitive Psychology, 2008(56), 165–209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Yang C (2005). On productivity. Linguistic variation yearbook, 5, 265–302. [Google Scholar]
  37. Yang C (2016). The price of productivity. Cambridge, MA: MIT press. [Google Scholar]

RESOURCES