Abstract
Since Chomsky’s pioneering work on syntactic structures, comparative psychologists interested in the study of language evolution have targeted pattern complexity, using formal mathematical grammars, as the key to organizing language-relevant cognitive processes across species. This focus on formal syntactic complexity, however, often disregards the close interaction in real-world signals between the structure of a pattern and its constituent elements. Whether such features of natural auditory signals shape pattern generalization is unknown. In the present paper, we train birds to recognize differently patterned strings of natural signals (song motifs). Instead of focusing on the complexity of the overtly reinforced patterns, we ask how the perceptual groupings of pattern elements influence the generalization pattern knowledge. We find that learning and perception of training patterns is agnostic to the perceptual features of underlying elements. Surprisingly, however, these same features constrain the generalization of pattern knowledge, and thus its broader use. Our results demonstrate that the restricted focus of comparative language research on formal models of syntactic complexity is, at best, insufficient to understand pattern use.
Keywords: pattern recognition, artificial grammar learning, songbird, auditory processing, rule learning
Introduction
Theories of syntax are central to understanding language. As a result, many comparative psychologists interested in the study of language evolution have targeted pattern complexity, using formal mathematical grammars, as the key to organizing language-relevant cognitive processes across species (Berwick, Okanoya, Beckers, & Bolhuis, 2011; Berwick, Beckers, Okanoya, & Bolhuis, 2012). In identifying similarities and differences between human and nonhuman animals in relation to pattern recognition and production, some researchers have concluded that comparative models are inadequate because “human language sentences are potentially unbounded in length and structure, limited only by extraneous factors, such as short-term memory or lung capacity” (Berwick, Okanoya, Beckers, & Bolhuis, 2011). This focus on formal syntactic complexity, however, disregards the close interaction in real-world signals between the structure of a pattern and its constituent elements as well as core biological and cognitive constraints intrinsic to temporal processing and, therefore, language. Others have argued that comparative studies are essential to the study of language precisely because they showcase how biological and cognitive mechanisms interact with dynamic real-world signals to tune pattern perception mechanisms crucial to aspects of language (Margoliash & Nusbaum, 2009; Kiggins, Comins, & Gentner, 2012). The latter perspective proposes to study language and its evolution in the context of the principles of organismal biology (Margoliash & Nusbaum, 2009), whereas the former posits these questions in the domain of mathematical formalisms specifically unburdened by such restrictions (Berwick, Okanoya, Beckers, & Bolhuis, 2011; Berwick, Beckers, Okanoya, & Bolhuis, 2012).
What role does syntactic pattern complexity play in language origin? Is the capacity for humans to perceive mildly context-sensitive language constructions, such as “John Mary Peter Jane lets help teach swim” (see Berwick, Okanoya, Beckers, & Bolhuis, 2011), or even context-free language constructions, like “the starling the cats want was tired” (again, Berwick, Okanoya, Beckers, & Bolhuis, 2011) what marks the divide between a nervous system that can support language from one that cannot? Or are these grades of temporal pattern complexity extraneous to the minimally sufficient requirements for language? Regardless of formal computational models that might be instantiated to perceive a given pattern, the broader utility of any pattern is constrained by its generalizability. That is, can knowledge of a pattern gained in one context be employed to recognize the same pattern in other contexts.
To explore this idea, we examined how perceptual constraints shape pattern generalization in European starlings, Sturnus vulgaris, a species of songbird. Starlings are expert auditory sequence learners both in the laboratory and during natural song development. Acoustically, starling song appears as a patterned sequence of distinct motifs, where each 200 – 1000ms long motif comprises multiple shorter notes presented in a stereotyped pattern. Although motif repertoires between starlings are largely unique, all motifs can be grouped into four general classes by their broad acoustic characteristics (Eens, Pinxten, & Verheyen, 1988). Perceptually, these classes form open-ended natural categories for starlings (Braaten, 2000), and individual motifs within categories can be readily differentiated (Meliza, 2011). Starlings attend to the temporal patterning of notes within motifs (Gentner, 2008) and to the temporal patterning of motifs in songs (Gentner & Hulse, 1998). Further, starlings accurately classify and generalize artificial motif patterns that take the forms AABB and ABAB (Gentner, Fenn, Margoliash, & Nusbaum, 2006), where sets A and B represent two of four natural motif categories (rattles and warbles) in starling song.
Methods
Subjects
Eight (N=8) European starlings (Sturnus vulgaris) wild-caught in southern California served as subjects. We controlled for neither age nor sex, though all subjects had full adult plumage when acquired and, thus, were at least one year old. From the time of capture until use in this study, subjects were housed in a large mixed sex, conspecific aviary with ad libitum access to food and water. The photoperiod in the aviary and the testing chambers followed the seasonal variation in local sunrise and sunset times.
Stimuli
Patterned stimuli were made using motifs from the song recordings of one adult male starling captured near Baltimore, Maryland (for details, see Gentner & Hulse, 1998). Motifs can be broadly classified into four classes based on their spectrotemporal features: whistles, warbles, rattles and high frequency events (Adret-Hausberger & Jenkins, 1988). We extracted eight unique (non-repetitious) warble and rattle motifs from the songs of one male. Both warbles and rattles possess a variable acoustic structure, with rattles further characterized by a broadband click train. The rattle and warble motif categories follow ethologically-relevant and psychologically-salient boundaries (Braaten, 2000; Eens, Pinxten, & Verheyen, 1988). Using these two classes of motifs, we built two patterns of motif sequences of the form XXYY or XYXY. While the pattern constituents and groupings (i.e. motifs and motifs classes) are ethologically-relevant, the overall pattern forms XXYY/XYXY are not.
To construct XXYY and XYXY patterned stimuli, we divided the motifs into two sets, labeled A and B. For half of the subjects (n = 4), motif membership in set A and B preserved natural acoustic (perceptual) category boundaries (i.e., 8 warbles for set A; 8 rattles for set B; see figure 1). For the other half of subjects (n = 4) motif membership in sets A and B conflicted with the natural perceptual boundaries (i.e., 4 warbles, 4 rattles for A; 4 warbles, 4 rattles for B; see figure 2C and 2D).
Subjects were trained using 32 (out of a possible 16,384) XXYY and XYXY patterned stimuli (8 of the explicit form AABB, 8 BBAA, 8 ABAB and 8 BABA). Within each subset of 8 explicit training patterns (e.g. AABB), each motif appeared exactly once in each possible position, and no motif appeared twice in the same sequence. No differences were observed between the mean duration of warbles and rattles (independent samples t-test: t = 0.98; df = 14; p = 0.34), or in the durations of XXYY and XYXY patterns in the group where the natural perceptual boundaries A/B motif were preserved (independent samples t-test: t = 0.0; df = 30; p = 1.00), or pseudo-randomized (independent samples t-test: t = 0.0; df = 30; p = 1.00). Thus, stimulus length cannot be used to recognize patterns.
Apparatus
Figure 2A illustrates the operant apparatus where starlings learned to classify the training stimuli. Each subject was held in a small weld-wire cage containing an operant panel. On the panel, a centrally located response port was a PVC-housed opening. Inside of this opening was an IR receiver and transmitter used to detect when the bird broke the plane of the response port with its beak. This ‘poke-hole’ design allows starlings to probe the apparatus with their beaks, in a manner akin to their natural appetitive foraging behavior. A remotely controlled hopper, positioned behind the panel, moved the food within and beyond the subject’s reach beneath the opening. Acoustic stimuli were delivered through a small full-range audio speaker mounted behind the panel and out of the subject’s view. The sound pressure level inside all chambers was calibrated to the same standard broadband noise signal. Full details for all the mechanical components of the apparatus, audio interface, digital I/O control hardware, and custom software are available upon request.
Shaping
Upon initially entering the operant chamber, we provided each subject with unrestricted access to the food hopper. Following acclimation to eating from this device, the hopper was lowered beyond the subject’s reach. Next subjects were placed on an autoshaping routine (Brown & Jenkins, 1968). Several times per minute, we presented the subject with a blinking LED in the center response port followed by 2-s access to the food hopper. This process recurred until the subject acquired a key-peck response. From then, the lowered food hopper would only be engaged if the subject pecked the blinking LED in the center port. Subjects repeated this behavior for 100 trials. After the completion of this phase, the center LED ceased blinking, requiring subjects to peck at the darkened center port to raise the food hopper. Following 100 such trials, pecking the center port initiated the playback of an acoustic stimulus where subjects earned food rewards in accordance with standard go/no-go training procedures described below.
Classification training
Figure 2B demonstrates the operant procedure. We trained subjects to classify sets of starling song stimuli abiding by two different patterning rules. To initiate a trial, subjects pecked a darkened center port which triggered the playback of a training stimulus. Once stimulus presentation concluded, the animal could either peck the darkened center port once more (termed a ‘go’ response) or not (termed a ‘no-go’ response). A peck in response to one set of stimuli (S+ training set) was positively reinforced with 2-s access to the food hopper. A peck in response to the other set of training stimuli (S− training set) was punished by extinguishing the operant box light at least 10 seconds, thereby prohibiting food hopper access. An absence of a response to either the S+ or S− stimuli yielded no operant consequence. Correction trials were available in the experiment during only training phases. For these trials, a key-peck response within the response window following an S− stimulus not only extinguished the house lights as usual, but the program would also continue to present the same S− stimulus until the subject abstained from giving a key-peck response. Correction trials were discontinued for this experiment once the animal achieved our criterion for successful classification performance during the initial training phase (criterion: 3 successive d′ ≥ 1.0). The stimulus exemplar presented on any given non-correction trial was sampled randomly with replacement from the pool of all stimuli the animal was learning to classify. To evaluate classification accuracy, we labeled a response to an S+ stimulus and the withholding of a response to an S− stimulus as correct. Conversely, withholding a response to an S+ stimulus and responding to an S− stimulus were labeled incorrect. Subjects could freely peck at the center response port throughout stimulus presentation, but only the first response within a 2-s response window beginning at stimulus offset triggered reinforcement or punishment. Subjects were on a closed economy during training, with daily sessions lasting from sunrise to sunset, and each subject could run as few or as many trials as they were able. Food intake was monitored daily to ensure each subject’s well-being. The inter-trial interval was a minimum of 2-s. Water was always available. All procedures were approved by the UCSD institutional animal care and use committee.
Procedure
To quantify learning during training we used a performance criterion of 3 consecutive 100-trial blocks where d′ scores exceeded 1.0. Subjects remained on training stimuli after reaching this learning threshold to ensure stable behavioral performance prior to generalization testing. We assessed the subjects’ ability to generalize knowledge of the XXYY and XYXY patterns presented during initial training using an abrupt and complete transfer to 500 novel motif sequences that followed the training patterns. Each of the 500 test sequences was built using the training motifs in sets A and B assigned to that subject, and was presented only once. The entire generalization test was exactly 500 trials. During transfer, we maintained all reinforcement contingencies as in training, except that no correction trials were used.
We used d-prime (d′) to measure bias-free sensitivity of baseline and test pattern classification as follows: d′ = z(H−ε) – z(F+ε), where z(H−ε)and z(F+ε) are the z-scores of the proportions of responses to S+ and S− stimuli adjusted by a factor of ε = 0.001 to set response rate bounds (0, 1), respectively. To compare generalization performance between the two groups, we normalized d′ by dividing every subject’s d′ on each of their five 100-trial transfer blocks by their mean performance across the five 100-trial blocks preceding transfer.
Results
Initial pattern learning was similar for both groups. We observed no significant differences between groups in acquisition rate (number of 100-trial blocks until d′ > 1.0 for 3-consecutive blocks; t = −2.14, df = 6, p = 0.076; figure 3A). More importantly, all subjects learned to classify the training patterns with accuracy significantly above chance (p < 0.02 all cases, t-test per bird for last 5 100-trial blocks of training, where chance d′ = 0; figure 3B).
Surprisingly, pattern generalization differed qualitatively between groups. Subjects that learned patterns of categorically-organized motifs maintained accurate classification across the 500 novel patterns (p < 0.05 all cases, t-test per bird relative to chance; figure 3C). In contrast, none of the subjects that learned patterns with pseudo-randomly grouped motifs performed above chance during the transfer (t-test for each bird; in three cases: p > 0.3; one subject below chance; p = 0.044; figure 3C). Consistent with these differences, normalized generalization performance (see methods) for subjects trained with the categorically organized motifs was significantly better than that for subjects trained with the pseudo-randomly organized motifs (F(1,6) = 13.67, p = 0.01; nested rmANOVA; figure 4).
Discussion
The results of this simple experiment are clear. Reshuffling pattern elements across the boundaries of natural acoustic categories leaves pattern learning intact, but fully blocks pattern generalization. While pattern learning and perception in a restricted context (i.e., explicit training sequences) appears agnostic to the perceptual grouping of underlying elements, such grouping constrains generalization of pattern knowledge. Thus, regularity in the perceptual structure of the pattern elements is a salient feature of pattern perception in non-humans and is essential to the broader use pattern knowledge.
Although this experiment was not designed to test the perceptual strategies that each group of subjects used to acquire pattern knowledge during baseline training, the results are instructive nonetheless. Given that subjects trained with the pseudo-randomized motif groups failed to generalize to novel pattern exemplars, the most parsimonious strategy for their recognition of the baseline sequences is rote memorization. In contrast, rote memorization cannot account for generalization to the novel patterns by the group working with the categorically well-defined motifs. Although we observed no statistically significant difference in the acquisition rates between subjects in the two groups, there was a trend for those learning patterns with the pseudo-category motifs to take longer, and it is likely that with larger groups sizes these differences would reach statistical significance, consistent with a different learning strategy. Understanding the learning strategy that enables successful generalization will require further experiments, but it clearly involves sensitivity to the perceptual grouping of the patterned elements.
It is unclear whether successful pattern generalization was enabled because the perceptual features of motifs aligned with already known natural categories of song elements i.e. warble and rattle, or because the categorical boundaries were more easily acquired during training due to greater within-class perceptual similarities, or both. Our choice to use of species-typical song elements was motivated by recent demonstrations that such stimuli, compared to simple tones, enhance task performance on other auditory tasks (Bregman, Patel, & Gentner, 2012). We suspect that starlings would perform similarly on this same pattern task using other auditory objects so long as they are sensitive to the perceptual categories of those stimuli – this remains, however, an open question for future research.
The present results may help in understanding the mechanisms and constraints for pattern generalization in young infants. Specifically, Marcus et al (2007) report that infants fail to generalize patterning rules across sequences of tones, animal vocalizations or different musical timbres unless they are first exposed to those same patterning rules instantiated over speech sounds. Our results demonstrate a close interaction between pattern knowledge and pattern components, and support the idea that the abstraction of patterning information is closely constrained by the categorical structure of the constituent elements. Thus the ease with which children learn patterning rules instantiated over speech signals may reflect both familiarity and, perhaps more importantly, the perceptual structure of the speech sounds. This is consistent several results suggesting that human infants use acoustic categories embedded in constituent elements of natural speech (e.g., words) as perceptual scaffolding for later acquiring knowledge of more abstract categories of language (Frigo & McDonald, 1998; Kelly, 1992; Monaghan, Christiansen, & Chater, 2007; Shi, Morgan, & Allopenna, 1998; Shi, Werker, & Morgan, 1999).
Our results indicate that the restricted focus of comparative language research on formal models of syntactic complexity is, at best, insufficient to understand how patterns are used. The abilities of non-human animals to perceive specific, formally defined patterns do not predict pattern generalization. Because reinterpreting patterns independent of specific perceptual events is essential for human language acquisition, we contend there is much to learn about the evolution of language faculties by understanding the benefits and constraints that learning any pattern exerts over multiple levels of abstraction.
Acknowledgments
Work supported by NSF Graduate Research Fellowship 2011122846 to JAC and NIH DC008358 to TQG. We thank Dan Knudsen, Amy Skerry and Jon Mahlandt for comments.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Adret-Hausberger M, Jenkins P. Complex Organization of the Warbling Song in the European Starling Sturnus vulgaris. Behaviour. 1988;107(3):138–156. [Google Scholar]
- Berwick RC, Beckers GJL, Okanoya K, Bolhuis JJ. A Bird’s Eye View of Human Language Evolution. Frontiers in evolutionary neuroscience. 2012 Apr;4:5. doi: 10.3389/fnevo.2012.00005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berwick R, Okanoya K, Beckers G, Bolhuis J. Songs to syntax: the linguistics of birdsong. Trends in Cognitive Sciences. 2011;15:113–121. doi: 10.1016/j.tics.2011.01.002. [DOI] [PubMed] [Google Scholar]
- Braaten RF. Multiple levels of representation of song by European starlings (Sturnus vulgaris): Open-ended categorization of starling sing types and differential forgetting of song categories and exemplars. Journal of Comparative Psychology. 2000;114(1):61–72. doi: 10.1037//0735-7036.114.1.61. [DOI] [PubMed] [Google Scholar]
- Bregman MR, Patel AD, Gentner TQ. Stimulus-dependent flexibility in non-human auditory pitch processing. Cognition. 2012;122(1):51–60. doi: 10.1016/j.cognition.2011.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown P, Jenkins H. Auto-shaping of the pigeon’s keypeck. Journal of the experimental analysis of behavior. 1968;11(I):1–8. doi: 10.1901/jeab.1968.11-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eens M, Pinxten R, Verheyen RF. Temporal and sequential organisation of song bouts in the starling. Ardea. 1988;77:75–86. [Google Scholar]
- Frigo L, McDonald JL. Properties of Phonological Markers That Affect the Acquisition of Gender-Like Subclasses. Journal of Memory and Language. 1998;39(2):218–245. doi: 10.1006/jmla.1998.2569. [DOI] [Google Scholar]
- Gentner TQ, Fenn KM, Margoliash D, Nusbaum HC. Recursive syntactic pattern learning by songbirds. Nature. 2006;440(7088):1204–1207. doi: 10.1038/nature04675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gentner TQ. Temporal scales of auditory objects underlying birdsong vocal recognition. The Journal of the Acoustical Society of America. 2008;124(2):1350–1359. doi: 10.1121/1.2945705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gentner TQ, Hulse SH. Perceptual mechanisms for individual vocal recognition in European starlings, Sturnus vulgaris. Animal Behaviour. 1998;56:579–594. doi: 10.1006/anbe.1998.0810. [DOI] [PubMed] [Google Scholar]
- Kelly MH. Using sound to solve syntactic problems: the role of phonology in grammatical category assignments. Psychological Review. 1992;99(2):349–64. doi: 10.1037/0033-295x.99.2.349. [DOI] [PubMed] [Google Scholar]
- Kiggins JT, Comins JA, Gentner TQ. Targets for a Comparative Neurobiology of Language. Frontiers in Evolutionary Neuroscience. 2012 Apr;4:1–13. doi: 10.3389/fnevo.2012.00006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marcus GF, Fernandes KJ, Johnson SP. Infant rule learning facilitated by speech. Psychological science. 2007;18(5):387–91. doi: 10.1111/j.1467-9280.2007.01910.x. [DOI] [PubMed] [Google Scholar]
- Margoliash D, Nusbaum HC. Language: the perspective from organismal biology. Trends In Cognitive Sciences. 2009;13(12):505–510. doi: 10.1016/j.tics.2009.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meliza CD. Effects of auditory recognition learning on the perception of vocal features in European starlings (Sturnus vulgaris) The Journal of the Acoustical Society of America. 2011;130(5):3115–23. doi: 10.1121/1.3641420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Monaghan P, Christiansen MH, Chater N. The phonological-distributional coherence hypothesis: cross-linguistic evidence in language acquisition. Cognitive psychology. 2007;55(4):259–305. doi: 10.1016/j.cogpsych.2006.12.001. [DOI] [PubMed] [Google Scholar]
- Shi R, Morgan JL, Allopenna P. Phonological and acoustic bases for earliest grammatical category assignment: a cross-linguistic perspective. Journal of child language. 1998;25(1):169–201. doi: 10.1017/s0305000997003395. [DOI] [PubMed] [Google Scholar]
- Shi R, Werker JF, Morgan JL. Newborn infants’ sensitivity to perceptual cues to lexical and grammatical words. Cognition. 1999;72(2):B11–21. doi: 10.1016/s0010-0277(99)00047-5. [DOI] [PubMed] [Google Scholar]