Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jan 1.
Published in final edited form as: Trends Cogn Sci. 2017 Nov 14;22(1):52–63. doi: 10.1016/j.tics.2017.10.003

Constraints on Statistical Learning across Species

Chiara Santolin 1,2, Jenny R Saffran 2
PMCID: PMC5777226  NIHMSID: NIHMS915313  PMID: 29150414

Abstract

Both human and nonhuman organisms are sensitive to statistical regularities in sensory inputs, supporting functions including communication, visual processing and sequence learning. One of the issues faced by comparative research in this field is the lack of a comprehensive theory to explain the relevance of statistical learning across distinct ecological niches. In the current review, we interpret cross-species research on statistical learning based on the perceptual and cognitive mechanisms characterizing the human and nonhuman models under investigation. Considering statistical learning as an essential part of the animal’s cognitive architecture will help to uncover the potential ecological functions of this powerful learning process.

Keywords: Statistical learning, infancy, comparative psychology

Finding structures in the sensory world

A central problem in the study of cognition and development concerns the ways in which organisms compute sensory inputs and discover patterns in the environment. Statistical learning has been proposed as a key mechanism for extracting regularities distributed across sensory modalities and cognitive domains (see Glossary), and as a process that constitutes the foundation of further abilities (see [13] for recent reviews). Over the past 20 years, a substantial body of research has explored statistical learning across myriad domains in both human and nonhuman learners.

In the present review, we identify constraints on statistical learning: differences in the amount of statistical information acquired or types of computations performed over a structured input by a given organism. We examine similarities and differences between species, and interpret statistical learning findings based on what is known about domain-general mechanisms possessed by the species in question.

Researchers in the field of human statistical learning consider these mechanisms to be an essential component of our perceptual and cognitive systems [3]. In humans, statistical learning is constrained in the way in which it operates across modalities and domains [1,4,5]. In addition, data from human infants suggest that statistical learning abilities may emerge at distinct points in development in different sensory modalities. This trajectory is likely explained, at least in part, by the ways in which infants’ perceptual and cognitive skills develop as a function of age and experience.

In nonhuman organisms, it is still unknown how statistical learning relates to other perceptual and cognitive capacities and to ecological niches. Studies have largely focused on the types of statistical computations performed by each species, and the extent to which such computations mirror those performed by humans. Less attention has been directed towards the ways in which statistical learning is constrained in nonhuman animals. As in prior work with humans, our goal is to examine statistical learning abilities through the lens of domain-general abilities that differentiate species.

Who can do what?

Statistical learning facilitates the detection of sequential patterns in visual, auditory, and tactile information, as well as spatial patterns in the visual domain [627]. In humans, and some songbird species, statistical learning also supports communicative functions. This process has been implicated in numerous aspects of early language development, including discovering word boundaries, prosodic and phonotactic patterns, syntactic structures, and label-object mappings [2836]. In a similar vein, vocal learning in songbirds may involve statistical learning processes. For example, juvenile Bengalese and zebra finches learn their own songs by tracking probability distribution of syllables within songs sung by adult tutors [3739].

We have chosen to focus this review on three specific statistical learning abilities: tracking sequential statistics, generalizing sequential patterns, and acquiring simple syntactic structures. These three abilities are particularly relevant because they have been studied using comparable behavioral methods across species. Note that there is also a large literature testing human adults. For the purposes of the current comparative review, however, we focus primarily on human infant studies because the methods are more comparable to those used with nonhuman animals (see Box 1).

Box 1. Behavioral procedures.

1. Habituation

Infants are exposed to structured visual or auditory sequences until they reach a habituation criterion (diminished looking). Each test trial consists of presentation of a single sequence that is either consistent or inconsistent with the habituation materials. Discrimination is measured as a function of looking time to the consistent vs. inconsistent test items.

2. Head-Turn Preference

Infants are exposed to sound sequences for a few minutes in the presence of a neutral visual stimulus. The test items consist of sequences that are either consistent or inconsistent with the exposure materials (e.g., words vs. non-words: the same syllables in novel orders). Infants control the duration of each test trial via a head turn in the direction of the auditory stimulus. Discrimination is measured as in habituation.

In cotton-top tamarins, the orienting response towards the speaker playing the stimuli serves as measure of discrimination.

3. Go/No-Go

Animals are first trained to discriminate a sound associated with the Go response (e.g., pecking on a sensor) from a sound associated with the No-Go response (no pecking) via reinforcement and/or punishment. Test items alternate with training stimuli which are reinforced to avoid response extinction. Subjects categorize test stimuli as consistent with Go or No-Go stimuli experienced during training; discrimination is measured as proportion of correct responses.

4. Spontaneous discrimination

Animals are familiarized with artificial languages for hours, and then tested on strings of sounds consistent vs. inconsistent with the familiarization language; discrimination is measured as a function of changes in calling behavior in response to the stimuli. In the newborn chick task, animals are exposed for hours to a structured visual stream of objects. Discrimination between consistent vs. inconsistent stimuli is measured by the proportion of time spent near the screen playing the consistent stimulus (see Figure 1).

5. Operant conditioning

Animals are trained to select one type of stimuli associated with food (S+); responses toward S− are associated with bland punishment or no-reward. Discrimination is measured as a function of choice of consistent vs. inconsistent stimuli with respect to S+. Alternatively, the training phase consists of presentation of stimuli associated with food reinforcement. Test items are words consistent with the language vs. non-words, and behavioral response (e.g., lever-pressing) associated with test stimuli is measured.

Tracking sequential patterns

Human neonates detect frequencies of co-occurrence of streams of syllables [40] and shapes [41]. The youngest ages for which there is published evidence showing detection of transitional probabilities is 4.5 months [42] for visual materials and 7 months for auditory materials [43]. Detecting transitional probabilities is more complex than detecting co-occurrence frequencies because transitional probabilities entail predictive relations amongst items, whereas frequencies involve simple co-occurrence of elements. Given the studies published to date, it is unclear whether detection of sequential statistics emerges earlier than what is reported in the literature.

Sensitivity to sequential statistics has also been demonstrated in nonhuman species. Zebra finches (Taeniopygia guttata) track transitional probabilities to encode sequences of song syllables [44]. When learning songs from adult tutors, juvenile Bengalese finches (Lonchura striata domestica) select chunks of notes with greater internal transitional probabilities over groups of notes spanning chunk boundaries [37]. Following mere exposure without reward, newborn domestic chicks (Gallus gallus) detect patterns in streams of visual elements [13]. Although it remains unclear exactly which statistics chicks are computing (e.g., transitional probabilities or co-occurrence frequencies), this species does appear to be tuned to distributional information in its post-natal environment.

Among mammals, statistical learning from linguistic materials has been demonstrated in rats (Rattus norvegicus), who utilize frequencies of co-occurrence rather than transitional probabilities to track syllables in speech streams [8]. Cotton-top tamarins (Sanguinus oedipus) show similar capacities; however, it is unknown whether tamarins detect co-occurrence frequencies or transitional probabilities between syllables [45].

Generalizing sequential patterns

The sequential learning tasks described in the previous section require learners to track specific sequences of elements. Other experimental paradigms assess whether learners can abstract beyond specific sequences, requiring generalization. Human neonates generalize the structure of triplets of syllables arranged in an ABB pattern (e.g., ga-ti-ti, we-fo-fo, la-gu-gu), discriminating novel ABB sequences from random sequences of the same syllables [46]. A few months later, infants’ generalization abilities become more robust. After familiarization with a pattern like ABB, 7-month-old infants discriminate between novel syllables arrayed in this familiar structure from the same syllables in a new structure (e.g., ga-ti-ti vs. ga-ti-ga [47]).

Infants perform similar computations in visual and auditory nonlinguistic domains, although with some constraints. It has been suggested that infants perform better if previously exposed to the same regularities implemented by speech ([48]; for counterexamples, see [49]), or, more broadly, communicative signals [50]. An alternative hypothesis suggests that successful learning is impacted by familiarity and/or ease of processing. For example, infants are more apt to generalize sequences of animals or upright faces than geometric forms or inverted faces [9,21,51]. Other perceptual and cognitive factors appear to constrain this form of learning in infants, including the presence of immediate repetitions of individual elements [9,21,52].

Nonhuman animals also generalize in sequential learning tasks, but with notable limitations. Zebra finches generalize based on the perceptual similarity between training and test materials rather than syllable order. In fact, in the absence of acoustic properties shared between familiar and novel streams, finches fail to generalize at all [5355]. Like human infants, finches seem to privilege patterns containing adjacent repetitions like ABB and AAB [54]. Newly-hatched chicks demonstrate robust generalization given training on triplets of visual objects arranged according to ABA, AAB, ABB and BAA patterns, regardless of the presence of immediate repetitions [26]. Among mammals, rats trained on ABA sequences instantiated by strings of tones generalize to novel stimuli [56]. Similar results have been obtained with consonant-vowel alternations implementing ABB patterns compared to random sequences [5758]. Rhesus macaques (Macaca mulatta) discriminate novel AAB vs. ABB strings implemented by their own calls [59]; a recent extension of this study involved another primate species, the cotton-top tamarin, which can generalize structures presented in both speech and musical tones [60].

Acquiring simple syntactic structures

By 12 months of age, human infants can learn rudimentary syntactic structures in lab tasks. Using artificial grammar learning paradigms, infants are first familiarized with simple miniature grammars that generate sets of sentences. Infants subsequently discriminate grammatical versus ungrammatical strings containing either violations of internal syllable pairs or violations at the edges of the grammar [32]. In a similar paradigm, 12-month-olds learned phrase structures that mimicked basic statistical patterns of natural languages (e.g., a determiner, such as “the” or “a”, requires a noun somewhere within a sentence [34]). In this latter study, nonsense words from an artificial language were clustered in categories (e.g., determiners), whose presence depended on the presence of other categories (e.g., nouns). After exposure to the language, infants distinguished grammatical versus ungrammatical test strings, which violated predictive dependencies between word categories. This task also required generalization of grammatical knowledge beyond the trained sentences.

Songbirds’ ability to learn and generalize syntactic structures is somewhat more restricted. European starlings (Sturnus vulgaris) learn structures as complex as finite-state (e.g., ABn) and hierarchical (e.g., AnBn) grammars formed by their own song syllables [61]. Finite-state grammars generate strings of items repeated linearly (e.g., ABABAB) whereas hierarchical grammars comprise categories of items grouped into sub-strings, nested into other sub-strings (e.g., phrase structure, AABBBB). However, at least in starlings, generalization to novel instances of the grammars (e.g., CDCDCD) is driven by acoustic similarity between training and test syllables, rather than on detection of the underlying patterns [62]. When the computations require transfer of structured information to novel inputs, these species exhibit limited capacities. Bengalese finches (Lonchura striata domestica) demonstrate more advanced generalization skills, taking advantage of statistical (predictive) dependencies between categories of song syllables and generalize to strings composed of novel syllables [10]. This evidence is consistent with the previously described results from 12-month-old human infants, pointing to common processing of basic syntactic patterns between the two species.

Among nonhuman primates, tamarins learn finite-state and phrase structure grammars [34,63]. However, in the latter case, monkeys failed to extract the structure when it required generalization to novel sentences, unlike human infants, suggesting that the operation is limited to the specific stimuli with which they were familiarized. In a series of artificial grammar learning tasks involving phrase-structure grammars (similar to [34]), marmosets (Callithrix jacchus) primarily encoded regularities involving the beginning of sentences, whereas macaques could also track violations throughout the strings [11]. Macaques have also been compared to human adults in tasks investigating learning of nonadjacent dependencies. In line with some findings on nonhuman primates [63,64] but not others [65], macaques exhibited no sensitivity to regularities involving nonadjacent syllables, only responding to violations of contiguous syllables. Humans performed better than macaques overall, detecting nonadjacent patterns [66].

Species comparisons & constraints

Despite many similarities, human infants and nonhuman animals diverge in the facility with which they perform various statistical learning tasks. Our interpretation of these cross-species differences considers perception and cognition, focusing on the extent to which these general abilities constrain statistical learning. A similar approach has been taken to explain early learning of grammar with respect to similarities between human and nonhuman performance. According to this perspective, grammar acquisition is supported by specialized perceptual and memory systems, possibly shared with other species [67]. In the present review article, we examine and interpret cross-species differences in comparable tasks based on domain-general abilities possessed by the species in question. Considering the creature under investigation tout court has the potential to provide a window into the ecological functions of statistical learning. To this end, we will examine statistical learning cross-species comparisons through the lens of three aspects of perception and cognition: vocal learning, perceptual processing, and memory.

Relationship between vocal learning and statistical learning

Vocal learning requires animals to modify the acoustic structure of their own species’ vocalizations and produce novel patterns of sounds. Through this mechanism, the young of some species acquire fundamental communicative signals that facilitate social interactions with conspecifics. Vocal learners include passerine songbirds, seals, cetaceans, and some bats. However, there are substantial differences in the way this process unfolds across species. Some species acquire a single novel vocalization as juveniles, whereas others learn new sounds even in adulthood [68,69]. Differences also include the structure of the learned sound patterns. Some species’ songs provide more structural variability (e.g., European starlings), whereas others consist of fixed sequences of notes (e.g., zebra finches) [70,71]. Differences in vocal learning across species may predict the complexity of the patterns animals can learn and generalize in lab tasks [55]. In particular, species whose songs are syntactically-structured and composed of a wide range of notes should be better at performing complex computations requiring, for instance, generalization of a given pattern to novel exemplars.

Consistent with this hypothesis, recent findings comparing two avian species in an artificial grammar learning task reveal differences that are consistent with general aspects (syntactic-like organization, phonological variability) of birds’ vocal behavior [55]. Zebra finches learn regularities like ABA and AAB presented as short sequences of song syllables, but fail to generalize to structured strings formed by novel song syllables. However, the budgerigar (common parakeet; Melopsittacus undulatus), goes beyond item-specific information and generalizes these patterns, recognizing them even when they are implemented in novel sounds. This difference in performance can be explained in light of what is known about vocal learning in these species. The vocal repertoire of zebra finches includes highly-stereotyped songs formed by rigid syllable sequences, resulting in repetitive, linear patterns [7073]. Budgerigars are open-ended vocal learners, with high vocal plasticity, whose songs show greater phonological variability and syntactic-like organization [74,75]. We hypothesize that the perceptual and learning skills possessed by the finches may not be suited to perform computations more complex than rote memorization of specific syllable order. In contrast, budgerigars’ vocal learning abilities suggest the presence of learning mechanisms that can detect the underlying structure of sound sequences, later recognizing that structure when implemented by novel syllables (e.g., generalization). This hypothesis is supported by findings showing that budgerigars possess generally superior memory for acoustic complex stimuli [76]. The directionality of this relationship remains unknown. In line with theorizing about human language, we hypothesize that the learning abilities themselves have shaped the structure of the songs in these species [31,89].

Visual statistical learning in neonates

The development of perceptual processing also provides insights into the learning outcomes observed across species. Precocial and altricial species differ in the ontogeny of a range of physiological and behavioral functions. Precocial animals are biologically mature from hatching or birth, and are generally independent from parental care. For instance, superprecocial animals like megapode birds (e.g., Australian brushturkey, Alectura lathami) leave the nest shortly after hatching, and display fully developed brain structures and motor behavior, being able to fly few hours after hatching [7779]. As a consequence, the early perceptual processing performed by precocial organisms exceeds that of altricial species [see also 80]. These ontogenetic differences may be linked to the statistical learning skills present in a given species at birth. It is likely the case that altricial animals would have to deal with biological limitations (e.g., neural development, sensory and perceptual processing), that constrain what can be learned at the outset of postnatal life.

From this perspective, consider the comparison between human neonates – an altricial species, with an extended developmental timeline – and newly-hatched domestic chicks, a precocial bird species. In a visual statistical learning study, newly-hatched chicks discriminated structured from random streams of four and six shapes [13]. Human neonates, however, succeeded only with streams of four shapes, failing with six-shape streams [41]. Limited perceptual abilities are likely to constrain visual statistical learning in humans at birth, whose visual system is immature (e.g., severely reduced acuity [81]), generally limiting visual learning [82]. Such restrictions are typical of the altricial human primate, whose offspring stay immature longer than other mammals [83,84]. Unlike humans, chicks hatch in an advanced stage of development, with completely developed visual pathways from the very first days of postnatal life [8587]. In chicks, vision is the predominant sensory modality, allowing full processing of complex visual stimuli right after hatching. Compared to chicks, reduced visual skills in human infants appear to constrain statistical learning, limiting the extent to which infants can process statistics over particular inputs. Indeed, a classic theory of perceptual development hypothesizes that limitations on the amount of information computed – due to the protracted maturation of some sensory systems relative to others – actually facilitates human perceptual and cognitive development [88].

Statistical learning is likely to guide different functions in humans and chicks. In humans, early learning of regularities plays a key role in domains other than vision, especially language processing. In chicks, however, extracting regularities might be linked with early social learning, a fundamental capacity that allows newly-hatched chicks to recognize the mother hen and siblings. This process, filial imprinting, occurs immediately after birth via learning of invariant visual features of their social companions, like plumage colored patterns, peck and head shape [85,89]. We hypothesize that statistical learning works in tandem with filial imprinting in chicks, and leads to an integrated representation of relevant social objects that will allow further recognition and identification. Consistent with our hypothesis, a recent study shows that imprinting promotes chicks’ learning of multimodal regularities such as XX vs. XY implemented by sound-shape pairs as well as generalization to novel instances [90; see also 91]. Being precocial requires chicks to process salient visual stimuli at birth, whereas altricial humans do not have to rely on early visual learning to identify relevant social inputs. On this view, statistical learning abilities are influenced by the ontogeny of the species, which determines the functions supported by the learning process (i.e., language acquisition, recognition of social objects, etc.).

Memory and statistical learning

Memory is a fundamental component of statistical learning. Learners must keep track of sequences of elements that rarely persist over time, and temporarily hold information necessary for future processing (i.e., working memory; 92). According to the memory-based framework presented in [93], regularities extracted from structured input lead to representations in long-term memory even after short exposure [see also 94]. Memory traces then become the foundation for subsequent learning operations, driving detection of further patterns, and integrating stored information to find common regularities across exemplars [95,96]. Computations such as extraction, storing and integration seem to be fundamental for both acquiring sequential regularities and generalizing to novel exemplars. This framework also points to the way in which learners retain and retrieve information as a constraint on statistical learning, suggesting that sensitivity to patterns in the sensory input is shaped by learners’ memory skills (e.g., storing, access). Following this path, we hypothesize that species with reduced memory skills would be sensitive to a restricted set of structured information from sequentially-presented inputs compared to species with enhanced storing and retention abilities.

Among nonhuman primates, marmosets and macaques have been directly compared in artificial grammar learning experiments to test learning of sequential syntactic patterns. When presented with strings violating the familiar grammar at multiple locations, marmosets, unlike macaques, detect only violations at the very beginning of the grammar [11]. One explanation for this pattern of performance points to cross-species differences in memory skills. These species belong to separate groups of the primate order, with distinct physical features and cognitive abilities. Macaques are equipped with superior general learning and memory capacities, outperforming marmosets in basic learning tasks such as discrimination of learned stimuli [97,98]. In particular, some species of the Callitrichidae family, including the common marmoset, seem to use spatial memory while foraging, concurrently tracking several object locations containing food [99101]. It is likely that such differences affect other computations that closely involve learning, such as statistical processing of grammatical sequences. On this view, marmosets developed better memory skills in the spatial modality than in the temporal modality, thus performing worse on tasks requiring retention of sequential stimuli.

Nonhuman primates’ general learning systems may not be tailored to acquire linguistic structures as complex as hierarchical grammars [see also 102], but do support learning of sequential patterns in other domains like tool-use [103,104], motor actions [105,106] and social communication [107] which require the processing of events occurring over time, and play a prominent role in primates’ behavioral repertoires. It is possible that the use of linguistic materials in studies investigating sequential statistical learning has hindered the discovery of more advanced capabilities in nonhuman primates (see Concluding Remarks). Experiments using “grammars” created not from words but from sequences of tools or actions might reveal statistical learning abilities not observed with linguistic materials. More generally, interactions between statistical learning abilities and the elements over which learning occurs (e.g., speech syllables vs. motor actions) point to important constraints on statistical learning, and may help to explain divergent learning outcomes across species.

Concluding remarks

In this review, we have framed cross-species differences in statistical learning in the context of perceptual and cognitive mechanisms that vary across organisms. It is clearly the case that reduced perceptual abilities in a given modality lead to limited learning of structured inputs, as in the case of human neonates. In a similar vein, reduced memory capacity affects regularities detected from sequential inputs, confining learning to a restricted portion of the information available in the input.

According to this perspective, we might expect similar computations in species that share cognitive functions supported by statistical learning. For instance, statistical learning is likely to drive vocal learning in organisms that must learn to produce structured vocalizations [71,108,109]. Like human infants acquiring languages, juveniles of some songbird species track statistical properties of songs, and re-combine statistically coherent patterns into new songs [37,38]. These observations suggest that both acoustic features and structural organization of the songs are acquired during vocal learning, making this process similar to the beginning of human language acquisition. It is thus not surprising that some bird species and humans exhibit the most advanced statistical learning of sound sequences.

Vocal learning is not limited to birds and humans. Bats, seals, cetaceans and elephants meet the requirements for vocal learners: they can learn novel vocalizations and imitate other species’ sounds. Among cetaceans, bottlenose dolphins (Tursiops truncatus) possess impressive vocal learning and mimicry abilities: they learn to produce new vocalizations, have sophisticated communication systems composed of extended call repertoires, and generally possess remarkable cognitive skills [110112]. For these reasons, we would expect dolphins to show similar statistical learning capacities as human neonates and songbirds [113]. In a similar vein, we would expect non-vocal learners to show restricted abilities in artificial grammar learning tasks. For example, domestic chickens are limited to short repetitive calls [114,115], whose usage is acquired following social experience with conspecifics [109]. Given that there is no vocal learning in this species, we predict that young chicks would fail to track probabilistic structures from auditory stimuli, performing worse than vocal learners.

Statistical learning may support communicative functions even in the absence of vocal learning. Fieldwork suggests that free-ranging putty-nosed monkeys (Cercopithecus nictitans) can track co-occurrences between specific combinations of natural calls and the presence of an eagle, eventually emitting vocalizations to signal the predator presence to the rest of the group [116]. Black-fronted titi monkeys (Callicebus nigrifrons) produce call sequences whose combination and type vary based on the context (e.g., aerial vs. terrestrial predator [117,118]). Female baboons (Papio cynocephalus ursinus) appear to possess a combinatorial system of communication reflecting the complex hierarchy of their social group [107]. For example, female baboons notice inconsistent call sequences with respect to the rank of the caller, recognizing when a dominant female emits unusual vocalizations when interacting with subordinates [119].

Statistical learning abilities also impact non-communicative domains across species. Guinea baboons (Papio papio) demonstrated learning of hierarchical structures from visual stimuli [120] and orthographic patterns [23]. Baboons learned statistical relations between printed letters and their positions within a word, distinguishing English words and nonwords with remarkable accuracy. The main theoretical implication of this work is that orthographic processing, an essential computation in reading, might be rooted into more general statistical learning mechanisms shared across baboons and humans, and appears to be constrained by domain-general abilities necessary to discriminate visual objects (e.g., detecting feature combinations).

Further consideration of the role of the perceptual and cognitive mechanisms supporting statistical learning will help to clarify how this process unfolds in different ecological niches (see Outstanding Questions). In human infants, the evidence suggests that statistical learning abilities have impacted the types of structures that are observed in human languages [4,33,34]. It is also the case that what human infants have already learned impacts the types of structures that they subsequently learn most readily [95,121]. Both of these directions of effects may influence statistical learning in non-human animals as well. Interpreting cross-species findings in this light links our approach to perspectives on human learning that consider statistical learning as a core component of cognitive systems, rather than an independent computational process [3]. This view is also aligned with modern neurocomputational theories, which treat brains as sophisticated prediction machines [122], internalizing probabilistic models of the environment in order to anticipate the sensory stream and generate inferences [123,124]. Shifting the focus of comparative research to the systems within which learning is embedded will us allow to develop a much deeper understanding of the ecological relevance of statistical learning.

Outstanding Questions.

  • What is the ecological function of statistical learning in distinct species? What does tracking frequency or probability distribution allow animals to do in real life?

  • Do cross-species findings imply the presence of a unique statistical learning mechanism shared between organisms? Or does the fact that statistical learning leads to different outcomes in different species imply the presence of multiple separate mechanisms?

  • Is statistical learning domain-general even within a given animal species? For example, is statistical learning involved in domains other than vocal learning in songbirds as it is in humans?

  • What is the directionality of the relationship between differences in statistical learning and differences in general perceptual/cognitive abilities across species? For example, are differences in birdsong structure due to different learning mechanisms across species, or are differences in learning mechanisms due to different song structures across species?

Figure 1.

Figure 1

Apparatus and sample stimuli used to investigate chicks’ detection of statistical patterns [13]. In the familiar sequence, the shapes are structured into pairs, such that the first shape in a pair is always followed by the same second shape. In the Unfamiliar sequence, the same shapes are presented but in random order.

Trends Box.

  • Ecological relevance of statistical learning is mostly undefined in the animal kingdom; the majority of cross-species research in this area focuses on revealing human-like abilities rather than discovering the functions of statistical learning in different species.

  • Perception, memory and learning guide and constrain statistical learning differently across species, limiting the extent to which organisms can process regularities from sensory inputs.

  • Cross-species differences can be framed around general-purpose abilities, developmental processes, and learning challenges characterizing the animal models under investigation, and interpreted by considering how statistical learning is integrated within the learner’s cognitive system.

Acknowledgments

We are grateful to Alberto Testolin and Erica Wojcik for comments on earlier versions of this manuscript. Preparation of the manuscript was supported by a fellowship from the Foundation Marica DeVincenzi onlus along with the Department of Psychology and Cognitive Science of the University of Trento to C.S., and by a grant from NIH (R37HD037466) to J.S.

Glossary

Artificial grammar

a nonsense language composed of novel words whose order mimics structures of natural languages

Domains

general areas of cognition such as language, music, object recognition, memory, attention

Domain-general

refers to those processes and abilities operating across distinct cognitive domains. Statistical learning is often considered to be domain-general mechanism because it operates across multiple domains, including language, music, and visual objects

Ecological niche

habitat or environment in which a species lives, and behavioral responses emitted in order to best adapt in that habitat

Frequency of co-occurrence

how often two elements composing a pattern appear together. Elements can be syllables, musical notes, visual shapes based on the sensory modality implementing the pattern

Finite-state grammar

transitional probabilities between a set of adjacent items (AB)n. A unique pair of items generates a linear structure based on the number repetitions required (e.g., ABABABABAB)

Generalization

learning of a structure independent of the perceptual identity of its components. For example, ABA represents a specific ordering of categories of items (A and B), which can be implemented by any given syllables or shapes; cross-circle-cross, triangle-hexagon-triangle represent instances of the same structure. Generalization requires learners to transfer this structure to novel stimuli (new syllables or shapes)

Modality

sensory presentation of the stimuli used in experimental tasks. In the statistical learning literature, the principal modalities are auditory, visual (which can be spatial or temporal) and tactile

Nonadjacent dependency

conditional probability between two items interleaved by at least one additional item (AXB). As for transitional probabilities, A predicts B

Phrase structure grammar

categories of items (usually words like determiners, nouns, verbs, etc.) clustered into sub-groups such as noun-phrases and verb-phrases. Sub-phrases can be nested into other sub-phrases conferring hierarchical organization to a sentence. Which word pertains to which category, and the position of sub-phrases within a sentence, is determined by statistical dependencies (example of rudimentary phrase structures in the literature: AnBn; AAABBBBBB)

Statistical learning

learning mechanism enabling detection of regularity (e.g., co-occurrence frequencies, transitional probabilities, nonadjacent dependencies, etc.) in sensory inputs

Transitional probability

a form of conditional probability defined by the formula probability of B|A = frequency of AB/frequency of A. This conditional statistic is sensitive to the order with which one item predicts the following one within a pair (AB)

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Frost R, et al. Domain generality versus modality specificity: The paradox of statistical learning. Trends Cogn Sci. 2015;19:117–125. doi: 10.1016/j.tics.2014.12.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Dehaene S, et al. The neural representation of sequences: from transition probabilities to algebraic patterns and linguistic trees. Neuron. 2015;88:2–19. doi: 10.1016/j.neuron.2015.09.019. [DOI] [PubMed] [Google Scholar]
  • 3.Armstrong BC, et al. The long road of statistical learning research: past, present and future. Phil Trans R Soc B. 2017 doi: 10.1098/rstb.2016.0047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Saffran JR. Constraints on statistical language learning. J Mem Lang. 2002;47:172–196. [Google Scholar]
  • 5.Krogh L, et al. Statistical learning across development: flexible yet constrained. Front Psychol. 2012 doi: 10.3389/fpsyg.2012.00598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Saffran JR, et al. Statistical learning of tone sequences by human infants and adults. Cognition. 1999;70:27–52. doi: 10.1016/s0010-0277(98)00075-4. [DOI] [PubMed] [Google Scholar]
  • 7.Conway CM, Christiansen MH. Sequential learning in non-human primates. Trends Cogn Sci. 2001;5:539–546. doi: 10.1016/s1364-6613(00)01800-3. [DOI] [PubMed] [Google Scholar]
  • 8.Toro JM, Trobalón JB. Statistical computations over a speech stream in a rodent. Atten Percept Psychophys. 2005;67:867–875. doi: 10.3758/bf03193539. [DOI] [PubMed] [Google Scholar]
  • 9.Johnson SP, et al. Abstract rule learning for visual sequences in 8-and 11-month-olds. Infancy. 2009;14:2–18. doi: 10.1080/15250000802569611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Abe K, Watanabe D. Songbirds possess the spontaneous ability to discriminate syntactic rules. Nature Neuroscience. 2011;14:1067–1074. doi: 10.1038/nn.2869. [DOI] [PubMed] [Google Scholar]
  • 11.Wilson B, et al. Auditory artificial grammar learning in macaque and marmoset monkeys. J Neurosci. 2013;33:18825–18835. doi: 10.1523/JNEUROSCI.2414-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Slone LK, Johnson SP. Infants’ statistical learning: 2-and 5-month-olds’ segmentation of continuous visual sequences. J Exp Child Psychol. 2015;133:47–56. doi: 10.1016/j.jecp.2015.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Santolin C, et al. Unsupervised statistical learning in newly hatched chicks. Current Biology. 2016a;26:R1218–R1220. doi: 10.1016/j.cub.2016.10.011. [DOI] [PubMed] [Google Scholar]
  • 14.Tummeltshammer K, et al. Across space and time: infants learn from backward and forward visual statistics. Dev Sci. 2016 doi: 10.1111/desc.12474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Baldwin DA, et al. Infants parse dynamic action. Child Dev. 2001;72:708–717. doi: 10.1111/1467-8624.00310. [DOI] [PubMed] [Google Scholar]
  • 16.Kirkham NZ, et al. Visual statistical learning in infancy: Evidence for a domain general learning mechanism. Cognition. 2002;83:B35–B42. doi: 10.1016/s0010-0277(02)00004-5. [DOI] [PubMed] [Google Scholar]
  • 17.Fiser J, Aslin RN. Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychol Sci. 2001;12:499–504. doi: 10.1111/1467-9280.00392. [DOI] [PubMed] [Google Scholar]
  • 18.Fiser J, Aslin RN. Statistical learning of higher-order temporal structure from visual shape sequences. J Exp Psychol Learn Mem Cogn. 2002a;28:458–467. doi: 10.1037//0278-7393.28.3.458. [DOI] [PubMed] [Google Scholar]
  • 19.Fiser J, Aslin RN. Statistical learning of new visual feature combinations by infants. Proc Natl Acad Sci USA. 2002b;99:15822–15826. doi: 10.1073/pnas.232472899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Fiser J, Aslin RN. Encoding multielement scenes: statistical learning of visual feature hierarchies. J Exp Psychol Gen. 2005;134:521–537. doi: 10.1037/0096-3445.134.4.521. [DOI] [PubMed] [Google Scholar]
  • 21.Saffran JR, et al. Dog is a dog is a dog: Infant rule learning is not specific to language. Cognition. 2007;105:669–680. doi: 10.1016/j.cognition.2006.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kirkham NZ, et al. Location, location, location: Development of spatiotemporal sequence learning in infancy. Child Dev. 2007;78:1559–1571. doi: 10.1111/j.1467-8624.2007.01083.x. [DOI] [PubMed] [Google Scholar]
  • 23.Grainger J, et al. Orthographic processing in baboons (Papio papio) Science. 2012;336:245–248. doi: 10.1126/science.1218152. [DOI] [PubMed] [Google Scholar]
  • 24.Stobbe N, et al. Visual artificial grammar learning: comparative research on humans, kea (Nestor notabilis) and pigeons (Columba livia) Phil Trans R Soc B. 2012;367:1995–2006. doi: 10.1098/rstb.2012.0096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Minier L, et al. The Temporal Dynamics of Regularity Extraction in Non-Human Primates. Cogn Sci. 2015;40:1019–1030. doi: 10.1111/cogs.12279. [DOI] [PubMed] [Google Scholar]
  • 26.Santolin C, et al. Generalization of visual regularities in newly hatched chicks (Gallus gallus) Anim Cogn. 2016b;19:1007–1017. doi: 10.1007/s10071-016-1005-2. [DOI] [PubMed] [Google Scholar]
  • 27.Scarf D, et al. Orthographic processing in pigeons (Columba livia) Proc Natl Acad Sci USA. 2016;113:11272–11276. doi: 10.1073/pnas.1607870113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Saffran JR, et al. Statistical learning by 8-month-old infants. Science. 1996;274:1926–1928. doi: 10.1126/science.274.5294.1926. [DOI] [PubMed] [Google Scholar]
  • 29.Aslin RN, et al. Computation of conditional probability statistics by 8-month-old infants. Psychol Sci. 1998;9:321–324. [Google Scholar]
  • 30.Mattys SL, et al. Phonotactic and prosodic effects on word segmentation in infants. Cogn Psychol. 1999;38:465–494. doi: 10.1006/cogp.1999.0721. [DOI] [PubMed] [Google Scholar]
  • 31.Mattys SL, Jusczyk PW. Phonotactic cues for segmentation of fluent speech by infants. Cognition. 2001;78:91–121. doi: 10.1016/s0010-0277(00)00109-8. [DOI] [PubMed] [Google Scholar]
  • 32.Gomez RL, Gerken L. Artificial grammar learning by 1-year-olds leads to specific and abstract knowledge. Cognition. 1999;70:109–135. doi: 10.1016/s0010-0277(99)00003-7. [DOI] [PubMed] [Google Scholar]
  • 33.Saffran JR. The use of predictive dependencies in language learning. J Mem Lang. 2001;44:493–515. [Google Scholar]
  • 34.Saffran JR, et al. Grammatical pattern learning by human infants and cotton-top tamarin monkeys. Cognition. 2008;107:479–500. doi: 10.1016/j.cognition.2007.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Graf-Estes K, et al. Can infants map meaning to newly segmented words? Statistical segmentation and word learning. Psychol Sci. 2007;18:254–260. doi: 10.1111/j.1467-9280.2007.01885.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Clerkin EM, et al. Real-world visual statistics and infants’ first-learned object names. Phil Trans R Soc B. 2017 doi: 10.1098/rstb.2016.0055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Takahasi M, et al. Statistical and prosodic cues for song segmentation learning by Bengalese finches (Lonchura striata var domestica) Ethology. 2010;116:481–489. [Google Scholar]
  • 38.Menyhart O, et al. Juvenile zebra finches learn the underlying structural regularities of their fathers’ song. Front Psychol. 2015;6:1–12. doi: 10.3389/fpsyg.2015.00571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Fehér O, et al. Statistical learning in songbirds: from self-tutoring to song culture. Phil Trans R Soc B. 2017 doi: 10.1098/rstb.2016.0053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Teinonen T, et al. Statistical language learning in neonates revealed by event-related brain potentials. BMC Neurosci. 2009;10:21. doi: 10.1186/1471-2202-10-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Bulf H, et al. Visual statistical learning in the newborn infant. Cognition. 2011;121:127–132. doi: 10.1016/j.cognition.2011.06.010. [DOI] [PubMed] [Google Scholar]
  • 42.Marcovitch S, Lewkowicz DJ. Sequence learning in infancy: The independent contributions of conditional probability and pair frequency information. Dev Sci. 2009;12:1020–1025. doi: 10.1111/j.1467-7687.2009.00838.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Thiessen ED, et al. Infant-directed speech facilitates word segmentation. Infancy. 2005;7:53–71. doi: 10.1207/s15327078in0701_5. [DOI] [PubMed] [Google Scholar]
  • 44.Chen J, ten Cate C. Zebra finches can use positional and transitional cues to distinguish vocal element strings. Behav Process. 2015;117:29–34. doi: 10.1016/j.beproc.2014.09.004. [DOI] [PubMed] [Google Scholar]
  • 45.Hauser MD, et al. Segmentation of the speech stream in a non-human primate: statistical learning in cotton-top tamarins. Cognition. 2001;78:B53–B64. doi: 10.1016/s0010-0277(00)00132-3. [DOI] [PubMed] [Google Scholar]
  • 46.Gervain J, et al. The neonate brain detects speech structure. Proc Natl Acad Sci USA. 2008;105:14222–14227. doi: 10.1073/pnas.0806530105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Marcus GF, et al. Rule learning by seven-month-old infants. Science. 1999;283:77–80. doi: 10.1126/science.283.5398.77. [DOI] [PubMed] [Google Scholar]
  • 48.Marcus GF, et al. Infant rule learning facilitated by speech. Psychol Sci. 2007;18:387–391. doi: 10.1111/j.1467-9280.2007.01910.x. [DOI] [PubMed] [Google Scholar]
  • 49.Dawson C, Gerken L. From domain-generality to domain-sensitivity: 4-month-olds learn an abstract repetition rule in music that 7-month-olds do not. Cognition. 2009;111:378–382. doi: 10.1016/j.cognition.2009.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Ferguson B, Lew-Williams C. Communicative signals support abstract rule learning by 7-month-old infants. Sci Rep. 2016 doi: 10.1038/srep25434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Bulf H, et al. Many faces, one rule: the role of perceptual expertise in infants’ sequential rule learning. Front Psychol. 2015:1595. doi: 10.3389/fpsyg.2015.01595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Thiessen ED. Effects of inter-and intra-modal redundancy on infants’ rule learning. Lang Learn Dev. 2012;8:197–214. [Google Scholar]
  • 53.Chen J, et al. Artificial grammar learning in zebra finches and human adults: XYX versus XXY. Anim Cogn. 2015;18:151–164. doi: 10.1007/s10071-014-0786-4. [DOI] [PubMed] [Google Scholar]
  • 54.van Heijningen CA, et al. Rule learning by zebra finches in an artificial grammar learning task: Which rule? Anim Cogn. 2013;16:165–175. doi: 10.1007/s10071-012-0559-x. [DOI] [PubMed] [Google Scholar]
  • 55.Spierings MJ, ten Cate C. Budgerigars and zebra finches differ in how they generalize in an artificial grammar learning experiment. Proc Natl Acad Sci USA. 2016;113:E3977–E3984. doi: 10.1073/pnas.1600483113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Murphy RA, et al. Rule learning by rats. Science. 2008;319:1849–1851. doi: 10.1126/science.1151564. [DOI] [PubMed] [Google Scholar]
  • 57.de La Mora D, Toro JM. Rule learning over consonants and vowels in a non-human animal. Cognition. 2013;126:307–312. doi: 10.1016/j.cognition.2012.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Crespo-Bojorque P, Toro JM. The use of interval ratios in consonance perception by rats (Rattus norvegicus) and humans (Homo sapiens) J Comp Psychol. 2015;129:42. doi: 10.1037/a0037991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Hauser MD, Glynn D. Can free-ranging rhesus monkeys (Macaca mulatta) extract artificially created rules comprised of natural vocalizations? J Comp Psychol. 2009;123:161. doi: 10.1037/a0015584. [DOI] [PubMed] [Google Scholar]
  • 60.Neiworth JJ, et al. Artificial grammar learning in tamarins (Saguinus oedipus) in varying stimulus contexts. J Comp Psychol. 2017;131:128. doi: 10.1037/com0000066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Gentner TQ, et al. Recursive syntactic pattern learning by songbirds. Nature. 2006;440:1204–1207. doi: 10.1038/nature04675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Comins JA, Gentner TQ. Perceptual categories enable pattern generalization in songbirds. Cognition. 2013;128:113–118. doi: 10.1016/j.cognition.2013.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Fitch WT, Hauser MD. Computational constraints on syntactic processing in a nonhuman primate. Science. 2004;303:377–380. doi: 10.1126/science.1089401. [DOI] [PubMed] [Google Scholar]
  • 64.Newport EL, et al. Learning at a distance II. Statistical learning of non-adjacent dependencies in a non-human primate. Cogn Psychol. 2004;49:85–117. doi: 10.1016/j.cogpsych.2003.12.002. [DOI] [PubMed] [Google Scholar]
  • 65.Ravignani A, et al. Action at a distance: dependency sensitivity in a New World primate. Biol Lett. 2013 doi: 10.1098/rsbl.2013.0852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Wilson B, et al. Mixed-complexity artificial grammar learning in humans and macaque monkeys: evaluating learning strategies. Eur J Neurosci. 2015;41:568–578. doi: 10.1111/ejn.12834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Endress AD, et al. Perceptual and memory constraints on language acquisition. Trends Cogn Sci. 2009;13:348–353. doi: 10.1016/j.tics.2009.05.005. [DOI] [PubMed] [Google Scholar]
  • 68.Catchpole CK, Slater PJB, editors. Bird Song: Biological Themes and Variations. Cambridge University Press; 1995. [Google Scholar]
  • 69.Okanoya K. The Bengalese finch: a window on the behavioral neurobiology of birdsong syntax. Ann NY Acad Sci. 2004;1016:724–735. doi: 10.1196/annals.1298.026. [DOI] [PubMed] [Google Scholar]
  • 70.Honda E, Okanoya K. Acoustical and syntactical comparisons between songs of the white-backed Munia (Lonchura striata) and its domesticated strain, the Bengalese finch (Lonchura striata vardomestica) Zool Sci. 1999;16:319–326. doi: 10.1002/jez.1748. [DOI] [PubMed] [Google Scholar]
  • 71.Berwick RC, et al. Songs to syntax: the linguistics of birdsong. Trends Cogn Sci. 2011;15:113–121. doi: 10.1016/j.tics.2011.01.002. [DOI] [PubMed] [Google Scholar]
  • 72.Price PH. Developmental determinants of structure in zebra finch song. J Comp Physiol Psychol. 1979;93:260. [Google Scholar]
  • 73.Scharff C, Nottebohm F. Study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning. J Neurosci. 1991;11:2896–2913. doi: 10.1523/JNEUROSCI.11-09-02896.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Farabaugh SM, et al. Analysis of warble song of the budgerigar Melopsittacus undulatus. Bioacoustics. 1992;4:111–130. [Google Scholar]
  • 75.Dooling RJ, et al. Perceptual organization of acoustic stimuli by budgerigars (Melopsittacus undulatus): II. Vocal signals. J Comp Psychol. 1987;101:367. doi: 10.1037/0735-7036.101.4.367. [DOI] [PubMed] [Google Scholar]
  • 76.Dent ML, Welch TE, McClaine EM, Shinn-Cunningham BG. Species differences in the identification of acoustic stimuli by birds. Behavi Proc. 2008;77:184–190. doi: 10.1016/j.beproc.2007.11.005. [DOI] [PubMed] [Google Scholar]
  • 77.Starck JM, Ricklefs RE. Patterns of development: the altricial-precocial spectrum. Oxford Ornithology Series. 1998;8:3–30. [Google Scholar]
  • 78.Andrew RJ, editor. Neural and behavioural plasticity: The use of the chick as a model. Oxford University Press; 1991. [Google Scholar]
  • 79.Rose S. God’s organism? The chick as a model system for memory studies. Learn Mem. 2000;7:1–17. doi: 10.1101/lm.7.1.1. [DOI] [PubMed] [Google Scholar]
  • 80.Sloman A, Chappell J. The altricial-precocial spectrum for robots. Proceedings International Joint Conference on Artificial Intelligence. 2005;19:1187–1192. [Google Scholar]
  • 81.Braddick O, Atkinson J. Development of human visual function. Vision research. 2011;51:1588–1609. doi: 10.1016/j.visres.2011.02.018. [DOI] [PubMed] [Google Scholar]
  • 82.Johnson MH, De Haan M. Developmental Cognitive Neuroscience: An Introduction. John Wiley & Sons; 2015. [Google Scholar]
  • 83.Zeveloff SI, Boyce MS. Why human neonates are so altricial. The American Naturalist. 1982;120:537–542. [Google Scholar]
  • 84.Montagu MF. An introduction to physical anthropology. 3. Charles C Thomas Publisher; 1960. Time, morphology, and neoteny in the evolution of man; pp. 295–316. [Google Scholar]
  • 85.Lorenz KZ. The companion in the bird’s world. The Auk. 1937;54:245–273. [Google Scholar]
  • 86.Schmid KL, Wildsoet CF. Assessment of visual acuity and contrast sensitivity in the chick using an optokinetic nystagmus paradigm. Vision Research. 1998;38:2629–2634. doi: 10.1016/s0042-6989(97)00446-x. [DOI] [PubMed] [Google Scholar]
  • 87.Deng C, Rogers LJ. Bilaterally projecting neurons in the two visual pathways of chicks. Brain Res. 1998;794:281–290. doi: 10.1016/s0006-8993(98)00237-6. [DOI] [PubMed] [Google Scholar]
  • 88.Turkewitz G, Kenny PA. Limitations on input as a basis for neural organization and perceptual development: A preliminary theoretical statement. Dev Psychobiol. 1982;15:357–368. doi: 10.1002/dev.420150408. [DOI] [PubMed] [Google Scholar]
  • 89.McCabe BJ. Imprinting. Wiley Interdisciplinary Reviews: Cognitive Science. 2013;4:375–390. doi: 10.1002/wcs.1231. [DOI] [PubMed] [Google Scholar]
  • 90.Versace E, Spierings MJ, Caffini M, ten Cate C, Vallortigara G. Spontaneous generalization of abstract multimodal patterns in young domestic chicks. Anim Cogn. 2017;20:521–529. doi: 10.1007/s10071-017-1079-5. [DOI] [PubMed] [Google Scholar]
  • 91.Wood JN. Newborn chickens generate invariant object representations at the onset of visual object experience. Proceedings of the National Academy of Sciences. 2013;110(34):14000–14005. doi: 10.1073/pnas.1308246110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Awh E, Jonides J. Overlapping mechanisms of attention and spatial working memory. Trends Cogn Sci. 2001;5:119–126. doi: 10.1016/s1364-6613(00)01593-x. [DOI] [PubMed] [Google Scholar]
  • 93.Thiessen ED, Kronstein AT, Hufnagle DG. The extraction and integration framework: A two-process account of statistical learning. Psychol Bull. 2013;139:792–814. doi: 10.1037/a0030801. [DOI] [PubMed] [Google Scholar]
  • 94.Kim R, Seitz A, Feenstra H, Shams L. Testing assumptions of statistical learning: is it long-term and implicit? Neurosci Lett. 2009;461:145–149. doi: 10.1016/j.neulet.2009.06.030. [DOI] [PubMed] [Google Scholar]
  • 95.Thiessen ED, Saffran JR. Learning to learn: Infants’ acquisition of stress-based strategies for word segmentation. Lang Learn Dev. 2007;3:73–100. [Google Scholar]
  • 96.McClelland JL, Rumelhart DE. Distributed memory and the representation of general and specific information. J Exp Psychol Gen. 1985;114:159–188. doi: 10.1037/0096-3445.114.2.159. [DOI] [PubMed] [Google Scholar]
  • 97.Miles RC. Delayed-response learning in the marmoset and the macaque. J Comp Physiol Psychol. 1957;50:352–355. doi: 10.1037/h0040257. [DOI] [PubMed] [Google Scholar]
  • 98.Miles RC, Meyer DC. Learning sets in marmosets. J Comp Physiol Psychol. 1956;49:212–222. doi: 10.1037/h0045088. [DOI] [PubMed] [Google Scholar]
  • 99.Menzel EW, Juno C, Garrud P. Social foraging in marmoset monkeys and the question of intelligence. Proc Biol Sci. 1985:145–158. [Google Scholar]
  • 100.Garber PA. Role of spatial memory in primate foraging patterns: Saguinus mystax and Saguinus fuscicollis. Am J Primatol. 1989;19:203–216. doi: 10.1002/ajp.1350190403. [DOI] [PubMed] [Google Scholar]
  • 101.MacDonald SE, Pang JC, Gibeault S. Marmoset (Callithrix jacchus jacchus) spatial memory in a foraging task: Win-stay versus win-shift strategies. J Comp Psychol. 1994;108:328. doi: 10.1037/0735-7036.108.4.328. [DOI] [PubMed] [Google Scholar]
  • 102.Fitch WT. The evolution of speech: a comparative review. Trends Cogn Sci. 2000;4:258–267. doi: 10.1016/s1364-6613(00)01494-7. [DOI] [PubMed] [Google Scholar]
  • 103.Greenfield PM. Language, tools and brain: the ontogeny and phylogeny of hierarchically organized sequential behavior. Behav Brain Sci. 1991;14:531–595. [Google Scholar]
  • 104.Piñon D, Greenfield PM. Does everybody do it? Hierarchically organized sequential activity in robots, birds and monkeys. Behav Brain Sci. 1994;17:361–365. [Google Scholar]
  • 105.Jordan MI, Rosenbaum DA. Action. In: Posner MI, editor. Foundations of Cognitive Science. MIT Press; 1990. pp. 727–767. [Google Scholar]
  • 106.Dawkins R. Hierarchical organization: a candidate principle for ethology. In: Bateson PPG, Hinde RA, editors. Growing Points in Ethology. Cambridge University Press; 1976. [Google Scholar]
  • 107.Seyfarth RM, Cheney DL. The evolution of language from social cognition. Curr Opin Neurobiol. 2014;28:5–9. doi: 10.1016/j.conb.2014.04.003. [DOI] [PubMed] [Google Scholar]
  • 108.Doupe AJ, Kuhl PK. Birdsong and human speech: common themes and mechanisms. Ann Rev Neurosci. 1999;22:567–631. doi: 10.1146/annurev.neuro.22.1.567. [DOI] [PubMed] [Google Scholar]
  • 109.Petkov CI, Jarvis E. Birds, primates, and spoken language origins: behavioral phenotypes and neurobiological substrates. Front Evol. 2012 doi: 10.3389/fnevo.2012.00012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Lilly JC. Vocal mimicry in tursiops: ability to match numbers and durations of human vocal bursts. Science. 1965;147:300–301. doi: 10.1126/science.147.3655.300. [DOI] [PubMed] [Google Scholar]
  • 111.Janik VM. Acoustic communication in delphinids. Adv Study Behav. 2009;40:123–157. [Google Scholar]
  • 112.Connor RC. Dolphin social intelligence: complex alliance relationships in bottlenose dolphins and a consideration of selective environments for extreme brain size evolution in mammals. Phil Trans R Soc B. 2007;362:587–602. doi: 10.1098/rstb.2006.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Herman LM, et al. Responses to anomalous gestural sequences by a language-trained dolphin: Evidence for processing of semantic relations and syntactic information. J Exp Psychol Gen. 1993;122:184. doi: 10.1037//0096-3445.122.2.184. [DOI] [PubMed] [Google Scholar]
  • 114.Marler P, et al. Vocal communication in the domestic chicken: I. Does a sender communicate information about the quality of a food referent to a receiver? Anim Behav. 1986;34:188–193. [Google Scholar]
  • 115.Jarvis ED. Selection for and against vocal learning in birds and mammals. Ornithol Sci. 2006;5:5–14. [Google Scholar]
  • 116.Arnold K, Zuberbühler K. Language evolution: semantic combinations in primate calls. Nature. 2006;441:303–303. doi: 10.1038/441303a. [DOI] [PubMed] [Google Scholar]
  • 117.Cäsar C, Byrne R, Young RJ, Zuberbühler K. The alarm call system of wild black-fronted titi monkeys, Callicebus nigrifrons. Behav Ecol Sociobiol. 2012;66:653–667. [Google Scholar]
  • 118.Cäsar C, Zuberbühler K, Young RJ, Byrne RW. Titi monkey call sequences vary with predator location and type. Biol Lett. 2013;9:20130535. doi: 10.1098/rsbl.2013.0535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Cheney DL, Seyfarth RM, Silk JB. The responses of female baboons (Papio cynocephalus ursinus) to anomalous social interactions: evidence for causal reasoning? J Comp Psychol. 1995;109:134. doi: 10.1037/0735-7036.109.2.134. [DOI] [PubMed] [Google Scholar]
  • 120.Rey A, et al. Centre-embedded structures are a by-product of associative learning and working memory constraints: Evidence from baboons (Papio Papio) Cognition. 2012;123:180–184. doi: 10.1016/j.cognition.2011.12.005. [DOI] [PubMed] [Google Scholar]
  • 121.Lew-Williams C, Saffran JR. All words are not created equal: Expectations about word length guide infant statistical learning. Cognition. 2012;122:241–246. doi: 10.1016/j.cognition.2011.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Clark A. Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behav Brain Sci. 2013;36:181–204. doi: 10.1017/S0140525X12000477. [DOI] [PubMed] [Google Scholar]
  • 123.Fiser J, et al. Statistically optimal perception and learning: from behavior to neural representations. Trends Cogn Sci. 2010;14:119–30. doi: 10.1016/j.tics.2010.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Testolin A, Zorzi M. Probabilistic models and generative neural networks: Towards a unified framework for modeling normal and impaired neurocognitive functions. Front Comput Neurosci. 2016 doi: 10.3389/fncom.2016.00073. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES