Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2012 Jul 19;367(1598):2077–2088. doi: 10.1098/rstb.2012.0073

On the pursuit of the brain network for proto-syntactic learning in non-human primates: conceptual issues and neurobiological hypotheses

Christopher I Petkov 1,2,*, Benjamin Wilson 1
PMCID: PMC3367685  EMSID: UKMS49154  PMID: 22688642

Abstract

Songbirds have become impressive neurobiological models for aspects of human verbal communication because they learn to sequence their song elements, analogous, in some ways, to how humans learn to produce spoken sequences with syntactic structure. However, mammals such as non-human primates are considered to be at best limited-vocal learners and not able to sequence their vocalizations, although some of these animals can learn certain ‘artificial grammar’ sequences. Thus, conceptual issues have slowed the progress in exploring potential neurobiological homologues to language-related processes in species that are taxonomically closely related to humans. We consider some of the conceptual issues impeding a pursuit of, as we define them, ‘proto-syntactic’ capabilities and their neuronal substrates in non-human animals. We also discuss ways to better bridge comparative behavioural and neurobiological data between humans and other animals. Finally, we propose guiding neurobiological hypotheses with which we aim to facilitate the future testing of the level of correspondence between the human brain network for syntactic-learning and related neurobiological networks present in other primates. Insights from the study of non-human primates and other mammals are likely to complement those being obtained in birds to further our knowledge of the human language-related network at the cellular level.

Keywords: language, monkeys, humans, functional magnetic-resonance imaging, hypotheses

1. Introduction

If you can find a path with no obstacles, it probably doesn't lead anywhere.

Frank A. ‘Parson’ Clark, ca. 1963

The path towards understanding the behavioural abilities and neuronal substrates that are evolutionarily related to those that humans use for language has been as challenging as it has been informative. Recently, we have seen considerable advances in modern language theory [14] and in our understanding of language-related processes (for recent reviews on the neurobiology of syntax, see: Bickerton & Szathmary [5]). Concurrently, work in non-human animals has seen the development of theoretical frameworks on the evolutionary origins of language-related processes [1,69]. This has led to an increase in comparative animal studies on ‘artificial grammar learning’ (AGL) [1012]. As we consider below, AGL paradigms aim to tap into the computational abilities that humans use to learn syntactically structured sequences [9,13,14]. Moreover, songbirds have recently become an important neurobiological model system, in part, because they learn their vocalizations and because their song production seems to reveal ‘syntactic-like’ abilities that are in some ways related to how humans learn to produce language with syntactic structure [6,15]. These are all exciting developments, but, arguably, one area that remains relatively underdeveloped is in advancing mammalian model systems that can provide insights on the cellular mechanisms that might be homologous to those that the human brain uses to support language-related processes. In particular, additional comparative work with non-human primates, although faced with considerable challenges as we consider in this paper, is needed to inform us on the evolutionary changes that are likely to have occurred within the primate order as language evolved in humans [8]. Interdisciplinary efforts will remain important for advancing future treatments for communication and language disorders, and it is likely that major advances will be difficult to achieve if research efforts are limited to the study of select animal species or to the non-invasive approaches that are normally available for studying humans.1

In this paper, we focus on the conceptual and technical challenges that are faced in pursuing evolutionarily homologues to human syntactic-learning in mammals such as non-human primates. We provide a description of what we define here as ‘proto-syntactic’ processes and how we might go about studying these behaviourally and neurobiologically and in ways that can facilitate comparative testing with humans and other animals. We conclude by reviewing recent perspectives on the structure and function of the human brain network for syntactic processes, and propose several neurobiological hypotheses that consider the possible combinations of behavioural sequencing capabilities and neurobiological substrates with which different non-human primate species might present.

2. A conceptual framework for the pursuit of proto-syntactic capabilities and processes in non-human animals

Syntax can be defined as the ability to learn and to produce grammatical relations between words and word parts in a sentence. However, syntax is not simply the linear sequencing of words (i.e. evaluating the word-by-word relationships between elements in a string). Although we speak and write word-by-word, modern linguistic theory emphasizes that beneath the surface-level of word sequences is an underlying structure, such as hierarchically nested phrases and ‘movement’ (perceived or actual) of syntactic constituents [1,2,5,6,17]. In this section, we consider: (i) two examples of operational definitions of syntactic abilities that could be comparatively studied with non-human animals; (ii) the important distinction between production and learning, the latter of which allows us to ask questions about the learning abilities of animals, which might be better than their vocal production capabilities; and (iii) the idea of an evolutionary gradient in syntactic complexity to help us to understand how human syntactic abilities may have evolved from simpler systems. In this regard, we define proto-syntactic abilities as those that reflect an evolutionary increase in computational processing capabilities, which comparative testing might reveal to have formed an evolutionary basis for human syntactic abilities.

(a). A place to start: creating operational definitions of syntactic abilities for comparative testing

Most definitions of syntax reflect capabilities that are uniquely human, such as the ability to learn to produce and evaluate considerable levels of complexity in the hierarchical structure of sentences. Since no other animals have syntax, grammar, words, sentences, semantics, etc. as they are defined for human language, the first major hurdle for comparative study is to be as clear as possible about the operational definition of the core aspects of some of these abilities that one hopes to study with other animals. Each operational definition will suggest and constrain the ways in which these abilities can be comparatively studied and with which species these could be realistically explored.

As one example, we might be interested in studying a general aspect of syntactic sequencing ability, operationally defined as follows: An aspect of syntactic structure building is present in animals that can learn to produce structural relationships between their individual vocalizations (what we might call ‘syntactic-like’ ability). The italicized phrase, however, suggests that we would need to study species of animals that are vocal learners and have communication systems that allow them to combine several of their vocalizations in some sort of a sequence for production. Syntactic-like abilities in non-human animals seem to be closely associated with vocal imitation and vocal learning, such as when songbirds and humpback whales learn to structure their songs. The few animal species known to be vocal learners (humans, songbirds, parrots, hummingbirds, bats, elephants, pinnipeds and cetaceans, [8,15,1822]) have varying degrees of syntactic-like capabilities. Of these groups of animals, not all are being neurobiologically studied. Thus, some groups of songbirds have become representative neurobiological animal model systems for vocal production learning and syntactic-like abilities. Moreover, although different songbirds show varying levels of song complexity, the structure of their songs are typically described as exhibiting ‘phonological syntax’ [5,23], where different sequencing combinations of the units do not produce different meanings (referred to as ‘semantically compositional syntax’ in humans).

Many mammals, non-human primates included, have a call-based system for vocal communication that lacks the sequencing abilities of songbirds or cetaceans. Most non-human primates are generally thought to produce unitary calls from a limited set of innate or genetically regulated vocalizations, although this perspective is changing somewhat. Recently, Snowdon [24] reviewed the evidence for vocal learning in non-human primates, stating: ‘None of these new results suggest that primates will soon challenge songbirds for vocal virtuosity, but nonetheless the accumulation of results suggests a much greater degree of vocal control and flexibility of production than previously thought.’, also see [8,25]. Moreover, some species of guenons (Old World monkeys) appear to combine their calls into different context-specific call combinations [26,27]. However, as with songbirds, these call combinations lack semantic compositionality [1,2].

Therefore, an operational definition, such as the following, is required to help us to remain empirically grounded regarding the limited vocal production learning and sequencing capabilities of non-human primates: A core aspect of the human syntactic capacity, to learn how sensory elements are appropriately sequenced, might exist in mammals that are able to evaluate whether sequences of auditory or visual elements violate a previously learned structure. This operational definition differs from the one for vocal learners above in two key respects. First, it does not depend on the vocal production capabilities of the animals, which theoretical papers on language evolution have suggested is not necessary [7]. Second, it draws a distinction between learning and production, suggesting that some animals might be able to learn sequences of sensory elements better than they are able to (re)produce them. We evaluate the basis for this claim next.

(b). The distinction between vocal production learning and auditory learning

It is well known that human receptive capabilities can outstrip productive capabilities. Any learner of a second language will be familiar with the feeling that their ability to understand that language exceeds their ability to produce well-formed sentences in it, and we know that infants are sensitive to certain properties of their native language before they can use them [28]. A related distinction is made between ‘auditory learning’ and ‘vocal production’ by comparative scientists because many vertebrates are capable of some form of auditory learning although very few species are also production vocal learners [19]. Linguists tend to focus on receptive abilities when they evaluate human language—particularly the ability to differentiate between well-formed and ill-formed (ungrammatical) sentences. However, when scientists look for correspondence to abilities in other animals there is a strong tendency to focus on production [1], such as syntactic-like abilities in songbirds [6,15]. Although many vertebrates are often considered to be vocal non-learners, many of these animals are capable of considerable auditory learning [25,29]. Thereby, the extent to which different animal species can learn varying levels of complexity in how sensory elements are temporally sequenced remains an open question and is an issue that remains linguistically relevant.

(c). The notion of an evolutionary gradient of syntactic complexity

The formal language hierarchy (FLH; or extended Chomsky hierarchy [4]) contains several categories of grammar, each describing an increasingly powerful computational language (see figure 1a, which is based on Berwick et al. [6]). Here, lower ranked grammars (e.g. finite-state grammars (FSGs); also referred to as ‘sub-regular’ grammars [4]) generate sets of languages that are subsets of the sets of languages generated by higher ranked grammars. Humans seem to be unique in the animal kingdom in being able to produce languages that breach into the realm of context-sensitive languages [30] (figure 1a). However, as Hurford notes: ‘ … linguists pay little attention to classes of languages of [the] lowly rank on the Formal Language Hierarchy.’ [1]. In our view, this has resulted in a lack of resolution of the level of complexity of FSGs that are not human unique, leading to an emphasis on determining whether the status of some non-human animal species can be elevated if they are able to learn context-free patterns from context-free grammars (CFGs; also referred to as ‘supra-regular’ grammars [3,4,31]). Moreover, the interpretation that songbirds can learn CFG [12,32] has been questioned for several reasons considered in detail elsewhere [6,33,34], leading Berwick et al. [6] to conclude that: ‘Considerable controversy remains as to whether any nonhuman species can truly recognize strictly context-free patterns’. Context-free pattern learning may someday be demonstrated in certain animals [4], yet, even if it is not, it remains important to understand how human capabilities with CFGs and beyond may have evolved from abilities lower in the FLH that are present in other living animals. This requires better resolution of the lower parts of the hierarchy (figure 1) and consideration of the distinction between learning—as a behavioural measure of reception—and production. As we schematize in figure 1b for humans and other species of animals, these two behavioural phonotypes should be distinguished (see [25]).

Figure 1.

Figure 1.

Formal language hierarchy (FLH) distinguishing learning versus production and the notion of a gradient of syntactic complexity. (a) Schematic of FLH, based on Berwick et al. [6]. (b) Our illustrated distinction between learning and production in relation to the FLH, highlighting considerable uncertainty in how high human and other animal learning (rather than production) capabilities can reach in the hierarchy, see text. (c) Schematized for quantifying the different dimensions of syntactic complexity, on the vertical axis, a measure of linearity can be used as a function of increasing memory demands on the horizontal axis. Other ways of quantifying complexity in different dimensions would also be useful to test. At the lowest level are single element/state systems, followed by multi-state linear systems with (i) only ‘adjacent relationships’, (ii) forward branching systems including ‘non-adjacent relationships’ that some animals might be able to learn, and (iii) state repetitions that might tap into numerosity sensitivity. Higher still are ‘state chains’ that cannot be solved by first-order Markov models; here the transition following each ‘a’ element depends on the preceding state transition. This process is second-order Markov, but all other transitions are first-order Markov processes [1].

How might the ability to generate context-free languages or beyond have evolved? One possibility is that when the ancestors to living humans began to organize vocalizations and then words into sentences of greater complexity, this built upon the evolutionarily conserved ability to process sets of serially ordered strings. Then at some point selective pressures to reduce memory demands may have expanded syntactic capabilities by the adoption of rule-based learning strategies that avoid having to memorize all the elements and transitions in the sequences from more complex grammars [35]. In this regard, we are motivated by Hurford's attempts to resolve several of the stages below CFGs in reference to the various levels of complexity seen in the songs of songbirds and humpback whales [1]. We will expand on some of his ideas to illustrate our proposed notion of an evolutionary gradient of syntactic complexity. See Jäger & Rogers [4] for other approaches to resolve the sub-regular grammar space in the extended Chomsky hierarchy.

One of the simplest scenarios is for a system to recognize and/or to generate single elements. Such is the case for animals with call-based systems that can produce and recognize single vocalizations from a limited set of vocalizations (figure 1c). The next level of sequencing complexity is introduced when two calls are combined where it then becomes important to evaluate the ‘adjacent relationships’ between element pairs. The subsequent level of complexity occurs when several elements are serially sequenced in a purely linear fashion. An example of this is the linear song components of, for example, zebra finch songs [36], where the pairwise transitions can be modelled by a first-order Markov process [1] (figure 1c). Adding more elements or transitions does not change the computational complexity of the pairwise sequencing process, but requires a larger indexical memory store. Some songbirds, such as Bengalese finches, nightingales and chaffinches, and humpback whales have songs that show sequencing elaborations such as forward or backward branching relationships and elaborations such as repeating elements within a range of acceptable repetitions. While it is not always clear which of these would be hierarchically higher than the others in terms of syntactic complexity (however defined), these sorts of transitions deviate from strictly linear processes [37], although these cases still only require first-order Markov processes to model them (figure 1c). Of special interest are the branching transitions since these can be modelled either as a number of adjacent relationships, or could include more complex ‘non-adjacent relationships’ where an optional element can occur between two other elements with some probability. The recognition of non-adjacent relationships can reduce the need to memorize many pairwise transitions if the non-adjacency ‘rule’ can be learned. For adult humans, non-adjacent relationships can include even greater levels of complexity (e.g. nested or crossed relationships [38,39]). Moreover, the ability to deal with non-adjacent relationships is not present at birth but seems to occur during infant development [40,41]. As a final example of another level of syntactic complexity (figure 1c), Hurford notes the special case of the same element occurring in multiple parts of the sequence where its next transition state depends on the preceding, called a ‘state chain’ process [1]. Such transitions require higher order Markov models, although much of the rest of the sequence could remain a first-order Markov process.

We hope that these examples help to illustrate the great variety seen in animal song production that can be usefully applied towards quantifying the structural complexity between different artificial grammars, prior to using these in comparative tests with different animal species. It would be of benefit to many if the scientific community works together to rank the complexity of these structures along different dimensions (using quantitative rather than qualitative descriptions, wherever possible). Subsequently, the learning abilities of animals can be evaluated along the various dimensions of ‘syntactic complexity’ to advance our understanding of the evolutionary bases for human syntactic abilities. It remains possible that the evolution of syntactic complexity may have been step-wise rather than, as we have proposed, a gradient function. Yet, if the pursuit is informative regarding how language may have evolved, we welcome the testing of different alternative hypotheses. As we will discuss in §4, there is already a basis for considering syntactic complexity from the human cognitive neuroscience literature, where, for instance, the comparison of adjacent versus non-adjacent relationships (broadly defined) seem to be able to predict which parts of the human language network are engaged [42].

3. Obtaining comparative data on artificial grammar learning: implicit versus explicit learning

Classically, the behavioural approach has been a tool of choice for comparative biologists and psychologists. However, even behavioural testing is challenging to apply in the same way across species that may have different forms of communication, different levels of motivation, varying abilities to engage in behavioural testing and that may find different methods of providing responses more natural than others. Combining behavioural study with neurobiological measurements that can be performed in a similar way across the species escalates the obstacles to success. Yet, bridging techniques and approaches are required to link research based on the study of different species. In this section, we consider (i) how AGL can be used to study implicit or explicit learning processes and (ii) several approaches in which behavioural and neurobiological data can be similarly obtained across species to facilitate comparative testing.

The use of AGL paradigms is a promising approach for understanding what aspects of syntactic-related patterns can be learned by animals. Following Chomsky's theoretical formulations of the structure of language [17], Reber pioneered the use of artificial language paradigms to study how humans learn language structure [10]. AGL paradigms have been used to explore the types of structures that humans (including infants and adults), songbirds, non-human primates and rodents can learn [33,4347]. However, there are differences in how some of these study groups have been tested such that different learning substrates might have been engaged.

The infant and non-human primate data have tended to be obtained relying on the implicit learning of artificial grammars, which is often studied by measuring preferential looking during habituation/dishabituation paradigms [11,14,44,48]. Typically, these experiments are conducted by familiarizing the individual for some length of time with exemplary sequences of stimuli that follow the artificial-grammar pattern or rule(s) [11,12,14,33,44,45,48]. Then in the second ‘testing phase’ of the experiment, the individual is tested with well-formed ‘correct’ or ‘violation’ sequences during natural response measurements, such as preferential looking towards the audio speaker that presented the test sequence. In this way, the familiarization and testing need not engage perceptual awareness for learning to have occurred, i.e. implicit learning [11,44,45,48]. However, in the bird and rodent studies, the participants were trained to discriminate correct versus violation sequences, which could engage an explicit rather than implicit learning system [12,33,46,49]. Similarly, in many of the human studies [43,50,51] either during the familiarization phase or during the testing phase, the participants were engaged in learning the sequencing structure of the artificial grammar by being asked to judge whether the sequences were correct or violation sequences. When participants are actively seeking to determine the artificial-grammar pattern, there is a risk that they might fail to learn some of the sequencing relationships after the point at which they feel that they have sufficiently understood the pattern and are performing reasonably well. Such explicit learning could engage different brain circuits [52] in relation to studies of AGL using implicit learning (such as those in infants and non-human primates).

More recently, the groups of Hagoort and Petersson have worked to engage adult humans in more implicit learning paradigms, whereby little instruction is given to participants during testing other than to report by pressing one of two buttons their preference for a test sequence (i.e. whether they ‘liked’ the sequence or not). Subsequent to this, the participants were asked to make ‘grammaticality’ judgements both to validate the preference judgements and to engage explicit learning [38,39]. Interestingly, both implicit and explicit AGL is reported as yielding fairly comparable results. Both seem to engage the inferior-frontal gyrus (IFG), e.g. Broca's territory (Brodmann areas (BA) 44/45), as has been reported in several other human AGL or natural language-learning studies.

When comparing data with animals such as non-human primates that are limited vocal learners, an advantage of using implicit rather than explicit learning of artificial-grammar sequences is to avoid engaging the aspects of the network that in vocal learners such as humans and songbirds might form part of the network engaged in vocal production. Implicit learning might be better able to distinguish perception from motor production in the service of perception by reducing the ability of vocal learners to rely on sub-articulation, imitation, etc. to assist in the perception of syntactic sequences. Otherwise, several aspects of the networks that support syntactic or syntactic-like learning in vocal learners would by comparison to vocal non-learners appear to be strikingly different (e.g. human or songbird unique). For a more detailed discussion of the similarities and differences in the behaviour and neurobiology of vocal learners (such as songbirds and humans) and other animals with more limited vocal learning abilities (such as non-human primates and other birds), see Petkov & Jarvis [25].

Another way in which comparative testing can be facilitated is to use similar behavioural and neurobiological measurements between humans, infants and non-human animals. For instance, for behavioural testing, infra-red eye tracking has become more available in scientific laboratories and can be used to evaluate preferential looking responses after habituation to artificial-grammar sequences. This is shown for macaques in figure 2a,b and can be comparably conducted in adult humans, infants and other types of monkeys, such as marmosets (figure 2cf). Apart from the advantage of using eye tracking to measure implicit learning similarly across participant groups, the approach also offers a more objective way to analyse behavioural data, in relation to the traditional approach of manually rating the animals’ responses as captured on video, which has been criticized [34]. Other groups have opted to use brain potentials both to obtain neurobiological data after AGL and to evaluate whether, for instance, infant brain potentials show a signature of learning [40].

Figure 2.

Figure 2.

Eye-tracking measurement of implicit artificial grammar learning. (a) Schematic of a behavioural eye-tracking experiment with monkeys in our laboratory. The monkey sits in front of a monitor and after a brief central fixation period, an auditory test sequence is randomly presented from the left or right audio speaker. The length of time spent looking into the predefined analysis region around the presenting audio speaker is measured. (b) Exemplary eye-traces towards correct (‘grammatical’) and violation (‘ungrammatical’) sequences. Positive values in the plot indicate eye movements towards the test speaker location, whichever audio speaker it was; negative values are looks in the opposite direction away from the presenting audio speaker. (cf) exemplary non-invasive infra-red eye tracking of (c) adult human, (d) human infant (image courtesy of J. Read), (e) macaque and (f) marmoset.

Many neuroscientific studies are conducted in anaesthetized animals. However, comparative AGL studies using evoked potentials or brain neuroimaging will depend on the animals being studied awake, rather than anaesthetized. Technical advances have made it possible to accommodate non-human animals so that they can be scanned awake with functional magnetic-resonance imaging (fMRI), which is often used to scan humans [53,54]. Moreover, although the gradient systems of MRI scanners generate a considerable amount of noise, animal MRI studies often use strategies to reduce the impact of scanner noise on the animals and to improve the auditory activity response during sound stimulation [55,56]. Recent fMRI and positron emission tomography (PET) studies have been describing how the brains of non-human primates process communication signals (monkeys [5760]; chimpanzees [61]). General summaries are now available on how the results in monkeys and apes relate to how the human brain processes species-specific communication signals [62,63]. In this way, testing for the level of correspondence across the species, rather than assuming that it exists, provides a stronger bridge between human neuroimaging work and studies in certain species of non-human primates, where for instance the processing of communication signals can be studied at the neuronal level [6466]. As a specific example, an fMRI-based correspondence has recently been suggested between how human [67,68] and monkey [57] brains process voice content in communication sounds; see Petkov et al. [62]. Subsequently, fMRI-guided electrophysiology was used in the monkeys to target fMRI-identified voice-sensitive brain clusters which when studied seemed to reveal ‘voice cells’ in the primate brain [69]. A similar two-stage approach—linking human neuroimaging results on language-related processes using a bridging technique followed by the neuronal-level study of potential homologues in an animal model system—could provide novel insights into the cellular function of evolutionarily conserved regions than in humans evolved to support language-related processes.

4. Neurobiological hypotheses on the proto-syntactic learning network in monkeys

There is a growing consensus among scientists that the prominent brain regions in humans that are engaged in syntactic processes involve the left inferior and middle frontal cortex, large parts of the superior and middle temporal cortex, parts of the parietal cortex and subcortical regions such as the basal ganglia, as well as a number of these same regions in the right hemisphere [42,70]. Many of these brain regions appear to be engaged both during syntactic processing of natural language [71,72] and when human participants evaluate artificial-grammar sequences [13,38,43,50]. Thus, a considerable amount of language-related processing does not appear to be strictly language-specific. Friederici [42] has recently proposed an extensive model integrating information on the structure, function and connectivity of the human brain network that subserves language processing. Important to this model is how different behavioural demands can engage different aspects of the language network [42], thus, we next overview some of the key concepts that are relevant for neurobiological hypotheses of proto-syntactic networks in non-human primates. For other models, including those that focus on human speech processing and the relevance to brain pathways for auditory processing in primates, see [73,74].

  • — Several language pathways. Human semantic and syntactic processing engages several brain pathways: two dorsal pathways link posterior temporal and parietal lobe regions with either premotor cortex BA 6 (dorsal pathway I; via the superior longitudinal fasciculus (SLF)) or BA 44 in Broca's territory (dorsal pathway II; via the arcuate fasciculus, a part of the SLF). Two ventral pathways are hypothesized to link anterior supra-temporal lobe regions and either BA 45 in Broca's territory (ventral pathway I; via the extreme capsule (EC) fibre system) or the frontal operculum (FOP) area below BA 44/45 (ventral pathway II; via the uncinate faciculus (UF)).

  • — Syntactic complexity demands on the network. For initial syntactic structural analysis, the FOP and ventral pathway II are engaged (including for finite-state grammars such as (AB)n that monkeys and songbirds appear able to learn [11,12,50]). Dorsal pathway II (arcuate fasciculus) and BA 44 are critical for syntactic function, such as evaluating hierarchical structure and ‘non-adjacent relationships’ of various types [13,43]. Dorsal pathway II (to BA 44) and ventral pathway I (to BA 45) are engaged in semantic and syntactic relationships or syntactic movement (e.g. evaluating whether a sentence structure is subject–verb–object versus object–subject–verb). Higher memory demands and longer distance non-adjacent relationships engage Broca's territory (BA 44 in particular) and dorsal pathway I to premotor cortex. However, dorsal pathway I is primarily involved in sensory-to-motor mapping.

  • — Left hemisphere dominant and subcortical structures can be engaged. The syntactic/semantic network in frontal cortex tends to be left lateralized, see also [72,75]. The right hemisphere is thought to mainly subserve functions such as the prosodic and emotional aspects associated with linguistic comprehension. Subcortical structures such as the hippocampus and basal ganglia can be differently engaged relative to, e.g. BA 44, at different stages of syntactic learning [76].

Based on these considerations, several hypotheses can be articulated that consider the level of complexity that non-human primates are capable of learning and the neurobiological regions and pathways that might be engaged. For clarity in illustration, in figure 3 we subdivide the likely AGL capabilities into abilities for evaluating adjacent relationships alone or with non-adjacent relationships. See §2 and figure 1 for other aspects of syntactic complexity that could also be useful for testing. Moreover, since we are considering AGL of the temporal structure of sensory elements, it is an open question whether, all things equal, all presumed homologues of the pathways that have been described in humans would be engaged (e.g. dorsal pathway I to premotor cortex that is engaged in sensory-to-motor mapping might not be involved in this case). Also, although traditionally the dorsal arcuate fasciculus is considered as the classical language pathway linking Broca's and Wernicke's territories, the ventral pathway(s) and their role in language processes are being emphasized by some groups [7779]. However, although the ventral UF and EC pathways are anatomically evident in non-human primates, our hypotheses at this point only make predictions about the EC pathway since it is the primary ventral fronto-temporal tract that can currently be resolved with in vivo connectivity studies of the IFG in monkeys and apes [80,81].

Figure 3.

Figure 3.

Hypothetical proto-syntactic learning capabilities and neurobiological substrates in monkeys. (a) Hypothesis 1 illustrates a ventral pathway linking the supratemporal plane with inferior frontal cortex. Here, the animals are only able to learn adjacent relationships in finite-state grammars (FSGs). (b) Hypothesis 2 illustrates a dorsal pathway supporting the learning of FSG. (c) Hypothesis 3 illustrates the reliance on multiple pathways and regions depending on the complexity of the FSG patterns that can be learned (e.g. for adjacent relationships, a ventral pathway; for non-adjacent relationships, a dorsal pathway and/or a different part of the ventral pathway). (d) A discussion of variations to these hypotheses, see text. AC, auditory cortex; aSt, anterior striatum; Gp, globus pallidus; vF4/vF5, ventral frontal cortical areas F4 and F5 [82]; VL, ventro-lateral thalamus; 44/45, Brodmann areas 44/45.

(a). Hypothesis 1: ventral pathway for proto-syntactic learning

There is evidence that tamarin monkeys are able to learn adjacent relationships in FSGs, but are insensitive to violations of more complex grammatical patterns [11]. Also, human fMRI results suggest that the processing of such adjacent relationships engages the FOP, more so than Broca's territory [50]. However, in humans the processing of various sorts of non-adjacent relationships in artificial grammars [13,50], including those with hierarchical structure [43], engages at least Broca's territory, e.g. BA 44. Thus, one hypothesis is that the involvement of Broca's territory (and the dorsal SLF pathway) would not be seen in non-human primates [9,50], especially if the animals are not capable of evaluating non-adjacent relationships. In this scenario, when evaluating adjacent relationships or simpler syntactic-related relationships, both humans and some species of non-human primates might engage a ventral pathway (EC and/or UF) interconnecting anterior temporal lobe regions and frontal cortical areas that are inferior to BA 44/45 (e.g. in monkeys, the frontal opercular areas or areas vF5/F4 [82]). We illustrate this scenario in figure 3a (hypothesis 1: ventral pathway). If, the non-human primates are able to evaluate non-adjacent relationships and for this engage the ventral pathway, then this would suggest that the human dorsal pathway involving the arcuate fasciculus differentiated during language evolution to support increasing syntactic complexity, as Rilling et al. [80] have suggested.

(b). Hypothesis 2: dorsal pathway for proto-syntactic learning

A second hypothesis is that the processing of FSGs with only adjacent relationships engages monkey homologues of BA 44/45 and the dorsal SLF pathway, as illustrated in figure 3b (hypothesis 2: dorsal pathway). This would also suggest that the dorsal pathway differentiated after the split from a common ancestor to support the learning of greater syntactic complexity in humans.

(c). Hypothesis 3: multiple pathways in non-human primates for proto-syntactic learning depend on syntactic complexity

A third hypothesis is that different brain regions and pathways are engaged depending on the complexity of the grammars that can be learned. For instance, any combination of the following might be possible: (i) parts of the ventral pathway linking temporal lobe regions to monkey homologues of the human FOP are engaged in the processing of adjacent relationships in FSGs; (ii) the dorsal pathway is relied on for processing greater complexity in FSGs, such as non-adjacent relationships [13]; and/or (iii) different parts of the ventral pathway are engaged in evaluating either adjacent or non-adjacent relationships (figure 3c: hypothesis 3: multiple pathways). The combination of these scenarios in monkeys might be viewed to be the most comparable to how the human brain processes syntactic complexity, but there could be subtle differences. For instance, would the processing of comparable adjacent and non-adjacent relationships in artificial grammars engage a broader set of regions in frontal cortex in monkeys? If so, this could suggest a different form of functional differentiation during human language evolution from the ones considered for the other hypotheses above. For example, the ventral pathway and BA 45 might in humans have had to differentiate to support the combination of semantic and syntactic relationships [42].

(d). Other variants and hypotheses

The human syntactic learning network is also not entirely left lateralized [70], nor is the processing of communication sounds in humans, chimpanzees or monkeys [62,63]. Thus, it is possible that the right hemisphere in non-human primates might show some of the homotopic regions and connectivity illustrated here for the left hemisphere. Also, for brevity, the hypotheses of figure 3 do not illustrate the possible greater or lesser reliance on subcortical structures (such as the striatum and basal ganglia) or cerebellum to support, for instance, the implicit learning of artificial-grammar sequences. Fitch [83] proposed three interesting hypotheses regarding how the human syntactic network might differ from ancestral variants present in living non-human animals. First is the notion that human vocal learning involves a direct pathway between the regions required for vocal learning and the laryngeal motoneurons in the nucleus ambiguus in the brainstem. As suggested in §3 above, we would not expect the vocal production pathway to be engaged in (at least) the implicit learning of artificial-grammar sequences in non-human primates; for more details, see Petkov & Jarvis [25]. The second Fitch hypothesis regarding the specialization of the arcuate fasciculus [80] is considered in detail above. The third of the hypotheses considers the architectonic and other specializations of Broca's territory, e.g. BA 44, which, if present, might be evident in differences in the neurobiological activity and/or connectivity patterns between humans and monkeys in relation to their behavioural capabilities.

In summary, it is possible that humans engage at least Broca's territory and a dorsal pathway to process grammatical complexity in a way that may not be evident in non-human primates (hypothesis 1 in figure 3). Other possibilities are that monkeys may engage homologues of BA 44 and parts of the dorsal SLF tract for grammars perceived as simple by humans (hypothesis 2), or that there is a general correspondence between how human and monkey brain networks evaluate artificial-grammar complexity (hypothesis 3) with more or less subtle differences in hemispheric lateralization and/or cortical and subcortical engagement. It remains to be seen how a proto-syntactic network in monkeys would compare to the network humans that subserves syntactic learning.

5. Conclusions

At least conceptually, the approach with non-human primates and possibly also the one that might be taken with other so-called ‘vocal non-learning’ animals must differ from the approaches that are being taken with vocal learning animals, such as songbirds. On the other hand, the comparative testing of behaviour and neurobiology needs to be done as similarly as possible across the species so that data can be compared. We have aimed to build on the efforts of the international scientific community to understand the origins of language and to open new pathways for pursuing language homologues in non-human animals that tend to be dismissed from consideration. Work has also begun to refine the comparative behavioural testing of humans and non-human animals on AGL paradigms and we have begun to obtain initial results on monkey AGL with fMRI in our laboratory [84]. The constraints that are imposed by working with animals that are limited vocal learners can also be positively viewed as providing important insights, guidance and predictions into the ancestral state of the human language-related network and its generic processing capabilities. Thereby, the comparative approach remains important for understanding language evolution and for the development of useful animal model systems to study the evolutionarily conserved aspects of the human language-related network at the cellular and molecular levels.

Acknowledgements

We thank W. T. Fitch and the co-organizers A. Friederici and P. Hagoort for the invitation to present at the workshop in Nijmegen upon which this paper is based (supported by ERC grant SOMACCA to W.T.F.). We thank W. Levelt for useful discussions at the workshop and our collaborator on these projects K. Smith, who went beyond the call of duty in commenting on and discussing previous versions of the manuscript. J. Read provided the infant eye-tracking image. Financial support was provided by the Wellcome Trust (C.I.P.).

Endnote

1

In some cases, neuronal studies in humans are possible [16]. However, great care is required for interpreting the results from clinical patients that either involve or neighbour pathological regions that are being monitored for neurosurgical resection. In all cases, information from cell and molecular studies in animals can enhance the data from neuronal-level study in humans, provided that the level of correspondence across the species is tested using a common bridging technique.

References


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES