Abstract
Sequence learning underlies many uniquely human behaviours, from complex tool use to language and ritual. To understand whether this fundamental cognitive feature is uniquely derived in humans requires a comparative approach. We propose that the vicarious (but not individual) learning of novel arbitrary sequences represents a human cognitive specialization. To test this hypothesis, we compared the abilities of human children aged 3–5 years and orangutans to learn different types of arbitrary sequences (item-based and spatial-based). Sequences could be learned individually (by trial and error) or vicariously from a human (social) demonstrator or a computer (ghost control). We found that both children and orangutans recalled both types of sequence following trial-and-error learning; older children also learned both types of sequence following social and ghost demonstrations. Orangutans' success individually learning arbitrary sequences shows that their failure to do so in some vicarious learning conditions is not owing to general representational problems. These results provide new insights into some of the most persistent discontinuities observed between humans and other great apes in terms of complex tool use, language and ritual, all of which involve the cultural learning of novel arbitrary sequences.
This article is part of the theme issue ‘Ritual renaissance: new insights into the most human of behaviours’.
Keywords: social learning, sequence learning, ritual, children, apes, ghost control
1. Introduction
Sequences are pervasive features of human thoughts and actions. As such, they underlie many uniquely human traits, including complex tool use, language and ritual. Consider the sequences involved in everyday actions like making tea in the morning, sending a text message and celebrating a friend's birthday. In some cases, the thoughts and corresponding actions are causally yoked (e.g. making tea). In others, they are constrained by linguistic rules and communicative norms (e.g. text versus spoken messages). Still others are governed by cultural conventions as well as idiosyncratic considerations (e.g. birthday celebrations) that are causally opaque and ‘goal demoted’ (i.e. it is difficult for an observer to discern the objective [1–3]). Ritual sequences include these last two features, which distinguish them from instrumental sequences, such as tool use, that are causally meaningful and exhibit clear goals [1,2].
Learning novel and arbitrary sequences is not unique to humans. In fact, the learning of causally opaque, serially organized responses appears to be widely shared in the animal kingdom. Animals as different as pigeons, rats, rhesus monkeys [4], chimpanzees [5] and orangutans [6] can learn sequences of arbitrarily related items. Macaques and human adults learn sequences using the same cognitive and inferential processes [4,7]. Like humans, macaques evidence increasing expertise when individually learning arbitrary sequences, demonstrating greater accuracy and more rapid acquisition with each new list as list length increases [7]. They also use transitive inference (rather than associative weight) when learning the serial position of novel items in a sequence [8]. However, all of these studies involved individual, direct, trial-and-error learning, not social or vicarious learning. Sensu Bandura et al. [9] and Renner et al. [10], vicarious learning happens as a result of exposure to events that are neither generated nor directly experienced by the learner, that is, usually from a conspecific or the environment.
Both monkeys and apes can socially learn single or familiar responses, including causally relevant sequences of familiar actions (for reviews, see [11,12]). There is a growing consensus that when causally irrelevant actions are added to sequences, children faithfully copy them, while great apes omit them, copying only the causally relevant intentional actions [13–15]. To date, only one study has shown the social learning of novel sequences in primates: rhesus monkeys, with years of expertise, successfully imitated the serial position of at least two (out of four) items in a novel arbitrary sequence demonstrated by another monkey [16]. The issue of expertise raises the question: is the poor performance of non-human primates in novel social learning tasks owing to issues associated with task difficulty (e.g. encoding and recalling particular types of responses)? After all, if one cannot learn how to solve a given task via individual learning, one may also be unable to do so vicariously or socially.
This is not to say that individual and social learning are always interdependent. A growing body of research has shown that the social and individual learning of novel sequences are dissociable skills in humans [17–21]. For example, various studies with preschool-aged children have now shown that the imitation of item-based sequences, involving responses to distinct items that are arbitrarily related (e.g. ambulance → bird → crown; figure 1a), is not correlated with the imitation of similarly arbitrary spatial-based sequences (e.g. right → bottom → left; figure 1b) within subjects [19,20]. Moreover, individual differences associated with learning each sequence type by trial and error do not predict variation in learning either sequence type by imitation [20]. In other words, being a good independent learner does not necessarily make one a good imitator. This pattern of results has led Subiaul [11,23] and Subiaul et al. [20] to hypothesize that while the individual learning of arbitrary sequences may be widely shared in the primate order, the ability to vicariously learn such sequences may be phylogenetically restricted to humans. Heyes [24, p. 4] has made a similar point, arguing that ‘even when [animals] get the experience necessary…[they] are limited in their capacity to imitate new sequences of action’.
What exactly is it that makes humans exceptional sequence imitators relative to non-human species? Do humans have a general facility for vicariously acquiring information from the environment, regardless of what is learned or from where? Or is this facility linked to particular content types and sources? For example, if a live model or social demonstrator were removed, but their actions and/or their effects preserved, would humans learn nonetheless? and would learning in such a condition differ from that in one involving a live agent? If humans are specialized vicarious/social learners, then learning in social conditions, as well as in vicarious conditions without a live agent (i.e. ghost control for affordance learning [12]), should be better than that in individual conditions. However, if social and asocial learning are not independent [25,26], then there may be no difference between learning in social and individual conditions. Alternatively, one might reasonably predict that performance following individual learning may be better than that following vicarious learning owing to the direct experience of actions and feedback. While research has shown significant dissociations between social and individual learning in sequencing tasks in human children [19,20,27], to our knowledge, there is no comparable evidence with non-human primates. Such evidence is necessary to address the question of cognitive specialization.
Here, we investigate these questions using well-established touchscreen-based sequence learning tasks. Touchscreen tasks use familiar responses in novel ways and allow for within-subject comparisons, something that is impossible to do with object-based tasks involving serial actions or events (e.g. [10,14,28]). Given that most complex responses confound seriating item-specific information (i.e. which objects are relevant when in an event) with spatial-specific information (i.e. when and where in space objects are placed), coupled with the fact that the brain independently processes what and where information [29], we employed two sequencing tasks that isolate these components. In the item-specific task (hereafter called the cognitive task), participants must select three different pictures in an item-specific order, ignoring their spatial locations. In the spatial-specific task (hereafter the spatial task), participants must select three identical pictures in a spatial-specific order, ignoring their identity. By comparing orangutans with preschool children, our study provides unique insights into the underlying cognitive similarities and/or differences between species in sequence learning under conditions that vary in the amount and type of vicarious input.
We selected preschool-aged children (3 and 5 years old) as a comparison group for two reasons. First, we wanted to minimize the effect of formal education (which emphasizes complex sequence learning). Second, previous studies have shown that by the age of 4.5, children evidence robust imitation in both the cognitive and spatial tasks. However, younger age groups (less than 3.5 years) evidence a mosaic social learning pattern (e.g. copying in the cognitive but not the spatial task). Subiaul [11,23,30] has argued that the social and imitation learning skills of non-human primates may show a similar mosaic pattern.
We examine the following four hypotheses (see the electronic supplementary material, table S7 for a summary): (i) concerning how vicariously acquired information compares to individually acquired information: if humans or orangutans have a specialization for vicarious learning compared to individual learning, we would expect an advantage in performance in vicarious (Social or Ghost) over individual (Recall) learning conditions; (ii) regarding the specificity of vicarious learning: if human or orangutan vicarious learning is narrowly specified for social information (obtained from agents), we would expect an advantage in performance in Social over Ghost conditions; (iii) considering how task difficulty or disinterest (i.e. motivation) may confound social learning performance: if orangutans fail to perform at above-chance levels in any learning condition, we would conclude that the apes lacked the necessary motivation or the tasks were too difficult to be useful in evaluating learning competence; and (iv) finally, on the relationship between social, vicarious and individual learning in children and orangutans: if individual or vicarious learning scaffolds the development of social learning in particular tasks or conditions, they should predict social learning performance.
2. Experiment 1: children
(a). Material and methods
(i). Tasks
Two touchscreen-based tasks were used in the present study: the cognitive task and the spatial task. In the cognitive task [4,31,32], three different images appear on the screen in various locations within a 4 × 4 grid (gridlines are not visible). To solve the task, the images must be touched in a certain order governed by the contents of the pictures: for example, ambulance → bird → crown (figure 1a). After each trial, the image locations are shuffled around in the grid, but the correct order (governed by picture contents) remains the same. These rules require participants to learn the sequence based on image content rather than spatial location.
The spatial task (figure 1b; [19,20]) is similar to the cognitive task, with the following exceptions: (i) the three picture items are identical within a trial, but change across trials; (ii) the images' locations remain fixed from trial to trial; and (iii) the sequence is governed by location.
Both tasks require participants to attend to, encode and recall different features: item identity in the cognitive task and spatial location in the spatial task. With both tasks, when there are three images, the chance of choosing the correct sequence if selecting items at random is 16.7% (that is, 1/3 × 1/2 × 1/1 = 0.167).
In both tasks, the relationship between the pictures or locations themselves is arbitrary. However, when items are touched in an arbitrarily specified order, a reward (both primary and secondary reinforcers) is produced. The order of elements in a ritualized behaviour (e.g. praying to a deity) is arbitrary, in that the order itself has no clear causal connection to a (presumed) outcome. The order of elements in a linguistic utterance, by contrast, often has a clear connection to the utterance's meaning and, therefore, a causal connection to an outcome, even though the syntactic rules of an individual language are themselves largely arbitrary. For example, asking someone to ‘cut the bandage’ will probably lead to a different outcome than asking them to ‘bandage the cut’. In this way, the sequences in these tasks are more like linguistic ones—with causal effects—than ritualistic ones. The causal link to a detectable outcome makes it plausible that apes would be more likely to copy sequences in these situations than in tasks examining the copying of causally irrelevant actions in sequences [13–15].
(ii). Conditions
Four different learning conditions were used with each task: (i) in the Baseline condition (individual learning), participants discovered the correct sequence independently by trial and error; (ii) the Recall condition (another individual learning condition) always occurred directly after the Baseline condition. Once subjects correctly entered a full three-item sequence during Baseline, there was a 30 s delay during which the computer screen was occluded. Then participants were presented with the same sequence used in the preceding Baseline condition to assess recall of the sequence; (iii) in the Ghost (vicarious) condition, the computer demonstrated the correct sequence three times by highlighting individual items on the screen in the target order; and (iv) in the Social (vicarious) condition, a human experimenter demonstrated the correct sequence three times.
(iii). Participants
A total of 96 typically developing 3 year old (n = 44; mean age in months, 41.8; s.d., 5.2) and 5 year old (n = 52; mean age in months, 65.3; s.d., 3.8) children were recruited at a local museum or zoo following Institutional Review Board approved protocols.
(iv). Apparatus
All tasks were carried out on an iMac computer (Apple, Cupertino, CA, USA) with a MagicTouch touchscreen panel (Keytec, Garland, TX, USA) affixed to it. The tasks were custom-written.
(v). Procedure
The cognitive and spatial tasks were done in blocks to avoid interference, minimize training time and ensure understanding of the rules of the task. The order of tasks was counterbalanced; half of the children received the spatial block first and the other half received the cognitive block first. A brief training phase preceded the first experimental condition for each task. After training, the testing phase began. The conditions within each task were counterbalanced (but Baseline was always immediately followed by Recall). Four different conditions (delineated above), and a total of three different sequences, were used for each task, for a 2 (tasks) × 4 (learning conditions) design. A sequence consisted of a set of three images (for the cognitive task) or three locations (for the spatial task). A trial consisted of an attempt (either correct or incorrect) to enter a sequence. After an incorrect response on a trial, a new trial with the same sequence started; in the cognitive task, the spatial locations of the pictures were shuffled, and in the spatial task, the identity of the pictures changed.
For the Baseline and Recall conditions, performance from the first presentation of a sequence (first trial, T1), as well as subsequent trials until the correct sequence was entered, was measured. Once children selected the items in the correct order in one condition, testing was complete in that condition, and a new condition was begun. For additional details on the methods, see the electronic supplementary material.
(vi). Statistical analysis
Two measures were used to quantify performance. Performance on the first trial (T1) of each condition was a binomial measure, with values of correct (1) or incorrect (0). T1 is the strictest measure of learning. In conditions with demonstrations, T1 represents a ‘pure’ measure of vicarious learning. Success on any trials after the first trial could be owing to vicarious learning, individual learning or a combination of both. The only way to perform at above-chance levels on this measure is to have prior knowledge of the sequence (by individual learning or vicarious learning). However, T1 cannot give information about what happens after the first trial. An individual who first enters a correct sequence on trial 2 and one who first enters a correct sequence on trial 10 receive the same score on the T1 measure (both score a 0). As such, T1 excludes partial learning.
A second measure, the correct : incorrect (CI) presses measure, is a per-trial measurement of performance. It consists of two response variables: out of the first two presses in a trial, the number of correct selections and the number of incorrect selections. A fully correct trial (picture A → picture B → picture C) results in values of [2:0]; a partially correct trial (picture A → picture C) results in values of [1:1] and an incorrect trial (picture B or picture C selected first) results in values of [0:2]. The CI measure can show how quickly an individual finds a solution if they do not do so on the first trial, and also how performance changes within a condition. For children, CI values for each trial until the first correct trial were calculated for each condition; that is, if an individual first entered two incorrect trials and then one correct trial, a total of three CI values were calculated. The first four trials for each condition were included in the statistical models, to maintain consistency with the orangutan data structure (see §3a); if a child entered the correct sequence before they had performed four trials, the number of CI values corresponded to the total number of trials performed by the child.
For the main statistical analyses, generalized linear mixed models (GLMMs) implemented in lme4 in R [33] were used to examine the fixed effects of condition, age group (3 and 5 year olds), and their interactions. We analysed the same data with Markov chain Monte Carlo GLMMs in a Bayesian framework using the MCMCglmm package [34], to determine if analysis method affected the results (see the electronic supplementary material, §S1.7). It did not; therefore we report here the results of the lme4 analyses. Regression analyses (general linear models; GLMs) were used to evaluate the degree to which learning in certain conditions predicted that in others.
(b). Results
(i). Cognitive task performance by condition and age
We used a binomial GLMM with a response variable of CI; fixed effects of age group (two levels: 3 and 5 years), condition (four levels: Baseline, Ghost, Social, Recall) and their interaction; and a random effect of participant (identity). There was no main effect of age group, but there was a main effect of condition and an interaction between condition and age group. To explore this interaction, we separated the data by age and re-ran the models with condition as the only fixed effect. For 3 year olds, pairwise contrasts between conditions (using Tukey's correction for multiple comparisons) indicated that performance was significantly better in the Social condition than Baseline (b = 0.68, s.e. = 0.23, Z = 3.0, p = 0.013); no other pairwise contrast showed a significant difference (all ps > 0.2; figure 2), including comparisons between Social and Recall and between Social and Ghost. For 5 year olds, pairwise contrasts between conditions indicated that when compared with Baseline, performance was significantly better in the Social (b = 1.9, s.e. = 0.27, Z = 7.2, p < 0.001), Ghost (b = 0.64, s.e. = 0.20, Z = 3.2, p = 0.0077) and Recall (b = 1.1, s.e. = 0.21, Z = 5.1, p < 0.001) conditions. Additionally, performance in the Social condition was better than that in the Recall (b = 0.81, s.e. = 0.28, Z = 2.9, p = 0.017) and Ghost (b = 1.3, s.e. = 0.27, Z = 4.6, p < 0.001) conditions.
(ii). Spatial task performance by condition and age
We used an analysis analogous to that described above and found that, similar to the cognitive task, there was no main effect of age group, but there was a main effect of condition and an interaction between condition and age group. We again separated the data by age to explore this interaction, and ran models with condition as the only fixed effect. For 3 year olds, pairwise contrasts between conditions (with Tukey's correction) indicated that performance was significantly better in the Recall condition than in Baseline (b = 0.61, s.e. = 0.21, Z = 2.9, p = 0.018); no other pairwise contrast showed a significant difference (all ps > 0.1; figure 3). For 5 year olds, pairwise contrasts between conditions indicated that when compared with Baseline, performance was significantly better in the Social (b = 0.95, s.e. = 0.22, Z = 4.3, p < 0.001), Ghost (b = 0.89, s.e. = 0.20, Z = 4.4, p < 0.001) and Recall (b = 1.2, s.e. = 0.21, Z = 5.6, p < 0.001) conditions. In contrast with the cognitive task described above, however, no other pairwise contrast showed a significant difference (all ps > 0.6), including the comparisons between Social and Recall and between Social and Ghost.
For results of the analyses of both tasks with the T1 measure, see the electronic supplementary material.
(iii). Differences between tasks
We examined whether children's overall performance was better in the spatial or the cognitive task by using a binomial GLMM with a response variable of CI, a fixed effect of task (two levels: cognitive and spatial) and a random effect of participant. There was no main effect of task (b = −0.08, s.e. = 0.073, Z = −1.2, p = 0.25), indicating that children's performance overall was not better or worse in either task.
(iv). Relationships between performance in Social and other conditions
We examined whether performance in the various conditions predicted children's social learning performance in each task using GLMs (see the electronic supplementary material, S1.6 for details). For the cognitive task, while age group (5 year olds vs 3 year olds; b = −0.78, s.e. = 0.26, Z = −3.0, p = 0.0025) was a significant predictor of performance in the cognitive Social condition, performance in the other conditions (cognitive Ghost, cognitive Recall, spatial Ghost, spatial Recall and spatial Social) was not (all ps > 0.08). For the spatial task, age group (5 year olds vs 3 year olds; b = −0.40, s.e. = 0.16, Z = −2.5, p = 0.014) was a significant predictor of performance in the spatial Social condition, as was spatial Ghost performance (b = 0.54, s.e. = 0.19, Z = 2.8, p = 0.005; all other ps > 0.07). Results are summarized in the electronic supplementary material, figure S3.
3. Experiment 2: orangutans
(a). Material and methods
(i). Tasks
The cognitive and spatial tasks described above were used to test the orangutans.
(ii). Conditions
The same four learning conditions described above for children were used.
(iii). Participants
Three adult orangutans living at Smithsonian's National Zoo in Washington, DC, participated in this study. Demographic details of the orangutans are shown in the electronic supplementary material, table S3. The protocols for this study were approved by the Institutional Animal Care and Use Committees of the George Washington University and the Smithsonian Institution.
(iv). Apparatus
The apparatus used for orangutans was similar to the one used for children, and was affixed to a mobile cart that allowed testing in the orangutans' living enclosures. During testing, the touchscreen was placed against the enclosure mesh so that an orangutan could interact with the screen and access rewards (grapes) delivered through a feeding tube.
(v). Procedure
Training and testing on the cognitive and spatial tasks were done in blocks to minimize training time and between-task interference, and to maximize understanding of the task rules. Iris received the cognitive block first, while Batang and Kyle received the spatial block first. For additional details, see the electronic supplementary material.
Training. In the training sessions, orangutans were given three demonstrations of a two-item sequence by a familiar zookeeper. They then had four consecutive opportunities (trials) per sequence to enter the correct two-item sequence, and received rewards for correct performance. They saw four different sequences per training session. Upon reaching a performance criterion, they moved to the testing phase.
Testing. Experimental trials used three-item sequences. Conditions within each experimental block were counterbalanced, so that each orangutan received the Ghost condition first in one block and the Social condition first in the other block. Each condition block consisted of 12 sessions; in addition, 12 sessions that comprised both Baseline and Recall conditions were interspersed throughout the Ghost and Social sessions, not performed in a block, to distribute any potential changes in expertise.
During testing, as during training, orangutans were given four trials per sequence in the Recall, Ghost and Social conditions. The number of response trials was limited in order to ensure that session length was predictable and motivation to respond correctly was high (i.e. there were relatively few opportunities for a reward). Orangutans were given up to 35 trials in the Baseline condition to discover the correct sequence initially. If they did not do so, no Recall condition was begun; instead, the next sequence in a session was begun.
(vi). Statistical analysis
For orangutans, both T1 and CI values were used. CI values were calculated for trials 1–4 in the Baseline, Recall, Ghost and Social conditions.
GLMMs implemented in lme4 in R were used to examine the fixed effect of condition. As for the children's data, we also created GLMMs for the orangutan data using the MCMCglmm package [34]; see the electronic supplementary material S2.9. The method of analysis did not affect the results, so we report here the results of the lme4 analyses. To evaluate the degree to which performance in other conditions predicted that in the Social conditions, we used MCMCglmm because the lme4 models did not converge.
(b). Results
(i). Cognitive task performance by condition
We used a binomial GLMM, with a response variable of CI, a fixed effect of condition (four levels: Baseline, Ghost, Social, Recall) and random effects of participant and demonstrator. There was a main effect of condition; pairwise contrasts between conditions (using Tukey's correction) indicated that performance was significantly better in the Recall condition than in the Baseline (b = 0.35, s.e. = 0.096, Z = 3.7, p = 0.0013), Social (b = 0.27, s.e. = 0.087, Z = 3.2, p = 0.0086) and Ghost (b = 0.41, s.e. = 0.090, Z = 4.6, p < 0.001) conditions. No other contrasts were significantly different. Results are summarized in figure 2 and the electronic supplementary material, figure S4.
(ii). Spatial task performance by condition
We repeated the analysis described above for the spatial task and found a main effect of condition. Pairwise contrasts between conditions (using Tukey's correction) indicated that when compared with Baseline, performance was significantly better in the Recall (b = 1.0, s.e. = 0.098, Z = 10, p < 0.001), Social (b = 0.45, s.e. = 0.10, Z = 4.5, p < 0.001) and Ghost (b = 0.37, s.e. = 0.10, Z = 3.7, p = 0.0013) conditions. Additionally, performance in the Recall condition was better than that in both the Social (b = 0.55, s.e. = 0.091, Z = 6.0, p < 0.001) and the Ghost (b = 0.63, s.e. = 0.089, Z = 7.1, p < 0.001) conditions. There were no other significant differences, including between the Social and Ghost conditions. Results are summarized in figure 3 and the electronic supplementary material, figure S5.
(iii). Differences between tasks
We examined whether orangutans' overall performance was better in the spatial or the cognitive task by using a binomial GLMM with a response variable of CI, a fixed effect of task and a random effect of participant. There was a main effect of task (b = 0.28, s.e. = 0.045, Z = 6.2, p < 0.001), indicating that orangutans performed better in the spatial than the cognitive task.
(iv). Relationships between performance in the Social and other conditions
We examined whether performance in the various conditions predicted orangutans’ social learning performance in each task using MCMCglmm (see the electronic supplementary material S2.8). For the cognitive task, the model indicated that performance in the other conditions (cognitive Ghost, cognitive Recall, spatial Ghost, spatial Recall and spatial Social) did not significantly predict performance in the cognitive Social condition (all ps > 0.14). For the spatial task, performance in the other conditions (spatial Ghost, spatial Recall, cognitive Ghost, cognitive Recall, and cognitive Social) did not significantly predict performance in the spatial Social condition (all ps > 0.06). One condition, cognitive Social, had a marginal negative relationship with spatial Social performance (see the electronic supplementary material, table S5). See the electronic supplementary material, figure S3 for a summary.
(v). Comparing Recall performance with chance
To compare the performance of orangutans with chance levels, we performed χ2 tests on the distribution of first-trial (T1) responses (correctly pressing all three items [ABC], correctly pressing the first but not the second item [AC], or pressing an initial incorrect item [B or C]) for each individual. The full results are reported in the electronic supplementary material, table S6. For the Recall conditions, Batang's responses in both the cognitive and spatial tasks included more ABC responses than expected by chance; Iris's responses in the spatial task (but not the cognitive task) included more ABC responses than expected by chance; and Kyle's responses in the cognitive task included fewer ABC responses than expected by chance. Kyle's responses in the Ghost condition of the spatial task also included fewer ABC responses than expected by chance. No other result differed significantly from chance.
4. Discussion
Several uniquely human traits, such as language and ritual, involve arbitrary sequences that are culturally rather than causally specified, and vicariously rather than individually learned. The ubiquity of both language and ritual in human activities prompts the question: do humans have a specialization for vicariously learning arbitrary sequences across tasks and problem domains? Consider that ritual, for example, includes arbitrary sequences of actions that are causally opaque and goal demoted [35]. Language—syntax specifically—similarly consists of arbitrary sequences of words, but in contrast with ritual, words and phrases can be clearly linked to causal outcomes [36]. In addition to their sequential features, the abstract and symbolic nature of the tokens (i.e. ritual acts and words)—representing unobservable concepts or absent entities—used in both domains links the evolution of ritual and language [37,38].
Here, we present evidence that humans at an early age, but not orangutans, possess a specialization for vicariously learning some arbitrary sequences (electronic supplementary material, table S7). In each task, participants received individual (Recall) or vicarious information either from a social agent serving as the model (Social) or provided only by a computer, an artificial agent (Ghost). The performance of orangutans and young children on the tasks diverged. Regardless of task, orangutans learned sequences best in the Recall condition. They also evidenced some limited vicarious learning in the spatial task, with performance following the pattern of Recall > Social ≈ Ghost > Baseline. Such results show that the orangutans had both the motivation and ability to encode and recall novel arbitrary sequences in individual learning conditions, and in some vicarious conditions. By contrast, 5 year old children showed proficient learning in all individual and vicarious conditions, regardless of task. Like orangutans, 3 year old children's performance in the spatial task was best in the Recall condition. However, unlike orangutans, in the cognitive task, their performance was best in the Social condition.
To our surprise, orangutans appeared to learn spatial-based sequences (spatial task) better than item-based sequences (cognitive task), despite the fact that they had less expertise in the former than the latter. This pattern differs from that observed in monkeys (who evidenced social learning in the cognitive task [16]) and the developmental pattern in children observed here and in other studies [19,20]. There are several possible explanations for this result. First, there is empirical evidence that captive [39] as well as wild [40,41] orangutans have excellent spatial memory, as evidenced by the ability to form cognitive maps of complex habitats, which includes avoiding previously depleted sites in experimental tasks and revisiting preferred sites in the wild. Second, in search tasks, orangutans, like all other non-human great apes (gorillas, chimpanzees and bonobos) and 1 year old infants, favoured the use of a spatial rather than a feature-based memory strategy, while 3 year olds showed the reverse strategy [42]. Finally, in a previous study by Swartz et al. [43], the orangutans (one of which was involved in this study) spontaneously used a spatial strategy (selecting items from right to left) when encoding and recalling unordered items on a touchscreen task similar to the cognitive task used here. These factors may explain orangutans' comparatively better performance in the spatial than the cognitive task.
Children did not show any overall performance differences between the tasks, and the pattern of their performance differences by age replicates and extends previously reported results [19,20,44,45].
Species differences are further highlighted by our predictive analyses (electronic supplementary material, tables S1, S2, S4 and S5 and figure S3). These show that children's spatial Social performance is predicted by their spatial Ghost performance (consistent with children being adept vicarious learners, regardless of source). However, neither orangutans’ spatial Social performance nor their cognitive Social performance is significantly predicted by either their individual learning (Recall) or their vicarious learning (Ghost). Neither orangutans’ nor children's Social performance across tasks was predicted by their Recall performance. Together these results suggest that social and individual learning may be dissociable in both humans and orangutans, consistent with previous studies [19,20] and theories of a mosaic architecture of social learning [11,23,30].
Some limitations of the present study should be considered for designing future research. In addition to testing other great apes, it would be useful to expose participants to incorrect as well as correct responses. Errors, executed by conspecifics, have been associated with more robust social learning in both children [19,20,46] and monkeys [47,48]. This would reveal whether the pattern of results reported here would change if apes and children were provided with models that showed both correct and incorrect responses. Additionally, the copying of non-arbitrary causal sequences by orangutans should be tested, for example, in physical tasks that visibly require certain orders of actions. This would indicate whether it is the (arbitrary) relationship between elements that makes it difficult for orangutans to vicariously learn sequences in these tasks.
In summary, these results show that humans, from an early age, have a facility to learn novel arbitrary sequences from others in a way that orangutans do not. Is this owing to the fact that humans are exposed to more (and perhaps more dependent on) sequences of actions than orangutans? Consider that wild orangutans perform some serial actions such as the daily construction of their nests for night-time sleep. This involves selecting a site for the nest, making a foundation of larger branches, and sometimes adding embellishments [49]. While the seriation of some of these actions is instrumental (e.g. a nest could not be constructed in a different order), other behaviours appear to have less of an instrumental role (e.g. adding a ‘rim’ around the edge of the nest, or other embellishments called ‘artistic’ features [50,51]). While wild orangutans may sometimes be exposed to and use sequences in cases like this, sequences are ubiquitous in children's lives. In fact, from the moment children wake up in the morning to when their heads are placed on a pillow at night, children's days are organized into a series of elaborate routines that include hierarchically organized sequences. Might such experiences explain the species differences observed here? or do humans rely on and use such elaborate sequences and routines because learning them comes so easily and naturally?
While most developmental research has focused on the unique pressures faced by human children to learn new instrumental skills by imitation [52–54], less attention has been paid to the challenges associated with the vicarious and social learning of arbitrary sequences, critical for both language and ritual [55,56]. The evidence that does exist suggests that placing serial responses in a ‘ritual’ context enhances imitation fidelity [56,57]. For example, making sequences causally opaque and without an obvious end goal—two core features of rituals [35]—increases, rather than decreases, imitation fidelity; but the fact is that children in general excel at imitating all types of sequences [58–60]. While our results confirm that humans and orangutans share various individual sequence learning skills, the faithful copying of socially demonstrated arbitrary sequences is highly developed in humans relative to orangutans. This is consistent with the hypothesis that imitation of novel arbitrary sequences is a human cognitive specialization [23,61]. From a developmental perspective, we do not have enough information to determine the extent to which this specialization is acquired via experience [26]. Also, from an evolutionary perspective, we cannot say conclusively whether this skill precipitated complex tool use, language and symbolic rituals; whether an increasing dependence on these skills placed unique pressures on the ability to vicariously learn novel arbitrary sequences; or even whether these suites of skills coevolved [24,37,57,62,63]. Regardless, the interdependence between the ability to vicariously learn sequences and these uniquely human behaviours is unmistakable.
5. Conclusion
Do humans possess a cognitive specialization for vicariously learning novel sequences? The evidence presented here is consistent with the specialization hypothesis showing that at a young age, children are particularly adept, when compared with orangutans, at faithfully imitating arbitrary sequences. Orangutans' comparatively poor performance on the same tasks and conditions cannot be explained by some general representational deficit or a lack of interest (or motivation), as they showed significant learning in individual conditions and even some vicarious conditions. However, differences may be explained by the fact that vicarious sequence learning underlies many uniquely human behaviours that range from complex tool use to language and ritual. These results raise the question: is the relative facility by which humans vicariously learn novel sequences a cause of the emergence of ritual and language or, as Heyes [26] has suggested, is the specialization a product of these cultural activities? We may never know for sure. What we can say, however, is that the few apes that have been raised in human homes or given language training do not show the same facility for learning complex sequences as young human children, whether imitating novel actions on objects [23] or the sequencing of signs to communicate [64].
Supplementary Material
Acknowledgements
We thank the staff in the primate unit at Smithsonian National Zoological Park and Natural History Museum for logistical assistance and the undergraduate research assistants involved in collecting data for this research. We thank Shreejata Gupta, Cheryl Stimpson, Mark Atkinson and two anonymous reviewers for comments on earlier versions of the manuscript.
Ethics
Ethical approval for this work was obtained from the George Washington University Institutional Research Board (for work with children) and Institutional Animal Care and Use Committee (for work with orangutans). For all child subjects, informed consent from a parent as well as the child's assent were obtained prior to participation. This research was conducted according to the principles in the Declaration of Helsinki, and followed the Association for the Study of Animal Behaviour Guidelines for the Use of Animals in Research, as well as all legal and institutional requirements.
Data accessibility
Data can be accessed at https://osf.io/n2rgv/.
Authors' contributions
E.R. conceived of and designed the study, carried out the data collection, carried out the statistical analysis and drafted the manuscript; E.M.P. carried out the statistical analysis and critically revised the manuscript; F.S. helped conceive and design the study and helped draft and critically revise the manuscript. All authors gave final approval for publication and agree to be held accountable for the work performed therein.
Competing interests
We declare we have no competing interests.
Funding
Funding for this research was provided by the Leakey Foundation (to E.R.) and the National Science Foundation, BCS-0748717 (to F.S.). During write-up of this project, E.R. was supported by ERC grant no. 648841 RATCHETCOG ERC-2014-CoG (to Christine A. Caldwell) under the European Union's Horizon 2020 research and innovation programme.
References
- 1.Kapitány R, Nielsen M. 2015. Adopting the ritual stance: the role of opacity and context in ritual and everyday actions. Cognition 145, 13–29. ( 10.1016/j.cognition.2015.08.002) [DOI] [PubMed] [Google Scholar]
- 2.Kapitány R, Nielsen M. 2017. The ritual stance and the precaution system: the role of goal-demotion and opacity in ritual and everyday actions. Relig. Brain Behav. 7, 27–42. ( 10.1080/2153599X.2016.1141792) [DOI] [Google Scholar]
- 3.Kapitány R, Nielsen M. 2019. Ritualized objects: how we perceive and respond to causally opaque and goal demoted action. J. Cogn. Cult. 19, 170–194. ( 10.1163/15685373-12340053) [DOI] [Google Scholar]
- 4.Terrace HS. 2005. The simultaneous chain: a new approach to serial learning. Trends Cogn. Sci. 9, 202–210. ( 10.1016/j.tics.2005.02.003) [DOI] [PubMed] [Google Scholar]
- 5.Inoue S, Matsuzawa T. 2007. Working memory of numerals in chimpanzees. Curr. Biol. 17, R1004–R1005. ( 10.1016/j.cub.2007.10.027) [DOI] [PubMed] [Google Scholar]
- 6.Renner E, Price EE, Subiaul F. 2016. Sequential recall of meaningful and arbitrary sequences by orangutans and human children: does content matter? Anim. Cogn. 19, 39–52. ( 10.1007/s10071-015-0911-z) [DOI] [PubMed] [Google Scholar]
- 7.Terrace HS, Son LK, Brannon EM. 2003. Serial expertise of rhesus macaques. Psychol. Sci. 14, 66–73. ( 10.1111/1467-9280.01420) [DOI] [PubMed] [Google Scholar]
- 8.Merritt DJ, Terrace HS. 2011. Mechanisms of inferential order judgments in humans (Homo sapiens) and rhesus monkeys (Macaca mulatta). J. Comp. Psychol. 125, 227–238. ( 10.1037/a0021572) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Bandura A, Ross D, Ross SA. 1963. Vicarious reinforcement and imitative learning. J. Abnorm. Soc. Psychol. 67, 601–607. ( 10.1037/h0045550) [DOI] [PubMed] [Google Scholar]
- 10.Renner E, Atkinson M, Caldwell CA. 2019. Squirrel monkey responses to information from social demonstration and individual exploration using touchscreen and object choice tasks. PeerJ 7, e7960 ( 10.7717/peerj.7960) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Subiaul F. 2007. The imitation faculty in monkeys: evaluating its features, distribution and evolution. J. Anthropol. Sci. 85, 35–62. [Google Scholar]
- 12.Whiten A, McGuigan N, Marshall-Pescini S, Hopper LM. 2009. Emulation, imitation, over-imitation and the scope of culture for child and chimpanzee. Phil. Trans. R. Soc. B 364, 2417–2428. ( 10.1098/rstb.2009.0069) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Clay Z, Tennie C. 2018. Is overimitation a uniquely human phenomenon? Insights from human children as compared to bonobos. Child Dev. 89, 1535–1544. ( 10.1111/cdev.12857) [DOI] [PubMed] [Google Scholar]
- 14.Horner V, Whiten A. 2005. Causal knowledge and imitation/emulation switching in chimpanzees (Pan troglodytes) and children (Homo sapiens). Anim. Cogn. 8, 164–181. ( 10.1007/s10071-004-0239-6) [DOI] [PubMed] [Google Scholar]
- 15.Nielsen M, Susianto EWE. 2010. Failure to find over-imitation in captive orangutans (Pongo pygmaeus): implications for our understanding of cross-generation information transfer. In Developmental psychology (ed. Håkansson J.), pp. 153–167. New York, NY: Nova Science Publishers. [Google Scholar]
- 16.Subiaul F, Cantlon JF, Holloway RL, Terrace HS. 2004. Cognitive imitation in rhesus macaques. Science 305, 407–410. ( 10.1126/science.1099136) [DOI] [PubMed] [Google Scholar]
- 17.Boutin A, Fries U, Panzer S, Shea CH, Blandin Y. 2010. Role of action observation and action in sequence learning and coding. Acta Psychol. (Amst.) 135, 240–251. ( 10.1016/j.actpsy.2010.07.005) [DOI] [PubMed] [Google Scholar]
- 18.Boutin A, Badets A, Salesse RN, Fries U, Panzer S, Blandin Y. 2012. Practice makes transfer of motor skills imperfect. Psychol. Res. 76, 611–625. ( 10.1007/s00426-011-0355-2) [DOI] [PubMed] [Google Scholar]
- 19.Subiaul F, Anderson S, Brandt J, Elkins J. 2012. Multiple imitation mechanisms in children. Dev. Psychol. 48, 1165–1179. ( 10.1037/a0026646) [DOI] [PubMed] [Google Scholar]
- 20.Subiaul F, Patterson EM, Schilder B, Renner E, Barr R. 2015. Becoming a high-fidelity—super-imitator: what are the contributions of social and individual learning? Dev. Sci. 18, 1025–1035. ( 10.1111/desc.12276) [DOI] [PubMed] [Google Scholar]
- 21.Subiaul F, Zimmermann L, Renner E, Schilder B, Barr R. 2016. Defining elemental imitation mechanisms: a comparison of cognitive and motor-spatial imitation learning across object- and computer-based tasks. J. Cogn. Dev. 17, 221–243. ( 10.1080/15248372.2015.1053483) [DOI] [Google Scholar]
- 22.Subiaul F, Renner E.2020. Example trials of the cognitive and spatial touchscreen tasks. Figshare Figure. See . [DOI]
- 23.Subiaul F. 2016. What's special about human imitation? A comparison with enculturated apes. Behav. Sci. (Basel) 6, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Heyes C. 2016. Homo imitans? Seven reasons why imitation couldn't possibly be associative. Phil. Trans. R. Soc. B 371, 20150069 ( 10.1098/rstb.2015.0069) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Heyes C. 2012. What's social about social learning? J. Comp. Psychol. 126, 193–202. ( 10.1037/a0025180) [DOI] [PubMed] [Google Scholar]
- 26.Heyes C. 2018. Cognitive gadgets: the cultural evolution of thinking. Cambridge, MA: The Belknap Press of Harvard University Press. [Google Scholar]
- 27.Subiaul F, Patterson EM, Zimmermann L, Barr R. 2019. Only domain-specific imitation practice makes imitation perfect. J. Exp. Child Psychol. 177, 248–264. ( 10.1016/j.jecp.2018.07.004) [DOI] [PubMed] [Google Scholar]
- 28.Bauer PJ, Mandler JM. 1992. Putting the horse before the cart: the use of temporal order in recall of events by one year old children. Dev. Psychol. 28, 441–452. ( 10.1037/0012-1649.28.3.441) [DOI] [Google Scholar]
- 29.Mishkin M, Ungerleider LG. 1982. Contribution of striate inputs to the visuospatial functions of parieto-preoccipital cortex in monkeys. Behav. Brain Res. 6, 57–77. ( 10.1016/0166-4328(82)90081-X) [DOI] [PubMed] [Google Scholar]
- 30.Subiaul F. 2010. Dissecting the imitation faculty: the multiple imitation mechanisms (MIM) hypothesis. Behav. Processes 83, 222–234. ( 10.1016/j.beproc.2009.12.002) [DOI] [PubMed] [Google Scholar]
- 31.Terrace HS. 2001. Chunking and serially organized behavior in pigeons, monkeys and humans. In Avian visual cognition (ed. Cook RG.), (online). Medford, MA: Comparative Cognition Press. [Google Scholar]
- 32.Terrace HS. 2001. Comparative psychology of chunking. In Animal cognition and sequential behavior (ed. Fountain S.), pp. 23–56. Dordrecht, The Netherlands: Kluwer Academic Publishing. [Google Scholar]
- 33.R Core Team. 2017. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; See https://www.r-project.org/. [Google Scholar]
- 34.Hadfield JD. 2010. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J. Stat. Softw. 33, 1–22. ( 10.18637/jss.v033.i02)20808728 [DOI] [Google Scholar]
- 35.Nielsen M, Tomaselli K, Kapitány R. 2018. The influence of goal demotion on children's reproduction of ritual behavior. Evol. Hum. Behav. 39, 343–348. ( 10.1016/j.evolhumbehav.2018.02.006) [DOI] [Google Scholar]
- 36.Pinker S. 2007. The language instinct. New York, NY: Harper Perennial Modern Classics. [Google Scholar]
- 37.Deacon TW. 1997. The symbolic species: the co-evolution of language and the brain, 1st edn New York, NY: W.W. Norton. [Google Scholar]
- 38.Donald M. 1991. Origins of the modern mind: three stages in the evolution of culture and cognition. Cambridge, MA: Harvard University Press. [Google Scholar]
- 39.MacDonald SE, Agnes MM. 1999. Orangutan (Pongo pygmaeus abelii) spatial memory and behavior in a foraging task. J. Comp. Psychol. 113, 213–217. ( 10.1037/0735-7036.113.2.213) [DOI] [Google Scholar]
- 40.Vogel ER, Haag L, Wich SA, Bastian ML, van Schaik CP. 2008. Factors affecting foraging decisions in a wild population of sympatric orangutans (Pongo pygmaeus wurmbii) and white-bearded gibbons (Hylobates albibarbis): evidence of cognitive maps. In Prog. 77th Ann. Meeting of the American Assoc. Physical Anthropologists, 9–12 April 2008, Columbus, OH, p. S215 Herndon, VA: American Association of Physical Anthropologists. [Google Scholar]
- 41.Markham KE, van Noordwijk MA, Vogel ER. 2014. Factors influencing revisitation rates to feeding trees in wild Bornean orangutans (Pongo pygmaeus wurmbii) in Central Kalimantan, Indonesia. In 37th Annual Meeting of the American Society of Primatologists Scientific Prog., 12–15 September 2014, Decatur, GA, p. 75. Norman, OK: American Society of Primatologists. [Google Scholar]
- 42.Haun DBM, Call J, Janzen G, Levinson SC. 2006. Evolutionary psychology of spatial representations in the Hominidae. Curr. Biol. 16, 1736–1740. ( 10.1016/j.cub.2006.07.049) [DOI] [PubMed] [Google Scholar]
- 43.Swartz KB, Himmanen SA, Shumaker RW. 2007. Response strategies in list learning by orangutans (Pongo pygmaeus × P. abelii). J. Comp. Psychol. 121, 260–269. ( 10.1037/0735-7036.121.3.260) [DOI] [PubMed] [Google Scholar]
- 44.Subiaul F, Lurie H, Romansky K, Klein T, Holmes D, Terrace H. 2007. Cognitive imitation in typically-developing 3- and 4-year olds and individuals with autism. Cogn. Dev. 22, 230–243. ( 10.1016/j.cogdev.2006.10.003) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Subiaul F, Vonk J, Rutherford MD. 2011. The ghosts in the computer: the role of agency and animacy attributions in ‘ghost controls’. PLoS ONE 6, e26429 ( 10.1371/journal.pone.0026429) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Want SC, Harris PL. 2001. Learning from other people's mistakes: causal understanding in learning to use a tool. Child Dev. 72, 431–443. ( 10.1111/1467-8624.00288) [DOI] [PubMed] [Google Scholar]
- 47.Ferrucci L, Nougaret S, Genovesio A. 2019. Macaque monkeys learn by observation in the ghost display condition in the object-in-place task with differential reward to the observer. Sci. Rep. 9, 401 ( 10.1038/s41598-018-36803-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Monfardini E, Hadj-Bouziane F, Meunier M. 2014. Model-observer similarity, error modeling and social learning in rhesus macaques. PLoS ONE 9, e89825 ( 10.1371/journal.pone.0089825) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Russon AE, Handayani DP, Kuncoro P, Ferisa A. 2007. Orangutan leaf-carrying for nest-building: toward unraveling cultural processes. Anim. Cogn. 10, 189–202. ( 10.1007/s10071-006-0058-z) [DOI] [PubMed] [Google Scholar]
- 50.van Schaik CP, Ancrenaz M, Borgen G, Galdikas B, Knott CD, Singleton I et al. . 2003. Orangutan cultures and the evolution of material culture. Science 299, 102–105. ( 10.1126/science.1078004) [DOI] [PubMed] [Google Scholar]
- 51.Bastian ML, Van Noordwijk MA, Van Schaik CP. 2012. Innovative behaviors in wild Bornean orangutans revealed by targeted population comparison. Behaviour 149, 275–297. ( 10.1163/156853912X636726) [DOI] [Google Scholar]
- 52.Csibra G, Gergely G. 2009. Natural pedagogy. Trends Cogn. Sci. 13, 148–153. ( 10.1016/j.tics.2009.01.005) [DOI] [PubMed] [Google Scholar]
- 53.Lyons DE, Damrosch DH, Lin JK, Macris DM, Keil FC. 2011. The scope and limits of overimitation in the transmission of artefact culture. Phil. Trans. R. Soc. B 366, 1158–1167. ( 10.1098/rstb.2010.0335) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Tomasello M. 2016. Cultural learning redux. Child Dev. 87, 643–653. ( 10.1111/cdev.12499) [DOI] [PubMed] [Google Scholar]
- 55.Herrmann PA, Legare CH, Harris PL, Whitehouse H. 2013. Stick to the script: the effect of witnessing multiple actors on children's imitation. Cognition 129, 536–543. ( 10.1016/j.cognition.2013.08.010) [DOI] [PubMed] [Google Scholar]
- 56.Legare CH, Wen NJ, Herrmann PA, Whitehouse H. 2015. Imitative flexibility and the development of cultural learning. Cognition 142, 351–361. ( 10.1016/j.cognition.2015.05.020) [DOI] [PubMed] [Google Scholar]
- 57.Legare CH, Nielsen M. 2015. Imitation and innovation: the dual engines of cultural learning. Trends Cogn. Sci. 19, 688–699. ( 10.1016/j.tics.2015.08.005) [DOI] [PubMed] [Google Scholar]
- 58.Bauer PJ. 1992. Holding it all together: how enabling relations facilitate young children's event recall. Cogn. Dev. 7, 1–28. ( 10.1016/0885-2014(92)90002-9) [DOI] [Google Scholar]
- 59.Loucks J, Meltzoff AN. 2013. Goals influence memory and imitation for dynamic human action in 36-month-old children. Scand. J. Psychol. 54, 41–50. ( 10.1111/sjop.12004) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Loucks J, Mutschler C, Meltzoff AN. 2017. Children's representation and imitation of events: how goal organization influences 3 year old children's memory for action sequences. Cogn. Sci. 41, 1904–1933. ( 10.1111/cogs.12446) [DOI] [PubMed] [Google Scholar]
- 61.Sherwood CC, Subiaul F, Zawidzki TW. 2008. A natural history of the human mind: tracing evolutionary changes in brain and cognition. J. Anat. 212, 426–454. ( 10.1111/j.1469-7580.2008.00868.x) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Henrich J. 2016. The secret of our success: how culture is driving human evolution, domesticating our species, and making us smarter. Princeton, NJ: Princeton University Press. [Google Scholar]
- 63.Renner E, Zawidzki T. 2018. Minimal cognitive preconditions on the ratchet. In Evolution of primate cocial cognition: interdisciplinary evolution research (eds Di Paolo LD, Di Vicenzo F, De Petrillo F), pp. 249–265. New York, NY: Springer International Publishing AG. [Google Scholar]
- 64.Lyn H, Greenfield PM, Savage-Rumbaugh S, Gillespie-Lynch K, Hopkins WD. 2011. Nonhuman primates do declare! A comparison of declarative symbol and gesture use in two children, two bonobos, and a chimpanzee. Lang. Commun. 31, 63–74. ( 10.1016/j.langcom.2010.11.001) [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data can be accessed at https://osf.io/n2rgv/.