Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Oct 1.
Published in final edited form as: Atten Percept Psychophys. 2019 Oct;81(7):2354–2364. doi: 10.3758/s13414-019-01734-3

The role of motor context in the beneficial effects of hand gesture on memory

Kimberly M Halvorson 1, Alexa Bushinski 1, Caitlin Hilverman 2
PMCID: PMC6824968  NIHMSID: NIHMS1528354  PMID: 31044395

Abstract

Viewing co-speech hand gesture with spoken phrases enhances memory for phrases compared to when the phrases are presented without gesture. Prior work investigating the mechanism underlying the effect of gesture on memory has implicated engagement of the motor system; when the hands are engaged in an unrelated motor task when viewing gesture, the beneficial effect of gesture is absent. However, one alternative interpretation of these findings is that the beneficial effect of gesture disappears due to mismatched context at encoding and retrieval: the hands are engaged during either encoding or retrieval, but not during both stages. Here, we examined whether matching motor context at encoding and retrieval plays a role in the beneficial effect of gesture on memory during a phrase recall task. Participants were presented with phrases that were viewed with and without gesture. Participants were assigned to one of four conditions that determined whether they would complete an unrelated motor task at (1) encoding only, (2) retrieval only, (3) both encoding and retrieval, or (4) neither. For stages when they were not completing a motor task, participants’ hands were in their laps. We found that gesture enhanced memory for phrases when participants engaged in an unrelated motor task at encoding and retrieval and when they did not complete the motor task during either stage. Further, phrases observed with gesture were more likely to be paraphrased rather than recalled literally. Together, these findings demonstrate that gesture can enhance memory even when the motor system is engaged in another task as long as that same task is performed at retrieval.

Keywords: working memory, Perception and Action, embodied perception


When we communicate, we often produce co-speech hand gestures. Gestures are spontaneous movements of our hands and arms that are semantically related to the content of our speech (McNeill, 1992). These gestures are integrated with the content of spoken language and affect the listener’s comprehension of the message (Beattie & Shovelton, 1999; McNeill, Cassell, & McCullough, 1994). Even when gesture provides information not present in spoken language, listeners remember unique information from gesture when reporting what they heard (Hilverman, Clough, Duff, & Cook, 2018; Kelly, Barr, Church, & Lynch, 1999; McNeill et al., 1994). Thus, gesture and spoken language together comprise a unified, dynamic system that can profoundly affect communication and memory.

Observing or producing gesture during encoding can facilitate the learning of a new language in adults (Hilverman, Cook, & Duff, 2018; Kelly, McDevitt, & Esch, 2009; Kroenke, Mueller, Friederici, & Obrig, 2013; Macedonia, 2014). Similarly, in children, observing or producing gesture when learning a new mathematical concept enhances learning for that concept both in an immediate posttest (Cook, Mitchell, & Goldin-Meadow, 2008) and after a delay (Cook, Duffy, & Fenn, 2013). Instructing children to produce gesture when describing a past event enhances their memory for that event (Stevanoni & Salmon, 2005). Therefore, producing or observing gestures during encoding, retrieval, or both can enhance memory.

In the present study, we are specifically interested in the benefit of observing gesture on the listener’s memory for spoken words. This benefit derives from a phenomenon known as the enactment effect (Cohen, 1989; Engelkamp & Krumnacker, 1980). The enactment effect refers to the finding that pantomiming the relevant movements associated with action words or phrases will lead to better memory for those words or phrases. This is true whether the participant is doing the acting or whether they are observing someone else do the acting (Cohen, 1989). Additionally, the enactment effect is found when enactment occurs during encoding, and a further benefit is found when the same action is performed at retrieval (Engelkamp, Zimmer, Mohr, & Sellen, 1994). Yet the enactment effect only facilitates memory if the action performed matches the verbal content that it is produced with. Action production can inhibit memory performance if the action produced does not match the concurrent verbal content. Zimmer & Engelkamp (1984) had participants learn motor action sentences (e.g., “The father is winding up his watch”) and kinematic sentences (e.g., “The smoke was rising”). Participants that produced a concurrent motor action that did not match the content of the sentence (e.g., fist clenching) remembered significantly fewer motor sentences than participants that had viewed short videos containing kinematic movement (e.g., a ball rolling across a table). Thus, unrelated motor engagement while learning sentences inhibited learning, implicating the motor system in the processing of and memory for action sentences, particularly when those sentences also contained motor information.

Similarly, previous studies have implicated the motor system in gesture processing more generally. Spontaneously producing hand gestures while describing a narrative enhances memory for speech compared to when they are not produced (Cook, Yip, & Goldin-Meadow, 2010). Viewing gesture with sentences facilitates the recall of those sentences, specifically when gestures are related to the verbal information (Feyereisen, 2006). Imaging work has also linked gesture processing and the motor system; a study using EEG demonstrated that having prior sensorimotor experience with an object affects how the gesture for that object is processed (Quandt, Marshall, Shipley, Beilock, & Goldin-Meadow, 2012). Relatedly, a study of both adults and children using fMRI demonstrated that the neural correlates of observing gesture are affected by how much experience one has in producing gesture (Wakefield, James, & James, 2013).

In addition to this empirical work, a theoretical gesture production framework – the Gesture as Simulated Action framework (Hostetter & Alibali, 2008, 2018) – also suggests that the link between gesture and the motor system is critical. By this account, speakers gesture because they simulate actions and perceptual states while thinking, and these thoughts engage the motor system and serve as the building blocks of gesture. Taken together, this framework and the aforementioned empirical work demonstrate a well-established link between gesture and the motor system. However, the extent to which the motor system is involved in gesture processing – and specifically in gesture observation’s facilitative effect on memory – remains less clear.

Recent work by Ianì and colleagues (Ianì & Bucciarelli, 2017, 2018) investigated a direct activation account of the motor system of the listener as a possible mechanism for the facilitative effect of gesture observation on memory. Similar to Zimmer & Engelkamp (1984), they tested whether having listeners perform a concurrent motor task disrupts the beneficial effect of gesture observation on recall for spoken phrases. Ianì & Bucciarelli (2017, 2018) had participants watch videos of a person saying action phrases. In half of the videos, the phrase was accompanied by a meaningful co-speech gesture. For the other half the phrase was not accompanied by any arm movements. In that case, when there were no additional instructions regarding the participants’ own hands or constraints on his or her movements, recall was better for the action phrases accompanied by gesture (Ianì & Bucciarelli, 2017).

In critical comparison conditions, an irrelevant motor task was introduced such that participants were instructed to move their hands in a rhythmic tapping motion during encoding (i.e., while watching the videos), during retrieval (i.e., while attempting to recall the action phrases), or they were instructed to move their feet in a comparable pattern at either encoding or retrieval (Ianì & Bucciarelli, 2017, 2018). For the conditions in which the hands were engaged in the irrelevant motor task at either encoding or retrieval, the facilitation of memory for action phrases that accompanied by gesture was disrupted. When listeners’ feet were engaged in an irrelevant task at encoding or retrieval, the benefit for gesture persisted.

Ianì & Bucciarelli (2017; 2018) concluded that observing gesture activated the listeners’ own motor systems and that this activation supported the development of a content-rich mental model for representing the speech information. When the hands are engaged in another motor task, the listeners’ motor systems were occupied and unable to encode supplementary information from gesture. According to this theory, mental models are constructed during discourse and relevant motor information can contribute to a more fully articulated model. These models can contain declarative knowledge (“knowing that”) and procedural knowledge (“knowing how”) for the to-be-remembered speech (Ianì et al., 2018; Ianì & Bucciarelli, 2017). Multiple knowledge types foster a more complete understanding of the speech and make it easier to recall.

Despite the central role that the motor system plays in gesture production and observation, manipulating the presence of an additional tapping task at encoding or retrieval results in inconsistencies in the motor context that could also affect recall of the phrases. One recent study highlighted the importance of the motor context matching in a procedural learning task that involved instructing participants to gesture or not. Huff, Maurer, & Merkt, (2018) had participants complete a procedural learning task (tying knots) and found that participants who gestured during the learning phase were more accurate at test so long as they also gestured during the testing phase. Participants who did not gesture during the learning phase were more accurate when they did not gesture during the testing phase. Both congruent groups did better than participants who had gestured during learning but not at test (Huff et al., 2018). The researchers suggested that gesturing during learning provides an added benefit for retrieval only when that context is reinstated (i.e., gestures or actual movements are required) at test.

This work is slightly different than the gesture observation studies discussed prior, as participants were not observing someone gesture or making task-irrelevant hand movements, but rather were using their own motor system to mimic the procedural knowledge they were trying to acquire. But the reinstatement of the encoding context at retrieval is crucial to the question we are addressing in the current work. In Ianì & Bucciarelli (2017, 2018), manipulating the availability of the listeners’ motor systems by engaging the hands and arms with an irrelevant motor task only during encoding or only at retrieval created inconsistencies across the encoding and retrieval contexts. Such inconsistencies are known to be disruptive to memory consolidation and retrieval.

According to the principle of encoding specificity, when a word, for example, is encoded, what is stored is very specific information about that word based on, and including information from, the context from the specific situation in which it was encountered (Tulving & Thomson, 1973). Put more plainly, when an item is encoded into memory, the stored representation is not only the information from the relevant stimulus. Rather, memory includes the information from the stimulus plus any number of situational, environmental, emotional, or semantic cues present at the time of encoding. One example of the reach of the encoding specificity principle was the classic experiment by Baddeley and colleagues in which participants were asked to learn a list of words either on land or under water in full scuba gear (Godden & Baddeley, 1975). In this early demonstration of the role of context on memory formation and retrieval, half of the participants were tested under the same conditions that they learned the information and the other half were tested in the opposite environment. Memory was better when the encoding and retrieval conditions matched – whether that was on land or in water – than when they were different.

In addition to occupying the listeners’ motor systems with an irrelevant task, in Ianì & Bucciarelli’s (2017, 2018) gesture studies, the researchers also created different conditions at encoding and retrieval by introducing the motor task only at encoding or only at test. The aim of the current study is to more precisely characterize the conditions under which an irrelevant motor task disrupts the beneficial effect of gesture on recall for action phrases by keeping the encoding and retrieval contexts consistent. We hypothesize that providing matching motor contexts at both encoding and retrieval will enhance memory for phrases with gesture, even when the motor system is engaged in a task that is not directly related to the information being learned.

Current Experiment

To address the question of whether the change in context might have had an effect on the benefit of gesture on recall of verbal phrases, we replicated the three primary conditions from Ianì & Bucciarelli (2017, 2018) and added a fourth condition in which participants were instructed to perform the irrelevant motor task throughout encoding and retrieval. In all four conditions, participants saw 24 distinct sentences – 12 accompanied by gesture and 12 without. In the first condition – No Tapping – participants were not told to move their hands in any specific way. In the second condition – Both Tapping – participants were instructed to tap at both encoding and retrieval. In the third condition – Encoding Tapping – participants were instructed to tap only at encoding. In the fourth condition – Retrieval Tapping – participants were instructed to tap only at retrieval. For all four groups we measured memory for the spoken phrases via an uncued recall task.

We predicted a benefit for sentences accompanied by gesture in two of the four conditions; specifically, the No Tapping and Both Tapping conditions. Because the encoding and retrieval contexts are the same in these cases, it is possible that the benefit for memory for sentences accompanied by gesture will be preserved. In the No Tapping condition, following Ianì & Bucciarelli (2017), listening to sentences accompanied by gesture may create a more detailed mental model for the phrases. At retrieval, because the motor system is unoccupied, motor simulations for the observed gestures may be reactivated to boost the overall number of items remembered.

In the Both Tapping condition, we predict that performing a concurrent motor task will not disrupt the benefit for memory for sentences accompanied by gesture so long as the same motor context is reinstated at recall. In fact, the continuous activity of the motor system during encoding may allow participants to actually use the tapping task as one of many retrieval cues at recall since it was part of the motor context at encoding. By reinstating the encoding context it is possible that we would see the same benefit for observing gesture on memory for the phrases. Alternatively, if the motor system is overwhelmed by the demands of the tapping task, the motor information from the gestures would never be encoded and therefore would not be available for retrieval regardless of whether the motor contexts match.

We make a different prediction for the Encoding Tapping and Retrieval Tapping conditions. In the Encoding Tapping condition we predict no benefit for phrases accompanied by gesture. Assuming the concurrent motor task is a critical part of the motor context at encoding, information from the gesture-accompanied phrases would be harder to access at recall. For the Retrieval Tapping condition, the addition of a new task may make the retrieval process more difficult and potentially diminish the extent to which the information that was acquired via gesture may be used as a retrieval cue or activated. This would replicate the two conditions in Ianì & Bucciarelli (2017) that show no benefit of gesture when the encoding and retrieval conditions are mismatched.

Methods

Participants

Eighty-three volunteers were recruited for participation in this study. Twenty-three participants (19 female, 3 male, 1 other) were recruited via an electronic subject pool and participated in exchange for course credit in an introductory Psychology course in St. Paul, Minnesota. The average participant age was 30 years old (range 18 to 55); 1 identified as Hispanic or Latino, 2 identified as Asian, 8 identified as Black, not of Hispanic origin, 12 identified as White, not of Hispanic origin, and 1 identified as other. The remaining 60 volunteers (29 female, 28 male, 3 other) were recruited from the same geographical region via snowball sampling, advertisements on social media platforms, and word-of-mouth in exchange for one entry in a drawing for a $25 gift card to an online retailer The average participant age was 32 years old (range 18 to 66); 2 identified as Hispanic or Latino, 3 identified as American Indian or Alaska Native, 6 identified as Asian, 9 identified as Black, not of Hispanic origin, 40 identified as White, not of Hispanic origin, and 2 identified as other.

Materials

The 24 normed sentences consisting of action phrases from Iani and Buccarelli (2017, experiment 2) were adapted slightly for comprehension and familiarity (Appendix A). There were 48 videos total in which the speaker uttered the phrases with or without accompanying gestures depicting the action information (see Figure 1 for an example). The action phrases were divided into two sets with 12 sentences in each set. There were 2 versions of each set (one with the accompanying gestures and one without). In total there were four sets of videos: gesture + set A, no gesture + set A, gesture + set B, no gesture + set B. These video sets were used to construct four protocols. In protocol 1, gesture + set A was followed by no gesture + set B. In protocol 2, gesture + set B was followed by no gesture + set A. In protocol 3, no gesture + set A was followed by gesture + set B. In protocol 4, no gesture + set B was followed by gesture + set A. Thus the order of the gesture block and the set of action phrases were fully counterbalanced across participants. Stimuli videos were presented on an 13.3 inch monitor using Opensesame’s media_player_mpy plugin, which is based on the MoviePy software.

Figure 1.

Figure 1.

A still from the videos accompanying the phrase, “playing the piano”. The Gesture Observed condition is on the left, the No Gesture Observed condition on the right. Participants saw two sets of 12 sentences, with one set accompanied by gesture and the other set with the hands at rest.

Procedure

Participants were randomly assigned to one of the four protocols; participants were also randomly assigned to one of the four tapping conditions. The same tapping instructions and task from Iani and Buccarelli (2017) were used here. Participants in the No Tapping condition (n = 21) were not given any additional instructions about what to do with their hands during encoding or retrieval. Participants in the Encoding Tapping condition (n = 21) were instructed at the start of each video set to “place their hands on their knees. Throughout the videos, continuously and alternately tap the table in front of you with your index fingers. After tapping the table with one hand, that same hand would come back down to the knee before the next hand goes on to tap the table.” The 12 sentences in the video set were presented randomly in immediate succession one after the other. See Figure 2 for a schematic of the procedure.

Figure 2.

Figure 2.

Experimental procedure for all conditions. Participants were instructed to either keep their hands on their lap or rhythmically tap the table in front of them in each phase, depending on the condition to which they were assigned. The order of phrase set presented was randomized across participants; the order of the type of phrase (with gesture or without) was counterbalanced.

Next a white screen appeared with the word “Now” in the center of the screen in black type for 90 seconds. Participants were asked to recall as many of the phrases as possible (but they were not engaged in the tapping task). Vocal responses were recorded. Participants were then instructed to resume the tapping task while they watched the second video set which was followed by the “Now” screen. Participants had 90 seconds to recall as many sentences as possible. In the Retrieval Tapping condition (n = 21) participants were not told what to do with their hands while watching the videos. They were given the instructions for the tapping task prior to the start of the study and prompted to begin the tapping task when the “Now” screen appeared. In the Both Tapping condition (n = 20) participants were given the instructions for the tapping task at the start of the experiment and were told to continue the tapping throughout the duration of the study. In total, there were 8 possible conditions; the design was fully counterbalanced.

Coding of the recollections

We adopted the exact same coding system as Iani and Buccarelli (2017). Responses were coded to one of three categories: literal recollections, paraphrase recollections, erroneous recollections. Literal recollections were phrases recalled exactly as they were originally uttered by the speaker. Paraphrase recollections were phrases recalled that captured the general meaning or gist of the original phrase. We used the same system for identifying paraphrases as Iani and Buccarelli (2017) which included changes to the plurality of the items in the phrases, different articles, or minor verb modifications. All other recollections were recorded as erroneous recollections. In addition to coding responses, we also identified the phrases that each participant missed; we incorporated these missed trials into our analyses reported below.

Results

Participants correctly recalled a mean of 5.70 sentences per set of 12 (sd = 1.75, range = 2– 10 phrases; Figure 3). The condition with the highest average of correctly recalled phrases was the No Tapping, Gesture Observed condition (M = 7.00) whereas the condition with the lowest average of correctly recalled phrases was the Both Tapping, No Gesture Observed condition (M = 5.05).

Figure 3.

Figure 3.

Mean number of phrases correctly recalled by Tapping Condition and Gesture Type. Participants recalled significantly more phrases in the No Tapping condition than in the Encoding Tapping condition and the Retrieval Tapping condition.

Of those coded as correct, a mean of 4.31 (sd = 1.68) were literal recollections and 1.38 (sd = 1.25) were paraphrased. Literal responses comprised 35.2% of all response types while paraphrased responses comprised 11.4% of response types. Errors were relatively infrequent, with a mean of 0.30 (sd = 0.71) per set of 12, comprising just 2.5% of all response types. Missed phrases made up the largest proportion of possible response types at 50.9%. See Figure 4 for the breakdown in proportion of responses by condition.

Figure 4.

Figure 4.

Proportion of responses for phrases at recall by Tapping Condition and Gesture Type. Participants were significantly more likely to provide a paraphrased response when phrases were encoded while observing gesture compared to those encoded without gesture.

To assess whether context modulated the effect of gesture on memory, we used binomial mixed effect regression models. We used the glmer() function from the lme4 package (version 1.1–13) in R (version 1.1.419). Tapping Condition and Gesture Type were dummy coded, with the No Tap Condition and Gesture Viewed serving as the reference groups. We determined random effect structure by using the most maximal random effect structure that would converge (Barr, Levy, Scheepers, & Tily, 2013). We ran six models: the first predicting correct recall of a phrase across all four conditions, the second and third predicting correct recall across the two conditions with matching context (No Tapping, Both Tapping) and then the two conditions with mismatching context (Encoding Tapping, Retrieval Tapping), the fourth predicting literal recall of a phrase across all conditions, the fifth predicting paraphrased recall of a phrase across all conditions, and the sixth predicting errors in recall of a phrase across all conditions. Model predicting erroneous responses did not converge due to data sparsity. For the first five logistic regression models, Missed and Error responses were coded as 0s. For the sixth model, errors were coded as 1s and the rest as 0s.

Correct Phrase Recall.

Our first model predicted correct phrase recall – collapsing across literal and paraphrased responses – as a function of Tapping Condition (No Tapping, Encoding Tapping, Retrieval Tapping, Both Tapping), Gesture Type (Gesture Viewed or No Gesture Viewed), and their interactions. There were random intercepts for item and subject, with a random slope for Gesture Type on the intercept for phrase; the three models we tried with more complex random effect structure failed to converge. Our final model was Recalled ~TappingCondition * GestureType + (1+GestureType|Phrase) +(1|Subject). There was a main effect of Gesture Type (B = −0.40, z = −1.97, p=.049); phrases viewed with gesture were more likely to be recalled than those viewed without. There were main effects of Tapping Condition for both Encoding Tapping (B = −0.57, z = −2.86, p =.004) and Retrieval Tapping (B = −0.51, z = −2.64, p=.008); phrases were more likely to be recalled in the No Tapping condition than in the Encoding Tapping and Retrieval Tapping conditions. There was no significant difference between the No Tapping and Both Tapping conditions (B = −0.31, z = −1.55, p = 0.12). None of the interactions were significant (BothTap x NoGesture: B = 0.03, z = 1.51, p = 0.88; EncodingTap x NoGesture: B = 0.40, z = 1.51, p = .13; RetrievalTap x NoGesture: B = 0.26, z = 0.99, p = 0.32).

After finding no difference between the No Tapping and Both Tapping conditions, our next model included just the conditions with matching context. We included this model to examine whether these two conditions were different without the added variability present in the full model from the mismatched conditions. The model structure was the same as above. There was a marginal main effect of Gesture Type (B = −0.41, z = −1.85, p = .06); phrases viewed with gesture were more likely to be recalled than those viewed without. The main effect of condition was not significant (B = −0.31, z = −1.54, p = 0.12), nor was the interaction (B = 0.04, z = 0.16, p = 0.88). We then ran an identical model with just the Encoding Tapping and Retrieval Tapping conditions. The main effect of Gesture Type was not significant (B = −0.02, z = −053, p = 0.60). The main effect of Tapping Condition was also not significant (B = 0.05, z = 0.28, p = 0.78), nor was the interaction of Tapping Condition and Gesture Type (B = −0.14, z = −0.52, p = 0.60).

Literal Phrase Recall.

We next analyzed literal phrase recall as a function of Tapping Condition and Gesture Type. The model structure was the same as above. There was a main effect of Tapping Condition (B = −0.43, z = −2.01, p = .045); phrases were more likely to be literally recalled in the No Tapping compared to the Retrieval Tapping condition. The remaining main effects of Both Tapping (B = −0.11, z = −0.53, p = 0.60) and Encoding Tapping (B = −0.25, z = −1.19, p = 0.23) were not significant. The main effect of Gesture Type was not significant (B = −0.16, z = −0.78, p = 0.43). The interactions of Tapping Condition and Gesture Type were also not significant (BothTapping x NoGesture: B = 0.005, z = 0.27, p = 0.99; EncodingTapping x NoGesture: B = 0.14, z = 0.50, p = 0.62; RetrievalTapping x NoGesture: B = 0.17, z = 0.60, p = 0.55).

Paraphrased Phrase Recall.

We next analyzed paraphrased phrase recall as a function of Tapping Condition and Gesture Type. The model structure was the same as above. There was a main effect of Gesture Type (B = −0.58, z = −2.07, p = 0.038); phrases were more likely to be paraphrased when viewed with gesture than when viewed without. There was also a main effect of Tapping Condition for the Encoding Tapping condition (B = −0.78, z = −2.38, p = 0.017); phrases were more likely to be paraphrased in the No Tapping condition than the Encoding Tapping condition. The remaining effects of Tapping Condition were not significant (BothTapping: B = −0.45, z = −1.42, p = 0.16; RetrievalTapping: B = −0.27, z = −0.89, p = 0.38). The interactions of Tapping Condition and Gesture Type were also not significant (BothTapping x NoGesture: B = −0.09, z = −0.22, p = 0.83; EncodingTapping x NoGesture: B = 0.58, z = 1.38, p = 0.17; RetrievalTapping x NoGesture: B = 0.21, z = 0.52, p = 0.60).

Errors.

We next analyzed error in recall as a function of Tapping Condition and Error Type. We included a random intercept for participant; this was the most complex model that would converge due to the relatively infrequent occurrence of errors. None of the main effects of Tapping Condition (BothTapping: B = 0.958, z = 1.19, p = 0.23; EncodingTapping: B = 1.35, t = 1.71, p = .09; RetrievalTapping: B = 0.67, t = 0.82, p = 0.41) and Phrase Type (NoGesture: B = 0.50, t = 0.68, p = 0.50) nor their two-way interactions (BothTapping x No Gesture : B = −0.77, t = −0.85, p = 0.40; EncodingTapping x NoGesture: B = −1.46, t = −1.52, p = 0.13; RetrievalTapping x NoGesture: B = −0.72, t = −0.74, p = 0.46) were significant.

Discussion

We investigated whether viewing gesture enhances memory for phrases and whether engaging in an unrelated motor task mitigates the effect of gesture on memory, depending on whether the encoding and retrieval contexts match. Consistent with prior work, we found that seeing and hearing phrases accompanied by gesture enhanced memory for those phrases (Cohen, 1989; Engelkamp et al., 1994; Ianì & Bucciarelli, 2017, 2018). Further, we replicated Ianì & Bucciarelli (2017; 2018) by demonstrating this beneficial effect of gesture in the No Tapping condition but it was not observed in the Encoding Tapping or Retrieval Tapping conditions. As predicted, the results of the new condition that was not included in previous studies by Ianì and Bucciarelli (2017; 2018), Both Tapping, showed that the motor context at encoding and retrieval mattered: performance was affected by whether the encoding and retrieval contexts matched. Specifically, participants who engaged in a motor task at encoding and retrieval performed similarly to participants who did not engage in a motor task at either stage. Therefore, engaging in an unrelated motor task does not wipe out any beneficial effect of gesture for memory. Rather, matching the learning and recall contexts – by engaging the hands and arms in the same task at encoding and retrieval – leads to enhanced memory for phrases accompanied by gesture compared to those unaccompanied by gesture.

Although our first model comparing all four conditions did not yield significant interactions, the main effects of the encoding and retrieval conditions suggested different performance based on whether the contexts matched or not. We ran a follow-up model only on the No Tapping and Both Tapping conditions to directly compare the novel condition with the baseline condition. We found that gesture significantly enhanced memory for phrases in both conditions but the overall mean phrases recalled was not significantly different across groups. When we ran this same model in the Encoding Tapping and Retrieval Tapping conditions, the beneficial effect of gesture was not present. We can conclude from these results that even when the motor system is engaged in a motor task during encoding, information is still encoded from gesture. If the motor system is engaged in that same task at retrieval, a benefit for gestured information persists.

How is it that engaging in a motor task at both encoding and retrieval showed the same facilitative effect of gesture on recall as keeping the hands at rest? We do not interpret these findings as evidence that the listeners’ motor system is not involved in gesture observation, understanding, or comprehension. Rather, we suggest that engaging the arms and hands in a secondary motor task during encoding or retrieval does not, on its own, disrupt the benefit for gesture. Although unrelated to the primary task of recalling the spoken phrases, we interpret our findings as evidence that the tapping task was never task irrelevant. It seems that instructing participants to engage in the tapping task created a motor context that must be present at both encoding and retrieval for the benefit of gesture to be observed. We argue that unlike the critical control condition from Iani & Bucciarelli (2018) in which participants were prompted to move their legs and feet in a secondary motor task, moving your hands while watching someone else move their hands is inherently task relevant. The reason that moving your feet does not disrupt the beneficial effect of gesture is because it is truly task irrelevant.

We can conceive of a few explanations for the persistence of the benefit for gesture in the Both Tapping condition based on potential differences in the role of the listeners’ motor systems during gesture observation once the relevant motor context from encoding has been reinstated at retrieval. Under one view, gesture observation elicits an identical motor trace in the motor system of the listener; in this case, we think it is entirely possible that the motor information from the gestures was simply integrated with the ongoing motor activity from the tapping task. Given that the tapping task used was relatively simple, rhythmic and repetitive, it likely was not a significant strain on the cognitive and neural systems underlying action planning and production. It could be that there were plenty of available resources in working memory such that the information from gesture could be stored along with or in addition to the action information necessary for executing the tapping task.

When the same movements are produced again at recall, it could cue the action information encoded concurrently via gesture. This reactivation of the motor information may serve as an effective retrieval cue that makes the verbal information easier to access or makes the memory for the spoken information more robust. In this case, the action information and verbal information from each phrase could be stored (and subsequently reactivated) as a single, multimodal representation.

It is also possible that during gesture observation the listener acquires additional content from the gestures via a motor simulation but ultimately this information does not get stored as a motor trace. Instead, motor simulation during gesture observation facilitates comprehension of the spoken content by activating or even generating a corresponding mental image or other analog representation of the to-be-remembered information. This is consistent with the Gesture as Simulated Action framework (Hostetter & Alibali, 2008, 2018) described earlier.

We remain a bit agnostic about the precise role for the listener’s motor system in gesture observation. However, as noted above, the tapping task used in these experiments was repetitive and simple, arguably not putting great burden on the motor system. Would engaging the hands in a more complicated motor task overload the motor system and wipe out the benefit of gesture? There is some evidence that this would be the case. Ping et al. (2013) had participants view and hear sentences – some accompanied with gesture and some not – and then subsequently make judgments on whether a pictured object was present in that sentence. Gestures contained additional information that could be used to speed up reaction times for the judgments. One group of participants completed this task while carrying out a concurrent motor task involving making unplanned hand and arm movements. Ping et al. (2013) found that engaging in this motor task rendered participants incapable of using information from gesture to influence their sentence comprehension. It remains unknown if engaging participants in a more complex motor task would mitigate the effect of gesture on memory. Follow-up studies could engage participants in a more complex task to investigate whether the benefit for gesture would be eliminated in a Both Tapping-like condition that utilized a complex motor task.

Another aspect of our findings that speaks to mechanism is the relative proportions of paraphrased versus literal recollections of the phrases by Gesture Type; gesture’s facilitative effect on memory appears to have been driven by the paraphrased responses. When we restricted the analysis to just paraphrased recollections, there was a significant effect of gesture on memory. When we restricted the analysis to just literal recollections, this benefit disappeared. This is consistent with earlier findings showing a detriment in memory when cued with literal phrases at recall. Cutica and colleagues had both children (Cutica, Iani, & Bucciarelli, 2014) and adults (Cutica & Bucciarelli, 2013) read scientific texts and then gave them tests measuring comprehension, verbatim memory, and inference-based questions. They found that when participants were instructed to gesture during encoding, they got more questions correct at test than when they did not gesture. However, in a follow-up experiment, participants who had been instructed to gesture got fewer questions correct on a recognition test when they included a literal phrase from the study materials than when they had not been instructed to gesture (Cutica & Bucciarelli, 2013). The researchers concluded that the beneficial effect of gesture improves memory by helping to establish a more detailed, articulated mental model for the written information. One side effect of this process may be a diminished memory for surface features of the original material (Cutica & Bucciarelli, 2013).

We posit a similar interpretation for our results. Observing gesture with the phrases significantly enhanced memory for paraphrased, but not literal, responses. We suggest that this may have been due to participants sometimes relying more heavily on their memory for the gesture than the phrase itself. For example, the phrase, “whisking eggs” was presented with a gesture of a hand moving vigorously in a circular motion in the gesture condition. At recall, the reactivation of the motor information from gesture could link back to several different ways to phrase this in spoken language (e.g., whisking some eggs, beating eggs, stirring the eggs, etc.). Because specific gestures can map on to multiple different words and ways of phrasing the intended meaning or activate corresponding images or analog representations of the semantic content, retrieving multimodal representations via gesture is likely to lead to a “gist” memory for what was encoded from spoken language. Indeed, the phrases that had the lowest incidence of paraphrased responses were all phrases that have clear mappings with specific gestures; the gesture canonically represented the literal phrase, with few other options for what it was representing (i.e., hammering a nail, rowing a boat, shooting a gun). Future work should further investigate this possibility by using gesture and phrase pairings that vary systematically with respect to how clearly the gesture represents the literal phrase. Understanding this distinction has practical implications for the use of gesture in classrooms, therapeutic environments, and other learning contexts; this may suggest viewing gesture is most useful when the goal is to learn and understand concepts that do not require rote memory for specific words.

In sum, we followed up on prior work demonstrating that a concurrent motor task diminished the effect of gesture on memory for phrases by testing a critical condition: matching the learning and retrieval contexts by engaging in the motor task at both encoding and recall. We found that participants who completed a tapping task with their hands at both stages performed similarly to participants who did not engage in the task at either stage. Further, we found that when participants viewed gesture with phrases they were more likely to provide paraphrased responses than when they did not view gesture. Taken together, these findings suggest that the learning context – specifically the motor context for the task-relevant effectors – is critical for assessing how gesture affects memory for phrases. Engaging in an unrelated motor task need not disrupt the benefit on memory for observing gestures if the task is completed both at encoding and retrieval.

Appendix A. Stimuli phrases

Rowing a boat

Conducting an orchestra

Playing the violin

Dribbling a basketball

Playing the piano

Cleaning a window

Driving the car

Painting a painting

Ironing a shirt

Whisking eggs

Wringing out a washcloth

Throwing a stone

Getting shampoo

Polishing silver

Hammering a nail

Brushing your teeth

Putting lotion on your hands

Stacking some blocks

Sewing by hand

Typing on a computer

Giving a hug

Shooting a gun

Rolling some yarn

Sharpening a knife

Footnotes

Publisher's Disclaimer: This Author Accepted Manuscript is a PDF file of a an unedited peer-reviewed manuscript that has been accepted for publication but has not been copyedited or corrected. The official version of record that is published in the journal is kept up to date and so may therefore differ from this version.

References

  1. Barr DJ, Levy R, Scheepers C, & Tily HJ (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. 10.1016/j.jml.2012.11.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Beattie G, & Shovelton H (1999). Mapping the Range of Information Contained in the Iconic Hand Gestures that Accompany Spontaneous Speech. Journal of Language and Social Psychology, 18(4), 438–462. 10.1177/0261927X99018004005 [DOI] [Google Scholar]
  3. Cohen RL (1989). Memory for action events: The power of enactment. Educational Psychology Review, 1(1), 57–80. 10.1007/BF01326550 [DOI] [Google Scholar]
  4. Cook SW, Duffy RG, & Fenn KM (2013). Consolidation and transfer of learning after observing hand gesture. Child Development, 84(6), 1863–71. 10.1111/cdev.12097 [DOI] [PubMed] [Google Scholar]
  5. Cook SW, Mitchell Z, & Goldin-Meadow S (2008). Gesturing makes learning last. Cognition, 106(2), 1047–58. 10.1016/j.cognition.2007.04.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cook SW, Yip TK, & Goldin-Meadow S (2010). Gesturing makes memories that last. Journal of Memory and Language, 63(4), 465–475. 10.1016/j.jml.2010.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Cutica I, & Bucciarelli M (2013). Cognitive change in learning from text: Gesturing enhances the construction of the text mental model. Journal of Cognitive Psychology, 25, 201–209. http://doi.org/10.1080 [Google Scholar]
  8. Cutica I, Iani F, & Bucciarelli M (2014). Learning from text benefits from enactment. Memory & Cognition2y, 42, 1026–1037. [DOI] [PubMed] [Google Scholar]
  9. Engelkamp J, & Krumnacker H (1980). Image-and motor-processes in the retention of verbal materials. Zeitschrift Für Experimentelle Und Angewandte Psychologie. [PubMed] [Google Scholar]
  10. Engelkamp J, Zimmer HD, Mohr G, & Sellen O (1994). Memory of self-performed tasks: Self-performing during recognition. Memory & Cognition, 22(1), 34–39. 10.3758/BF03202759 [DOI] [PubMed] [Google Scholar]
  11. Feyereisen P (2006). Further investigation on the mnemonic effect of gestures: Their meaning matters. European Journal of Cognitive Psychology, 18(2), 185–205. 10.1080/09541440540000158 [DOI] [Google Scholar]
  12. Godden DR, & Baddeley AD (1975). Context-dependent memory in two natural environments: on land and underwater. British Journal of Psychology, 66(3), 325–331. [Google Scholar]
  13. Hilverman C, Clough SA, Duff MC, & Cook SW (2018). Patients with hippocampal amnesia successfully integrate gesture and speech. Neuropsychologia, 117(June), 332–338. 10.1016/j.neuropsychologia.2018.06.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hilverman C, Cook SW, & Duff MC (2018). Hand gestures support word learning in patients with hippocampal amnesia. Hippocampus, 28(6), 406–415. 10.1002/hipo.22840 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Hostetter AB, & Alibali MW (2008). Visible embodiment: Gestures as simulated action. Psychonomic Bulletin & Review, 15(3), 495–514. 10.3758/PBR.15.3.495 [DOI] [PubMed] [Google Scholar]
  16. Hostetter AB, & Alibali MW (2018). Gesture as simulated action: Revisiting the framework. Psychonomic Bulletin & Review. [DOI] [PubMed] [Google Scholar]
  17. Huff M, Maurer AE, & Merkt M (2018). Producing gestures establishes a motor context for procedural learning tasks. Learning and Instruction, 58(July), 245–254. 10.1016/j.learninstruc.2018.07.008 [DOI] [Google Scholar]
  18. Ianì F, & Bucciarelli M (2017). Mechanisms underlying the beneficial effect of a speaker’s gestures on the listener. Journal of Memory and Language, 96(96), 110–121. 10.1016/j.jml.2017.05.004 [DOI] [Google Scholar]
  19. Ianì F, & Bucciarelli M (2018). Relevance of the listener’s motor system in recalling phrases enacted by the speaker. Memory, 26(8), 1084–1092. 10.1080/09658211.2018.1433214 [DOI] [PubMed] [Google Scholar]
  20. Ianì F, Burin D, Salatino A, Pia L, Ricci R, & Bucciarelli M (2018). The beneficial effect of a speaker’s gestures on the listener’s memory for action phrases: The pivotal role of the listener’s premotor cortex. Brain and Language, 180–182(March), 8–13. 10.1016/j.bandl.2018.03.001 [DOI] [PubMed] [Google Scholar]
  21. Kelly SD, Barr DJ, Church RB, & Lynch K (1999). Offering a Hand to Pragmatic Understanding: The Role of Speech and Gesture in Comprehension and Memory. Journal of Memory and Language, 40(4), 577–592. 10.1006/jmla.1999.2634 [DOI] [Google Scholar]
  22. Kelly SD, McDevitt T, & Esch M (2009). Brief training with co-speech gesture lends a hand to word learning in a foreign language. Language and Cognitive Processes, 24(2), 313–334. 10.1080/01690960802365567 [DOI] [Google Scholar]
  23. Kroenke K-M, Mueller K, Friederici AD, & Obrig H (2013). Learning by doing? The effect of gestures on implicit retrieval of newly acquired words. Cortex; a Journal Devoted to the Study of the Nervous System and Behavior, 49(9), 2553–68. 10.1016/j.cortex.2012.11.016 [DOI] [PubMed] [Google Scholar]
  24. Macedonia M (2014). Bringing back the body into the mind: gestures enhance word learning in foreign language. Frontiers in Psychology, 5(December), 1–6. 10.3389/fpsyg.2014.01467 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. McNeill D (1992). Hand and Mind: What Gestures Reveal About Thought. University of Chicago Press. [Google Scholar]
  26. McNeill D, Cassell J, & McCullough K-E (1994). Communicative Effects of Speech-Mismatched Gestures. Research on Language & Social Interaction. 10.1207/s15327973rlsi2703_4 [DOI] [Google Scholar]
  27. Ping RM, Goldin-Meadow S, & Beilock SL (2013). Understanding Gesture: Is the Listener’s Motor System Involved? Journal Of Experimental Psychology: General, 143(1), 195–204. 10.1037/a0032246.Understanding [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Quandt LC, Marshall PJ, Shipley TF, Beilock SL, & Goldin-Meadow S (2012). Sensitivity of Alpha and Beta Oscillations to Sensorimotor Characteristics of Action: An EEG Study of Action Production and Gesture Observation. Neuropsychologia, 50(12), 2745–2751. 10.1016/j.neuropsychologia.2012.08.005.Sensitivity [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Stevanoni E, & Salmon K (2005). Giving Memory a Hand: Instructing Children to Gesture Enhances their Event Recall. Journal of Nonverbal Behavior, 29(4), 217–233. 10.1007/s10919-005-7721-y [DOI] [Google Scholar]
  30. Tulving E, & Thomson DM (1973). Encoding specificity and retrieval processes in episodic memory. Psychological Review, 80(5), 352–373. 10.1037/h0020071 [DOI] [Google Scholar]
  31. Wakefield EM, James TW, & James KH (2013). Neural correlates of gesture processing across human development. Cognitive Neuropsychology, 30(2), 58–76. 10.1080/02643294.2013.794777 [DOI] [PubMed] [Google Scholar]

RESOURCES