Abstract
Embodied cognition theory underscores the close coupling of bodily action and cognition, suggesting that learning can be strengthened when language is enacted. However, in English as a Foreign Language (EFL) education, the instructional value of gesture-based digital interfaces remains underexamined beyond vocabulary outcomes, particularly in relation to working memory capacity, spatial reasoning, and learner engagement. Addressing this gap, the present study examined how gesture-mediated interactions shape cognitive and affective outcomes in an EFL context using a convergent parallel mixed-methods design. A total of 143 Grade 12 students participated in the study, with 74 were allocated to the experimental group (EG) and 69 to the control group (CG). Over a period of six weeks, EG participated in language tasks using Leap Motion, while CG received traditional instruction. Using a convergent parallel mixed-methods design, we administered three validated scales pre/post and collected open-ended responses from the experimental group; findings were integrated to interpret convergent quantitative and qualitative patterns. Working memory capacity, spatial reasoning, and engagement were measured using validated scales The quantitative analyses indicated that EG made significantly larger improvements in working memory capacity, spatial reasoning, and engagement compared to CG. Qualitative narratives supported these findings, highlighting increased motivation, deeper understanding, and greater agency among participants in EG, despite some initial technical challenges. These results approve that gesture-based instruction may offer a promising approach for enhancing EFL instruction, boosting cognitive abilities and strengthening classroom engagement.
Keywords: Digital age, Gesture-based educational technology, Spatial reasoning, Working memory capacity, Engagement, EFL
Introduction
The integration of digital technology into education has expanded opportunities for interactive, immersive learning experiences. In English as a Foreign Language (EFL) classrooms, conventional text-based teaching methods frequently restrict students’ active participation. Digital tools that use embodied interaction help learners link language with physical actions, boosting their understanding, memory, and motivation [43, 69]. Schnaider [74] highlights how these technologies enable learners to engage with abstract concepts in tangible ways, making classroom learning a more vibrant experience. Gesture-based technologies, including motion-sensing sensors/devices, enable students to interact with virtual content using hand movements and other physical actions [9]. Gesture-based technologies allow learners to express language and build understanding through movement. Gesture-based learning has proven effective in various educational fields, such as STEM, vocational training, and language acquisition [10, 44, 78]. In EFL education, these tools provide a distinctive way to strengthen vocabulary, grammar, and spatial concepts. When students use gestures to express words, sentences, or prepositions, they connect physical actions with language, which improves comprehension and memory [44]. These embodied experiences foster motivation and active engagement, enabling learners to interact with language in ways that go beyond mere memorization. Gesture-based learning turns the classroom into a space where students actively create understanding through movement, helping to make abstract concepts more concrete and memorable.
This study examines three constructs central to gesture-based EFL learning: working memory capacity, spatial reasoning, and learner engagement. Barutchu et al. [6] describe working memory capacity as the short-term storage and manipulation of linguistic information, which is essential for encoding and retaining knowledge needed for long-term language proficiency [50]. In this study, memory is defined more broadly as learners’ capacity to retain and later retrieve linguistic information over time, and Barutchu et al. [6] likewise emphasize that such retention supports durable language proficiency. Spatial reasoning encompasses the ability to visualize, manipulate, and interpret the relationships between objects. This skill is especially important for grasping prepositions, directions, and language that is grounded in spatial contexts [47, 59]. Learner engagement includes emotional, behavioral, cognitive, and agentic involvement, showcasing students’ interest, effort, and proactive participation in learning tasks, as noted by Reeve and Lee [66] and Ghelichli et al. [18]. This study offers a thorough perspective on how gesture-based learning impacts both the cognitive and emotional dimensions of EFL education by examining these constructs in tandem.
While there is increasing interest in gesture-based technologies, research exploring their impact in EFL contexts is still scarce, particularly concerning working memory capacity, spatial reasoning, and engagement. Many studies have focused on short-term vocabulary acquisition or overall educational outcomes, while the wider cognitive and emotional benefits remain less examined [44, 78]. This study explores the impact of Leap Motion, a gesture-based learning tool, on Grade 12 EFL students during a six-week intervention, aiming to fill an existing gap in the literature. The study highlights the pedagogical benefits of gesture-based tools by examining how physical interaction with virtual content affects working memory capacity, spatial reasoning, and engagement. While the findings are derived from an Ethiopian urban context, they suggest that embodied digital learning may enhance language acquisition by making it more active, meaningful, and learner-focused, with potential implications for EFL classroom practice and future research in similar resource-constrained settings, though generalizability to diverse global EFL populations requires further validation.
Literature review
Gesture-based educational technology allows learners to interact with educational content using physical movements, especially through hand and arm gestures. In EFL contexts, such tools enable students to express vocabulary, spatial relationships, or grammatical structures through gestures, transcending the limitations of text or speech alone [75, 83]. These systems often utilize motion sensors or gesture recognition to connect physical actions with linguistic meaning. Hsiao and Chen [28] and Janzen-Ulbricht [31] demonstrate that these technologies connect abstract linguistic forms with real-life experiences, which is particularly beneficial for EFL learners who frequently find it challenging to grasp grammar or vocabulary through solely verbal teaching methods. Moreover, recent studies on embodied cognition reveal that gestures play a crucial role in helping learners process spatial and abstract concepts during second language acquisition. By simulating actions and metaphors through bodily movements, learners can ground meaning more effectively [4, 37, 38]. These studies highlight that iconic and metaphoric gestures improve understanding by connecting gestural representations with semantic structures, facilitating a deeper integration of linguistic elements in EFL contexts. By mapping hand movements onto spatial relations (e.g., under/between), gesture-based tools can reduce processing demands and support sustained attention. Research shows that integrating gestures with speech or text enhances understanding, boosts retention, and elevates motivation when compared to conventional verbal approaches [34, 51].
Theoretical foundation of the study
Glenberg [19] posits that embodied cognition views cognitive processes as emerging from dynamic interplay among brain, body, and environment, rather than isolated neural computations. Johnson [33] argues that these sensorimotor experiences are integral to how knowledge is formed and used, and this premise carries direct implications for language learning. Applied to language learning, this view suggests that spatial prepositions gain meaning through physical interaction that mirrors real-world relations. Learners internalize concepts by enacting scenarios that engage motor and perceptual systems. For example, positioning a virtual book under a chair or spreading fingers for between promotes a multimodal representation which integrates linguistic symbols with movements.
Research in second language acquisition shows physical activities yield superior vocabulary and grammar retention over passive methods like listening or reading [48]. One explanation is that gesturing links language processing to motor activity, which can strengthen initial encoding and support later retrieval [27, 70]. Such activation enhances encoding, turning memories into embodied simulations reactivated by similar sensorimotor cues [49]. From an embodied cognition perspective, gesture-mediated tasks should support working memory, spatial reasoning, and engagement by distributing processing across verbal and sensorimotor channels. Interactions reduce cognitive load by shifting demands to physical actions by supporting both verbal rehearsal and visuospatial processing through sensorimotor practice [60, 62]. For spatial reasoning, learner interaction with virtual objects grounds relationships in bodily coordinates, boosting mental rotation and visualization [58]. Active participation meets needs for competence and autonomy, fostering sustained behavioral and cognitive investment [73].
Goldin-Meadow and Beilock [20] present gesture-based theory, building on embodied cognition, by treating gestures as tools for cognitive offloading and conceptual elaboration, not just communicative aids. Kelly and Ngo Tran [35] extend this account by proposing that gestures help externalize internal thought, which enables learners to explore and refine emerging ideas with less pressure on limited cognitive resources. Janzen-Ulbricht [32] explains that this action-meaning linkage can strengthen retention because learners encode language through coordinated motor and semantic cues. In EFL, where vocabulary lacks direct representations, gestures bridge new inputs to familiar motor patterns. Struggling with “between,” a learner traces space between virtual books, making the relation tangible and supporting more accurate interpretation through physical enactment [12]. Gesture-based theory aligns with evidence that situated gestures boost word learning in second language contexts, especially with contextual integration [26]. It predicts study outcomes: working memory improves as gestures distribute load across channels, elevating encoding and retrieval [13]. Gesture-mediated interaction may enhance spatial reasoning by gestures mimicking transformations, fostering intricate mental models of dynamics [1]. Engagement rises through gesturing feedback loops, nurturing agentic behaviors and emotional commitment, transforming learning into self-directed performance [67].
This approach integrates viewpoints into a unified framework for gesture-based EFL interaction, particularly via motion-sensing technologies like Leap Motion (detailed in the methods section). Gesture-mediated tasks create a perception–action loop in which learners act on language input, receive immediate feedback, and refine form–meaning mappings over repeated trials. This cycle contrasts passive methods, turning acquisition into embodied simulation. Leap Motion’s tracking connects physical inputs to linguistic outputs, supporting measurable changes on standardized assessments. Interactions advance verbal working memory by embedding elements in motor traces, elevate spatial reasoning via kinesthetic mapping, and heighten engagement through autonomy and flow. The framework supports measure selection and result interpretation, positioning gesture-based technology as catalyst for holistic improvements [5]. Beyond novelty, it shifts instruction to active modality. For example, learners can enact spatial prepositions by repositioning virtual objects via gesture recognition. This perception–action-feedback loop, grounded in both theories, is crafted for precision, where prompt responses fortify neural connections linking movement to meaning through iterative practice.
Working memory capacity, spatial reasoning, and learner engagement in EFL contexts: constructs and empirical gaps
In the context of EFL learning, working memory capacity is the ability to temporarily store and manipulate linguistic components like vocabulary, syntax, and meaning. This process aids in initial encoding and plays a significant role in long-term retention [42, 50]. Working memory supports immediate processing and initial encoding, which can contribute to later retention and access to semantic representations. Working memory capacity supports online comprehension and production, including fluent processing during reading and speaking. Limited retention can lead learners to forget newly acquired forms quickly [2, 24, 57]. Studies indicate that using gestures can greatly improve working memory capacity during second language learning. Kelly et al. [36] discovered that primary school students who used gestures while learning verbs retained those verbs longer than their peers who did not use gestures, with the effects lasting for weeks after instruction. In a similar vein, Hsiao et al. [29] found that using gesture-based memory strategies enhanced working memory in preschoolers. The findings suggest that gesture-based technologies enhance vocabulary and syntax encoding, allowing EFL learners to retain language more effectively [71]. Additionally, embodied approaches show that using gestures to simulate motion verbs and spatial schemas enhances memory by forming sensorimotor connections to language. As learners engage with intensifiers and manner adverbs through actions, this practice strengthens recall via metaphoric and iconic representations [37, 40].
Spatial reasoning in EFL learning involves the capacity to comprehend and mentally navigate spatial relationships conveyed through language, including prepositions, directions, locatives, or spatial metaphors [63]. For learners whose first language organizes space differently, this skill can be challenging but is essential for understanding terms related to place and movement, navigating spatial discourse, and interpreting visual aids such as diagrams or maps in English materials [79]. Gestures are essential in cultivating this skill. Janzen-Ulbricht [31] demonstrated that teaching gestures focused on the morpheme level improved learners’ comprehension and application of spatial terms, especially during the initial stages, in contrast to gestures at the sentence level. Nakatsukasa [55] discovered that using movement-based expressions of spatial relations, like recasts for locative prepositions, enhanced the performance of EFL learners on spatial reasoning tasks. Gesture-based technology allows learners to engage with virtual objects and adjust spatial relationships, promoting a more intuitive and perceptual understanding of spatial reasoning. Recent evidence highlights how using gestures to embody spatial schematic information, both in motion and static contexts, enhances the understanding of complex relational concepts. Learners effectively simulate definite articles and ergative verbs through gestures, which helps them visualize and remember spatial configurations [38, 39]. For cohesion with the study’s Leap Motion tasks, which primarily involve manipulating virtual objects to enact prepositions (e.g., placing items ‘under’ or ‘between’ others), these activities ground literal spatial relations while facilitating extensions to spatial metaphors, such as conceptualizing abstract ideas like ‘time under pressure’ through similar embodied simulations [26].
Engagement in EFL classrooms involves emotional, behavioral, cognitive, and agentic aspects [45, 46]. Emotional engagement shows how much we enjoy and are interested in tasks, while behavioral engagement is about actively participating and staying focused. Cognitive engagement involves putting in mental effort, and agentic engagement is about making choices and taking initiative [3, 30]. High engagement is essential, as learners who are disengaged might miss important information, forget what they’ve learned, or withdraw from practice, ultimately affecting their learning outcomes. Gesture-based technology boosts engagement by transforming tasks into interactive and dynamic experiences. According to Gullberg [23], when gestures play a crucial role in learning tasks, learners show increased motivation and persistence. Likewise, gesture-interactive game-based methods enhance both performance and engagement, with learners indicating greater attention and enjoyment [22, 29]. These interactions enrich tasks, offer instant feedback through movement, and enable learners to take charge of their learning journey, promoting ongoing engagement.
Gesture-based educational technology provides valuable advantages for enhancing working memory capacity, improving spatial reasoning, and increasing engagement in EFL learning. Encouraging learners to physically enact vocabulary, prepositions, or grammatical structures helps to anchor abstract linguistic content in sensorimotor experiences. This approach strengthens encoding and supports recall [51, 77]. At the same time, using gestures to illustrate spatial relationships, like moving virtual objects to show location or direction, makes these ideas more concrete. This approach improves spatial reasoning skills, which can be challenging to cultivate through text or speech alone [51, 76]. Engagement increases with multimodal interactions, allowing learners to see and feel their movements, receive immediate visual or tactile feedback, and participate in active, playful tasks [52, 56, 68, 81]. These constructs likely reinforce one another. Higher engagement can increase time-on-task, and clearer spatial representations may reduce misunderstandings, which can ease processing demands during language use. The interconnectedness is enriched by embodied metaphor processing, where conditions that align gestures with meaning enhance the understanding of metaphor schemas. This suggests that the psychological processes involved in language acquisition utilize gestural simulations, which boost both cognitive and emotional engagement [4, 37, 39].
While research increasingly highlights the promise of gesture-based learning, notable gaps remain in the current literature. Yuan et al. [82] offered important insights into the physiological aspects of memory and attention,however, their research did not explore the effects of gesture-based interaction on spatial reasoning or ongoing learner engagement. While the use of physiological data provided valuable insights into cognitive states, the study fell short in thoroughly examining learners’ subjective experiences and the application of learning beyond the immediate context. Rau and Beier [64] effectively demonstrated the benefits of representational gestures within STEM disciplines. However, their findings do not directly address linguistic contexts or the potential influence of such gestures on learner agency, affect, or spatial reasoning in non-technical domains like language acquisition. Hsiao et al.’s [29] research is important for early childhood education, but it restricts the applicability of its findings to older learners involved in more complex cognitive activities, including spatial transformation, grammatical manipulation, or reflective language use.
The studies mentioned, along with other recent works (e.g., [4, 34, 37, 38, 51]), indicate growing interest in gesture-based approaches, yet current research has not sufficiently explored their implementation within the unique sociocultural context of EFL learners in regions like Ethiopia. These learners frequently face distinct challenges due to limited exposure to English outside formal settings, the prevalence of teacher-centered instructional methods, restricted access to advanced digital learning environments, and cultural-linguistic differences that influence spatial and metaphorical language processing. While broader studies on embodied cognition and gestures exist, the specific overlap between gesture-based learning and EFL pedagogy in non-Western, resource-constrained contexts like Ethiopia remains under-researched, even as global education systems shift toward more digital, learner-centered approaches. This study seeks to fill these targeted gaps by investigating how gesture-based educational technology influences working memory capacity, spatial reasoning, and learner engagement among Ethiopian EFL classrooms, while also exploring learners’ perceptions to provide a contextually grounded assessment of its educational value in language acquisition. The study is shaped by the following research questions, designed to adopt an experimental approach and enhance the current body of literature:
Research Question 1: Does the integration of gesture-based technology result in improved working memory capacity in EFL learners?
Research Question 2: Does the integration of gesture-based educational technology lead to improved spatial reasoning abilities of EFL learners?
Research Question 3: Does the integration of gesture-based educational technology result in enhanced engagement in EFL classrooms?
Research Question 4: What are EFL learners’ perceptions of gesture-based technology as a language learning tool, including its perceived benefits, challenges, and potential for future use?
Method and measures
Research design
The authors utilized a convergent parallel mixed-methods design to achieve the research objectives. This design, which entails the simultaneous collection and analysis of both quantitative and qualitative data, integrates the findings to offer a more comprehensive account. It is well-recognized in the field of educational research [14]. The authors chose this method as it allows for the triangulation of quantitative results from a pre-test-post-test control-group design. This involves comparing an experimental group (EG) that utilizes gesture-based technology with a control group (CG) that receives traditional instruction, alongside qualitative insights gathered from open-ended survey responses, as highlighted in applied linguistics research [17]. The authors controlled for the intervention’s effects by maintaining consistent instructional content, teacher delivery, and classroom environments across all groups, allowing only the instructional mode to vary. Additionally, the qualitative data gathered concurrently from the EG enhanced the understanding of the statistical findings by revealing learners’ perceptions and emotional reactions, offering a comprehensive perspective on the intervention’s effects.
Participants
Participants were 143 Grade 12 students from four public high schools situated in an urban district in Ethiopia. The participants, aged 17 to 18, exhibited a lower-intermediate level of English proficiency, aligning with the B1 level as outlined by the Common European Framework of Reference for Languages. The proficiency level was evaluated by reviewing their past academic records and confirming results through placement tests conducted by the schools. A preliminary questionnaire was conducted to identify potential confounding factors related to prior technology experience. The results indicated that while most students had regular access to smartphones, their familiarity with interactive learning technologies, such as motion-sensing devices, was limited, with fewer than 20% reporting any prior use. This suggests that prior experience with interactive learning technologies was unlikely to confound intervention effects.
Participants were divided into two groups: EG, which included 74 students, and CG, made up of 69 students. Given the practical constraints of school scheduling, assignments were made at the class level, leveraging the current classroom frameworks. As a result, the study utilized a quasi-experimental design instead of a fully randomized method. Gender representation was similar across groups (EG: 38 female, 36 male; CG: 35 female, 34 male). To maintain uniformity in teaching, a single qualified EFL teacher instructed all students during the 6-week intervention period. EG participated in learning activities that involved gestures using Leap Motion, whereas CG experienced conventional instruction. This study was categorized as minimal-risk educational research and did not necessitate any procedures that would require formal institutional approval according to national regulations. Nevertheless, we adhered closely to ethical safeguards. All student participants and their legal guardians provided informed consent after receiving a complete explanation of the study’s aims and procedures. Participation was completely voluntary, and participants were informed of their right to withdraw at any point without any repercussions. We ensured that confidentiality and anonymity were rigorously upheld, with all data being securely managed and utilized exclusively for research purposes. While convenience sampling was used, efforts were made to ensure that the two groups were comparable with respect to age, baseline proficiency, and prior technology exposure. All procedures were conducted in accordance with relevant international ethical standards for social science and educational research.
Instruments
The authors chose and used a set of validated tools to collect the necessary data. The Oxford Quick Placement Test (OQPT), created by Oxford University Press, evaluated general proficiency in the English language. This 60-item assessment measures essential areas such as grammar, vocabulary, and reading comprehension. Numerous validation studies have established its robust construct validity, showing significant correlations with standardized tests such as TOEFL and IELTS, which highlights its effectiveness in measuring language skills accurately. The streamlined design enhances efficiency in both research and educational environments. This study found that the OQPT had a Cronbach’s alpha of 0.93, demonstrating strong internal consistency and reliability among participants.
The authors developed a composite questionnaire to evaluate student engagement, integrating Reeve’s [65] agentic engagement scale with the Student Engagement in Schools Questionnaire (SESQ) by Hart et al. [25]. This 14-item tool assessed four key dimensions: emotional, behavioral, cognitive, and agentic engagement. Participants evaluated each item using a 5-point Likert scale, where 1 indicated strong disagreement and 5 indicated strong agreement. The SESQ, which has been validated in educational contexts, effectively captures emotional responses to learning. Its integration with Reeve’s scale offers a thorough assessment of engagement in EFL settings. The adapted engagement questionnaire’s validity was assessed by a panel of five experts, including two applied linguists specializing in learner engagement, two seasoned EFL secondary teachers, and one educational psychologist. They evaluated each item for relevance, clarity, and suitability for the EFL classroom, yielding a content validity index (CVI) of 0.94. The authors carried out a pilot test involving 10 comparable students to ensure the clarity, cultural relevance, and reliability of the modified engagement questionnaire. To ensure EFL participants could easily understand the questionnaire, it was translated into their first language (L1) through a forward–backward translation method. Two bilingual translators produced independent forward translations, discrepancies were reconciled, and a third bilingual translator conducted back-translation, followed by committee review. After finalizing the translated version, the pilot confirmed the instrument’s reliability in the EFL context, yielding a Cronbach’s alpha of 0.87. This alpha reflects the final administered version after refinement.
The Spatial Reasoning Inventory (SRI) developed by Ramful et al. [63] was utilized to assess spatial reasoning. This 30-item paper-and-pencil test evaluates three essential constructs: mental rotation, spatial orientation, and spatial visualization, all of which are closely associated with cognitive performance in educational psychology. Originally designed for middle school students in mathematical and spatial contexts, the SRI was adapted for this study’s Grade 12 EFL participants (aged 17–18) to measure language-related spatial skills, such as understanding prepositions and directions. Validation studies demonstrate the psychometric robustness of the SRI, showing significant correlations with established measures of spatial ability, which makes it appropriate for adolescent learners. The SRI’s validity was evaluated through expert judgment by a panel of four specialists—two educational psychologists with a focus on spatial cognition and two EFL curriculum experts. They assessed the items for their conceptual fit and relevance to language-related spatial tasks, resulting in a Content Validity Index (CVI) of 0.92. To address potential age-related biases, the items were selected and reviewed to ensure developmental appropriateness for older adolescents, emphasizing abstract spatial manipulation relevant to EFL. Additionally, cultural biases were mitigated through the adaptation process, including pretesting with Ethiopian students to confirm item clarity and relevance in a non-Western educational context, where spatial concepts may differ due to linguistic and environmental influences. Responses were scored based on correctness, with unanswered questions receiving a score of zero. The items were selected from the original validation pool to align with the Grade 12 curriculum. To adapt to the EFL context, the authors translated the SRI into the participants’ L1 using a careful approach: forward translation by subject experts, back-translation, and a committee review for conceptual accuracy, followed by pretesting to confirm comprehension. The SRI demonstrated a Cronbach’s alpha of 0.81, indicating its reliability in assessing spatial skills affected by the intervention.
The Working Memory Test Battery for Children (WMTB-C), developed by Pickering and Gathercole [61], evaluated the working memory capacity of EFL learners. This battery has been effectively adapted for adolescents in language studies, focusing on verbal short-term memory, visuospatial short-term memory, and working memory capacity. Participants remembered sequences of digits, words, or patterns after brief delays, achieving scores between 0 and 20 depending on their correct responses. Higher scores reflected a greater capacity. The validation of the WMTB-C in second language research highlights its effectiveness in examining the role of working memory in the learning process. The adapted WMTB-C’s validity was assessed through expert judgment, with three cognitive psychologists specializing in L2 memory evaluation reviewing the subtest suitability and cultural neutrality for the participant group, resulting in a CVI of 0.92. The authors translated the battery into participants’ first language through a forward–backward translation process, engaging bilingual psychologists for accuracy. This was followed by a reconciliation phase and a pilot study to guarantee cultural neutrality and clarity. The WMTB-C demonstrated a reliability coefficient of 0.85, indicating its trustworthiness for group comparisons.
The authors conducted an open-ended survey with EG to gather qualitative insights about their experiences with gesture-based technology. This approach enabled participants to share their insights openly, yielding valuable information on perceived effectiveness, usability, motivational effects, and obstacles encountered. Questions prompted reflections on the emotional and cognitive impacts, including increased confidence and a deeper connection to learning materials, consistent with methodologies in technology-enhanced language studies. The survey was given in participants’ first language to encourage genuine responses. Educators reviewed it for clarity and relevance, ensuring it aligned with quantitative findings and provided a richer insight into the practical implications of the intervention.
Data collection procedures
Data collection occurred over a six-week instructional period, employing a mixed-methods pre-test–treatment–post-test design to investigate the impact of gesture-based technology on EFL learners’ memory retention, spatial reasoning, and engagement. Every participant, irrespective of their group assignment, underwent the same assessments at both the start and conclusion of the intervention. This approach was essential for establishing baseline performance and measuring changes over time. Furthermore, qualitative data were gathered solely from EG following the completion of instruction, providing a richer understanding of their experiences, perceptions, and evolving insights. This organized method made it possible to attribute the differences seen between groups mainly to the type of instructional medium, rather than to changes in content, timing, or assessment conditions. For better clarity and organization in presenting the steps of data collection, the procedures are detailed in Table 1.
Table 1.
Overview of data collection procedures
| Phase | Key activities | Participants | Instruments | Conditions/controls |
|---|---|---|---|---|
| Pre-Testing (Week 1) | Assess memory, spatial reasoning, and engagement with standardized instruments | All (EG: 74; CG: 69) | WMTB-C (memory); SRI (spatial reasoning); Adapted engagement questionnaire | Computer lab; uniform instructions; no feedback; trained assistants |
| Intervention (Weeks 1–6) | Deliver identical content (Vision Book 3) in two 90-min lessons/week; EG uses Leap Motion for gesture tasks; CG uses traditional methods. Control group activities included teacher-led explanation, printed flashcards, choral repetition, grammar worksheets, guided sentence-construction and dialogue practice, drawing tasks based on spatial descriptions, and feedback delivered verbally and through written annotations, without gesture recognition or digital feedback | All (EG and CG) | EG: Gesture-based tasks (vocabulary enactment, categorization, sentence building, prepositions, pair activities, dialogues, reflections). CG: Flashcards, worksheets, oral practice, written exercises | Same teacher, pacing, objectives; EG: embodied interaction with feedback; CG: text-based, no movement |
| Post-Testing (After Week 6) | Repeat pre-test instruments; administer open-ended survey to EG | All for quantitative; EG only for qualitative | WMTB-C; SRI; Engagement questionnaire; Open-ended survey (5 questions) | Same lab, proctors, timing; controls ensure attribution to intervention |
Pre-testing phase
During the initial week of the study, all 143 Grade 12 students participated in three standardized assessments, conducted under uniform and controlled conditions. The WMTB-C was utilized to evaluate memory, focusing on both verbal and visuospatial working memory. This assessment involves tasks that require participants to recall sequences and patterns. SRI was used to assess spatial reasoning, focusing on mental rotation and the interpretation of spatial relationships between objects. This ability is crucial for grasping prepositional language within its contextual framework. The adapted engagement questionnaire was used to measure learner engagement, showing a Cronbach’s alpha of 0.87 in a post-translation pilot test involving 10 students, thereby confirming its reliability for the main study. All assessments took place in the school’s computer lab during standard class hours. Research assistants delivered consistent oral instructions, kept an eye on student engagement, and made sure that no feedback or help was offered during the testing process. The implementation of these procedures aimed to maintain the integrity of baseline measurements and reduce variability among participants.
Instructional intervention
This study primarily aimed to compare two different teaching methods implemented over a six-week period, featuring two 90-min lessons each week. Both groups were exposed to the same curriculum content from Vision Book 3 and were instructed by the same teacher, following the same pacing and learning objectives. The sole distinction was in the presentation of the material. EG interacted with virtual environments using hand gestures monitored by the Leap Motion sensor, whereas CG took part in conventional, text-based tasks without any digital or physical engagement. Across the intervention, the experimental tasks were sequenced to address three aligned instructional goals, namely enhancing lexical memory, developing syntactic awareness, and improving spatial understanding, so that learners repeatedly revisited target forms through complementary modes of enactment, assembly, and manipulation.
Experimental Group: embodied learning through gesture-based technology
Instruction for EG was based on theories of embodied cognition, dual coding, and active learning. The frameworks indicate that language learning is more effective when abstract linguistic forms are linked to physical movement and sensory experiences. Every lesson was thoughtfully crafted to engage students in physically embodying vocabulary, grammar, and spatial relationships through intuitive hand movements, while immediate digital feedback provided reinforcement for correct usage. To strengthen coherence across tasks, each activity was explicitly aligned with one or more of the three instructional goals stated in the lesson overview, so that the rationale for each step remained visible in the instructional sequence.
The initial lesson emphasized enhancing lexical memory, developing syntactic awareness, and improving spatial understanding via a variety of interactive tasks. Students started by linking target verbs like fly, run, sleep, and write with gestures that are culturally relevant. To illustrate the concept of flying, they extended their arms and moved them up and down. They tilted their head down, mimicking the posture of someone resting on a pillow to symbolize sleep. When a gesture aligned with the system’s predefined model, an animation was triggered that depicted the corresponding action, such as a bird taking flight or a character closing their eyes while the word was articulated in a natural tone. When the gesture was not accurate, there was no response, prompting students to self-correct independently without external cues. This process strengthened motor memory and connected physical movement to meaning, enhancing the encoding of vocabulary. In this opening phase, lexical memory was foregrounded through repeated form-meaning-action coupling, which ensured that each new verb was introduced as an enactable concept rather than as a decontextualized label.
To consolidate lexical memory through semantic organization, students then organized 30 virtual objects, including a dog, pencil, apple, and chair, into thematic categories labeled Animals, Classroom Objects, Food, and Furniture. Through pinching and dragging motions, they sorted each object into its correct category. The system necessitated careful positioning. An object needed to stay within the specified area for a complete second to be considered accepted. This requirement discouraged random clicking and encouraged thoughtful decision-making, aiding students in developing stable mental categories grounded in semantic features rather than superficial associations.
To develop syntactic awareness through structured sentence assembly, students constructed simple subject-verb-object sentences by manipulating floating word icons on a digital canvas. To construct a sentence like “The dog runs under the table”, they physically moved each word into place, adhering to a left-to-right order. The system prevented progression until the sentence was grammatically accurate. Misplaced words activated a visual cue, a gentle red glow encircling the error, encouraging correction while maintaining the continuity of the task. Correct constructions were indicated with a green highlight and accompanied by an audio playback of the sentence spoken naturally by a native speaker. This design positioned word order as an explicitly monitorable problem, so that syntactic form was practiced as a meaningful constraint on successful completion, not as an isolated rule statement. This practical method redefined syntax, turning it from a collection of theoretical rules into a spatial challenge that can be addressed through movement.
To improve spatial understanding and strengthen the form-meaning mapping for prepositions, a third component of the lesson focused on mastering spatial prepositions through hands-on activities. Students received instructions like “Place the book on the desk” or “Put the cup between the two chairs.” They employed particular hand movements to accomplish these tasks. Students used a downward motion for below, a lateral motion for beside, an upward motion for above, and a circular placement around paired objects for between. The Leap Motion system effectively monitored the path and angle of every gesture, while gentle visual prompts assisted in fine-tuning the actions. For instance, when a student raised their hand too high while positioning an object beneath a chair, the target object shimmered subtly, suggesting a need for adjustment. This non-verbal feedback directed attention to how movement expressed meaning, which strengthened learners’ interpretation of spatial language while keeping the interaction uninterrupted. During the lesson, feedback was offered in real time through various channels. The device provided visual cues, auditory feedback, and subtle vibrations to confirm that the gesture had been recognized successfully. Following each task, students received a tailored progress summary that highlighted their accuracy rates, average response times, and specific areas for improvement. This openness fostered self-regulated learning, enabling students to track their own development and take charge of their advancement.
The second lesson expanded on these foundations by incorporating collaborative, dialogic, and communicative tasks that demanded ongoing linguistic production. While the first lesson established the three goals through tightly guided enactment and construction, the second lesson maintained the same targets in interactional settings that required lexical retrieval, syntactic negotiation, and continued attention to spatial meaning within communicative flow. In one activity, students collaborated in pairs, with each taking on a specific role. One served as the Noun Driver, tasked with selecting nouns and adjectives, while the other functioned as the Verb Driver, responsible for choosing verbs and sentence modifiers. They constructed sentences like “The tall student draws a blue picture”. Both partners needed to reach a mutual agreement before moving forward with the final arrangement. The necessity for negotiation, explanation, and justification fostered peer-mediated learning and enhanced syntactic awareness through social interaction.
To reinforce lexical memory with precision in form-meaning-action links, another task resembled a game of charades, enhanced by technological support. A student made a gesture for a specific word, while their partner guessed the word out loud. In contrast to conventional approaches, this system examined the kinematics of the gesture and matched it against a database containing five hundred recorded examples. When the gesture lacked clarity, such as a vague waving motion intended to signify a wave, the system provided subtle suggestions to assist in refining the movement. This process tightened the alignment between perception, production, and interpretation, which made lexical meanings more concrete and more readily retrievable.
To integrate syntactic production with situated, meaning-oriented use, students participated in brief, scripted conversations with a virtual avatar called Leo. For instance, a student may indicate a virtual apple and ask, “How much is this?” while also making a pointing gesture. Leo replied with three dollars, and the student pretended to hand over the money while saying, “Here you go”. The conversation progressed only when both the spoken words and the related gestures were properly performed. Leo’s mistakes prompted him to articulate his line slowly and clearly, allowing students the opportunity to attempt again without feeling embarrassed. Throughout four sessions, students engaged in eight distinct dialogues that addressed everyday scenarios like shopping, asking for directions, and ordering food, resulting in over twelve minutes of uninterrupted speaking practice within relevant contexts.
To extend syntactic awareness beyond fixed templates, the system sometimes displayed sentences that were incorrectly structured, such as “Book the on table is.” Students were invited to organize the word tiles into the correct sequence through gestures. To achieve success, they needed to leverage their understanding of sentence structure and remember how similar phrases had been formed previously. This task strengthened schema development by linking procedural memory to abstract grammatical rules. At the end of each session, students took five minutes to capture short voice reflections using a tablet app. Participants were asked to share their observations regarding their learning experiences. One student shared, “I remembered under because I had to bend low to put the bag there.” Another remarked that they didn’t understand that open implied pulling rather than pushing until they attempted the gesture. These reflections offered a brief, learner-centered account of how the activities supported vocabulary recall and grammatical understanding, complementing the instructional emphasis on lexical memory and syntactic awareness while documenting learners’ perceived mechanisms of change. The spontaneous reflections were transcribed and subsequently analyzed thematically, offering direct evidence of the influence of embodied experience on conceptual understanding.
Control Group: traditional text-based instruction
CG was taught using traditional classroom methods that matched the same learning goals. Vocabulary was presented using printed flashcards, explanations from the teacher, and choral repetition. Students engaged in various written exercises, including matching words to pictures, filling in sentence blanks, and illustrating scenes based on spatial descriptions. For instance, they were requested to depict a book resting on the table and a bag positioned beneath the chair. The tasks depended on visual representation and written output instead of physical interaction.
During the second lesson, students worked on constructing sentences with word lists, filled in grammar worksheets, and participated in peer-editing activities to spot and correct errors in one another’s writing. Oral practice included answering teacher prompts like “Use the word run in a sentence,” followed by responses from individuals or groups. Feedback was given either verbally or through written annotations on paper. These activities emphasized visual and verbal practice and did not include gesture recognition or system-generated feedback.
The key difference between the two groups was not just technological; it was also cognitive. EG engaged in language learning by connecting linguistic forms with physical actions and reactions to their surroundings. CG acquired language by engaging in visual and verbal repetition, utilizing explanations and written exercises. This contrast allowed for a clear exploration of whether embodied interaction improves learning outcomes compared to traditional methods.
Post-testing phase
After the intervention, all participants reconvened in the computer lab to take the same three assessments that were administered during the pre-test. The WMTB-C, the SRI, and the updated engagement scale were readministered. The testing conditions did not changed. To ensure comparability, we maintained the same room, proctors, timing, and instructions. An additional open-ended survey was administered exclusively to EG. The set comprised five reflective questions aimed at gathering insights from learners:
What activities assisted you in recalling new words, and what was the reason behind their effectiveness?
Did using hand gestures enhance your grasp of grammar? Could you provide an explanation, please?
What did you find to be the most difficult aspect of incorporating gestures into your lessons?
Were you more motivated to engage in this activity compared to your usual English classes? What are the reasons for or against this?
Are you interested in utilizing this method once more? What changes would you consider making?
We recorded responses digitally, transcribed them word for word, and analyzed the data through thematic analysis to uncover recurring themes related to motivation, autonomy, conceptual clarity, and technical barriers. The qualitative findings enriched the quantitative results by uncovering the mechanisms driving the observed gains. This design effectively controlled for all variables aside from the mode of instruction, allowing any notable differences in memory, spatial reasoning, or engagement to be reasonably linked to the use of gesture-based technology. Because content, teacher, pacing, setting, and testing conditions were held constant, observed differences can be attributed primarily to the instructional medium.
Data analysis
The authors used a combination of methods to examine both quantitative and qualitative data. The authors utilized validated questionnaires to gather quantitative data, which was then analyzed using IBM SPSS Statistics (Version 26). They calculated descriptive statistics, such as means and standard deviations, and conducted inferential statistics. Independent-samples t tests compared EG and CG at posttest (and/or change scores), and paired-samples t tests assessed pretest–posttest change within each group.
The qualitative data was sone through thematic analysis due to its adaptability in recognizing, analyzing, and presenting patterns within narrative data (e.g., [8]). Thematic analysis is a structured approach that involves coding and categorizing qualitative data to reveal significant themes. This method allowed the authors to delve into participants’ experiences with gesture-based technology. The analysis involved a structured six-step process: (1) immersing in the data through multiple readings, (2) creating initial codes to identify recurring concepts, (3) organizing related codes into potential themes, (4) assessing themes for coherence and relevance, (5) defining and naming themes to highlight their importance, and (6) crafting a narrative report that combined the themes with quantitative results. The identified themes were enhanced motivation, improved conceptual understanding, learner agency, and technical challenges. To enhance reliability and validity, the authors employed investigator triangulation, where two experts independently coded the data and addressed discrepancies through discussion, resulting in a 90% intercoder agreement. Furthermore, the authors engaged in member checking by presenting preliminary findings to a select group of participants.
Results
Quantitative findings
This section outlines the results from the pre-test and post-test evaluations that assess how gesture-based educational technology influences memory, spatial reasoning, and engagement. The analysis involved descriptive statistics, independent-samples t-tests, and effect size calculations with Cohen’s d to assess the effectiveness of the intervention. Prior to performing the t-tests, we ensured that the assumptions for parametric testing were met. Normality was established through Shapiro–Wilk tests, revealing p-values exceeding 0.05 for all groups. We confirmed the homogeneity of variances using Levene’s tests, which produced p-values exceeding 0.05 for all comparisons. The results affirm that the statistical methods employed were suitable.
We conducted reliability analyses to confirm the psychometric strength of the measurement instruments. Table 2 shows the Cronbach’s alpha values for each instrument, which range from 0.81 to 0.93, reflecting acceptable to excellent internal consistency. The high reliability scores indicate that the instruments effectively measured the intended constructs, establishing a dependable basis for examining the intervention’s effects.
Table 2.
Reliability analyses of the instruments
| Instrument | Reliability measure | Cronbach's alpha | Comments |
|---|---|---|---|
| OQPT | Reliability Coefficient | 0.93 | High reliability coefficient, indicating strong consistency in measuring overall English proficiency |
| SESQ | Cronbach's Alpha | 0.82 | Reliability coefficient indicates good internal consistency of the engagement measure |
| SRI | Cronbach's Alpha | 0.81 | Consistent measurement of spatial reasoning constructs |
| WMTB-C | Cronbach's Alpha | 0.85 | Strong internal consistency in assessing verbal and spatial memory |
OQPT Oxford Quick Placement Test, SESQ Student Engagement in School Scale Questionnaire, SRI Spatial Reasoning Inventory, WMTB-C Working Memory Test Battery for Children
Internal consistency for all measures was acceptable to high (Table 2). As shown in Table 3, EG and CG did not differ at pretest, but EG outperformed CG at posttest on working memory, with a large between-group effect. EG demonstrated a significant improvement in working memory capacity, with a notable effect size, suggesting that the gesture-based intervention effectively enhanced working memory capacity. CG demonstrated improvement as well, though the effect size was smaller, indicating a more modest change. The findings underscore the practical benefits of gesture-based technology for enhancing cognitive capacity, suggesting valuable applications in educational environments.
Table 3.
Working memory results
| Outcome | Group | N | Mean | Std. Deviation | Std. Error Mean | F | Sig | t | df | Sig. (2-tailed) | Cohen’s d |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Pre | EG | 74 | 12.24 | 2.05 | .23 | .76 | .38 | −1.780 | 141 | .07 | −0.30 |
| CG | 69 | 12.84 | 1.95 | .23 | |||||||
| Post | EG | 74 | 15.51 | 2.09 | .24 | 1.06 | .30 | 5.869 | 141 | .00 | 0.99 |
| CG | 69 | 13.36 | 2.28 | .27 |
EG Experimental Group, CG Control Group
Standard deviations are for post-test scores. Cohen's d values interpret as small (0.2), medium (0.5), and large (0.8) effects
Table 4 shows a significant posttest advantage for EG in spatial reasoning, with a large effect size. CG demonstrated some improvement, albeit with a moderate effect size, indicating that the gesture-based approach was more effective. The findings highlight the promise of gesture-based technology in enhancing spatial reasoning skills, a factor that holds significant importance for fields like science, technology, engineering, and mathematics.
Table 4.
Spatial reasoning results
| Outcome | Group | N | Mean | Std. Deviation | Std. Error Mean | F | Sig | t | df | Sig. (2-tailed) | Cohen’s d |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Pre | EG | 74 | 18.44 | 2.32 | .27 | .95 | .32 | .512 | 141 | .61 | 0.08 |
| CG | 69 | 18.23 | 2.67 | .32 | |||||||
| Post | EG | 74 | 23.55 | 3.37 | .39 | 12.25 | .00 | 8.008 | 141 | .00 | 1.34 |
| CG | 69 | 19.69 | 2.22 | .26 |
SR Spatial Reasoning), EG Experimental Group, CG Control Group
Standard deviations are for post-test scores. Cohen's d values interpret as small (0.2), medium (0.5), and large (0.8) effects
Table 5 indicates higher posttest engagement in EG than CG, with a large between-group effect. CG demonstrated some improvement, albeit with a moderate effect size, suggesting a less significant change. The findings indicate that gesture-based technology can significantly boost student engagement, presenting an effective method for maintaining interest in educational settings.
Table 5.
Engagement results
| Outcome | Group | N | Mean | Std. Deviation | Std. Error Mean | F | Sig | t | df | Sig. (2-tailed) | Cohen’s d |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Pre | EG | 74 | 45.31 | 4.98 | .57 | .17 | .67 | .695 | 141 | .48 | 0.12 |
| CG | 69 | 44.71 | 5.35 | .64 | |||||||
| Post | EG | 74 | 55.21 | 4.87 | .56 | .44 | .50 | 8.907 | 141 | .00 | 1.49 |
| CG | 69 | 48.26 | 4.43 | .53 |
EG Experimental Group, CG Control Group
Standard deviations are reported for both pretest and posttest scores. Cohen's d values interpret as small (0.2), medium (0.5), and large (0.8) effects
Qualitative findings
The open-ended survey responses from EG yielded five main themes: perceived effectiveness, increased motivation, engagement and interaction, challenges, and growing confidence. Across responses, perceived effectiveness and increased motivation were mentioned most often, engagement and interaction were also frequently emphasized, challenges were raised by several students, and growing confidence was discussed by a smaller but notable subset. In line with Maxwell [53], cautious verbal quantifiers such as many, several and a smaller subset are used to signal prevalence without reducing the qualitative account to simple tallies.
Perceived Effectiveness: Many students described gesture-based activities as noticeably more memorable than conventional approaches, and they connected recall to bodily experience rather than repetition alone. A student shared, “When I bent down to put the book under the chair, I didn’t just hear ‘under.’ I felt it. I can't stop thinking about it.” “Before, I always got ‘on’ and ‘under’ mixed up”. Another noted, “but I could see and feel the difference after I moved the things myself. It stuck.” A third noted, “When I think of ‘fly,’ my arms still move a little. That helps me remember it more quickly.” Collectively, these statements suggest that learners experienced vocabulary and prepositions as actions they could reenact, not only as forms they could rehearse. This pattern aligns with embodied cognition accounts in which meaning and retention are strengthened when linguistic representations are grounded in sensorimotor experience rather than relying only on auditory or visual input [19]. Students were not simply memorizing words,they were encoding them through coordinated perception and movement. Students often described enacted words as more memorable. However, the WMTB-C did not test recall for the lesson-specific vocabulary. Future work should add delayed vocabulary measures (e.g., picture-cued recognition).
Increased Motivation: Many students reported heightened motivation, and they attributed it to a sense of control and immediacy rather than to grades, praise, or other external incentives. One student explained, “I usually zone out in grammar class, but here I had to do something every second. It didn’t feel like work; it felt like a game.” Another remarked, “I didn’t realize how much I was learning until I looked back. The whole time, I was smiling.” A third added, “I didn’t think I liked tech class, but this made me want to go back. I wanted to try again the next day.” These comments portray motivation as emerging from autonomy, curiosity, and enjoyment, and they indicate that novelty alone did not fully account for engagement because students repeatedly framed the activity as meaningful and self-directed. In contrast to conventional lessons where learners often wait, follow rigid routines, or participate intermittently, the gesture-based tasks positioned students as active agents who continuously made decisions and observed the consequences of those decisions. This orientation reflects core propositions of self-determination theory, which emphasizes autonomy and competence as central antecedents of intrinsic motivation [72]. When learners perceive genuine control over the learning process and experience competence through successful interaction, they are more likely to persist and to invest effort in ways that deepen understanding.
Engagement and Interaction: A common theme was sustained attention, with students linking focus to the requirement for continuous embodied participation. One student noted, “I couldn’t zone out because I had too much to do. If I stopped moving, the screen would stop.” Another student remarked, “I never looked at my phone during class. I was too busy writing sentences with my hands.” A third observed, “I used to get bored after ten minutes. I didn’t even notice how long it had been today.” These accounts suggest that engagement was not limited to interest or enjoyment; it was enacted through ongoing involvement that demanded perception, decision-making, and action in rapid succession. Whereas many EFL classrooms rely heavily on listening, reading, and writing for extended periods, which can contribute to cognitive fatigue, the gesture-based environment distributed activity across multiple modalities and kept learners physically and mentally oriented to the task. This pattern is consistent with multimedia learning perspectives that emphasize how coordinated channels of processing can support sustained attention and reduce overload when learners interact meaningfully with content [54]. In this setting, learning became a dynamic experience in which language was enacted and monitored in real time, which offers a plausible explanation for why attention remained elevated throughout longer sessions.
Challenges: Several students reported initial technical difficulty and physical fatigue, and these challenges were most salient early in the intervention. One student recalled, “At first, I was angry. I moved my hand like the teacher told me to, but nothing happened. I thought the system wasn’t working.” Another described repetition and strain: “I told him to ‘jump’ five times. My arm hurt. I was ready to quit.” Although these frustrations were genuine, many of the students who mentioned challenges also described adapting over time, shifting from blaming the device to refining their own actions. As one learner reflected, “After two lessons, I knew how to move more slowly and clearly. It wasn't the machine; it was me.” This shift from external attribution to self-regulation is a meaningful qualitative marker because it suggests metacognitive adjustment rather than simple resignation. For a number of learners, the early obstacles became opportunities to reinterpret errors as part of learning, as skill and confidence increased through repeated practice. This trajectory resonates with accounts of productive struggle, in which temporary difficulty can contribute to deeper understanding when learners receive sufficient structure and support [7]. In this sense, initial barriers did not merely hinder learning,they also revealed the persistence through which students negotiated unfamiliar interaction demands and gradually converted difficulty into mastery.
Confidence and Comfort: A smaller but notable subset of learners foregrounded self-consciousness at the outset, especially the discomfort of moving visibly in front of peers in classroom cultures where stillness is often expected. One student admitted, “I thought it was dumb to wave my hands in front of everyone. I thought people would get a kick out of it.” Another learner said, “At first, I wasn’t sure.” These statements reflect common participation norms in many EFL classrooms, where physical expressiveness may be discouraged and learners can fear negative evaluation. Importantly, most students who raised this concern also described a gradual transition toward ease. One student noted, “I don’t even think about moving anymore. It just happens. It sounds like I’m using my hands to talk.” Another related the experience to habitual communication: “I realized that I’ve been using my hands to explain things since I was a kid. This just made it official.” This progression from embarrassment to comfort suggests that the intervention supported more than vocabulary instruction because it legitimized gesture as part of language use and allowed learners to experience agency through embodied expression. Over time, students increasingly framed themselves as active participants who could influence their learning through physical engagement, which indicates growth in both communicative confidence and classroom participation.
Discussion
This research explored how gesture-based learning tools influence memory retention, spatial reasoning, and engagement levels in EFL students. Across the six-week intervention, EG revealed consistently stronger post-intervention performance in working memory capacity, spatial reasoning, and engagement than CG that received conventional instruction. EG showed better results on post-intervention assessments of working memory capacity, spatial reasoning, and engagement, as indicated by established questionnaires. Open-ended responses also converged with these patterns, as many students described heightened motivation, clearer understanding of target concepts, and a stronger sense of ownership in learning, which together suggest that gesture-based technology can enrich both cognitive and affective dimensions of the EFL learning experience.
The findings support an increasing amount of research highlighting the advantages of embodied cognition and learning through gestures. Chen and Fang [11] illustrated how gesture-based technologies positively influence learning, emphasizing the crucial impact of physical activity on retention and understanding. Yuan et al. [82] demonstrated that gesture-based technology enhances memory training in adaptive learning environments, reinforcing the link between gesture use and improved working memory capacity in language learning settings. The noted enhancements in motivation and ongoing engagement align with the research by Shakroum et al. [75], which found that Gesture-Based Learning Systems significantly improve learning outcomes by fostering greater engagement and interactive learning settings. Taken together, these convergent findings strengthen the argument that integrating structured movement into language instruction can support both deeper processing and sustained participation, especially when movement is meaning-aligned rather than incidental.
This study builds on previous research in several significant ways. This research builds on previous findings, like those of Hsiao et al. [29], which highlighted the advantages of gesture recognition for cognitive skills in younger learners. By examining Ethiopian Grade 12 EFL learners, the present study extends this line of work to older students who engage in more complex cognitive activity, including rule-based manipulation, spatial transformation, and reflective language use in a high-stakes secondary-school setting. Unlike prior work that has focused mainly on STEM contexts (e.g., [64]), this study examines gesture-based technology in an EFL setting and its potential contribution to language learning. In doing so, it clarifies how embodied interaction can operate as a language-learning mechanism, not only as a representational aid within technical domains. The noted advancements in spatial reasoning, along with increased engagement, present a valuable insight into the current body of research. This indicates that learning through gestures could be an especially effective method for fostering both cognitive and emotional skills essential for successful language acquisition. The findings align with recent studies by Dimitriadou and Lanitis [16] and Maruf et al. [51], highlighting the beneficial effects of body language and gesture-based feedback on engagement and interaction in language learning.
Quantitative improvements observed in EG’s working memory capacity can be interpreted through the perspective of embodied cognition. This framework suggests that linguistic knowledge gains stability when it is rooted in sensorimotor experience. Kogan et al. [41] suggest that language comprehension is rooted in lasting sensorimotor activations that support verbal learning. In contrast, Willems et al. [80] highlight how the integration of gesture and speech enhances the construction of meaning. Consistent with this perspective, participants in EG engaged in physical actions by grasping, dragging, and positioning virtual objects to enact words. Such enactment may have produced stronger connections between linguistic forms and their meanings. Studies indicate that engaging in self-performed actions while learning, known as the enactment effect, significantly improves memory retention compared to more passive learning conditions [21]. The WMTB-C assessed overall working memory capacity, separate from the specific recall of learned items. For this reason, and consistent with the cautious interpretation stated in the qualitative findings, we cannot verify that the observed gains reflect direct improvements in vocabulary learning per se, even though many students perceived vocabulary as more memorable when it was enacted. Qualitative data indicated that students felt bodily engagement enhanced the memorability of vocabulary. One learner noted that they recalled the word “under” because they had to bend down to position the bag accordingly. These accounts suggest that gestures may serve as cognitive supports, allowing students to encode language through their lived experiences instead of solely through verbal repetition. Martin and Ellis [50] show that a greater working memory capacity probably aids vocabulary retention in EFL settings by enabling learners to maintain and manipulate linguistic information during processing. At the same time, the present mixed-method evidence supports a strong inference regrading the instructional advantage of embodied interaction, while it does not justify attributing vocabulary-specific gains solely to gesture, because the WMTB-C is a domain-general index and other co-occurring features of the environment may have contributed. These nuances help preserve interpretive accuracy while still recognizing the robust pattern of improvement.
Improvements in spatial reasoning can be understood through the combined perspectives of embodied cognition and gesture-based theory. Hostetter and Alibali [27] characterize gesture as a simulated action that enhances conceptual understanding, while Kelly et al. [36] show that speech and gesture work together to aid in the comprehension of spatial relations. In the present study, students did not just study prepositions,they actively engaged with them. To place a virtual cup under a chair, one needed to execute a downward motion, whereas positioning an object between two others required lateral coordination. These embodied actions turned abstract relational concepts into salient learning experiences. Scores on the SRI provided a quantitative assessment. Qualitative reflections suggested that this embodied manipulation often translated into greater confidence, with several students emphasizing that they understood spatial relations more clearly once they could perform them. This qualitative insight indicates that gesture connects perception and conceptualization. In contrast to traditional instruction, which depended on static diagrams and written prompts, active manipulation seems to encourage a more profound internalization of concepts. However, uncertainties persist regarding the applicability of these benefits to wider contexts of language acquisition. Moreover, the use of gestural embodiment to convey spatial schematic information and ergative verbs adds depth to this explanation. Simulations involving definite articles and space-based metaphors allow learners to practice mental manipulation of relationships through iconic and metaphoric gesture, which can strengthen spatial visualization in EFL tasks [38, 39].
Quantitative data indicated a significant rise in learner engagement, as evidenced by SESQ scores. Qualitative feedback distinctly emphasized aspects of motivation. Gesture-based approaches perceive movement as both a communicative tool and a method for shaping thought. Goldin-Meadow and Beilock [20] demonstrated that gesturing facilitates conceptual change, while Janzen-Ulbricht [32] has connected gesture to increased learning motivation through a neurocognitive lens. In this study, students reported feeling more involved, curious, and proud when participating in gesture-guided activities. Their feeling of control seemed to stem from the clear connection between their intentions and the results: as they moved their hands to create a correct sentence, the system reacted instantly, bolstering their confidence in their abilities. Many responses signaled sustained engagement, and a common thread was the sense that time passed quickly during tasks that required continuous perception, decision-making, and action. Statements like “I did not look at my phone once” and “I wanted to come back tomorrow” highlight how gesture-based interaction fosters intrinsic motivation by fulfilling the needs for autonomy and mastery. The initial excitement may have stemmed from novelty, but the consistent engagement over six weeks indicates that the intervention had a lasting impact on learners’ approaches to language learning. This pattern is also compatible with embodied accounts of L2 motivation, in which gestural representations of metaphor schemas and manner adverbs can intensify emotional investment through embodied simulations and encourage agentic engagement under supportive instructional conditions [4, 39, 40].
The compelling nature of these findings lies in their alignment with the fundamental principles of embodied cognition and gesture-based theory. Kogan et al. [41] argue that abstract linguistic knowledge becomes meaningful when it is connected to bodily experiences. In this study, learners engaged with vocabulary and grammar not merely as concepts, but as experiences through enactment. A student who physically moved the word dog into a category labeled animals or mimicked the action of jumping with their arms was not merely engaging in an activity,they were actively constructing meaning through sensorimotor involvement. Gesture-based tasks can create a perception–action loop: learners act on language input, receive immediate feedback, and refine form–meaning mappings over repeated trials [80]. The gesture-based theory provides additional insight into the enhancement of spatial reasoning. Prepositions like under and between serve not merely as abstract rules to memorize but as relational concepts that we actively engage with [11]. When learners intentionally placed objects in space, they began to grasp the logic of spatial terms through their actions rather than deducing them from unchanging examples. Qualitative accounts frequently echoed this mechanism, as several learners described a clearer internal picture of meaning when they could coordinate movement with language form. Qualitative reports, including students expressing that they could “see it in their hands,” demonstrate how gesture served as a medium for internal representation. Furthermore, the increased sense of agency and motivation might not have stemmed only from novelty or autonomy, but rather from the alignment between intention and outcome. The embodied approach fosters a seamless connection between thought, action, and language, in contrast to traditional instruction, which frequently results in learners feeling detached from the material. The quantitative enhancements in working memory capacity, reasoning, and engagement suggest that gesture serves not merely as an auxiliary tool but as a crucial avenue for learning. At the same time, the interpretation remains intentionally cautious: the mixed-method pattern supports a strong instructional inference, but it does not warrant a claim that gesture alone directly caused vocabulary gains, particularly when the WMTB-C provides a general index rather than item-specific learning evidence. Meanwhile, the qualitative themes provide additional insights into perceived memorability and motivation, without asserting direct causation from unmeasured variables. The intriguing aspects of these findings are evident in the context of embodied metaphor perspectives. Here, gestural simulations of motion and spatial elements tackle acquisition challenges, merging iconic and metaphoric embodiments to foster cohesive cognitive and emotional development [38–40].
This study offers a theoretical insight by demonstrating the interplay between embodied cognition and gesture-based learning in the context of second language acquisition. Embodied cognition highlights that linguistic knowledge gains significance when it is rooted in bodily action [41]. Gesture-based theory emphasizes how movement contributes to the development of thought and the formation of conceptual understanding [20]. This study’s findings indicate a convergence of perspectives, showing that learners improved in memory retention, spatial reasoning, and motivation when language was engaged physically instead of being presented in a static format. In an Ethiopian Grade 12 EFL context, where opportunities for sustained, meaning-rich English use are often constrained, this convergence is especially consequential because it suggests that embodied interaction can function as a practical pathway for making language learnable, memorable, and engaging under resource-sensitive classroom conditions. This convergence suggests that bodily engagement is not just a supplementary tool but a fundamental avenue for learners to internalize language and create lasting representations of meaning [80]. This study challenges conventional models that regard linguistic knowledge as abstract and separate from action. It emphasizes the importance of gesture and bodily enactment as essential components of both cognitive and motivational aspects of learning. This contribution draws on recent embodied research concerning the gestural realization of ergative verbs, definite articles, and spatial metaphors. It highlights how psychological processes in L2 learning are enhanced by congruent gesture alignments, which aid in the understanding of comprehensive schemas [4, 37, 40].
Research conclusions
Across six weeks, EG outperformed CG on working memory capacity, spatial reasoning, and engagement, and participants’ reflections suggested that physically enacting language supported motivation, agency, and retention. Additionally, qualitative insights highlighted increased motivation, deeper conceptual understanding, and greater agency, as students expressed connections between their physical actions and language retention. The observed patterns highlight how physical enactment helps make abstract language tangible, creating lasting cognitive connections that go beyond simple memorization to a deeper, embodied understanding. This work enhances the intersection of embodied cognition and gesture-based paradigms by clarifying how sensorimotor interactions create strong neural pathways for language processing. It repositions gestures as crucial builders of meaning rather than mere supports, challenging traditional views that separate cognition from physical experience. This aligns with recent advancements that highlight the importance of movement in educational theory [15, 19, 20, 27].
These insights, drawn from an urban Ethiopian sample, highlight the need to integrate gesture-oriented strategies into EFL curricula to create engaging, learner-centered environments in similar contexts. Educators can plan short, structured “embodiment routines” that align specific language targets with observable classroom actions. For example, teachers can design brief preposition tasks in which students physically arrange familiar objects to model spatial relations, such as placing a book under a chair, positioning a pencil between two notebooks, or moving a cup on a desk, and then prompt learners to describe what they did using complete sentences that reuse the same forms (“The book is under the chair”). For verbs, instructors can use guided enactment in which students act out target actions and peers produce full sentences, followed by teacher modeling and brief corrective moves accompanied by meaning-aligned gesture, which is consistent with pedagogical-gesture accounts of form-focused interaction [52, 55]. This approach connects linguistic forms with real-life experiences, enhancing comprehension and persistence [36, 52, 55, 76]. Digital augmentations, such as responsive interfaces and personalized progress tracking, enhance these initiatives by providing consistent feedback and adaptive challenges. This is particularly advantageous in diverse classrooms where the level of individualized attention can differ [11, 28, 34, 44, 75, 81, 82]. To support scalable adoption, curriculum designers and institutional leaders can develop ready-to-use lesson modules that include a small, standardized gesture repertoire for high-frequency prepositions and classroom verbs, short teacher scripts for prompts and feedback, and simple rubrics that prioritize accuracy and intelligibility, alongside brief professional workshops that help teachers integrate movement without disrupting pacing. This approach will foster teaching environments that prioritize embodiment as a key element of successful language learning, with broader global applicability warranting additional studies across varied EFL settings.
While the study offers valuable insights, it faces several limitations. The participant group is limited to a single urban district in Ethiopia, which may hinder the ability to generalize findings to diverse populations and skill levels. Additionally, the reliance on one specific technology could blur the distinction between the advantages of gestural communication and the features of the device itself. The duration of the study is also insufficient to determine long-term effects on comprehensive language skills, such as productive fluency. Furthermore, there is a lack of focused assessments for instructed vocabulary, despite broader evaluations of memory [50], and the qualitative data collection appears to be skewed in favor of EG. Future endeavors should address these issues by implementing extended longitudinal designs with delayed evaluations to assess sustainability, exploring progressive gesture attenuation to understand cognitive internalization, utilizing collaborative narrative frameworks across various platforms for spontaneous output, conducting equitable thematic explorations that include all conditions, and evaluating cost-effective alternatives such as app-integrated tracking to enhance access in underserved areas. These efforts aim to enhance embodied approaches, transforming them from mere experimental novelties into essential components of global language education.
Acknowledgements
AI Use StatementArtificial intelligence tools (ChatGPT and Grammarly) were utilized exclusively for surface-level linguistic editing. These tools did not contribute to the study’s conceptual framework, research design, data collection, statistical analysis, or interpretation of results. The authors take full responsibility for the integrity of the manuscript.
Authors’ contributions
AA conceived and designed the study, collected and analyzed the data, and drafted the initial version of the manuscript. WC contributed substantially to the revision process by strengthening the literature review, enriching the theoretical background, clarifying methodological details, and refining the discussion in response to reviewer feedback. Both authors critically reviewed, edited, and approved the final manuscript and agree to be accountable for all aspects of the work.
Funding
No financial support was received from any funding agency, whether public, private, or non-profit, for the conduct of this research.
Data availability
The data generated and analyzed during this study are available from the corresponding author upon reasonable request.
Declarations
Ethics approval and consent to participate
This study was classified as minimal-risk educational research conducted within the context of regular instructional activities. All participants were fully informed about the aims, procedures, and scope of the study prior to data collection. Participation was entirely voluntary, and written informed consent was obtained from all participants before their inclusion in the study. Participants were assured that confidentiality and anonymity would be strictly maintained and that all data would be used exclusively for research purposes. They were also informed of their right to withdraw from the study at any stage without any academic or personal consequences.
In accordance with the institutional and national regulations governing educational research in Ethiopia, formal ethical approval was not required for this type of minimal-risk study. The requirement for ethics review was therefore waived in line with the research ethics regulations of Jimma University, Jimma, Ethiopia, as the study did not involve medical or psychological intervention, vulnerable populations, or the collection of sensitive personal data. Nevertheless, all research procedures adhered to the ethical principles outlined in the Declaration of Helsinki (1964, as revised) and comparable international ethical guidelines.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Alibali MW, Heath DC, Myers HJ. Effects of visibility between speaker and listener on gesture production: some gestures are meant to be seen. J Mem Lang. 2001;44(2):169–88. 10.1006/jmla.2000.2752. [Google Scholar]
- 2.Alzubi T, Fernández R, Flores J, Duran M, Cotos JM. Improving the working memory during early childhood education through the use of an interactive gesture game-based learning approach. IEEE Access. 2018;6:53998–4009. 10.1109/ACCESS.2018.2870575. [Google Scholar]
- 3.Aubrey S. Enhancing long-term learner engagement through project-based learning. ELT J. 2022;76(4):441–51. 10.1093/elt/ccab032. [Google Scholar]
- 4.Banaruee H, Khatin-Zadeh O, Farsani D. The challenge of psychological processes in language acquisition: a systematic review. Cogent Arts Humanit. 2023;10(1):2157961. 10.1080/23311983.2022.2157961. [Google Scholar]
- 5.Barsalou LW. Grounded cognition. Annu Rev Psychol. 2008;59:617–45. 10.1146/annurev.psych.59.103006.093639. [DOI] [PubMed] [Google Scholar]
- 6.Barutchu A, Sahu A, Humphreys GW, Spence C. Multisensory processing in event-based prospective memory. Acta Psychol. 2019;192:23–30. 10.1016/j.actpsy.2018.10.015. [DOI] [PubMed] [Google Scholar]
- 7.Boaler J. Mathematical mindsets: unleashing students’ potential through creative math, inspiring messages and innovative teaching. John Wiley & Sons; 2015. [Google Scholar]
- 8.Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2006;3(2):77–101. 10.1191/1478088706qp063oa. [Google Scholar]
- 9.Camurri A, Volpe G. Gesture-based communication in human-computer interaction: Selected revised papers from the 5th International Gesture Workshop, GW 2003, Genova, Italy, April 15–17, 2003. Springer; 2004. 10.1007/b95740.
- 10.Chang W, Fang WC, Lin YL, Chen NS. Gesture-facilitated learning of English word stress patterns. In 2014 International Conference of Educational Innovation through Technology. IEEE; 2014. pp. 37–42.
- 11.Chen N-S, Fang W-C. Gesture-based technologies for enhancing learning. In The new development of technology-enhanced learning. Springer; 2014. pp. 95–112. 10.1007/978-3-642-38291-8_6.
- 12.Church RB, Ayman-Nolley S, Mahootian S. The role of gesture in bilingual education: does gesture enhance learning? Int J Bilingual Educ Bilingual. 2004;7(4):303–19. 10.1080/13670050408667815. [Google Scholar]
- 13.Cook SW, Yip TK, Goldin-Meadow S. Gestures, but not meaningless movements, lighten working memory load when explaining math. Lang Cogn Process. 2012;27(4):594–610. 10.1080/01690965.2011.567074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Creswell JW, Clark VLP. Designing and conducting mixed methods research. Sage Publications; 2017. [Google Scholar]
- 15.de Koning B, Zhang S, Sepp S. Integrating human movement in learning: advancements in language instruction, multimedia, and theory. Educ Psychol Rev. 2025;37(2):51. 10.1007/s10648-025-10027-1. [Google Scholar]
- 16.Dimitriadou E, Lanitis A. Evaluating the impact of an automated body language assessment system. Educ Inf Technol. 2025;30:3509–39. 10.1007/s10639-024-12931-5. [Google Scholar]
- 17.Dörnyei Z. Research methods in applied linguistics: quantitative, qualitative, and mixed methodologies. Oxford University Press; 2007. [Google Scholar]
- 18.Ghelichli Y, Seyyedrezaei SH, Barani G, Mazandarani O. The mediating role of self-regulation between student engagement and motivation among Iranian EFL learners: a structural equation modeling approach. J Modern Res English Language Stud. 2021;9(1):179–200. 10.30479/jmrels.2020.13689.1679. [Google Scholar]
- 19.Glenberg AM. Embodiment as a unifying perspective for psychology. WIREs Cogn Sci. 2010;1(4):586–96. 10.1002/wcs.55. [DOI] [PubMed] [Google Scholar]
- 20.Goldin-Meadow S, Beilock SL. Action’s influence on thought: the case of gesture. Perspect Psychol Sci. 2010;5(6):664–74. 10.1177/1745691610388764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Goldin-Meadow S, Cook SW, Mitchell ZA. Gesturing gives children new ideas about math. Psychol Sci. 2009;20(3):267–72. 10.1111/j.1467-9280.2009.02297.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gullberg M. Gestures and second language acquisition. In Robinson P, Ellis NC (Eds.), Handbook of cognitive linguistics and second language acquisition. Routledge/Taylor & Francis Group; 2008. p. 276-305.
- 23.Gullberg M. The relationship between gestures and speaking in L2 learning. In Derwing T, Munro M, Thomson R (Red.), The Routledge handbook on second language acquisition and speaking. (The Routledge Handbooks in Second Language Acquisition). Routledge; 2022. p. 386-98.
- 24.Guo X, Rosenberg MD, Bainbridge WA, Goldin-Meadow S. How gesture benefits learning: a working framework for examining attention and memory mechanisms. Annu Rev Dev Psychol. 2025;7:1–22. 10.1146/annurev-devpsych-111323-112334. [Google Scholar]
- 25.Hart SR, Stewart K, Jimerson SR. The student engagement in schools questionnaire (SESQ) and the teacher engagement report form-new (TERF-N): examining the preliminary evidence. Contemp Sch Psychol. 2011;15(1):67–79. [Google Scholar]
- 26.He-Zhang Y, Duvignau K, Huet N. The effects of situated gestures on Mandarin Chinese word learning. Acta Psychol. 2025;261:105806. 10.1016/j.actpsy.2025.105806. [Google Scholar]
- 27.Hostetter AB, Alibali MW. Gesture as simulated action: revisiting the framework. Psychon Bull Rev. 2019;26(3):721–52. 10.3758/s13423-018-1548-0. [DOI] [PubMed] [Google Scholar]
- 28.Hsiao HS, Chen JC. Using a gesture interactive game-based learning approach to improve preschool children’s learning performance and motor skills. Comput Educ. 2016;95:151–62. 10.1016/j.compedu.2016.01.005. [Google Scholar]
- 29.Hsiao HS, Chang IH, Chen YH, Chen JC. Using gesture recognition with the memory strategy to improve preschoolers’ learning performance, motor skills, and executive function. Educ Technol Res Dev. 2025:1-1910.1007/s11423-025-10471-4.
- 30.Huang M, Kuang F, Ling Y. EFL learners’ engagement in different activities of blended learning environment. Asian-Pac J Sec Foreign Lang Educ. 2022;7(1):1–15. 10.1186/s40862-022-00136-7. [Google Scholar]
- 31.Janzen Ulbricht, N. (2020). The Embodied Teaching of Spatial Terms: Gestures Mapped to Morphemes Improve Learning. Frontiers in Education, 5. 10.3389/feduc.2020.00109
- 32.Janzen-Ulbricht N. Gesture-based language learning–Can more learners learn more?. Freie Universitaet Berlin (Germany). 2024.
- 33.Johnson MH. Cortical plasticity in normal and abnormal cognitive development: evidence and working hypotheses. Dev Psychopathol. 1999;11(3):419–37. 10.1017/S0954579499002138. [DOI] [PubMed] [Google Scholar]
- 34.Jusslin S, Korpinen K, Lilja N, Martin R, Lehtinen-Schnabel J, Anttila E. Embodied learning and teaching approaches in language education: a mixed studies review. Educ Res Rev. 2022;37:100480. 10.1016/j.edurev.2022.100480. [Google Scholar]
- 35.Kelly SD, Ngo Tran QA. Exploring the emotional functions of co‐speech hand gesture in language and communication. Top Cogn Sci. 2025;17(3):586–608. 10.1111/tops.12657. [DOI] [PubMed] [Google Scholar]
- 36.Kelly SD, Özyürek A, Maris E. Two sides of the same coin: speech and gesture mutually interact to enhance comprehension. Psychol Sci. 2010;21(2):260–7. 10.1177/0956797609357327. [DOI] [PubMed] [Google Scholar]
- 37.Khatin-Zadeh O, Eskandari Z, Farsani D, Banaruee H. Embodiment and gestural simulation of the definite article. SAGE Open. 2025;15(4):21582440251385321. 10.1177/21582440251385321. [Google Scholar]
- 38.Khatin-Zadeh O, Farsani D, Banaruee H. A study of the use of iconic and metaphoric gestures with motion-based, static space-based, static object-based, and static event-based statements. Behav Sci. 2022;12(7):239. 10.3390/bs12070239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Khatin-Zadeh O, Farsani D, Banaruee H. How gestural representation of metaphor schema facilitates metaphor comprehension in congruent gesture-aligned conditions: an embodied metaphor processing perspective. SAGE Open. 2025;15(4):21582440251394448. 10.1177/21582440251394448. [Google Scholar]
- 40.Khatin-Zadeh O, Farsani D, Eskandari Z, Li S, Banaruee H. Gestural embodiment of spatial schematic information in motion-based and static space-based metaphors. Cogent Arts Humanities. 2023;10(1):2266904. 10.1080/23311983.2023.2266904. [Google Scholar]
- 41.Kogan B, Birba A, Díaz Rivera MN, González Santibáñez C, García AM. Dynamics of language grounding: on the time course, durability, adaptability, and vulnerability of embodied effects. Front Psychol. 2025;16:1637855. 10.3389/fpsyg.2025.1637855. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kohonen V. Is the relation between phonological memory and foreign language learning accounted for by vocabulary acquisition? Appl Psycholinguist. 1995;16(2):155–72. 10.1017/S0142716400007062. [Google Scholar]
- 43.Kosmas P, Zaphiris P. Embodied Interaction in Language Learning: Enhancing Students’ Collaboration and Emotional Engagement. In: Lamas D, Loizides F, Nacke L, Petrie H, Winckler M, Zaphiris P. (eds) Human-Computer Interaction – INTERACT 2019. INTERACT 2019. Lecture Notes in Computer Science, vol 11747. Cham: Springer; 2019. 10.1007/978-3-030-29384-0_11.
- 44.Kuo F-R, Hsu C-C, Fang W-C, Chen N-S. The effects of embodimentbased TPR approach on student English vocabulary learning achievement, retention and acceptance. J King Saud University Comput Inform Sci. 2014;26:63–70. 10.1016/j.jksuci.2013.10.003. [Google Scholar]
- 45.Li Z, Li J. Learner engagement in the flipped foreign language classroom: definitions, debates, and directions of future research. Front Psychol. 2022;13:810701. 10.3389/fpsyg.2022.810701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Liu Q, Du X, Lu H. Teacher support and learning engagement of EFL learners: the mediating role of self-efficacy and achievement goal orientation. Curr Psychol. 2023;42(4):2619–35. 10.1007/s12144-022-04043-5. [Google Scholar]
- 47.Lowrie T, Logan T, Harris D, Hegarty M. The impact of an intervention program on students’ spatial reasoning: student engagement through mathematics-enhanced learning activities. Cogn Res Princ Implic. 2018;3:50. 10.1186/s41235-018-0147-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Macedonia M. Your body as a tool to learn second language vocabulary. Behav Sci. 2025;15(8):997. 10.3390/bs15080997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Macedonia M, von Kriegstein K. Gestures enhance foreign language learning. Biolinguistics. 2012;6:393–416. [Google Scholar]
- 50.Martin KI, Ellis NC. The roles of phonological short-term memory and working memory in L2 grammar and vocabulary learning. Stud Second Lang Acquis. 2012;34(3):379–413. 10.1017/S0272263112000125. [Google Scholar]
- 51.Maruf N, Liu Y, Halyna K, Anwar K, Santoso RA. Enhancing language learning through gesture-based corrective feedback: a comparative study in a multi-country context. Indonesian J Appl Linguistics. 2025;14(3):612–25. [Google Scholar]
- 52.Matsumoto Y, Dobs AM. Pedagogical gestures as interactional resources for teaching and learning tense and aspect in the ESL grammar classroom. Lang Learn. 2017;67(1):7–42. 10.1111/lang.12181. [Google Scholar]
- 53.Maxwell JA. Using numbers in qualitative research. Qual Inq. 2010;16(6):475–82. 10.1177/1077800410364740. [Google Scholar]
- 54.Mayer RE. Incorporating motivation into multimedia learning. Learn Instr. 2014;29:171–3. 10.1016/j.learninstruc.2013.04.003. [Google Scholar]
- 55.Nakatsukasa K. Efficacy of recasts and gestures on the acquisition of locative prepositions. Stud Second Lang Acquis. 2016;38(4):771–99. [Google Scholar]
- 56.Namaziandost E, Hwang G-J. Implementing multiple intelligence-informed tasks to cultivate willingness to communicate, academic engagement, and academic success: evidence from EFL learners. Instructional Sci2025. 10.1007/s11251-025-09739-2.
- 57.Namaziandost E, Hafezian M, Shafiee S. Exploring the association among working memory, anxiety and Iranian EFL learners’ listening comprehension. Asian Pacific J Second Foreign Language Educ. 2018;3(1):1–17. 10.1186/s40862-018-0061-3. [Google Scholar]
- 58.Newcombe NS, Frick A. Early education for spatial intelligence: Why, what, and how. Mind Brain Educ. 2010;4(3):102–11. 10.1111/j.1751-228X.2010.01089.x. [Google Scholar]
- 59.Newcombe NS, Shipley TF. Thinking about spatial thinking: New typology, new assessments. Studying Visual and Spatial Reasoning for Design Creativity. 2015:179–192. 10.1007/978-94-017-9297-4_10.
- 60.Paivio A. Dual coding theory: retrospect and current status. Can J Psychol. 1991;45(3):255. [Google Scholar]
- 61.Pickering S, Gathercole S. Working Memory Test Battery for Children (WMTB-C). Manual: The Psychological Corporation; 2001. [Google Scholar]
- 62.Pulvermüller F. Brain mechanisms linking language and action. Nat Rev Neurosci. 2005;6(7):576–82. 10.1038/nrn1706. [DOI] [PubMed] [Google Scholar]
- 63.Ramful A, Lowrie T, Logan T. Measurement of spatial ability: construction and validation of the spatial reasoning instrument for middle school students. J Psychoeduc Assess. 2017;35(7):709–27. 10.1177/0734282916659207. [Google Scholar]
- 64.Rau MA, Beier JP. Exploring the effects of gesture-based collaboration on students’ benefit from a perceptual training. J Educ Psychol. 2023;115(2):267–89. 10.1037/edu0000774. [Google Scholar]
- 65.Reeve J. How students create motivationally supportive learning environments for themselves: the concept of agentic engagement. J Educ Psychol. 2013;105(3):579–95. [Google Scholar]
- 66.Reeve J, Lee W. Students’ classroom engagement produces longitudinal changes in classroom motivation. J Educ Psychol. 2014;106(2):527–40. [Google Scholar]
- 67.Reeve J, Tseng CM. Agency as a fourth aspect of students’ engagement during learning activities. Contemp Educ Psychol. 2011;36(4):257–67. 10.1016/j.cedpsych.2011.05.002. [Google Scholar]
- 68.Reinders H. Touch and gesture-based language learning some possible avenues for research and classroom practice. Teach English Technol. 2014;14(1):3–8. [Google Scholar]
- 69.Rezai A, Soyoof A, Reynolds BL. Disclosing the correlation between using ChatGPT and well‐being in EFL learners: considering the mediating role of emotion regulation. Eur J Educ. 2024;59(4):e12752. 10.1111/ejed.12752. [Google Scholar]
- 70.Rosborough AA. Gesture, meaning-making, and embodiment: second language learning in an elementary classroom. J Pedagogy. 2014;5(2):1–24. 10.2478/jped-2014-0011. [Google Scholar]
- 71.Rudner M. Working memory for linguistic and non-linguistic manual gestures: evidence, theory, and application. Front Psychol. 2018;9:679. 10.3389/fpsyg.2018.00679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Ryan RM, Deci EL. Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. Am Psychol. 2000;55(1):68–78. [DOI] [PubMed] [Google Scholar]
- 73.Ryan RM, Deci EL. Self-determination theory. In: Encyclopedia of quality of life and well-being research. Cham: Springer International Publishing; 2024. p. 6229–35. [Google Scholar]
- 74.Schnaider K. The influence of technological designs on teachers’ and students’ meaning-making: semiotic chains configuring teaching and learning activities. Comput Educ Open. 2023;4:100136. 10.1016/j.caeo.2023.100136. [Google Scholar]
- 75.Shakroum M, Wong KW, Fung CC. The influence of gesture-based learning system (GBLS) on learning outcomes. Comput Educ. 2018;117:75–101. 10.1016/j.compedu.2017.10.002. [Google Scholar]
- 76.Smotrova T. Gesture as a mediational tool in the L2 classroom. In The Routledge handbook of sociocultural theory and second language development (1st ed.). Routledge; 2018. p. 472–86. 10.4324/9781315624747-30
- 77.Tian L. Motivation and gesture in foreign and second language development: a sociocultural study of Chinese learners of English (Doctoral dissertation, University of Nevada, Las Vegas). 2019.
- 78.Vernadakis N, Gioftsidou A, Antoniou P, Ioannidis D, Giannousi M. The impact of Nintendo Wii to physical education students’ balance compared to the traditional approaches. Comput Educ. 2012;59(2):196–205. [Google Scholar]
- 79.Widyasari FE. Teaching vocabulary by enhancing students’ spatial-visual intelligence. Asian EFL J. 2018;20(4):19–26. [Google Scholar]
- 80.Willems RM, Özyürek A, Hagoort P. When language meets action: the neural integration of gesture and speech. Cereb Cortex. 2007;17(10):2322–33. [DOI] [PubMed] [Google Scholar]
- 81.Yu Q. Effects of motion-sensing technology on language learning: evidence from a meta-analysis. Interact Learn Environ. 2024;32(10):7507–23. 10.1080/10494820.2024.2324328. [Google Scholar]
- 82.Yuan RQ, Hsieh SW, Chew SW, Chen NS. The Effects of gesture-based technology on memory training in adaptive learning environment. In 2015 International Conference of Educational Innovation through Technology (EITT). IEEE, Wuhan, China; 2015. p. 190-3.
- 83.Yukselturk E, Altıok S, Başer Z. Using game-based learning with kinect technology in foreign language education course. J Educ Technol Soc. 2018;21(3):159–73. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data generated and analyzed during this study are available from the corresponding author upon reasonable request.
