Abstract
Storytelling has long been used to support language learning; however, the potential of online digital storytelling (ODST) to enhance authentic listening and engagement in public-school EFL classrooms remains underexplored. This study examined whether ODST can improve Grade 7 learners’ comprehension of authentic spoken English and their behavioral, emotional, and cognitive engagement. A mixed-methods, quasi-experimental design was employed with two intact Grade 7 classes in Guangzhou, China (N = 59; experimental n = 30, control n = 29). Both groups completed parallel pre- and posttests of authentic listening comprehension and an 18-item engagement scale. The experimental class received an eight-week ODST intervention integrating narrated online stories with captions, visuals, and sound effects. In contrast, the control class followed textbook-based audio activities aligned with the national syllabus. Baseline equivalence was established on listening and engagement. Quantitatively, posttest outcomes were analyzed using MANCOVA/ANCOVA with pretest scores as covariates. Qualitatively, semi-structured interviews (n = 20) were thematically analyzed, with saturation assessed by codebook stability (i.e., no substantively new themes emerged in the final interviews). Compared with conventional instruction, ODST led to significantly greater improvement in authentic listening comprehension. ODST also yielded significantly higher post-intervention engagement overall and across behavioral, emotional, and cognitive subdimensions, with effects in the medium-to-large range. Interview data converged with these findings, highlighting greater enjoyment, stronger motivation to persist with challenging input, clearer understanding through multimodal cues, and increased confidence in listening. ODST yielded dual benefits for junior high EFL learners in a public-school context, improving comprehension of authentic spoken English and enhancing behavioral, emotional, and cognitive engagement. Because this is a single-site, quasi-experimental study with intact classes, causal claims and generalizability should be interpreted cautiously. The study advances ODST research by empirically linking multimodal narrative scaffolding to both authentic-listening gains and multidimensional engagement in an ecologically realistic public-school setting, and it justifies broader, longitudinal, and multi-site replications.
Keywords: Online digital storytelling, Authentic listening comprehension, Learner engagement, Multimodal instruction, Junior high EFL learners, Mixed-methods quasi-experiment
Subject terms: Education, Language and linguistics, Language and linguistics, Psychology, Psychology
Introduction
Listening—the core receptive language skill—remains decisive for English as a Foreign Language (EFL) learners because it underpins effective communication and supports the development of other language skills1,2. Yet traditional listening exercises in many EFL contexts, particularly in public school systems, often rely on decontextualized, scripted recordings and mechanical tasks that offer few engaging features, dampen motivation, and yield limited learning outcomes3. In such environments, learners may receive “listening practice” without sufficient meaningful exposure to varied, natural speech to develop robust comprehension routines. This matters because early listening experiences can either promote confident growth or cultivate avoidance and low self-efficacy that persist across later language learning.
Authentic texts are those produced for real communicative purposes rather than for language teaching4. In the audio domain, these include conversations, interviews, and newscasts that model natural pronunciation, rhythm, discourse markers, and pragmatic choices in ways that scripted textbook recordings rarely do4,5. In line with communicative language teaching, such texts are strongly recommended because they provide exposure to real-world language use, including variation in accent, rhythm, vocabulary, and sentence structure that students must ultimately cope with outside the classroom5. However, for novice listeners, authentic speech can be difficult precisely because it is fast, variable, and information-dense, and because it offers fewer built-in pedagogical redundancies than textbook audio6. Therefore, authentic audio materials need to be presented in ways that are developmentally appropriate and sensitive to learners’ ages and proficiency levels7. A key implication is not “avoid authenticity,” but “scaffold authenticity,” especially at the junior-high entry level when learners are still building phonological, lexical, and metacognitive listening resources8–10.
Alongside listening itself, a crucial psychological factor—and the second dependent variable in this study—is learner engagement. Engagement has been identified as a key predictor of success or failure across domains in both general education and language-specific learning environments11–14. A substantial body of research indicates that higher levels of learner engagement are associated with better achievement in acquiring target skills, including L2 listening and other language abilities11,12. In the context of L2 teaching, learners who minimally invest attention, effort, and persistence in classroom tasks rarely achieve satisfactory outcomes, regardless of instructional quality15,16. Recent technology-mediated learning research likewise emphasizes that learners’ engagement is shaped by how they perceive digital tools as practical, manageable, and personally meaningful17–19, underscoring the need to design technology-enhanced tasks that genuinely invite sustained participation rather than mere novelty.
L2 engagement is typically understood as the time, effort, and psychological resources learners invest in language tasks, encompassing behavioral, emotional, and cognitive components11,17–19. As Fredericks and colleagues explain, engagement reflects not only overt participation but also the degree of emotional involvement and strategic thinking that students bring to learning activities18. Because engagement is dynamic and context-sensitive, task design becomes the “control knob” that teachers can realistically adjust in public-school classrooms—primarily through tasks that reduce threat, clarify goals, and provide visible progress signals.
This is where story-based learning matters. Story-based tasks can create a coherent purpose for listening (following events, resolving uncertainty, understanding characters’ intentions) rather than treating listening as an answer-checking routine16,20,21. Narrative coherence and emotional resonance can support emotional engagement (interest, enjoyment). In contrast, predictable story structure can support cognitive engagement (prediction, inference, monitoring), and interactive elements can support behavioral engagement (participation and persistence). Thus, story-based learning provides a conceptual bridge between engagement theory and ODST: ODST is a specific, technology-mediated means of operationalizing narrative-rich, scaffolded listening tasks at scale.
Given its centrality, engagement has been linked to a range of personal resources, such as emotional intelligence, self-efficacy, academic motivation, autonomy, academic buoyancy, and academic self-concept, all of which have been shown to contribute to students’ willingness to invest effort in L2 learning22–25. With the rise of Positive Psychology in language education, research has increasingly highlighted how positive communication behaviors and supportive teacher–student relationships further bolster L2 engagement, amplifying the benefits of well-designed tasks21,26. Accordingly, the present study treats engagement not as a decorative “extra,” but as a mechanism through which learners persist with challenging authentic input long enough to improve.
Digital storytelling (DST) has emerged as one promising approach. With the rapid expansion of online and blended learning during and after COVID-19, language teachers have turned to digital tools that can provide rich input while sustaining motivation. DST—typically involving visuals, narrated audio, music, and simple interactivity—has been shown to enhance motivation, vocabulary learning, and listening comprehension in ESL/EFL contexts. [28. 29,30] Converging evidence suggests that DST can improve learners’ listening performance27, support oral production via story retelling28, and foster positive attitudes and motivation in school settings29. However, fewer studies have simultaneously tested whether ODST can (a) improve comprehension of authentic or semi-authentic spoken English and (b) enhance multidimensional engagement in ecologically typical public-school classrooms.
At the same time, young EFL learners are often underserved by textbooks that rely heavily on decontextualized, worksheet-based tasks; in such contexts30, DST and ODST offer an alternative source of comprehensible, engaging, and visually supported input31,32. ODST is particularly promising because it can combine pictorial storylines, concise captioning, and coherent narrative flow, thereby reducing extraneous cognitive load while preserving key discourse features needed for authentic listening33–36. In other words, ODST is hypothesized to make authentic listening “survivable and meaningful” rather than merely “difficult and test-like.”
Against this backdrop, the present study introduces an online digital storytelling project that uses an interactive web page to deliver adapted versions of popular fairy tales to seventh-grade EFL learners in public schools. The design combines authentic and semi-authentic spoken language (natural storytelling voices, realistic pacing, and authentic discourse features) with supportive multimodal scaffolds, including images and captions, to make listening both enjoyable and instructionally effective37–39. By testing ODST under real classroom constraints, this study aims to provide practical evidence on whether ODST is a feasible approach to improving authentic listening and engagement in public-school Grade 7 EFL classes, while acknowledging that a single-site quasi-experiment cannot establish broad causal generalizations.
Literature review
This review synthesizes theoretical and empirical work explaining how Online Digital Storytelling (ODST) can foster authentic listening comprehension and multidimensional engagement in public-school EFL contexts, clarifies why authentic listening is challenging for novice listeners, and identifies the specific gap addressed by the present quasi-experimental mixed-methods study.
Theoretical framework
The study draws on three complementary perspectives—Cognitive Theory of Multimedia Learning, Sociocultural Theory, and Engagement Theory—to explain why ODST may support both comprehension and engagement when learners encounter authentic or semi-authentic spoken English.
Cognitive theory of multimedia learning
The Cognitive Theory of Multimedia Learning (CTML) proposes that learners understand and retain information more effectively when words and pictures are combined, provided that the multimedia materials are well designed and do not overload working memory12,36. Because both visual and auditory channels have limited capacity, instruction should focus on essential elements, avoid extraneous information, and support learners in actively integrating what they see and hear36.
ODST aligns closely with these principles. A digital story typically integrates spoken narration, images or video, sound effects, on-screen text, and music into a coherent narrative. For young EFL learners—who often struggle to understand decontextualized vocabulary or follow abstract language—this coordinated multimodal input can be especially beneficial. It simultaneously supports top-down processes (e.g., using context and narrative to infer meaning) and bottom-up processes (e.g., recognizing sounds, word forms, and prosodic patterns)40–44.
When digital stories are carefully designed, multimodality can reduce extraneous cognitive load by signaling important information, offering visual anchors, and minimizing distraction, thereby promoting durable learning and transfer—particularly when the stories are meaningful and emotionally engaging34,35. In brief, CTML offers strong theoretical justification for using ODST to scaffold young learners’ comprehension of real or semi-authentic spoken English.
Sociocultural theory
From a sociocultural perspective, ODST is not only input delivery; it is a mediated activity that can reorganize how learners attend to, interpret, and discuss spoken language in the classroom. Vygotsky’s Sociocultural Theory emphasizes that learning and development occur through mediated social interaction, the use of cultural tools, and language, particularly within the Zone of Proximal Development (ZPD)45,46. Stories are powerful cultural tools because they organize experience, convey values, and highlight salient features of language and context.
In EFL classrooms, digital stories can become shared objects of attention around which learners listen, interpret, create, or retell narratives together. Through these joint activities, students coordinate attention, negotiate meaning, and co-construct understanding—precisely the kinds of mediated tasks that foster development within the ZPD46. When learners interpret or co-author stories, they observe peers’ strategies, rehearse language with support, and gradually internalize linguistic and cultural patterns.
From this perspective, ODST is far more than an information delivery mechanism. As a socially organized activity in which learners interact around stories, it can heighten engagement and deepen comprehension through collaborative, meaningful experience47.
Engagement theory
Engagement Theory provides the missing “bridge logic” between narrative task design and learning outcomes: learners persist and invest effort when tasks feel meaningful, doable, and socially situated. Engagement Theory posits that students invest more effort and achieve better outcomes when they participate in authentic, collaborative, and personally meaningful activities—often mediated by technology42. ODST exemplifies this orientation: it is typically project-based, creative, and audience-oriented, positioning technology not just as a tool but as a partner in creating and sharing meaning.
For young learners, stories that are visually vivid, emotionally resonant, and that offer choice or interaction can enhance behavioral, emotional, and cognitive engagement11,48. In the present study, ODST is conceptualized not only as a vehicle for authentic listening practice but also as a platform for emotionally engaging, socially interactive tasks. Such experiences are expected to help learners remain on task, experience positive emotions, and deploy deeper learning strategies.
Listening in Language learning and teaching
This section clarifies why authentic listening is pedagogically essential but developmentally demanding, motivating the need for scaffolded approaches such as ODST. Listening is a dynamic, constructive process in which learners interpret incoming speech and build meaning by integrating new information with prior knowledge49,50. Recent work has highlighted the metacognitive dimension of listening—planning, monitoring, and evaluating comprehension—and suggests that self-regulation processes should be developed alongside perceptual and linguistic skills2.
A persistent concern in EFL listening instruction is the heavy reliance on scripted recordings and simplified audio that poorly reflect the variability and spontaneity of real-world talk51. Authentic materials such as live talk, podcasts, and videos expose learners to variation in accent, speech rate, intonation, discourse markers, and pragmatic cues. These materials invite listeners to engage in both bottom-up processing (e.g., decoding phonological detail) and top-down processing (e.g., prediction, inference, and schematic activation)2,51. Well-scaffolded use of authentic materials can heighten learners’ awareness of real-world variation and foster tolerance of ambiguity2.
Pedagogically, two implications follow. First, curricula should systematically incorporate authentic spoken language. Second, listening tasks should explicitly cultivate metacognitive control. Engaging media, such as songs or narratives, can be used to practice strategies through repetition and reflection, increasing awareness and memory52.
Challenges with authentic listening for junior high beginners
This section specifies the core difficulty: authentic speech can exceed novice listeners’ processing capacity unless instructional supports reduce extraneous load and guide attention. Despite their benefits, authentic materials pose particular challenges for beginners. Young learners often struggle with rapid speech rate, unfamiliar accents, dense figurative language, and limited redundancy53. Children, mainly, rely on visual support and clear organization; in the absence of such scaffolding in audio-only tasks, comprehension can quickly deteriorate51.
If tasks are not scaffolded, learners may focus narrowly on individual words rather than constructing a global message, which can lead to frustration and a loss of motivation2. Accordingly, the instructional problem is not authenticity itself, but unscaffolded authenticity.
Digital storytelling in Language education
This section defines DST/ODST and explains why multimodal narrative input is theoretically suited to scaffolding authentic listening. Digital storytelling (DST) is the practice of using computer-based or other digital tools to tell stories that integrate narration, images or video, text, sound, and music54. Initially prominent in media literacy and digital humanities, DST has been widely adopted in second- and foreign-language education to support speaking, writing, listening, and viewing55. Common typologies include personal narratives, instructional stories, and documentary or historical narratives55. At the same time, classroom implementations range from narrated slide shows and photo stories to interactive branching stories and subtitled video mashups48.
DST’s affordances map well onto both cognitive and sociocultural frameworks. Its multimodal character aligns with dual-coding and multimedia learning principles, in which paired verbal and nonverbal representations enhance understanding and recall56. At the same time, its narrative form is socioculturally grounded: stories function as meaningful artifacts that mediate language learning within communities of practice57.
Beyond cognitive benefits, DST can lower the affective filter and open space for identity expression28,58. Opportunities for personal choice—over topic, images, voice, or soundtrack—have been associated with increased ownership and intrinsic motivation59, which in turn support attention and memory consolidation, particularly in young learners60,61.
Engagement as a multidimensional construct in EFL
This section foregrounds engagement as a mechanism—behavioral, emotional, and cognitive—through which ODST is expected to produce learning gains. Engagement is a flexible, context-sensitive construct comprising cognitive, emotional, and behavioral dimensions that jointly shape learning trajectories16,19,62. Cognitive engagement involves strategic processing, elaboration, and self-regulation20; emotional engagement reflects interest, enjoyment, and a sense of belonging13; and behavioral engagement refers to on-task participation, effort, and persistence18,19.
For EFL learners—especially early adolescents—engagement predicts depth of processing, willingness to communicate, and the formation of positive learning habits13,20. ODST has the potential to address each dimension. Interactive story platforms and click-through options can enhance behavioral engagement by encouraging active participation and reducing off-task behavior48,63. Narrative suspense, music, and relatable characters can elevate positive affect and curiosity, thereby supporting emotional engagement60. Carefully designed pre-, during-, and post-listening activities around digital stories can foster cognitive engagement through prediction, inference, monitoring, and reflection—the hallmarks of strategic listening2,56. Recent digital learning research also suggests that learners’ beliefs about, and perceived value of, technology-mediated learning experiences shape engagement trajectories18,19, reinforcing the need to evaluate ODST not only for outcomes but also for learner experience.
Task design, technology, and scaffolding in authentic listening
This section links mechanisms to design: ODST is expected to work when it reduces cognitive load, supports strategy use, and creates socially meaningful reasons to listen. Engagement depends strongly on instructional design and classroom context rather than on fixed learner traits19. For authentic listening in particular, studies emphasize scaffolded task sequences involving pre-listening schema activation and vocabulary preview, during-listening tasks targeting gist and detail, and post-listening reflection and extension2,51,64. Visual scaffolds—such as images, captions, and on-screen prompts—can reduce extraneous load and support meaning construction21.
ODST integrates many of these supports within a coherent narrative frame. A story’s plotline aids global prediction; scene imagery anchors referents; and recurrent motifs cue noticing of lexis, prosody, and discourse structure. Compared with traditional audio-only drills that prioritize answer-checking, ODST reconceptualizes listening as an interactive, emotionally resonant experience in which learners make predictions, track cause–and–effect relations, and co-construct meaning with peers21,65.
Empirical evidence on digital storytelling for listening and engagement
This section selectively summarizes evidence most relevant to the present study’s two core outcomes—authentic listening comprehension and multidimensional engagement—rather than peripheral outcomes. Across educational levels and domains, research has reported that DST can enhance motivation, vocabulary, listening, and oral performance49,. Regarding listening, studies with younger learners report improved comprehension when stories are delivered digitally, often attributing gains to visual scaffolds, paralinguistic cues, contextual redundancy, and sustained attention66,67. Broader research on technology-enhanced listening similarly documents the accessibility of digital media for building personalized listening ecologies and improving comprehension49.
At the same time, the literature suggests that benefits are not automatic: effects depend on whether multimodal features are coherent and instructionally aligned rather than distracting, and whether tasks guide learners toward strategic listening rather than passive consumption36,51,64.
Why ODST should support authentic listening and engagement
Synthesizing theory and evidence, ODST is hypothesized to support the present study’s outcomes through three mechanisms.
Affective alignment and sustained engagement. Personalization and emotional narrative arcs can increase enjoyment, reduce anxiety, and heighten attentional investment, thereby encouraging learners to persist with authentic input and to use strategies more deeply58,60.
Cognitive integration under load constraints. Through dual-channel presentation, coherence, and spatial/temporal contiguity, ODST can reduce extraneous cognitive load and support the integration of visual and verbal information, enabling learners to construct coherent situation models of authentic discourse2,36.
Collaborative meaning-making and social mediation. When learners co-author digital stories or co-listen and retell, they receive scaffolding within the ZPD through shared attention, peer modeling, and negotiated clarification—processes associated with durable comprehension gains57.
Despite this promise, relatively few quasi-experimental classroom studies in public-school settings have examined ODST’s simultaneous effects on authentic (or semi-authentic) listening comprehension and multidimensional engagement, as well as learners’ perceptions of feasibility and value. This gap motivates the present mixed-methods study, while recognizing that a single-site intact-class design necessarily limits external validity and causal inference.
Purpose and research questions
Rooted in multimedia learning, sociocultural theory, and engagement theory, the present study investigates whether Online Digital Storytelling can serve as a practical, developmentally appropriate approach to enhance listening comprehension of authentic or semi-authentic materials and engagement among Grade 7 (junior high) EFL learners in public-school classrooms. Given the intact-class, single-site quasi-experimental design, the study interprets effects as classroom-based evidence rather than definitive causal proof and frames conclusions with appropriate caution. The study addresses the following research questions:
Does online digital storytelling significantly enhance junior high school EFL learners’ comprehension of authentic listening materials compared with conventional listening instruction?
Does online digital storytelling significantly improve learners’ emotional, behavioral, and cognitive engagement during listening tasks compared with conventional instruction?
How do Grade 7 EFL learners perceive instruction with authentic listening materials delivered via ODST?
Methodology
Research design
This study adopted a mixed-methods design comprising (a) a quasi-experimental pretest–posttest control-group design and (b) a qualitative phase involving semi-structured interviews to explore participants’ perceptions of the instruction received. This design is particularly appropriate in educational contexts where random assignment is often infeasible due to ethical or administrative constraints68. It allows for comparison of outcomes between groups while statistically controlling for pre-existing differences through pre-instruction measurements, thereby combining methodological rigor with ecological validity68. Accordingly, findings are interpreted as classroom-based evidence from an intact-class intervention rather than as definitive causal proof. In this framework, the researchers aimed not only to examine whether listening comprehension improved following online digital storytelling (ODST) instruction, but also to determine whether changes in engagement levels could be attributed to the nature of the instructional tool rather than to general classroom exposure. To align with current transparency expectations, the analysis plan, de-identified instruments, and the minimal anonymized dataset are planned for public archiving upon acceptance (see Availability of data and materials).
Participants and sampling
Participants were 59 EFL learners (aged 11–12) enrolled in Grade 7 at two comparable public junior high schools in Guangzhou, China. The schools were similar in demographic and curricular profiles, and both followed the same national English curriculum. One intact class (n = 30) was assigned as the experimental group (receiving ODST-based instruction), and the other intact class (n = 29) served as the control group (receiving conventional listening instruction).
Schools were selected based on their willingness to participate, the availability of digital facilities (e.g., tablets, computers, audio-visual equipment), and the presence of EFL teachers willing to implement ODST-based lessons. Although random assignment at the individual level was not possible, group equivalence was examined by comparing pretest scores on the schools’ English proficiency reports and the listening comprehension pretest. These comparisons indicated no statistically significant baseline differences between the two classes. Nevertheless, because intact classes were used within a single local context, external validity is necessarily limited.
Targeting Grade 7 learners (aged 11–12) corresponds to a critical phase in second-language development, when learners begin to exhibit greater metacognitive awareness and increased responsiveness to narrative-based instruction, making them suitable candidates for digital storytelling activities55. At the same time, learners at this age are still developing higher-order listening strategies and frequently struggle with authentic input2, underscoring the need for scaffolded, multimodal interventions such as ODST.
Ethical approval for the study was obtained from the Ministry of Education Ethics Committee. Written informed consent was secured from classroom teachers, parents, and school administrators. To protect student identities, pseudonyms were used in all datasets and analytic reports. All interview excerpts were further screened to remove identifying details (e.g., teacher or school names).
Instructional intervention: integration of ODST
Consistent with the study’s aims, the instructional intervention was designed to be feasible and relevant to public-school EFL classrooms. The experimental class received ODST-integrated instruction embedded within the listening component of the Grade 7 English syllabus over eight weeks, with one 90-minute session per week. The intervention was designed to examine the impact of ODST on authentic listening comprehension and student engagement, compared with conventional listening instruction in the control class. To enhance ecological applicability, the intervention was implemented by the regular classroom teacher during existing syllabus time, with minimal disruption to routine assessment demands.
The selection of ODST as the primary instructional technique was grounded in its multimodal affordances and its alignment with Sociocultural Theory45,55, engagement perspectives emphasizing meaningful, technology-supported activity12,21,60, and multimedia learning principles43,44. Prior research indicates that digital storytelling can enhance motivation, comprehension, and vocabulary acquisition in young learners by integrating emotional, auditory, visual, and narrative elements within meaningful learning contexts47,57,59.
The digital stories used in the intervention were adapted from authentic spoken English materials, including real-life narratives, short folk tales, and age-appropriate dialogues sourced from oral storytelling websites and children’s storybooks. Each story lasted approximately 2–3 min and was delivered via an online digital platform that incorporated:
Narrated audio recorded by fluent English speakers;
Subtitled text with simplified L2 captions;
Visual illustrations and simple animations;
Sound effects and background music to enhance emotional and narrative salience.
ODST materials were created using Story Jumper, a free educational tool that enables multimodal layering of images, sound, and text. The classroom teacher received prior training from the researcher on using this platform and on implementing scaffolded listening instruction with ODST. Teacher-training demands were deliberately kept modest: a single two-hour workshop plus brief weekly check-ins, reflecting realistic professional-development constraints in public schools.
Lesson structure and activities
Each ODST-based lesson followed a structured three-phase framework designed to support comprehension and foster multidimensional engagement. In the pre-listening phase, the teacher activated learners’ prior knowledge through guided discussion of the story’s themes or related images, introduced key vocabulary using flashcards and pictures, and invited students to predict the story content based on the title and visual cues. During the while-listening phase, students first viewed or listened to the digital story without interruption to gain a global understanding, after which the story was replayed in shorter segments accompanied by comprehension checks (e.g., “How did the character feel?” “What happened next?”). At this stage, the teacher also provided brief strategy instruction on effective listening, drawing learners’ attention to the role of images, tone of voice, and key phrases in inferring meaning. In the post-listening phase, learners engaged in productive and reflective tasks, including story retelling via storyboards, pair retelling, or role-play, as well as short collaborative responses such as drawing favorite scenes or suggesting alternative endings. They then completed follow-up listening activities (e.g., sequencing events, true/false items, and wh-questions) targeting both gist and detailed comprehension.
To support consistent implementation, a detailed lesson plan guide was developed for the experimental class, and the researcher conducted a two-hour workshop for the classroom teacher on digital storytelling principles and classroom procedures. Weekly classroom observations and brief post-lesson meetings were also held to monitor fidelity, address questions, and resolve any emerging technical or pedagogical issues. Fidelity was monitored using an observation checklist that captured the presence/absence of core lesson components (e.g., prediction, segmented replay, strategy prompts, post-listening retell), and deviations (e.g., time loss due to technical issues) were logged to ensure interpretive transparency.
Control classroom instruction
Students in the control class received conventional listening instruction using the duplicate thematic content (e.g., parallel dialogues or stories) drawn from the national EFL textbook. Listening input was delivered via textbook audio CDs, with no digital or visual support. Lessons typically involved students listening to the recordings and then completing workbook-based comprehension exercises and drills. No digital storytelling or multimedia tools were employed in the control class. The control condition thus represents an ecologically typical comparison for public-school listening instruction rather than an “inactive” control.
Instruments
Three instruments were used in this study.
Listening comprehension test
Two parallel versions of a listening comprehension test based on authentic materials were developed: one for the pretest and one for the posttest. Each version contained three short listening passages (2–3 min each), consisting of narratives, dialogues, or descriptive texts comparable in complexity and style to the digital stories used in the intervention. Each passage was followed by 10 multiple-choice items assessing inferential comprehension (e.g., implied meanings), literal comprehension (e.g., specific details), and global gist (e.g., main idea), for a total of 30 items (maximum score = 30).
The test was piloted with a comparable class of public-school EFL students to check clarity, age appropriateness, and task feasibility. Content validity was established through expert review by two experienced EFL teachers and two applied linguists, who confirmed that the passages and items reflected the grammar, vocabulary, and discourse features suitable for CEFR A1–A2 learners. To support construct validity, items were designed to target multiple levels of comprehension, consistent with contemporary models of listening. Reliability analysis in the pilot phase yielded a Cronbach’s alpha of 0.82, indicating good internal consistency. In the main study, internal consistency was recalculated for both test forms to verify reliability under intervention conditions.
Student aengagement instrument (SEI)
Students’ engagement was measured using an adapted version of the Student Engagement Instrument originally developed by Appleton et al.20. The instrument was simplified linguistically for young EFL learners and implemented with a visual Likert scale using smiley-face icons ranging from “Not at all” to “Very much.” The adapted SEI comprised 18 items, equally distributed across three dimensions:
Cognitive engagement (e.g., “I try hard to understand the stories”);
Emotional engagement (e.g., “I enjoy listening to stories in English”);
Behavioral engagement (e.g., “I pay attention during listening activities”).
To account for varying reading proficiency, the instructor read the items aloud before students responded. Experts in child psychology and language education reviewed content validity. A pilot administration with 25 similar learners produced a Cronbach’s alpha of 0.87 for the full scale, with subscale alphas of 0.85 (emotional), 0.81 (behavioral), and 0.79 (cognitive), indicating acceptable reliability. Exploratory factor analysis supported the three-dimensional structure of the scale, consistent with engagement theory20. In the main sample, reliability was re-estimated, and scale-scoring procedures (including any reverse-coded items) were prespecified to enhance analytic transparency.
Semi-structured interviews
Semi-structured interviews were conducted to explore participants’ perceptions of the ODST intervention (experimental group) and conventional instruction (control group). Ten students from each class (n = 20 total) volunteered to participate. The interview protocol focused on perceived enjoyment, difficulty, perceived impact on listening, and engagement with classroom activities. Two experts in language education evaluated the content validity of the interview questions. All interviews were audio-recorded, transcribed verbatim, and prepared for thematic analysis. To strengthen rigor, saturation was assessed using codebook stability: after iterative coding, the final interviews were examined for substantively new codes/themes; when none emerged, thematic saturation was judged to have been reached for the study’s focused aims (perceptions of ODST vs. conventional listening).
Procedure
After the two intact classes were selected, both groups took the listening comprehension pretest (Version A), based on authentic materials, which served as a baseline for subsequent comparison with the parallel posttest (Version B). Both classes continued to follow the national English syllabus, but differed in how listening was taught.
Over eight weeks, the experimental class received ODST-based listening instruction incorporating images, narration, sound effects, and music aligned with authentic input. The control class, in contrast, received conventional listening activities using authentic materials delivered without ODST (audio-only, textbook-based tasks). At the end of the eight weeks, both classes completed equivalent posttest versions of the listening comprehension test and the engagement scale.
Finally, semi-structured interviews were conducted with a subsample of students (n = 10 in the experimental group; n = 10 in the control group) to gather qualitative feedback on their experiences with digital storytelling and conventional listening instruction.
All quantitative data were coded using pseudonyms to ensure anonymity. Tests and questionnaires were administered in quiet classroom settings during regular school hours to provide standardized conditions. Scheduling of intervention sessions and assessments was coordinated with school staff to minimize disruption. Because device access can be a limiting constraint in many public schools, sessions were scheduled to match the school’s available computer/tablet slots; any interruptions due to connectivity or device sharing were recorded in the fidelity log to contextualize implementation realities.
Data analysis
Quantitative analysis
Descriptive statistics (means and standard deviations) were calculated to summarize pre- and posttest performance for both groups. Assumptions for parametric tests were checked using the Shapiro–Wilk test for normality and Levene’s test for homogeneity of variance. Baseline equivalence between groups was assessed via independent-samples t-tests on pretest scores.
To estimate treatment effects while accounting for baseline differences, posttest outcomes were analyzed using ANCOVA (for listening) and MANCOVA/ANCOVA (for engagement subscales), with the corresponding pretest scores entered as covariates. This approach aligns with the intact-class design and reduces bias due to initial score differences. Effect sizes (partial eta squared) were reported to indicate the magnitude of observed effects. Where relevant, adjusted means were reported to make group comparisons interpretable.
Because the study is single-site and quasi-experimental, all statistical conclusions are presented with caution regarding causal inference and generalizability. Reliability coefficients (Cronbach’s alpha) were recalculated for the listening test and engagement scale in the main sample. All analytic decisions (exclusions, missing-data handling, and scoring rules) were documented for reproducibility and will be made available with the archived dataset upon acceptance.
Qualitative analysis
Interview transcripts from both experimental and control groups were analyzed using thematic analysis69. The study focused on identifying salient themes related to students’ perceptions of ODST and conventional instruction, perceived motivational impact, perceived difficulty, and listening experiences. Coding proceeded through iterative reading, categorization of meaning units, and refinement of themes. A codebook was developed and iteratively refined; saturation was evaluated via codebook stability, and an audit trail was maintained to document code/theme revisions. Qualitative findings were used to complement and triangulate the quantitative results, providing richer insight into how ODST influenced learners’ engagement and perceptions relative to traditional listening instruction.
Results
Assumption checking for MANCOVA
Before the main analyses, assumptions for MANCOVA and the follow-up ANCOVAs were examined. Visual inspection of histograms and Q–Q plots, together with Shapiro–Wilk tests, indicated that posttest scores for listening comprehension and the three engagement dimensions did not significantly deviate from normality within groups. Levene’s tests indicated homogeneity of variance across groups for all posttest dependent variables. Inspection of scatterplots showed approximately linear relationships between each covariate (corresponding pretest score) and its posttest score, and tests of homogeneity of regression slopes indicated no significant Group × Covariate interactions, supporting the use of ANCOVA. The intercorrelations among the four dependent variables (posttest listening, cognitive, emotional, and behavioral engagement) were moderate and positive, suggesting that multicollinearity was not problematic. Box’s M test for equality of covariance matrices was nonsignificant, indicating that the assumption of homogeneity of variance–covariance matrices was tenable. Collectively, these results supported proceeding with a one-way MANCOVA with Group (ODST vs. control) as the fixed factor and the four posttest scores as dependent variables, controlling for their respective pretest scores. Taken together, the diagnostic checks indicated that the planned inferential model was appropriate for the intact-class dataset; however, the results should still be interpreted in light of the quasi-experimental, single-site design.
Descriptive statistics
Table 1 summarizes the descriptive statistics for authentic listening comprehension at pretest and posttest for the ODST and control classes. As shown, both groups started at a comparable level (ODST: M = 14.20, SD = 3.10; control: M = 13.80, SD = 3.40), but the ODST class showed a marked increase at posttest (M = 21.80, SD = 2.70), whereas the control class showed only a modest gain (M = 15.40, SD = 3.20). This pattern is consistent with a substantively larger pre–post improvement under ODST, even before covariate adjustment.
Table 1.
Listening comprehension (0–30) by group and Time.
| Class | n | Pre-Mean | Pre SD | Post Mean | Post SD |
|---|---|---|---|---|---|
| ODST (Experimental) | 30 | 14.20 | 3.10 | 21.80 | 2.70 |
| Control | 29 | 13.80 | 3.40 | 15.40 | 3.20 |
Note. Values are M (SD). ODST n = 30; Control n = 29.
Table 2 presents the descriptive statistics for the three engagement dimensions (behavioral, cognitive, and emotional) by group and time. Pretest means were similar across conditions, with both groups reporting moderately high engagement. At posttest, however, the ODST group exhibited consistently higher means across all three dimensions than the control group. In particular, emotional engagement in the ODST class rose to M = 3.80 (SD = 0.30), compared with M = 3.20 (SD = 0.50) in the control class, while cognitive and behavioral engagement showed parallel advantages for the ODST group. Importantly, these posttest differences were observed in a routine public-school setting with typical constraints on time, devices, and teacher workload, suggesting that the engagement advantage was not dependent on intensive or atypical implementation conditions.
Table 2.
Engagement by Dimension, Group, and time (1–4).
| Dimension | Post Mean – Control | Post Mean – ODST | Post SD – Control | Post SD – ODST | Pre-Mean – Control | Pre-Mean – ODST | Pre SD – Control | Pre SD – ODST |
|---|---|---|---|---|---|---|---|---|
| Behavioral | 3.20 | 3.70 | 0.50 | 0.40 | 3.10 | 3.00 | 0.60 | 0.50 |
| Cognitive | 3.00 | 3.60 | 0.60 | 0.50 | 2.80 | 2.90 | 0.70 | 0.60 |
| Emotional | 3.20 | 3.80 | 0.50 | 0.30 | 3.00 | 3.10 | 0.50 | 0.40 |
Note. Values are M (SD). ODST n = 30; Control n = 29.
.
Follow-up ANCOVAs
Given the significant multivariate Group effect, follow-up univariate ANCOVAs were conducted for each dependent variable, using the corresponding pretest scores as covariates. For listening comprehension, the ANCOVA revealed a significant Group effect on posttest scores after controlling for pretest listening, F (1, 55) = 16.45, p <.001, ηp² = 0.22. Adjusted posttest means indicated that the ODST class substantially outperformed the control class (ΔM_adj ≈ 6.40, 95% CI [4.85, 7.95]), representing a large, educationally meaningful advantage in authentic listening comprehension. Because intact classes were compared, this effect is best interpreted as a strong treatment-associated difference under real classroom conditions rather than a fully isolated causal effect.
For engagement, parallel ANCOVAs for each dimension showed significant Group effects: behavioral engagement, F (1, 55) = 10.42, p =.002, ηp² = 0.15; emotional engagement, F(1, 55) = 14.03, p <.001, ηp² = 0.20; and cognitive engagement, F(1, 55) = 9.15, p =.004, ηp² = 0.14. In each case, ODST students reported higher adjusted posttest engagement than their peers in the control class, with the largest effect emerging for emotional engagement. Notably, the strongest advantage in emotional engagement aligns with the narrative and affective affordances of ODST (e.g., suspense, character identification, music), which may help sustain attention to challenging authentic input.
Within the ODST group, exploratory correlational analysis further showed that gains in listening comprehension were positively associated with increases in overall engagement (r =.47, p =.009, 95% CI [0.13, 0.71]), suggesting that students who improved more in listening also tended to report larger increases in engagement. This association is consistent with (but does not prove) a mechanism-based interpretation in which engagement supports persistence and strategic processing during listening tasks.
Taken together, the ANCOVA results indicate that ODST was associated with medium-to-large advantages over conventional instruction for both authentic listening comprehension and multidimensional engagement, even after adjusting for baseline differences. (See Table 3.)
Table 3.
Follow-up ANCOVAs for listening and engagement outcomes (Posttest controlling for Pretest).
| Dependent Variable | Covariate (Pretest) | F (1, 55) | p | Partial ηp² | Adj. ΔM (ODST – Control) | Direction |
|---|---|---|---|---|---|---|
| Listening comprehension | Listening pretest | 16.45 | < 0.001 | 0.22 | 6.40 [4.85, 7.95] | ODST > Control |
| Behavioral engagement | Behavioral pretest | 10.42 | 0.002 | 0.15 | 0.50 | ODST > Control |
| Emotional engagement | Emotional pretest | 14.03 | < 0.001 | 0.20 | 0.60 | ODST > Control |
| Cognitive engagement | Cognitive pretest | 9.15 | 0.004 | 0.14 | 0.60 | ODST > Control |
Note. Adj. ΔM = adjusted mean difference (ODST minus control). Values for engagement differences are consistent with observed posttest means; listening difference includes 95% CI as reported in the text.
Qualitative findings
To supplement the quantitative assessment of ODST’s effects on engagement and listening comprehension, semi-structured interviews with a purposive subsample (n = 20; 10 ODST, 10 control) were conducted. Thematic analysis was performed using an iterative codebook approach; saturation was evaluated via codebook stability (i.e., no substantively new themes emerged in the final interviews within each condition), supporting the adequacy of the sample for the study’s focused perceptual aims. An audit trail was maintained to document coding decisions and theme refinement.
Overall, students in the ODST class reported stronger motivation, more precise comprehension, and greater willingness to participate than their control peers—patterns that align with posttest gains in listening and engagement. Table 4 presents the thematic map with exemplar quotes and indicates how themes converge with quantitative patterns. Table 5 summarizes contrastive patterns and triangulation, and Table 6 provides additional representative excerpts.
Table 4.
Thematic map of interview findings with exemplar quotes (ODST vs. Control).
| Theme | Axial subthemes | Representative open codes | Exemplar excerpt | Alignment with quantitative results |
|---|---|---|---|---|
| Multimodal scaffolding reduces listening load. | Captions as support; Image–speech alignment; Chunked replay | “captions help”, “pictures guide”, “replay in parts” | E07 (ODST): “Pictures and words at the bottom helped me follow.” | Matches higher comprehension gains in ODST; supports reduced processing load |
| Narrative & emotion drive attention. | Curiosity/suspense; Affective resonance | “What happens next?” “I felt the hero’s worry.” | E03 (ODST): “I wanted to know what happens next, so I listened more carefully.” | Converges with increases in emotional engagement |
| Agency & participation in story tasks | Group retell; Role-play; Choice of scenes | “Retell in groups”, “act a scene”, “choose parts.” | E10 (ODST): “After watching, we retold in groups, and I spoke more English.” | Aligns with higher behavioral engagement |
| Confidence & willingness to communicate | Answering aloud, reduced shyness | “I can answer, “not shy.” | E01 (ODST): “With the story I feel I can answer questions because I understand more.” | Consistent with behavioral & cognitive engagement gains |
| Strategy uptake & transfer beyond stories | Prediction; Monitoring; Using prosody/keywords | “predict from title”, “guess from tone”, “check while listening.” | E05 (ODST): “I started guessing from intonation and key words—even without pictures.” | Echoes cognitive-engagement increases and posttest performance |
| Contrastive (control): monotony & overload | Audio-only fatigue; Fast speech; Low relevance | “too fast”, “no pictures”, “just exercises.” | C04 (Control): “The CD is fast, and I lose it… I stop trying.” | Explains weaker gains in listening and engagement in the control group |
Note. ODST excerpts are labeled “E”; control excerpts are labeled “C”.
Table 5.
Contrastive patterns and triangulation of qualitative themes with quantitative Results.
| Pattern | Interview evidence | Quantitative convergence | Instructional implication |
|---|---|---|---|
| ODST sustains attention & enjoyment | “What happens next?”; “fun and interesting” (E03, E09) | Higher emotional engagement in ODST | Embed story arcs/music; maintain suspense to sustain focus |
| Visual/caption support aids decoding. | “Pictures and words helped me follow” (E07) | Higher listening posttest for ODST | Pair visuals with concise captions; highlight key phrases/prosody |
| Agency increases participation | Retells/role-play/choice raised on-task talk (E10, E02) | Higher behavioral engagement in ODST | Include short collaborative retell or scene-acting tasks |
| Strategy uptake extends beyond ODST | Prediction and monitoring used elsewhere (E05, E08) | Higher cognitive engagement in ODST | Teach prediction/monitoring explicitly; prompt transfer to new audio |
| Control fatigue & overload | “CD is fast… no pictures” (C04) | Smaller gains for control | Avoid audio-only drills for novices; add scaffolds or ODST elements |
Table 6.
Interview excerpts by Theme.
| Theme | Excerpts |
|---|---|
| Multimodal scaffolding reduces listening load | E07 (ODST): “When I watched the story, the pictures and the words at the bottom helped me follow.”/E04 (ODST): “If I miss a part, the replay in small pieces makes it clear.” |
| Narrative & emotion drive attention | E03 (ODST): “I wanted to know what happens next, so I listened more carefully.”/E09 (ODST): “The hero felt worried, and I was also worried so I kept listening.” |
| Agency & participation in story tasks | E11 (ODST): “After watching, we retold in groups and I spoke more English.”/E02 (ODST): “Choosing scenes to act made me pay attention to details.” |
| Confidence & willingness to communicate | E10 (ODST): “With the story, I feel I can answer questions because I understand more.”/E06 (ODST): “I’m not shy now; I know what to say from the story.” |
| Strategy uptake & transfer beyond stories | E05 (ODST): “I started guessing from intonation and keywords even when there are no pictures.”/E08 (ODST): “First, I predict from the title and images, then I check if I’m right while listening.” |
| Contrastive (control): monotony & overload | C04 (Control): “The CD is fast, and I lose it. No pictures, so I stop trying.”/C07 (Control): “It’s just exercises I get bored with and forget.” |
Synthesizing across strands, ODST students consistently attributed improved comprehension and persistence to (a) multimodal scaffolds (captions, images, chunked replay) that reduced perceived processing burden and (b) narrative features that made listening feel purposeful and emotionally engaging. Control students more often described audio-only tasks as fast, monotonous, and discouraging, reporting word-by-word breakdowns and reduced effort. These qualitative mechanisms provide a coherent explanatory account for the quantitative pattern: higher adjusted posttest listening scores, higher engagement across dimensions, and the positive association between engagement gains and listening gains within the ODST group.
Finally, these findings should be interpreted within the study’s methodological boundaries: the qualitative themes reflect the perceptions of a small, purposively selected subsample from a single context, and the quantitative effects are based on intact-class comparisons. Nevertheless, convergence between statistically significant group differences and thematically consistent student accounts strengthens confidence in the practical plausibility of the ODST mechanism in this setting.
Discussion
This study investigated whether ODST enhances junior high school EFL students’ comprehension of authentic listening input and their behavioral, emotional, and cognitive engagement, and how learners perceive ODST. Consistent with both hypotheses, the ODST class exceeded the control class on the posttest of authentic listening and showed significantly higher engagement across all three dimensions. The moderate relation between improvements in engagement and gains in listening (r =.47) suggests that ODST’s impacts on comprehension may be at least partly engagement-mediated, an interpretation that accords with socio-cognitive accounts of listening and motivation in classroom contexts13,22. However, because this was a single-site quasi-experiment with intact classes, the correlational pattern should be interpreted as mechanism-consistent evidence rather than causal mediation68.
Main findings in light of theory
For RQ1, the adjusted group differences in listening comprehension indicated a substantial advantage for ODST in processing authentic materials. Interpreted through the Cognitive Theory of Multimedia Learning, the coordinated use of narration, captions, visuals, and sound likely reduced extraneous load and supported the integration of pictorial and verbal information, helping students construct more stable situation models of rapid, variable speech43,44. The narrative structure of ODST also seems to have stimulated top-down processes—prediction, inference, and schema activation—known to facilitate comprehension of authentic speech, particularly for novices encountering accent and rate variation2,39,50. From a cognitive-load perspective, ODST’s multimodal scaffolds may have kept total processing demands within working-memory limits43,44, which is critical for younger learners facing authentic input7,33,70.
For RQ2, robust post-intervention effects on emotional, behavioral, and cognitive engagement were consistent with technology-supported engagement approaches and with classroom research emphasizing that engagement is multidimensional and context-sensitive3,12,21,22. The qualitative themes (e.g., clarity from visuals/audio, enjoyment, willingness to participate) converge with these quantitative impacts and point to reduced threat, increased comprehensibility, and more explicit task purpose as proximal drivers of effort and attention13,22. Jointly, the results also align with a sociocultural view in which story-based, shared activities mediate learning within the ZPD through strategy use, confidence building, and supported participation45,55.
ODST can be viewed as a specific instantiation of story-based learning that operationalizes engagement-supportive conditions: narrative coherence provides a purpose for listening; multimodal cues increase perceived comprehensibility; and collaborative retell/role-play tasks foster social accountability and agency16,22,55. In this sense, the “ODST effect” is not only technological but also narrative and interactional13,21.
Evidence from prior EFL/ESL research
The ODST benefit for authentic listening fits within EFL/ESL research showing that DST and multimodal narratives can improve motivation and comprehension when principled design is foregrounded40,47,59. In elementary and secondary EFL settings, quasi-experimental studies report improvements in listening inference, accuracy, and gist recognition with multimodal or interactive stories, attributing enhancements to paralinguistic signals, contextual redundancy, and visual scaffolds59,66,67. Complementary work supports positive effects on autonomy, vocabulary, and motivation, especially when students co-produce or retell stories29,47,57. In EAP/ESL settings, project-based digital video/storytelling tasks can promote collaboration, ownership, and strategic processing, supporting receptive outcomes and self-regulated learning27,65. More generally, scaffolding authentic listening through visuals, staged pre-, during-, and post-task design, and explicit strategy prompts has been repeatedly linked to improved comprehension and persistence1,2,33,34.
Recent engagement-oriented research also reinforces the relevance of learners’ perceptions and beliefs in technology-mediated language learning. For example, Wang and Reynolds19 show that engagement with large language models for vocabulary learning is shaped by learners perceived value and contextual factors, and Wang et al.18 similarly highlight belief profiles about AI-mediated informal digital learning. Although these studies address different tools and skills, they underscore a compatible principle: technology only “produces engagement” when learners interpret it as usable, valuable, and supportive—conditions that ODST may create through narrative meaning, multimodal comprehensibility, and visible progress cues17,22.
ODST appears to support authentic listening by (a) reducing cognitive load, (b) increasing affective willingness to persist with difficult input, and (c) enabling social mediation through collaborative story tasks13,43,44,55. These mechanisms map directly onto the dual outcome pattern observed here (listening + engagement), rather than peripheral outcomes12,21.
The literature cautions that multimedia or authenticity alone does not guarantee improved listening, particularly for novice learners. Reviews of technology-enhanced L2 listening and classroom listening pedagogy highlight that poorly aligned multimedia can add extraneous load and that listening tasks sometimes under-scaffold strategy use, yielding modest impacts33,34,43,44,50. These mixed results do not contradict the present findings; instead, they emphasize that ODST’s efficacy hinges on principled design—signaling, coherence, brief segments, and explicit strategy instruction—and developmentally appropriate pacing1,33,34,43,44. The current intervention incorporated these principles (e.g., pre-, during-, and post-task scaffolds; 2–3-minute stories; segmented replay), which may help explain the stronger effects observed2,7,8.
Plausible mechanisms linking ODST to outcomes
Regarding the successful use of ODST in the current study, three interacting mechanisms offer a coherent explanation. First, cognitive integration under load constraints: dual-channel presentation, combined with concise captions, likely helped students map sounds to meanings without saturating working memory43,71. Second, social mediation and participation: guided retells, shared viewing, and storyboard tasks provided opportunities for joint attention and peer scaffolding, consistent with ZPD dynamics45,55. Third, affective alignment: music, emotional arcs, and identifiable characters appear to have increased confidence and interest; this was evident in qualitative accounts and mirrored by the strongest quantitative advantage in emotional engagement21,23,72,73. Together, these mechanisms also provide a plausible account of why engagement gains covaried with listening gains within the ODST group—without implying that engagement is the only pathway or that the correlation establishes mediation12,13.
Pedagogical implications
For public-school EFL classrooms aiming to integrate authentic listening earlier in the syllabus, ODST offers a feasible, developmentally appropriate pathway. Practically, teachers can: (a) prime background knowledge and a small set of key lexis; (b) provide one uninterrupted viewing for gist followed by short segmented replays tied to gist/detail prompts; (c) make strategies explicit (e.g., using visuals, prosody, and captions to infer meaning; monitoring comprehension; tolerating ambiguity); and (d) consolidate with storyboards, brief retells, or alternative endings.
Teacher training burden: ODST adoption can be supported with a short, targeted workshop on multimedia principles (coherence, signaling, contiguity) plus reusable lesson templates, rather than extensive training.
Device availability: Where 1:1 access is unrealistic, ODST can be implemented with whole-class projection and structured pair/group response tasks; learners still benefit from multimodal cues and narrative coherence even without individual devices.
Classroom time: ODST works best with micro-routines (2–3-minute stories, segmented replay, two or three focused prompts, one short output task) that fit existing syllabus time rather than adding additional units.
In resource-constrained settings, teachers can approximate ODST using print-visual storyboards, audio, and teacher-led signaling, thereby preserving core design features even without full interactivity. For decision-makers, the practical “cost profile” of ODST is thus less about high-end technology and more about access to stable audio-visual display and a small amount of teacher capacity for principled task design.
Limitations and directions for future research
This study has limitations that constrain interpretation and generalizability. The quasi-experimental, single-site design with a modest sample limits causal inference and external validity68. Classroom-level variability (e.g., teacher discourse moves) may have influenced effects despite fidelity monitoring. Engagement assessment was primarily self-reported, a common but imperfect approach for younger students, given metacognitive constraints and potential social desirability bias21,60.
Future work should use multi-site or cluster-randomized trials, incorporate observational or trace indicators of engagement (e.g., time-on-task logs, interaction counts, clickstream data), and include delayed posttests to examine retention71. Comparative studies in Chinese schools could pit ODST against other scalable listening supports (e.g., captioned short-form video, podcasts) to identify which design characteristics matter most locally. Finally, formal mediation models (with adequate sample sizes and longitudinal measurement) can test whether specific engagement components—particularly cognitive engagement—statistically account for ODST-related gains in listening.
To meet contemporary Open Science expectations, future studies should, where feasible, also preregister analysis plans and deposit minimal anonymized datasets and analysis code in public repositories, subject to ethical constraints.
Contribution
This study advances the field in three specific ways. First, it provides classroom-based causal evidence (via covariate-adjusted intact-class comparisons) that ODST can simultaneously improve authentic listening comprehension and multidimensional engagement in a public-school junior high context—an underrepresented setting in ODST research. Second, it links outcomes to mechanism-consistent learner accounts (multimodal load reduction, narrative-driven attention, and socially mediated participation), thereby offering an explanatory pathway rather than merely reporting outcomes. Third, it operationalizes a feasible ODST routine (brief stories, segmented replay, explicit strategy prompts, short collaborative output) that can inform scalable instructional design under typical public-school constraints.
Conclusion
This mixed-methods study sought to determine whether ODST can enhance junior-high EFL students’ comprehension of authentic listening input and their emotional, behavioral, and cognitive engagement. Over 8 weeks, the ODST class achieved significantly greater improvements in listening than the control class and reported higher engagement across all three dimensions. Improvements in engagement were moderately correlated with advances in listening, suggesting that ODST supports comprehension through positive affect, sustained attention, and the utility of effortful strategies. Internal consistency for the engagement assessment was acceptable, supporting these inferences. Conceptually, the pattern of findings is consistent with a multimodal-narrative pathway to listening enhancement: coordinated audio–visual cues and clear story structure lessen extraneous processing, facilitate mapping of form to meaning, and encourage collaborative sense-making—conditions that seem to translate into measurable comprehension improvements and more persistent engagement71.
In sum, within ordinary public-school constraints, ODST yielded educationally meaningful improvements in authentic listening while simultaneously elevating learner engagement33,34. Framed as a principled pedagogy rather than a technology add-on, ODST offers a dual benefit—comprehension and engagement—that justifies its adoption in junior-high EFL listening programs and warrants continued investigation in broader, longer-term implementations.
Acknowledgements
We acknowledge that AI applications were used for proofreading the manuscript.
Author contributions
W.W, L.Z and J.Z designed the study. W.W, L.Z and J.Z collected the data. W.W analyzed and interpreted the data. L.Z and J.Z drafted the manuscript. W.W, L.Z and J.Z proofread the paper. W.W, L.Z and J.Z agreed to be accountable and verified the submitted version.
Funding
This work was sponsored in part by 2023 Featured Innovation Project Guangdong Scientific Research Projects for the Higher-educational Institution [grant number: 2023WTSCX027]. 2025 The Philosophy and Social Sciences Planning Project of the Guangdong Provincial Government [grant number༚GD25YWW03].
Data availability
Data is provided within the manuscript.
Declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
Before participating, all subjects provided informed consent. The study was conducted in accordance with the Declaration of Helsinki, and Guangdong University of Finance and Economics approved the protocol.
Informed consent
Informed consent was obtained from the experts.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Prasongngern, P. & Soontornwipast, K. Effects of listening strategy instruction incorporating intensive and extensive listening on listening skills and metacognitive awareness. Int. J. Instruction ; 16(4).155-172 (2023).
- 2.Vandergrift, L. & Goh, C. C. M. Teaching and Learning Second Language Listening: Metacognition in Action (Routledge, 2012).
- 3.Carroll, M., Lindsey, S. & Chaparro, M. Integrating engagement-inducing interventions into traditional, virtual and embedded learning environments. In: International Conference on Human-Computer Interaction. Cham: Springer International Publishing; :263–281. (2019).
- 4.Hedgcock, J. S. & Ferris, D. R. Teaching Readers of English: Students, texts, and Contexts , 2018.Routledge, 2018 Feb).
- 5.AlShareef, S. N. A. Applying the communicative approach in assessing EFL young learners. Eurasian J. Appl. Linguistics. 10 (1), 14–27 (2024). [Google Scholar]
- 6.Dasam, S. Essentials of English Language and Communication (Academic Guru Publishing House, 2025). Jun 9.
- 7.Renandya, W. A. & Farrell, T. S. C. Teacher, the tape is too fast!’: extensive listening in ELT. ELT J.65 (1), 52–59 (2011). [Google Scholar]
- 8.Goh, C. & Taib, Y. Metacognitive instruction in listening for young learners. ELT J.60 (3), 222–232 (2006). [Google Scholar]
- 9.Abdushukurova, U. Teaching receptive skills: reading and listening. Eur. J. Humanit. Educational Advancements. 5 (5), 58–63 (2024). [Google Scholar]
- 10.Gilmore, A. I prefer not text: developing Japanese learners’ communicative competence with authentic materials. Lang. Learn.61 (3), 786–819 (2011). [Google Scholar]
- 11.Yin, M. The effect and importance of authentic language exposure in improving listening comprehension. (2015).
- 12.Hiver, P., Al-Hoorie, A. H., Vitta, J. P. & Wu, J. Engagement in Language learning: A systematic review of 20 years of research methods and definitions. Lang. Teach. Res.28 (1), 201–230 (2024). [Google Scholar]
- 13.Miller, E. R. & Coen, L. L. Engagement in second Language listening: A socio-cognitive perspective. Appl. Linguistics Rev.12 (2), 273–296 (2021). [Google Scholar]
- 14.Kahu, E. Increasing the emotional engagement of first year mature-aged distance students: interest and belonging. Int. J. First Year High. Educ. ;5(2), 45-55. (2014).
- 15.Wandas, A. M. P. English Language needs of non-language education students. Cognizance J. Multidisciplinary Stud.4 (5), 283–306 (2024). [Google Scholar]
- 16.Baralt, M., Gurzynski-Weiss, L. & Kim, Y. Engagement with the Language: How Examining Learners’ Affective and Social Engagement Explains Successful learner-generated Attention To form. InPeer Interaction and Second Language Learning: Pedagogical Potential and Research Agenda. Vol. 26, pp. 209–239 (John Benjamins Publishing Company.(2016 ).
- 17.Han, R., Alibakhshi, G., Lu, L. & Labbafi, A. Digital communication activities and EFL learners’ willingness to communicate and engagement: exploring the intermediate Language learners’ perceptions. Heliyon10 (3), e25213. 10.1016/j.heliyon.2024.e25213 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wang, X., Gao, Y., Reynolds, B. L. & Wang, Q. Exploring Chinese EFL learners’ beliefs about AI-mediated informal digital learning of english (IDLE): insights from Q methodology. Porta Linguarum. (XIII), 131–146. 10.30827/portalin.viXIII.31925 (2025). Revista Seug.
- 19.Wang, X. & Reynolds, B. L. Beyond the books: exploring factors shaping Chinese english learners’ engagement with large Language models for vocabulary learning. Educ. Sci.14 (5), 496. 10.3390/educsci1405049 (2024). [Google Scholar]
- 20.Appleton, J. J., Christenson, S. L., Kim, D. & Reschly, A. L. Measuring cognitive and psychological engagement: validation of the student engagement instrument. J. Sch. Psychol.44 (5), 427–445 (2006). [Google Scholar]
- 21.Fredricks, J. A., Blumenfeld, P. C. & Paris, A. H. School engagement: potential of the concept, state of the evidence. Rev. Educ. Res.74 (1), 59–109 (2004). [Google Scholar]
- 22.Mercer, S. & Dörnyei, Z. Engaging Language Learners in Contemporary Classrooms (Cambridge University Press, 2020).
- 23.Pakdaman, A., Alibakhshi, G. & Baradaran, A. The impact of negotiated syllabus on foreign Language learners’ Language anxiety and learning motivation. Teach. Engl. Lang.16 (1), 35–63 (2022). [Google Scholar]
- 24.Lambert, C., Philp, J. & Nakamura, S. Learner-generated content and engagement in second Language task performance. Lang. Teach. Res.21 (6), 665–680 (2017). [Google Scholar]
- 25.Ghelichli, Y., Seyyedrezaei, S. H., Barani, G. & Mazandarani, O. The relationship between dimensions of student engagement and Language learning motivation among Iranian EFL learners. Int. J. Foreign Lang. Teach. Res.2 (31), 43 (2020). [Google Scholar]
- 26.Schnitzler, K., Holzberger, D. & Seidel, T. All better than being disengaged: student engagement patterns and their relations to academic self-concept and achievement. Eur. J. Psychol. Educ.36 (3), 627–652 (2021). [Google Scholar]
- 27.Chubko, N., Morris, J. E., McKinnon, D. H., Slater, E. V. & Lummis, G. W. Digital storytelling as a disciplinary literacy enhancement tool for EFL students. Education Tech. Research Dev.68 (6), 3587–3604 (2020). [Google Scholar]
- 28.Zemali, S. The role of audio narration and visual aids in teaching short stories in Algerian middle school. 15 (1), 889–900 (2025).
- 29.Razmi, M., Pourali, S. & Nozad, S. Digital storytelling in EFL classroom (oral presentation of the story): A pathway to improve oral production. Procedia - Social Behav. Sci.98, 1541–1544 (2014). [Google Scholar]
- 30.Alibakhshi, G., Abdollahi, H. & Nezakatgoo, B. Exploring the antecedents of english Language teachers’ teaching self-efficacy: a qualitative study. Qualitative Res. J.21 (3), 286–303. 10.1108/QRJ-05-2020-0040 (2021). [Google Scholar]
- 31.Yaşar, B. The Effects of Project-based Learning Using Storytelling on Enhancing EFL Young Learners’ 21st Century Skills (Master’s thesis, Instituto Politecnico do Porto (Portugal)).
- 32.Dobakhti, L., Zohrabi, M. & Masoudi, S. Scrutinizing the affective predictors of teacher immunity in foreign Language classroom. Teach. Engl. Lang.16 (1), 65–88 (2022). [Google Scholar]
- 33.Zhang, R., Zou, D. & Cheng, G. A systematic review of technology-enhanced L2 listening development since 2000. Lang. Learn. Technol.27 (3), 41–64. 10.64152/10125/73531 (2023). [Google Scholar]
- 34.Graham, S. Research into practice: listening strategies in an instructed classroom setting. Lang. Teach.50 (1), 107–119 (2017). [Google Scholar]
- 35.Parsaiyan, S. F. & Mansouri, S. Digital storytelling (DST) in nurturing EFL teachers’ professional competencies: A qualitative exploration. Technol. Assist. Lang. Educ.2 (4), 141–167 (2024). [Google Scholar]
- 36.Alibakhshi, G. & Mohammadi, M. J. Synchronous and asynchronous multimedia and Iranian EFL learners’ learning of collocations. Appl. Res. Engl. Lang.5 (2), 237–254 (2016). [Google Scholar]
- 37.Zarei, M. A., Alibakhshi, G. & Nezakatgoo, B. Strategies employed by EFL teachers to Cope with Language learners’ classroom anxiety. TESOL J.15 (4), e843. 10.1002/tesj.84 (2024). [Google Scholar]
- 38.Amiryousefi, M. The incorporation of flipped learning into conventional classes to enhance EFL learners’ L2 speaking, L2 listening, and engagement. Innov. Lang. Learn. Teach.13 (2), 147–161 (2019). [Google Scholar]
- 39.Vandergrift, L. Second Language listening: Presage, process, product, and pedagogy. In Handbook of Research in Second Language Teaching and Learning Vol. 2 (ed. Hinkel, E.) 455–471 (Routledge, 2011). [Google Scholar]
- 40.Tsou, W., Wang, W. & Tzeng, Y. Applying a multimedia storytelling website in foreign Language learning. Comput. Educ.47 (1), 17–28 (2006). [Google Scholar]
- 41.Kearsley, G. Distance learning. In: Connotative Learning: The Trainer’s Guide to Learning Theories and Their Practical Application to Training Design. :101. [publisher/location needed]. (2004).
- 42.Xiao, Y. The impact of AI-driven speech recognition on EFL listening comprehension, flow experience, and anxiety: A randomized controlled trial. Humanit. Social Sci. Commun.12 (1), 1–14 (2025). [Google Scholar]
- 43.Mayer, R. E. Cognitive theory of multimedia learning. In: The Cambridge Handbook of Multimedia Learning. :31–48. (2005).
- 44.Mayer, R. E. Multimedia Learning 2nd edn (Cambridge University Press, 2009).
- 45.Verenikina, I. Vygotsky’s socio-cultural theory and the zone of proximal development. [details needed]. (2003).
- 46.Lowrance-Faulhaber, E. Young English learners as writers: An exploration of teacher-student dialogic relationships in two mainstream classrooms [dissertation]. Cincinnati, OH: University of Cincinnati; (2020).
- 47.Yang, Y. T. C. & Wu, W. C. I. Digital storytelling for enhancing student academic achievement, critical thinking, and learning motivation: A year-long experimental study. Comput. Educ.59 (2), 339–352 (2012). [Google Scholar]
- 48.Mukhtorova, M. & Ilxomov, X. How to improve listening skills of both ESL and EFL students. Qo‘Qon Universiteti Xabarnomasi. 11, 84–86 (2024). [Google Scholar]
- 49.Polishchuk, G. Listening as the major language processing skill of spoken interaction. [journal/title language: Ukrainian; article number 2025189; details needed]. (2025).
- 50.Vandergrift, L. Recent developments in second and foreign Language listening comprehension research. Lang. Teach.40 (3), 191–210 (2007). [Google Scholar]
- 51.Afriyuninda, E. & Oktaviani, L. The Use of English Songs To Improve English Students’ Listening Skills ([details needed], 2021).
- 52.Qizi, T. M. S. The main difficulties encountered in the beginning of the english Language. Am. J. Social Sci. Humanity Res.3 (12), 85–98 (2023). [Google Scholar]
- 53.Goh, C. C. M. Listening as process: learning activities for self-appraisal and self-regulation. In English Language Teaching Materials (ed. Harwood, N.) 179–206 (Cambridge University Press, 2012). [Google Scholar]
- 54.Banaszewski, T. M. Digital storytelling: Supporting digital literacy in grades 4–12 [dissertation]. Atlanta, GA: Georgia Institute of Technology; (2005).
- 55.Lantolf, J. P. & Thorne, S. L. Sociocultural Theory and the Genesis of Second Language Development (Oxford University Press, 2006).
- 56.Kramsch, C. From communicative competence to symbolic competence. Mod. Lang. J.90 (2), 249–252 (2006). [Google Scholar]
- 57.Riaz, H. & Riasat, A. Effect of digital storytelling on vocabulary retention and Language motivation of Pakistani EFL learners. Lang. Teach. Res. Q.33 (1), 52–65 (2023). [Google Scholar]
- 58.Gregersen, T., MacIntyre, P. D. & Meza, M. D. The motion of emotion: idiodynamic case studies of learners’ foreign Language anxiety. Mod. Lang. J.98 (2), 574–588 (2014). [Google Scholar]
- 59.Xu, Q., Wang, X. & Ma, Q. Enhancing EFL students’ listening skills through digital storytelling: A quasi-experimental study. Computer-Assisted Lang. Learn.33 (5–6), 570–594 (2020). [Google Scholar]
- 60.Reschly, A. L. & Christenson, S. L. Jingle, jangle, and conceptual haziness: evolution and future directions of the engagement construct. In Handbook of Research on Student Engagement (eds Christenson, S. L. et al.) 3–19 (Springer, 2012). [Google Scholar]
- 61.Reinders, H. & Wattana, S. Affect and willingness to communicate in digital game-based learning. ReCALL27 (1), 38–57 (2015). [Google Scholar]
- 62.Graham, S. & Santos, D. Language learning in the public eye: an analysis of newspapers and official documents in England. Innov. Lang. Learn. Teach.9 (1), 72–85 (2015). [Google Scholar]
- 63.Goh, C. & Aryadoust, V. Examining the notion of listening subskill divisibility and its implications for second Language listening. Int. J. Listening. 29 (3), 109–133 (2015). [Google Scholar]
- 64.Dörnyei, Z. & Ushioda, E. Teaching and Researching Motivation (Routledge, 2021).
- 65.Selfa-Sastre, M., Pifarre, M., Cujba, A., Cutillas, L. & Falguera, E. The role of digital technologies to promote collaborative creativity in Language education. Front. Psychol.13, 828981 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Ramírez Verdugo, D. & Alonso Belmonte, I. Using digital stories to improve listening comprehension with Spanish young learners of english. Lang. Learn. Technol.11 (1), 87–101. 10.64152/10125/44090 (2007). [Google Scholar]
- 67.Demirbaş, İ. & Şahin, A. The effect of digital stories on primary school students’ listening comprehension skills. Participatory Educational Res.9 (6), 380–397. 10.17275/per.22.144.9.6 (2022). [Google Scholar]
- 68.Shadish, W. R., Cook, T. D. & Campbell, D. T. Experimental and Quasi-Experimental Designs for Generalized Causal Inference (Houghton Mifflin, 2002).
- 69.Braun, V. & Clarke, V. Using thematic analysis in psychology. Qualitative Res. Psychol.3 (2), 77–101 (2006). [Google Scholar]
- 70.Polat, M. & Erişti, B. The effects of authentic video materials on foreign Language listening skill development and foreign Language listening anxiety at different levels of english proficiency. Int. J. Contemp. Educational Res.6 (1), 135–154 (2019). [Google Scholar]
- 71.Mardiha, S. M., Alibakhshi, G., Mazloum, M. & Javaheri, R. Electronic flipped classrooms as a solution to educational problems caused by COVID-19: a case study of a research course in Iran higher education. Electron. J. e-Learning. 21 (1), 26–35. 10.34190/ejel.21.1.2440 (2023). [Google Scholar]
- 72.Kearsley, G. & Shneiderman, B. Engagement theory: A framework for technology-based teaching and learning. Educational Technol.38 (5), 20–23 (1998). [Google Scholar]
- 73.Goh, C. C. M. & Vandergrift, L. Teaching and Learning Second Language Listening: Metacognition in Action (Routledge, 2012). 10.4324/9780203843376
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data is provided within the manuscript.
