Abstract
Background
Augmented reality (AR) and mixed reality (MR) are increasingly being applied in the field of medical education. However, the effectiveness and acceptance of AR/MR in different teaching scenarios remain unclear. This meta-analysis aimed to examine the effectiveness of AR/MR in improving medical students' knowledge and skills training.
Methods
PubMed, Embase, Cochrane Library, PsycINFO, Education Source, Education Resources Information Center, CINAHL, Web of Science Core Collection, China National Knowledge Infrastructure, WanFang, WeiPu, and Chinese BioMedical Literature databases were searched. The search encompassed literature from the database establishment to December 2023. Studies examining the use of AR/MR in medical education and reporting outcomes related to knowledge learning or skill training were included. The data were analyzed using Stata version 15.1 software.
Results
Twenty-nine publications were included in this meta-analysis. Compared to traditional teaching methods, AR/MR-assisted teaching showed greater effectiveness in medical skills training, as indicated by the higher skill scores (WMD = 12.31, 95% CI: 4.12 to 20.50), reduced failure rate (RD = -0.08, 95% CI: -0.12 to -0.04), and shortened performance time (SMD = -0.20, 95% CI: -0.36 to -0.04). However, AR/MR did not significantly improve knowledge acquisition (WMD = 2.92, 95% CI: -1.73 to 7.57). The questionnaire survey revealed the advantages of AR/MR in terms of perceived usefulness (PU) (WMD = 0.27, 95% CI: 0.00 to 0.53), perceived ease of use (PEOU) (WMD = 0.35, 95% CI: 0.13 to 0.57), and enjoyment (WMD = 0.67, 95% CI: 0.20 to 1.13).
Conclusion
This meta-analysis highlights the effectiveness and user acceptance of AR/MR in medical education, particularly in skill training, although no significant improvement was observed in knowledge learning. The findings provide a foundational evidence base for the further expansion of AR/MR applications in medical education. However, it should be noted that these estimates may be influenced by heterogeneity and publication bias.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12909-025-08166-8.
Keywords: Augmented reality, Mixed reality, Medical education, Meta-analysis
Introduction
Augmented reality (AR), along with mixed reality (MR), is increasingly used in medical education and clinical practice. AR and MR technologies have unique abilities to overlay digital information onto real-world environments or blend virtual and real worlds, respectively [1]. These technologies offer unique advantages, such as reducing cognitive load, enhancing spatial understanding, and enabling interactive learning through dynamic visualization and feedback [2]. Since AR/MR-based medical devices have already been approved for medical practice [3], this must be included in the curriculum not only for educational purposes but also for practical reasons.
In medical education, AR/MR applications fall into two broad categories: knowledge learning (e.g., anatomy, pathology) and skill training (e.g., surgery, physical examinations). For knowledge acquisition, AR/MR improves spatial reasoning and retention of complex structures [4], while operational training, such as surgical skills [5], physical examinations [6], interventional diagnosis and treatment, and emergency skills training [7], benefits from highly realistic, repeatable simulations with real-time feedback [8]. AR/MR greatly enriches teaching methods, enhances teaching effectiveness, and promotes the development of clinical thinking and skills in medical students [9]. Compared to traditional methods, AR/MR offers superior safety, engagement, and collaborative learning opportunities [10]. However, the existing published studies have contradictory results, with some suggesting a positive effect on learning outcomes [7] and others indicating no significant impact [11].
To our knowledge, apart from a study by Baashar et al. [12], no meta-analysis has been conducted to evaluate AR/MR in medical education. Moreover, Yahia's research included 13 studies as of April 2021, mainly analyzing the impact of AR on knowledge learning performance, skill operation completion time, and satisfaction, and lacked a multidimensional assessment of AR effects and user acceptance. This study advances beyond previous research by employing a Technology Acceptance Model (TAM)-based framework to comprehensively assess multidimensional indicators, including perceived usefulness (PU), perceived ease of use (PEOU) and user enjoyment, while systematically analyzing different performance metrics across both major AR/MR application scenarios. The evidence base has been substantially enhanced by extending the literature review to include studies up to December 2023, capturing the latest advancements in this rapidly evolving field. Significantly, this marks the first meta-analysis to quantitatively assess key training parameters, including failure rates and performance time. As a result, it provides more comprehensive and up-to-date evidence to inform the adoption of AR/MR educational technologies.
The results of this work have many implications for teachers, medical students, education professionals, and the users of AR/MR technology to promote learning and training in the medical field. This meta-analysis aims to address the following research questions (RQs):
RQ1: Does the use of AR/MR have an impact on medical education when compared to other instructional methods?
RQ2: What medical education application scenarios are AR/MR suitable?
RQ3: How is the user experience of AR/MR?
Methods
Definitions
For the purposes of this meta-analysis, we defined AR as a “computer-generated holographic image overlaid into the real clinical environment, permitting the user to interact with the hologram and objects in the real environment in an integrated fashion” [13], and MR as a “blending virtual objects and information onto a physical environment, interacting with it, and responding to it [9]”.
Inclusion criteria
Based on the PICOS (Population, Intervention, Comparison, Outcomes, Study design) framework, the inclusion criteria were as follows:
Population(P): Individuals enrolled in undergraduate or graduate medical programs.
Intervention(I): The intervention group received education involving AR or MR (combining AR/MR with traditional teaching, or purely AR/MR-based teaching).
Comparison(C): traditional teaching methods, which may include classroom-based learning, no intervention, or other forms of digital and blended education.
Outcome(O) (at least one of these): (a) Skill assessment, incorporating: the skill score (SS), evaluated via standardized procedures and converted to a percentage format; failure rate (FR), documenting instances where tasks were not successfully completed; and performance time (PT), measuring the duration taken to finish a single task. (b) Knowledge assessment, incorporating: knowledge score (KS), assessed via a written examination designed to test the comprehension of pertinent theoretical knowledge. (c) Participants' experience was evaluated by a standardized self-report questionnaire survey (QS), which included items on PU, PEOU and enjoyment. Additional Notes: The KS and SS results were presented as scores out of 100 and for QS, the response options ranged from “very dissatisfied” (1 point) to “very satisfied” (5 points) on a five-point Likert scale.
Study design(S): Randomized controlled trials (RCTs).
Exclusion criteria
The exclusion criteria were as follows: (a) studies with unclear outcome measures or insufficient data description; (b) duplicate publications; (c) studies without access to the full text; and (d) studies with a high risk of bias.
Search strategies
We searched seven English-language online databases, including MEDLINE, Embase (Elsevier), Cochrane Central Register of Controlled Trials, PsycINFO, Education Resources Information Center (Ovid), CINAHL (EBSCO), Web of Science Core Collection (Thomson Reuters), and four Chinese databases, including China National Knowledge Infrastructure, WanFang, WeiPu, and Chinese BioMedical Literature. The search encompassed literature from the database establishment to December 2023, and the languages were limited to Chinese and English. The keywords used in the search strategy corresponded to PubMed Subject Headings (MeSH) and included, but were not limited to: “Augmented Reality/Mixed Reality/Education/Learn*/Teach*/Train*/Health/Medicine/Randomized Controlled Trial”. Our study protocol was registered with PROSPERO (Code: CRD42024506938). Table S1 shows the literature search strategy for each database.
Data extraction
Two researchers conducted literature searches based on the inclusion and exclusion criteria. Noteexpress software was used for literature screening and data extraction. Disputed articles were discussed, and the final decision on inclusion was made by the corresponding author. The included literature was then cross-checked by two researchers to extract the following information: author, publication date, location, specialty, application scenarios, equipment, sample size, outcome measures, and intervention duration.
For studies that reported outcomes measured by multiple scales, we prioritized the primary outcome defined by the authors. If no primary outcome was specified, we calculated the average of standardized mean differences across all scales. For studies comparing multiple intervention groups, each group was treated as an independent comparison.
Quality assessment
Ruoxuan Zhang and Xiaoyan Jin independently evaluated the methodological quality of the included studies. Any discrepancies were resolved through discussion or, if necessary, by consultation with a third author, Ming Liu. The methodological quality of the RCTs was assessed using the Cochrane collaboration’s Risk Interventions Risk of Bias Tool (ROB 2.0) [14] reporting each of its five domains (Randomization process, Deviations from the intended interventions, Missing outcome data, Measurement of the outcome, and Selection of the reported result) into a low risk of bias, high risk of bias, or some concerns. Additionally, study domains were classified as low, some concerns, and high risk of bias, leading to an overall study-level risk of bias.
Statistical analysis
Heterogeneity among the included studies was evaluated using the chi-square test, with a significance level of P < 0.1 and I2 > 50% indicating significant heterogeneity. A random effects model was employed for the meta-analysis. As part of data standardization, the original assessment scores, such as OSATS or Likert scales, were transformed into percentage values using linear normalization relative to their theoretical maximums. The pooled effects are presented as weighted or standardized mean difference (WMD/SMD) with 95% confidence intervals (CIs), with performance time specifically analyzed using SMD. For failure rates, the effects are presented as risk difference (RD) along with 95% CIs. Subgroup analysis was performed based on the teaching scenarios. Sensitivity analysis was conducted by sequentially excluding each study to assess its influence on the pooled effects. The publication bias was evaluated by a funnel plot and Egger’s test. The data were analyzed using Stata version 15.1 software. A significance level of P < 0.05 indicated statistical significance.
Results
Search results
A total of 386 potentially relevant articles were identified, of which 128 duplicates were removed. At the screening stage, 202 articles were excluded after reading the titles and abstracts, among which 136 were irrelevant to the topic, and 66 did not match the research type. According to the inclusion and exclusion criteria, 56 full-text articles were assessed for eligibility. Among these, 27 studies did not meet the inclusion criteria and had incomplete outcome indicators. Thus, a total of 29 articles were included in this meta-analysis. A summary of the process is outlined in Fig. 1.
Fig. 1.

Flow chart of literature screening
Study characteristics
Most of the 29 included studies were published in 2020 or later (n = 23, 79.31%) and the majority of the studies were conducted in Europe and North America (n = 19, 65.51%). The included studies covered 9 specialties, including Anatomy, Surgery, Imaging department, and Emergency Medicine. Among them, studies on theoretical knowledge learning accounted for 53.33% (n = 16), and those on skill training accounted for 46.67% (n = 14). One of the studies included both knowledge learning and skill training (Table 1).
Table 1.
Basic characteristics of the included studies (n = 29)
| Article | Year | Location | Specialty | Application | Equipment(I/C) | Sample size(I/C) | Outcome measures | Intervention duration |
|---|---|---|---|---|---|---|---|---|
| Bogomolova et al. [15] | 2023 | Netherlands | Anatomy | KL | 3D holography/monoscopic 3D | 32/34 | KS,PU,PEOU,EN | 45 min |
| Cizmic et al. [16] | 2023 | Germany | Laparoscope | ST | iSurgeon/verbal guidance | 20/20 | PT,FR | / |
| D'Aiello et al. [17] | 2023 | Italy | Anatomy | KL | Microsoft HoloLens/regular slideware | 19/20 | KS | 30 min |
| Farshad et al. [18] | 2023 | Switzerland | Ultrasound | ST | Microsoft HoloLens/standard US | 22/22 | PT | / |
| Felinska et al. [6] | 2023 | Germany | Laparoscope | ST | iSurgeon/no AR telestration | 20/20 | FR, PT,PU | / |
| Guha et al. [19] | 2023 | UK | Surgery | ST | Microsoft HoloLens/video | 18/18 | SS,PEOU,EN | 20 min |
| Hayasaka et al. [20] | 2023 | Japan | Anesthesia | ST | Microsoft HoloLens/traditional | 10/10 | FR,PU,PEOU,EN | 10 min |
| Malik et al. [21] | 2023 | UK | Otolaryngology | KL | Microsoft HoloLens/traditional | 35/21 | KS,PU,PEOU,EN | 45 min |
| An et al. [22] | 2022 | Korea | Self-Regulated learning | KL | AR app/textbook | 31/31 | KS | 4 weeks |
| Hou et al. [23] | 2022 | China | Cardiopulmonary Resuscitation | ST | Microsoft HoloLens/instructor- assisted | 14/13 | FR | 10 min |
| Little et al. [24] | 2022 | USA | Anatomy | KL | IVALA®/textbook | 38/36 | KS,PU,PEOU | 60 min |
| Nagayo et al. [25] | 2022 | Japan | Surgery | ST | Microsoft HoloLens/video | 19/19 | SS,PU,PEOU | 10 min |
| Pickering et al. [26] | 2022 | UK,Greece | Anatomy | KL | MR/screencast | 62/84 | KS | 90 min |
| Veer et al. [11] | 2022 | Australia | Asthma | KL | Microsoft HoloLens/textbook | 33/34 | KS | 6 min |
| Yohannan et al. [27] | 2022 | India | Anatomy | KL | Air Anatomy/lecture | 49/45 | KS | 3 h |
| Chen et al. [28] | 2021 | Health assessment and practice course | ST,KL | 3D holography/lecture | 40/39 | SS,KS | 18 weeks | |
| Kamonwon et al. [29] | 2021 | Thailand | Ultrasound | KL | AR/no AR | 23/26 | KS | 4 weeks |
| Liu [30] | 2021 | China | Pulmonary lesions | ST | 3D holography/2D CT | 10/10 | SS | / |
| Weeks et al. [31] | 2021 | USA | Anatomy | KL | Microsoft HoloLens/2D screens | 15/15 | KS | 60 min |
| Bogomolova et al. [32] | 2020 | Netherlands | Anatomy | KL | stereoscopic 3D AR/monoscopic 3D | 20/22 | KS,PU,PEOU,EN | 45 min |
| Schoeb et al. [33] | 2020 | Germany | Surgery | ST | Microsoft HoloLens/instructor | 59/105 | SS,PU,PEOU,EN | 30 min |
| Stojanovska et al. [34] | 2020 | USA | Anatomy | KL | Microsoft HoloLens/Dissection | 31/33 | KS | 4 h |
| Wang et al. [35] | 2020 | New Zealand et al | Anatomy | KL | Microsoft HoloLens/3DM laptop program | 19/15 | KS,PU,PEOU,EN | 20 min |
| Noll et al. [36] | 2017 | Germany | Dermatology | KL | mARble/traditional | 22/22 | KS,PU,PEOU,EN | 45 min |
| Leitritz et al. [37] | 2014 | Germany | Ophthalmoscopy | ST | AR ophthalmoscopy/traditional | 19/18 | SS, FR,PU | 15 min |
| Vera et al. [38] | 2014 | USA | Laparoscope | ST | ART platform/traditional | 9/9 | FR, PT | 1 h |
| Albrecht et al. [39] | 2013 | Germany | Forensic medicine | KL | mARble/textbook | 4/4 | KS,PU,PEOU,EN | 30 min |
| Wilson et al. [40] | 2013 | USA | Surgery | ST | AR goggles/lecture | 13/21 | SS, FR | / |
| Yeo et al. [41] | 2011 | Canada | Surgery | ST | Perk Station training suite/traditional | 20/20 | DS | 8 time |
ST Skill training, KL Knowledge learning, KS Knowledge scores, SS Skill scores, FR Failure rate, PT Performance time, PU Perceived Usefulness, PEOU Perceived Ease of Use, EN Enjoyment
Study quality
According to the ROB 2.0 tool, 23 of the RCTs were of low risk (79.3%) and 6 of the RCTs were of some concerns (20.7%) (Table 2).
Table 2.
Sensitivity analysis and publication bias
Due to high heterogeneity, a sensitivity analysis was conducted to assess the reliability of the results. In the analysis of performance time, when the study by Yeo et al. [41], Felinska et al. [6] or Cizmic et al. [16] was excluded, the overall effect was not different between the AR/MR and control groups, which was different from the primary pooled effect. In addition to the abovementioned studies, no single study was found to significantly influence the overall pooled effects, indicating the stability of our results (Figure S1). Aside from the failure rate analysis (P = 0.017), no other analyses revealed significant publication bias based on the Egger's test and funnel plots (P < 0.05) (Table S2, Figure S2).
Effects of AR/MR on skill training
A total of 8 publications involving 454 participants (AR/MR group = 205 and control group = 249) reported skill scores. Because high heterogeneity was observed across these studies (I2 = 95.8%, P = 0.000), a random-effects model was used. The pooled effects showed a significant difference in skill scores (WMD = 12.31, 95% CI: 4.12 to 20.50, P = 0.003) in favor of the AR/MR group compared to the control group (Fig. 2). Subgroup analysis showed reduced heterogeneity in the invasive procedure subgroup (I2 = 5.9%, P = 0.373), with effects consistent with the pooled analysis results (Figure S3). The results showed that the type of technology (AR vs. MR) was not meaningfully associated with the skill scores (the coefficient = −8.661, SE = 9.043, T = − 0.96, P = 0.375).
Fig. 2.
Forest plot for the effects of AR/MR on skill scores
A total of 9 studies reported failure rates, with 3,690 failures in the AR/MR group and 4,487 in the control group. Due to high heterogeneity in the meta-analysis (I2 = 67.4%, P = 0.002), a random-effects model was used. Egger’s test indicated significant publication bias in the failure rate analysis (P = 0.028). Within the scope of currently published studies, AR/MR demonstrates a trend toward reducing failure rates (RD = −0.08, 95% CI: −0.12 to −0.04, P = 0.000). However, this estimate may be influenced by publication bias (Fig. 3). Subgroup analysis showed reduced heterogeneity in the invasive procedure subgroup (I2 = 18.1%, P = 0.296), with effects consistent with the pooled analysis results (Figure S4).
Fig. 3.
Forest plot for the effects of AR/MR on the failure rate
A total of 5 publications involving 582 participants were identified (AR/MR group = 291 and control group = 291). Heterogeneity analysis revealed an I2 of 0.0% and a P value of 0.438, indicating no significant heterogeneity. A random-effects model was used for the meta-analysis to ensure the reliability of the results. The results showed a significant reduction in performance time in the AR/MR group compared to the control group (SMD = −0.20, 95% CI: −0.36 to −0.04, P = 0.017) (Fig. 4).
Fig. 4.
Forest plot for the effects of AR/MR on performance time
Effects of AR/MR on knowledge learning
A total of 14 publications involving 826 participants (AR/MR group = 406 and control group = 420) reported the use of ARs/MRs for knowledge learning. Due to high heterogeneity in the meta-analysis (I2 = 87.7%, P = 0.000), a random-effects model was used. The results showed no significant difference in knowledge test scores between the AR/MR group and the control group (WMD = 2.92, 95% CI: −1.73 to 7.57, P = 0.218) (Fig. 5). The meta-regression results showed that the type of technology (AR vs. MR) was not significantly associated with the knowledge scores (the coefficient = −1.342, SE = 4.640, T = − 0.29, P = 0.777). Meanwhile, the live teacher guidance was meaningfully associated with the knowledge scores (the coefficient = 9.380, SE = 3.771, T = 2.49, P = 0.029) (Figure S5). Subgroup analysis by live teacher guidance revealed a significant knowledge score improvement with AR/MR group (WMD = 7.15, 95% CI: 0.80 to 13.50, P = 0.027) (Fig. 6).
Fig. 5.
Forest plot for the effects of AR/MR on knowledge scores
Fig. 6.
Forest plot for the subgroup effects of AR/MR on knowledge scores
Effects of AR/MR on student experience indicators
A total of 12 publications reported 605 student evaluation questionnaires on the PU (AR/MR group = 281 and control group = 324). Due to high heterogeneity in the meta-analysis (I2 = 77.4%, P = 0.000), a random-effects model was used. The pooled effects showed that the score of the AR/MR group was significantly greater than that of the control group (WMD = 0.27, 95% CI: 0.00–0.53, P = 0.048) (Figure S6). To understand the PU of AR/MR in knowledge learning and skill training activities, a subgroup analysis was conducted. In the skills training subgroup, which included 5 studies, the AR/MR scores were significantly greater than those in the control group (WMD = 0.30, 95% CI: 0.11–0.49, P = 0.002). Conversely, in the knowledge learning group, which included 7 studies, no significant difference was observed between the AR/MR score and those of the control group (Fig. 7).
Fig. 7.
Effects of AR/MR on student experience indicators
A total of 11 publications reported 568 student evaluation questionnaires on the PEOU (AR/MR group = 262 and control group = 306). Heterogeneity analysis revealed high heterogeneity (I2 = 66.5%, P = 0.001), and a random-effects model was used for the meta-analysis. The pooled effects showed that the AR/MR score was significantly greater than that of the control group (WMD = 0.35, 95% CI: 0.13–0.57, P = 0.002) (Figure S7). Subgroup analysis was conducted for the knowledge learning group (7 studies) and the skill training group (4 studies). The analysis using a random effects model revealed that in both the knowledge learning (WMD = 0.40, 95% CI: 0.07–0.73, P = 0.018) and skills training (WMD = 0.32, 95% CI: 0.09–0.55, P = 0.007) subgroups, the PEOU scores of the AR/MR group were significantly greater than those of the control group (Fig. 7).
A total of 9 publications reported 456 student evaluations of enjoyment (AR/MR group = 205 and control group = 251). Heterogeneity analysis revealed high heterogeneity (I2 = 90.2%, P = 0.000), and a random-effects model was used. The pooled effects showed that the score of the AR/MR group was significantly greater than that of the control group (WMD = 0.67, 95% CI: 0.20–1.13, P = 0.005) (Figure S8). The subgroup analysis included 6 studies on knowledge learning and 3 on skill training. The results indicated that in both the knowledge learning (WMD = 0.77, 95% CI: 0.10–1.44, P = 0.024) and skills training (WMD = 0.52, 95% CI: 0.23–0.81, P = 0.000) subgroups, the scores for the enjoyment aspect of AR/MR were significantly greater than those of the control group (Fig. 7).
Discussion
The current meta-analysis encompasses 29 studies, 23 of which were conducted between 2020 and 2023, highlighting the rapid expansion of AR/MR applications in medical education with technological advancements. This meta-analysis indicated that employing AR/MR in skill training significantly reduces both failure rates and performance time, while concurrently yielding higher skill training scores. Notably, no significant difference was detected in the knowledge learning scores. Participants find AR/MR more useful, easier to use, and more enjoyable than traditional training methods, especially for skill training.
Information processing theory divides knowledge into declarative and procedural knowledge [42]. AR/MR aids declarative (e.g., anatomy, pathology) via visualizations [43] and enhances procedural skills through immersive simulations, offering real-time operational practice [44].
AR/MR enhanced skill training but limited impact on knowledge learning
While AR/MR excels in skill training within medical education, as evidenced by reduced failure rates and performance times, its impact on theoretical knowledge learning appears less substantial. This meta-analysis revealed no significant advantage for AR/MR over traditional methods in enhancing knowledge comprehension, which is congruent with the findings of Moro et al. [45] and contrasts with the conclusion of Paloma et al. [8].
AR/MR's potential to reduce cognitive load during information processing [46] suggests benefits for learners [27], yet, its reliance on visualization and interaction may inadequately address deeper cognitive processes such as understanding and memory, which are crucial for knowledge acquisition. AR/MR content design must be highly aligned with learning strategies. Simply converting traditional textbooks into 3D models without fully leveraging their interactive and immersive features yields learning outcomes no better than conventional methods. Real-time guidance, explanations, and feedback from instructors can compensate for current technological limitations in content integration and cognitive depth support, thereby facilitating the mastery of declarative knowledge. This finding is consistent with our analysis which identified "teacher assistance" as a significant moderating factor. Therefore, future AR/MR educational interventions should emphasize human–machine collaboration, strengthen the role of teachers in knowledge structuring, and carefully manage cognitive load to prevent information overload and ensure user focus on learning objectives. Of the 14 studies included in this meta-analysis focusing on knowledge learning, all assessed learning outcomes using traditional evaluation methods emphasizing memorization and recall. Although AR/MR technology offers unique advantages in cultivating students' spatial thinking, problem-solving, and collaborative abilities, these competencies may not be adequately quantified through conventional examinations or tests [47]. There is a need for more holistic evaluation methods, such as 3D spatial tests (e.g., molecular structures) and performance-based tasks (e.g., virtual experiments), to assess deep understanding and real-world application. Combining these methods offers a more authentic evaluation of technology-enhanced learning, enhancing the validity of educational assessments.
In contrast, AR/MR technology significantly improved skill training outcomes and reduced failure rates and performance times, consistent with past meta-analyses [12, 13]. It enhances skill acquisition by offering standardized demos and key explanations, facilitating mental model formation and skill mimicry [6]. The real-time simulation of AR/MR in medical training provides authentic intrinsic feedback, which is augmented by extrinsic feedback mechanisms, accelerating skill development and refinement [16, 23, 48]. AR/MR simplifies operational orientation and task execution, reduces cognitive load, and thus accelerates skill formation [30, 49].
The impact of AR/MR in medical education in four stages was evaluated based on the Kirkpatrick model [50]: increased student satisfaction (reaction), enhanced skill performance (learning), reduced performance time (behavior), and reduced failure rate (result). In knowledge learning, although students react favorably (reaction), there is no marked improvement in learning outcomes (learning), and behavior and results assessments are lacking.
This emphasizes the necessity of refined, multilevel evaluation approaches, including spatial abilities and practical skills. These innovative biometric sensors are expected to achieve accurate quantification [6]. Future research should adopt standardized methodologies to address evaluation gaps and optimize AR/MR's implementation strategy for both knowledge and skill development, particularly in light of its notable potential for skill training.
The acceptance for AR/MR in skill training is better than that in knowledge learning
The meta-analysis shows that medical students have varying levels of acceptance of AR/MR in skill training and knowledge learning, which is related to the effectiveness demonstrated by the technology. User acceptance is the key to technological implementation, and Fred D. Davis's (1986) Technology Acceptance Model (TAM) is an important evaluation model. The TAM centers on perceived usefulness (PU) and perceived ease of use (PEOU) as determinants [51]. The PU measures perceived improvement in work performance, while the PEOU reflects ease of use perception [51].These factors are instrumental in forecasting the adoption of emerging technologies. However, to our knowledge, no meta-analysis has yet been conducted to evaluate the user acceptance of AR/MR in medical education, such as PU, PEOU, and enjoyment.
In both knowledge learning and skill training, AR/MR technology has been widely valued for its user-friendly interface and engaging experience. Student feedback indicates high ratings for perceived PEOU and enjoyment. Consistent with AR/MR's impact on learning outcomes, its PU remains moderate in knowledge learning. However, in skill training, learners show stronger recognition of AR/MR's effectiveness in enhancing training outcomes, aligning with the findings of previous meta-analyses that highlighted AR’s efficacy in enhancing learner satisfaction and proficiency in medical training exercises [12]. A large proportion of heterogeneity in user experience metrics remains unexplained, and this variability may be attributed to factors such as qualitative differences in instructional design, software usability, and learners’ prior experience that are not captured.
To effectively implement AR/MR in medical education, it is important to carefully design the course curriculum. Design courses tailored to the characteristics of AR/MR, fully tapping into the unique value and potential of AR/MR technology in the field of medical education. Implementing assessment models that accurately reflect the unique advantages of AR/MR can facilitate a clearer understanding of learning outcomes among students, thereby increasing their acceptance of AR/MR-integrated curricula. By addressing these key considerations, medical schools can optimize the integration of AR/MR technology and create a more effective and engaging learning environment for their students. By doing so, institutions can leverage the potential of AR/MR to enhance both the theoretical comprehension and practical skill development of future healthcare practitioners, thereby accelerating the development of medical education.
Future research should focus on integrating AR/MR with VR and AI technologies to develop hybrid immersive learning environments and assess their long-term educational outcomes. Moreover, comparative studies in diverse teaching scenarios are needed to generalize findings and refine implementation strategies.
Limitations
The main limitation of this study is the high heterogeneity among the included studies. There is significant heterogeneity in terms of research methods, outcome measures, data collection, and reporting due to the different research topics covered by the included articles. This is not surprising because there is considerable variation in medical curricula and educational interventions are typically designed based on specific teaching content and targeted populations to meet specific educational needs. High heterogeneity has also been observed in meta-analyses of other medical and health sciences education-related interventions. The high heterogeneity observed across studies highlights that the effectiveness of AR/MR interventions is highly context-dependent. Differences in technological implementations (e.g., display types, interaction modalities), instructional designs, and assessment protocols can significantly impact learning outcomes. This underscores the importance of interpreting our findings within specific educational and technological contexts. Additionally, there is a pressing need for future research to develop a standardized evaluation framework for augmented reality (AR) and mixed reality (MR) technologies in medical education. It is also important to note that our findings should be interpreted with caution due to evidence of publication bias, which may lead to an overestimation of AR/MR's true effect on reducing failure rates.
Conclusion
This meta-analysis demonstrated that AR/MR has been applied in medical education in Europe, America, and other regions. AR/MR improves medical skill performance in terms of accuracy and speed, particularly for spatial-visual tasks such as surgery, but shows no significant impact on knowledge acquisition. The technology enhances perceived ease of use (PEOU) and enjoyment compared to traditional methods; however, there is a need for improved assessment tools and further validation. However, it should be noted that these estimates may be influenced by heterogeneity and publication bias.
Supplementary Information
Acknowledgements
Not applicable.
Abbreviations
- AR
Augmented reality
- MR
Mixed reality
- PEOU
Perceived ease of use
- PU
Perceived usefulness
- RD
Risk difference
- SMD
Standardization mean difference
- WMD
Weighted mean difference
- TAM
Technology acceptance model
Authors’ contributions
R.Z.: Conceptualization, Data curation, Formal analysis, Methodology, Software, Writing – original draft, Writing – review & editing. X.J.: Conceptualization, Data curation, Project administration, Writing – review & editing. M.L.: Conceptualization, Methodology, Project administration, Supervision, Writing – review & editing. H.T.: Methodology, Writing – review & editing. All authors reviewed the manuscript.
Funding
This study was funded by Macao Polytechnic University-funded research project (Grant number RP/AE-04/2022).
Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Declarations
Ethics approval and consent to participate
Not applicable. (This manuscript is a meta-analysis, and does not report on or involve the use of any animal or human data or tissue.)
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Ruoxuan Zhang and Xiaoyan Jin contributed equally to this work.
References
- 1.Bruno RR, Wolff G, Wernly B, Masyuk M, Piayda K, Leaver S, et al. Virtual and augmented reality in critical care medicine: the patient’s, clinician’s, and researcher’s perspective. Crit Care. 2022;26:326. 10.1186/s13054-022-04202-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Dinh A, Yin AL, Estrin D, Greenwald P, Fortenko A. Augmented reality in real-time telemedicine and telementoring: scoping review. JMIR Mhealth Uhealth. 2023;11:e45464. 10.2196/45464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.U.S. Food & Drug administration. Augmented Reality and Virtual Reality in Medical Devices. https://www.fda.gov/medical-devices/digital-health-center-excellence/augmented-reality-and-virtual-reality-medical-devices.
- 4.Uruthiralingam U, Rea PM. Augmented and virtual reality in anatomical education - a systematic review. Adv Exp Med Biol. 2020;1235:89–101. 10.1007/978-3-030-37639-0_5. [DOI] [PubMed] [Google Scholar]
- 5.Sun P, Zhao Y, Men J, Ma ZR, Jiang HZ, Liu CY, et al. Application of virtual and augmented reality technology in hip surgery: systematic review. J Med Internet Res. 2023;25:e37599. 10.2196/37599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Felinska EA, Fuchs TE, Kogkas A, Chen ZW, Otto B, Kowalewski KF, et al. Telestration with augmented reality improves surgical performance through gaze guidance. Surg Endosc. 2023;37:3557–66. 10.1007/s00464-022-09859-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Munzer BW, Khan MM, Shipman B, Mahajan P. Augmented reality in emergency medicine: a scoping review. J Med Internet Res. 2019;21:e12368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Garcia-Robles P, Cortes-Perez I, Nieto-Escamez FA, Garcia-Lopez H, Obrero-Gaitan E, Osuna-Perez MC. Immersive virtual reality and augmented reality in anatomy education: A systematic review and meta-analysis. Anat Sci Educ. 2024;17:514–28. 10.1002/ase.2397. [DOI] [PubMed] [Google Scholar]
- 9.Bui T, Ruiz-Cardozo MA, Dave HS, Barot K, Kann MR, Joseph K, et al: Virtual, Augmented, and Mixed Reality Applications for Surgical Rehearsal, Operative Execution, and Patient Education in Spine Surgery: A Scoping Review. Medicina (Kaunas). 2024;60. 10.3390/medicina60020332. [DOI] [PMC free article] [PubMed]
- 10.Yoo S, Son MH. Virtual, augmented, and mixed reality: potential clinical and training applications in pediatrics. Clin Exp Pediatr. 2024;67:92–103. 10.3345/cep.2022.00731. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Veer V, Phelps C, Moro C. Incorporating mixed reality for knowledge retention in physiology, anatomy, pathology, and pharmacology interdisciplinary education: a randomized controlled trial. Med Sci Educ. 2022;32:1579–86. 10.1007/s40670-022-01635-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Baashar Y, Alkawsi G, Ahmad WNW, Alhussian H, Alwadain A, Capretz LF, et al. Effectiveness of using augmented reality for training in the medical professions: meta-analysis. JMIR Serious Games. 2022;10:e32715. 10.2196/32715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Cheng A, Fijacko N, Lockey A, Greif R, Abelairas-Gomez C, Gosak L, et al. Education Implementation Team Task Force of the International Liaison Committee on R. Use of augmented and virtual reality in resuscitation training: A systematic review. Resusc Plus. 2024;18:100643. 10.1016/j.resplu.2024.100643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Higgins JP, Thomas J, Chandler J, et al. Revised Cochrane risk-of-bias tool for randomized trials (RoB 2). John Wiley & Sons. 2019.
- 15.Bogomolova K, Vorstenbosch M, El Messaoudi I, Holla M, Hovius SER, van der Hage JA, et al. Effect of binocular disparity on learning anatomy with stereoscopic augmented reality visualization: A double center randomized controlled trial. Anat Sci Educ. 2023;16:87–98. 10.1002/ase.2164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cizmic A, Muller F, Wise PA, Haberle F, Gabel F, Kowalewski KF, et al. Telestration with augmented reality improves the performance of the first ten ex vivo porcine laparoscopic cholecystectomies: a randomized controlled study. Surg Endosc. 2023;37:7839–48. 10.1007/s00464-023-10360-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.d’Aiello AF, Cabitza F, Natali C, Vigano S, Ferrero P, Bognoni L, et al. The effect of holographic heart models and mixed reality for anatomy learning in congenital heart disease: an exploratory study. J Med Syst. 2023;47:64. 10.1007/s10916-023-01959-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Farshad-Amacker NA, Kubik-Huch RA, Kolling C, Leo C, Goldhahn J. Learning how to perform ultrasound-guided interventions with and without augmented reality visualization: a randomized study. Eur Radiol. 2023;33:2927–34. 10.1007/s00330-022-09220-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Guha P, Lawson J, Minty I, Kinross J, Martin G. Can mixed reality technologies teach surgical skills better than traditional methods? A prospective randomised feasibility study. BMC Med Educ. 2023;23:144. 10.1186/s12909-023-04122-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hayasaka T, Kawano K, Onodera Y, Suzuki H, Nakane M, Kanoto M, et al. Comparison of accuracy between augmented reality/mixed reality techniques and conventional techniques for epidural anesthesia using a practice phantom model kit. BMC Anesthesiol. 2023;23:171. 10.1186/s12871-023-02133-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Malik R, Abbas JR, Jayarajah C, Bruce IA, Tolley N. Mixed reality enhanced otolaryngology case-based learning: a randomized educational study. Laryngoscope. 2023;133:1606–13. 10.1002/lary.30364. [DOI] [PubMed] [Google Scholar]
- 22.An J, Oh J, Park K. Self-regulated learning strategies for nursing students: a pilot randomized controlled trial. Int J Environ Res Public Health.2022;19. 10.3390/ijerph19159058. [DOI] [PMC free article] [PubMed]
- 23.Hou L, Dong X, Li K, Yang C, Yu Y, Jin X, et al. Comparison of Augmented Reality-assisted and Instructor-assisted Cardiopulmonary Resuscitation: A Simulated Randomized Controlled Pilot Trial. Clin Simul Nurs. 2022;68:9–18. 10.1016/j.ecns.2022.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Little WB, Dezdrobitu C, Conan A, Artemiou E. Is augmented reality the new way for teaching and learning veterinary cardiac anatomy? Med Sci Educ. 2021;31:723–32. 10.1007/s40670-021-01260-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Nagayo Y, Saito T, Oyama H. Augmented reality self-training system for suturing in open surgery: A randomized controlled trial. Int J Surg. 2022;102:106650. 10.1016/j.ijsu.2022.106650. [DOI] [PubMed] [Google Scholar]
- 26.Pickering JD, Panagiotis A, Ntakakis G, Athanassiou A, Babatsikos E, Bamidis PD. Assessing the difference in learning gain between a mixed reality application and drawing screencasts in neuroanatomy. Anat Sci Educ. 2022;15:628–35. 10.1002/ase.2113. [DOI] [PubMed] [Google Scholar]
- 27.Yohannan DG, Oommen AM, Amogh BJ, Raju NK, Suresh RO, Nair SJ. “Air Anatomy” - Teaching Complex Spatial Anatomy Using Simple Hand Gestures. Anat Sci Educ. 2022;15:552–65. 10.1002/ase.2088. [DOI] [PubMed] [Google Scholar]
- 28.Chen CJ, Chen YC, Lee MY, Wang CH, Sung HC. Effects of three-dimensional holograms on the academic performance of nursing students in a health assessment and practice course: a pretest-intervention-posttest study. Nurse Educ Today. 2021;106:105081. 10.1016/j.nedt.2021.105081. [DOI] [PubMed] [Google Scholar]
- 29.Ienghong K, Kotruchin P, Tangpaisarn T, Apiratwarakul K. Practical emergency ultrasound flashcards with augmented reality in teaching point-of-care ultrasound in ER. Open Access Maced J Med Sci. 2021;9:39–42. 10.3889/oamjms.2021.5566. [Google Scholar]
- 30.Liu S, Xie M, Zhang Z, Wu X, Gao F, Lu L, et al. A 3D hologram with mixed reality techniques to improve understanding of pulmonary lesions caused by COVID-19: randomized controlled trial. J Med Internet Res. 2021;23:e24081. 10.2196/24081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Weeks JK, Pakpoor J, Park BJ, Robinson NJ, Rubinstein NA, Prouty SM, et al. Harnessing augmented reality and ct to teach first-year medical students head and neck anatomy. Acad Radiol. 2021;28:871–6. 10.1016/j.acra.2020.07.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bogomolova K, van der Ham IJM, Dankbaar MEW, van den Broek WW, Hovius SER, van der Hage JA, et al. The effect of stereoscopic augmented reality visualization on learning anatomy and the modifying effect of visual-spatial abilities: a double-center randomized controlled trial. Anat Sci Educ. 2020;13:558–67. 10.1002/ase.1941. [DOI] [PubMed] [Google Scholar]
- 33.Schoeb DS, Schwarz J, Hein S, Schlager D, Pohlmann PF, Frankenschmidt A, et al. Mixed reality for teaching catheter placement to medical students: a randomized single-blinded, prospective trial. BMC Med Educ. 2020;20:510. 10.1186/s12909-020-02450-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Stojanovska M, Tingle G, Tan L, Ulrey L, Simonson-Shick S, Mlakar J, et al. Mixed reality anatomy using Microsoft HoloLens and cadaveric dissection: a comparative effectiveness study. Med Sci Educ. 2020;30:173–8. 10.1007/s40670-019-00834-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wang C, Daniel BK, Asil M, Khwaounjoo P, Cakmak YO. A randomised control trial and comparative analysis of multi-dimensional learning tools in anatomy. Sci Rep. 2020;10:6120. 10.1038/s41598-020-62855-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Noll C, von Jan U, Raap U, Albrecht UV. Mobile augmented reality as a feature for self-oriented, blended learning in medicine: randomized controlled trial. JMIR Mhealth Uhealth. 2017;5:e139. 10.2196/mhealth.7943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Leitritz MA, Ziemssen F, Suesskind D, Partsch M, Voykov B, Bartz-Schmidt KU, et al. Critical evaluation of the usability of augmented reality ophthalmoscopy for the training of inexperienced examiners. Retina. 2014;34:785–91. 10.1097/IAE.0b013e3182a2e75d. [DOI] [PubMed] [Google Scholar]
- 38.Vera AM, Russo M, Mohsin A, Tsuda S. Augmented reality telementoring (ART) platform: a randomized controlled trial to assess the efficacy of a new surgical education technology. Surg Endosc. 2014;28:3467–72. 10.1007/s00464-014-3625-4. [DOI] [PubMed] [Google Scholar]
- 39.Albrecht UV, Folta-Schoofs K, Behrends M, von Jan U. Effects of mobile augmented reality learning compared to textbook learning on medical students: randomized controlled pilot study. J Med Internet Res. 2013;15:e182. 10.2196/jmir.2497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wilson KL, Doswell JT, Fashola OS, Debeatham W, Darko N, Walker TM, et al. Using augmented reality as a clinical support tool to assist combat medics in the treatment of tension pneumothoraces. Mil Med. 2013;178:981–5. 10.7205/MILMED-D-13-00074. [DOI] [PubMed] [Google Scholar]
- 41.Yeo CT, Ungi T, U-Thainual P, Lasso A, McGraw RC, Fichtinger G. The effect of augmented reality training on percutaneous needle placement in spinal facet joint injections. IEEE Trans Biomed Eng. 2011;58:2031–2037. 10.1109/TBME.2011.2132131. [DOI] [PubMed]
- 42.Ju Y, Lee M, You M. Comparing the effect of declarative vs. procedural knowledge on health behavior moderated by outrage factors. Health Commun. 2023;38:3243–51. 10.1080/10410236.2022.2145752. [DOI] [PubMed] [Google Scholar]
- 43.Barteit S, Lanfermann L, Barnighausen T, Neuhann F, Beiersmann C. Augmented, mixed, and virtual reality-based head-mounted devices for medical education: systematic review. JMIR Serious Games. 2021;9:e29080. 10.2196/29080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Dhar P, Rocks T, Samarasinghe RM, Stephenson G, Smith C. Augmented reality in medical education: students’ experiences and learning outcomes. Med Educ Online. 2021;26:1953953. 10.1080/10872981.2021.1953953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Moro C, Birt J, Stromberga Z, Phelps C, Clark J, Glasziou P, et al. Virtual and augmented reality enhancements to medical and science student physiology and anatomy test performance: a systematic review and meta-analysis. Anat Sci Educ. 2021;14:368–76. 10.1002/ase.2049. [DOI] [PubMed] [Google Scholar]
- 46.Kucuk S, Kapakin S, Goktas Y. Learning anatomy via mobile augmented reality: Effects on achievement and cognitive load. Anat Sci Educ. 2016;9:411–21. 10.1002/ase.1603. [DOI] [PubMed] [Google Scholar]
- 47.Bogomolova K, Sam AH, Misky AT, Gupte CM, Strutton PH, Hurkxkens TJ, et al. Development of a virtual three-dimensional assessment scenario for anatomical education. Anat Sci Educ. 2021;14:385–93. 10.1002/ase.2055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Lamb A, McKinney B, Frousiakis P, Diaz G, Sweet S. A comparative study of traditional technique guide versus virtual reality in orthopedic trauma training. Adv Med Educ Pract. 2023;14:947–55. 10.2147/AMEP.S395087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Stefanidis D, Korndorffer JR, Heniford BT, Scott DJ. Limited feedback and video tutorials optimize learning and resource utilization during laparoscopic simulator training. Surgery. 2007;142:202–6. 10.1016/j.surg.2007.03.009. [DOI] [PubMed] [Google Scholar]
- 50.El Nsouli D, Nelson D, Nsouli L, Curtis F, Ahmed SI, McGonagle I, et al. The application of Kirkpatrick’s evaluation model in the assessment of interprofessional simulation activities involving pharmacy students: a systematic review. Am J Pharm Educ. 2023;87:100003. 10.1016/j.ajpe.2023.02.003. [DOI] [PubMed] [Google Scholar]
- 51.Felber NA, Lipworth W, Tian YJA, Roulet Schwab D, Wangmo T. Informing existing technology acceptance models: a qualitative study with older persons and caregivers. Eur J Ageing. 2024;21:12. 10.1007/s10433-024-00801-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.







