Skip to main content
BMC Medical Education logoLink to BMC Medical Education
. 2026 Mar 6;26:601. doi: 10.1186/s12909-026-08948-8

Reforming hospital management education with the P-MASE pedagogy: a randomized controlled trial

Yongshuo Zhang 1,#, Wenbo Li 2,3,#, Chuchen Zhang 4, Yawei Wang 5,, Shuo Feng 2,
PMCID: PMC13078037  PMID: 41787402

Abstract

Background

To address persistent gaps in hospital management education—including outdated theoretical instruction, insufficient competency assessment, and limited interdisciplinary training—we developed and evaluated the P-MASE pedagogy (Problem-based, Mobile, Authentic, Social, Experiential) to narrow the knowing–doing gap.

Methods

In a prospective, parallel-group randomized controlled trial, 120 undergraduates from a medical university were allocated 1:1 to the P-MASE pedagogy or traditional instruction a 16-week teaching period followed by a 4-week evaluation. The primary outcome was the closed-book examination score. Secondary outcomes included skills—Diagnosis-Related Groups (DRG) grouping accuracy, empathy in crisis statements, and departmental budgeting satisfaction; learning process—weekly platform logins, valid discussion posts, and modeled Emergency Department (ED) waiting-time reduction rate; and teaching satisfaction. A combined evaluation framework integrating data-driven metrics, simulation modeling, and Natural Language Processing (NLP) was applied. Outcome assessors were blinded to group allocation. A two-sided α of 0.05 was used.

Results

In this educational RCT, the P-MASE group had higher closed-book examination scores than the control group (88.3 ± 3.2 vs. 73.6 ± 5.8, P < 0.001). The P-MASE group also demonstrated better performance on standardized simulation/task-based assessments and showed higher learning engagement and teaching satisfaction than the control group (all P < 0.05).

Conclusions

P-MASE was associated with higher short-term educational outcomes in this course-based setting. Findings should be interpreted within the educational context of this course.

Trial registration

Chinese Clinical Trial Registry (ChiCTR), ChiCTR2400082568. Registered on 01 April 2024.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12909-026-08948-8.

Keywords: Hospital management education, Health professions education, Competency-based education, Simulation-based learning, Performance-based assessment, Active learning, Interprofessional education

Background

Hospital management education is a specialized strand of health professions education that focuses on system-level competencies (e.g., policy interpretation, resource allocation, quality governance, and institutional crisis communication). It is conceptually distinct from general medical education, which primarily targets patient-level clinical competence (e.g., diagnosis, treatment, and bedside communication) and typically evaluates outcomes through clinical knowledge and skills related to direct care. By contrast, hospital management education centers on organizational and system performance reasoning—how decisions are made under institutional constraints, across departments, and in response to policy and operational signals—therefore requiring different learning tasks and assessment evidence. As such, educational outcomes are often evaluated through performance-based assessments conducted in simulated or practice-proximal learning contexts, rather than interpreted as direct evidence of real-world hospital operational improvement. Against this educational framing, we consider how current hospital management teaching can better cultivate applied reasoning and decision-making competencies.

Payment reform and tertiary hospital performance evaluation are pushing hospital management from experience-based practice toward data and process-driven operations [1]. Yet current teaching faces three structural gaps: first, curricula overemphasize legacy theories and underrepresent emerging competencies such as Diagnosis-Related Groups (DRG) / Diagnosis-Intervention Packet (DIP) [24]; second, traditional written exams fail to measure complex, context dependent decision‑making and communication [5, 6]; third, systematic interdisciplinary training is limited, weakening graduates’ coordination across information, clinical, and administrative interfaces [7]. Systematic evidence indicates that competency‑based assessment and interdisciplinary training remain insufficient in medical and health management education [8, 9].

To address these gaps, we introduce the P‑MASE pedagogy, which combines problem orientation, mobile learning, authentic contexts, social collaboration, and experiential learning [10, 11]. In this manuscript, “P-MASE” is used as a descriptive acronym for an integrated pedagogy that combines five well-established educational approaches, rather than as an established indexing term or a single proprietary method. Conceptually, P-MASE differs from many existing hospital management teaching approaches that adopt single or loosely coupled instructional strategies (e.g., lecture-based delivery, case discussions, isolated simulations, or stand-alone projects) or apply generic competency-based education without an explicit linkage between competencies, task sequences, and performance evidence. P-MASE is positioned as an integrated pedagogical package: it organizes job-relevant management competencies into authentic, cross-department task chains, supports learning through mobile/digital tools and collaboration, and aligns assessment with management performance evidence collected across tasks. It emphasizes real operational data, cross‑department tasks, and digital tools to translate policy rules into executable decisions [12]. Importantly, the management-derived indicators used in this study (e.g., DRG grouping accuracy, budgeting/department satisfaction, and ED process optimization) are operationalized within standardized simulations or practice-proximal learning tasks and are interpreted as educational performance measures rather than real-world hospital operational outcomes. Unlike single‑tool or single‑skill approaches, P‑MASE embeds the development of knowledge, skills, and attitudes within job‑relevant task chains and uses a combined evaluation framework (data‑driven metrics, simulation, and Natural Language Processing (NLP)) to yield measurable and reusable competency assessments [13]. In other words, the novelty of P-MASE lies less in introducing a new single teaching technique and more in integrating “what to learn” (hospital management competencies), “how to learn” (problem-oriented, authentic, collaborative, mobile-supported experiences), and “how to judge performance” (multi-source, task-embedded evidence including NLP) into one coherent design logic tailored to hospital management education. In addition to domain knowledge, the pedagogy targets higher-order cognitive outcomes, including systems thinking (recognizing interdependencies across processes and departments) and trade-off reasoning (prioritizing among competing objectives under constraints), which are operationalized through the course tasks and performance-based assessments.

This study asks whether, compared with traditional instruction, the P‑MASE pedagogy produces superior outcomes in key competencies for undergraduate hospital management students—specifically knowledge acquisition, data-driven decision making, and simulation-based crisis communication—while also improving learning engagement and teaching satisfaction. We pre-specified one primary and several secondary hypotheses: [1] Primary hypothesis: P‑MASE increases closed-book examination scores (primary outcome) [2]. Secondary hypotheses: P‑MASE reduces DRG grouping errors; increases empathy in crisis statements; was associated with higher departmental budgeting satisfaction; enhances learning process indicators (platform logins, valid discussion posts, and process optimization performance); and raises teaching satisfaction and perceived job competence.

Methods

Study design and ethical approval

This study was a prospective, randomized controlled educational trial conducted in accordance with the CONSORT-Edu guidelines to ensure methodological transparency and reproducibility. This study evaluates P-MASE as an integrated, multicomponent intervention delivered as a single package; the effects of individual components were not tested separately. The sample consisted of 120 undergraduate students majoring in Health Administration and Information Resource Management from a medical university’s School of Management. All participants were drawn from the same academic cohort (i.e., the same year/batch) and undertook the course during the same semester (September 2024 to January 2025). All participants provided written informed consent before enrollment. The study received approval from the institutional ethics committee (approval number XYFY2023-KL146-01). In addition, the course reform and its implementation within the undergraduate curriculum were reviewed and approved through the university’s (or School-level) teaching/curriculum governance process (e.g., the School of Management Teaching Committee), prior to study initiation. The study was prospectively registered in the Chinese Clinical Trial Registry (ChiCTR2400082568). Based on pilot data (mean difference in exam score = 8 points, SD ≈ 6), we calculated that at α = 0.05 and power = 0.80, at least 52 participants were required per group. Allowing for a 10% dropout rate, we recruited 60 students per group. Given the formative pilot work suggesting potential promise of the integrated P-MASE package, we considered alternative designs (including a cross-over design). However, the pilot was conducted primarily to refine feasibility and implementation and was not designed or powered to establish definitive superiority. Therefore, educational equipoise remained regarding the magnitude and generalizability of benefit, supporting the use of a parallel-group randomized design. A cross-over design was considered but deemed inappropriate because educational exposure is not readily reversible, substantial carry-over would be expected across periods, and interaction within a single cohort would increase contamination risk. To address ethical considerations, the control group received the institution’s standard curriculum covering the same course content, and after completion of outcome assessments, control participants were offered access to key P-MASE learning resources (e.g., selected modules and case materials) where feasible.

Participant allocation and intervention procedures

Participants were stratified by major and randomly assigned in a 1:1 ratio to either the experimental group (P-MASE pedagogy, n = 60) or the control group (traditional instruction, n = 60) using computer-generated permuted blocks. Allocation concealment was achieved through sequentially numbered, opaque, sealed envelopes prepared by an independent administrator. The intervention lasted for 16 weeks, followed by a 4-week evaluation phase, spanning a total of 5 months (September 2024 to January 2025). To minimize expectancy effects, students were informed that two teaching formats were being compared as part of routine course delivery, but they were not informed of the study hypothesis or any pilot findings suggesting superiority of either format.

Course context and learning objectives

Within this trial, the intervention was embedded within a required undergraduate hospital management course delivered to students majoring in Health Administration and Information Resource Management at the School of Management. The course spanned 16 weeks of instruction followed by a 4-week evaluation period (September 2024 to January 2025). The broad course objectives were to enable students to: (1) understand and apply core hospital management concepts and policy/regulatory content; (2) demonstrate data-informed decision-making using hospital management tools (e.g., DRG/DIP-related classification and interpretation tasks); (3) reason through system-level trade-offs and resource allocation under constraints (e.g., budgeting and departmental satisfaction scenarios); (4) practice crisis communication strategies emphasizing empathic engagement and accountability; and (5) develop collaborative problem-solving and self-directed learning behaviors in team-based tasks.

Outside of the intervention, students followed the standard curriculum of the program; because randomization occurred within the same cohort, exposure to other learning activities beyond this course was expected to be comparable between groups. To minimize expectancy effects, students were informed that two teaching formats were being compared as part of routine course delivery, but they were not informed of the study hypothesis or any pilot findings suggesting superiority of either format.

Teaching Interventions

P-MASE teaching model

Problem-Based Learning (PBL): Semi-structured interviews at eight tertiary hospitals identified six categories of authentic management problems (e.g., DRG cost overruns, Intensive Care Unit (ICU) infection rates). The process included four stages: problem deconstruction, policy-guided knowledge retrieval, multidimensional solution design, and simulation validation (AnyLogic modeling).

Mobile Learning (ML): Based on the SuperStar platform, the curriculum integrates theory, skills, and policy engagement through three pathways: knowledge internalization, data tool application, and dynamic policy synchronization. Dedicated course modules included a Policy Intelligence Hub and a Case Discussion Forum. Each week, students solved real management challenges using a “problem identification → policy alignment → evidence-based solutions” approach.

Authentic Learning (AL): An innovative departmental rotation program immersed students in Operations, Quality Control, and Outpatient Services, building data literacy and system optimization skills. For example, Power BI was used to analyze surgical efficiency, and a three-dimensional framework (length of stay (LOS), complication rate, DRG weight) identified management was associated with higherment opportunities.

Social Learning (SL): Five-member interdisciplinary teaching teams (management, clinical, informatics) guided students in needs analysis, digital modeling, and change management. Students optimized ED workflows, applied data-driven staffing solutions, and balanced compliance, cost, and ethics via online forums.

Experiential Learning (EL): Students completed standardized modules on resource allocation, crisis communication, and process optimization using a virtual simulation platform integrating authentic hospital management scenarios; decision-support tools (e.g., AHP) and NLP-assisted text analysis were used to generate performance-based educational assessment indicators. Table 1 provides a concise summary of the five P-MASE components, their objectives, core learning activities, and links to assessment evidence. An overview of this virtual simulation training platform is presented in Fig. 1, which illustrates how hospital management theories and digital technologies are combined to provide a realistic environment for decision-making, resource allocation, and crisis management. This trial evaluated P-MASE as an integrated, multi-component pedagogy; it was not designed to isolate the independent effects of individual components.

Table 1.

Summary of the five P-MASE components: objectives, learning activities, and assessment links

P-MASE component Objective (competency focus) Core learning activities / tools Assessment evidence (links)
Problem-Based Learning (PBL) Applied reasoning for authentic management problems; policy-guided problem solving Semi-structured interview–derived problem sets; 4-stage workflow (deconstruction → policy retrieval → solution design → simulation validation); AnyLogic modeling Performance task outputs (solutions/model results); rubric-based scoring of reasoning quality; problem-solution alignment
Mobile Learning (ML) Continuous policy/knowledge updating; data tool application; self-regulated learning SuperStar modules (Policy Intelligence Hub; Case Discussion Forum); weekly “identify → policy align → evidence-based solution” tasks LMS log indicators (logins; valid discussion posts); completion of weekly tasks; knowledge checks (if applicable)
Authentic Learning (AL) Practice-proximal data literacy and system optimization in real contexts Departmental rotation (Operations / Quality Control / Outpatient); Power BI analysis; LOS/complication/DRG-weight framework Deliverables from authentic tasks (dashboards/analyses); performance-based scoring of optimization proposals
Social Learning (SL) Interdisciplinary coordination and communication across clinical–admin–informatics interfaces Interdisciplinary teaching teams; collaborative needs analysis; digital modeling; change management; online forums Team artefacts (workflow redesign; staffing); peer/team feedback; participation quality (valid posts)
Experiential Learning (EL) Decision-making under constraints; crisis communication; process optimization Virtual simulation modules (resource allocation; crisis communication; process optimization); decision tools (e.g., AHP); NLP-assisted text analysis Simulation performance metrics (e.g., DRG grouping accuracy / process indicators); NLP-derived text indicators (e.g., empathy/communication features); integrated competency scoring framework

P-MASE is a descriptive acronym for a bundled, five-component pedagogy (PBL, ML, AL, SL, EL); the study was not designed to estimate component-specific effects. Assessment indicators were derived from standardized simulations or practice-proximal tasks (including LMS logs and NLP outputs) and represent educational performance rather than real-world hospital outcomes.

Fig. 1.

Fig. 1

This virtual simulation training platform integrates hospital management theories with digital technologies, providing a realistic virtual environment for decision-making, resource allocation, and crisis management

Control group

The control group received traditional lectures and case discussions covering identical content, but without the five innovative P-MASE components. To mitigate instructor-related bias, instructors from both groups received standardized training before the trial, and teaching experience and academic rank were matched between groups.

Contamination prevention

To minimize contamination between groups, separate teaching assistants were assigned, classes were scheduled at different times and locations, and learning management system (LMS) access was restricted to group-specific materials and forums. Students were regularly reminded not to share materials across groups, and access logs were monitored. Participation was administratively separated from official academic evaluation to avoid coercion.

Adherence/attendance and engagement

Formal attendance was not systematically recorded for all sessions (e.g., no standardized roll-call or electronic sign-in across teaching activities) in either group. Therefore, we could not compute session-level attendance rates as a measure of adherence. To provide an objective indicator of participation, we used learning platform logs as engagement proxies, including number of logins and valid discussion posts, and we additionally tracked completion of required course tasks.

Evaluation index system

Teaching effectiveness was assessed across four dimensions—Theoretical Knowledge, Skills Ability, Learning Process, and Teaching Satisfaction—using ten indicators (Table 2). To ensure that the trial evaluations reflected the course’s core learning objectives, we defined an objective-to-assessment mapping within this four-dimensional framework, operationalizing each objective into measurable indicators and aligning them with specific assessment methods (Table 2). For example, policy/regulatory knowledge was assessed via a closed-book examination (primary outcome), while applied decision-making competencies were assessed through standardized performance tasks such as DRG grouping accuracy, crisis-statement communication (NLP-assisted empathy markers in standardized written tasks), and simulation-based resource allocation and process optimization scenarios. Learning engagement and collaborative processes were captured via platform behavioral indicators and structured discussion contributions, and perceived readiness was assessed via teaching satisfaction and job-competence confidence measures.

Table 2.

Evaluation indicator system for hospital management teaching effectiveness

Dimension Indicator Evaluation Method Teaching Mapping Goal
Theoretical Knowledge 1. Mastery of the “Hospital Management” textbook Closed-book exam (100 points) Knowledge retention and policy comprehension
2. Awareness of the medical quality management system Case analysis scoring Ability to link theory to practice
Skills Ability 3. Budget preparation and resource allocation rationality Simulation sandbox scoring (Department satisfaction ≥ 80%) Strategic planning and cost control skills
4. Effectiveness of crisis response plan Frequency of “empathy expression” in public statements (times/100 words) (RoBERTa model analysis) Emergency decision-making and risk prevention skills
5. Data-driven decision-making ability Disease DRG clustering accuracy (comparison with standard group results) Use of information tools and data analysis
Teaching Process 6. Learning motivation and participation Platform login frequency, quality of discussion posts (NLP sentiment analysis) Self-directed learning and proactive inquiry skills
7. Team collaboration effectiveness Group task contribution (peer review + teacher scoring) Cross-department communication and collaboration skills
8. Hospital management thinking and problem-solving ability Feasibility of emergency resource allocation plan (Simio simulation results) Complex problem analysis and systematic decision-making
Teaching Satisfaction 9. Practicality of course content Likert 5-point scale (“degree of alignment with actual management needs”) Alignment of teaching content with job competency
10. Support of teaching resources Satisfaction score for the case database, simulation system, and mentoring Appropriateness of resource allocation

Between-group comparisons used independent-samples t tests for continuous variables and χ² tests (or Fisher’s exact test) for categorical variables. Two-sided P < 0.05 was considered statistically significant

To clarify the educational intent of the management-derived indicators, we treated them as performance-based assessments of learners’ reasoning processes within standardized tasks. Specifically, DRG grouping accuracy was used as an indicator of rule application, data interpretation, and justification of classification decisions; the budgeting/resource-allocation simulation (department satisfaction) reflected trade-off reasoning under constraints; where applicable, an Analytic Hierarchy Process (AHP) decision model was used within the resource-allocation tasks to structure multi-criteria prioritization; learners’ weighting choices and consistency of judgments were treated as indicators of trade-off reasoning and justification under constraints; and ED workflow optimization performance (simulated waiting-time reduction) indicated systems thinking and process-improvement decision-making within the learning tasks. Empathy markers in crisis communication were derived from standardized written crisis statements using NLP-assisted analysis and were interpreted as an educational indicator of empathic risk communication strategies, complementing other competency assessments.

The teaching satisfaction questionnaire (Table 3) was developed for this study based on prior hospital management education literature and expert consultation. It included three items—course practicality, resource support, and confidence in job competence—rated on a 5-point Likert scale. The full questionnaire is available upon reasonable request.

Table 3.

Teaching satisfaction survey results (Likert 5-point scale)

Indicator Experimental group (n = 60) Control group (n = 60) T P value
Practicality of course content 4.7 ± 0.3 3.1 ± 0.6 17.03 < 0.001
Supportiveness of teaching resources 4.5 ± 0.4 3.4 ± 0.5 13.48 < 0.001
Confidence in job competence 4.6 ± 0.2 3.0 ± 0.7 15.67 < 0.001

Data are presented as mean ± standard deviation. Satisfaction levels were measured using a Likert 5-point scale, where higher scores indicate greater satisfaction

Primary and secondary outcomes

  • Primary outcome: Closed-book examination score.

  • Secondary outcomes: Department satisfaction, empathy expression, simulation-based DRG grouping accuracy, platform logins, valid discussion posts, and teaching satisfaction scores.

Statistical analysis

All analyses were performed using Statistical Package for the Social Sciences (SPSS) v25.0. Continuous variables are reported as mean ± standard deviation (SD). Between-group comparisons were conducted using Welch’s independent-samples t-tests (two-sided, α = 0.05) for continuous variables, and chi-square or Fisher’s exact tests for categorical variables. Statistical significance was set at P < 0.05. Missing data were minimal (< 5%) and handled by complete-case analysis. To control for type I error inflation due to multiple secondary outcomes, the Benjamini–Hochberg false discovery rate (FDR) procedure was applied.

Baseline characteristics

A baseline comparison was performed to ensure initial equivalence between groups. No significant differences were observed in age, sex, GPA, or prior management-related coursework.

Reporting guidelines

This randomized controlled trial was reported in accordance with the Consolidated Standards of Reporting Trials (CONSORT) 2010 statement [14]. A completed CONSORT 2010 checklist is provided as Additional file 1.

Results

Primary outcome (closed-book examination score) is presented first, followed by prespecified secondary outcomes grouped by skills-based performance assessments, learning-process indicators, and teaching satisfaction. Several management-derived outcomes (e.g., DRG grouping accuracy, simulated departmental satisfaction, and simulated ED waiting-time reduction) were generated within standardized simulations or practice-proximal learning tasks and are reported as educational performance indicators rather than real-world operational metrics. Full descriptive statistics and hypothesis-test results are provided in the corresponding tables; the narrative below emphasizes overall patterns and educational significance rather than restating all numeric values.

Primary outcome: closed-book examination score

As shown in Table 4, the experimental group demonstrated superior closed-book examination performance compared with the control group, with consistent advantages in both policy/regulation mastery and case-based analysis. These between-group differences were statistically significant (P < 0.001 for both domains), indicating stronger consolidation of core management knowledge and improved ability to apply this knowledge to practice-relevant scenarios (see Table 4 for full values).

Table 4.

Comparison of theoretical knowledge assessment results (Full Score: 100 points)

Indicator Experimental Group (n = 60) Control Group (n = 60) T P value
Mastery of Policies and Regulations (Closed-book) 88.3 ± 3.2 73.6 ± 5.8 17.12 < 0.001
Case Analysis Score 90.1 ± 4.5 65.4 ± 7.2 22.79 < 0.001

Data are presented as mean ± standard deviation. A P value less than 0.001 indicates a statistically significant difference between the experimental and control groups

Secondary outcomes: skills performance assessments

As shown in Table 5, in the standardized budgeting/resource-allocation simulation task, the experimental group demonstrated higher performance, reflected by higher simulated departmental satisfaction (82.5 ± 3.4% vs. 74.3 ± 5.1%, P < 0.001). In a standardized written crisis-statement task, NLP-assisted analysis indicated more empathy expressions as an educational performance marker (5.3 ± 0.8 vs. 2.1 ± 0.4 per 100 words, P < 0.001). In a DRG grouping performance-based assessment (comparison against a reference standard), the experimental group had a lower DRG error rate (6.3 ± 2.1% vs. 21.7 ± 4.3%, P < 0.01).

Table 5.

Skills improvement effects (standardized simulation- and task-based assessments)

Indicator Experimental Group (n = 60) Control Group (n = 60) T P value
Simulated departmental satisfaction (%) (budgeting/resource-allocation task) 82.5 ± 3.4 74.3 ± 5.1 10.25 P < 0.001
Empathy markers in standardized written crisis statements (times/100 words; NLP-assisted analysis) 5.3 ± 0.8 2.1 ± 0.4 29.14 P < 0.001
DRG grouping error rate (%) 6.3 ± 2.1 21.7 ± 4.3 -25.24 P < 0.001

All management-derived indicators in this table are educational performance measures generated within standardized tasks/simulations. The values represent the mean ± standard deviation (SD) for each indicator. The experimental group demonstrated significantly higher performance in terms of department satisfaction, empathy in simulation-based crisis communication, and data-driven decision-making ability compared to the control group

Secondary outcomes: learning-process indicators

As summarized in Table 6, students in the experimental group showed higher engagement, with more weekly platform logins (18.3 ± 2.1 vs. 9.7 ± 3.4, P < 0.001) and valid discussion posts (24.5 ± 5.2 vs. 8.3 ± 3.7, P < 0.001). In a standardized ED workflow optimization simulation scenario (educational assessment), their simulation-based ED waiting-time reduction rate was also higher and was interpreted as a performance indicator of process-optimization decision-making within the learning tasks (33.3 ± 4.7% vs. 11.2 ± 3.6%, P < 0.001).

Table 6.

Learning-process behavioral outcomes

Indicator Experimental Group (n = 60) Control Group (n = 60) T P value
Weekly platform logins (times/week) 18.3 ± 2.1 9.7 ± 3.4 16.74 < 0.001
Valid discussion contributions (posts per student) 24.5 ± 5.2 8.3 ± 3.7 19.61 < 0.001
Simulated ED waiting-time reduction (%) (standardized workflow optimization simulation scenario) 33.3 ± 4.7 11.2 ± 3.6 28.45 < 0.001

Values are mean ± SD or percentage as indicated; two-sided tests at α = 0.05. P values for continuous outcomes use Welch’s t-test. The simulated ED waiting-time reduction represents performance within a standardized workflow optimization simulation scenario used for educational assessment and is not a real-world operational metric; it is reported as the mean (± SD) percentage reduction from baseline in the simulation scenario. As a sensitivity check for count outcomes, negative binomial models with robust standard errors yielded consistent inferences (not shown)

Secondary outcomes: teaching satisfaction

Post-course surveys indicated greater satisfaction in the experimental group. They rated course practicality (4.7 ± 0.3 vs. 3.1 ± 0.6, P < 0.001), teaching resource support (4.5 ± 0.4 vs. 3.4 ± 0.5, P < 0.001), and confidence in job competence (4.6 ± 0.2 vs. 3.0 ± 0.7, P < 0.001) all significantly higher than the control group(Table 3). Over 91% agreed that simulation-based training was closely connected to real hospital management scenarios.

Discussion

The prespecified primary outcome—closed-book examination performance—was higher in the P-MASE group than in the control group, indicating improved short-term knowledge acquisition within this course context. Across prespecified secondary outcomes, the P-MASE group also demonstrated better performance on standardized simulation/task-based assessments (e.g., DRG grouping accuracy, crisis-statement empathy markers, and budgeting/resource-allocation satisfaction), higher engagement indicators (platform logins and discussion posts), and higher teaching satisfaction. Importantly, these results should be interpreted as educational effects observed within a course-based and simulation-supported learning environment, and should not be taken as evidence of improved real-world hospital management effectiveness or policy execution. Because several secondary indicators were derived from standardized simulations or practice-proximal learning tasks, they should be interpreted as educational performance measures of learners’ applied reasoning and decision-making rather than direct evidence of real-world hospital operational or policy outcomes. Accordingly, any translation of these educational gains to sustained workplace performance would require further workplace-based evaluation with objective, real-world outcomes.

To support performance-based assessment in hospital management education, we used several management-derived tools as educational instruments to externalize learners’ reasoning processes rather than to evaluate real-world operational performance. DRG grouping accuracy was used to assess rule application and data interpretation in classification decisions against a reference standard. In the budgeting/resource-allocation simulations, the use of AHP-enabled multi-criteria structuring of trade-offs, allowing evaluation of learners’ prioritization rationale and consistency of judgments under constraints. For crisis communication, NLP-assisted analysis of standardized written statements provided an objective indicator of empathic risk-communication strategies, complementing other competency assessments. Collectively, these tools operationalize applied reasoning and decision-making constructs within controlled learning tasks, aligning assessment outputs with the educational aims of the course.

From a health professions education perspective, these findings support the value of aligning teaching strategies with competency-based, performance-oriented assessment. By integrating problem-based inquiry, authentic task chains, social learning supports, and experiential simulation, P-MASE may help narrow the knowing–doing gap by providing repeated opportunities for learners to apply policy rules, interpret data, and justify decisions under competing constraints within a controlled educational context.

Theoretical contribution: bridging the knowing–doing gap in hospital management education

This study suggests that traditional policy-memorization-focused pedagogies are inadequate for meeting the developmental demands of public hospitals pursuing high-quality transformation. The P-MASE model achieves a shift from “knowledge transmission” to “decision-wisdom cultivation” by embedding authentic managerial conflicts (e.g., interest negotiations in DRG cost containment) and leveraging digital analytics technologies (e.g., RoBERTa-based sentiment analysis) [15]. This approach aligns with Kolb’s experiential learning theory, which emphasizes reflective practice—when students engage with unprocessed operational data (such as raw DRG grouper outputs), policy provisions transform from isolated knowledge points into practical decision-making tools. Such pedagogical logic restructuring provides a theoretical pathway to resolve the persistent “knowing-doing gap” in hospital management education [16, 17]. Given the magnitude of between-group differences, potential confounding factors—such as group motivation, instructor preference, or differential exposure to digital tools—cannot be entirely excluded. Although efforts were made to minimize these factors, future multicenter studies are needed.

Practical implications: transferable cultivation of job competence

This study confirms that hospital administrators’ core competency lies in dynamically balancing complex systems [18]. ​First, reconciling data-driven and humanistic dimensions: the experimental group’s low negative sentiment scores in crisis responses stemmed from integrating the tripartite communication strategy—empathic engagement, factual clarification, and institutional accountability—requiring managers to simultaneously interpret public emotion metrics and design compassionate communication frameworks [19]. ​Second, optimizing efficiency-quality equilibrium: the interdisciplinary team’s emergency resource scheduling model demonstrated how synchronizing patient flow optimization with algorithmic rostering could overcome traditional management’s “efficiency ceiling“ [20]. Cultivating such competencies necessitates authentic conflict scenarios where students experience the complete decision-making continuum—from data insight generation to stakeholder interest harmonization.

Innovation: clinical translation of educational technology

The innovation of this study lies in transforming hospital management tools—such as the Analytic Hierarchy Process (AHP)—into pedagogical enablers. ​First, decision process visualization: Through departmental satisfaction modeling in budgetary simulations, students observed direct trade-offs (e.g., “a 10% increase in equipment investment raised surgical satisfaction by 6.2% but reduced medical department satisfaction by 3.8%”), thereby internalizing marginal utility principles and resource allocation compromises [21]. ​Second, real-time risk feedback: RoBERTa-driven sentiment analysis of crisis statements shifted communication training from post hoc scoring to in-process calibration—mirroring real-time monitoring paradigms in clinical diagnostics [22]. This “management-as-pedagogy” technological translation establishes a replicable methodological framework for hospital management education [23].

Future direction: from educational innovation to management ecosystem transformation

Despite the demonstrated efficacy of the P-MASE model, its broader implementation faces significant challenges, particularly regarding the integration of pedagogical tools (e.g., DRG groupers, performance algorithms) with hospital management information systems. To address this, a tripartite collaboration framework is proposed: ​health administrative departments​ providing de-identified medical data, ​universities​ developing pedagogy-adapted algorithms, and ​hospitals​ co-constructing a closed-loop “education-practice-research” ecosystem. Paradoxically, technological dependency risks must be mitigated—when students overemphasize quantitative metrics (e.g., CMI values), individualized patient needs may be overlooked, necessitating strengthened medical humanities modules in future curricula [24, 25].

Limitations

This study has several important limitations that should be considered when interpreting the results. First, it was conducted at a single medical university in China, which limits the generalizability of findings to other regions, healthcare systems, or cultural contexts and to different levels of learner baseline experience. Second, P-MASE is a multi-component intervention; therefore, we cannot determine which specific elements (or their combinations) drove the observed improvements. Because P-MASE was delivered as a combined package, we cannot attribute observed effects to any single component or identify the most active element(s). Future work could use factorial or dismantling designs to isolate component contributions and optimize the intervention and to assess potential interactions among components. Third, although allocation concealment and outcome assessor blinding were implemented, blinding of students and instructors was not feasible due to the nature of the educational intervention. This raises the possibility of performance bias and Hawthorne effects, particularly in the intervention group who may have been motivated by perceived novelty of digital tools and interdisciplinary mentoring. Fourth, several outcomes (teaching satisfaction, perceived job competence) relied on self-reported data, which are susceptible to social desirability and recall bias and may not fully reflect objectively measured competence. Fifth, although we took multiple measures to prevent contamination (separate schedules, restricted LMS access, monitoring of logs), some information exchange between groups cannot be completely ruled out given shared institutional settings and peer networks. A cross-over design was not used due to expected carry-over and contamination effects in a semester-long educational intervention. Sixth, the NLP (RoBERTa) model for empathy detection and the simulation platforms were locally developed and have not yet undergone extensive external validation across diverse datasets or institutions nor calibrated against widely used benchmark corpora or commercial platforms. Finally, outcomes were assessed immediately after the course and therefore represent short-term, course-based effects without evidence of long-term retention. Several secondary indicators were generated within standardized simulations or analytic models (e.g., workflow optimization scenarios, budgeting/resource-allocation simulations, and NLP-assisted analysis of crisis statements), and some tools/models were locally developed and have not yet undergone extensive external validation across diverse datasets or institutions. Accordingly, these measures should be interpreted as educational performance indicators of learners’ competence and decision-making in controlled learning contexts, and should not be extrapolated as evidence of sustained workplace performance or real-world hospital operational or policy impact under routine clinical/administrative conditions. Future multicenter studies with longer follow-up and objective workplace-based outcomes are needed to evaluate retention, transfer, and implementation feasibility (including cost, faculty workload, and infrastructure requirements).

Conclusions

Driven by authentic scenarios and interdisciplinary collaboration, P-MASE integrates and scaffolds digital decision-support tools to strengthen learners’ ability to apply policy rules, interpret data, and justify decisions in standardized learning tasks and simulations within policy-constrained hospital management contexts. In this educational RCT, P-MASE was associated with higher knowledge scores and better simulation/task-based performance indicators, together with greater engagement and teaching satisfaction, compared with standard instruction. These findings should be interpreted as short-term, course-based educational effects and improvements in simulation-supported learning performance, rather than direct evidence of enhanced real-world hospital management effectiveness or policy execution in practice. Future multicenter studies with longer follow-up are needed to evaluate retention and transfer to workplace performance and to examine feasibility, scalability, and potential equity impacts across learner groups and settings.

Supplementary Information

Supplementary Material 1 (32.9KB, docx)
Supplementary Material 2 (27.6KB, docx)

Acknowledgements

Thank you to all participants who collaborated with the study.

Abbreviations

AHP

Analytic Hierarchy Process

AL

Authentic Learning

CBE

Competency-Based Education

ChiCTR

Chinese Clinical Trial Registry

CMI

Case Mix Index

CNY

Chinese Yuan

CONSORT-Edu

Consolidated Standards of Reporting Trials for Education

CRBSI

Catheter-Related Bloodstream Infections

DIP

Diagnosis-Intervention Packet

DRG

Diagnosis-Related Groups

ED

Emergency Department

EL

Experiential Learning

FDR

False Discovery Rate

GPA

Grade Point Average

GLM

Generalized Linear Model

HIS

Hospital Information Systems

ICD-9-CM

International Classification of Diseases, 9th Revision, Clinical Modification

ICC

Intraclass Correlation Coefficient

ICU

Intensive Care Unit

LMS

Learning Management System

LOS

Length of Stay

ML

Mobile Learning

NLP

Natural Language Processing

P-MASE

Problem-based, Mobile, Authentic, Social, and Experiential

PBL

Problem-Based Learning

RoBERTa

Robustly Optimized BERT Pretraining Approach

SD

Standard Deviation

SL

Social Learning

SPICES

Student-centered, Problem-based, Integrated, Community-oriented, Elective, Systematic

SPSS

Statistical Package for the Social Sciences

Authors’ contributions

SYZ and BWL contributed equally to this work as co-first authors. SYZ conceptualized the overall study design, developed the P-MASE pedagogy framework, and led the methodology, including the integration of mobile and experiential learning elements. WL handled data curation, performed formal statistical analysis such as t-tests, GLM models, and ICC calculations, and interpreted results related to DRG grouping accuracy and empathy expression. CCZ assisted with the investigation process, managed participant recruitment and randomization, and oversaw data collection from learning platforms, including logins and discussion posts. WYW provided supervision for the project, secured funding, reviewed and edited the manuscript for intellectual content, and managed ethics approval and trial registration. SF, as co-corresponding author, supervised interdisciplinary aspects, contributed to skills was associated with higherment analysis using tools like RoBERTa for sentiment evaluation and simulation modeling, and finalized manuscript revisions. All authors read and approved the final manuscript.

Funding

This research was funded by the Construction Project of High Level Hospital of Jiangsu Province(GSPJS202515), Paired Assistance Scientific Research Project by The Affiliated Hospital of Xuzhou Medical University(FXJDBF2024204), and the Major Program in Philosophy and Social Sciences of Jiangsu Provincial Colleges and Universities(2021SJZDA1350).

Data availability

The datasets used and/or analyzed in this study are available from the corresponding author upon reasonable request. We do not have ethical permission to upload the datasets to a repository. Please note that all research data have been anonymized for confidentiality purposes.

Declarations

Ethics approval and consent to participate

This study was approved by the Ethics Committee of the Affiliated Hospital of Xuzhou Medical University (XYFY2023-KL146-01). All participants had informed consent prior to the study’s investigation, the subjects were healthy adults, and the Helsinki Declaration was not violated in the study. It is to confirm that all methods were carried out in accordance with relevant guidelines and regulations.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Yongshuo Zhang and Wenbo Li are co-first authorship.

Contributor Information

Yawei Wang, Email: wangyawei2013@foxmail.com.

Shuo Feng, Email: xzfs0561@163.com.

References

  • 1.Li Y, He W, Yang L, Zheng K. A historical review of performance appraisal of public hospitals in China from the perspective of historical institutionalism. Front public health. 2022;10:1009780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cooper D, Holmboe ES. Competency-Based Medical Education at the Front Lines of Patient Care. N Engl J Med. 2025;393(4):376–88. [DOI] [PubMed] [Google Scholar]
  • 3.Wu Z, Huang Y, Lyu L, Huang Y, Ping F. The efficacy of simulation-based learning versus non-simulation-based learning in endocrinology education: a systematic review and meta-analysis. BMC Med Educ. 2024;24(1):1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Patel H, Perry S, Badu E, Mwangi F, Onifade O, Mazurskyy A, et al. A scoping review of interprofessional education in healthcare: evaluating competency development, educational outcomes and challenges. BMC Med Educ. 2025;25(1):409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Song X, Cleaves E, Gluzman E, Kotlyar B, Russo RA, Schilling DC, et al. A Scoping Review of Assessments in Undergraduate Medical Education: Implications for Residency Programs and Medical Schools. Acad Psychiatry. 2025;49(3):263–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chang O, Holbrook AM, Lohit S, Deng J, Xu J, Lee M, et al. Comparability of Objective Structured Clinical Examinations (OSCEs) and Written Tests for Assessing Medical School Students’ Competencies: A Scoping Review. Eval Health Prof. 2023;46(3):213–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Oudbier J, Verheijck E, van Diermen D, Tams J, Bramer J, Spaai G. Enhancing the effectiveness of interprofessional education in health science education: a state-of-the-art review. BMC Med Educ. 2024;24(1):1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Alharbi NS. Evaluating competency-based medical education: a systematized review of current practices. BMC Med Educ. 2024;24(1):612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tulshian P, Montgomery L, McCrory K, Theobald M, Matosich S, Wright O, et al. National Recommendations for Implementation of Competency-Based Medical Education in Family Medicine. Fam Med. 2025;57(4):253–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Lall P, Rees R, Law GCY, Dunleavy G, Cotič Ž, Car J. Influences on the Implementation of Mobile Learning for Medical and Nursing Education: Qualitative Systematic Review by the Digital Health Education Collaboration. J Med Internet Res. 2019;21(2):e12895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Alizadeh M, Masoomi R, Mafinejad MK, Parmelee D, Khalaf RJ, Norouzi A. Team-based learning in health professions education: an umbrella review. BMC Med Educ. 2024;24(1):1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Aversano L, Iammarino M, Madau A, Pirlo G, Semeraro G. Process mining applications in healthcare: a systematic literature review. PeerJ Comput Sci. 2025;11:e2613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Healy LI, Rodriguez-Guerineau L, Mema B. Development and Validity of a Simulation Program for Assessment of Clinical Teaching Skills. ATS scholar. 2025;6(2):217–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Schulz KF, Altman DG, Moher D. CONSORT 2010 statement: updated guidelines for reporting parallel group randomized trials. Ann Intern Med. 2010;152(11):726–32. [DOI] [PubMed] [Google Scholar]
  • 15.Semary NA, Ahmed W, Amin K, Pławiak P, Hammad M. Improving sentiment classification using a RoBERTa-based hybrid model. Front Hum Neurosci. 2023;17:1292010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Choshi M. Addressing Challenges in Undergraduate Community Health Nursing Clinical: Kolb’s Experiential Learning Theory. J Nurs Educ. 2025;64(6):e31–4. [DOI] [PubMed]
  • 17.Nagel DA, Penner JL, Halas G, Philip MT, Cooke CA. Exploring experiential learning within interprofessional practice education initiatives for pre-licensure healthcare students: a scoping review. BMC Med Educ. 2024;24(1):139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Terzic-Supic Z, Bjegovic-Mikanovic V, Vukovic D, Santric-Milicevic M, Marinkovic J, Vasic V, et al. Training hospital managers for strategic planning and management: a prospective study. BMC Med Educ. 2015;15:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lighterness A, Adcock M, Scanlon LA, Price G. Data Quality-Driven Improvement in Health Care: Systematic Literature Review. J Med Internet Res. 2024;26:e57615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.El-Rifai O, Garaix T, Augusto V, Xie X. A stochastic optimization model for shift scheduling in emergency departments. Health Care Manag Sci. 2015;18(3):289–302. [DOI] [PubMed] [Google Scholar]
  • 21.Feng X, Qu Y, Sun K, Luo T, Meng K. Identifying strategic human resource management ability in the clinical departments of public hospitals in China: a modified Delphi study. BMJ open. 2023;13(3):e066599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Fuller K, Lupton-Smith C, Hubal R, McLaughlin JE. Automated Analysis of Preceptor Comments: A Pilot Study Using Sentiment Analysis to Identify Potential Student Issues in Experiential Education. Am J Pharm Educ. 2023;87(9):100005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Cahn A, Akirov A, Raz I. Digital health technology and diabetes management. J Diabetes. 2018;10(1):10–7. [DOI] [PubMed] [Google Scholar]
  • 24.Chen L, Zhang J, Zhu Y, Shan J, Zeng L. Exploration and practice of humanistic education for medical students based on volunteerism. Med Educ Online. 2023;28(1):2182691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ambalavanan R, Snead RS, Marczika J, Towett G, Malioukis A, Mbogori-Kairichi M. Challenges and strategies in building a foundational digital health data integration ecosystem: a systematic review and thematic synthesis. Front health Serv. 2025;5:1600689. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (32.9KB, docx)
Supplementary Material 2 (27.6KB, docx)

Data Availability Statement

The datasets used and/or analyzed in this study are available from the corresponding author upon reasonable request. We do not have ethical permission to upload the datasets to a repository. Please note that all research data have been anonymized for confidentiality purposes.


Articles from BMC Medical Education are provided here courtesy of BMC

RESOURCES