. 2019 Mar 28;21(3):e12994. doi: 10.2196/12994

Table 7.

Summary of findings for serious gaming versus traditional learning. Patient or population: various health professionals, settings: high- and middle-income countries, intervention: serious gaming and gamification, comparison: traditional learning.

Outcomes	Number of participants (number of studies)	Quality of evidence (GRADE^a)	Comments
Knowledge (measures include multiple-choice questions, clinical scenario–based questions, and self-assessment; follow-up mostly immediately after the intervention, longest follow-up of 52 weeks)	769 (7)	Low^b,c,d	All the individually played games with an objective assessment of knowledge suggested serious gaming/gamification was superior to traditional learning. Four RCTs^e and one cRCT^f reported higher postintervention knowledge scores between the serious gaming and control groups, with moderate-to-large effect sizes, although the result for the cRCT may not have been statistically significant^g. An RCT of a serious gaming intervention reported no difference between groups. A cRCT assessing perceived knowledge reported no difference between groups.
Skills (measures include performance metrics on a simulator, practical examinations, OSCEs^h and self-evaluation; most studies followed up until immediately after the intervention only)	1195 (14)	Low	Six RCTs reported higher postintervention skill scores on all measures of skills employed in that study in the serious gaming group, with small-to-large effect sizes. A further cRCT suggested higher skill scores of small magnitude but may not have been statistically significant^g. Three RCTs measured skill outcomes using multiple measures (and no summary measure) and reported higher postintervention scores for some of these measures and no difference for others. Two RCTs and one cRCT reported no difference in postintervention skill scores between groups. One cRCT suggested serious gaming may be inferior to traditional learning, but this result may not have been statistically significant^g.
Attitudes (measured with participant-completed rating scales; follow-up immediately after the test)	369 (3)	Very low^b,c,i,^j	One RCT reported higher postintervention attitude scores in the serious gaming group (small effect size) and one RCT reported no difference between groups. One reported higher scores in the intervention groups, but this result may not have been statistically significant^g.
Satisfaction (3 questions on attitudes toward learning experience measured on a 4-point Likert scale; follow-up immediately after the intervention)	144 (1)	Low	One study reported higher postintervention satisfaction scores in the serious gaming group compared with the control.

^aGRADE: Grading of Recommendations, Assessment, Development and Evaluations.

^bRated down one level for study limitations: The risk of bias was unclear for multiple domains.

^cRated down one level for imprecision: All included studies assessing this comparison and outcome had fewer than 400 participants.

^dLow quality (+ + – –): Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.

^eRCT: randomized controlled trial.

^fcRCT: cluster randomized controlled trial.

^gNone of the 3 included cRCTs accounted for clustering in their analyses. They were therefore reanalyzed using the number of clusters as the sample sizes and were likely significantly underpowered.

^hOSCE: objective structured clinical examination.

ⁱRated down one level for inconsistency: There was considerable heterogeneity in the results without a clear explanation.

^jVery low quality (+ – – –): We are uncertain about the estimate.