Table 7.
Outcomes | Number of participants (number of studies) |
Quality of evidence (GRADEa) |
Comments |
Knowledge (measures include multiple-choice questions, clinical scenario–based questions, and self-assessment; follow-up mostly immediately after the intervention, longest follow-up of 52 weeks) | 769 (7) | Lowb,c,d | All the individually played games with an objective assessment of knowledge suggested serious gaming/gamification was superior to traditional learning. Four RCTse and one cRCTf reported higher postintervention knowledge scores between the serious gaming and control groups, with moderate-to-large effect sizes, although the result for the cRCT may not have been statistically significantg. An RCT of a serious gaming intervention reported no difference between groups. A cRCT assessing perceived knowledge reported no difference between groups. |
Skills (measures include performance metrics on a simulator, practical examinations, OSCEsh and self-evaluation; most studies followed up until immediately after the intervention only) | 1195 (14) | Low | Six RCTs reported higher postintervention skill scores on all measures of skills employed in that study in the serious gaming group, with small-to-large effect sizes. A further cRCT suggested higher skill scores of small magnitude but may not have been statistically significantg. Three RCTs measured skill outcomes using multiple measures (and no summary measure) and reported higher postintervention scores for some of these measures and no difference for others. Two RCTs and one cRCT reported no difference in postintervention skill scores between groups. One cRCT suggested serious gaming may be inferior to traditional learning, but this result may not have been statistically significantg. |
Attitudes (measured with participant-completed rating scales; follow-up immediately after the test) | 369 (3) | Very lowb,c,i,j |
One RCT reported higher postintervention attitude scores in the serious gaming group (small effect size) and one RCT reported no difference between groups. One reported higher scores in the intervention groups, but this result may not have been statistically significantg. |
Satisfaction (3 questions on attitudes toward learning experience measured on a 4-point Likert scale; follow-up immediately after the intervention) | 144 (1) | Low | One study reported higher postintervention satisfaction scores in the serious gaming group compared with the control. |
aGRADE: Grading of Recommendations, Assessment, Development and Evaluations.
bRated down one level for study limitations: The risk of bias was unclear for multiple domains.
cRated down one level for imprecision: All included studies assessing this comparison and outcome had fewer than 400 participants.
dLow quality (+ + – –): Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
eRCT: randomized controlled trial.
fcRCT: cluster randomized controlled trial.
gNone of the 3 included cRCTs accounted for clustering in their analyses. They were therefore reanalyzed using the number of clusters as the sample sizes and were likely significantly underpowered.
hOSCE: objective structured clinical examination.
iRated down one level for inconsistency: There was considerable heterogeneity in the results without a clear explanation.
jVery low quality (+ – – –): We are uncertain about the estimate.