Abstract
Independent meta-analyses on the same topic can sometimes yield seemingly conflicting results. For example, prominent meta-analyses assessing the effects of violent video games on aggressive behavior have reached apparently different conclusions, provoking ongoing debate. We suggest that such conflicts are sometimes partly an artifact of reporting practices for meta-analyses that focus only on the pooled point estimate and its statistical significance. Considering statistics that focus on the distributions of effect sizes and that adequately characterize effect heterogeneity can sometimes indicate reasonable consensus between “warring” meta-analyses. Using novel analyses, we show that this seems to be the case in the video-game literature. Despite seemingly conflicting results for the statistical significance of the pooled estimates in different meta-analyses of video-game studies, all of the meta-analyses do in fact point to the conclusion that, in the vast majority of settings, violent video games do increase aggressive behavior but that these effects are almost always quite small.
Keywords: meta-analysis, effect sizes, video games, aggression
Meta-analyses are often intended to improve scientific certainty and settle debates. However, in a recent article, de Vrieze (2018) described instances in which independent meta-analyses on the same topic yield apparently conflicting results, seeming only to exacerbate controversy and uncertainty. He noted that prominent meta-analyses assessing the effects of violent video games on aggressive behavior have been widely interpreted as yielding opposite conclusions, which has provoked ongoing heated debate (Anderson et al., 2010; Ferguson, 2015). de Vrieze (2018) described compelling possibilities to help adjudicate the results of such meta-analyses, giving the following specific examples: (a) Minimizing researcher degrees of freedom when conducting new meta-analyses, (b) ensuring full analytic reproducibility, and (c) conducting prospective multisite replications of the phenomenon of interest. We agree emphatically with all of these recommendations. In the specific context of the video-game debate, others have also provided interesting commentary on potential scientific reasons for apparently conflicting conclusions (e.g., Kepes, Bushman, & Anderson, 2017; Prescott, Sargent, & Hull, 2018).
However, we also believe that reporting practices for meta-analyses can sometimes produce an illusion of conflict in the meta-analyses’ scientific implications when in fact little conflict exists, as we illustrate for meta-analyses of violent video games later in the article. Meta-analyses are usually reported with nearly exclusive focus on the pooled point estimate and its so-called statistical significance and are sometimes treated as conflicting merely because one point estimate attains statistical significance whereas the other does not, even when the point estimates and confidence intervals (CIs) are quite similar (e.g., Appleton, Rogers, & Ness, 2010; Bloch & Hannestad, 2012). Even when we focus on comparing the point estimates rather than the p values, this approach still does not fully characterize evidence strength when the effects represent a heterogeneous distribution (Mathur & VanderWeele, 2018).
Some “metawars” might be reduced to smaller skirmishes or entirely resolved if investigators were to compare evidence strength between the meta-analyses in a manner that characterizes effect heterogeneity and focuses on the distributions of effect sizes rather than statistical significance. Specifically, an investigator could select thresholds above which an effect size might be considered scientifically meaningful depending on the context of the effect under investigation. (There is a large, interdisciplinary literature considering how to choose such thresholds, as summarized by Mathur & VanderWeele, 2018.) With such a threshold in mind, it is statistically straightforward to estimate, in each meta-analysis, the percentage of true effects stronger than this threshold; we have provided MetaUtility, a package for the R software environment (R Development Core Team, 2017), to do so (Mathur & VanderWeele, 2018). It is also possible to assess the percentage of scientifically meaningful effect sizes in the unexpected direction—that is, opposite in sign from the pooled point estimate. These metrics can help identify whether (a) there are few effects of scientifically meaningful size despite a statistically significant pooled point estimate, (b) there are some large effects despite an apparently null point estimate, or (c) strong effects in the direction opposite the pooled estimate also regularly occur (and thus potential moderators should be examined).
Methods
For three prominent meta-analyses on violent video games (Anderson et al., 2010; Ferguson, 2015; Prescott et al., 2018), we meta-analytically estimated1 the percentage of effects that surpass standardized effect-size thresholds of q = 0, 0.10, and 0.20 (where standardized effect sizes were either Fisher-transformed correlations or standardized multiple-regression coefficients). The threshold of 0 is the least stringent in that it considers all detrimental effects regardless of the magnitude. The more stringent thresholds of 0.10 and 0.20 consider only effects of at least modest sizes. Estimating the percentage (and 95% CI) of true effects stronger than these thresholds can be done meta-analytically, an approach that is distinct from simply counting the so-called significant p values in the observed sample of studies, as we have described elsewhere (Mathur & VanderWeele, 2018).
For each meta-analysis, we first conducted an analysis that reproduced as closely as possible the main results as reported in the respective article’s abstract.2 We also assessed the percentage of effects below effect sizes of −0.10 or −0.20, which would indicate beneficial, rather than detrimental, effects of violent video games. In addition, for more direct scientific comparability across analyses, we conducted a second controlled analysis that included only longitudinal studies (Anderson et al., 2010; Prescott et al., 2018) or studies controlling for baseline aggression through statistical adjustment or randomization (Ferguson, 2015). (For Prescott et al., 2018, the main and controlled analyses were identical.) Secondarily, we also compared the consistency of these metrics upon correction for publication bias of a form that favors studies with statistically significant positive results (Vevea & Hedges, 1995).
For analytic reproducibility, data from the Prescott et al. meta-analysis were obtained from the published forest plot and are publicly available at https://osf.io/ eunz3. Data from the Ferguson and Anderson et al. meta-analyses cannot be made public at the authors’ request, but they are available upon request to individuals who have secured permission from the original authors. All code required to reproduce our reanalyses is also publicly available at https://osf.io/eunz3.
Results
When both main- and controlled-analysis specifications are considered, the results (see Table 1) suggest considerable common ground between these three meta-analyses. All six analyses suggest that a large majority (point estimate of at least 80%, with CIs all bounded above 57%) of effects are greater than 0, indicating frequent detrimental effects of violent video games, albeit possibly of negligible size (Table 1, fifth column). In addition, five of the six meta-analyses suggest that very few effects are above 0.20, with the five CIs all bounded below 12% (Table 1, last column). The remaining meta-analysis (the main analysis from Anderson et al., 2010) suggests that this percentage of effects above 0.20, although not negligible, still represents a minority of effects (29%; 95% CI = [14, 45]). The meta-analyses diverge meaningfully only in their estimation of effects above 0.10 (Table 1, sixth column), suggesting that the “conflict” between these analyses is limited to their estimation of effects in the narrow range between 0.10 and 0.20. For example, considering the controlled analyses for the two meta-analyses that have been most central in the debate, we estimate for the Anderson et al. (2010) meta-analysis that 100% (95% CI = [99, 100]), 0% (95% CI = [0, 69]), and 0% (95% CI = [0, 0]), of effects surpass the thresholds of 0, 0.10, and 0.20 respectively; similarly, we estimate for the Ferguson (2015) meta-analysis that 80% (95% CI = 57, 100]), 9% (95% CI = [0, 29]), and 0% (95% CI = [0, 3]) of effects surpass these thresholds.
Table 1.
Estimates From Video-Game Meta-Analyses
| Number of estimates | Pooled point estimates | Heterogeneity estimates | Estimated percentage of true effects | |||
|---|---|---|---|---|---|---|
| > 0 | > 0.10 | > 0.20 | ||||
| Anderson et al. (2010) | ||||||
| Main | 75 | 0.17 [0.15, 0.19] | 0.05 [0.02, 0.07] | 100 [99, 100] | 90 [79, 98] | 29 [14, 45] |
| Controlled | 12 | 0.08 [0.05, 0.10] | 0 [0.00, 0.04] | 100 [98, 100] | 0 [0, 69] | 0 [0, 0] |
| Ferguson (2015) | ||||||
| Main | 166 | 0.05 [0.04, 0.07] | 0.06 [0.04, 0.06] | 83 [76, 90] | 20 [12, 28] | 0 [0, 2] |
| Controlled | 22 | 0.04 [0.01, 0.07] | 0.05 [0.00, 0.07] | 80 [57, 100] | 9 [0, 29] | 0 [0, 3] |
| Prescott, Sargent, and Hull (2018) | ||||||
| Main and controlled | 25 | 0.10 [0.07, 0.13] | 0.05 [0.01, 0.07] | 98 [75, 100] | 53 [29, 76] | 3 [0, 12] |
Note: Effect sizes are on the standardized scale. Heterogeneity estimates represent the estimated standard deviation of the true effect distribution. Values reported in brackets are 95% confidence intervals. Specifications for the main analyses are reported in the respective article’s abstract; controlled analyses include only longitudinal studies or studies otherwise controlling for baseline aggression. The two specifications were identical for Prescott et al. (2018).
Considering instead the percentage of effects suggesting beneficial rather than detrimental effects of violent video games, the three meta-analyses all estimate that no such effects (i.e., 0%) are stronger than (i.e., more beneficial than) an effect size of −0.20 (with CIs all bounded below 3%) or even −0.10 (with CIs all bounded below 21%); these analyses are presented in the Supplemental Material available online. The sensitivity analysis correcting for publication bias suggested similarly consistent evidence across the three meta-analyses.
Discussion
In practice, we would interpret these various meta-analyses as providing consistent evidence that the effects of violent video games on aggressive behavior are nearly always detrimental in direction but are rarely stronger than a standardized effect size of 0.20. These conclusions are not intended to trivialize important methodological critiques and debates in this literature, such as those regarding demand characteristics, expectancy effects, confounding, measurement of aggression, and publication bias in experiments with behavioral outcomes (e.g., Ferguson, 2015; Hilgard, Engelhardt, & Rouder, 2017; Markey, 2015). Our claim is not that our reanalyses resolve these methodological problems but rather that widespread perceptions of conflict among the results of these meta-analyses—even when taken at face value without reconciling their substantial methodological differences—may in part be an artifact of statistical reporting practices in meta-analyses. Indeed, our quantitative findings seem to support a recent task force’s suggestion that, heuristically, the conflicting meta-analyses may indicate similar effect sizes (Calvert et al., 2017).
Our findings also in no way undermine the recommendations of de Vrieze (2018) and many others for designing scientifically robust meta-analyses and for adjudicating seemingly conflicting results. Corroborating his discussion of analytic reproducibility, we were able to obtain raw data for the three discussed meta-analyses but experienced challenges in attempting to analytically reproduce several published results; these challenges persisted after contact with the authors. In addition, for one meta-analysis that we initially intended to include because of its historical prominence (Ferguson & Kilburn, 2009), contact with the original author indicated that neither the data nor the list of the studies included in the meta-analysis still existed, although with the author’s assistance, we were able to obtain data for a subsequent, partly overlapping meta-analysis (Ferguson, 2015). Ultimately, even in light of potential methodological problems, suboptimal reproducibility, and researcher degrees of freedom, as noted by de Vrieze (2018), we believe that these conflicting meta-analyses in fact provide considerable consensus in favor of consistent, but small, detrimental effects of violent video games on aggressive behavior.
Supplementary Material
Acknowledgments
We thank C. Ferguson and C. Anderson for providing raw data for reanalysis and for answering analytic questions.
Funding
M. B. Mathur and T. J. VanderWeele were supported by National Cancer Institute Grant R01-CA222147. The funders had no role in the design, conduct, or reporting of this research.
Footnotes
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Supplemental Material
Additional supporting information can be found at http://journals.sagepub.com/doi/suppl/10.1177/1745691619850104
Per Mathur and VanderWeele (2018), we used closed-form inference when the estimated percentage was greater than 15% and less than 85%. Otherwise, we used bias-corrected and accelerated bootstrapping with 1,000 iterations or percentile bootstrapping if needed to alleviate computational problems.
We fit meta-analyses using restricted maximum likelihood with Knapp-Hartung–adjusted standard errors (IntHout, Ioannidis, & Borm, 2014). In each meta-analysis, some studies seemed to contribute multiple, potentially nonindependent point estimates, although all original analyses used standard methods assuming independence. For the meta-analysis with the most apparent clustering, we performed a sensitivity analysis by refitting the model using robust methods (Hedges et al., 2010), yielding nearly identical results. For the others, limitations in available data precluded this sensitivity analysis, but clustering seemed to be minimal and so unlikely to affect results.
References
- Anderson CA, Shibuya A, Ihori N, Swing EL, Bushman BJ, Sakamoto A, … Saleem M (2010). Violent video game effects on aggression, empathy, and prosocial behavior in Eastern and Western countries: A meta-analytic review. Psychological Bulletin, 136, 151–173. [DOI] [PubMed] [Google Scholar]
- Appleton KM, Rogers PJ, & Ness AR (2010). Updated systematic review and meta-analysis of the effects of n-3 long-chain polyunsaturated fatty acids on depressed mood. The American Journal of Clinical Nutrition, 91, 757–770. [DOI] [PubMed] [Google Scholar]
- Bloch MH, & Hannestad J (2012). Omega-3 fatty acids for the treatment of depression: Systematic review and meta-analysis. Molecular Psychiatry, 17, 1272–1282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calvert SL, Appelbaum M, Dodge KA, Graham S, Nagayama Hall GC, Hamby S, … Hedges LV (2017). The American Psychological Association Task Force assessment of violent video games: Science in the service of public interest. American Psychologist, 72, 126–143. [DOI] [PubMed] [Google Scholar]
- de Vrieze J (2018). The metawars. Science, 361, 1184–1188. [DOI] [PubMed] [Google Scholar]
- Ferguson CJ (2015). Do angry birds make for angry children? A meta-analysis of video game influences on children’s and adolescents’ aggression, mental health, prosocial behavior, and academic performance. Perspectives on Psychological Science, 10, 646–666. [DOI] [PubMed] [Google Scholar]
- Ferguson CJ, & Kilburn J (2009). The public health risks of media violence: A meta-analytic review. The Journal of Pediatrics, 154, 759–763. [DOI] [PubMed] [Google Scholar]
- Hedges LV, Tipton E, & Johnson MC (2010). Robust variance estimation in meta-regression with dependent effect size estimates. Research Synthesis Methods, 1, 39–65. [DOI] [PubMed] [Google Scholar]
- Hilgard J, Engelhardt CR, & Rouder JN (2017). Overstated evidence for short-term effects of violent games on affect and behavior: A reanalysis of Anderson et al. (2010). Perspectives on Psychological Science, 143, 757–774. [DOI] [PubMed] [Google Scholar]
- IntHout J, Ioannidis JP, & Borm GF (2014). The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method. BMC Medical Research Methodology, 14(1), 10.1186/1471-2288-14-25 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kepes S, Bushman BJ, & Anderson CA (2017). Violent video game effects remain a societal concern: Reply to Hilgard, Engelhardt, and Rouder (2017). Perspectives on Psychological Science, 143, 775–782. [DOI] [PubMed] [Google Scholar]
- Markey PM (2015). Finding the middle ground in violent video game research: Lessons from Ferguson (2015). Perspectives on Psychological Science, 10, 667–670. [DOI] [PubMed] [Google Scholar]
- Mathur MB, & VanderWeele TJ (2018). New metrics for meta-analyses of heterogeneous effects. Statistics in Medicine, 38, 1336–1342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prescott AT, Sargent JD, & Hull JG (2018). Metaanalysis of the relationship between violent video game play and physical aggression over time. Proceedings of the National Academy of Sciences, USA, 115, 9882–9888. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Development Core Team (2017). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
- Vevea JL, & Hedges LV (1995). A general linear model for estimating effect size in the presence of publication bias. Psychometrika, 60, 419–435. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
