Skip to main content
Current Research in Neurobiology logoLink to Current Research in Neurobiology
. 2022 May 20;3:100041. doi: 10.1016/j.crneur.2022.100041

Commentary on Unnecessary reliance on multilevel modelling to analyse nested data in neuroscience: When a traditional summary-statistics approach suffices

Paul Alexander Bloom 1,1,, Monica Kim Ngan Thieu 1,1,∗∗, Niall Bolger 1
PMCID: PMC9846465  PMID: 36685767

McNabb and Murayama (2021) present a set of simulations demonstrating that under certain circumstances, simpler models based on summary statistics perform equivalently to multilevel models (MLM). We commend the clarification that the summary statistic approach does not automatically result in a loss in statistical power and agree that the summary-statistics approach can be useful in some cases in neuroimaging (Mumford and Nichols, 2009).

However, the authors' conclusion that the summary statistics approach can do “an equally good or even better job in some situations” overstates the drawbacks of MLM and neglects its key advantages. Most importantly, the singular fit errors cited by the authors as common with MLM arise only in frequentist methods that use maximum likelihood estimation (e.g. lmer; Bates and Bolker, 2011). With Bayesian MLM methods that use Markov-Chain Monte Carlo computations, such singular fit or convergence errors largely do not occur. Bayesian multilevel regression models can be readily fit using syntax nearly identical to lmer using the brms (Bürkner, 2017) and rstanarm (Gabry et al., 2019) packages in R, or with the PyMC python library (Salvatier et al., 2016). Thus, Bayesian MLM methods can alleviate the singular fit errors and associated issues with power and Type I error rates that can occur with equivalent maximum-likelihood MLM methods.

Further, although in certain circumstances the summary-statistics approach returns equivalent results to MLM fixed effects (Murayama et al., 2022), the summary-statistics approach can be problematic in cases of unequal cluster sizes or unequal cluster variances (Gelman and Hill, 2006). This approach tends to overweight summary statistics from smaller clusters in estimating fixed effects (Gelman, 2006). In contrast, the partial pooling of variances across clusters within the MLM framework can improve estimation of uncertainty and model predictive performance (van Boekel, 2021).

Finally, though the approaches can be equivalent in detecting fixed or average effects, the summary-statistics approach is unable to estimate the extent of between-cluster variation in those effects. Unless within-cluster variance is explicitly modeled (for example, by using a “sufficient” summary-statistic approach; see Dowding and Haufe, 2018), between-cluster variation in effects will be overestimated. By contrast, MLM approaches give researchers a principled set of tools for understanding between-person and other sources of effect heterogeneity (Bolger et al., 2019; Chen et al., 2021; Koo and Li, 2016). When effects vary widely across participants, for example, methods that only focus on the average participant can be very misleading (DiGiovanni et al., 2021; Haaf and Rouder, 2019). Thus, summary-statistics approaches, by discarding information on random effects, can impede researchers’ understanding of the data-generating process and the development of adequate theories (Baayen et al., 2008). Thus, we encourage the broad use of MLMs, particularly within a Bayesian framework, when possible.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Contributor Information

Paul Alexander Bloom, Email: paul.bloom@columbia.edu.

Monica Kim Ngan Thieu, Email: monica.thieu@columbia.edu.

Niall Bolger, Email: nb2229@columbia.edu.

References

  1. Baayen R.H., Davidson D.J., Bates D.M. Mixed-effects modeling with crossed random effects for subjects and items. J. Mem. Lang. 2008;59(4):390–412. doi: 10.1016/j.jml.2007.12.005. [DOI] [Google Scholar]
  2. Bates D., Bolker M.M., B . 2011. lme4: Linear Mixed-Effects Models Using S4 Classes (0.999375-39)http://www.idg.pl/mirrors/CRAN/web/packages/lme4/ [Computer software] [Google Scholar]
  3. Bolger N., Zee K.S., Rossignac-Milon M., Hassin R.R. Causal processes in psychology are heterogeneous. J. Exp. Psychol. Gen. 2019;148(4):601–618. doi: 10.1037/xge0000558. [DOI] [PubMed] [Google Scholar]
  4. Bürkner P.-C. 2017. Brms: Bayesian Regression Models Using Stan (1.6.1)https://cran.r-project.org/web/packages/brms/index.html [Computer software] [Google Scholar]
  5. Chen G., Pine D.S., Brotman M.A., Smith A.R., Cox R.W., Haller S.P. Trial and error: a hierarchical modeling approach to test-retest reliability. Neuroimage. 2021;245 doi: 10.1016/j.neuroimage.2021.118647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. DiGiovanni A., Vannucci A., Ohannessian C.M., Bolger N. 2021. Modeling Heterogeneity in the Simultaneous Emotional Costs and Social Benefits of Co- Rumination. [DOI] [PubMed] [Google Scholar]
  7. Dowding I., Haufe S. Powerful statistical inference for nested data using sufficient summary statistics. Front. Hum. Neurosci. 2018;12 doi: 10.3389/fnhum.2018.00103. https://www.frontiersin.org/article/10.3389/fnhum.2018.00103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Gabry J., Ali I., Brilleman S., Novik J.B., Wood S., Development R.C., Bates D., Maechler M., Bolker B., Walker S., Burkner P.-C., Ripley B., Venables W., Goodrich B., AstraZeneca . CRAN; 2019. Rstanarm: Bayesian Applied Regression Modeling via Stan (2.19.2.https://CRAN.R-project.org/package=rstanarm [Computer software] [Google Scholar]
  9. Gelman A. Multilevel (hierarchical) modeling: what it can and cannot do. Technometrics. 2006;48(3):432–435. doi: 10.1198/004017005000000661. [DOI] [Google Scholar]
  10. Gelman A., Hill J. 1 edition. Cambridge University Press; 2006. Data Analysis Using Regression and Multilevel/Hierarchical Models. [Google Scholar]
  11. Haaf J.M., Rouder J.N. Some do and some don't? Accounting for variability of individual difference structures. Psychon. Bull. Rev. 2019;26(3):772–789. doi: 10.3758/s13423-018-1522-x. [DOI] [PubMed] [Google Scholar]
  12. Koo T.K., Li M.Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine. 2016;15(2):155–163. doi: 10.1016/j.jcm.2016.02.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. McNabb C.B., Murayama K. Unnecessary reliance on multilevel modelling to analyse nested data in neuroscience: when a traditional summary-statistics approach suffices. Current Research in Neurobiology. 2021;2 doi: 10.1016/j.crneur.2021.100024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Mumford J.A., Nichols T. Simple group fMRI modeling and inference. Neuroimage. 2009;47(4):1469–1475. doi: 10.1016/j.neuroimage.2009.05.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Murayama K., Usami S., Sakaki M. Summary-statistics-based power analysis: a new and practical method to determine sample size for mixed-effects modeling. Psychol. Methods. 2022 doi: 10.1037/met0000330. [DOI] [PubMed] [Google Scholar]
  16. Salvatier J., Wiecki T.V., Fonnesbeck C. Probabilistic programming in Python using PyMC3. PeerJ Computer Science. 2016;2:e55. doi: 10.7717/peerj-cs.55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. van Boekel M.A.J.S. To pool or not to pool: that is the question in microbial kinetics. Int. J. Food Microbiol. 2021;354 doi: 10.1016/j.ijfoodmicro.2021.109283. [DOI] [PubMed] [Google Scholar]

Articles from Current Research in Neurobiology are provided here courtesy of Elsevier

RESOURCES