Description
Multiple imputation (Rubin, 1987) has become a widely recommended way to deal with missing data. The procedure consists of the following three steps: (1) the missing data are estimated multiple times, which results in multiple plausible complete versions of the incomplete dataset, (2) the completed datasets are analyzed using the intended statistical analysis, and (3) the results of these several analyses are combined into a pooled analysis using specific combination rules. Until recently, no combination rules were proposed for standardized regression coefficients and the proportion of explained variance, R2, in multiple regression analysis. However, Van Ginkel (2020) proposed combination rules for both these statistics, along with combination rules for confidence intervals for standardized regression coefficients. Additionally, Van Ginkel and Karch (2024) extended the combination rules for R2 by Van Ginkel (2020) to 10 alternative estimators for the proportion of explained variance, among which the Ezekiel estimator, better known as R2-adjusted, the Smith estimator, and the Wherry estimator.
To calculate the above statistics for multiply imputed datasets, we developed an R (R Core Team, 2024) package in which the combination rules for these statistics are included. By default, the procedure displays the pooled estimator for R2 by Van Ginkel (2020), the pooled multiple correlation (the square root of R2), R2-adjusted, and the pooled standardized regression coefficients by Van Ginkel (2020). Optionally, the procedure also displays confidence intervals of the pooled standardized regression coefficients, the pooled zero-order correlations of the predictor variables with the outcome variable, and the pooled alternative estimators of the proportion of explained variance.
Availability
The R package RsquaredMI can be obtained from the Comprehensive R Archive Network (CRAN) repository at https://cran.r-project.org/web/packages/RSquaredMI/. The manual is available at https://cran.r-project.org/web/packages/RSquaredMI/RSquaredMI.pdf. The development version of the package is hosted on GitHub: https://github.com/karchjd/RsquaredMI.
Footnotes
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.
ORCID iD
Joost R. van Ginkel https://orcid.org/0000-0002-4137-0943
References
- R Core Team . (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Available at: https://www.R-project.org/ [Google Scholar]
- Rubin D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley. [Google Scholar]
- Van Ginkel J. R. (2020). Standardized regression coefficients and newly proposed estimators for R2 in multiply imputed data. Psychometrika, 85(1), 185–205. 10.1007/s11336-020-09696-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Ginkel J. R., Karch J. D. (2024). A comparison of different measures of the proportion of explained variance in multiply imputed data sets. British Journal of Mathematical and Statistical Psychology, 77(1), 672–693. 10.1111/bmsp.12344 [DOI] [PubMed] [Google Scholar]
