Skip to main content
Applied Psychological Measurement logoLink to Applied Psychological Measurement
. 2025 Jan 26;49(4-5):243–244. doi: 10.1177/01466216251316275

R Package for Calculating Estimators of the Proportion of Explained Variance and Standardized Regression Coefficients in Multiply Imputed Datasets

Joost R van Ginkel 1,, Julian D Karch 1
PMCID: PMC11770685  PMID: 39877663

Description

Multiple imputation (Rubin, 1987) has become a widely recommended way to deal with missing data. The procedure consists of the following three steps: (1) the missing data are estimated multiple times, which results in multiple plausible complete versions of the incomplete dataset, (2) the completed datasets are analyzed using the intended statistical analysis, and (3) the results of these several analyses are combined into a pooled analysis using specific combination rules. Until recently, no combination rules were proposed for standardized regression coefficients and the proportion of explained variance, R2, in multiple regression analysis. However, Van Ginkel (2020) proposed combination rules for both these statistics, along with combination rules for confidence intervals for standardized regression coefficients. Additionally, Van Ginkel and Karch (2024) extended the combination rules for R2 by Van Ginkel (2020) to 10 alternative estimators for the proportion of explained variance, among which the Ezekiel estimator, better known as R2-adjusted, the Smith estimator, and the Wherry estimator.

To calculate the above statistics for multiply imputed datasets, we developed an R (R Core Team, 2024) package in which the combination rules for these statistics are included. By default, the procedure displays the pooled estimator for R2 by Van Ginkel (2020), the pooled multiple correlation (the square root of R2), R2-adjusted, and the pooled standardized regression coefficients by Van Ginkel (2020). Optionally, the procedure also displays confidence intervals of the pooled standardized regression coefficients, the pooled zero-order correlations of the predictor variables with the outcome variable, and the pooled alternative estimators of the proportion of explained variance.

Availability

The R package RsquaredMI can be obtained from the Comprehensive R Archive Network (CRAN) repository at https://cran.r-project.org/web/packages/RSquaredMI/. The manual is available at https://cran.r-project.org/web/packages/RSquaredMI/RSquaredMI.pdf. The development version of the package is hosted on GitHub: https://github.com/karchjd/RsquaredMI.

Footnotes

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Joost R. van Ginkel https://orcid.org/0000-0002-4137-0943

References

  1. R Core Team . (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Available at: https://www.R-project.org/ [Google Scholar]
  2. Rubin D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley. [Google Scholar]
  3. Van Ginkel J. R. (2020). Standardized regression coefficients and newly proposed estimators for R2 in multiply imputed data. Psychometrika, 85(1), 185–205. 10.1007/s11336-020-09696-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Van Ginkel J. R., Karch J. D. (2024). A comparison of different measures of the proportion of explained variance in multiply imputed data sets. British Journal of Mathematical and Statistical Psychology, 77(1), 672–693. 10.1111/bmsp.12344 [DOI] [PubMed] [Google Scholar]

Articles from Applied Psychological Measurement are provided here courtesy of SAGE Publications

RESOURCES