Skip to main content
. Author manuscript; available in PMC: 2022 Apr 11.
Published in final edited form as: J Proteome Res. 2020 Sep 25;20(1):1–13. doi: 10.1021/acs.jproteome.0c00123

Table 1.

Statistical Approaches to Impute Missing Intensity Values

method description R package method reference R parameters
BPCA the posterior distribution of the model parameters and the missing values are estimated using a variational Bayes algorithm pcaMethods24 (Bioconductor) Oba et al.25 nPcs = 3 method = “bpca”
EM expectation maximization: the observed data are used to estimate missing data via penalized likelihood expectation maximization PEMM26 v 1.0 (CRAN) Chen et al.27 phi = 0
IRMI iterative robust model-based imputation: each peptide with missing values is iteratively used as a response variable in linear regression while the remaining peptides are used as explanatory variables VIM28 v.5.1.0 (CRAN) Templ et al.29
kNN k-nearest neighbors: values are imputed using a weighted average intensity of k most similar peptides VIM28 v.5.1.0 (CRAN) Kowarik et al.28 k = 5
LLS local least-squares: the missing values are imputed based on linear locally weighted least-squares regression imputation30 v 2.0.1 leveraging locfit31 v 1.5–9.1 (Github) Loader32
MEAN mean replacement: missing values are filled in with the mean observed value for the respective peptide
MICE multivariate imputation by chained equations: multiple imputation method that replaces missing values by predictive mean matching mice33 v 3.8.0 (CRAN) Little34 m = 5
PCA principal component analysis: runs PCA, imputes the missing values with the regularized reconstruction formulas and repeats until convergence missMDA35 v 1.16.0 (CRAN) Josse et al.36 ncp = 3
RF random forest: nonparametric method to impute missing values using a random forest trained on the observed parts of the data set, repeated iteratively until convergence MissForest37 v 1.4 (CRAN) Stekhoven et al.38 ntree = 100