Skip to main content
. 2015 Jun 10;34(25):3298–3317. doi: 10.1002/sim.6553

Table A.1.

Possible strategies for imputation and model building with pros, cons and recommendations in light of results.

Stage Possible approach Pros Cons Practical advice
Imputation JAV Unbiased for linear models with data MCAR. Biased in all other settings. Avoid
PMM Ease of implementation; some ability to model nonlinear associations. Performance degrades under strong MAR mechanisms. Possibly useful for exploratory analysis.
SMC FCS Good approach if the analysis model is known, for example when validating a prognostic model. Unclear how best to proceed when the analysis model is to be developed from multiply imputed data. Consider using
Draw FP1 exponents via ABB Good approach if the highest dimension of FP considered is FP1. Does not extend beyond FP1. Predictive mean matching can be incorporated for further flexibility but comes with the aforementioned caution. With complete binary or continuous covariates, the search over the parameter space of p becomes computationally infeasible. Consider using
Estimation of p Log‐likelihood in complete records Reflects how MFP models are built in complete data. May be adequate with a small fraction of incomplete records and could be followed by SMC FCS to impute for the selected model. Does not use incomplete records, leading to bias in estimates of p^ and β^|p^ under departures from MCAR. Avoid unless there are few incomplete records
Log‐likelihood in MI data Reflects how MFP models are built in complete data but uses MI data. Small bias in p^. Consider using
Wald statistics Typically used in MI data where likelihoods do not have the same meaning. Very small bias in p^ (less than using the log‐likelihood in MI data). Consider using
Selection of D Likelihood‐ratio tests on complete records Type I error rate well controlled. May be adequate with a small fraction of incomplete records and could be followed by SMC FCS to impute for the selected model. Estimates of p and β are biased. Power is lower than any alternative method. Avoid unless there are few incomplete records
Weighted likelihood‐ratio tests on stacked MI data 22 Standard approach to building MFP models in complete data. Superior power to complete records and less biased. Approximation for the fraction of missing information may be wrong. Type I error rate less well controlled than analysis of complete records. Consider
Wald and ΔWald tests on MI data The standard approach to testing in multiply imputed data. Better power and lower bias than complete records. No theoretical basis for ΔWald. Type I error less well controlled than analysis of complete records. Consider
Meng and Rubin 23 Does not require access to full covariance matrix Computational complexity and extremely low power. Avoid
Robins and Wang 33 Provides consistent variance estimation even when the imputation and analysis models are incompatible. Impractical. Requires a different approach to imputation. Implementation is extremely complex for all but the simplest settings and is infeasible for MFP. Avoid

ABB, approximate Bayesian bootstrap; JAV, just another variable; PMM, predictive mean matching; SMC FCS, substantive model compatible fully conditional specification; FP, fractional polynomial; MCAR, missing completely at random; MAR, missing at random; MFP, multivariable FP; MI, multiple imputation.