Skip to main content
. 2020 Aug 12;20:208. doi: 10.1186/s12874-020-01089-6

Table 1.

Methods for unbiased analysis under MCAR/MAR/MNAR in clinical trials (not an exhaustive listing of options)

Assumption Method for unbiased estimation Comments
MCAR Any complete case analysis Easy to implement but may not use all the available information in the data. Excludes participants with any missing data.
MAR Complete case analysis incorporating all variables associated with both outcome and missingness Easy to implement but may not use all the available information in the data. Excludes participants with any missing data. Generally cannot incorporate post-randomisation data.
Mixed model incorporating all variables associated with both outcome and missingness, which may include post-randomisation variables (e.g. for a longitudinal trial a mixed model for repeated measured MMRM) Earlier response data can be incorporated in the analysis to strengthen/justify MAR. Includes all observed data on each participant. Additional post-randomisation data predictive of both missingness and outcome that are required to be included to justify MAR, but that the treatment estimate (estimand) should not be conditioned on can also be included, however careful model specification is required to do so (e.g. any post-randomisation variables must be included as additional responses in the model with separate means for each treatment group, for detailed guidance see [7]).
Multiple Imputation incorporating all variables associated with both outcome and missingness, which may include post-randomisation variables Closely approximates a complete case/mixed model analysis when the variables included in the imputation and analysis model match those in the complete case/mixed model analysis. The imputation model must include as a minimum all variables included with the analysis model [16]. Provides a convenient analysis method when conditioning on particular variables is required to justify/strengthen a MAR assumption, but conditioning on these in the analysis is not required/appropriate. This is because variables that are predictive of both outcome and missingness, but that the treatment effect (estimand) should not be adjusted by, can be included in the imputation model and not in the subsequent analysis model.
MNAR Selection models: Consists of a model for the outcome data and a model for the occurrence of missing data For example, may consist of a logistic model to model the log odds of response and how this depends on the unobserved outcome. Expert/clinical knowledge is required to inform how the log odds of response depends on the unobserved outcome. Can be fitted using maximum likelihood, or within a Bayesian framework [8].
Pattern mixture models: Consists of a model for the outcome data for each missing data pattern Expert/clinical knowledge is required to inform how the outcome data distribution varies for each missing data pattern. Can be fitted using maximum likelihood, within a Bayesian framework or, implemented in a multiple imputation framework, when it is referred to as ‘Controlled Multiple Imputation’ [8, 15]:
Controlled Multiple Imputation: Combines pattern mixture modelling and multiple imputation (e.g. delta-based or reference-based Multiple Imputation) Data is imputed multiple times from a pattern mixture model. The analyst has direct ‘control’ over the imputation distribution. For example, in delta-based imputation a numerical offset term, delta, is typically added to the expected value of the missing data to assess the impact of unobserved participants having a worse or better response than of those observed. Reference-based imputation draws imputed values with some reference to the observed data in other groups of the trial, typically in other treatment arms. Different MNAR assumptions can be made for different groups of individuals in the same trial analysis by postulating different distributions for imputation or MAR and MNAR assumptions may be made for different groups [17].

For any trial analysis, care should be taken to ensure an appropriate method is chosen than handles all missing data appropriately for the treatment estimand of interest