. 2019 Jun 7;9:162. doi: 10.1038/s41398-019-0484-8

Table 2.

Approaches to handling missing data

Method	Description	Limitations
Replacement with mean or median	Inserts the mean or median of the whole dataset in place of missing data	Reduces the variance
Last observation carried forward/back	Inserts the last observation in place of the missing data points	Ignores existing trends in the data
Reduces variance
Weakens covariance and correlations
Linear interpolation	Assumes a linear relationship between two points and uses non-missing values from adjacent points to compute a value for the missing data points	Inappropriate in oscillatory data
Regression substitution	Predicts the most likely value of the missing data	May overestimate model fit
Does not quantify uncertainty about that value
Reduces variance
Maximum likelihood estimation	Identified likely set of values based on observed data. The maximum likelihood estimate of a parameter is the value of the parameter that is most likely to have resulted in the observed data	Limited to linear models
Multiple imputation	Plausible values for missing observations are created that reflect uncertainty. These values are used to impute the missing values. This process is repeated, to create a number of ‘completed' datasets. Each of these datasets is separately analyzed. The results are then combined allowing the uncertainty of the imputation to be taken into account	Complex to employ
Choosing the correct model can be difficult
Dimensionality	Views missing data as an additional dimension within the data	Complex to employ. May be difficult to interpret