. 2020 Aug 25;20:790. doi: 10.1186/s12913-020-05660-1

Table 3.

Strategies applied in research articles to counter issues of RHIS data

Type of strategy	Description of strategy
*Missing data*
Exclusion	Exclude facility data if a certain threshold was reached (e.g. more than two-thirds of months in a year; more than a sixth of baseline data; facilities with any missing data)
	Restrict analysis to a period with a low level of missing data
	Sensitivity analysis to compare analysis of restricted period and full period
Imputation	Assign missing observations with mean-value for the year
	Assign missing observations with the average of precedent and subsequent data
	Imputation using conditional autoregressive model
	Missing value was replaced as positive (binary form) to prevent exaggeration of the fade-out effect
	Sensitivity analysis of imputation strategies: 1) single imputation using means, trimmed means, and median, 2) Poisson generalized linear modeling, 3) iterative singular value decomposition method
Interpolation	Interpolation using space-time kriging
	Adjust results by dividing each indicator by the percentage of reports submitted
	Adjust the data by calibrating to the total population using proportion reported in a household survey to have occurred in health facilities
Verification Account in the modeling method	Manual verification of the missing data with register at the health facility
Verification Account in the modeling method	Missing data was assumed missing at random and accounted for in the mixed-effect models using standard maximum likelihood estimation
*Identifying extreme values*
Specific threshold	Establishing a lower and upper limit based on proportion of the annual average or feasible value
Specific threshold	Univariate regression on individual facility-level to identify deviation from the mean time trend (e.g. if exceed 8 standard deviations)
Visual	Visual inspection of outliers
Analytic assessment	Jackknifing analysis to assess influence
Analytic assessment	Student residual higher than an absolute value of 2 and influence on the estimated coefficients determined by high Cook’s distance statistics
*Handling of extreme values*
Exclusion	Extreme values were excluded from analyses
Replacing extreme value with average	Extreme values were assigned the average value of the year; with exceptions of low average values
Replacing extreme value with missing	Outliers set to missing
Verification with data source	Any drastic change in monthly data reported electronically were manually verified with register at the health facility. Discrepancies were replaced with data in the register
Discount observation in estimation	Outliers were allocated a dummy coding to discount the observation in the calculation of coefficients
*Assess reliability*
Data validation process	Randomly selected 10% of the total sample to check accuracy and reliability of data with reports and registers
	Verify data with another source (e.g. payroll)
	Established routine data validation process by health information and records officer (e.g. monthly data review meetings)