Skip to main content
. 2020 Aug 25;20:790. doi: 10.1186/s12913-020-05660-1

Table 3.

Strategies applied in research articles to counter issues of RHIS data

Type of strategy Description of strategy
Missing data
 Exclusion Exclude facility data if a certain threshold was reached (e.g. more than two-thirds of months in a year; more than a sixth of baseline data; facilities with any missing data)
Restrict analysis to a period with a low level of missing data
Sensitivity analysis to compare analysis of restricted period and full period
 Imputation Assign missing observations with mean-value for the year
Assign missing observations with the average of precedent and subsequent data
Imputation using conditional autoregressive model
Missing value was replaced as positive (binary form) to prevent exaggeration of the fade-out effect
Sensitivity analysis of imputation strategies: 1) single imputation using means, trimmed means, and median, 2) Poisson generalized linear modeling, 3) iterative singular value decomposition method
 Interpolation Interpolation using space-time kriging
Adjust results by dividing each indicator by the percentage of reports submitted
Adjust the data by calibrating to the total population using proportion reported in a household survey to have occurred in health facilities

 Verification

Account in the modeling method

Manual verification of the missing data with register at the health facility
Missing data was assumed missing at random and accounted for in the mixed-effect models using standard maximum likelihood estimation
Identifying extreme values
 Specific threshold Establishing a lower and upper limit based on proportion of the annual average or feasible value
Univariate regression on individual facility-level to identify deviation from the mean time trend (e.g. if exceed 8 standard deviations)
 Visual Visual inspection of outliers
 Analytic assessment Jackknifing analysis to assess influence
Student residual higher than an absolute value of 2 and influence on the estimated coefficients determined by high Cook’s distance statistics
Handling of extreme values
 Exclusion Extreme values were excluded from analyses
 Replacing extreme value with average Extreme values were assigned the average value of the year; with exceptions of low average values
 Replacing extreme value with missing Outliers set to missing
 Verification with data source Any drastic change in monthly data reported electronically were manually verified with register at the health facility. Discrepancies were replaced with data in the register
 Discount observation in estimation Outliers were allocated a dummy coding to discount the observation in the calculation of coefficients
Assess reliability
 Data validation process Randomly selected 10% of the total sample to check accuracy and reliability of data with reports and registers
Verify data with another source (e.g. payroll)
Established routine data validation process by health information and records officer (e.g. monthly data review meetings)