Table 1. Data quality and performance indicator definitions, mitigation strategies, and references.
Definition | Mitigation strategies | Relevant studies | |
Dataquality | |||
Completeness (or, conversely, missingness) | The absence of data points, without reference to data type or plausibility [12] | Automated data extraction; data imputation | [2-6,8,9,24-37,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined] |
Conformance | The compliance of data with expected formatting, relational, or absolute definitions [12] | Preemptively enforced data format standardization | [2-6,8,14,24-27,29-33,36,38,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined] |
Plausibility | The possibility that a value is true given the context of other variables or temporal sequences (ie, patient date of birth must precede date of treatment or diagnosis) [12] | Periodic realignment with logic rule sets or objective truth standards; thresholding | [4-6,8,14,25,27,28,30-33,35,37-39,undefined,undefined,undefined,undefined,undefined,undefined,undefined] |
Uniqueness | The lack of duplicate data among other patient records [8] | Two-level encounter or visit data structure | [8] |
Dataperformance | |||
Correctness or accuracy | Whether patient records are free from errors or inconsistencies when the information provided in them is true [10,13] | Periodic validation against internal and external gold standards | [2,7-9,14,23,24,28,undefined,undefined] |
Currency or recency | Whether data were entered into the EHRa within a clinically relevant time frame and is representative of the patient state at a given time of interest [10,13] | Enforcing predetermined hard and soft rule sets for timeline of data entry | [2,4,9,27,32,34,36] |
Fairness (or, conversely, bias) | The degree to which data collection, augmentation, and application are free from unwarranted over- or underrepresentation of individual data elements or characteristics | Periodic review against a predetermined internal gold standard or bias criterion | [3,19,22,24,27,35] |
Stability (or, conversely, temporal variability) | Whether temporally dependent variables change according to predefined expectations [10,12] | Periodic measurement of data drift against a baseline standard of data distribution | [4,8,19,31] |
Shareability | Whether data can be shared directly, easily, and with no information loss [3] | Preemptively enforced data standardization | [2,3] |
Robustness | The percent of patient records with tolerable (eg, inaccurate, inconsistent, and outdated information) versus intolerable (eg, missing required information) data quality problems [24] | Timely identification of critical data quality issues | [24] |
EHR: electronic health record.