Skip to main content
. 2024 Nov 6;12:e58130. doi: 10.2196/58130

Table 1. Data quality and performance indicator definitions, mitigation strategies, and references.

Definition Mitigation strategies Relevant studies
Dataquality
 Completeness (or, conversely, missingness) The absence of data points, without reference to data type or plausibility [12] Automated data extraction; data imputation [2-6,8,9,24-37,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined]
 Conformance The compliance of data with expected formatting, relational, or absolute definitions [12] Preemptively enforced data format standardization [2-6,8,14,24-27,29-33,36,38,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined]
 Plausibility The possibility that a value is true given the context of other variables or temporal sequences (ie, patient date of birth must precede date of treatment or diagnosis) [12] Periodic realignment with logic rule sets or objective truth standards; thresholding [4-6,8,14,25,27,28,30-33,35,37-39,undefined,undefined,undefined,undefined,undefined,undefined,undefined]
 Uniqueness The lack of duplicate data among other patient records [8] Two-level encounter or visit data structure [8]
Dataperformance
 Correctness or accuracy Whether patient records are free from errors or inconsistencies when the information provided in them is true [10,13] Periodic validation against internal and external gold standards [2,7-9,14,23,24,28,undefined,undefined]
 Currency or recency Whether data were entered into the EHRa within a clinically relevant time frame and is representative of the patient state at a given time of interest [10,13] Enforcing predetermined hard and soft rule sets for timeline of data entry [2,4,9,27,32,34,36]
 Fairness (or, conversely, bias) The degree to which data collection, augmentation, and application are free from unwarranted over- or underrepresentation of individual data elements or characteristics Periodic review against a predetermined internal gold standard or bias criterion [3,19,22,24,27,35]
 Stability (or, conversely, temporal variability) Whether temporally dependent variables change according to predefined expectations [10,12] Periodic measurement of data drift against a baseline standard of data distribution [4,8,19,31]
 Shareability Whether data can be shared directly, easily, and with no information loss [3] Preemptively enforced data standardization [2,3]
 Robustness The percent of patient records with tolerable (eg, inaccurate, inconsistent, and outdated information) versus intolerable (eg, missing required information) data quality problems [24] Timely identification of critical data quality issues [24]
a

EHR: electronic health record.