TABLE A1.
Concept | Definition | Rationale for Excluding |
---|---|---|
Coherence | Dimension that expresses how different parts of an overall data set are consistent in their representation and meaning closely relates to consistency and validation. We can consider consistency and coherence largely synonyms, with the caveat that detection of inconsistencies is often a way to measure the reliability of data. Subdimensions of coherence include format coherence, structural coherence, semantic coherence, and uniqueness. Conformance assesses coherence toward a specific reference or data model | Coherence is only described as a distinct dimension by EMA Coherence overlaps with both consistency and conformance, which are incorporated into the evaluation of accuracy under the Flatiron Health framework |
Coverage | Amount of information available with respect to what exists in the real world, whether it is within the capture process or not. This cannot be easily measured if the total information is not definable or accessible | Coverage is only described as a distinct dimension by EMA Flatiron Health's evaluative approach to completeness includes elements of coverage, as thresholds of completeness are set based on clinically informed expectations of information availability |
Extensiveness | How much data are available, and whether the data are sufficient for purpose. Extensiveness is composed of completeness and coverage | Extensiveness is only described as a distinct dimension by EMA Extensiveness overlaps with dimensions of completeness and sufficiency in the Flatiron Health framework |
Precision | Degree of approximation by which data represent reality. For instance, the age of a person could be reported in years or months | Precision is only described as a distinct dimension by EMA Precision or granularity of a variable is incorporated in part in accuracy, with the degree to which the operational definitions represents reality, and in part under the availability subdimension of relevance, with the variable having the appropriate granularity for the use case Excluded also to avoid the term precision, which can also be used to mean positive predictive value |
Traceability | Permits an understanding of the relationships between the analysis results (tables, listings, and figures in the study report), analysis data sets, tabulation data sets, and source data | Traceability is only described as a distinct dimension by FDA Excluded from Flatiron Health data quality process as the focus of this dimension on analytic output is more applicable to real-world evidence than real-world data Traceability of source data is incorporated under provenance |
Abbreviations: EMA, European Medicines Agency; FDA, US Food and Drug Administration; ML, machine learning.