Skip to main content
. 2024 Jan 19;8:e2300046. doi: 10.1200/CCI.23.00046

TABLE A1.

Data Quality Concepts Not Included Within the Flatiron Health Framework (Table 1) Because of Inconsistent Use Across Sources or Overlap With Other Dimensions

Concept Definition Rationale for Excluding
Coherence Dimension that expresses how different parts of an overall data set are consistent in their representation and meaning closely relates to consistency and validation. We can consider consistency and coherence largely synonyms, with the caveat that detection of inconsistencies is often a way to measure the reliability of data. Subdimensions of coherence include format coherence, structural coherence, semantic coherence, and uniqueness. Conformance assesses coherence toward a specific reference or data model Coherence is only described as a distinct dimension by EMA
Coherence overlaps with both consistency and conformance, which are incorporated into the evaluation of accuracy under the Flatiron Health framework
Coverage Amount of information available with respect to what exists in the real world, whether it is within the capture process or not. This cannot be easily measured if the total information is not definable or accessible Coverage is only described as a distinct dimension by EMA
Flatiron Health's evaluative approach to completeness includes elements of coverage, as thresholds of completeness are set based on clinically informed expectations of information availability
Extensiveness How much data are available, and whether the data are sufficient for purpose. Extensiveness is composed of completeness and coverage Extensiveness is only described as a distinct dimension by EMA
Extensiveness overlaps with dimensions of completeness and sufficiency in the Flatiron Health framework
Precision Degree of approximation by which data represent reality. For instance, the age of a person could be reported in years or months Precision is only described as a distinct dimension by EMA
Precision or granularity of a variable is incorporated in part in accuracy, with the degree to which the operational definitions represents reality, and in part under the availability subdimension of relevance, with the variable having the appropriate granularity for the use case
Excluded also to avoid the term precision, which can also be used to mean positive predictive value
Traceability Permits an understanding of the relationships between the analysis results (tables, listings, and figures in the study report), analysis data sets, tabulation data sets, and source data Traceability is only described as a distinct dimension by FDA
Excluded from Flatiron Health data quality process as the focus of this dimension on analytic output is more applicable to real-world evidence than real-world data
Traceability of source data is incorporated under provenance

Abbreviations: EMA, European Medicines Agency; FDA, US Food and Drug Administration; ML, machine learning.