TABLE 3.
Category | Subcategory | Description | Example Verification Check |
---|---|---|---|
Conformance | Value conformance | Data values conform to internal formatting constraints | Dates are recorded as YYYY-MM-DD |
Data values conform to allowable values or ranges | Stage is abstracted from unstructured documents into structured categories aligned to AJCC terminology | ||
Relational conformance | Data values conform to relational constraints | Patients with documentation of real-world response events also have documented treatment data | |
Unique (key) data values are not duplicated | Duplicate records for the same patient across multiple clinic sites are merged into a single record | ||
Changes to the data model or data model versioning | Changes to the data model are tracked and inputs only allowed that match the current data model at the time of entry | ||
Computational conformance | Computed values conform to computational or programming specifications | Human-abstracted group stage and group stage calculated from abstracted T, N, M components, when available, are identical | |
Plausibility | Uniqueness plausibility | Data values that identify a single object are not duplicated | Biomarker tests are not captured in duplicate when there are multiple references to the same event in documentation |
Atemporal plausibility | Data values and distributions agree with an internal measurement or local knowledge (overlaps with indirect benchmarking) | First-line treatment regimens, as defined according to line of therapy business rules, reflect expected clinical practice as described by NCCN guidelines | |
Data values and distributions for independent measurements of the same or related facts agree | Date of treatment discontinuation for disease progression is in close proximity to date of progression documented on imaging | ||
Logical constraints between values agree with local or common knowledge (includes ”expected” missingness) | Patients receiving TRK inhibitor therapy have documentation of an NTRK fusion | ||
Biologic plausibility of different values is in agreement with local or common knowledge | Coexistence of EGFR and KRAS mutations are rare | ||
Values of repeated measurement of the same fact show expected variability | Time between repeated response assessments is generally aligned to intervals recommended by NCCN guidelines; however, shorter and longer intervals are also present in line with real-world practice patterns | ||
Temporal plausibility | Observed or derived values conform to expected temporal properties | Initial diagnosis date precedes metastatic diagnosis date for patients whose cancer stage at initial diagnosis is nonadvanced | |
Sequences of values that represent state transitions conform to expected properties | Real-world progression events are followed by a logical clinical event, such as change in treatment, referral to hospice, or death | ||
Measures of data value density against a time-oriented denominator are expected on the basis of internal or common knowledge | PD-L1 testing events become more frequent after approval of a therapy, for which PD-L1 positivity is required by indication | ||
Consistency | Cross-field consistency | Data are consistent across multiple fields or data sources | Patients documented as having brain metastases at initial diagnosis are also identified as having stage IV disease |
Temporal consistency | Data from recurring or refreshed databases are consistent over time | Frequency of PSA values within a given site shows minimal month-over-month variation | |
Agreement | Duplicate capture of the same data point by different processes or individuals results in the same values | Two abstractors agree on the discontinuation date and reason for discontinuation of the same drug | |
Reproducibility | Repeat use of operational data capture algorithms will result in the same or similar results | Performance of a smoking status variable leads to a consistent extracted result each time it is used on the same or similar tasks |
NOTE. Modified from Kahn et al.25
Abbreviations: AJCC, American Joint Committee on Cancer; EGFR, epidermal growth factor receptor; NCCN, National Comprehensive Cancer Network; NTRK, neurotrophic tyrosine receptor kinase; PD-L1, programmed death ligand 1; PSA, prostate-specific antigen; TRK, tropomyosin receptor kinase.