Skip to main content
. 2021 Aug 11;12(4):757–767. doi: 10.1055/s-0041-1732301

Table 5. Source data exclusion rules.

Source data table (source count)
Exclusion rule Person (5,214) Condition (338,688) Drug (2,274,749) Visit (261,499) Laboratory (11,563,678) Death (244) Observation (542,211)
Missing or useless values in patient identifiers 0 0 0 0 0 0 0
Multiple gender values for a patient 0 NA NA NA NA NA NA
Multiple date of birth records for a patient with a difference of more than 2 y 0 NA NA NA NA NA NA
Date of birth < 1900 0 NA NA NA NA NA NA
Event start_date > event end_date (ex: Drug_start_date > Drug_end_date etc.) 0 0 0 0 0 0 0
Missing of multiple relevant variables 0 44,453 0 840 522,455 0 324,820
Removal of duplicates and junk values 15 7,020 62 9,304 612,465 0 0
Final cleaned source data count 5,199 331,669 2,274,687 251,355 10,428,758 244 217,391

Abbreviation: NA, not applicable.