Table 5. Source data exclusion rules.
| Source data table (source count) | |||||||
|---|---|---|---|---|---|---|---|
| Exclusion rule | Person (5,214) | Condition (338,688) | Drug (2,274,749) | Visit (261,499) | Laboratory (11,563,678) | Death (244) | Observation (542,211) |
| Missing or useless values in patient identifiers | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Multiple gender values for a patient | 0 | NA | NA | NA | NA | NA | NA |
| Multiple date of birth records for a patient with a difference of more than 2 y | 0 | NA | NA | NA | NA | NA | NA |
| Date of birth < 1900 | 0 | NA | NA | NA | NA | NA | NA |
| Event start_date > event end_date (ex: Drug_start_date > Drug_end_date etc.) | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Missing of multiple relevant variables | 0 | 44,453 | 0 | 840 | 522,455 | 0 | 324,820 |
| Removal of duplicates and junk values | 15 | 7,020 | 62 | 9,304 | 612,465 | 0 | 0 |
| Final cleaned source data count | 5,199 | 331,669 | 2,274,687 | 251,355 | 10,428,758 | 244 | 217,391 |
Abbreviation: NA, not applicable.