Table 2.
Descriptive analysis of the cohorts.
Characteristic | Pretraining | DHF-Cerner | PaCa-Cerner | PaCa-Truven |
---|---|---|---|---|
Cohort size (n) | 28,490,650 | 672,647 | 29,405 | 42,721 |
Percent of patients with the eventa | 15% | 14% | 0.07% | 0.06% |
Average age on last/index encounter (std) | 41 | 61 | 65 | 63 |
Gender—Male (%) | 45% | 47% | 45% | 48% |
Race | ||||
White (%) | 68% | 72% | 77% | NA |
African American (%) | 15% | 16% | 13% | |
Asian/Pacific Islander (%) | 2% | 2% | 2% | |
African American (%) | 2% | 2% | 1% | |
Average number of visits per patient | 8 | 17 | 7 | 19 |
Average number of codes per patient | 15 | 33 | 14 | 18 |
Vocabulary size | 82,603 | 26,427 | 13,071 | 7002 |
ICD-10 codes (%) | 33.8% | 13.3% | 20.7% | 0% |
aThe event for pretraining is a prolonged hospitalization >7 days. The event for DHF-Cerner is the development of heart failure for diabetic patients. The event for PaCa-Cerner and PaCa-Truven is the diagnosis of pancreatic cancer and the percent is from the dataset total population.