Table 1. Data Sources, Extraction Method, and Description.
Key variable | Data sources and extraction method | Variable description | |
---|---|---|---|
Structured data | Unstructured data | ||
Demographic characteristic | |||
Birth date | Demographics | NA | NA |
Sex | Demographics | NA | NA |
Race/ethnicity | Demographics | NA | NA |
Clinical outcomes | |||
Diagnosis date | Diagnosis codes (ICD-9 and ICD-10 codes) | NICE | Date of lung cancer diagnosis |
Date of death | Death report | NA | NA |
Prognostic factors | NA | NA | NA |
Stage | NA | NICE | TNM stage and clinical stage |
Histologic type | NA | NICE | NSCLC (ie, adenocarcinoma, squamous cell carcinoma, other non-small cell carcinoma) or small cell lung cancer |
Smoking status | NA | NA | Smoker or nonsmoker |
BMI | Vital signs | EXTEND | Calculated as weight in kilograms divided by height in meters squared |
ECOG performance status | NA | EXTEND | Grade 0 to 4 |
Laboratory test | Laboratory test codes | NA | Complete blood count, metabolic panel, lipid panel, liver panel, hemoglobin A1C, and urinalysis |
Tumor somatic variant information | NA | NICE | Genetic alterations in EGFR, KRAS, ALK, ROS1, MET, or BRAF |
Medical history | Diagnosis codes (ICD-9 and ICD-10 codes) | NA | Respiratory disease (eg, COPD and asthma), cardiovascular disease, type 2 diabetes, and others |
Treatment | |||
Surgical treatment | Procedure codes (CPT and ICD-10 codes) | NA | Surgical procedure (ie, lobectomy, segmentectomy, wedge resection, video-assisted thoracic surgical procedure) with surgical admission and discharge dates |
Radiation therapy | Procedure codes (CPT and ICD-10 codes) | NA | Radiation therapy procedure, treatment start and end dates |
Chemotherapy | Procedure codes (CPT and ICD-10 codes) and medication name codes | NA | Chemotherapy procedures, chemotherapy drugs, and treatment start and end dates |
Target therapy and immunotherapy | Medication name codes | NA | Target therapy and immunotherapy drugs and treatment start and end dates |
Abbreviations: BMI, body mass index; COPD, chronic obstructive pulmonary disease; ECOG, Eastern Cooperative Oncology Group; EXTEND, Extraction of Electronic Medical Record Numerical Data; ICD-9, International Classification of Diseases, Ninth Revision; ICD-10, International Statistical Classification of Diseases and Related Health Problems, Tenth Revision; NA, not applicable; NICE, Natural Language Processing Interpreter for Cancer Extraction; NSCLC, non–small cell lung cancer.