Abstract
Lung cancer is the leading cause of cancer-related mortality in the United States. Investigating epidemiological and clinical parameters can contribute to an improved understanding of disease development and management. In this cross-sectional, case–control study, we used the All of Us database to compare healthcare access, family history, smoking-related behaviors, and psychiatric comorbidities in light smoking controls, matched smoking controls, and primary and secondary lung cancer patients. We found a decreased odds of primary lung cancer patients versus matched smoking controls reporting inability to afford follow-up or specialist care. Additionally, we found a significantly increased odds of secondary lung cancer patients having comorbid anxiety and insomnia when compared to matched smoking controls. Our study provides a profile of the psychiatric disease burden in lung cancer patients and reports key epidemiological factors in patients with primary and secondary lung cancer. By using two controls, we were able to separate smoking behavior from lung cancer and identify factors that were mediated by heavy smoking alone or by both smoking and lung cancer.
Subject terms: Cancer epidemiology, Lung cancer, Risk factors
Introduction
In 2022, approximately 236,000 lung cancer diagnoses and 130,000 lung cancer deaths were expected to occur in the United States (US)1. Every day, roughly 350 patients are expected to die from lung cancer, making it the leading cause of cancer-related death in the US1. Tumors in the lungs can be classified as primary lung cancer, including small-cell or non-small-cell lung cancer, or secondary lung cancer, which typically arises from the metastasis of breast2, colorectal3, renal4, testicular5, and uterine cancer6, among other forms of cancer. Many primary lung cancers are attributed to modifiable risk factors, such as smoking1,7, secondhand smoke7, excess body weight8, red and processed meat consumption7, alcohol intake7, and various occupational exposures9. However, cigarette smoking is a well-known risk factor for primary lung cancer and is attributed as the leading cause of more than 80% of lung cancer cases in the US1.
Although cigarette smoking is a significant risk factor for the development of lung cancer, numerous studies have demonstrated that a family history of lung cancer is also associated with an increased risk10. Even after accounting for age, sex, smoking history, and occupation, studies suggest a 2–4-fold increase in lung cancer risk for first-degree relatives of lung cancer patients10. Other epidemiologic factors, such as barriers to healthcare, can impact lung cancer development and outcomes, especially in vulnerable populations11. Studies estimate that only 5–18% of patients at high risk for lung cancer receive low dose computed tomography (LDCT) screening12. Investigating smoking-related behaviors is also crucial in the context of lung cancer risks, including e-cigarette use and smokeless tobacco. While nicotine replacement and pharmacological therapies along with behavior therapies have led to improved smoking cessation rates, the accessibility of e-cigarettes has led to an increase in their usage13,14. A particular concern is that e-cigarette users often also use cigarettes, thus increasing their lung cancer risk15. Notably, a literature gap exists in understanding the complex interplay between smoking, e-cigarette or smokeless tobacco use, and lung cancer, which this study aims to address.
Finally, the psychiatric disease burden associated with both smoking and lung cancer is well-documented16–18. However, to our knowledge, no studies have investigated the differences in the psychiatric disease burden between primary and secondary lung cancer. Understanding which psychiatric diseases are comorbid with both primary and secondary lung cancer can help physicians develop treatment plans tailored to the individual patient.
To obtain a more comprehensive understanding of lung cancer development, treatment, and outcomes, it is essential to investigate epidemiological factors beyond cigarette smoking. This investigation can help develop better risk-based lung cancer screening methods and outcome prediction models that can draw on data from diverse sources19. This study aims to explore several key factors that may contribute to primary and secondary lung cancer, including lung cancer family history, barriers to healthcare, smoking-related behaviors, and psychiatric comorbidities. To understand better the impact of these factors, we designed a case–control analysis using two control groups (light smokers and matched smokers) to study the effects of smoking, lung cancer, and comorbid psychiatric conditions. Specifically, this study aims to answer the research question of whether the prevalence and impact of smoking-related behaviors, psychiatric comorbidities, and other epidemiological factors differ between primary and secondary lung cancer patients compared to light smoking and matched smoking controls.
Materials and methods
All of Us Research Program
The All of Us Research Program is a prospective cohort study with the objective of recruiting at least one million individuals in the US to provide a comprehensive database that enables researchers to investigate the effects of lifestyle, access to care, family history, environment, and genomics on participant health20. The program collects data through self-reported surveys, electronic health records (EHRs), and physical wearables such as Fitbit devices. Of the 372,082 patients in the All of Us Research Program, 54.1% are white, 19.7% are black or African American, 3.3% are Asian, 0.60% are Middle Eastern or North African, and 0.11% are Native Hawaiian or Other Pacific Islander. Data from this program are accessible at http://www.allofus.nih.gov, and this study was conducted on version 6 of the data utilizing the All of Us Researcher Workbench. Supplementary Material provide codes utilized to query EHRs for lung cancer and psychiatric conditions.
Lung cancer patient and control selection
Using the cohort builder function within the All of Us workbench, we created cohorts for patients with primary and secondary lung cancer based on source concept names (Supplementary Material). To protect individual-level patient information and in accordance with the All of Us data access policy, we excluded a small number of patients from both the primary and secondary lung cancer cohorts whose sex at birth survey answer categories contained fewer than 20 participants. Controls were divided into two groups: a light smoking control (LSC) and a matched smoking control (MSC). Light smoking controls in primary lung cancer and secondary lung cancer are designated as LSC-1 and LSC-2, respectively. Matched smoking controls in primary and secondary lung cancer are designated as MSC-1 and MSC-2, respectively. Control group participants were matched with patients based on their current age at the time of this study in 5-year intervals, sex at birth, and smoking status from a sample excluding primary and secondary lung cancer patients. The controls were matched by randomly selecting the control group participant with the appropriate inclusion criteria for a given matched lung cancer patient from a list of eligible control participants (i.e., same age, sex at birth, and smoking status as matched lung cancer patient). While smoking pack years is a well-established metric for smoking history21, we used the number of years smoked as the matching criteria because not all patients filled out both years smoked and the average number of daily cigarettes, which are needed to calculate the pack-year metric. LSC controls answered the “Number of Years Smoked” question from the “Lifestyle” survey with an answer less than or equal to 5, which is a well-published “years smoked” cutoff for light smokers22,23, while MSC controls were matched based on the exact number of years smoked. Fewer secondary lung cancer patients completed the “Number of Years Smoked” question, leading to a smaller sample size for the matched smoking controls in secondary lung cancer. We excluded answers of “PMI: Skip” and “PMI: Don’t Know” when calculating smoking-related demographic information such as the average daily cigarette number, the current average daily cigarette number, the daily smoking starting age, and the number of years smoked.
Statistical analysis
Odds ratios were used to generate forest plots, and the following R (v 4.2.2) packages were used for statistical analysis or plotting: epitools (v 0.5.10.1)24, tidyverse (v 1.3.2)25, patchwork (v 1.1.2)26, and ggplot2 (v 3.4.0)27. Mid p-values are commonly used in the analysis of odds ratios and are calculated by taking the midpoint of the range of p-values with a full description available in the documentation for the epitools24 R package. The epitools R package provides mid p-values, Fisher p-values, and Chi-squared p-values. Mid p-values are used for all p-values in this study except for the primary lung cancer vs. LSC and MSC vs. LSC comparisons for electronic cigarette use and in analysis of psychiatric comorbidities, in which cases Fisher exact p-values were used as the epitools program returned a value of 0 for the mid p-value. Bonferroni p-values were calculated by multiplying the shown p-value by the number of comparisons and are significant if they are less than 0.05. All p-values reported in results text are mid p-values.
Results
Lung cancer patient and control demographics
We conducted a matched case–control study to investigate the epidemiological and clinical parameters of primary and secondary lung cancer. This study included two age- and sex-matched controls for each case: a light smoking control (LSC) and a matched smoking control (MSC), with the latter having smoked for an equivalent number of years as their respective lung cancer patient. From a total of 221,125 patients in the All of Us database with available electronic health record (EHR) data, we identified 1451 patients with primary lung cancer (prevalence of 0.66%) and 1161 patients with secondary lung cancer (prevalence of 0.53%). The median age of lung cancer patients in our cohorts at the time of this study was 72 for primary lung cancer and 67 for secondary lung cancer (Table 1), which aligns with the literature suggesting a median age of lung cancer diagnosis 70 for both men and women28. In our primary lung cancer cohort, 60.0% of patients reported female sex at birth, while 55.1% of secondary lung cancer patients did so. In our primary lung cancer cohort, 68.8% of patients were white, 16.7% were black or African American, 2.4% were Asian, and 7.7% were Hispanic. In our secondary lung cancer cohort, 68.6% of patients were white, 10.1% were black of African American, 3.2% were Asian, and 14.8% were Hispanic.
Table 1.
Demographic and smoking behavior for patients with primary or secondary lung cancer and controls.
Primary lung cancer (n = 1451) | Light smoking control without primary lung cancer (n = 1433) (LSC-1) | Matched smoking control without primary lung cancer (n = 1051) (MSC-1) | Secondary lung cancer (n = 1161) | Light smoking control without secondary lung cancer (n = 1127) (LSC-2) | Matched smoking control without secondary lung cancer (n = 541) (MSC-2) | |
---|---|---|---|---|---|---|
Current median age (IQR) | 72 (65–78) | 71 (64–78) | 72 (66–78) | 67 (58–75) | 68 (59–75) | 70 (61–76) |
Sex at birth, female | 871 (60.0) | 859 (59.9) | 616 (58.6) | 640 (55.1) | 623 (55.3) | 270 (49.9) |
Race | ||||||
White | 999 (68.8) | 1142 (79.7) | 660 (62.8) | 797 (68.6) | 839 (74.4) | 344 (63.6) |
Black or African American | 243 (16.7) | 114 (8.0) | 237 (22.5) | 117 (10.1) | 120 (10.6) | 123 (22.7) |
Asian | 35 (2.4) | NA (NA) | NA (NA) | 37 (3.2) | NA (NA) | NA (NA) |
Other | 174 (12.0) | NA (NA) | NA (NA) | 210 (18.1) | NA (NA) | NA (NA) |
Ethnicity | ||||||
Hispanic or Latino | 112 (7.7) | 119 (8.3) | 105 (10.0) | 172 (14.8) | 113 (10.0) | 60 (11.1) |
Not Hispanic or Latino | 1288 (88.8) | 1274 (88.9) | 913 (86.9) | 958 (82.5) | 991 (87.9) | 472 (87.2) |
Other | 51 (3.5) | 40 (2.8) | 33 (3.1) | 31 (2.7) | 23 (2.0) | NA (NA) |
Income | ||||||
< $25k | 357 (24.6) | 194 (13.5) | 316 (30.1) | 222 (19.1) | 177 (15.7) | 172 (31.8) |
$25–50k | 252 (17.4) | 213 (14.9) | 177 (16.8) | 152 (13.1) | 146 (13.0) | 80 (14.8) |
$50–100k | 241 (16.6) | 355 (24.8) | 206 (19.6) | 243 (20.9) | 292 (25.9) | 91 (16.8) |
$100k+ | 272 (18.7) | 455 (31.8) | 141 (13.4) | 253 (21.8) | 358 (31.8) | 100 (18.5) |
Not reported | 329 (22.7) | 216 (15.1) | 211 (20.1) | 291 (25.1) | 154 (13.7) | 98 (18.1) |
Education | ||||||
Advanced degree | 311 (21.4) | 507 (35.4) | 178 (16.9) | 276 (23.8) | 353 (31.3) | 98 (18.1) |
College graduate | 286 (19.7) | 386 (26.9) | 200 (19.0) | 284 (24.5) | 317 (28.1) | 93 (17.2) |
Some college | 410 (28.3) | 318 (22.2) | 297 (28.3) | 285 (24.5) | 278 (24.7) | 151 (27.9) |
High school graduate or GED | 291 (20.1) | 161 (11.2) | 244 (23.2) | 210 (18.1) | 109 (9.7) | 118 (21.8) |
Other | 153 (10.5) | 61 (4.3) | 132 (12.6) | 106 (9.1) | 70 (6.2) | 81 (15.0) |
Smoking | ||||||
100 cigarettes lifetime = yes | 1057 (72.8) | 1401 (97.8) | 1037 (98.7) | 541 (46.6) | 1104 (98.0) | 534 (98.7) |
Daily cigarette median (IQR) | 20 (10–25) | 5 (3–10) | 17 (10–20) | 18 (10–20) | 5 (3–10) | 12 (5–20) |
Current daily cigarette median (IQR) | 0 (0–10) | 0 (0–0) | 0 (0–10) | 0 (0–7) | 0 (0–1) | 0 (0–7) |
Daily smoking starting age median (IQR) | 16 (14–18) | 18 (16.25–21) | 17 (15–19) | 17 (15–18) | 18 (16–21) | 17 (15–20) |
Median number of years smoked (IQR) | 35 (21.5–45) | 3 (2–5) | 35 (21–45) | 25 (11–40) | 3 (2–4) | 24.5 (11–40) |
Demographic and smoking behavior data is provided for primary and secondary lung cancer patients, as well as for light smoking (LSC) and matched smoking (MSC) controls. Parentheses indicate percentages except where noted as the interquartile range (IQR).
*NA is included if participant count < 20 to prevent patient identification and in accordance with All of Us policy.
The lifestyle survey data from participants offered insights into smoking behaviors and patterns. Of the primary lung cancer patients, 72.8% self-reported having smoked at least 100 cigarettes in their lifetime, compared to 46.6% of secondary lung cancer patients (Table 1). In the light smoking controls without primary lung cancer (LSC-1), the median years smoked was 3 (interquartile range [IQR]: 2–5). Primary lung cancer patients and matched smoking controls without primary lung cancer (MSC-1) reported a median of 35 years smoked (IQR: 21.5–45) and 35 years smoked (IQR: 21–45), respectively. In the light smoking controls without secondary lung cancer (LSC-2), the median years smoked was 3 (IQR: 2–4). Secondary lung cancer patients and matched smoking controls without secondary lung cancer (MSC-2) reported a median of 25 years smoked (IQR: 11–40) and 24.5 years smoked (IQR: 11–40), respectively.
Differences in access to healthcare in primary and secondary lung cancer
After defining our cases and controls, we investigated several macro-level healthcare access factors, as well as patient-specific information such as smoking-related behavior and psychiatric comorbidities. We assessed the results from several healthcare access survey questions, including whether a patient could afford their co-pay, deductible, mental health counseling, or follow-up care, and whether they were worried about paying. The results of our analysis showed that primary lung cancer patients had significantly lower odds of reporting that they could not afford specialist or follow-up care, compared to MSC-1 controls, with odds ratios of 0.57 (p = 0.046) and 0.41 (p = 0.0038), respectively (Fig. 1, top panel). However, after Bonferroni’s multiple comparisons adjustment, these associations did not reach significance. In contrast, MSC-1 controls had significantly higher odds of reporting that they could not afford specialist or follow-up care, or mental health counseling, compared to LSC-1 controls, with odds ratios of 2.11 (p = 0.0073), 3.34 (p = 8.74e−05), and 1.95 (p = 0.048), respectively. For secondary lung cancer patients, cases had significantly higher odds of reporting that they were somewhat or very worried about paying, compared to LSC-2 controls, with an odds ratio of 1.31 (p = 0.030) (Fig. 1, bottom panel). However, none of the odds ratios in the healthcare access analysis for secondary lung cancer patients met the stricter Bonferroni significance threshold.
Figure 1.
Healthcare access in primary and secondary lung cancer patients. Odds ratios (± standard error) generated comparing healthcare access patient-reported metrics in primary and secondary lung cancer patients to light smoking (LSC) and matched smoking (MSC) controls. The reference group (e.g., (ref: MSC)), mid-p value, and Bonferroni-corrected p-values are reported for each comparison.
Family history patterns in primary and secondary lung cancer
In our investigation of familial history in primary and secondary lung cancer patients, we found that smoking status, rather than lung cancer diagnosis, was associated with an increased odds of having a sibling or father with primary lung cancer (Fig. 2, top panel). The odds of having a sibling with lung cancer comparing both MSC-1 controls and primary lung cancer patients with LSC-1 controls were 2.31 (p = 0.0020) and 3.17 (p = 4.54e−07), respectively, with both p-values remaining significant after Bonferroni correction. Similarly, the odds of having a father with lung cancer comparing both MSC-1 controls and primary lung cancer patients with LSC-1 controls were 1.66 (p = 0.018) and 1.82 (p = 0.0017), respectively, with the latter maintaining significance after Bonferroni correction. Although the odds of having a mother or grandparent with lung cancer were also increased when comparing our primary lung cancer patients to LSC-1 controls, with odds ratios of 1.75 (p = 0.0087) and 1.74 (p = 0.0083), respectively, neither remained significant after Bonferroni correction. For patients with secondary lung cancer, the odds of having a father with lung cancer were increased with an odds ratio of 1.66 (p = 0.034) compared to LSC-2 controls, while MSC-2 controls compared to LSC-2 controls had an odds ratio of 0.31 (p = 0.00065) of having a grandparent with lung cancer (Fig. 2, bottom panel).
Figure 2.
Family history in primary and secondary lung cancer patients. Odds ratios (± standard error) generated comparing family history patient-reported metrics in primary and secondary lung cancer patients to light smoking (LSC) and matched smoking (MSC) controls. The reference group (e.g., (ref: MSC)), mid-p value, and Bonferroni-corrected p-values are reported for each comparison.
Smoking-related behavior in primary and secondary lung cancer
We investigated several smoking-related behaviors, including electronic cigarette use, smokeless tobacco use, hookah use, cigar smoking, and alcohol use, in both primary and secondary lung cancer patients. We observed that primary lung cancer patients had a significantly lower odds of using alcohol compared to all comparison groups (Fig. 3, top panel). Additionally, primary lung cancer patients had a significantly lower odds of using cigars compared to both MSC-1 and LSC-1 controls, with odds ratios of 0.78 (p = 0.0027) and 0.79 (p = 0.0017), respectively, which retained significance after Bonferroni correction. Interestingly, electronic cigarette use was found to be associated with smoking status rather than lung cancer status, with both primary lung cancer patients and MSC-1 controls having a greater odds of using electronic cigarettes compared to LSC-1 controls, with odds ratios of 3.85 (p = 8.55e−22) and 4.24 (p = 1.46e−22), respectively. These associations also retained significance after Bonferroni correction. Furthermore, primary lung cancer patients demonstrated a nominally significant increased odds of having made a serious smoking quit attempt compared to both MSC-1 and LSC-1 controls, with odds ratios of 1.44 (p = 0.028) and 1.42 (p = 0.026), respectively.
Figure 3.
Smoking-related behaviors in primary and secondary lung cancer patients. Odds ratios (± standard error) generated comparing smoking-related behaviors from patient-reported metrics in primary and secondary lung cancer patients to light smoking (LSC) and matched smoking (MSC) controls. The reference group (e.g., (ref: MSC)), mid-p value, and Bonferroni-corrected p-values are reported for each comparison.
In the analysis of smoking-related behaviors in secondary lung cancer patients, both comparison groups had a Bonferroni-corrected significantly lower odds of using alcohol (Fig. 3, bottom panel). When compared to both MSC-2 and LSC-2 controls, secondary lung cancer patients demonstrated a Bonferroni-corrected significantly lower odds of using cigars, with odds ratios of 0.52 (p = 1.1e−09) and 0.57 (p = 4.23e−11), respectively. Electronic cigarette use was associated with smoking status, rather than lung cancer status. MSC-2 controls had a 2.70 greater odds (p = 1.06e−08) than LSC-2 controls, and secondary lung cancer patients had a 1.76 greater odds (p = 0.00026) than LSC-2 controls of using electronic cigarettes. These associations retained significance after Bonferroni correction.
Primary and secondary lung cancer are associated with significant psychiatric comorbidities
We investigated the odds of lung cancer patients having psychiatric conditions in their electronic health record (EHR) compared to their controls. The analyzed conditions included anxiety, bipolar disorder, depressive disorders, disorders caused by alcohol, insomnia, schizophrenia, and substance use disorder. We found that primary lung cancer patients had significantly higher odds of having substance use disorder, insomnia, bipolar disorder, disorder caused by alcohol, depressive disorder, and anxiety compared to their LSC-1 controls. Each of these odds ratios (except for bipolar disorder) remained significant after Bonferroni correction (Fig. 4, top panel). MSC-1 controls had significantly greater odds of having a substance use disorder, bipolar disorder, disorder caused by alcohol, anxiety, or a depressive disorder when compared to LSC-1 controls. Furthermore, primary lung cancer patients had significantly higher odds of having anxiety compared to MSC-1 controls (OR: 1.39; p = 0.00052). Interestingly, smoking status was associated with comorbid substance use disorder, bipolar disorder, disorder caused by alcohol, and depressive disorder, instead of primary lung cancer status. Both MSC-1 controls versus LSC-1 controls and primary lung cancer versus LSC-1 controls had a greater odds of having these psychiatric comorbidities. Additionally, secondary lung cancer patients had significantly higher odds of having substance use disorder, insomnia, and anxiety compared to their LSC-2 controls, and these odds ratios retained significance after Bonferroni multiple comparisons adjustment (Fig. 4, bottom panel). Furthermore, secondary lung cancer patients versus the MSC-2 controls had significantly higher odds of having comorbid insomnia and anxiety.
Figure 4.
Psychiatric disease burden in primary and secondary lung cancer patients. Odds ratios (± standard error) generated comparing psychiatric disease burden from patient EHR data in primary and secondary lung cancer patients to light smoking (LSC) and matched smoking (MSC) controls. The reference group (e.g., (ref: MSC)), Fisher p-value, and Bonferroni-corrected p-values are reported for each comparison.
Discussion
In this cross-sectional, case–control study, we examined various epidemiological factors and psychiatric comorbidities in primary and secondary lung cancer. Previous case–control studies on primary lung cancer have investigated factors such as diet29,30, occupational exposure31,32, physical activity33, medications34,35, cannabis use36, genetic polymorphisms37, and various other factors. However, our study is the first to report key epidemiological information in lung cancer from the All of Us Research Program, which has a focus on recruiting historically underrepresented individuals38. Additionally, our dual control study design allowed us to differentiate the effect of smoking from the effect of lung cancer when examining variables of interest (Fig. 5). We investigated differences in healthcare access, family history, smoking-related behavior, and psychiatric disease burden in our cohort of primary and secondary lung cancer, as well as in the light smoking and matched smoking controls.
Figure 5.
Dual control study design. Schema depicting the dual control design utilized in the present study.
The issue of healthcare access and equity is of significant concern in cancer research. In a previous study, it was discovered that cancer death rates in men and women are 13% and 3% higher, respectively, in poorer counties compared to more affluent counties39. Furthermore, the same study found that non-Hispanic whites have higher 5-year cancer survival rates than African American, American Indian/Alaskan Native, and Asian/Pacific Islander men39. These findings underscore the need to identify and remove barriers to healthcare. Our study produced positive results, as none of the examined lung cancer groups reached Bonferroni-corrected levels of significance for access to care metrics such as increased worry about payment or concern about high copays or deductibles (Fig. 1). While our analysis was conducted on the entire cohort of primary and secondary lung cancer patients, future studies can stratify by race, ethnicity, and income to identify potential nuanced differences between these groups regarding access to care metrics.
Family history in primary lung cancer plays an important, yet not fully characterized, role in determining a patient’s predisposition to primary lung cancer40. Presently, our results demonstrate that while a first-degree relative with primary lung cancer can increase the odds of a patient having primary lung cancer (Fig. 2, top panel), family history understandably cannot explain the entire risk. Interestingly, we also saw that an increased odds of having a sibling or father with lung cancer was associated more with smoking behavior, with both primary lung cancer patients and their matched smoking controls compared to light smoking controls having a similar odds of having a sibling or father with lung cancer. This suggests a strong role for the environment in the development of lung cancer, and various literature demonstrates that smoking behavior is correlated in families41–43. One major limitation of our family history analysis, however, is that the data is self-reported through a survey, meaning (1) there was no stratification between primary and secondary lung cancer in relatives and (2) not all cases of familial lung cancer will be captured.
In addition to cigarette smoking, we investigated other smoking-related behaviors, such as alcohol use44, electronic cigarette use14, cigar smoking, hookah use, and smokeless tobacco use. Both primary and secondary lung cancer patients showed a lower odds of using alcohol or cigars, and secondary lung cancer patients also showed a lower odds of using smokeless tobacco compared to the MSC-2 control (Fig. 3). Interestingly, in our primary lung cancer patient analysis, electronic cigarette use was associated with smoking status, irrespective of whether or not the patient had primary lung cancer. Namely, smokers (with and without lung cancer) were more likely to use electronic cigarettes compared to their light smoking counterparts. Electronic cigarette use (i.e., vaping) has increased significantly in recent years, and smokers may view vaping as a safer alternative, which can explain the trend observed in this study14,45. The safety of vaping is under active investigation, and many researchers are concerned about the rapid rise in patients presenting with e-cigarette use-associated lung injury (EVALI)45. While vaping may be a more benign alternative to smoking, evidence strongly suggests that vaping has its own associated risks.
Finally, the presence of significant psychiatric comorbidities is well-known in cancer46, including lung cancer47. By querying electronic health records, we wanted to understand whether or not smoking and/or lung cancer increased the odds of having a comorbid psychiatric condition and by how much. The results from this analysis demonstrated that primary lung cancer patients have a significantly higher odds of having comorbid substance use disorder, insomnia, bipolar disorder, disorder caused by alcohol, depressive disorder, and anxiety when compared to their LSC-1 controls. Secondary lung cancer patients had a significantly higher odds of having substance use disorder, insomnia, and anxiety compared to their LSC-2 controls. However, smoking, rather than lung cancer, appeared to be associated with an increased odds of particular psychiatric comorbidities, such as substance use disorder, bipolar disorder, disorder caused by alcohol, and depressive disorder in primary lung cancer patients. This suggests that much of the psychiatric burden associated with lung cancer may be due to smoking status, rather than lung cancer diagnosis. Of the studied conditions, only secondary lung cancer patients versus matched smoking controls demonstrated a significantly increased odds of comorbid anxiety and insomnia conditions. Additionally, for these two conditions in secondary lung cancer, no significant differences were seen between the matched and light smoking controls, suggesting that the increase in odds was due to secondary lung cancer. Psychiatric conditions like anxiety and depression are well-documented in primary lung cancer47, and studies have demonstrated that the mood and anxiety symptoms in lung cancer patients may exceed those of other cancer patients as result of negative psychosocial and physical (e.g., symptom-related) factors48. Additionally, perceived negative stigma surrounding primary lung cancer, which is correlated with depressive and anxious symptoms, has been associated with greater psychiatric symptom severity48. Furthermore, lung cancer patients may have impaired pulmonary function, leading to lower quality of life (QoL) and increased psychiatric symptom severity48,49. Understanding the relationship between lung cancer, smoking, and psychiatric disease may help oncologists collaborate closely with mental health professionals to provide well-rounded, comprehensive care to lung cancer patients.
This study did have several limitations, primarily related to known challenges that occur with extracting data from electronic health records and patient surveys. First, in this study, some patients had EHR codes for both primary and secondary lung cancer, leading to an overlap between the primary and secondary lung cancer cohorts of ~ 300 patients. Another challenge is that secondary lung cancer had fewer smokers and fewer patients who filled out the survey indicating the number of years smoked, making it more challenging to assign a full suite of matched smoking controls. Notably, self-reported data from patient surveys may also be subject to bias or inaccuracies. Not every patient in the All of Us database has EMR data, and not all patients who have consented to provide their EMR data have all of their EMR data successfully integrated into the All of Us database, meaning there is a potential for errors or inconsistencies in the coding and categorization of EHR data. Given the size and diversity of the All of Us data network, there are several obstacles related to data integration, and the All of Us team has implemented data quality tools to regularly evaluate, quantify, and communicate about EHR data quality issues50. The present study’s cross-sectional design prevents us from making conclusions that establish a temporal relationship between smoking or lung cancer diagnosis and the diagnosis of a comorbid psychiatric condition. Moreover, there is a potential for confounding variables that were not included in the analysis, such as environmental exposures or other health conditions, and the limited number of variables included in the analysis may not fully capture the complex interactions between various epidemiological and clinical factors in lung cancer development and outcomes. Finally, the study’s focus on a specific population may not be representative of the general population, and this analysis should be repeated as the All of Us research program recruits more participants. Of particular note, 60% of participants in the All of Us v6 data release who completed the Basics survey identified as female.
In conclusion, our present cross-sectional, case–control study characterizes primary and secondary lung cancer in the All of Us database, providing information on demographics, healthcare access, family history, smoking-related behaviors, and psychiatric conditions. In future studies, using the vast array of data, including genetic information, present in the All of Us database, researchers can investigate deeper questions, such as probing the combined effect of genetic, environmental, clinical, and epidemiological factors on the development of lung cancer. Future studies can combine genetic models (e.g., polygenic risk scores) with models built from EHR information to improve predictions of disease development, progression, and management, and the All of Us database will be an excellent tool to help researchers answer a wide range of important questions.
Supplementary Information
Acknowledgements
The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers: 1 OT2 OD026549; 1 OT2 OD026554; 1 OT2 OD026557; 1 OT2 OD026556; 1 OT2 OD026550; 1 OT2 OD 026552; 1 OT2 OD026553; 1 OT2 OD026548; 1 OT2 OD026551; 1 OT2 OD026555; IAA #: AOD 16037; Federally Qualified Health Centers: HHSN 263201600085U; Data and Research Center: 5 U2C OD023196; Biobank: 1 U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: 1 U24 OD023163; Communications and Engagement: 3 OT2 OD023205; 3 OT2 OD023206; and Community Partners: 1 OT2 OD025277; 3 OT2 OD025315; 1 OT2 OD025337; 1 OT2 OD025276. In addition, the All of Us Research Program would not be possible without the partnership of its participants. V.S. and R.P. would like to thank the Baylor College of Medicine Medical Scientist M.D./Ph.D. training program for their support.
Author contributions
V.S., J.B., R.P., D.H., and L.N. designed the study and analyzed the data. V.S. carried out the implementation. V.S. wrote the manuscript with input from all authors. J.B., Y.H., and C.A. were in charge of overall direction and planning.
Funding
Our study was supported by the National Institutes of Health (NIH) for Integrative Analysis of Lung Cancer Etiology and Risk (U19CA203654) and Sequencing Familial Lung Cancer (R01CA243483). C.I.A. is a Research Scholar of the Cancer Prevention Research Interest of Texas (CPRIT) award (RR170048).
Data availability
Data from this program are accessible at http://www.allofus.nih.gov, and this study was conducted on version 6 of the data utilizing the All of Us Researcher Workbench.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-023-37585-0.
References
- 1.Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2022. CA Cancer J. Clin. 2022;72(1):7–33. doi: 10.3322/caac.21708. [DOI] [PubMed] [Google Scholar]
- 2.Medeiros B, Allan AL. Molecular mechanisms of breast cancer metastasis to the lung: clinical and experimental perspectives. Int. J. Mol. Sci. 2019;20(9):2272. doi: 10.3390/ijms20092272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Riihimäki M, Hemminki A, Sundquist J, Hemminki K. Patterns of metastasis in colon and rectal cancer. Sci. Rep. 2016;6:29765. doi: 10.1038/srep29765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Dudani S, de Velasco G, Wells JC, et al. Evaluation of clear cell, papillary, and chromophobe renal cell carcinoma metastasis sites and association with survival. JAMA Netw. Open. 2021;4(1):e2021869. doi: 10.1001/jamanetworkopen.2020.21869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bozkurt M, Aghalarov S, Atci MM, Selvi O, Canat HL. A new biomarker for lung metastasis in non-seminomatous testicular cancer: De Ritis Ratio. Aktuelle Urol. 2022;53(6):540–544. doi: 10.1055/a-1926-9698. [DOI] [PubMed] [Google Scholar]
- 6.Tsuyoshi H, Yoshida Y. Molecular biomarkers for uterine leiomyosarcoma and endometrial stromal sarcoma. Cancer Sci. 2018;109(6):1743–1752. doi: 10.1111/cas.13613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Islami F, Goding Sauer A, Miller KD, et al. Proportion and number of cancer cases and deaths attributable to potentially modifiable risk factors in the United States. CA Cancer J. Clin. 2018;68(1):31–54. doi: 10.3322/caac.21440. [DOI] [PubMed] [Google Scholar]
- 8.Zhou W, Liu G, Hung RJ, et al. Causal relationships between body mass index, smoking and lung cancer: Univariable and multivariable Mendelian randomization. Int. J. Cancer. 2021;148(5):1077–1086. doi: 10.1002/ijc.33292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Christiani DAC, Broaddus C, et al. Murray and Neidel’s Textbook of Respiratory Medicine. Elsevier; 2021. Lung cancer: Epidemiology, chapter 74; pp. 1018–1028. [Google Scholar]
- 10.Schwartz AG, Cote ML. Epidemiology of lung cancer. Adv. Exp. Med. Biol. 2016;893:21–41. doi: 10.1007/978-3-319-24223-1_2. [DOI] [PubMed] [Google Scholar]
- 11.Haddad DN, Sandler KL, Henderson LM, Rivera MP, Aldrich MC. Disparities in lung cancer screening: A review. Ann. Am. Thorac. Soc. 2020;17(4):399–405. doi: 10.1513/AnnalsATS.201907-556CME. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bernstein E, Bade BC, Akgün KM, Rose MG, Cain HC. Barriers and facilitators to lung cancer screening and follow-up. Semin. Oncol. 2022 doi: 10.1053/j.seminoncol.2022.07.004. [DOI] [PubMed] [Google Scholar]
- 13.Hartmann-Boyce J, Chepkin SC, Ye W, Bullen C, Lancaster T. Nicotine replacement therapy versus control for smoking cessation. Cochrane Database Syst. Rev. 2018;5(5):CD000146. doi: 10.1002/14651858.CD000146.pub5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Bracken-Clarke D, Kapoor D, Baird AM, et al. Vaping and lung cancer—A review of current data and recommendations. Lung Cancer. 2021;153:11–20. doi: 10.1016/j.lungcan.2020.12.030. [DOI] [PubMed] [Google Scholar]
- 15.Soneji SS, Sung HY, Primack BA, Pierce JP, Sargent JD. Quantifying population-level health benefits and harms of e-cigarette use in the United States. PLoS One. 2018;13(3):e0193328. doi: 10.1371/journal.pone.0193328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Coughlin LN, Wilson SM, Erwin MC, Beckham JC, VA Mid-Atlantic MIRECC Workgroup. Calhoun PS. Cigarette smoking rates among veterans: Association with rurality and psychiatric disorders. Addict. Behav. 2019;90:119–123. doi: 10.1016/j.addbeh.2018.10.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zhuo C, Zhuang H, Gao X, Triplett PT. Lung cancer incidence in patients with schizophrenia: Meta-analysis. Br. J. Psychiatry. 2019;215(6):704–711. doi: 10.1192/bjp.2019.23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sikjær MG, Løkke A, Hilberg O. The influence of psychiatric disorders on the course of lung cancer, chronic obstructive pulmonary disease and tuberculosis. Respir. Med. 2018;135:35–41. doi: 10.1016/j.rmed.2017.12.012. [DOI] [PubMed] [Google Scholar]
- 19.Toumazis I, Bastani M, Han SS, Plevritis SK. Risk-based lung cancer screening: A systematic review. Lung Cancer. 2020;147:154–186. doi: 10.1016/j.lungcan.2020.07.007. [DOI] [PubMed] [Google Scholar]
- 20.Alonso A, Alam AB, Kamel H, et al. Epidemiology of atrial fibrillation in the All of Us Research Program. PLoS One. 2022;17(3):e0265498. doi: 10.1371/journal.pone.0265498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Janjigian YY, McDonnell K, Kris MG, et al. Pack-years of cigarette smoking as a prognostic factor in patients with stage IIIB/IV nonsmall cell lung cancer. Cancer. 2010;116(3):670–675. doi: 10.1002/cncr.24813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schane RE, Ling PM, Glantz SA. Health effects of light and intermittent smoking: A review. Circulation. 2010;121(13):1518–1522. doi: 10.1161/CIRCULATIONAHA.109.904235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Husten CG. How should we define light or intermittent smoking? Does it matter? Nicotine Tob. Res. 2009;11(2):111–121. doi: 10.1093/ntr/ntp010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Aragon, T. epitools: Epidemiology Tools R package version 0.5-10.1. https://CRANR-project.org/package=epitools (2020) (published online).
- 25.Wickham H, Averick M, Bryan J, et al. Welcome to the Tidyverse. J. Open Source Softw. 2019;4(43):1686. doi: 10.21105/joss.01686. [DOI] [Google Scholar]
- 26.Pedersen, T. patchwork: The Composer of Plots R package version 1.1.2. https://CRANR-project.org/package=patchwork (2022) (published online).
- 27.Wickham H. Ggplot2. Springer; 2009. [Google Scholar]
- 28.Torre LA, Siegel RL, Jemal A. Lung cancer statistics. Adv. Exp. Med. Biol. 2016;893:1–19. doi: 10.1007/978-3-319-24223-1_1. [DOI] [PubMed] [Google Scholar]
- 29.Sadeghi A, Parastouei K, Seifi S, et al. Inflammatory potential of diet and odds of lung cancer: A case–control study. Nutr. Cancer. 2022;74(8):2859–2867. doi: 10.1080/01635581.2022.2036770. [DOI] [PubMed] [Google Scholar]
- 30.Krusinska B, Hawrysz I, Wadolowska L, et al. Associations of Mediterranean diet and a posteriori derived dietary patterns with breast and lung cancer risk: A case–control study. Nutrients. 2018;10(4):470. doi: 10.3390/nu10040470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Austin H, Delzell E, Lally C, Rotimi C, Oestenstad K. A case–control study of lung cancer at a foundry and two engine plants. Am. J. Ind. Med. 1997;31(4):414–421. doi: 10.1002/(sici)1097-0274(199704)31:4<414::aid-ajim6>3.0.co;2-v. [DOI] [PubMed] [Google Scholar]
- 32.Hosseini B, Olsson A, Bouaoun L, et al. Lung cancer risk in relation to jobs held in a nationwide case–control study in Iran. Occup. Environ. Med. 2022;79(12):831–838. doi: 10.1136/oemed-2022-108463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Brizio MLR, Hallal PC, Lee IM, Domingues MR. Physical activity and lung cancer: A case–control study in Brazil. J. Phys. Act. Health. 2016;13(3):257–261. doi: 10.1123/jpah.2014-0571. [DOI] [PubMed] [Google Scholar]
- 34.Suissa S, Dell’aniello S, Vahey S, Renoux C. Time-window bias in case–control studies: Statins and lung cancer. Epidemiology. 2011;22(2):228–231. doi: 10.1097/EDE.0b013e3182093a0f. [DOI] [PubMed] [Google Scholar]
- 35.Kristensen KB, Hicks B, Azoulay L, Pottegård A. Use of ACE (angiotensin-converting enzyme) inhibitors and risk of lung cancer: A nationwide nested case–control study. Circ. Cardiovasc. Qual. Outcomes. 2021;14(1):e006687. doi: 10.1161/CIRCOUTCOMES.120.006687. [DOI] [PubMed] [Google Scholar]
- 36.Aldington S, Harwood M, Cox B, et al. Cannabis use and risk of lung cancer: A case–control study. Eur. Respir. J. 2008;31(2):280–286. doi: 10.1183/09031936.00065707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Ji Y, Yang Y, Yin Z. Polymorphisms in lncRNA CCAT1 on the susceptibility of lung cancer in a Chinese northeast population: A case–control study. Cancer Med. 2023;12(1):500–512. doi: 10.1002/cam4.4902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Denny JC, Rutter JL, et al. The “All of Us” Research Program. N. Engl. J. Med. 2019;381(7):668–676. doi: 10.1056/NEJMsr1809937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Ward E, Jemal A, Cokkinides V, et al. Cancer disparities by race/ethnicity and socioeconomic status. CA Cancer J. Clin. 2004;54(2):78–93. doi: 10.3322/canjclin.54.2.78. [DOI] [PubMed] [Google Scholar]
- 40.Dragani TA, Manenti G, Pierotti MA. Polygenic inheritance of predisposition to lung cancer. Ann. Ist. Super. Sanita. 1996;32(1):145–150. [PubMed] [Google Scholar]
- 41.Joung MJ, Han MA, Park J, Ryu SY. Association between family and friend smoking status and adolescent smoking behavior and e-cigarette use in Korea. Int. J. Environ. Res. Public Health. 2016 doi: 10.3390/ijerph13121183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Vàzquez-Nava F, Peinado-Herreros JM, Saldívar-González AH, et al. Association between family structure, parental smoking, friends who smoke, and smoking behavior in adolescents with asthma. ScientificWorldJournal. 2010;10:62–69. doi: 10.1100/tsw.2010.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Schuck K, Otten R, Engels RCME, Barker ED, Kleinjan M. Bidirectional influences between parents and children in smoking behavior: A longitudinal full-family model. Nicotine Tob. Res. 2013;15(1):44–51. doi: 10.1093/ntr/nts082. [DOI] [PubMed] [Google Scholar]
- 44.DiFranza JR, Guerrera MP. Alcoholism and smoking. J. Stud. Alcohol. 1990;51(2):130–135. doi: 10.15288/jsa.1990.51.130. [DOI] [PubMed] [Google Scholar]
- 45.Smith ML, Gotway MB, Crotty Alexander LE, Hariri LP. Vaping-related lung injury. Virchows Arch. 2021;478(1):81–88. doi: 10.1007/s00428-020-02943-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Quante A, Schulz K, Fissler M. Psychiatric comorbidities in cancer patients: Acute interventions by the psychiatric consultation liaison service. Wien Med. Wochenschr. 2020;170(13–14):348–356. doi: 10.1007/s10354-020-00739-0. [DOI] [PubMed] [Google Scholar]
- 47.Chen HM, Tsai CM, Wu YC, Lin KC, Lin CC. Randomised controlled trial on the effectiveness of home-based walking exercise on anxiety, depression and cancer-related symptoms in patients with lung cancer. Br. J. Cancer. 2015;112(3):438–445. doi: 10.1038/bjc.2014.612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Morrison EJ, Novotny PJ, Sloan JA, et al. Emotional problems, quality of life, and symptom burden in patients with lung cancer. Clin. Lung Cancer. 2017;18(5):497–503. doi: 10.1016/j.cllc.2017.02.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Sterzi S, Cesario A, Cusumano G, et al. How best to assess the quality of life in long-term survivors after surgery for NSCLC? Comparison between clinical predictors and questionnaire scores. Clin. Lung Cancer. 2013;14(1):78–87. doi: 10.1016/j.cllc.2012.04.002. [DOI] [PubMed] [Google Scholar]
- 50.Engel N, Wang H, Jiang X, et al. EHR data quality assessment tools and issue reporting workflows for the “All of Us” Research Program clinical data research network. AMIA Annu. Symp. Proc. 2022;2022:186–195. [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data from this program are accessible at http://www.allofus.nih.gov, and this study was conducted on version 6 of the data utilizing the All of Us Researcher Workbench.