Abstract
Background
Although lung cancer incidence rates according to smoking status, sex, and detailed race/ethnicity have not been available, it is estimated that more than half of Asian American, Native Hawaiian, and Pacific Islander (AANHPI) females with lung cancer have never smoked.
Methods
We calculated age-adjusted incidence rates for lung cancer according to smoking status and detailed race/ethnicity among females, focusing on AANHPI ethnic groups, and assessed relative incidence across racial/ethnic groups. We used a large-scale dataset that integrates data from electronic health records from 2 large health-care systems—Sutter Health in Northern California and Kaiser Permanente Hawai’i—linked to state cancer registries for incident lung cancer diagnoses between 2000 and 2013. The study population included 1 222 694 females (n = 244 147 AANHPI), 3297 of which were diagnosed with lung cancer (n = 535 AANHPI).
Results
Incidence of lung cancer among never-smoking AANHPI as an aggregate group was 17.1 per 100 000 (95% confidence interval [CI] = 14.9 to 19.4) but varied widely across ethnic groups. Never-smoking Chinese American females had the highest rate (22.8 per 100 000, 95% CI = 17.3 to 29.1). Except for Japanese American females, incidence among every never-smoking AANHPI female ethnic group was higher than that of never-smoking non-Hispanic White females, from 66% greater among Native Hawaiian females (incidence rate ratio = 1.66, 95% CI = 1.03 to 2.56) to more than 100% greater among Chinese American females (incidence rate ratio = 2.26, 95% CI = 1.67 to 3.02).
Conclusions
Our study revealed high rates of lung cancer among most never-smoking AANHPI female ethnic groups. Our approach illustrates the use of innovative data integration to dispel the myth that AANHPI females are at overall reduced risk of lung cancer and demonstrates the need to disaggregate this highly diverse population.
Lung cancer among persons who have never smoked is the seventh leading cause of cancer mortality in the United States (1,2). Moreover, an increasing proportion of lung cancers are diagnosed among persons who have never smoked (1,3). Our 2007 study estimated incidence rates of non-small cell lung cancer (NSCLC) by sex and smoking status using data from several large, prospective cohorts and reported substantially higher rates among females than males who have never smoked (ranging from 15.2 to 20.8 per 100 000 in females and 4.8 to 13.7 in males) (4). Furthermore, we previously reported 24% of non-Hispanic (NH) White females diagnosed with lung cancer have never smoked, and nearly double that proportion (44%) of Asian American, Native Hawaiian, and Pacific Islander (AANHPI) females with lung cancer have never smoked (5). Similarly, in a study of seven state-based cancer registries with smoking information abstracted from medical records, 57% of AANHPI females with lung cancer had never smoked (6). Smoking-specific lung cancer incidence rates, however, have not been available.
AANHPIs are the fastest growing racial/ethnic population in the United States, increasing by 45% between 2000 and 2019 (7-9). It is a heterogeneous group with people from 30 countries who speak more than 100 languages and represent great diversity in socioeconomic levels, cultural beliefs and behaviors, English proficiency, immigration experience, generational status, and acculturation (10-13). In addition, the proportion of AANHPIs identifying as multiple races or ethnicities is growing, increasing 60% among Asian Americans and 44% among Native Hawaiian and Pacific Islanders (NHPIs) from 2000 to 2010 (7,11). Despite this diversity, AANHPIs have historically been studied as an aggregate, which masks health inequalities across ethnicities. For example, we previously reported that proportions of AANHPI females with lung cancer who never smoked varies from a high of 88% among Chinese American to a low of 16% among Native Hawaiian females (5). In addition, a study of Surveillance, Epidemiology, and End Results (SEER) data for diagnoses from 1990 to 2008 showed that lung cancer incidence rates were lower for most Asian American ethnic groups compared with the NH White group but varied substantially from 12.4 per 100 000 for Asian Indian and Pakistani American females to 31.8 for Vietnamese American females (14). Native Hawaiian females, on the other hand, had rates similar to NH White females (51.6 and 56.9, respectively) in a SEER-13 study (15). These SEER studies, however, could not report rates by smoking status (13,16).
In this study, we used a linked data resource, including 1 222 694 females (244 147 AANHPI), that integrates electronic health record (EHR) data on race, ethnicity, and smoking status with cancer registry data on incident lung cancer diagnoses (5). We examined lung cancer incidence among females by joint race/ethnicity and smoking status focusing on specific AANHPI single and multi-racial/ethnic groups. We aimed to fill a critical gap in knowledge that has limited our ability to quantify the burden of lung cancer among never-smoking AANHPI females.
Methods
Study Approval
All aspects of the study protocol were approved by the institutional review boards of the State of California Protection for Human Subjects; University of California, San Francisco; Sutter Health; Kaiser Permanente Hawai’i; and the Hawai’i Medical Association.
Study Population
Development of the cohort and dataset has been described (5). The cohort included adults 18 years of age or older with at least one in-person visit to Sutter Health in Northern California or Kaiser Permanente Hawai’i between 2000 and 2013. Individuals were excluded if they did not have sex classified as male or female, had no social history (the portion of the EHR related to patient behavioral, familial, or occupational characteristics) within the study period, had a history of lung cancer at the time of their baseline visit (ie, the first in-person visit during the study period), did not have EHR record of residence in California (for Sutter Health) or Hawai’i (for Kaiser Permanente Hawai’i) during the study period, had a date of death (per the EHR) before baseline or unknown date of death, or who had no follow-up (ie, baseline date was equivalent to the study end date, date of diagnosis, or date of death). The final pooled cohort comprised 2 211 476 individuals (n = 1 275 838 females; n = 935 638 males). Although our proposed study was intended to investigate the incidence of lung cancer among females who never smoked, the rates among males are also presented in the Supplementary Materials (available online) for some analyses.
EHR Data
Collection and harmonization of EHR data on self-reported race, ethnicity, and smoking status have been described (5). Our race/ethnicity classification algorithm prioritized small AANHPI populations and distinguished between single and multiple races (5). Our AANHPI groupings are any Native Hawaiian, Pacific Islander, Asian Indian only, Chinese only, Filipinx only, Japanese only, Korean only, Vietnamese only, Other single Asian ethnic group only, multiple Asian ethnic groups only (Asian, multiple group), or Asian and non-Asian race (Asian and non-Asian multiple). Individuals not indicating AANHPI were categorized as NH White, Black, Hispanic, multiple (non-AANHPI) races, Other (including American Indian and Alaska Native), and unknown/missing.
We determined smoking status by extracting up to 2 nonmissing smoking status values (current, former, passive, never) from EHR social history: 1) the first available value recorded on the day of or after the baseline visit and 2) the last available value recorded before date of lung cancer diagnosis, death, or study end (December 31, 2013) (5). If a patient’s extracted smoking status values were discordant, we applied a simple algorithm to classify a single smoking status value of “ever” (at least 1 entry of “current” or “former”) or “never” (all nonmissing entries of “never” or “passive”). We excluded 53 144 (4.2%) females and 39 387 (4.2%) males who had unknown smoking status with this approach (5). The final study population thus was comprised of 1 222 694 females and 896 251 males with smoking status.
Lung Cancer Diagnoses and Tumor Characteristics
Sutter Health patients were previously linked to the California Cancer Registry for all invasive cancers diagnosed between 1988 and 2013 (17). The Kaiser Permanente Hawai’i cohort was linked to the Hawai’i Tumor Registry for lung cancers diagnosed between 1973 and 2013 (18). For all lung cancer cases, cancer registry data included date of diagnosis, tumor stage (localized, regional, remote), and tumor histology. Histologic cell types were based on morphology codes as defined by Lewis et al. (19). Incident lung cancer was defined as a diagnosis of invasive lung or bronchus carcinoma (International Classification of Disease for Oncology, third edition, site codes C34.0-34.9, excluding morphology codes 8500 and 8580-9999 ) occurring after the date of the baseline visit and before December 31, 2013. In situ lung cancer (n = 2 females and n = 2 males), noncarcinomas (n = 16 females and n = 16 males), and apparent metastases (n = 1 females and n = 0 males) were not considered as cases. This resulted in 3297 females and 2892 males with incident lung cancer diagnoses for analysis.
Although the study population is enriched for AANHPI groups in the United States, it is generally representative of the target populations of California and Hawai’i in regard to race/ethnicity and smoking prevalence among the cohort and in regard to race/ethnicity among lung cancer cases; a detailed account of the representativeness of the cohort has been published (18).
Statistical Analyses
All statistical analyses were conducted with SAS software version 9.4 (SAS Institute, Cary, NC). We used frequencies and percentages to describe the age and race/ethnicty of the cohort population and individuals diagnosed with lung cancer according to sex and smoking status. We also describe the distribution of age and race/ethnicity among females diagnosed with lung cancer according to histology and stage at diagnosis. We followed SEER guidance on suppression of table cells with less than 5 cases.
Follow-up time for lung cancer incidence was defined as days from the baseline visit to the earliest of lung cancer diagnosis, death, or December 31, 2013. Median follow-up was 4.8 (interquartile range [IRR] = 2.6 to 9.2) years.
Age-adjusted incidence rates (AAIR) represent the number of cases per 100 000 person-years at risk. Among groups defined by sex and race/ethnicity, AAIRs were calculated for overall lung cancer and for lung cancer stratified by smoking status. We also calculated AAIRs for the NSCLC and adenocarcinoma subtypes; AAIRs for other subtypes could not be calculated because of low frequencies of cases among persons who never smoked. To calculate AAIRs, we first calculated age-specific incidence rates for 15 age groups of the cohort by dividing the number of cases by the corresponding person-years of at-risk follow-up time in each age group. These age-specific rates were then standardized to the United States 2000 population, summed over age groups, and multiplied by 100 000 to obtain the AAIR. Ninety-five percent confidence intervals (CIs) were calculated using the Fay and Feuer method (20) with the modification by Tiwari et al. (21) for the upper confidence level. Incidence rate ratios with 95% confidence intervals were calculated within each stratum of smoking status by dividing the AAIR for each racial/ethnic group by the AAIR for the NH White group (22). P values were all 2-sided, and P values less than .05 were considered statistically significant. We followed the Centers for Disease Control guidance on suppression of incidence rates based on a case count of less than 16 (23). Thus, because of low frequencies of cases among Asian Indian, Vietnamese, and Korean American females in all smoking categories, these ethnic groups were aggregated with “Other Asian” for analyses of incidence rates and incidence rate ratios.
To account for differential misclassification of smoking by case status, we conducted a sensitivity analysis whereby AAIRs were recalculated with smoking status defined using 2 smoking status values that were at least 3 months prior to the date of diagnosis among cases. To account for potential loss to follow-up by state cancer registries because of an out-of-state move, we conducted a sensitivity analysis whereby individuals whose last known address in the EHR was out of state were censored at the date of the first out-of-state address. To assess the impact of unknown or missing race/ethnicity in EHRs, we calculated standardized incidence rate ratios as a relative measure across racial/ethnic groups; expected rates for each racial/ethnic group were calculated with the rate among all females including or, in a separate analysis, not including females with unknown or missing race/ethnicity. Confidence intervals were calculated using Byar approximation (24).
Results
The distributions of age at baseline, baseline year, and race/ethnicity according to smoking status among all females in the cohort and females diagnosed with lung cancer are presented in Table 1. There were 535 incident lung cancer diagnoses among 244 147 AANHPI females; 43.9% of AANHPI females with lung cancer never smoked. Analogous distributions among males are available in Supplementary Table 1 (available online). Table 2 presents the distribution of females with lung cancer by age at diagnosis, year of baseline visit, and race/ethnicity according to smoking status as well as lung cancer histology and stage. Across all racial/ethnic groups, adenocarcinoma accounted for 64.5% of lung cancers among females who never smoked compared with 46.5% among those who ever smoked. Among AANHPI females with lung cancer who never smoked, 77.0% were diagnosed with the adenocarcinoma histologic subtype.
Table 1.
Characteristic | Female cohort |
Female incident lung cancer cases |
||||
---|---|---|---|---|---|---|
Total No. (%)a |
Smoking status |
Total No. (%)a |
Smoking Status |
|||
Never No. (%) |
Ever No. (%) |
Never No. (%) |
Ever No. (%) |
|||
Total | 1 222 694 (100.0) | 889 870 (72.8) | 332 824 (27.2) | 3297 (100.0) | 884 (26.8) | 2413 (73.2) |
Age at baseline/diagnosis, yb | ||||||
18-29 | 345 599 (28.3) | 272 851 (79.0) | 72 748 (21.0) | 5 (0.2) | — | — |
30-39 | 253 666 (20.7) | 198 435 (78.2) | 55 231 (21.8) | 37 (1.1) | 23 (62.2) | 14 (37.8) |
40-49 | 214 981 (17.6) | 154 146 (71.7) | 60 835 (28.3) | 119 (3.6) | 55 (46.2) | 64 (53.8) |
50-59 | 177 066 (14.5) | 116 551 (65.8) | 60 515 (34.2) | 407 (12.3) | 107 (26.3) | 300 (73.7) |
60-69 | 116 361 (9.5) | 71 997 (61.9) | 44 364 (38.1) | 903 (27.4) | 217 (24.0) | 686 (76.0) |
70-79 | 70 032 (5.7) | 44 193 (63.1) | 25 839 (36.9) | 1051 (31.9) | 232 (22.1) | 819 (77.9) |
≥80 | 44 989 (3.7) | 31 697 (70.5) | 13 292 (29.5) | 775 (23.5) | 246 (31.7) | 529 (68.3) |
Baseline year | ||||||
2000-2004 | 326 811 (26.7) | 227 712 (69.7) | 99 099 (30.3) | 1724 (52.3) | 460 (26.7) | 1264 (73.3) |
2005-2009 | 404 724 (33.1) | 294 143 (72.7) | 110 581 (27.3) | 976 (29.6) | 257 (26.3) | 719 (73.7) |
2010-2013 | 491 159 (40.2) | 368 015 (74.9) | 123 144 (25.1) | 597 (18.1) | 167 (28.0) | 430 (72.0) |
Race/Ethnicity | ||||||
Any AANHPI | 244 147 (20.0) | 198 208 (81.2) | 45 939 (18.8) | 535 (16.2) | 235 (43.9) | 300 (56.1) |
Any NHPI | 42 627 (3.5) | 25 139 (59.0) | 17 488 (41.0) | 178 (5.4) | 31 (17.4) | 147 (82.6) |
Native Hawaiianc | 26 467 (2.2) | 14 658 (55.4) | 11 809 (44.6) | 144 (4.4) | 23 (16.0) | 121 (84.0) |
Pacific Islanderd | 16 160 (1.3) | 10 481 (64.9) | 5679 (35.1) | 34 (1.0) | 8 (23.5) | 26 (76.5) |
Asian (single or multiple) | 201 520 (16.5) | 173 069 (85.9) | 28 451 (14.1) | 357 (10.8) | 204 (57.1) | 153 (42.9) |
Asian (single group) | 167 135 (13.7) | 146 518 (87.7) | 20 617 (12.3) | 261 (7.9) | 158 (60.5) | 103 (39.5) |
Asian Indian | 36 382 (3.0) | 35 458 (97.5) | 924 (2.5) | 6 (0.2) | 6 (100.0) | 0 (0.0) |
Chinese | 37 982 (3.1) | 35 622 (93.8) | 2360 (6.2) | 67 (2.0) | 59 (88.1) | 8 (11.9) |
Japanese | 19 441 (1.6) | 14 453 (74.3) | 4988 (25.7) | 57 (1.7) | 18 (31.6) | 39 (68.4) |
Filipinx | 35 852 (2.9) | 28 841 (80.4) | 7011 (19.6) | 76 (2.3) | 42 (55.3) | 34 (44.7) |
Korean | 6429 (0.5) | 4887 (76.0) | 1542 (24.0) | 19 (0.6) | 6 (31.6) | 13 (68.4) |
Vietnamese | 4329 (0.4) | 4012 (92.7) | 317 (7.3) | 5 (0.2) | 5 (100.0) | 0 (0.0) |
Other Asian | 26 720 (2.2) | 23 245 (87.0) | 3475 (13.0) | 31 (0.9) | 22 (71.0) | 9 (29.0) |
Asian (multiple group) | 34 385 (2.8) | 26 551 (77.2) | 7834 (22.8) | 96 (2.9) | 46 (47.9) | 50 (52.1) |
Asian only | 10 082 (0.8) | 8461 (83.9) | 1621 (16.1) | 16 (0.5) | 12 (75.0) | — |
Asian and non-Asian | 24 303 (2.0) | 18 090 (74.4) | 6213 (25.6) | 80 (2.4) | 34 (42.5) | 46 (57.5) |
Non-Hispanic White | 518 152 (42.4) | 348 492 (67.3) | 169 660 (32.7) | 1297 (39.3) | 306 (23.6) | 991 (76.4) |
Black | 35 488 (2.9) | 23 995 (67.6) | 11 493 (32.4) | 82 (2.5) | 13 (15.9) | 69 (84.1) |
Hispanic | 100 070 (8.2) | 79 235 (79.2) | 20 835 (20.8) | 72 (2.2) | 31 (43.1) | 41 (56.9) |
Non-AANHPI multiple | 38 228 (3.1) | 26 055 (68.2) | 12 173 (31.8) | 94 (2.9) | 14 (14.9) | 80 (85.1) |
Other (including AIAN) | 25 195 (2.1) | 19 220 (76.3) | 5975 (23.7) | 18 (0.5) | — | 14 (77.8) |
Unknown | 261 414 (21.4) | 194 665 (74.5) | 66 749 (25.5) | 1199 (36.4) | 281 (23.4) | 918 (76.6) |
Column percentages are provided in Total columns. All other columns with proportions present row percentages. AANHPI = Asian American, Native Hawaiian, and Pacific Islander; AIAN = American Indian and Alaska Native; NHPI = Native Hawaiian and Pacific Islander. “—” indicates censoring due to low numbers (<5 individuals).
Age at baseline among cohort, age at diagnosis among cases.
Individuals indicating any Native Hawaiian, even if also indicating other races or ethnicities, are categorized as Native Hawaiian.
Pacific Islander, not indicating Native Hawaiian.
Table 2.
Characteristics | Lung cancer cases among never-smokers |
Lung cancer cases among ever-smokers |
||||||
---|---|---|---|---|---|---|---|---|
By histology |
By stage |
By histology |
By stage |
|||||
All No. (%)a |
Adenocarcinoma No. (%)b |
Localized No. (%)b |
Regional + Distant No. (%)b |
All No. (%)a |
Adenocarcinoma No. (%)b |
Localized No. (%)b |
Regional + Distant No. (%)b |
|
All | 884 (100.0) | 570 (64.5) | 203 (23.0) | 642 (72.6) | 2413 (100.0) | 1123 (46.5) | 498 (20.6) | 1828 (75.8) |
Age at diagnosis, y | ||||||||
18-39 | 27 (3.1) | 17 (63.0) | 9 (33.3) | 16 (59.3) | 15 (0.6) | 11 (73.3) | 5 (33.3) | 10 (66.7) |
40-49 | 55 (6.2) | 42 (76.4) | 12 (21.8) | 42 (76.4) | 64 (2.7) | 34 (53.1) | 13 (20.3) | 48 (75.0) |
50-59 | 107 (12.1) | 73 (68.2) | 28 (26.2) | 78 (72.9) | 300 (12.4) | 162 (54.0) | 57 (19.0) | 235 (78.3) |
60-69 | 217 (24.5) | 150 (69.1) | 53 (24.4) | 155 (71.4) | 686 (28.4) | 345 (50.3) | 138 (20.1) | 526 (76.7) |
70-79 | 232 (26.2) | 165 (71.1) | 54 (23.3) | 171 (73.7) | 819 (33.9) | 368 (44.9) | 181 (22.1) | 620 (75.7) |
≥80 | 246 (27.8) | 123 (50.0) | 47 (19.1) | 180 (73.2) | 529 (21.9) | 203 (38.4) | 104 (19.7) | 389 (73.5) |
Baseline year | ||||||||
2000-2004 | 460 (52.0) | 310 (67.4) | 101 (22.0) | 339 (73.7) | 1264 (52.4) | 584 (46.2) | 267 (21.1) | 954 (75.5) |
2005-2009 | 257 (29.1) | 156 (60.7) | 60 (23.3) | 186 (72.4) | 719 (29.8) | 332 (46.2) | 133 (18.5) | 556 (77.3) |
2010-2013 | 167 (18.9) | 104 (62.3) | 42 (25.1) | 117 (70.1) | 430 (17.8) | 207 (48.1) | 98 (22.8) | 318 (74.0) |
Race/Ethnicity | (0.0) | |||||||
Any AANHPI | 235 (26.6) | 181 (77.0) | 56 (23.8) | 171 (72.8) | 300 (12.4) | 140 (46.7) | 58 (19.3) | 233 (77.7) |
Any NHPI | 31 (3.5) | 22 (71.0) | 5 (16.1) | 23 (74.2) | 147 (6.1) | 61 (41.5) | 28 (19.0) | 115 (78.2) |
Native Hawaiianc | 23 (2.6) | 15 (65.2) | — | 18 (78.3) | 121 (5.0) | 52 (43.0) | 24 (19.8) | 94 (77.7) |
Pacific Islanderd | 8 (0.9) | 7 (87.5) | — | 5 (62.5) | 26 (1.1) | 9 (34.6) | — | 21 (80.8) |
Asian (single or multiple) | 204 (23.1) | 159 (77.9) | 51 (25.0) | 148 (72.5) | 153 (6.3) | 79 (51.6) | 30 (19.6) | 118 (77.1) |
Asian (single group) | 158 (17.9) | 127 (80.4) | 41 (25.9) | 114 (72.2) | 103 (4.3) | 50 (48.5) | 18 (17.5) | 84 (81.6) |
Chinese | 59 (6.7) | 47 (79.7) | 15 (25.4) | 42 (71.2) | 8 (0.3) | — | — | 7 (87.5) |
Japanese | 18 (2.0) | 16 (88.9) | 6 (33.3) | 12 (66.7) | 39 (1.6) | 18 (46.2) | — | 35 (89.7) |
Filipinx | 42 (4.8) | 36 (85.7) | 13 (31.0) | 29 (69.0) | 34 (1.4) | 15 (44.1) | 10 (29.4) | 24 (70.6) |
Other Asian | 39 (4.4) | 28 (71.8) | 7 (17.9) | 31 (79.5) | 22 (0.9) | 13 (59.1) | — | 18 (81.8) |
Asian (multiple group) | 46 (5.2) | 32 (69.6) | 10 (21.7) | 34 (73.9) | 50 (2.1) | 29 (58.0) | 12 (24.0) | 34 (68.0) |
Asian only multiple | 12 (1.4) | 10 (83.3) | — | 8 (66.7) | — | 0 (0.0) | — | — |
Asian and non-Asian | 34 (3.8) | 22 (64.7) | 6 (17.6) | 26 (76.5) | 46 (1.9) | 29 (63.0) | 11 (23.9) | 31 (67.4) |
Non-Hispanic White | 306 (34.6) | 199 (65.0) | 89 (29.1) | 206 (67.3) | 991 (41.1) | 507 (51.2) | 269 (27.1) | 691 (69.7) |
Black | 13 (1.5) | 7 (53.8) | — | 8 (61.5) | 69 (2.9) | 38 (55.1) | 21 (30.4) | 48 (69.6) |
Hispanic | 31 (3.5) | 20 (64.5) | 11 (35.5) | 20 (64.5) | 41 (1.7) | 25 (61.0) | 8 (19.5) | 33 (80.5) |
Non-AANHPI multiple | 14 (1.6) | 9 (64.3) | 5 (35.7) | 6 (42.9) | 80 (3.3) | 42 (52.5) | 16 (20.0) | 60 (75.0) |
Unknown | 281 (31.8) | 152 (54.1) | 38 (13.5) | 227 (80.8) | 918 (38.0) | 362 (39.4) | 121 (13.2) | 754 (82.1) |
Column percentages are provided. AANHPI = Asian American, Native Hawaiian, and Pacific Islander; AIAN = American Indian and Alaska Native; NHPI = Native Hawaiian and Pacific Islander. “— ” indicates censoring due to low numbers (<5 individuals).
Row percentages are provided.
Individuals indicating any Native Hawaiian, even if also indicating other races or ethnicities, are categorized as Native Hawaiian.
Pacific Islander, not indicating Native Hawaiian.
Figure 1 shows AAIRs for lung cancer among never- and ever-smoking females by race/ethnicity. The rate we observed among all never-smoking females in our cohort was 13.1 per 100 000 (95% CI = 12.2 to 14.0). Incidence of lung cancer among never-smoking AANHPI females as an aggregate group was 17.1 per 100 000 (95% CI = 14.9 to 19.4). However, AAIRs varied widely across AANHPI groups, from 6.4 (95% CI = 3.6 to 10.0) among Japanese American to 22.8 (95% CI = 17.3 to 29.1) among Chinese American females who never smoked. The rate among Native Hawaiian and Pacific Islander females who never smoked was 15.2 (95% CI = 10.2 to 21.2). A similar pattern was observed for the adenocarcinoma subtype. The AAIR among ever-smoking AANHPI females in the aggregate was 66.5 (95% CI = 59.1 to 74.4). Across AANHPI ethnic groups, AAIRs ranged from 41.3 (95% CI = 29.1 to 55.7) among Japanese American females to 102.4 (95% CI = 83.5 to 123.3) among Native Hawaiian females who ever smoked. AAIRs for lung cancer among never- and ever-smoking males by race/ethnicity are in Supplementary Figure 1 (available online). Our results show that the group of females with unknown or missing race/ethnicity, especially those who ever smoked, have a particularly high risk of lung cancer (Figure 1). An examination of registry data on race/ethnicity among females with lung cancer and unknown or missing EHR race/ethnicity indicated that 82.9% had NH White race/ethnicity (61.8% of those with known EHR race/ethnicity were NH White).
Figure 2 shows smoking-specific race/ethnicity IRRs among females. Among females who never smoked, AAIRs of all histologies combined and adenocarcinoma among every AANHPI group, except Japanese American females, were higher than that of NH White females, from 66% greater among Native Hawaiian (IRR = 1.66, 95% CI = 1.03 to 2.56) to more than 120% greater among Chinese American females (IRR = 2.26, 95% CI = 1.67 to 3.02). Among females who ever smoked, Native Hawaiian females (in addition to Black females and females with multiple non-AANHPI races) had higher incidence of lung cancer compared with NH White females.
Incidence of overall lung cancer among females and males not stratified by smoking are in Supplementary Table 2 (available online), and incidence of the NSCLC subtype stratified by smoking status are in Supplementary Table 3 (available online). In general, AAIRs for overall lung cancer reported here are comparable to those previously reported for AANHPI (14,15,25–27). Patterns of incidence of NSCLC across race/ethnicity resemble those for adenocarcinoma; the majority of cancers within the NSCLC group are adenocarcinomas. Results from sensitivity analyses to account for misclassification of smoking status, potential loss to follow-up, and misclassification of EHR race/ethnicity did not differ substantially from main analyses (Supplementary Tables 4 and 5, available online).
Discussion
Among never-smoking females, all AANHPI female ethnic groups, except for Japanese American females, had substantially elevated risk of lung cancer. These rates (more than 20 per 100 000) make lung cancer among those who never smoked the third most common cancer among Chinese American females (after breast and colon cancer) and the fourth most common among Filipinx females (after breast, colon, and thyroid cancer) (28). Among females who ever smoked, NHPI have higher risk of lung cancer. Our 2007 study summarized lung cancer incidence among never-smoking females in 4 large cohort studies and found published rates from 15.2 per 100 000 (95% CI = 9.1 to 24.5) to 20.8 (95% CI = 13.5 to 31.2). The rate we observed among all females in our cohort 13.1 (95% CI = 12.2 to 14.0) is within this range, and our study provides much needed stratification of rates by detailed race/ethnicity, particularly for AANHPI ethnic groups (4).
Among never-smoking females, we also examined incidence of the adenocarcinoma histologic subtype, the most common histologic subtype among those who never smoked (29). In our cohort, adenocarcinoma accounted for 64.5% of lung cancers among females who never smoked compared to 46.5% among those who ever smoked. In our 2010 SEER study, increasing incidence rates for adenocarcinoma among some AANHPI groups suggested higher rates of lung cancer among AANHPI who never smoked, an observation confirmed here (26).
Females with unknown or missing race/ethnicity according to our EHR data, especially those who ever smoked, have a particularly high risk of lung cancer. This observation led us to examine the possibility of misclassification of race/ethnicity among cases: although we were only able to extract EHR data on race/ethnicity from specific fields, the cancer registries have race and ethnicity data informed through more extensive chart review. We found that proportionally more individuals with lung cancer who were missing EHR race/ethnicity were NH White (82.9%) compared to those with known EHR race/ethnicity (61.8%). Thus, AAIRs for the NH White group may be slightly underestimated because of misclassification of EHR race/ethnicity. Because incidence rate ratios rely on NH White as the reference group, we also calculated standardized incidence rate ratios as a relative measure of incidence across race/ethnicity (Supplementary Table 5, available online). Incidence rate ratios and standardized incidence rate ratios show similar patterns of risk across race/ethnicity, so bias in the group with unknown or missing EHR race/ethnicity is not substantially affecting our observations.
There are few previous studies in the United States reporting incidence of lung cancer among females who have never smoked, and none, to our knowledge, that report these rates by detailed race/ethnicity. Our linkage of EHR with cancer registry data allowed calculation of lung cancer incidence by race/ethnicity and smoking status, which was not possible with registry data alone. Moreover, with more recent research focusing on the disaggregation of the AANHPI population, there is growing evidence of substantial health and exposure inequalities that debunk the perception that they are at lower risk for developing cancer (11,30,31). Our study sites, Hawai’i and Northern California, are among the nation’s largest and most diverse AANHPI populations (7,8). This focus on geographies with a substantial and diverse AANHPI population and the large size of our cohort made possible the disaggregation of AANHPI groups and allowed us to document, for the first time, the high burden of lung cancer among most never-smoking AANHPI female ethnic groups.
Known risk factors for lung cancer among females in the United States who have never smoked include second-hand tobacco smoke, family history of lung cancer, air pollution, cooking oil fumes, and radon (32-35), but the degree to which these risk factors contribute to higher risk among AANHPI females is not known (36,37). There is evidence of high levels of air pollution exposures among AANHPIs, including high traffic volume, fine particulate matter (PM2.5), and exposure to volcanic smog in Hawai’i, but robust studies to address these exposures in regard to lung cancer among AANHPI are not available (38-40). No studies of cooking oil fumes and lung cancer risk have been conducted in the United States (35). Body size, reproductive factors such as hormone therapies, and certain infectious diseases may also have etiologic significance, but so far, results from previous studies have been mixed and not specific to AANHPI groups (41-51). Genetic ancestry and the mutational landscapes of lung cancers among AANHPI females who have never smoked may also provide insights into the etiology and increased burden of lung cancer among this group.
Our study was made possible with an innovative approach to data integration, including pooling EHR data across health-care systems and linkage to cancer registries. We note some limitations. EHR data often contain a high degree of missingness. The 2 EHR systems had remarkably complete smoking information and, with our approach to extraction of smoking status, yielded smoking status classification for more than 95% of the cohort (18). Health care utilization might contribute substantially to completeness of EHR data; however, the proportion of individuals with race/ethnicity and smoking status data did not change substantially when we considered a subset of the population from Sutter Health with a designated primary care physician (43% of the Sutter Health cohort). There is a potential for loss to follow-up for individuals who were diagnosed with lung cancer in another state; to mitigate this, we required a California (for Sutter Health) or Hawai’i (for Kaiser Permanente Hawaii) address during the follow-up period. A relatively small proportion (2.5%) of the cohort had an out-of-state address at the last follow-up date, and sensitivity analyses to censor these individuals at that date did not indicate bias because of loss to follow-up. The systematic use of EHR data also requires careful consideration of the introduction of measurement bias. Thus, as previously noted, we also conducted sensitivity analyses to account for potential differential misclassification of smoking or race/ethnicity by lung cancer status; these analyses showed our results to be robust.
Among never-smoking females, most AANHPI ethnic groups experience a high burden of lung cancer incidence. Continuing studies are needed to determine risk factors for lung cancer among females who have never smoked and why the burden of this disease is greater for AANHPI females.
Funding
National Cancer Institute (R01 CA204070; multiple PIs Gomez and Cheng); National Center for Advancing Translational Sciences (KL2TR001444; Thompson).
Notes
Role of the funder: The funders were not involved in designing the study; in the collection, analysis, and interpretation of data; in writing this report; or in the decision to submit this paper for publication.
Disclosures: The authors have no conflicts of interest to declare.
Author contributions: Conceptualization: IC and SLG; Methodology: MCD, AJC, CAT, MIP, YGD, HSL, SSM, PR, HAW, BEW, IC, and SLG; Data curation: MCD, AJC, CAT, AJ, SN, CW, DL, YGD, SYL; Resources: CAT, AJ, SN, CW, DL, YGD, HSL, SYL, and BEW; Formal analysis: AJC and DL; Investigation: MCD, AJC, CAT, MIP, HSL, SSM, PR, HAW, BEW, IC, SLG; Visualization: MCD and AJC; Writing original draft: MCD, AJC, IC, and SLG; Writing review and editing: All authors; Project administration: MCD, CAT, LA, SYL, BEW, IC, and SLG; Supervision: IC and SLG; Funding acquisition: IC and SLG.
Prior presentations: The study and results herein have been presented as poster and oral presentations at academic conferences: the 12th annual AACR Conference on the Science of Cancer Health Disparities among Racial/ethnic Minorities and the Medically Underserved (September 2019), the combined annual conference of the North American Association of Central Cancer Registries and the International Association of Cancer Research (June 2019), and the American Society of Preventive Oncology 44th Annual (Virtual) Conference (March 2020).
Data Availability
The data underlying this article were provided by Sutter Health, Kaiser Permanente Hawai’i, the California Cancer Registry, and the Hawai’i Tumor Registry by permission. Data will be shared on request to the corresponding author with permission of these parties.
Supplementary Material
References
- 1. Rivera GA, Wakelee H.. Lung cancer in never smokers. Adv Exp Med Biol. 2016;893:43–57. [DOI] [PubMed] [Google Scholar]
- 2.American Cancer Society. Cancer Facts and Figures 2021; 2021. https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2021/cancer-facts-and-figures-2021.pdf. Accessed May 1, 2021.
- 3. Pelosof L, Ahn C, Gao A, et al. Proportion of never-smoker non-small cell lung cancer patients at three diverse institutions. J Natl Cancer Inst. 2017;109(7):djw295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Wakelee HA, Chang ET, Gomez SL, et al. Lung cancer incidence in never-smokers. J Clin Oncol. 2007;25(5):472–478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Schmoldt A, Benthe HF, Haberland G, et al. Integrating electronic health record, cancer registry, and geospatial data to study lung cancer in Asian American, Native Hawaiian and Pacific Islander ethnic groups. Biochem Pharmacol. 1975;24(17):1639–1641. doi:10.1158/1055-9965.EPI-21-0019.10 [Google Scholar]
- 6. Siegel DA, Fedewa SA, Henley SJ, Pollack LA, Jemal A.. proportion of never smokers among men and women with lung cancer in 7 US states. JAMA Oncol. 2021;7(2):302–304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hoeffel EM, Rastogi S, Kim MO, Shahid H. The Asian Population: 2010. US Department of Commerce, Economics and Statistics Administration, US Census Bureau; 2012.
- 8. Hixson LK, Hepler BB, Kim MO. The Native Hawaiian and other Pacific Islander Population: 2010. US Department of Commerce, Economics and Statistics Administration, US Census Bureau; 2012.
- 9.United States Census Bureau. Explore Census Data. https://data.census.gov/cedsci/table?q=United%20States&tid=DECENNIALPLNAT2010.P1&hidePreview=true. Accessed May 1, 2021.
- 10.Pew Social & Demographic Trends. The rise of Asian Americans. Pew Research Center; 2012.
- 11.Ponce NA, Tseng W, Ong P, et al. The State of Asian American, Native Hawaiian and Pacific Islander Health in California Report; 2009.
- 12. Atienza AA, Serrano KJ, Riley WT, Moser RP, Klein WM.. Advancing cancer prevention and behavior theory in the era of big data. J Cancer Prev. 2016;21(3):201–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Moy KL, Sallis JF, Trinidad DR, Ice CL, McEligot AJ.. Health behaviors of native Hawaiian and Pacific Islander adults in California. Asia Pac J Public Health. 2012;24(6):961–969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Gomez SL, Noone A-M, Lichtensztajn DY, et al. Cancer incidence trends among Asian American populations in the United States, 1990-2008. J Natl Cancer Inst. 2013;105(15):1096–1110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Liu L, Noone A-M, Gomez SL, et al. Cancer incidence trends among native Hawaiians and other Pacific Islanders in the United States, 1990-2008. J Natl Cancer Inst. 2013;105(15):1086–1095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Mukherjea A, Wackowski OA, Lee YO, Delnevo CD.. Asian American, Native Hawaiian and Pacific Islander tobacco use patterns. Am J Health Behav. 2014;38(3):362–369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Thompson CA, Kurian AW, Luft HS.. Linking electronic health records to better understand breast cancer patient pathways within and between two health systems. EGEMS (Wash DC). 2015;3(1):1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. DeRouen MC, Thompson CA, Canchola AJ, et al. Integrating electronic health record, cancer registry, and geospatial data to study lung cancer in Asian American, Native Hawaiian and Pacific Islander ethnic groups.. Cancer Epidemiol Biomarkers Prev. 2021;30(8):1506–1516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Lewis DR, Check DP, Caporaso NE, Travis WD, Devesa SS.. US lung cancer trends by histologic type. Cancer. 2014;120(18):2883–2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Fay MP, Feuer EJ.. Confidence intervals for directly standardized rates: a method based on the gamma distribution. Statist Med. 1997;16(7):791–801. [DOI] [PubMed] [Google Scholar]
- 21. Tiwari RC, Clegg LX, Zou Z.. Efficient interval estimation for age-adjusted cancer rates. Stat Methods Med Res. 2006;15(6):547–569. [DOI] [PubMed] [Google Scholar]
- 22. Fay M. Approximate confidence intervals for rate ratios from directly standardized rates with sparse data. Commun Stat-Theor Methods. 1999;28(9):2141–2160. [Google Scholar]
- 23.Centers for Disease Control and Prevention. Statistical methods: suppression of rates and counts; 2020. https://www.cdc.gov/cancer/uscs/technical_notes/stat_methods/suppression.htm. Accessed January 1, 2020.
- 24. Breslow NE, Day NE.. Statistical methods in cancer research. Volume II. The design and analysis of cohort studies. IARC Sci Publ. 1987;1:406. [PubMed] [Google Scholar]
- 25. Haiman CA, Stram DO, Wilkens LR, et al. Ethnic and racial differences in the smoking-related risk of lung cancer. N Engl J Med. 2006;354(4):333–342. [DOI] [PubMed] [Google Scholar]
- 26. Cheng I, Le GM, Noone A-M, et al. Lung cancer incidence trends by histology type among Asian American, Native Hawaiian, and Pacific Islander populations in the United States, 1990-2010. Cancer Epidemiol Biomarkers Prev. 2014;23(11):2250–2265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Gallaway MS, Henley SJ, Steele CB, et al. Surveillance for cancers associated with tobacco use–United States, 2010-2014. MMWR Surveill Summ. 2018;67(12):1–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.American Cancer Society. Cancer Facts & Figures 2016 Special Section: Cancer in Asian Americans, Native Hawaiians, and Pacific Islanders. https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts-figures/cancer-facts-figures-2016.html. Accessed January 1, 2020.
- 29. Pesch B, Kendzia B, Gustavsson P, et al. Cigarette smoking and lung cancer–relative risk estimates for the major histological types from a pooled analysis of case-control studies. Int J Cancer. 2012;131(5):1210–1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.US Census Bureau. Asian/Pacific American Heritage Month: May 2014. US Census Bureau News; 2014. https://www.census.gov/content/dam/Census/newsroom/facts-for-features/2014/cb14-ff13_asian.pdf. Accessed January 1, 2020.
- 31.A Community of Contrasts: Native Hawaiians and Pacific Islanders in the United States; 2014. https://www.advancingjustice-la.org/what-we-do/policy-and-research/demographic-research/community-contrasts-native-hawaiians-and-pacific. Accessed January 1, 2020.
- 32. Fontham ET. Environmental tobacco smoke and lung cancer in nonsmoking women. A multicenter study. JAMA. 1994;271(22):1752–1759. [PubMed] [Google Scholar]
- 33. Wu AH, Fontham ET, Reynolds P, et al. Family history of cancer and risk of lung cancer among lifetime nonsmoking women in the United States. Am J Epidemiol. 1996;143(6):535–542. [DOI] [PubMed] [Google Scholar]
- 34. Boldo E, Linares C, Lumbreras J, et al. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. Environ Int. 2011;37(2):342–1141.21056471 [Google Scholar]
- 35. Lee T, Gany F.. Cooking oil fumes and lung cancer: a review of the literature in the context of the U.S. population. J Immigr Minor Health. 2013;15(3):646–652. [DOI] [PubMed] [Google Scholar]
- 36. Loomis D, Huang W, Chen G.. The International Agency for Research on Cancer (IARC) evaluation of the carcinogenicity of outdoor air pollution: focus on China. Chin J Cancer. 2014;33(4):189–196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Hamra GB, Guha N, Cohen A, et al. Outdoor particulate matter exposure and lung cancer: a systematic review and meta-analysis. Environ Health Perspect. 2014;122(9):906–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Bell ML, Ebisu K.. Environmental inequality in exposures to airborne particulate matter components in the United States. Environ Health Perspect. 2012;120(12):1699–1704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Payne-Sturges D, Gee GC.. National environmental health measures for minority and low-income populations: tracking social disparities in environmental health. Environ Res. 2006;102(2):154–171. [DOI] [PubMed] [Google Scholar]
- 40. Brooks N, Sethi R.. The distribution of pollution: community characteristics and exposure to air toxics. J Environ Econ Manag. 1997;32(2):233–250. [Google Scholar]
- 41. Wu AH, Fontham ET, Reynolds P, et al. Previous lung disease and risk of lung cancer among lifetime nonsmoking women in the United States. Am J Epidemiol. 1995;141(11):1023–1032. [DOI] [PubMed] [Google Scholar]
- 42. Lim W-Y, Chen Y, Chuah KL, et al. Female reproductive factors, gene polymorphisms in the estrogen metabolism pathway, and risk of lung cancer in Chinese women. Am J Epidemiol. 2012;175(6):492–503. [DOI] [PubMed] [Google Scholar]
- 43. Seow A, Koh W-P, Wang R, Lee H-P, Yu MC.. Reproductive variables, soy intake, and lung cancer risk among nonsmoking women in the Singapore Chinese Health Study. Cancer Epidemiol Biomarkers Prev. 2009;18(3):821–827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Liu Y, Inoue M, Sobue T, Tsugane S; for the JPHC Study Group. Reproductive factors, hormone use and the risk of lung cancer among middle-aged never-smoking Japanese women: a large-scale population-based cohort study. Int J Cancer. 2005;117(4):662–666. [DOI] [PubMed] [Google Scholar]
- 45. Schwartz AG, Ray RM, Cote ML, et al. Hormone use, reproductive history and risk of lung cancer: the Women’s Health Initiative Studies. J Thorac Oncol. 2015;10(7):1004–1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Pesatori AC, Carugno M, Consonni D, et al. Reproductive and hormonal factors and the risk of lung cancer: the EAGLE study. Int J Cancer. 2013;132(11):2630–2639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Brinton LA, Gierach GL, Andaya A, et al. Reproductive and hormonal factors and lung cancer risk in the NIH-AARP diet and health study cohort. Cancer Epidemiol Biomarkers Prev. 2011;20(5):900–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Meinhold CL, Berrington de González A, Bowman ED, et al. Reproductive and hormonal factors and the risk of nonsmall cell lung cancer. Int J Cancer. 2011;128(6):1404–1413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Pesatori AC, Carugno M, Consonni D, et al. Hormone use and risk for lung cancer: a pooled analysis from the International Lung Cancer Consortium (ILCCO). Br J Cancer. 2013;109(7):1954–1964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Rauscher GH, Mayne ST, Janerich DT.. Relation between body mass index and lung cancer risk in men and women never and former smokers. Am J Epidemiol. 2000;152(6):506–513. [DOI] [PubMed] [Google Scholar]
- 51. Zhu H, Zhang S.. Body mass index and lung cancer risk in never smokers: a meta-analysis. BMC Cancer. 2018;18(1):635. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data underlying this article were provided by Sutter Health, Kaiser Permanente Hawai’i, the California Cancer Registry, and the Hawai’i Tumor Registry by permission. Data will be shared on request to the corresponding author with permission of these parties.